From ralf.gommers at gmail.com Sun Dec 2 10:16:42 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 2 Dec 2012 16:16:42 +0100 Subject: [Numpy-discussion] allclose changed behaviour in 1.6.2 ? In-Reply-To: <50B8692F.4050706@smhi.se> References: <50B8692F.4050706@smhi.se> Message-ID: On Fri, Nov 30, 2012 at 9:07 AM, Martin Raspaud wrote: > Hi, > > We noticed that comparing arrays of different shapes with allclose > doesn't work anymore in numpy 1.6.2. > > Is this a feature or a bug ? :) > I vote for feature. Allclose does element-wise comparison, so using different size non-broadcastable inputs is an error in user code. It should have raised ValueError in 1.6.1 also. Ralf > > See the output in both 1.6.1 and 1.6.2 at the end of this mail. > > Best regards, > Martin > > 1.6.1:: > > In [1]: import numpy as np > > In [2]: np.__version__ > Out[2]: '1.6.1' > > In [3]: a = np.array([1, 2, 3]) > > In [4]: b = np.array([1, 2, 3, 4]) > > In [5]: np.allclose(a, b) > Out[5]: False > > > 1.6.2:: > > In[1]: import numpy as np > > In[2]: np.__version__ > Out[2]: '1.6.2' > > In [3]: a = np.array([1, 2, 3]) > > In[4]: b = np.array([1, 2, 3, 4]) > > In[5]: np.allclose(a, b) > Traceback (most recent call last): > File "", line 1, in > File > > "/home/maarten/pytroll/local/lib/python2.7/site-packages/numpy-1.6.2-py2.7-linux-x86_64.egg/numpy/core/numeric.py", > line 1936, in allclose > return all(less_equal(abs(x-y), atol + rtol * abs(y))) > ValueError: operands could not be broadcast together with shapes (3) (4) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Dec 2 11:11:27 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 2 Dec 2012 16:11:27 +0000 Subject: [Numpy-discussion] allclose changed behaviour in 1.6.2 ? In-Reply-To: References: <50B8692F.4050706@smhi.se> Message-ID: On Sun, Dec 2, 2012 at 3:16 PM, Ralf Gommers wrote: > > On Fri, Nov 30, 2012 at 9:07 AM, Martin Raspaud > wrote: >> >> Hi, >> >> We noticed that comparing arrays of different shapes with allclose >> doesn't work anymore in numpy 1.6.2. >> >> Is this a feature or a bug ? :) > > > I vote for feature. Allclose does element-wise comparison, so using > different size non-broadcastable inputs is an error in user code. It should > have raised ValueError in 1.6.1 also. I think I agree... in retrospect maybe we should have left the change for 1.7 rather than 1.6.2, but it's too late to do much about that, at least for this particular issue. -n From charlesr.harris at gmail.com Sun Dec 2 16:07:21 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 2 Dec 2012 14:07:21 -0700 Subject: [Numpy-discussion] Euler-Mascheroni constant Message-ID: Hi All, I put in a PR to expose the Euler-Mascheroni constant as 'euler_gamma'. The name is open to discussion. Suggestions for alternatives welcome. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From raul at virtualmaterials.com Sun Dec 2 20:28:24 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Sun, 02 Dec 2012 18:28:24 -0700 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement Message-ID: <50BC0038.70105@virtualmaterials.com> Hello, First a quick summary of my problem and at the end I include the basic changes I am suggesting to the source (they may benefit others) I am ages behind in times and I am still using Numeric in Python 2.2.3. The main reason why it has taken so long to upgrade is because NumPy kills performance on several of my tests. I am sorry if this topic has been discussed before. I tried parsing the mailing list and also google and all I found were comments related to the fact that such is life when you use NumPy for small arrays. In my case I have several thousands of lines of code where data structures rely heavily on Numeric arrays but it is unpredictable if the problem at hand will result in large or small arrays. Furthermore, once the vectorized operations complete, the values could be assigned into scalars and just do simple math or loops. I am fairly sure the core of my problems is that the 'float64' objects start propagating all over the program data structures (not in arrays) and they are considerably slower for just about everything when compared to the native python float. Conclusion, it is not practical for me to do a massive re-structuring of code to improve speed on simple things like "a[0] < 4" (assuming "a" is an array) which is about 10 times slower than "b < 4" (assuming "b" is a float) I finally decided to track down the problem and I started by getting Python 2.6 from source and profiling it in one of my cases. By far the biggest bottleneck came out to be PyString_FromFormatV which is a function to assemble a string for a Python error caused by a failure to find an attribute when "multiarray" calls PyObject_GetAttrString. This function seems to get called way too often from NumPy. The real bottleneck of trying to find the attribute when it does not exist is not that it fails to find it, but that it builds a string to set a Python error. In other words, something as simple as "a[0] < 3.5" internally result in a call to set a python error . I downloaded NumPy code (for Python 2.6) and tracked down all the calls like this, ret = PyObject_GetAttrString(obj, "__array_priority__"); and changed to if (PyList_CheckExact(obj) || (Py_None == obj) || PyTuple_CheckExact(obj) || PyFloat_CheckExact(obj) || PyInt_CheckExact(obj) || PyString_CheckExact(obj) || PyUnicode_CheckExact(obj)){ //Avoid expensive calls when I am sure the attribute //does not exist ret = NULL; } else{ ret = PyObject_GetAttrString(obj, "__array_priority__"); ( I think I found about 7 spots ) I also noticed (not as bad in my case) that calls to PyObject_GetBuffer also resulted in Python errors being set thus unnecessarily slower code. With this change, something like this, for i in xrange(1000000): if a[1] < 35.0: pass went down from 0.8 seconds to 0.38 seconds. A bogus test like this, for i in xrange(1000000): a = array([1., 2., 3.]) went down from 8.5 seconds to 2.5 seconds. Altogether, these simple changes got me half way to the speed I used to get in Numeric and I could not see any slow down in any of my cases that benefit from heavy array manipulation. I am out of ideas on how to improve further though. Few questions: - Is there any interest for me to provide the exact details of the code I changed ? - I managed to compile NumPy through setup.py but I am not sure how to force it to generate pdb files from my Visual Studio Compiler. I need the pdb files such that I can run my profiler on NumPy. Anybody has any experience with this ? (Visual Studio) - The core of my problems I think boil down to things like this s = a[0] assigning a float64 into s as opposed to a native float ? Is there any way to hack code to change it to extract a native float instead ? (probably crazy talk, but I thought I'd ask :) ). I'd prefer to not use s = a.item(0) because I would have to change too much code and it is not even that much faster. For example, for i in xrange(1000000): if a.item(1) < 35.0: pass is 0.23 seconds (as opposed to 0.38 seconds with my suggested changes) I apologize again if this topic has already been discussed. Regards, Raul From cgohlke at uci.edu Sun Dec 2 21:33:33 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sun, 02 Dec 2012 18:33:33 -0800 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement In-Reply-To: <50BC0038.70105@virtualmaterials.com> References: <50BC0038.70105@virtualmaterials.com> Message-ID: <50BC0F7D.8020602@uci.edu> On 12/2/2012 5:28 PM, Raul Cota wrote: > Hello, > > First a quick summary of my problem and at the end I include the basic > changes I am suggesting to the source (they may benefit others) > > I am ages behind in times and I am still using Numeric in Python 2.2.3. > The main reason why it has taken so long to upgrade is because NumPy > kills performance on several of my tests. > > I am sorry if this topic has been discussed before. I tried parsing the > mailing list and also google and all I found were comments related to > the fact that such is life when you use NumPy for small arrays. > > In my case I have several thousands of lines of code where data > structures rely heavily on Numeric arrays but it is unpredictable if the > problem at hand will result in large or small arrays. Furthermore, once > the vectorized operations complete, the values could be assigned into > scalars and just do simple math or loops. I am fairly sure the core of > my problems is that the 'float64' objects start propagating all over the > program data structures (not in arrays) and they are considerably slower > for just about everything when compared to the native python float. > > Conclusion, it is not practical for me to do a massive re-structuring of > code to improve speed on simple things like "a[0] < 4" (assuming "a" is > an array) which is about 10 times slower than "b < 4" (assuming "b" is a > float) > > > I finally decided to track down the problem and I started by getting > Python 2.6 from source and profiling it in one of my cases. By far the > biggest bottleneck came out to be PyString_FromFormatV which is a > function to assemble a string for a Python error caused by a failure to > find an attribute when "multiarray" calls PyObject_GetAttrString. This > function seems to get called way too often from NumPy. The real > bottleneck of trying to find the attribute when it does not exist is not > that it fails to find it, but that it builds a string to set a Python > error. In other words, something as simple as "a[0] < 3.5" internally > result in a call to set a python error . > > I downloaded NumPy code (for Python 2.6) and tracked down all the calls > like this, > > ret = PyObject_GetAttrString(obj, "__array_priority__"); > > and changed to > if (PyList_CheckExact(obj) || (Py_None == obj) || > PyTuple_CheckExact(obj) || > PyFloat_CheckExact(obj) || > PyInt_CheckExact(obj) || > PyString_CheckExact(obj) || > PyUnicode_CheckExact(obj)){ > //Avoid expensive calls when I am sure the attribute > //does not exist > ret = NULL; > } > else{ > ret = PyObject_GetAttrString(obj, "__array_priority__"); > > > > ( I think I found about 7 spots ) > > > I also noticed (not as bad in my case) that calls to PyObject_GetBuffer > also resulted in Python errors being set thus unnecessarily slower code. > > > With this change, something like this, > for i in xrange(1000000): > if a[1] < 35.0: > pass > > went down from 0.8 seconds to 0.38 seconds. > > A bogus test like this, > for i in xrange(1000000): > a = array([1., 2., 3.]) > > went down from 8.5 seconds to 2.5 seconds. > > > > Altogether, these simple changes got me half way to the speed I used to > get in Numeric and I could not see any slow down in any of my cases that > benefit from heavy array manipulation. I am out of ideas on how to > improve further though. > > Few questions: > - Is there any interest for me to provide the exact details of the code > I changed ? > > - I managed to compile NumPy through setup.py but I am not sure how to > force it to generate pdb files from my Visual Studio Compiler. I need > the pdb files such that I can run my profiler on NumPy. Anybody has any > experience with this ? (Visual Studio) Change the compiler and linker flags in Python\Lib\distutils\msvc9compiler.py to: self.compile_options = ['/nologo', '/Ox', '/MD', '/W3', '/DNDEBUG', '/Zi'] self.ldflags_shared = ['/DLL', '/nologo', '/INCREMENTAL:YES', '/DEBUG'] Then rebuild numpy. Christoph > > - The core of my problems I think boil down to things like this > s = a[0] > assigning a float64 into s as opposed to a native float ? > Is there any way to hack code to change it to extract a native float > instead ? (probably crazy talk, but I thought I'd ask :) ). > I'd prefer to not use s = a.item(0) because I would have to change too > much code and it is not even that much faster. For example, > for i in xrange(1000000): > if a.item(1) < 35.0: > pass > is 0.23 seconds (as opposed to 0.38 seconds with my suggested changes) > > > I apologize again if this topic has already been discussed. > > > Regards, > > Raul > > From travis at continuum.io Sun Dec 2 22:31:28 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 2 Dec 2012 21:31:28 -0600 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement In-Reply-To: <50BC0038.70105@virtualmaterials.com> References: <50BC0038.70105@virtualmaterials.com> Message-ID: <4F8FB56A-5C63-436D-8EFE-359C7BB70203@continuum.io> Raul, This is *fantastic work*. While many optimizations were done 6 years ago as people started to convert their code, that kind of report has trailed off in the last few years. I have not seen this kind of speed-comparison for some time --- but I think it's definitely beneficial. NumPy still has quite a bit that can be optimized. I think your example is really great. Perhaps it's worth making a C-API macro out of the short-cut to the attribute string so it can be used by others. It would be interesting to see where your other slow-downs are. I would be interested to see if the slow-math of float64 is hurting you. It would be possible, for example, to do a simple subclass of the ndarray that overloads a[] to be the same as array.item(). The latter syntax returns python objects (i.e. floats) instead of array scalars. Also, it would not be too difficult to add fast-math paths for int64, float32, and float64 scalars (so they don't go through ufuncs but do scalar-math like the float and int objects in Python. A related thing we've been working on lately which might help you is Numba which might help speed up functions that have code like: "a[0] < 4" : http://numba.pydata.org. Numba will translate the expression a[0] < 4 to a machine-code address-lookup and math operation which is *much* faster when a is a NumPy array. Presently this requires you to wrap your function call in a decorator: from numba import autojit @autojit def function_to_speed_up(...): pass In the near future (2-4 weeks), numba will grow the experimental ability to basically replace all your function calls with @autojit versions in a Python function. I would love to see something like this work: python -m numba filename.py To get an effective autojit on all the filename.py functions (and optionally on all python modules it imports). The autojit works out of the box today --- you can get Numba from PyPI (or inside of the completely free Anaconda CE) to try it out. Best, -Travis On Dec 2, 2012, at 7:28 PM, Raul Cota wrote: > Hello, > > First a quick summary of my problem and at the end I include the basic > changes I am suggesting to the source (they may benefit others) > > I am ages behind in times and I am still using Numeric in Python 2.2.3. > The main reason why it has taken so long to upgrade is because NumPy > kills performance on several of my tests. > > I am sorry if this topic has been discussed before. I tried parsing the > mailing list and also google and all I found were comments related to > the fact that such is life when you use NumPy for small arrays. > > In my case I have several thousands of lines of code where data > structures rely heavily on Numeric arrays but it is unpredictable if the > problem at hand will result in large or small arrays. Furthermore, once > the vectorized operations complete, the values could be assigned into > scalars and just do simple math or loops. I am fairly sure the core of > my problems is that the 'float64' objects start propagating all over the > program data structures (not in arrays) and they are considerably slower > for just about everything when compared to the native python float. > > Conclusion, it is not practical for me to do a massive re-structuring of > code to improve speed on simple things like "a[0] < 4" (assuming "a" is > an array) which is about 10 times slower than "b < 4" (assuming "b" is a > float) > > > I finally decided to track down the problem and I started by getting > Python 2.6 from source and profiling it in one of my cases. By far the > biggest bottleneck came out to be PyString_FromFormatV which is a > function to assemble a string for a Python error caused by a failure to > find an attribute when "multiarray" calls PyObject_GetAttrString. This > function seems to get called way too often from NumPy. The real > bottleneck of trying to find the attribute when it does not exist is not > that it fails to find it, but that it builds a string to set a Python > error. In other words, something as simple as "a[0] < 3.5" internally > result in a call to set a python error . > > I downloaded NumPy code (for Python 2.6) and tracked down all the calls > like this, > > ret = PyObject_GetAttrString(obj, "__array_priority__"); > > and changed to > if (PyList_CheckExact(obj) || (Py_None == obj) || > PyTuple_CheckExact(obj) || > PyFloat_CheckExact(obj) || > PyInt_CheckExact(obj) || > PyString_CheckExact(obj) || > PyUnicode_CheckExact(obj)){ > //Avoid expensive calls when I am sure the attribute > //does not exist > ret = NULL; > } > else{ > ret = PyObject_GetAttrString(obj, "__array_priority__"); > > > > ( I think I found about 7 spots ) > > > I also noticed (not as bad in my case) that calls to PyObject_GetBuffer > also resulted in Python errors being set thus unnecessarily slower code. > > > With this change, something like this, > for i in xrange(1000000): > if a[1] < 35.0: > pass > > went down from 0.8 seconds to 0.38 seconds. > > A bogus test like this, > for i in xrange(1000000): > a = array([1., 2., 3.]) > > went down from 8.5 seconds to 2.5 seconds. > > > > Altogether, these simple changes got me half way to the speed I used to > get in Numeric and I could not see any slow down in any of my cases that > benefit from heavy array manipulation. I am out of ideas on how to > improve further though. > > Few questions: > - Is there any interest for me to provide the exact details of the code > I changed ? > > - I managed to compile NumPy through setup.py but I am not sure how to > force it to generate pdb files from my Visual Studio Compiler. I need > the pdb files such that I can run my profiler on NumPy. Anybody has any > experience with this ? (Visual Studio) > > - The core of my problems I think boil down to things like this > s = a[0] > assigning a float64 into s as opposed to a native float ? > Is there any way to hack code to change it to extract a native float > instead ? (probably crazy talk, but I thought I'd ask :) ). > I'd prefer to not use s = a.item(0) because I would have to change too > much code and it is not even that much faster. For example, > for i in xrange(1000000): > if a.item(1) < 35.0: > pass > is 0.23 seconds (as opposed to 0.38 seconds with my suggested changes) > > > I apologize again if this topic has already been discussed. > > > Regards, > > Raul > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Mon Dec 3 06:14:13 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 3 Dec 2012 11:14:13 +0000 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement In-Reply-To: <50BC0038.70105@virtualmaterials.com> References: <50BC0038.70105@virtualmaterials.com> Message-ID: On Mon, Dec 3, 2012 at 1:28 AM, Raul Cota wrote: > I finally decided to track down the problem and I started by getting > Python 2.6 from source and profiling it in one of my cases. By far the > biggest bottleneck came out to be PyString_FromFormatV which is a > function to assemble a string for a Python error caused by a failure to > find an attribute when "multiarray" calls PyObject_GetAttrString. This > function seems to get called way too often from NumPy. The real > bottleneck of trying to find the attribute when it does not exist is not > that it fails to find it, but that it builds a string to set a Python > error. In other words, something as simple as "a[0] < 3.5" internally > result in a call to set a python error . > > I downloaded NumPy code (for Python 2.6) and tracked down all the calls > like this, > > ret = PyObject_GetAttrString(obj, "__array_priority__"); > > and changed to > if (PyList_CheckExact(obj) || (Py_None == obj) || > PyTuple_CheckExact(obj) || > PyFloat_CheckExact(obj) || > PyInt_CheckExact(obj) || > PyString_CheckExact(obj) || > PyUnicode_CheckExact(obj)){ > //Avoid expensive calls when I am sure the attribute > //does not exist > ret = NULL; > } > else{ > ret = PyObject_GetAttrString(obj, "__array_priority__"); > > ( I think I found about 7 spots ) If the problem is the exception construction, then maybe this would work about as well? if (PyObject_HasAttrString(obj, "__array_priority__") { ret = PyObject_GetAttrString(obj, "__array_priority__"); } else { ret = NULL; } If so then it would be an easier and more reliable way to accomplish this. > I also noticed (not as bad in my case) that calls to PyObject_GetBuffer > also resulted in Python errors being set thus unnecessarily slower code. > > With this change, something like this, > for i in xrange(1000000): > if a[1] < 35.0: > pass > > went down from 0.8 seconds to 0.38 seconds. Huh, why is PyObject_GetBuffer even getting called in this case? > A bogus test like this, > for i in xrange(1000000): > a = array([1., 2., 3.]) > > went down from 8.5 seconds to 2.5 seconds. I can see why we'd call PyObject_GetBuffer in this case, but not why it would take 2/3rds of the total run-time... > - The core of my problems I think boil down to things like this > s = a[0] > assigning a float64 into s as opposed to a native float ? > Is there any way to hack code to change it to extract a native float > instead ? (probably crazy talk, but I thought I'd ask :) ). > I'd prefer to not use s = a.item(0) because I would have to change too > much code and it is not even that much faster. For example, > for i in xrange(1000000): > if a.item(1) < 35.0: > pass > is 0.23 seconds (as opposed to 0.38 seconds with my suggested changes) I'm confused here -- first you say that your problems would be fixed if a[0] gave you a native float, but then you say that a.item(0) (which is basically a[0] that gives a native float) is still too slow? (OTOH at 40% speedup is pretty good, even if it is just a microbenchmark :-).) Array scalars are definitely pretty slow: In [9]: timeit a[0] 1000000 loops, best of 3: 151 ns per loop In [10]: timeit a.item(0) 10000000 loops, best of 3: 169 ns per loop In [11]: timeit a[0] < 35.0 1000000 loops, best of 3: 989 ns per loop In [12]: timeit a.item(0) < 35.0 1000000 loops, best of 3: 233 ns per loop It is probably possible to make numpy scalars faster... I'm not even sure why they go through the ufunc machinery, like Travis said, since they don't even follow the ufunc rules: In [3]: np.array(2) * [1, 2, 3] # 0-dim array coerces and broadcasts Out[3]: array([2, 4, 6]) In [4]: np.array(2)[()] * [1, 2, 3] # scalar acts like python integer Out[4]: [1, 2, 3, 1, 2, 3] But you may want to experiment a bit more to make sure this is actually the problem. IME guesses about speed problems are almost always wrong (even when I take this rule into account and only guess when I'm *really* sure). -n From josef.pktd at gmail.com Mon Dec 3 08:56:55 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 3 Dec 2012 08:56:55 -0500 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement In-Reply-To: References: <50BC0038.70105@virtualmaterials.com> Message-ID: On Mon, Dec 3, 2012 at 6:14 AM, Nathaniel Smith wrote: > On Mon, Dec 3, 2012 at 1:28 AM, Raul Cota wrote: >> I finally decided to track down the problem and I started by getting >> Python 2.6 from source and profiling it in one of my cases. By far the >> biggest bottleneck came out to be PyString_FromFormatV which is a >> function to assemble a string for a Python error caused by a failure to >> find an attribute when "multiarray" calls PyObject_GetAttrString. This >> function seems to get called way too often from NumPy. The real >> bottleneck of trying to find the attribute when it does not exist is not >> that it fails to find it, but that it builds a string to set a Python >> error. In other words, something as simple as "a[0] < 3.5" internally >> result in a call to set a python error . >> >> I downloaded NumPy code (for Python 2.6) and tracked down all the calls >> like this, >> >> ret = PyObject_GetAttrString(obj, "__array_priority__"); >> >> and changed to >> if (PyList_CheckExact(obj) || (Py_None == obj) || >> PyTuple_CheckExact(obj) || >> PyFloat_CheckExact(obj) || >> PyInt_CheckExact(obj) || >> PyString_CheckExact(obj) || >> PyUnicode_CheckExact(obj)){ >> //Avoid expensive calls when I am sure the attribute >> //does not exist >> ret = NULL; >> } >> else{ >> ret = PyObject_GetAttrString(obj, "__array_priority__"); >> >> ( I think I found about 7 spots ) > > If the problem is the exception construction, then maybe this would > work about as well? > > if (PyObject_HasAttrString(obj, "__array_priority__") { > ret = PyObject_GetAttrString(obj, "__array_priority__"); > } else { > ret = NULL; > } > > If so then it would be an easier and more reliable way to accomplish this. > >> I also noticed (not as bad in my case) that calls to PyObject_GetBuffer >> also resulted in Python errors being set thus unnecessarily slower code. >> >> With this change, something like this, >> for i in xrange(1000000): >> if a[1] < 35.0: >> pass >> >> went down from 0.8 seconds to 0.38 seconds. > > Huh, why is PyObject_GetBuffer even getting called in this case? > >> A bogus test like this, >> for i in xrange(1000000): >> a = array([1., 2., 3.]) >> >> went down from 8.5 seconds to 2.5 seconds. > > I can see why we'd call PyObject_GetBuffer in this case, but not why > it would take 2/3rds of the total run-time... > >> - The core of my problems I think boil down to things like this >> s = a[0] >> assigning a float64 into s as opposed to a native float ? >> Is there any way to hack code to change it to extract a native float >> instead ? (probably crazy talk, but I thought I'd ask :) ). >> I'd prefer to not use s = a.item(0) because I would have to change too >> much code and it is not even that much faster. For example, >> for i in xrange(1000000): >> if a.item(1) < 35.0: >> pass >> is 0.23 seconds (as opposed to 0.38 seconds with my suggested changes) > > I'm confused here -- first you say that your problems would be fixed > if a[0] gave you a native float, but then you say that a.item(0) > (which is basically a[0] that gives a native float) is still too slow? > (OTOH at 40% speedup is pretty good, even if it is just a > microbenchmark :-).) Array scalars are definitely pretty slow: > > In [9]: timeit a[0] > 1000000 loops, best of 3: 151 ns per loop > > In [10]: timeit a.item(0) > 10000000 loops, best of 3: 169 ns per loop > > In [11]: timeit a[0] < 35.0 > 1000000 loops, best of 3: 989 ns per loop > > In [12]: timeit a.item(0) < 35.0 > 1000000 loops, best of 3: 233 ns per loop > > It is probably possible to make numpy scalars faster... I'm not even > sure why they go through the ufunc machinery, like Travis said, since > they don't even follow the ufunc rules: > > In [3]: np.array(2) * [1, 2, 3] # 0-dim array coerces and broadcasts > Out[3]: array([2, 4, 6]) > > In [4]: np.array(2)[()] * [1, 2, 3] # scalar acts like python integer > Out[4]: [1, 2, 3, 1, 2, 3] I thought it still behaves like a numpy "animal" >>> np.array(-2)[()] ** [1, 2, 3] array([-2, 4, -8]) >>> np.array(-2)[()] ** 0.5 nan >>> np.array(-2).item() ** [1, 2, 3] Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for ** or pow(): 'int' and 'list' >>> np.array(-2).item() ** 0.5 Traceback (most recent call last): File "", line 1, in ValueError: negative number cannot be raised to a fractional power >>> np.array(0)[()] ** (-1) inf >>> np.array(0).item() ** (-1) Traceback (most recent call last): File "", line 1, in ZeroDivisionError: 0.0 cannot be raised to a negative power and similar I often try to avoid python scalars to avoid "surprising" behavior, and try to work defensively or fixed bugs by switching to np.power(...) (for example in the distributions). Josef > > But you may want to experiment a bit more to make sure this is > actually the problem. IME guesses about speed problems are almost > always wrong (even when I take this rule into account and only guess > when I'm *really* sure). > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Mon Dec 3 10:14:44 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 3 Dec 2012 10:14:44 -0500 Subject: [Numpy-discussion] scalars and strange casting Message-ID: A followup on the previous thread on scalar speed. operations with numpy scalars I can *maybe* understand this >>> np.array(2)[()] * [0.5, 1] [0.5, 1, 0.5, 1] but don't understand this >>> np.array(2.+0.1j)[()] * [0.5, 1] __main__:1: ComplexWarning: Casting complex values to real discards the imaginary part [0.5, 1, 0.5, 1] The difference in behavior compared to the other operators, +,-, /,**, looks, at least, like an inconsistency to me. Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.array(2.+0.1j)[()] * [0.5, 1] __main__:1: ComplexWarning: Casting complex values to real discards the imaginary part [0.5, 1, 0.5, 1] >>> np.array(2.+0.1j)[()] ** [0.5, 1] array([ 1.41465516+0.0353443j, 2.00000000+0.1j ]) >>> np.array(2.+0.1j)[()] + [0.5, 1] array([ 2.5+0.1j, 3.0+0.1j]) >>> np.array(2.+0.1j)[()] / [0.5, 1] array([ 4.+0.2j, 2.+0.1j]) >>> np.array(2)[()] * [0.5, 1] [0.5, 1, 0.5, 1] >>> np.array(2)[()] / [0.5, 1] array([ 4., 2.]) >>> np.array(2)[()] ** [0.5, 1] array([ 1.41421356, 2. ]) >>> np.array(2)[()] - [0.5, 1] array([ 1.5, 1. ]) >>> np.__version__ '1.5.1' or >>> np.array(-2.+0.1j)[()] * [0.5, 1] [] >>> np.multiply(np.array(-2.+0.1j)[()], [0.5, 1]) array([-1.+0.05j, -2.+0.1j ]) >>> np.array([-2.+0.1j])[0] * [0.5, 1] [] >>> np.multiply(np.array([-2.+0.1j])[0], [0.5, 1]) array([-1.+0.05j, -2.+0.1j ]) Josef defensive programming = don't use python, use numpy arrays, or at least remember which kind of animals you have From raul at virtualmaterials.com Mon Dec 3 10:33:23 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Mon, 03 Dec 2012 08:33:23 -0700 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement In-Reply-To: <50BC0F7D.8020602@uci.edu> References: <50BC0038.70105@virtualmaterials.com> <50BC0F7D.8020602@uci.edu> Message-ID: <50BCC643.7000601@virtualmaterials.com> Thanks Christoph. It seemed to work. Will do profile runs today/tomorrow and see what come out. Raul On 02/12/2012 7:33 PM, Christoph Gohlke wrote: > On 12/2/2012 5:28 PM, Raul Cota wrote: >> Hello, >> >> First a quick summary of my problem and at the end I include the basic >> changes I am suggesting to the source (they may benefit others) >> >> I am ages behind in times and I am still using Numeric in Python 2.2.3. >> The main reason why it has taken so long to upgrade is because NumPy >> kills performance on several of my tests. >> >> I am sorry if this topic has been discussed before. I tried parsing the >> mailing list and also google and all I found were comments related to >> the fact that such is life when you use NumPy for small arrays. >> >> In my case I have several thousands of lines of code where data >> structures rely heavily on Numeric arrays but it is unpredictable if the >> problem at hand will result in large or small arrays. Furthermore, once >> the vectorized operations complete, the values could be assigned into >> scalars and just do simple math or loops. I am fairly sure the core of >> my problems is that the 'float64' objects start propagating all over the >> program data structures (not in arrays) and they are considerably slower >> for just about everything when compared to the native python float. >> >> Conclusion, it is not practical for me to do a massive re-structuring of >> code to improve speed on simple things like "a[0] < 4" (assuming "a" is >> an array) which is about 10 times slower than "b < 4" (assuming "b" is a >> float) >> >> >> I finally decided to track down the problem and I started by getting >> Python 2.6 from source and profiling it in one of my cases. By far the >> biggest bottleneck came out to be PyString_FromFormatV which is a >> function to assemble a string for a Python error caused by a failure to >> find an attribute when "multiarray" calls PyObject_GetAttrString. This >> function seems to get called way too often from NumPy. The real >> bottleneck of trying to find the attribute when it does not exist is not >> that it fails to find it, but that it builds a string to set a Python >> error. In other words, something as simple as "a[0] < 3.5" internally >> result in a call to set a python error . >> >> I downloaded NumPy code (for Python 2.6) and tracked down all the calls >> like this, >> >> ret = PyObject_GetAttrString(obj, "__array_priority__"); >> >> and changed to >> if (PyList_CheckExact(obj) || (Py_None == obj) || >> PyTuple_CheckExact(obj) || >> PyFloat_CheckExact(obj) || >> PyInt_CheckExact(obj) || >> PyString_CheckExact(obj) || >> PyUnicode_CheckExact(obj)){ >> //Avoid expensive calls when I am sure the attribute >> //does not exist >> ret = NULL; >> } >> else{ >> ret = PyObject_GetAttrString(obj, "__array_priority__"); >> >> >> >> ( I think I found about 7 spots ) >> >> >> I also noticed (not as bad in my case) that calls to PyObject_GetBuffer >> also resulted in Python errors being set thus unnecessarily slower code. >> >> >> With this change, something like this, >> for i in xrange(1000000): >> if a[1] < 35.0: >> pass >> >> went down from 0.8 seconds to 0.38 seconds. >> >> A bogus test like this, >> for i in xrange(1000000): >> a = array([1., 2., 3.]) >> >> went down from 8.5 seconds to 2.5 seconds. >> >> >> >> Altogether, these simple changes got me half way to the speed I used to >> get in Numeric and I could not see any slow down in any of my cases that >> benefit from heavy array manipulation. I am out of ideas on how to >> improve further though. >> >> Few questions: >> - Is there any interest for me to provide the exact details of the code >> I changed ? >> >> - I managed to compile NumPy through setup.py but I am not sure how to >> force it to generate pdb files from my Visual Studio Compiler. I need >> the pdb files such that I can run my profiler on NumPy. Anybody has any >> experience with this ? (Visual Studio) > > Change the compiler and linker flags in > Python\Lib\distutils\msvc9compiler.py to: > > self.compile_options = ['/nologo', '/Ox', '/MD', '/W3', '/DNDEBUG', '/Zi'] > self.ldflags_shared = ['/DLL', '/nologo', '/INCREMENTAL:YES', '/DEBUG'] > > Then rebuild numpy. > > Christoph > > > >> - The core of my problems I think boil down to things like this >> s = a[0] >> assigning a float64 into s as opposed to a native float ? >> Is there any way to hack code to change it to extract a native float >> instead ? (probably crazy talk, but I thought I'd ask :) ). >> I'd prefer to not use s = a.item(0) because I would have to change too >> much code and it is not even that much faster. For example, >> for i in xrange(1000000): >> if a.item(1) < 35.0: >> pass >> is 0.23 seconds (as opposed to 0.38 seconds with my suggested changes) >> >> >> I apologize again if this topic has already been discussed. >> >> >> Regards, >> >> Raul >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From raul at virtualmaterials.com Mon Dec 3 10:35:58 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Mon, 03 Dec 2012 08:35:58 -0700 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement In-Reply-To: <4F8FB56A-5C63-436D-8EFE-359C7BB70203@continuum.io> References: <50BC0038.70105@virtualmaterials.com> <4F8FB56A-5C63-436D-8EFE-359C7BB70203@continuum.io> Message-ID: <50BCC6DE.3090901@virtualmaterials.com> On 02/12/2012 8:31 PM, Travis Oliphant wrote: > Raul, > > This is *fantastic work*. While many optimizations were done 6 years ago as people started to convert their code, that kind of report has trailed off in the last few years. I have not seen this kind of speed-comparison for some time --- but I think it's definitely beneficial. I'll clean up a bit as a Macro and comment. > NumPy still has quite a bit that can be optimized. I think your example is really great. Perhaps it's worth making a C-API macro out of the short-cut to the attribute string so it can be used by others. It would be interesting to see where your other slow-downs are. I would be interested to see if the slow-math of float64 is hurting you. It would be possible, for example, to do a simple subclass of the ndarray that overloads a[] to be the same as array.item(). The latter syntax returns python objects (i.e. floats) instead of array scalars. > > Also, it would not be too difficult to add fast-math paths for int64, float32, and float64 scalars (so they don't go through ufuncs but do scalar-math like the float and int objects in Python. Thanks. I'll dig a bit more into the code. > > A related thing we've been working on lately which might help you is Numba which might help speed up functions that have code like: "a[0] < 4" : http://numba.pydata.org. > > Numba will translate the expression a[0] < 4 to a machine-code address-lookup and math operation which is *much* faster when a is a NumPy array. Presently this requires you to wrap your function call in a decorator: > > from numba import autojit > > @autojit > def function_to_speed_up(...): > pass > > In the near future (2-4 weeks), numba will grow the experimental ability to basically replace all your function calls with @autojit versions in a Python function. I would love to see something like this work: > > python -m numba filename.py > > To get an effective autojit on all the filename.py functions (and optionally on all python modules it imports). The autojit works out of the box today --- you can get Numba from PyPI (or inside of the completely free Anaconda CE) to try it out. This looks very interesting. Will check it out. > Best, > > -Travis > > > > > On Dec 2, 2012, at 7:28 PM, Raul Cota wrote: > >> Hello, >> >> First a quick summary of my problem and at the end I include the basic >> changes I am suggesting to the source (they may benefit others) >> >> I am ages behind in times and I am still using Numeric in Python 2.2.3. >> The main reason why it has taken so long to upgrade is because NumPy >> kills performance on several of my tests. >> >> I am sorry if this topic has been discussed before. I tried parsing the >> mailing list and also google and all I found were comments related to >> the fact that such is life when you use NumPy for small arrays. >> >> In my case I have several thousands of lines of code where data >> structures rely heavily on Numeric arrays but it is unpredictable if the >> problem at hand will result in large or small arrays. Furthermore, once >> the vectorized operations complete, the values could be assigned into >> scalars and just do simple math or loops. I am fairly sure the core of >> my problems is that the 'float64' objects start propagating all over the >> program data structures (not in arrays) and they are considerably slower >> for just about everything when compared to the native python float. >> >> Conclusion, it is not practical for me to do a massive re-structuring of >> code to improve speed on simple things like "a[0] < 4" (assuming "a" is >> an array) which is about 10 times slower than "b < 4" (assuming "b" is a >> float) >> >> >> I finally decided to track down the problem and I started by getting >> Python 2.6 from source and profiling it in one of my cases. By far the >> biggest bottleneck came out to be PyString_FromFormatV which is a >> function to assemble a string for a Python error caused by a failure to >> find an attribute when "multiarray" calls PyObject_GetAttrString. This >> function seems to get called way too often from NumPy. The real >> bottleneck of trying to find the attribute when it does not exist is not >> that it fails to find it, but that it builds a string to set a Python >> error. In other words, something as simple as "a[0] < 3.5" internally >> result in a call to set a python error . >> >> I downloaded NumPy code (for Python 2.6) and tracked down all the calls >> like this, >> >> ret = PyObject_GetAttrString(obj, "__array_priority__"); >> >> and changed to >> if (PyList_CheckExact(obj) || (Py_None == obj) || >> PyTuple_CheckExact(obj) || >> PyFloat_CheckExact(obj) || >> PyInt_CheckExact(obj) || >> PyString_CheckExact(obj) || >> PyUnicode_CheckExact(obj)){ >> //Avoid expensive calls when I am sure the attribute >> //does not exist >> ret = NULL; >> } >> else{ >> ret = PyObject_GetAttrString(obj, "__array_priority__"); >> >> >> >> ( I think I found about 7 spots ) >> >> >> I also noticed (not as bad in my case) that calls to PyObject_GetBuffer >> also resulted in Python errors being set thus unnecessarily slower code. >> >> >> With this change, something like this, >> for i in xrange(1000000): >> if a[1] < 35.0: >> pass >> >> went down from 0.8 seconds to 0.38 seconds. >> >> A bogus test like this, >> for i in xrange(1000000): >> a = array([1., 2., 3.]) >> >> went down from 8.5 seconds to 2.5 seconds. >> >> >> >> Altogether, these simple changes got me half way to the speed I used to >> get in Numeric and I could not see any slow down in any of my cases that >> benefit from heavy array manipulation. I am out of ideas on how to >> improve further though. >> >> Few questions: >> - Is there any interest for me to provide the exact details of the code >> I changed ? >> >> - I managed to compile NumPy through setup.py but I am not sure how to >> force it to generate pdb files from my Visual Studio Compiler. I need >> the pdb files such that I can run my profiler on NumPy. Anybody has any >> experience with this ? (Visual Studio) >> >> - The core of my problems I think boil down to things like this >> s = a[0] >> assigning a float64 into s as opposed to a native float ? >> Is there any way to hack code to change it to extract a native float >> instead ? (probably crazy talk, but I thought I'd ask :) ). >> I'd prefer to not use s = a.item(0) because I would have to change too >> much code and it is not even that much faster. For example, >> for i in xrange(1000000): >> if a.item(1) < 35.0: >> pass >> is 0.23 seconds (as opposed to 0.38 seconds with my suggested changes) >> >> >> I apologize again if this topic has already been discussed. >> >> >> Regards, >> >> Raul >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From raul at virtualmaterials.com Mon Dec 3 11:26:39 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Mon, 03 Dec 2012 09:26:39 -0700 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement In-Reply-To: References: <50BC0038.70105@virtualmaterials.com> Message-ID: <50BCD2BF.1040807@virtualmaterials.com> On 03/12/2012 4:14 AM, Nathaniel Smith wrote: > On Mon, Dec 3, 2012 at 1:28 AM, Raul Cota wrote: >> I finally decided to track down the problem and I started by getting >> Python 2.6 from source and profiling it in one of my cases. By far the >> biggest bottleneck came out to be PyString_FromFormatV which is a >> function to assemble a string for a Python error caused by a failure to >> find an attribute when "multiarray" calls PyObject_GetAttrString. This >> function seems to get called way too often from NumPy. The real >> bottleneck of trying to find the attribute when it does not exist is not >> that it fails to find it, but that it builds a string to set a Python >> error. In other words, something as simple as "a[0] < 3.5" internally >> result in a call to set a python error . >> >> I downloaded NumPy code (for Python 2.6) and tracked down all the calls >> like this, >> >> ret = PyObject_GetAttrString(obj, "__array_priority__"); >> >> and changed to >> if (PyList_CheckExact(obj) || (Py_None == obj) || >> PyTuple_CheckExact(obj) || >> PyFloat_CheckExact(obj) || >> PyInt_CheckExact(obj) || >> PyString_CheckExact(obj) || >> PyUnicode_CheckExact(obj)){ >> //Avoid expensive calls when I am sure the attribute >> //does not exist >> ret = NULL; >> } >> else{ >> ret = PyObject_GetAttrString(obj, "__array_priority__"); >> >> ( I think I found about 7 spots ) > If the problem is the exception construction, then maybe this would > work about as well? > > if (PyObject_HasAttrString(obj, "__array_priority__") { > ret = PyObject_GetAttrString(obj, "__array_priority__"); > } else { > ret = NULL; > } > > If so then it would be an easier and more reliable way to accomplish this. I did think of that one but at least in Python 2.6 the implementation is just a wrapper to PyObject_GetAttrSting that clears the error """ PyObject_HasAttrString(PyObject *v, const char *name) { PyObject *res = PyObject_GetAttrString(v, name); if (res != NULL) { Py_DECREF(res); return 1; } PyErr_Clear(); return 0; } """ so it is just as bad when it fails and a waste when it succeeds (it will end up finding it twice). In my opinion, Python's source code should offer a version of PyObject_GetAttrString that does not raise an error but that is a completely different topic. >> I also noticed (not as bad in my case) that calls to PyObject_GetBuffer >> also resulted in Python errors being set thus unnecessarily slower code. >> >> With this change, something like this, >> for i in xrange(1000000): >> if a[1] < 35.0: >> pass >> >> went down from 0.8 seconds to 0.38 seconds. > Huh, why is PyObject_GetBuffer even getting called in this case? Sorry for being misleading in an already long and confusing email. PyObject_GetBuffer is not getting called doing an "if" call. This call showed up in my profiler as a time consuming task that raised python errors unnecessarily (not nearly as bad as often as PyObject_GetAttrString ) but since I was already there I decided to look into it as well. The point I was trying to make was that I did both changes (avoiding PyObject_GetBuffer, PyObject_GetAttrString) when I came up with the times. >> A bogus test like this, >> for i in xrange(1000000): >> a = array([1., 2., 3.]) >> >> went down from 8.5 seconds to 2.5 seconds. > I can see why we'd call PyObject_GetBuffer in this case, but not why > it would take 2/3rds of the total run-time... Same scenario. This total time includes both changes (avoiding PyObject_GetBuffer, PyObject_GetAttrString). If my memory helps, I believe PyObject_GetBuffer gets called once for every 9 times of a call to PyObject_GetAttrString in this scenario. >> - The core of my problems I think boil down to things like this >> s = a[0] >> assigning a float64 into s as opposed to a native float ? >> Is there any way to hack code to change it to extract a native float >> instead ? (probably crazy talk, but I thought I'd ask :) ). >> I'd prefer to not use s = a.item(0) because I would have to change too >> much code and it is not even that much faster. For example, >> for i in xrange(1000000): >> if a.item(1) < 35.0: >> pass >> is 0.23 seconds (as opposed to 0.38 seconds with my suggested changes) > I'm confused here -- first you say that your problems would be fixed > if a[0] gave you a native float, but then you say that a.item(0) > (which is basically a[0] that gives a native float) is still too slow? Don't get me wrong. I am confused too when it gets beyond my suggested changes :) . My "theory" for saying that a.item(1) is not the same to a[1] returning a float was that perhaps the overhead of the dot operator is too big. At the end of the day, I do want to profile NumPy and find out if there is anything I can do to speed things up. To bring things more into context, I don't really care to speed up a bogus loop with if statements. My bottom line is, - I am focusing on two cases from our software that take 141.8 seconds and 40 seconds respectively using Numeric and Python 2.2.3 . - These cases now take 229 seconds and 62 seconds respectively using NumPy and Python 2.6 . This is quite a bit of a slow down taking into account that Python code that uses only native objects is quite a bit faster in Python 2.6 Vs Python 2.2 Both cases (like most of our software) use array operations as much as possible and revert down to scalar operations when it is not practical to do otherwise. I am not saying it is impossible to optimize even more, it is just not practical. I ran the profiler on Python 2.6 and I found the bottlenecks I reported in this email. Both of my cases are now running at 170 and 50 seconds respectively. In other words, I am "almost" back to where I want to be. The improvement is huge, but in my opinion it still uncomfortably far from what it used to be in Numeric and I worry that there may be other spots in our software that may be affected on a more meaningful way that I just have not noticed. > (OTOH at 40% speedup is pretty good, even if it is just a > microbenchmark :-).) Array scalars are definitely pretty slow: > > In [9]: timeit a[0] > 1000000 loops, best of 3: 151 ns per loop > > In [10]: timeit a.item(0) > 10000000 loops, best of 3: 169 ns per loop > > In [11]: timeit a[0] < 35.0 > 1000000 loops, best of 3: 989 ns per loop > > In [12]: timeit a.item(0) < 35.0 > 1000000 loops, best of 3: 233 ns per loop > > It is probably possible to make numpy scalars faster... I'm not even > sure why they go through the ufunc machinery, like Travis said, since > they don't even follow the ufunc rules: > > In [3]: np.array(2) * [1, 2, 3] # 0-dim array coerces and broadcasts > Out[3]: array([2, 4, 6]) > > In [4]: np.array(2)[()] * [1, 2, 3] # scalar acts like python integer > Out[4]: [1, 2, 3, 1, 2, 3] > > But you may want to experiment a bit more to make sure this is > actually the problem. IME guesses about speed problems are almost > always wrong (even when I take this rule into account and only guess > when I'm *really* sure). I agree 100% about the pitfalls of guessing. Thanks to Christoph's suggestion I should be able to profile NumPy now. Thanks for your comments, Raul > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From chris.barker at noaa.gov Mon Dec 3 14:49:57 2012 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 3 Dec 2012 11:49:57 -0800 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement In-Reply-To: <50BCD2BF.1040807@virtualmaterials.com> References: <50BC0038.70105@virtualmaterials.com> <50BCD2BF.1040807@virtualmaterials.com> Message-ID: Raul, Thanks for doing this work -- both the profiling and actual suggestions for how to improve the code -- whoo hoo! In general, it seem that numpy performance for scalars and very small arrays (i.e (2,), (3,) maybe (3,3), the kind of thing that you'd use to hold a coordinate point or the like, not small as in "fits in cache") is pretty slow. In principle, a basic array scalar operation could be as fast as a numpy native numeric type, and it would be great is small array operations were, too. It may be that the route to those performance improvements is special-case code, which is ugly, but I think could really be worth it for the common types and operations. I'm really out of my depth for suggesting (or contributing) actual soluitons, but +1 for the idea! -Chris NOTE: Here's a example of what I'm talking about -- say you are scaling an (x,y) point by a (s_x, s_y) scale factor: def numpy_version(point, scale): return point * scale def tuple_version(point, scale): return (point[0] * scale[0], point[1] * scale[1]) In [36]: point_arr, sca scale scale_arr In [36]: point_arr, scale_arr Out[36]: (array([ 3., 5.]), array([ 2., 3.])) In [37]: timeit tuple_version(point, scale) 1000000 loops, best of 3: 397 ns per loop In [38]: timeit numpy_version(point_arr, scale_arr) 100000 loops, best of 3: 2.32 us per loop It would be great if numpy could get closer to tuple performance for this sor tof thing... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From magnetotellurics at gmail.com Mon Dec 3 15:10:31 2012 From: magnetotellurics at gmail.com (Karl Kappler) Date: Mon, 3 Dec 2012 12:10:31 -0800 Subject: [Numpy-discussion] Apparently Non-deterministic behaviour of complex-array instantiation values Message-ID: Hello, This is a continuation of a problem I had last year, http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tt32893004.html#a32931369 at least it seems to have similar symptoms. I am working again with complex valued arrays in numpy (python version 2.7.3). This time however, the dataset is not very large, and I am able to post a snippet of the code. I became aware of the problem working in and IDE: Spyder 2.1.9, where I was repeatedly running code by pushing f5 and checking that my numerical results were what I expect. What I found was that the output in spyder varied somewhat randomly. In particular, when initializing a 2x2 complex-valued numpy array on line 32 of the code called ?lowTri?. When I print the value of the upper right element (the one that should be zero) I often see 1.789+1.543j, or 0+1.543j, or 0+0j. This behavior happens when I run the code using f5 in spyder, and I thought it may be a Spyder issue, but further investigation has shown equally strange behavior on the command line as well. When I run this script on the command line the output is usually the same from run to run (although I have seen some variations, which I do not understand), but most remarkable, and reproducible, is that if I comment out line 17 (where the complex-valued array zzz is populated), the behavior of the initialization of lowTri varies. With Line 17 uncommented I usually get (on the command line): Lower Triangular [0,1]: 1.543j [[ 670.9 +1.22400000e-05j 0.0 +1.54300000e+00j] [ 195.8 -1.17300000e+02j 391.2 +1.46900000e-05j]] Lower Triangular [0,1]: 1.543j Real Part 0.0 and with line 17 it commented: Lower Triangular [0,1]: 0j [[ 670.9 +1.22400000e-05j 0.0 +0.00000000e+00j] [ 195.8 -1.17300000e+02j 391.2 +1.46900000e-05j]] Lower Triangular [0,1]: 0j Real Part 0.0 I.e.. the imaginary part is initialized to a different value. From reading up on forums I think I understand that when an array is allocated without specific values, it will be given random values which are very small, ie. ~1e-316 or so. But it would seem that sometimes initallization is done to a finite quantity. I know I can try to initialize the array using np.zeros() instead of np.ndarray(), but it is the principle I am concerned about. Last year it had been suggested that I had bad RAM, but these issues are reproducing on four computers, one of which has a new motherboard/RAM and AMD processor, and the others are Intel. Memtest has been run recently on at least two of the machines. Could someone try running this script with and without line 17 commented out and tell me if they are getting the same sort of behaviour? Thanks, Karl ********************************** import numpy as np if __name__ == "__main__": a={} a[0] = '0.1537E+00 0.1610E+01 -0.4801E+01 -0.3175E+01' a[1] = '0.1789E+01 0.1543E+01 -0.5524E+00 -0.8423E+00' c = '0.6709E+03 0.1224E-04 0.1958E+03 -0.1173E+03 0.3912E+03 0.1469E-04' ztmp = np.zeros((4,2)) zzz = np.zeros((2,2)) + complex(0,1)*np.zeros((2,2)); line = [] for iE in range(2): line = a[iE].split() for iElt, elt in enumerate(line): ztmp[iElt,iE] = float(elt) zzz[:,0:2] = ztmp[[0,2],:] + complex(0,1)*ztmp[[1,3],:] #commenting this line seems to affect value of lowTri stemp = np.zeros((2,3)) nElts = np.prod(stemp.shape) v = [] line = c.split() for l in line: v.append(float(l)) N=len(v)/2 cVec = np.ndarray(shape=(N), dtype=complex) for i in range(N): cVec[i] = complex(float(v[2*i]),float(v[2*(i+1)-1])) lowTri = np.ndarray(shape=(2,2), dtype=complex) #lowTri = np.zeros(shape=(2,2), dtype=complex) print("Lower Triangular [0,1]: {}".format(lowTri[0,1])) TI = np.tril_indices(2) rows = TI[0] cols = TI[1] for iCell in range(len(rows)): lowTri[rows[iCell],cols[iCell]] = cVec[iCell] print lowTri print("Lower Triangular [0,1]: {}".format(lowTri[0,1])) print("Real Part {}".format(np.real(lowTri[0,1]))) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Dec 3 15:19:54 2012 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 03 Dec 2012 22:19:54 +0200 Subject: [Numpy-discussion] Apparently Non-deterministic behaviour of complex-array instantiation values In-Reply-To: References: Message-ID: 03.12.2012 22:10, Karl Kappler kirjoitti: [clip] > I.e.. the imaginary part is initialized to a different value. From > reading up on forums I think I understand that when an array is > allocated without specific values, it will be given random values which > are very small, ie. ~1e-316 or so. But it would seem that sometimes > initallization is done to a finite quantity. I know I can try to > initialize the array using np.zeros() instead of np.ndarray(), but it is > the principle I am concerned about. The memory is not initialized in any way [*] if you get the array from np.empty(..) or np.ndarray(...). It contains whatever that happens to be at that location. It just happens that "typical memory content" when viewed in floating point often looks like that. [*] Except that the OS zeroes new memory pages given to the process. Processes however reuse the pages they are given. -- Pauli Virtanen From raul at virtualmaterials.com Mon Dec 3 17:12:57 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Mon, 03 Dec 2012 15:12:57 -0700 Subject: [Numpy-discussion] Speed bottlenecks on simple tasks - suggested improvement In-Reply-To: References: <50BC0038.70105@virtualmaterials.com> <50BCD2BF.1040807@virtualmaterials.com> Message-ID: <50BD23E9.1080903@virtualmaterials.com> Chris, thanks for the feedback, fyi, the minor changes I talked about have different performance enhancements depending on scenario, e.g, 1) Array * Array point = array( [2.0, 3.0]) scale = array( [2.4, 0.9] ) retVal = point * scale #The line above runs 1.1 times faster with my new code (but it runs 3 times faster in Numeric in Python 2.2) #i.e. pretty meaningless but still far from old Numeric 2) Array * Tuple (item by item) point = array( [2.0, 3.0]) scale = (2.4, 0.9 ) retVal = point[0] < scale[0], point[1] < scale[1] #The line above runs 1.8 times faster with my new code (but it runs 6.8 times faster in Numeric in Python 2.2) #i.e. pretty decent speed up but quite far from old Numeric I am not saying that I would ever do something exactly like (2) in my code nor am I saying that the changes in NumPy Vs Numeric are not beneficial. My point is that performance in small size problems is fairly far from what it used to be in Numeric particularly when dealing with scalars and it is problematic at least to me. I am currently looking around to see if there are practical ways to speed things up without slowing anything else down. Will keep you posted. regards, Raul On 03/12/2012 12:49 PM, Chris Barker - NOAA Federal wrote: > Raul, > > Thanks for doing this work -- both the profiling and actual > suggestions for how to improve the code -- whoo hoo! > > In general, it seem that numpy performance for scalars and very small > arrays (i.e (2,), (3,) maybe (3,3), the kind of thing that you'd use > to hold a coordinate point or the like, not small as in "fits in > cache") is pretty slow. In principle, a basic array scalar operation > could be as fast as a numpy native numeric type, and it would be great > is small array operations were, too. > > It may be that the route to those performance improvements is > special-case code, which is ugly, but I think could really be worth it > for the common types and operations. > > I'm really out of my depth for suggesting (or contributing) actual > soluitons, but +1 for the idea! > > -Chris > > NOTE: Here's a example of what I'm talking about -- say you are > scaling an (x,y) point by a (s_x, s_y) scale factor: > > def numpy_version(point, scale): > return point * scale > > > def tuple_version(point, scale): > return (point[0] * scale[0], point[1] * scale[1]) > > > In [36]: point_arr, sca > scale scale_arr > > In [36]: point_arr, scale_arr > Out[36]: (array([ 3., 5.]), array([ 2., 3.])) > > In [37]: timeit tuple_version(point, scale) > 1000000 loops, best of 3: 397 ns per loop > > In [38]: timeit numpy_version(point_arr, scale_arr) > 100000 loops, best of 3: 2.32 us per loop > > It would be great if numpy could get closer to tuple performance for > this sor tof thing... > > > -Chris > > From ondrej.certik at gmail.com Mon Dec 3 21:27:42 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Mon, 3 Dec 2012 18:27:42 -0800 Subject: [Numpy-discussion] Weird Travis-CI bugs in the release 1.7.x branch Message-ID: Hi, I started to work on the release again and noticed weird failures at Travis-CI: https://github.com/numpy/numpy/pull/2782 The first commit (8a18fc7) should not trigger this failure: ====================================================================== FAIL: test_iterator.test_iter_array_cast ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/core/tests/test_iterator.py", line 836, in test_iter_array_cast assert_equal(i.operands[0].strides, (-96,8,-32)) File "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 252, in assert_equal assert_equal(actual[k], desired[k], 'item=%r\n%s' % (k,err_msg), verbose) File "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 314, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: item=0 ACTUAL: 96 DESIRED: -96 So I pushed a whitespace commit into the PR (516b478) yet it has the same failure. So it's there, it's not some random fluke at Travis. I created this testing PR: https://github.com/numpy/numpy/pull/2783 to try to nail it down. But I can't see what could have caused this, because the release branch was passing all tests last time I worked on it. Any ideas? Btw, I managed to reproduce the SPARC64 bug: https://github.com/numpy/numpy/issues/2668 so that's good. Now I just need to debug it. Ondrej P.S. My thesis was finally approved by the grad school today, doing some final changes took more time than expected, but I think that I am done now. From njs at pobox.com Mon Dec 3 22:10:50 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 4 Dec 2012 03:10:50 +0000 Subject: [Numpy-discussion] Weird Travis-CI bugs in the release 1.7.x branch In-Reply-To: References: Message-ID: On 4 Dec 2012 02:27, "Ond?ej ?ert?k" wrote: > > Hi, > > I started to work on the release again and noticed weird failures at Travis-CI: [?] > File "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/core/tests/test_iterator.py", The problem is that Travis started installing numpy in all python virtualenvs by default, and our Travis build script just runs setup.py install, which is too dumb to notice that there is a numpy already installed and just overwrites it. The file mentioned above doesn't even exist in 1.7, it's left over from the 1.6 install. I did a PR to fix this in master a few days ago, you want to back port that. (Sorry for lack of link, I'm on my phone.) > P.S. My thesis was finally approved by the grad school today, > doing some final changes took more time than expected, but > I think that I am done now. Congratulations Dr. ?ert?k! -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Dec 4 08:57:12 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 04 Dec 2012 14:57:12 +0100 Subject: [Numpy-discussion] Allowing 0-d arrays in np.take Message-ID: <1354629432.21666.13.camel@sebastian-laptop> Hey, Maybe someone has an opinion about this (since in fact it is new behavior, so it is undefined). `np.take` used to not allow 0-d/scalar input but did allow any other dimensions for the indices. Thinking about changing this, meaning that: np.take(np.arange(5), 0) works. I was wondering if anyone has feelings about whether this should return a scalar or a 0-d array. Typically numpy prefers scalars for these cases (indexing would return a scalar too) for good reasons, so I guess that is correct. But since I noticed this wondering if maybe it returns a 0-d array, I thought I would ask here. Regards, Sebastian From ben.root at ou.edu Tue Dec 4 09:15:45 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 4 Dec 2012 09:15:45 -0500 Subject: [Numpy-discussion] Allowing 0-d arrays in np.take In-Reply-To: <1354629432.21666.13.camel@sebastian-laptop> References: <1354629432.21666.13.camel@sebastian-laptop> Message-ID: On Tue, Dec 4, 2012 at 8:57 AM, Sebastian Berg wrote: > Hey, > > Maybe someone has an opinion about this (since in fact it is new > behavior, so it is undefined). `np.take` used to not allow 0-d/scalar > input but did allow any other dimensions for the indices. Thinking about > changing this, meaning that: > > np.take(np.arange(5), 0) > > works. I was wondering if anyone has feelings about whether this should > return a scalar or a 0-d array. Typically numpy prefers scalars for > these cases (indexing would return a scalar too) for good reasons, so I > guess that is correct. But since I noticed this wondering if maybe it > returns a 0-d array, I thought I would ask here. > > Regards, > > Sebastian > > At first, I was thinking that the output type should be based on what the input type is. So, if a scalar index was used, then a scalar value should be returned. But this wouldn't be true if the array had other dimensions. So, perhaps it should always be an array. The only other option is to mimic the behavior of the array indexing, which wouldn't be a bad choice. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Tue Dec 4 11:14:31 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Dec 2012 08:14:31 -0800 Subject: [Numpy-discussion] Weird Travis-CI bugs in the release 1.7.x branch In-Reply-To: References: Message-ID: On Mon, Dec 3, 2012 at 7:10 PM, Nathaniel Smith wrote: > On 4 Dec 2012 02:27, "Ond?ej ?ert?k" wrote: >> >> Hi, >> >> I started to work on the release again and noticed weird failures at >> Travis-CI: > [?] >> File >> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/core/tests/test_iterator.py", > > The problem is that Travis started installing numpy in all python > virtualenvs by default, and our Travis build script just runs setup.py > install, which is too dumb to notice that there is a numpy already installed > and just overwrites it. The file mentioned above doesn't even exist in 1.7, > it's left over from the 1.6 install. > > I did a PR to fix this in master a few days ago, you want to back port that. > (Sorry for lack of link, I'm on my phone.) Thanks! I backported it in: https://github.com/numpy/numpy/pull/2786 Nice, I was not aware of the fact that "pip install ." fixes this problem with setup.py --- I've burned myself with this so many times already and I always forget about this bug. > >> P.S. My thesis was finally approved by the grad school today, >> doing some final changes took more time than expected, but >> I think that I am done now. > > Congratulations Dr. ?ert?k! Thanks. I am glad it's over. Ondrej From sebastian at sipsolutions.net Tue Dec 4 12:08:32 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 04 Dec 2012 18:08:32 +0100 Subject: [Numpy-discussion] Numpy's definition of contiguous arrays Message-ID: <1354640912.21666.65.camel@sebastian-laptop> Hi, maybe someone has an opinion about how this can be handled and was not yet aware of this. In current numpy master (probably being reverted), the definition for contiguous arrays is changed such that it means "Contiguous in memory" and nothing more. What this means is this: 1. An array of size (1,3,1) is both C- and F-contiguous (Assuming `arr.strides[1] == arr.itemsize`). 2. However it is incorrect that `arr.strides[-1] == arr.itemsize` because the corresponding axes dimension is 1 so it does not matter for the memory layout. Also other similar assumptions about "clean strides" are incorrect. (This was always incorrect in corner cases) I think most will agree that this change reflects what these flags should indicate, because the exact value of the strides is not really important for the memory layout and for example for a row vector there is no reason to say it cannot be both C- and F-contiguous. However the change broke some code in scipy as well as sk-learn, that relied on `arr.strides[-1] == arr.itemsize` (for C-contiguous arrays). The fact that it was never noticed that this isn't quite correct indicates that there is certainly more code out there just like it. There is more discussion here: https://github.com/numpy/numpy/pull/2735 with suggestions for a possible deprecation process of having both definitions next to each other and deprecating the current, etc. I was personally wondering if it is good enough to ensure strides are cleaned up when an array is explicitly requested as contiguous which means: np.array(arr, copy=False, order='C').strides[-1] == arr.itemsize is always True, but: if arr.flags.c_contiguous: # It is possible that: arr.strides[-1] != arr.itemsize Which fixes the problems found yet since typically if you want to use the fact that an array is contiguous, you use this kind of command to make sure it is. But I guess it is likely too dangerous to assume that nobody only checks the flags and then continuous to do unwanted assumptions about strides. Best Regards, Sebastian From ondrej.certik at gmail.com Tue Dec 4 18:47:41 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Dec 2012 15:47:41 -0800 Subject: [Numpy-discussion] Weird Travis-CI bugs in the release 1.7.x branch In-Reply-To: References: Message-ID: On Tue, Dec 4, 2012 at 8:14 AM, Ond?ej ?ert?k wrote: > On Mon, Dec 3, 2012 at 7:10 PM, Nathaniel Smith wrote: >> On 4 Dec 2012 02:27, "Ond?ej ?ert?k" wrote: >>> >>> Hi, >>> >>> I started to work on the release again and noticed weird failures at >>> Travis-CI: >> [?] >>> File >>> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/core/tests/test_iterator.py", >> >> The problem is that Travis started installing numpy in all python >> virtualenvs by default, and our Travis build script just runs setup.py >> install, which is too dumb to notice that there is a numpy already installed >> and just overwrites it. The file mentioned above doesn't even exist in 1.7, >> it's left over from the 1.6 install. >> >> I did a PR to fix this in master a few days ago, you want to back port that. >> (Sorry for lack of link, I'm on my phone.) > > Thanks! I backported it in: > > https://github.com/numpy/numpy/pull/2786 > > Nice, I was not aware of the fact that "pip install ." fixes this > problem with setup.py --- > I've burned myself with this so many times already and I always forget > about this bug. It's fixed in the release branch now. So both master and the release branch pass all tests on Travis again. Thanks for your help. Ondrej From markbak at gmail.com Wed Dec 5 16:35:32 2012 From: markbak at gmail.com (Mark Bakker) Date: Wed, 5 Dec 2012 22:35:32 +0100 Subject: [Numpy-discussion] how do I specify maximum line length when using savetxt? Message-ID: Hello List, I want to write a large array to file, and each line can only be 80 characters long. Can I use savetxt to do that? Where would I specify the maximum line length? Or is there a better way to do this? Thanks, Mark From markbak at gmail.com Wed Dec 5 16:42:27 2012 From: markbak at gmail.com (Mark Bakker) Date: Wed, 5 Dec 2012 22:42:27 +0100 Subject: [Numpy-discussion] turn off square brackets in set_print_options? Message-ID: Hello List, Is it possible to turn off printing the square brackets in set_print_options? Am I overlooking something? Thanks, Mark From paul.anton.letnes at gmail.com Wed Dec 5 16:56:55 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Wed, 5 Dec 2012 22:56:55 +0100 Subject: [Numpy-discussion] how do I specify maximum line length when using savetxt? In-Reply-To: References: Message-ID: <46DE5D18-345D-483F-808D-4776487B7884@gmail.com> On 5. des. 2012, at 22:35, Mark Bakker wrote: > Hello List, > > I want to write a large array to file, and each line can only be 80 > characters long. > Can I use savetxt to do that? Where would I specify the maximum line length? If you specify the format, %10.3f for instance, you will know the max line length if you also know the array shape. > Or is there a better way to do this? Probably 1000 ways to accomplish the same thing out there, sure. Cheers Paul > > Thanks, > > Mark > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From markbak at gmail.com Wed Dec 5 18:40:17 2012 From: markbak at gmail.com (Mark Bakker) Date: Thu, 6 Dec 2012 00:40:17 +0100 Subject: [Numpy-discussion] how do I specify maximum line length when using savetxt? Message-ID: I guess I wasn't explicit enough. Say I have an array with 100 numbers and I want to write it to a file with 6 numbers on each line (and hence, only 4 on the last line). Can I use savetxt to do that? What other easy tool does numpy have to do that? Thanks, Mark On 5. des. 2012, at 22:35, Mark Bakker wrote: > Hello List, > > I want to write a large array to file, and each line can only be 80 > characters long. > Can I use savetxt to do that? Where would I specify the maximum line length? If you specify the format, %10.3f for instance, you will know the max line length if you also know the array shape. > Or is there a better way to do this? Probably 1000 ways to accomplish the same thing out there, sure. Cheers Paul From derek at astro.physik.uni-goettingen.de Wed Dec 5 19:27:42 2012 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Thu, 6 Dec 2012 01:27:42 +0100 Subject: [Numpy-discussion] how do I specify maximum line length when using savetxt? In-Reply-To: References: Message-ID: On 06.12.2012, at 12:40AM, Mark Bakker wrote: > I guess I wasn't explicit enough. > Say I have an array with 100 numbers and I want to write it to a file > with 6 numbers on each line (and hence, only 4 on the last line). > Can I use savetxt to do that? > What other easy tool does numpy have to do that? I've just been looking into a similar case and I think there is no easy tool for this - i.e. nothing comparable to Fortran's '(6e10.3)' or the like, so if your array does not reshape to a Nx6 array, you'd probably have to write something customised yourself. I would not be terribly difficult to add such functionality to savetxt, but then, unless you want the output file to be more human-readable, there is not really a strong case for writing a shape (100,) array into 16 lines plus an incomplete one - it just would not play well with reading back in and then determining the right shape automatically? HTH, Derek From raul at virtualmaterials.com Wed Dec 5 19:30:44 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Wed, 05 Dec 2012 17:30:44 -0700 Subject: [Numpy-discussion] how do I specify maximum line length when using savetxt? In-Reply-To: References: Message-ID: <50BFE734.7090601@virtualmaterials.com> assuming savetxt does not support it, I modified a bit of code I had to do what I think you need ONLY works for a 1D array and wrapped it into a function that writes in properly formatted columns. I didn't really test it other than what is there. I "dressed" it like savetxt but the glaring difference is that it goes off significant digits as opposed to format. """ import numpy def mysavetxt_forvector(fname, x, sigDigs, delimiter, newline, maxCharsPLine, fmode='w'): padSize = sigDigs + 6 #How many characters per number including empty space fmt = str(padSize) + '.' + str(sigDigs) + 'g' #e.g. 13.7g' asTxtLst = map(lambda val: format(val, fmt), a) #from array to list of formatted strings #how many cols max ? cols = maxCharsPLine/(padSize + len(delimiter)) #write to file size = len(asTxtLst) col = 0 f = open(fname, fmode) while col < size: f.write(delimiter.join(asTxtLst[col:col+cols]) ) f.write(newline) col += cols f.close() #Test it a = numpy.ones(34, dtype='float64') * 7./3. a[3] = 123564234.0002345 a[5] = 1 a[7] = -123564234.0002345 a[9] = -.00000000000023453456345 sigDigs = 7 maxCharsPLine = 80 delimiter = ',' newline = '\n' fname = 'temp.out' mysavetxt_forvector(fname, a, sigDigs, delimiter, newline, maxCharsPLine) #append on this one maxCharsPLine = 33 mysavetxt_forvector(fname, a, sigDigs, delimiter, newline, maxCharsPLine, fmode='a') """ Raul On 05/12/2012 4:40 PM, Mark Bakker wrote: > I guess I wasn't explicit enough. > Say I have an array with 100 numbers and I want to write it to a file > with 6 numbers on each line (and hence, only 4 on the last line). > Can I use savetxt to do that? > What other easy tool does numpy have to do that? > Thanks, > Mark > > On 5. des. 2012, at 22:35, Mark Bakker wrote: > >> Hello List, >> >> I want to write a large array to file, and each line can only be 80 >> characters long. >> Can I use savetxt to do that? Where would I specify the maximum line length? > > If you specify the format, %10.3f for instance, you will know the max > line length if you also know the array shape. > > >> Or is there a better way to do this? > > Probably 1000 ways to accomplish the same thing out there, sure. > > Cheers > Paul > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From pav at iki.fi Wed Dec 5 19:45:23 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 06 Dec 2012 02:45:23 +0200 Subject: [Numpy-discussion] Numpy Trac migration Message-ID: <50BFEAA3.1030306@iki.fi> Hi, For those whom it may concern: Since the Numpy Trac -> Github migration is complete, I went ahead and added redirects projects.scipy.org/numpy/register -> github.com/numpy/numpy/issues projects.scipy.org/numpy/newticket -> github.com/numpy/numpy/issues plus an ugly warning bar on top to direct potential users towards Github. Cheers, -- Pauli Virtanen From alex.eberspaecher at gmail.com Thu Dec 6 07:29:08 2012 From: alex.eberspaecher at gmail.com (Alexander =?ISO-8859-1?B?RWJlcnNw5GNoZXI=?=) Date: Thu, 6 Dec 2012 13:29:08 +0100 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: References: Message-ID: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> On Fri, 30 Nov 2012 12:13:58 -0800 "Bradley M. Froehle" wrote: > As far as I can tell, it's IMPOSSIBLE to create a site.cfg which will > link to ACML when a system installed ATLAS is present. setup.py respects environment variables. You can set ATLAS to None and force the setup to use $LAPACK and $BLAS. See also this link: http://www.der-schnorz.de/2012/06/optimized-linear-algebra-and-numpyscipy/ Greetings Alex From ondrej.certik at gmail.com Thu Dec 6 12:35:33 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 6 Dec 2012 09:35:33 -0800 Subject: [Numpy-discussion] Numpy Trac migration In-Reply-To: <50BFEAA3.1030306@iki.fi> References: <50BFEAA3.1030306@iki.fi> Message-ID: On Wed, Dec 5, 2012 at 4:45 PM, Pauli Virtanen wrote: > Hi, > > For those whom it may concern: Since the Numpy Trac -> Github migration > is complete, I went ahead and added redirects > > projects.scipy.org/numpy/register -> github.com/numpy/numpy/issues > projects.scipy.org/numpy/newticket -> github.com/numpy/numpy/issues > > plus an ugly warning bar on top to direct potential users towards Github. Thanks, that is excellent. Ondrej From brad.froehle at gmail.com Thu Dec 6 13:13:26 2012 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Thu, 6 Dec 2012 10:13:26 -0800 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> References: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> Message-ID: Thanks Alexander, that was quite helpful, but unfortunately does not actually work. The recommendations there are akin to a site.cfg file: [atlas] atlas_libs = library_dirs = [blas] blas_libs = cblas,acml library_dirs = /opt/acml5.2.0/gfortan64_fma4/lib [lapack] blas_libs = cblas,acml library_dirs = /opt/acml5.2.0/gfortan64_fma4/lib $ python setup.py build However this makes numpy think that there is no optimized blas available and prevents the numpy.core._dotblas module from being built. -Brad On Thu, Dec 6, 2012 at 4:29 AM, Alexander Ebersp?cher < alex.eberspaecher at gmail.com> wrote: > On Fri, 30 Nov 2012 12:13:58 -0800 > "Bradley M. Froehle" wrote: > > > As far as I can tell, it's IMPOSSIBLE to create a site.cfg which will > > link to ACML when a system installed ATLAS is present. > > setup.py respects environment variables. You can set ATLAS to None and > force the setup to use $LAPACK and $BLAS. See also this link: > > http://www.der-schnorz.de/2012/06/optimized-linear-algebra-and-numpyscipy/ > > Greetings > > Alex > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Dec 6 13:34:50 2012 From: cournape at gmail.com (David Cournapeau) Date: Thu, 6 Dec 2012 19:34:50 +0100 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: References: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> Message-ID: On Thu, Dec 6, 2012 at 7:13 PM, Bradley M. Froehle wrote: > Thanks Alexander, that was quite helpful, but unfortunately does not > actually work. The recommendations there are akin to a site.cfg file: > > [atlas] > atlas_libs = > library_dirs = > > [blas] > blas_libs = cblas,acml > library_dirs = /opt/acml5.2.0/gfortan64_fma4/lib > > [lapack] > blas_libs = cblas,acml > library_dirs = /opt/acml5.2.0/gfortan64_fma4/lib > $ python setup.py build > > However this makes numpy think that there is no optimized blas available and > prevents the numpy.core._dotblas module from being built. _dotblas is only built if *C*blas is available (atlas, accelerate and mkl only are supported ATM). David From brad.froehle at gmail.com Thu Dec 6 13:35:36 2012 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Thu, 6 Dec 2012 10:35:36 -0800 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: References: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> Message-ID: Right, but if I link to libcblas, cblas would be available, no? On Thu, Dec 6, 2012 at 10:34 AM, David Cournapeau wrote: > On Thu, Dec 6, 2012 at 7:13 PM, Bradley M. Froehle > wrote: > > Thanks Alexander, that was quite helpful, but unfortunately does not > > actually work. The recommendations there are akin to a site.cfg file: > > > > [atlas] > > atlas_libs = > > library_dirs = > > > > [blas] > > blas_libs = cblas,acml > > library_dirs = /opt/acml5.2.0/gfortan64_fma4/lib > > > > [lapack] > > blas_libs = cblas,acml > > library_dirs = /opt/acml5.2.0/gfortan64_fma4/lib > > $ python setup.py build > > > > However this makes numpy think that there is no optimized blas available > and > > prevents the numpy.core._dotblas module from being built. > > _dotblas is only built if *C*blas is available (atlas, accelerate and > mkl only are supported ATM). > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cgohlke at uci.edu Fri Dec 7 03:00:45 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Fri, 07 Dec 2012 00:00:45 -0800 Subject: [Numpy-discussion] ValueError: low level cast function is for unequal type numbers Message-ID: <50C1A22D.7050209@uci.edu> Hello, the following code using np.object_ data types works with numpy 1.5.1 but fails with 1.6.2. Is this intended or a regression? Other data types, np.float64 for example, seem to work. In [1]: import numpy as np In [2]: np.array(['a'], dtype='O').astype(('O', [('name', 'O')])) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 np.array(['a'], dtype='O').astype(('O', [('name', 'O')])) ValueError: low level cast function is for unequal type numbers In [3]: np.array([16], dtype='d').astype(('d', [('name', 'd')])) Out[3]: array([ 16.]) These downstream issues could be related: http://code.google.com/p/h5py/issues/detail?id=217 https://github.com/CellProfiler/CellProfiler/issues/421 Thank you, Christoph From cournape at gmail.com Fri Dec 7 07:01:37 2012 From: cournape at gmail.com (David Cournapeau) Date: Fri, 7 Dec 2012 13:01:37 +0100 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: References: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> Message-ID: On Thu, Dec 6, 2012 at 7:35 PM, Bradley M. Froehle wrote: > Right, but if I link to libcblas, cblas would be available, no? No, because we don't explicitly check for CBLAS. We assume it is there if Atlas, Accelerate or MKL is found. cheers, David > > > On Thu, Dec 6, 2012 at 10:34 AM, David Cournapeau > wrote: >> >> On Thu, Dec 6, 2012 at 7:13 PM, Bradley M. Froehle >> wrote: >> > Thanks Alexander, that was quite helpful, but unfortunately does not >> > actually work. The recommendations there are akin to a site.cfg file: >> > >> > [atlas] >> > atlas_libs = >> > library_dirs = >> > >> > [blas] >> > blas_libs = cblas,acml >> > library_dirs = /opt/acml5.2.0/gfortan64_fma4/lib >> > >> > [lapack] >> > blas_libs = cblas,acml >> > library_dirs = /opt/acml5.2.0/gfortan64_fma4/lib >> > $ python setup.py build >> > >> > However this makes numpy think that there is no optimized blas available >> > and >> > prevents the numpy.core._dotblas module from being built. >> >> _dotblas is only built if *C*blas is available (atlas, accelerate and >> mkl only are supported ATM). >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From brad.froehle at gmail.com Fri Dec 7 13:09:00 2012 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Fri, 7 Dec 2012 10:09:00 -0800 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: References: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> Message-ID: <7C5D3880ABAC4B1AB19214CDCBA2227A@gmail.com> Aha, thanks for the clarification. I've always been surpassed that NumPy doesn't ship with a copy of CBLAS. It's easy to compile --- just a thin wrapper over BLAS, if I remember correctly. -Brad On Friday, December 7, 2012 at 4:01 AM, David Cournapeau wrote: > On Thu, Dec 6, 2012 at 7:35 PM, Bradley M. Froehle > wrote: > > Right, but if I link to libcblas, cblas would be available, no? > > > > No, because we don't explicitly check for CBLAS. We assume it is there > if Atlas, Accelerate or MKL is found. > > cheers, > David From d.s.seljebotn at astro.uio.no Fri Dec 7 13:58:26 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Fri, 07 Dec 2012 19:58:26 +0100 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: <7C5D3880ABAC4B1AB19214CDCBA2227A@gmail.com> References: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> <7C5D3880ABAC4B1AB19214CDCBA2227A@gmail.com> Message-ID: <50C23C52.3030204@astro.uio.no> One way of fixing this I'm sort of itching to do is to create a "pylapack" project which can iterate quickly on these build issues, run-time selection of LAPACK backend and so on. (With some templates generating some Cython code it shouldn't be more than a few days for an MVP.) Then patch NumPy to attempt to import pylapack (and if it's there, get a table of function pointers from it). The idea would be that powerusers could more easily build pylapack the way they wanted, in isolation, and then have other things switch to that without a rebuild (note again that it would need to export a table of function pointers). But I'm itching to do too many things, we'll see. Dag Sverre On 12/07/2012 07:09 PM, Bradley M. Froehle wrote: > Aha, thanks for the clarification. I've always been surpassed that NumPy doesn't ship with a copy of CBLAS. It's easy to compile --- just a thin wrapper over BLAS, if I remember correctly. > > -Brad > > > On Friday, December 7, 2012 at 4:01 AM, David Cournapeau wrote: > >> On Thu, Dec 6, 2012 at 7:35 PM, Bradley M. Froehle >> wrote: >>> Right, but if I link to libcblas, cblas would be available, no? >> >> >> >> No, because we don't explicitly check for CBLAS. We assume it is there >> if Atlas, Accelerate or MKL is found. >> >> cheers, >> David > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From njs at pobox.com Fri Dec 7 16:00:20 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 7 Dec 2012 21:00:20 +0000 Subject: [Numpy-discussion] Fwd: [matplotlib-devel] GitHub attachments In-Reply-To: References: Message-ID: Heh, looks like we did the trac migration about a month too soon... ---------- Forwarded message ---------- From: Damon McDougall Date: Fri, Dec 7, 2012 at 8:15 PM Subject: [matplotlib-devel] GitHub attachments To: matplotlib-devel at lists.sourceforge.net Did everyone see that GitHub now allows attachments in issue comments? IMO, this is awesome. Attaching a visual *thing* of output when users get unexpected results from plotting calls is a massive plus. Drag and drop right into the browser, too! No more having to upload a file to imgur or dropbox and link it to the issue comment and then have it disappear when you delete it and forgot it was linked to GitHub. Win. -- Damon McDougall http://www.damon-is-a-geek.com Institute for Computational Engineering Sciences 201 E. 24th St. Stop C0200 The University of Texas at Austin Austin, TX 78712-1229 ------------------------------------------------------------------------------ LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d _______________________________________________ Matplotlib-devel mailing list Matplotlib-devel at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-devel From njs at pobox.com Fri Dec 7 16:21:25 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 7 Dec 2012 21:21:25 +0000 Subject: [Numpy-discussion] [matplotlib-devel] GitHub attachments In-Reply-To: References: Message-ID: Oh, never mind, I guess they *only* allow image files. So, uh, no test data files, but if we have any lolcats in the trac attachments, we can migrate those. On Fri, Dec 7, 2012 at 9:00 PM, Nathaniel Smith wrote: > Heh, looks like we did the trac migration about a month too soon... > > ---------- Forwarded message ---------- > From: Damon McDougall > Date: Fri, Dec 7, 2012 at 8:15 PM > Subject: [matplotlib-devel] GitHub attachments > To: matplotlib-devel at lists.sourceforge.net > > > Did everyone see that GitHub now allows attachments in issue comments? > IMO, this is awesome. Attaching a visual *thing* of output when users > get unexpected results from plotting calls is a massive plus. Drag and > drop right into the browser, too! No more having to upload a file to > imgur or dropbox and link it to the issue comment and then have it > disappear when you delete it and forgot it was linked to GitHub. > > Win. > > -- > Damon McDougall > http://www.damon-is-a-geek.com > Institute for Computational Engineering Sciences > 201 E. 24th St. > Stop C0200 > The University of Texas at Austin > Austin, TX 78712-1229 > > ------------------------------------------------------------------------------ > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more value-add services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > _______________________________________________ > Matplotlib-devel mailing list > Matplotlib-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/matplotlib-devel From jason-sage at creativetrax.com Fri Dec 7 17:16:00 2012 From: jason-sage at creativetrax.com (Jason Grout) Date: Fri, 07 Dec 2012 16:16:00 -0600 Subject: [Numpy-discussion] [matplotlib-devel] GitHub attachments In-Reply-To: References: Message-ID: <50C26AA0.7050409@creativetrax.com> On 12/7/12 3:21 PM, Nathaniel Smith wrote: > Oh, never mind, I guess they *only* allow image files. So, uh, no test > data files, but if we have any lolcats in the trac attachments, we can > migrate those. > It looks like what they do is just automatically upload it to their own cloud, and then substitute in their standard markup for embedding images. So it's just replacing the "upload your file to somewhere" to "we'll upload it automatically to our own cloud." That said, it is really important and very nice that they're doing this! Thanks, Jason From ralf.gommers at gmail.com Sat Dec 8 05:24:14 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 8 Dec 2012 11:24:14 +0100 Subject: [Numpy-discussion] turn off square brackets in set_print_options? In-Reply-To: References: Message-ID: On Wed, Dec 5, 2012 at 10:42 PM, Mark Bakker wrote: > Hello List, > > Is it possible to turn off printing the square brackets in > set_print_options? > Am I overlooking something? > You're not, those are hardcoded in _formatArray() in core/arrayprint.py Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Dec 8 06:15:06 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 8 Dec 2012 12:15:06 +0100 Subject: [Numpy-discussion] Hierarchical vs non-hierarchical ndarray.base and __array_interface__ In-Reply-To: <2B413C1254F1B44F908C9BE85A5DA0AD21B7FA@PRDEXMBX-04.the-lab.llnl.gov> References: <2B413C1254F1B44F908C9BE85A5DA0AD21B7FA@PRDEXMBX-04.the-lab.llnl.gov> Message-ID: On Sat, Nov 24, 2012 at 8:34 PM, Gamblin, Todd wrote: > Hi all, > > I posted on the change in semantics of ndarray.base here: > > > https://github.com/numpy/numpy/commit/6c0ad59#commitcomment-2153047 > > And some folks asked me to post my question to the numpy mailing list. > I've implemented a tool for mapping processes in parallel applications to > nodes in cartesian networks. It uses hierarchies of numpy arrays to > represent the domain decomposition of the application, as well as > corresponding groups of processes on the network. You can "map" an > application to the network using assignment of through views. The tool is > here if anyone is curious: https://github.com/tgamblin/rubik. I used > numpy to implement this because I wanted to be able to do mappings for > arbitrary-dimensional networks. Blue Gene/Q, for example, has a 5-D > network. > > The reason I bring this up is because I rely on the ndarray.base pointer > and some of the semantics in __array_interface__ to translate indices > within my hierarchy of views. e.g., if a value is at (0,0) in a view I > want to know that it's actually at (4,4) in its immediate parent array. > > After looking over the commit I linked to above, I realized I'm actually > relying on a lot of stuff that's not guaranteed by numpy. I rely on .base > pointing to its closest parent, and I rely on __array_interface__.data > containing the address of the array's memory and its strides. None of > these is guaranteed by the API docs: > > http://docs.scipy.org/doc/numpy/reference/arrays.interface.html > Are you saying that data/strides aren't guaranteed because they're marked optional on that page? My interpretation of "optional" is that these fields don't have to be present for all objects implementing something that qualifies as an array interface (for example ndarrays don't need a mask), but it does not mean that everything marked optional can be changed without warning for ndarrays. So I guess I have a few questions: > > 1. Is translating indices between base arrays and views something that > would be useful to other people? > > 2. Is there some better way to do this than using ndarray.base and > __array_interface__? > > 3. What's the numpy philosophy on this? Should views know about their > parents or not? They obviously have to know a little bit about their > memory, but whether or not they know how they were derived from their > owning array is a different question. There was some discussion on the > vagueness of .base here: > > > http://thread.gmane.org/gmane.comp.python.numeric.general/51688/focus=51703 > > But it doesn't look like you're deprecating .base in 1.7, only changing > its behavior, which I tend to agree is worse than deprecating it. > > After thinking about all this, I'm not sure what I would like to happen. > I can see the value of not keeping extra references around within numpy, > and my domain is pretty different from the ways that I imagine people use > numpy. I wouldn't have to change my code much to make it work without > .base, but I do rely on __array_interface__. If that doesn't include the > address and strides, t think I'm screwed as far as translating indices go. > > Any suggestions? > The discussion on .base seems to have converged. As for __array_interface__, you could write a test which captures the essence of your use of ndarray.__array_interface__ and send a PR for it. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From staywithpin at gmail.com Sat Dec 8 06:20:21 2012 From: staywithpin at gmail.com (Daniel Wu) Date: Sat, 08 Dec 2012 19:20:21 +0800 Subject: [Numpy-discussion] pandas dataframe memory layout Message-ID: <50C32275.6090203@gmail.com> For numpy array, we can choose to use either C style or Forran stype. For dataframe in Pandas, is it possible to choose memory layout as in numpy array? From d.s.seljebotn at astro.uio.no Sat Dec 8 08:48:31 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Sat, 08 Dec 2012 14:48:31 +0100 Subject: [Numpy-discussion] Numpy Trac migration In-Reply-To: <50BFEAA3.1030306@iki.fi> References: <50BFEAA3.1030306@iki.fi> Message-ID: <50C3452F.9050108@astro.uio.no> On 12/06/2012 01:45 AM, Pauli Virtanen wrote: > Hi, > > For those whom it may concern: Since the Numpy Trac -> Github migration > is complete, I went ahead and added redirects > > projects.scipy.org/numpy/register -> github.com/numpy/numpy/issues > projects.scipy.org/numpy/newticket -> github.com/numpy/numpy/issues > > plus an ugly warning bar on top to direct potential users towards Github. Related news: https://github.com/blog/1347-issue-attachments Dag Sverre From chaoyuejoy at gmail.com Sat Dec 8 10:07:24 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Sat, 8 Dec 2012 16:07:24 +0100 Subject: [Numpy-discussion] pandas dataframe memory layout In-Reply-To: <50C32275.6090203@gmail.com> References: <50C32275.6090203@gmail.com> Message-ID: I don't know. Maybe you can ask here: http://stackoverflow.com/questions/tagged/pandas chao On Sat, Dec 8, 2012 at 12:20 PM, Daniel Wu wrote: > For numpy array, we can choose to use either C style or Forran stype. > For dataframe in Pandas, is it possible to choose memory layout as in > numpy array? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kamauallan at gmail.com Mon Dec 10 05:57:04 2012 From: kamauallan at gmail.com (Allan Kamau) Date: Mon, 10 Dec 2012 13:57:04 +0300 Subject: [Numpy-discussion] ImportError: libatlas.so.3: cannot open shared object file Message-ID: I have built and installed numpy on Debian from source successfully as follows. export LAPACK=/usr/lib/lapack/liblapack.so;export ATLAS=/usr/lib/atlas-base/libatlas.so; python setup.py build; python setup.py install; Then I change directory from the numpy sources directory. Then I give the command "python -c 'import numpy; numpy.test()' I get the error pasted below. How can I resource this issue? Traceback (most recent call last): File "", line 1, in File "/home/allank/SmashCell/local/lib/python2.7/site-packages/numpy/__init__.py", line 130, in import add_newdocs File "/home/allank/SmashCell/local/lib/python2.7/site-packages/numpy/add_newdocs.py", line 9, in from lib import add_newdoc File "/home/allank/SmashCell/local/lib/python2.7/site-packages/numpy/lib/__init__.py", line 13, in from polynomial import * File "/home/allank/SmashCell/local/lib/python2.7/site-packages/numpy/lib/polynomial.py", line 18, in from numpy.linalg import eigvals, lstsq File "/home/allank/SmashCell/local/lib/python2.7/site-packages/numpy/linalg/__init__.py", line 47, in from linalg import * File "/home/allank/SmashCell/local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 22, in from numpy.linalg import lapack_lite ImportError: libatlas.so.3: cannot open shared object file: No such file or directory -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.eberspaecher at gmail.com Mon Dec 10 06:54:08 2012 From: alex.eberspaecher at gmail.com (Alexander =?ISO-8859-1?B?RWJlcnNw5GNoZXI=?=) Date: Mon, 10 Dec 2012 12:54:08 +0100 Subject: [Numpy-discussion] ImportError: libatlas.so.3: cannot open shared object file In-Reply-To: References: Message-ID: <20121210125408.0c02d3f3@poetzsch.nat.uni-magdeburg.de> On Mon, 10 Dec 2012 13:57:04 +0300 Allan Kamau wrote: > I have built and installed numpy on Debian from source successfully as > follows. [...] > ImportError: libatlas.so.3: cannot open shared object file: No such > file or directory Are the paths to ATLAS in your $LD_LIBRARY_PATH? If not, try adding those. Hope that helps! Cheers, Alex From kamauallan at gmail.com Mon Dec 10 07:09:21 2012 From: kamauallan at gmail.com (Allan Kamau) Date: Mon, 10 Dec 2012 15:09:21 +0300 Subject: [Numpy-discussion] ImportError: libatlas.so.3: cannot open shared object file In-Reply-To: <20121210125408.0c02d3f3@poetzsch.nat.uni-magdeburg.de> References: <20121210125408.0c02d3f3@poetzsch.nat.uni-magdeburg.de> Message-ID: I did add the paths to LD_LIBRARY_PATH as advised (see below), then "python setup.py clean;python setup.py build;python setup.py install;" but the same error persists. export LAPACK=/usr/lib/lapack/liblapack.so;export ATLAS=/usr/lib/atlas-base/libatlas.so; export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/lapack:/usr/lib/atlas-base; On Mon, Dec 10, 2012 at 2:54 PM, Alexander Ebersp?cher < alex.eberspaecher at gmail.com> wrote: > On Mon, 10 Dec 2012 13:57:04 +0300 > Allan Kamau wrote: > > > I have built and installed numpy on Debian from source successfully as > > follows. > [...] > > ImportError: libatlas.so.3: cannot open shared object file: No such > > file or directory > > Are the paths to ATLAS in your $LD_LIBRARY_PATH? If not, try adding > those. > > Hope that helps! > > Cheers, > > Alex > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Mon Dec 10 08:42:49 2012 From: shish at keba.be (Olivier Delalleau) Date: Mon, 10 Dec 2012 08:42:49 -0500 Subject: [Numpy-discussion] ImportError: libatlas.so.3: cannot open shared object file In-Reply-To: References: <20121210125408.0c02d3f3@poetzsch.nat.uni-magdeburg.de> Message-ID: 2012/12/10 Allan Kamau > I did add the paths to LD_LIBRARY_PATH as advised (see below), then > "python setup.py clean;python setup.py build;python setup.py install;" but > the same error persists. > > export LAPACK=/usr/lib/lapack/liblapack.so;export > ATLAS=/usr/lib/atlas-base/libatlas.so; > export > LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/lapack:/usr/lib/atlas-base; > Is the file libatlas.so.3 present in /usr/lib/lapack:/usr/lib/atlas-base? -=- Olivier > On Mon, Dec 10, 2012 at 2:54 PM, Alexander Ebersp?cher < > alex.eberspaecher at gmail.com> wrote: > >> On Mon, 10 Dec 2012 13:57:04 +0300 >> Allan Kamau wrote: >> >> > I have built and installed numpy on Debian from source successfully as >> > follows. >> [...] >> > ImportError: libatlas.so.3: cannot open shared object file: No such >> > file or directory >> >> Are the paths to ATLAS in your $LD_LIBRARY_PATH? If not, try adding >> those. >> >> Hope that helps! >> >> Cheers, >> >> Alex >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaoyuejoy at gmail.com Mon Dec 10 17:30:05 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Mon, 10 Dec 2012 23:30:05 +0100 Subject: [Numpy-discussion] np.dstack vs np.concatenate? Message-ID: Dear all, I want to concate 13 mXn arrays into mXnX13 array. np version 1.6.2 I know the correct way to do that like here: http://stackoverflow.com/questions/8898471/concatenate-two-numpy-arrays-in-the-4th-dimension yet the np.dstack documentation also gives something like this: Equivalent to ``np.concatenate(tup, axis=2)``. so I tried this: In [10]: a = np.arange(8).reshape(2,4) In [11]: b=np.arange(9,17).reshape(2,4) In [12]: abd = np.dstack((a,b)) In [13]: abd.shape Out[13]: (2, 4, 2) In [14]: np.testing.assert_array_equal(abd[...,0],a) In [15]: np.testing.assert_array_equal(abd[...,1],b) In [16]: np.concatenate((a,b),axis=2) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 np.concatenate((a,b),axis=2) ValueError: bad axis1 argument to swapaxes so I guess for the 13 arrays, np.dstack will work. I can also do something like: array_list_old = [arr1, arr2, arr3, arr4, arr5, arr6] array_list = [arr[...,np.newaxis] for arr in array_list_old] array = np.concatenate(tuple(array_list),axis=2) So is there some inconsistency in the documentation? thanks, Chao -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brad.froehle at gmail.com Mon Dec 10 17:38:50 2012 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Mon, 10 Dec 2012 14:38:50 -0800 Subject: [Numpy-discussion] np.dstack vs np.concatenate? In-Reply-To: References: Message-ID: The source for np.dstack would point the way towards a simpler implementation: array = np.concatenate(map(np.atleast_3d, (arr1, arr2, arr3, arr4, arr5, arr6)), axis=2) array_list_old = [arr1, arr2, arr3, arr4, arr5, arr6] > > array_list = [arr[...,np.newaxis] for arr in array_list_old] > array = np.concatenate(tuple(array_list),axis=2) > > So is there some inconsistency in the documentation? > Maybe. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Dec 10 19:39:39 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 10 Dec 2012 19:39:39 -0500 Subject: [Numpy-discussion] segfaulting numpy with dot and datetime Message-ID: >>> np.__version__ '1.6.2' >>> aa array([1970-01-13 96:00:00, 1970-01-13 120:00:00, 1970-01-13 144:00:00, 1970-01-13 168:00:00, 1970-01-13 192:00:00], dtype=datetime64[ns]) >>> np.dot(aa, [1]) reported at http://stackoverflow.com/questions/13786209/regression-on-stock-data-using-pandas-and-matplotlib discussion https://groups.google.com/d/topic/pystatsmodels/5zpWBzSH8UE/discussion using linalg is "safe" but I doubt this casting to float is the desired result >>> np.linalg.pinv(aa[:,None]) array([[ 1.87768878e-19, 1.87784114e-19, 1.87799350e-19, 1.87814586e-19, 1.87829822e-19]]) >>> np.linalg.pinv(np.asarray(aa, float)[:,None]) array([[ 1.87768878e-19, 1.87784114e-19, 1.87799350e-19, 1.87814586e-19, 1.87829822e-19]]) Josef From charlesr.harris at gmail.com Mon Dec 10 20:26:35 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 10 Dec 2012 18:26:35 -0700 Subject: [Numpy-discussion] segfaulting numpy with dot and datetime In-Reply-To: References: Message-ID: On Mon, Dec 10, 2012 at 5:39 PM, wrote: > >>> np.__version__ > '1.6.2' > >>> aa > array([1970-01-13 96:00:00, 1970-01-13 120:00:00, 1970-01-13 144:00:00, > 1970-01-13 168:00:00, 1970-01-13 192:00:00], dtype=datetime64[ns]) > >>> np.dot(aa, [1]) > > > Hmm, I can't even get that array using current master, what with illegal hours and all. Datetime in 1.6.x wasn't quite up to snuff, so things might have been fixed in 1.7.0. Is there any way you can check that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Dec 10 20:54:42 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 10 Dec 2012 20:54:42 -0500 Subject: [Numpy-discussion] segfaulting numpy with dot and datetime In-Reply-To: References: Message-ID: On Mon, Dec 10, 2012 at 8:26 PM, Charles R Harris wrote: > > On Mon, Dec 10, 2012 at 5:39 PM, wrote: >> >> >>> np.__version__ >> '1.6.2' >> >>> aa >> array([1970-01-13 96:00:00, 1970-01-13 120:00:00, 1970-01-13 144:00:00, >> 1970-01-13 168:00:00, 1970-01-13 192:00:00], dtype=datetime64[ns]) >> >>> np.dot(aa, [1]) >> >> > > Hmm, I can't even get that array using current master, what with illegal > hours and all. Datetime in 1.6.x wasn't quite up to snuff, so things might > have been fixed in 1.7.0. Is there any way you can check that. I didn't know the dates are illegal, they were created with pandas. Skipper said he couldn't replicate the segfault so it might be gone with a more recent numpy. I still have to setup a virtualenv for a 1.7.0 beta so I can start to test it. (I rely on binaries for numpy and scipy now.) Thanks, Josef > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Mon Dec 10 21:46:29 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 10 Dec 2012 21:46:29 -0500 Subject: [Numpy-discussion] segfaulting numpy with dot and datetime In-Reply-To: References: Message-ID: On Mon, Dec 10, 2012 at 8:54 PM, wrote: > On Mon, Dec 10, 2012 at 8:26 PM, Charles R Harris > wrote: >> >> On Mon, Dec 10, 2012 at 5:39 PM, wrote: >>> >>> >>> np.__version__ >>> '1.6.2' >>> >>> aa >>> array([1970-01-13 96:00:00, 1970-01-13 120:00:00, 1970-01-13 144:00:00, >>> 1970-01-13 168:00:00, 1970-01-13 192:00:00], dtype=datetime64[ns]) >>> >>> np.dot(aa, [1]) >>> >>> >> >> Hmm, I can't even get that array using current master, what with illegal >> hours and all. Datetime in 1.6.x wasn't quite up to snuff, so things might >> have been fixed in 1.7.0. Is there any way you can check that. > > I didn't know the dates are illegal, they were created with pandas. here's the minimal numpy version >>> np.ones(5, 'datetime64[ns]') array([1970-01-01 00:00:00, 1970-01-01 00:00:00, 1970-01-01 00:00:00, 1970-01-01 00:00:00, 1970-01-01 00:00:00], dtype=datetime64[ns]) >>> b = np.ones(5, 'datetime64[ns]') >>> np.dot(b, [1]) virtualenv is next Josef > > Skipper said he couldn't replicate the segfault so it might be gone > with a more recent numpy. > I still have to setup a virtualenv for a 1.7.0 beta so I can start to test it. > (I rely on binaries for numpy and scipy now.) > > Thanks, > > Josef > >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> From josef.pktd at gmail.com Mon Dec 10 22:28:53 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 10 Dec 2012 22:28:53 -0500 Subject: [Numpy-discussion] segfaulting numpy with dot and datetime In-Reply-To: References: Message-ID: On Mon, Dec 10, 2012 at 9:46 PM, wrote: > On Mon, Dec 10, 2012 at 8:54 PM, wrote: >> On Mon, Dec 10, 2012 at 8:26 PM, Charles R Harris >> wrote: >>> >>> On Mon, Dec 10, 2012 at 5:39 PM, wrote: >>>> >>>> >>> np.__version__ >>>> '1.6.2' >>>> >>> aa >>>> array([1970-01-13 96:00:00, 1970-01-13 120:00:00, 1970-01-13 144:00:00, >>>> 1970-01-13 168:00:00, 1970-01-13 192:00:00], dtype=datetime64[ns]) >>>> >>> np.dot(aa, [1]) >>>> >>>> >>> >>> Hmm, I can't even get that array using current master, what with illegal >>> hours and all. Datetime in 1.6.x wasn't quite up to snuff, so things might >>> have been fixed in 1.7.0. Is there any way you can check that. >> >> I didn't know the dates are illegal, they were created with pandas. > > here's the minimal numpy version > >>>> np.ones(5, 'datetime64[ns]') > array([1970-01-01 00:00:00, 1970-01-01 00:00:00, 1970-01-01 00:00:00, > 1970-01-01 00:00:00, 1970-01-01 00:00:00], dtype=datetime64[ns]) >>>> b = np.ones(5, 'datetime64[ns]') >>>> np.dot(b, [1]) > > > virtualenv is next much better (py27d) E:\Josef\testing\tox\py27d\Scripts>python Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.7.0b2' >>> b = np.ones(5, 'datetime64[ns]') >>> b array(['1969-12-31T19:00:00.000000001-0500', '1969-12-31T19:00:00.000000001-0500', '1969-12-31T19:00:00.000000001-0500', '1969-12-31T19:00:00.000000001-0500', '1969-12-31T19:00:00.000000001-0500'], dtype='datetime64[ns]') >>> np.dot(b, [1]) Traceback (most recent call last): File "", line 1, in TypeError: Cannot cast array data from dtype('>> b = np.arange(10).astype('datetime64[ns]').reshape(2,5) >>> b array([['1969-12-31T19:00:00.000000000-0500', '1969-12-31T19:00:00.000000001-0500', '1969-12-31T19:00:00.000000002-0500', '1969-12-31T19:00:00.000000003-0500', '1969-12-31T19:00:00.000000004-0500'], ['1969-12-31T19:00:00.000000005-0500', '1969-12-31T19:00:00.000000006-0500', '1969-12-31T19:00:00.000000007-0500', '1969-12-31T19:00:00.000000008-0500', '1969-12-31T19:00:00.000000009-0500']], dtype='datetime64[ns]') >>> np.linalg.pinv(b) array([[ -3.20000000e-01, 1.20000000e-01], [ -1.80000000e-01, 8.00000000e-02], [ -4.00000000e-02, 4.00000000e-02], [ 1.00000000e-01, -1.73472348e-17], [ 2.40000000e-01, -4.00000000e-02]]) >>> np.linalg.qr(b) (array([[ 0., -1.], [-1., 0.]]), array([[-5., -6., -7., -8., -9.], [ 0., -1., -2., -3., -4.]])) >>> c array([['2003-09-28T20:00:00.000000000-0400', '2003-09-29T20:00:00.000000000-0400', '2003-09-30T20:00:00.000000000-0400', '2003-10-01T20:00:00.000000000-0400', '2003-10-02T20:00:00.000000000-0400'], ['2003-10-05T20:00:00.000000000-0400', '2003-10-06T20:00:00.000000000-0400', '2003-10-07T20:00:00.000000000-0400', '2003-10-08T20:00:00.000000000-0400', '2003-10-09T20:00:00.000000000-0400']], dtype='datetime64[ns]') >>> np.linalg.pinv(c) array([[ -4.07870371e-12, 4.07638889e-12], [ -2.03951719e-12, 2.03835978e-12], [ -3.30684814e-16, 3.30684816e-16], [ 2.03885582e-12, -2.03769841e-12], [ 4.07804232e-12, -4.07572751e-12]]) Josef > > Josef > >> >> Skipper said he couldn't replicate the segfault so it might be gone >> with a more recent numpy. >> I still have to setup a virtualenv for a 1.7.0 beta so I can start to test it. >> (I rely on binaries for numpy and scipy now.) >> >> Thanks, >> >> Josef >> >>> >>> Chuck >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> From ndbecker2 at gmail.com Tue Dec 11 11:44:26 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 11 Dec 2012 11:44:26 -0500 Subject: [Numpy-discussion] non-integer index misfeature? Message-ID: I think it's a misfeature that a floating point is silently accepted as an index. I would prefer a warning for: bins = np.arange (...) for b in bins: ... w[b] = blah when I meant: for ib,b in enumerate (bins): w[ib] = blah From jason-sage at creativetrax.com Wed Dec 12 09:51:02 2012 From: jason-sage at creativetrax.com (Jason Grout) Date: Wed, 12 Dec 2012 08:51:02 -0600 Subject: [Numpy-discussion] IPython receives $1.15 million from Alfred P. Sloan Foundation Message-ID: <50C899D6.3030805@creativetrax.com> Hi everyone, Just FYI, IPython just received $1.15 million in funding from the Alfred P. Sloan Foundation to support development over the next 2 years. Fernando talks more about this in his post to the IPython mailing list: http://mail.scipy.org/pipermail/ipython-dev/2012-December/010799.html It's great to see a significant open-source python project that many of us use on a day-to-day basis get such great funding! Thanks, Jason -- Jason Grout From ioannis87 at gmail.com Tue Dec 11 05:51:24 2012 From: ioannis87 at gmail.com (ioannis syntychakis) Date: Tue, 11 Dec 2012 11:51:24 +0100 Subject: [Numpy-discussion] unsubscribe Message-ID: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Dec 12 15:20:04 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 12 Dec 2012 21:20:04 +0100 Subject: [Numpy-discussion] non-integer index misfeature? In-Reply-To: References: Message-ID: On Tue, Dec 11, 2012 at 5:44 PM, Neal Becker wrote: > I think it's a misfeature that a floating point is silently accepted as an > index. I would prefer a warning for: > > bins = np.arange (...) > > for b in bins: > ... > w[b] = blah > > when I meant: > > for ib,b in enumerate (bins): > w[ib] = blah > Agreed. Scipy.special functions were just changed to generate warnings on truncation of float inputs where ints are expected (only if truncation changes the value, so 3.0 is silent and 3.1 is not). For numpy indexing this may not be appropriate though; checking every index value used could slow things down and/or be quite disruptive. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Dec 12 15:48:42 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 12 Dec 2012 20:48:42 +0000 Subject: [Numpy-discussion] non-integer index misfeature? In-Reply-To: References: Message-ID: On Wed, Dec 12, 2012 at 8:20 PM, Ralf Gommers wrote: > > On Tue, Dec 11, 2012 at 5:44 PM, Neal Becker wrote: >> >> I think it's a misfeature that a floating point is silently accepted as an >> index. I would prefer a warning for: >> >> bins = np.arange (...) >> >> for b in bins: >> ... >> w[b] = blah >> >> when I meant: >> >> for ib,b in enumerate (bins): >> w[ib] = blah > > > Agreed. Scipy.special functions were just changed to generate warnings on > truncation of float inputs where ints are expected (only if truncation > changes the value, so 3.0 is silent and 3.1 is not). > > For numpy indexing this may not be appropriate though; checking every index > value used could slow things down and/or be quite disruptive. I doubt this is measurable, and it only affects people who are using floats as indexes, which is a risky thing to be doing in the first place. The only good reason to use floats as indexes is if you're doing floating point arithmetic to calculate indexes -- but now you're going to get bitten as soon as some operation returns N - epsilon instead of N, and gets truncated to N - 1. I'd be +1 for a patch to make numpy warn when indexing with non-integer floats. (Heck, I'd probably be +1 on deprecating allowing floating point numbers as indexes at all... it's risky as heck and reminding people to think about rounding can only be a good thing, given that risk.) -n From d.warde.farley at gmail.com Wed Dec 12 16:09:56 2012 From: d.warde.farley at gmail.com (David Warde-Farley) Date: Wed, 12 Dec 2012 16:09:56 -0500 Subject: [Numpy-discussion] non-integer index misfeature? In-Reply-To: References: Message-ID: On Wed, Dec 12, 2012 at 3:20 PM, Ralf Gommers wrote: > For numpy indexing this may not be appropriate though; checking every index > value used could slow things down and/or be quite disruptive. For array fancy indices, a dtype check on the entire array would suffice. For lists and scalars, I doubt it'd be substantial amount of overhead compared to the overhead that already exists (for lists, I'm not totally sure whether these are coerced to arrays first -- if they are, the dtype check need only be performed once after that). At any rate, a benchmark of any proposed solution is in order. David From njs at pobox.com Wed Dec 12 16:56:32 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 12 Dec 2012 21:56:32 +0000 Subject: [Numpy-discussion] non-integer index misfeature? In-Reply-To: References: Message-ID: On Wed, Dec 12, 2012 at 9:09 PM, David Warde-Farley wrote: > On Wed, Dec 12, 2012 at 3:20 PM, Ralf Gommers wrote: >> For numpy indexing this may not be appropriate though; checking every index >> value used could slow things down and/or be quite disruptive. > > For array fancy indices, a dtype check on the entire array would > suffice. For lists and scalars, I doubt it'd be substantial amount of > overhead compared to the overhead that already exists (for lists, I'm > not totally sure whether these are coerced to arrays first -- if they > are, the dtype check need only be performed once after that). At any > rate, a benchmark of any proposed solution is in order. The current behaviour seems to be: Scalars are silently cast to int: In [6]: a = np.arange(10) In [7]: a[1.5] Out[7]: 1 So are lists: In [8]: a[[1.5, 2.5]] Out[8]: array([1, 2]) In fact, it looks like lists always get passed through np.array(mylist, dtype=int), because even a list of booleans gets cast to integer: In [21]: a[[False] * 10] Out[21]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) But arrays must have integer type to start with: In [10]: a[np.array([1.5, 2.5])] IndexError: arrays used as indices must be of integer (or boolean) type Complex scalars are also cast (but with a warning): In [13]: a[np.complex64(1.5)] /home/njs/.user-python2.7-64bit/bin/ipython:1: ComplexWarning: Casting complex values to real discards the imaginary part Out[13]: 1 So now I'm even more in favour of making floating point indexes a hard error (with a deprecation period). https://github.com/numpy/numpy/issues/2810 Also we should fix it so that a[list] acts exactly like a[np.array(list)]. (Again with a suitable deprecation period.) The current behaviour is really weird! https://github.com/numpy/numpy/issues/2811 -n P.S. to Neal: none of this is going to help your original code though, because your 'bins' array actually contains integers... From sebastian at sipsolutions.net Wed Dec 12 17:29:01 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 12 Dec 2012 23:29:01 +0100 Subject: [Numpy-discussion] non-integer index misfeature? In-Reply-To: References: Message-ID: <1355351341.2609.11.camel@sebastian-laptop> On Wed, 2012-12-12 at 20:48 +0000, Nathaniel Smith wrote: > On Wed, Dec 12, 2012 at 8:20 PM, Ralf Gommers wrote: > > > > On Tue, Dec 11, 2012 at 5:44 PM, Neal Becker wrote: > >> > >> I think it's a misfeature that a floating point is silently accepted as an > >> index. I would prefer a warning for: > >> > >> bins = np.arange (...) > >> > >> for b in bins: > >> ... > >> w[b] = blah > >> > >> when I meant: > >> > >> for ib,b in enumerate (bins): > >> w[ib] = blah > > > > > > Agreed. Scipy.special functions were just changed to generate warnings on > > truncation of float inputs where ints are expected (only if truncation > > changes the value, so 3.0 is silent and 3.1 is not). > > > > For numpy indexing this may not be appropriate though; checking every index > > value used could slow things down and/or be quite disruptive. > > I doubt this is measurable, and it only affects people who are using > floats as indexes, which is a risky thing to be doing in the first > place. The only good reason to use floats as indexes is if you're > doing floating point arithmetic to calculate indexes -- but now you're > going to get bitten as soon as some operation returns N - epsilon > instead of N, and gets truncated to N - 1. > > I'd be +1 for a patch to make numpy warn when indexing with > non-integer floats. (Heck, I'd probably be +1 on deprecating allowing > floating point numbers as indexes at all... it's risky as heck and > reminding people to think about rounding can only be a good thing, > given that risk.) > Personally +1 on just deprecating that stuff in the long run. Just if someone is interested I remember seeing this comment (which applies for the scalar case): /* * PyNumber_Index was introduced in Python 2.5 because of NumPy. * http://www.python.org/dev/peps/pep-0357/ * Let's use it for indexing! * * Unfortunately, SciPy and possibly other code seems to rely * on the lenient coercion. :( */ #if 0 /*PY_VERSION_HEX >= 0x02050000*/ PyObject *ind = PyNumber_Index(op); if (ind != NULL) { value = PyArray_PyIntAsIntp(ind); Py_DECREF(ind); } else { value = -1; } #else and is is somewhat related. But with a long deprecation process switching to using `__index__` would seem possible to me. Regards, Sebastian > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ndbecker2 at gmail.com Thu Dec 13 06:58:09 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 13 Dec 2012 06:58:09 -0500 Subject: [Numpy-discussion] non-integer index misfeature? References: <1355351341.2609.11.camel@sebastian-laptop> Message-ID: I'd be happy with disallowing floating point index at all. I would think it was almost always a mistake. From charlesr.harris at gmail.com Thu Dec 13 11:34:53 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 13 Dec 2012 09:34:53 -0700 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 Message-ID: Time to raise this topic again. Opinions welcome. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From teoliphant at gmail.com Thu Dec 13 11:35:01 2012 From: teoliphant at gmail.com (Travis Oliphant) Date: Thu, 13 Dec 2012 10:35:01 -0600 Subject: [Numpy-discussion] www.numpy.org home page Message-ID: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> For people interested in the www.numpy.org home page: Jon Turner has officially transferred the www.numpy.org domain to NumFOCUS. Thank you, Jon for this donation and for being a care-taker of the domain-name. We have setup the domain registration to point to numpy.github.com and I've changed the CNAME in that repostiory to www.numpy.org I've sent an email to have the numpy.scipy.org page to redirect to www.numpy.org. The NumPy home page can still be edited in this repository: git at github.com:numpy/numpy.org.git. Pull requests are always welcome --- especially pull requests that improve the look and feel of the web-page. Two of the content changes that we need to make a decision about is 1) whether or not to put links to books published (Packt publishing for example has offered a higher percentage of their revenues if we put a prominent link on www.numpy.org) 2) whether or not to accept "Sponsored by" links on the home page for donations to the project (e.g. Continuum Analytics has sponsored Ondrej release management, other companies have sponsored pull requests, other companies may want to provide donations and we would want to recognize their contributions to the numpy project). These decisions should be made by the NumPy community which in my mind are interested people on this list. Who is interested in this kind of discussion? We could have these discussions on this list or on the numfocus at googlegroups.com list and keep this list completely technical (which I prefer, but I will do whatever the consensus is). Best regards, -Travis From travis at continuum.io Thu Dec 13 11:36:49 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 13 Dec 2012 10:36:49 -0600 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: A big +1 from me --- but I don't have anyone I know using 2.4 anymore.... -Travis On Dec 13, 2012, at 10:34 AM, Charles R Harris wrote: > Time to raise this topic again. Opinions welcome. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ben.root at ou.edu Thu Dec 13 11:39:55 2012 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 13 Dec 2012 11:39:55 -0500 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: As a point of reference, python 2.4 is on RH5/CentOS5. While RH6 is the current version, there are still enterprises that are using version 5. Of course, at this point, one really should be working on a migration plan and shouldn't be doing new development on those machines... Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From scopatz at gmail.com Thu Dec 13 11:46:54 2012 From: scopatz at gmail.com (Anthony Scopatz) Date: Thu, 13 Dec 2012 10:46:54 -0600 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: +1, if someone wants to use an older version of Python they can use an older version of numpy. On Thu, Dec 13, 2012 at 10:36 AM, Travis Oliphant wrote: > A big +1 from me --- but I don't have anyone I know using 2.4 anymore.... > > -Travis > > On Dec 13, 2012, at 10:34 AM, Charles R Harris wrote: > > > Time to raise this topic again. Opinions welcome. > > > > Chuck > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Thu Dec 13 11:49:28 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Thu, 13 Dec 2012 11:49:28 -0500 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: +1, especially if this is for 1.8. What is the plan for when it will be released? It is 1.7 that will be the long term supported version? Fred On Thu, Dec 13, 2012 at 11:46 AM, Anthony Scopatz wrote: > +1, if someone wants to use an older version of Python they can use an older > version of numpy. > > > On Thu, Dec 13, 2012 at 10:36 AM, Travis Oliphant > wrote: >> >> A big +1 from me --- but I don't have anyone I know using 2.4 anymore.... >> >> -Travis >> >> On Dec 13, 2012, at 10:34 AM, Charles R Harris wrote: >> >> > Time to raise this topic again. Opinions welcome. >> > >> > Chuck >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cournape at gmail.com Thu Dec 13 12:00:12 2012 From: cournape at gmail.com (David Cournapeau) Date: Thu, 13 Dec 2012 18:00:12 +0100 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 5:34 PM, Charles R Harris wrote: > Time to raise this topic again. Opinions welcome. I am ok if 1.7 is the LTS. I would even go as far as dropping 2.5 as well then (RHEL 6 uses python 2.6). cheers, David From jsseabold at gmail.com Thu Dec 13 12:03:14 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 13 Dec 2012 12:03:14 -0500 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 12:00 PM, David Cournapeau wrote: > I would even go as far as dropping 2.5 as well then (RHEL 6 > uses python 2.6). +1 Skipper From sturla at molden.no Thu Dec 13 12:11:41 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 13 Dec 2012 18:11:41 +0100 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: <4E1C143A-B952-4716-BDBA-1A91A3EAD82C@molden.no> Yes, and ditto for SciPy. With dropped 2.4 support we can also use the new memoryview syntax instead of ndarray syntax in Cython. That is more important for SciPy, but it has some relevance for NumPy too. Sturla Sendt fra min iPad Den 13. des. 2012 kl. 17:34 skrev Charles R Harris : > Time to raise this topic again. Opinions welcome. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From brad.froehle at gmail.com Thu Dec 13 12:12:38 2012 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Thu, 13 Dec 2012 09:12:38 -0800 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: Targeting >= 2.6 would be preferable to me. Several other packages including IPython, support only Python >= 2.6, >= 3.2. This change would help me from accidentally writing Python syntax which is allowable in 2.6 & 2.7 (but not in 2.4 or 2.5). Compiling a newer Python interpreter isn't very hard? probably about as difficult as installing NumPy. -Brad On Thursday, December 13, 2012 at 9:03 AM, Skipper Seabold wrote: > On Thu, Dec 13, 2012 at 12:00 PM, David Cournapeau wrote: > > > I would even go as far as dropping 2.5 as well then (RHEL 6 > > uses python 2.6). > > +1 From charlesr.harris at gmail.com Thu Dec 13 12:36:56 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 13 Dec 2012 10:36:56 -0700 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 10:12 AM, Bradley M. Froehle wrote: > Targeting >= 2.6 would be preferable to me. Several other packages > including IPython, support only Python >= 2.6, >= 3.2. > > This change would help me from accidentally writing Python syntax which is > allowable in 2.6 & 2.7 (but not in 2.4 or 2.5). > > Compiling a newer Python interpreter isn't very hard? probably about as > difficult as installing NumPy. > > -Brad > > > On Thursday, December 13, 2012 at 9:03 AM, Skipper Seabold wrote: > > > On Thu, Dec 13, 2012 at 12:00 PM, David Cournapeau cournape at gmail.com)> wrote: > > > > > I would even go as far as dropping 2.5 as well then (RHEL 6 > > > uses python 2.6). > > > > +1 > > OK. Dropping support for python 2.4 looks like the majority opinion. I'll put up another post for 2.5. Do we need to coordinate with scipy? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Dec 13 12:38:39 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 13 Dec 2012 10:38:39 -0700 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? Message-ID: The previous proposal to drop python 2.4 support garnered no opposition. How about dropping support for python 2.5 also? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jniehof at lanl.gov Thu Dec 13 12:39:29 2012 From: jniehof at lanl.gov (Jonathan T. Niehof) Date: Thu, 13 Dec 2012 10:39:29 -0700 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: <50CA12D1.6030406@lanl.gov> On 12/13/2012 09:39 AM, Benjamin Root wrote: > As a point of reference, python 2.4 is on RH5/CentOS5. While RH6 is the > current version, there are still enterprises that are using version 5. > Of course, at this point, one really should be working on a migration > plan and shouldn't be doing new development on those machines... FWIW this RHEL5* shop uses a local install of 2.6 rather than dealing with 2.4. Happy with dropping 2.4 support. Bonus from dropping 2.5: it's pretty easy to support a 2.6/3.x combined codebase without relying on 2to3. (*Not the same as RH5, from days of yore...) -- Jonathan Niehof ISR-3 Space Data Systems Los Alamos National Laboratory MS-D466 Los Alamos, NM 87545 Phone: 505-667-9595 email: jniehof at lanl.gov Correspondence / Technical data or Software Publicly Available From shish at keba.be Thu Dec 13 12:46:00 2012 From: shish at keba.be (Olivier Delalleau) Date: Thu, 13 Dec 2012 12:46:00 -0500 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: I'd say it's a good idea, although I hope 1.7.x will still be maintained for a while for those who are still stuck with Python 2.4-5 (sometimes you don't have a choice). -=- Olivier 2012/12/13 Charles R Harris > The previous proposal to drop python 2.4 support garnered no opposition. > How about dropping support for python 2.5 also? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.s.seljebotn at astro.uio.no Thu Dec 13 13:04:23 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 13 Dec 2012 19:04:23 +0100 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: <50C23C52.3030204@astro.uio.no> References: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> <7C5D3880ABAC4B1AB19214CDCBA2227A@gmail.com> <50C23C52.3030204@astro.uio.no> Message-ID: <50CA18A7.3030703@astro.uio.no> On 12/07/2012 07:58 PM, Dag Sverre Seljebotn wrote: > One way of fixing this I'm sort of itching to do is to create a > "pylapack" project which can iterate quickly on these build issues, > run-time selection of LAPACK backend and so on. (With some templates > generating some Cython code it shouldn't be more than a few days for an > MVP.) > > Then patch NumPy to attempt to import pylapack (and if it's there, get a > table of function pointers from it). > > The idea would be that powerusers could more easily build pylapack the > way they wanted, in isolation, and then have other things switch to that > without a rebuild (note again that it would need to export a table of > function pointers). > > But I'm itching to do too many things, we'll see. Update: This can be tackled as part of Hashdist funding, so I'm hoping that I can 3-4 days on improving the LAPACK situation in January; as an option for power-users at first, hopefully something nice for everybody eventually. Opinions on the plan above welcome. Dag Sverre > > Dag Sverre > > On 12/07/2012 07:09 PM, Bradley M. Froehle wrote: >> Aha, thanks for the clarification. I've always been surpassed that NumPy doesn't ship with a copy of CBLAS. It's easy to compile --- just a thin wrapper over BLAS, if I remember correctly. >> >> -Brad >> >> >> On Friday, December 7, 2012 at 4:01 AM, David Cournapeau wrote: >> >>> On Thu, Dec 6, 2012 at 7:35 PM, Bradley M. Froehle >>> wrote: >>>> Right, but if I link to libcblas, cblas would be available, no? >>> >>> >>> >>> No, because we don't explicitly check for CBLAS. We assume it is there >>> if Atlas, Accelerate or MKL is found. >>> >>> cheers, >>> David >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ben.root at ou.edu Thu Dec 13 13:12:23 2012 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 13 Dec 2012 13:12:23 -0500 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 12:38 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > The previous proposal to drop python 2.4 support garnered no opposition. > How about dropping support for python 2.5 also? > > Chuck > > matplotlib 1.2 supports py2.5. I haven't seen any plan to move off of that for 1.3. Is there a compelling reason for dropping 2.5? Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu Dec 13 13:14:54 2012 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 13 Dec 2012 13:14:54 -0500 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: My apologies... we support 2.6 and above. +1 on dropping 2.5 support. Ben On Thu, Dec 13, 2012 at 1:12 PM, Benjamin Root wrote: > On Thu, Dec 13, 2012 at 12:38 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> The previous proposal to drop python 2.4 support garnered no opposition. >> How about dropping support for python 2.5 also? >> >> Chuck >> >> > matplotlib 1.2 supports py2.5. I haven't seen any plan to move off of > that for 1.3. Is there a compelling reason for dropping 2.5? > > Ben Root > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Dec 13 13:57:42 2012 From: cournape at gmail.com (David Cournapeau) Date: Thu, 13 Dec 2012 19:57:42 +0100 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 7:12 PM, Benjamin Root wrote: > On Thu, Dec 13, 2012 at 12:38 PM, Charles R Harris > wrote: >> >> The previous proposal to drop python 2.4 support garnered no opposition. >> How about dropping support for python 2.5 also? >> >> Chuck >> > > matplotlib 1.2 supports py2.5. I haven't seen any plan to move off of that > for 1.3. Is there a compelling reason for dropping 2.5? A rationale for the record: I don't think people who don't care about 2.4 care about 2.5, and 2.6 is a significant improvement compared to 2.5: - context manager - python 3-compatible exception syntax (writing code that works with 2 and 3 without any change is significantly easier if your baseline in 2.6 instead of 2.4/2.5) - json, ast, multiprocessing are available and potentially quite useful for NumPy itself. cheers, David From cournape at gmail.com Thu Dec 13 13:59:34 2012 From: cournape at gmail.com (David Cournapeau) Date: Thu, 13 Dec 2012 19:59:34 +0100 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: <50C23C52.3030204@astro.uio.no> References: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> <7C5D3880ABAC4B1AB19214CDCBA2227A@gmail.com> <50C23C52.3030204@astro.uio.no> Message-ID: On Fri, Dec 7, 2012 at 7:58 PM, Dag Sverre Seljebotn wrote: > One way of fixing this I'm sort of itching to do is to create a > "pylapack" project which can iterate quickly on these build issues, > run-time selection of LAPACK backend and so on. (With some templates > generating some Cython code it shouldn't be more than a few days for an > MVP.) > > Then patch NumPy to attempt to import pylapack (and if it's there, get a > table of function pointers from it). > > The idea would be that powerusers could more easily build pylapack the > way they wanted, in isolation, and then have other things switch to that > without a rebuild (note again that it would need to export a table of > function pointers). > > But I'm itching to do too many things, we'll see. It would be hard to support in a cross platform way I think (windows being the elephant in the room). I would be more than happy to be proven wrong, though :) cheers, David From d.s.seljebotn at astro.uio.no Thu Dec 13 14:10:28 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 13 Dec 2012 20:10:28 +0100 Subject: [Numpy-discussion] site.cfg: Custom BLAS / LAPACK configuration In-Reply-To: References: <20121206132908.769707dd@poetzsch.nat.uni-magdeburg.de> <7C5D3880ABAC4B1AB19214CDCBA2227A@gmail.com> <50C23C52.3030204@astro.uio.no> Message-ID: <50CA2824.1050200@astro.uio.no> On 12/13/2012 07:59 PM, David Cournapeau wrote: > On Fri, Dec 7, 2012 at 7:58 PM, Dag Sverre Seljebotn > wrote: >> One way of fixing this I'm sort of itching to do is to create a >> "pylapack" project which can iterate quickly on these build issues, >> run-time selection of LAPACK backend and so on. (With some templates >> generating some Cython code it shouldn't be more than a few days for an >> MVP.) >> >> Then patch NumPy to attempt to import pylapack (and if it's there, get a >> table of function pointers from it). >> >> The idea would be that powerusers could more easily build pylapack the >> way they wanted, in isolation, and then have other things switch to that >> without a rebuild (note again that it would need to export a table of >> function pointers). >> >> But I'm itching to do too many things, we'll see. > > It would be hard to support in a cross platform way I think (windows > being the elephant in the room). I would be more than happy to be > proven wrong, though :) Right, perhaps I should have said "Linux users" rather than "power users" :-) But seriously, I'd like to have a few more details here. Is it something about calling conventions and "libc" and so on that makes the function-pointer-table approach difficult? (But the NumPy C API is exported in the same way?) Or is it about the build? Can't one just lift whatever numpy.distutils or the NumPy Bento build does for detection? Though what I'll aim for at first is the complete opposite, no auto-detection, but a Windows user should also be able to specify the right flags manually... Dag Sverre From philip at semanchuk.com Thu Dec 13 14:26:23 2012 From: philip at semanchuk.com (Philip Semanchuk) Date: Thu, 13 Dec 2012 14:26:23 -0500 Subject: [Numpy-discussion] Calling LAPACK function dbdsqr()? Message-ID: <9854A6C5-8458-4158-A999-42EA363D1A51@semanchuk.com> Hi all, I'm porting some Fortran code that makes use of a number of BLAS and LAPACK functions, including dbdsqr(). I've found all of the functions I need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except for dbdsqr(). I see that the numpy source code (I looked at numpy-1.6.0b2) contains dbdsqr() in numpy/linalg/dlapack_lite.c, but grep can't find that string in the binary distribution on my Mac nor on Linux. If it's buried in a numpy binary somewhere, I'm comfortable with using ctypes to call it, but I suspect it isn't. Can anyone point me to a cross-platform (OS X, Linux & Windows) way I can call this function? I'm unfortunately quite na?ve about the math in the code I'm porting, so I'm porting the code blindly -- if you ask me what problem I'm trying to solve with dbdsqr(), I won't be able to explain. Thanks in advance for any suggestions, Philip From charlesr.harris at gmail.com Thu Dec 13 14:47:30 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 13 Dec 2012 12:47:30 -0700 Subject: [Numpy-discussion] Calling LAPACK function dbdsqr()? In-Reply-To: <9854A6C5-8458-4158-A999-42EA363D1A51@semanchuk.com> References: <9854A6C5-8458-4158-A999-42EA363D1A51@semanchuk.com> Message-ID: On Thu, Dec 13, 2012 at 12:26 PM, Philip Semanchuk wrote: > Hi all, > I'm porting some Fortran code that makes use of a number of BLAS and > LAPACK functions, including dbdsqr(). I've found all of the functions I > need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except for > dbdsqr(). > > I see that the numpy source code (I looked at numpy-1.6.0b2) contains > dbdsqr() in numpy/linalg/dlapack_lite.c, but grep can't find that string in > the binary distribution on my Mac nor on Linux. If it's buried in a numpy > binary somewhere, I'm comfortable with using ctypes to call it, but I > suspect it isn't. > > Can anyone point me to a cross-platform (OS X, Linux & Windows) way I can > call this function? > > I'm unfortunately quite na?ve about the math in the code I'm porting, so > I'm porting the code blindly -- if you ask me what problem I'm trying to > solve with dbdsqr(), I won't be able to explain. > > Not all the functions in lapack_lite are exposed in numpy. You might want to post on the scipy list, there have been recent additions supporting more lapack functions. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Dec 13 15:03:56 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 13 Dec 2012 22:03:56 +0200 Subject: [Numpy-discussion] Calling LAPACK function dbdsqr()? In-Reply-To: <9854A6C5-8458-4158-A999-42EA363D1A51@semanchuk.com> References: <9854A6C5-8458-4158-A999-42EA363D1A51@semanchuk.com> Message-ID: 13.12.2012 21:26, Philip Semanchuk kirjoitti: > I'm porting some Fortran code that makes use of a number of BLAS and > LAPACK functions, including dbdsqr(). I've found all of the functions > I need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except > for dbdsqr(). [clip] If you tolerate having compiled code, you can relatively easily wrap the LAPACK routine you need using f2py. -- Pauli Virtanen From ralf.gommers at gmail.com Thu Dec 13 15:17:49 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 13 Dec 2012 21:17:49 +0100 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 6:36 PM, Charles R Harris wrote: > > > On Thu, Dec 13, 2012 at 10:12 AM, Bradley M. Froehle < > brad.froehle at gmail.com> wrote: > >> Targeting >= 2.6 would be preferable to me. Several other packages >> including IPython, support only Python >= 2.6, >= 3.2. >> >> This change would help me from accidentally writing Python syntax which >> is allowable in 2.6 & 2.7 (but not in 2.4 or 2.5). >> >> Compiling a newer Python interpreter isn't very hard? probably about as >> difficult as installing NumPy. >> >> -Brad >> >> >> On Thursday, December 13, 2012 at 9:03 AM, Skipper Seabold wrote: >> >> > On Thu, Dec 13, 2012 at 12:00 PM, David Cournapeau > cournape at gmail.com)> wrote: >> > >> > > I would even go as far as dropping 2.5 as well then (RHEL 6 >> > > uses python 2.6). >> > >> > +1 >> > +1 > > OK. Dropping support for python 2.4 looks like the majority opinion. I'll > put up another post for 2.5. Do we need to coordinate with scipy? > Not much to coordinate I think. I'll send a message to scipy-dev proposing to simply follow the Numpy decision. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip at semanchuk.com Thu Dec 13 15:34:08 2012 From: philip at semanchuk.com (Philip Semanchuk) Date: Thu, 13 Dec 2012 15:34:08 -0500 Subject: [Numpy-discussion] Calling LAPACK function dbdsqr()? In-Reply-To: References: <9854A6C5-8458-4158-A999-42EA363D1A51@semanchuk.com> Message-ID: <19E5F883-2F88-47E1-9860-0D389C3F2110@semanchuk.com> On Dec 13, 2012, at 3:03 PM, Pauli Virtanen wrote: > 13.12.2012 21:26, Philip Semanchuk kirjoitti: >> I'm porting some Fortran code that makes use of a number of BLAS and >> LAPACK functions, including dbdsqr(). I've found all of the functions >> I need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except >> for dbdsqr(). > [clip] > > If you tolerate having compiled code, you can relatively easily wrap the > LAPACK routine you need using f2py. Hi Pauli, THanks for the tip. We already have access to a compiled version, but maintaining that across 3 platforms and 32/64-bit is a headache, which is why we're aiming for pure Python plus numpy/scipy. bye Philip From philip at semanchuk.com Thu Dec 13 15:40:34 2012 From: philip at semanchuk.com (Philip Semanchuk) Date: Thu, 13 Dec 2012 15:40:34 -0500 Subject: [Numpy-discussion] Calling LAPACK function dbdsqr()? In-Reply-To: References: <9854A6C5-8458-4158-A999-42EA363D1A51@semanchuk.com> Message-ID: <605A5ECB-FD91-4101-B5D8-8AE225866870@semanchuk.com> On Dec 13, 2012, at 2:47 PM, Charles R Harris wrote: > On Thu, Dec 13, 2012 at 12:26 PM, Philip Semanchuk wrote: > >> Hi all, >> I'm porting some Fortran code that makes use of a number of BLAS and >> LAPACK functions, including dbdsqr(). I've found all of the functions I >> need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except for >> dbdsqr(). >> >> I see that the numpy source code (I looked at numpy-1.6.0b2) contains >> dbdsqr() in numpy/linalg/dlapack_lite.c, but grep can't find that string in >> the binary distribution on my Mac nor on Linux. If it's buried in a numpy >> binary somewhere, I'm comfortable with using ctypes to call it, but I >> suspect it isn't. >> >> Can anyone point me to a cross-platform (OS X, Linux & Windows) way I can >> call this function? >> >> I'm unfortunately quite na?ve about the math in the code I'm porting, so >> I'm porting the code blindly -- if you ask me what problem I'm trying to >> solve with dbdsqr(), I won't be able to explain. >> >> > Not all the functions in lapack_lite are exposed in numpy. You might want > to post on the scipy list, there have been recent additions supporting more > lapack functions. Thanks, I'll check scipy. In numpy, I can see that lapack_litemodule.c is the layer that exposes select lapack functions via the Python interface, and dbdsqr isn't present in that file which is why I figured I would have to use ctypes to call it. I just can't figure out why dbdsqr appears in the source code but neither grep nor nm find any reference to it in the compiled binary. Cheers Philip From chris.barker at noaa.gov Thu Dec 13 16:07:56 2012 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 13 Dec 2012 13:07:56 -0800 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: >>> How about dropping support for python 2.5 also? I"m still dumfounded that people are working on projects where they are free to use the latest an greatest numpy, but *have* to use a more-than-four-year-old-python: """ Python 2.6 (final) was released on October 1st, 2008. """ so +1 on moving forward! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From andrew.collette at gmail.com Thu Dec 13 16:57:23 2012 From: andrew.collette at gmail.com (Andrew Collette) Date: Thu, 13 Dec 2012 14:57:23 -0700 Subject: [Numpy-discussion] ValueError: low level cast function is for unequal type numbers In-Reply-To: <50C1A22D.7050209@uci.edu> References: <50C1A22D.7050209@uci.edu> Message-ID: Hi, > the following code using np.object_ data types works with numpy 1.5.1 > but fails with 1.6.2. Is this intended or a regression? Other data > types, np.float64 for example, seem to work. I am also seeing this problem; there was a change to how string types are handled in h5py 2.1.0 which triggers this bug. It's a serious inconvenience, as people can't do e.g. np.copy() any more. > These downstream issues could be related: > > http://code.google.com/p/h5py/issues/detail?id=217 Yes, this seems to be the cause of issue 217. Andrew From cournape at gmail.com Thu Dec 13 17:02:25 2012 From: cournape at gmail.com (David Cournapeau) Date: Thu, 13 Dec 2012 23:02:25 +0100 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 10:07 PM, Chris Barker - NOAA Federal wrote: >>>> How about dropping support for python 2.5 also? > > I"m still dumfounded that people are working on projects where they > are free to use the latest an greatest numpy, but *have* to use a > more-than-four-year-old-python: It happens very easily in corporate environments. Compiling python it a major headache compared to numpy, not because of python itself, but because you need to recompile every single extension you're gonna use. Compiling numpy + its dependencies is at least feasable, but building things like pygtk when you don't have X11 headers installed is more or less impossible. David From cgohlke at uci.edu Thu Dec 13 17:02:45 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Thu, 13 Dec 2012 14:02:45 -0800 Subject: [Numpy-discussion] ValueError: low level cast function is for unequal type numbers In-Reply-To: References: <50C1A22D.7050209@uci.edu> Message-ID: <50CA5085.5080606@uci.edu> On 12/13/2012 1:57 PM, Andrew Collette wrote: > Hi, > >> the following code using np.object_ data types works with numpy 1.5.1 >> but fails with 1.6.2. Is this intended or a regression? Other data >> types, np.float64 for example, seem to work. > > I am also seeing this problem; there was a change to how string types > are handled in h5py 2.1.0 which triggers this bug. It's a serious > inconvenience, as people can't do e.g. np.copy() > any more. > >> These downstream issues could be related: >> >> http://code.google.com/p/h5py/issues/detail?id=217 > > Yes, this seems to be the cause of issue 217. > > Andrew Sorry, I forgot to mention that I submitted a pull request at . Christoph From brad.froehle at gmail.com Thu Dec 13 18:01:38 2012 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Thu, 13 Dec 2012 15:01:38 -0800 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: Yes, but the point was that since you can live with an older version on Python you can probably live with an older version of NumPy. On Thursday, December 13, 2012, David Cournapeau wrote: > > > I"m still dumfounded that people are working on projects where they > > are free to use the latest an greatest numpy, but *have* to use a > > more-than-four-year-old-python: > > It happens very easily in corporate environments. Compiling python it > a major headache compared to numpy, not because of python itself, but > because you need to recompile every single extension you're gonna use. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Thu Dec 13 19:39:12 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 13 Dec 2012 16:39:12 -0800 Subject: [Numpy-discussion] Travis failures with no errors Message-ID: Hi, I found these recent weird "failures" in Travis, but I can't find any problem with the log and all tests pass. Any ideas what is going on? https://travis-ci.org/numpy/numpy/jobs/3570123 https://travis-ci.org/numpy/numpy/jobs/3539549 https://travis-ci.org/numpy/numpy/jobs/3369629 Ondrej From chris.barker at noaa.gov Thu Dec 13 19:41:28 2012 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 13 Dec 2012 16:41:28 -0800 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 3:01 PM, Bradley M. Froehle wrote: > Yes, but the point was that since you can live with an older version on > Python you can probably live with an older version of NumPy. exactly -- also: How likely are you to nee the latest and greatest numpy but not a new PyGTK, or a new name_your_package_here. And, in fact, other packages drop support for older Python's too. However, what I can imagine is pretty irrelevant -- sorry I brought it up -- either there are a significant number of folks for whom support for old Pythons in important, or there aren't. -Chris > > On Thursday, December 13, 2012, David Cournapeau wrote: >> >> > I"m still dumfounded that people are working on projects where they >> > are free to use the latest an greatest numpy, but *have* to use a >> > more-than-four-year-old-python: >> >> It happens very easily in corporate environments. Compiling python it >> a major headache compared to numpy, not because of python itself, but >> because you need to recompile every single extension you're gonna use. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ondrej.certik at gmail.com Thu Dec 13 21:23:40 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 13 Dec 2012 18:23:40 -0800 Subject: [Numpy-discussion] Failure in test_iterator.py at Travis Message-ID: Hi, Another weird bug sometimes happen in numpy/core/tests/test_iterator.py, it looks like this: ====================================================================== FAIL: test_iterator.test_iter_array_cast ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/core/tests/test_iterator.py", line 836, in test_iter_array_cast assert_equal(i.operands[0].strides, (-96,8,-32)) File "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 252, in assert_equal assert_equal(actual[k], desired[k], 'item=%r\n%s' % (k,err_msg), verbose) File "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", line 314, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: item=0 ACTUAL: 96 DESIRED: -96 ---------------------------------------------------------------------- But the problem is that there is no numpy/core/tests/test_iterator.py file in current branches.... This error was triggered for example by these PRs: https://github.com/numpy/numpy/pull/2765 https://github.com/numpy/numpy/pull/2815 and here are links to the failing Travis tests: https://travis-ci.org/certik/numpy/builds/3656959 https://travis-ci.org/numpy/numpy/builds/3330234 Any idea what is happening here and how to fix it? Ondrej From charlesr.harris at gmail.com Thu Dec 13 22:04:59 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 13 Dec 2012 20:04:59 -0700 Subject: [Numpy-discussion] Failure in test_iterator.py at Travis In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 7:23 PM, Ond?ej ?ert?k wrote: > Hi, > > Another weird bug sometimes happen in > numpy/core/tests/test_iterator.py, it looks like this: > > ====================================================================== > FAIL: test_iterator.test_iter_array_cast > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/nose/case.py", > line 197, in runTest > self.test(*self.arg) > File > "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/core/tests/test_iterator.py", > line 836, in test_iter_array_cast > assert_equal(i.operands[0].strides, (-96,8,-32)) > File > "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", > line 252, in assert_equal > assert_equal(actual[k], desired[k], 'item=%r\n%s' % (k,err_msg), > verbose) > File > "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", > line 314, in assert_equal > raise AssertionError(msg) > AssertionError: > Items are not equal: > item=0 > > ACTUAL: 96 > DESIRED: -96 > > ---------------------------------------------------------------------- > > > But the problem is that there is no numpy/core/tests/test_iterator.py > file in current branches.... This error was triggered for example by > these PRs: > > https://github.com/numpy/numpy/pull/2765 > https://github.com/numpy/numpy/pull/2815 > > and here are links to the failing Travis tests: > > https://travis-ci.org/certik/numpy/builds/3656959 > https://travis-ci.org/numpy/numpy/builds/3330234 > > > Any idea what is happening here and how to fix it? > > That should have been fixed by Nathaniel's travis fix. Hmm... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Dec 13 22:16:55 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 13 Dec 2012 20:16:55 -0700 Subject: [Numpy-discussion] Failure in test_iterator.py at Travis In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 8:04 PM, Charles R Harris wrote: > > > On Thu, Dec 13, 2012 at 7:23 PM, Ond?ej ?ert?k wrote: > >> Hi, >> >> Another weird bug sometimes happen in >> numpy/core/tests/test_iterator.py, it looks like this: >> >> ====================================================================== >> FAIL: test_iterator.test_iter_array_cast >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/nose/case.py", >> line 197, in runTest >> self.test(*self.arg) >> File >> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/core/tests/test_iterator.py", >> line 836, in test_iter_array_cast >> assert_equal(i.operands[0].strides, (-96,8,-32)) >> File >> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", >> line 252, in assert_equal >> assert_equal(actual[k], desired[k], 'item=%r\n%s' % (k,err_msg), >> verbose) >> File >> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", >> line 314, in assert_equal >> raise AssertionError(msg) >> AssertionError: >> Items are not equal: >> item=0 >> >> ACTUAL: 96 >> DESIRED: -96 >> >> ---------------------------------------------------------------------- >> >> >> But the problem is that there is no numpy/core/tests/test_iterator.py >> file in current branches.... This error was triggered for example by >> these PRs: >> >> https://github.com/numpy/numpy/pull/2765 >> https://github.com/numpy/numpy/pull/2815 >> >> and here are links to the failing Travis tests: >> >> https://travis-ci.org/certik/numpy/builds/3656959 >> https://travis-ci.org/numpy/numpy/builds/3330234 >> >> >> Any idea what is happening here and how to fix it? >> >> > That should have been fixed by Nathaniel's travis fix. Hmm... > > And a quick check here shows pip not removing a previous 1.6.2 install. Maybe it is a pip version problem, pip 1.0.2 here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Thu Dec 13 22:49:20 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 13 Dec 2012 19:49:20 -0800 Subject: [Numpy-discussion] Failure in test_iterator.py at Travis In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 7:16 PM, Charles R Harris wrote: > > > On Thu, Dec 13, 2012 at 8:04 PM, Charles R Harris > wrote: >> >> >> >> On Thu, Dec 13, 2012 at 7:23 PM, Ond?ej ?ert?k >> wrote: >>> >>> Hi, >>> >>> Another weird bug sometimes happen in >>> numpy/core/tests/test_iterator.py, it looks like this: >>> >>> ====================================================================== >>> FAIL: test_iterator.test_iter_array_cast >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File >>> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/nose/case.py", >>> line 197, in runTest >>> self.test(*self.arg) >>> File >>> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/core/tests/test_iterator.py", >>> line 836, in test_iter_array_cast >>> assert_equal(i.operands[0].strides, (-96,8,-32)) >>> File >>> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", >>> line 252, in assert_equal >>> assert_equal(actual[k], desired[k], 'item=%r\n%s' % (k,err_msg), >>> verbose) >>> File >>> "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", >>> line 314, in assert_equal >>> raise AssertionError(msg) >>> AssertionError: >>> Items are not equal: >>> item=0 >>> >>> ACTUAL: 96 >>> DESIRED: -96 >>> >>> ---------------------------------------------------------------------- >>> >>> >>> But the problem is that there is no numpy/core/tests/test_iterator.py >>> file in current branches.... This error was triggered for example by >>> these PRs: >>> >>> https://github.com/numpy/numpy/pull/2765 >>> https://github.com/numpy/numpy/pull/2815 >>> >>> and here are links to the failing Travis tests: >>> >>> https://travis-ci.org/certik/numpy/builds/3656959 >>> https://travis-ci.org/numpy/numpy/builds/3330234 >>> >>> >>> Any idea what is happening here and how to fix it? >>> >> >> That should have been fixed by Nathaniel's travis fix. Hmm... >> > > And a quick check here shows pip not removing a previous 1.6.2 install. > Maybe it is a pip version problem, pip 1.0.2 here. That's what I thought as well. I think we need to update the Travis script to manually remove the installed numpy. Btw, Travis seems to be using pip 1.2.1, at least according to: https://travis-ci.org/numpy/numpy/jobs/3330236 Ondrej From shish at keba.be Thu Dec 13 23:00:48 2012 From: shish at keba.be (Olivier Delalleau) Date: Thu, 13 Dec 2012 23:00:48 -0500 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: 2012/12/13 Chris Barker - NOAA Federal > On Thu, Dec 13, 2012 at 3:01 PM, Bradley M. Froehle > wrote: > > Yes, but the point was that since you can live with an older version on > > Python you can probably live with an older version of NumPy. > > exactly -- also: > > How likely are you to nee the latest and greatest numpy but not a new > PyGTK, or a new name_your_package_here. And, in fact, other packages > drop support for older Python's too. > > However, what I can imagine is pretty irrelevant -- sorry I brought it > up -- either there are a significant number of folks for whom support > for old Pythons in important, or there aren't. > I doubt it's a common situation, but just to give an example: I am developing some machine learning code that heavily relies on Numpy, and it is meant to run into a large Python 2.4 software environment, which can't easily be upgraded because it contains lots of libraries that have been built against Python 2.4. And even if I could rebuild it, they wouldn't let me ;) This Python code is mostly proprietary and doesn't require external dependencies to be upgraded... except my little module that may take advantage of Numpy improvements. -=- Olivier -------------- next part -------------- An HTML attachment was scrubbed... URL: From raul at virtualmaterials.com Thu Dec 13 23:14:30 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Thu, 13 Dec 2012 21:14:30 -0700 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: <50CAA7A6.6030208@virtualmaterials.com> +1 from me For what is worth, we are just moving forward from Python 2.2 / Numeric and are going to 2.6 and it has been rather painful because of the several little details of extensions and other subtleties. I believe we will settle there for a while. For companies like ours, it is a big problem to upgrade versions. There is always this or that hiccup that "works great" in a version but not so much in another and we also have all sorts of extensions. Raul On 13/12/2012 9:34 AM, Charles R Harris wrote: > Time to raise this topic again. Opinions welcome. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From andrew.collette at gmail.com Fri Dec 14 00:37:20 2012 From: andrew.collette at gmail.com (Andrew Collette) Date: Thu, 13 Dec 2012 22:37:20 -0700 Subject: [Numpy-discussion] Attaching metadata to dtypes: what's the best way? Message-ID: Hi all, I have a question for the list sparked by this discussion of a bug in NumPy 1.6.2 and 1.7: http://mail.scipy.org/pipermail/numpy-discussion/2012-December/064682.html and this open issue in h5py: https://code.google.com/p/h5py/issues/detail?id=217 In h5py we need to represent variable length strings and HDF5 object references within the existing NumPy dtype system. The way this is handled at the moment is with object (type "O") dtypes with a small amount of metadata attached; in other words, an "O" array could have a dtype marked as representing variable-length strings, and HDF5 would convert the Python string objects into the corresponding type in the HDF5 file. Likewise, an "O" dtype marked as containing HDF5 object references (h5py.Reference instances) would be converted to native HDF5 references when written. The trouble I'm having is trying to attach metadata to a dtype in such a way that it is preserved in NumPy. Right now I create an "O" dtype with a single field and store the information in the field "description", e.g.: dtype(('O', [( ({'type': bytes},'vlen'), 'O' )] )) This works (it's how special types have worked in h5py for years) but is quite unwieldy, and leads to interesting side effects. For example, because of the single field used, array[index] returns a 1-element NumPy array containing a Python object, instead of the Python object itself. Worse, our fix for this behavior (remove the field when returning data from h5py) triggered the above bug in NumPy. Is there a better way to add metadata to dtypes I'm not aware of? Note I'm *not* interested in creating a custom type; one of the advantages of the current system is that people deal with the resulting "O" object arrays like any other object array in NumPy. Andrew Collette From klonuo at gmail.com Fri Dec 14 01:23:08 2012 From: klonuo at gmail.com (klo) Date: Fri, 14 Dec 2012 07:23:08 +0100 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: On Thu, 13 Dec 2012 17:35:01 +0100, Travis Oliphant wrote: > The NumPy home page can still be edited in this repository: > git at github.com:numpy/numpy.org.git. Pull requests are always welcome > --- especially pull requests that improve the look and feel of the > web-page. Hi, let me comment on it. 1. Page uses too intense color. Choosing color is sensitive question, but any reasonable suggestion should be better then current one 2. Numpy logo is of low quality: http://www.numpy.org/_static/numpy_logo.png and icon it's cropped carelessly 3. Links icons look messy as they look like patched from different themes - there are scipy shiny icons and then scipy matte icon and then RSS icon which links to blog. IMHO no one uses shiny icons anymore, and I don't know if these icons should have large scipy logo with thematic overlay as is. I'd suggest flat and informative icons if link icons are wanted 4. Favicon could be changed 5. Screenshot(s) could be added, but this is easier to say that suggest one. Maybe shots representing each numpy selected feature, and/or perhaps side links to Packt, as mentioned in your email, or similar, can add some dynamics in page look From sturla at molden.no Fri Dec 14 03:09:02 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 14 Dec 2012 09:09:02 +0100 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: <50CAA7A6.6030208@virtualmaterials.com> References: <50CAA7A6.6030208@virtualmaterials.com> Message-ID: <7A79C555-9D79-4878-B123-8EB4DB23A726@molden.no> So when upgrading everything you prefer to keep the bugs in 2.6 that were squashed in 2.7? Who has taught IT managers that older and more buggy versions of software are more "professional" and better for corporate environments? Sturla Den 14. des. 2012 kl. 05:14 skrev Raul Cota : > > +1 from me > > For what is worth, we are just moving forward from Python 2.2 / Numeric > and are going to 2.6 and it has been rather painful because of the > several little details of extensions and other subtleties. I believe we > will settle there for a while. For companies like ours, it is a big > problem to upgrade versions. There is always this or that hiccup that > "works great" in a version but not so much in another and we also have > all sorts of extensions. > > > > Raul > > > > > > On 13/12/2012 9:34 AM, Charles R Harris wrote: >> Time to raise this topic again. Opinions welcome. >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Fri Dec 14 03:29:39 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 14 Dec 2012 08:29:39 +0000 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: <50CAA7A6.6030208@virtualmaterials.com> References: <50CAA7A6.6030208@virtualmaterials.com> Message-ID: On 14 Dec 2012 04:14, "Raul Cota" wrote: > > > +1 from me > > For what is worth, we are just moving forward from Python 2.2 / Numeric > and are going to 2.6 and it has been rather painful because of the > several little details of extensions and other subtleties. I believe we > will settle there for a while. For companies like ours, it is a big > problem to upgrade versions. There is always this or that hiccup that > "works great" in a version but not so much in another and we also have > all sorts of extensions. Unfortunately (and I know this is a tradeoff), one consequence of this strategy is that you give up the chance to influence numpy development and avoid those hiccups in the first place. We try to catch things, but there's a *lot* more we can do if a bug gets noticed before it makes it into a final release, or multiple final releases... (This is why 1.7 has been dragging on - people testing the RCs found a number of places that it broke there code, so we're fixing numpy instead of them having to fix their code.) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Dec 14 03:32:43 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 14 Dec 2012 08:32:43 +0000 Subject: [Numpy-discussion] Failure in test_iterator.py at Travis In-Reply-To: References: Message-ID: I only checked this build: https://secure.travis-ci.org/#!/certik/numpy/jobs/3656960 But that log clearly shows 'python setup.py install' being used instead of 'pip install'. How certain are you that your branch actually has my fix? -n On 14 Dec 2012 03:49, "Ond?ej ?ert?k" wrote: > On Thu, Dec 13, 2012 at 7:16 PM, Charles R Harris > wrote: > > > > > > On Thu, Dec 13, 2012 at 8:04 PM, Charles R Harris > > wrote: > >> > >> > >> > >> On Thu, Dec 13, 2012 at 7:23 PM, Ond?ej ?ert?k > > >> wrote: > >>> > >>> Hi, > >>> > >>> Another weird bug sometimes happen in > >>> numpy/core/tests/test_iterator.py, it looks like this: > >>> > >>> ====================================================================== > >>> FAIL: test_iterator.test_iter_array_cast > >>> ---------------------------------------------------------------------- > >>> Traceback (most recent call last): > >>> File > >>> > "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/nose/case.py", > >>> line 197, in runTest > >>> self.test(*self.arg) > >>> File > >>> > "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/core/tests/test_iterator.py", > >>> line 836, in test_iter_array_cast > >>> assert_equal(i.operands[0].strides, (-96,8,-32)) > >>> File > >>> > "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", > >>> line 252, in assert_equal > >>> assert_equal(actual[k], desired[k], 'item=%r\n%s' % (k,err_msg), > >>> verbose) > >>> File > >>> > "/home/travis/virtualenv/python2.5/lib/python2.5/site-packages/numpy/testing/utils.py", > >>> line 314, in assert_equal > >>> raise AssertionError(msg) > >>> AssertionError: > >>> Items are not equal: > >>> item=0 > >>> > >>> ACTUAL: 96 > >>> DESIRED: -96 > >>> > >>> ---------------------------------------------------------------------- > >>> > >>> > >>> But the problem is that there is no numpy/core/tests/test_iterator.py > >>> file in current branches.... This error was triggered for example by > >>> these PRs: > >>> > >>> https://github.com/numpy/numpy/pull/2765 > >>> https://github.com/numpy/numpy/pull/2815 > >>> > >>> and here are links to the failing Travis tests: > >>> > >>> https://travis-ci.org/certik/numpy/builds/3656959 > >>> https://travis-ci.org/numpy/numpy/builds/3330234 > >>> > >>> > >>> Any idea what is happening here and how to fix it? > >>> > >> > >> That should have been fixed by Nathaniel's travis fix. Hmm... > >> > > > > And a quick check here shows pip not removing a previous 1.6.2 install. > > Maybe it is a pip version problem, pip 1.0.2 here. > > That's what I thought as well. I think we need to update the Travis > script to manually remove the installed numpy. > > Btw, Travis seems to be using pip 1.2.1, at least according to: > > https://travis-ci.org/numpy/numpy/jobs/3330236 > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbos.net at gmail.com Fri Dec 14 04:17:23 2012 From: sbos.net at gmail.com (Sergey Bartunov) Date: Fri, 14 Dec 2012 13:17:23 +0400 Subject: [Numpy-discussion] Building numpy with OpenBLAS Message-ID: Hi. I'm trying to build numpy (1.6.2 and master from git) with OpenBLAS on Ubuntu server 11.10. I succeed with this just once and performance boost was really big for me, but unfortunately something went wrong with my application and I had to reinstall numpy. After that I couldn't reproduce this result and even just perform faster than default numpy installation with no external libraries anyhow. Now things went even worse. I assume that numpy built with BLAS and LAPACK should do dot operation faster than "clean" installation on relatively large matirces (say 2000 x 2000). Here I don't use OpenBLAS anyway. I install libblas-dev and liblapack-dev by apt-get and after that build numpy from sources / by pip (that doesn't matter for the result). Building tool reports that BLAS and LAPACK are detected on my system, so says "numpy.distutils.system_info" after installation. But matrix multiplication by dot takes the same time as clean installation (12 s vs 0.16 s with OpenBLAS). That's the first thing I'm wondering about. Nevertheless I tried to compile numpy with OpenBLAS only. I have forced this by setting ATLAS="" BLAS=/usr/lib/libopenblas.a LAPACK=/usr/lib/libopenblas.a as I saw somewhere in the internet. I had installed numpy exactly this way at the first time when I was lucky. But now it doesn't work for me. I tryied installing OpenBLAS from sources and as libopenblas-dev ubuntu package. So how can I fix this? Many thanks in advance. From sturla at molden.no Fri Dec 14 07:04:14 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 14 Dec 2012 13:04:14 +0100 Subject: [Numpy-discussion] Building numpy with OpenBLAS In-Reply-To: References: Message-ID: <50CB15BE.4030605@molden.no> On 14.12.2012 10:17, Sergey Bartunov wrote: > Now things went even worse. I assume that numpy built with BLAS and > LAPACK should do dot operation faster than "clean" installation on > relatively large matirces (say 2000 x 2000). Here I don't use OpenBLAS > anyway. No, _dotblas is only built against ATLAS, MKL or Apple's accelerate framework. So with OpenBLAS you have to call e.g. DGEMM from OpenBLAS yourself instead of using np.dot. > So how can I fix this? Many thanks in advance. You might fix NumPy to build _dotblas against OpenBLAS as well :) Sturla From chaoyuejoy at gmail.com Fri Dec 14 08:57:30 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Fri, 14 Dec 2012 14:57:30 +0100 Subject: [Numpy-discussion] np.seterr doesn't work for masked array? Message-ID: Dear all, I tried to capture the zero divide error when I divide a masked array by another. It seems that np.seterr is not working for masked array? when I do np.divide on two masked array, it directly put the zero divides part as being masked. The np.seterr works if the two arrays for dividing are not masked arrays. could anyone explain? thanks!! np.__version__ = 1.6.2 In [87]: np.seterr(all='print') Out[87]: {'divide': 'print', 'invalid': 'print', 'over': 'print', 'under': 'print'} In [88]: a = np.arange(8,dtype=float).reshape(2,4) In [89]: b = np.ma.masked_less(a,4) In [90]: b[1,-2:] = 0. In [91]: b Out[91]: masked_array(data = [[-- -- -- --] [4.0 5.0 0.0 0.0]], mask = [[ True True True True] [False False False False]], fill_value = 1e+20) In [92]: c = a.copy() In [93]: c[1,-2:] = 0. In [94]: c Out[94]: array([[ 0., 1., 2., 3.], [ 4., 5., 0., 0.]]) In [95]: np.divide(a,b) Warning: divide by zero encountered in divide Out[95]: masked_array(data = [[-- -- -- --] [1.0 1.0 -- --]], mask = [[ True True True True] [False False True True]], fill_value = 1e+20) In [96]: np.divide(a,c) Warning: divide by zero encountered in divide Out[96]: array([[ nan, 1., 1., 1.], [ 1., 1., inf, inf]]) Chao -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Dec 14 09:15:52 2012 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Dec 2012 14:15:52 +0000 Subject: [Numpy-discussion] np.seterr doesn't work for masked array? In-Reply-To: References: Message-ID: On Fri, Dec 14, 2012 at 1:57 PM, Chao YUE wrote: > Dear all, > > I tried to capture the zero divide error when I divide a masked array by > another. It seems that np.seterr is not working for masked array? > when I do np.divide on two masked array, it directly put the zero divides > part as being masked. The np.seterr works if the two arrays for dividing are > not masked arrays. > could anyone explain? thanks!! numpy.ma uses np.seterr(divide='ignore', invalid='ignore') for most of its operations, so it is overriding your settings. This is usually desirable since many of the masked values will be trip these errors spuriously even though they will be masked out in the result. -- Robert Kern From chanley at gmail.com Fri Dec 14 09:21:32 2012 From: chanley at gmail.com (Christopher Hanley) Date: Fri, 14 Dec 2012 09:21:32 -0500 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: We (STScI) are ending support for Python 2.5 in our stsci_python project and told our users as much last July. I have no objections to ending support for Python 2.5. Chris On Thu, Dec 13, 2012 at 12:38 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > The previous proposal to drop python 2.4 support garnered no opposition. > How about dropping support for python 2.5 also? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaoyuejoy at gmail.com Fri Dec 14 09:34:28 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Fri, 14 Dec 2012 15:34:28 +0100 Subject: [Numpy-discussion] an easy way to know if a functions works or not for masked array? Message-ID: Dear numpy users, I think since long I am confused by if a function works as expected or not for masked array. like np.reshape works for masked array, but not np.sum (I mean, I expect np.sum to drop the masked elements when do summing, of course we have np.ma.sum). So I always use fuctions preceding by np.ma to make sure there is nothing going woring if I expect there will be masked array participating in the calculation in the data process chain. When I handle masked array, Before I use a function, normally I check if there is an equivalent np.ma function and if there is, I use np.mafunction; Then I check how the documentation says about np.func and np.ma.func and see there is anything mentioned explicitly on the handling of masked array. Howevery, In most cases I will try with simple arrays to test both np.ma.func and np.func before I use some function. but this is sometimes time consuming. Does anyone have similar situation? thanks! Chao -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaoyuejoy at gmail.com Fri Dec 14 09:40:21 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Fri, 14 Dec 2012 15:40:21 +0100 Subject: [Numpy-discussion] np.seterr doesn't work for masked array? In-Reply-To: References: Message-ID: Thanks. You mean actually when numpy handle masked array, it will first treat all the base data, and then apply the mask after the treatment. and normally the base data of maksed elements will very likely to intrigure these errors, and you will see a lot errory warning or print in the process, and will make it impossible to see the error information your want to see for those elements that are not masked but still can intriguer the error you would like to see or check. I didn't realize this. It's a good to overwrite the error setting. thanks for your explanation. Chao On Fri, Dec 14, 2012 at 3:15 PM, Robert Kern wrote: > On Fri, Dec 14, 2012 at 1:57 PM, Chao YUE wrote: > > Dear all, > > > > I tried to capture the zero divide error when I divide a masked array by > > another. It seems that np.seterr is not working for masked array? > > when I do np.divide on two masked array, it directly put the zero divides > > part as being masked. The np.seterr works if the two arrays for dividing > are > > not masked arrays. > > could anyone explain? thanks!! > > numpy.ma uses np.seterr(divide='ignore', invalid='ignore') for most of > its operations, so it is overriding your settings. This is usually > desirable since many of the masked values will be trip these errors > spuriously even though they will be masked out in the result. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Dec 14 09:42:36 2012 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Dec 2012 14:42:36 +0000 Subject: [Numpy-discussion] np.seterr doesn't work for masked array? In-Reply-To: References: Message-ID: On Fri, Dec 14, 2012 at 2:40 PM, Chao YUE wrote: > Thanks. You mean actually when numpy handle masked array, it will first > treat all the base data, and then apply the mask after the treatment. > and normally the base data of maksed elements will very likely to intrigure > these errors, and you will see a lot errory warning or print in the process, > and will make it impossible to see the error information your want to see > for those elements that are not masked but still can intriguer the error you > would like to see or check. I didn't realize this. It's a good to overwrite > the error setting. Precisely. -- Robert Kern From brad.froehle at gmail.com Fri Dec 14 12:10:15 2012 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Fri, 14 Dec 2012 09:10:15 -0800 Subject: [Numpy-discussion] Building numpy with OpenBLAS In-Reply-To: References: Message-ID: Hi Sergey: I recently ran into similar problems with ACML. See an original bug report (https://github.com/numpy/numpy/issues/2728) & documentation fix (https://github.com/numpy/numpy/pull/2809). Personally, I ended up using a patch similar to https://github.com/numpy/numpy/pull/2751 to force NumPy to respect site.cfg (so that I could put the libacml in the [blas_opt] & [lapack_opt] sections). But this seems unlikely to get merged into NumPy as it changes the behavior of site.cfg. Instead I think we should discuss adding a "have cblas" flag of some sort to the [blas] section so that the user can still get _dotblas to compile. -Brad On Friday, December 14, 2012 at 1:17 AM, Sergey Bartunov wrote: > Hi. I'm trying to build numpy (1.6.2 and master from git) with > OpenBLAS on Ubuntu server 11.10. > > I succeed with this just once and performance boost was really big for > me, but unfortunately something went wrong with my application and I > had to reinstall numpy. After that I couldn't reproduce this result > and even just perform faster than default numpy installation with no > external libraries anyhow. > > Now things went even worse. I assume that numpy built with BLAS and > LAPACK should do dot operation faster than "clean" installation on > relatively large matirces (say 2000 x 2000). Here I don't use OpenBLAS > anyway. > > I install libblas-dev and liblapack-dev by apt-get and after that > build numpy from sources / by pip (that doesn't matter for the > result). Building tool reports that BLAS and LAPACK are detected on my > system, so says "numpy.distutils.system_info" after installation. But > matrix multiplication by dot takes the same time as clean installation > (12 s vs 0.16 s with OpenBLAS). That's the first thing I'm wondering > about. > > Nevertheless I tried to compile numpy with OpenBLAS only. I have > forced this by setting ATLAS="" BLAS=/usr/lib/libopenblas.a > LAPACK=/usr/lib/libopenblas.a as I saw somewhere in the internet. I > had installed numpy exactly this way at the first time when I was > lucky. But now it doesn't work for me. I tryied installing OpenBLAS > from sources and as libopenblas-dev ubuntu package. > > So how can I fix this? Many thanks in advance. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org (mailto:NumPy-Discussion at scipy.org) > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ondrej.certik at gmail.com Fri Dec 14 12:20:31 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Fri, 14 Dec 2012 09:20:31 -0800 Subject: [Numpy-discussion] Failure in test_iterator.py at Travis In-Reply-To: References: Message-ID: On Fri, Dec 14, 2012 at 12:32 AM, Nathaniel Smith wrote: > I only checked this build: > https://secure.travis-ci.org/#!/certik/numpy/jobs/3656960 > But that log clearly shows 'python setup.py install' being used instead of > 'pip install'. How certain are you that your branch actually has my fix? Right. So is this one (using "python setup.py install"): https://travis-ci.org/numpy/numpy/jobs/3330235 No wonder that it fails. Since I rebased this on top of master and the master has this fix, and so does the 1.7 branch, I can't explain it. I know that the master at certik/numpy github does not have that fix (since I didn't push in there lately), but I don't see how that could matter, unless there is a bug at Travis. The branches that I created the pull requests from should have that fix. Ondrej From njs at pobox.com Fri Dec 14 12:49:37 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 14 Dec 2012 17:49:37 +0000 Subject: [Numpy-discussion] Failure in test_iterator.py at Travis In-Reply-To: References: Message-ID: The top of the build log has the actual git command they used to check out the source - it's some clever GitHub thing that gives the same thing as pressing the green button would iirc. You could copy the commands from the log to check that out locally though and see what the .travis.tml actually says. And if it's wrong poke around in the git history to figure out why. -n On 14 Dec 2012 17:20, "Ond?ej ?ert?k" wrote: > On Fri, Dec 14, 2012 at 12:32 AM, Nathaniel Smith wrote: > > I only checked this build: > > https://secure.travis-ci.org/#!/certik/numpy/jobs/3656960 > > But that log clearly shows 'python setup.py install' being used instead > of > > 'pip install'. How certain are you that your branch actually has my fix? > > Right. So is this one (using "python setup.py install"): > > https://travis-ci.org/numpy/numpy/jobs/3330235 > > No wonder that it fails. Since I rebased this on top of master and the > master > has this fix, and so does the 1.7 branch, I can't explain it. I know that > the master at certik/numpy github does not have that fix (since I didn't > push in > there lately), but I don't see how that could matter, unless there is > a bug at Travis. > The branches that I created the pull requests from should have that fix. > > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Fri Dec 14 13:43:14 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Fri, 14 Dec 2012 10:43:14 -0800 Subject: [Numpy-discussion] Failure in test_iterator.py at Travis In-Reply-To: References: Message-ID: On Fri, Dec 14, 2012 at 9:49 AM, Nathaniel Smith wrote: > The top of the build log has the actual git command they used to check out > the source - it's some clever GitHub thing that gives the same thing as > pressing the green button would iirc. You could copy the commands from the > log to check that out locally though and see what the .travis.tml actually > says. And if it's wrong poke around in the git history to figure out why. You are right! I followed the one: https://travis-ci.org/numpy/numpy/jobs/3330235 and .travis.yml indeed contains the old "python setup.py install"! I just don't understand why. My strong suspicion is that the actual commands are *not* equivalent to the github green merge button. It took the latest patches from the PR (which do not contain the .travis.yml fix), and merged into: commit 089bfa5865cd39e2b40099755e8563d8f0d04f5f Merge: 1065cc5 df2958e Author: Ralf Gommers Date: Sat Nov 24 02:54:48 2012 -0800 Merge pull request #2766 from g2p/master Assume we can use sys.stdout.fileno() and friends. I have no idea why. Unless --- it is merging into the master, which was current at the time the PR was created. That would explain it this particular failure. Ok. And finally, the one at certik/numpy fails, because that one is checking out the sources from certik/numpy, but those do not have the improvement. It is then the Travis's bug, that it links to certik/numpy log for PRs at numpy/numpy. So I think that all is explained now. Ondrej From ralf.gommers at gmail.com Fri Dec 14 16:01:41 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 14 Dec 2012 22:01:41 +0100 Subject: [Numpy-discussion] MKL licenses for core scientific Python projects Message-ID: Hi all, Intel has offered to provide free MKL licenses for main contributors to scientific Python projects - at least those listed at numfocus.org/projects/. Licenses for all OSes that are required can be provided, the condition is that they're used for building/testing our projects and not for broader purposes. If you're interested, please let me know your full name and what OS you need a license for. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From raul at virtualmaterials.com Fri Dec 14 20:34:58 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Fri, 14 Dec 2012 18:34:58 -0700 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: <7A79C555-9D79-4878-B123-8EB4DB23A726@molden.no> References: <50CAA7A6.6030208@virtualmaterials.com> <7A79C555-9D79-4878-B123-8EB4DB23A726@molden.no> Message-ID: <50CBD3C2.7090106@virtualmaterials.com> Point well taken. It is always a tradeoff / balancing act where you can have 'anything' but not 'everything'. Where would the fun be if we could have everything :) ? . In our situation, there were a couple of extensions that did not work (at least out of the box) in Python 2.7. Raul On 14/12/2012 1:09 AM, Sturla Molden wrote: > So when upgrading everything you prefer to keep the bugs in 2.6 that were squashed in 2.7? Who has taught IT managers that older and more buggy versions of software are more "professional" and better for corporate environments? > > Sturla > > > Den 14. des. 2012 kl. 05:14 skrev Raul Cota : > >> >> +1 from me >> >> For what is worth, we are just moving forward from Python 2.2 / Numeric >> and are going to 2.6 and it has been rather painful because of the >> several little details of extensions and other subtleties. I believe we >> will settle there for a while. For companies like ours, it is a big >> problem to upgrade versions. There is always this or that hiccup that >> "works great" in a version but not so much in another and we also have >> all sorts of extensions. >> >> >> >> Raul >> >> >> >> >> >> On 13/12/2012 9:34 AM, Charles R Harris wrote: >>> Time to raise this topic again. Opinions welcome. >>> >>> Chuck >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From chris.barker at noaa.gov Fri Dec 14 23:06:36 2012 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 14 Dec 2012 20:06:36 -0800 Subject: [Numpy-discussion] MKL licenses for core scientific Python projects In-Reply-To: References: Message-ID: <4473522818305701545@unknownmsgid> Ralf, Do these licenses allow fully free distribution of binaries? And are those binaries themselves redistributive? I.e. with py2exe and friends? If so, that could be nice. -Chris On Dec 14, 2012, at 1:01 PM, Ralf Gommers wrote: Hi all, Intel has offered to provide free MKL licenses for main contributors to scientific Python projects - at least those listed at numfocus.org/projects/. Licenses for all OSes that are required can be provided, the condition is that they're used for building/testing our projects and not for broader purposes. If you're interested, please let me know your full name and what OS you need a license for. Cheers, Ralf _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Dec 15 08:13:11 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 15 Dec 2012 14:13:11 +0100 Subject: [Numpy-discussion] MKL licenses for core scientific Python projects In-Reply-To: <4473522818305701545@unknownmsgid> References: <4473522818305701545@unknownmsgid> Message-ID: On Sat, Dec 15, 2012 at 5:06 AM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > Ralf, > > Do these licenses allow fully free distribution of binaries? And are those > binaries themselves redistributive? I.e. with py2exe and friends? > > If so, that could be nice. > Good point. It's not entirely clear from the emails I received. I've asked for clarification. Ralf > > On Dec 14, 2012, at 1:01 PM, Ralf Gommers wrote: > > Hi all, > > Intel has offered to provide free MKL licenses for main contributors to > scientific Python projects - at least those listed at > numfocus.org/projects/. Licenses for all OSes that are required can be > provided, the condition is that they're used for building/testing our > projects and not for broader purposes. > > If you're interested, please let me know your full name and what OS you > need a license for. > > Cheers, > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aron at ahmadia.net Sat Dec 15 08:21:28 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Sat, 15 Dec 2012 08:21:28 -0500 Subject: [Numpy-discussion] MKL licenses for core scientific Python projects In-Reply-To: References: <4473522818305701545@unknownmsgid> Message-ID: Ralf, Does "performance testing" come under building/testing? If so, Aron Ahmadia OS X.8 Thanks, A On Sat, Dec 15, 2012 at 8:13 AM, Ralf Gommers wrote: > > > > On Sat, Dec 15, 2012 at 5:06 AM, Chris Barker - NOAA Federal < > chris.barker at noaa.gov> wrote: > >> Ralf, >> >> Do these licenses allow fully free distribution of binaries? And are >> those binaries themselves redistributive? I.e. with py2exe and friends? >> >> If so, that could be nice. >> > > Good point. It's not entirely clear from the emails I received. I've asked > for clarification. > > Ralf > > >> >> On Dec 14, 2012, at 1:01 PM, Ralf Gommers wrote: >> >> Hi all, >> >> Intel has offered to provide free MKL licenses for main contributors to >> scientific Python projects - at least those listed at >> numfocus.org/projects/. Licenses for all OSes that are required can be >> provided, the condition is that they're used for building/testing our >> projects and not for broader purposes. >> >> If you're interested, please let me know your full name and what OS you >> need a license for. >> >> Cheers, >> Ralf >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Dec 15 08:46:48 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 15 Dec 2012 14:46:48 +0100 Subject: [Numpy-discussion] MKL licenses for core scientific Python projects In-Reply-To: References: <4473522818305701545@unknownmsgid> Message-ID: On Sat, Dec 15, 2012 at 2:21 PM, Aron Ahmadia wrote: > Ralf, > > Does "performance testing" come under building/testing? > As long as it's for the project(s) that these licenses are for, and not for your own research. Would this be for PyClaw? Ralf > If so, > > Aron Ahmadia > OS X.8 > > Thanks, > A > > > On Sat, Dec 15, 2012 at 8:13 AM, Ralf Gommers wrote: > >> >> >> >> On Sat, Dec 15, 2012 at 5:06 AM, Chris Barker - NOAA Federal < >> chris.barker at noaa.gov> wrote: >> >>> Ralf, >>> >>> Do these licenses allow fully free distribution of binaries? And are >>> those binaries themselves redistributive? I.e. with py2exe and friends? >>> >>> If so, that could be nice. >>> >> >> Good point. It's not entirely clear from the emails I received. I've >> asked for clarification. >> >> Ralf >> >> >>> >>> On Dec 14, 2012, at 1:01 PM, Ralf Gommers >>> wrote: >>> >>> Hi all, >>> >>> Intel has offered to provide free MKL licenses for main contributors to >>> scientific Python projects - at least those listed at >>> numfocus.org/projects/. Licenses for all OSes that are required can be >>> provided, the condition is that they're used for building/testing our >>> projects and not for broader purposes. >>> >>> If you're interested, please let me know your full name and what OS you >>> need a license for. >>> >>> Cheers, >>> Ralf >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.warde.farley at gmail.com Sat Dec 15 09:21:30 2012 From: d.warde.farley at gmail.com (David Warde-Farley) Date: Sat, 15 Dec 2012 09:21:30 -0500 Subject: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8 In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 11:34 AM, Charles R Harris wrote: > Time to raise this topic again. Opinions welcome. As you know from the pull request discussion, big +1 from me too. I'm also of the opinion with David C. and Brad that dropping 2.5 support would be a good thing too, as there's a lot of good stuff in 2.6+. Also, this is what IPython did a while back. David From tmp50 at ukr.net Sat Dec 15 10:13:31 2012 From: tmp50 at ukr.net (Dmitrey) Date: Sat, 15 Dec 2012 17:13:31 +0200 Subject: [Numpy-discussion] [ANN] OpenOpt Suite release 0.43 Message-ID: <1943.1355584411.5558559243999444992@ffe6.ukr.net> Hi all, I'm glad to inform you about new OpenOpt release 0.43 (2012-Dec-15): * interalg now can solve SNLE in 2nd mode (parameter dataHandling = "raw", before - only "sorted") * Many other improvements for interalg * Some improvements for FuncDesigner kernel * FuncDesigner ODE now has 3 arguments instead of 4 (backward incompatibility!), e.g. {t: np.linspace(0,1,100)} or mere np.linspace(0,1,100) if your ODE right side is time-independend * FuncDesigner stochastic addon now can handle some problems with gradient-based NLP / NSP solvers * Many minor improvements and some bugfixes Visit openopt.org for more details. Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aron at ahmadia.net Sat Dec 15 12:31:45 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Sat, 15 Dec 2012 12:31:45 -0500 Subject: [Numpy-discussion] MKL licenses for core scientific Python projects In-Reply-To: References: <4473522818305701545@unknownmsgid> Message-ID: All open source software and research projects with numpy in the stack, including PyClaw and petsc4py. A On Saturday, December 15, 2012, Ralf Gommers wrote: > > > > On Sat, Dec 15, 2012 at 2:21 PM, Aron Ahmadia > > wrote: > >> Ralf, >> >> Does "performance testing" come under building/testing? >> > > As long as it's for the project(s) that these licenses are for, and not > for your own research. Would this be for PyClaw? > > Ralf > > >> If so, >> >> Aron Ahmadia >> OS X.8 >> >> Thanks, >> A >> >> >> On Sat, Dec 15, 2012 at 8:13 AM, Ralf Gommers >> > wrote: >> >>> >>> >>> >>> On Sat, Dec 15, 2012 at 5:06 AM, Chris Barker - NOAA Federal < >>> chris.barker at noaa.gov >> 'chris.barker at noaa.gov');>> wrote: >>> >>>> Ralf, >>>> >>>> Do these licenses allow fully free distribution of binaries? And are >>>> those binaries themselves redistributive? I.e. with py2exe and friends? >>>> >>>> If so, that could be nice. >>>> >>> >>> Good point. It's not entirely clear from the emails I received. I've >>> asked for clarification. >>> >>> Ralf >>> >>> >>>> >>>> On Dec 14, 2012, at 1:01 PM, Ralf Gommers > >>>> wrote: >>>> >>>> Hi all, >>>> >>>> Intel has offered to provide free MKL licenses for main contributors to >>>> scientific Python projects - at least those listed at >>>> numfocus.org/projects/. Licenses for all OSes that are required can be >>>> provided, the condition is that they're used for building/testing our >>>> projects and not for broader purposes. >>>> >>>> If you're interested, please let me know your full name and what OS you >>>> need a license for. >>>> >>>> Cheers, >>>> Ralf >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>> 'NumPy-Discussion at scipy.org');> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>> 'NumPy-Discussion at scipy.org');> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >> 'NumPy-Discussion at scipy.org');> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org > 'NumPy-Discussion at scipy.org');> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Sat Dec 15 18:52:01 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 15 Dec 2012 15:52:01 -0800 Subject: [Numpy-discussion] Status of the 1.7 release Message-ID: Hi, If you go to the issues for 1.7 and click "high priority": https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&state=open you will see 3 issues as of right now. Two of those have PR attached. It's been a lot of work to get to this point and I'd like to thank all of you for helping out with the issues. In particular, I have just fixed a very annoying segfault (#2738) in the PR: https://github.com/numpy/numpy/pull/2831 If you can review that one carefully, that would be highly appreciated. The more people the better, it's a reference counting issue and since this would go into the 1.7 release and it's in the core of numpy, I want to make sure that it's correct. So the last high priority issue is: https://github.com/numpy/numpy/issues/568 and that's the one I will be concentrating on now. After it's fixed, I think we are ready to release the rc1. There are more open issues (that are not "high priority"): https://github.com/numpy/numpy/issues?labels=&milestone=3&page=1&state=open But I don't think we should delay the release any longer because of them. Let me know if there are any objections. Of course, if you attach a PR fixing any of those, we'll merge it. Ondrej From njs at pobox.com Sat Dec 15 21:17:26 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 16 Dec 2012 02:17:26 +0000 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: #294 is a regression, so probably should be considered release critical. I can't tell if #2750 is a real problem or not. #378 looks serious, but afaict has actually been fixed even though the bug is still marked open? At least fixed in 1.7.x? On 15 Dec 2012 23:52, "Ond?ej ?ert?k" wrote: > Hi, > > If you go to the issues for 1.7 and click "high priority": > > > https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&state=open > > you will see 3 issues as of right now. Two of those have PR attached. > It's been a lot of work > to get to this point and I'd like to thank all of you for helping out > with the issues. > > > In particular, I have just fixed a very annoying segfault (#2738) in the > PR: > > https://github.com/numpy/numpy/pull/2831 > > If you can review that one carefully, that would be highly > appreciated. The more people the better, > it's a reference counting issue and since this would go into the 1.7 > release and it's in the core of numpy, > I want to make sure that it's correct. > > So the last high priority issue is: > > https://github.com/numpy/numpy/issues/568 > > and that's the one I will be concentrating on now. After it's fixed, I > think we are ready to release the rc1. > > There are more open issues (that are not "high priority"): > > https://github.com/numpy/numpy/issues?labels=&milestone=3&page=1&state=open > > But I don't think we should delay the release any longer because of > them. Let me know if there > are any objections. Of course, if you attach a PR fixing any of those, > we'll merge it. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 16 04:49:08 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 16 Dec 2012 10:49:08 +0100 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: On Sun, Dec 16, 2012 at 3:17 AM, Nathaniel Smith wrote: > #294 is a regression, so probably should be considered release critical. I > can't tell if #2750 is a real problem or not. #378 looks serious, but > afaict has actually been fixed even though the bug is still marked open? At > least fixed in 1.7.x? > On 15 Dec 2012 23:52, "Ond?ej ?ert?k" wrote: > >> Hi, >> >> If you go to the issues for 1.7 and click "high priority": >> >> >> https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&state=open >> >> you will see 3 issues as of right now. Two of those have PR attached. >> It's been a lot of work >> to get to this point and I'd like to thank all of you for helping out >> with the issues. >> >> >> In particular, I have just fixed a very annoying segfault (#2738) in the >> PR: >> >> https://github.com/numpy/numpy/pull/2831 >> >> If you can review that one carefully, that would be highly >> appreciated. The more people the better, >> it's a reference counting issue and since this would go into the 1.7 >> release and it's in the core of numpy, >> I want to make sure that it's correct. >> >> So the last high priority issue is: >> >> https://github.com/numpy/numpy/issues/568 >> >> and that's the one I will be concentrating on now. After it's fixed, I >> think we are ready to release the rc1. >> >> There are more open issues (that are not "high priority"): >> >> >> https://github.com/numpy/numpy/issues?labels=&milestone=3&page=1&state=open >> >> But I don't think we should delay the release any longer because of >> them. Let me know if there >> are any objections. Of course, if you attach a PR fixing any of those, >> we'll merge it. >> > Properly documenting .base (gh-2737) and casting rules (gh-561) changes should be finished before rc1. I agree that the Debian issues all shouldn't block the release. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From aron at ahmadia.net Sun Dec 16 04:57:12 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Sun, 16 Dec 2012 04:57:12 -0500 Subject: [Numpy-discussion] MKL licenses for core scientific Python projects In-Reply-To: References: <4473522818305701545@unknownmsgid> Message-ID: All open source software and research projects with numpy in the stack, including PyClaw and petsc4py. A On Saturday, December 15, 2012, Ralf Gommers wrote: > > > > On Sat, Dec 15, 2012 at 2:21 PM, Aron Ahmadia > > wrote: > >> Ralf, >> >> Does "performance testing" come under building/testing? >> > > As long as it's for the project(s) that these licenses are for, and not > for your own research. Would this be for PyClaw? > > Ralf > > >> If so, >> >> Aron Ahmadia >> OS X.8 >> >> Thanks, >> A >> >> >> On Sat, Dec 15, 2012 at 8:13 AM, Ralf Gommers >> > wrote: >> >>> >>> >>> >>> On Sat, Dec 15, 2012 at 5:06 AM, Chris Barker - NOAA Federal < >>> chris.barker at noaa.gov >> 'chris.barker at noaa.gov');>> wrote: >>> >>>> Ralf, >>>> >>>> Do these licenses allow fully free distribution of binaries? And are >>>> those binaries themselves redistributive? I.e. with py2exe and friends? >>>> >>>> If so, that could be nice. >>>> >>> >>> Good point. It's not entirely clear from the emails I received. I've >>> asked for clarification. >>> >>> Ralf >>> >>> >>>> >>>> On Dec 14, 2012, at 1:01 PM, Ralf Gommers > >>>> wrote: >>>> >>>> Hi all, >>>> >>>> Intel has offered to provide free MKL licenses for main contributors to >>>> scientific Python projects - at least those listed at >>>> numfocus.org/projects/. Licenses for all OSes that are required can be >>>> provided, the condition is that they're used for building/testing our >>>> projects and not for broader purposes. >>>> >>>> If you're interested, please let me know your full name and what OS you >>>> need a license for. >>>> >>>> Cheers, >>>> Ralf >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>> 'NumPy-Discussion at scipy.org');> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>> 'NumPy-Discussion at scipy.org');> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >> 'NumPy-Discussion at scipy.org');> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org > 'NumPy-Discussion at scipy.org');> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 16 08:38:41 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 16 Dec 2012 14:38:41 +0100 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: On Thu, Dec 13, 2012 at 5:35 PM, Travis Oliphant wrote: > For people interested in the www.numpy.org home page: > > Jon Turner has officially transferred the www.numpy.org domain to > NumFOCUS. Thank you, Jon for this donation and for being a care-taker > of the domain-name. We have setup the domain registration to point to > numpy.github.com and I've changed the CNAME in that repostiory to > www.numpy.org > > I've sent an email to have the numpy.scipy.org page to redirect to > www.numpy.org. > > The NumPy home page can still be edited in this repository: > git at github.com:numpy/numpy.org.git. Pull requests are always welcome > --- especially pull requests that improve the look and feel of the web-page. > > Two of the content changes that we need to make a decision about is > > 1) whether or not to put links to books published (Packt > publishing for example has offered a higher percentage of their revenues if > we put a prominent link on www.numpy.org) > I'm +1 on showing links to books in a sidebar on the main page and/or on the documentation page, provided that (a) someone in this community can vouch for the quality of the book, and (b) we accept links for all books that are relevant and of sufficient quality. 2) whether or not to accept "Sponsored by" links on the home page > for donations to the project (e.g. Continuum Analytics has sponsored Ondrej > release management, other companies have sponsored pull requests, other > companies may want to provide donations and we would want to recognize > their contributions to the numpy project). > +1 for putting this on the main page. Something like the Support section on the IPython main page would be good. It lists specifically what the support was for. > These decisions should be made by the NumPy community which in my mind are > interested people on this list. Who is interested in this kind of > discussion? > > We could have these discussions on this list or on the > numfocus at googlegroups.com list and keep this list completely technical > (which I prefer, but I will do whatever the consensus is). > I'd prefer things that are cross-project to move to the numfocus list, but things that are specifically about NumPy (which numpy.org content is) to stay on this list. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 16 08:52:28 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 16 Dec 2012 14:52:28 +0100 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: On Sun, Dec 16, 2012 at 2:38 PM, Ralf Gommers wrote: > > > > On Thu, Dec 13, 2012 at 5:35 PM, Travis Oliphant wrote: > >> For people interested in the www.numpy.org home page: >> >> Jon Turner has officially transferred the www.numpy.org domain to >> NumFOCUS. Thank you, Jon for this donation and for being a care-taker >> of the domain-name. We have setup the domain registration to point to >> numpy.github.com and I've changed the CNAME in that repostiory to >> www.numpy.org >> >> I've sent an email to have the numpy.scipy.org page to redirect to >> www.numpy.org. >> >> The NumPy home page can still be edited in this repository: >> git at github.com:numpy/numpy.org.git. Pull requests are always welcome >> --- especially pull requests that improve the look and feel of the web-page. >> >> Two of the content changes that we need to make a decision about is >> >> 1) whether or not to put links to books published (Packt >> publishing for example has offered a higher percentage of their revenues if >> we put a prominent link on www.numpy.org) >> > > I'm +1 on showing links to books in a sidebar on the main page and/or on > the documentation page, provided that (a) someone in this community can > vouch for the quality of the book, and (b) we accept links for all books > that are relevant and of sufficient quality. > Does anyone have an informed opinion on the quality of these books: "NumPy 1.5 Beginner's Guide", Ivan Idris, http://www.packtpub.com/numpy-1-5-using-real-world-examples-beginners-guide/book "NumPy Cookbook", Ivan Idris, http://www.packtpub.com/numpy-for-python-cookbook/book "Python for Data Analysis", Wes McKinney, http://shop.oreilly.com/product/0636920023784.do "SciPy and NumPy", Eli Bressert, http://shop.oreilly.com/product/0636920020219.do The first 5 books at http://stackoverflow.com/questions/4375094/numpy-what-are-the-authoritative-numpy-resources-e-g-documentation-tutorial Are there any more I missed? Ralf > 2) whether or not to accept "Sponsored by" links on the home page >> for donations to the project (e.g. Continuum Analytics has sponsored Ondrej >> release management, other companies have sponsored pull requests, other >> companies may want to provide donations and we would want to recognize >> their contributions to the numpy project). >> > > +1 for putting this on the main page. Something like the Support section > on the IPython main page would be good. It lists specifically what the > support was for. > > >> These decisions should be made by the NumPy community which in my mind >> are interested people on this list. Who is interested in this kind of >> discussion? >> >> We could have these discussions on this list or on the >> numfocus at googlegroups.com list and keep this list completely technical >> (which I prefer, but I will do whatever the consensus is). >> > > I'd prefer things that are cross-project to move to the numfocus list, but > things that are specifically about NumPy (which numpy.org content is) to > stay on this list. > > Ralf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Sun Dec 16 09:17:57 2012 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 16 Dec 2012 15:17:57 +0100 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: > Does anyone have an informed opinion on the quality of these books: > > "NumPy 1.5 Beginner's Guide", Ivan Idris, > http://www.packtpub.com/numpy-1-5-using-real-world-examples-beginners-guide/book > > "NumPy Cookbook", Ivan Idris, > http://www.packtpub.com/numpy-for-python-cookbook/book > Packt is looking for reviewers for this (new) book. I will do one in the next few weeks. Cheers, -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Dec 16 09:51:32 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 16 Dec 2012 14:51:32 +0000 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: On 16 Dec 2012 13:38, "Ralf Gommers" wrote: > > On Thu, Dec 13, 2012 at 5:35 PM, Travis Oliphant wrote: >> >> For people interested in the www.numpy.org home page: >> >> Jon Turner has officially transferred the www.numpy.org domain to NumFOCUS. Thank you, Jon for this donation and for being a care-taker of the domain-name. We have setup the domain registration to point to numpy.github.com and I've changed the CNAME in that repostiory to www.numpy.org >> >> I've sent an email to have the numpy.scipy.org page to redirect to www.numpy.org. >> >> The NumPy home page can still be edited in this repository: git at github.com:numpy/numpy.org.git. Pull requests are always welcome --- especially pull requests that improve the look and feel of the web-page. >> >> Two of the content changes that we need to make a decision about is >> >> 1) whether or not to put links to books published (Packt publishing for example has offered a higher percentage of their revenues if we put a prominent link on www.numpy.org) > > > I'm +1 on showing links to books in a sidebar on the main page and/or on the documentation page, provided that (a) someone in this community can vouch for the quality of the book, and (b) we accept links for all books that are relevant and of sufficient quality. I agree, so long as we're careful to avoid all the huge drama that could arise from trying to come up with "official" community judgements on the quality of books produced by members of our community. In practice I guess this means that we err on the side of inclusion, where all a book would need is one person who likes it, with no voting or vetoes possible. But that seems like a fine system - there are plenty of places to get more fine-gained recommendations. >> 2) whether or not to accept "Sponsored by" links on the home page for donations to the project (e.g. Continuum Analytics has sponsored Ondrej release management, other companies have sponsored pull requests, other companies may want to provide donations and we would want to recognize their contributions to the numpy project). > > > +1 for putting this on the main page. Something like the Support section on the IPython main page would be good. It lists specifically what the support was for. > >> >> These decisions should be made by the NumPy community which in my mind are interested people on this list. Who is interested in this kind of discussion? >> >> We could have these discussions on this list or on the numfocus at googlegroups.com list and keep this list completely technical (which I prefer, but I will do whatever the consensus is). > > > I'd prefer things that are cross-project to move to the numfocus list, but things that are specifically about NumPy (which numpy.org content is) to stay on this list. +1 -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From klonuo at gmail.com Sun Dec 16 10:48:17 2012 From: klonuo at gmail.com (klo) Date: Sun, 16 Dec 2012 16:48:17 +0100 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: On Sun, 16 Dec 2012 14:52:28 +0100, Ralf Gommers wrote: > > Does anyone have an informed opinion on the quality of these books: > > "NumPy 1.5 Beginner's Guide", Ivan Idris, > http://www.packtpub.com/numpy-1-5-using-real-world-examples-beginners-guide/book > > "NumPy Cookbook", Ivan Idris, > http://www.packtpub.com/numpy-for-python-cookbook/book Some reviews on first title: http://gael-varoquaux.info/blog/?p=161 http://glowingpython.blogspot.com/2011/12/book-review-numpy-15-beginners-guide.html Gael noted http://scipy-lectures.github.com/ which IMHO could be more promoted. Same for Travis' free Numpy book. The second title is very fresh, I don't know if anyone did review, but seems like good companion. > "Python for Data Analysis", Wes McKinney, > http://shop.oreilly.com/product/0636920023784.do This is already allover pandas, and although there is introduction to numpy, it's more focused on pandas data object model then numpy arrays, logically. > "SciPy and NumPy", Eli Bressert, > http://shop.oreilly.com/product/0636920020219.do This is very short introductory course to numpy and scipy in 40 pages and next 10 pages about scikit.learn and scikit.image > The first 5 books at > http://stackoverflow.com/questions/4375094/numpy-what-are-the-authoritative-numpy-resources-e-g-documentation-tutorial Voted answer contains great suggestions. All those books are very good companions, especially those Springer published. From charlesr.harris at gmail.com Sun Dec 16 12:28:34 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 16 Dec 2012 10:28:34 -0700 Subject: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also? In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 10:38 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > The previous proposal to drop python 2.4 support garnered no opposition. > How about dropping support for python 2.5 also? > > The proposal to drop support for python 2.5 and 2.4 in numpy 1.8 has carried. It is now a todo issue on github . -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 16 17:36:53 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 16 Dec 2012 15:36:53 -0700 Subject: [Numpy-discussion] required nose version. Message-ID: Hi All, Looking at INSTALL.txt with an eye to updating it since we have dropped Python 2.4 -2.5 support, it looks like we could update the nose version also. The first version of nose to support Python 3 was 1.0, but I think 1.1 would better because of some bug fixes. IPython also requires nose 1.1. So I propose the required nose version be updated to 1.1. Thoughts? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Sun Dec 16 17:50:41 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sun, 16 Dec 2012 14:50:41 -0800 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: Thanks Ralf and Nathan, I have put high priority on the issues that need to be fixed before the rc1. There are now 4 issues: https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&page=1&state=open I am working on the mingw one, as that one is the most difficult. Ralf (or anyone else), do you know how to fix this one: https://github.com/numpy/numpy/issues/438 I am not very familiar with this part of numpy, so maybe you know how to document it well. The sooner we can fix these 4 issues, the sooner we can release. Ondrej On Sun, Dec 16, 2012 at 1:49 AM, Ralf Gommers wrote: > > > > On Sun, Dec 16, 2012 at 3:17 AM, Nathaniel Smith wrote: >> >> #294 is a regression, so probably should be considered release critical. I >> can't tell if #2750 is a real problem or not. #378 looks serious, but afaict >> has actually been fixed even though the bug is still marked open? At least >> fixed in 1.7.x? >> >> On 15 Dec 2012 23:52, "Ond?ej ?ert?k" wrote: >>> >>> Hi, >>> >>> If you go to the issues for 1.7 and click "high priority": >>> >>> >>> https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&state=open >>> >>> you will see 3 issues as of right now. Two of those have PR attached. >>> It's been a lot of work >>> to get to this point and I'd like to thank all of you for helping out >>> with the issues. >>> >>> >>> In particular, I have just fixed a very annoying segfault (#2738) in the >>> PR: >>> >>> https://github.com/numpy/numpy/pull/2831 >>> >>> If you can review that one carefully, that would be highly >>> appreciated. The more people the better, >>> it's a reference counting issue and since this would go into the 1.7 >>> release and it's in the core of numpy, >>> I want to make sure that it's correct. >>> >>> So the last high priority issue is: >>> >>> https://github.com/numpy/numpy/issues/568 >>> >>> and that's the one I will be concentrating on now. After it's fixed, I >>> think we are ready to release the rc1. >>> >>> There are more open issues (that are not "high priority"): >>> >>> >>> https://github.com/numpy/numpy/issues?labels=&milestone=3&page=1&state=open >>> >>> But I don't think we should delay the release any longer because of >>> them. Let me know if there >>> are any objections. Of course, if you attach a PR fixing any of those, >>> we'll merge it. > > > Properly documenting .base (gh-2737) and casting rules (gh-561) changes > should be finished before rc1. I agree that the Debian issues all shouldn't > block the release. > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Sun Dec 16 18:01:21 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 16 Dec 2012 16:01:21 -0700 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: On Sun, Dec 16, 2012 at 3:50 PM, Ond?ej ?ert?k wrote: > Thanks Ralf and Nathan, > > I have put high priority on the issues that need to be fixed before the > rc1. > There are now 4 issues: > > > https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&page=1&state=open > > I am working on the mingw one, as that one is the most difficult. > Ralf (or anyone else), do you know how to fix this one: > > https://github.com/numpy/numpy/issues/438 > > I am not very familiar with this part of numpy, so maybe you know how > to document it well. > > The sooner we can fix these 4 issues, the sooner we can release. > > I believe mingw was updated last month to a new compiler version. I don't know what other changes there were, but it is possible that some problems have been fixed. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Dec 16 18:26:22 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 16 Dec 2012 23:26:22 +0000 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: On 16 Dec 2012 23:01, "Charles R Harris" wrote: > > > > On Sun, Dec 16, 2012 at 3:50 PM, Ond?ej ?ert?k wrote: >> >> Thanks Ralf and Nathan, >> >> I have put high priority on the issues that need to be fixed before the rc1. >> There are now 4 issues: >> >> https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&page=1&state=open >> >> I am working on the mingw one, as that one is the most difficult. >> Ralf (or anyone else), do you know how to fix this one: >> >> https://github.com/numpy/numpy/issues/438 >> >> I am not very familiar with this part of numpy, so maybe you know how >> to document it well. >> >> The sooner we can fix these 4 issues, the sooner we can release. >> > > I believe mingw was updated last month to a new compiler version. I don't know what other changes there were, but it is possible that some problems have been fixed. It'd be worth checking in case it allows us to get off the (incredibly old) GCC that we currently require on windows. But that's a long-term problem that we probably shouldn't be messing with for 1.7 purposes. afaict all we need to do for 1.7 is switch to using our current POSIX code on win32 as well, instead of the (weird and broken) MS-specific API that we're currently using. (Plus suppress some totally spurious warnings): http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063346.html (Or I could be missing something, but I don't think any problems with that solution have been discussed on the list anyway.) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From vs at it.uu.se Sun Dec 16 20:20:06 2012 From: vs at it.uu.se (Virgil Stokes) Date: Mon, 17 Dec 2012 02:20:06 +0100 Subject: [Numpy-discussion] On the difference of two positive definite matrices Message-ID: <50CE7346.4050002@it.uu.se> Suppose I have two positive definite matrices, A and B. Is it possible to use U*D*U^T factorizations of these matrices to obtain a numerically stable result for their difference, A - B ? My application is the "UD" factorization method for the Kalman filter followed by the Rauch-Tung-Striebel smoother --- this is where the difference of two positive definite matrices occurs. I hope that this question is appropriate for this list and does not offend any subscribers. From charlesr.harris at gmail.com Sun Dec 16 20:38:43 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 16 Dec 2012 18:38:43 -0700 Subject: [Numpy-discussion] On the difference of two positive definite matrices In-Reply-To: <50CE7346.4050002@it.uu.se> References: <50CE7346.4050002@it.uu.se> Message-ID: On Sun, Dec 16, 2012 at 6:20 PM, Virgil Stokes wrote: > Suppose I have two positive definite matrices, A and B. Is it possible > to use U*D*U^T factorizations of these matrices to obtain a numerically > stable result for their difference, A - B ? > > My application is the "UD" factorization method for the Kalman filter > followed by the Rauch-Tung-Striebel smoother --- this is where the > difference of two positive definite matrices occurs. > > I hope that this question is appropriate for this list and does not > offend any subscribers. > > Not sure what you are asking, but there is a coordinate system in which they are both diagonal. Nevertheless, the difference may not be positive definite. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Mon Dec 17 01:07:31 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 17 Dec 2012 00:07:31 -0600 Subject: [Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch Message-ID: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> Hello all, There is a lot happening in my life right now and I am spread quite thin among the various projects that I take an interest in. In particular, I am thrilled to publicly announce on this list that Continuum Analytics has received DARPA funding (to the tune of at least $3 million) for Blaze, Numba, and Bokeh which we are writing to take NumPy, SciPy, and visualization into the domain of very large data sets. This is part of the XDATA program, and I will be taking an active role in it. You can read more about Blaze here: http://blaze.pydata.org. You can read more about XDATA here: http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx I personally think Blaze is the future of array-oriented computing in Python. I will be putting efforts and resources next year behind making that case. How it interacts with future incarnations of NumPy, Pandas, or other projects is an interesting and open question. I have no doubt the future will be a rich ecosystem of interoperating array-oriented data-structures. I invite anyone interested in Blaze to participate in the discussions and development at https://groups.google.com/a/continuum.io/forum/#!forum/blaze-dev or watch the project on our public GitHub repo: https://github.com/ContinuumIO/blaze. Blaze is being incubated under the ContinuumIO GitHub project for now, but eventually I hope it will receive its own GitHub project page later next year. Development of Blaze is early but we are moving rapidly with it (and have deliverable deadlines --- thus while we will welcome input and pull requests we won't have a ton of time to respond to simple queries until at least May or June). There is more that we are working on behind the scenes with respect to Blaze that will be coming out next year as well but isn't quite ready to show yet. As I look at the coming months and years, my time for direct involvement in NumPy development is therefore only going to get smaller. As a result it is not appropriate that I remain as "head steward" of the NumPy project (a term I prefer to BFD12 or anything else). I'm sure that it is apparent that while I've tried to help personally where I can this year on the NumPy project, my role has been more one of coordination, seeking funding, and providing expert advice on certain sections of code. I fundamentally agree with Fernando Perez that the responsibility of care-taking open source projects is one of stewardship --- something akin to public service. I have tried to emulate that belief this year --- even while not always succeeding. It is time for me to make official what is already becoming apparent to observers of this community, namely, that I am stepping down as someone who might be considered "head steward" for the NumPy project and officially leaving the development of the project in the hands of others in the community. I don't think the project actually needs a new "head steward" --- especially from a development perspective. Instead I see a lot of strong developers offering key opinions for the project as well as a great set of new developers offering pull requests. My strong suggestion is that development discussions of the project continue on this list with consensus among the active participants being the goal for development. I don't think 100% consensus is a rigid requirement --- but certainly a super-majority should be the goal, and serious changes should not be made with out a clear consensus. I would pay special attention to under-represented people (users with intense usage of NumPy but small voices on this list). There are many of them. If you push me for specifics then at this point in NumPy's history, I would say that if Chuck, Nathaniel, and Ralf agree on a course of action, it will likely be a good thing for the project. I suspect that even if only 2 of the 3 agree at one time it might still be a good thing (but I would expect more detail and discussion). There are others whose opinion should be sought as well: Ondrej Certik, Perry Greenfield, Robert Kern, David Cournapeau, Francesc Alted, and Mark Wiebe to name a few. For some questions, I might even seek input from people like Konrad Hinsen and Paul Dubois --- if they have time to give it. I will still be willing to offer my view from time to time and if I am asked. Greg Wilson (of Software Carpentry fame) asked me recently what letter I would have written to myself 5 years ago. What would I tell myself to do given the knowledge I have now? I've thought about that for a bit, and I have some answers. I don't know if these will help anyone, but I offer them as hopefully instructive: 1) Do not promise to not break the ABI of NumPy --- and in fact emphasize that it will be broken at least once in the 1.X series. NumPy was designed to add new data-types --- but not without breaking the ABI. NumPy has needed more data-types and still needs even more. While it's not beautifully simple to add new data-types, it can be done. But, it is impossible to add them without breaking the ABI in some fashion. The desire to add new data-types *and* keep ABI compatibility has led to significant pain. I think the ABI non-breakage goal has been amplified by the poor state of package management in Python. The fact that it's painful for someone to update their downstream packages when an upstream ABI breaks (on Windows and Mac in particular) has put a lot of unfortunate pressure on this community. Pressure that was not envisioned or understood when I was writing NumPy. (As an aside: This is one reason Continuum has invested resources in building the conda tool and a completely free set of binary packages called Anaconda CE which is becoming more and more usable thanks to the efforts of Bryan Van de Ven and Ilan Schnell and our testing team at Continuum. The conda tool: http://docs.continuum.io/conda/index.html is open source and BSD licensed and the next release will provide the ability to build packages, build indexes on package repositories and interface with pip. Expect a blog-post in the near future about how cool conda is!). 2) Don't create array-scalars. Instead, make the data-type object a meta-type object whose instances are the items returned from NumPy arrays. There is no need for a separate array-scalar object and in fact it's confusing to the type-system. I understand that now. I did not understand that 5 years ago. 3) Special-case small arrays to avoid the memory indirection and look at PDL so that generalized ufuncs are supported from the beginning. 4) Define missing-value data-types and labels on the dimensions and arrays 5) Define a standard "dictionary of NumPy arrays" interface as the basic "structure of arrays" concept to go with the "array of structures" that structured arrays provide. 6) Start work on SQL interface to NumPy arrays *now* Additional comments I would make to someone today: 1) Most of NumPy should be written in Python with Numba used as the compiler (particularly as soon as Numba gets the ability to create Python extension modules which is in the next release). 2) There are still many, many optimizations that can be made in NumPy run-time (especially in the face of modern hardware). I will continue to be available to answer questions and I may chime in here and there on pull requests. However, most of my time for NumPy will be on administrative aspects of the project where I will continue to take an active interest. To help make sure that this happens in a transparent way, I would like to propose that "administrative" support of the project be left to the NumFOCUS board of which I am currently 1 of 9 members. The other board members are currently: Ralf Gommers, Anthony Scopatz, Andy Terrel, Prabhu Ramachandran, Fernando Perez, Emmanuelle Gouillart, Jarrod Millman, and Perry Greenfield. While NumFOCUS basically seeks to promote and fund the entire scientific Python stack, I think it can also play a role in helping to administer some of the core projects which the board members themselves have a personal interest in. By administrative support, I mean decisions like "what should be done with any NumPy IP or web-domains" or "what kind of commercially-related ads or otherwise should go on the NumPy home page", or "what should be done with the NumPy github account", etc. --- basically anything that requires an executive decision that is not directly development related. I don't expect there to be many of these decisions. But, when they show up, I would like them to be made in as transparent and public of a way as possible. In practice, the way I see this working is that there are members of the NumPy community who are (like me) particularly interested in admin-related questions and serve on a NumPy team in the NumFOCUS organization. I just know I'll be attending NumFOCUS board meetings, and I would like to help move administrative decisions forward with NumPy as part of the time I spend thinking about NumFOCUS. If people on this list would like to play an active role in those admin discussions, then I would heartily welcome them into NumFOCUS membership where they would work with interested members of the NumFOCUS board (like me and Ralf) to help direct that organization. I would really love to have someone from this list volunteer to serve on the NumPy team as part of the NumFOCUS project. I am certainly going to be interested in the opinions of people who are active participants on this list and on GitHub pages for NumPy on anything admin related to NumPy, and I expect Ralf would also be very interested in those views. One admin discussion that I will bring up in another email (as this one is already too long) is about making 2 or 3 lists for NumPy such as numpy-admin at numpy.org, numpy-dev at numpy.org, and numpy-users at numpy-org. Just because I'll be spending more time on Blaze, Numba, Bokeh, and the PyData ecosystem does not mean that I won't be around for NumPy. I will continue to promote NumPy. My involvement with Continuum connects me to NumPy as Continuum continues to offer commercial support contracts for NumPy (and SciPy and other open source projects). Continuum will also continue to maintain its Github NumPy project which will contain pull requests from our company that we are working to get into the mainline branch. Continuum will also continue to provide resources for release-management of NumPy (we have been funding Ondrej in this role for the past 6 months --- though I would like to see this happen through NumFOCUS in the future even if Continuum provides much of the money). We also offer optimized versions of NumPy in our commercial Anaconda distribution (Anaconda CE is free and open source). Also, I will still be available for questions and help (I'm not disappearing --- just making it clear that I'm stepping back into an occasional NumPy developer role). It has been extremely gratifying to see the number of pull-requests, GitHub-conversations, and code contributions increase this year. Even though the 1.7 release has taken a long time to stabilize, there have been a lot of people participating in the discussion and in helping to track down the problems, figure out what to do, and fix them. It even makes it possible for people to think about 1.7 as a long-term release. I will continue to hope that the spirit of openness, tolerance, respect, and gratitude continue to permeate this mailing list, and that we continue to seek to resolve any differences with trust and mutual respect. I know I have offended people in the past with quick remarks and actions made sometimes in haste without fully realizing how they might be taken. But, I also know that like many of you I have always done the very best I could for moving Python for scientific computing forward in the best way I know how. Thank you for the great memories. If you will forgive a little sentiment: My daughter who is in college now was 3 years old when I began working with this community and went down a road that would lead to my involvement with SciPy and NumPy. I have marked the building of my family and the passage of time with where the Python for Scientific Computing Community was at. Like many of you, I have given a great deal of attention and time to building this community. That sacrifice and time has led me to love what we have created. I know that I leave this segment of the community with the tools in better hands than mine. I am hopeful that NumPy will continue to be a useful array library for the Python community for many years to come even as we all continue to build new tools for the future. Very best regards, -Travis From ralf.gommers at gmail.com Mon Dec 17 02:07:03 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 17 Dec 2012 08:07:03 +0100 Subject: [Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch In-Reply-To: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> References: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> Message-ID: On Mon, Dec 17, 2012 at 7:07 AM, Travis Oliphant wrote: > Hello all, > > There is a lot happening in my life right now and I am spread quite thin > among the various projects that I take an interest in. In particular, I > am thrilled to publicly announce on this list that Continuum Analytics has > received DARPA funding (to the tune of at least $3 million) for Blaze, > Numba, and Bokeh which we are writing to take NumPy, SciPy, and > visualization into the domain of very large data sets. This is part of > the XDATA program, and I will be taking an active role in it. You can > read more about Blaze here: http://blaze.pydata.org. You can read more > about XDATA here: http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx > Hi Travis, that is fantastic news, congratulations! I can't wait to see what you guys will come up with in the near future. Also thank you for the rest of this thoughtful post; it'll take me some time to digest but I enjoyed the reflection on the past. Best, Ralf > > I personally think Blaze is the future of array-oriented computing in > Python. I will be putting efforts and resources next year behind making > that case. How it interacts with future incarnations of NumPy, Pandas, or > other projects is an interesting and open question. I have no doubt the > future will be a rich ecosystem of interoperating array-oriented > data-structures. I invite anyone interested in Blaze to participate in > the discussions and development at > https://groups.google.com/a/continuum.io/forum/#!forum/blaze-dev or watch > the project on our public GitHub repo: > https://github.com/ContinuumIO/blaze. Blaze is being incubated under the > ContinuumIO GitHub project for now, but eventually I hope it will receive > its own GitHub project page later next year. Development of Blaze is > early but we are moving rapidly with it (and have deliverable deadlines --- > thus while we will welcome input and pull requests we won't have a ton of > time to respond to simple queries until > at least May or June). There is more that we are working on behind > the scenes with respect to Blaze that will be coming out next year as well > but isn't quite ready to show yet. > > As I look at the coming months and years, my time for direct involvement > in NumPy development is therefore only going to get smaller. As a result > it is not appropriate that I remain as "head steward" of the NumPy project > (a term I prefer to BFD12 or anything else). I'm sure that it is apparent > that while I've tried to help personally where I can this year on the NumPy > project, my role has been more one of coordination, seeking funding, and > providing expert advice on certain sections of code. I fundamentally > agree with Fernando Perez that the responsibility of care-taking open > source projects is one of stewardship --- something akin to public service. > I have tried to emulate that belief this year --- even while not always > succeeding. > > It is time for me to make official what is already becoming apparent to > observers of this community, namely, that I am stepping down as someone who > might be considered "head steward" for the NumPy project and officially > leaving the development of the project in the hands of others in the > community. I don't think the project actually needs a new "head steward" > --- especially from a development perspective. Instead I see a lot of > strong developers offering key opinions for the project as well as a great > set of new developers offering pull requests. > > My strong suggestion is that development discussions of the project > continue on this list with consensus among the active participants being > the goal for development. I don't think 100% consensus is a rigid > requirement --- but certainly a super-majority should be the goal, and > serious changes should not be made with out a clear consensus. I would > pay special attention to under-represented people (users with intense usage > of NumPy but small voices on this list). There are many of them. If > you push me for specifics then at this point in NumPy's history, I would > say that if Chuck, Nathaniel, and Ralf agree on a course of action, it will > likely be a good thing for the project. I suspect that even if only 2 of > the 3 agree at one time it might still be a good thing (but I would expect > more detail and discussion). There are others whose opinion should be > sought as well: Ondrej Certik, Perry Greenfield, Robert Kern, David > Cournapeau, Francesc Alted, and Mark Wiebe to > name a few. For some questions, I might even seek input from people > like Konrad Hinsen and Paul Dubois --- if they have time to give it. I > will still be willing to offer my view from time to time and if I am asked. > > Greg Wilson (of Software Carpentry fame) asked me recently what letter I > would have written to myself 5 years ago. What would I tell myself to do > given the knowledge I have now? I've thought about that for a bit, and > I have some answers. I don't know if these will help anyone, but I offer > them as hopefully instructive: > > 1) Do not promise to not break the ABI of NumPy --- and in fact > emphasize that it will be broken at least once in the 1.X series. NumPy > was designed to add new data-types --- but not without breaking the ABI. > NumPy has needed more data-types and still needs even more. While it's > not beautifully simple to add new data-types, it can be done. But, it is > impossible to add them without breaking the ABI in some fashion. The > desire to add new data-types *and* keep ABI compatibility has led to > significant pain. I think the ABI non-breakage goal has been amplified by > the poor state of package management in Python. The fact that it's > painful for someone to update their downstream packages when an upstream > ABI breaks (on Windows and Mac in particular) has put a lot of unfortunate > pressure on this community. Pressure that was not envisioned or > understood when I was writing NumPy. > > (As an aside: This is one reason Continuum has invested resources in > building the conda tool and a completely free set of binary packages called > Anaconda CE which is becoming more and more usable thanks to the efforts of > Bryan Van de Ven and Ilan Schnell and our testing team at Continuum. The > conda tool: http://docs.continuum.io/conda/index.html is open source and > BSD licensed and the next release will provide the ability to build > packages, build indexes on package repositories and interface with pip. > Expect a blog-post in the near future about how cool conda is!). > > 2) Don't create array-scalars. Instead, make the data-type object > a meta-type object whose instances are the items returned from NumPy > arrays. There is no need for a separate array-scalar object and in fact > it's confusing to the type-system. I understand that now. I did not > understand that 5 years ago. > > 3) Special-case small arrays to avoid the memory indirection and > look at PDL so that generalized ufuncs are supported from the beginning. > > 4) Define missing-value data-types and labels on the dimensions > and arrays > > 5) Define a standard "dictionary of NumPy arrays" interface as the > basic "structure of arrays" concept to go with the "array of structures" > that structured arrays provide. > > 6) Start work on SQL interface to NumPy arrays *now* > > Additional comments I would make to someone today: > > 1) Most of NumPy should be written in Python with Numba used as > the compiler (particularly as soon as Numba gets the ability to create > Python extension modules which is in the next release). > 2) There are still many, many optimizations that can be made in > NumPy run-time (especially in the face of modern hardware). > > I will continue to be available to answer questions and I may chime in > here and there on pull requests. However, most of my time for NumPy will > be on administrative aspects of the project where I will continue to take > an active interest. To help make sure that this happens in a transparent > way, I would like to propose that "administrative" support of the project > be left to the NumFOCUS board of which I am currently 1 of 9 members. The > other board members are currently: Ralf Gommers, Anthony Scopatz, Andy > Terrel, Prabhu Ramachandran, Fernando Perez, Emmanuelle Gouillart, Jarrod > Millman, and Perry Greenfield. While NumFOCUS basically seeks to > promote and fund the entire scientific Python stack, I think it can also > play a role in helping to administer some of the core projects which the > board members themselves have a personal interest in. > > By administrative support, I mean decisions like "what should be done with > any NumPy IP or web-domains" or "what kind of commercially-related ads or > otherwise should go on the NumPy home page", or "what should be done with > the NumPy github account", etc. --- basically anything that requires an > executive decision that is not directly development related. I don't > expect there to be many of these decisions. But, when they show up, I > would like them to be made in as transparent and public of a way as > possible. In practice, the way I see this working is that there are > members of the NumPy community who are (like me) particularly interested in > admin-related questions and serve on a NumPy team in the NumFOCUS > organization. I just know I'll be attending NumFOCUS board meetings, > and I would like to help move administrative decisions forward with NumPy > as part of the time I spend thinking about NumFOCUS. > > If people on this list would like to play an active role in those admin > discussions, then I would heartily welcome them into NumFOCUS membership > where they would work with interested members of the NumFOCUS board (like > me and Ralf) to help direct that organization. I would really love to > have someone from this list volunteer to serve on the NumPy team as part of > the NumFOCUS project. I am certainly going to be interested in the > opinions of people who are active participants on this list and on GitHub > pages for NumPy on anything admin related to NumPy, and I expect Ralf would > also be very interested in those views. > > One admin discussion that I will bring up in another email (as this one is > already too long) is about making 2 or 3 lists for NumPy such as > numpy-admin at numpy.org, numpy-dev at numpy.org, and numpy-users at numpy-org. > > Just because I'll be spending more time on Blaze, Numba, Bokeh, and the > PyData ecosystem does not mean that I won't be around for NumPy. I will > continue to promote NumPy. My involvement with Continuum connects me to > NumPy as Continuum continues to offer commercial support contracts for > NumPy (and SciPy and other open source projects). Continuum will also > continue to maintain its Github NumPy project which will contain pull > requests from our company that we are working to get into the mainline > branch. Continuum will also continue to provide resources for > release-management of NumPy (we have been funding Ondrej in this role for > the past 6 months --- though I would like to see this happen through > NumFOCUS in the future even if Continuum provides much of the money). We > also offer optimized versions of NumPy in our commercial Anaconda > distribution (Anaconda CE is free and open source). > > Also, I will still be available for questions and help (I'm not > disappearing --- just making it clear that I'm stepping back into an > occasional NumPy developer role). It has been extremely gratifying to see > the number of pull-requests, GitHub-conversations, and code contributions > increase this year. Even though the 1.7 release has taken a long time to > stabilize, there have been a lot of people participating in the discussion > and in helping to track down the problems, figure out what to do, and fix > them. It even makes it possible for people to think about 1.7 as a > long-term release. > > I will continue to hope that the spirit of openness, tolerance, respect, > and gratitude continue to permeate this mailing list, and that we continue > to seek to resolve any differences with trust and mutual respect. I know > I have offended people in the past with quick remarks and actions made > sometimes in haste without fully realizing how they might be taken. But, > I also know that like many of you I have always done the very best I could > for moving Python for scientific computing forward in the best way I know > how. > > Thank you for the great memories. If you will forgive a little > sentiment: My daughter who is in college now was 3 years old when I began > working with this community and went down a road that would lead to my > involvement with SciPy and NumPy. I have marked the building of my family > and the passage of time with where the Python for Scientific Computing > Community was at. Like many of you, I have given a great deal of > attention and time to building this community. That sacrifice and time > has led me to love what we have created. I know that I leave this > segment of the community with the tools in better hands than mine. I am > hopeful that NumPy will continue to be a useful array library for the > Python community for many years to come even as we all continue to build > new tools for the future. > > Very best regards, > > -Travis > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Dec 17 02:11:17 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 17 Dec 2012 08:11:17 +0100 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: On Mon, Dec 17, 2012 at 12:26 AM, Nathaniel Smith wrote: > On 16 Dec 2012 23:01, "Charles R Harris" > wrote: > > > > > > > > On Sun, Dec 16, 2012 at 3:50 PM, Ond?ej ?ert?k > wrote: > >> > >> Thanks Ralf and Nathan, > >> > >> I have put high priority on the issues that need to be fixed before the > rc1. > >> There are now 4 issues: > >> > >> > https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&page=1&state=open > >> > >> I am working on the mingw one, as that one is the most difficult. > >> Ralf (or anyone else), do you know how to fix this one: > >> > >> https://github.com/numpy/numpy/issues/438 > >> > >> I am not very familiar with this part of numpy, so maybe you know how > >> to document it well. > >> > >> The sooner we can fix these 4 issues, the sooner we can release. > >> > > > > I believe mingw was updated last month to a new compiler version. I > don't know what other changes there were, but it is possible that some > problems have been fixed. > > It'd be worth checking in case it allows us to get off the (incredibly > old) GCC that we currently require on windows. But that's a long-term > problem that we probably shouldn't be messing with for 1.7 purposes. afaict > all we need to do for 1.7 is switch to using our current POSIX code on > win32 as well, instead of the (weird and broken) MS-specific API that we're > currently using. (Plus suppress some totally spurious warnings): > http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063346.html > > (Or I could be missing something, but I don't think any problems with that > solution have been discussed on the list anyway.) > AFAICT Nathaniel's suggestion in the thread linked above is the way to go. Trying again to go to gcc 4.x doesn't sound like a good idea. Probably David C. already has a good idea about whether recent changes to MinGW have made a difference to the issue he ran into about a year ago. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis-bz-gg at t-online.de Mon Dec 17 06:52:57 2012 From: denis-bz-gg at t-online.de (denis) Date: Mon, 17 Dec 2012 11:52:57 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?On_the_difference_of_two_positive_de?= =?utf-8?q?finite=09matrices?= References: <50CE7346.4050002@it.uu.se> Message-ID: Charles R Harris gmail.com> writes: > > On Sun, Dec 16, 2012 at 6:20 PM, Virgil Stokes it.uu.se> wrote: > Suppose I have two positive definite matrices, A and B. Is it possible > to use U*D*U^T ?factorizations of these matrices to obtain a numerically > stable result for their difference, A - B ? ... > Not sure what you are asking, but there is a coordinate system in which they > are both diagonal. Nevertheless, the difference may not be positive > definite.Chuck http://en.wikipedia.org/wiki/Positive-definite_matrix #Simultaneous_diagonalization shows how to do that, but "note that this is no longer an orthogonal diagonalization"; orthogonal can do it iff A and B commute. cheers -- denis From pierre.raybaut at gmail.com Mon Dec 17 08:57:28 2012 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Mon, 17 Dec 2012 14:57:28 +0100 Subject: [Numpy-discussion] ANN: WinPython v2.7.3.2 Message-ID: Hi all, I'm pleased to announce that WinPython v2.7.3.2 has been released for 32-bit and 64-bit Windows platforms: http://code.google.com/p/winpython/ This is mainly a maintenance release (many packages have been updated since v2.7.3.1). WinPython is a free open-source portable distribution of Python for Windows, designed for scientists. It is a full-featured (see http://code.google.com/p/winpython/wiki/PackageIndex) Python-based scientific environment: * Designed for scientists (thanks to the integrated libraries NumPy, SciPy, Matplotlib, guiqwt, etc.: * Regular *scientific users*: interactive data processing and visualization using Python with Spyder * *Advanced scientific users and software developers*: Python applications development with Spyder, version control with Mercurial and other development tools (like gettext) * *Portable*: preconfigured, it should run out of the box on any machine under Windows (without any installation requirements) and the folder containing WinPython can be moved to any location (local, network or removable drive) * *Flexible*: one can install (or should I write "use" as it's portable) as many WinPython versions as necessary (like isolated and self-consistent environments), even if those versions are running different versions of Python (2.7, 3.x in the near future) or different architectures (32bit or 64bit) on the same machine * *Customizable*: using the integrated package manager (wppm, as WinPython Package Manager), it's possible to install, uninstall or upgrade Python packages (see http://code.google.com/p/winpython/wiki/WPPM for more details on supported package formats). *WinPython is not an attempt to replace Python(x,y)*, this is just something different (see http://code.google.com/p/winpython/wiki/Roadmap): more flexible, easier to maintain, movable and less invasive for the OS, but certainly less user-friendly, with less packages/contents and without any integration to Windows explorer [*]. [*] Actually there is an optional integration into Windows explorer, providing the same features as the official Python installer regarding file associations and context menu entry (this option may be activated through the WinPython Control Panel). Enjoy! -Pierre From nouiz at nouiz.org Mon Dec 17 09:42:34 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Mon, 17 Dec 2012 09:42:34 -0500 Subject: [Numpy-discussion] required nose version. In-Reply-To: References: Message-ID: This is fine for us. Fr?d?ric On Sun, Dec 16, 2012 at 5:36 PM, Charles R Harris wrote: > Hi All, > > Looking at INSTALL.txt with an eye to updating it since we have dropped > Python 2.4 -2.5 support, it looks like we could update the nose version > also. The first version of nose to support Python 3 was 1.0, but I think 1.1 > would better because of some bug fixes. IPython also requires nose 1.1. So I > propose the required nose version be updated to 1.1. Thoughts? > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From nouiz at nouiz.org Mon Dec 17 11:17:59 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Mon, 17 Dec 2012 11:17:59 -0500 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: Hi, I added a new issue that is a regression about numpy.ndindex() that we already talked. But it was a duplicate[1], so I closed it. I think it got lost as the ticket wasn't marked for 1.7 milestone. Ccan someone do it? I don't have the right. This regression break something in Theano. We could work around it, this also break stuff in SciPy from a comment in that ticket. Fred [1] github.com/numpy/numpy/issues/2781 On Mon, Dec 17, 2012 at 2:11 AM, Ralf Gommers wrote: > > > > On Mon, Dec 17, 2012 at 12:26 AM, Nathaniel Smith wrote: >> >> On 16 Dec 2012 23:01, "Charles R Harris" >> wrote: >> > >> > >> > >> > On Sun, Dec 16, 2012 at 3:50 PM, Ond?ej ?ert?k >> > wrote: >> >> >> >> Thanks Ralf and Nathan, >> >> >> >> I have put high priority on the issues that need to be fixed before the >> >> rc1. >> >> There are now 4 issues: >> >> >> >> >> >> https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&page=1&state=open >> >> >> >> I am working on the mingw one, as that one is the most difficult. >> >> Ralf (or anyone else), do you know how to fix this one: >> >> >> >> https://github.com/numpy/numpy/issues/438 >> >> >> >> I am not very familiar with this part of numpy, so maybe you know how >> >> to document it well. >> >> >> >> The sooner we can fix these 4 issues, the sooner we can release. >> >> >> > >> > I believe mingw was updated last month to a new compiler version. I >> > don't know what other changes there were, but it is possible that some >> > problems have been fixed. >> >> It'd be worth checking in case it allows us to get off the (incredibly >> old) GCC that we currently require on windows. But that's a long-term >> problem that we probably shouldn't be messing with for 1.7 purposes. afaict >> all we need to do for 1.7 is switch to using our current POSIX code on win32 >> as well, instead of the (weird and broken) MS-specific API that we're >> currently using. (Plus suppress some totally spurious warnings): >> http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063346.html >> >> (Or I could be missing something, but I don't think any problems with that >> solution have been discussed on the list anyway.) > > AFAICT Nathaniel's suggestion in the thread linked above is the way to go. > > Trying again to go to gcc 4.x doesn't sound like a good idea. Probably David > C. already has a good idea about whether recent changes to MinGW have made a > difference to the issue he ran into about a year ago. > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d.warde.farley at gmail.com Mon Dec 17 11:46:03 2012 From: d.warde.farley at gmail.com (David Warde-Farley) Date: Mon, 17 Dec 2012 11:46:03 -0500 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: A bit off-topic, but could someone have a look at https://github.com/numpy/numpy/pull/2699 and provide some feedback? If 1.7 is meant to be an LTS release, this would be a nice wart to have out of the way. The Travis failure was a spurious one that has since been fixed. On Sat, Dec 15, 2012 at 6:52 PM, Ond?ej ?ert?k wrote: > Hi, > > If you go to the issues for 1.7 and click "high priority": > > https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&state=open > > you will see 3 issues as of right now. Two of those have PR attached. > It's been a lot of work > to get to this point and I'd like to thank all of you for helping out > with the issues. > > > In particular, I have just fixed a very annoying segfault (#2738) in the PR: > > https://github.com/numpy/numpy/pull/2831 > > If you can review that one carefully, that would be highly > appreciated. The more people the better, > it's a reference counting issue and since this would go into the 1.7 > release and it's in the core of numpy, > I want to make sure that it's correct. > > So the last high priority issue is: > > https://github.com/numpy/numpy/issues/568 > > and that's the one I will be concentrating on now. After it's fixed, I > think we are ready to release the rc1. > > There are more open issues (that are not "high priority"): > > https://github.com/numpy/numpy/issues?labels=&milestone=3&page=1&state=open > > But I don't think we should delay the release any longer because of > them. Let me know if there > are any objections. Of course, if you attach a PR fixing any of those, > we'll merge it. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From nouiz at nouiz.org Mon Dec 17 12:09:44 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Mon, 17 Dec 2012 12:09:44 -0500 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: While we are at it, back-porting https://github.com/numpy/numpy/pull/2730 Would give a good speed up for an LTS. I made a new PR that do this back-port: https://github.com/numpy/numpy/pull/2847 Fred On Mon, Dec 17, 2012 at 11:46 AM, David Warde-Farley wrote: > A bit off-topic, but could someone have a look at > https://github.com/numpy/numpy/pull/2699 and provide some feedback? > > If 1.7 is meant to be an LTS release, this would be a nice wart to > have out of the way. The Travis failure was a spurious one that has > since been fixed. > > On Sat, Dec 15, 2012 at 6:52 PM, Ond?ej ?ert?k wrote: >> Hi, >> >> If you go to the issues for 1.7 and click "high priority": >> >> https://github.com/numpy/numpy/issues?labels=priority%3A+high&milestone=3&state=open >> >> you will see 3 issues as of right now. Two of those have PR attached. >> It's been a lot of work >> to get to this point and I'd like to thank all of you for helping out >> with the issues. >> >> >> In particular, I have just fixed a very annoying segfault (#2738) in the PR: >> >> https://github.com/numpy/numpy/pull/2831 >> >> If you can review that one carefully, that would be highly >> appreciated. The more people the better, >> it's a reference counting issue and since this would go into the 1.7 >> release and it's in the core of numpy, >> I want to make sure that it's correct. >> >> So the last high priority issue is: >> >> https://github.com/numpy/numpy/issues/568 >> >> and that's the one I will be concentrating on now. After it's fixed, I >> think we are ready to release the rc1. >> >> There are more open issues (that are not "high priority"): >> >> https://github.com/numpy/numpy/issues?labels=&milestone=3&page=1&state=open >> >> But I don't think we should delay the release any longer because of >> them. Let me know if there >> are any objections. Of course, if you attach a PR fixing any of those, >> we'll merge it. >> >> Ondrej >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From wesmckinn at gmail.com Mon Dec 17 12:19:49 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 17 Dec 2012 12:19:49 -0500 Subject: [Numpy-discussion] ANN: pandas 0.10.0 released Message-ID: hi all, I'm super excited to announce the pandas 0.10.0 release. This is a major release including a new high performance file reading engine with tons of new user-facing functionality as well, a bunch of work on the HDF5/PyTables integration layer, much-expanded Unicode support, a new option/configuration interface, integration with the Google Analytics API, and a wide array of other new features, bug fixes, and performance improvements. I strongly recommend that all users get upgraded as soon as feasible. Many performance improvements made are quite substantial over 0.9.x, see vbenchmarks at the end of the e-mail. As of this release, we are no longer supporting Python 2.5. Also, this is the first release to officially support Python 3.3. Note: there are a number of minor, but necessary API changes that long-time pandas users should pay attention to in the What's New. Thanks to all who contributed to this release, especially Chang She, Yoval P, and Jeff Reback (and everyone else listed in the commit log!). As always source archives and Windows installers are on PyPI. What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html Installers: http://pypi.python.org/pypi/pandas $ git log v0.9.1..v0.10.0 --pretty=format:%aN | sort | uniq -c | sort -rn 246 Wes McKinney 140 y-p 99 Chang She 45 jreback 18 Abraham Flaxman 17 Jeff Reback 14 locojaydev 11 Keith Hughitt 5 Adam Obeng 2 Dieter Vandenbussche 1 zach powers 1 Luke Lee 1 Laurent Gautier 1 Ken Van Haren 1 Jay Bourque 1 Donald Curtis 1 Chris Mulligan 1 alex arsenovic 1 A. Flaxman Happy data hacking! - Wes What is it ========== pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with relational, time series, or any other kind of labeled data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Links ===== Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst Documentation: http://pandas.pydata.org Installers: http://pypi.python.org/pypi/pandas Code Repository: http://github.com/pydata/pandas Mailing List: http://groups.google.com/group/pydata Performance vs. v0.9.0 ====================== Benchmarks from https://github.com/pydata/pandas/tree/master/vb_suite Ratio < 1 means that v0.10.0 is faster v0.10.0 v0.9.0 ratio name unstack_sparse_keyspace 1.2813 144.1262 0.0089 groupby_frame_apply_overhead 20.1520 337.3330 0.0597 read_csv_comment2 25.3097 363.2860 0.0697 groupbym_frame_apply 75.1554 504.1661 0.1491 frame_iteritems_cached 0.0711 0.3919 0.1815 read_csv_thou_vb 35.2690 191.9360 0.1838 concat_small_frames 12.9019 55.3561 0.2331 join_dataframe_integer_2key 5.8184 21.5823 0.2696 series_value_counts_strings 5.3824 19.1262 0.2814 append_frame_single_homogenous 0.3413 0.9319 0.3662 read_csv_vb 18.4084 46.9500 0.3921 read_csv_standard 12.0651 29.9940 0.4023 panel_from_dict_all_different_indexes 73.6860 158.2949 0.4655 frame_constructor_ndarray 0.0471 0.0958 0.4918 groupby_first 3.8502 7.1988 0.5348 groupby_last 3.6962 6.7792 0.5452 panel_from_dict_two_different_indexes 50.7428 86.4980 0.5866 append_frame_single_mixed 1.2950 2.1930 0.5905 frame_get_numeric_data 0.0695 0.1119 0.6212 replace_fillna 4.6349 7.0540 0.6571 frame_to_csv 281.9340 427.7921 0.6590 replace_replacena 4.7154 7.1207 0.6622 frame_iteritems 2.5862 3.7463 0.6903 series_align_int64_index 29.7370 41.2791 0.7204 join_dataframe_integer_key 1.7980 2.4303 0.7398 groupby_multi_size 31.0066 41.7001 0.7436 groupby_frame_singlekey_integer 2.3579 3.1649 0.7450 write_csv_standard 326.8259 427.3241 0.7648 groupby_simple_compress_timing 41.2113 52.3993 0.7865 frame_fillna_inplace 16.2843 20.0491 0.8122 reindex_fillna_backfill 0.1364 0.1667 0.8181 groupby_multi_series_op 15.2914 18.6651 0.8193 groupby_multi_cython 17.2169 20.4420 0.8422 frame_fillna_many_columns_pad 14.9510 17.5114 0.8538 panel_from_dict_equiv_indexes 25.8427 29.9682 0.8623 merge_2intkey_nosort 19.0755 22.1138 0.8626 sparse_series_to_frame 167.8529 192.9920 0.8697 reindex_fillna_pad 0.1410 0.1617 0.8720 merge_2intkey_sort 44.7863 51.3315 0.8725 reshape_stack_simple 2.6698 3.0502 0.8753 groupby_indices 7.2264 8.2314 0.8779 sort_level_one 4.3845 4.9902 0.8786 sort_level_zero 4.3362 4.9198 0.8814 write_store 16.0587 18.2042 0.8821 frame_reindex_both_axes 0.3726 0.4183 0.8907 groupby_multi_different_numpy_functions 13.4164 15.0509 0.8914 index_int64_intersection 25.3705 28.1867 0.9001 groupby_frame_median 7.7491 8.6011 0.9009 frame_drop_dup_na_inplace 2.6290 2.9155 0.9017 dataframe_reindex_columns 0.3052 0.3372 0.9049 join_dataframe_index_multi 20.5651 22.6893 0.9064 frame_ctor_list_of_dict 101.7439 112.2260 0.9066 groupby_pivot_table 18.4551 20.3184 0.9083 reindex_frame_level_align 0.9644 1.0531 0.9158 stat_ops_level_series_sum_multiple 7.3637 8.0230 0.9178 write_store_mixed 38.2528 41.6604 0.9182 frame_reindex_both_axes_ix 0.4550 0.4950 0.9192 stat_ops_level_frame_sum_multiple 8.1975 8.9055 0.9205 panel_from_dict_same_index 25.7938 28.0147 0.9207 groupby_series_simple_cython 5.1310 5.5624 0.9224 frame_sort_index_by_columns 41.9577 45.1816 0.9286 groupby_multi_python 54.9727 59.0400 0.9311 datetimeindex_add_offset 0.2417 0.2584 0.9356 frame_boolean_row_select 0.2905 0.3100 0.9373 frame_reindex_axis1 2.9760 3.1742 0.9376 stat_ops_level_series_sum 2.3382 2.4937 0.9376 groupby_multi_different_functions 14.0333 14.9571 0.9382 timeseries_timestamp_tzinfo_cons 0.0159 0.0169 0.9397 stats_rolling_mean 1.6904 1.7959 0.9413 melt_dataframe 1.5236 1.6181 0.9416 timeseries_asof_single 0.0548 0.0582 0.9416 frame_ctor_nested_dict_int64 134.3100 142.6389 0.9416 join_dataframe_index_single_key_bigger 15.6578 16.5949 0.9435 stat_ops_level_frame_sum 3.2475 3.4414 0.9437 indexing_dataframe_boolean_rows 0.2382 0.2518 0.9459 timeseries_asof_nan 10.0433 10.6006 0.9474 frame_reindex_axis0 1.4403 1.5184 0.9485 concat_series_axis1 69.2988 72.8099 0.9518 join_dataframe_index_single_key_small 6.8492 7.1847 0.9533 dataframe_reindex_daterange 0.4054 0.4240 0.9562 join_dataframe_index_single_key_bigger 6.4616 6.7578 0.9562 timeseries_timestamp_downsample_mean 4.5849 4.7787 0.9594 frame_fancy_lookup 2.5498 2.6544 0.9606 series_value_counts_int64 2.5569 2.6581 0.9619 frame_fancy_lookup_all 30.7510 31.8465 0.9656 index_int64_union 82.2279 85.1500 0.9657 indexing_dataframe_boolean_rows_object 0.4809 0.4977 0.9662 frame_ctor_nested_dict 91.6129 94.8122 0.9663 stat_ops_series_std 0.2450 0.2533 0.9673 groupby_frame_cython_many_columns 3.7642 3.8894 0.9678 timeseries_asof 10.4352 10.7721 0.9687 series_ctor_from_dict 3.7707 3.8749 0.9731 frame_drop_dup_inplace 3.0007 3.0746 0.9760 timeseries_large_lookup_value 0.0242 0.0248 0.9764 read_table_multiple_date_baseline 1201.2930 1224.3881 0.9811 dti_reset_index 0.6339 0.6457 0.9817 read_table_multiple_date 2600.7280 2647.8729 0.9822 reindex_frame_level_reindex 0.9524 0.9674 0.9845 reindex_multiindex 1.3483 1.3685 0.9853 frame_insert_500_columns 102.1249 103.4329 0.9874 frame_drop_duplicates 19.3780 19.6157 0.9879 reindex_daterange_backfill 0.1870 0.1889 0.9899 stats_rank2d_axis0_average 25.0480 25.2801 0.9908 series_align_left_monotonic 13.1929 13.2558 0.9953 timeseries_add_irregular 22.4635 22.5122 0.9978 read_store_mixed 13.4398 13.4560 0.9988 lib_fast_zip 11.1289 11.1354 0.9994 match_strings 0.3831 0.3833 0.9995 read_store 5.5526 5.5290 1.0043 timeseries_sort_index 22.7172 22.5976 1.0053 timeseries_1min_5min_mean 0.6224 0.6175 1.0079 stats_rank2d_axis1_average 14.6569 14.5339 1.0085 reindex_daterange_pad 0.1886 0.1867 1.0102 timeseries_period_downsample_mean 6.4241 6.3480 1.0120 frame_drop_duplicates_na 19.3303 19.0970 1.0122 stats_rank_average_int 23.3569 22.9996 1.0155 lib_fast_zip_fillna 14.1394 13.8473 1.0211 index_datetime_intersection 17.2626 16.8986 1.0215 timeseries_1min_5min_ohlc 0.7054 0.6891 1.0237 stats_rank_average 31.3440 30.3845 1.0316 timeseries_infer_freq 10.9854 10.6439 1.0321 timeseries_slice_minutely 0.0637 0.0611 1.0418 index_datetime_union 17.9083 17.1640 1.0434 series_align_irregular_string 89.9470 85.1344 1.0565 series_constructor_ndarray 0.0127 0.0119 1.0742 indexing_panel_subset 0.5692 0.5214 1.0917 groupby_apply_dict_return 46.3497 42.3220 1.0952 reshape_unstack_simple 3.2901 2.9089 1.1310 timeseries_to_datetime_iso8601 4.2305 3.6015 1.1746 frame_to_string_floats 53.6217 37.2041 1.4413 reshape_pivot_time_series 170.4340 107.9068 1.5795 sparse_frame_constructor 6.2714 3.5053 1.7891 datetimeindex_normalize 37.2718 6.9329 5.3761 Columns: test_name | target_duration [ms] | baseline_duration [ms] | ratio From charlesr.harris at gmail.com Mon Dec 17 13:50:44 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Dec 2012 11:50:44 -0700 Subject: [Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch In-Reply-To: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> References: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> Message-ID: Hi Travis, On Sun, Dec 16, 2012 at 11:07 PM, Travis Oliphant wrote: > Hello all, > > There is a lot happening in my life right now and I am spread quite thin > among the various projects that I take an interest in. In particular, I > am thrilled to publicly announce on this list that Continuum Analytics has > received DARPA funding (to the tune of at least $3 million) for Blaze, > Numba, and Bokeh which we are writing to take NumPy, SciPy, and > visualization into the domain of very large data sets. This is part of > the XDATA program, and I will be taking an active role in it. You can > read more about Blaze here: http://blaze.pydata.org. You can read more > about XDATA here: http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx > > I personally think Blaze is the future of array-oriented computing in > Python. I will be putting efforts and resources next year behind making > that case. How it interacts with future incarnations of NumPy, Pandas, or > other projects is an interesting and open question. I have no doubt the > future will be a rich ecosystem of interoperating array-oriented > data-structures. I invite anyone interested in Blaze to participate in > the discussions and development at > https://groups.google.com/a/continuum.io/forum/#!forum/blaze-dev or watch > the project on our public GitHub repo: > https://github.com/ContinuumIO/blaze. Blaze is being incubated under the > ContinuumIO GitHub project for now, but eventually I hope it will receive > its own GitHub project page later next year. Development of Blaze is > early but we are moving rapidly with it (and have deliverable deadlines --- > thus while we will welcome input and pull requests we won't have a ton of > time to respond to simple queries until > at least May or June). There is more that we are working on behind > the scenes with respect to Blaze that will be coming out next year as well > but isn't quite ready to show yet. > > As I look at the coming months and years, my time for direct involvement > in NumPy development is therefore only going to get smaller. As a result > it is not appropriate that I remain as "head steward" of the NumPy project > (a term I prefer to BFD12 or anything else). I'm sure that it is apparent > that while I've tried to help personally where I can this year on the NumPy > project, my role has been more one of coordination, seeking funding, and > providing expert advice on certain sections of code. I fundamentally > agree with Fernando Perez that the responsibility of care-taking open > source projects is one of stewardship --- something akin to public service. > I have tried to emulate that belief this year --- even while not always > succeeding. > > It is time for me to make official what is already becoming apparent to > observers of this community, namely, that I am stepping down as someone who > might be considered "head steward" for the NumPy project and officially > leaving the development of the project in the hands of others in the > community. I don't think the project actually needs a new "head steward" > --- especially from a development perspective. Instead I see a lot of > strong developers offering key opinions for the project as well as a great > set of new developers offering pull requests. > > My strong suggestion is that development discussions of the project > continue on this list with consensus among the active participants being > the goal for development. I don't think 100% consensus is a rigid > requirement --- but certainly a super-majority should be the goal, and > serious changes should not be made with out a clear consensus. I would > pay special attention to under-represented people (users with intense usage > of NumPy but small voices on this list). There are many of them. If > you push me for specifics then at this point in NumPy's history, I would > say that if Chuck, Nathaniel, and Ralf agree on a course of action, it will > likely be a good thing for the project. I suspect that even if only 2 of > the 3 agree at one time it might still be a good thing (but I would expect > more detail and discussion). There are others whose opinion should be > sought as well: Ondrej Certik, Perry Greenfield, Robert Kern, David > Cournapeau, Francesc Alted, and Mark Wiebe to > name a few. For some questions, I might even seek input from people > like Konrad Hinsen and Paul Dubois --- if they have time to give it. I > will still be willing to offer my view from time to time and if I am asked. > > Greg Wilson (of Software Carpentry fame) asked me recently what letter I > would have written to myself 5 years ago. What would I tell myself to do > given the knowledge I have now? I've thought about that for a bit, and > I have some answers. I don't know if these will help anyone, but I offer > them as hopefully instructive: > > 1) Do not promise to not break the ABI of NumPy --- and in fact > emphasize that it will be broken at least once in the 1.X series. NumPy > was designed to add new data-types --- but not without breaking the ABI. > NumPy has needed more data-types and still needs even more. While it's > not beautifully simple to add new data-types, it can be done. But, it is > impossible to add them without breaking the ABI in some fashion. The > desire to add new data-types *and* keep ABI compatibility has led to > significant pain. I think the ABI non-breakage goal has been amplified by > the poor state of package management in Python. The fact that it's > painful for someone to update their downstream packages when an upstream > ABI breaks (on Windows and Mac in particular) has put a lot of unfortunate > pressure on this community. Pressure that was not envisioned or > understood when I was writing NumPy. > > (As an aside: This is one reason Continuum has invested resources in > building the conda tool and a completely free set of binary packages called > Anaconda CE which is becoming more and more usable thanks to the efforts of > Bryan Van de Ven and Ilan Schnell and our testing team at Continuum. The > conda tool: http://docs.continuum.io/conda/index.html is open source and > BSD licensed and the next release will provide the ability to build > packages, build indexes on package repositories and interface with pip. > Expect a blog-post in the near future about how cool conda is!). > > 2) Don't create array-scalars. Instead, make the data-type object > a meta-type object whose instances are the items returned from NumPy > arrays. There is no need for a separate array-scalar object and in fact > it's confusing to the type-system. I understand that now. I did not > understand that 5 years ago. > > 3) Special-case small arrays to avoid the memory indirection and > look at PDL so that generalized ufuncs are supported from the beginning. > > 4) Define missing-value data-types and labels on the dimensions > and arrays > > 5) Define a standard "dictionary of NumPy arrays" interface as the > basic "structure of arrays" concept to go with the "array of structures" > that structured arrays provide. > > 6) Start work on SQL interface to NumPy arrays *now* > > Additional comments I would make to someone today: > > 1) Most of NumPy should be written in Python with Numba used as > the compiler (particularly as soon as Numba gets the ability to create > Python extension modules which is in the next release). > 2) There are still many, many optimizations that can be made in > NumPy run-time (especially in the face of modern hardware). > > I will continue to be available to answer questions and I may chime in > here and there on pull requests. However, most of my time for NumPy will > be on administrative aspects of the project where I will continue to take > an active interest. To help make sure that this happens in a transparent > way, I would like to propose that "administrative" support of the project > be left to the NumFOCUS board of which I am currently 1 of 9 members. The > other board members are currently: Ralf Gommers, Anthony Scopatz, Andy > Terrel, Prabhu Ramachandran, Fernando Perez, Emmanuelle Gouillart, Jarrod > Millman, and Perry Greenfield. While NumFOCUS basically seeks to > promote and fund the entire scientific Python stack, I think it can also > play a role in helping to administer some of the core projects which the > board members themselves have a personal interest in. > > By administrative support, I mean decisions like "what should be done with > any NumPy IP or web-domains" or "what kind of commercially-related ads or > otherwise should go on the NumPy home page", or "what should be done with > the NumPy github account", etc. --- basically anything that requires an > executive decision that is not directly development related. I don't > expect there to be many of these decisions. But, when they show up, I > would like them to be made in as transparent and public of a way as > possible. In practice, the way I see this working is that there are > members of the NumPy community who are (like me) particularly interested in > admin-related questions and serve on a NumPy team in the NumFOCUS > organization. I just know I'll be attending NumFOCUS board meetings, > and I would like to help move administrative decisions forward with NumPy > as part of the time I spend thinking about NumFOCUS. > > If people on this list would like to play an active role in those admin > discussions, then I would heartily welcome them into NumFOCUS membership > where they would work with interested members of the NumFOCUS board (like > me and Ralf) to help direct that organization. I would really love to > have someone from this list volunteer to serve on the NumPy team as part of > the NumFOCUS project. I am certainly going to be interested in the > opinions of people who are active participants on this list and on GitHub > pages for NumPy on anything admin related to NumPy, and I expect Ralf would > also be very interested in those views. > > One admin discussion that I will bring up in another email (as this one is > already too long) is about making 2 or 3 lists for NumPy such as > numpy-admin at numpy.org, numpy-dev at numpy.org, and numpy-users at numpy-org. > > Just because I'll be spending more time on Blaze, Numba, Bokeh, and the > PyData ecosystem does not mean that I won't be around for NumPy. I will > continue to promote NumPy. My involvement with Continuum connects me to > NumPy as Continuum continues to offer commercial support contracts for > NumPy (and SciPy and other open source projects). Continuum will also > continue to maintain its Github NumPy project which will contain pull > requests from our company that we are working to get into the mainline > branch. Continuum will also continue to provide resources for > release-management of NumPy (we have been funding Ondrej in this role for > the past 6 months --- though I would like to see this happen through > NumFOCUS in the future even if Continuum provides much of the money). We > also offer optimized versions of NumPy in our commercial Anaconda > distribution (Anaconda CE is free and open source). > > Also, I will still be available for questions and help (I'm not > disappearing --- just making it clear that I'm stepping back into an > occasional NumPy developer role). It has been extremely gratifying to see > the number of pull-requests, GitHub-conversations, and code contributions > increase this year. Even though the 1.7 release has taken a long time to > stabilize, there have been a lot of people participating in the discussion > and in helping to track down the problems, figure out what to do, and fix > them. It even makes it possible for people to think about 1.7 as a > long-term release. > > I will continue to hope that the spirit of openness, tolerance, respect, > and gratitude continue to permeate this mailing list, and that we continue > to seek to resolve any differences with trust and mutual respect. I know > I have offended people in the past with quick remarks and actions made > sometimes in haste without fully realizing how they might be taken. But, > I also know that like many of you I have always done the very best I could > for moving Python for scientific computing forward in the best way I know > how. > > Thank you for the great memories. If you will forgive a little > sentiment: My daughter who is in college now was 3 years old when I began > working with this community and went down a road that would lead to my > involvement with SciPy and NumPy. I have marked the building of my family > and the passage of time with where the Python for Scientific Computing > Community was at. Like many of you, I have given a great deal of > attention and time to building this community. That sacrifice and time > has led me to love what we have created. I know that I leave this > segment of the community with the tools in better hands than mine. I am > hopeful that NumPy will continue to be a useful array library for the > Python community for many years to come even as we all continue to build > new tools for the future. > > Congratulations on the DARPA grant and best wishes for the success of your enterprises. We will all do our best to keep Numpy moving forward and hope that Blaze will contribute to that. One administrative detail you might want to deal with at the point is ownership of the Numpy github repositories. I note that the Scipy repositories have a number of owners, but you are currently the sole owner of the Numpy site. May I suggest adding a few more owners? I'd recommend Ralf, Pauli, Nathaniel, and myself as additions. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.collette at gmail.com Mon Dec 17 14:55:54 2012 From: andrew.collette at gmail.com (Andrew Collette) Date: Mon, 17 Dec 2012 12:55:54 -0700 Subject: [Numpy-discussion] Status of the 1.7 release In-Reply-To: References: Message-ID: I am not very familiar with the NumPy development and release strategy, but is there any chance this fix could be included in 1.7.0? https://github.com/numpy/numpy/pull/2798 This is the source of a recently reported bug in h5py and there is nothing I can do to work around it without breaking other parts of the project. If I can be of any help just let me know how. Andrew Collette From chris.barker at noaa.gov Mon Dec 17 15:30:04 2012 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 17 Dec 2012 12:30:04 -0800 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: On Sun, Dec 16, 2012 at 7:48 AM, klo wrote: >> "NumPy 1.5 Beginner's Guide", Ivan Idris, >> http://www.packtpub.com/numpy-1-5-using-real-world-examples-beginners-guide/book > Some reviews on first title: > > http://gael-varoquaux.info/blog/?p=161 > http://glowingpython.blogspot.com/2011/12/book-review-numpy-15-beginners-guide.html Interesting -- I was asked to review the Numpy 1.5 Beginner's Guide, and I did read through the whole thing, and make notes, but never wrote up a full review. One reason is that I found it hard to motivate myself to write what would have been a bad review. I don't really disagree with the summaries provided in the two reviews above, but I found two things: 1) There was too much wasted space in the book -- places where there would be a nice example, then another one that was the same thing, but, for example, using a different random number distribution -- we really don't need to see almost identical code and plots take up another bunch of pages -- i.e. I think the book could have been maybe half as long with about the same content. 2) -- and this is worse -- The author did not seem to be all that familiar with the real strengths of numpy and idiomatic numpy code -- it felt like almost a translation of MATLAB material, or at least written by someone that had not yet made the full transition to numpy from another language. Aside from small style issues was the glaring omission of any discussion of array broadcasting! So it may be hard to reach a consensus on a book being "good"! That being said, I don't have a problem with listing this book on numpy sites -- though it would be nice to have easy access of reviews right there, too. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pivanov314 at gmail.com Mon Dec 17 20:50:34 2012 From: pivanov314 at gmail.com (Paul Ivanov) Date: Mon, 17 Dec 2012 17:50:34 -0800 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: On Mon, Dec 17, 2012 at 12:30 PM, Chris Barker - NOAA Federal wrote: > Interesting -- I was asked to review the Numpy 1.5 Beginner's Guide, > and I did read through the whole thing, and make notes, but never > wrote up a full review. One reason is that I found it hard to motivate > myself to write what would have been a bad review. This was also my experience. I would go so far as to say that it would be a disservice to our community to link to that book. Our documentation is better. -- Paul Ivanov 314 address only used for lists, off-list direct email at: http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 From jason-sage at creativetrax.com Mon Dec 17 21:10:05 2012 From: jason-sage at creativetrax.com (Jason Grout) Date: Mon, 17 Dec 2012 19:10:05 -0700 Subject: [Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch In-Reply-To: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> References: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> Message-ID: <50CFD07D.6050802@creativetrax.com> On 12/16/12 11:07 PM, Travis Oliphant wrote: > Hello all, > > There is a lot happening in my life right now and I am spread quite > thin among the various projects that I take an interest in. In > particular, I am thrilled to publicly announce on this list that > Continuum Analytics has received DARPA funding (to the tune of at > least $3 million) for Blaze, Numba, and Bokeh which we are writing to > take NumPy, SciPy, and visualization into the domain of very large > data sets. This is part of the XDATA program, and I will be taking > an active role in it. You can read more about Blaze here: > http://blaze.pydata.org. You can read more about XDATA here: > http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx This is awesome. As with the recent IPython grant, it would be great if you guys got some good publicity from this. For example, I see an article up on Hacker News about blaze, but there doesn't seem to be a mention of big funding. Has someone written a press release? Has someone submitted the grant news to Hacker News or Slashdot, where you might attract attention and mindshare? Thanks, Jason From pivanov314 at gmail.com Mon Dec 17 21:19:29 2012 From: pivanov314 at gmail.com (Paul Ivanov) Date: Mon, 17 Dec 2012 18:19:29 -0800 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: On Mon, Dec 17, 2012 at 5:50 PM, Paul Ivanov wrote: > On Mon, Dec 17, 2012 at 12:30 PM, Chris Barker - NOAA Federal > wrote: >> Interesting -- I was asked to review the Numpy 1.5 Beginner's Guide, >> and I did read through the whole thing, and make notes, but never >> wrote up a full review. One reason is that I found it hard to motivate >> myself to write what would have been a bad review. > > This was also my experience. I would go so far as to say that it would > be a disservice to our community to link to that book. Our > documentation is better. I dug up the skeleton of the review that I had written up until I lost steam and interest in going further, I think it may shed more light on my negative opinion of this book. Packt publishing approached me about doing a review of one of their newest Python books: _NumPy 1.5 Beginner's Guide_ I think it's great that publishers are making it easier for folks to get started in this hot area of computing, though obviously, being a vested member of the Scientific Python community, I'm not exactly unbiased in my opinions on the topic. I received a complementary e-book copy. Here are my thoughts It correctly mentions that NumPy can use a LAPACK implementation if one is available on your system, and also correctly mentions that NumPy provides its own implementation if it can't find one - but neglects to state the important fact that this will be very slow relative to a traditional LAPACK implementation No mention of CPython - since there are so many other flavors of Python out there, these days, and NumPy doesn't work on most of them Mentions that NumPy is "open source" and "free as in beer", but neglects to specifically state its license. Numpy 1.5 is in the book title - but NumPy 2.0.0.dev20100915 listed under "Here is a list of software used to develop and test the code examples" in the Preface. - the need to register for an account in order to download the sample code is annoying The author of the book uses a 64-bit machine. How do I know that? The first code example provided in the book does not work for the second set of inputs. 20:32 at ch1code$ python vectorsum.py 2000 The last 2 elements of the sum [7980015996L, 7992002000L] PythonSum elapsed time in microseconds 4943 Warning: invalid value encountered in power The last 2 elements of the sum [-2143491644 -2143487647] NumPySum elapsed time in microseconds 722 20:32 at ch1code$ Indeed the vectorsum example: doesn't work for values > 1291 ( int overflow on my 32 bit machine ) python vectorsum.py 1291 20:32 at ch1code$ python vectorsum.py 1291 The last 2 elements of the sum [2143362090, 2148353100L] PythonSum elapsed time in microseconds 1771 The last 2 elements of the sum [ 2143362090 -2146614196] NumPySum elapsed time in microseconds 374 So though the answer is attained way faster using numpy, it's wrong in this case! "What just happened?" section a bit annoying - just space filler without utility - reminiscent of closing curling brackets or line-ending semicolons of those other programming languages :) Ditto with the "time for action" headings. import numpy as np would have been nice - since that's the convention used in numpy's own doc strings. "What does arange(5) do" - namespaces are one honking good idea... Re: IPython: "The Pylab switch imports all the Scipy, NumPy, and Matplotlib packages. Without this switch, we would have to import every package we need ourselves." - biggest use of ``--pylab`` is to get a separate event loop for plots. It's confusing to have a Numpy book that, in the first chapter, dives into IPython! The book is unfocused in its presentation. A simple mention of %quickref would have sufficed for pointing out features of IPython TypeError for ints and floats is a property of the Python programming language - it's not specific to Numpy. Reshape function should have mentioned that it only changes only the metadata, so reshaping a really large array takes the same amount of time as a small one. More style problems: "This selects the first floor" Mention of ravel() and flatten() without mention of flat (which is mentioned several pages later). Mention of a.transpose() without a.T - which is mentioned seven pages later. Something I didn't know: "The flat attribute is settable. Setting the value of the flat attribute leads to overwriting the values of the whole array" (p 46). And then I figured out why I didn't know this (it's slow): In [31]: timeit a.flat=1 10 loops, best of 3: 80.4 ms per loop In [32]: timeit a.fill(1) 100 loops, best of 3: 5.62 ms per loop Unfortunately, the fill() method is *not* mentioned here, it's mentioned 25 pages later. VWA-what? too simplistic explanation to be useful. Mean - cheeky language, but the fact that it's also a method on numpy array is only mentioned in passing four pages later. That you can specify ``axis`` keyword to get means across rows or columns, etc., is also not mentioned here. The same thing for min, max, ptp. Uses the memory-copying numpy.msort() to sort an array, instead of the in-place a.sort(), or talking about the more general numpy.sort() function. The explanation of numpy.var include the important point about how "some books tell us to divide by the number of elements in the array minus one," but then fails to mention the ddof keyword argument to var and std. A shout-out to numpy.std would have been nice here, too. I don't care for the finance-focus of the example in this book, but I do care about being talked down to: In academic literature it is more common to base analysis on stock returns and log returns of the close price. Simple returns are just the rate of change from one value to the next. Logarithmic returns or log returns are determined by taking the log of all the prices and calculating the differences between them. In high school, we learned that the difference between the log of "a" and the log of "b" is equal to the log of "a divided by b". Log return, therefore, also measures rate of change. Indexing with masks (e.g. all positive values of an array as a[a>0]), not mentioned until chapter X, whereas it had a natural fit in either the indexing section of Chapter 2, or along side the use of numpy.where() in Chapter 3. The book talks about how it's useful for scientists and engineers, and then has a heavy focus on stock-market-related finance one-off examples. My eyes glazed over with the acronyms and, what to me, are meaningless sets of quantities. Image: ATR - if it's not important, don't tell me about it! Bollinger Bands: gimme a break! np.piecewise - I've never used it - looks quite useful As a newcomer to NumPy, I would have been too distracted by the financial focus of all of the example to get a general picture of what NumPy goodies. inconsistencies in style: numpy.arange followed by plot() plot() plot() show() no discussion of broadcasting. go ahead and time it graphic: no mention of ipython's timeit magic, or even just the python standard library timeit module. dot function described independent of matrix class multiplication. Typos / Errata -------------- numpy.arange - "the arange function was imported, that's why it is prefixed with numpy" - should be "the arange function was *not* imported..." page 17 hstack and vstack visuals on page 39 are identical in the resulting array, and should not be. poly = numpy.polyfit(t, bhp - vale, int(sys.argv[1])) on page 85 should make reference to how the script must be passed an argument for the degree of the polynomial, or the int(sys.argv[1]) should be changed to just 3 to suit the result. pg 86 - "extremums" should read "extrema" -- Paul Ivanov 314 address only used for lists, off-list direct email at: http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 From d.s.seljebotn at astro.uio.no Tue Dec 18 02:06:10 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 18 Dec 2012 08:06:10 +0100 Subject: [Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch In-Reply-To: <50CFD07D.6050802@creativetrax.com> References: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> <50CFD07D.6050802@creativetrax.com> Message-ID: <50D015E2.3090705@astro.uio.no> On 12/18/2012 03:10 AM, Jason Grout wrote: > On 12/16/12 11:07 PM, Travis Oliphant wrote: >> Hello all, >> >> There is a lot happening in my life right now and I am spread quite >> thin among the various projects that I take an interest in. In >> particular, I am thrilled to publicly announce on this list that >> Continuum Analytics has received DARPA funding (to the tune of at >> least $3 million) for Blaze, Numba, and Bokeh which we are writing to >> take NumPy, SciPy, and visualization into the domain of very large >> data sets. This is part of the XDATA program, and I will be taking >> an active role in it. You can read more about Blaze here: >> http://blaze.pydata.org. You can read more about XDATA here: >> http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx > > > This is awesome. As with the recent IPython grant, it would be great if > you guys got some good publicity from this. For example, I see an > article up on Hacker News about blaze, but there doesn't seem to be a > mention of big funding. Has someone written a press release? Has > someone submitted the grant news to Hacker News or Slashdot, where you > might attract attention and mindshare? The IPython grant was on HN front page for a day. Dag Sverre From d.s.seljebotn at astro.uio.no Tue Dec 18 02:06:35 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 18 Dec 2012 08:06:35 +0100 Subject: [Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch In-Reply-To: <50D015E2.3090705@astro.uio.no> References: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> <50CFD07D.6050802@creativetrax.com> <50D015E2.3090705@astro.uio.no> Message-ID: <50D015FB.60108@astro.uio.no> On 12/18/2012 08:06 AM, Dag Sverre Seljebotn wrote: > On 12/18/2012 03:10 AM, Jason Grout wrote: >> On 12/16/12 11:07 PM, Travis Oliphant wrote: >>> Hello all, >>> >>> There is a lot happening in my life right now and I am spread quite >>> thin among the various projects that I take an interest in. In >>> particular, I am thrilled to publicly announce on this list that >>> Continuum Analytics has received DARPA funding (to the tune of at >>> least $3 million) for Blaze, Numba, and Bokeh which we are writing to >>> take NumPy, SciPy, and visualization into the domain of very large >>> data sets. This is part of the XDATA program, and I will be taking >>> an active role in it. You can read more about Blaze here: >>> http://blaze.pydata.org. You can read more about XDATA here: >>> http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx >> >> >> This is awesome. As with the recent IPython grant, it would be great if >> you guys got some good publicity from this. For example, I see an >> article up on Hacker News about blaze, but there doesn't seem to be a >> mention of big funding. Has someone written a press release? Has >> someone submitted the grant news to Hacker News or Slashdot, where you >> might attract attention and mindshare? > > The IPython grant was on HN front page for a day. Oh. Misread. Sorry. DS > > Dag Sverre From travis at continuum.io Tue Dec 18 02:14:35 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 18 Dec 2012 01:14:35 -0600 Subject: [Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch In-Reply-To: References: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> Message-ID: <79B07804-4EF6-4C28-BA42-B022AEFAE609@continuum.io> Thanks Charles, for the reminder and for the well wishes. I added the suggested names as owners. I have no doubt you will all do very well for NumPy in the future as you have in the past. All the best, -Travis On Dec 17, 2012, at 12:50 PM, Charles R Harris wrote: > Hi Travis, > > On Sun, Dec 16, 2012 at 11:07 PM, Travis Oliphant wrote: > Hello all, > > There is a lot happening in my life right now and I am spread quite thin among the various projects that I take an interest in. In particular, I am thrilled to publicly announce on this list that Continuum Analytics has received DARPA funding (to the tune of at least $3 million) for Blaze, Numba, and Bokeh which we are writing to take NumPy, SciPy, and visualization into the domain of very large data sets. This is part of the XDATA program, and I will be taking an active role in it. You can read more about Blaze here: http://blaze.pydata.org. You can read more about XDATA here: http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx > > I personally think Blaze is the future of array-oriented computing in Python. I will be putting efforts and resources next year behind making that case. How it interacts with future incarnations of NumPy, Pandas, or other projects is an interesting and open question. I have no doubt the future will be a rich ecosystem of interoperating array-oriented data-structures. I invite anyone interested in Blaze to participate in the discussions and development at https://groups.google.com/a/continuum.io/forum/#!forum/blaze-dev or watch the project on our public GitHub repo: https://github.com/ContinuumIO/blaze. Blaze is being incubated under the ContinuumIO GitHub project for now, but eventually I hope it will receive its own GitHub project page later next year. Development of Blaze is early but we are moving rapidly with it (and have deliverable deadlines --- thus while we will welcome input and pull requests we won't have a ton of time to respond to simple queries until > at least May or June). There is more that we are working on behind the scenes with respect to Blaze that will be coming out next year as well but isn't quite ready to show yet. > > As I look at the coming months and years, my time for direct involvement in NumPy development is therefore only going to get smaller. As a result it is not appropriate that I remain as "head steward" of the NumPy project (a term I prefer to BFD12 or anything else). I'm sure that it is apparent that while I've tried to help personally where I can this year on the NumPy project, my role has been more one of coordination, seeking funding, and providing expert advice on certain sections of code. I fundamentally agree with Fernando Perez that the responsibility of care-taking open source projects is one of stewardship --- something akin to public service. I have tried to emulate that belief this year --- even while not always succeeding. > > It is time for me to make official what is already becoming apparent to observers of this community, namely, that I am stepping down as someone who might be considered "head steward" for the NumPy project and officially leaving the development of the project in the hands of others in the community. I don't think the project actually needs a new "head steward" --- especially from a development perspective. Instead I see a lot of strong developers offering key opinions for the project as well as a great set of new developers offering pull requests. > > My strong suggestion is that development discussions of the project continue on this list with consensus among the active participants being the goal for development. I don't think 100% consensus is a rigid requirement --- but certainly a super-majority should be the goal, and serious changes should not be made with out a clear consensus. I would pay special attention to under-represented people (users with intense usage of NumPy but small voices on this list). There are many of them. If you push me for specifics then at this point in NumPy's history, I would say that if Chuck, Nathaniel, and Ralf agree on a course of action, it will likely be a good thing for the project. I suspect that even if only 2 of the 3 agree at one time it might still be a good thing (but I would expect more detail and discussion). There are others whose opinion should be sought as well: Ondrej Certik, Perry Greenfield, Robert Kern, David Cournapeau, Francesc Alted, and Mark Wiebe to > name a few. For some questions, I might even seek input from people like Konrad Hinsen and Paul Dubois --- if they have time to give it. I will still be willing to offer my view from time to time and if I am asked. > > Greg Wilson (of Software Carpentry fame) asked me recently what letter I would have written to myself 5 years ago. What would I tell myself to do given the knowledge I have now? I've thought about that for a bit, and I have some answers. I don't know if these will help anyone, but I offer them as hopefully instructive: > > 1) Do not promise to not break the ABI of NumPy --- and in fact emphasize that it will be broken at least once in the 1.X series. NumPy was designed to add new data-types --- but not without breaking the ABI. NumPy has needed more data-types and still needs even more. While it's not beautifully simple to add new data-types, it can be done. But, it is impossible to add them without breaking the ABI in some fashion. The desire to add new data-types *and* keep ABI compatibility has led to significant pain. I think the ABI non-breakage goal has been amplified by the poor state of package management in Python. The fact that it's painful for someone to update their downstream packages when an upstream ABI breaks (on Windows and Mac in particular) has put a lot of unfortunate pressure on this community. Pressure that was not envisioned or understood when I was writing NumPy. > > (As an aside: This is one reason Continuum has invested resources in building the conda tool and a completely free set of binary packages called Anaconda CE which is becoming more and more usable thanks to the efforts of Bryan Van de Ven and Ilan Schnell and our testing team at Continuum. The conda tool: http://docs.continuum.io/conda/index.html is open source and BSD licensed and the next release will provide the ability to build packages, build indexes on package repositories and interface with pip. Expect a blog-post in the near future about how cool conda is!). > > 2) Don't create array-scalars. Instead, make the data-type object a meta-type object whose instances are the items returned from NumPy arrays. There is no need for a separate array-scalar object and in fact it's confusing to the type-system. I understand that now. I did not understand that 5 years ago. > > 3) Special-case small arrays to avoid the memory indirection and look at PDL so that generalized ufuncs are supported from the beginning. > > 4) Define missing-value data-types and labels on the dimensions and arrays > > 5) Define a standard "dictionary of NumPy arrays" interface as the basic "structure of arrays" concept to go with the "array of structures" that structured arrays provide. > > 6) Start work on SQL interface to NumPy arrays *now* > > Additional comments I would make to someone today: > > 1) Most of NumPy should be written in Python with Numba used as the compiler (particularly as soon as Numba gets the ability to create Python extension modules which is in the next release). > 2) There are still many, many optimizations that can be made in NumPy run-time (especially in the face of modern hardware). > > I will continue to be available to answer questions and I may chime in here and there on pull requests. However, most of my time for NumPy will be on administrative aspects of the project where I will continue to take an active interest. To help make sure that this happens in a transparent way, I would like to propose that "administrative" support of the project be left to the NumFOCUS board of which I am currently 1 of 9 members. The other board members are currently: Ralf Gommers, Anthony Scopatz, Andy Terrel, Prabhu Ramachandran, Fernando Perez, Emmanuelle Gouillart, Jarrod Millman, and Perry Greenfield. While NumFOCUS basically seeks to promote and fund the entire scientific Python stack, I think it can also play a role in helping to administer some of the core projects which the board members themselves have a personal interest in. > > By administrative support, I mean decisions like "what should be done with any NumPy IP or web-domains" or "what kind of commercially-related ads or otherwise should go on the NumPy home page", or "what should be done with the NumPy github account", etc. --- basically anything that requires an executive decision that is not directly development related. I don't expect there to be many of these decisions. But, when they show up, I would like them to be made in as transparent and public of a way as possible. In practice, the way I see this working is that there are members of the NumPy community who are (like me) particularly interested in admin-related questions and serve on a NumPy team in the NumFOCUS organization. I just know I'll be attending NumFOCUS board meetings, and I would like to help move administrative decisions forward with NumPy as part of the time I spend thinking about NumFOCUS. > > If people on this list would like to play an active role in those admin discussions, then I would heartily welcome them into NumFOCUS membership where they would work with interested members of the NumFOCUS board (like me and Ralf) to help direct that organization. I would really love to have someone from this list volunteer to serve on the NumPy team as part of the NumFOCUS project. I am certainly going to be interested in the opinions of people who are active participants on this list and on GitHub pages for NumPy on anything admin related to NumPy, and I expect Ralf would also be very interested in those views. > > One admin discussion that I will bring up in another email (as this one is already too long) is about making 2 or 3 lists for NumPy such as numpy-admin at numpy.org, numpy-dev at numpy.org, and numpy-users at numpy-org. > > Just because I'll be spending more time on Blaze, Numba, Bokeh, and the PyData ecosystem does not mean that I won't be around for NumPy. I will continue to promote NumPy. My involvement with Continuum connects me to NumPy as Continuum continues to offer commercial support contracts for NumPy (and SciPy and other open source projects). Continuum will also continue to maintain its Github NumPy project which will contain pull requests from our company that we are working to get into the mainline branch. Continuum will also continue to provide resources for release-management of NumPy (we have been funding Ondrej in this role for the past 6 months --- though I would like to see this happen through NumFOCUS in the future even if Continuum provides much of the money). We also offer optimized versions of NumPy in our commercial Anaconda distribution (Anaconda CE is free and open source). > > Also, I will still be available for questions and help (I'm not disappearing --- just making it clear that I'm stepping back into an occasional NumPy developer role). It has been extremely gratifying to see the number of pull-requests, GitHub-conversations, and code contributions increase this year. Even though the 1.7 release has taken a long time to stabilize, there have been a lot of people participating in the discussion and in helping to track down the problems, figure out what to do, and fix them. It even makes it possible for people to think about 1.7 as a long-term release. > > I will continue to hope that the spirit of openness, tolerance, respect, and gratitude continue to permeate this mailing list, and that we continue to seek to resolve any differences with trust and mutual respect. I know I have offended people in the past with quick remarks and actions made sometimes in haste without fully realizing how they might be taken. But, I also know that like many of you I have always done the very best I could for moving Python for scientific computing forward in the best way I know how. > > Thank you for the great memories. If you will forgive a little sentiment: My daughter who is in college now was 3 years old when I began working with this community and went down a road that would lead to my involvement with SciPy and NumPy. I have marked the building of my family and the passage of time with where the Python for Scientific Computing Community was at. Like many of you, I have given a great deal of attention and time to building this community. That sacrifice and time has led me to love what we have created. I know that I leave this segment of the community with the tools in better hands than mine. I am hopeful that NumPy will continue to be a useful array library for the Python community for many years to come even as we all continue to build new tools for the future. > > > Congratulations on the DARPA grant and best wishes for the success of your enterprises. We will all do our best to keep Numpy moving forward and hope that Blaze will contribute to that. > > One administrative detail you might want to deal with at the point is ownership of the Numpy github repositories. I note that the Scipy repositories have a number of owners, but you are currently the sole owner of the Numpy site. May I suggest adding a few more owners? I'd recommend Ralf, Pauli, Nathaniel, and myself as additions. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From jniehof at lanl.gov Tue Dec 18 10:12:23 2012 From: jniehof at lanl.gov (Jonathan T. Niehof) Date: Tue, 18 Dec 2012 08:12:23 -0700 Subject: [Numpy-discussion] www.numpy.org home page In-Reply-To: References: <6E409FEC-FD9F-41F6-BC02-F227EE42F077@gmail.com> Message-ID: <50D087D7.7050406@lanl.gov> On 12/17/2012 07:19 PM, Paul Ivanov wrote: > pg 86 - "extremums" should read "extrema" In case anyone was wondering, it *is* possible to snort a bagel up into one's nose. It's also painful. (Although not as painful as that pluralization.) Thanks for the notes. -- Jonathan Niehof ISR-3 Space Data Systems Los Alamos National Laboratory MS-D466 Los Alamos, NM 87545 Phone: 505-667-9595 email: jniehof at lanl.gov Correspondence / Technical data or Software Publicly Available From charlesr.harris at gmail.com Tue Dec 18 10:41:38 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Dec 2012 08:41:38 -0700 Subject: [Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch In-Reply-To: <79B07804-4EF6-4C28-BA42-B022AEFAE609@continuum.io> References: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> <79B07804-4EF6-4C28-BA42-B022AEFAE609@continuum.io> Message-ID: On Tue, Dec 18, 2012 at 12:14 AM, Travis Oliphant wrote: > Thanks Charles, > > for the reminder and for the well wishes. > > I added the suggested names as owners. I have no doubt you will all do > very well for NumPy in the future as you have in the past. > > Thanks Travis. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From heng at cantab.net Wed Dec 19 03:40:39 2012 From: heng at cantab.net (Henry Gomersall) Date: Wed, 19 Dec 2012 08:40:39 +0000 Subject: [Numpy-discussion] Byte aligned arrays Message-ID: <1355906439.3456.9.camel@farnsworth> I've written a few simple cython routines for assisting in creating byte-aligned numpy arrays. The point being for the arrays to work with SSE/AVX code. https://github.com/hgomersall/pyFFTW/blob/master/pyfftw/utils.pxi The change recently has been to add a check on the CPU as to what flags are supported (though it's not complete, I should make the default return 0 or something). It occurred to me that this is something that (a) other people almost certainly need and are solving themselves and (b) I lack the necessary platforms to test all the possible CPU/OS combinations to make sure something sensible happens in all cases. Is this something that can be rolled into Numpy (the feature, not my particular implementation or interface - though I'd be happy for it to be so)? Regarding (b), I've written a test case that works for Linux on x86-64 with GCC (my platform!). I can test it on 32-bit windows, but that's it. Is ARM supported by Numpy? Neon would be great to include as well. What other platforms might need this? Cheers, Henry From njs at pobox.com Wed Dec 19 09:43:58 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 19 Dec 2012 14:43:58 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1355906439.3456.9.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> Message-ID: On Wed, Dec 19, 2012 at 8:40 AM, Henry Gomersall wrote: > I've written a few simple cython routines for assisting in creating > byte-aligned numpy arrays. The point being for the arrays to work with > SSE/AVX code. > > https://github.com/hgomersall/pyFFTW/blob/master/pyfftw/utils.pxi > > The change recently has been to add a check on the CPU as to what flags > are supported (though it's not complete, I should make the default > return 0 or something). > > It occurred to me that this is something that (a) other people almost > certainly need and are solving themselves and (b) I lack the necessary > platforms to test all the possible CPU/OS combinations to make sure > something sensible happens in all cases. > > Is this something that can be rolled into Numpy (the feature, not my > particular implementation or interface - though I'd be happy for it to > be so)? > > Regarding (b), I've written a test case that works for Linux on x86-64 > with GCC (my platform!). I can test it on 32-bit windows, but that's it. > Is ARM supported by Numpy? Neon would be great to include as well. What > other platforms might need this? Your code looks simple and portable to me (at least the alignment part). I can see a good argument for adding this sort of functionality directly to numpy with a nice interface, though, since these kind of requirements seem quite common these days. Maybe an interface like a = np.asarray([1, 2, 3], base_alignment=32) # should this be in bits or in bytes? b = np.empty((10, 10), order="C", base_alignment=32) # etc. assert a.base_alignment == 32 which underneath tries to use posix_memalign/_aligned_malloc when possible, or falls back on the overallocation trick otherwise? -n From charlesr.harris at gmail.com Wed Dec 19 09:57:51 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 19 Dec 2012 07:57:51 -0700 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: References: <1355906439.3456.9.camel@farnsworth> Message-ID: On Wed, Dec 19, 2012 at 7:43 AM, Nathaniel Smith wrote: > On Wed, Dec 19, 2012 at 8:40 AM, Henry Gomersall wrote: > > I've written a few simple cython routines for assisting in creating > > byte-aligned numpy arrays. The point being for the arrays to work with > > SSE/AVX code. > > > > https://github.com/hgomersall/pyFFTW/blob/master/pyfftw/utils.pxi > > > > The change recently has been to add a check on the CPU as to what flags > > are supported (though it's not complete, I should make the default > > return 0 or something). > > > > It occurred to me that this is something that (a) other people almost > > certainly need and are solving themselves and (b) I lack the necessary > > platforms to test all the possible CPU/OS combinations to make sure > > something sensible happens in all cases. > > > > Is this something that can be rolled into Numpy (the feature, not my > > particular implementation or interface - though I'd be happy for it to > > be so)? > > > > Regarding (b), I've written a test case that works for Linux on x86-64 > > with GCC (my platform!). I can test it on 32-bit windows, but that's it. > > Is ARM supported by Numpy? Neon would be great to include as well. What > > other platforms might need this? > > Your code looks simple and portable to me (at least the alignment > part). I can see a good argument for adding this sort of functionality > directly to numpy with a nice interface, though, since these kind of > requirements seem quite common these days. Maybe an interface like > a = np.asarray([1, 2, 3], base_alignment=32) # should this be in > bits or in bytes? > b = np.empty((10, 10), order="C", base_alignment=32) > # etc. > assert a.base_alignment == 32 > which underneath tries to use posix_memalign/_aligned_malloc when > possible, or falls back on the overallocation trick otherwise? > > There is a thread about this from several years back. IIRC, David Cournapeau was interested in the same problem. At first glance, the alignment keyword looks interesting. One possible concern is keeping alignment for rows, views, etc., which is probably not possible in any sensible way. But people who need this most likely know what they are doing and just need memory allocated on the proper boundary. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Dec 19 10:10:24 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 19 Dec 2012 15:10:24 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: References: <1355906439.3456.9.camel@farnsworth> Message-ID: On Wed, Dec 19, 2012 at 2:57 PM, Charles R Harris wrote: > > > On Wed, Dec 19, 2012 at 7:43 AM, Nathaniel Smith wrote: >> >> On Wed, Dec 19, 2012 at 8:40 AM, Henry Gomersall wrote: >> > I've written a few simple cython routines for assisting in creating >> > byte-aligned numpy arrays. The point being for the arrays to work with >> > SSE/AVX code. >> > >> > https://github.com/hgomersall/pyFFTW/blob/master/pyfftw/utils.pxi >> > >> > The change recently has been to add a check on the CPU as to what flags >> > are supported (though it's not complete, I should make the default >> > return 0 or something). >> > >> > It occurred to me that this is something that (a) other people almost >> > certainly need and are solving themselves and (b) I lack the necessary >> > platforms to test all the possible CPU/OS combinations to make sure >> > something sensible happens in all cases. >> > >> > Is this something that can be rolled into Numpy (the feature, not my >> > particular implementation or interface - though I'd be happy for it to >> > be so)? >> > >> > Regarding (b), I've written a test case that works for Linux on x86-64 >> > with GCC (my platform!). I can test it on 32-bit windows, but that's it. >> > Is ARM supported by Numpy? Neon would be great to include as well. What >> > other platforms might need this? >> >> Your code looks simple and portable to me (at least the alignment >> part). I can see a good argument for adding this sort of functionality >> directly to numpy with a nice interface, though, since these kind of >> requirements seem quite common these days. Maybe an interface like >> a = np.asarray([1, 2, 3], base_alignment=32) # should this be in >> bits or in bytes? >> b = np.empty((10, 10), order="C", base_alignment=32) >> # etc. >> assert a.base_alignment == 32 >> which underneath tries to use posix_memalign/_aligned_malloc when >> possible, or falls back on the overallocation trick otherwise? >> > > There is a thread about this from several years back. IIRC, David Cournapeau > was interested in the same problem. At first glance, the alignment keyword > looks interesting. One possible concern is keeping alignment for rows, > views, etc., which is probably not possible in any sensible way. But people > who need this most likely know what they are doing and just need memory > allocated on the proper boundary. Right, my intuition is that it's like order="C" -- if you make a new array by, say, indexing, then it may or may not have order="C", no guarantees. So when you care, you call asarray(a, order="C") and that either makes a copy or not as needed. Similarly for base alignment. I guess to push this analogy even further we could define a set of array flags, ALIGNED_8, ALIGNED_16, etc. (In practice only power-of-2 alignment matters, I think, so the number of flags would remain manageable?) That would make the C API easier to deal with too, no need to add PyArray_FromAnyAligned. -n From charlesr.harris at gmail.com Wed Dec 19 10:27:54 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 19 Dec 2012 08:27:54 -0700 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: References: <1355906439.3456.9.camel@farnsworth> Message-ID: On Wed, Dec 19, 2012 at 8:10 AM, Nathaniel Smith wrote: > On Wed, Dec 19, 2012 at 2:57 PM, Charles R Harris > wrote: > > > > > > On Wed, Dec 19, 2012 at 7:43 AM, Nathaniel Smith wrote: > >> > >> On Wed, Dec 19, 2012 at 8:40 AM, Henry Gomersall > wrote: > >> > I've written a few simple cython routines for assisting in creating > >> > byte-aligned numpy arrays. The point being for the arrays to work with > >> > SSE/AVX code. > >> > > >> > https://github.com/hgomersall/pyFFTW/blob/master/pyfftw/utils.pxi > >> > > >> > The change recently has been to add a check on the CPU as to what > flags > >> > are supported (though it's not complete, I should make the default > >> > return 0 or something). > >> > > >> > It occurred to me that this is something that (a) other people almost > >> > certainly need and are solving themselves and (b) I lack the necessary > >> > platforms to test all the possible CPU/OS combinations to make sure > >> > something sensible happens in all cases. > >> > > >> > Is this something that can be rolled into Numpy (the feature, not my > >> > particular implementation or interface - though I'd be happy for it to > >> > be so)? > >> > > >> > Regarding (b), I've written a test case that works for Linux on x86-64 > >> > with GCC (my platform!). I can test it on 32-bit windows, but that's > it. > >> > Is ARM supported by Numpy? Neon would be great to include as well. > What > >> > other platforms might need this? > >> > >> Your code looks simple and portable to me (at least the alignment > >> part). I can see a good argument for adding this sort of functionality > >> directly to numpy with a nice interface, though, since these kind of > >> requirements seem quite common these days. Maybe an interface like > >> a = np.asarray([1, 2, 3], base_alignment=32) # should this be in > >> bits or in bytes? > >> b = np.empty((10, 10), order="C", base_alignment=32) > >> # etc. > >> assert a.base_alignment == 32 > >> which underneath tries to use posix_memalign/_aligned_malloc when > >> possible, or falls back on the overallocation trick otherwise? > >> > > > > There is a thread about this from several years back. IIRC, David > Cournapeau > > was interested in the same problem. At first glance, the alignment > keyword > > looks interesting. One possible concern is keeping alignment for rows, > > views, etc., which is probably not possible in any sensible way. But > people > > who need this most likely know what they are doing and just need memory > > allocated on the proper boundary. > > Right, my intuition is that it's like order="C" -- if you make a new > array by, say, indexing, then it may or may not have order="C", no > guarantees. So when you care, you call asarray(a, order="C") and that > either makes a copy or not as needed. Similarly for base alignment. > > I guess to push this analogy even further we could define a set of > array flags, ALIGNED_8, ALIGNED_16, etc. (In practice only power-of-2 > alignment matters, I think, so the number of flags would remain > manageable?) That would make the C API easier to deal with too, no > need to add PyArray_FromAnyAligned. > > Another possibility is an aligned datatype, basically an aligned structured array with floats/ints in chunks of the appropriate size. IIRC, gcc support for sse is something like that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Dec 19 10:57:47 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 19 Dec 2012 15:57:47 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: References: <1355906439.3456.9.camel@farnsworth> Message-ID: On Wed, Dec 19, 2012 at 3:27 PM, Charles R Harris wrote: > > > On Wed, Dec 19, 2012 at 8:10 AM, Nathaniel Smith wrote: >> Right, my intuition is that it's like order="C" -- if you make a new >> array by, say, indexing, then it may or may not have order="C", no >> guarantees. So when you care, you call asarray(a, order="C") and that >> either makes a copy or not as needed. Similarly for base alignment. >> >> I guess to push this analogy even further we could define a set of >> array flags, ALIGNED_8, ALIGNED_16, etc. (In practice only power-of-2 >> alignment matters, I think, so the number of flags would remain >> manageable?) That would make the C API easier to deal with too, no >> need to add PyArray_FromAnyAligned. >> > > Another possibility is an aligned datatype, basically an aligned structured > array with floats/ints in chunks of the appropriate size. IIRC, gcc support > for sse is something like that. True; right now it looks like structured dtypes have no special alignment: In [13]: np.dtype("f4,f4").alignment Out[13]: 1 So for this approach we'd need a way to create structured dtypes with .alignment == .itemsize, and we'd need some way to request dtype-aligned memory from array allocation functions. I guess existing NPY_ALIGNED is a good enough public interface for the latter, but AFAICT the current implementation is to just assume that whatever malloc() returns will always be ALIGNED. This is true for all base C types, but not for more exotic record types with larger alignment requirements -- that would require some fancier allocation scheme. Not sure which interface is more useful to users. On the one hand, using funny dtypes makes regular non-SIMD access more cumbersome, and it forces your array size to be a multiple of the SIMD word size, which might be inconvenient if your code is smart enough to handle arbitrary-sized arrays with partial SIMD acceleration (i.e., using SIMD for most of the array, and then a slow path to handle any partial word at the end). OTOH, if your code *is* that smart, you should probably just make it smart enough to handle a partial word at the beginning as well and then you won't need any special alignment in the first place, and representing each SIMD word as a single numpy scalar is an intuitively appealing model of how SIMD works. OTOOH, just adding a single argument np.array() is a much simpler to explain than some elaborate scheme involving the creation of special custom dtypes. -n From heng at cantab.net Wed Dec 19 11:47:25 2012 From: heng at cantab.net (Henry Gomersall) Date: Wed, 19 Dec 2012 16:47:25 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: References: <1355906439.3456.9.camel@farnsworth> Message-ID: <1355935645.3456.22.camel@farnsworth> On Wed, 2012-12-19 at 15:57 +0000, Nathaniel Smith wrote: > Not sure which interface is more useful to users. On the one hand, > using funny dtypes makes regular non-SIMD access more cumbersome, and > it forces your array size to be a multiple of the SIMD word size, > which might be inconvenient if your code is smart enough to handle > arbitrary-sized arrays with partial SIMD acceleration (i.e., using > SIMD for most of the array, and then a slow path to handle any partial > word at the end). OTOH, if your code *is* that smart, you should > probably just make it smart enough to handle a partial word at the > beginning as well and then you won't need any special alignment in the > first place, and representing each SIMD word as a single numpy scalar > is an intuitively appealing model of how SIMD works. OTOOH, just > adding a single argument np.array() is a much simpler to explain than > some elaborate scheme involving the creation of special custom dtypes. If it helps, my use-case is in wrapping the FFTW library. This _is_ smart enough to deal with unaligned arrays, but it just results in a performance penalty. In the case of an FFT, there are clearly going to be issues with the powers of two indices in the array not lying on a suitable n-byte boundary (which would be the case with a misaligned array), but I imagine it's not unique. The other point is that it's easy to create a suitable power of two array that should always bypass any special case unaligned code (e.g. with floats, any multiple of 4 array length will fill every 16-byte word). Finally, I think there is significant value in auto-aligning the array based on an appropriate inspection of the cpu capabilities (or alternatively, a function that reports back the appropriate SIMD alignment). Again, this makes it easier to wrap libraries that may function with any alignment, but benefit from optimum alignment. Cheers, Henry From francesc at continuum.io Wed Dec 19 13:03:56 2012 From: francesc at continuum.io (Francesc Alted) Date: Wed, 19 Dec 2012 19:03:56 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1355935645.3456.22.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> Message-ID: <50D2018C.4080106@continuum.io> On 12/19/12 5:47 PM, Henry Gomersall wrote: > On Wed, 2012-12-19 at 15:57 +0000, Nathaniel Smith wrote: >> Not sure which interface is more useful to users. On the one hand, >> using funny dtypes makes regular non-SIMD access more cumbersome, and >> it forces your array size to be a multiple of the SIMD word size, >> which might be inconvenient if your code is smart enough to handle >> arbitrary-sized arrays with partial SIMD acceleration (i.e., using >> SIMD for most of the array, and then a slow path to handle any partial >> word at the end). OTOH, if your code *is* that smart, you should >> probably just make it smart enough to handle a partial word at the >> beginning as well and then you won't need any special alignment in the >> first place, and representing each SIMD word as a single numpy scalar >> is an intuitively appealing model of how SIMD works. OTOOH, just >> adding a single argument np.array() is a much simpler to explain than >> some elaborate scheme involving the creation of special custom dtypes. > If it helps, my use-case is in wrapping the FFTW library. This _is_ > smart enough to deal with unaligned arrays, but it just results in a > performance penalty. In the case of an FFT, there are clearly going to > be issues with the powers of two indices in the array not lying on a > suitable n-byte boundary (which would be the case with a misaligned > array), but I imagine it's not unique. > > The other point is that it's easy to create a suitable power of two > array that should always bypass any special case unaligned code (e.g. > with floats, any multiple of 4 array length will fill every 16-byte > word). > > Finally, I think there is significant value in auto-aligning the array > based on an appropriate inspection of the cpu capabilities (or > alternatively, a function that reports back the appropriate SIMD > alignment). Again, this makes it easier to wrap libraries that may > function with any alignment, but benefit from optimum alignment. Hmm, NumPy seems to return data blocks that are aligned to 16 bytes on systems (Linux and Mac OSX): In []: np.empty(1).data Out[]: In []: np.empty(1).data Out[]: In []: np.empty(1).data Out[]: In []: np.empty(1).data Out[]: [Check that the last digit in the addresses above is always 0] The only scenario that I see that this would create unaligned arrays is for machines having AVX. But provided that the Intel architecture is making great strides in fetching unaligned data, I'd be surprised that the difference in performance would be even noticeable. Can you tell us which difference in performance are you seeing for an AVX-aligned array and other that is not AVX-aligned? Just curious. -- Francesc Alted From heng at cantab.net Wed Dec 19 13:25:34 2012 From: heng at cantab.net (Henry Gomersall) Date: Wed, 19 Dec 2012 18:25:34 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D2018C.4080106@continuum.io> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> Message-ID: <1355941534.10732.5.camel@farnsworth> On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: > > Finally, I think there is significant value in auto-aligning the > array > > based on an appropriate inspection of the cpu capabilities (or > > alternatively, a function that reports back the appropriate SIMD > > alignment). Again, this makes it easier to wrap libraries that may > > function with any alignment, but benefit from optimum alignment. > > Hmm, NumPy seems to return data blocks that are aligned to 16 bytes > on > systems (Linux and Mac OSX): That is not true at least under Windows 32-bit. I think also it's not true for Linux 32-bit from my vague recollections of testing in a virtual machine. (disclaimer: both those statements _may_ be out of date). But yes, under Linux 64-bit I always get my arrays aligned to 16 bytes. > > The only scenario that I see that this would create unaligned arrays > is > for machines having AVX. But provided that the Intel architecture is > making great strides in fetching unaligned data, I'd be surprised > that > the difference in performance would be even noticeable. > > Can you tell us which difference in performance are you seeing for an > AVX-aligned array and other that is not AVX-aligned? Just curious. I don't know; I don't own a machine with AVX ;) It might be that the difference is negligible, though I do think it would be _nice_ to have the arrays properly aligned if it's not too difficult. Cheers, Henry From njs at pobox.com Wed Dec 19 13:48:29 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 19 Dec 2012 18:48:29 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1355941534.10732.5.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> Message-ID: On Wed, Dec 19, 2012 at 6:25 PM, Henry Gomersall wrote: > On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: > >> > Finally, I think there is significant value in auto-aligning the >> array >> > based on an appropriate inspection of the cpu capabilities (or >> > alternatively, a function that reports back the appropriate SIMD >> > alignment). Again, this makes it easier to wrap libraries that may >> > function with any alignment, but benefit from optimum alignment. >> >> Hmm, NumPy seems to return data blocks that are aligned to 16 bytes >> on >> systems (Linux and Mac OSX): > > > That is not true at least under Windows 32-bit. I think also it's not > true for Linux 32-bit from my vague recollections of testing in a > virtual machine. (disclaimer: both those statements _may_ be out of > date). > > But yes, under Linux 64-bit I always get my arrays aligned to 16 bytes. Currently numpy just uses whatever the system malloc() returns, so the alignment guarantees are entirely determined by your libc. -n From cournape at gmail.com Wed Dec 19 18:18:31 2012 From: cournape at gmail.com (David Cournapeau) Date: Wed, 19 Dec 2012 23:18:31 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D2018C.4080106@continuum.io> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> Message-ID: On Wed, Dec 19, 2012 at 6:03 PM, Francesc Alted wrote: > On 12/19/12 5:47 PM, Henry Gomersall wrote: >> On Wed, 2012-12-19 at 15:57 +0000, Nathaniel Smith wrote: >>> Not sure which interface is more useful to users. On the one hand, >>> using funny dtypes makes regular non-SIMD access more cumbersome, and >>> it forces your array size to be a multiple of the SIMD word size, >>> which might be inconvenient if your code is smart enough to handle >>> arbitrary-sized arrays with partial SIMD acceleration (i.e., using >>> SIMD for most of the array, and then a slow path to handle any partial >>> word at the end). OTOH, if your code *is* that smart, you should >>> probably just make it smart enough to handle a partial word at the >>> beginning as well and then you won't need any special alignment in the >>> first place, and representing each SIMD word as a single numpy scalar >>> is an intuitively appealing model of how SIMD works. OTOOH, just >>> adding a single argument np.array() is a much simpler to explain than >>> some elaborate scheme involving the creation of special custom dtypes. >> If it helps, my use-case is in wrapping the FFTW library. This _is_ >> smart enough to deal with unaligned arrays, but it just results in a >> performance penalty. In the case of an FFT, there are clearly going to >> be issues with the powers of two indices in the array not lying on a >> suitable n-byte boundary (which would be the case with a misaligned >> array), but I imagine it's not unique. >> >> The other point is that it's easy to create a suitable power of two >> array that should always bypass any special case unaligned code (e.g. >> with floats, any multiple of 4 array length will fill every 16-byte >> word). >> >> Finally, I think there is significant value in auto-aligning the array >> based on an appropriate inspection of the cpu capabilities (or >> alternatively, a function that reports back the appropriate SIMD >> alignment). Again, this makes it easier to wrap libraries that may >> function with any alignment, but benefit from optimum alignment. > > Hmm, NumPy seems to return data blocks that are aligned to 16 bytes on > systems (Linux and Mac OSX): Only by accident, at least on linux. The pointers returned by the gnu libc malloc are at least 8 bytes aligned, but they may not be 16 bytes when you're above the threshold where mmap is used for malloc. The difference between aligned and unaligned ram <-> sse registers (e.g. movaps, movups) used to be significant. Don't know if that's still the case for recent CPUs. David From heng at cantab.net Thu Dec 20 03:12:28 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 08:12:28 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: References: <1355906439.3456.9.camel@farnsworth> Message-ID: <1355991148.10732.14.camel@farnsworth> On Wed, 2012-12-19 at 15:10 +0000, Nathaniel Smith wrote: > >> > Is this something that can be rolled into Numpy (the feature, not > my > >> > particular implementation or interface - though I'd be happy for > it to > >> > be so)? > >> > > >> > Regarding (b), I've written a test case that works for Linux on > x86-64 > >> > with GCC (my platform!). I can test it on 32-bit windows, but > that's it. > >> > Is ARM supported by Numpy? Neon would be great to include as > well. What > >> > other platforms might need this? > >> > >> Your code looks simple and portable to me (at least the alignment > >> part). I can see a good argument for adding this sort of > functionality > >> directly to numpy with a nice interface, though, since these kind > of > >> requirements seem quite common these days. Maybe an interface like > >> a = np.asarray([1, 2, 3], base_alignment=32) # should this be in > >> bits or in bytes? > >> b = np.empty((10, 10), order="C", base_alignment=32) > >> # etc. > >> assert a.base_alignment == 32 > >> which underneath tries to use posix_memalign/_aligned_malloc when > >> possible, or falls back on the overallocation trick otherwise? > >> > > > > There is a thread about this from several years back. IIRC, David > Cournapeau > > was interested in the same problem. At first glance, the alignment > keyword > > looks interesting. One possible concern is keeping alignment for > rows, > > views, etc., which is probably not possible in any sensible way. But > people > > who need this most likely know what they are doing and just need > memory > > allocated on the proper boundary. > > Right, my intuition is that it's like order="C" -- if you make a new > array by, say, indexing, then it may or may not have order="C", no > guarantees. So when you care, you call asarray(a, order="C") and that > either makes a copy or not as needed. Similarly for base alignment. > > I guess to push this analogy even further we could define a set of > array flags, ALIGNED_8, ALIGNED_16, etc. (In practice only power-of-2 > alignment matters, I think, so the number of flags would remain > manageable?) That would make the C API easier to deal with too, no > need to add PyArray_FromAnyAligned. So, if I were to implement this, I presume the proper way would be through modifications to multiarray? Would this basic description be a reasonable initial target? Henry From heng at cantab.net Thu Dec 20 03:16:51 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 08:16:51 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1355991148.10732.14.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355991148.10732.14.camel@farnsworth> Message-ID: <1355991411.10732.16.camel@farnsworth> On Thu, 2012-12-20 at 08:12 +0000, Henry Gomersall wrote: > On Wed, 2012-12-19 at 15:10 +0000, Nathaniel Smith wrote: > > > >> > Is this something that can be rolled into Numpy (the feature, > not > > my > > >> > particular implementation or interface - though I'd be happy > for > > it to > > >> > be so)? > > >> > > > >> > Regarding (b), I've written a test case that works for Linux on > > x86-64 > > >> > with GCC (my platform!). I can test it on 32-bit windows, but > > that's it. > > >> > Is ARM supported by Numpy? Neon would be great to include as > > well. What > > >> > other platforms might need this? > > >> > > >> Your code looks simple and portable to me (at least the alignment > > >> part). I can see a good argument for adding this sort of > > functionality > > >> directly to numpy with a nice interface, though, since these kind > > of > > >> requirements seem quite common these days. Maybe an interface > like > > >> a = np.asarray([1, 2, 3], base_alignment=32) # should this be > in > > >> bits or in bytes? > > >> b = np.empty((10, 10), order="C", base_alignment=32) > > >> # etc. > > >> assert a.base_alignment == 32 > > >> which underneath tries to use posix_memalign/_aligned_malloc when > > >> possible, or falls back on the overallocation trick otherwise? > > >> > > > > > > There is a thread about this from several years back. IIRC, David > > Cournapeau > > > was interested in the same problem. At first glance, the alignment > > keyword > > > looks interesting. One possible concern is keeping alignment for > > rows, > > > views, etc., which is probably not possible in any sensible way. > But > > people > > > who need this most likely know what they are doing and just need > > memory > > > allocated on the proper boundary. > > > > Right, my intuition is that it's like order="C" -- if you make a new > > array by, say, indexing, then it may or may not have order="C", no > > guarantees. So when you care, you call asarray(a, order="C") and > that > > either makes a copy or not as needed. Similarly for base alignment. > > > > I guess to push this analogy even further we could define a set of > > array flags, ALIGNED_8, ALIGNED_16, etc. (In practice only > power-of-2 > > alignment matters, I think, so the number of flags would remain > > manageable?) That would make the C API easier to deal with too, no > > need to add PyArray_FromAnyAligned. > > So, if I were to implement this, I presume the proper way would be > through modifications to multiarray? > > Would this basic description be a reasonable initial target? There is this patch: http://projects.scipy.org/numpy/attachment/ticket/568/aligned_v1.patch Which, rather amusingly to me, was written by Steven Johnson of FFTW. It looks like a good starting point. Cheers, Henry From heng at cantab.net Thu Dec 20 03:53:31 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 08:53:31 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D2018C.4080106@continuum.io> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> Message-ID: <1355993611.10732.18.camel@farnsworth> On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: > The only scenario that I see that this would create unaligned arrays > is > for machines having AVX. But provided that the Intel architecture is > making great strides in fetching unaligned data, I'd be surprised > that > the difference in performance would be even noticeable. > > Can you tell us which difference in performance are you seeing for an > AVX-aligned array and other that is not AVX-aligned? Just curious. Further to this point, from an Intel article... http://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors "Aligning data to vector length is always recommended. When using Intel SSE and Intel SSE2 instructions, loaded data should be aligned to 16 bytes. Similarly, to achieve best results use Intel AVX instructions on 32-byte vectors that are 32-byte aligned. The use of Intel AVX instructions on unaligned 32-byte vectors means that every second load will be across a cache-line split, since the cache line is 64 bytes. This doubles the cache line split rate compared to Intel SSE code that uses 16-byte vectors. A high cache-line split rate in memory-intensive code is extremely likely to cause performance degradation. For that reason, it is highly recommended to align the data to 32 bytes for use with Intel AVX." Though it would be nice to put together a little example of this! Henry From njs at pobox.com Thu Dec 20 07:15:15 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Dec 2012 12:15:15 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1355991148.10732.14.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355991148.10732.14.camel@farnsworth> Message-ID: On Thu, Dec 20, 2012 at 8:12 AM, Henry Gomersall wrote: > On Wed, 2012-12-19 at 15:10 +0000, Nathaniel Smith wrote: > >> >> > Is this something that can be rolled into Numpy (the feature, not >> my >> >> > particular implementation or interface - though I'd be happy for >> it to >> >> > be so)? >> >> > >> >> > Regarding (b), I've written a test case that works for Linux on >> x86-64 >> >> > with GCC (my platform!). I can test it on 32-bit windows, but >> that's it. >> >> > Is ARM supported by Numpy? Neon would be great to include as >> well. What >> >> > other platforms might need this? >> >> >> >> Your code looks simple and portable to me (at least the alignment >> >> part). I can see a good argument for adding this sort of >> functionality >> >> directly to numpy with a nice interface, though, since these kind >> of >> >> requirements seem quite common these days. Maybe an interface like >> >> a = np.asarray([1, 2, 3], base_alignment=32) # should this be in >> >> bits or in bytes? >> >> b = np.empty((10, 10), order="C", base_alignment=32) >> >> # etc. >> >> assert a.base_alignment == 32 >> >> which underneath tries to use posix_memalign/_aligned_malloc when >> >> possible, or falls back on the overallocation trick otherwise? >> >> >> > >> > There is a thread about this from several years back. IIRC, David >> Cournapeau >> > was interested in the same problem. At first glance, the alignment >> keyword >> > looks interesting. One possible concern is keeping alignment for >> rows, >> > views, etc., which is probably not possible in any sensible way. But >> people >> > who need this most likely know what they are doing and just need >> memory >> > allocated on the proper boundary. >> >> Right, my intuition is that it's like order="C" -- if you make a new >> array by, say, indexing, then it may or may not have order="C", no >> guarantees. So when you care, you call asarray(a, order="C") and that >> either makes a copy or not as needed. Similarly for base alignment. >> >> I guess to push this analogy even further we could define a set of >> array flags, ALIGNED_8, ALIGNED_16, etc. (In practice only power-of-2 >> alignment matters, I think, so the number of flags would remain >> manageable?) That would make the C API easier to deal with too, no >> need to add PyArray_FromAnyAligned. > > So, if I were to implement this, I presume the proper way would be > through modifications to multiarray? Yes, numpy/core/src/multiarray/ is the code you'd be modifying. > Would this basic description be a reasonable initial target? My feeling is that we're at the stage where we need to get more information and feedback from people with experience in this area before we'll be able to merge anything into numpy proper (since that implies nailing down and committing to an API). One way to get that might be to go ahead and implement something to experiment with, and this "basic description" does seem like one of the plausible options, so... yes it seems like a reasonable initial target to work on, but I don't want to mislead you into thinking that I think it would necessarily be a reasonable initial target to ship in 1.8 or whatever. I feel like I don't have enough information to make a judgement there. There must be other people working with SIMD and numpy, right? If you're interested in this problem, another thing that might help would be to spend some effort finding those people and convincing them to get involved in discussing what they need. -n From francesc at continuum.io Thu Dec 20 09:23:01 2012 From: francesc at continuum.io (Francesc Alted) Date: Thu, 20 Dec 2012 15:23:01 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1355993611.10732.18.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355993611.10732.18.camel@farnsworth> Message-ID: <50D31F44.5050306@continuum.io> On 12/20/12 9:53 AM, Henry Gomersall wrote: > On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: >> The only scenario that I see that this would create unaligned arrays >> is >> for machines having AVX. But provided that the Intel architecture is >> making great strides in fetching unaligned data, I'd be surprised >> that >> the difference in performance would be even noticeable. >> >> Can you tell us which difference in performance are you seeing for an >> AVX-aligned array and other that is not AVX-aligned? Just curious. > Further to this point, from an Intel article... > > http://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors > > "Aligning data to vector length is always recommended. When using Intel > SSE and Intel SSE2 instructions, loaded data should be aligned to 16 > bytes. Similarly, to achieve best results use Intel AVX instructions on > 32-byte vectors that are 32-byte aligned. The use of Intel AVX > instructions on unaligned 32-byte vectors means that every second load > will be across a cache-line split, since the cache line is 64 bytes. > This doubles the cache line split rate compared to Intel SSE code that > uses 16-byte vectors. A high cache-line split rate in memory-intensive > code is extremely likely to cause performance degradation. For that > reason, it is highly recommended to align the data to 32 bytes for use > with Intel AVX." > > Though it would be nice to put together a little example of this! Indeed, an example is what I was looking for. So provided that I have access to an AVX capable machine (having 6 physical cores), and that MKL 10.3 has support for AVX, I have made some comparisons using the Anaconda Python distribution (it ships with most packages linked against MKL 10.3). Here it is a first example using a DGEMM operation. First using a NumPy that is not turbo-loaded with MKL: In [34]: a = np.linspace(0,1,1e7) In [35]: b = a.reshape(1000, 10000) In [36]: c = a.reshape(10000, 1000) In [37]: time d = np.dot(b,c) CPU times: user 7.56 s, sys: 0.03 s, total: 7.59 s Wall time: 7.63 s In [38]: time d = np.dot(c,b) CPU times: user 78.52 s, sys: 0.18 s, total: 78.70 s Wall time: 78.89 s This is getting around 2.6 GFlop/s. Now, with a MKL 10.3 NumPy and AVX-unaligned data: In [7]: p = ctypes.create_string_buffer(int(8e7)); hex(ctypes.addressof(p)) Out[7]: '0x7fcdef3b4010' # 16 bytes alignment In [8]: a = np.ndarray(1e7, "f8", p) In [9]: a[:] = np.linspace(0,1,1e7) In [10]: b = a.reshape(1000, 10000) In [11]: c = a.reshape(10000, 1000) In [37]: %timeit d = np.dot(b,c) 10 loops, best of 3: 164 ms per loop In [38]: %timeit d = np.dot(c,b) 1 loops, best of 3: 1.65 s per loop That is around 120 GFlop/s (i.e. almost 50x faster than without MKL/AVX). Now, using MKL 10.3 and AVX-aligned data: In [21]: p2 = ctypes.create_string_buffer(int(8e7+16)); hex(ctypes.addressof(p)) Out[21]: '0x7f8cb9598010' In [22]: a2 = np.ndarray(1e7+2, "f8", p2)[2:] # skip the first 16 bytes (now is 32-bytes aligned) In [23]: a2[:] = np.linspace(0,1,1e7) In [24]: b2 = a2.reshape(1000, 10000) In [25]: c2 = a2.reshape(10000, 1000) In [35]: %timeit d2 = np.dot(b2,c2) 10 loops, best of 3: 163 ms per loop In [36]: %timeit d2 = np.dot(c2,b2) 1 loops, best of 3: 1.67 s per loop So, again, around 120 GFlop/s, and the difference wrt to unaligned AVX data is negligible. One may argue that DGEMM is CPU-bounded and that memory access plays little role here, and that is certainly true. So, let's go with a more memory-bounded problem, like computing a transcendental function with numexpr. First with a with NumPy and numexpr with no MKL support: In [8]: a = np.linspace(0,1,1e8) In [9]: %time b = np.sin(a) CPU times: user 1.20 s, sys: 0.22 s, total: 1.42 s Wall time: 1.42 s In [10]: import numexpr as ne In [12]: %time b = ne.evaluate("sin(a)") CPU times: user 1.42 s, sys: 0.27 s, total: 1.69 s Wall time: 0.37 s This time is around 4x faster than regular 'sin' in libc, and about the same speed than a memcpy(): In [13]: %time c = a.copy() CPU times: user 0.19 s, sys: 0.20 s, total: 0.39 s Wall time: 0.39 s Now, with a MKL-aware numexpr and non-AVX alignment: In [8]: p = ctypes.create_string_buffer(int(8e8)); hex(ctypes.addressof(p)) Out[8]: '0x7fce435da010' # 16 bytes alignment In [9]: a = np.ndarray(1e8, "f8", p) In [10]: a[:] = np.linspace(0,1,1e8) In [11]: %time b = ne.evaluate("sin(a)") CPU times: user 0.44 s, sys: 0.27 s, total: 0.71 s Wall time: 0.15 s That is, more than 2x faster than a memcpy() in this system, meaning that the problem is truly memory-bounded. So now, with an AVX aligned buffer: In [14]: a2 = a[2:] # skip the first 16 bytes In [15]: %time b = ne.evaluate("sin(a2)") CPU times: user 0.40 s, sys: 0.28 s, total: 0.69 s Wall time: 0.16 s Again, times are very close. Just to make sure, let's use the timeit magic: In [16]: %timeit b = ne.evaluate("sin(a)") 10 loops, best of 3: 159 ms per loop # unaligned In [17]: %timeit b = ne.evaluate("sin(a2)") 10 loops, best of 3: 154 ms per loop # aligned All in all, it is not clear that AVX alignment would have an advantage, even for memory-bounded problems. But of course, if Intel people are saying that AVX alignment is important is because they have use cases for asserting this. It is just that I'm having a difficult time to find these cases. -- Francesc Alted From sturla at molden.no Thu Dec 20 11:26:28 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 20 Dec 2012 17:26:28 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1355906439.3456.9.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> Message-ID: <50D33C34.7050506@molden.no> On 19.12.2012 09:40, Henry Gomersall wrote: > I've written a few simple cython routines for assisting in creating > byte-aligned numpy arrays. The point being for the arrays to work with > SSE/AVX code. > > https://github.com/hgomersall/pyFFTW/blob/master/pyfftw/utils.pxi Why use Cython? http://mail.scipy.org/pipermail/scipy-user/2009-March/020289.html def aligned_zeros(shape, boundary=16, dtype=float, order='C'): N = np.prod(shape) d = np.dtype(dtype) tmp = np.zeros(N * d.itemsize + boundary, dtype=np.uint8) address = tmp.__array_interface__['data'][0] offset = (boundary - address % boundary) % boundary return tmp[offset:offset+N]\ .view(dtype=d)\ .reshape(shape, order=order) Sturla From heng at cantab.net Thu Dec 20 11:45:43 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 16:45:43 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D33C34.7050506@molden.no> References: <1355906439.3456.9.camel@farnsworth> <50D33C34.7050506@molden.no> Message-ID: <1356021943.10732.29.camel@farnsworth> On Thu, 2012-12-20 at 17:26 +0100, Sturla Molden wrote: > On 19.12.2012 09:40, Henry Gomersall wrote: > > I've written a few simple cython routines for assisting in creating > > byte-aligned numpy arrays. The point being for the arrays to work > with > > SSE/AVX code. > > > > https://github.com/hgomersall/pyFFTW/blob/master/pyfftw/utils.pxi > > Why use Cython? > http://mail.scipy.org/pipermail/scipy-user/2009-March/020289.html > > > def aligned_zeros(shape, boundary=16, dtype=float, order='C'): > N = np.prod(shape) > d = np.dtype(dtype) > tmp = np.zeros(N * d.itemsize + boundary, dtype=np.uint8) > address = tmp.__array_interface__['data'][0] > offset = (boundary - address % boundary) % boundary > return tmp[offset:offset+N]\ > .view(dtype=d)\ > .reshape(shape, order=order) Initially because it kept my module in a single file. That's legacy now, but since I'm already in the Cython domain, it makes sense to get the advantages (like speed - creating a 1000 length array with n_byte_align_empty is about 7 times faster than with the code above). The alignment functions is just a utility function for the FFTW wrapper. Cheers, Henry From heng at cantab.net Thu Dec 20 11:47:50 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 16:47:50 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D33C34.7050506@molden.no> References: <1355906439.3456.9.camel@farnsworth> <50D33C34.7050506@molden.no> Message-ID: <1356022070.10732.31.camel@farnsworth> On Thu, 2012-12-20 at 17:26 +0100, Sturla Molden wrote: > return tmp[offset:offset+N]\ > .view(dtype=d)\ > .reshape(shape, order=order) Also, just for the email record, that should be return tmp[offset:offset+N*d.itemsize]\ .view(dtype=d)\ .reshape(shape, order=order) Cheers, Henry From sturla at molden.no Thu Dec 20 11:48:21 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 20 Dec 2012 17:48:21 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1355941534.10732.5.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> Message-ID: <50D34155.2040105@molden.no> On 19.12.2012 19:25, Henry Gomersall wrote: > That is not true at least under Windows 32-bit. I think also it's not > true for Linux 32-bit from my vague recollections of testing in a > virtual machine. (disclaimer: both those statements _may_ be out of > date). malloc is required to return memory on 16 byte boundary on Windows. http://msdn.microsoft.com/en-us/library/ycsb6wwf.aspx On Windows we can also use _aligned_malloc and _aligned_realloc to produce the alignment we want. _aligned_malloc(N, 4096); // page-aligned memory Sturla From sturla at molden.no Thu Dec 20 11:53:00 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 20 Dec 2012 17:53:00 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1356022070.10732.31.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <50D33C34.7050506@molden.no> <1356022070.10732.31.camel@farnsworth> Message-ID: <50D3426C.4010002@molden.no> On 20.12.2012 17:47, Henry Gomersall wrote: > On Thu, 2012-12-20 at 17:26 +0100, Sturla Molden wrote: >> return tmp[offset:offset+N]\ >> .view(dtype=d)\ >> .reshape(shape, order=order) > > Also, just for the email record, that should be > > return tmp[offset:offset+N*d.itemsize]\ > .view(dtype=d)\ > .reshape(shape, order=order) Oops, yes that's right :) Sturla From heng at cantab.net Thu Dec 20 12:38:58 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 17:38:58 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D34155.2040105@molden.no> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> Message-ID: <1356025138.12003.2.camel@farnsworth> On Thu, 2012-12-20 at 17:48 +0100, Sturla Molden wrote: > On 19.12.2012 19:25, Henry Gomersall wrote: > > > That is not true at least under Windows 32-bit. I think also it's > not > > true for Linux 32-bit from my vague recollections of testing in a > > virtual machine. (disclaimer: both those statements _may_ be out of > > date). > > malloc is required to return memory on 16 byte boundary on Windows. > > http://msdn.microsoft.com/en-us/library/ycsb6wwf.aspx > > On Windows we can also use _aligned_malloc and _aligned_realloc to > produce the alignment we want. > > _aligned_malloc(N, 4096); // page-aligned memory Except I build with MinGW. Please don't tell me I need to install Visual Studio... I have about 1GB free on my windows partition! hen From bahtiyor_zohidov at mail.ru Thu Dec 20 13:32:01 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Thu, 20 Dec 2012 22:32:01 +0400 Subject: [Numpy-discussion] =?utf-8?q?=3D=3D=3D___I_REALLY_NEED_YOUR_HELP_?= =?utf-8?q?OUR_PYTHON_USERS__=3D=3D=3D=3D?= Message-ID: <1356028321.485046337@f352.mail.ru> ?Hi Python users,? First of all, Marry coming Cristmas!!! ALL THE BEST TO YOU AND YOUR FAMILY I need solution of integration under trapz() rule: There are following functions: ? def ?F1 (const1, x): ? ? ? """several calculations depending on bessel functions(mathematical functions) jn(), yv() exists in Python""" ? ? ? ?return a,b def ? F2(const1 ,const2, D) : ? ? ? ?"""Several calculation process""" ? ? ? ? x = D / const2 ? ? ? ? [a , b] = F1 (?const1, x) ? ? ?# Where x - the same as in F1() function ? ? ? ? S= a*b ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # This is (a*b) just an example for simply explanation ? ? ? ?return S def F3(D, R): ? ? ? ? ? ? ? """Here I also calculated some process. So:""" ? ? ? ?return arg1**arg3 ? # Just for example def ?Integrate_all(const1, const2, min1, step1, max1): ? ? ? ?? ? ? ? ?R=arange(min1, max1,?step1) ? ? ?# This is for function "F3" ? ? ? ?D = arange ( 0.1, 7.0, 0.0001) ? ? ? ?M = zeros ( size(R) ) ? ? ? ?for i in range(0,size(R)): ? ? ? ? ? ? ? ? ?M [ i ] = integrate. trapz ( ( F2 ( const1, const2, D ) * F3 ( D ,R)) , x=D) ? ? ? return M ? ? ? const1=complex number, const2= float,? The aim of the calculation is to use Integrate_all function for integration function above!!!!!!! When I use those functions directly like one by one separately from python shell it works very accurately, BUT when I do it as shown above : ?ERROR OCCURS:? ??C:\calculation.py:194: RuntimeWarning: invalid value encountered in divide!!!!!! (I think this is occuring in F1()) --> bessel functions >>> jn(n,x) and yv(n,x) HERE IS THE PROBLEM!!!!! I HAVE BEEN TRYING FOR MORE THAN 1.5 MONTHS, UNFORTUNATELY I AM LOOSING MY INTEREST, and TIME. PLEEEEEAAAASEEEEEE HELP ME MY FRIENDS!!!!!!!!!? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ? ? ? ? ? ?? ? ? ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From heng at cantab.net Thu Dec 20 13:35:20 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 18:35:20 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D31F44.5050306@continuum.io> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355993611.10732.18.camel@farnsworth> <50D31F44.5050306@continuum.io> Message-ID: <1356028520.12003.9.camel@farnsworth> On Thu, 2012-12-20 at 15:23 +0100, Francesc Alted wrote: > On 12/20/12 9:53 AM, Henry Gomersall wrote: > > On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: > >> The only scenario that I see that this would create unaligned > arrays > >> is > >> for machines having AVX. But provided that the Intel architecture > is > >> making great strides in fetching unaligned data, I'd be surprised > >> that > >> the difference in performance would be even noticeable. > >> > >> Can you tell us which difference in performance are you seeing for > an > >> AVX-aligned array and other that is not AVX-aligned? Just curious. > > Further to this point, from an Intel article... > > > > > http://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors > > > > "Aligning data to vector length is always recommended. When using > Intel > > SSE and Intel SSE2 instructions, loaded data should be aligned to 16 > > bytes. Similarly, to achieve best results use Intel AVX instructions > on > > 32-byte vectors that are 32-byte aligned. The use of Intel AVX > > instructions on unaligned 32-byte vectors means that every second > load > > will be across a cache-line split, since the cache line is 64 bytes. > > This doubles the cache line split rate compared to Intel SSE code > that > > uses 16-byte vectors. A high cache-line split rate in > memory-intensive > > code is extremely likely to cause performance degradation. For that > > reason, it is highly recommended to align the data to 32 bytes for > use > > with Intel AVX." > > > > Though it would be nice to put together a little example of this! > > Indeed, an example is what I was looking for. So provided that I > have > access to an AVX capable machine (having 6 physical cores), and that > MKL > 10.3 has support for AVX, I have made some comparisons using the > Anaconda Python distribution (it ships with most packages linked > against > MKL 10.3). > All in all, it is not clear that AVX alignment would have an > advantage, > even for memory-bounded problems. But of course, if Intel people are > saying that AVX alignment is important is because they have use cases > for asserting this. It is just that I'm having a difficult time to > find > these cases. Thanks for those examples, they were very interesting. I managed to temporarily get my hands on a machine with AVX and I have shown some speed-up with aligned arrays. FFT (using my wrappers) gives about a 15% speedup. Also this convolution code: https://github.com/hgomersall/SSE-convolution/blob/master/convolve.c Shows a small but repeatable speed-up (a few %) when using some aligned loads (as many as I can work out to use!). Cheers, Henry From d.s.seljebotn at astro.uio.no Thu Dec 20 14:33:21 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 20 Dec 2012 20:33:21 +0100 Subject: [Numpy-discussion] === I REALLY NEED YOUR HELP OUR PYTHON USERS ==== In-Reply-To: <1356028321.485046337@f352.mail.ru> References: <1356028321.485046337@f352.mail.ru> Message-ID: <50D36801.9090204@astro.uio.no> On 12/20/2012 07:32 PM, Happyman wrote: > Hi Python users, > > First of all, Marry coming Cristmas!!! ALL THE BEST TO YOU AND YOUR FAMILY > > I need solution of integration under trapz() rule: > There are following functions: > > def F1 (const1, x): > """several calculations depending on bessel > functions(mathematical functions) jn(), yv() exists in Python""" > return a,b > > def F2(const1 ,const2, D) : > > """Several calculation process""" > x = D / const2 > [a , b] = F1 ( const1, x) # Where x - the same as in F1() > function > S= a*b # This is (a*b) just an > example for simply explanation > return S > > def F3(D, R): > > """Here I also calculated some process. So:""" > return arg1**arg3 # Just for example > > def Integrate_all(const1, const2, min1, step1, max1): > > R=arange(min1, max1, step1) # This is for function "F3" > D = arange ( 0.1, 7.0, 0.0001) > > M = zeros ( size(R) ) > > for i in range(0,size(R)): > M [ i ] = integrate. trapz ( ( F2 ( const1, const2, D > ) * F3 ( D ,R)) , x=D) > return M > > const1=complex number, const2= float, > > The aim of the calculation is to use Integrate_all function for > integration function above!!!!!!! > > When I use those functions directly like one by one separately from > python shell it works very accurately, BUT when I do it as shown above : > ERROR OCCURS: C:\calculation.py:194: RuntimeWarning: invalid value > encountered in divide!!!!!! (I think this is occuring in F1()) --> > bessel functions >>> jn(n,x) and yv(n,x) Yet you didn't supply the source code for F1(), so nobody will be able to help you. (But what you should do is a) figure out which argument range F1 will be evaluated in ("print const1, x" should get you started if you don't know), b) write a seperate function that *only* evaluates F1 in various points in this range (perhaps plots it etc.). That should probably give you a clue about what you are doing wrong. The key is to isolate the problem. That will also help you produce a version of F1 that you feel confident about posting to the list. Also, please read http://www.catb.org/esr/faqs/smart-questions.html Dag Sverre From sturla at molden.no Thu Dec 20 14:50:41 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 20 Dec 2012 20:50:41 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1356025138.12003.2.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> Message-ID: <50D36C11.6030602@molden.no> On 20.12.2012 18:38, Henry Gomersall wrote: > Except I build with MinGW. Please don't tell me I need to install Visual > Studio... I have about 1GB free on my windows partition! The same DLL is used as CRT. Sturla From heng at cantab.net Thu Dec 20 14:52:52 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 19:52:52 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D36C11.6030602@molden.no> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> Message-ID: <1356033172.12003.12.camel@farnsworth> On Thu, 2012-12-20 at 20:50 +0100, Sturla Molden wrote: > On 20.12.2012 18:38, Henry Gomersall wrote: > > > Except I build with MinGW. Please don't tell me I need to install > Visual > > Studio... I have about 1GB free on my windows partition! > > The same DLL is used as CRT. Perhaps the DLL should go and read MS's edicts! hen From sturla at molden.no Thu Dec 20 14:57:31 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 20 Dec 2012 20:57:31 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1356033172.12003.12.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> <1356033172.12003.12.camel@farnsworth> Message-ID: <50D36DAB.8080203@molden.no> On 20.12.2012 20:52, Henry Gomersall wrote: > Perhaps the DLL should go and read MS's edicts! Do you link with same same CRT as Python? (msvcr90.dll) You should always use -lmsvcr90. If you don't, you will link with msvcrt.dll. Sturla From heng at cantab.net Thu Dec 20 15:03:44 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 20:03:44 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D36DAB.8080203@molden.no> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> <1356033172.12003.12.camel@farnsworth> <50D36DAB.8080203@molden.no> Message-ID: <1356033824.12003.14.camel@farnsworth> On Thu, 2012-12-20 at 20:57 +0100, Sturla Molden wrote: > On 20.12.2012 20:52, Henry Gomersall wrote: > > > Perhaps the DLL should go and read MS's edicts! > > Do you link with same same CRT as Python? (msvcr90.dll) > > You should always use -lmsvcr90. > > If you don't, you will link with msvcrt.dll. Hmmm, plausibly not. Why is it important? (for my own understanding) Cheers, Henry From sturla at molden.no Thu Dec 20 15:05:42 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 20 Dec 2012 21:05:42 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D36DAB.8080203@molden.no> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> <1356033172.12003.12.camel@farnsworth> <50D36DAB.8080203@molden.no> Message-ID: <50D36F96.1030908@molden.no> On 20.12.2012 20:57, Sturla Molden wrote: > On 20.12.2012 20:52, Henry Gomersall wrote: > >> Perhaps the DLL should go and read MS's edicts! > > Do you link with same same CRT as Python? (msvcr90.dll) > > You should always use -lmsvcr90. > > If you don't, you will link with msvcrt.dll. Here is VS2008, which uses malloc from msvcr90.dll: http://msdn.microsoft.com/en-us/library/ycsb6wwf(v=vs.90).aspx "malloc is required to return memory on a 16-byte boundary" If this does not happen, you are linking with the wrong CRT. When building C extensions for Python, we should always link with the same CRT as Python uses, unless you are 100% certain that CRT resources are never shared with Python. Sturla From heng at cantab.net Thu Dec 20 15:09:24 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 20:09:24 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D36F96.1030908@molden.no> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> <1356033172.12003.12.camel@farnsworth> <50D36DAB.8080203@molden.no> <50D36F96.1030908@molden.no> Message-ID: <1356034164.12003.16.camel@farnsworth> On Thu, 2012-12-20 at 21:05 +0100, Sturla Molden wrote: > On 20.12.2012 20:57, Sturla Molden wrote: > > On 20.12.2012 20:52, Henry Gomersall wrote: > > > >> Perhaps the DLL should go and read MS's edicts! > > > > Do you link with same same CRT as Python? (msvcr90.dll) > > > > You should always use -lmsvcr90. > > > > If you don't, you will link with msvcrt.dll. > > Here is VS2008, which uses malloc from msvcr90.dll: > > http://msdn.microsoft.com/en-us/library/ycsb6wwf(v=vs.90).aspx > > "malloc is required to return memory on a 16-byte boundary" > > If this does not happen, you are linking with the wrong CRT. When > building C extensions for Python, we should always link with the same > CRT as Python uses, unless you are 100% certain that CRT resources > are > never shared with Python. Is this something that can be established when using Cython? I haven't experienced any problems so far (other than a now solved problem trying to free memory allocated in FFTW which was down to a runtime mismatch). Cheers, Henry From sturla at molden.no Thu Dec 20 15:13:53 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 20 Dec 2012 21:13:53 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1356033824.12003.14.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> <1356033172.12003.12.camel@farnsworth> <50D36DAB.8080203@molden.no> <1356033824.12003.14.camel@farnsworth> Message-ID: <50D37181.5000301@molden.no> On 20.12.2012 21:03, Henry Gomersall wrote: > Why is it important? (for my own understanding) Because if CRT resources are shared between different CRT versions, bad things will happen (the ABIs are not equivalent, errno and other globals are at different addresses, etc.) Cython code tends to share CRT resources with Python. For example, your Cython code might (invisibly to us) invoke malloc to allocate space for an objent, and then Python will later call free. This should perferably go through the same DLL. Another thing is that msvcrt.dll is for Windows own system resources, not for user apps. Sturla From sturla at molden.no Thu Dec 20 15:22:46 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 20 Dec 2012 21:22:46 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D37181.5000301@molden.no> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> <1356033172.12003.12.camel@farnsworth> <50D36DAB.8080203@molden.no> <1356033824.12003.14.camel@farnsworth> <50D37181.5000301@molden.no> Message-ID: <50D37396.6010800@molden.no> On 20.12.2012 21:13, Sturla Molden wrote: > Because if CRT resources are shared between different CRT versions, bad > things will happen (the ABIs are not equivalent, errno and other globals > are at different addresses, etc.) For example, PyErr_SetFromErrno will return garbage if CRTs are shared. Sturla From heng at cantab.net Thu Dec 20 15:24:05 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 20:24:05 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D37181.5000301@molden.no> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> <1356033172.12003.12.camel@farnsworth> <50D36DAB.8080203@molden.no> <1356033824.12003.14.camel@farnsworth> <50D37181.5000301@molden.no> Message-ID: <1356035045.12003.20.camel@farnsworth> On Thu, 2012-12-20 at 21:13 +0100, Sturla Molden wrote: > On 20.12.2012 21:03, Henry Gomersall wrote: > > > Why is it important? (for my own understanding) > > Because if CRT resources are shared between different CRT versions, > bad > things will happen (the ABIs are not equivalent, errno and other > globals > are at different addresses, etc.) Cython code tends to share CRT > resources with Python. For example, your Cython code might (invisibly > to > us) invoke malloc to allocate space for an objent, and then Python > will > later call free. This should perferably go through the same DLL. > I understood the general point about it being bad, but it was more specific to Cython, which I think you answered. Presumably it's less of an issue with pure C libs where calls to libc are more obvious. > Another thing is that msvcrt.dll is for Windows own system resources, > not for user apps. I didn't know that. It's a real pain having so many libc libs knocking around. I have little experience of Windows, as you may have guessed! hen From sturla at molden.no Thu Dec 20 15:45:07 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 20 Dec 2012 21:45:07 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1356035045.12003.20.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> <1356033172.12003.12.camel@farnsworth> <50D36DAB.8080203@molden.no> <1356033824.12003.14.camel@farnsworth> <50D37181.5000301@molden.no> <1356035045.12003.20.camel@farnsworth> Message-ID: <50D378D3.6040407@molden.no> On 20.12.2012 21:24, Henry Gomersall wrote: > I didn't know that. It's a real pain having so many libc libs knocking > around. I have little experience of Windows, as you may have guessed! Originally there was only one system-wide CRT on Windows (msvcrt.dll), which is why MinGW linkes with that by default. But starting with the release of VS2003, Microsoft decided to reserve msvcrt.dll for system resources and create a libc "DLL Hell" for user apps. Visual Studio 2003 came with static and dynamic versions of the CRT library, as well as single- and multithreaded ones... Then everyone building apps that used DLLs or COM objects just had to make sure that nothing conflicted. And for every later version of Visual Studio they have released further more CRT versions, adding to the confusion. Currently: The official Python 2.7 binaries are built with Visual Studio 2008 and linked with msvcr90.dll. MinGW has import libraries for the other CRTs Microsoft has released, so just add -lmsvcr90 to your final linkage. Python's distutils will control the build process for extensions automatically. Adding -lmsvcr90 is one of the things that distutils will do. Sturla From heng at cantab.net Thu Dec 20 15:50:41 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 20 Dec 2012 20:50:41 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D378D3.6040407@molden.no> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355941534.10732.5.camel@farnsworth> <50D34155.2040105@molden.no> <1356025138.12003.2.camel@farnsworth> <50D36C11.6030602@molden.no> <1356033172.12003.12.camel@farnsworth> <50D36DAB.8080203@molden.no> <1356033824.12003.14.camel@farnsworth> <50D37181.5000301@molden.no> <1356035045.12003.20.camel@farnsworth> <50D378D3.6040407@molden.no> Message-ID: <1356036641.12003.24.camel@farnsworth> On Thu, 2012-12-20 at 21:45 +0100, Sturla Molden wrote: > On 20.12.2012 21:24, Henry Gomersall wrote: > > > I didn't know that. It's a real pain having so many libc libs > knocking > > around. I have little experience of Windows, as you may have > guessed! > > Originally there was only one system-wide CRT on Windows > (msvcrt.dll), > which is why MinGW linkes with that by default. But starting with the > release of VS2003, Microsoft decided to reserve msvcrt.dll for system > resources and create a libc "DLL Hell" for user apps. Visual Studio > 2003 > came with static and dynamic versions of the CRT library, as well as > single- and multithreaded ones... Then everyone building apps that > used > DLLs or COM objects just had to make sure that nothing conflicted. > And > for every later version of Visual Studio they have released further > more > CRT versions, adding to the confusion. > > Currently: The official Python 2.7 binaries are built with Visual > Studio > 2008 and linked with msvcr90.dll. > > MinGW has import libraries for the other CRTs Microsoft has released, > so > just add -lmsvcr90 to your final linkage. > > Python's distutils will control the build process for extensions > automatically. Adding -lmsvcr90 is one of the things that distutils > will > do. Well, I _am_ using distutils, so I should expect it to happen then. Probably my alignment concerns are based on some previous stuff when I was building pure C libs under Windows. Anyway, thanks for all the assistance! hen From bahtiyor_zohidov at mail.ru Thu Dec 20 16:58:30 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Fri, 21 Dec 2012 01:58:30 +0400 Subject: [Numpy-discussion] =?utf-8?q?=3D=3D=3D_RuntimeWarning=3A_INVALID_?= =?utf-8?q?VALUE_=3E_encountered_in_divide_=3D=3D=3D=3D?= In-Reply-To: <50D36801.9090204@astro.uio.no> References: <1356028321.485046337@f352.mail.ru> <50D36801.9090204@astro.uio.no> Message-ID: <1356040710.964528114@f220.mail.ru> Well, If F1 is run in Python shell, everything is properly working, BUT if call through the functions it is wrongly answering!!! ? def F1 (const1, x): # const1 should be complex number T1=round(2+x+4*x**(1.0/3.0)) T2=const1*x T3=const1**2 x1,x2,x3,x4 = sph_jnyn(T1, x) --> 1-standard function in Python x1=x1[1:] x2=x2[1:] x3=x3[1:] x4=x4[1:] a1=x1+1.0j*x3 a2=x2+1.0j*x4 y1,y2= sph_jn(T1,T2) --> 2- standard function in Python y1=y1[1:] y2=y2[1:] b1=x1+x*x2 b2=y1+T2*y2 b3=a1+x*a2 an1= T3*y1*b1-x1*b2 an2= T3*y1*b3-a1*b2 a=an1/an2 bn1= y1*b1-x1*b2 bn2= y1*b3-a1*b2 b=bn1/bn2 return a,b ? ? ???????, 20 ??????? 2012, 20:33 ?? Dag Sverre Seljebotn : >On 12/20/2012 07:32 PM, Happyman wrote: >> Hi Python users, >> >> First of all, Marry coming Cristmas!!! ALL THE BEST TO YOU AND YOUR FAMILY >> >> I need solution of integration under trapz() rule: >> There are following functions: >> >> def F1 (const1, x): >> """several calculations depending on bessel >> functions(mathematical functions) jn(), yv() exists in Python""" >> return a,b >> >> def F2(const1 ,const2, D) : >> >> """Several calculation process""" >> x = D / const2 >> [a , b] = F1 ( const1, x) # Where x - the same as in F1() >> function >> S= a*b # This is (a*b) just an >> example for simply explanation >> return S >> >> def F3(D, R): >> >> """Here I also calculated some process. So:""" >> return arg1**arg3 # Just for example >> >> def Integrate_all(const1, const2, min1, step1, max1): >> >> R=arange(min1, max1, step1) # This is for function "F3" >> D = arange ( 0.1, 7.0, 0.0001) >> >> M = zeros ( size(R) ) >> >> for i in range(0,size(R)): >> M [ i ] = integrate. trapz ( ( F2 ( const1, const2, D >> ) * F3 ( D ,R)) , x=D) >> return M >> >> const1=complex number, const2= float, >> >> The aim of the calculation is to use Integrate_all function for >> integration function above!!!!!!! >> >> When I use those functions directly like one by one separately from >> python shell it works very accurately, BUT when I do it as shown above : >> ERROR OCCURS: C:\calculation.py:194: RuntimeWarning: invalid value >> encountered in divide!!!!!! (I think this is occuring in F1()) --> >> bessel functions >>> jn(n,x) and yv(n,x) > >Yet you didn't supply the source code for F1(), so nobody will be able >to help you. > >(But what you should do is a) figure out which argument range F1 will be >evaluated in ("print const1, x" should get you started if you don't >know), b) write a seperate function that *only* evaluates F1 in various >points in this range (perhaps plots it etc.). That should probably give >you a clue about what you are doing wrong. > >The key is to isolate the problem. That will also help you produce a >version of F1 that you feel confident about posting to the list. > >Also, please read > >http://www.catb.org/esr/faqs/smart-questions.html > >Dag Sverre >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Dec 20 17:02:42 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 20 Dec 2012 23:02:42 +0100 Subject: [Numpy-discussion] MKL licenses for core scientific Python projects In-Reply-To: References: <4473522818305701545@unknownmsgid> Message-ID: On Sat, Dec 15, 2012 at 2:13 PM, Ralf Gommers wrote: > > > > On Sat, Dec 15, 2012 at 5:06 AM, Chris Barker - NOAA Federal < > chris.barker at noaa.gov> wrote: > >> Ralf, >> >> Do these licenses allow fully free distribution of binaries? And are >> those binaries themselves redistributive? I.e. with py2exe and friends? >> >> If so, that could be nice. >> > > Good point. It's not entirely clear from the emails I received. I've asked > for clarification. > Yes, redistribution is fine. Intel would even be quite pleased if we would offer official binaries built against MKL next to or instead of ATLAS. They do ask us to include attribution on the website, which I don't see a problem with. Actually I'd like to see that better organized on the website anyway - we should mention companies like Enthought, Continuum, Github and others who are contributing resources now or have done so in the past. Ralf > Ralf > > >> >> On Dec 14, 2012, at 1:01 PM, Ralf Gommers wrote: >> >> Hi all, >> >> Intel has offered to provide free MKL licenses for main contributors to >> scientific Python projects - at least those listed at >> numfocus.org/projects/. Licenses for all OSes that are required can be >> provided, the condition is that they're used for building/testing our >> projects and not for broader purposes. >> >> If you're interested, please let me know your full name and what OS you >> need a license for. >> >> Cheers, >> Ralf >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Dec 20 18:46:48 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 20 Dec 2012 15:46:48 -0800 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) Message-ID: Hi, [Travis wrote...] > My strong suggestion is that development discussions of the project continue on > this list with consensus among the active participants being the goal for > development. I don't think 100% consensus is a rigid requirement --- but > certainly a super-majority should be the goal, and serious changes should not be > made with out a clear consensus. I would pay special attention to > under-represented people (users with intense usage of NumPy but small voices on > this list). There are many of them. If you push me for specifics then at > this point in NumPy's history, I would say that if Chuck, Nathaniel, and Ralf > agree on a course of action, it will likely be a good thing for the project. I > suspect that even if only 2 of the 3 agree at one time it might still be a good > thing (but I would expect more detail and discussion). There are others whose > opinion should be sought as well: Ondrej Certik, Perry Greenfield, Robert Kern, > David Cournapeau, Francesc Alted, and Mark Wiebe to name a few. For some > questions, I might even seek input from people like Konrad Hinsen and Paul > Dubois --- if they have time to give it. I will still be willing to offer my > view from time to time and if I am asked. Thank you for starting this discussion. I am more or less offline at the moment in Cuba and flying, but I hope very much this will be an opportunity for a good discussion on the best way forward for numpy. Travis - I think you are suggesting that there should be no one person in charge of numpy, and I think this is very unlikely to work well. Perhaps there are good examples of well-led projects where there is not a clear leader, but I can't think of any myself at the moment. My worry would be that, without a clear leader, it will be unclear how decisions are made, and that will make it very hard to take strategic decisions. I would like to humbly suggest the following in the hope that it spurs discussion. As first pass, Ralf, Chuck and Nathaniel decide on a core group of people that will form the succession committee. Maybe this could be the list of people you listed above. Ralf, Chuck and Nathaniel then ask for people to wish to propose themselves as the leader of numpy. Anyone proposing themselves to lead numpy would remove themselves from the succession committee. The proposed leaders of numpy write a short manifesto saying why they are the right choice for the job, and what they intend to do if elected. The manifestos and issues arising are discussed in public on the mailing list - the equivalent of an online presidential debate. In due course - say after 2 weeks or after the discussion seems to be dying out - the succession committee votes on the leader. I propose that these votes should be public, but I can see the opposite argument. How does that sound? Best, Matthew From ondrej.certik at gmail.com Thu Dec 20 20:23:40 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 20 Dec 2012 17:23:40 -0800 Subject: [Numpy-discussion] Travis-CI stopped supporting Python 3.1, but added 3.3 Message-ID: Hi, I noticed that the 3.1 tests are now failing. After clarification with the Travis guys: https://groups.google.com/d/topic/travis-ci/02iRu6kmwY8/discussion I've submitted a fix to our .travis.yml (and backported to 1.7): https://github.com/numpy/numpy/pull/2850 https://github.com/numpy/numpy/pull/2851 In case you were wondering. Do we need to support Python 3.1? We could in principle test 3.1 just like we test 2.4. I don't know if it is worth the pain. Ondrej From ondrej.certik at gmail.com Thu Dec 20 20:25:15 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 20 Dec 2012 17:25:15 -0800 Subject: [Numpy-discussion] Travis failures with no errors In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 4:39 PM, Ond?ej ?ert?k wrote: > Hi, > > I found these recent weird "failures" in Travis, but I can't find any > problem with the log and all tests pass. Any ideas what is going on? > > https://travis-ci.org/numpy/numpy/jobs/3570123 > https://travis-ci.org/numpy/numpy/jobs/3539549 > https://travis-ci.org/numpy/numpy/jobs/3369629 And here is another one: https://travis-ci.org/numpy/numpy/jobs/3768782 Ondrej From njs at pobox.com Thu Dec 20 20:39:19 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 21 Dec 2012 01:39:19 +0000 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: References: Message-ID: On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett wrote: > Travis - I think you are suggesting that there should be no one > person in charge of numpy, and I think this is very unlikely to work > well. Perhaps there are good examples of well-led projects where > there is not a clear leader, but I can't think of any myself at the > moment. My worry would be that, without a clear leader, it will be > unclear how decisions are made, and that will make it very hard to > take strategic decisions. Curious; my feeling is the opposite, that among mature and successful FOSS projects, having a clear leader is the uncommon case. GCC doesn't, Glibc not only has no leader but they recently decided to get rid of their formal steering committee, I'm pretty sure git doesn't, Apache certainly doesn't, Samba doesn't really, etc. As usual Karl Fogel has sensible comments on this: http://producingoss.com/en/consensus-democracy.html In practice the main job of a successful FOSS leader is to refuse to make decisions, nudge people to work things out, and then if they refuse to work things out tell them to go away until they do: https://lwn.net/Articles/105375/ and what actually gives people influence in a project is the respect of the other members. The former stuff is stuff anyone can do, and the latter isn't something you can confer or take away with a vote. Nor do we necessarily have a great track record for executive decisions actually working things out. -n From ondrej.certik at gmail.com Thu Dec 20 20:39:56 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 20 Dec 2012 17:39:56 -0800 Subject: [Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch In-Reply-To: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> References: <5EA78D17-1494-4057-A2C6-89BFC116C69F@continuum.io> Message-ID: Hi Travis, On Sun, Dec 16, 2012 at 10:07 PM, Travis Oliphant wrote: > Hello all, > > There is a lot happening in my life right now and I am spread quite thin among the various projects that I take an interest in. In particular, I am thrilled to publicly announce on this list that Continuum Analytics has received DARPA funding (to the tune of at least $3 million) for Blaze, Numba, and Bokeh which we are writing to take NumPy, SciPy, and visualization into the domain of very large data sets. This is part of the XDATA program, and I will be taking an active role in it. You can read more about Blaze here: http://blaze.pydata.org. You can read more about XDATA here: http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx First of all, congratulations! > > I personally think Blaze is the future of array-oriented computing in Python. I will be putting efforts and resources next year behind making that case. How it interacts with future incarnations of NumPy, Pandas, or other projects is an interesting and open question. I have no doubt the future will be a rich ecosystem of interoperating array-oriented data-structures. I invite anyone interested in Blaze to participate in the discussions and development at https://groups.google.com/a/continuum.io/forum/#!forum/blaze-dev or watch the project on our public GitHub repo: https://github.com/ContinuumIO/blaze. Blaze is being incubated under the ContinuumIO GitHub project for now, but eventually I hope it will receive its own GitHub project page later next year. Development of Blaze is early but we are moving rapidly with it (and have deliverable deadlines --- thus while we will welcome input and pull requests we won't have a ton of time to respond to simple queries until > at least May or June). There is more that we are working on behind the scenes with respect to Blaze that will be coming out next year as well but isn't quite ready to show yet. > > As I look at the coming months and years, my time for direct involvement in NumPy development is therefore only going to get smaller. As a result it is not appropriate that I remain as "head steward" of the NumPy project (a term I prefer to BFD12 or anything else). I'm sure that it is apparent that while I've tried to help personally where I can this year on the NumPy project, my role has been more one of coordination, seeking funding, and providing expert advice on certain sections of code. I fundamentally agree with Fernando Perez that the responsibility of care-taking open source projects is one of stewardship --- something akin to public service. I have tried to emulate that belief this year --- even while not always succeeding. > > It is time for me to make official what is already becoming apparent to observers of this community, namely, that I am stepping down as someone who might be considered "head steward" for the NumPy project and officially leaving the development of the project in the hands of others in the community. I don't think the project actually needs a new "head steward" --- especially from a development perspective. Instead I see a lot of strong developers offering key opinions for the project as well as a great set of new developers offering pull requests. > > My strong suggestion is that development discussions of the project continue on this list with consensus among the active participants being the goal for development. I don't think 100% consensus is a rigid requirement --- but certainly a super-majority should be the goal, and serious changes should not be made with out a clear consensus. I would pay special attention to under-represented people (users with intense usage of NumPy but small voices on this list). There are many of them. If you push me for specifics then at this point in NumPy's history, I would say that if Chuck, Nathaniel, and Ralf agree on a course of action, it will likely be a good thing for the project. I suspect that even if only 2 of the 3 agree at one time it might still be a good thing (but I would expect more detail and discussion). There are others whose opinion should be sought as well: Ondrej Certik, Perry Greenfield, Robert Kern, David Cournapeau, Francesc Alted, and Mark Wiebe to > name a few. For some questions, I might even seek input from people like Konrad Hinsen and Paul Dubois --- if they have time to give it. I will still be willing to offer my view from time to time and if I am asked. After being involved with the numpy development/release lately, Chuck, Nathaniel, and Ralf would be my (independent) choice as well, in fact I have always sought their (and yours) approval to any PR that I had doubts merging, and I have been treating them as the de-facto leaders of the project... So it's good that you made it official. > > Greg Wilson (of Software Carpentry fame) asked me recently what letter I would have written to myself 5 years ago. What would I tell myself to do given the knowledge I have now? I've thought about that for a bit, and I have some answers. I don't know if these will help anyone, but I offer them as hopefully instructive: > > 1) Do not promise to not break the ABI of NumPy --- and in fact emphasize that it will be broken at least once in the 1.X series. NumPy was designed to add new data-types --- but not without breaking the ABI. NumPy has needed more data-types and still needs even more. While it's not beautifully simple to add new data-types, it can be done. But, it is impossible to add them without breaking the ABI in some fashion. The desire to add new data-types *and* keep ABI compatibility has led to significant pain. I think the ABI non-breakage goal has been amplified by the poor state of package management in Python. The fact that it's painful for someone to update their downstream packages when an upstream ABI breaks (on Windows and Mac in particular) has put a lot of unfortunate pressure on this community. Pressure that was not envisioned or understood when I was writing NumPy. > > (As an aside: This is one reason Continuum has invested resources in building the conda tool and a completely free set of binary packages called Anaconda CE which is becoming more and more usable thanks to the efforts of Bryan Van de Ven and Ilan Schnell and our testing team at Continuum. The conda tool: http://docs.continuum.io/conda/index.html is open source and BSD licensed and the next release will provide the ability to build packages, build indexes on package repositories and interface with pip. Expect a blog-post in the near future about how cool conda is!). > > 2) Don't create array-scalars. Instead, make the data-type object a meta-type object whose instances are the items returned from NumPy arrays. There is no need for a separate array-scalar object and in fact it's confusing to the type-system. I understand that now. I did not understand that 5 years ago. > > 3) Special-case small arrays to avoid the memory indirection and look at PDL so that generalized ufuncs are supported from the beginning. > > 4) Define missing-value data-types and labels on the dimensions and arrays > > 5) Define a standard "dictionary of NumPy arrays" interface as the basic "structure of arrays" concept to go with the "array of structures" that structured arrays provide. > > 6) Start work on SQL interface to NumPy arrays *now* > > Additional comments I would make to someone today: > > 1) Most of NumPy should be written in Python with Numba used as the compiler (particularly as soon as Numba gets the ability to create Python extension modules which is in the next release). > 2) There are still many, many optimizations that can be made in NumPy run-time (especially in the face of modern hardware). > > I will continue to be available to answer questions and I may chime in here and there on pull requests. However, most of my time for NumPy will be on administrative aspects of the project where I will continue to take an active interest. To help make sure that this happens in a transparent way, I would like to propose that "administrative" support of the project be left to the NumFOCUS board of which I am currently 1 of 9 members. The other board members are currently: Ralf Gommers, Anthony Scopatz, Andy Terrel, Prabhu Ramachandran, Fernando Perez, Emmanuelle Gouillart, Jarrod Millman, and Perry Greenfield. While NumFOCUS basically seeks to promote and fund the entire scientific Python stack, I think it can also play a role in helping to administer some of the core projects which the board members themselves have a personal interest in. > > By administrative support, I mean decisions like "what should be done with any NumPy IP or web-domains" or "what kind of commercially-related ads or otherwise should go on the NumPy home page", or "what should be done with the NumPy github account", etc. --- basically anything that requires an executive decision that is not directly development related. I don't expect there to be many of these decisions. But, when they show up, I would like them to be made in as transparent and public of a way as possible. In practice, the way I see this working is that there are members of the NumPy community who are (like me) particularly interested in admin-related questions and serve on a NumPy team in the NumFOCUS organization. I just know I'll be attending NumFOCUS board meetings, and I would like to help move administrative decisions forward with NumPy as part of the time I spend thinking about NumFOCUS. > > If people on this list would like to play an active role in those admin discussions, then I would heartily welcome them into NumFOCUS membership where they would work with interested members of the NumFOCUS board (like me and Ralf) to help direct that organization. I would really love to have someone from this list volunteer to serve on the NumPy team as part of the NumFOCUS project. I am certainly going to be interested in the opinions of people who are active participants on this list and on GitHub pages for NumPy on anything admin related to NumPy, and I expect Ralf would also be very interested in those views. > > One admin discussion that I will bring up in another email (as this one is already too long) is about making 2 or 3 lists for NumPy such as numpy-admin at numpy.org, numpy-dev at numpy.org, and numpy-users at numpy-org. > > Just because I'll be spending more time on Blaze, Numba, Bokeh, and the PyData ecosystem does not mean that I won't be around for NumPy. I will continue to promote NumPy. My involvement with Continuum connects me to NumPy as Continuum continues to offer commercial support contracts for NumPy (and SciPy and other open source projects). Continuum will also continue to maintain its Github NumPy project which will contain pull requests from our company that we are working to get into the mainline branch. Continuum will also continue to provide resources for release-management of NumPy (we have been funding Ondrej in this role for the past 6 months --- though I would like to see this happen through NumFOCUS in the future even if Continuum provides much of the money). We also offer optimized versions of NumPy in our commercial Anaconda distribution (Anaconda CE is free and open source). > > Also, I will still be available for questions and help (I'm not disappearing --- just making it clear that I'm stepping back into an occasional NumPy developer role). It has been extremely gratifying to see the number of pull-requests, GitHub-conversations, and code contributions increase this year. Even though the 1.7 release has taken a long time to stabilize, there have been a lot of people participating in the discussion and in helping to track down the problems, figure out what to do, and fix them. It even makes it possible for people to think about 1.7 as a long-term release. > > I will continue to hope that the spirit of openness, tolerance, respect, and gratitude continue to permeate this mailing list, and that we continue to seek to resolve any differences with trust and mutual respect. I know I have offended people in the past with quick remarks and actions made sometimes in haste without fully realizing how they might be taken. But, I also know that like many of you I have always done the very best I could for moving Python for scientific computing forward in the best way I know how. > > Thank you for the great memories. If you will forgive a little sentiment: My daughter who is in college now was 3 years old when I began working with this community and went down a road that would lead to my involvement with SciPy and NumPy. I have marked the building of my family and the passage of time with where the Python for Scientific Computing Community was at. Like many of you, I have given a great deal of attention and time to building this community. That sacrifice and time has led me to love what we have created. I know that I leave this segment of the community with the tools in better hands than mine. I am hopeful that NumPy will continue to be a useful array library for the Python community for many years to come even as we all continue to build new tools for the future. Thank you for all the work you are doing, it gives a huge boost to the whole Python scientific/numeric/data community. I discovered it when I first came to the scipy conference in 2007. I was 23. All I could think of was wow, this is pretty awesome. I remember your energy and enthusiasm. Now I am 29. Since then I tried to contribute in my own little ways as well. Ondrej From ondrej.certik at gmail.com Thu Dec 20 20:48:18 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 20 Dec 2012 17:48:18 -0800 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: References: Message-ID: Hi Matthew, On Thu, Dec 20, 2012 at 3:46 PM, Matthew Brett wrote: > Hi, > > [Travis wrote...] >> My strong suggestion is that development discussions of the project continue on >> this list with consensus among the active participants being the goal for >> development. I don't think 100% consensus is a rigid requirement --- but >> certainly a super-majority should be the goal, and serious changes should not be >> made with out a clear consensus. I would pay special attention to >> under-represented people (users with intense usage of NumPy but small voices on >> this list). There are many of them. If you push me for specifics then at >> this point in NumPy's history, I would say that if Chuck, Nathaniel, and Ralf >> agree on a course of action, it will likely be a good thing for the project. I >> suspect that even if only 2 of the 3 agree at one time it might still be a good >> thing (but I would expect more detail and discussion). There are others whose >> opinion should be sought as well: Ondrej Certik, Perry Greenfield, Robert Kern, >> David Cournapeau, Francesc Alted, and Mark Wiebe to name a few. For some >> questions, I might even seek input from people like Konrad Hinsen and Paul >> Dubois --- if they have time to give it. I will still be willing to offer my >> view from time to time and if I am asked. > > Thank you for starting this discussion. > > I am more or less offline at the moment in Cuba and flying, but I hope > very much this will be an opportunity for a good discussion on the > best way forward for numpy. > > Travis - I think you are suggesting that there should be no one > person in charge of numpy, and I think this is very unlikely to work > well. Perhaps there are good examples of well-led projects where > there is not a clear leader, but I can't think of any myself at the > moment. My worry would be that, without a clear leader, it will be > unclear how decisions are made, and that will make it very hard to > take strategic decisions. > > I would like to humbly suggest the following in the hope that it spurs > discussion. > > As first pass, Ralf, Chuck and Nathaniel decide on a core group of > people that will form the succession committee. Maybe this could be > the list of people you listed above. > > Ralf, Chuck and Nathaniel then ask for people to wish to propose > themselves as the leader of numpy. Anyone proposing themselves to > lead numpy would remove themselves from the succession committee. > > The proposed leaders of numpy write a short manifesto saying why they > are the right choice for the job, and what they intend to do if > elected. The manifestos and issues arising are discussed in public on > the mailing list - the equivalent of an online presidential debate. > > In due course - say after 2 weeks or after the discussion seems to be > dying out - the succession committee votes on the leader. I propose > that these votes should be public, but I can see the opposite > argument. Travis has very clearly made "Chuck, Nathaniel, and Ralf" as the leaders of the project. But as always ---- he didn't pick them because he needed to create leaders. They were already the de-facto leaders of the project due to their actions, involvements and respect in the community, so he just made it official. > How does that sound? To me that sounds like a bad idea. That being said, if Chuck, Nathaniel, and Ralf agree that it would be a good idea, that's fine with me too. Ondrej From ondrej.certik at gmail.com Thu Dec 20 20:51:49 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 20 Dec 2012 17:51:49 -0800 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: References: Message-ID: On Thu, Dec 20, 2012 at 5:48 PM, Ond?ej ?ert?k wrote: > Hi Matthew, > > On Thu, Dec 20, 2012 at 3:46 PM, Matthew Brett wrote: >> Hi, >> >> [Travis wrote...] >>> My strong suggestion is that development discussions of the project continue on >>> this list with consensus among the active participants being the goal for >>> development. I don't think 100% consensus is a rigid requirement --- but >>> certainly a super-majority should be the goal, and serious changes should not be >>> made with out a clear consensus. I would pay special attention to >>> under-represented people (users with intense usage of NumPy but small voices on >>> this list). There are many of them. If you push me for specifics then at >>> this point in NumPy's history, I would say that if Chuck, Nathaniel, and Ralf >>> agree on a course of action, it will likely be a good thing for the project. I >>> suspect that even if only 2 of the 3 agree at one time it might still be a good >>> thing (but I would expect more detail and discussion). There are others whose >>> opinion should be sought as well: Ondrej Certik, Perry Greenfield, Robert Kern, >>> David Cournapeau, Francesc Alted, and Mark Wiebe to name a few. For some >>> questions, I might even seek input from people like Konrad Hinsen and Paul >>> Dubois --- if they have time to give it. I will still be willing to offer my >>> view from time to time and if I am asked. >> >> Thank you for starting this discussion. >> >> I am more or less offline at the moment in Cuba and flying, but I hope >> very much this will be an opportunity for a good discussion on the >> best way forward for numpy. >> >> Travis - I think you are suggesting that there should be no one >> person in charge of numpy, and I think this is very unlikely to work >> well. Perhaps there are good examples of well-led projects where >> there is not a clear leader, but I can't think of any myself at the >> moment. My worry would be that, without a clear leader, it will be >> unclear how decisions are made, and that will make it very hard to >> take strategic decisions. >> >> I would like to humbly suggest the following in the hope that it spurs >> discussion. >> >> As first pass, Ralf, Chuck and Nathaniel decide on a core group of >> people that will form the succession committee. Maybe this could be >> the list of people you listed above. >> >> Ralf, Chuck and Nathaniel then ask for people to wish to propose >> themselves as the leader of numpy. Anyone proposing themselves to >> lead numpy would remove themselves from the succession committee. >> >> The proposed leaders of numpy write a short manifesto saying why they >> are the right choice for the job, and what they intend to do if >> elected. The manifestos and issues arising are discussed in public on >> the mailing list - the equivalent of an online presidential debate. >> >> In due course - say after 2 weeks or after the discussion seems to be >> dying out - the succession committee votes on the leader. I propose >> that these votes should be public, but I can see the opposite >> argument. > > Travis has very clearly made "Chuck, Nathaniel, and Ralf" as the leaders > of the project. But as always ---- he didn't pick them because he > needed to create leaders. They were already the de-facto leaders > of the project due to their actions, involvements and respect in the community, > so he just made it official. > > >> How does that sound? > > To me that sounds like a bad idea. That being said, if Chuck, > Nathaniel, and Ralf > agree that it would be a good idea, that's fine with me too. I forgot to add --- Matthew, I'll be happy to discuss this over phone once you get back to the US. I know we had discussions over this in the past, but I couldn't offer much advice, as I didn't understand the inner working of the numpy development. I now understand it much better. Ondrej From charlesr.harris at gmail.com Thu Dec 20 21:32:58 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 20 Dec 2012 19:32:58 -0700 Subject: [Numpy-discussion] Travis failures with no errors In-Reply-To: References: Message-ID: On Thu, Dec 20, 2012 at 6:25 PM, Ond?ej ?ert?k wrote: > On Thu, Dec 13, 2012 at 4:39 PM, Ond?ej ?ert?k > wrote: > > Hi, > > > > I found these recent weird "failures" in Travis, but I can't find any > > problem with the log and all tests pass. Any ideas what is going on? > > > > https://travis-ci.org/numpy/numpy/jobs/3570123 > > https://travis-ci.org/numpy/numpy/jobs/3539549 > > https://travis-ci.org/numpy/numpy/jobs/3369629 > > And here is another one: > > https://travis-ci.org/numpy/numpy/jobs/3768782 > Hmm, that is strange indeed. The first three are old, >= 12 days, but the last is new, although the run time was getting up there. Might try running the last one again. I don't know if the is an easy way to do that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Fri Dec 21 03:14:53 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 21 Dec 2012 02:14:53 -0600 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: References: Message-ID: <766F069B-9C36-4BC1-8A51-BB91155104C1@continuum.io> On Dec 20, 2012, at 7:39 PM, Nathaniel Smith wrote: > On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett wrote: >> Travis - I think you are suggesting that there should be no one >> person in charge of numpy, and I think this is very unlikely to work >> well. Perhaps there are good examples of well-led projects where >> there is not a clear leader, but I can't think of any myself at the >> moment. My worry would be that, without a clear leader, it will be >> unclear how decisions are made, and that will make it very hard to >> take strategic decisions. > > Curious; my feeling is the opposite, that among mature and successful > FOSS projects, having a clear leader is the uncommon case. GCC > doesn't, Glibc not only has no leader but they recently decided to get > rid of their formal steering committee, I'm pretty sure git doesn't, > Apache certainly doesn't, Samba doesn't really, etc. As usual Karl > Fogel has sensible comments on this: > http://producingoss.com/en/consensus-democracy.html > > In practice the main job of a successful FOSS leader is to refuse to > make decisions, nudge people to work things out, and then if they > refuse to work things out tell them to go away until they do: > https://lwn.net/Articles/105375/ > and what actually gives people influence in a project is the respect > of the other members. The former stuff is stuff anyone can do, and the > latter isn't something you can confer or take away with a vote. > I will strongly voice my opinion that NumPy does not need an official single "leader". What it needs are committed, experienced, service-oriented developers and users who are willing to express their concerns and requests because they are used to being treated well. It also needs new developers who are willing to dive into code, contribute to discussions, tackle issues, make pull requests, and review pull requests. As people do this regularly, the leaders of the project will emerge as they have done in the past. Even though I called out three people explicitly --- there are many more contributors to NumPy whose voices deserve attention. But, you don't need me to point out the obvious to what the Github record shows about who is shepherding NumPy these days. But, the Github record is not the only one that matters. I would love to see NumPy developers continue to pay attention to and deeply respect the users (especially of downstream projects that depend on NumPy). I plan to continue using NumPy myself and plan to continue to encourage others around me to contribute patches, fixes and features. Obviously, there are people who have rights to merge pull-requests to the repository. But, this group seems always open to new, willing help. From a practical matter, this group is the head development group of the official NumPy fork. I believe this group will continue to be open enough to new, motivated contributors which will allow it to grow to the degree that such developers are available. > Nor do we necessarily have a great track record for executive > decisions actually working things out. > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From francesc at continuum.io Fri Dec 21 05:34:39 2012 From: francesc at continuum.io (Francesc Alted) Date: Fri, 21 Dec 2012 11:34:39 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1356028520.12003.9.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355993611.10732.18.camel@farnsworth> <50D31F44.5050306@continuum.io> <1356028520.12003.9.camel@farnsworth> Message-ID: <50D43B3F.4030208@continuum.io> On 12/20/12 7:35 PM, Henry Gomersall wrote: > On Thu, 2012-12-20 at 15:23 +0100, Francesc Alted wrote: >> On 12/20/12 9:53 AM, Henry Gomersall wrote: >>> On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: >>>> The only scenario that I see that this would create unaligned >> arrays >>>> is >>>> for machines having AVX. But provided that the Intel architecture >> is >>>> making great strides in fetching unaligned data, I'd be surprised >>>> that >>>> the difference in performance would be even noticeable. >>>> >>>> Can you tell us which difference in performance are you seeing for >> an >>>> AVX-aligned array and other that is not AVX-aligned? Just curious. >>> Further to this point, from an Intel article... >>> >>> >> http://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors >>> "Aligning data to vector length is always recommended. When using >> Intel >>> SSE and Intel SSE2 instructions, loaded data should be aligned to 16 >>> bytes. Similarly, to achieve best results use Intel AVX instructions >> on >>> 32-byte vectors that are 32-byte aligned. The use of Intel AVX >>> instructions on unaligned 32-byte vectors means that every second >> load >>> will be across a cache-line split, since the cache line is 64 bytes. >>> This doubles the cache line split rate compared to Intel SSE code >> that >>> uses 16-byte vectors. A high cache-line split rate in >> memory-intensive >>> code is extremely likely to cause performance degradation. For that >>> reason, it is highly recommended to align the data to 32 bytes for >> use >>> with Intel AVX." >>> >>> Though it would be nice to put together a little example of this! >> Indeed, an example is what I was looking for. So provided that I >> have >> access to an AVX capable machine (having 6 physical cores), and that >> MKL >> 10.3 has support for AVX, I have made some comparisons using the >> Anaconda Python distribution (it ships with most packages linked >> against >> MKL 10.3). > > >> All in all, it is not clear that AVX alignment would have an >> advantage, >> even for memory-bounded problems. But of course, if Intel people are >> saying that AVX alignment is important is because they have use cases >> for asserting this. It is just that I'm having a difficult time to >> find >> these cases. > Thanks for those examples, they were very interesting. I managed to > temporarily get my hands on a machine with AVX and I have shown some > speed-up with aligned arrays. > > FFT (using my wrappers) gives about a 15% speedup. > > Also this convolution code: > https://github.com/hgomersall/SSE-convolution/blob/master/convolve.c > > Shows a small but repeatable speed-up (a few %) when using some aligned > loads (as many as I can work out to use!). Okay, so a 15% is significant, yes. I'm still wondering why I did not get any speedup at all using MKL, but probably the reason is that it manages the unaligned corners of the datasets first, and then uses an aligned access for the rest of the data (but just guessing here). -- Francesc Alted From heng at cantab.net Fri Dec 21 05:58:45 2012 From: heng at cantab.net (Henry Gomersall) Date: Fri, 21 Dec 2012 10:58:45 +0000 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D43B3F.4030208@continuum.io> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355993611.10732.18.camel@farnsworth> <50D31F44.5050306@continuum.io> <1356028520.12003.9.camel@farnsworth> <50D43B3F.4030208@continuum.io> Message-ID: <1356087525.3473.13.camel@farnsworth> On Fri, 2012-12-21 at 11:34 +0100, Francesc Alted wrote: > > Also this convolution code: > > https://github.com/hgomersall/SSE-convolution/blob/master/convolve.c > > > > Shows a small but repeatable speed-up (a few %) when using some > aligned > > loads (as many as I can work out to use!). > > Okay, so a 15% is significant, yes. I'm still wondering why I did > not > get any speedup at all using MKL, but probably the reason is that it > manages the unaligned corners of the datasets first, and then uses an > aligned access for the rest of the data (but just guessing here). With SSE in that convolution code example above (in which all alignments need be considered for each output element), I note a significant speedup by creating 4 copies of the float input array using memcopy, each shifted by 1 float (so the 5th element is aligned again). Despite all the extra copies its still quicker than using an unaligned load. However, when one tries the same trick with 8 copies for AVX it's actually slower than the SSE case. The fastest AVX (and any) implementation I have so far is with 16-aligned arrays (made with 4 copies as with SSE), with alternate aligned and unaligned loads (which is always at worst 16-byte aligned). Fascinating stuff! hen From bahtiyor_zohidov at mail.ru Fri Dec 21 06:29:12 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Fri, 21 Dec 2012 15:29:12 +0400 Subject: [Numpy-discussion] =?utf-8?q?NaN_=28Not_a_Number=29_occurs_in_cal?= =?utf-8?q?culation_of_complex_number_for_Bessel_functions?= Message-ID: <1356089352.739240573@f308.mail.ru> DEAR PYTHON USERS DO MATHEMATICAL FUNCTIONS HAVE LIMITATION IN PYTHON in comparison with other programming languages ???? I have two mathematical functions: from scipy.special import sph_jn, ?sph_jnyn 1) ?sph_jn (n, z) ---> n is float, z is complex number for example: ?a,b=sph_jn ( 2.0 , 5+0.4j ) ?gives the following result: >>>? a ? ? ? ? array( [ - 0.20416243 + 0.03963597j, - 0.10714653 - 0.06227716j,?0.13731305 - 0.07165432j ] ) >>>b ? ? ?? array(? [ 0.10714653 + 0.06227716j, -0.15959617 + 0.06098154j,?-0.18559289 - 0.01300886j ] ) 2) ?sph_jnyn(n , x) --> n-float, x - float ?,for example:? c,d,e,f=sph_jnyn(2.0 , 3.0) >>> c ? ? ? ? array( [ 0.04704 , 0.3456775, 0.2986375 ] ) >>> d ? ? ? ? array( [-0.3456775 , -0.18341166, 0.04704 ] ) >>> e ? ? ? ?array( [ 0.3299975 , 0.06295916, -0.26703834 ] ) >>> f ? ? ? ? array( [ -0.06295916, 0.28802472, 0.3299975 ] ) PROBLEM IS HERE!!! BUT , IF I GIVE ( it is necessary value for my program ): a , b =?sph_jn ( 536 , 2513.2741228718346 + 201.0619298974676j ) I would like to see even very very deep? comments as specific as possible!!!!!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From francesc at continuum.io Fri Dec 21 06:45:41 2012 From: francesc at continuum.io (Francesc Alted) Date: Fri, 21 Dec 2012 12:45:41 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <1356087525.3473.13.camel@farnsworth> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355993611.10732.18.camel@farnsworth> <50D31F44.5050306@continuum.io> <1356028520.12003.9.camel@farnsworth> <50D43B3F.4030208@continuum.io> <1356087525.3473.13.camel@farnsworth> Message-ID: <50D44BE5.4090401@continuum.io> On 12/21/12 11:58 AM, Henry Gomersall wrote: > On Fri, 2012-12-21 at 11:34 +0100, Francesc Alted wrote: >>> Also this convolution code: >>> https://github.com/hgomersall/SSE-convolution/blob/master/convolve.c >>> >>> Shows a small but repeatable speed-up (a few %) when using some >> aligned >>> loads (as many as I can work out to use!). >> Okay, so a 15% is significant, yes. I'm still wondering why I did >> not >> get any speedup at all using MKL, but probably the reason is that it >> manages the unaligned corners of the datasets first, and then uses an >> aligned access for the rest of the data (but just guessing here). > With SSE in that convolution code example above (in which all alignments > need be considered for each output element), I note a significant > speedup by creating 4 copies of the float input array using memcopy, > each shifted by 1 float (so the 5th element is aligned again). Despite > all the extra copies its still quicker than using an unaligned load. > However, when one tries the same trick with 8 copies for AVX it's > actually slower than the SSE case. > > The fastest AVX (and any) implementation I have so far is with > 16-aligned arrays (made with 4 copies as with SSE), with alternate > aligned and unaligned loads (which is always at worst 16-byte aligned). > > Fascinating stuff! Yes, to say the least. And it supports the fact that, when fine tuning memory access performance, there is no replacement for experimentation (in some weird ways many times :) -- Francesc Alted From d.s.seljebotn at astro.uio.no Fri Dec 21 07:35:36 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Fri, 21 Dec 2012 13:35:36 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D31F44.5050306@continuum.io> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355993611.10732.18.camel@farnsworth> <50D31F44.5050306@continuum.io> Message-ID: <50D45798.10205@astro.uio.no> On 12/20/2012 03:23 PM, Francesc Alted wrote: > On 12/20/12 9:53 AM, Henry Gomersall wrote: >> On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: >>> The only scenario that I see that this would create unaligned arrays >>> is >>> for machines having AVX. But provided that the Intel architecture is >>> making great strides in fetching unaligned data, I'd be surprised >>> that >>> the difference in performance would be even noticeable. >>> >>> Can you tell us which difference in performance are you seeing for an >>> AVX-aligned array and other that is not AVX-aligned? Just curious. >> Further to this point, from an Intel article... >> >> http://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors >> >> "Aligning data to vector length is always recommended. When using Intel >> SSE and Intel SSE2 instructions, loaded data should be aligned to 16 >> bytes. Similarly, to achieve best results use Intel AVX instructions on >> 32-byte vectors that are 32-byte aligned. The use of Intel AVX >> instructions on unaligned 32-byte vectors means that every second load >> will be across a cache-line split, since the cache line is 64 bytes. >> This doubles the cache line split rate compared to Intel SSE code that >> uses 16-byte vectors. A high cache-line split rate in memory-intensive >> code is extremely likely to cause performance degradation. For that >> reason, it is highly recommended to align the data to 32 bytes for use >> with Intel AVX." >> >> Though it would be nice to put together a little example of this! > > Indeed, an example is what I was looking for. So provided that I have > access to an AVX capable machine (having 6 physical cores), and that MKL > 10.3 has support for AVX, I have made some comparisons using the > Anaconda Python distribution (it ships with most packages linked against > MKL 10.3). > > Here it is a first example using a DGEMM operation. First using a NumPy > that is not turbo-loaded with MKL: > > In [34]: a = np.linspace(0,1,1e7) > > In [35]: b = a.reshape(1000, 10000) > > In [36]: c = a.reshape(10000, 1000) > > In [37]: time d = np.dot(b,c) > CPU times: user 7.56 s, sys: 0.03 s, total: 7.59 s > Wall time: 7.63 s > > In [38]: time d = np.dot(c,b) > CPU times: user 78.52 s, sys: 0.18 s, total: 78.70 s > Wall time: 78.89 s > > This is getting around 2.6 GFlop/s. Now, with a MKL 10.3 NumPy and > AVX-unaligned data: > > In [7]: p = ctypes.create_string_buffer(int(8e7)); hex(ctypes.addressof(p)) > Out[7]: '0x7fcdef3b4010' # 16 bytes alignment > > In [8]: a = np.ndarray(1e7, "f8", p) > > In [9]: a[:] = np.linspace(0,1,1e7) > > In [10]: b = a.reshape(1000, 10000) > > In [11]: c = a.reshape(10000, 1000) > > In [37]: %timeit d = np.dot(b,c) > 10 loops, best of 3: 164 ms per loop > > In [38]: %timeit d = np.dot(c,b) > 1 loops, best of 3: 1.65 s per loop > > That is around 120 GFlop/s (i.e. almost 50x faster than without MKL/AVX). > > Now, using MKL 10.3 and AVX-aligned data: > > In [21]: p2 = ctypes.create_string_buffer(int(8e7+16)); > hex(ctypes.addressof(p)) > Out[21]: '0x7f8cb9598010' > > In [22]: a2 = np.ndarray(1e7+2, "f8", p2)[2:] # skip the first 16 bytes > (now is 32-bytes aligned) > > In [23]: a2[:] = np.linspace(0,1,1e7) > > In [24]: b2 = a2.reshape(1000, 10000) > > In [25]: c2 = a2.reshape(10000, 1000) > > In [35]: %timeit d2 = np.dot(b2,c2) > 10 loops, best of 3: 163 ms per loop > > In [36]: %timeit d2 = np.dot(c2,b2) > 1 loops, best of 3: 1.67 s per loop > > So, again, around 120 GFlop/s, and the difference wrt to unaligned AVX > data is negligible. > > One may argue that DGEMM is CPU-bounded and that memory access plays > little role here, and that is certainly true. So, let's go with a more > memory-bounded problem, like computing a transcendental function with > numexpr. First with a with NumPy and numexpr with no MKL support: > > In [8]: a = np.linspace(0,1,1e8) > > In [9]: %time b = np.sin(a) > CPU times: user 1.20 s, sys: 0.22 s, total: 1.42 s > Wall time: 1.42 s > > In [10]: import numexpr as ne > > In [12]: %time b = ne.evaluate("sin(a)") > CPU times: user 1.42 s, sys: 0.27 s, total: 1.69 s > Wall time: 0.37 s > > This time is around 4x faster than regular 'sin' in libc, and about the > same speed than a memcpy(): > > In [13]: %time c = a.copy() > CPU times: user 0.19 s, sys: 0.20 s, total: 0.39 s > Wall time: 0.39 s > > Now, with a MKL-aware numexpr and non-AVX alignment: > > In [8]: p = ctypes.create_string_buffer(int(8e8)); hex(ctypes.addressof(p)) > Out[8]: '0x7fce435da010' # 16 bytes alignment > > In [9]: a = np.ndarray(1e8, "f8", p) > > In [10]: a[:] = np.linspace(0,1,1e8) > > In [11]: %time b = ne.evaluate("sin(a)") > CPU times: user 0.44 s, sys: 0.27 s, total: 0.71 s > Wall time: 0.15 s > > That is, more than 2x faster than a memcpy() in this system, meaning > that the problem is truly memory-bounded. So now, with an AVX aligned > buffer: > > In [14]: a2 = a[2:] # skip the first 16 bytes > > In [15]: %time b = ne.evaluate("sin(a2)") > CPU times: user 0.40 s, sys: 0.28 s, total: 0.69 s > Wall time: 0.16 s > > Again, times are very close. Just to make sure, let's use the timeit magic: > > In [16]: %timeit b = ne.evaluate("sin(a)") > 10 loops, best of 3: 159 ms per loop # unaligned > > In [17]: %timeit b = ne.evaluate("sin(a2)") > 10 loops, best of 3: 154 ms per loop # aligned > > All in all, it is not clear that AVX alignment would have an advantage, > even for memory-bounded problems. But of course, if Intel people are > saying that AVX alignment is important is because they have use cases > for asserting this. It is just that I'm having a difficult time to find > these cases. Hmm, I think it is the opposite, that it is for CPU-bound problems that alignment would have an effect? I.e. the MOVUPD would be doing some shuffling etc. to get around the non-alignment, which only matters if the data is already in cache. (There are other instructions, like the STREAM instructions and the direct writes and so on, which are much more important for the non-cached case. At least that's my understanding.) Dag Sverre From pav at iki.fi Fri Dec 21 07:46:26 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 21 Dec 2012 12:46:26 +0000 (UTC) Subject: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions References: <1356089352.739240573@f308.mail.ru> Message-ID: Happyman mail.ru> writes: [clip] > IF I GIVE ( it is necessary value for my program ): > a , b =?sph_jn ( 536 , 2513.2741228718346 + 201.0619298974676j ) The implementation of the spherical Bessel functions is through this Fortran code: https://github.com/scipy/scipy/blob/master/scipy/special/specfun/specfun.f#L1091 It does not have asymptotic expansions for dealing with parts of the complex plane where the computation via the recurrence does not work. -- Pauli Virtanen From francesc at continuum.io Fri Dec 21 07:48:03 2012 From: francesc at continuum.io (Francesc Alted) Date: Fri, 21 Dec 2012 13:48:03 +0100 Subject: [Numpy-discussion] Byte aligned arrays In-Reply-To: <50D45798.10205@astro.uio.no> References: <1355906439.3456.9.camel@farnsworth> <1355935645.3456.22.camel@farnsworth> <50D2018C.4080106@continuum.io> <1355993611.10732.18.camel@farnsworth> <50D31F44.5050306@continuum.io> <50D45798.10205@astro.uio.no> Message-ID: <50D45A83.10706@continuum.io> On 12/21/12 1:35 PM, Dag Sverre Seljebotn wrote: > On 12/20/2012 03:23 PM, Francesc Alted wrote: >> On 12/20/12 9:53 AM, Henry Gomersall wrote: >>> On Wed, 2012-12-19 at 19:03 +0100, Francesc Alted wrote: >>>> The only scenario that I see that this would create unaligned arrays >>>> is >>>> for machines having AVX. But provided that the Intel architecture is >>>> making great strides in fetching unaligned data, I'd be surprised >>>> that >>>> the difference in performance would be even noticeable. >>>> >>>> Can you tell us which difference in performance are you seeing for an >>>> AVX-aligned array and other that is not AVX-aligned? Just curious. >>> Further to this point, from an Intel article... >>> >>> http://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors >>> >>> "Aligning data to vector length is always recommended. When using Intel >>> SSE and Intel SSE2 instructions, loaded data should be aligned to 16 >>> bytes. Similarly, to achieve best results use Intel AVX instructions on >>> 32-byte vectors that are 32-byte aligned. The use of Intel AVX >>> instructions on unaligned 32-byte vectors means that every second load >>> will be across a cache-line split, since the cache line is 64 bytes. >>> This doubles the cache line split rate compared to Intel SSE code that >>> uses 16-byte vectors. A high cache-line split rate in memory-intensive >>> code is extremely likely to cause performance degradation. For that >>> reason, it is highly recommended to align the data to 32 bytes for use >>> with Intel AVX." >>> >>> Though it would be nice to put together a little example of this! >> Indeed, an example is what I was looking for. So provided that I have >> access to an AVX capable machine (having 6 physical cores), and that MKL >> 10.3 has support for AVX, I have made some comparisons using the >> Anaconda Python distribution (it ships with most packages linked against >> MKL 10.3). >> >> Here it is a first example using a DGEMM operation. First using a NumPy >> that is not turbo-loaded with MKL: >> >> In [34]: a = np.linspace(0,1,1e7) >> >> In [35]: b = a.reshape(1000, 10000) >> >> In [36]: c = a.reshape(10000, 1000) >> >> In [37]: time d = np.dot(b,c) >> CPU times: user 7.56 s, sys: 0.03 s, total: 7.59 s >> Wall time: 7.63 s >> >> In [38]: time d = np.dot(c,b) >> CPU times: user 78.52 s, sys: 0.18 s, total: 78.70 s >> Wall time: 78.89 s >> >> This is getting around 2.6 GFlop/s. Now, with a MKL 10.3 NumPy and >> AVX-unaligned data: >> >> In [7]: p = ctypes.create_string_buffer(int(8e7)); hex(ctypes.addressof(p)) >> Out[7]: '0x7fcdef3b4010' # 16 bytes alignment >> >> In [8]: a = np.ndarray(1e7, "f8", p) >> >> In [9]: a[:] = np.linspace(0,1,1e7) >> >> In [10]: b = a.reshape(1000, 10000) >> >> In [11]: c = a.reshape(10000, 1000) >> >> In [37]: %timeit d = np.dot(b,c) >> 10 loops, best of 3: 164 ms per loop >> >> In [38]: %timeit d = np.dot(c,b) >> 1 loops, best of 3: 1.65 s per loop >> >> That is around 120 GFlop/s (i.e. almost 50x faster than without MKL/AVX). >> >> Now, using MKL 10.3 and AVX-aligned data: >> >> In [21]: p2 = ctypes.create_string_buffer(int(8e7+16)); >> hex(ctypes.addressof(p)) >> Out[21]: '0x7f8cb9598010' >> >> In [22]: a2 = np.ndarray(1e7+2, "f8", p2)[2:] # skip the first 16 bytes >> (now is 32-bytes aligned) >> >> In [23]: a2[:] = np.linspace(0,1,1e7) >> >> In [24]: b2 = a2.reshape(1000, 10000) >> >> In [25]: c2 = a2.reshape(10000, 1000) >> >> In [35]: %timeit d2 = np.dot(b2,c2) >> 10 loops, best of 3: 163 ms per loop >> >> In [36]: %timeit d2 = np.dot(c2,b2) >> 1 loops, best of 3: 1.67 s per loop >> >> So, again, around 120 GFlop/s, and the difference wrt to unaligned AVX >> data is negligible. >> >> One may argue that DGEMM is CPU-bounded and that memory access plays >> little role here, and that is certainly true. So, let's go with a more >> memory-bounded problem, like computing a transcendental function with >> numexpr. First with a with NumPy and numexpr with no MKL support: >> >> In [8]: a = np.linspace(0,1,1e8) >> >> In [9]: %time b = np.sin(a) >> CPU times: user 1.20 s, sys: 0.22 s, total: 1.42 s >> Wall time: 1.42 s >> >> In [10]: import numexpr as ne >> >> In [12]: %time b = ne.evaluate("sin(a)") >> CPU times: user 1.42 s, sys: 0.27 s, total: 1.69 s >> Wall time: 0.37 s >> >> This time is around 4x faster than regular 'sin' in libc, and about the >> same speed than a memcpy(): >> >> In [13]: %time c = a.copy() >> CPU times: user 0.19 s, sys: 0.20 s, total: 0.39 s >> Wall time: 0.39 s >> >> Now, with a MKL-aware numexpr and non-AVX alignment: >> >> In [8]: p = ctypes.create_string_buffer(int(8e8)); hex(ctypes.addressof(p)) >> Out[8]: '0x7fce435da010' # 16 bytes alignment >> >> In [9]: a = np.ndarray(1e8, "f8", p) >> >> In [10]: a[:] = np.linspace(0,1,1e8) >> >> In [11]: %time b = ne.evaluate("sin(a)") >> CPU times: user 0.44 s, sys: 0.27 s, total: 0.71 s >> Wall time: 0.15 s >> >> That is, more than 2x faster than a memcpy() in this system, meaning >> that the problem is truly memory-bounded. So now, with an AVX aligned >> buffer: >> >> In [14]: a2 = a[2:] # skip the first 16 bytes >> >> In [15]: %time b = ne.evaluate("sin(a2)") >> CPU times: user 0.40 s, sys: 0.28 s, total: 0.69 s >> Wall time: 0.16 s >> >> Again, times are very close. Just to make sure, let's use the timeit magic: >> >> In [16]: %timeit b = ne.evaluate("sin(a)") >> 10 loops, best of 3: 159 ms per loop # unaligned >> >> In [17]: %timeit b = ne.evaluate("sin(a2)") >> 10 loops, best of 3: 154 ms per loop # aligned >> >> All in all, it is not clear that AVX alignment would have an advantage, >> even for memory-bounded problems. But of course, if Intel people are >> saying that AVX alignment is important is because they have use cases >> for asserting this. It is just that I'm having a difficult time to find >> these cases. > Hmm, I think it is the opposite, that it is for CPU-bound problems that > alignment would have an effect? I.e. the MOVUPD would be doing some > shuffling etc. to get around the non-alignment, which only matters if > the data is already in cache. > > (There are other instructions, like the STREAM instructions and the > direct writes and so on, which are much more important for the > non-cached case. At least that's my understanding.) Yes, I think you are right. It is just that I was a bit disappointed with the DGEMM not being affected by non-AVX alignment and tried a memory-bound problem, just in case. But as I said before, probably Intel people have dealt with both aligned and unaligned data. -- Francesc Alted From bahtiyor_zohidov at mail.ru Fri Dec 21 07:56:08 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Fri, 21 Dec 2012 16:56:08 +0400 Subject: [Numpy-discussion] =?utf-8?q?NaN_=28Not_a_Number=29_occurs_in_cal?= =?utf-8?q?culation_of_complex_number_for_Bessel_functions?= In-Reply-To: References: <1356089352.739240573@f308.mail.ru> Message-ID: <1356094568.754851858@f299.mail.ru> Thanks?Pauli But I have already very shortly built ?for bessel?function, but the code you gave me is in Fortran.. I also used f2py but I could not manage to read fortran codes..that is why I have asked in Python what is wrong?? ???????, 21 ??????? 2012, 12:46 UTC ?? Pauli Virtanen : >Happyman mail.ru> writes: >[clip] >> IF I GIVE ( it is necessary value for my program ): >> a , b =?sph_jn ( 536 , 2513.2741228718346 + 201.0619298974676j ) > >The implementation of the spherical Bessel functions is through >this Fortran code: > >https://github.com/scipy/scipy/blob/master/scipy/special/specfun/specfun.f#L1091 > >It does not have asymptotic expansions for dealing with parts of >the complex plane where the computation via the recurrence does not >work. > >-- >Pauli Virtanen > > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Dec 21 08:17:21 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 21 Dec 2012 13:17:21 +0000 (UTC) Subject: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions References: <1356089352.739240573@f308.mail.ru> <1356094568.754851858@f299.mail.ru> Message-ID: Happyman mail.ru> writes: > Thanks?Pauli But I have already very shortly built ?for bessel >?function, but the code you gave me is in Fortran.. I also used > f2py but I could not manage to read fortran codes..that is why > I have asked in Python what is wrong?? That Fortran code is `sph_jn`, which you used. It works using f2py. Only some of the special functions in scipy.special are written using Python as the language. Most of them are in C or in Fortran, using some existing special function library not written by us. Some of the implementations provided by these libraries are not complete, and do not cover the whole complex plane (or the real axis). Other functions (the more common ones), however, have very good implementations. -- Pauli Virtanen From bahtiyor_zohidov at mail.ru Fri Dec 21 08:30:20 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Fri, 21 Dec 2012 17:30:20 +0400 Subject: [Numpy-discussion] =?utf-8?q?NaN_=28Not_a_Number=29_occurs_in_cal?= =?utf-8?q?culation_of_complex_number_for_Bessel_functions?= In-Reply-To: References: <1356089352.739240573@f308.mail.ru> Message-ID: <1356096620.594767210@f323.mail.ru> I have everything in C or Fortran...According to my friends recommendations I started learning Python for my research... Do you mean the functions which gave Nan result has not been developed properly yet in Python, Don't you???? For about 1.5 months I have been facing the same problem for Bessel functions.. I think the code that I showed like an example is not working in Python. What to do ??? ???????, 21 ??????? 2012, 13:17 ?? Pauli Virtanen : >Happyman mail.ru> writes: >> Thanks?Pauli But I have already very shortly built ?for bessel >>?function, but the code you gave me is in Fortran.. I also used >> f2py but I could not manage to read fortran codes..that is why >> I have asked in Python what is wrong?? > >That Fortran code is `sph_jn`, which you used. It works using f2py. > >Only some of the special functions in scipy.special are written >using Python as the language. Most of them are in C or in Fortran, >using some existing special function library not written by us. > >Some of the implementations provided by these libraries are not >complete, and do not cover the whole complex plane (or the real axis). >Other functions (the more common ones), however, have very good >implementations. > >-- >Pauli Virtanen > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.s.seljebotn at astro.uio.no Fri Dec 21 08:40:27 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Fri, 21 Dec 2012 14:40:27 +0100 Subject: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions In-Reply-To: <1356096620.594767210@f323.mail.ru> References: <1356089352.739240573@f308.mail.ru> <1356096620.594767210@f323.mail.ru> Message-ID: <50D466CB.1@astro.uio.no> On 12/21/2012 02:30 PM, Happyman wrote: > I have everything in C or Fortran...According to my friends > recommendations I started learning Python for my research... > > Do you mean the functions which gave Nan result has not been developed > properly yet in Python, Don't you???? The way most of NumPy and SciPy works is by calling into C and Fortran code. > > For about 1.5 months I have been facing the same problem for Bessel > functions.. I think the code that I showed like an example is not > working in Python. > What to do ??? Do you have an implemention of the Bessel functions that work as you wish in C or Fortran? If so, that could be wrapped and called from Python. Dag Sverre From pav at iki.fi Fri Dec 21 08:59:02 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 21 Dec 2012 13:59:02 +0000 (UTC) Subject: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions References: <1356089352.739240573@f308.mail.ru> <1356096620.594767210@f323.mail.ru> <50D466CB.1@astro.uio.no> Message-ID: Dag Sverre Seljebotn astro.uio.no> writes: [clip] > Do you have an implemention of the Bessel functions that work as you > wish in C or Fortran? If so, that could be wrapped and called from Python. For spherical Bessel functions it's possible to also use the relation to Bessel functions, which have a better implementation (AMOS) in Scipy: import numpy as np from scipy.special import jv def sph_jn(n, z): return jv(n + 0.5, z) * np.sqrt(np.pi / (2 * z)) print sph_jn(536, 2513.2741228718346 + 201.0619298974676j) # (2.5167666386507171e+81+3.3576357192536334e+81j) This should solve the problem. -- Pauli Virtanen From lev at columbia.edu Fri Dec 21 09:11:34 2012 From: lev at columbia.edu (Lev Givon) Date: Fri, 21 Dec 2012 09:11:34 -0500 Subject: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions In-Reply-To: References: <1356089352.739240573@f308.mail.ru> <1356096620.594767210@f323.mail.ru> <50D466CB.1@astro.uio.no> Message-ID: <20121221141134.GA6618@avicenna.ee.columbia.edu> Received from Pauli Virtanen on Fri, Dec 21, 2012 at 08:59:02AM EST: > Dag Sverre Seljebotn astro.uio.no> writes: > [clip] > > Do you have an implemention of the Bessel functions that work as you > > wish in C or Fortran? If so, that could be wrapped and called from Python. > > For spherical Bessel functions it's possible to also use the relation > to Bessel functions, which have a better implementation (AMOS) in Scipy: > > import numpy as np > from scipy.special import jv > > def sph_jn(n, z): > return jv(n + 0.5, z) * np.sqrt(np.pi / (2 * z)) > > print sph_jn(536, 2513.2741228718346 + 201.0619298974676j) > # (2.5167666386507171e+81+3.3576357192536334e+81j) > > This should solve the problem. You can also try the spherical Bessel function in Sympy; it should be able to handle this case as well [1]. L.G. [1] http://docs.sympy.org/0.7.2/modules/functions/special.html#bessel-type-functions From bahtiyor_zohidov at mail.ru Fri Dec 21 10:14:45 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Fri, 21 Dec 2012 19:14:45 +0400 Subject: [Numpy-discussion] =?utf-8?q?NaN_=28Not_a_Number=29_occurs_in_cal?= =?utf-8?q?culation_of_complex_number_for_Bessel_functions?= In-Reply-To: References: <1356089352.739240573@f308.mail.ru> Message-ID: <1356102885.831374538@f279.mail.ru> I think you advised about the code which is the same appearance. ========================================================================== Problem is not here Sir.... I will give you exactly what I was talking about. I have ready codes already(It would be kind of you if you checked the following codes, may be): ------------------------------------------------------------------------------------------------------------------------------ ## Bessel function of the first kind # mathematical form: Jn(x)--> n=arg1, x=arg2 # returns --> Jn(x) value in a complex form ------------------------------ def bes_1(arg1,arg2): ? ? ? nu=arg1+0.5 ? ? ? ? ? ? ? ? ? ? ? ? ?# nu=n+0.5: Jn(x) --> Jnu(x) ? ? ? return sqrt(pi/(2*arg2))*np.round(jv(nu,arg2),5) ? ? ?# jv(nu,arg2)--> from 'numpy.special' in PYTHON --------------------------------------------------------------------------------------------------------------------------------? ## Bessel function of the second kind # mathematical form: Yn(x)--> n=arg1, x=arg2 # returns --> Yn(x) value in a complex form --------------------------------- def bes_2 ( arg1, arg2 ): ? ? ? ?nu = arg1 + 0.5 ? ? ? ? ? ? ? ? ? ? ? ?# nu=n+0.5: Yn(x)--> Ynu(x) ? ? ? ?return sqrt ( pi / ( 2 * arg2 ) ) * np.round ( yv ( nu , arg2 ) , 5) ? ? ? # yv(nu,arg2)--> from 'numpy.special' in PYTHON ---------------------------------?------------------------------------------------------------------------------------------------------- ## Hankel function of the first kind # mathematical form: Hn(x)= Jn(x)+Yn(x)j # returns --> Hn(x) value in a complex form --------------------------------- def hank_1 ( arg1, arg2 ): ? ? ? ?return bes_1 ( arg1 , arg2 ) + bes_2 ( arg1 , arg2 ) * 1.0j ? ? ? ? ? ?# Hn(x)= Jn(x)+Yn(x)j -------------------------------------------------------------------------------------------------------------------------------------------? ## Bessel function of the first kind derivative # mathematical form: d(z*jn(z))=z*jn-1(z)-n*jn(z) where, z=x or m*x --------------------------------- def bes_der1 ( arg1 , arg2 ): ? ? ? return arg2 * bes_1 ( arg1 - 1, arg2 ) - arg1 * bes_1 ( arg1 , arg2 ) ---------------------------------------------------------------------------------------------------------------------------------------------? ## Hankel function of the first kind derivative # mathematical form: d(z*hankeln(z))=z*hankeln-1(z)-n*hankeln(z) where, z=x or m*x def hank_der1 ( arg1 , arg2 ): ? ? ? return arg2 * hank_1 ( arg1 - 1, arg2 ) - arg1 * hank_1( arg1, arg2 ) ---------------------------------------------------------------------------------------------------------------------------------------------? FOR MY CASE: m =?2513.2741228718346 + 201.0619298974676j? ? ? ? ? ? ? ? ? ? ? ? ? ? x = 502.6548245743669 def F(m,x): nmax = x + 2.0 + 4.0 * x ** ( 1.0 / 3.0 ) ? ? ? ? ?# ? ?nmax= gives 536.0 as expected value nstop = np.round( nmax ) n = np.arange ( 0.0 ,nstop, dtype = float)? z = m * x m2 = m * m? val1 = m2 * bes_1 ( en , z ) * bes_der1 ( en, x) val2 = bes_1 ( en , x ) * bes_der1 ( en , z ) val3 = m2 * bes_1 ( en , z ) * hank_der1 ( en , x )? val4 = hank_1 ( en , x ) * bes_der1 ( en , z )? an = ( val1 - val2 ) / ( val3 - val4 ) val5 = bes_1 ( en , z ) * bes_der1 ( en, x ) val6 = bes_1 ( en , x ) * bes_der1 ( en, z ) val7 = bes_1 ( en , z ) * hank_der1 ( en, x ) val8 = hank_1 ( en , x ) * bes_der1 ( en , z ) bn = ( val5 - val6 ) / ( val7 - val8 ) return an, bn !!!! ?PROBLEM IS RETURNING THE an, bn at a given value which I showed before F(m,x) function WHAT IS WRONG WITH THIS?????? ???????, 21 ??????? 2012, 13:59 ?? Pauli Virtanen : >Dag Sverre Seljebotn astro.uio.no> writes: >[clip] >> Do you have an implemention of the Bessel functions that work as you >> wish in C or Fortran? If so, that could be wrapped and called from Python. > >For spherical Bessel functions it's possible to also use the relation >to Bessel functions, which have a better implementation (AMOS) in Scipy: > >import numpy as np >from scipy.special import jv > >def sph_jn(n, z): >????return jv(n + 0.5, z) * np.sqrt(np.pi / (2 * z)) > >print sph_jn(536, 2513.2741228718346 + 201.0619298974676j) ># (2.5167666386507171e+81+3.3576357192536334e+81j) > > >This should solve the problem. > >-- >Pauli Virtanen > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Dec 21 10:45:40 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 21 Dec 2012 15:45:40 +0000 (UTC) Subject: [Numpy-discussion] NaN (Not a Number) occurs in calculation of complex number for Bessel functions References: <1356089352.739240573@f308.mail.ru> <1356102885.831374538@f279.mail.ru> Message-ID: Hi, Your code tries to to evaluate z = 1263309.3633394379 + 101064.74910119522j jv(536, z) # -> (inf+inf*j) In reality, this number is not infinite, but jv(536, z) == -2.3955170861527422e+43888 + 9.6910119847300024e+43887 These numbers (~ 10^43888) are too large for the floating point numbers that computers use (maximum ~ 10^308). This is why you get infinities and NaNs in the result. The same is true for the spherical Bessel functions. You will not be able to do this calculation using any software that uses only floating point numbers (Scipy, Matlab, ...). You need to use analytical properties of your problem to get rid of such large numbers. Alternatively, you can use arbitrary precision numbers. Python has libraries for that: http://code.google.com/p/mpmath/ By the way, the proper place for this discussion is the following mailing list: http://mail.scipy.org/mailman/listinfo/scipy-user -- Pauli Virtanen From raul at virtualmaterials.com Fri Dec 21 14:20:58 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Fri, 21 Dec 2012 12:20:58 -0700 Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final findings and suggestions Message-ID: <50D4B69A.7000409@virtualmaterials.com> Hello, On Dec/2/2012 I sent an email about some meaningful speed problems I was facing when porting our core program from Numeric (Python 2.2) to Numpy (Python 2.6). Some of our tests went from 30 seconds to 90 seconds for example. I saw interest from some people in this list and I left the topic saying I would do a more complete profile of the program and report back anything meaningful. It took me quite a bit to get through things because I ended up having to figure out how to create a Visual Studio project that I could debug and compile from the IDE. First, the obvious, Everything that relies heavily on Numpy for speed (mid to large arrays) is pretty much the same speed when compared to Numeric. The areas that are considerably slower in Numpy Vs Numeric are the trivial tasks that we end up using either for convenience (small arrays) or because scalar types such as 'float64' propagate everywhere throughout the program and creep into several of our data structures. This email is really only relevant to people stuck with doing trivial operations with Numpy and want a meaningful speed boost. I focused on float64. * NOTE: I ended up doing everything in Numpy 1.6.2 as opposed to using the latest stuff. I am going to guess all my findings still apply but I will only be able to confirm until later. ========================================================= In this email I include, 1) Main bottlenecks I found which I list and refer to as (a), (b) and (c). 2) The benchmark tests I made and their speed ups 3) Details on the actual changes to the C code ========================================================= Summary of conclusions, - Our code is finally running as fast as it used to by doing some changes in Numpy and also some minor changes in our code. Half of our problems were caused by instantiating small arrays several times which is fairly slow in Numpy. The other half of our problems were are caused by the slow math performance of Numpy scalars. We did find a particular python function in our code that was a big candidate to be rewritten in C and just got it done. - In Numpy I did four sets of changes in the source code. I believe three of them are relevant to every one using Numpy and one of them is probably not going to be very popular. - The main speed up is in float64 scalar operations and creation of small arrays from lists or tuples. The speed up in small array operations is only marginal but I believe there is potential to get them at least twice as fast. ========================================================= 1) By profiling the program I found three generic types of bottlenecks in Numpy that were affecting our code, a) Operations that result in Python internally raising an error e.g. PyObject_GetAttrString(obj, "__array_priority__") when __array_priority__ is not an attribute of obj b) Creation / destruction of scalar array types . In some places this was happening unnecessarily . c) Ufuncs trying to figure out the proper type for an operation (e.g. if I multiply a float64 array by a float64 array, a fair amount of time is spent deciding that it should use float64) I came up with specific changes to address (a) and (b) . I gave up on (c) for now since I couldn't think of a way to speed it up without a large re-write and I really don't know the Numpy code (never saw it before this work). ========================================================= 2) The tests I did were (some are python natives for reference), 1) Array * Array 2) PyFloat * Array 3) Float64 * Array 4) PyFloat + Array 5) Float64 + Array 6) PyFloat * PyFloat 7) Float64 * Float64 8) PyFloat * Float64 9) PyFloat * vector1[1] 10) PyFloat + Float64 11) PyFloat < Float64 12) if PyFloat < Float64: 13) Create array from list 14) Assign PyFloat to all 15) Assign Float64 to all 16) Float64 * Float64 * Float64 * Float64 * Float64 17) Float64 * Float64 * Float64 * Float64 * Float64 18) Float64 ** 2 19) PyFloat ** 2 where Array -> Numpy array of float64 of two elements (vector1 = array( [2.0, 3.1] )). PyFloat -> pyFloat = 3.1 Float64 -> Numpy scalar 'float64' (scalarFloat64 = vector1[1]) Create array from list -> newVec = array([0.2, 0.3], dtype="float64") Assign PyFloat to all -> vector1[:] = pyFloat Assign Float64 to all -> vector1[:] = scalarFloat64 I ran every test 100000 and timed it in seconds. These are the base timings with the original Numpy TIME[s] TEST 1) 0.2003 Array * Array 2) 0.2502 PyFloat * Array 3) 0.2689 Float64 * Array 4) 0.2469 PyFloat + Array 5) 0.2640 Float64 + Array 6) 0.0055 PyFloat * PyFloat 7) 0.0278 Float64 * Float64 8) 0.0778 PyFloat * Float64 9) 0.0893 PyFloat * vector1[1] 10) 0.0767 PyFloat + Float64 11) 0.0532 PyFloat < Float64 12) 0.0543 if PyFloat < Float64 : 13) 0.6788 Create array from list 14) 0.0708 Assign PyFloat to all 15) 0.0775 Assign Float64 to all 16) 0.2994 Float64 * pyFloat * pyFloat * pyFloat * pyFloat 17) 0.1053 Float64 * Float64 * Float64 * Float64 * Float64 18) 0.0918 Float64 ** 2 19) 0.0156 pyFloat ** 2 - Test (13) is the operation that takes the longest overall - PyFloat * Float64 is 14 times slower than PyFloat * PyFloat By addressing bottleneck (a) I got the following ratios of time (BaseTime/NewTime) i.e. RATIO > 1 means GOOD . RATIO TEST 1) 1.1 Array * Array 2) 1.1 PyFloat * Array 3) 1.1 Float64 * Array 4) 1.1 PyFloat + Array 5) 1.2 Float64 + Array 6) 1.0 PyFloat * PyFloat 7) 1.7 Float64 * Float64 8) 2.8 PyFloat * Float64 9) 2.1 PyFloat * vector1[1] 10) 2.8 PyFloat + Float64 11) 3.3 PyFloat < Float64 12) 3.3 if PyFloat < Float64: 13) 3.2 Create array from list 14) 1.2 Assign PyFloat to all 15) 1.2 Assign Float64 to all 16) 2.9 Float64 * pyFloat * pyFloat * pyFloat * pyFloat 17) 1.7 Float64 * Float64 * Float64 * Float64 * Float64 18) 2.4 Float64 ** 2 19) 1.0 pyFloat ** 2 Speed up from Test (13) and (16) resulted in a big speed boost in our code Keeping the changes above. By addressing (b) in a way that did not change the data types of the return values I got the following ratios of time (BaseTime/NewTime) i.e. RATIO > 1 means GOOD . RATIO TEST 1) 1.1 Array * Array 2) 1.1 PyFloat * Array 3) 1.2 Float64 * Array 4) 1.1 PyFloat + Array 5) 1.2 Float64 + Array 6) 1.0 PyFloat * PyFloat 7) 1.7 Float64 * Float64 8) 4.3 PyFloat * Float64 9) 3.1 PyFloat * vector1[1] 10) 4.4 PyFloat + Float64 11) 9.3 PyFloat < Float64 12) 9.2 if PyFloat < Float64 : 13) 3.2 Create array from list 14) 1.2 Assign PyFloat to all 15) 1.2 Assign Float64 to all 16) 4.7 Float64 * pyFloat * pyFloat * pyFloat * pyFloat 17) 1.8 Float64 * Float64 * Float64 * Float64 * Float64 18) 2.4 Float64 ** 2 19) 1.0 pyFloat ** 2 - Scalar operations are quite a bit faster but PyFloat * Float64 is 2.9 times slower than PyFloat * PyFloat I decided to then tackle (b) even further by changing things like PyFloat * Float64 to return a PyFloat as opposed to a Float64. This is the change that I don't think is going to be very popular. This is what I got, 1) 1.1 Array * Array 2) 1.1 PyFloat * Array 3) 1.2 Float64 * Array 4) 1.1 PyFloat + Array 5) 1.2 Float64 + Array 6) 1.0 PyFloat * PyFloat 7) 3.2 Float64 * Float64 8) 8.1 PyFloat * Float64 9) 4.1 PyFloat * vector1[1] 10) 8.3 PyFloat + Float64 11) 9.4 PyFloat < Float64 12) 9.2 if PyFloat < Float64 : 13) 3.2 Create array from list 14) 1.2 Assign PyFloat to all 15) 1.2 Assign Float64 to all 16) 17.3 Float64 * pyFloat * pyFloat * pyFloat * pyFloat 17) 3.3 Float64 * Float64 * Float64 * Float64 * Float64 18) 2.4 Float64 ** 2 19) 1.0 pyFloat ** 2 - Test (16) shows how only one Float64 spoils the speed of trivial math. Now imagine the effect in hundreds of lines like that. - Even Test (17) got faster which uses only Float64 - Test (18) Float64 ** 2 is still returning a float64 in this run. Regarding bottleneck (c) . Deciding the type of UFunc. I hacked a version for testing purposes to check the potential speed up (some dirty changes in generate_umath.py). This version avoided the overhead of the call to the calls to find the matching ufunc. The ratio of speed up for something like Array * Array was only 1.6 . This was not too exciting so I walked away for now. ========================================================= 3) These are the actual changes to the C code, For bottleneck (a) In general, - avoid calls to PyObject_GetAttrString when I know the type is List, None, Tuple, Float, Int, String or Unicode - avoid calls to PyObject_GetBuffer when I know the type is List, None or Tuple a.1) In arrayobject.h after the line #include "npy_interrupt.h" I added a couple of #define //Check for exact native types that for sure do not //support array related methods. Useful for faster checks when //validating if an object supports these methods #define ISEXACT_NATIVE_PYTYPE(op) (PyList_CheckExact(op) || (Py_None == op) || PyTuple_CheckExact(op) || PyFloat_CheckExact(op) || PyInt_CheckExact(op) || PyString_CheckExact(op) || PyUnicode_CheckExact(op)) //Check for exact native types that for sure do not //support buffer protocol. Useful for faster checks when //validating if an object supports the buffer protocol. #define NEVERSUPPORTS_BUFFER_PROTOCOL(op) ( PyList_CheckExact(op) || (Py_None == op) || PyTuple_CheckExact(op) ) a.2) In common.c above the line if ((ip=PyObject_GetAttrString(op, "__array_interface__"))!=NULL) { I added if (ISEXACT_NATIVE_PYTYPE(op)){ ip = NULL; } else{ and close the } before the line #if !defined(NPY_PY3K) In common.c above the line if (PyObject_HasAttrString(op, "__array__")) { I added if (ISEXACT_NATIVE_PYTYPE(op)){ } else{ and close the } before the line #if defined(NPY_PY3K) In common.c above the line if (PyObject_GetBuffer(op, &buffer_view, PyBUF_FORMAT|PyBUF_STRIDES I added if ( NEVERSUPPORTS_BUFFER_PROTOCOL(op) ){ } else{ and close the } before the line #endif a.3) In ctors.c above the line if ((e = PyObject_GetAttrString(s, "__array_struct__")) != NULL) { I added if (ISEXACT_NATIVE_PYTYPE(s)){ e = NULL; } else{ and close the } before the line n = PySequence_Size(s); In ctors.c above the line attr = PyObject_GetAttrString(input, "__array_struct__"); I added if (ISEXACT_NATIVE_PYTYPE(input)){ attr = NULL; return Py_NotImplemented; } else{ and close the } before the line if (!NpyCapsule_Check(attr)) { In ctors.c above the line inter = PyObject_GetAttrString(input, "__array_interface__"); I added if (ISEXACT_NATIVE_PYTYPE(input)){ inter = NULL; return Py_NotImplemented; } else{ and close the } before the line if (!PyDict_Check(inter)) { In ctors.c above the line array_meth = PyObject_GetAttrString(op, "__array__"); I added if (ISEXACT_NATIVE_PYTYPE(op)){ array_meth = NULL; return Py_NotImplemented; } else{ and close the } before the line if (context == NULL) { In ctors.c above the line if (PyObject_GetBuffer(s, &buffer_view, PyBUF_STRIDES) == 0 || I added if ( NEVERSUPPORTS_BUFFER_PROTOCOL(s) ){ } else{ and close the } before the line #endif a.4) In multiarraymodule.c above the line ret = PyObject_GetAttrString(obj, "__array_priority__"); I added if (ISEXACT_NATIVE_PYTYPE(obj)){ ret = NULL; } else{ and close the } before the line if (PyErr_Occurred()) { For bottleneck (b) b.1) I noticed that PyFloat * Float64 resulted in an unnecessary "on the fly" conversion of the PyFloat into a Float64 to extract its underlying C double value. This happened in the function _double_convert_to_ctype which comes from the pattern, _ at name@_convert_to_ctype I ended up splitting _ at name@_convert_to_ctype into two sections. One for double types and one for the rest of the types where I extract the C value directly if it passes the check to PyFloat_CheckExact (It could be extended for other types). in scalarmathmodule.c.src I added, /**begin repeat * #name = double# * #Name = Double# * #NAME = DOUBLE# * #PYCHECKEXACT = PyFloat_CheckExact# * #PYEXTRACTCTYPE = PyFloat_AS_DOUBLE# */ static int _ at name@_convert_to_ctype(PyObject *a, npy_ at name@ *arg1) { PyObject *temp; if (@PYCHECKEXACT@(a)){ *arg1 = @PYEXTRACTCTYPE@(a); return 0; } ... The rest of this function is the implementation of the original _ at name@_convert_to_ctype(PyObject *a, npy_ at name@ *arg1) The original implementation of _ at name@_convert_to_ctype does not include double anymore, i.e. /**begin repeat * #name = byte, ubyte, short, ushort, int, uint, long, ulong, longlong, * ulonglong, half, float, longdouble, cfloat, cdouble, clongdouble# * #Name = Byte, UByte, Short, UShort, Int, UInt, Long, ULong, LongLong, * ULongLong, Half, Float, LongDouble, CFloat, CDouble, CLongDouble# * #NAME = BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG, LONGLONG, * ULONGLONG, HALF, FLOAT, LONGDOUBLE, CFLOAT, CDOUBLE, CLONGDOUBLE# */ static int _ at name@_convert_to_ctype(PyObject *a, npy_ at name@ *arg1) b.2) This is the change that may not be very popular among Numpy users. I modified Float64 operations to return a Float instead of Float64. I could not think or see any ill effects and I got a fairly decent speed boost. in scalarmathmodule.c.src I modified to this, /**begin repeat * #name=(byte,ubyte,short,ushort,int,uint,long,ulong,longlong,ulonglong)*13, * (half, float, double, longdouble, cfloat, cdouble, clongdouble)*6, * (half, float, double, longdouble)*2# * #Name=(Byte,UByte,Short,UShort,Int,UInt,Long,ULong,LongLong,ULongLong)*13, * (Half, Float, Double, LongDouble, CFloat, CDouble, CLongDouble)*6, * (Half, Float, Double, LongDouble)*2# * #oper=add*10, subtract*10, multiply*10, divide*10, remainder*10, * divmod*10, floor_divide*10, lshift*10, rshift*10, and*10, * or*10, xor*10, true_divide*10, * add*7, subtract*7, multiply*7, divide*7, floor_divide*7, true_divide*7, * divmod*4, remainder*4# * #fperr=1*70,0*50,1*10, * 1*42, * 1*8# * #twoout=0*50,1*10,0*70, * 0*42, * 1*4,0*4# * #otyp=(byte,ubyte,short,ushort,int,uint,long,ulong,longlong,ulonglong)*12, * float*4, double*6, * (half, float, double, longdouble, cfloat, cdouble, clongdouble)*6, * (half, float, double, longdouble)*2# * #OName=(Byte,UByte,Short,UShort,Int,UInt,Long,ULong,LongLong,ULongLong)*12, * Float*4, Double*6, * (Half, Float, Double, LongDouble, CFloat, CDouble, CLongDouble)*6, * (Half, Float, Double, LongDouble)*2# * #OutUseName=(Byte,UByte,Short,UShort,Int,UInt,Long,ULong,LongLong,ULongLong)*12, * Float*4, out*6, * (Half, Float, out, LongDouble, CFloat, CDouble, CLongDouble)*6, * (Half, Float, out, LongDouble)*2# * #AsScalarArr=(1,1,1,1,1,1,1,1,1,1)*12, * 1*4, 0*6, * (1, 1, 0, 1, 1, 1, 1)*6, * (1, 1, 0, 1)*2# * #RetValCreate=(PyArrayScalar_New,PyArrayScalar_New,PyArrayScalar_New,PyArrayScalar_New,PyArrayScalar_New,PyArrayScalar_New,PyArrayScalar_New,PyArrayScalar_New,PyArrayScalar_New,PyArrayScalar_New)*12, * PyArrayScalar_New*4, PyFloat_FromDouble*6, * (PyArrayScalar_New, PyArrayScalar_New, PyFloat_FromDouble, PyArrayScalar_New, PyArrayScalar_New, PyArrayScalar_New, PyArrayScalar_New)*6, * (PyArrayScalar_New, PyArrayScalar_New, PyFloat_FromDouble, PyArrayScalar_New)*2# */ #if !defined(CODEGEN_SKIP_ at oper@_FLAG) static PyObject * @name at _@oper@(PyObject *a, PyObject *b) { ... Same as before and ends with... #else ret = @RetValCreate@(@OutUseName@); if (ret == NULL) { return NULL; } if (@AsScalarArr@) PyArrayScalar_ASSIGN(ret, @OName@, out); #endif return ret; } #endif /**end repeat**/ I still need to do the section for when there are two return values and the power function. I am not sure what else could be there. ========================================================= That's about it. Sorry for the long email. I tried to summarize as much as possible. Let me know if you have any questions or if you want the actual files I modified. Cheers, Raul Cota From bahtiyor_zohidov at mail.ru Fri Dec 21 14:44:10 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Fri, 21 Dec 2012 23:44:10 +0400 Subject: [Numpy-discussion] =?utf-8?q?NaN_=28Not_a_Number=29_occurs_in_cal?= =?utf-8?q?culation_of_complex_number_for_Bessel_functions?= In-Reply-To: References: <1356089352.739240573@f308.mail.ru> Message-ID: <1356119050.307555288@f390.i.mail.ru> Thanks But I could find for Win64 bit windows???? Second question: Did you mean that I have to put lens limits of those number????? ???????, 21 ??????? 2012, 15:45 UTC ?? Pauli Virtanen : >Hi, > >Your code tries to to evaluate > >????z = 1263309.3633394379 + 101064.74910119522j >????jv(536, z) >????# -> (inf+inf*j) > >In reality, this number is not infinite, but > >????jv(536, z) == -2.3955170861527422e+43888 + 9.6910119847300024e+43887 > >These numbers (~ 10^43888) are too large for the floating point >numbers that computers use (maximum ~ 10^308). This is why you get >infinities and NaNs in the result. The same is true for the spherical >Bessel functions. > >You will not be able to do this calculation using any software >that uses only floating point numbers (Scipy, Matlab, ...). > >You need to use analytical properties of your problem to >get rid of such large numbers. Alternatively, you can use arbitrary >precision numbers. Python has libraries for that: > >http://code.google.com/p/mpmath/ > > >By the way, the proper place for this discussion is the following >mailing list: > >http://mail.scipy.org/mailman/listinfo/scipy-user > >-- >Pauli Virtanen > > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Dec 21 14:54:01 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 21 Dec 2012 20:54:01 +0100 Subject: [Numpy-discussion] Travis-CI stopped supporting Python 3.1, but added 3.3 In-Reply-To: References: Message-ID: On Fri, Dec 21, 2012 at 2:23 AM, Ond?ej ?ert?k wrote: > Hi, > > I noticed that the 3.1 tests are now failing. After clarification with > the Travis guys: > > https://groups.google.com/d/topic/travis-ci/02iRu6kmwY8/discussion > > I've submitted a fix to our .travis.yml (and backported to 1.7): > > https://github.com/numpy/numpy/pull/2850 > https://github.com/numpy/numpy/pull/2851 > > In case you were wondering. Do we need to support Python 3.1? > We could in principle test 3.1 just like we test 2.4. I don't know if > it is worth the pain. > I think for 1.7.x we should. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Dec 21 15:05:14 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 21 Dec 2012 20:05:14 +0000 Subject: [Numpy-discussion] Travis-CI stopped supporting Python 3.1, but added 3.3 In-Reply-To: References: Message-ID: On Fri, Dec 21, 2012 at 1:23 AM, Ond?ej ?ert?k wrote: > Hi, > > I noticed that the 3.1 tests are now failing. After clarification with > the Travis guys: > > https://groups.google.com/d/topic/travis-ci/02iRu6kmwY8/discussion > > I've submitted a fix to our .travis.yml (and backported to 1.7): > > https://github.com/numpy/numpy/pull/2850 > https://github.com/numpy/numpy/pull/2851 > > In case you were wondering. Do we need to support Python 3.1? > We could in principle test 3.1 just like we test 2.4. I don't know if > it is worth the pain. It probably isn't much pain, since Python is easy to compile, and I don't think we've been running into many cases where supporting 3.1 required workarounds yet? (As compared to 2.4, where we get compatibility-breaking patches constantly.) It's a crude metric and I'm not sure what conclusion it suggests, but we have download stats for the different python-version-specific pre-built numpy binaries: http://sourceforge.net/projects/numpy/files/NumPy/1.6.1/ http://pypi.python.org/pypi/numpy/1.6.1 http://sourceforge.net/projects/numpy/files/NumPy/1.6.2/ http://pypi.python.org/pypi/numpy/1.6.2 It looks like python 2.5 is several times more popular than 3.1, and its popularity is dropping quickly. (30% fewer downloads so far for 1.6.2 than 1.6.1, even though 1.6.2 has more downloads overall.) -n From ondrej.certik at gmail.com Fri Dec 21 15:36:44 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Fri, 21 Dec 2012 12:36:44 -0800 Subject: [Numpy-discussion] Travis-CI stopped supporting Python 3.1, but added 3.3 In-Reply-To: References: Message-ID: On Fri, Dec 21, 2012 at 12:05 PM, Nathaniel Smith wrote: > On Fri, Dec 21, 2012 at 1:23 AM, Ond?ej ?ert?k wrote: >> Hi, >> >> I noticed that the 3.1 tests are now failing. After clarification with >> the Travis guys: >> >> https://groups.google.com/d/topic/travis-ci/02iRu6kmwY8/discussion >> >> I've submitted a fix to our .travis.yml (and backported to 1.7): >> >> https://github.com/numpy/numpy/pull/2850 >> https://github.com/numpy/numpy/pull/2851 >> >> In case you were wondering. Do we need to support Python 3.1? >> We could in principle test 3.1 just like we test 2.4. I don't know if >> it is worth the pain. > > It probably isn't much pain, since Python is easy to compile, and I > don't think we've been running into many cases where supporting 3.1 > required workarounds yet? (As compared to 2.4, where we get > compatibility-breaking patches constantly.) Yes, I was going to suggest that it really is not a big deal to support 3.1, as far as I see it by my experience with Travis tests on numpy PRs in the last half a year. We can compile it like we do for 2.4, or we can even provide a prebuild binary and just install it in the Travis VM each time. So I'll try to provide a PR which implements testing of 3.1, unless someone beats me to it. Ondrej From chang at lambdafoundry.com Fri Dec 21 16:13:32 2012 From: chang at lambdafoundry.com (Chang She) Date: Fri, 21 Dec 2012 16:13:32 -0500 Subject: [Numpy-discussion] [pystatsmodels] Re: ANN: pandas 0.10.0 released In-Reply-To: <925e10ec-bfb9-42be-a4aa-e26d197a58a0@googlegroups.com> References: <925e10ec-bfb9-42be-a4aa-e26d197a58a0@googlegroups.com> Message-ID: On Dec 21, 2012, at 3:27 PM, Collin Sellman wrote: > Thanks, Wes and team. I've been looking through the new features, but haven't found any documentation on the integration with the Google Analytics API. I was just in the midst of trying to pull data into Pandas from GA in v.0.9.0, so would love to try what you built in .10. > > -Collin > > On Monday, December 17, 2012 10:19:49 AM UTC-7, Wes McKinney wrote: > hi all, > > I'm super excited to announce the pandas 0.10.0 release. This is > a major release including a new high performance file reading > engine with tons of new user-facing functionality as well, a > bunch of work on the HDF5/PyTables integration layer, > much-expanded Unicode support, a new option/configuration > interface, integration with the Google Analytics API, and a wide > array of other new features, bug fixes, and performance > improvements. I strongly recommend that all users get upgraded as > soon as feasible. Many performance improvements made are quite > substantial over 0.9.x, see vbenchmarks at the end of the e-mail. > > As of this release, we are no longer supporting Python 2.5. Also, > this is the first release to officially support Python 3.3. > > Note: there are a number of minor, but necessary API changes that > long-time pandas users should pay attention to in the What's New. > > Thanks to all who contributed to this release, especially Chang > She, Yoval P, and Jeff Reback (and everyone else listed in the > commit log!). > > As always source archives and Windows installers are on PyPI. > > What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html > Installers: http://pypi.python.org/pypi/pandas > > $ git log v0.9.1..v0.10.0 --pretty=format:%aN | sort | uniq -c | sort -rn > 246 Wes McKinney > 140 y-p > 99 Chang She > 45 jreback > 18 Abraham Flaxman > 17 Jeff Reback > 14 locojaydev > 11 Keith Hughitt > 5 Adam Obeng > 2 Dieter Vandenbussche > 1 zach powers > 1 Luke Lee > 1 Laurent Gautier > 1 Ken Van Haren > 1 Jay Bourque > 1 Donald Curtis > 1 Chris Mulligan > 1 alex arsenovic > 1 A. Flaxman > > Happy data hacking! > > - Wes > > What is it > ========== > pandas is a Python package providing fast, flexible, and > expressive data structures designed to make working with > relational, time series, or any other kind of labeled data both > easy and intuitive. It aims to be the fundamental high-level > building block for doing practical, real world data analysis in > Python. > > Links > ===== > Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst > Documentation: http://pandas.pydata.org > Installers: http://pypi.python.org/pypi/pandas > Code Repository: http://github.com/pydata/pandas > Mailing List: http://groups.google.com/group/pydata > > Performance vs. v0.9.0 > ====================== > > Benchmarks from https://github.com/pydata/pandas/tree/master/vb_suite > Ratio < 1 means that v0.10.0 is faster > > v0.10.0 v0.9.0 ratio > name > unstack_sparse_keyspace 1.2813 144.1262 0.0089 > groupby_frame_apply_overhead 20.1520 337.3330 0.0597 > read_csv_comment2 25.3097 363.2860 0.0697 > groupbym_frame_apply 75.1554 504.1661 0.1491 > frame_iteritems_cached 0.0711 0.3919 0.1815 > read_csv_thou_vb 35.2690 191.9360 0.1838 > concat_small_frames 12.9019 55.3561 0.2331 > join_dataframe_integer_2key 5.8184 21.5823 0.2696 > series_value_counts_strings 5.3824 19.1262 0.2814 > append_frame_single_homogenous 0.3413 0.9319 0.3662 > read_csv_vb 18.4084 46.9500 0.3921 > read_csv_standard 12.0651 29.9940 0.4023 > panel_from_dict_all_different_indexes 73.6860 158.2949 0.4655 > frame_constructor_ndarray 0.0471 0.0958 0.4918 > groupby_first 3.8502 7.1988 0.5348 > groupby_last 3.6962 6.7792 0.5452 > panel_from_dict_two_different_indexes 50.7428 86.4980 0.5866 > append_frame_single_mixed 1.2950 2.1930 0.5905 > frame_get_numeric_data 0.0695 0.1119 0.6212 > replace_fillna 4.6349 7.0540 0.6571 > frame_to_csv 281.9340 427.7921 0.6590 > replace_replacena 4.7154 7.1207 0.6622 > frame_iteritems 2.5862 3.7463 0.6903 > series_align_int64_index 29.7370 41.2791 0.7204 > join_dataframe_integer_key 1.7980 2.4303 0.7398 > groupby_multi_size 31.0066 41.7001 0.7436 > groupby_frame_singlekey_integer 2.3579 3.1649 0.7450 > write_csv_standard 326.8259 427.3241 0.7648 > groupby_simple_compress_timing 41.2113 52.3993 0.7865 > frame_fillna_inplace 16.2843 20.0491 0.8122 > reindex_fillna_backfill 0.1364 0.1667 0.8181 > groupby_multi_series_op 15.2914 18.6651 0.8193 > groupby_multi_cython 17.2169 20.4420 0.8422 > frame_fillna_many_columns_pad 14.9510 17.5114 0.8538 > panel_from_dict_equiv_indexes 25.8427 29.9682 0.8623 > merge_2intkey_nosort 19.0755 22.1138 0.8626 > sparse_series_to_frame 167.8529 192.9920 0.8697 > reindex_fillna_pad 0.1410 0.1617 0.8720 > merge_2intkey_sort 44.7863 51.3315 0.8725 > reshape_stack_simple 2.6698 3.0502 0.8753 > groupby_indices 7.2264 8.2314 0.8779 > sort_level_one 4.3845 4.9902 0.8786 > sort_level_zero 4.3362 4.9198 0.8814 > write_store 16.0587 18.2042 0.8821 > frame_reindex_both_axes 0.3726 0.4183 0.8907 > groupby_multi_different_numpy_functions 13.4164 15.0509 0.8914 > index_int64_intersection 25.3705 28.1867 0.9001 > groupby_frame_median 7.7491 8.6011 0.9009 > frame_drop_dup_na_inplace 2.6290 2.9155 0.9017 > dataframe_reindex_columns 0.3052 0.3372 0.9049 > join_dataframe_index_multi 20.5651 22.6893 0.9064 > frame_ctor_list_of_dict 101.7439 112.2260 0.9066 > groupby_pivot_table 18.4551 20.3184 0.9083 > reindex_frame_level_align 0.9644 1.0531 0.9158 > stat_ops_level_series_sum_multiple 7.3637 8.0230 0.9178 > write_store_mixed 38.2528 41.6604 0.9182 > frame_reindex_both_axes_ix 0.4550 0.4950 0.9192 > stat_ops_level_frame_sum_multiple 8.1975 8.9055 0.9205 > panel_from_dict_same_index 25.7938 28.0147 0.9207 > groupby_series_simple_cython 5.1310 5.5624 0.9224 > frame_sort_index_by_columns 41.9577 45.1816 0.9286 > groupby_multi_python 54.9727 59.0400 0.9311 > datetimeindex_add_offset 0.2417 0.2584 0.9356 > frame_boolean_row_select 0.2905 0.3100 0.9373 > frame_reindex_axis1 2.9760 3.1742 0.9376 > stat_ops_level_series_sum 2.3382 2.4937 0.9376 > groupby_multi_different_functions 14.0333 14.9571 0.9382 > timeseries_timestamp_tzinfo_cons 0.0159 0.0169 0.9397 > stats_rolling_mean 1.6904 1.7959 0.9413 > melt_dataframe 1.5236 1.6181 0.9416 > timeseries_asof_single 0.0548 0.0582 0.9416 > frame_ctor_nested_dict_int64 134.3100 142.6389 0.9416 > join_dataframe_index_single_key_bigger 15.6578 16.5949 0.9435 > stat_ops_level_frame_sum 3.2475 3.4414 0.9437 > indexing_dataframe_boolean_rows 0.2382 0.2518 0.9459 > timeseries_asof_nan 10.0433 10.6006 0.9474 > frame_reindex_axis0 1.4403 1.5184 0.9485 > concat_series_axis1 69.2988 72.8099 0.9518 > join_dataframe_index_single_key_small 6.8492 7.1847 0.9533 > dataframe_reindex_daterange 0.4054 0.4240 0.9562 > join_dataframe_index_single_key_bigger 6.4616 6.7578 0.9562 > timeseries_timestamp_downsample_mean 4.5849 4.7787 0.9594 > frame_fancy_lookup 2.5498 2.6544 0.9606 > series_value_counts_int64 2.5569 2.6581 0.9619 > frame_fancy_lookup_all 30.7510 31.8465 0.9656 > index_int64_union 82.2279 85.1500 0.9657 > indexing_dataframe_boolean_rows_object 0.4809 0.4977 0.9662 > frame_ctor_nested_dict 91.6129 94.8122 0.9663 > stat_ops_series_std 0.2450 0.2533 0.9673 > groupby_frame_cython_many_columns 3.7642 3.8894 0.9678 > timeseries_asof 10.4352 10.7721 0.9687 > series_ctor_from_dict 3.7707 3.8749 0.9731 > frame_drop_dup_inplace 3.0007 3.0746 0.9760 > timeseries_large_lookup_value 0.0242 0.0248 0.9764 > read_table_multiple_date_baseline 1201.2930 1224.3881 0.9811 > dti_reset_index 0.6339 0.6457 0.9817 > read_table_multiple_date 2600.7280 2647.8729 0.9822 > reindex_frame_level_reindex 0.9524 0.9674 0.9845 > reindex_multiindex 1.3483 1.3685 0.9853 > frame_insert_500_columns 102.1249 103.4329 0.9874 > frame_drop_duplicates 19.3780 19.6157 0.9879 > reindex_daterange_backfill 0.1870 0.1889 0.9899 > stats_rank2d_axis0_average 25.0480 25.2801 0.9908 > series_align_left_monotonic 13.1929 13.2558 0.9953 > timeseries_add_irregular 22.4635 22.5122 0.9978 > read_store_mixed 13.4398 13.4560 0.9988 > lib_fast_zip 11.1289 11.1354 0.9994 > match_strings 0.3831 0.3833 0.9995 > read_store 5.5526 5.5290 1.0043 > timeseries_sort_index 22.7172 22.5976 1.0053 > timeseries_1min_5min_mean 0.6224 0.6175 1.0079 > stats_rank2d_axis1_average 14.6569 14.5339 1.0085 > reindex_daterange_pad 0.1886 0.1867 1.0102 > timeseries_period_downsample_mean 6.4241 6.3480 1.0120 > frame_drop_duplicates_na 19.3303 19.0970 1.0122 > stats_rank_average_int 23.3569 22.9996 1.0155 > lib_fast_zip_fillna 14.1394 13.8473 1.0211 > index_datetime_intersection 17.2626 16.8986 1.0215 > timeseries_1min_5min_ohlc 0.7054 0.6891 1.0237 > stats_rank_average 31.3440 30.3845 1.0316 > timeseries_infer_freq 10.9854 10.6439 1.0321 > timeseries_slice_minutely 0.0637 0.0611 1.0418 > index_datetime_union 17.9083 17.1640 1.0434 > series_align_irregular_string 89.9470 85.1344 1.0565 > series_constructor_ndarray 0.0127 0.0119 1.0742 > indexing_panel_subset 0.5692 0.5214 1.0917 > groupby_apply_dict_return 46.3497 42.3220 1.0952 > reshape_unstack_simple 3.2901 2.9089 1.1310 > timeseries_to_datetime_iso8601 4.2305 3.6015 1.1746 > frame_to_string_floats 53.6217 37.2041 1.4413 > reshape_pivot_time_series 170.4340 107.9068 1.5795 > sparse_frame_constructor 6.2714 3.5053 1.7891 > datetimeindex_normalize 37.2718 6.9329 5.3761 > > Columns: test_name | target_duration [ms] | baseline_duration [ms] | ratio Hi Collin, I didn't add it to the official docs because of the authentication step complicating the doc build, but you can reference this brief blog post I wrote here: http://quantabee.wordpress.com/2012/12/17/google-analytics-pandas/ Best, Chang -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Dec 22 12:36:58 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 22 Dec 2012 09:36:58 -0800 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: References: Message-ID: Hi, On Thu, Dec 20, 2012 at 5:39 PM, Nathaniel Smith wrote: > On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett wrote: >> Travis - I think you are suggesting that there should be no one >> person in charge of numpy, and I think this is very unlikely to work >> well. Perhaps there are good examples of well-led projects where >> there is not a clear leader, but I can't think of any myself at the >> moment. My worry would be that, without a clear leader, it will be >> unclear how decisions are made, and that will make it very hard to >> take strategic decisions. > > Curious; my feeling is the opposite, that among mature and successful > FOSS projects, having a clear leader is the uncommon case. GCC > doesn't, Glibc not only has no leader but they recently decided to get > rid of their formal steering committee, I'm pretty sure git doesn't, > Apache certainly doesn't, Samba doesn't really, etc. As usual Karl > Fogel has sensible comments on this: > http://producingoss.com/en/consensus-democracy.html Ah yes - that is curious. My - er - speculation was based on: Numpy - Travis golden age in which we still bask Sympy - Ondrej, then Aaron - evolving into group decision making AFAICT IPython - Fernando, evolving into group decision making, AFAICT Cython - Robert Bradshaw - evolving into ... - you get the idea. and then reading about businesses particularly Good to Great, Built to Last, the disaster at HP when they didn't take care about succession. In general, that reading gave me the impression that successful organizations take enormous care about succession. I can't think of any case in the business literature I've read where a successful leader handed over to a group of three. > In practice the main job of a successful FOSS leader is to refuse to > make decisions, nudge people to work things out, and then if they > refuse to work things out tell them to go away until they do: > https://lwn.net/Articles/105375/ > and what actually gives people influence in a project is the respect > of the other members. The former stuff is stuff anyone can do, and the > latter isn't something you can confer or take away with a vote. Right. My impression is - I'm happy to be corrected with better information - that the leader of a to-be-successful organization is very good at encouraging a spirit of free and vigorous debate, strong opinion, and reasoned decisions - and that may be the main gift they give to the organization. At that point, usually under that leader's supervision, the decision making starts diffusing over the group, as they learn to discuss and make decisions together. As I was teaching my niece and nephew to say to their parents in the car - Daddy - are we there yet? If we are not already there, how are we going to get there? > Nor do we necessarily have a great track record for executive > decisions actually working things out. No, I agree, the right leader will help form the group well for making good group decisions. I think. In the mean-time - now that there is a change - could I ask - where do you three see Numpy going in the next five years? What do you see as the challenges to solve? What are the big risks? What are the big possibilities? Cheers, Matthew From matthew.brett at gmail.com Sat Dec 22 17:19:02 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 22 Dec 2012 14:19:02 -0800 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: <766F069B-9C36-4BC1-8A51-BB91155104C1@continuum.io> References: <766F069B-9C36-4BC1-8A51-BB91155104C1@continuum.io> Message-ID: Hi, On Fri, Dec 21, 2012 at 12:14 AM, Travis Oliphant wrote: > > On Dec 20, 2012, at 7:39 PM, Nathaniel Smith wrote: > >> On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett wrote: >>> Travis - I think you are suggesting that there should be no one >>> person in charge of numpy, and I think this is very unlikely to work >>> well. Perhaps there are good examples of well-led projects where >>> there is not a clear leader, but I can't think of any myself at the >>> moment. My worry would be that, without a clear leader, it will be >>> unclear how decisions are made, and that will make it very hard to >>> take strategic decisions. >> >> Curious; my feeling is the opposite, that among mature and successful >> FOSS projects, having a clear leader is the uncommon case. GCC >> doesn't, Glibc not only has no leader but they recently decided to get >> rid of their formal steering committee, I'm pretty sure git doesn't, >> Apache certainly doesn't, Samba doesn't really, etc. As usual Karl >> Fogel has sensible comments on this: >> http://producingoss.com/en/consensus-democracy.html >> >> In practice the main job of a successful FOSS leader is to refuse to >> make decisions, nudge people to work things out, and then if they >> refuse to work things out tell them to go away until they do: >> https://lwn.net/Articles/105375/ >> and what actually gives people influence in a project is the respect >> of the other members. The former stuff is stuff anyone can do, and the >> latter isn't something you can confer or take away with a vote. >> > > I will strongly voice my opinion that NumPy does not need an official single "leader". I am sorry, I have a feeling this question might be unwelcome - but I think it's reasonable to say that having three people in joint charge is an unusual choice. I suppose it has various risks and advantages. Would you mind saying a little bit about why you chose this option instead of the more common one of having a single lead? What problems do you think might arise? How can they be detected and avoided? Thanks a lot, Matthew From matthew.brett at gmail.com Sun Dec 23 00:40:43 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 23 Dec 2012 05:40:43 +0000 Subject: [Numpy-discussion] Many failing doctests - release blocker? Enable for default test runs? Message-ID: Hi, I noticed that enabling the doctests on the 1.7.x maintenance branch caused lots and lots of doctest failures. (np-devel)[mb312 at blair ~/dev_trees]$ python -c 'import numpy as np; np.test(doctests=True)' 1.7.0rc1.dev-1e8fcdf Running unit tests and doctests for numpy NumPy version 1.7.0rc1.dev-1e8fcdf NumPy is installed in /Users/mb312/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 (Apple Inc. build 5493)] nose version 1.1.2 ... Ran 3839 tests in 59.928s FAILED (KNOWNFAIL=4, SKIP=4, errors=23, failures=175) The doctests also throw up somewhere round 10 matplotlib plots, so presumably those would fail as well on a machine without a display without forcing the import of an 'Agg' backend or similar. I have never checked the doctests on Python 3. Has anyone run those recently? For the projects I work on most, we enable doctests for the default test run - as in 'doctests=True' by default in the numpy testing machinery. Do ya'll see any disadvantage in doing that for numpy? In case someone gets to this before I do, we've also got some logic for doing conditional skips of doctests when optional packages are not available such as matplotlib, inspired by something similar in IPython: https://github.com/nipy/nipy/blob/master/nipy/testing/doctester.py#L193 If Christmas allows I'll send a pull request with something like that in the next few days. Cheers, Matthew From d.s.seljebotn at astro.uio.no Sun Dec 23 03:56:22 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Sun, 23 Dec 2012 09:56:22 +0100 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: References: Message-ID: <50D6C736.6020908@astro.uio.no> On 12/22/2012 06:36 PM, Matthew Brett wrote: > Hi, > > On Thu, Dec 20, 2012 at 5:39 PM, Nathaniel Smith wrote: >> On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett wrote: >>> Travis - I think you are suggesting that there should be no one >>> person in charge of numpy, and I think this is very unlikely to work >>> well. Perhaps there are good examples of well-led projects where >>> there is not a clear leader, but I can't think of any myself at the >>> moment. My worry would be that, without a clear leader, it will be >>> unclear how decisions are made, and that will make it very hard to >>> take strategic decisions. >> >> Curious; my feeling is the opposite, that among mature and successful >> FOSS projects, having a clear leader is the uncommon case. GCC >> doesn't, Glibc not only has no leader but they recently decided to get >> rid of their formal steering committee, I'm pretty sure git doesn't, >> Apache certainly doesn't, Samba doesn't really, etc. As usual Karl >> Fogel has sensible comments on this: >> http://producingoss.com/en/consensus-democracy.html > > Ah yes - that is curious. My - er - speculation was based on: > > Numpy - Travis golden age in which we still bask > Sympy - Ondrej, then Aaron - evolving into group decision making AFAICT > IPython - Fernando, evolving into group decision making, AFAICT > Cython - Robert Bradshaw - evolving into ... - you get the idea. I don't really want to prolong this thread, but I feel like I should correct a factual error. Cython started with Robert Bradshaw (and other Sage members) and Stefan Behnel exchanging patches on top of Pyrex; there was definitely no leader at that point. Then I came along; there was no leader at that point either (but I was aware that the two others had a longer track record of course). Robert Bradshaw was declared leader in order to break the tie when I and Stefan Behnel had argued for a 100-post long thread and could not reach a conclusion. And at least in this case, we were able to settle on a leadership structure then, when we needed it, and didn't regret not doing it earlier. Dag Sverre > > and then reading about businesses particularly Good to Great, Built to > Last, the disaster at HP when they didn't take care about succession. > In general, that reading gave me the impression that successful > organizations take enormous care about succession. I can't think of > any case in the business literature I've read where a successful > leader handed over to a group of three. > >> In practice the main job of a successful FOSS leader is to refuse to >> make decisions, nudge people to work things out, and then if they >> refuse to work things out tell them to go away until they do: >> https://lwn.net/Articles/105375/ >> and what actually gives people influence in a project is the respect >> of the other members. The former stuff is stuff anyone can do, and the >> latter isn't something you can confer or take away with a vote. > > Right. My impression is - I'm happy to be corrected with better > information - that the leader of a to-be-successful organization is > very good at encouraging a spirit of free and vigorous debate, strong > opinion, and reasoned decisions - and that may be the main gift they > give to the organization. At that point, usually under that leader's > supervision, the decision making starts diffusing over the group, as > they learn to discuss and make decisions together. > > As I was teaching my niece and nephew to say to their parents in the > car - Daddy - are we there yet? > > If we are not already there, how are we going to get there? > >> Nor do we necessarily have a great track record for executive >> decisions actually working things out. > > No, I agree, the right leader will help form the group well for making > good group decisions. I think. > > In the mean-time - now that there is a change - could I ask - where do > you three see Numpy going in the next five years? What do you see > as the challenges to solve? What are the big risks? What are the big > possibilities? > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From matthew.brett at gmail.com Sun Dec 23 08:33:02 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 23 Dec 2012 13:33:02 +0000 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: <50D6C736.6020908@astro.uio.no> References: <50D6C736.6020908@astro.uio.no> Message-ID: Hi, On Sun, Dec 23, 2012 at 8:56 AM, Dag Sverre Seljebotn wrote: > On 12/22/2012 06:36 PM, Matthew Brett wrote: >> Hi, >> >> On Thu, Dec 20, 2012 at 5:39 PM, Nathaniel Smith wrote: >>> On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett wrote: >>>> Travis - I think you are suggesting that there should be no one >>>> person in charge of numpy, and I think this is very unlikely to work >>>> well. Perhaps there are good examples of well-led projects where >>>> there is not a clear leader, but I can't think of any myself at the >>>> moment. My worry would be that, without a clear leader, it will be >>>> unclear how decisions are made, and that will make it very hard to >>>> take strategic decisions. >>> >>> Curious; my feeling is the opposite, that among mature and successful >>> FOSS projects, having a clear leader is the uncommon case. GCC >>> doesn't, Glibc not only has no leader but they recently decided to get >>> rid of their formal steering committee, I'm pretty sure git doesn't, >>> Apache certainly doesn't, Samba doesn't really, etc. As usual Karl >>> Fogel has sensible comments on this: >>> http://producingoss.com/en/consensus-democracy.html >> >> Ah yes - that is curious. My - er - speculation was based on: >> >> Numpy - Travis golden age in which we still bask >> Sympy - Ondrej, then Aaron - evolving into group decision making AFAICT >> IPython - Fernando, evolving into group decision making, AFAICT >> Cython - Robert Bradshaw - evolving into ... - you get the idea. > > I don't really want to prolong this thread, but I feel like I should > correct a factual error. Cython started with Robert Bradshaw (and other > Sage members) and Stefan Behnel exchanging patches on top of Pyrex; > there was definitely no leader at that point. Then I came along; there > was no leader at that point either (but I was aware that the two others > had a longer track record of course). > > Robert Bradshaw was declared leader in order to break the tie when I and > Stefan Behnel had argued for a 100-post long thread and could not reach > a conclusion. And at least in this case, we were able to settle on a > leadership structure then, when we needed it, and didn't regret not > doing it earlier. Thanks for correcting the error - sorry for passing on my half-understood knowledge, better history is useful. I am sure not discussing stuff works for some groups, but I very much doubt that it will work for numpy. The masked array discussion was a particularly good example where the attempt to shut down the discussion led to a great deal of wasted time and effort and lots of bad feeling, when some good time spent to hammer out the issues would (probably) have been a much more efficient use of energy. I hope that this time, instead of trying to shut down the conversation as fast as possible, we can have a productive and reasoned discussion about what to do next, in order to make the best possible decision. See you, Matthew From ondrej.certik at gmail.com Sun Dec 23 13:54:49 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sun, 23 Dec 2012 10:54:49 -0800 Subject: [Numpy-discussion] Many failing doctests - release blocker? Enable for default test runs? In-Reply-To: References: Message-ID: Hi Matthew, On Sat, Dec 22, 2012 at 9:40 PM, Matthew Brett wrote: > Hi, > > I noticed that enabling the doctests on the 1.7.x maintenance branch > caused lots and lots of doctest failures. > > (np-devel)[mb312 at blair ~/dev_trees]$ python -c 'import numpy as np; > np.test(doctests=True)' > 1.7.0rc1.dev-1e8fcdf > Running unit tests and doctests for numpy > NumPy version 1.7.0rc1.dev-1e8fcdf > NumPy is installed in > /Users/mb312/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy > Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 > (Apple Inc. build 5493)] > nose version 1.1.2 > ... > Ran 3839 tests in 59.928s > > FAILED (KNOWNFAIL=4, SKIP=4, errors=23, failures=175) > > The doctests also throw up somewhere round 10 matplotlib plots, so > presumably those would fail as well on a machine without a display > without forcing the import of an 'Agg' backend or similar. > > I have never checked the doctests on Python 3. Has anyone run those recently? > > For the projects I work on most, we enable doctests for the default > test run - as in 'doctests=True' by default in the numpy testing > machinery. Do ya'll see any disadvantage in doing that for numpy? > > In case someone gets to this before I do, we've also got some logic > for doing conditional skips of doctests when optional packages are not > available such as matplotlib, inspired by something similar in > IPython: > > https://github.com/nipy/nipy/blob/master/nipy/testing/doctester.py#L193 > > If Christmas allows I'll send a pull request with something like that > in the next few days. Thanks for pointing this out. I think in the long term, we should definitely run doctests as part of the test suite on Travis-CI. Because what use is a doctest if it doesn't work? Matthew, do you know if doctests fail for the 1.6 release as well? I am swamped with other bugs for the 1.7 release and since I assume they also fail for 1.6, I want to get the release out as soon as we fix our current issues. However, I think it's a good idea to run doctests automatically on Travis, once they are all fixed. Ondrej From njs at pobox.com Sun Dec 23 14:00:10 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 23 Dec 2012 19:00:10 +0000 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: References: Message-ID: On Sat, Dec 22, 2012 at 5:36 PM, Matthew Brett wrote: > Hi, > > On Thu, Dec 20, 2012 at 5:39 PM, Nathaniel Smith wrote: >> On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett wrote: >>> Travis - I think you are suggesting that there should be no one >>> person in charge of numpy, and I think this is very unlikely to work >>> well. Perhaps there are good examples of well-led projects where >>> there is not a clear leader, but I can't think of any myself at the >>> moment. My worry would be that, without a clear leader, it will be >>> unclear how decisions are made, and that will make it very hard to >>> take strategic decisions. >> >> Curious; my feeling is the opposite, that among mature and successful >> FOSS projects, having a clear leader is the uncommon case. GCC >> doesn't, Glibc not only has no leader but they recently decided to get >> rid of their formal steering committee, I'm pretty sure git doesn't, >> Apache certainly doesn't, Samba doesn't really, etc. As usual Karl >> Fogel has sensible comments on this: >> http://producingoss.com/en/consensus-democracy.html > > Ah yes - that is curious. My - er - speculation was based on: > > Numpy - Travis golden age in which we still bask > Sympy - Ondrej, then Aaron - evolving into group decision making AFAICT > IPython - Fernando, evolving into group decision making, AFAICT > Cython - Robert Bradshaw - evolving into ... - you get the idea. > > and then reading about businesses particularly Good to Great, Built to > Last, the disaster at HP when they didn't take care about succession. > In general, that reading gave me the impression that successful > organizations take enormous care about succession. I can't think of > any case in the business literature I've read where a successful > leader handed over to a group of three. I think this is just a case of different organizational styles working differently. If organizations were optimisation algorithms, good businesses would be Newton's method, and good FOSS projects would be simulated annealing, or maybe GAs. Slower and less focused, but more robust against noise and local minima, and less susceptible to perturbations. They depend much less on the focused attention of visionary leaders. (Also I wouldn't consider numpy to have a formal "group of three leaders" now just because Travis mentioned three names in his email. Leadership is something people do, not something people are, so it's a fuzzy category in the first place.) >> In practice the main job of a successful FOSS leader is to refuse to >> make decisions, nudge people to work things out, and then if they >> refuse to work things out tell them to go away until they do: >> https://lwn.net/Articles/105375/ >> and what actually gives people influence in a project is the respect >> of the other members. The former stuff is stuff anyone can do, and the >> latter isn't something you can confer or take away with a vote. > > Right. My impression is - I'm happy to be corrected with better > information - that the leader of a to-be-successful organization is > very good at encouraging a spirit of free and vigorous debate, strong > opinion, and reasoned decisions - and that may be the main gift they > give to the organization. At that point, usually under that leader's > supervision, the decision making starts diffusing over the group, as > they learn to discuss and make decisions together. > > As I was teaching my niece and nephew to say to their parents in the > car - Daddy - are we there yet? > > If we are not already there, how are we going to get there? > >> Nor do we necessarily have a great track record for executive >> decisions actually working things out. > > No, I agree, the right leader will help form the group well for making > good group decisions. I think. > > In the mean-time - now that there is a change - could I ask - where do > you three see Numpy going in the next five years? What do you see > as the challenges to solve? What are the big risks? What are the big > possibilities? Personally I'd like to see NA support and sparse ndarrays in numpy proper, but I'm not going to have the time to write them myself in the forseeable future... In the long run of course everyone wants a version of numpy+python that can do automatic loop fusion (since that's the core feature for achieving throughput on modern CPUs) without giving up the ability to interface with C code and CPython compatibility. In my dreams the PyPy people will get their act together WRT interfacing with C code, the Cython people will take advantage of this to write a Cython-to-RPython compiler that lets the PyPy optimizer see the internals of Cython-written code, and then we port numpy to Cython and get a single compatible code-base that can run fast on both CPython and PyPy. But who knows what will actually make sense, if anything; as they say, it's very hard to make predictions, especially about the future. And of course the actual long-term strategic plan is "review PRs, merge the good ones". -n From aronne.merrelli at gmail.com Sun Dec 23 14:48:31 2012 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Sun, 23 Dec 2012 13:48:31 -0600 Subject: [Numpy-discussion] help with f2py Message-ID: Hi, I'm trying to run f2py and running into some trouble. Starting from http://www.scipy.org/Cookbook/F2Py, and the very simple 'Wrapping Hermite Polynomial' example, I can get the pyf file created with no issues. The system I am using is RedHat linux, and has several Fortran compilers: $ f2py -c --help-fcompiler Fortran compilers found: --fcompiler=g95 G95 Fortran Compiler (0.92) --fcompiler=gnu GNU Fortran 77 compiler (3.4.6) --fcompiler=gnu95 GNU Fortran 95 compiler (4.1.2) --fcompiler=intelem Intel Fortran Compiler for 64-bit apps (11.1) All of these will successfully create the .so file except for g95, but when I try to import into python I get this ImportError for any of the other three compilers: In [5]: import hermite ImportError: ./hermite.so: undefined symbol: c06ebf_ If I look at the shared object I find that symbol here: $ nm hermite.so 00000000000043c0 T array_from_pyobj U c06eaf_ U c06ebf_ And that about hits my limit of compiler knowledge, as I am pretty much a novice with these things. Any ideas on what is going wrong here, or suggestions of things to try? Thanks, Aronne -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 23 15:53:21 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 23 Dec 2012 21:53:21 +0100 Subject: [Numpy-discussion] Many failing doctests - release blocker? Enable for default test runs? In-Reply-To: References: Message-ID: On Sun, Dec 23, 2012 at 7:54 PM, Ond?ej ?ert?k wrote: > Hi Matthew, > > On Sat, Dec 22, 2012 at 9:40 PM, Matthew Brett > wrote: > > Hi, > > > > I noticed that enabling the doctests on the 1.7.x maintenance branch > > caused lots and lots of doctest failures. > > > > (np-devel)[mb312 at blair ~/dev_trees]$ python -c 'import numpy as np; > > np.test(doctests=True)' > > 1.7.0rc1.dev-1e8fcdf > > Running unit tests and doctests for numpy > > NumPy version 1.7.0rc1.dev-1e8fcdf > > NumPy is installed in > > /Users/mb312/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy > > Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 > > (Apple Inc. build 5493)] > > nose version 1.1.2 > > ... > > Ran 3839 tests in 59.928s > > > > FAILED (KNOWNFAIL=4, SKIP=4, errors=23, failures=175) > > > > The doctests also throw up somewhere round 10 matplotlib plots, so > > presumably those would fail as well on a machine without a display > > without forcing the import of an 'Agg' backend or similar. > > > > I have never checked the doctests on Python 3. Has anyone run those > recently? > > > > For the projects I work on most, we enable doctests for the default > > test run - as in 'doctests=True' by default in the numpy testing > > machinery. Do ya'll see any disadvantage in doing that for numpy? > Yes, I do. The doctest framework and reproducibility of reprs across Python versions and platforms are too poor to do this. And failing tests give new users a bad impression of the quality of numpy. I'm +1 on enabling doctests on Travis for one Python version (2.7 probably) in order to reduce the number of out-of-date examples, -1 on default doctests=True. > > > In case someone gets to this before I do, we've also got some logic > > for doing conditional skips of doctests when optional packages are not > > available such as matplotlib, inspired by something similar in > > IPython: > > > > https://github.com/nipy/nipy/blob/master/nipy/testing/doctester.py#L193 > > > > If Christmas allows I'll send a pull request with something like that > > in the next few days. > > Thanks for pointing this out. I think in the long term, we should > definitely > run doctests as part of the test suite on Travis-CI. Because what use > is a doctest if it doesn't work? > Since a "doctest" is an example and not a test, still quite useful. > Matthew, do you know if doctests fail for the 1.6 release as well? > > I am swamped with other bugs for the 1.7 release and since I assume > they also fail for 1.6, I want to get the release out as soon as we fix our > current issues. > Agreed that this shouldn't be a release blocker. Ralf > > However, I think it's a good idea to run doctests automatically on Travis, > once they are all fixed. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Sun Dec 23 16:11:16 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sun, 23 Dec 2012 13:11:16 -0800 Subject: [Numpy-discussion] help with f2py In-Reply-To: References: Message-ID: On Sun, Dec 23, 2012 at 11:48 AM, Aronne Merrelli wrote: > Hi, > > I'm trying to run f2py and running into some trouble. Starting from > http://www.scipy.org/Cookbook/F2Py, and the very simple 'Wrapping Hermite > Polynomial' example, I can get the pyf file created with no issues. The > system I am using is RedHat linux, and has several Fortran compilers: > > $ f2py -c --help-fcompiler > > Fortran compilers found: > --fcompiler=g95 G95 Fortran Compiler (0.92) > --fcompiler=gnu GNU Fortran 77 compiler (3.4.6) > --fcompiler=gnu95 GNU Fortran 95 compiler (4.1.2) > --fcompiler=intelem Intel Fortran Compiler for 64-bit apps (11.1) > > All of these will successfully create the .so file except for g95, but when > I try to import into python I get this ImportError for any of the other > three compilers: > > In [5]: import hermite > ImportError: ./hermite.so: undefined symbol: c06ebf_ > > If I look at the shared object I find that symbol here: > > $ nm hermite.so > > 00000000000043c0 T array_from_pyobj > U c06eaf_ > U c06ebf_ This "U" here means "undefined" (see "man nm"). I don't know if this symbol is something that f2py introduces, or if it is present in the original Fortran sources. One way to know for sure is to examine all .f90 and .c files generated by f2py and search for this symbol and make sure that these subroutines/functions are linked in. > And that about hits my limit of compiler knowledge, as I am pretty much a > novice with these things. Any ideas on what is going wrong here, or > suggestions of things to try? I personally wrap Fortran like any other C code using the iso_c_binding Fortran module. I use Cython, but you can also use ctypes or any other method. See here: http://fortran90.org/src/best-practices.html#interfacing-with-c http://fortran90.org/src/best-practices.html#interfacing-with-python That way it is easy to see what is going on under the hood. Ondrej From matthew.brett at gmail.com Mon Dec 24 03:15:42 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 24 Dec 2012 08:15:42 +0000 Subject: [Numpy-discussion] Many failing doctests - release blocker? Enable for default test runs? In-Reply-To: References: Message-ID: Hi, On Sun, Dec 23, 2012 at 8:53 PM, Ralf Gommers wrote: > > > > On Sun, Dec 23, 2012 at 7:54 PM, Ond?ej ?ert?k > wrote: >> >> Hi Matthew, >> >> On Sat, Dec 22, 2012 at 9:40 PM, Matthew Brett >> wrote: >> > Hi, >> > >> > I noticed that enabling the doctests on the 1.7.x maintenance branch >> > caused lots and lots of doctest failures. >> > >> > (np-devel)[mb312 at blair ~/dev_trees]$ python -c 'import numpy as np; >> > np.test(doctests=True)' >> > 1.7.0rc1.dev-1e8fcdf >> > Running unit tests and doctests for numpy >> > NumPy version 1.7.0rc1.dev-1e8fcdf >> > NumPy is installed in >> > /Users/mb312/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy >> > Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 >> > (Apple Inc. build 5493)] >> > nose version 1.1.2 >> > ... >> > Ran 3839 tests in 59.928s >> > >> > FAILED (KNOWNFAIL=4, SKIP=4, errors=23, failures=175) >> > >> > The doctests also throw up somewhere round 10 matplotlib plots, so >> > presumably those would fail as well on a machine without a display >> > without forcing the import of an 'Agg' backend or similar. >> > >> > I have never checked the doctests on Python 3. Has anyone run those >> > recently? >> > >> > For the projects I work on most, we enable doctests for the default >> > test run - as in 'doctests=True' by default in the numpy testing >> > machinery. Do ya'll see any disadvantage in doing that for numpy? > > > Yes, I do. The doctest framework and reproducibility of reprs across Python > versions and platforms are too poor to do this. And failing tests give new > users a bad impression of the quality of numpy. I believe the repr problems are fairly easily soluble by using minor extensions to the current numpy doctest machinery. I think I was the last person to do big modifications to that bit of the numpy codebase and I've been using small tweaks to that framework to run cross version and cross platform doctest runs by default for a while on lots of numpy stuff in nipy: https://github.com/nipy/nipy/blob/master/nipy/testing/doctester.py#L155 Cheers, Matthew From ralf.gommers at gmail.com Mon Dec 24 04:05:24 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 24 Dec 2012 10:05:24 +0100 Subject: [Numpy-discussion] Many failing doctests - release blocker? Enable for default test runs? In-Reply-To: References: Message-ID: On Mon, Dec 24, 2012 at 9:15 AM, Matthew Brett wrote: > Hi, > > On Sun, Dec 23, 2012 at 8:53 PM, Ralf Gommers > wrote: > > > > > > > > On Sun, Dec 23, 2012 at 7:54 PM, Ond?ej ?ert?k > > wrote: > >> > >> Hi Matthew, > >> > >> On Sat, Dec 22, 2012 at 9:40 PM, Matthew Brett > > >> wrote: > >> > Hi, > >> > > >> > I noticed that enabling the doctests on the 1.7.x maintenance branch > >> > caused lots and lots of doctest failures. > >> > > >> > (np-devel)[mb312 at blair ~/dev_trees]$ python -c 'import numpy as np; > >> > np.test(doctests=True)' > >> > 1.7.0rc1.dev-1e8fcdf > >> > Running unit tests and doctests for numpy > >> > NumPy version 1.7.0rc1.dev-1e8fcdf > >> > NumPy is installed in > >> > /Users/mb312/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy > >> > Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 > >> > (Apple Inc. build 5493)] > >> > nose version 1.1.2 > >> > ... > >> > Ran 3839 tests in 59.928s > >> > > >> > FAILED (KNOWNFAIL=4, SKIP=4, errors=23, failures=175) > >> > > >> > The doctests also throw up somewhere round 10 matplotlib plots, so > >> > presumably those would fail as well on a machine without a display > >> > without forcing the import of an 'Agg' backend or similar. > >> > > >> > I have never checked the doctests on Python 3. Has anyone run those > >> > recently? > >> > > >> > For the projects I work on most, we enable doctests for the default > >> > test run - as in 'doctests=True' by default in the numpy testing > >> > machinery. Do ya'll see any disadvantage in doing that for numpy? > > > > > > Yes, I do. The doctest framework and reproducibility of reprs across > Python > > versions and platforms are too poor to do this. And failing tests give > new > > users a bad impression of the quality of numpy. > > I believe the repr problems are fairly easily soluble by using minor > extensions to the current numpy doctest machinery. I think I was the > last person to do big modifications to that bit of the numpy codebase > and I've been using small tweaks to that framework to run cross > version and cross platform doctest runs by default for a while on lots > of numpy stuff in nipy: > > https://github.com/nipy/nipy/blob/master/nipy/testing/doctester.py#L155 > My experience is different, but I'm happy to be proven wrong. Let's first see it running on all Python versions on Travis without issues for a while, then consider turning it on by default. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Dec 24 04:24:38 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 24 Dec 2012 09:24:38 +0000 Subject: [Numpy-discussion] Future of numpy (was: DARPA funding for Blaze and passing the NumPy torch) In-Reply-To: References: Message-ID: Hi, On Sun, Dec 23, 2012 at 7:00 PM, Nathaniel Smith wrote: > On Sat, Dec 22, 2012 at 5:36 PM, Matthew Brett wrote: >> Hi, >> >> On Thu, Dec 20, 2012 at 5:39 PM, Nathaniel Smith wrote: >>> On Thu, Dec 20, 2012 at 11:46 PM, Matthew Brett wrote: >>>> Travis - I think you are suggesting that there should be no one >>>> person in charge of numpy, and I think this is very unlikely to work >>>> well. Perhaps there are good examples of well-led projects where >>>> there is not a clear leader, but I can't think of any myself at the >>>> moment. My worry would be that, without a clear leader, it will be >>>> unclear how decisions are made, and that will make it very hard to >>>> take strategic decisions. >>> >>> Curious; my feeling is the opposite, that among mature and successful >>> FOSS projects, having a clear leader is the uncommon case. GCC >>> doesn't, Glibc not only has no leader but they recently decided to get >>> rid of their formal steering committee, I'm pretty sure git doesn't, >>> Apache certainly doesn't, Samba doesn't really, etc. As usual Karl >>> Fogel has sensible comments on this: >>> http://producingoss.com/en/consensus-democracy.html >> >> Ah yes - that is curious. My - er - speculation was based on: >> >> Numpy - Travis golden age in which we still bask >> Sympy - Ondrej, then Aaron - evolving into group decision making AFAICT >> IPython - Fernando, evolving into group decision making, AFAICT >> Cython - Robert Bradshaw - evolving into ... - you get the idea. >> >> and then reading about businesses particularly Good to Great, Built to >> Last, the disaster at HP when they didn't take care about succession. >> In general, that reading gave me the impression that successful >> organizations take enormous care about succession. I can't think of >> any case in the business literature I've read where a successful >> leader handed over to a group of three. > > I think this is just a case of different organizational styles working > differently. If organizations were optimisation algorithms, good > businesses would be Newton's method, and good FOSS projects would be > simulated annealing, or maybe GAs. Slower and less focused, but more > robust against noise and local minima, and less susceptible to > perturbations. They depend much less on the focused attention of > visionary leaders. You seem to be implying that any management organization for numpy would have the same effect as any other as far as we know, and if that were true, it would certainly not be worth discussing in any detail. But I doubt very much that is true, and, following the optimization strategy logic, there may be a reason that having 3 people lead an organization has not been a common and visible option, and that is that it doesn't work very well. > (Also I wouldn't consider numpy to have a formal "group of three > leaders" now just because Travis mentioned three names in his email. > Leadership is something people do, not something people are, so it's a > fuzzy category in the first place.) In Travis' email I saw the three of you and a 2 to 1 voting suggestion. There doesn't seem to be much appetite for discussing alternatives and neither Chuck nor Ralf have joined this discussion, so that seems to be the only option on the table. Is there another one? Or are you thinking that something will gradually evolve from - essentially - no prescription of how things work. Again, I doubt very much that "no prescription" has been a successful option in the past, even for FOSS. It just never happens in businesses or successful governments - does it? The problem is obviously going to be what to do when we get problems, as we have in the past. When there is no or little structure to use, then decisions don't get made, or get made by default, and the debate deteriorates because there is no-one to moderate it. That seems to me the situation numpy is in, and doing nothing or having poorly defined management can only prolong that or even make it worse. >>> In practice the main job of a successful FOSS leader is to refuse to >>> make decisions, nudge people to work things out, and then if they >>> refuse to work things out tell them to go away until they do: >>> https://lwn.net/Articles/105375/ >>> and what actually gives people influence in a project is the respect >>> of the other members. The former stuff is stuff anyone can do, and the >>> latter isn't something you can confer or take away with a vote. >> >> Right. My impression is - I'm happy to be corrected with better >> information - that the leader of a to-be-successful organization is >> very good at encouraging a spirit of free and vigorous debate, strong >> opinion, and reasoned decisions - and that may be the main gift they >> give to the organization. At that point, usually under that leader's >> supervision, the decision making starts diffusing over the group, as >> they learn to discuss and make decisions together. >> >> As I was teaching my niece and nephew to say to their parents in the >> car - Daddy - are we there yet? >> >> If we are not already there, how are we going to get there? For example - you didn't answer this one. What are your thoughts? >>> Nor do we necessarily have a great track record for executive >>> decisions actually working things out. >> >> No, I agree, the right leader will help form the group well for making >> good group decisions. I think. >> >> In the mean-time - now that there is a change - could I ask - where do >> you three see Numpy going in the next five years? What do you see >> as the challenges to solve? What are the big risks? What are the big >> possibilities? > > Personally I'd like to see NA support and sparse ndarrays in numpy > proper, but I'm not going to have the time to write them myself in the > forseeable future... > > In the long run of course everyone wants a version of numpy+python > that can do automatic loop fusion (since that's the core feature for > achieving throughput on modern CPUs) without giving up the ability to > interface with C code and CPython compatibility. In my dreams the PyPy > people will get their act together WRT interfacing with C code, the > Cython people will take advantage of this to write a Cython-to-RPython > compiler that lets the PyPy optimizer see the internals of > Cython-written code, and then we port numpy to Cython and get a single > compatible code-base that can run fast on both CPython and PyPy. But > who knows what will actually make sense, if anything; as they say, > it's very hard to make predictions, especially about the future. > > And of course the actual long-term strategic plan is "review PRs, > merge the good ones". The contrast between these last two paragraphs is strong. The first is a vision for how numpy might be - to coin a phrase - "the next generation of numpy". It seems exciting and interesting, but you don't hold out much hope of getting there, and having not-much-management makes it less likely we will have any real new direction to the project. That's your second paragraph - keep on keeping on. In effect it condemns numpy to be a slow-moving development effort waiting for something more interesting to overtake it. Is that really necessary? Can we not hope for better? Is there any real chance of attracting a force of new developers in that situation? Best, Matthew From matthew.brett at gmail.com Mon Dec 24 13:08:03 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 24 Dec 2012 18:08:03 +0000 Subject: [Numpy-discussion] Many failing doctests - release blocker? Enable for default test runs? In-Reply-To: References: Message-ID: Hi, On Sun, Dec 23, 2012 at 6:54 PM, Ond?ej ?ert?k wrote: > Hi Matthew, > > On Sat, Dec 22, 2012 at 9:40 PM, Matthew Brett wrote: >> Hi, >> >> I noticed that enabling the doctests on the 1.7.x maintenance branch >> caused lots and lots of doctest failures. >> >> (np-devel)[mb312 at blair ~/dev_trees]$ python -c 'import numpy as np; >> np.test(doctests=True)' >> 1.7.0rc1.dev-1e8fcdf >> Running unit tests and doctests for numpy >> NumPy version 1.7.0rc1.dev-1e8fcdf >> NumPy is installed in >> /Users/mb312/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy >> Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 >> (Apple Inc. build 5493)] >> nose version 1.1.2 >> ... >> Ran 3839 tests in 59.928s >> >> FAILED (KNOWNFAIL=4, SKIP=4, errors=23, failures=175) >> >> The doctests also throw up somewhere round 10 matplotlib plots, so >> presumably those would fail as well on a machine without a display >> without forcing the import of an 'Agg' backend or similar. >> >> I have never checked the doctests on Python 3. Has anyone run those recently? >> >> For the projects I work on most, we enable doctests for the default >> test run - as in 'doctests=True' by default in the numpy testing >> machinery. Do ya'll see any disadvantage in doing that for numpy? >> >> In case someone gets to this before I do, we've also got some logic >> for doing conditional skips of doctests when optional packages are not >> available such as matplotlib, inspired by something similar in >> IPython: >> >> https://github.com/nipy/nipy/blob/master/nipy/testing/doctester.py#L193 >> >> If Christmas allows I'll send a pull request with something like that >> in the next few days. > > Thanks for pointing this out. I think in the long term, we should definitely > run doctests as part of the test suite on Travis-CI. Because what use > is a doctest if it doesn't work? > > Matthew, do you know if doctests fail for the 1.6 release as well? On 1.6.2: FAILED (KNOWNFAIL=5, SKIP=3, errors=43, failures=167) On Python 3.2, current 1.7.x maintenance: FAILED (KNOWNFAIL=5, SKIP=4, errors=24, failures=211) The last time I looked I had the impression that we were not doing 2to3 conversion on doctests, but that was a while ago. See you, Matthew From eric.emsellem at eso.org Wed Dec 26 04:09:42 2012 From: eric.emsellem at eso.org (Eric Emsellem) Date: Wed, 26 Dec 2012 10:09:42 +0100 Subject: [Numpy-discussion] Efficient way of binning points and applying functions to these groups Message-ID: <50DABED6.6020706@eso.org> Hi! I am looking for an efficient way of doing some simple binning of points and then applying some functions to points within each bin. I have tried several ways, including crude looping over the indices, or using digitize (see below) but I cannot manage to get it as efficient as I need it to be. I have a comparison with a similar (although complex) code in idl, and thought I would ask the forum. In idl there is a way to "invert" an histogram and get a reverse set of indices (via the histogram function) which seems to do the trick (or maybe it is faster for another reason). Below I provide a (dummy) example of what I wish to achieve. Any hint on how to do this EFFICIENTLY using numpy is most welcome. I need to speed things up quite a bit (at the moment, the version I have, see below, is 10 times slower than the more complex idl routine..., I must be doing something wrong!). thanks!! Eric ======================================================== # I have a random set of data points in 2D with coordinates x,y : import numpy as np x = np.random.random(1000) y = np.random.random(1000) # I have now a 2D grid given by let's say 10x10 grid points: nx = 11 ny = 21 lx = linspace(0,1,nx) ly = linspace(0,1,ny) gx, gy = np.meshgrid(lx, ly) # So my set of 2D bins are (not needed in the solution I present but just for clarity) bins = np.dstack((gx.ravel(), gy.ravel()))[0] # Now I want to have the list of points in each bin and # if the number of points in that bin is larger than 10, apply (dummy) function func1 (see below) # If less than 10, apply (dummy) function func2 so (dum?) # if 0, do nothing # for two dummy functions like (for example): def func1(x) : return x.mean() def func2(x) : return x.std() # One solution would be to use digitize in 1D and histogram in 2D (don't need gx, gy for this one): h = histogram2d(x, y, bins=[lx, ly])[0] digitX = np.digitize(x, lx) digitY = np.digitize(y, ly) # create the output array, with -999 values to make sure I see which ones are not filled in result = np.zeros_like(h) - 999 for i in range(nx-1) : for j in range(ny-1) : selectionofpoints = (digitX == i+1) & (digitY == j+1) if h[i,j] > 10 : result[i,j] = func1(x[selectionofpoints]) elif h[i,j] > 0 : result[i,j] = func2(x[selectionofpoints]) From davidmenhur at gmail.com Wed Dec 26 05:21:10 2012 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Wed, 26 Dec 2012 11:21:10 +0100 Subject: [Numpy-discussion] Efficient way of binning points and applying functions to these groups In-Reply-To: <50DABED6.6020706@eso.org> References: <50DABED6.6020706@eso.org> Message-ID: This looks like the perfect work for cython. It it's great opp optimizing loops. Another option is the new Numba, an automatic compiler. David. El 26/12/2012 10:09, "Eric Emsellem" escribi?: > Hi! > > I am looking for an efficient way of doing some simple binning of points > and then applying some functions to points within each bin. > > I have tried several ways, including crude looping over the indices, or > using digitize (see below) but I cannot manage to get it as efficient as > I need it to be. I have a comparison with a similar (although complex) > code in idl, and thought I would ask the forum. In idl there is a way to > "invert" an histogram and get a reverse set of indices (via the > histogram function) which seems to do the trick (or maybe it is faster > for another reason). > > Below I provide a (dummy) example of what I wish to achieve. Any hint on > how to do this EFFICIENTLY using numpy is most welcome. I need to speed > things up quite a bit (at the moment, the version I have, see below, is > 10 times slower than the more complex idl routine..., I must be doing > something wrong!). > > thanks!! > Eric > ======================================================== > # I have a random set of data points in 2D with coordinates x,y : > import numpy as np > x = np.random.random(1000) > y = np.random.random(1000) > > # I have now a 2D grid given by let's say 10x10 grid points: > nx = 11 > ny = 21 > lx = linspace(0,1,nx) > ly = linspace(0,1,ny) > gx, gy = np.meshgrid(lx, ly) > > # So my set of 2D bins are (not needed in the solution I present but > just for clarity) > bins = np.dstack((gx.ravel(), gy.ravel()))[0] > > # Now I want to have the list of points in each bin and > # if the number of points in that bin is larger than 10, apply (dummy) > function func1 (see below) > # If less than 10, apply (dummy) function func2 so (dum?) > # if 0, do nothing > # for two dummy functions like (for example): > def func1(x) : return x.mean() > > def func2(x) : return x.std() > > # One solution would be to use digitize in 1D and histogram in 2D (don't > need gx, gy for this one): > > h = histogram2d(x, y, bins=[lx, ly])[0] > > digitX = np.digitize(x, lx) > digitY = np.digitize(y, ly) > > # create the output array, with -999 values to make sure I see which > ones are not filled in > result = np.zeros_like(h) - 999 > > for i in range(nx-1) : > for j in range(ny-1) : > selectionofpoints = (digitX == i+1) & (digitY == j+1) > if h[i,j] > 10 : result[i,j] = func1(x[selectionofpoints]) > elif h[i,j] > 0 : result[i,j] = func2(x[selectionofpoints]) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Dec 26 07:33:58 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 26 Dec 2012 13:33:58 +0100 Subject: [Numpy-discussion] Efficient way of binning points and applying functions to these groups In-Reply-To: <50DABED6.6020706@eso.org> References: <50DABED6.6020706@eso.org> Message-ID: On Wed, Dec 26, 2012 at 10:09 AM, Eric Emsellem wrote: > Hi! > > I am looking for an efficient way of doing some simple binning of points > and then applying some functions to points within each bin. > That's exactly what scipy.stats.binned_statistic does: http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.stats.binned_statistic.html binned_statistic uses np.digitize as well, but I'm not sure that in your code below digitize is the bottleneck - the nested for-loop looks like the more likely suspect. Ralf > > I have tried several ways, including crude looping over the indices, or > using digitize (see below) but I cannot manage to get it as efficient as > I need it to be. I have a comparison with a similar (although complex) > code in idl, and thought I would ask the forum. In idl there is a way to > "invert" an histogram and get a reverse set of indices (via the > histogram function) which seems to do the trick (or maybe it is faster > for another reason). > > Below I provide a (dummy) example of what I wish to achieve. Any hint on > how to do this EFFICIENTLY using numpy is most welcome. I need to speed > things up quite a bit (at the moment, the version I have, see below, is > 10 times slower than the more complex idl routine..., I must be doing > something wrong!). > > thanks!! > Eric > ======================================================== > # I have a random set of data points in 2D with coordinates x,y : > import numpy as np > x = np.random.random(1000) > y = np.random.random(1000) > > # I have now a 2D grid given by let's say 10x10 grid points: > nx = 11 > ny = 21 > lx = linspace(0,1,nx) > ly = linspace(0,1,ny) > gx, gy = np.meshgrid(lx, ly) > > # So my set of 2D bins are (not needed in the solution I present but > just for clarity) > bins = np.dstack((gx.ravel(), gy.ravel()))[0] > > # Now I want to have the list of points in each bin and > # if the number of points in that bin is larger than 10, apply (dummy) > function func1 (see below) > # If less than 10, apply (dummy) function func2 so (dum?) > # if 0, do nothing > # for two dummy functions like (for example): > def func1(x) : return x.mean() > > def func2(x) : return x.std() > > # One solution would be to use digitize in 1D and histogram in 2D (don't > need gx, gy for this one): > > h = histogram2d(x, y, bins=[lx, ly])[0] > > digitX = np.digitize(x, lx) > digitY = np.digitize(y, ly) > > # create the output array, with -999 values to make sure I see which > ones are not filled in > result = np.zeros_like(h) - 999 > > for i in range(nx-1) : > for j in range(ny-1) : > selectionofpoints = (digitX == i+1) & (digitY == j+1) > if h[i,j] > 10 : result[i,j] = func1(x[selectionofpoints]) > elif h[i,j] > 0 : result[i,j] = func2(x[selectionofpoints]) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Wed Dec 26 15:09:23 2012 From: Nicolas.Rougier at inria.fr (Nicolas Rougier) Date: Wed, 26 Dec 2012 21:09:23 +0100 Subject: [Numpy-discussion] dtype "reduction" Message-ID: <88940E94-C5A2-4CBB-8D44-554193B9CF05@inria.fr> Hi all, I'm looking for a way to "reduce" dtype1 into dtype2 (when it is possible of course). Is there some easy way to do that by any chance ? dtype1 = np.dtype( [ ('vertex', [('x', 'f4'), ('y', 'f4'), ('z', 'f4')]), ('normal', [('x', 'f4'), ('y', 'f4'), ('z', 'f4')]), ('color', [('r', 'f4'), ('g', 'f4'), ('b', 'f4'), ('a', 'f4')]) ] ) dtype2 = np.dtype( [ ('vertex', 'f4', 3), ('normal', 'f4', 3), ('color', 'f4', 4)] ) Nicolas From chaoyuejoy at gmail.com Wed Dec 26 18:23:13 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 27 Dec 2012 00:23:13 +0100 Subject: [Numpy-discussion] numpy.testing.asserts and masked array Message-ID: Dear all, I found here http://mail.scipy.org/pipermail/numpy-discussion/2009-January/039681.html that to use* numpy.ma.testutils.assert_almost_equal* for masked array assertion, but I cannot find the np.ma.testutils module? Am I getting somewhere wrong? my numpy version is 1.6.2 thanks! Chao -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Dec 26 19:32:37 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 27 Dec 2012 00:32:37 +0000 Subject: [Numpy-discussion] dtype "reduction" In-Reply-To: <88940E94-C5A2-4CBB-8D44-554193B9CF05@inria.fr> References: <88940E94-C5A2-4CBB-8D44-554193B9CF05@inria.fr> Message-ID: On Wed, Dec 26, 2012 at 8:09 PM, Nicolas Rougier wrote: > > > Hi all, > > > I'm looking for a way to "reduce" dtype1 into dtype2 (when it is possible of course). > Is there some easy way to do that by any chance ? > > > dtype1 = np.dtype( [ ('vertex', [('x', 'f4'), > ('y', 'f4'), > ('z', 'f4')]), > ('normal', [('x', 'f4'), > ('y', 'f4'), > ('z', 'f4')]), > ('color', [('r', 'f4'), > ('g', 'f4'), > ('b', 'f4'), > ('a', 'f4')]) ] ) > > dtype2 = np.dtype( [ ('vertex', 'f4', 3), > ('normal', 'f4', 3), > ('color', 'f4', 4)] ) > If you have an array whose dtype is dtype1, and you want to convert it into an array with dtype2, then you just do my_dtype2_array = my_dtype1_array.view(dtype2) If you have dtype1 and you want to programmaticaly construct dtype2, then that's a little more fiddly and depends on what exactly you're trying to do, but start by poking around with dtype1.names and dtype1.fields, which contain information on how dtype1 is put together in the form of regular python structures. -n From eric.emsellem at eso.org Thu Dec 27 02:25:48 2012 From: eric.emsellem at eso.org (Eric Emsellem) Date: Thu, 27 Dec 2012 08:25:48 +0100 Subject: [Numpy-discussion] Efficient way of binning points and, applying functions to these groups Message-ID: <50DBF7FC.5050209@eso.org> Thanks Ralf! this module looks great in fact. I didn't know it existed, and in fact It is only available in Scipy 0.11.0 (had to install from source since an Ubuntu 12.04 bin is not available). Too bad that the User-defined function only accepts one single array. If that function should take more input you need to rely on a trick to basically duplicate the input coordinates and concatenate the input arrays you need. But apart from the fact that the programme looks much cleaner now, I just tested it and it is rather SLOW in fact. Since I have to repeat this 2 or 3 times (I have in fact 3 functions to apply), it takes about 20 seconds for a full test, while with the changes I made (see below) it takes now 3 seconds or so [I am talking about the real code, not the example I give below]. So I managed to speed up things a bit by doing two things: - keeping the first loop but replacing the second one with a loop ONLY on bins which contains the right number of points - and more importantly not addressing the full array at each loop iteration but first selecting the right points (to reduce the size). So something as shown below. it is still a factor of 2 slower than the idl routine and I have no clue why. I will analyse it further. The idl routine has similar loops etc, so there is no reason for this. if anybody has an idea .... THANKS (using cython is a bit too much on my side - being a low-profile python developer. As for numba, will have a look) and thanks again for your help! Eric ======================================================= import numpy as np x = np.random.random(1000) y = np.random.random(1000) data = np.random.random(1000) # I have now a 2D grid given by let's say 10x10 grid points: nx = 11 ny = 21 lx = linspace(0,1,nx) ly = linspace(0,1,ny) gx, gy = np.meshgrid(lx, ly) # So my set of 2D bins are (not needed in the solution I present but just for clarity) bins = np.dstack((gx.ravel(), gy.ravel()))[0] # Now I want to have the list of points in each bin and # if the number of points in that bin is larger than 10, apply (dummy) function func1 (see below) # If less than 10, apply (dummy) function func2 so (dum?) # if 0, do nothing # for two dummy functions like (for example): def func1(x) : return x.mean() def func2(x) : return x.std() h = histogram2d(x, y, bins=[lx, ly])[0] digitX = np.digitize(x, lx) digitY = np.digitize(y, ly) # create the output array, with -999 values to make sure I see which ones are not filled in result = np.zeros_like(h) - 999 for i in range(nx-1) : selX = (digitX == i+1) dataX = data[selX] selH10 = np.where(h > 10) selH0 = np.where((h > 0) & (h <= 10)) for j in selH10 : selectionofpoints = (digitY == j+1) result[i,j] = func1(data[selectionofpoints]) for j in selH0 : selectionofpoints = (digitY == j+1) result[i,j] = func2(data[selectionofpoints]) > Hi! > > I am looking for an efficient way of doing some simple binning of points > and then applying some functions to points within each bin. > That's exactly what scipy.stats.binned_statistic does: http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.stats.binned_statistic.html binned_statistic uses np.digitize as well, but I'm not sure that in your code below digitize is the bottleneck - the nested for-loop looks like the more likely suspect. Ralf From Nicolas.Rougier at inria.fr Thu Dec 27 03:11:05 2012 From: Nicolas.Rougier at inria.fr (Nicolas Rougier) Date: Thu, 27 Dec 2012 09:11:05 +0100 Subject: [Numpy-discussion] dtype "reduction" In-Reply-To: References: <88940E94-C5A2-4CBB-8D44-554193B9CF05@inria.fr> Message-ID: Yep, I'm trying to construct dtype2 programmaticaly and was hoping for some function giving me a "canonical" expression of the dtype. I've started playing with fields but it's just a bit harder than I though (lot of different cases and recursion). Thanks for the answer. Nicolas On Dec 27, 2012, at 1:32 , Nathaniel Smith wrote: > On Wed, Dec 26, 2012 at 8:09 PM, Nicolas Rougier > wrote: >> >> >> Hi all, >> >> >> I'm looking for a way to "reduce" dtype1 into dtype2 (when it is possible of course). >> Is there some easy way to do that by any chance ? >> >> >> dtype1 = np.dtype( [ ('vertex', [('x', 'f4'), >> ('y', 'f4'), >> ('z', 'f4')]), >> ('normal', [('x', 'f4'), >> ('y', 'f4'), >> ('z', 'f4')]), >> ('color', [('r', 'f4'), >> ('g', 'f4'), >> ('b', 'f4'), >> ('a', 'f4')]) ] ) >> >> dtype2 = np.dtype( [ ('vertex', 'f4', 3), >> ('normal', 'f4', 3), >> ('color', 'f4', 4)] ) >> > > If you have an array whose dtype is dtype1, and you want to convert it > into an array with dtype2, then you just do > my_dtype2_array = my_dtype1_array.view(dtype2) > > If you have dtype1 and you want to programmaticaly construct dtype2, > then that's a little more fiddly and depends on what exactly you're > trying to do, but start by poking around with dtype1.names and > dtype1.fields, which contain information on how dtype1 is put together > in the form of regular python structures. > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From Nikolaus at rath.org Thu Dec 27 11:44:12 2012 From: Nikolaus at rath.org (Nikolaus Rath) Date: Thu, 27 Dec 2012 08:44:12 -0800 Subject: [Numpy-discussion] Pre-allocate array Message-ID: <50DC7ADC.4090305@rath.org> Hello, I have an array that I know will need to grow to X elements. However, I will need to work with it before it's completely filled. I see two ways of doing this: bigarray = np.empty(X) current_size = 0 for i in something: buf = produce_data(i) bigarray[current_size:current_size+len(buf)] = buf current_size += len(buf) # Do things with bigarray[:current_size] This avoids having to allocate new buffers and copying data around, but I have to separately manage the current array size. Alternatively, I could do bigarray = np.empty(0) current_size = 0 for i in something: buf = produce_data(i) bigarray.resize(len(bigarray)+len(buf)) bigarray[-len(buf):] = buf # Do things with bigarray this is much more elegant, but the resize() calls may have to copy data around. Is there any way to tell numpy to allocate all the required memory while using only a part of it for the array? Something like: bigarray = np.empty(50, will_grow_to=X) bigarray.resize(X) # Guaranteed to work without copying stuff around Thanks, -Nikolaus From chris.barker at noaa.gov Thu Dec 27 12:40:46 2012 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 27 Dec 2012 09:40:46 -0800 Subject: [Numpy-discussion] Pre-allocate array In-Reply-To: <50DC7ADC.4090305@rath.org> References: <50DC7ADC.4090305@rath.org> Message-ID: On Thu, Dec 27, 2012 at 8:44 AM, Nikolaus Rath wrote: > I have an array that I know will need to grow to X elements. However, I > will need to work with it before it's completely filled. what sort of "work with it" do you mean? -- resize() is dangerous if there are any other views on the data block... > bigarray = np.empty(X) > current_size = 0 > for i in something: > buf = produce_data(i) > bigarray[current_size:current_size+len(buf)] = buf > current_size += len(buf) > # Do things with bigarray[:current_size] > > This avoids having to allocate new buffers and copying data around, but > I have to separately manage the current array size. yup -- but not a bad option, really. > Alternatively, I > could do > > bigarray = np.empty(0) > current_size = 0 > for i in something: > buf = produce_data(i) > bigarray.resize(len(bigarray)+len(buf)) > bigarray[-len(buf):] = buf > # Do things with bigarray > > this is much more elegant, but the resize() calls may have to copy data > around. Yes, they will -- but whether that's a problem or not depends on your use-case. If you are adding elements one-by-one, the re-allocatiing and copying of memory could be a big overhead. But if buf is not that "small", then the overhead gets lost in teh wash. Yopu'd have to profile to be sure, but I found that if, in this case, "buf" is on order of larger than 1/16 of the size of bigarray, you'll not see it (vague memory...) > Is there any way to tell numpy to allocate all the required memory while > using only a part of it for the array? Something like: > > bigarray = np.empty(50, will_grow_to=X) > bigarray.resize(X) # Guaranteed to work without copying stuff around no -- though you could probably fudge it by messing with the strides -- though you'd need to either keep track of how much memory was originally allocated, or how much is currently used yourself, like you did above. NOTE: I've written a couple of "growable array" classes for just this problem. One in pure Python, and one in Cython that isn't quite finished. I've enclosed the pure python one, let me know if your interested in the Cython version (it may need some work to b fully functional). -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- A non-text attachment was scrubbed... Name: accumulator.py Type: application/octet-stream Size: 4171 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_accumulator.py Type: application/octet-stream Size: 5154 bytes Desc: not available URL: From otrov at hush.ai Thu Dec 27 15:20:39 2012 From: otrov at hush.ai (deb) Date: Thu, 27 Dec 2012 21:20:39 +0100 Subject: [Numpy-discussion] Manipulate neighboring points in 2D array Message-ID: <20121227202040.24578E6726@smtp.hushmail.com> Hi, I have 2D array, let's say: `np.random.random((100,100))` and I want to do simple manipulation on each point neighbors, like divide their values by 3. So for each array value, x, and it neighbors n: n n n n/3 n/3 n/3 n x n -> n/3 x n/3 n n n n/3 n/3 n/3 I searched a bit, and found about scipy ndimage filters, but if I'm not wrong, there is no such function. Of course me being wrong is quite possible, as I did not comprehend whole ndimage module, but I tried generic filter for example and browser other functions. Is there better way to make above manipulation, instead using for loop over every array element? TIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Dec 27 16:35:23 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 27 Dec 2012 22:35:23 +0100 Subject: [Numpy-discussion] numpy.testing.asserts and masked array In-Reply-To: References: Message-ID: On Thu, Dec 27, 2012 at 12:23 AM, Chao YUE wrote: > Dear all, > > I found here > http://mail.scipy.org/pipermail/numpy-discussion/2009-January/039681.html > that to use* numpy.ma.testutils.assert_almost_equal* for masked array > assertion, but I cannot find the np.ma.testutils module? > Am I getting somewhere wrong? my numpy version is 1.6.2 thanks! "from numpy.ma import testutils" works for me. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaoyuejoy at gmail.com Thu Dec 27 17:15:41 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 27 Dec 2012 23:15:41 +0100 Subject: [Numpy-discussion] numpy.testing.asserts and masked array In-Reply-To: References: Message-ID: Thanks. I tried again, it works. On Thu, Dec 27, 2012 at 10:35 PM, Ralf Gommers wrote: > from numpy.ma import testutils > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Thu Dec 27 18:28:50 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 27 Dec 2012 16:28:50 -0700 Subject: [Numpy-discussion] Manipulate neighboring points in 2D array In-Reply-To: <20121227202040.24578E6726@smtp.hushmail.com> References: <20121227202040.24578E6726@smtp.hushmail.com> Message-ID: <75B8BBD4-7E3B-461A-92B4-A355EB1A6D65@yale.edu> > I have 2D array, let's say: `np.random.random((100,100))` and I want to do simple manipulation on each point neighbors, like divide their values by 3. > > So for each array value, x, and it neighbors n: > > n n n n/3 n/3 n/3 > n x n -> n/3 x n/3 > n n n n/3 n/3 n/3 > > I searched a bit, and found about scipy ndimage filters, but if I'm not wrong, there is no such function. Of course me being wrong is quite possible, as I did not comprehend whole ndimage module, but I tried generic filter for example and browser other functions. > > Is there better way to make above manipulation, instead using for loop over every array element? I am not sure I understand the above manipulation... typically neighborhood operators take an array element and the its neighborhood and then give a single output that becomes the value of the new array at that point. That is, a 3x3 neighborhood filter would act as a function F(R^{3x3}) -> R. It appears that what you're talking about above is a function F(R^{3x3}) -> R^{3x3}. But how is this output to map onto the original array positions? Is the function to be applied to non-overlapping neighborhoods? Is it to be applied to all neighborhoods and then summed at each position to give the output array? If you can describe the problem in a bit more detail, with perhaps some sample input and output for what you desire (and/or with some pseudocode describing how it would work in a looping-over-each-element approach), I'm sure folks can figure out how best to do this in numpy. Zach From otrov at hush.ai Fri Dec 28 10:00:06 2012 From: otrov at hush.ai (deb) Date: Fri, 28 Dec 2012 16:00:06 +0100 Subject: [Numpy-discussion] Manipulate neighboring points in 2D array Message-ID: <20121228150006.401C1E6739@smtp.hushmail.com> Thanks Zach You are right. I needed generic filter - to update current point, and not the neighbors as I wrote. Initial code is slow loop over 2D python lists, which I'm trying to convert to numpy and make it useful. In that loop there is inner loop for calculating neighbors properties, which confused me yesterday, and mislead to search for something that probably does not make sense. It's clear now :) Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Fri Dec 28 19:02:01 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Fri, 28 Dec 2012 16:02:01 -0800 Subject: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release Message-ID: Hi, I'm pleased to announce the availability of the first release candidate of NumPy 1.7.0rc1. Sources and binary installers can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.7.0rc1/ We have fixed all issues known to us since the 1.7.0b2 release. The only remaining issue is a documentation improvement: https://github.com/numpy/numpy/issues/561 Please test this release and report any issues on the numpy-discussion mailing list. If there are no more problems, we'll release the final version soon. I'll wait at least a week and please write me an email if you need more time for testing. I would like to thank Sebastian Berg, Ralf Gommers, Han Genuit, Nathaniel J. Smith, Jay Bourque, Gael Varoquaux, Mark Wiebe, Matthew Brett, Skipper Seabold, Peter Cock, Charles Harris, Frederic, Gabriel, Luis Pedro Coelho, Pauli Virtanen, Travis E. Oliphant and cgohlke for sending patches and fixes for this release since 1.7.0b2. Cheers, Ondrej P.S. Source code is uploaded to sourceforge, and I'll upload the rest of the Windows and Mac binaries in a few hours as they finish building. From zachary.pincus at yale.edu Fri Dec 28 21:13:01 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 28 Dec 2012 19:13:01 -0700 Subject: [Numpy-discussion] Manipulate neighboring points in 2D array In-Reply-To: <20121228150006.401C1E6739@smtp.hushmail.com> References: <20121228150006.401C1E6739@smtp.hushmail.com> Message-ID: <14606163-7B33-46E4-B7A8-C4B3C9865A61@yale.edu> > You are right. I needed generic filter - to update current point, and not the neighbors as I wrote. > Initial code is slow loop over 2D python lists, which I'm trying to convert to numpy and make it useful. In that loop there is inner loop for calculating neighbors properties, which confused me yesterday, and mislead to search for something that probably does not make sense. > > It's clear now :) It's possible that some generic filter operations can be cast in terms of pure-numpy operations, or composed out of existing filters available in scipy.ndimage. If you can describe the filter operation you wish to perform, perhaps someone can make some suggestions. Alternately, scipy.ndimage.generic_filter can take an arbitrary python function. Though it's not really fast... Zach From travis at continuum.io Sat Dec 29 01:38:54 2012 From: travis at continuum.io (Travis Oliphant) Date: Sat, 29 Dec 2012 00:38:54 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release In-Reply-To: References: Message-ID: <849D607B-2F73-40F2-843E-17A785CCB95E@continuum.io> Fantastic job everyone! Hats of to you Ondrej! -Travis On Dec 28, 2012, at 6:02 PM, Ond?ej ?ert?k wrote: > Hi, > > I'm pleased to announce the availability of the first release candidate of > NumPy 1.7.0rc1. > > Sources and binary installers can be found at > https://sourceforge.net/projects/numpy/files/NumPy/1.7.0rc1/ > > We have fixed all issues known to us since the 1.7.0b2 release. > The only remaining issue is a documentation improvement: > > https://github.com/numpy/numpy/issues/561 > > Please test this release and report any issues on the numpy-discussion > mailing list. If there are no more problems, we'll release the final > version soon. I'll wait at least a week and please write me an email > if you need more time for testing. > > I would like to thank Sebastian Berg, Ralf Gommers, Han Genuit, > Nathaniel J. Smith, Jay Bourque, Gael Varoquaux, Mark Wiebe, > Matthew Brett, Skipper Seabold, Peter Cock, Charles Harris, Frederic, > Gabriel, Luis Pedro Coelho, Pauli Virtanen, Travis E. Oliphant > and cgohlke for sending patches and fixes for this release since > 1.7.0b2. > > Cheers, > Ondrej > > P.S. Source code is uploaded to sourceforge, and I'll upload the > rest of the Windows and Mac binaries in a few hours as they finish building. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Sat Dec 29 02:07:49 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 29 Dec 2012 00:07:49 -0700 Subject: [Numpy-discussion] A small challenge Message-ID: Hi All, I propose a challenge: express the dtype grammar in EBNF. That's all. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Dec 29 06:14:13 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 29 Dec 2012 11:14:13 +0000 Subject: [Numpy-discussion] A small challenge In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 7:07 AM, Charles R Harris wrote: > Hi All, > > I propose a challenge: express the dtype grammar in EBNF. That's all. Not sure I understand. Do you mean just the little string-parsing DSL for specifying dtypes ("i4,datetime64[ms]"), or is there some way to write EBNF for describing arbitrary nested Python structures? -n From cgohlke at uci.edu Sat Dec 29 08:46:29 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sat, 29 Dec 2012 05:46:29 -0800 Subject: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release In-Reply-To: References: Message-ID: <50DEF435.8010103@uci.edu> On 12/28/2012 4:02 PM, Ond?ej ?ert?k wrote: > Hi, > > I'm pleased to announce the availability of the first release candidate of > NumPy 1.7.0rc1. > > Sources and binary installers can be found at > https://sourceforge.net/projects/numpy/files/NumPy/1.7.0rc1/ > > We have fixed all issues known to us since the 1.7.0b2 release. > The only remaining issue is a documentation improvement: > > https://github.com/numpy/numpy/issues/561 > > Please test this release and report any issues on the numpy-discussion > mailing list. If there are no more problems, we'll release the final > version soon. I'll wait at least a week and please write me an email > if you need more time for testing. > > I would like to thank Sebastian Berg, Ralf Gommers, Han Genuit, > Nathaniel J. Smith, Jay Bourque, Gael Varoquaux, Mark Wiebe, > Matthew Brett, Skipper Seabold, Peter Cock, Charles Harris, Frederic, > Gabriel, Luis Pedro Coelho, Pauli Virtanen, Travis E. Oliphant > and cgohlke for sending patches and fixes for this release since > 1.7.0b2. > > Cheers, > Ondrej > > P.S. Source code is uploaded to sourceforge, and I'll upload the > rest of the Windows and Mac binaries in a few hours as they finish building. Looks good so far. I tested numpy-MKL-1.7.0rc1.win-amd64-py2.7 with some packages that were compiled with numpy 1.6.x . There are a few additional test failures in bottleneck and Cython, but they don't look serious. The rc works well on Python 3.3 too . Christoph From charlesr.harris at gmail.com Sat Dec 29 09:57:22 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 29 Dec 2012 07:57:22 -0700 Subject: [Numpy-discussion] A small challenge In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 4:14 AM, Nathaniel Smith wrote: > On Sat, Dec 29, 2012 at 7:07 AM, Charles R Harris > wrote: > > Hi All, > > > > I propose a challenge: express the dtype grammar in EBNF. That's all. > > Not sure I understand. Do you mean just the little string-parsing DSL > for specifying dtypes ("i4,datetime64[ms]"), or is there some way to > write EBNF for describing arbitrary nested Python structures? > > Heh, pinning that down is part of the problem. I think a good start is the string parsing, but dtypes can also be constructed using python tuples, lists and types. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Sat Dec 29 12:35:36 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Sat, 29 Dec 2012 12:35:36 -0500 Subject: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release References: Message-ID: Are release notes available? From charlesr.harris at gmail.com Sat Dec 29 12:37:21 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 29 Dec 2012 10:37:21 -0700 Subject: [Numpy-discussion] A small challenge In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 7:57 AM, Charles R Harris wrote: > > > On Sat, Dec 29, 2012 at 4:14 AM, Nathaniel Smith wrote: > >> On Sat, Dec 29, 2012 at 7:07 AM, Charles R Harris >> wrote: >> > Hi All, >> > >> > I propose a challenge: express the dtype grammar in EBNF. That's all. >> >> Not sure I understand. Do you mean just the little string-parsing DSL >> for specifying dtypes ("i4,datetime64[ms]"), or is there some way to >> write EBNF for describing arbitrary nested Python structures? >> >> > Heh, pinning that down is part of the problem. I think a good start is the > string parsing, but dtypes can also be constructed using python tuples, > lists and types. > > The idea is to see if the dtype constructor can be regarded as a parser, and if so, how to describe the grammar it parses. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Sat Dec 29 14:37:29 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 29 Dec 2012 11:37:29 -0800 Subject: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release In-Reply-To: References: Message-ID: Hi Neal, On Sat, Dec 29, 2012 at 9:35 AM, Neal Becker wrote: > Are release notes available? Yes. There are here: http://sourceforge.net/projects/numpy/files/NumPy/1.7.0rc1/ if you slide the page down a little bit (sf.net just shows the file README.txt). I am posting them here as well for reference (I forgot to do it in my release email). Ondrej ------------------- ========================= NumPy 1.7.0 Release Notes ========================= This release includes several new features as well as numerous bug fixes and refactorings. It supports Python 2.4 - 2.7 and 3.1 - 3.3 and is the last release that supports Python 2.4 - 2.5. Highlights ========== * ``where=`` parameter to ufuncs (allows the use of boolean arrays to choose where a computation should be done) * ``vectorize`` improvements (added 'excluded' and 'cache' keyword, general cleanup and bug fixes) * ``numpy.random.choice`` (random sample generating function) Compatibility notes =================== In a future version of numpy, the functions np.diag, np.diagonal, and the diagonal method of ndarrays will return a view onto the original array, instead of producing a copy as they do now. This makes a difference if you write to the array returned by any of these functions. To facilitate this transition, numpy 1.7 produces a FutureWarning if it detects that you may be attempting to write to such an array. See the documentation for np.diagonal for details. Similar to np.diagonal above, in a future version of numpy, indexing a record array by a list of field names will return a view onto the original array, instead of producing a copy as they do now. As with np.diagonal, numpy 1.7 produces a FutureWarning if it detects that you may be attempting to write to such an array. See the documentation for array indexing for details. In a future version of numpy, the default casting rule for UFunc out= parameters will be changed from 'unsafe' to 'same_kind'. (This also applies to in-place operations like a += b, which is equivalent to np.add(a, b, out=a).) Most usages which violate the 'same_kind' rule are likely bugs, so this change may expose previously undetected errors in projects that depend on NumPy. In this version of numpy, such usages will continue to succeed, but will raise a DeprecationWarning. Full-array boolean indexing has been optimized to use a different, optimized code path. This code path should produce the same results, but any feedback about changes to your code would be appreciated. Attempting to write to a read-only array (one with ``arr.flags.writeable`` set to ``False``) used to raise either a RuntimeError, ValueError, or TypeError inconsistently, depending on which code path was taken. It now consistently raises a ValueError. The .reduce functions evaluate some reductions in a different order than in previous versions of NumPy, generally providing higher performance. Because of the nature of floating-point arithmetic, this may subtly change some results, just as linking NumPy to a different BLAS implementations such as MKL can. If upgrading from 1.5, then generally in 1.6 and 1.7 there have been substantial code added and some code paths altered, particularly in the areas of type resolution and buffered iteration over universal functions. This might have an impact on your code particularly if you relied on accidental behavior in the past. New features ============ Reduction UFuncs Generalize axis= Parameter ------------------------------------------- Any ufunc.reduce function call, as well as other reductions like sum, prod, any, all, max and min support the ability to choose a subset of the axes to reduce over. Previously, one could say axis=None to mean all the axes or axis=# to pick a single axis. Now, one can also say axis=(#,#) to pick a list of axes for reduction. Reduction UFuncs New keepdims= Parameter ---------------------------------------- There is a new keepdims= parameter, which if set to True, doesn't throw away the reduction axes but instead sets them to have size one. When this option is set, the reduction result will broadcast correctly to the original operand which was reduced. Datetime support ---------------- .. note:: The datetime API is *experimental* in 1.7.0, and may undergo changes in future versions of NumPy. There have been a lot of fixes and enhancements to datetime64 compared to NumPy 1.6: * the parser is quite strict about only accepting ISO 8601 dates, with a few convenience extensions * converts between units correctly * datetime arithmetic works correctly * business day functionality (allows the datetime to be used in contexts where only certain days of the week are valid) The notes in `doc/source/reference/arrays.datetime.rst `_ (also available in the online docs at `arrays.datetime.html `_) should be consulted for more details. Custom formatter for printing arrays ------------------------------------ See the new ``formatter`` parameter of the ``numpy.set_printoptions`` function. New function numpy.random.choice --------------------------------- A generic sampling function has been added which will generate samples from a given array-like. The samples can be with or without replacement, and with uniform or given non-uniform probabilities. New function isclose -------------------- Returns a boolean array where two arrays are element-wise equal within a tolerance. Both relative and absolute tolerance can be specified. Preliminary multi-dimensional support in the polynomial package --------------------------------------------------------------- Axis keywords have been added to the integration and differentiation functions and a tensor keyword was added to the evaluation functions. These additions allow multi-dimensional coefficient arrays to be used in those functions. New functions for evaluating 2-D and 3-D coefficient arrays on grids or sets of points were added together with 2-D and 3-D pseudo-Vandermonde matrices that can be used for fitting. Ability to pad rank-n arrays ---------------------------- A pad module containing functions for padding n-dimensional arrays has been added. The various private padding functions are exposed as options to a public 'pad' function. Example:: pad(a, 5, mode='mean') Current modes are ``constant``, ``edge``, ``linear_ramp``, ``maximum``, ``mean``, ``median``, ``minimum``, ``reflect``, ``symmetric``, ``wrap``, and ````. New argument to searchsorted ---------------------------- The function searchsorted now accepts a 'sorter' argument that is a permutation array that sorts the array to search. C API ----- New function ``PyArray_RequireWriteable`` provides a consistent interface for checking array writeability -- any C code which works with arrays whose WRITEABLE flag is not known to be True a priori, should make sure to call this function before writing. NumPy C Style Guide added (``doc/C_STYLE_GUIDE.rst.txt``). Changes ======= General ------- The function np.concatenate tries to match the layout of its input arrays. Previously, the layout did not follow any particular reason, and depended in an undesirable way on the particular axis chosen for concatenation. A bug was also fixed which silently allowed out of bounds axis arguments. The ufuncs logical_or, logical_and, and logical_not now follow Python's behavior with object arrays, instead of trying to call methods on the objects. For example the expression (3 and 'test') produces the string 'test', and now np.logical_and(np.array(3, 'O'), np.array('test', 'O')) produces 'test' as well. The ``.base`` attribute on ndarrays, which is used on views to ensure that the underlying array owning the memory is not deallocated prematurely, now collapses out references when you have a view-of-a-view. For example:: a = np.arange(10) b = a[1:] c = b[1:] In numpy 1.6, ``c.base`` is ``b``, and ``c.base.base`` is ``a``. In numpy 1.7, ``c.base`` is ``a``. To increase backwards compatibility for software which relies on the old behaviour of ``.base``, we only 'skip over' objects which have exactly the same type as the newly created view. This makes a difference if you use ``ndarray`` subclasses. For example, if we have a mix of ``ndarray`` and ``matrix`` objects which are all views on the same original ``ndarray``:: a = np.arange(10) b = np.asmatrix(a) c = b[0, 1:] d = c[0, 1:] then ``d.base`` will be ``b``. This is because ``d`` is a ``matrix`` object, and so the collapsing process only continues so long as it encounters other ``matrix`` objects. It considers ``c``, ``b``, and ``a`` in that order, and ``b`` is the last entry in that list which is a ``matrix`` object. Deprecations ============ General ------- Specifying a custom string formatter with a `_format` array attribute is deprecated. The new `formatter` keyword in ``numpy.set_printoptions`` or ``numpy.array2string`` can be used instead. The deprecated imports in the polynomial package have been removed. ``concatenate`` now raises DepractionWarning for 1D arrays if ``axis != 0``. Versions of numpy < 1.7.0 ignored axis argument value for 1D arrays. We allow this for now, but in due course we will raise an error. C-API ----- Direct access to the fields of PyArrayObject* has been deprecated. Direct access has been recommended against for many releases. Expect similar deprecations for PyArray_Descr* and other core objects in the future as preparation for NumPy 2.0. The macros in old_defines.h are deprecated and will be removed in the next major release (>= 2.0). The sed script tools/replace_old_macros.sed can be used to replace these macros with the newer versions. You can test your code against the deprecated C API by #defining NPY_NO_DEPRECATED_API to the target version number, for example NPY_1_7_API_VERSION, before including any NumPy headers. The ``NPY_CHAR`` member of the ``NPY_TYPES`` enum is deprecated and will be removed in NumPy 1.8. See the discussion at `gh-2801 `_ for more details. Checksums ========= 0abe9356c7fc5e2dc3ff3a1f7292db23 release/installers/numpy-1.7.0rc1.zip ea4268cb12cc759a33861b8c04535f3b release/installers/numpy-1.7.0rc1-win32-superpack-python3.3.exe b5ba5ae858b8d1b4d50742aefe20e151 release/installers/numpy-1.7.0rc1-win32-superpack-python2.6.exe 6cc692e53df87e7c2a9c5dd742fa3556 release/installers/numpy-1.7.0rc1-win32-superpack-python2.5.exe e164beae6c43d514f1ebba5a34aa4162 release/installers/numpy-1.7.0rc1-win32-superpack-python3.1.exe a4719f5a1853bc0f8892a5956d5c4229 release/installers/numpy-1.7.0rc1.tar.gz ca0151c50c79c5843083c3f8817e5c20 release/installers/numpy-1.7.0rc1-win32-superpack-python3.2.exe 329c3e1560332248e2fb6efdd150e421 release/installers/numpy-1.7.0rc1-win32-superpack-python2.7.exe From ondrej.certik at gmail.com Sat Dec 29 14:48:08 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 29 Dec 2012 11:48:08 -0800 Subject: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release In-Reply-To: <50DEF435.8010103@uci.edu> References: <50DEF435.8010103@uci.edu> Message-ID: Hi Christoph, On Sat, Dec 29, 2012 at 5:46 AM, Christoph Gohlke wrote: > Looks good so far. > > I tested numpy-MKL-1.7.0rc1.win-amd64-py2.7 with some packages that were > compiled with numpy 1.6.x > . > There are a few additional test failures in bottleneck and Cython, but > they don't look serious. > > The rc works well on Python 3.3 too > . Thanks! I created an issue for it here: https://github.com/numpy/numpy/issues/2870 Ondrej P.S. Would you mind adding your name to your github profile (https://github.com/cgohlke) please? I was trying to figure out your full name so that I could thank you in the release email, but I could only find your handle "cgohlke" both at github and in the commit history. My apologies for that. Now I'll remember to google it in my gmail. :) But if you could also update it at github, that'd be the easiest. I can see that in master we have an updated .mailmap which fixes precisely this issue. I should have used it -- I was running it on the release branch which does not have it yet. From otrov at hush.ai Sun Dec 30 06:21:38 2012 From: otrov at hush.ai (deb) Date: Sun, 30 Dec 2012 12:21:38 +0100 Subject: [Numpy-discussion] Manipulate neighboring points in 2D array Message-ID: <20121230112138.D9234E6726@smtp.hushmail.com> Thanks Zach for your interest I was thinking about ndimage.generic_filter when I wrote about generic filter. For generic_filter I used trivial function that returns .sum() but I can't seem to make the code any faster than it is. This is the code: http://code.activestate.com/recipes/578390-snowflake-simulation-using-reiter-cellular-automat/ As commenter suggested I thought to try and make it in numpy Interestingly, the first thing I tried before trying to use numpy was change range() loops with xrange(), as xrange is considered faster and more efficient, but result was that code was twice slower. Anyway I give up, and concluded that my numpy skills are far below I expected :D > It's possible that some generic filter operations can be cast in > terms of pure-numpy operations, or composed out of existing filters > available in scipy.ndimage. If you can describe the filter operation > you wish to perform, perhaps someone can make some suggestions. > Alternately, scipy.ndimage.generic_filter can take an arbitrary > python function. Though it's not really fast... From bahtiyor_zohidov at mail.ru Sun Dec 30 06:41:26 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Sun, 30 Dec 2012 15:41:26 +0400 Subject: [Numpy-discussion] =?utf-8?q?3D_array_problem_in_Python?= Message-ID: <1356867686.200644432@f373.mail.ru> Hello I have 3 dimensional array ?which I want ?to calculate in a huge process. Everything is working well if I use ordinary way which is unsuitable in Python like the following: nums=32 rows=120 cols=150 for k in range(0,nums): ? ? ? ? ? for i in range(0,rows): ? ? ? ? ? ? ? ? ? ? ?for j in range(0,cols): ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? if float ( R[ k ] [ i ] [ j ] ) == 0.0: ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?val11 [ i ] =0.0 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? else: ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?val11[ i ] [ j ], val22[ i ][ j ] = integrate.quad( lambda x : ?F1(x)*F2(x) , 0 , pi) But, this calculation takes so long time, let's say about ?1 hour (theoretically)... Is there any better way to easily and fast calculate the process such as [ F( i ) for i in xlist ] or something like that rather than using for loop? -------------- next part -------------- An HTML attachment was scrubbed... URL: From oc-spam66 at laposte.net Sun Dec 30 07:13:18 2012 From: oc-spam66 at laposte.net (oc-spam66) Date: Sun, 30 Dec 2012 13:13:18 +0100 Subject: [Numpy-discussion] 3D array problem in Python In-Reply-To: <1356867686.200644432@f373.mail.ru> References: <1356867686.200644432@f373.mail.ru> Message-ID: <50E02FDE.2060004@laposte.net> Hello, > else: > val11[i][j], val22[i][j] = integrate.quad(lambda x: F1(x)*F2(x), 0, pi) > But, this calculation takes so long time, let's say about 1 hour > (theoretically)... Is there any better way to easily and fast calculate > the process such as [ F( i ) for i in xlist ] or something like that > rather than using for loop? * What are F1() and F2()? Do they depend on anything else than 'x'? Maybe you meant Fi() and Fj(). In that case, can you benefit of a symmetry property? * It's likely that all the computing time is in the "integrate" operation (check it with a profiler? %prun under ipython for example). In this situation, there's no improvement possible, apart from using a simpler function than integrate() that might be vectorized (this depends on the definition of Fi()) From bahtiyor_zohidov at mail.ru Sun Dec 30 07:47:26 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Sun, 30 Dec 2012 16:47:26 +0400 Subject: [Numpy-discussion] =?utf-8?q?3D_array_problem_in_Python?= In-Reply-To: <50E02FDE.2060004@laposte.net> References: <1356867686.200644432@f373.mail.ru> <50E02FDE.2060004@laposte.net> Message-ID: <1356871646.828630843@f84.mail.ru> Actually These two functions namely F1 and F2 are really exponential and Bessel functions respectively. But I can not change its analytic form.. I mean is there way to get more quickly the result? Let's say above mentioned two functions, both of them, one function but the dimension I showed should not be changed.. What do you think here whether the problem is with 3 dimension or....??? Thanks in advance for your answer! ???????????, 30 ??????? 2012, 13:13 +01:00 ?? oc-spam66 : >Hello, > >> else: >> val11[i][j], val22[i][j] = integrate.quad(lambda x: F1(x)*F2(x), 0, pi) > >> But, this calculation takes so long time, let's say about 1 hour >> (theoretically)... Is there any better way to easily and fast calculate >> the process such as [ F( i ) for i in xlist ] or something like that >> rather than using for loop? > >* What are F1() and F2()? Do they depend on anything else than 'x'? >Maybe you meant Fi() and Fj(). In that case, can you benefit of a >symmetry property? >* It's likely that all the computing time is in the "integrate" >operation (check it with a profiler? %prun under ipython for example). >In this situation, there's no improvement possible, apart from using a >simpler function than integrate() that might be vectorized (this depends >on the definition of Fi()) >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From silva at lma.cnrs-mrs.fr Sun Dec 30 09:11:22 2012 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Sun, 30 Dec 2012 15:11:22 +0100 Subject: [Numpy-discussion] 3D array problem in Python In-Reply-To: <1356871646.828630843@f84.mail.ru> References: <1356867686.200644432@f373.mail.ru> <50E02FDE.2060004@laposte.net> <1356871646.828630843@f84.mail.ru> Message-ID: <1356876682.5622.6.camel@laptop-101> Le dimanche 30 d?cembre 2012 ? 16:47 +0400, Happyman a ?crit : > Actually > These two functions namely F1 and F2 are really exponential and Bessel functions respectively. But I can not change its analytic form.. > > I mean is there way to get more quickly the result? > Let's say above mentioned two functions, both of them, one function but the dimension I showed should not be changed.. > > > What do you think here whether the problem is with 3 dimension or....??? > Thanks in advance for your answer! The question was: does F1 and F2 change depending on i,j or k ? The answer seems to be yes. I don't think any improvement with vectorisation is possible while using integrate.quad or similar. This function does adaptative meshing of the integration interval. Singular points and/or discontinuities may arise at different points for the several integrands, so that vectorisation would even slow the computation! Maybe you may have a look at romberg's method, which support vector-form. From Nicolas.Rougier at inria.fr Sun Dec 30 09:54:09 2012 From: Nicolas.Rougier at inria.fr (Nicolas Rougier) Date: Sun, 30 Dec 2012 15:54:09 +0100 Subject: [Numpy-discussion] Manipulate neighboring points in 2D array In-Reply-To: <20121230112138.D9234E6726@smtp.hushmail.com> References: <20121230112138.D9234E6726@smtp.hushmail.com> Message-ID: You might want to have a look at : http://code.google.com/p/glumpy/source/browse/demos/gray-scott.py which implements a Gray-Scott reaction-diffusion system. The 'convolution_matrix(src, dst, kernel, toric)' build a sparse matrix such that multiplying an array with this matrix will result in the convolution. This is very fast if your kernel is small (like for cellular automata) and if you intend to repeat the convolution several times. Note that you only need to build the matrix once. Example: >>> S = np.ones((3,3)) >>> K = np.ones((3,3)) >>> M = convolution_matrix(S,S,K,True) >> print (M*S.ravel()).reshape(S.shape) [[ 9. 9. 9.] [ 9. 9. 9.] [ 9. 9. 9.]] >>> M = convolution_matrix(S,S,K,False) >>> print (M*S.ravel()).reshape(S.shape) [[ 4. 6. 4.] [ 6. 9. 6.] [ 4. 6. 4.]] the 'dst' parameter won't be useful in your case so you have to set it to 'src'. Nicolas On Dec 30, 2012, at 12:21 , deb wrote: > Thanks Zach for your interest > > I was thinking about ndimage.generic_filter when I wrote about generic filter. > For generic_filter I used trivial function that returns .sum() but I can't seem to make the code any faster than it is. > > This is the code: http://code.activestate.com/recipes/578390-snowflake-simulation-using-reiter-cellular-automat/ > As commenter suggested I thought to try and make it in numpy > > Interestingly, the first thing I tried before trying to use numpy was change range() loops with xrange(), as xrange is considered faster and more efficient, but result was that code was twice slower. > > Anyway I give up, and concluded that my numpy skills are far below I expected :D > > >> It's possible that some generic filter operations can be cast in >> terms of pure-numpy operations, or composed out of existing filters >> available in scipy.ndimage. If you can describe the filter operation >> you wish to perform, perhaps someone can make some suggestions. > >> Alternately, scipy.ndimage.generic_filter can take an arbitrary >> python function. Though it's not really fast... > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From morph at debian.org Sun Dec 30 19:17:50 2012 From: morph at debian.org (Sandro Tosi) Date: Mon, 31 Dec 2012 01:17:50 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release In-Reply-To: References: Message-ID: Hi Ondrej & al, On Sat, Dec 29, 2012 at 1:02 AM, Ond?ej ?ert?k wrote: > I'm pleased to announce the availability of the first release candidate of > NumPy 1.7.0rc1. Congrats on this RC release! I've uploaded this version to Debian and updated some of the issues related to it. There are also a couple of minor PR you might want to consider for 1.7: 2872 and 2873. Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From chris.barker at noaa.gov Sun Dec 30 19:35:35 2012 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Sun, 30 Dec 2012 16:35:35 -0800 Subject: [Numpy-discussion] 3D array problem in Python In-Reply-To: <1356867686.200644432@f373.mail.ru> References: <1356867686.200644432@f373.mail.ru> Message-ID: On Sun, Dec 30, 2012 at 3:41 AM, Happyman wrote: > nums=32 > rows=120 > cols=150 > > for k in range(0,nums): > for i in range(0,rows): > for j in range(0,cols): > if float ( R[ k ] [ i ] [ j ] ) == > 0.0: why the float() -- what data type is R? > else: > val11[ i ] [ j ], val22[ i > ][ j ] = integrate.quad( lambda x : F1(x)*F2(x) , 0 , pi) this is odd -- Do F1 and F2 depend on i,j, or k somehow? or are you somehow integerting over the k-dimension? In which case, I'm guessing that integration is you time killer anyway -- do some profiling to know for sure. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From bahtiyor_zohidov at mail.ru Sun Dec 30 21:40:14 2012 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Mon, 31 Dec 2012 06:40:14 +0400 Subject: [Numpy-discussion] =?utf-8?q?3D_array_problem_challenging_in_Pyth?= =?utf-8?q?on?= References: <1356867686.200644432@f373.mail.ru> Message-ID: <1356921614.45639691@f150.mail.ru> Hi Chris Actually, I think the request by me was not so properly explained, I'm afraid (sorry about that!!!) Let me explain precisely, Last time I calculated matrix which is from only one file. it took exactly:? 237.713999987 seconds. Let's say approx. 4 minute! BUT, IF I use it for more than 30 files(if for one file it takes 4 minute, then 30 files will be 120 minute????) ?each of which has 18 000 values(this can be as an example manually any kind of array). I have to process turn by turn. In other words, I created ,after taking some advice from python users, of course, two dimensional matrix: R(rows, cols) and obtained all processed data onto R matrix. HERE, below I put some of my codes which are connected with each other. Also I showed F1() and F2() Again the same problem here I want to optimize my codes in order to avoid "Loop" as well as to get quick response as much as possible. BUT, it seems really confusing, would be great to get help from Python programmers !!! ================================== The codes here: ================================================================= import numpy as np import scipy.special as ss from scipy.special import sph_jnyn,sph_jn,jv,yv from scipy import integrate import time import os --------------------------- 1) Problem: no problem in this F0() function --------------------------- Inputs: m=5+0.4j ?- complex number as an example! ? ? ? ? ? ? x= one value - float! --------------------------- #This function returns an, bn coefficients I don't want it to be vectorized because it is already done. it is working well! def F0(m, x): nmax = np.round(2.0+x+4.0*x**(1.0/3.0)) mx = m * x j_x,jd_x,y_x,yd_x = ss.sph_jnyn(nmax, x) ? ? ???# ? ?sph_jnyn ? - ? ?from scipy special functions ? ? j_x = j_x[1:] jd_x = jd_x[1:] y_x = y_x[1:] yd_x = yd_x[1:] h1_x = j_x + 1.0j*y_x h1d_x = jd_x + 1.0j*yd_x j_mx,jd_mx = ss.sph_jn(nmax, mx) ? ? ? ?# ? ?sph_jn ? ?- ? ?from scipy special functions ? ? j_mx = j_mx[1:] jd_mx = jd_mx[1:] j_xp = j_x + x*jd_x j_mxp = j_mx + mx*jd_mx h1_xp = h1_x + x*h1d_x m2 = m * m an = (m2 * j_mx * j_xp - j_x * j_mxp)/(m2 * j_mx * h1_xp - h1_x * j_mxp) bn = (j_mx * j_xp - j_x * j_mxp)/(j_mx * h1_xp - h1_x * j_mxp) return an, bn -------------------------------------- 2) Problem: ? ?1) To avoid loop ? ? ? ? ? ? ? ? ? ? ? ?2) To return values from the function (below) no matter whether 'a' array or scalar! -------------------------------------- ? m=5+0.4j ?- for example ? L = 30 - for example? ? a - array(one dimensional) --------------------------------------? def F1(m,L,a): xs = pi * a / L if(m.imag < 0.0): ? ? ? ? ? ? ?m = conj(m) # Want to make sure we can accept single arguments or arrays try: ? ? ?xs.size ? ? ?xlist = xs except: ? ? ?xlist = array(xs) q=[ ] for i,s in enumerate(xlist.flat): ? ? ? ? ? ? ? if float(s)==0.0: # To avoid a singularity at x=0 ? ? ? ? ? ? ? ? ? ? ? ? ?q.append(0.0) ? ? ? ? ? ? ? ?else: ? ? ? ? ? ? ? ? ? ? ? ? ?x = np.round(s,7) ? ? ? ? ? ? ? ? ? ? ? ? ?an,bn = F0(m,x) ? ? ? ? ? ? ? ? ? ? ? ? ?n = arange(1.0, an.size + 1.0) ? ? ? ? ? ? ? ? ? ? ? ? ?c = 2.0 * n + 1.0 ? ? ? ? ? ? ? ? ? ? ? ? ?q.append(((L*L)/(2*pi) * (c * (an.real + bn.real )).sum())) return array(q) ----------------------------- 3) Problem: 1) ?I used "try" to avoid whether 'D' is singular or not!!! IS there better way beside this? ? ? ? ? ? ? ? ? ? ? 2) ?IS there any wayy to avoid from the loop here in this case??? ----------------------------- ?a - array(one dimensional the same as above) ?s -?array(one dimensional) -----------------------------? def F2(a,s): try: ? ? ? ? ? ? ? ? ?# I used "try" to avoid whether 'D' is singular or not!!! IS there better way beside this? ? ? ? a.size ? ? ? Dslist = a except: ? ? ? Dslist = np.array(a) K=zeros( size( Dslist ) ) for i,d in enumerate(Dslist.flat): ? ?# IS there any wayy to avoid from the loop here in this case??? ? ? ? ? ? ? ? if float(d)==0.0: ? ? ? ? ? ? ? ? ? ? ? ? ? K[i]=0.0 ? ? ? ? ? ? ? else: ? ? ? ? ? ? ? ? ? ? ? ? ? K[i] = np.exp(s**(-0.45)) * d) return K ---------------------- 4) Problem: F3 is depends on F1, F2? ---------------------- def F3(m=20,L=30): ? ? ? ? ? ? ? ? ? ? ?F_file=loadtxt ( ' filename') ? ? # ?Instead, here we can create file, size is 120x150 will be 18000 values as I explained above. ? ? ? ? ? ? ? ? ? ? val = [integrate.quad(lambda x: F1(m,L,x)*F2(x,i), 0.0, 7.0) for i in F_file] ? ? ? ? ? ? ? ? ? ? return array(val) F3 - what I really tried to get more efficient result, unfortunately, it took about one month for it!!! nothing was got!? I really need help, I am stuck.... ???????????, 30 ??????? 2012, 16:35 -08:00 ?? Chris Barker - NOAA Federal : >On Sun, Dec 30, 2012 at 3:41 AM, Happyman < bahtiyor_zohidov at mail.ru > wrote: >> nums=32 >> rows=120 >> cols=150 >> >> for k in range(0,nums): >> for i in range(0,rows): >> for j in range(0,cols): >> if float ( R[ k ] [ i ] [ j ] ) == >> 0.0: > >why the float() -- what data type is R? > >> else: >> val11[ i ] [ j ], val22[ i >> ][ j ] = integrate.quad( lambda x : F1(x)*F2(x) , 0 , pi) > >this is odd -- Do F1 and F2 depend on i,j, or k somehow? or are you >somehow integerting over the k-dimension? In which case, I'm guessing >that integration is you time killer anyway -- do some profiling to >know for sure. > >-Chris > >-- > >Christopher Barker, Ph.D. >Oceanographer > >Emergency Response Division >NOAA/NOS/OR&R (206) 526-6959 voice >7600 Sand Point Way NE (206) 526-6329 fax >Seattle, WA 98115 (206) 526-6317 main reception > >Chris.Barker at noaa.gov >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From raul at virtualmaterials.com Mon Dec 31 13:42:17 2012 From: raul at virtualmaterials.com (Raul Cota) Date: Mon, 31 Dec 2012 11:42:17 -0700 Subject: [Numpy-discussion] 3D array problem in Python In-Reply-To: References: <1356867686.200644432@f373.mail.ru> Message-ID: <50E1DC89.50404@virtualmaterials.com> Quick comment since I was working on timing trivial operations, If I run the triple loop with R with only zeros thus avoiding the integration, the loop takes in my computer about 1 second with the float() function and about 1.5 without it if R is dtype='float64' and 3.3 seconds if dtype='float32'. I didn't bother trying the other obvious speed up of avoiding the dot operator quad = integrate.quad and using quad inside the triple loop. I do those things out of habit but they barely ever make a meaningful difference. Conclusions: 1) The overhead of the triple loop is meaningless if the whole operation takes minutes to complete. 2) Using float() does make it faster but in this scenario the speed up is meaningless in the grand scheme of things. * It is besides the point, but for what is worth, with the modified code for numpy I suggested a week ago, using the float() function is not needed to get it to run in 1 second. Raul Cota On 30/12/2012 5:35 PM, Chris Barker - NOAA Federal wrote: > On Sun, Dec 30, 2012 at 3:41 AM, Happyman wrote: >> nums=32 >> rows=120 >> cols=150 >> >> for k in range(0,nums): >> for i in range(0,rows): >> for j in range(0,cols): >> if float ( R[ k ] [ i ] [ j ] ) == >> 0.0: > why the float() -- what data type is R? > >> else: >> val11[ i ] [ j ], val22[ i >> ][ j ] = integrate.quad( lambda x : F1(x)*F2(x) , 0 , pi) > this is odd -- Do F1 and F2 depend on i,j, or k somehow? or are you > somehow integerting over the k-dimension? In which case, I'm guessing > that integration is you time killer anyway -- do some profiling to > know for sure. > > -Chris >