From cournape at gmail.com Sat Aug 1 00:14:30 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 1 Aug 2009 13:14:30 +0900 Subject: [Numpy-discussion] ** On entry to ILAENV parameter number 2 had an illegal value In-Reply-To: <4A72D363.8090603@ar.media.kyoto-u.ac.jp> References: <4A72A656.7040802@ar.media.kyoto-u.ac.jp> <1249030460.336856@nntpgw.ncl.ac.uk> <4A72B40D.7080004@ar.media.kyoto-u.ac.jp> <1249033198.686107@nntpgw.ncl.ac.uk> <1249037111.723816@nntpgw.ncl.ac.uk> <4A72CD4E.7000606@ar.media.kyoto-u.ac.jp> <4A72D363.8090603@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220907312114h7106bb8fi1cedc192e980ec2f@mail.gmail.com> On Fri, Jul 31, 2009 at 8:20 PM, David Cournapeau wrote: > Steven Coutts wrote: >> David Cournapeau ar.media.kyoto-u.ac.jp> writes: >> >> ?If you are willing to do >> >>> it, I would ?also be interested whether numpy works ok if linked against >>> BLAS/LAPACK instead of atlas (i.e. build numpy, again from scratch, with >>> ATLAS=None python setup.py build, and then run the test suite). >>> >>> >> >> Yes that appears to work fine, all tests run. >> > > So that's a problem with ATLAS. Maybe a gcc bug ? Another user contacted > me privately for my rpm repository, and got exactly the same problem > with the rpms, on CENTOS 5.3 as well. I will try to look at it on a > centos VM if I have time this WE, Ok, I have installed CENTOS 5.3 on my machine (kudos to vmware fusion which installs the OS automatically), build numpy 1.3.0 with atlas 3.8.3 + lapack 3.1.1 on 64 bits. But I could not reproduce the bug, unfortunately. Are you using the threaded atlas ? cheers, David From cournape at gmail.com Sat Aug 1 00:15:51 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 1 Aug 2009 13:15:51 +0900 Subject: [Numpy-discussion] ** On entry to ILAENV parameter number 2 had an illegal value In-Reply-To: <5b8d13220907312114h7106bb8fi1cedc192e980ec2f@mail.gmail.com> References: <4A72A656.7040802@ar.media.kyoto-u.ac.jp> <1249030460.336856@nntpgw.ncl.ac.uk> <4A72B40D.7080004@ar.media.kyoto-u.ac.jp> <1249033198.686107@nntpgw.ncl.ac.uk> <1249037111.723816@nntpgw.ncl.ac.uk> <4A72CD4E.7000606@ar.media.kyoto-u.ac.jp> <4A72D363.8090603@ar.media.kyoto-u.ac.jp> <5b8d13220907312114h7106bb8fi1cedc192e980ec2f@mail.gmail.com> Message-ID: <5b8d13220907312115y7af850afva4659559a84f15b8@mail.gmail.com> On Sat, Aug 1, 2009 at 1:14 PM, David Cournapeau wrote: > On Fri, Jul 31, 2009 at 8:20 PM, David > Cournapeau wrote: >> Steven Coutts wrote: >>> David Cournapeau ar.media.kyoto-u.ac.jp> writes: >>> >>> ?If you are willing to do >>> >>>> it, I would ?also be interested whether numpy works ok if linked against >>>> BLAS/LAPACK instead of atlas (i.e. build numpy, again from scratch, with >>>> ATLAS=None python setup.py build, and then run the test suite). >>>> >>>> >>> >>> Yes that appears to work fine, all tests run. >>> >> >> So that's a problem with ATLAS. Maybe a gcc bug ? Another user contacted >> me privately for my rpm repository, and got exactly the same problem >> with the rpms, on CENTOS 5.3 as well. I will try to look at it on a >> centos VM if I have time this WE, > > Ok, I have installed CENTOS 5.3 on my machine (kudos to vmware fusion > which installs the OS automatically), build numpy 1.3.0 with atlas > 3.8.3 + lapack 3.1.1 on 64 bits. But I could not reproduce the bug, > unfortunately. Are you using the threaded atlas ? I forgot: another thing which would be helpful since you can reproduce the bug would be to build a debug version of numpy (python setup.py build_ext -g), and reproduce the bug under gdb to have a traceback. David From scott.sinclair.za at gmail.com Sat Aug 1 05:16:08 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Sat, 1 Aug 2009 11:16:08 +0200 Subject: [Numpy-discussion] Doc-editor internal error Message-ID: <6a17e9ee0908010216y411b2f0eve2dfe74f5ee68553@mail.gmail.com> Hi, I'm seeing "500 Internal Error" at http://docs.scipy.org/numpy/stats/ Cheers, Scott From scott.sinclair.za at gmail.com Sat Aug 1 05:33:06 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Sat, 1 Aug 2009 11:33:06 +0200 Subject: [Numpy-discussion] Doc-editor internal error In-Reply-To: <6a17e9ee0908010216y411b2f0eve2dfe74f5ee68553@mail.gmail.com> References: <6a17e9ee0908010216y411b2f0eve2dfe74f5ee68553@mail.gmail.com> Message-ID: <6a17e9ee0908010233m58e67a1ejef88333666d3c517@mail.gmail.com> Ignore the noise. Seems to be fixed now.. 2009/8/1 Scott Sinclair : > Hi, > > I'm seeing "500 Internal Error" at http://docs.scipy.org/numpy/stats/ > > Cheers, > Scott > From gav451 at gmail.com Sun Aug 2 11:14:11 2009 From: gav451 at gmail.com (Gerard Vermeulen) Date: Sun, 2 Aug 2009 17:14:11 +0200 Subject: [Numpy-discussion] PyQwt-5.2.0 released Message-ID: <20090802171411.41ec88ad@jupiter.rozan.fr> What is PyQwt ( http://pyqwt.sourceforge.net ) ? - it is a set of Python bindings for the Qwt C++ class library which extends the Qt framework with widgets for scientific and engineering applications. It provides a 2-dimensional plotting widget and various widgets to display and control bounded or unbounded floating point values. - it requires and extends PyQt, a set of Python bindings for Qt. - it supports the use of PyQt, Qt, Qwt, and optionally NumPy or SciPy in a GUI Python application or in an interactive Python session. - it runs on POSIX, Mac OS X and Windows platforms (practically any platform supported by Qt and Python). - it plots fast: fairly good hardware allows a rate of 100,000 points/second. (PyQwt with Qt-3 is faster than with Qt-4). - it is licensed under the GPL with an exception to allow dynamic linking with non-free releases of Qt and PyQt. The most important new features of PyQwt v5.2.0 are: - support for Qwt v5.2.0 - support for PyQt4 upto v4.5.4, PyQt3 upto v3.18.1, and SIP upto v4.8.2. - switch to documentation generated by Sphinx. - provide a normal qwt plugin for the pyuic4 user interface compiler instead of the anormal qwt plugin included in PyQt. The most important bug fixes in PyQwt-5.2.0 are: - fixed crashes in the QImage-array conversion functions. - fixed three transfer of ownership bugs. PyQwt-5.2.0 supports: 1. Python v2.6.x and v2.5.x. 2. PyQt v3.18.1 downto v3.17.5. 3 PyQt v4.5.x, v4.4.x. 4 SIP v4.8.x downto v4.7.3. 5. Qt v3.3.x. 6. Qt v4.5.x, v4.4.x, and v4.3.x. 7. Qwt v5.2.x, v5.1.x, and v5.0.x. 8. Recent versions of NumPy, numarray, and/or Numeric. Enjoy -- Gerard Vermeulen From dwf at cs.toronto.edu Mon Aug 3 02:12:29 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 3 Aug 2009 02:12:29 -0400 Subject: [Numpy-discussion] Differences Between Arrays and Matrices in Numpy In-Reply-To: References: Message-ID: <209B50BC-3867-4A38-B429-E4F3571B69D2@cs.toronto.edu> On 30-Jul-09, at 1:14 PM, Nanime Puloski wrote: > What are some differences between arrays and matrices using the Numpy > library? When would one want to use arrays instead of matrices and > vice > versa? This is answered in the online documentation in several places: http://preview.tinyurl.com/n6of54 http://docs.scipy.org/doc/numpy/reference/arrays.classes.html#matrix-objects Regards, David From stevec at couttsnet.com Mon Aug 3 04:30:26 2009 From: stevec at couttsnet.com (Steven Coutts) Date: Mon, 3 Aug 2009 08:30:26 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?**_On_entry_to_ILAENV_parameter_numb?= =?utf-8?q?er_2_had=09an_illegal_value?= References: <4A72A656.7040802@ar.media.kyoto-u.ac.jp> <1249030460.336856@nntpgw.ncl.ac.uk> <4A72B40D.7080004@ar.media.kyoto-u.ac.jp> <1249033198.686107@nntpgw.ncl.ac.uk> <1249037111.723816@nntpgw.ncl.ac.uk> <4A72CD4E.7000606@ar.media.kyoto-u.ac.jp> <4A72D363.8090603@ar.media.kyoto-u.ac.jp> <5b8d13220907312114h7106bb8fi1cedc192e980ec2f@mail.gmail.com> <5b8d13220907312115y7af850afva4659559a84f15b8@mail.gmail.com> Message-ID: David Cournapeau gmail.com> writes: > > I forgot: another thing which would be helpful since you can reproduce > the bug would be to build a debug version of numpy (python setup.py > build_ext -g), and reproduce the bug under gdb to have a traceback. > > David Ok I have rebuilt numpy-1.3.0 with debugging, and it segfaults as soon as I import numpy in python2.5 Backtrace -: http://pastebin.com/d27fbd2a5 Regards From stevec at couttsnet.com Mon Aug 3 04:35:31 2009 From: stevec at couttsnet.com (Steven Coutts) Date: Mon, 3 Aug 2009 08:35:31 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?**_On_entry_to_ILAENV_parameter_numb?= =?utf-8?q?er_2_had=09an_illegal_value?= References: <4A72A656.7040802@ar.media.kyoto-u.ac.jp> <1249030460.336856@nntpgw.ncl.ac.uk> <4A72B40D.7080004@ar.media.kyoto-u.ac.jp> <1249033198.686107@nntpgw.ncl.ac.uk> <1249037111.723816@nntpgw.ncl.ac.uk> <4A72CD4E.7000606@ar.media.kyoto-u.ac.jp> <4A72D363.8090603@ar.media.kyoto-u.ac.jp> <5b8d13220907312114h7106bb8fi1cedc192e980ec2f@mail.gmail.com> <5b8d13220907312115y7af850afva4659559a84f15b8@mail.gmail.com> Message-ID: Steven Coutts couttsnet.com> writes: > > Ok I have rebuilt numpy-1.3.0 with debugging, and it segfaults as soon as I > import numpy in python2.5 > > Backtrace -: > http://pastebin.com/d27fbd2a5 > > Regards > Sorry ignore this, I cleanded out numpy properly, re-installed 1.3.0 and the tests are all running now. Regards From david at ar.media.kyoto-u.ac.jp Mon Aug 3 04:26:58 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 03 Aug 2009 17:26:58 +0900 Subject: [Numpy-discussion] ** On entry to ILAENV parameter number 2 had an illegal value In-Reply-To: References: <4A72A656.7040802@ar.media.kyoto-u.ac.jp> <1249030460.336856@nntpgw.ncl.ac.uk> <4A72B40D.7080004@ar.media.kyoto-u.ac.jp> <1249033198.686107@nntpgw.ncl.ac.uk> <1249037111.723816@nntpgw.ncl.ac.uk> <4A72CD4E.7000606@ar.media.kyoto-u.ac.jp> <4A72D363.8090603@ar.media.kyoto-u.ac.jp> <5b8d13220907312114h7106bb8fi1cedc192e980ec2f@mail.gmail.com> <5b8d13220907312115y7af850afva4659559a84f15b8@mail.gmail.com> Message-ID: <4A769F52.7000705@ar.media.kyoto-u.ac.jp> Steven Coutts wrote: > > > Sorry ignore this, I cleanded out numpy properly, re-installed 1.3.0 and the > tests are all running now. > Do you mean that if you build with debug information, everything else being equal, you cannot reproduce the crashes ? cheers, David From stevec at couttsnet.com Mon Aug 3 05:32:03 2009 From: stevec at couttsnet.com (Steven Coutts) Date: Mon, 03 Aug 2009 10:32:03 +0100 Subject: [Numpy-discussion] ** On entry to ILAENV parameter number 2 had an illegal value References: <4A72A656.7040802@ar.media.kyoto-u.ac.jp> <1249030460.336856@nntpgw.ncl.ac.uk> <4A72B40D.7080004@ar.media.kyoto-u.ac.jp> <1249033198.686107@nntpgw.ncl.ac.uk> <1249037111.723816@nntpgw.ncl.ac.uk> <4A72CD4E.7000606@ar.media.kyoto-u.ac.jp> <4A72D363.8090603@ar.media.kyoto-u.ac.jp> <5b8d13220907312114h7106bb8fi1cedc192e980ec2f@mail.gmail.com> <5b8d13220907312115y7af850afva4659559a84f15b8@mail.gmail.com> <4A769F52.7000705@ar.media.kyoto-u.ac.jp> Message-ID: <1249291923.686115@nntpgw.ncl.ac.uk> David Cournapeau wrote: > > Do you mean that if you build with debug information, everything else > being equal, you cannot reproduce the crashes ? > > cheers, > > David That does appear to be the case, SciPy 1.7.0 is now also running fine. Regards From cournape at gmail.com Mon Aug 3 09:25:18 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 3 Aug 2009 22:25:18 +0900 Subject: [Numpy-discussion] ** On entry to ILAENV parameter number 2 had an illegal value In-Reply-To: <1249291923.686115@nntpgw.ncl.ac.uk> References: <4A72CD4E.7000606@ar.media.kyoto-u.ac.jp> <4A72D363.8090603@ar.media.kyoto-u.ac.jp> <5b8d13220907312114h7106bb8fi1cedc192e980ec2f@mail.gmail.com> <5b8d13220907312115y7af850afva4659559a84f15b8@mail.gmail.com> <4A769F52.7000705@ar.media.kyoto-u.ac.jp> <1249291923.686115@nntpgw.ncl.ac.uk> Message-ID: <5b8d13220908030625j55a77a3ak6d75582bc734971f@mail.gmail.com> On Mon, Aug 3, 2009 at 6:32 PM, Steven Coutts wrote: > David Cournapeau wrote: > >> >> Do you mean that if you build with debug information, everything else >> being equal, you cannot reproduce the crashes ? >> >> cheers, >> >> David > > That does appear to be the case, SciPy 1.7.0 is now also running fine. It is just getting weirder - the fact that numpy worked with bare BLAS/LAPACK and crashed with atlas lead me to think that it was an atlas problem. But now, this smells more like a compiler problem. I would first really check that the only difference between crash vs. no crash is debug vs non debug (both with ATLAS), to avoid chasing wrong hints. Practically, I would advised you to "clone" the numpy sources (one debug, one non debug), and build from scratch with a script to do things in a repeatable manner. Then, if indeed you have crash only with non debug build, I would recommend to install my project numscons, to be able to "play" with flags, and first try building with the exact same flags as a normal build, but adding the -g flag. For example: CFLAGS="-O2 -fno-strict-aliasing -g" python setupscons.py install --prefix=blabla Hopefully, you will be able to reproduce the crash and get a backtrace, cheers, David From afriedle at indiana.edu Mon Aug 3 09:32:57 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Mon, 03 Aug 2009 09:32:57 -0400 Subject: [Numpy-discussion] strange sin/cos performance Message-ID: <4A76E709.9090100@indiana.edu> While working on GSoC stuff I came across this weird performance behavior for sine and cosine -- using float32 is way slower than float64. On a 2ghz opteron: sin float32 1.12447786331 sin float64 0.133481025696 cos float32 1.14155912399 cos float64 0.131420135498 The times are in seconds, and are best of three runs of ten iterations of numpy.{sin,cos} over a 1000-element array (script attached). I've produced similar results on a PS3 system also. The opteron is running Python 2.6.1 and NumPy 1.3.0, while the PS3 has Python 2.5.1 and NumPy 1.1.1. I haven't jumped into the code yet, but does anyone know why sin/cos are ~8.5x slower for 32-bit floats compared to 64-bit doubles? Side question: I see people in emails writing things like 'timeit foo(x)' and having it run some sort of standard benchmark, how exactly do I do that? Is that some environment other than a normal Python? Thanks, Andrew -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cos.py URL: From afriedle at indiana.edu Mon Aug 3 09:38:46 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Mon, 03 Aug 2009 09:38:46 -0400 Subject: [Numpy-discussion] Add/multiply reduction confusion In-Reply-To: <20090705185700.GA8888@phare.normalesup.org> References: <4A48D2DB.10508@indiana.edu> <4A50CAAE.5060400@indiana.edu> <9457e7c80907050937u305f3508o1829a00f28d9f8f5@mail.gmail.com> <4A50F536.6040200@indiana.edu> <20090705185700.GA8888@phare.normalesup.org> Message-ID: <4A76E866.90003@indiana.edu> Gael Varoquaux wrote: > On Sun, Jul 05, 2009 at 02:47:18PM -0400, Andrew Friedley wrote: >> St?fan van der Walt wrote: >>> 2009/7/5 Andrew Friedley : >>>> I found the check that does the type 'upcasting' in >>>> umath_ufunc_object.inc around line 3072 (NumPy 1.3.0). Turns out all I >>>> need to do is make sure my add and multiply ufuncs are actually named >>>> 'add' and 'multiply' and arrays will be upcasted appropriately. > >>> Would you please be so kind as to add your findings here: > >>> http://docs.scipy.org/numpy/docs/numpy-docs/reference/index.rst/#reference-index > >>> I haven't read through that document recently, so it may be in there already. > >> I created an account (afriedle) but looks like I don't have edit >> permissions. > > I have added you to the Editor list. Thanks and sorry about the delay; I went and added the comment I proposed. Andrew From cournape at gmail.com Mon Aug 3 09:44:28 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 3 Aug 2009 22:44:28 +0900 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A76E709.9090100@indiana.edu> References: <4A76E709.9090100@indiana.edu> Message-ID: <5b8d13220908030644v55366931hffbc80267d998667@mail.gmail.com> On Mon, Aug 3, 2009 at 10:32 PM, Andrew Friedley wrote: > While working on GSoC stuff I came across this weird performance behavior > for sine and cosine -- using float32 is way slower than float64. ?On a 2ghz > opteron: > > sin float32 1.12447786331 > sin float64 0.133481025696 > cos float32 1.14155912399 > cos float64 0.131420135498 Which OS are you on ? FWIW, on max os x, with recent svn checkout, I get expected results (float32 ~ twice faster). > > The times are in seconds, and are best of three runs of ten iterations of > numpy.{sin,cos} over a 1000-element array (script attached). ?I've produced > similar results on a PS3 system also. ?The opteron is running Python 2.6.1 > and NumPy 1.3.0, while the PS3 has Python 2.5.1 and NumPy 1.1.1. > > I haven't jumped into the code yet, but does anyone know why sin/cos are > ~8.5x slower for 32-bit floats compared to 64-bit doubles? My guess would be that you are on a platform where there is no sinf, and our sinf replacement is bad for some reason. > Side question: ?I see people in emails writing things like 'timeit foo(x)' > and having it run some sort of standard benchmark, how exactly do I do that? > ?Is that some environment other than a normal Python? Yes, that's in ipython. cheers, David From emmanuelle.gouillart at normalesup.org Mon Aug 3 09:45:56 2009 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Mon, 3 Aug 2009 15:45:56 +0200 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A76E709.9090100@indiana.edu> References: <4A76E709.9090100@indiana.edu> Message-ID: <20090803134556.GA31036@phare.normalesup.org> Hi Andrew, %timeit is an Ipython magic command that uses the timeit module, see http://ipython.scipy.org/doc/stable/html/interactive/reference.html?highlight=timeit for more information about how to use it. So you were right to suppose that it is not a "normal Python". However, I was not able to reproduce your observations. >>> import numpy as np >>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) >>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) >>> %timeit -n 10 np.sin(a) 10 loops, best of 3: 8.67 ms per loop >>> %timeit -n 10 np.sin(b) 10 loops, best of 3: 9.29 ms per loop Emmanuelle On Mon, Aug 03, 2009 at 09:32:57AM -0400, Andrew Friedley wrote: > While working on GSoC stuff I came across this weird performance > behavior for sine and cosine -- using float32 is way slower than > float64. On a 2ghz opteron: > > sin float32 1.12447786331 > sin float64 0.133481025696 > cos float32 1.14155912399 > cos float64 0.131420135498 > > The times are in seconds, and are best of three runs of ten iterations > of numpy.{sin,cos} over a 1000-element array (script attached). I've > produced similar results on a PS3 system also. The opteron is running > Python 2.6.1 and NumPy 1.3.0, while the PS3 has Python 2.5.1 and NumPy > 1.1.1. > > I haven't jumped into the code yet, but does anyone know why sin/cos are > ~8.5x slower for 32-bit floats compared to 64-bit doubles? > > Side question: I see people in emails writing things like 'timeit > foo(x)' and having it run some sort of standard benchmark, how exactly > do I do that? Is that some environment other than a normal Python? > > Thanks, > > Andrew > import timeit > t = timeit.Timer("numpy.sin(a)", > "import numpy\n" > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float32)") > print "sin float32", min(t.repeat(3, 10)) > t = timeit.Timer("numpy.sin(a)", > "import numpy\n" > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float64)") > print "sin float64", min(t.repeat(3, 10)) > t = timeit.Timer("numpy.cos(a)", > "import numpy\n" > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float32)") > print "cos float32", min(t.repeat(3, 10)) > t = timeit.Timer("numpy.cos(a)", > "import numpy\n" > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float64)") > print "cos float64", min(t.repeat(3, 10)) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From afriedle at indiana.edu Mon Aug 3 10:08:42 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Mon, 03 Aug 2009 10:08:42 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <5b8d13220908030644v55366931hffbc80267d998667@mail.gmail.com> References: <4A76E709.9090100@indiana.edu> <5b8d13220908030644v55366931hffbc80267d998667@mail.gmail.com> Message-ID: <4A76EF6A.5060400@indiana.edu> Thanks for the quick responses. David Cournapeau wrote: > On Mon, Aug 3, 2009 at 10:32 PM, Andrew Friedley wrote: >> While working on GSoC stuff I came across this weird performance behavior >> for sine and cosine -- using float32 is way slower than float64. On a 2ghz >> opteron: >> >> sin float32 1.12447786331 >> sin float64 0.133481025696 >> cos float32 1.14155912399 >> cos float64 0.131420135498 > > Which OS are you on ? FWIW, on max os x, with recent svn checkout, I > get expected results (float32 ~ twice faster). The numbers above are on linux, RHEL 5.2. The PS3 is running Fedora 9 I think. I just ran on a PPC OSX 10.5 system: sin float32 0.111793041229 sin float64 0.0902218818665 cos float32 0.112202882767 cos float64 0.0917768478394 Much more reasonable, but still not what I'd expect or what you seem to expect. >> The times are in seconds, and are best of three runs of ten iterations of >> numpy.{sin,cos} over a 1000-element array (script attached). I've produced >> similar results on a PS3 system also. The opteron is running Python 2.6.1 >> and NumPy 1.3.0, while the PS3 has Python 2.5.1 and NumPy 1.1.1. >> >> I haven't jumped into the code yet, but does anyone know why sin/cos are >> ~8.5x slower for 32-bit floats compared to 64-bit doubles? > > My guess would be that you are on a platform where there is no sinf, > and our sinf replacement is bad for some reason. I think linux has sinf, is there a quick/easy way to check if numpy is using it? >> Side question: I see people in emails writing things like 'timeit foo(x)' >> and having it run some sort of standard benchmark, how exactly do I do that? >> Is that some environment other than a normal Python? > > Yes, that's in ipython. Thanks for the pointer. Andrew From afriedle at indiana.edu Mon Aug 3 10:10:27 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Mon, 03 Aug 2009 10:10:27 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <20090803134556.GA31036@phare.normalesup.org> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> Message-ID: <4A76EFD3.5010508@indiana.edu> Emmanuelle Gouillart wrote: > Hi Andrew, > > %timeit is an Ipython magic command that uses the timeit module, > see > http://ipython.scipy.org/doc/stable/html/interactive/reference.html?highlight=timeit > for more information about how to use it. So you were right to suppose > that it is not a "normal Python". Thanks for the pointer, I'm not familiar with IPython at all, will check it out. > However, I was not able to reproduce your observations. > >>>> import numpy as np >>>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) >>>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) >>>> %timeit -n 10 np.sin(a) > 10 loops, best of 3: 8.67 ms per loop >>>> %timeit -n 10 np.sin(b) > 10 loops, best of 3: 9.29 ms per loop OK, I'm curious, what OS/Python/Numpy are you using? Andrew From emmanuelle.gouillart at normalesup.org Mon Aug 3 10:21:12 2009 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Mon, 3 Aug 2009 16:21:12 +0200 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A76EFD3.5010508@indiana.edu> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> Message-ID: <20090803142112.GA7495@phare.normalesup.org> > >>>> import numpy as np > >>>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) > >>>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) > >>>> %timeit -n 10 np.sin(a) > > 10 loops, best of 3: 8.67 ms per loop > >>>> %timeit -n 10 np.sin(b) > > 10 loops, best of 3: 9.29 ms per loop > OK, I'm curious, what OS/Python/Numpy are you using? Sorry, I should have specified these information earlier: OS: Linux Ubuntu 9.04 (running a Dual Core Intel Pentium E5200 @ 2.50GHz) Python: 2.6.2 Numpy: 1.2.1 Emmanuelle From josef.pktd at gmail.com Mon Aug 3 10:44:51 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 3 Aug 2009 10:44:51 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <20090803142112.GA7495@phare.normalesup.org> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> Message-ID: <1cd32cbb0908030744x7b924ee6g808cd9f85905660d@mail.gmail.com> On Mon, Aug 3, 2009 at 10:21 AM, Emmanuelle Gouillart wrote: >> >>>> import numpy as np >> >>>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) >> >>>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) >> >>>> %timeit -n 10 np.sin(a) >> > 10 loops, best of 3: 8.67 ms per loop >> >>>> %timeit -n 10 np.sin(b) >> > 10 loops, best of 3: 9.29 ms per loop > >> OK, I'm curious, what OS/Python/Numpy are you using? > > Sorry, I should have specified these information earlier: > > OS: Linux Ubuntu 9.04 (running a Dual Core Intel Pentium E5200 ?@ > 2.50GHz) > Python: 2.6.2 > Numpy: 1.2.1 > > Emmanuelle > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > just for reference: on a plain single core WindowsXP (32bit) notebook with official numpy 1.3.0, I get with some variation sin float32 0.0963996820825 sin float64 0.164140135129 cos float32 0.124504371366 cos float64 0.149174266562 Josef From cournape at gmail.com Mon Aug 3 11:13:49 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 4 Aug 2009 00:13:49 +0900 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A76EF6A.5060400@indiana.edu> References: <4A76E709.9090100@indiana.edu> <5b8d13220908030644v55366931hffbc80267d998667@mail.gmail.com> <4A76EF6A.5060400@indiana.edu> Message-ID: <5b8d13220908030813o7f00a975j43712cb458f728d5@mail.gmail.com> On Mon, Aug 3, 2009 at 11:08 PM, Andrew Friedley wrote: > Thanks for the quick responses. > > David Cournapeau wrote: >> On Mon, Aug 3, 2009 at 10:32 PM, Andrew Friedley wrote: >>> While working on GSoC stuff I came across this weird performance behavior >>> for sine and cosine -- using float32 is way slower than float64. ?On a 2ghz >>> opteron: >>> >>> sin float32 1.12447786331 >>> sin float64 0.133481025696 >>> cos float32 1.14155912399 >>> cos float64 0.131420135498 >> >> Which OS are you on ? FWIW, on max os x, with recent svn checkout, I >> get expected results (float32 ~ twice faster). > > The numbers above are on linux, RHEL 5.2. ?The PS3 is running Fedora 9 I > think. I know next to nothing about the PS3 hardware, but I know that it is quite different compared to conventional x86 CPU. Does it even have both 4 and 8 bytes native float ? > Much more reasonable, but still not what I'd expect or what you seem to > expect. On a x86 system with sinf available in the math lib, I would expect the float32 to be faster than float64. Other than that, the exact ratio depends on too many factors (sse vs x87 usage, cache size, compiler, math library performances). One order magnitude slower seems very strange in any case. > >>> The times are in seconds, and are best of three runs of ten iterations of >>> numpy.{sin,cos} over a 1000-element array (script attached). ?I've produced >>> similar results on a PS3 system also. ?The opteron is running Python 2.6.1 >>> and NumPy 1.3.0, while the PS3 has Python 2.5.1 and NumPy 1.1.1. >>> >>> I haven't jumped into the code yet, but does anyone know why sin/cos are >>> ~8.5x slower for 32-bit floats compared to 64-bit doubles? >> >> My guess would be that you are on a platform where there is no sinf, >> and our sinf replacement is bad for some reason. > > I think linux has sinf, is there a quick/easy way to check if numpy is > using it? You can look at the config.h in numpy/core/include/numpy, and see if there is a HAVE_SINF defined (for numpy >= 1.2.0 at least). cheers, David From kwgoodman at gmail.com Mon Aug 3 11:17:21 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 3 Aug 2009 08:17:21 -0700 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <20090803142112.GA7495@phare.normalesup.org> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> Message-ID: On Mon, Aug 3, 2009 at 7:21 AM, Emmanuelle Gouillart wrote: >> >>>> import numpy as np >> >>>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) >> >>>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) >> >>>> %timeit -n 10 np.sin(a) >> > 10 loops, best of 3: 8.67 ms per loop >> >>>> %timeit -n 10 np.sin(b) >> > 10 loops, best of 3: 9.29 ms per loop > >> OK, I'm curious, what OS/Python/Numpy are you using? > > Sorry, I should have specified these information earlier: > > OS: Linux Ubuntu 9.04 (running a Dual Core Intel Pentium E5200 ?@ > 2.50GHz) > Python: 2.6.2 > Numpy: 1.2.1 Why are my times so different from yours? >> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) >> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) >> timeit -n 10 np.sin(a) 10 loops, best of 3: 46.8 ms per loop >> timeit -n 10 np.sin(b) 10 loops, best of 3: 7.43 ms per loop Ubuntu 9.04 on Core i7 920 (Quad 2.66GHz) Python 2.6.2 Numpy 1.3.0 And even though it is not used for this problem: ATLAS 3.8.3 (single threaded) From sccolbert at gmail.com Mon Aug 3 12:23:13 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 3 Aug 2009 12:23:13 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> Message-ID: <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> I get similar results as the OP: In [1]: import numpy as np In [2]: a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32) In [3]: b = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float64) In [4]: %timeit -n 10 np.sin(a) 10 loops, best of 3: 63.8 ms per loop In [5]: %timeit -n 10 np.sin(b) 10 loops, best of 3: 10.8 ms per loop In [6]: %timeit np.sin(a) 10 loops, best of 3: 63.6 ms per loop In [7]: %timeit np.sin(b) 100 loops, best of 3: 8.85 ms per loop machine: ubuntu 9.04 AMD64 Intel Qx9300 @ 2.53 numpy 1.3 with Atlas 3.8.3 python 2.6.2 On Mon, Aug 3, 2009 at 11:17 AM, Keith Goodman wrote: > On Mon, Aug 3, 2009 at 7:21 AM, Emmanuelle > Gouillart wrote: >>> >>>> import numpy as np >>> >>>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) >>> >>>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) >>> >>>> %timeit -n 10 np.sin(a) >>> > 10 loops, best of 3: 8.67 ms per loop >>> >>>> %timeit -n 10 np.sin(b) >>> > 10 loops, best of 3: 9.29 ms per loop >> >>> OK, I'm curious, what OS/Python/Numpy are you using? >> >> Sorry, I should have specified these information earlier: >> >> OS: Linux Ubuntu 9.04 (running a Dual Core Intel Pentium E5200 ?@ >> 2.50GHz) >> Python: 2.6.2 >> Numpy: 1.2.1 > > Why are my times so different from yours? > >>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) >>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) >>> timeit -n 10 np.sin(a) > 10 loops, best of 3: 46.8 ms per loop >>> timeit -n 10 np.sin(b) > 10 loops, best of 3: 7.43 ms per loop > > Ubuntu 9.04 on Core i7 920 (Quad 2.66GHz) > Python 2.6.2 > Numpy 1.3.0 > And even though it is not used for this problem: ATLAS 3.8.3 (single threaded) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From emmanuelle.gouillart at normalesup.org Mon Aug 3 13:09:32 2009 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Mon, 3 Aug 2009 19:09:32 +0200 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> Message-ID: <20090803170932.GA23528@phare.normalesup.org> On Mon, Aug 03, 2009 at 08:17:21AM -0700, Keith Goodman wrote: > On Mon, Aug 3, 2009 at 7:21 AM, Emmanuelle > Gouillart wrote: > >> >>>> import numpy as np > >> >>>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) > >> >>>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) > >> >>>> %timeit -n 10 np.sin(a) > >> > 10 loops, best of 3: 8.67 ms per loop > >> >>>> %timeit -n 10 np.sin(b) > >> > 10 loops, best of 3: 9.29 ms per loop > >> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) > >> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) > >> timeit -n 10 np.sin(a) > 10 loops, best of 3: 46.8 ms per loop > >> timeit -n 10 np.sin(b) > 10 loops, best of 3: 7.43 ms per loop > Why are my times so different from yours? No idea, sorry... All I can say is that I get similar results (around 11 and 12 ms per loop) with my other computer (wich has the same Ubuntu/Python/Numpy configuration, and has 2 Intel T5600 @ 1.83GHz CPUs). Emmanuelle From charlesr.harris at gmail.com Mon Aug 3 13:38:14 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 3 Aug 2009 11:38:14 -0600 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> Message-ID: On Mon, Aug 3, 2009 at 10:23 AM, Chris Colbert wrote: > I get similar results as the OP: > > > In [1]: import numpy as np > > In [2]: a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32) > > In [3]: b = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float64) > > In [4]: %timeit -n 10 np.sin(a) > 10 loops, best of 3: 63.8 ms per loop > > In [5]: %timeit -n 10 np.sin(b) > 10 loops, best of 3: 10.8 ms per loop > > In [6]: %timeit np.sin(a) > 10 loops, best of 3: 63.6 ms per loop > > In [7]: %timeit np.sin(b) > 100 loops, best of 3: 8.85 ms per loop > > > machine: > > ubuntu 9.04 AMD64 > Intel Qx9300 @ 2.53 > numpy 1.3 with Atlas 3.8.3 > python 2.6.2 > > On Mon, Aug 3, 2009 at 11:17 AM, Keith Goodman wrote: > > On Mon, Aug 3, 2009 at 7:21 AM, Emmanuelle > > Gouillart wrote: > >>> >>>> import numpy as np > >>> >>>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) > >>> >>>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) > >>> >>>> %timeit -n 10 np.sin(a) > >>> > 10 loops, best of 3: 8.67 ms per loop > >>> >>>> %timeit -n 10 np.sin(b) > >>> > 10 loops, best of 3: 9.29 ms per loop > >> > >>> OK, I'm curious, what OS/Python/Numpy are you using? > >> > >> Sorry, I should have specified these information earlier: > >> > >> OS: Linux Ubuntu 9.04 (running a Dual Core Intel Pentium E5200 @ > >> 2.50GHz) > >> Python: 2.6.2 > >> Numpy: 1.2.1 > > > > Why are my times so different from yours? > > > >>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) > >>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) > >>> timeit -n 10 np.sin(a) > > 10 loops, best of 3: 46.8 ms per loop > >>> timeit -n 10 np.sin(b) > > 10 loops, best of 3: 7.43 ms per loop > > > > Ubuntu 9.04 on Core i7 920 (Quad 2.66GHz) > > Python 2.6.2 > > Numpy 1.3.0 > > And even though it is not used for this problem: ATLAS 3.8.3 (single > threaded) > What compiler versions are folks using? In the slow cases, what is the timing for converting to double, computing the sin, then casting back to single? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From afriedle at indiana.edu Mon Aug 3 13:39:39 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Mon, 03 Aug 2009 13:39:39 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <5b8d13220908030813o7f00a975j43712cb458f728d5@mail.gmail.com> References: <4A76E709.9090100@indiana.edu> <5b8d13220908030644v55366931hffbc80267d998667@mail.gmail.com> <4A76EF6A.5060400@indiana.edu> <5b8d13220908030813o7f00a975j43712cb458f728d5@mail.gmail.com> Message-ID: <4A7720DB.80005@indiana.edu> David Cournapeau wrote: >> David Cournapeau wrote: >>> On Mon, Aug 3, 2009 at 10:32 PM, Andrew Friedley wrote: >>>> While working on GSoC stuff I came across this weird performance behavior >>>> for sine and cosine -- using float32 is way slower than float64. On a 2ghz >>>> opteron: >>>> >>>> sin float32 1.12447786331 >>>> sin float64 0.133481025696 >>>> cos float32 1.14155912399 >>>> cos float64 0.131420135498 >>> Which OS are you on ? FWIW, on max os x, with recent svn checkout, I >>> get expected results (float32 ~ twice faster). >> The numbers above are on linux, RHEL 5.2. The PS3 is running Fedora 9 I >> think. > > I know next to nothing about the PS3 hardware, but I know that it is > quite different compared to conventional x86 CPU. Does it even have > both 4 and 8 bytes native float ? Yes. As far as this discussion is concerned, the PS3/Cell is just a slow PowerPC. Quite different from x86, but probably not as different as you think :) >> Much more reasonable, but still not what I'd expect or what you seem to >> expect. > > On a x86 system with sinf available in the math lib, I would expect > the float32 to be faster than float64. Other than that, the exact > ratio depends on too many factors (sse vs x87 usage, cache size, > compiler, math library performances). One order magnitude slower seems > very strange in any case. OK. I'll probably investigate this a bit further, but I don't have anything that really depends on this issue. It does explain a large part of why my cos ufunc was so much faster. Since I'm observing this on both x86 and PPC (PS3), I don't think its a hardware issue -- something in the software stack. And now there's two people reporting results with only different numpy versions. >>>> The times are in seconds, and are best of three runs of ten iterations of >>>> numpy.{sin,cos} over a 1000-element array (script attached). I've produced >>>> similar results on a PS3 system also. The opteron is running Python 2.6.1 >>>> and NumPy 1.3.0, while the PS3 has Python 2.5.1 and NumPy 1.1.1. >>>> >>>> I haven't jumped into the code yet, but does anyone know why sin/cos are >>>> ~8.5x slower for 32-bit floats compared to 64-bit doubles? >>> My guess would be that you are on a platform where there is no sinf, >>> and our sinf replacement is bad for some reason. >> I think linux has sinf, is there a quick/easy way to check if numpy is >> using it? > > You can look at the config.h in numpy/core/include/numpy, and see if > there is a HAVE_SINF defined (for numpy >= 1.2.0 at least). OK, I see HAVE_SINF (and HAVE_COSF) for my 1.3.0 build on the opteron system. I'm using the distro-provided packages on other systems, so I guess I can't check those. I don't think this matters -- numpy/core/src/npy_math.c just defines sinf as a function calling sin. So if HAVE_SINF wasn't set, I'd expect the performance different to be very little, with floats still being slightly faster (less mem traffic). Also I just went and wrote a C program to do a similar benchmark, and I am unable to reproduce the issue there. Makes me think the problem is in NumPy, but I have no idea where to look. Suggestions welcome :) Andrew From afriedle at indiana.edu Mon Aug 3 13:51:36 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Mon, 03 Aug 2009 13:51:36 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> Message-ID: <4A7723A8.8010107@indiana.edu> Charles R Harris wrote: > What compiler versions are folks using? In the slow cases, what is the > timing for converting to double, computing the sin, then casting back to > single? I did this, is this the right way to do that? t = timeit.Timer("numpy.sin(a.astype(numpy.float64)).astype(numpy.float32)", "import numpy\n" "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float64)") print "sin converted float 32/64", min(t.repeat(3, 10)) Timings on my opteron system (2-socket 2-core 2GHz): sin float32 1.13407707214 sin float64 0.133460998535 sin converted float 32/64 0.18202996254 Not too surprising I guess. gcc --version shows: gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44) My compile flags for my Python 2.6.1/NumPy 1.3.0 builds: -Os -fomit-frame-pointer -pipe -s -march=k8 -m64 Andrew From charlesr.harris at gmail.com Mon Aug 3 14:09:49 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 3 Aug 2009 12:09:49 -0600 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A7723A8.8010107@indiana.edu> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> <4A7723A8.8010107@indiana.edu> Message-ID: On Mon, Aug 3, 2009 at 11:51 AM, Andrew Friedley wrote: > Charles R Harris wrote: > > What compiler versions are folks using? In the slow cases, what is the > > timing for converting to double, computing the sin, then casting back to > > single? > > I did this, is this the right way to do that? > > t = > timeit.Timer("numpy.sin(a.astype(numpy.float64)).astype(numpy.float32)", > "import numpy\n" > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, > dtype=numpy.float64)") > print "sin converted float 32/64", min(t.repeat(3, 10)) > > Timings on my opteron system (2-socket 2-core 2GHz): > > sin float32 1.13407707214 > sin float64 0.133460998535 > sin converted float 32/64 0.18202996254 > > Not too surprising I guess. > > gcc --version shows: > > gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44) > > My compile flags for my Python 2.6.1/NumPy 1.3.0 builds: > > -Os -fomit-frame-pointer -pipe -s -march=k8 -m64 > That looks right. When numpy doesn't find a *f version it basically does that conversion. This is beginning to look like a hardware/software implementation problem, maybe compiler related. That is, I suspect the fast times come from using a hardware implementation. What happens if you use -O2 instead of -Os? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Mon Aug 3 14:19:01 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 03 Aug 2009 13:19:01 -0500 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A7723A8.8010107@indiana.edu> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> <4A7723A8.8010107@indiana.edu> Message-ID: <4A772A15.8090407@gmail.com> On 08/03/2009 12:51 PM, Andrew Friedley wrote: > Charles R Harris wrote: > >> What compiler versions are folks using? In the slow cases, what is the >> timing for converting to double, computing the sin, then casting back to >> single? >> > > I did this, is this the right way to do that? > > t = timeit.Timer("numpy.sin(a.astype(numpy.float64)).astype(numpy.float32)", > "import numpy\n" > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, > dtype=numpy.float64)") > print "sin converted float 32/64", min(t.repeat(3, 10)) > > Timings on my opteron system (2-socket 2-core 2GHz): > > sin float32 1.13407707214 > sin float64 0.133460998535 > sin converted float 32/64 0.18202996254 > > Not too surprising I guess. > > gcc --version shows: > > gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44) > > My compile flags for my Python 2.6.1/NumPy 1.3.0 builds: > > -Os -fomit-frame-pointer -pipe -s -march=k8 -m64 > > Andrew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, Can you try these from the command line: python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32)" python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32); b=np.sin(a)" python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32); np.sin(a)" python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32)" "np.sin(a)" The first should be similar for different dtypes because it is just array creation. The second extends that by storing the sin into another array. I am not sure how to interpret the third but in the Python prompt it would print it to screen. The last causes Python to handle two arguments which is slow using float32 but not for float64 and float128 suggesting compiler issue such as not using SSE or similar. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Mon Aug 3 14:47:55 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 03 Aug 2009 11:47:55 -0700 Subject: [Numpy-discussion] (newbie) How can I use NumPy to wrap my C++ class with 2-dimensional arrays? In-Reply-To: <4A72DAD0.7040601@zonnet.nl> References: <4A7140A8.2040305@zonnet.nl> <4A715F40.1060903@zonnet.nl> <4A7162C4.8030605@zonnet.nl> <4A7173A4.6070608@zonnet.nl> <4A72DAD0.7040601@zonnet.nl> Message-ID: <4A7730DB.3020907@noaa.gov> Raymond de Vries wrote: > Thanks for the explanation. After having looked at the documentation, I > decided to do my own plain Python c-api implementation. That is unlikely to be the best option these days -- it's simply too easy to make a type checking and or reference counting error. If SWIG isn't your cup of tea, take a look at Cython or Ctypes -- lower level and more control, but still handle much of the book keeping for you. The Cython team is working on better C++ support, though I don't know where they are at with that. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cekees at gmail.com Mon Aug 3 15:57:07 2009 From: cekees at gmail.com (Chris Kees) Date: Mon, 3 Aug 2009 14:57:07 -0500 Subject: [Numpy-discussion] PDE BoF at SciPy2009 Message-ID: <1963DA80-8CE5-4033-BCC8-EBEF05352AAB@gmail.com> Is there any interest in a BoF session on implementing numerical methods for partial differential equations using modules like numpy, cython, mpi4py, etc.? Regards, Chris From reedev at zonnet.nl Mon Aug 3 16:55:22 2009 From: reedev at zonnet.nl (Raymond de Vries) Date: Mon, 03 Aug 2009 22:55:22 +0200 Subject: [Numpy-discussion] (newbie) How can I use NumPy to wrap my C++ class with 2-dimensional arrays? In-Reply-To: <4A7730DB.3020907@noaa.gov> References: <4A7140A8.2040305@zonnet.nl> <4A715F40.1060903@zonnet.nl> <4A7162C4.8030605@zonnet.nl> <4A7173A4.6070608@zonnet.nl> <4A72DAD0.7040601@zonnet.nl> <4A7730DB.3020907@noaa.gov> Message-ID: <4A774EBA.6040208@zonnet.nl> Hi Chris, >> Thanks for the explanation. After having looked at the documentation, I >> decided to do my own plain Python c-api implementation. >> > > That is unlikely to be the best option these days -- it's simply too > easy to make a type checking and or reference counting error. > > If SWIG isn't your cup of tea, take a look at Cython or Ctypes -- lower > level and more control, but still handle much of the book keeping for you. > > The Cython team is working on better C++ support, though I don't know > where they are at with that. > Oops, I guess I didn't express myself clearly enough: I have used plain Python c-api (in my case a list of lists for my 2-dimensional arrays) for my typemaps. Sorry for the unclearness. Actually because NumPy is not my cup of tea... Especially because Matthieu suggested that I should convert my data into a contiguous array. So no matter what I use, either swig, cython, or.. I still have the NumPy issue. Do you, or someone else, see another possibility? regards Raymond > -Chris > > > From d_l_goldsmith at yahoo.com Mon Aug 3 17:26:17 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 3 Aug 2009 14:26:17 -0700 (PDT) Subject: [Numpy-discussion] PDE BoF at SciPy2009 Message-ID: <701667.11074.qm@web52111.mail.re2.yahoo.com> Please remind: BoF = ? DG --- On Mon, 8/3/09, Chris Kees wrote: > From: Chris Kees > Subject: [Numpy-discussion] PDE BoF at SciPy2009 > To: "Discussion of Numerical Python" > Date: Monday, August 3, 2009, 12:57 PM > Is there any interest in a BoF > session on implementing numerical? > methods for partial differential equations using modules > like numpy,? > cython, mpi4py, etc.? > > Regards, > Chris > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gael.varoquaux at normalesup.org Mon Aug 3 17:27:17 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 3 Aug 2009 23:27:17 +0200 Subject: [Numpy-discussion] PDE BoF at SciPy2009 In-Reply-To: <701667.11074.qm@web52111.mail.re2.yahoo.com> References: <701667.11074.qm@web52111.mail.re2.yahoo.com> Message-ID: <20090803212717.GH32408@phare.normalesup.org> On Mon, Aug 03, 2009 at 02:26:17PM -0700, David Goldsmith wrote: > Please remind: BoF = ? http://conference.scipy.org/bofs G. From Chris.Barker at noaa.gov Mon Aug 3 19:17:10 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 03 Aug 2009 16:17:10 -0700 Subject: [Numpy-discussion] (newbie) How can I use NumPy to wrap my C++ class with 2-dimensional arrays? In-Reply-To: <4A774EBA.6040208@zonnet.nl> References: <4A7140A8.2040305@zonnet.nl> <4A715F40.1060903@zonnet.nl> <4A7162C4.8030605@zonnet.nl> <4A7173A4.6070608@zonnet.nl> <4A72DAD0.7040601@zonnet.nl> <4A7730DB.3020907@noaa.gov> <4A774EBA.6040208@zonnet.nl> Message-ID: <4A776FF6.6080500@noaa.gov> Raymond de Vries wrote: > Oops, I guess I didn't express myself clearly enough: I have used plain > Python c-api (in my case a list of lists for my 2-dimensional arrays) > for my typemaps. Sorry for the unclearness. Actually because NumPy is > not my cup of tea... Well, for almost any purpose, numpy arrays are a better fit for a 2-d array of numbers than a list of lists, so I'm not sure what kind of tea you like ;-) > Especially because Matthieu suggested that I should > convert my data into a contiguous array. a Python list can't share a pointer with your C++ data type anyway, so you'll have to copy anyway -- why not make a contiguous array? If you need a "ragged" array, then it's a different story, but a list of numpy arrays may be a better fit. > So no matter what I use, either swig, cython, or.. I still have the > NumPy issue. True, it's probably best to decide what sort of representation you want in Python, then decide how to build your wrappers. If you have a big 2-d array in C++, numpy is the obvious choice. Another choice is a wrapper around your C++ class -- don't convert to a python type at all. Depending on what you need to do, it may be OK to lost a lot of functionality numpy give you. This is how the SWIG C++ vector wrappers work, for instance. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From matthew.brett at gmail.com Mon Aug 3 19:35:35 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 3 Aug 2009 16:35:35 -0700 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? Message-ID: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> Hi, We are using numpy.distutils, and have run into this odd behavior in windows: I have XP, Mingw, latest numpy SVN, python.org python 2.6. All the commands below I am running from within the 'numpy' root directory (where 'numpy' is a subdirectory). If I run python setup.py build I get the following expected error: ''' No module named msccompiler in numpy.distutils; trying from distutils error: Unable to find vccarsall.bat ''' because, I don't have MSVC. If I run: python setup.py build -c mingw32 - that works. But. running python setup.py build_ext -c mingw32 generates the same error as above. Similarly: python setup.py build_ext -c completely_unknown Ignores the attempt to set the 'completely_unknown' compiler, whereas python setup.py build -c completely_unknown raises a sensible error. I conclude that the numpy.distutils build_ext command is ignoring at least the compiler options. Is that correct? Thanks a lot, Matthew From david at ar.media.kyoto-u.ac.jp Mon Aug 3 22:42:17 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 04 Aug 2009 11:42:17 +0900 Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback Message-ID: <4A77A009.9060104@ar.media.kyoto-u.ac.jp> Hi All, I (David Cournapeau) and the people at Berkeley (Jarrod Millman, Fernando Perez, Matthew Brett) have been in discussion so that I could do some funded work on NumPy/SciPy. Although they are obviously interested in improvements that help their own projects, they are willing to make sure the work will impact numpy/scipy as a whole. As such we would like to get some feedback about the proposal. There are several areas we discussed about, but the main 'vision' is to make more of the C code in numpy reusable to 3rd parties, in particular purely computational (fft, linear algebra, etc...) code. A first draft of the proposal is pasted below. Comments, request for details, objections are welcomed, Thank you for your attention, The Berkeley team, Gael Varoquaux and David Cournapeau ================================== Proposal for improvements to numpy ================================== NumPy is a solid foundation for efficient numerical computation with the python programming language. It consists in a set of extensions to add a powerful multi-dimensional array object. SciPy is built upon NumPy to add more high level functionalities such as numerical integration, linear algebra, statistical functions, etc\.\.\. Although the numpy codebase is mature, and can be reused both at the C and Python levels, there are some limitations in the numpy codebase which prevent some functionalities from being reused by third parties. This means that users of numpy either need to reimplement the functionalities, or to use workarounds. The main goal of this proposal is to improve numpy to circumvent those limitations in general manner. Reusable C libraries ==================== A lot of NumPy and SciPy code is in a compiled language (mostly C and Fortran). For computational code, it is generally advisable to split it into a purely computational code and a wrapping part, marshalling back and forth python objects/structures into basic C types. For example, when computing the exponential of the items in an array, most of NumPy's job is to extract the data from the array into one of the basic C type (int, double, etc...), call the C function exp, and marshall the data back into python objects. Making the marshalling and purely computational code separate has several advantages: 1. The code is easier to follow 2. The purely computational code could be reused by third parties. For example, even for simple C math functions, there is a vast difference in platform/toolchains support. NumPy makes sure that functions to handle special float values (NaN, Inf, etc...) work on every supported platform, in a consistent manner. Making those functions available to third parties would enable developers to reuse this portable functions, and stay consistent even on platforms they don't care about. 3. Working on optimizing the computational code is easier. 4. It would enable easier replacement of the purely computational code at runtime. For example, one could imagine loading SSE-enabled code if the CPU supports SSE extensions. 5. It would also helps for py3k porting, as only the marshalling code would need to change To make purely computational code available to third parties, two things are needed: 1. the code itself needs to make the split explicit. 2. there needs to be support so that reusing those functionalities is as painless as possible, from a build point of view (Note: this is almost done in the upcoming numpy 1.4.0 as long as static linking is OK). Splitting the code ------------------ The amount of work is directly proportional to the amount of functions to be made available. The most obvious candidates are: 1. C99 math functions: a lot of this has already been done. In particular math constants, and special values support is already implemented. Almost every real function in numpy has a portable npy\_ implementation in C. 2. C99-like complex support: this naturally extends the previous series. The main difficult is to support platforms without C99 complex support, and the corresponding C99 complex functions. 3. FFT code: there is no support to reuse FFT at the C level at the moment. 4. Random: there is no support either 5. Linalg: idem. Build support ------------- Once the code itself is split, there needs some support so that the code can be reused by third-parties. The following issues need to be solved: 1. Compile the computational code into shared or static library 2. Once built, making the libraries available to third parties (distutils issues). Ideally, it should work in installed, in-place builds, etc\.\.\. situations. 3. Versioning, ABI/API compatibility issues Iterators ========= When dealing with multi-dimensional arrays, the best abstraction to deal with indexing in a dimension-independent way is to use iterator. NumPy already has some iterators to walk into every item of an array, or every item but one axis. More general iterators are useful for more complicated cases, when one needs to walk into a subset of every item of the array. For example, for image processing, it is often necessary to walk in a neighborhood of an array. Boundaries conditions can be handled automatically, so that padding is transparant to the user. More elaborate iterators, e.g. with a mask (for morphological image processing) can be considered as well. Several packages in scipy implement those iterators (ndimage), or handle boundaries conditions manually in the algorithmic code (scipy.signal). Implementing iterators in numpy would enable better code reuse. Possible iterators ------------------ A neighborhood iterator is already available in numpy. It can handles zero, one, constant, and mirror padding. Potential improvements can be considered from a speed POV, in particular by splitting areas which neeed boundaries handling from the ones which do not. A masked neighborhood iterator is not available. Itk is one toolkit which implemented such an iterator. C code coverage and static numpy linking ======================================== NumPy community has focused a lot on improving the test suite. We went from a few hundred of unit tests in 2006 to more than 2000 unit tests for numpy 1.4. Although code coverage at the python level is relatively easy to obtain using some nose plugins, C code coverage is not possible ATM. The traditional code coverage tool for C code is gprof, the GNU profiler. Unfortunately, gprof cannot profile code which is dynamically linked, as is the case for python extensions. One solution is thus to statically link numpy to the python interpreter. This poses challenges both as build and code levels. Some preliminary work showed that the approach works, but something which could be integrated upstream, and make numpy easily linkable to the python interpreter would be better. Also, some people have expressed interest in distributing a python interpreter with numpy statically linked (e.g. for easy distribution). From david at ar.media.kyoto-u.ac.jp Mon Aug 3 23:11:29 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 04 Aug 2009 12:11:29 +0900 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> Message-ID: <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> Matthew Brett wrote: > Hi, > > We are using numpy.distutils, and have run into this odd behavior in windows: > > I have XP, Mingw, latest numpy SVN, python.org python 2.6. All the > commands below I am running from within the 'numpy' root directory > (where 'numpy' is a subdirectory). > > If I run > > python setup.py build > > I get the following expected error: > > ''' > No module named msccompiler in numpy.distutils; trying from distutils > error: Unable to find vccarsall.bat > ''' > > because, I don't have MSVC. > > If I run: > > python setup.py build -c mingw32 > > - that works. But. running > > python setup.py build_ext -c mingw32 > > generates the same error as above. Similarly: > > python setup.py build_ext -c completely_unknown > > Ignores the attempt to set the 'completely_unknown' compiler, whereas > > python setup.py build -c completely_unknown > > raises a sensible error. I conclude that the numpy.distutils > build_ext command is ignoring at least the compiler options. > > Is that correct? > Short answer: I am afraid it cannot work as you want. Basically, when you pass an option to build_ext, it does not affect other distutils commands, which are run before build_ext, and need the compiler (config in this case I think). So you need to pass the -c option to every command affected by the compiler (build_ext, build_clib and config IIRC). Long answer: The reason is linked to the single most annoying "feature" of distutils: distutils fundamentally works by running some commands, one after the other. Commands have subcommands. For Numpy, as far as compiled code is concerned, it goes like this: config - build - build_clib - build_ext (the build command calls all the subcommands build_* and config). Now, each command options set is independent on the other (build_ext vs. config in this case), but if you pass an option to a command it affects all its subcommands I believe. cheers, David From charlesr.harris at gmail.com Tue Aug 4 00:23:47 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 3 Aug 2009 22:23:47 -0600 Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback In-Reply-To: <4A77A009.9060104@ar.media.kyoto-u.ac.jp> References: <4A77A009.9060104@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, Aug 3, 2009 at 8:42 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi All, > > I (David Cournapeau) and the people at Berkeley (Jarrod Millman, > Fernando Perez, Matthew Brett) have been in discussion so that I could > do some funded work on NumPy/SciPy. Although they are obviously > interested in improvements that help their own projects, they are > willing to make sure the work will impact numpy/scipy as a whole. As > such we would like to get some feedback about the proposal. > > There are several areas we discussed about, but the main 'vision' is to > make more of the C code in numpy reusable to 3rd parties, in particular > purely computational (fft, linear algebra, etc...) code. A first draft > of the proposal is pasted below. > > Comments, request for details, objections are welcomed, > > Thank you for your attention, > > The Berkeley team, Gael Varoquaux and David Cournapeau > > ================================== > Proposal for improvements to numpy > ================================== > > NumPy is a solid foundation for efficient numerical computation with the > python > programming language. It consists in a set of extensions to add a powerful > multi-dimensional array object. SciPy is built upon NumPy to add more high > level functionalities such as numerical integration, linear algebra, > statistical functions, etc\.\.\. Although the numpy codebase is mature, and > can be > reused both at the C and Python levels, there are some limitations in the > numpy > codebase which prevent some functionalities from being reused by third > parties. > This means that users of numpy either need to reimplement the > functionalities, > or to use workarounds. The main goal of this proposal is to improve numpy > to circumvent > those limitations in general manner. > > Reusable C libraries > ==================== > > A lot of NumPy and SciPy code is in a compiled language (mostly C and > Fortran). > For computational code, it is generally advisable to split it into a purely > computational code and a wrapping part, marshalling back and forth python > objects/structures into basic C types. For example, when computing the > exponential of the items in an array, most of NumPy's job is to extract the > data from the array into one of the basic C type (int, double, etc...), > call > the C function exp, and marshall the data back into python objects. Making > the > marshalling and purely computational code separate has several advantages: > > 1. The code is easier to follow > 2. The purely computational code could be reused by third parties. For > example, > even for simple C math functions, there is a vast difference in > platform/toolchains support. NumPy makes sure that functions to handle > special float values (NaN, Inf, etc...) work on every supported platform, > in > a consistent manner. Making those functions available to third parties > would > enable developers to reuse this portable functions, and stay consistent > even > on platforms they don't care about. > 3. Working on optimizing the computational code is easier. > 4. It would enable easier replacement of the purely computational code at > runtime. For example, one could imagine loading SSE-enabled code if the > CPU > supports SSE extensions. > 5. It would also helps for py3k porting, as only the marshalling code would > need to change > > To make purely computational code available to third parties, two things > are > needed: > > 1. the code itself needs to make the split explicit. > 2. there needs to be support so that reusing those functionalities is as > painless as possible, from a build point of view (Note: this is almost > done in the upcoming numpy 1.4.0 as long as static linking is OK). > Ah, it itches. This is certainly a worthy goal, but are there third parties who have expressed an interest in this? I mean, besides trying to avoid duplicate bits of code in Scipy. > > Splitting the code > ------------------ > > The amount of work is directly proportional to the amount of functions to > be > made available. The most obvious candidates are: > > 1. C99 math functions: a lot of this has already been done. In particular > math > constants, and special values support is already implemented. Almost > every > real function in numpy has a portable npy\_ implementation in C. > 2. C99-like complex support: this naturally extends the previous series. > The > main difficult is to support platforms without C99 complex support, and > the > corresponding C99 complex functions. > 3. FFT code: there is no support to reuse FFT at the C level at the moment. > 4. Random: there is no support either > 5. Linalg: idem. > This is good. I think it should go along with code reorganization. The files are now broken up but I am not convinced that everything is yet where it should be. The complex support could be a major effort in its own right if we need to rewrite all the current functions. That said, it would be nice if the complex support was separated out like the current real support. Test to go along with it would be helpful. This also ties in with having build support for many platforms. > Build support > ------------- > > Once the code itself is split, there needs some support so that the code > can be > reused by third-parties. The following issues need to be solved: > > 1. Compile the computational code into shared or static library > 2. Once built, making the libraries available to third parties (distutils > issues). Ideally, it should work in installed, in-place builds, etc\.\.\. > situations. > 3. Versioning, ABI/API compatibility issues > > Trying to break out the build support itself might be useful. It would be good to get some feedback here from other projects that might be interested. But this is a wheel that probably gets reinvented on a regular basis. > > Iterators > ========= > > When dealing with multi-dimensional arrays, the best abstraction to deal > with > indexing in a dimension-independent way is to use iterator. NumPy already > has > some iterators to walk into every item of an array, or every item but one > axis. > More general iterators are useful for more complicated cases, when one > needs to > walk into a subset of every item of the array. For example, for image > processing, it is often necessary to walk in a neighborhood of an array. > Boundaries conditions can be handled automatically, so that padding is > transparant to the user. More elaborate iterators, e.g. with a mask (for > morphological image processing) can be considered as well. > > Several packages in scipy implement those iterators (ndimage), or handle > boundaries conditions manually in the algorithmic code (scipy.signal). > Implementing > iterators in numpy would enable better code reuse. > > Possible iterators > ------------------ > > A neighborhood iterator is already available in numpy. It can handles zero, > one, constant, and mirror padding. Potential improvements can be considered > from a speed POV, in particular by splitting areas which neeed boundaries > handling from the ones which do not. > > A masked neighborhood iterator is not available. Itk is one toolkit which > implemented such an iterator. > > I think this needs some thought. This would essentially be a c library of iterator code. C++ is probably an easier language for such things as it handles the classes and inlining automatically. Which is to say if I had to deal with a lot of iterators I might choose a different language for implementation. > > C code coverage and static numpy linking > ======================================== > > NumPy community has focused a lot on improving the test suite. We went from > a > few hundred of unit tests in 2006 to more than 2000 unit tests for numpy > 1.4. > Although code coverage at the python level is relatively easy to obtain > using > some nose plugins, C code coverage is not possible ATM. > > The traditional code coverage tool for C code is gprof, the GNU profiler. > Unfortunately, gprof cannot profile code which is dynamically linked, as is > the > case for python extensions. One solution is thus to statically link numpy > to > the python interpreter. This poses challenges both as build and code > levels. > Some preliminary work showed that the approach works, but something which > could > be integrated upstream, and make numpy easily linkable to the python > interpreter would be better. > > Also, some people have expressed interest in distributing a python > interpreter > with numpy statically linked (e.g. for easy distribution). > I don't have an opinion here. As a side issue, it would be nice to have some infrastructure for documenting the c code. That way after I have worked my way through one of the numpy functions I could document it so that I wouldn't have to repeat the whole process at some later date. As to choosing a project, you should pick one that really interests you. How would you rank your own interest in these various proposals? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.hirschfeld at gmail.com Tue Aug 4 03:56:31 2009 From: dave.hirschfeld at gmail.com (Dave) Date: Tue, 4 Aug 2009 07:56:31 +0000 (UTC) Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau ar.media.kyoto-u.ac.jp> writes: > > Matthew Brett wrote: > > Hi, > > > > We are using numpy.distutils, and have run into this odd behavior in windows: > > > > Short answer: > > I am afraid it cannot work as you want. Basically, when you pass an > option to build_ext, it does not affect other distutils commands, which > are run before build_ext, and need the compiler (config in this case I > think). So you need to pass the -c option to every command affected by > the compiler (build_ext, build_clib and config IIRC). > > cheers, > > David > I'm having the same problems! Running windows XP, Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)]. In my distutils.cfg I've got: [build] compiler=mingw32 [config] compiler = mingw32 and previously a python setup.py bdist_wininst would create an .exe installer, now I get the following error message: error: Python was built with Visual Studio 2003; extensions must be built with a compiler than can generate compatible binaries. Visual Studio 2003 was not found on this system. If you have Cygwin installed, you can try compiling with MingW32, by passing "-c mingw32" to setup.py. python setup.py build build_ext --compiler=mingw32 appeared to work (barring a warning: numpy\core\setup_common.py:81: MismatchCAPIWarning) but then how do I create a .exe installer afterwards? python setup.py bdist_wininst fails with the same error message as before and python setup.py bdist_wininst --compiler=mingw32 fails with the message: error: option --compiler not recognized Is it still possible to create a .exe installer on Windows and if so what are the commands we need to make it work? Thanks in advance for any help/workarounds it would be much appreciated! Regards, Dave From david at ar.media.kyoto-u.ac.jp Tue Aug 4 03:37:06 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 04 Aug 2009 16:37:06 +0900 Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback In-Reply-To: References: <4A77A009.9060104@ar.media.kyoto-u.ac.jp> Message-ID: <4A77E522.50300@ar.media.kyoto-u.ac.jp> Hi Chuck, Charles R Harris wrote: > > > > To make purely computational code available to third parties, two > things are > needed: > > 1. the code itself needs to make the split explicit. > 2. there needs to be support so that reusing those functionalities > is as > painless as possible, from a build point of view (Note: this is > almost > done in the upcoming numpy 1.4.0 as long as static linking is OK). > > > Ah, it itches. This is certainly a worthy goal, but are there third > parties who have expressed an interest in this? I mean, besides trying > to avoid duplicate bits of code in Scipy. Actually, I think that's what interests people around the Nipy project the most. In particular, they need to reuse lapack and random quite a bit, and for now, they just duplicate the code, with all the problems it brings (duplication, lack of reliability as far as cross platform is concerned, etc...). > > > > Splitting the code > ------------------ > > The amount of work is directly proportional to the amount of > functions to be > made available. The most obvious candidates are: > > 1. C99 math functions: a lot of this has already been done. In > particular math > constants, and special values support is already implemented. > Almost every > real function in numpy has a portable npy\_ implementation in C. > 2. C99-like complex support: this naturally extends the previous > series. The > main difficult is to support platforms without C99 complex > support, and the > corresponding C99 complex functions. > 3. FFT code: there is no support to reuse FFT at the C level at > the moment. > 4. Random: there is no support either > 5. Linalg: idem. > > > This is good. I think it should go along with code reorganization. The > files are now broken up but I am not convinced that everything is yet > where it should be. Oh, definitely agreed. Another thing I would like in that spirit is to split the numy headers like in Python itself: ndarrayobject.h would still pull out everything (for backward compatibility reasons), but people could only include a few headers if they want to. The rationale for me is when I work on numpy itself: it is kind of stupid that everytime I change the iterator structures, the whole numpy core has to be rebuilt. That's quite wasteful and frustrating. Another rationale is to be able to compile and test a very minimal core numpy (the array object + a few things). I don't see py3k port being possible in a foreseeable future without this. > > The complex support could be a major effort in its own right if we > need to rewrite all the current functions. That said, it would be nice > if the complex support was separated out like the current real > support. Test to go along with it would be helpful. This also ties in > with having build support for many platforms. Pauli has worked on this a little, and I have actually worked quite a bit myself because I need a minimal support for windows 64 bits support (to fake libgfortran). I have already implemented around 10 core complex functions (cabs, cangle, creal, cimag, cexp, cpow, csqrt, clog, ccos, csin, ctan), in such a way that native C99 complex are used on platforms which support it, and there is a quite thorough test suite which tests every special value condition (negative zero, inf, nan) as specified in the C99 standard. Still lacks actual values (!), FPU exception and branch cuts tests, and thorough tests on major platforms. And quite a few other functions would be useful (hyperbolic trigo). > > > > Build support > ------------- > > Once the code itself is split, there needs some support so that > the code can be > reused by third-parties. The following issues need to be solved: > > 1. Compile the computational code into shared or static library > 2. Once built, making the libraries available to third parties > (distutils > issues). Ideally, it should work in installed, in-place builds, > etc\.\.\. > situations. > 3. Versioning, ABI/API compatibility issues > > > Trying to break out the build support itself might be useful. What do you mean by breakout exactly ? I have documented the already implemented support: http://docs.scipy.org/doc/numpy/reference/distutils.html#building-installable-c-libraries > I think this needs some thought. This would essentially be a c library > of iterator code. C++ is probably an easier language for such things > as it handles the classes and inlining automatically. Which is to say > if I had to deal with a lot of iterators I might choose a different > language for implementation. C++ is not an option for numpy (and if I had to chose another language compared to C, I would rather take D, or one language which outputs C in the spirit of vala :) ). I think handling iterators in C is OK: sure, it is a bit messy, because of the lack of namespace, template and operator overloading, but the increased portability and implementation simplicity worths it IMHO. When looking at ITK, I don't find it much more readable/easy to use than our own. I also need to think more about this after I finish reading the recent presentation from A. Alexandrescu ("why iterators must go"). Maybe there are some bits which could be applied to numpy iterators design. > As to choosing a project, you should pick one that really interests > you. How would you rank your own interest in these various proposals? Well, that's not me to decide what I work on exactly here :) I must say that almost all of the above are things which are needed for NumPy, things which I have thought about, and would enjoy working on. Maybe that's masochism, but I spent so much time understanding the C code in numpy that I actually enjoy working on it now :) cheers, David From david at ar.media.kyoto-u.ac.jp Tue Aug 4 03:54:40 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 04 Aug 2009 16:54:40 +0900 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> Message-ID: <4A77E940.7070807@ar.media.kyoto-u.ac.jp> Dave wrote: > David Cournapeau ar.media.kyoto-u.ac.jp> writes: > > >> Matthew Brett wrote: >> >>> Hi, >>> >>> We are using numpy.distutils, and have run into this odd behavior in windows: >>> >>> >> Short answer: >> >> I am afraid it cannot work as you want. Basically, when you pass an >> option to build_ext, it does not affect other distutils commands, which >> are run before build_ext, and need the compiler (config in this case I >> think). So you need to pass the -c option to every command affected by >> the compiler (build_ext, build_clib and config IIRC). >> >> cheers, >> >> David >> >> > > I'm having the same problems! Running windows XP, Python 2.5.4 (r254:67916, > Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)]. > > In my distutils.cfg I've got: > > [build] > compiler=mingw32 > > [config] > compiler = mingw32 > > Yes, config files are an alternative I did not mention. I never use them because I prefer controlling the build on a per package basis, and the interaction between command line and config files is not always clear. > python setup.py build build_ext --compiler=mingw32 appeared to work (barring a > warning: numpy\core\setup_common.py:81: MismatchCAPIWarning) The warning is harmless: it is just a reminder that before releasing numpy 1.4.0, we will need to raise the C API version (to avoid problems we had in the past with mismatched numpy version). There is no point updating it during dev time I think. > but then how do I > create a .exe installer afterwards? python setup.py bdist_wininst fails with > the same error message as before and python setup.py bdist_wininst > --compiler=mingw32 fails with the message: > error: option --compiler not recognized > You need to do as follows, if you want to control from the command line: python setup.py build -c mingw32 bdist_wininst That's how I build the official binaries . cheers, David From dave.hirschfeld at gmail.com Tue Aug 4 04:34:44 2009 From: dave.hirschfeld at gmail.com (Dave) Date: Tue, 4 Aug 2009 08:34:44 +0000 (UTC) Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau ar.media.kyoto-u.ac.jp> writes: > > You need to do as follows, if you want to control from the command line: > > python setup.py build -c mingw32 bdist_wininst > > That's how I build the official binaries . > > cheers, > > David > Running the command: C:\dev\src\numpy>python setup.py build -c mingw32 bdist_wininst > build.txt still gives me the error: error: Python was built with Visual Studio 2003; extensions must be built with a compiler than can generate compatible binaries. Visual Studio 2003 was not found on this system. If you have Cygwin installed, you can try compiling with MingW32, by passing "-c mingw32" to setup.py. I tried without a distutils.cfg file and deleted the build directory both times. In case it helps the bulid log should be available from http://pastebin.com/m607992ba Am I doing something wrong? -Dave From david at ar.media.kyoto-u.ac.jp Tue Aug 4 04:28:46 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 04 Aug 2009 17:28:46 +0900 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> Message-ID: <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> Dave wrote: > David Cournapeau ar.media.kyoto-u.ac.jp> writes: > > >> You need to do as follows, if you want to control from the command line: >> >> python setup.py build -c mingw32 bdist_wininst >> >> That's how I build the official binaries . >> >> cheers, >> >> David >> >> > > Running the command: > > C:\dev\src\numpy>python setup.py build -c mingw32 bdist_wininst > build.txt > > still gives me the error: > > error: Python was built with Visual Studio 2003; > extensions must be built with a compiler than can generate compatible binaries. > Visual Studio 2003 was not found on this system. If you have Cygwin installed, > you can try compiling with MingW32, by passing "-c mingw32" to setup.py. > > I tried without a distutils.cfg file and deleted the build directory both times. > > In case it helps the bulid log should be available from > http://pastebin.com/m607992ba > > Am I doing something wrong? > No, I think you and Matthew actually found a bug in recent changes I have done in distutils. I will fix it right away, cheers, David From cournape at gmail.com Tue Aug 4 05:54:19 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 4 Aug 2009 18:54:19 +0900 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> On Tue, Aug 4, 2009 at 5:28 PM, David Cournapeau wrote: > > No, I think you and Matthew actually found a bug in recent changes I > have done in distutils. I will fix it right away, Ok, not right away, but could you check that r7280 fixed it for you ? cheers, David From dave.hirschfeld at gmail.com Tue Aug 4 06:03:27 2009 From: dave.hirschfeld at gmail.com (Dave) Date: Tue, 4 Aug 2009 10:03:27 +0000 (UTC) Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> Message-ID: David Cournapeau gmail.com> writes: > > On Tue, Aug 4, 2009 at 5:28 PM, David > Cournapeau ar.media.kyoto-u.ac.jp> wrote: > > > > > No, I think you and Matthew actually found a bug in recent changes I > > have done in distutils. I will fix it right away, > > Ok, not right away, but could you check that r7280 fixed it for you ? > > cheers, > > David > Work's for me. adding 'SCRIPTS\f2py.py' creating dist removing 'build\bdist.win32\wininst' (and everything under it) Thanks for the quick fix! -Dave From dave.hirschfeld at gmail.com Tue Aug 4 06:23:55 2009 From: dave.hirschfeld at gmail.com (Dave) Date: Tue, 4 Aug 2009 10:23:55 +0000 (UTC) Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> Message-ID: Dave gmail.com> writes: > > Work's for me. > > -Dave > Except now when trying to compile the latest scipy I get the following error: C:\dev\src\scipy>svn up Fetching external item into 'doc\sphinxext' External at revision 7280. At revision 5890. C:\dev\src\scipy>python setup.py bdist_wininst Traceback (most recent call last): File "setup.py", line 160, in setup_package() File "setup.py", line 152, in setup_package configuration=configuration ) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils\core.py", line 152, in setup config = configuration() File "setup.py", line 118, in configuration config.add_subpackage('scipy') File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils\misc_util.py", line 890, in add_subpackage caller_level = 2) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils\misc_util.py", line 859, in get_subpackage caller_level = caller_level + 1) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils\misc_util.py", line 796, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "scipy\setup.py", line 20, in configuration config.add_subpackage('special') File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils\misc_util.py", line 890, in add_subpackage caller_level = 2) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils\misc_util.py", line 859, in get_subpackage caller_level = caller_level + 1) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils\misc_util.py", line 796, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "scipy\special\setup.py", line 45, in configuration extra_info=get_info("npymath") File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils\misc_util.py", line 1954, in get_info pkg_info = get_pkg_info(pkgname, dirs) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils\misc_util.py", line 1921, in get_pkg_info return read_config(pkgname, dirs) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils \npy_pkg_config.py", line 235, in read_config v = _read_config_imp(pkg_to_filename(pkgname), dirs) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils \npy_pkg_config.py", line 221, in _read_config_imp meta, vars, sections, reqs = _read_config(filenames) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils \npy_pkg_config.py", line 205, in _read_config meta, vars, sections, reqs = parse_config(f, dirs) File "C:\dev\bin\Python25\Lib\site-packages\numpy\distutils \npy_pkg_config.py", line 177, in parse_config raise PkgNotFound("Could not find file(s) %s" % str(filenames)) numpy.distutils.npy_pkg_config.PkgNotFound: Could not find file(s) ['C:\\dev\\bin\\Python25\\lib\\site-packages\\numpy\\core\\lib \\npy-pkg-config\\npymath.ini'] In the numpy\core\lib directory there is no npy-pkg-config sub-directory, only a single file - libnpymath.a Is this expected - has scipy not yet caught up with the numpy changes or is this a numpy issue? -Dave From david at ar.media.kyoto-u.ac.jp Tue Aug 4 06:07:45 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 04 Aug 2009 19:07:45 +0900 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> Message-ID: <4A780871.5070408@ar.media.kyoto-u.ac.jp> Dave wrote: > Dave gmail.com> writes: > > >> Work's for me. >> >> -Dave >> >> > > Except now when trying to compile the latest scipy I get the following error: > Was numpy installed from a bdist_wininst installer, or did you use the install method directly ? David From dave.hirschfeld at gmail.com Tue Aug 4 07:20:30 2009 From: dave.hirschfeld at gmail.com (Dave) Date: Tue, 4 Aug 2009 11:20:30 +0000 (UTC) Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> <4A780871.5070408@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau ar.media.kyoto-u.ac.jp> writes: > > Dave wrote: > > Dave gmail.com> writes: > > > > > >> Work's for me. > >> > >> -Dave > >> > >> > > > > Except now when trying to compile the latest scipy I get the following error: > > > > Was numpy installed from a bdist_wininst installer, or did you use the > install method directly ? > > David > Numpy was installed with the bdist_wininst installer. In case it's relevant the installer seemed to create 2 egg-info files: numpy-1.4.0.dev7277-py2.5.egg-info numpy-1.4.0.dev7280-py2.5.egg-info Deleting the numpy directory and the egg-info files and re-installing from the bdist_wininst installer gave the same result (with the above 2 egg-info files) Installing numpy with python setup.py install seemed to work (at least the npymath.ini file was now in the numpy\core\lib\npy-pkg-config folder) Compiling scipy got much further now, but still failed with the below error message: C:\dev\src\scipy>python setup.py bdist_wininst > build.txt Warning: No configuration returned, assuming unavailable. C:\dev\bin\Python25\lib\site-packages\numpy\distutils\command\config.py:394: DeprecationWarning: +++++++++++++++++++++++++++++++++++++++++++++++++ Usage of get_output is deprecated: please do not use it anymore, and avoid configuration checks involving running executable on the target machine. +++++++++++++++++++++++++++++++++++++++++++++++++ DeprecationWarning) C:\dev\bin\Python25\lib\site-packages\numpy\distutils\system_info.py:452: UserWarning: UMFPACK sparse solver (http://www.cise.ufl.edu/research/sparse/umfpack/) not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [umfpack]) or by setting the UMFPACK environment variable. warnings.warn(self.notfounderror.__doc__) error: Command "C:\dev\bin\mingw\bin\g77.exe -g -Wall -mno-cygwin -g -Wall -mno-cygwin -shared build\temp.win32-2.5\Release\scipy\special\_cephesmodule.o build\temp.win32-2.5\Release\scipy\special\amos_wrappers.o build\temp.win32-2.5\Release\scipy\special\specfun_wrappers.o build\temp.win32-2.5\Release\scipy\special\toms_wrappers.o build\temp.win32-2.5\Release\scipy\special\cdf_wrappers.o build\temp.win32-2.5\Release\scipy\special\ufunc_extras.o -LC:\dein\Python25\Lib\site-packages -LC:\dev\bin\mingw\lib -LC:\dev\bin\mingw\lib\gcc\mingw32\3.4.5 -LC:\dev\bin\Python25\libs -LC:\dev\bin\Python25\PCBuild -Lbuild\temp.win32-2.5 -lsc_amos -lsc_toms -lsc_c_misc -lsc_cephes -lsc_mach -lsc_cdf -lsc_specfun -lnpymath -lpython25 -lg2c -o build\lib.win32-2.5\scipy\special\_cephes.pyd" failed with exit status 1 The output of the build is available from http://pastebin.com/d3efe5650 Note the strange character on line 4600. In my terminal window this is displayed as: compile options: '-D_USE_MATH_DEFINES -D_USE_MATH_DEFINES -IC:\dein\Python25\Lib\site-packages -IC:\dev\bin\Python25\lib\site-packages\numpy\core\include -IC:\dev\bin\Python25\include -IC:\dev\bin\Python25\PC -c' HTH, Dave From markbak at gmail.com Tue Aug 4 07:23:08 2009 From: markbak at gmail.com (Mark Bakker) Date: Tue, 4 Aug 2009 13:23:08 +0200 Subject: [Numpy-discussion] speed of atleast_1d and friends Message-ID: <6946b9500908040423n5ed4beawd5c1b0ca21823d06@mail.gmail.com> Hello all, I am making a lot of use of atleast_1d and atleast_2d in my routines. Does anybody know whether this will slow down my code significantly? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Aug 4 07:13:59 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 04 Aug 2009 20:13:59 +0900 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77A6E1.1050403@ar.media.kyoto-u.ac.jp> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> <4A780871.5070408@ar.media.kyoto-u.ac.jp> Message-ID: <4A7817F7.3@ar.media.kyoto-u.ac.jp> Dave wrote: > David Cournapeau ar.media.kyoto-u.ac.jp> writes: > > >> Dave wrote: >> >>> Dave gmail.com> writes: >>> >>> >>> >>>> Work's for me. >>>> >>>> -Dave >>>> >>>> >>>> >>> Except now when trying to compile the latest scipy I get the following >>> > error: > >>> >>> >> Was numpy installed from a bdist_wininst installer, or did you use the >> install method directly ? >> >> David >> >> > > Numpy was installed with the bdist_wininst installer. > > In case it's relevant the installer seemed to create 2 egg-info files: > numpy-1.4.0.dev7277-py2.5.egg-info > numpy-1.4.0.dev7280-py2.5.egg-info > > Deleting the numpy directory and the egg-info files and re-installing from the > bdist_wininst installer gave the same result (with the above 2 egg-info files) > > Installing numpy with python setup.py install seemed to work (at least the > npymath.ini file was now in the numpy\core\lib\npy-pkg-config folder) > I think I understand the problem. Unfortunately, that's looks tricky to solve... I hate distutils. David From emmanuelle.gouillart at normalesup.org Tue Aug 4 09:03:41 2009 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Tue, 4 Aug 2009 15:03:41 +0200 Subject: [Numpy-discussion] speed of atleast_1d and friends In-Reply-To: <6946b9500908040423n5ed4beawd5c1b0ca21823d06@mail.gmail.com> References: <6946b9500908040423n5ed4beawd5c1b0ca21823d06@mail.gmail.com> Message-ID: <20090804130341.GD9488@phare.normalesup.org> Hello, > I am making a lot of use of atleast_1d and atleast_2d in my routines. > Does anybody know whether this will slow down my code significantly? if there is no need to make copies (i.e. if you take arrays as parameters (?)), calls to atleast_1d and atleast_2d should be extremely fast: it's just a question of creating a different view, I think. Did you profile your code to check? Cheers, Emmanuelle From afriedle at indiana.edu Tue Aug 4 09:39:15 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Tue, 04 Aug 2009 09:39:15 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A772A15.8090407@gmail.com> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> <4A7723A8.8010107@indiana.edu> <4A772A15.8090407@gmail.com> Message-ID: <4A783A03.9090709@indiana.edu> Bruce Southey wrote: > Hi, > Can you try these from the command line: > python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, > (2*3.14159) / 1000, dtype=np.float32)" > python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, > (2*3.14159) / 1000, dtype=np.float32); b=np.sin(a)" > python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, > (2*3.14159) / 1000, dtype=np.float32); np.sin(a)" > python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, > (2*3.14159) / 1000, dtype=np.float32)" "np.sin(a)" > > The first should be similar for different dtypes because it is just > array creation. The second extends that by storing the sin into another > array. I am not sure how to interpret the third but in the Python prompt > it would print it to screen. The last causes Python to handle two > arguments which is slow using float32 but not for float64 and float128 > suggesting compiler issue such as not using SSE or similar. Results: $ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32)" 100 loops, best of 3: 0.0811 usec per loop $ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32); b=np.sin(a)" 100 loops, best of 3: 0.11 usec per loop $ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32); np.sin(a)" 100 loops, best of 3: 0.11 usec per loop $ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float32)" "np.sin(a)" 100 loops, best of 3: 112 msec per loop $ python -m timeit -n 100 -s "import numpy as np; a = np.arange(0.0, 1000, (2*3.14159) / 1000, dtype=np.float64)" "np.sin(a)" 100 loops, best of 3: 13.2 msec per loop I think the second and third are effectively the same; both create an array containing the result. The second assigns that array to a value, while the third does not, so it should get garbage collected. The fourth one is the only one that actually runs the sin in the timing loop. I don't understand what you mean by causing Pyton to handle two arguments? The fifth run I added uses float64 to compare (and reproduces the problem). Andrew From kwgoodman at gmail.com Tue Aug 4 10:37:03 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 4 Aug 2009 07:37:03 -0700 Subject: [Numpy-discussion] speed of atleast_1d and friends In-Reply-To: <6946b9500908040423n5ed4beawd5c1b0ca21823d06@mail.gmail.com> References: <6946b9500908040423n5ed4beawd5c1b0ca21823d06@mail.gmail.com> Message-ID: On Tue, Aug 4, 2009 at 4:23 AM, Mark Bakker wrote: > Hello all, > I am making a lot of use of atleast_1d and atleast_2d in my routines. > Does anybody know whether this will slow down my code significantly? > Thanks, > Mark Here's atleast_1d: def atleast_1d(*arys): res = [] for ary in arys: res.append(array(ary,copy=False,subok=True,ndmin=1)) if len(res) == 1: return res[0] else: return res If you only pass in on array at a time, that reduces to: def myatleast_1d(ary): return array(ary, copy=False, subok=True, ndmin=1) That might save some time. I'm always amazed at the solutions people come up with on this list. So if you send an example, someone might be able to get rid of the need for atleast_1d. From gael.varoquaux at normalesup.org Tue Aug 4 10:41:01 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 4 Aug 2009 16:41:01 +0200 Subject: [Numpy-discussion] speed of atleast_1d and friends In-Reply-To: References: <6946b9500908040423n5ed4beawd5c1b0ca21823d06@mail.gmail.com> Message-ID: <20090804144101.GM17519@phare.normalesup.org> On Tue, Aug 04, 2009 at 07:37:03AM -0700, Keith Goodman wrote: > I'm always amazed at the solutions people come up with on this list. > So if you send an example, someone might be able to get rid of the > need for atleast_1d. On the other hand, it costs almost no time, and makes your API more robusts (for instance it can be used with numbers as well as arrays). I am all for abusive use of np.atleast_1d. Ga?l From afriedle at indiana.edu Tue Aug 4 11:14:59 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Tue, 04 Aug 2009 11:14:59 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> <4A7723A8.8010107@indiana.edu> Message-ID: <4A785073.9060009@indiana.edu> Charles R Harris wrote: > On Mon, Aug 3, 2009 at 11:51 AM, Andrew Friedley wrote: > >> Charles R Harris wrote: >>> What compiler versions are folks using? In the slow cases, what is the >>> timing for converting to double, computing the sin, then casting back to >>> single? >> I did this, is this the right way to do that? >> >> t = >> timeit.Timer("numpy.sin(a.astype(numpy.float64)).astype(numpy.float32)", >> "import numpy\n" >> "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, >> dtype=numpy.float64)") >> print "sin converted float 32/64", min(t.repeat(3, 10)) >> >> Timings on my opteron system (2-socket 2-core 2GHz): >> >> sin float32 1.13407707214 >> sin float64 0.133460998535 >> sin converted float 32/64 0.18202996254 >> >> Not too surprising I guess. >> >> gcc --version shows: >> >> gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44) >> >> My compile flags for my Python 2.6.1/NumPy 1.3.0 builds: >> >> -Os -fomit-frame-pointer -pipe -s -march=k8 -m64 >> > > That looks right. When numpy doesn't find a *f version it basically does > that conversion. This is beginning to look like a hardware/software > implementation problem, maybe compiler related. That is, I suspect the fast > times come from using a hardware implementation. What happens if you use -O2 > instead of -Os? Do you know where this conversion is, in the code? The impression I got from my quick look at the code was that a wrapper sinf was defined that just calls sin. I guess the typecast to float in there will do the conversion, is that what you are referring to, or something at a higher level? I recompiled the same versions of Python/NumPy, using the same flags except -O2 instead of -Os, the behavior is still the same. Andrew From cournape at gmail.com Tue Aug 4 11:20:13 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 5 Aug 2009 00:20:13 +0900 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A785073.9060009@indiana.edu> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> <4A7723A8.8010107@indiana.edu> <4A785073.9060009@indiana.edu> Message-ID: <5b8d13220908040820u79b6d5felc3dd2e338cc55433@mail.gmail.com> On Wed, Aug 5, 2009 at 12:14 AM, Andrew Friedley wrote: > Do you know where this conversion is, in the code? ?The impression I got > from my quick look at the code was that a wrapper sinf was defined that > just calls sin. ?I guess the typecast to float in there will do the > conversion Exact. Given your CPU, compared to my macbook, it looks like the float32 is the problem (i.e. the float64 is not particularly fast). I really can't see what could cause such a slowdown: the range over which you evaluate sin should not cause denormal numbers - just to be sure, could you try the same benchmark but using a simple array of constant values (say numpy.ones(1000)) ? Also, you may want to check what happens if you force raising errors in case of FPU exceptions (numpy.seterr(raise="all")). cheers, David From cournape at gmail.com Tue Aug 4 12:31:04 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 5 Aug 2009 01:31:04 +0900 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: <4A7817F7.3@ar.media.kyoto-u.ac.jp> References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> <4A780871.5070408@ar.media.kyoto-u.ac.jp> <4A7817F7.3@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220908040931ide8259dn7c07ecf8e82e3099@mail.gmail.com> On Tue, Aug 4, 2009 at 8:13 PM, David Cournapeau wrote: > I think I understand the problem. Unfortunately, that's looks tricky to > solve... I hate distutils. Ok - should be fixed in r7281. David From gokhansever at gmail.com Tue Aug 4 12:46:57 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 4 Aug 2009 11:46:57 -0500 Subject: [Numpy-discussion] Why NaN? Message-ID: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> Hello, I know this has to have a very simple answer, but stuck at this very moment and can't get a meaningful result out of np.mean() In [121]: a = array([NaN, 4, NaN, 12]) In [122]: b = array([NaN, 2, NaN, 3]) In [123]: c = a/b In [124]: mean(c) Out[124]: nan In [125]: mean a --------> mean(a) Out[125]: nan Further when I tried: In [138]: c Out[138]: array([ NaN, 2., NaN, 4.]) In [139]: np.where(c==NaN) Out[139]: (array([], dtype=int32),) In [141]: mask = [c != NaN] In [142]: mask Out[142]: [array([ True, True, True, True], dtype=bool)] Any ideas? -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.hirschfeld at gmail.com Tue Aug 4 12:51:56 2009 From: dave.hirschfeld at gmail.com (Dave) Date: Tue, 4 Aug 2009 16:51:56 +0000 (UTC) Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77E940.7070807@ar.media.kyoto-u.ac.jp> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> <4A780871.5070408@ar.media.kyoto-u.ac.jp> <4A7817F7.3@ar.media.kyoto-u.ac.jp> <5b8d13220908040931ide8259dn7c07ecf8e82e3099@mail.gmail.com> Message-ID: David Cournapeau gmail.com> writes: > > On Tue, Aug 4, 2009 at 8:13 PM, David > Cournapeau ar.media.kyoto-u.ac.jp> wrote: > > > I think I understand the problem. Unfortunately, that's looks tricky to > > solve... I hate distutils. > > Ok - should be fixed in r7281. > > David > Well, that seemed to fix the bdist_wininst issue. The problem compiling scipy remains, but I assume that's probably something I should take up on the scipy list? FWIW running the full numpy test suite (verbose=10) I get 7 failures. The results are available from http://pastebin.com/m5505d4b5 The "errors" seem to be be related to the NaN handling. Thanks for the help today! -Dave From robert.kern at gmail.com Tue Aug 4 12:51:56 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 4 Aug 2009 11:51:56 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> Message-ID: <3d375d730908040951u2090502cxc3a1704a0e1c906a@mail.gmail.com> On Tue, Aug 4, 2009 at 11:46, G?khan Sever wrote: > Hello, > > I know this has to have a very simple answer, but stuck at this very moment > and can't get a meaningful result out of np.mean() > > > In [121]: a = array([NaN, 4, NaN, 12]) > > In [122]: b = array([NaN, 2, NaN, 3]) > > In [123]: c = a/b > > In [124]: mean(c) > Out[124]: nan > > In [125]: mean a > --------> mean(a) > Out[125]: nan > > Further when I tried: > > In [138]: c > Out[138]: array([ NaN,?? 2.,? NaN,?? 4.]) > > In [139]: np.where(c==NaN) > Out[139]: (array([], dtype=int32),) > > > In [141]: mask = [c != NaN] > > In [142]: mask > Out[142]: [array([ True,? True,? True,? True], dtype=bool)] Yeah, NaN != NaN. It's a feature, not a bug. Use np.ma.masked_invalid(c).mean(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Tue Aug 4 12:54:15 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 4 Aug 2009 09:54:15 -0700 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> Message-ID: On Tue, Aug 4, 2009 at 9:46 AM, G?khan Sever wrote: > Hello, > > I know this has to have a very simple answer, but stuck at this very moment > and can't get a meaningful result out of np.mean() > > > In [121]: a = array([NaN, 4, NaN, 12]) > > In [122]: b = array([NaN, 2, NaN, 3]) > > In [123]: c = a/b > > In [124]: mean(c) > Out[124]: nan > > In [125]: mean a > --------> mean(a) > Out[125]: nan > > Further when I tried: > > In [138]: c > Out[138]: array([ NaN,?? 2.,? NaN,?? 4.]) > > In [139]: np.where(c==NaN) > Out[139]: (array([], dtype=int32),) > > > In [141]: mask = [c != NaN] > > In [142]: mask > Out[142]: [array([ True,? True,? True,? True], dtype=bool)] > > > Any ideas? >> a = array([NaN, 4, NaN, 12]) >> b = array([NaN, 2, NaN, 3]) >> c = a/b >> from scipy import stats >> stats.nan [tab] stats.nanmean stats.nanmedian stats.nanstd >> stats.nanmean(c) 3.0 >> stats.nanmean(a) 8.0 >> c[isnan(c)] array([ NaN, NaN]) From perry at stsci.edu Tue Aug 4 12:57:23 2009 From: perry at stsci.edu (Perry Greenfield) Date: Tue, 4 Aug 2009 12:57:23 -0400 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> Message-ID: Note that NaN generally contaminates sums and other net results (as it should). You should filter them out (there is more than one way to do that). But also note that the IEEE standard for floating point numbers requires NaN != Nan. Thus any attempts to find where NaNs that way is destined to fail. Use the function isnan() instead to generate a mask. Perry On Aug 4, 2009, at 12:46 PM, G?khan Sever wrote: > Hello, > > I know this has to have a very simple answer, but stuck at this very > moment and can't get a meaningful result out of np.mean() > > > In [121]: a = array([NaN, 4, NaN, 12]) > > In [122]: b = array([NaN, 2, NaN, 3]) > > In [123]: c = a/b > > In [124]: mean(c) > Out[124]: nan > > In [125]: mean a > --------> mean(a) > Out[125]: nan > > Further when I tried: > > In [138]: c > Out[138]: array([ NaN, 2., NaN, 4.]) > > In [139]: np.where(c==NaN) > Out[139]: (array([], dtype=int32),) > > > In [141]: mask = [c != NaN] > > In [142]: mask > Out[142]: [array([ True, True, True, True], dtype=bool)] > > > Any ideas? > > -- > G?khan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From kwgoodman at gmail.com Tue Aug 4 12:59:06 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 4 Aug 2009 09:59:06 -0700 Subject: [Numpy-discussion] Why NaN? In-Reply-To: References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> Message-ID: On Tue, Aug 4, 2009 at 9:54 AM, Keith Goodman wrote: > On Tue, Aug 4, 2009 at 9:46 AM, G?khan Sever wrote: >> Hello, >> >> I know this has to have a very simple answer, but stuck at this very moment >> and can't get a meaningful result out of np.mean() >> >> >> In [121]: a = array([NaN, 4, NaN, 12]) >> >> In [122]: b = array([NaN, 2, NaN, 3]) >> >> In [123]: c = a/b >> >> In [124]: mean(c) >> Out[124]: nan >> >> In [125]: mean a >> --------> mean(a) >> Out[125]: nan >> >> Further when I tried: >> >> In [138]: c >> Out[138]: array([ NaN,?? 2.,? NaN,?? 4.]) >> >> In [139]: np.where(c==NaN) >> Out[139]: (array([], dtype=int32),) >> >> >> In [141]: mask = [c != NaN] >> >> In [142]: mask >> Out[142]: [array([ True,? True,? True,? True], dtype=bool)] >> >> >> Any ideas? > >>> a = array([NaN, 4, NaN, 12]) >>> b = array([NaN, 2, NaN, 3]) >>> c = a/b >>> from scipy import stats >>> stats.nan [tab] > stats.nanmean ? ?stats.nanmedian ?stats.nanstd >>> stats.nanmean(c) > ? 3.0 >>> stats.nanmean(a) > ? 8.0 >>> c[isnan(c)] > ? array([ NaN, ?NaN]) One more: >> c[isfinite(c)].mean() 3.0 From josef.pktd at gmail.com Tue Aug 4 13:05:01 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 4 Aug 2009 13:05:01 -0400 Subject: [Numpy-discussion] Why NaN? In-Reply-To: References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> Message-ID: <1cd32cbb0908041005m7405c9f8j5b75619c127c9180@mail.gmail.com> On Tue, Aug 4, 2009 at 12:59 PM, Keith Goodman wrote: > On Tue, Aug 4, 2009 at 9:54 AM, Keith Goodman wrote: >> On Tue, Aug 4, 2009 at 9:46 AM, G?khan Sever wrote: >>> Hello, >>> >>> I know this has to have a very simple answer, but stuck at this very moment >>> and can't get a meaningful result out of np.mean() >>> >>> >>> In [121]: a = array([NaN, 4, NaN, 12]) >>> >>> In [122]: b = array([NaN, 2, NaN, 3]) >>> >>> In [123]: c = a/b >>> >>> In [124]: mean(c) >>> Out[124]: nan >>> >>> In [125]: mean a >>> --------> mean(a) >>> Out[125]: nan >>> >>> Further when I tried: >>> >>> In [138]: c >>> Out[138]: array([ NaN,?? 2.,? NaN,?? 4.]) >>> >>> In [139]: np.where(c==NaN) >>> Out[139]: (array([], dtype=int32),) >>> >>> >>> In [141]: mask = [c != NaN] >>> >>> In [142]: mask >>> Out[142]: [array([ True,? True,? True,? True], dtype=bool)] >>> >>> >>> Any ideas? >> >>>> a = array([NaN, 4, NaN, 12]) >>>> b = array([NaN, 2, NaN, 3]) >>>> c = a/b >>>> from scipy import stats >>>> stats.nan [tab] >> stats.nanmean ? ?stats.nanmedian ?stats.nanstd >>>> stats.nanmean(c) >> ? 3.0 >>>> stats.nanmean(a) >> ? 8.0 >>>> c[isnan(c)] >> ? array([ NaN, ?NaN]) > > One more: > >>> c[isfinite(c)].mean() > ? 3.0 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > What's going on with the response time here? I cannot even finish reading the question and start python. Josef From robert.kern at gmail.com Tue Aug 4 13:08:36 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 4 Aug 2009 12:08:36 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <1cd32cbb0908041005m7405c9f8j5b75619c127c9180@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <1cd32cbb0908041005m7405c9f8j5b75619c127c9180@mail.gmail.com> Message-ID: <3d375d730908041008k33d2161ayf9a35b04b0ab85ca@mail.gmail.com> On Tue, Aug 4, 2009 at 12:05, wrote: > What's going on with the response time here? > > I cannot even finish reading the question and start python. Practice. :-) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Tue Aug 4 13:11:07 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 4 Aug 2009 10:11:07 -0700 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <1cd32cbb0908041005m7405c9f8j5b75619c127c9180@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <1cd32cbb0908041005m7405c9f8j5b75619c127c9180@mail.gmail.com> Message-ID: On Tue, Aug 4, 2009 at 10:05 AM, wrote: > What's going on with the response time here? > > I cannot even finish reading the question and start python. The trick is to not read the entire question. I usually reply after reading the subj line. Or just auto-reply with "x.sort() returns None" which seems to be the most common question. From charlesr.harris at gmail.com Tue Aug 4 13:16:28 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 Aug 2009 11:16:28 -0600 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> <4A780871.5070408@ar.media.kyoto-u.ac.jp> <4A7817F7.3@ar.media.kyoto-u.ac.jp> <5b8d13220908040931ide8259dn7c07ecf8e82e3099@mail.gmail.com> Message-ID: On Tue, Aug 4, 2009 at 10:51 AM, Dave wrote: > David Cournapeau gmail.com> writes: > > > > > On Tue, Aug 4, 2009 at 8:13 PM, David > > Cournapeau ar.media.kyoto-u.ac.jp> wrote: > > > > > I think I understand the problem. Unfortunately, that's looks tricky to > > > solve... I hate distutils. > > > > Ok - should be fixed in r7281. > > > > David > > > > Well, that seemed to fix the bdist_wininst issue. > > The problem compiling scipy remains, but I assume that's probably something > I > should take up on the scipy list? > > FWIW running the full numpy test suite (verbose=10) I get 7 failures. The > results are available from http://pastebin.com/m5505d4b5 > > The "errors" seem to be be related to the NaN handling. > The nan problems come from these tests: # atan2(+-infinity, -infinity) returns +-3*pi/4. yield assert_almost_equal, ncu.arctan2( np.inf, -np.inf), 0.75 * np.pi yield assert_almost_equal, ncu.arctan2(-np.inf, -np.inf), -0.75 * np.pi # atan2(+-infinity, +infinity) returns +-pi/4. yield assert_almost_equal, ncu.arctan2( np.inf, np.inf), 0.25 * np.pi yield assert_almost_equal, ncu.arctan2(-np.inf, np.inf), -0.25 * np.pi So the problem seems to be with the inf handling. Windows arctan2 is known to be wtf-buggy and I suspect that is what is being tested. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From afriedle at indiana.edu Tue Aug 4 13:19:22 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Tue, 04 Aug 2009 13:19:22 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <5b8d13220908040820u79b6d5felc3dd2e338cc55433@mail.gmail.com> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <4A76EFD3.5010508@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> <4A7723A8.8010107@indiana.edu> <4A785073.9060009@indiana.edu> <5b8d13220908040820u79b6d5felc3dd2e338cc55433@mail.gmail.com> Message-ID: <4A786D9A.1040703@indiana.edu> David Cournapeau wrote: > On Wed, Aug 5, 2009 at 12:14 AM, Andrew Friedley wrote: > >> Do you know where this conversion is, in the code? The impression I got >> from my quick look at the code was that a wrapper sinf was defined that >> just calls sin. I guess the typecast to float in there will do the >> conversion > > Exact. Given your CPU, compared to my macbook, it looks like the > float32 is the problem (i.e. the float64 is not particularly fast). I > really can't see what could cause such a slowdown: the range over > which you evaluate sin should not cause denormal numbers - just to be > sure, could you try the same benchmark but using a simple array of > constant values (say numpy.ones(1000)) ? Also, you may want to check > what happens if you force raising errors in case of FPU exceptions > (numpy.seterr(raise="all")). OK, have some interesting results. First is my array creation was not doing what I thought it was. This (what I've been doing) creates an array of 159161 elements: numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float32) Which isn't what I was after (1000 elements ranging from 0 to 2PI). So the values in that array climb up to 999.999. Running with numpy.ones() gives a much different timing (I did numpy.ones(159161) to keep the array lengths the same): sin float32 0.078202009201 sin float64 0.0767619609833 cos float32 0.0750858783722 cos float64 0.088515996933 Much better, but still a little strange, float32 should be relatively faster yet. I tried with 1000 elements and got similar results. So the performance has something to do with the input values. This is believable, but I don't think it explains why float32 would behave that way and not float64, unless there's something else I don't understand. Also I assume you meant seterr(all='raise'). This didn't seem to do anything, I don't have any exceptions thrown or other output. Andrew From robert.kern at gmail.com Tue Aug 4 13:24:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 4 Aug 2009 12:24:15 -0500 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A786D9A.1040703@indiana.edu> References: <4A76E709.9090100@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> <4A7723A8.8010107@indiana.edu> <4A785073.9060009@indiana.edu> <5b8d13220908040820u79b6d5felc3dd2e338cc55433@mail.gmail.com> <4A786D9A.1040703@indiana.edu> Message-ID: <3d375d730908041024p616fae36k3ce5205b70ed0e48@mail.gmail.com> On Tue, Aug 4, 2009 at 12:19, Andrew Friedley wrote: > OK, have some interesting results. ?First is my array creation was not > doing what I thought it was. ?This (what I've been doing) creates an > array of 159161 elements: > > numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float32) > > Which isn't what I was after (1000 elements ranging from 0 to 2PI). ?So > the values in that array climb up to 999.999. One uses arange() like so: numpy.arange(start, stop, step), just like the builtin range(). You want numpy.linspace(0.0, 2*numpy.pi, 1000). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d_l_goldsmith at yahoo.com Tue Aug 4 13:45:01 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 4 Aug 2009 10:45:01 -0700 (PDT) Subject: [Numpy-discussion] Why NaN? In-Reply-To: <3d375d730908041008k33d2161ayf9a35b04b0ab85ca@mail.gmail.com> Message-ID: <145439.20090.qm@web52104.mail.re2.yahoo.com> Actually, Robert's really a robot (indeed, the Kernel of all robot minds) - no way a biologic is going to beat him. ;-) DG --- On Tue, 8/4/09, Robert Kern wrote: > From: Robert Kern > Subject: Re: [Numpy-discussion] Why NaN? > To: "Discussion of Numerical Python" > Date: Tuesday, August 4, 2009, 10:08 AM > On Tue, Aug 4, 2009 at 12:05, > wrote: > > > What's going on with the response time here? > > > > I cannot even finish reading the question and start > python. > > Practice. :-) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, > a harmless > enigma that is made terrible by our own mad attempt to > interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Aug 4 13:48:34 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 Aug 2009 11:48:34 -0600 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A786D9A.1040703@indiana.edu> References: <4A76E709.9090100@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> <4A7723A8.8010107@indiana.edu> <4A785073.9060009@indiana.edu> <5b8d13220908040820u79b6d5felc3dd2e338cc55433@mail.gmail.com> <4A786D9A.1040703@indiana.edu> Message-ID: On Tue, Aug 4, 2009 at 11:19 AM, Andrew Friedley wrote: > David Cournapeau wrote: > > On Wed, Aug 5, 2009 at 12:14 AM, Andrew Friedley > wrote: > > > >> Do you know where this conversion is, in the code? The impression I got > >> from my quick look at the code was that a wrapper sinf was defined that > >> just calls sin. I guess the typecast to float in there will do the > >> conversion > > > > Exact. Given your CPU, compared to my macbook, it looks like the > > float32 is the problem (i.e. the float64 is not particularly fast). I > > really can't see what could cause such a slowdown: the range over > > which you evaluate sin should not cause denormal numbers - just to be > > sure, could you try the same benchmark but using a simple array of > > constant values (say numpy.ones(1000)) ? Also, you may want to check > > what happens if you force raising errors in case of FPU exceptions > > (numpy.seterr(raise="all")). > > OK, have some interesting results. First is my array creation was not > doing what I thought it was. This (what I've been doing) creates an > array of 159161 elements: > > numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float32) > > Which isn't what I was after (1000 elements ranging from 0 to 2PI). So > the values in that array climb up to 999.999. > > Running with numpy.ones() gives a much different timing (I did > numpy.ones(159161) to keep the array lengths the same): > > sin float32 0.078202009201 > sin float64 0.0767619609833 > cos float32 0.0750858783722 > cos float64 0.088515996933 > > Much better, but still a little strange, float32 should be relatively > faster yet. I tried with 1000 elements and got similar results. > Depends on the CPU, FPU and the compiler flags. The computations could very well be done using double precision internally with conversions on load/store. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From afriedle at indiana.edu Tue Aug 4 13:57:05 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Tue, 04 Aug 2009 13:57:05 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: References: <4A76E709.9090100@indiana.edu> <20090803142112.GA7495@phare.normalesup.org> <7f014ea60908030923s30b958a6meb2afbad38269052@mail.gmail.com> <4A7723A8.8010107@indiana.edu> <4A785073.9060009@indiana.edu> <5b8d13220908040820u79b6d5felc3dd2e338cc55433@mail.gmail.com> <4A786D9A.1040703@indiana.edu> Message-ID: <4A787671.5060501@indiana.edu> Charles R Harris wrote: > Depends on the CPU, FPU and the compiler flags. The computations could very > well be done using double precision internally with conversions on > load/store. Sure, but if this is the case, why is the performance blowing up on larger input values for float32 but not float64? Both should blow up, not just one or the other. In other words I think they are using different implementations :) Am I missing something? Andrew From josef.pktd at gmail.com Tue Aug 4 14:11:57 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 4 Aug 2009 14:11:57 -0400 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <145439.20090.qm@web52104.mail.re2.yahoo.com> References: <3d375d730908041008k33d2161ayf9a35b04b0ab85ca@mail.gmail.com> <145439.20090.qm@web52104.mail.re2.yahoo.com> Message-ID: <1cd32cbb0908041111s3d6c167dv2a216d7c245c0045@mail.gmail.com> On Tue, Aug 4, 2009 at 1:45 PM, David Goldsmith wrote: > > Actually, Robert's really a robot (indeed, the Kernel of all robot minds) - no way a biologic is going to beat him. ;-) So, what is the conclusion, do we need more practice, or can we sit back and let Robert take care of things? Josef > > DG > > --- On Tue, 8/4/09, Robert Kern wrote: > >> From: Robert Kern >> Subject: Re: [Numpy-discussion] Why NaN? >> To: "Discussion of Numerical Python" >> Date: Tuesday, August 4, 2009, 10:08 AM >> On Tue, Aug 4, 2009 at 12:05, >> wrote: >> >> > What's going on with the response time here? >> > >> > I cannot even finish reading the question and start >> python. >> >> Practice. :-) >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, >> a harmless >> enigma that is made terrible by our own mad attempt to >> interpret it as >> though it had an underlying truth." >> ? -- Umberto Eco >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d_l_goldsmith at yahoo.com Tue Aug 4 14:30:45 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 4 Aug 2009 11:30:45 -0700 (PDT) Subject: [Numpy-discussion] Why NaN? In-Reply-To: <1cd32cbb0908041111s3d6c167dv2a216d7c245c0045@mail.gmail.com> Message-ID: <348270.25448.qm@web52105.mail.re2.yahoo.com> Uh-oh, if my joke is going to promote wide-spread complacency, I take it back, I take it back! DG --- On Tue, 8/4/09, josef.pktd at gmail.com wrote: > From: josef.pktd at gmail.com > Subject: Re: [Numpy-discussion] Why NaN? > To: "Discussion of Numerical Python" > Date: Tuesday, August 4, 2009, 11:11 AM > On Tue, Aug 4, 2009 at 1:45 PM, David > Goldsmith > wrote: > > > > Actually, Robert's really a robot (indeed, the Kernel > of all robot minds) - no way a biologic is going to beat > him. ;-) > > So, what is the conclusion, do we need more practice, or > can we sit > back and let Robert take care of things? > > Josef > > > > > > DG > > > > --- On Tue, 8/4/09, Robert Kern > wrote: > > > >> From: Robert Kern > >> Subject: Re: [Numpy-discussion] Why NaN? > >> To: "Discussion of Numerical Python" > >> Date: Tuesday, August 4, 2009, 10:08 AM > >> On Tue, Aug 4, 2009 at 12:05, > >> wrote: > >> > >> > What's going on with the response time here? > >> > > >> > I cannot even finish reading the question and > start > >> python. > >> > >> Practice. :-) > >> > >> -- > >> Robert Kern > >> > >> "I have come to believe that the whole world is an > enigma, > >> a harmless > >> enigma that is made terrible by our own mad > attempt to > >> interpret it as > >> though it had an underlying truth." > >> ? -- Umberto Eco > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gael.varoquaux at normalesup.org Tue Aug 4 14:34:52 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 4 Aug 2009 20:34:52 +0200 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <1cd32cbb0908041111s3d6c167dv2a216d7c245c0045@mail.gmail.com> References: <3d375d730908041008k33d2161ayf9a35b04b0ab85ca@mail.gmail.com> <145439.20090.qm@web52104.mail.re2.yahoo.com> <1cd32cbb0908041111s3d6c167dv2a216d7c245c0045@mail.gmail.com> Message-ID: <20090804183452.GA11772@phare.normalesup.org> On Tue, Aug 04, 2009 at 02:11:57PM -0400, josef.pktd at gmail.com wrote: > On Tue, Aug 4, 2009 at 1:45 PM, David Goldsmith wrote: > > Actually, Robert's really a robot (indeed, the Kernel of all robot minds) - no way a biologic is going to beat him. ;-) > So, what is the conclusion, do we need more practice, or can we sit > back and let Robert take care of things? No, we need to get the master schematics of Robert and replicate him! Robert, please? Ga?l From gokhansever at gmail.com Tue Aug 4 14:40:31 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 4 Aug 2009 13:40:31 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> Message-ID: <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> This is the loveliest of all solutions: c[isfinite(c)].mean() You are all very helpful and funny. I am sure most of you spend more than 16 hours a day in front of or by your screens :) On Tue, Aug 4, 2009 at 11:46 AM, G?khan Sever wrote: > Hello, > > I know this has to have a very simple answer, but stuck at this very moment > and can't get a meaningful result out of np.mean() > > > In [121]: a = array([NaN, 4, NaN, 12]) > > In [122]: b = array([NaN, 2, NaN, 3]) > > In [123]: c = a/b > > In [124]: mean(c) > Out[124]: nan > > In [125]: mean a > --------> mean(a) > Out[125]: nan > > Further when I tried: > > In [138]: c > Out[138]: array([ NaN, 2., NaN, 4.]) > > In [139]: np.where(c==NaN) > Out[139]: (array([], dtype=int32),) > > > In [141]: mask = [c != NaN] > > In [142]: mask > Out[142]: [array([ True, True, True, True], dtype=bool)] > > > Any ideas? > > -- > G?khan > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Aug 4 14:43:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 4 Aug 2009 13:43:54 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> Message-ID: <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> On Tue, Aug 4, 2009 at 13:40, G?khan Sever wrote: > This is the loveliest of all solutions: > > c[isfinite(c)].mean() I kind of like c[c == c].mean(), but only because it's a bit mind-blowing. :-) > You are all very helpful and funny. I am sure most of you spend more than 16 > hours a day in front of or by your screens :) Hey! I resemble that remark! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From meine at informatik.uni-hamburg.de Tue Aug 4 14:46:39 2009 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Tue, 4 Aug 2009 20:46:39 +0200 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A786D9A.1040703@indiana.edu> References: <4A76E709.9090100@indiana.edu> <5b8d13220908040820u79b6d5felc3dd2e338cc55433@mail.gmail.com> <4A786D9A.1040703@indiana.edu> Message-ID: <200908042046.39426.meine@informatik.uni-hamburg.de> On Tuesday 04 August 2009 19:19:22 Andrew Friedley wrote: > OK, have some interesting results. First is my array creation was not > doing what I thought it was. This (what I've been doing) creates an > array of 159161 elements: > > numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=numpy.float32) Aaaaah. And I wondered why taking the sin/cos of 1000 elements took so long... ;-) (actually, I would've used larger arrays for benchmarking to begin with) Indeed, the value range fixes stuff here (Linux, GCC/amd64, Xeon X5450 @ 3.00GHz, NumPy 1.3.0), too: Before: float64 10 loops, best of 3: 54.2 ms per loop float32 10 loops, best of 3: 7.62 ms per loop After: float64 10 loops, best of 3: 6.03 ms per loop float32 10 loops, best of 3: 3.81 ms per loop Best, Hans From gael.varoquaux at normalesup.org Tue Aug 4 14:48:24 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 4 Aug 2009 20:48:24 +0200 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> Message-ID: <20090804184824.GB11772@phare.normalesup.org> On Tue, Aug 04, 2009 at 01:43:54PM -0500, Robert Kern wrote: > I kind of like c[c == c].mean(), but only because it's a bit mind-blowing. :-) > > You are all very helpful and funny. I am sure most of you spend more than 16 > > hours a day in front of or by your screens :) > Hey! I resemble that remark! Out of these 16 hours, 14 are spent staring at two terminals: one with IPython on one side, and another with vim on the other. Yeah baby! Ga?l From gokhansever at gmail.com Tue Aug 4 14:54:49 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 4 Aug 2009 13:54:49 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <20090804184824.GB11772@phare.normalesup.org> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> <20090804184824.GB11772@phare.normalesup.org> Message-ID: <49d6b3500908041154g7383eed7w7cbd00ddea55f035@mail.gmail.com> On Tue, Aug 4, 2009 at 1:48 PM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > On Tue, Aug 04, 2009 at 01:43:54PM -0500, Robert Kern wrote: > > I kind of like c[c == c].mean(), but only because it's a bit > mind-blowing. :-) > > > > You are all very helpful and funny. I am sure most of you spend more > than 16 > > > hours a day in front of or by your screens :) > > > Hey! I resemble that remark! > > Out of these 16 hours, 14 are spent staring at two terminals: one with > IPython on one side, and another with vim on the other. > > Yeah baby! > > Ga?l > I see that you should have a browser embedding plugin for Ipyhon which you don't want to share with us :) And do you only fix Mayavi issues in that not-included 2 hours? -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Aug 4 15:29:47 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 4 Aug 2009 15:29:47 -0400 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> Message-ID: <95747771-B7B5-44F3-B12C-094B944A33CA@gmail.com> On Aug 4, 2009, at 2:43 PM, Robert Kern wrote: > On Tue, Aug 4, 2009 at 13:40, G?khan Sever > wrote: >> This is the loveliest of all solutions: >> >> c[isfinite(c)].mean() > > I kind of like c[c == c].mean(), but only because it's a bit mind- > blowing. :-) But it doesn't give the same result as the previous one when there's an inf... From robert.kern at gmail.com Tue Aug 4 15:40:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 4 Aug 2009 14:40:19 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <95747771-B7B5-44F3-B12C-094B944A33CA@gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> <95747771-B7B5-44F3-B12C-094B944A33CA@gmail.com> Message-ID: <3d375d730908041240g44fe6079obc03c4eae4a9968@mail.gmail.com> On Tue, Aug 4, 2009 at 14:29, Pierre GM wrote: > > On Aug 4, 2009, at 2:43 PM, Robert Kern wrote: > >> On Tue, Aug 4, 2009 at 13:40, G?khan Sever >> wrote: >>> This is the loveliest of all solutions: >>> >>> c[isfinite(c)].mean() >> >> I kind of like c[c == c].mean(), but only because it's a bit mind- >> blowing. :-) > > But it doesn't give the same result as the previous one when there's > an inf... NaNs might be markers of missing data, but I see infs as data. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Tue Aug 4 15:59:36 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 4 Aug 2009 21:59:36 +0200 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <49d6b3500908041154g7383eed7w7cbd00ddea55f035@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> <20090804184824.GB11772@phare.normalesup.org> <49d6b3500908041154g7383eed7w7cbd00ddea55f035@mail.gmail.com> Message-ID: <20090804195936.GF11772@phare.normalesup.org> On Tue, Aug 04, 2009 at 01:54:49PM -0500, G?khan Sever wrote: > I see that you should have a browser embedding plugin for Ipyhon which you > don't want to share with us :) No, I answer e-mail using vim. > And do you only fix Mayavi issues in that not-included 2 hours? No, during the other hours, using IPython and vim, what else? Ga?l From kwgoodman at gmail.com Tue Aug 4 16:06:38 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 4 Aug 2009 13:06:38 -0700 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <20090804195936.GF11772@phare.normalesup.org> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> <20090804184824.GB11772@phare.normalesup.org> <49d6b3500908041154g7383eed7w7cbd00ddea55f035@mail.gmail.com> <20090804195936.GF11772@phare.normalesup.org> Message-ID: On Tue, Aug 4, 2009 at 12:59 PM, Gael Varoquaux wrote: > On Tue, Aug 04, 2009 at 01:54:49PM -0500, G?khan Sever wrote: >> ? ?I see that you should have a browser embedding plugin for Ipyhon which you >> ? ?don't want to share with us :) > > No, I answer e-mail using vim. Yeah, I'm trying that right now. :wq :q! :dammit From matthew.brett at gmail.com Tue Aug 4 16:09:13 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 4 Aug 2009 13:09:13 -0700 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: <5b8d13220908040931ide8259dn7c07ecf8e82e3099@mail.gmail.com> References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> <4A780871.5070408@ar.media.kyoto-u.ac.jp> <4A7817F7.3@ar.media.kyoto-u.ac.jp> <5b8d13220908040931ide8259dn7c07ecf8e82e3099@mail.gmail.com> Message-ID: <1e2af89e0908041309s6d96d638o2a46d4526b57202@mail.gmail.com> Hi, On Tue, Aug 4, 2009 at 9:31 AM, David Cournapeau wrote: > On Tue, Aug 4, 2009 at 8:13 PM, David > Cournapeau wrote: > >> I think I understand the problem. Unfortunately, that's looks tricky to >> solve... I hate distutils. > > Ok - should be fixed in r7281. Just to clarify - it's still true I guess that this: python setup.py build_ext --compiler=mingw32 --inplace just can't work - because the --compiler flag does not get passed to the build step? I noticed, when I was trying to be fancy: python setup.py build build_ext --inplace this error: File "/home/mb312/usr/local/lib/python2.5/site-packages/numpy/distutils/command/build_ext.py", line 74, in run self.library_dirs.append(build_clib.build_clib) UnboundLocalError: local variable 'build_clib' referenced before assignment because of the check for inplace builds above that, leaving build_clib undefined. I'm afraid I wasn't quite sure what the right thing to do was. Thanks a lot, Matthew From robert.kern at gmail.com Tue Aug 4 16:23:36 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 4 Aug 2009 15:23:36 -0500 Subject: [Numpy-discussion] Is this a bug in numpy.distutils ? In-Reply-To: <1e2af89e0908041309s6d96d638o2a46d4526b57202@mail.gmail.com> References: <1e2af89e0908031635i676135afmde57bd8993a4ca82@mail.gmail.com> <4A77F13E.2020402@ar.media.kyoto-u.ac.jp> <5b8d13220908040254n538c1e4flfe7ee2dd9aba96ef@mail.gmail.com> <4A780871.5070408@ar.media.kyoto-u.ac.jp> <4A7817F7.3@ar.media.kyoto-u.ac.jp> <5b8d13220908040931ide8259dn7c07ecf8e82e3099@mail.gmail.com> <1e2af89e0908041309s6d96d638o2a46d4526b57202@mail.gmail.com> Message-ID: <3d375d730908041323i20a155a3g37718104163db591@mail.gmail.com> On Tue, Aug 4, 2009 at 15:09, Matthew Brett wrote: > ?File "/home/mb312/usr/local/lib/python2.5/site-packages/numpy/distutils/command/build_ext.py", > line 74, in run > ? ?self.library_dirs.append(build_clib.build_clib) > UnboundLocalError: local variable 'build_clib' referenced before assignment > > because of the check for inplace builds above that, leaving build_clib > undefined. ?I'm afraid I wasn't quite sure what the right thing to do > was. Probably just build_clib = self.distribution.get_command_obj('build_clib') after the log.warn(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From npuloski at gmail.com Tue Aug 4 16:36:28 2009 From: npuloski at gmail.com (Nanime Puloski) Date: Tue, 4 Aug 2009 16:36:28 -0400 Subject: [Numpy-discussion] Features in SciPy That are Absent in NumPy Message-ID: What features does SciPy have that are absent in NumPy? -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmb at wartburg.edu Tue Aug 4 16:40:55 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Tue, 04 Aug 2009 14:40:55 -0600 Subject: [Numpy-discussion] Features in SciPy That are Absent in NumPy In-Reply-To: References: Message-ID: <4A789CD7.4020109@wartburg.edu> On 2009-08-04 14:36 , Nanime Puloski wrote: > What features does SciPy have that are absent in NumPy? Many. SciPy includes algorithms for optimization, solving differential equations, numerical integration among many others. NumPy primarily provides a useful n-dimensional array container. While there are some basic scientific features such as FFTs in NumPy, these appear in more detail in SciPy. If you can give more specifics on what features you would be interested in, we can offer more help about which package contains those features. -Neil From bsouthey at gmail.com Tue Aug 4 16:53:00 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 4 Aug 2009 15:53:00 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> Message-ID: On Tue, Aug 4, 2009 at 1:40 PM, G?khan Sever wrote: > This is the loveliest of all solutions: > > c[isfinite(c)].mean() This handling of nonfinite elements has come up before. Please remember that this only for 1d or flatten array so it not work in general especially along an axis. Bruce From kwgoodman at gmail.com Tue Aug 4 17:05:18 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 4 Aug 2009 14:05:18 -0700 Subject: [Numpy-discussion] Why NaN? In-Reply-To: References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> Message-ID: On Tue, Aug 4, 2009 at 1:53 PM, Bruce Southey wrote: > On Tue, Aug 4, 2009 at 1:40 PM, G?khan Sever wrote: >> This is the loveliest of all solutions: >> >> c[isfinite(c)].mean() > > This handling of nonfinite elements has come up before. > Please remember that this only for 1d or flatten array so it not work > in general especially along an axis. If you don't want to use nanmean from scipy.stats you could use: np.nansum(c, axis=0) / (~np.isnan(c)).sum(axis=0) or np.nansum(c, axis=0) / (c == c).sum(axis=0) But if c contains ints then you'll run into trouble with the division, so you'll need to protect against that. From bsouthey at gmail.com Tue Aug 4 17:24:35 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 4 Aug 2009 16:24:35 -0500 Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback In-Reply-To: <4A77A009.9060104@ar.media.kyoto-u.ac.jp> References: <4A77A009.9060104@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, Aug 3, 2009 at 9:42 PM, David Cournapeau wrote: > Hi All, > > ? ?I (David Cournapeau) and the people at Berkeley (Jarrod Millman, > Fernando Perez, Matthew Brett) have been in discussion so that I could > do some funded work on NumPy/SciPy. Although they are obviously > interested in improvements that help their own projects, they are > willing to make sure the work will impact numpy/scipy as a whole. As > such we would like to get some feedback about the proposal. > > There are several areas we discussed about, but the main 'vision' is to > make more of the C code in numpy reusable to 3rd parties, in particular > purely computational (fft, linear algebra, etc...) code. A first draft > of the proposal is pasted below. > > Comments, request for details, objections are welcomed, > > Thank you for your attention, > > The Berkeley team, Gael Varoquaux and David Cournapeau > [snip] Almost a year ago Travis send an email : 'Report from SciPy'? http://mail.scipy.org/pipermail/numpy-discussion/2008-August/036909.html Of importance was that " * NumPy 2.0 will be a library and will not automagically import numpy.fft * We will suggest that other libraries use from numpy import fft instead of import numpy as np; np.fft " I sort of see that the proposed work could help make numpy a library as a whole but it is not clear that the work is heading towards that goal. So if numpy 2.0 is still planned as a library then I would like to see a clearer statement towards that goal. Not really understanding the problems of C99, but I know that trying to cover all the little details can be very time consuming when more effort could be spent on things. So if 'C99-like' is going to be the near term future, is there any point in supporting non-C99 environments with this work? That is, is the limitation in the compiler, operating system, processor or some combination of these? Anyhow, these are only my thoughts and pale in comparison to the work you are doing so feel free ignore them. Thanks Bruce From d_l_goldsmith at yahoo.com Tue Aug 4 17:43:12 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 4 Aug 2009 14:43:12 -0700 (PDT) Subject: [Numpy-discussion] Features in SciPy That are Absent in NumPy In-Reply-To: <4A789CD7.4020109@wartburg.edu> Message-ID: <74401.51536.qm@web52107.mail.re2.yahoo.com> --- On Tue, 8/4/09, Neil Martinsen-Burrell wrote: > > What features does SciPy have that are absent in > NumPy? > > Many.? And that's an understatement! DG > SciPy includes algorithms for optimization, > solving differential > equations, numerical integration among many others.? > NumPy primarily > provides a useful n-dimensional array container.? > While there are some > basic scientific features such as FFTs in NumPy, these > appear in more > detail in SciPy.? If you can give more specifics on > what features you > would be interested in, we can offer more help about which > package > contains those features. > > -Neil > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Aug 4 17:43:28 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 Aug 2009 15:43:28 -0600 Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback In-Reply-To: References: <4A77A009.9060104@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, Aug 4, 2009 at 3:24 PM, Bruce Southey wrote: > On Mon, Aug 3, 2009 at 9:42 PM, David > Cournapeau wrote: > > Hi All, > > > > I (David Cournapeau) and the people at Berkeley (Jarrod Millman, > > Fernando Perez, Matthew Brett) have been in discussion so that I could > > do some funded work on NumPy/SciPy. Although they are obviously > > interested in improvements that help their own projects, they are > > willing to make sure the work will impact numpy/scipy as a whole. As > > such we would like to get some feedback about the proposal. > > > > There are several areas we discussed about, but the main 'vision' is to > > make more of the C code in numpy reusable to 3rd parties, in particular > > purely computational (fft, linear algebra, etc...) code. A first draft > > of the proposal is pasted below. > > > > Comments, request for details, objections are welcomed, > > > > Thank you for your attention, > > > > The Berkeley team, Gael Varoquaux and David Cournapeau > > > > [snip] > > > Almost a year ago Travis send an email : > 'Report from SciPy'? > http://mail.scipy.org/pipermail/numpy-discussion/2008-August/036909.html > > Of importance was that > " * NumPy 2.0 will be a library and will not automagically import numpy.fft > * We will suggest that other libraries use from numpy import fft > instead of import numpy as np; np.fft > " > > I sort of see that the proposed work could help make numpy a library > as a whole but it is not clear that the work is heading towards that > goal. So if numpy 2.0 is still planned as a library then I would like > to see a clearer statement towards that goal. > > Not really understanding the problems of C99, but I know that trying > to cover all the little details can be very time consuming when more > effort could be spent on things. > So if 'C99-like' is going to be the near term future, is there any > point in supporting non-C99 environments with this work? Windows? I don't know the status of the most recent MSVC compilers, but they haven't been c99 compliant in the past and compliance doesn't seem to be a priority. Other compilers are a mixed bag. This is the git conundrum: support isn't sufficiently widespread on all platforms to make the transition so we are stuck with the lowest common denominator. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Tue Aug 4 17:49:48 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 4 Aug 2009 14:49:48 -0700 (PDT) Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback In-Reply-To: Message-ID: <999200.9547.qm@web52106.mail.re2.yahoo.com> --- On Tue, 8/4/09, Bruce Southey wrote: > [snip] > > Almost a year ago Travis send an email : > 'Report from SciPy'? > http://mail.scipy.org/pipermail/numpy-discussion/2008-August/036909.html > > Of importance was that > " * NumPy 2.0 will be a library and will not automagically > import numpy.fft As someone who tends to think of "modules" as "libraries" (renamed for Python for "branding" purposes), what's the difference? DG > * We will suggest that other libraries use from numpy > import fft > instead of import numpy as np; np.fft > " > > I sort of see that the proposed work could help make numpy > a library > as a whole but it is not clear that the work is heading > towards that > goal. So if numpy 2.0 is still planned as a library then I > would like > to see a clearer statement towards that goal. > > Not really understanding the problems of C99, but I know > that trying > to cover all the little details can be very time consuming > when more > effort could be spent on things. > So if 'C99-like' is going to be the near term future, is > there any > point in supporting non-C99 environments with this work? > That is, is the limitation in the compiler, operating > system, > processor or some combination of these? > > Anyhow, these are only my thoughts and pale in comparison > to the work > you are doing so feel free ignore them. > > Thanks > Bruce > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Tue Aug 4 17:53:37 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 4 Aug 2009 16:53:37 -0500 Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback In-Reply-To: <999200.9547.qm@web52106.mail.re2.yahoo.com> References: <999200.9547.qm@web52106.mail.re2.yahoo.com> Message-ID: <3d375d730908041453k3ffcd219p50482a5d53d683a@mail.gmail.com> On Tue, Aug 4, 2009 at 16:49, David Goldsmith wrote: > > --- On Tue, 8/4/09, Bruce Southey wrote: > >> [snip] >> >> Almost a year ago Travis send an email : >> 'Report from SciPy'? >> http://mail.scipy.org/pipermail/numpy-discussion/2008-August/036909.html >> >> Of importance was that >> " * NumPy 2.0 will be a library and will not automagically >> import numpy.fft > > As someone who tends to think of "modules" as "libraries" (renamed for Python for "branding" purposes), what's the difference? Poor phrasing. I believe Travis meant something along the lines of "NumPy 2.0 will be a [well-behaved] library and will not automagically import numpy.fft." The informative part is the latter point. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d_l_goldsmith at yahoo.com Tue Aug 4 17:57:02 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 4 Aug 2009 14:57:02 -0700 (PDT) Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback In-Reply-To: <3d375d730908041453k3ffcd219p50482a5d53d683a@mail.gmail.com> Message-ID: <233855.28363.qm@web52103.mail.re2.yahoo.com> Gotchya, thanks! DG --- On Tue, 8/4/09, Robert Kern wrote: > From: Robert Kern > Subject: Re: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback > To: "Discussion of Numerical Python" > Date: Tuesday, August 4, 2009, 2:53 PM > On Tue, Aug 4, 2009 at 16:49, David > Goldsmith > wrote: > > > > --- On Tue, 8/4/09, Bruce Southey > wrote: > > > >> [snip] > >> > >> Almost a year ago Travis send an email : > >> 'Report from SciPy'? > >> http://mail.scipy.org/pipermail/numpy-discussion/2008-August/036909.html > >> > >> Of importance was that > >> " * NumPy 2.0 will be a library and will not > automagically > >> import numpy.fft > > > > As someone who tends to think of "modules" as > "libraries" (renamed for Python for "branding" purposes), > what's the difference? > > Poor phrasing. I believe Travis meant something along the > lines of > "NumPy 2.0 will be a [well-behaved] library and will not > automagically > import numpy.fft." The informative part is the latter > point. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, > a harmless > enigma that is made terrible by our own mad attempt to > interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From liukis at usc.edu Tue Aug 4 18:36:30 2009 From: liukis at usc.edu (Maria Liukis) Date: Tue, 04 Aug 2009 15:36:30 -0700 Subject: [Numpy-discussion] scipy.stats.poisson.ppf raises "OverflowError: cannot convert float infinity to long" Message-ID: Hello everybody, I'm using the following versions of scipy and numpy: >>> scipy.__version__ '0.6.0' >>> import numpy >>> numpy.__version__ '1.1.1' Would anybody happen to know why I get an exception when calling scipy.stats.poisson.ppf function: >>> from scipy.stats import * >>> poisson.ppf(0.9999, 4) Traceback (most recent call last): File "", line 1, in File "/usr/lib64/python2.5/site-packages/scipy/stats/ distributions.py", line 3601, in ppf place(output,cond2,self.b) File "/usr/lib64/python2.5/site-packages/numpy/lib/ function_base.py", line 957, in place return _insert(arr, mask, vals) OverflowError: cannot convert float infinity to long >>> Thanks a lot in advance, Masha -------------------- liukis at usc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Aug 4 19:01:03 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 4 Aug 2009 19:01:03 -0400 Subject: [Numpy-discussion] scipy.stats.poisson.ppf raises "OverflowError: cannot convert float infinity to long" In-Reply-To: References: Message-ID: <1cd32cbb0908041601s55216ffesa301443ff104054d@mail.gmail.com> On Tue, Aug 4, 2009 at 6:36 PM, Maria Liukis wrote: > Hello everybody, > I'm using the following versions of scipy and numpy: >>>> scipy.__version__ > '0.6.0' >>>> import numpy >>>> numpy.__version__ > '1.1.1' > Would anybody happen to know why I get an exception when calling > scipy.stats.poisson.ppf function: >>>> from scipy.stats import * >>>> poisson.ppf(0.9999, 4) > Traceback (most recent call last): > ??File "", line 1, in > ??File "/usr/lib64/python2.5/site-packages/scipy/stats/distributions.py", > line 3601, in ppf > ?? ?place(output,cond2,self.b) > ??File "/usr/lib64/python2.5/site-packages/numpy/lib/function_base.py", line > 957, in place > ?? ?return _insert(arr, mask, vals) > OverflowError: cannot convert float infinity to long >>> stats.poisson.ppf(0.9999, 4) 13.0 >>> stats.poisson.cdf(13, 4) 0.99992367158465667 should be fixed since scipy 0.7.0 Josef >>>> > Thanks a lot in advance, > Masha > -------------------- > liukis at usc.edu > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From liukis at usc.edu Tue Aug 4 19:03:44 2009 From: liukis at usc.edu (Maria Liukis) Date: Tue, 04 Aug 2009 16:03:44 -0700 Subject: [Numpy-discussion] scipy.stats.poisson.ppf raises "OverflowError: cannot convert float infinity to long" In-Reply-To: <1cd32cbb0908041601s55216ffesa301443ff104054d@mail.gmail.com> References: <1cd32cbb0908041601s55216ffesa301443ff104054d@mail.gmail.com> Message-ID: <8295F59A-5A15-443F-BB6E-ED31AC7054FF@usc.edu> Josef, Thanks a bunch! Masha -------------------- liukis at usc.edu On Aug 4, 2009, at 4:01 PM, josef.pktd at gmail.com wrote: > On Tue, Aug 4, 2009 at 6:36 PM, Maria Liukis wrote: >> Hello everybody, >> I'm using the following versions of scipy and numpy: >>>>> scipy.__version__ >> '0.6.0' >>>>> import numpy >>>>> numpy.__version__ >> '1.1.1' >> Would anybody happen to know why I get an exception when calling >> scipy.stats.poisson.ppf function: >>>>> from scipy.stats import * >>>>> poisson.ppf(0.9999, 4) >> Traceback (most recent call last): >> File "", line 1, in >> File "/usr/lib64/python2.5/site-packages/scipy/stats/ >> distributions.py", >> line 3601, in ppf >> place(output,cond2,self.b) >> File "/usr/lib64/python2.5/site-packages/numpy/lib/ >> function_base.py", line >> 957, in place >> return _insert(arr, mask, vals) >> OverflowError: cannot convert float infinity to long > >>>> stats.poisson.ppf(0.9999, 4) > 13.0 >>>> stats.poisson.cdf(13, 4) > 0.99992367158465667 > > should be fixed since scipy 0.7.0 > > Josef > >>>>> >> Thanks a lot in advance, >> Masha >> -------------------- >> liukis at usc.edu >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Tue Aug 4 19:49:28 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 4 Aug 2009 19:49:28 -0400 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <49d6b3500908041154g7383eed7w7cbd00ddea55f035@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> <20090804184824.GB11772@phare.normalesup.org> <49d6b3500908041154g7383eed7w7cbd00ddea55f035@mail.gmail.com> Message-ID: On 4-Aug-09, at 2:54 PM, G?khan Sever wrote: > I see that you should have a browser embedding plugin for Ipyhon > which you > don't want to share with us :) Ondrej's well on his way to fixing that: http://pythonnb.appspot.com/ David From josef.pktd at gmail.com Tue Aug 4 20:00:28 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 4 Aug 2009 20:00:28 -0400 Subject: [Numpy-discussion] scipy.stats.poisson.ppf raises "OverflowError: cannot convert float infinity to long" In-Reply-To: <8295F59A-5A15-443F-BB6E-ED31AC7054FF@usc.edu> References: <1cd32cbb0908041601s55216ffesa301443ff104054d@mail.gmail.com> <8295F59A-5A15-443F-BB6E-ED31AC7054FF@usc.edu> Message-ID: <1cd32cbb0908041700w6c041b0h9c0d4bd4fca05557@mail.gmail.com> On Tue, Aug 4, 2009 at 7:03 PM, Maria Liukis wrote: > Josef, > Thanks a bunch! > Masha You're welcome. Josef > -------------------- > liukis at usc.edu > > > On Aug 4, 2009, at 4:01 PM, josef.pktd at gmail.com wrote: > > On Tue, Aug 4, 2009 at 6:36 PM, Maria Liukis wrote: > > Hello everybody, > I'm using the following versions of scipy and numpy: > > scipy.__version__ > > '0.6.0' > > import numpy > numpy.__version__ > > '1.1.1' > Would anybody happen to know why I get an exception when calling > scipy.stats.poisson.ppf function: > > from scipy.stats import * > poisson.ppf(0.9999, 4) > > Traceback (most recent call last): > ??File "", line 1, in > ??File "/usr/lib64/python2.5/site-packages/scipy/stats/distributions.py", > line 3601, in ppf > ?? ?place(output,cond2,self.b) > ??File "/usr/lib64/python2.5/site-packages/numpy/lib/function_base.py", line > 957, in place > ?? ?return _insert(arr, mask, vals) > OverflowError: cannot convert float infinity to long > > stats.poisson.ppf(0.9999, 4) > > 13.0 > > stats.poisson.cdf(13, 4) > > 0.99992367158465667 > should be fixed since scipy 0.7.0 > Josef > > Thanks a lot in advance, > Masha > -------------------- > liukis at usc.edu > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From gokhansever at gmail.com Tue Aug 4 20:03:43 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 4 Aug 2009 19:03:43 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> <20090804184824.GB11772@phare.normalesup.org> <49d6b3500908041154g7383eed7w7cbd00ddea55f035@mail.gmail.com> Message-ID: <49d6b3500908041703i55782893ice32802a58dcd2e3@mail.gmail.com> On Tue, Aug 4, 2009 at 6:49 PM, David Warde-Farley wrote: > On 4-Aug-09, at 2:54 PM, G?khan Sever wrote: > > > I see that you should have a browser embedding plugin for Ipyhon > > which you > > don't want to share with us :) > > Ondrej's well on his way to fixing that: http://pythonnb.appspot.com/ > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hehe :) I would not be surprised if someone brings a real python snake into the conference then :) -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From cycomanic at gmail.com Tue Aug 4 21:18:33 2009 From: cycomanic at gmail.com (Jochen) Date: Wed, 5 Aug 2009 11:18:33 +1000 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <20090803134556.GA31036@phare.normalesup.org> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> Message-ID: <20090805111833.645c6d93@cudos0803> Hi all, I see something similar on my system. OK I've just done a test. System is Ubuntu 9.04 AMD64 there seems to be a regression for float32 with high values: In [47]: a=np.random.rand(10000).astype(np.float32) In [48]: b=np.random.rand(10000).astype(np.float64) In [49]: c=1000*np.random.rand(10000).astype(np.float32) In [50]: d=1000*np.random.rand(1000).astype(np.float64) In [51]: %timeit -n 10 np.sin(a) 10 loops, best of 3: 251 ?s per loop In [52]: %timeit -n 10 np.sin(b) 10 loops, best of 3: 395 ?s per loop In [53]: %timeit -n 10 np.sin(c) 10 loops, best of 3: 5.65 ms per loop In [54]: %timeit -n 10 np.sin(d) 10 loops, best of 3: 87.7 ?s per loop In [55]: %timeit -n 10 np.sin(c.astype(np.float64)).astype(np.float32) 10 loops, best of 3: 891 ?s per loop Cheers Jochen On Mon, 3 Aug 2009 15:45:56 +0200 Emmanuelle Gouillart wrote: > Hi Andrew, > > %timeit is an Ipython magic command that uses the timeit > module, see > http://ipython.scipy.org/doc/stable/html/interactive/reference.html?highlight=timeit > for more information about how to use it. So you were right to suppose > that it is not a "normal Python". > > However, I was not able to reproduce your observations. > > >>> import numpy as np > >>> a = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32) > >>> b = np.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float64) > >>> %timeit -n 10 np.sin(a) > 10 loops, best of 3: 8.67 ms per loop > >>> %timeit -n 10 np.sin(b) > 10 loops, best of 3: 9.29 ms per loop > > Emmanuelle > > On Mon, Aug 03, 2009 at 09:32:57AM -0400, Andrew Friedley wrote: > > While working on GSoC stuff I came across this weird performance > > behavior for sine and cosine -- using float32 is way slower than > > float64. On a 2ghz opteron: > > > > sin float32 1.12447786331 > > sin float64 0.133481025696 > > cos float32 1.14155912399 > > cos float64 0.131420135498 > > > > The times are in seconds, and are best of three runs of ten > > iterations of numpy.{sin,cos} over a 1000-element array (script > > attached). I've produced similar results on a PS3 system also. > > The opteron is running Python 2.6.1 and NumPy 1.3.0, while the PS3 > > has Python 2.5.1 and NumPy 1.1.1. > > > > I haven't jumped into the code yet, but does anyone know why > > sin/cos are ~8.5x slower for 32-bit floats compared to 64-bit > > doubles? > > > > Side question: I see people in emails writing things like 'timeit > > foo(x)' and having it run some sort of standard benchmark, how > > exactly do I do that? Is that some environment other than a normal > > Python? > > > > Thanks, > > > > Andrew > > > import timeit > > > t = timeit.Timer("numpy.sin(a)", > > "import numpy\n" > > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, > > dtype=numpy.float32)") print "sin float32", min(t.repeat(3, 10)) > > > t = timeit.Timer("numpy.sin(a)", > > "import numpy\n" > > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, > > dtype=numpy.float64)") print "sin float64", min(t.repeat(3, 10)) > > > t = timeit.Timer("numpy.cos(a)", > > "import numpy\n" > > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, > > dtype=numpy.float32)") print "cos float32", min(t.repeat(3, 10)) > > > t = timeit.Timer("numpy.cos(a)", > > "import numpy\n" > > "a = numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, > > dtype=numpy.float64)") print "cos float64", min(t.repeat(3, 10)) > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Tue Aug 4 22:42:40 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 Aug 2009 20:42:40 -0600 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <20090805111833.645c6d93@cudos0803> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <20090805111833.645c6d93@cudos0803> Message-ID: On Tue, Aug 4, 2009 at 7:18 PM, Jochen wrote: > Hi all, > I see something similar on my system. > OK I've just done a test. System is Ubuntu 9.04 AMD64 > there seems to be a regression for float32 with high values: > > In [47]: a=np.random.rand(10000).astype(np.float32) > > In [48]: b=np.random.rand(10000).astype(np.float64) > > In [49]: c=1000*np.random.rand(10000).astype(np.float32) > > In [50]: d=1000*np.random.rand(1000).astype(np.float64) > > In [51]: %timeit -n 10 np.sin(a) > 10 loops, best of 3: 251 ?s per loop > > In [52]: %timeit -n 10 np.sin(b) > 10 loops, best of 3: 395 ?s per loop > > In [53]: %timeit -n 10 np.sin(c) > 10 loops, best of 3: 5.65 ms per loop > > In [54]: %timeit -n 10 np.sin(d) > 10 loops, best of 3: 87.7 ?s per loop > > In [55]: %timeit -n 10 np.sin(c.astype(np.float64)).astype(np.float32) > 10 loops, best of 3: 891 ?s per loop > Is anyone with this problem *not* running ubuntu? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Wed Aug 5 01:22:40 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 5 Aug 2009 07:22:40 +0200 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <49d6b3500908041703i55782893ice32802a58dcd2e3@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <3d375d730908041143k5ead2616y34df06748ad57289@mail.gmail.com> <20090804184824.GB11772@phare.normalesup.org> <49d6b3500908041154g7383eed7w7cbd00ddea55f035@mail.gmail.com> <49d6b3500908041703i55782893ice32802a58dcd2e3@mail.gmail.com> Message-ID: <20090805052240.GA6038@phare.normalesup.org> On Tue, Aug 04, 2009 at 07:03:43PM -0500, G?khan Sever wrote: > I would not be surprised if someone brings a real python snake into the > conference then :) http://picasaweb.google.com/ziade.tarek/PyconFR#slideshow/5342502528927090354 From fperez.net at gmail.com Wed Aug 5 02:48:07 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 4 Aug 2009 23:48:07 -0700 Subject: [Numpy-discussion] [ANN] IPython 0.10 is out. Message-ID: Hi all, on behalf of the IPython development team, I'm happy to announce that we've just put out IPython 0.10 final. Many thanks to all those who contributed ideas, bug reports and code. You can download it from the usual location: - http://ipython.scipy.org/moin/Download: direct links to various formats - http://ipython.scipy.org/dist: all files are stored here. The official documentation for this release can be found at: - http://ipython.scipy.org/doc/rel-0.10/html: as HTML pages. - http://ipython.scipy.org/doc/rel-0.10/ipython.pdf: as a single PDF. In brief, this release gathers all recent work and in a sense closes a cycle of the current useful-but-internally-messy structure of the IPython code. We are now well into the work of a major internal cleanup that will inevitably change some APIs and will likely take some time to stabilize, so the 0.10 release should be used for a while until the dust settles on the development branch. The 0.10 release fixes many bugs, including some very problematic ones (a major memory leak with repeated %run is closed), and also brings a number of new features, stability improvements and improved documentation. Some highlights: - Improved WX-based ipythonx and ipython-wx tools, suitable for embedding into other applications and standalone use. - Better interactive demos with the IPython.demo module. - Refactored ipcluster with support for local execution, MPI, PBS and systems with SSH key access preconfigured. - Integration with the TextMate editor in the %edit command. The full release notes are available here with all the details: http://ipython.scipy.org/doc/rel-0.10/html/changes.html#release-0-10 We hope you enjoy it, please report any problems as usual either on the mailing list, or by filing a bug report at our Launchpad tracker: https://bugs.launchpad.net/ipython Cheers, The IPython team. From dave.hirschfeld at gmail.com Wed Aug 5 03:40:04 2009 From: dave.hirschfeld at gmail.com (Dave) Date: Wed, 5 Aug 2009 07:40:04 +0000 (UTC) Subject: [Numpy-discussion] strange sin/cos performance References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <20090805111833.645c6d93@cudos0803> Message-ID: Charles R Harris gmail.com> writes: > > > Is anyone with this problem *not* running ubuntu?Chuck > All I can say is that it (surprisingly?) doesn't appear to affect my windoze (XP) box. Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)] In [2]: a=np.random.rand(10000).astype(np.float32) In [3]: b=np.random.rand(10000).astype(np.float64) In [4]: c=1000*np.random.rand(10000).astype(np.float32) In [5]: d=1000*np.random.rand(1000).astype(np.float64) In [6]: timeit -n 10 np.sin(a) 10 loops, best of 3: 442 us per loop In [7]: timeit -n 10 np.sin(b) 10 loops, best of 3: 513 us per loop In [8]: timeit -n 10 np.sin(c) 10 loops, best of 3: 474 us per loop In [9]: timeit -n 10 np.sin(d) 10 loops, best of 3: 63.1 us per loop In [10]: timeit -n 10 np.sin(c.astype(np.float64)).astype(np.float32) 10 loops, best of 3: 587 us per loop In [11]: !gcc --version gcc (GCC) 3.4.5 (mingw-vista special r3) From bsouthey at gmail.com Wed Aug 5 04:40:17 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 5 Aug 2009 03:40:17 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> Message-ID: On Tue, Aug 4, 2009 at 4:05 PM, Keith Goodman wrote: > On Tue, Aug 4, 2009 at 1:53 PM, Bruce Southey wrote: >> On Tue, Aug 4, 2009 at 1:40 PM, G?khan Sever wrote: >>> This is the loveliest of all solutions: >>> >>> c[isfinite(c)].mean() >> >> This handling of nonfinite elements has come up before. >> Please remember that this only for 1d or flatten array so it not work >> in general especially along an axis. > > If you don't want to use nanmean from scipy.stats you could use: > > np.nansum(c, axis=0) / (~np.isnan(c)).sum(axis=0) > > or > > np.nansum(c, axis=0) / (c == c).sum(axis=0) > > But if c contains ints then you'll run into trouble with the division, > so you'll need to protect against that. That is not a problem because nan and infinity are only defined for floating point numbers not integers. So any array that have nonfinite elements like nans and infinity must have a floating point dtype. Bruce From bsouthey at gmail.com Wed Aug 5 05:20:12 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 5 Aug 2009 04:20:12 -0500 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <20090805111833.645c6d93@cudos0803> Message-ID: On Tue, Aug 4, 2009 at 9:42 PM, Charles R Harris wrote: > > > On Tue, Aug 4, 2009 at 7:18 PM, Jochen wrote: >> >> Hi all, >> I see something similar on my system. >> OK I've just done a test. System is Ubuntu 9.04 AMD64 >> there seems to be a regression for float32 with high values: >> >> In [47]: a=np.random.rand(10000).astype(np.float32) >> >> In [48]: b=np.random.rand(10000).astype(np.float64) >> >> In [49]: c=1000*np.random.rand(10000).astype(np.float32) >> >> In [50]: d=1000*np.random.rand(1000).astype(np.float64) >> >> In [51]: %timeit -n 10 np.sin(a) >> 10 loops, best of 3: 251 ?s per loop >> >> In [52]: %timeit -n 10 np.sin(b) >> 10 loops, best of 3: 395 ?s per loop >> >> In [53]: %timeit -n 10 np.sin(c) >> 10 loops, best of 3: 5.65 ms per loop >> >> In [54]: %timeit -n 10 np.sin(d) >> 10 loops, best of 3: 87.7 ?s per loop >> >> In [55]: %timeit -n 10 np.sin(c.astype(np.float64)).astype(np.float32) >> 10 loops, best of 3: 891 ?s per loop > > Is anyone with this problem *not* running ubuntu? > Yes but I do not consider a 'problem'. While not an expert in this, this looks to be related to 64 bit OSes running on 64bit processors, probably compiler related, and probably a feature of Python. As I have tried to show, I do not think that these timing are being performed correctly because when you pass a single argument to Python at the command prompt you get the comparable timings. The difference in timing occurs when you pass two arguments to Python. I do not use IPython so I am only guessing but you need to do the timing at the same time. Probably something like: %timeit -n 10 np.sin(numpy.arange(0.0, 1000, (2 * 3.14159) / 1000, dtype=np.float32)) Note there is a most likely a penalty involved in type conversion that needs to be addressed in any timings. Bruce From david at ar.media.kyoto-u.ac.jp Wed Aug 5 07:45:30 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 05 Aug 2009 20:45:30 +0900 Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback In-Reply-To: References: <4A77A009.9060104@ar.media.kyoto-u.ac.jp> Message-ID: <4A7970DA.50302@ar.media.kyoto-u.ac.jp> Bruce Southey wrote: > So if 'C99-like' is going to be the near term future, is there any > point in supporting non-C99 environments with this work? > There may be a misunderstanding: if the platform support C99 complex, then we will use it, and otherwise, we will do as today, that is define our own type. The advantages of reusing the C99 complex type if available: - if yourself do not care about portability, you can use the numpy complex typedef as a C99 complex, using addition, division, etc... operators. - we can reuse the math library. I also need some sort of proper C99 support for windows 64 (more exactly to reimplement a minimal libgfortran buildable by MS compiler). > That is, is the limitation in the compiler, operating system, > processor or some combination of these? > That's purely a compiler issue. Of course, the main culprit is MS compiler. MS explicitly stated they did not care about proper C support. cheers, David From jdh2358 at gmail.com Wed Aug 5 09:44:34 2009 From: jdh2358 at gmail.com (John Hunter) Date: Wed, 5 Aug 2009 08:44:34 -0500 Subject: [Numpy-discussion] yubnub and numpy examples Message-ID: <88e473830908050644t71825829uc965ad213e652b3@mail.gmail.com> yubnub is pretty cool -- it's a command line interface for the web. You can enable it in firefox by typing "about:config" in the URL bar, scrolling down to "keyword.URL", right click on the line and choose modify, and set the value to be http://www.yubnub.org/parser/parse?default=g2&command= Then, you can type yubnub commands in the URL bar, eg, to see all commands related to python, type "ls python" in the URL bar. It's easy to create new commands; I just created a new command to load the docs for a numpy function; just type in the URL bar: npfunc convolve which takes you directly to http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html I was hoping to create a similar command for the numpy examples, but the URL links in http://www.scipy.org/Numpy_Example_List_With_Doc are some md5 gobbledy-gook. Is it possible to have nice URLs on this page, so they can be more readily yubnub-ized? JDH From daniel.wheeler2 at gmail.com Wed Aug 5 10:20:14 2009 From: daniel.wheeler2 at gmail.com (Daniel Wheeler) Date: Wed, 5 Aug 2009 10:20:14 -0400 Subject: [Numpy-discussion] PDE BoF at SciPy2009 In-Reply-To: <1963DA80-8CE5-4033-BCC8-EBEF05352AAB@gmail.com> References: <1963DA80-8CE5-4033-BCC8-EBEF05352AAB@gmail.com> Message-ID: <80b160a0908050720i11a147d0ibc6e40f4762fb5f3@mail.gmail.com> On Mon, Aug 3, 2009 at 3:57 PM, Chris Kees wrote: > Is there any interest in a BoF session on implementing numerical > methods for partial differential equations using modules like numpy, > cython, mpi4py, etc.? Yes! My colleague, Jon Guyer, will be attending the meeting and speaking on this subject. He isn't on this list. He will be there from midday the Wednesday of the conference. Is this BoF still of interest? -- Daniel Wheeler From kwgoodman at gmail.com Wed Aug 5 10:18:17 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 5 Aug 2009 07:18:17 -0700 Subject: [Numpy-discussion] Why NaN? In-Reply-To: References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> Message-ID: On Wed, Aug 5, 2009 at 1:40 AM, Bruce Southey wrote: > On Tue, Aug 4, 2009 at 4:05 PM, Keith Goodman wrote: >> On Tue, Aug 4, 2009 at 1:53 PM, Bruce Southey wrote: >>> On Tue, Aug 4, 2009 at 1:40 PM, G?khan Sever wrote: >>>> This is the loveliest of all solutions: >>>> >>>> c[isfinite(c)].mean() >>> >>> This handling of nonfinite elements has come up before. >>> Please remember that this only for 1d or flatten array so it not work >>> in general especially along an axis. >> >> If you don't want to use nanmean from scipy.stats you could use: >> >> np.nansum(c, axis=0) / (~np.isnan(c)).sum(axis=0) >> >> or >> >> np.nansum(c, axis=0) / (c == c).sum(axis=0) >> >> But if c contains ints then you'll run into trouble with the division, >> so you'll need to protect against that. > > That is not a problem because nan and infinity are only defined for > floating point numbers not integers. So any array that have nonfinite > elements like nans and infinity must have a floating point dtype. That is true. But I was thnking of this case (no nans or infs): >> c array([[1, 2, 3], [4, 5, 6]]) >> c.mean(0) array([ 2.5, 3.5, 4.5]) <--- good >> np.nansum(c, axis=0) / (c == c).sum(axis=0) array([2, 3, 4]) <--- bad >> np.nansum(c, axis=0) / (c == c).sum(axis=0, dtype=np.float) array([ 2.5, 3.5, 4.5]) <--- good From josef.pktd at gmail.com Wed Aug 5 10:30:55 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 5 Aug 2009 10:30:55 -0400 Subject: [Numpy-discussion] yubnub and numpy examples In-Reply-To: <88e473830908050644t71825829uc965ad213e652b3@mail.gmail.com> References: <88e473830908050644t71825829uc965ad213e652b3@mail.gmail.com> Message-ID: <1cd32cbb0908050730h1ca6262y8d535f566295af9f@mail.gmail.com> On Wed, Aug 5, 2009 at 9:44 AM, John Hunter wrote: > yubnub is pretty cool -- it's a command line interface for the web. > You can enable it in firefox by typing "about:config" in the URL bar, > scrolling down to "keyword.URL", right click on the line and choose > modify, and set the value to be > > http://www.yubnub.org/parser/parse?default=g2&command= > > Then, you can type yubnub commands in the URL bar, eg, to see all > commands related to python, type "ls python" in the URL bar. > > It's easy to create new commands; I just created a new command to load > the docs for a numpy function; just type in the URL bar: > > ?npfunc convolve > > which takes you directly to > http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html > > I was hoping to create a similar command for the numpy examples, but > the URL links in http://www.scipy.org/Numpy_Example_List_With_Doc are > some md5 gobbledy-gook. ?Is it possible to have nice URLs on this > page, so they can be more readily yubnub-ized? > > JDH > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > looks pretty good, but I would recommend a safe install, instead of overwriting the keyword default. This requires typing one additional letter, e.g "y npfunc convolve", (and avoids invalidating the firefox warranty, and you can do the same with other search/link shortcuts) Josef from http://www.yubnub.org/documentation/describe_installation """ Safe Firefox Installation. The safest way to install YubNub is to make a Firefox keyword for it. If you're using the Firefox web browser: * Right-click the input box at the top of the page (the one under the words "Type in a command") * Click "Add a Keyword for this Search" * For the Name, enter "YubNub", and for the Keyword, enter "y" * Press OK Now you can use YubNub directly from the address bar. For example, try typing "y gim porsche 911" into your address bar. Don't forget the "y" in front! You may have noticed that I said that this is the "safest way" to install YubNub. Why safest? Because you must explicitly enter a "y" before the YubNub command. This prevents "command spoofing". For example, suppose someone made a "michael" command. If you typed "michael jordan" into YubNub, intending to do a search, you would instead go to the site of the person who made the "michael" command. Rats! But if you installed YubNub into your Firefox address bar as described above, typing "michael jordan" into your address bar would do a search for "michael jordan", as you intended. The only way to get to that other person's site would be to type "y michael". If you like to live on the edge like me, you can try one of the other installation methods, many of which do not require an initial keyword like "y". """ From meine at informatik.uni-hamburg.de Wed Aug 5 10:41:21 2009 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Wed, 5 Aug 2009 16:41:21 +0200 Subject: [Numpy-discussion] Why NaN? In-Reply-To: References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <20090804195936.GF11772@phare.normalesup.org> Message-ID: <200908051641.21306.meine@informatik.uni-hamburg.de> On Tuesday 04 August 2009 22:06:38 Keith Goodman wrote: > On Tue, Aug 4, 2009 at 12:59 PM, Gael > > Varoquaux wrote: > > On Tue, Aug 04, 2009 at 01:54:49PM -0500, G?khan Sever wrote: > >> I see that you should have a browser embedding plugin for Ipyhon > >> which you don't want to share with us :) > > > > No, I answer e-mail using vim. > > Yeah, I'm trying that right now. :wq :q! :dammit Vim? Isn't that the editor with the two modes, one which destroys your text and one that beeps? ;-) Have a nice day, Hans PS: Yes, it's a free translation of a German chat (IRC/bash?) citation.. From ralf.gommers at googlemail.com Wed Aug 5 10:49:43 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 5 Aug 2009 10:49:43 -0400 Subject: [Numpy-discussion] yubnub and numpy examples In-Reply-To: <88e473830908050644t71825829uc965ad213e652b3@mail.gmail.com> References: <88e473830908050644t71825829uc965ad213e652b3@mail.gmail.com> Message-ID: On Wed, Aug 5, 2009 at 9:44 AM, John Hunter wrote: > yubnub is pretty cool -- it's a command line interface for the web. > You can enable it in firefox by typing "about:config" in the URL bar, > scrolling down to "keyword.URL", right click on the line and choose > modify, and set the value to be > > http://www.yubnub.org/parser/parse?default=g2&command= > > Then, you can type yubnub commands in the URL bar, eg, to see all > commands related to python, type "ls python" in the URL bar. > > It's easy to create new commands; I just created a new command to load > the docs for a numpy function; just type in the URL bar: > > npfunc convolve very cool, thanks! > > > which takes you directly to > http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html > > I was hoping to create a similar command for the numpy examples, but > the URL links in http://www.scipy.org/Numpy_Example_List_With_Doc are > some md5 gobbledy-gook. Is it possible to have nice URLs on this > page, so they can be more readily yubnub-ized? Most of those examples have been integrated in the docstrings, and many more have been written in the doc wiki. They also use "from numpy import *" instead of the np namespace. So instead of spending time fixing links, it might make more sense to generate a new version of this page (with more useful links) from the docstrings themselves. Cheers, Ralf > > JDH > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Aug 5 10:52:43 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 5 Aug 2009 10:52:43 -0400 Subject: [Numpy-discussion] yubnub and numpy examples In-Reply-To: <1cd32cbb0908050730h1ca6262y8d535f566295af9f@mail.gmail.com> References: <88e473830908050644t71825829uc965ad213e652b3@mail.gmail.com> <1cd32cbb0908050730h1ca6262y8d535f566295af9f@mail.gmail.com> Message-ID: <1cd32cbb0908050752rf8b7bf2n8cd330e345992da9@mail.gmail.com> On Wed, Aug 5, 2009 at 10:30 AM, wrote: > On Wed, Aug 5, 2009 at 9:44 AM, John Hunter wrote: >> yubnub is pretty cool -- it's a command line interface for the web. >> You can enable it in firefox by typing "about:config" in the URL bar, >> scrolling down to "keyword.URL", right click on the line and choose >> modify, and set the value to be >> >> http://www.yubnub.org/parser/parse?default=g2&command= >> >> Then, you can type yubnub commands in the URL bar, eg, to see all >> commands related to python, type "ls python" in the URL bar. >> >> It's easy to create new commands; I just created a new command to load >> the docs for a numpy function; just type in the URL bar: >> >> ?npfunc convolve >> >> which takes you directly to >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html Still, it is a lot slower than windows htmlhelp, which is available for numy and scipy but not for others. "y mplcodex histogram" takes pretty long to load >> >> I was hoping to create a similar command for the numpy examples, but >> the URL links in http://www.scipy.org/Numpy_Example_List_With_Doc are >> some md5 gobbledy-gook. ?Is it possible to have nice URLs on this >> page, so they can be more readily yubnub-ized? my impression of the example list page: This page is not really maintained anymore, it is still at numpy 1.2.1 and mostly superseded by the new docs, with examples as part of the docstrings. (Also because of it's page size, I think it's more appropriate for browsing than for quick lookups.) Josef >> >> JDH >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > looks pretty good, but I would recommend a safe install, instead of > overwriting the keyword default. This requires typing one additional > letter, e.g "y npfunc convolve", (and avoids invalidating the firefox > warranty, and you can do the same with other search/link shortcuts) > > Josef > > > from http://www.yubnub.org/documentation/describe_installation > > """ > > Safe Firefox Installation. The safest way to install YubNub is to make > a Firefox keyword for it. If you're using the Firefox web browser: > > ? ?* Right-click the input box at the top of the page (the one under > the words "Type in a command") > ? ?* Click "Add a Keyword for this Search" > ? ?* For the Name, enter "YubNub", and for the Keyword, enter "y" > ? ?* Press OK > > Now you can use YubNub directly from the address bar. For example, try > typing "y gim porsche 911" into your address bar. Don't forget the "y" > in front! > You may have noticed that I said that this is the "safest way" to > install YubNub. Why safest? Because you must explicitly enter a "y" > before the YubNub command. This prevents "command spoofing". > > For example, suppose someone made a "michael" command. If you typed > "michael jordan" into YubNub, intending to do a search, you would > instead go to the site of the person who made the "michael" command. > Rats! But if you installed YubNub into your Firefox address bar as > described above, typing "michael jordan" into your address bar would > do a search for "michael jordan", as you intended. The only way to get > to that other person's site would be to type "y michael". > > If you like to live on the edge like me, you can try one of the other > installation methods, many of which do not require an initial keyword > like "y". > > """ > From bsouthey at gmail.com Wed Aug 5 10:04:47 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 05 Aug 2009 09:04:47 -0500 Subject: [Numpy-discussion] Funded work on Numpy: proposed improvements and request for feedback In-Reply-To: <4A7970DA.50302@ar.media.kyoto-u.ac.jp> References: <4A77A009.9060104@ar.media.kyoto-u.ac.jp> <4A7970DA.50302@ar.media.kyoto-u.ac.jp> Message-ID: <4A79917F.7020108@gmail.com> On 08/05/2009 06:45 AM, David Cournapeau wrote: > Bruce Southey wrote: > >> So if 'C99-like' is going to be the near term future, is there any >> point in supporting non-C99 environments with this work? >> >> > > There may be a misunderstanding: Really ignorance :-) > if the platform support C99 complex, > then we will use it, and otherwise, we will do as today, that is define > our own type. > Actually I did understand that much. > The advantages of reusing the C99 complex type if available: > - if yourself do not care about portability, you can use the numpy > complex typedef as a C99 complex, using addition, division, etc... > operators. > - we can reuse the math library. > I also need some sort of proper C99 support for windows 64 (more exactly > to reimplement a minimal libgfortran buildable by MS compiler). > > >> That is, is the limitation in the compiler, operating system, >> processor or some combination of these? >> >> > > That's purely a compiler issue. Of course, the main culprit is MS > compiler. MS explicitly stated they did not care about proper C support. > Obviously complicated by the distribution of the official Python MS compiled binaries. Ultimately, my view is looking at long term maintenance when people have moved on and the code gets somewhat stale. Definitely your proposal would help long term maintenance of Numpy using C99 supported compilers if included. So my concern is avoiding divergence of the code base between the Numpy and the library so there is no unnecessary code duplication, no need to merge code in the future and fixes (bugs or enhancements) get fixed once that applies to both aspects. Provided these aspects are addressed I have no problems with the proposal. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From afriedle at indiana.edu Wed Aug 5 11:19:52 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Wed, 05 Aug 2009 11:19:52 -0400 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <20090805111833.645c6d93@cudos0803> Message-ID: <4A79A318.8050907@indiana.edu> > Is anyone with this problem *not* running ubuntu? Me - RHEL 5.2 opteron: Python 2.6.1 (r261:67515, Jan 5 2009, 10:19:01) [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 Fedora 9 PS3/PPC: Python 2.5.1 (r251:54863, Jul 17 2008, 13:25:23) [GCC 4.3.1 20080708 (Red Hat 4.3.1-4)] on linux2 Actually I now have some interesting results that indicate the issue isn't in Python or NumPy at all. I just wrote a C program to try to reproduce the error, and was able to do so (actually the difference is even larger). Opteron: float (32) time in usecs: 179698 double (64) time in usecs: 13795 PS3/PPC: float (32) time in usecs: 614821 double (64) time in usecs: 37163 I've attached the code for others to review and/or try out. I guess this is worth showing to the libc people? Andrew -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cos.c URL: From bsouthey at gmail.com Wed Aug 5 11:20:43 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 05 Aug 2009 10:20:43 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> Message-ID: <4A79A34B.8070102@gmail.com> On 08/05/2009 09:18 AM, Keith Goodman wrote: > On Wed, Aug 5, 2009 at 1:40 AM, Bruce Southey wrote: > >> On Tue, Aug 4, 2009 at 4:05 PM, Keith Goodman wrote: >> >>> On Tue, Aug 4, 2009 at 1:53 PM, Bruce Southey wrote: >>> >>>> On Tue, Aug 4, 2009 at 1:40 PM, G?khan Sever wrote: >>>> >>>>> This is the loveliest of all solutions: >>>>> >>>>> c[isfinite(c)].mean() >>>>> >>>> This handling of nonfinite elements has come up before. >>>> Please remember that this only for 1d or flatten array so it not work >>>> in general especially along an axis. >>>> >>> If you don't want to use nanmean from scipy.stats you could use: >>> >>> np.nansum(c, axis=0) / (~np.isnan(c)).sum(axis=0) >>> >>> or >>> >>> np.nansum(c, axis=0) / (c == c).sum(axis=0) >>> >>> But if c contains ints then you'll run into trouble with the division, >>> so you'll need to protect against that. >>> >> That is not a problem because nan and infinity are only defined for >> floating point numbers not integers. So any array that have nonfinite >> elements like nans and infinity must have a floating point dtype. >> > > That is true. But I was thnking of this case (no nans or infs): > > >>> c >>> > array([[1, 2, 3], > [4, 5, 6]]) > >>> c.mean(0) >>> > array([ 2.5, 3.5, 4.5])<--- good > >>> np.nansum(c, axis=0) / (c == c).sum(axis=0) >>> > array([2, 3, 4])<--- bad > >>> np.nansum(c, axis=0) / (c == c).sum(axis=0, dtype=np.float) >>> > array([ 2.5, 3.5, 4.5])<--- good > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Sure but that is about ints versus floats and not about nans or infs. Your 'good' examples are really about first converting an int array into a float array and your 'bad' example maintains int dtype (same result if you cast the arrays from 'good' approaches back to an int dtype). The correct answer depends on what you want the dtype to be. For example, With floating point division: np.mean(c/0.0,axis=0) gives the expected floating point answer: array([ Inf, Inf, Inf]) With integer division: np.mean(c/0,axis=0) gives the expected integer answer: array([ 0., 0., 0.]) Note the default action of mean is to convert ints to float64 which is why the output is a float instead of an int. Although the numpy.mean dtype argument does not appear to work for int dtypes. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.sinclair.za at gmail.com Wed Aug 5 11:26:39 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 5 Aug 2009 17:26:39 +0200 Subject: [Numpy-discussion] strange sin/cos performance In-Reply-To: <4A79A318.8050907@indiana.edu> References: <4A76E709.9090100@indiana.edu> <20090803134556.GA31036@phare.normalesup.org> <20090805111833.645c6d93@cudos0803> <4A79A318.8050907@indiana.edu> Message-ID: <6a17e9ee0908050826q595364a9pe9b2f53d8bd65482@mail.gmail.com> > 2009/8/5 Andrew Friedley : > >> Is anyone with this problem *not* running ubuntu? > > Me - RHEL 5.2 opteron: > > Python 2.6.1 (r261:67515, Jan ?5 2009, 10:19:01) > [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 > > Fedora 9 PS3/PPC: > > Python 2.5.1 (r251:54863, Jul 17 2008, 13:25:23) > [GCC 4.3.1 20080708 (Red Hat 4.3.1-4)] on linux2 > > > Actually I now have some interesting results that indicate the issue isn't > in Python or NumPy at all. ?I just wrote a C program to try to reproduce the > error, and was able to do so (actually the difference is even larger). > > Opteron: > > float (32) time in usecs: 179698 > double (64) time in usecs: 13795 > > PS3/PPC: > > float (32) time in usecs: 614821 > double (64) time in usecs: 37163 > > I've attached the code for others to review and/or try out. ?I guess this is > worth showing to the libc people? For whatever it's worth, not much difference on my machine 32-bit Ubuntu, GCC 4.3.3. float (32) time in usecs: 13804 double (64) time in usecs: 15394 Cheers, Scott From d_l_goldsmith at yahoo.com Wed Aug 5 13:12:13 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Wed, 5 Aug 2009 10:12:13 -0700 (PDT) Subject: [Numpy-discussion] PDE BoF at SciPy2009 In-Reply-To: <80b160a0908050720i11a147d0ibc6e40f4762fb5f3@mail.gmail.com> Message-ID: <487426.84589.qm@web52103.mail.re2.yahoo.com> I already replied to OP, but I'll say publically: "+1", as long as it's not at the same time as the as-yet-potential BoF on "the Future of SciPy". DG --- On Wed, 8/5/09, Daniel Wheeler wrote: > From: Daniel Wheeler > Subject: Re: [Numpy-discussion] PDE BoF at SciPy2009 > To: "Discussion of Numerical Python" > Date: Wednesday, August 5, 2009, 7:20 AM > On Mon, Aug 3, 2009 at 3:57 PM, Chris > Kees > wrote: > > Is there any interest in a BoF session on implementing > numerical > > methods for partial differential equations using > modules like numpy, > > cython, mpi4py, etc.? > > Yes! My colleague, Jon Guyer, will be attending the meeting > and > speaking on this subject. He isn't on this list. He will be > there from > midday the Wednesday of the conference. Is this BoF still > of interest? > > -- > Daniel Wheeler > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From romain.brette at ens.fr Wed Aug 5 06:45:42 2009 From: romain.brette at ens.fr (Romain Brette) Date: Wed, 5 Aug 2009 10:45:42 +0000 (UTC) Subject: [Numpy-discussion] GPU Numpy Message-ID: Hi everyone, I was wondering if you had any plan to incorporate some GPU support to numpy, or perhaps as a separate module. What I have in mind is something that would mimick the syntax of numpy arrays, with a new dtype (gpufloat), like this: from gpunumpy import * x=zeros(100,dtype='gpufloat') # Creates an array of 100 elements on the GPU y=ones(100,dtype='gpufloat') z=exp(2*x+y) # z in on the GPU, all operations on GPU with no transfer z_cpu=array(z,dtype='float') # z is copied to the CPU i=(z>2.3).nonzero()[0] # operation on GPU, returns a CPU integer array I came across a paper about something like that but couldn't find any public release: http://www.tricity.wsu.edu/~bobl/personal/mypubs/2009_gpupy_toms.pdf There is a library named GPULib (http://www.txcorp.com/products/GPULib/) that does similar things, but unfortunately they don't support Python (I think their main Python developer left). I think this would be very useful for many people. For our project (a neural network simulator, http://www.briansimulator.org) we use PyCuda (http://mathema.tician.de/software/pycuda), which is great, but it is mainly for low-level GPU programming. Cheers Romain From cekees at gmail.com Wed Aug 5 14:23:40 2009 From: cekees at gmail.com (Chris Kees) Date: Wed, 5 Aug 2009 13:23:40 -0500 Subject: [Numpy-discussion] PDE BoF at SciPy2009 In-Reply-To: <487426.84589.qm@web52103.mail.re2.yahoo.com> References: <487426.84589.qm@web52103.mail.re2.yahoo.com> Message-ID: <6B54DD7C-8B1C-4886-9822-C1E8210945CD@gmail.com> OK. I contacted several attendees who are not on the numpy list, and it looks like we've got six or seven people interested. I've never been to the conference or organized a session like this. Any guidance? Chris On Aug 5, 2009, at 12:12 PM, David Goldsmith wrote: > I already replied to OP, but I'll say publically: > > "+1", as long as it's not at the same time as the as-yet-potential > BoF on "the Future of SciPy". > > DG > > --- On Wed, 8/5/09, Daniel Wheeler wrote: > >> From: Daniel Wheeler >> Subject: Re: [Numpy-discussion] PDE BoF at SciPy2009 >> To: "Discussion of Numerical Python" >> Date: Wednesday, August 5, 2009, 7:20 AM >> On Mon, Aug 3, 2009 at 3:57 PM, Chris >> Kees >> wrote: >>> Is there any interest in a BoF session on implementing >> numerical >>> methods for partial differential equations using >> modules like numpy, >>> cython, mpi4py, etc.? >> >> Yes! My colleague, Jon Guyer, will be attending the meeting >> and >> speaking on this subject. He isn't on this list. He will be >> there from >> midday the Wednesday of the conference. Is this BoF still >> of interest? >> >> -- >> Daniel Wheeler >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From gael.varoquaux at normalesup.org Wed Aug 5 14:27:02 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 5 Aug 2009 20:27:02 +0200 Subject: [Numpy-discussion] PDE BoF at SciPy2009 In-Reply-To: <6B54DD7C-8B1C-4886-9822-C1E8210945CD@gmail.com> References: <487426.84589.qm@web52103.mail.re2.yahoo.com> <6B54DD7C-8B1C-4886-9822-C1E8210945CD@gmail.com> Message-ID: <20090805182702.GB26054@phare.normalesup.org> On Wed, Aug 05, 2009 at 01:23:40PM -0500, Chris Kees wrote: > OK. I contacted several attendees who are not on the numpy list, and > it looks like we've got six or seven people interested. > I've never been to the conference or organized a session like this. > Any guidance? Just contact one of the organisers during the conference (as early as possible) and we'll sort out the room. It will happen in the one of the two evenings, preferably on Thursday. Ga?l From charlesr.harris at gmail.com Wed Aug 5 14:34:27 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 5 Aug 2009 12:34:27 -0600 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: References: Message-ID: On Wed, Aug 5, 2009 at 4:45 AM, Romain Brette wrote: > Hi everyone, > > I was wondering if you had any plan to incorporate some GPU support to > numpy, or > perhaps as a separate module. What I have in mind is something that would > mimick > the syntax of numpy arrays, with a new dtype (gpufloat), like this: > > from gpunumpy import * > x=zeros(100,dtype='gpufloat') # Creates an array of 100 elements on the GPU > y=ones(100,dtype='gpufloat') > z=exp(2*x+y) # z in on the GPU, all operations on GPU with no transfer > z_cpu=array(z,dtype='float') # z is copied to the CPU > i=(z>2.3).nonzero()[0] # operation on GPU, returns a CPU integer array > > I came across a paper about something like that but couldn't find any > public > release: > http://www.tricity.wsu.edu/~bobl/personal/mypubs/2009_gpupy_toms.pdf > > There is a library named GPULib (http://www.txcorp.com/products/GPULib/) > that > does similar things, but unfortunately they don't support Python (I think > their > main Python developer left). > I think this would be very useful for many people. For our project (a > neural > network simulator, http://www.briansimulator.org) we use PyCuda > (http://mathema.tician.de/software/pycuda), which is great, but it is > mainly for > low-level GPU programming. > What sort of functionality are you looking for? It could be you could slip in a small mod that would do what you want. In the larger picture, the use of GPUs has been discussed on the list several times going back at least a year. The main problems with using GPUs were that CUDA was only available for nvidia video cards and there didn't seem to be any hope for a CUDA version of LAPACK. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Wed Aug 5 14:37:17 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Wed, 5 Aug 2009 11:37:17 -0700 (PDT) Subject: [Numpy-discussion] PDE BoF at SciPy2009 In-Reply-To: <6B54DD7C-8B1C-4886-9822-C1E8210945CD@gmail.com> Message-ID: <802291.90959.qm@web52105.mail.re2.yahoo.com> Lot's of food and alcohol! (Just kidding.) DG --- On Wed, 8/5/09, Chris Kees wrote: > From: Chris Kees > Subject: Re: [Numpy-discussion] PDE BoF at SciPy2009 > To: "Discussion of Numerical Python" > Date: Wednesday, August 5, 2009, 11:23 AM > OK.? I contacted several > attendees who are not on the numpy list, and? > it looks like we've got six or seven people interested. > > I've never been to the conference or organized? a > session like this.? ? > Any guidance? > > Chris > > On Aug 5, 2009, at 12:12 PM, David Goldsmith wrote: > > > I already replied to OP, but I'll say publically: > > > > "+1", as long as it's not at the same time as the > as-yet-potential? > > BoF on "the Future of SciPy". > > > > DG > > > > --- On Wed, 8/5/09, Daniel Wheeler > wrote: > > > >> From: Daniel Wheeler > >> Subject: Re: [Numpy-discussion] PDE BoF at > SciPy2009 > >> To: "Discussion of Numerical Python" > >> Date: Wednesday, August 5, 2009, 7:20 AM > >> On Mon, Aug 3, 2009 at 3:57 PM, Chris > >> Kees > >> wrote: > >>> Is there any interest in a BoF session on > implementing > >> numerical > >>> methods for partial differential equations > using > >> modules like numpy, > >>> cython, mpi4py, etc.? > >> > >> Yes! My colleague, Jon Guyer, will be attending > the meeting > >> and > >> speaking on this subject. He isn't on this list. > He will be > >> there from > >> midday the Wednesday of the conference. Is this > BoF still > >> of interest? > >> > >> -- > >> Daniel Wheeler > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Wed Aug 5 14:47:16 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 5 Aug 2009 12:47:16 -0600 Subject: [Numpy-discussion] BOF c coders. Message-ID: Hi All, At the present time David C. and myself are doing most of the work in the numpy c code base. I am wondering if there are more people out there who might want to get involved in that end of things and if there are ways we can help them get started. If folks are interested we could have a BOF meeting at the SciPy conference. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Wed Aug 5 15:11:46 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 5 Aug 2009 15:11:46 -0400 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <4A79A34B.8070102@gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <4A79A34B.8070102@gmail.com> Message-ID: And, er... masked arrays anyone ? On Aug 5, 2009, at 11:20 AM, Bruce Southey wrote: > On 08/05/2009 09:18 AM, Keith Goodman wrote: >> >> On Wed, Aug 5, 2009 at 1:40 AM, Bruce Southey >> wrote: >> >>> On Tue, Aug 4, 2009 at 4:05 PM, Keith Goodman >>> wrote: >>> >>>> On Tue, Aug 4, 2009 at 1:53 PM, Bruce Southey >>>> wrote: >>>> >>>>> On Tue, Aug 4, 2009 at 1:40 PM, G?khan >>>>> Sever wrote: >>>>> >>>>>> This is the loveliest of all solutions: >>>>>> >>>>>> c[isfinite(c)].mean() >>>>>> >>>>> This handling of nonfinite elements has come up before. >>>>> Please remember that this only for 1d or flatten array so it not >>>>> work >>>>> in general especially along an axis. >>>>> >>>> If you don't want to use nanmean from scipy.stats you could use: >>>> >>>> np.nansum(c, axis=0) / (~np.isnan(c)).sum(axis=0) >>>> >>>> or >>>> >>>> np.nansum(c, axis=0) / (c == c).sum(axis=0) >>>> >>>> But if c contains ints then you'll run into trouble with the >>>> division, >>>> so you'll need to protect against that. >>>> >>> That is not a problem because nan and infinity are only defined for >>> floating point numbers not integers. So any array that have >>> nonfinite >>> elements like nans and infinity must have a floating point dtype. >>> >> >> That is true. But I was thnking of this case (no nans or infs): >> >> >>>> c >>>> >> array([[1, 2, 3], >> [4, 5, 6]]) >> >>>> c.mean(0) >>>> >> array([ 2.5, 3.5, 4.5]) <--- good >> >>>> np.nansum(c, axis=0) / (c == c).sum(axis=0) >>>> >> array([2, 3, 4]) <--- bad >> >>>> np.nansum(c, axis=0) / (c == c).sum(axis=0, dtype=np.float) >>>> >> array([ 2.5, 3.5, 4.5]) <--- good >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > Sure but that is about ints versus floats and not about nans or > infs. Your 'good' examples are really about first converting an int > array into a float array and your 'bad' example maintains int dtype > (same result if you cast the arrays from 'good' approaches back to > an int dtype). > > The correct answer depends on what you want the dtype to be. For > example, > With floating point division: > np.mean(c/0.0,axis=0) > > gives the expected floating point answer: > array([ Inf, Inf, Inf]) > > With integer division: > np.mean(c/0,axis=0) > > gives the expected integer answer: > array([ 0., 0., 0.]) > > Note the default action of mean is to convert ints to float64 which > is why the output is a float instead of an int. Although the > numpy.mean dtype argument does not appear to work for int dtypes. > > > Bruce > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Wed Aug 5 15:14:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 5 Aug 2009 14:14:28 -0500 Subject: [Numpy-discussion] Why NaN? In-Reply-To: References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <4A79A34B.8070102@gmail.com> Message-ID: <3d375d730908051214x4603fd7bve1864799b2dd4d5d@mail.gmail.com> On Wed, Aug 5, 2009 at 14:11, Pierre GM wrote: > > And, er... masked arrays anyone ? That was what I suggested. The very first response, even. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Wed Aug 5 15:20:22 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 5 Aug 2009 15:20:22 -0400 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <3d375d730908051214x4603fd7bve1864799b2dd4d5d@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <4A79A34B.8070102@gmail.com> <3d375d730908051214x4603fd7bve1864799b2dd4d5d@mail.gmail.com> Message-ID: On Aug 5, 2009, at 3:14 PM, Robert Kern wrote: > On Wed, Aug 5, 2009 at 14:11, Pierre GM wrote: >> >> And, er... masked arrays anyone ? > > That was what I suggested. The very first response, even. I know, Robert, and I thank you for that. My comment was intended to the later posters... From kwgoodman at gmail.com Wed Aug 5 15:20:31 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 5 Aug 2009 12:20:31 -0700 Subject: [Numpy-discussion] Why NaN? In-Reply-To: <3d375d730908051214x4603fd7bve1864799b2dd4d5d@mail.gmail.com> References: <49d6b3500908040946v2a06e615t7f77bffabf22e066@mail.gmail.com> <49d6b3500908041140g505e9a5csdffafb420b79b4ca@mail.gmail.com> <4A79A34B.8070102@gmail.com> <3d375d730908051214x4603fd7bve1864799b2dd4d5d@mail.gmail.com> Message-ID: On Wed, Aug 5, 2009 at 12:14 PM, Robert Kern wrote: > On Wed, Aug 5, 2009 at 14:11, Pierre GM wrote: >> >> And, er... masked arrays anyone ? > > That was what I suggested. The very first response, even. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco Is the enigma in your sig enough to invoke Godwin's Law on this thread? From d_l_goldsmith at yahoo.com Wed Aug 5 15:26:32 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Wed, 5 Aug 2009 12:26:32 -0700 (PDT) Subject: [Numpy-discussion] BOF c coders. In-Reply-To: Message-ID: <890670.91013.qm@web52107.mail.re2.yahoo.com> So far, no one's proposed a BoF I wouldn't be interested in attending. :-) (except for fact that at least some will have to overlap, yes? :-( ). DG --- On Wed, 8/5/09, Charles R Harris wrote: > From: Charles R Harris > Subject: [Numpy-discussion] BOF c coders. > To: "numpy-discussion" > Date: Wednesday, August 5, 2009, 11:47 AM > Hi All, > > At the present time David C. and myself are doing most of > the work in the numpy c code base. I am wondering if there > are more people out there who might want to get involved in > that end of things and if there are ways we can help them > get started. If folks are interested we could have a BOF > meeting at the SciPy conference. > > > Chuck > > > -----Inline Attachment Follows----- > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From geometrian at gmail.com Wed Aug 5 15:39:49 2009 From: geometrian at gmail.com (Ian Mallett) Date: Wed, 5 Aug 2009 12:39:49 -0700 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: References: Message-ID: On Wed, Aug 5, 2009 at 11:34 AM, Charles R Harris wrote: > It could be you could slip in a small mod that would do what you want. I'll help, if you want. I'm good with GPUs, and I'd appreciate the numerical power it would afford. > The main problems with using GPUs were that CUDA was only available for > nvidia video cards and there didn't seem to be any hope for a CUDA version > of LAPACK. You don't have to use CUDA, although it would make it easier. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pfeldman at verizon.net Wed Aug 5 15:57:43 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Wed, 5 Aug 2009 12:57:43 -0700 (PDT) Subject: [Numpy-discussion] maximum value and corresponding index Message-ID: <24834930.post@talk.nabble.com> With Python/NumPy, is there a way to get the maximum element of an array and also the index of the element having that value, at a single shot? (One can do this in Matlab via a statement like the following: [x_max,ndx]= max(x) -- View this message in context: http://www.nabble.com/maximum-value-and-corresponding-index-tp24834930p24834930.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Wed Aug 5 15:59:44 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 5 Aug 2009 14:59:44 -0500 Subject: [Numpy-discussion] maximum value and corresponding index In-Reply-To: <24834930.post@talk.nabble.com> References: <24834930.post@talk.nabble.com> Message-ID: <3d375d730908051259u5fa67a68wdf9734f005148519@mail.gmail.com> On Wed, Aug 5, 2009 at 14:57, Dr. Phillip M. Feldman wrote: > > With Python/NumPy, is there a way to get the maximum element of an array and > also the index of the element having that value, at a single shot? Not in one shot. maxi = x.argmax() maxv = x[maxi] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pfeldman at verizon.net Wed Aug 5 16:01:51 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Wed, 5 Aug 2009 13:01:51 -0700 (PDT) Subject: [Numpy-discussion] improved NumPy support for boolean arrays? Message-ID: <24835199.post@talk.nabble.com> Although I've used Matlab for many years and am quite new to Python, I'm already convinced that the Python/NumPy combination is more powerful and flexible than the Matlab base, and that it generally takes less Python code to get the same job done. There is, however, at least one thing that is much cleaner in Matlab-- operations on boolean arrays. If x and y are numpy arrays of bools, I'd like to be able to create expressions like the following: not x (to invert each element of x) x and y x or y x xor y (not x) or y The usual array broadcasting rules should apply. Is there any chance of getting something like this into NumPy? -- View this message in context: http://www.nabble.com/improved-NumPy-support-for-boolean-arrays--tp24835199p24835199.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Wed Aug 5 16:04:10 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 5 Aug 2009 15:04:10 -0500 Subject: [Numpy-discussion] improved NumPy support for boolean arrays? In-Reply-To: <24835199.post@talk.nabble.com> References: <24835199.post@talk.nabble.com> Message-ID: <3d375d730908051304r3d5a843g3e38005ba3e3824d@mail.gmail.com> On Wed, Aug 5, 2009 at 15:01, Dr. Phillip M. Feldman wrote: > > Although I've used Matlab for many years and am quite new to Python, I'm > already convinced that the Python/NumPy combination is more powerful and > flexible than the Matlab base, and that it generally takes less Python code > to get the same job done. There is, however, at least one thing that is much > cleaner in Matlab-- operations on boolean arrays. If x and y are numpy > arrays of bools, I'd like to be able to create expressions like the > following: > > not x (to invert each element of x) ~x > x and y x & y > x or y x | y > x xor y x ^ y > (not x) or y (~x) | y -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d_l_goldsmith at yahoo.com Wed Aug 5 16:06:03 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Wed, 5 Aug 2009 13:06:03 -0700 (PDT) Subject: [Numpy-discussion] maximum value and corresponding index In-Reply-To: <3d375d730908051259u5fa67a68wdf9734f005148519@mail.gmail.com> Message-ID: <658036.17782.qm@web52108.mail.re2.yahoo.com> But you can "cheat" and put them on one line (if that's all you're after): >>> x = np.array([1, 2, 3]) >>> maxi = x.argmax(); maxv = x[maxi] >>> maxi, maxv (2, 3) DG --- On Wed, 8/5/09, Robert Kern wrote: > From: Robert Kern > Subject: Re: [Numpy-discussion] maximum value and corresponding index > To: "Discussion of Numerical Python" > Date: Wednesday, August 5, 2009, 12:59 PM > On Wed, Aug 5, 2009 at 14:57, Dr. > Phillip M. > Feldman > wrote: > > > > With Python/NumPy, is there a way to get the maximum > element of an array and > > also the index of the element having that value, at a > single shot? > > Not in one shot. > > maxi = x.argmax() > maxv = x[maxi] > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, > a harmless > enigma that is made terrible by our own mad attempt to > interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Wed Aug 5 16:09:25 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 5 Aug 2009 16:09:25 -0400 Subject: [Numpy-discussion] improved NumPy support for boolean arrays? In-Reply-To: <3d375d730908051304r3d5a843g3e38005ba3e3824d@mail.gmail.com> References: <24835199.post@talk.nabble.com> <3d375d730908051304r3d5a843g3e38005ba3e3824d@mail.gmail.com> Message-ID: <1cd32cbb0908051309h2c78f746o5fb90e23f85c029b@mail.gmail.com> On Wed, Aug 5, 2009 at 4:04 PM, Robert Kern wrote: > On Wed, Aug 5, 2009 at 15:01, Dr. Phillip M. > Feldman wrote: >> >> Although I've used Matlab for many years and am quite new to Python, I'm >> already convinced that the Python/NumPy combination is more powerful and >> flexible than the Matlab base, and that it generally takes less Python code >> to get the same job done. There is, however, at least one thing that is much >> cleaner in Matlab-- operations on boolean arrays. If x and y are numpy >> arrays of bools, I'd like to be able to create expressions like the >> following: >> >> not x (to invert each element of x) > > ~x > >> x and y > > x & y > >> x or y > > x | y > >> x xor y > > x ^ y > >> (not x) or y > > (~x) | y See also logical_and, logical_or, logical_not, logical_xor Josef > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sturla at molden.no Wed Aug 5 17:02:04 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 05 Aug 2009 23:02:04 +0200 Subject: [Numpy-discussion] improved NumPy support for boolean arrays? In-Reply-To: <24835199.post@talk.nabble.com> References: <24835199.post@talk.nabble.com> Message-ID: <4A79F34C.6010802@molden.no> > If x and y are numpy > arrays of bools, I'd like to be able to create expressions like the > following: > > not x (to invert each element of x) > x and y > x or y > x xor y > (not x) or y > > The usual array broadcasting rules should apply. Is there any chance of > getting something like this into NumPy? There is a reason for this related to Python. In Python an object will often have a boolean truth value. How would you cast an ndarray to bool? If you write something like (x and y) the Python interpreter expects this to evaluate to True or False. Thus is cannot evaluate to an ndarray with booleans. NumPy cannot change the syntax of Python. Another thing: An empty list evaluates to False in a boolean context, whereas a non-empty list evaluates to True. ndarrays behave differently. Why? Sturla Moldem From robert.kern at gmail.com Wed Aug 5 17:11:39 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 5 Aug 2009 16:11:39 -0500 Subject: [Numpy-discussion] improved NumPy support for boolean arrays? In-Reply-To: <4A79F34C.6010802@molden.no> References: <24835199.post@talk.nabble.com> <4A79F34C.6010802@molden.no> Message-ID: <3d375d730908051411u3bbf9522u74ef71a287b09615@mail.gmail.com> On Wed, Aug 5, 2009 at 16:02, Sturla Molden wrote: > >> ?If x and y are numpy >> arrays of bools, I'd like to be able to create expressions like the >> following: >> >> not x (to invert each element of x) >> x and y >> x or y >> x xor y >> (not x) or y >> >> The usual array broadcasting rules should apply. ?Is there any chance of >> getting something like this into NumPy? > There is a reason for this related to Python. In Python an object will > often have a boolean truth value. How would you cast an ndarray to bool? > If you write something like (x and y) the Python interpreter expects > this to evaluate to True or False. Thus is cannot evaluate to an ndarray > with booleans. NumPy cannot change the syntax of Python. > > Another thing: An empty list evaluates to False in a boolean context, > whereas a non-empty list evaluates to True. ndarrays behave differently. > Why? Numeric used to evaluate bool(some_array) as True if any of the elements were nonzero and False if all of them were zero. This confused some people who expected that bool(some_array) to be True iff *all* of the elements were nonzero and False otherwise. People had bugs in their code for years without realizing it. They would try one example, get their expected result, and not test the other corner cases that would demonstrate that their mental model of what was going on was incorrect. By the time that numarray was being designed, the numarray team decided to make array always raise an exception instead of returning any truth value. numpy followed this decision. There really aren't many use cases for following the list object's semantics with arrays. Empty arrays aren't nearly as common as empty lists or even tuples. I know of no case where it is useful to test specifically for emptiness versus non-emptiness. In any case, directly checking the .size or .shape attributes would be sufficient and far more clear because there are other plausible interpretations of bool(some_array) like Numeric's. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From trevor at notcows.com Wed Aug 5 17:20:05 2009 From: trevor at notcows.com (Trevor Clarke) Date: Wed, 5 Aug 2009 17:20:05 -0400 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: References: Message-ID: <7bde5d400908051420i75a29fdexe923b104f428449e@mail.gmail.com> With OpenCL implementations making their way into the wild, that's probably a better target than CUDA. On Wed, Aug 5, 2009 at 3:39 PM, Ian Mallett wrote: > On Wed, Aug 5, 2009 at 11:34 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> It could be you could slip in a small mod that would do what you want. > > I'll help, if you want. I'm good with GPUs, and I'd appreciate the > numerical power it would afford. > >> The main problems with using GPUs were that CUDA was only available for >> nvidia video cards and there didn't seem to be any hope for a CUDA version >> of LAPACK. > > You don't have to use CUDA, although it would make it easier. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Wed Aug 5 18:13:59 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 5 Aug 2009 18:13:59 -0400 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: References: Message-ID: <351A1A80-FC2D-4410-95B0-955121A74CA5@cs.toronto.edu> A friend of mine wrote a simple wrapper around CUBLAS using ctypes that basically exposes a Python class that keeps a 2D array of single- precision floats on the GPU for you, and lets you I keep telling him to release it, but he thinks it's too hackish. It did inspire some of our colleagues in Montreal to create this, though: http://code.google.com/p/cuda-ndarray/ I gather it is VERY early in development, but I'm sure they'd love contributions! David On 5-Aug-09, at 6:45 AM, Romain Brette wrote: > Hi everyone, > > I was wondering if you had any plan to incorporate some GPU support > to numpy, or > perhaps as a separate module. What I have in mind is something that > would mimick > the syntax of numpy arrays, with a new dtype (gpufloat), like this: > > from gpunumpy import * > x=zeros(100,dtype='gpufloat') # Creates an array of 100 elements on > the GPU > y=ones(100,dtype='gpufloat') > z=exp(2*x+y) # z in on the GPU, all operations on GPU with no transfer > z_cpu=array(z,dtype='float') # z is copied to the CPU > i=(z>2.3).nonzero()[0] # operation on GPU, returns a CPU integer array > There is a library named GPULib (http://www.txcorp.com/products/GPULib/ > ) that > does similar things, but unfortunately they don't support Python (I > think their > main Python developer left). > I think this would be very useful for many people. For our project > (a neural > network simulator, http://www.briansimulator.org) we use PyCuda > (http://mathema.tician.de/software/pycuda) Neat project, though at first I was sure that was a typo :) "He can't be simulating Brians...." - David From fperez.net at gmail.com Wed Aug 5 18:20:06 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 5 Aug 2009 15:20:06 -0700 Subject: [Numpy-discussion] Wiki page: food options at the SciPy'09 conference Message-ID: Hi all, this is a message mostly for those attending the conference who know Caltech and its surroundings well. We've created a page to list easy-to-access food options from the campus, but I don't really know what to put there. Anyone who has some knowledge of local options is welcome to add to this wiki page, and will earn the gratitude of all attendees: http://conference.scipy.org/food Thanks! Cheers, f From olivier.grisel at ensta.org Wed Aug 5 19:42:46 2009 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Thu, 6 Aug 2009 01:42:46 +0200 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: <351A1A80-FC2D-4410-95B0-955121A74CA5@cs.toronto.edu> References: <351A1A80-FC2D-4410-95B0-955121A74CA5@cs.toronto.edu> Message-ID: OpenCL is definitely the way to go for a cross platform solution with both nvidia and AMD having released beta runtimes to their respective developer networks (free as in beer subscription required for the beta dowload pages). Final public releases to be expected around 2009 Q3. OpenCL is an open royalty free standardized API and runtime specification for heterogeneous with a mix of CPU and GPU cores. The nvidia implementation is based on the CUDA runtime and programming OpenCL is very similar to programming in C for CUDA. The developer of PyCUDA is also working on PyOpenCL http://pypi.python.org/pypi/pyopencl/ Both nvidia and AMD use llvm to compile the OpenCL cross-platform kernel sources into device specific binaries loaded at runtime. Official OpenCL specs: http://www.khronos.org/registry/cl/specs/opencl-1.0.29.pdf Wikipedia page: http://en.wikipedia.org/wiki/OpenCL nvidia runtime: http://www.nvidia.com/object/cuda_opencl.html AMD runtime: only working with x86 and x86_64 with SSE3 for now: http://developer.amd.com/GPU/ATISTREAMSDKBETAPROGRAM/Pages/default.aspx Intel and IBM were also a members of the standard comity so we can reasonably expect runtime for there chips in the future (e.g. larabee and Cell BE). -- Olivier http://twitter.com/ogrisel - http://code.oliviergrisel.name From david at ar.media.kyoto-u.ac.jp Wed Aug 5 22:32:53 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 06 Aug 2009 11:32:53 +0900 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: References: <351A1A80-FC2D-4410-95B0-955121A74CA5@cs.toronto.edu> Message-ID: <4A7A40D5.7050308@ar.media.kyoto-u.ac.jp> Olivier Grisel wrote: > OpenCL is definitely the way to go for a cross platform solution with > both nvidia and AMD having released beta runtimes to their respective > developer networks (free as in beer subscription required for the beta > dowload pages). Final public releases to be expected around 2009 Q3. > What's the status of opencl on windows ? Will MS have its own direct-x specific implementation ? cheers, David From olivier.grisel at ensta.org Thu Aug 6 01:44:16 2009 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Thu, 6 Aug 2009 07:44:16 +0200 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: <4A7A40D5.7050308@ar.media.kyoto-u.ac.jp> References: <351A1A80-FC2D-4410-95B0-955121A74CA5@cs.toronto.edu> <4A7A40D5.7050308@ar.media.kyoto-u.ac.jp> Message-ID: 2009/8/6 David Cournapeau : > Olivier Grisel wrote: >> OpenCL is definitely the way to go for a cross platform solution with >> both nvidia and AMD having released beta runtimes to their respective >> developer networks (free as in beer subscription required for the beta >> dowload pages). Final public releases to be expected around 2009 Q3. >> > > What's the status of opencl on windows ? Will MS have its own direct-x > specific implementation ? As usual, MS reinvents the wheel with DirectX Compute but vendors such as AMD and nvidia propose both the OpenCL API +runtime binaries for windows and their DirectX Compute counterpart, based on mostly the same underlying implementation, e.g. CUDA in nvidia's case. -- Olivier http://twitter.com/ogrisel - http://code.oliviergrisel.name From sturla at molden.no Thu Aug 6 03:32:25 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 06 Aug 2009 09:32:25 +0200 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: References: <351A1A80-FC2D-4410-95B0-955121A74CA5@cs.toronto.edu> <4A7A40D5.7050308@ar.media.kyoto-u.ac.jp> Message-ID: <4A7A8709.1070306@molden.no> Olivier Grisel wrote: > As usual, MS reinvents the wheel with DirectX Compute but vendors such > as AMD and nvidia propose both the OpenCL API +runtime binaries for > windows and their DirectX Compute counterpart, based on mostly the > same underlying implementation, e.g. CUDA in nvidia's case. > > Here is a DirectX Compute tutorial I found: http://www.gamedev.net/community/forums/topic.asp?topic_id=516043 It pretty much says all we need to know. I am not investing any of my time learning that shitty API. Period. Lets just hope OpenCL makes it to Windows without Microsoft breaking it for "security reasons" (as they did with OpenGL). Sturla From meine at informatik.uni-hamburg.de Thu Aug 6 04:21:58 2009 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Thu, 6 Aug 2009 10:21:58 +0200 Subject: [Numpy-discussion] maximum value and corresponding index In-Reply-To: <658036.17782.qm@web52108.mail.re2.yahoo.com> References: <658036.17782.qm@web52108.mail.re2.yahoo.com> Message-ID: <200908061021.58820.meine@informatik.uni-hamburg.de> On Wednesday 05 August 2009 22:06:03 David Goldsmith wrote: > But you can "cheat" and put them on one line (if that's all you're after): > >>> x = np.array([1, 2, 3]) > >>> maxi = x.argmax(); maxv = x[maxi] Is there any reason not to put this as a convenience function into numpy? It is needed so frequently, and it's a shame that the shortest solution traverses the array twice (the above is usually longer due to variable names, and/or >1 dimensions). def give_me_a_good_name(array): pos = array.argmax() val = array[unravel_index(pos)] return pos, val The name should not be too long, maybe "findmax"? I don't see good predecessors to learn from ATM. OTOH, a minmax() would be more pressing, since there is no shortcut yet AFAICS. Ciao, Hans From romain.brette at ens.fr Thu Aug 6 04:32:31 2009 From: romain.brette at ens.fr (Romain Brette) Date: Thu, 6 Aug 2009 08:32:31 +0000 (UTC) Subject: [Numpy-discussion] GPU Numpy References: Message-ID: Charles R Harris gmail.com> writes: > > What sort of functionality are you looking for? It could be you could slip in a small mod that would do what you want. In the larger picture, the use of GPUs has been discussed on the list several times going back at least a year. The main problems with using GPUs were that CUDA was only available for nvidia video cards and there didn't seem to be any hope for a CUDA version of LAPACK.Chuck > So for our project what we need is: * element-wise operations on vectors (arithmetical, exp/log, exponentiation) * same but on views (x[2:7]) * assignment (x[:]=2*y) * boolean operations on vectors (x>2.5) and the nonzero() method * possibly, multiplying a N*M matrix by an M*M matrix, where N is large and M is small (but this could be done with vector operations). * random number generation would be great too (gpurand(N)) What is very important to me is that the syntax be the same as with normal arrays, so that you could easily switch the GPU on/off (depending on whether a GPU was detected). Cheers Romain From romain.brette at ens.fr Thu Aug 6 04:39:13 2009 From: romain.brette at ens.fr (Romain Brette) Date: Thu, 6 Aug 2009 08:39:13 +0000 (UTC) Subject: [Numpy-discussion] GPU Numpy References: Message-ID: Ian Mallett gmail.com> writes: > > > On Wed, Aug 5, 2009 at 11:34 AM, Charles R Harris gmail.com> wrote: > > > > It could be you could slip in a small mod that would do what you want. > I'll help, if you want.? I'm good with GPUs, and I'd appreciate the numerical power it would afford. That would be great actually if we could gather a little team! Anyone else interested? As Trevor said, OpenCL could be a better choice than Cuda. By the way, there is a Matlab toolbox that seems to have similar functionality: http://www.accelereyes.com/ Cheers Romain From romain.brette at ens.fr Thu Aug 6 04:43:52 2009 From: romain.brette at ens.fr (Romain Brette) Date: Thu, 6 Aug 2009 08:43:52 +0000 (UTC) Subject: [Numpy-discussion] GPU Numpy References: <351A1A80-FC2D-4410-95B0-955121A74CA5@cs.toronto.edu> Message-ID: David Warde-Farley cs.toronto.edu> writes: > It did inspire some of our colleagues in Montreal to create this, > though: > > http://code.google.com/p/cuda-ndarray/ > > I gather it is VERY early in development, but I'm sure they'd love > contributions! > Hi David, That does look quite close to what I imagined, probably a good start then! Romain From robert.kern at gmail.com Thu Aug 6 11:27:51 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Aug 2009 10:27:51 -0500 Subject: [Numpy-discussion] maximum value and corresponding index In-Reply-To: <200908061021.58820.meine@informatik.uni-hamburg.de> References: <658036.17782.qm@web52108.mail.re2.yahoo.com> <200908061021.58820.meine@informatik.uni-hamburg.de> Message-ID: <3d375d730908060827t4fdf5a60h1675d2d70b175535@mail.gmail.com> 2009/8/6 Hans Meine : > On Wednesday 05 August 2009 22:06:03 David Goldsmith wrote: >> But you can "cheat" and put them on one line (if that's all you're after): >> >>> x = np.array([1, 2, 3]) >> >>> maxi = x.argmax(); maxv = x[maxi] > > Is there any reason not to put this as a convenience function into numpy? > It is needed so frequently, and it's a shame that the shortest solution > traverses the array twice (the above is usually longer due to variable names, > and/or >1 dimensions). The array is only traversed once. Indexing is O(1). I'm -1 on adding a function to do this. If you want to keep that convenience function in your own utilities, great. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Thu Aug 6 11:55:58 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 6 Aug 2009 11:55:58 -0400 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) Message-ID: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> What's the best way of getting back the correct shape to be able to broadcast, mean, min,.. to the original array, that works for arbitrary dimension and axis? I thought I have seen some helper functions, but I don't find them anymore? Josef >>> a array([[1, 2, 3, 3, 0], [2, 2, 3, 2, 1]]) >>> a-a.max(0) array([[-1, 0, 0, 0, -1], [ 0, 0, 0, -1, 0]]) >>> a-a.max(1) Traceback (most recent call last): File "", line 1, in a-a.max(1) ValueError: shape mismatch: objects cannot be broadcast to a single shape >>> a-a.max(1)[:,None] array([[-2, -1, 0, 0, -3], [-1, -1, 0, -1, -2]]) From kwgoodman at gmail.com Thu Aug 6 12:03:46 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 6 Aug 2009 09:03:46 -0700 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> Message-ID: On Thu, Aug 6, 2009 at 8:55 AM, wrote: > What's the best way of getting back the correct shape to be able to > broadcast, mean, min,.. to the original array, that works for > arbitrary dimension and axis? > > I thought I have seen some helper functions, but I don't find them anymore? > > Josef > >>>> a > array([[1, 2, 3, 3, 0], > ? ? ? [2, 2, 3, 2, 1]]) >>>> a-a.max(0) > array([[-1, ?0, ?0, ?0, -1], > ? ? ? [ 0, ?0, ?0, -1, ?0]]) >>>> a-a.max(1) > Traceback (most recent call last): > ?File "", line 1, in > ? ?a-a.max(1) > ValueError: shape mismatch: objects cannot be broadcast to a single shape >>>> a-a.max(1)[:,None] > array([[-2, -1, ?0, ?0, -3], > ? ? ? [-1, -1, ?0, -1, -2]]) Would this do it? >> pylab.demean?? Type: function Base Class: String Form: Namespace: Interactive File: /usr/lib/python2.6/dist-packages/matplotlib/mlab.py Definition: pylab.demean(x, axis=0) Source: def demean(x, axis=0): "Return x minus its mean along the specified axis" x = np.asarray(x) if axis: ind = [slice(None)] * axis ind.append(np.newaxis) return x - x.mean(axis)[ind] return x - x.mean(axis) From meine at informatik.uni-hamburg.de Thu Aug 6 12:05:36 2009 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Thu, 6 Aug 2009 18:05:36 +0200 Subject: [Numpy-discussion] maximum value and corresponding index In-Reply-To: <3d375d730908060827t4fdf5a60h1675d2d70b175535@mail.gmail.com> References: <658036.17782.qm@web52108.mail.re2.yahoo.com> <200908061021.58820.meine@informatik.uni-hamburg.de> <3d375d730908060827t4fdf5a60h1675d2d70b175535@mail.gmail.com> Message-ID: <200908061805.36420.meine@informatik.uni-hamburg.de> On Thursday 06 August 2009 17:27:51 Robert Kern wrote: > 2009/8/6 Hans Meine : > > On Wednesday 05 August 2009 22:06:03 David Goldsmith wrote: > >> But you can "cheat" and put them on one line (if that's all you're after): > >> >>> x = np.array([1, 2, 3]) > >> >>> maxi = x.argmax(); maxv = x[maxi] > > > > Is there any reason not to put this as a convenience function into numpy? > > It is needed so frequently, and it's a shame that the shortest solution > > traverses the array twice [...] > > The array is only traversed once. Indexing is O(1). Yes, but I wrote "the shortest solution", which is to call .argmax() and .max(), since.. > > the above is usually longer due to variable names, and/or >1 dimensions which requires unravel_index() (note that flattening is also inefficient when the array is strided). Never mind, Hans From robert.kern at gmail.com Thu Aug 6 12:07:14 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Aug 2009 11:07:14 -0500 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> Message-ID: <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> On Thu, Aug 6, 2009 at 11:03, Keith Goodman wrote: > On Thu, Aug 6, 2009 at 8:55 AM, wrote: >> What's the best way of getting back the correct shape to be able to >> broadcast, mean, min,.. to the original array, that works for >> arbitrary dimension and axis? >> >> I thought I have seen some helper functions, but I don't find them anymore? >> >> Josef >> >>>>> a >> array([[1, 2, 3, 3, 0], >> ? ? ? [2, 2, 3, 2, 1]]) >>>>> a-a.max(0) >> array([[-1, ?0, ?0, ?0, -1], >> ? ? ? [ 0, ?0, ?0, -1, ?0]]) >>>>> a-a.max(1) >> Traceback (most recent call last): >> ?File "", line 1, in >> ? ?a-a.max(1) >> ValueError: shape mismatch: objects cannot be broadcast to a single shape >>>>> a-a.max(1)[:,None] >> array([[-2, -1, ?0, ?0, -3], >> ? ? ? [-1, -1, ?0, -1, -2]]) > > Would this do it? > >>> pylab.demean?? > Type: ? ? ? ? ? function > Base Class: ? ? > String Form: ? ? > Namespace: ? ? ?Interactive > File: ? ? ? ? ? /usr/lib/python2.6/dist-packages/matplotlib/mlab.py > Definition: ? ? pylab.demean(x, axis=0) > Source: > def demean(x, axis=0): > ? ?"Return x minus its mean along the specified axis" > ? ?x = np.asarray(x) > ? ?if axis: > ? ? ? ?ind = [slice(None)] * axis > ? ? ? ?ind.append(np.newaxis) > ? ? ? ?return x - x.mean(axis)[ind] > ? ?return x - x.mean(axis) Ouch! That doesn't handle axis=-1. if axis != 0: ind = [slice(None)] * x.ndim ind[axis] = np.newaxis -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Thu Aug 6 12:15:27 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 6 Aug 2009 09:15:27 -0700 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> Message-ID: thanksOn Thu, Aug 6, 2009 at 9:07 AM, Robert Kern wrote: > On Thu, Aug 6, 2009 at 11:03, Keith Goodman wrote: >> On Thu, Aug 6, 2009 at 8:55 AM, wrote: >>> What's the best way of getting back the correct shape to be able to >>> broadcast, mean, min,.. to the original array, that works for >>> arbitrary dimension and axis? >>> >>> I thought I have seen some helper functions, but I don't find them anymore? >>> >>> Josef >>> >>>>>> a >>> array([[1, 2, 3, 3, 0], >>> ? ? ? [2, 2, 3, 2, 1]]) >>>>>> a-a.max(0) >>> array([[-1, ?0, ?0, ?0, -1], >>> ? ? ? [ 0, ?0, ?0, -1, ?0]]) >>>>>> a-a.max(1) >>> Traceback (most recent call last): >>> ?File "", line 1, in >>> ? ?a-a.max(1) >>> ValueError: shape mismatch: objects cannot be broadcast to a single shape >>>>>> a-a.max(1)[:,None] >>> array([[-2, -1, ?0, ?0, -3], >>> ? ? ? [-1, -1, ?0, -1, -2]]) >> >> Would this do it? >> >>>> pylab.demean?? >> Type: ? ? ? ? ? function >> Base Class: ? ? >> String Form: ? ? >> Namespace: ? ? ?Interactive >> File: ? ? ? ? ? /usr/lib/python2.6/dist-packages/matplotlib/mlab.py >> Definition: ? ? pylab.demean(x, axis=0) >> Source: >> def demean(x, axis=0): >> ? ?"Return x minus its mean along the specified axis" >> ? ?x = np.asarray(x) >> ? ?if axis: >> ? ? ? ?ind = [slice(None)] * axis >> ? ? ? ?ind.append(np.newaxis) >> ? ? ? ?return x - x.mean(axis)[ind] >> ? ?return x - x.mean(axis) > > Ouch! That doesn't handle axis=-1. > > if axis != 0: > ? ?ind = [slice(None)] * x.ndim > ? ?ind[axis] = np.newaxis Hey, didn't you warn us about the dangers of "if arr" the other day? From robert.kern at gmail.com Thu Aug 6 12:18:59 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Aug 2009 11:18:59 -0500 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> Message-ID: <3d375d730908060918h209267bfua88ba499df1d5dcb@mail.gmail.com> On Thu, Aug 6, 2009 at 11:15, Keith Goodman wrote: > thanksOn Thu, Aug 6, 2009 at 9:07 AM, Robert Kern wrote: >> On Thu, Aug 6, 2009 at 11:03, Keith Goodman wrote: >>>>> pylab.demean?? >>> Type: ? ? ? ? ? function >>> Base Class: ? ? >>> String Form: ? ? >>> Namespace: ? ? ?Interactive >>> File: ? ? ? ? ? /usr/lib/python2.6/dist-packages/matplotlib/mlab.py >>> Definition: ? ? pylab.demean(x, axis=0) >>> Source: >>> def demean(x, axis=0): >>> ? ?"Return x minus its mean along the specified axis" >>> ? ?x = np.asarray(x) >>> ? ?if axis: >>> ? ? ? ?ind = [slice(None)] * axis >>> ? ? ? ?ind.append(np.newaxis) >>> ? ? ? ?return x - x.mean(axis)[ind] >>> ? ?return x - x.mean(axis) >> >> Ouch! That doesn't handle axis=-1. >> >> if axis != 0: >> ? ?ind = [slice(None)] * x.ndim >> ? ?ind[axis] = np.newaxis > > Hey, didn't you warn us about the dangers of "if arr" the other day? Yes, but actually that wasn't quite the problem. "if axis:" would have been fine, if a bit obscure, as long as the body was fixed to not expect axis>0. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Thu Aug 6 12:21:26 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 6 Aug 2009 12:21:26 -0400 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> Message-ID: <1cd32cbb0908060921j57c6006dl789293b8d69ac62d@mail.gmail.com> On Thu, Aug 6, 2009 at 12:07 PM, Robert Kern wrote: > On Thu, Aug 6, 2009 at 11:03, Keith Goodman wrote: >> On Thu, Aug 6, 2009 at 8:55 AM, wrote: >>> What's the best way of getting back the correct shape to be able to >>> broadcast, mean, min,.. to the original array, that works for >>> arbitrary dimension and axis? >>> >>> I thought I have seen some helper functions, but I don't find them anymore? >>> >>> Josef >>> >>>>>> a >>> array([[1, 2, 3, 3, 0], >>> ? ? ? [2, 2, 3, 2, 1]]) >>>>>> a-a.max(0) >>> array([[-1, ?0, ?0, ?0, -1], >>> ? ? ? [ 0, ?0, ?0, -1, ?0]]) >>>>>> a-a.max(1) >>> Traceback (most recent call last): >>> ?File "", line 1, in >>> ? ?a-a.max(1) >>> ValueError: shape mismatch: objects cannot be broadcast to a single shape >>>>>> a-a.max(1)[:,None] >>> array([[-2, -1, ?0, ?0, -3], >>> ? ? ? [-1, -1, ?0, -1, -2]]) >> >> Would this do it? >> >>>> pylab.demean?? >> Type: ? ? ? ? ? function >> Base Class: ? ? >> String Form: ? ? >> Namespace: ? ? ?Interactive >> File: ? ? ? ? ? /usr/lib/python2.6/dist-packages/matplotlib/mlab.py >> Definition: ? ? pylab.demean(x, axis=0) >> Source: >> def demean(x, axis=0): >> ? ?"Return x minus its mean along the specified axis" >> ? ?x = np.asarray(x) >> ? ?if axis: >> ? ? ? ?ind = [slice(None)] * axis >> ? ? ? ?ind.append(np.newaxis) >> ? ? ? ?return x - x.mean(axis)[ind] >> ? ?return x - x.mean(axis) > > Ouch! That doesn't handle axis=-1. > > if axis != 0: > ? ?ind = [slice(None)] * x.ndim > ? ?ind[axis] = np.newaxis > Thanks, that's it. I have seen implementation of helper functions similar to this in other packages, but I thought there is already something in numpy. I think this should be a simple helper function in numpy to avoid mistakes and complicated implementation like the one in stats.nanstd even if it's only a few lines. Josef > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Thu Aug 6 12:22:55 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 6 Aug 2009 09:22:55 -0700 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: <3d375d730908060918h209267bfua88ba499df1d5dcb@mail.gmail.com> References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> <3d375d730908060918h209267bfua88ba499df1d5dcb@mail.gmail.com> Message-ID: On Thu, Aug 6, 2009 at 9:18 AM, Robert Kern wrote: > On Thu, Aug 6, 2009 at 11:15, Keith Goodman wrote: >> thanksOn Thu, Aug 6, 2009 at 9:07 AM, Robert Kern wrote: >>> On Thu, Aug 6, 2009 at 11:03, Keith Goodman wrote: > >>>>>> pylab.demean?? >>>> Type: ? ? ? ? ? function >>>> Base Class: ? ? >>>> String Form: ? ? >>>> Namespace: ? ? ?Interactive >>>> File: ? ? ? ? ? /usr/lib/python2.6/dist-packages/matplotlib/mlab.py >>>> Definition: ? ? pylab.demean(x, axis=0) >>>> Source: >>>> def demean(x, axis=0): >>>> ? ?"Return x minus its mean along the specified axis" >>>> ? ?x = np.asarray(x) >>>> ? ?if axis: >>>> ? ? ? ?ind = [slice(None)] * axis >>>> ? ? ? ?ind.append(np.newaxis) >>>> ? ? ? ?return x - x.mean(axis)[ind] >>>> ? ?return x - x.mean(axis) >>> >>> Ouch! That doesn't handle axis=-1. >>> >>> if axis != 0: >>> ? ?ind = [slice(None)] * x.ndim >>> ? ?ind[axis] = np.newaxis >> >> Hey, didn't you warn us about the dangers of "if arr" the other day? > > Yes, but actually that wasn't quite the problem. "if axis:" would have > been fine, if a bit obscure, as long as the body was fixed to not > expect axis>0. Oh, of course. Thanks for pointing that out. Now I can fix a bug in my own code. From pgmdevlist at gmail.com Thu Aug 6 12:27:09 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 6 Aug 2009 12:27:09 -0400 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> <3d375d730908060918h209267bfua88ba499df1d5dcb@mail.gmail.com> Message-ID: <1939A33F-9AC4-4137-B331-6CBDBB36F072@gmail.com> On Aug 6, 2009, at 12:22 PM, Keith Goodman wrote: > On Thu, Aug 6, 2009 at 9:18 AM, Robert Kern > wrote: >> On Thu, Aug 6, 2009 at 11:15, Keith Goodman >> wrote: >>> thanksOn Thu, Aug 6, 2009 at 9:07 AM, Robert Kern>> > wrote: >>>> On Thu, Aug 6, 2009 at 11:03, Keith Goodman >>>> wrote: >> >>>>>>> pylab.demean?? >>>>> Type: function >>>>> Base Class: >>>>> String Form: >>>>> Namespace: Interactive >>>>> File: /usr/lib/python2.6/dist-packages/matplotlib/ >>>>> mlab.py >>>>> Definition: pylab.demean(x, axis=0) >>>>> Source: >>>>> def demean(x, axis=0): >>>>> "Return x minus its mean along the specified axis" >>>>> x = np.asarray(x) >>>>> if axis: >>>>> ind = [slice(None)] * axis >>>>> ind.append(np.newaxis) >>>>> return x - x.mean(axis)[ind] >>>>> return x - x.mean(axis) FYI, there's a "anom" method for MaskedArrays that does the same thing as demean (if you can't /don't want to import mpl) From robert.kern at gmail.com Thu Aug 6 12:26:51 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Aug 2009 11:26:51 -0500 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: <1cd32cbb0908060921j57c6006dl789293b8d69ac62d@mail.gmail.com> References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> <1cd32cbb0908060921j57c6006dl789293b8d69ac62d@mail.gmail.com> Message-ID: <3d375d730908060926x6523df1bl7163bf9a1475b8b9@mail.gmail.com> On Thu, Aug 6, 2009 at 11:21, wrote: > On Thu, Aug 6, 2009 at 12:07 PM, Robert Kern wrote: >> if axis != 0: >> ? ?ind = [slice(None)] * x.ndim >> ? ?ind[axis] = np.newaxis >> > > Thanks, that's it. > > I have seen implementation of helper functions similar to this in > other packages, but I thought there is already something in numpy. ?I > think this should be a simple helper function in numpy to avoid > mistakes and complicated implementation like the one in stats.nanstd > even if it's only a few lines. It would make a good contribution to numpy.lib.index_tricks. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Aug 6 12:58:06 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Aug 2009 10:58:06 -0600 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> Message-ID: On Thu, Aug 6, 2009 at 9:55 AM, wrote: > What's the best way of getting back the correct shape to be able to > broadcast, mean, min,.. to the original array, that works for > arbitrary dimension and axis? > > I thought I have seen some helper functions, but I don't find them anymore? > Adding a keyword to retain the number of dimensions has been mooted. It shouldn't be too difficult to implement and would allow things like: >>> scaled = a/a.max(1, reduce=0) I could do that for 1.4 if folks are interested. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu Aug 6 13:07:13 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 6 Aug 2009 10:07:13 -0700 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> Message-ID: On Thu, Aug 6, 2009 at 9:58 AM, Charles R Harris wrote: > > > On Thu, Aug 6, 2009 at 9:55 AM, wrote: >> >> What's the best way of getting back the correct shape to be able to >> broadcast, mean, min,.. to the original array, that works for >> arbitrary dimension and axis? >> >> I thought I have seen some helper functions, but I don't find them >> anymore? > > Adding a keyword to retain the number of dimensions has been mooted. It > shouldn't be too difficult to implement and would allow things like: > >>>> scaled = a/a.max(1, reduce=0) > > I could do that for 1.4 if folks are interested. I'd use that. It's better than what I usually do: scaled = a / a.max(1).reshape(-1,1) From bergstrj at iro.umontreal.ca Thu Aug 6 13:12:26 2009 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Thu, 6 Aug 2009 13:12:26 -0400 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: Message-ID: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> >David Warde-Farley cs.toronto.edu> writes: >> It did inspire some of our colleagues in Montreal to create this, >> though: >> >> ? ? ?http://code.google.com/p/cuda-ndarray/ >> >> I gather it is VERY early in development, but I'm sure they'd love >> contributions! >> > >Hi David, >That does look quite close to what I imagined, probably a good start then! >Romain Hi, I'm one of the devs for that project. ? Thanks David for the link. ?I put some text on the homepage so it's a little more self-explanatory. ?We do welcome contributions. I feel like I must be reinventing the wheel on this, so I'd really appreciate it if someone who knows of a similar project would let me know about it. Otherwise we'll keep plugging away at replicating core ndarray interface elements (operators, math.h-type functions, array indexing, etc.) http://code.google.com/p/cuda-ndarray/ James From josef.pktd at gmail.com Thu Aug 6 13:18:52 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 6 Aug 2009 13:18:52 -0400 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> Message-ID: <1cd32cbb0908061018g4ec61669gb43cb18f0cb00a9e@mail.gmail.com> On Thu, Aug 6, 2009 at 1:07 PM, Keith Goodman wrote: > On Thu, Aug 6, 2009 at 9:58 AM, Charles R > Harris wrote: >> >> >> On Thu, Aug 6, 2009 at 9:55 AM, wrote: >>> >>> What's the best way of getting back the correct shape to be able to >>> broadcast, mean, min,.. to the original array, that works for >>> arbitrary dimension and axis? >>> >>> I thought I have seen some helper functions, but I don't find them >>> anymore? >> >> Adding a keyword to retain the number of dimensions has been mooted. It >> shouldn't be too difficult to implement and would allow things like: >> >>>>> scaled = a/a.max(1, reduce=0) >> >> I could do that for 1.4 if folks are interested. > > I'd use that. It's better than what I usually do: > > scaled = a / a.max(1).reshape(-1,1) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > added feature to numpy reduce function would be nice. but helper function is still useful for our own reduce operations something like this function with an awful name Josef import numpy as np def addreducedaxis(x,axis=None): '''adds axis so that results of reduce operation broadcast to original array Parameter --------- x : array n-dim array that is the result of reduce operation, e.g. mean, min axis : int axis that was removed in the reduce operation Return ------ y : array (n+1)dim array with additional axis ''' if axis != 0 and not axis is None: ind = [slice(None)] * (x.ndim+1) ind[axis] = np.newaxis return x[ind] else: return x a = np.array([[1,2,3,3,0],[2,2,3,2,1]]) a3 = np.dstack((a,a)) print np.all((a3-a3.mean(1)[:,None,:]) == (a3 - addreducedaxis(a3.mean(1),1))) print np.all((a3-a3.mean(-2)[:,None,:]) == (a3 - addreducedaxis(a3.mean(-2),-2))) print np.all((a3-a3.mean()) == (a3 - addreducedaxis(a3.mean()))) print np.all((a3.ravel()-a3.mean(None)) == (a3.ravel() - addreducedaxis(a3.mean(None),None))) print np.all((a3-a3.mean(None)) == (a3 - addreducedaxis(a3.mean(None),None))) #example usage from numpy.testing import assert_almost_equal for axis in [None,0,1,2]: m = a3.mean(axis) v = ((a3 - addreducedaxis(m,axis))**2).mean(axis) assert_almost_equal(v, np.var(a3,axis),15) #normalize array along one axis a3n = (a3 - addreducedaxis(np.mean(a3,1),1))/np.sqrt(addreducedaxis(np.var(a3,1),1)) print a3n.mean(1) print a3n.var(1) From charlesr.harris at gmail.com Thu Aug 6 13:19:53 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Aug 2009 11:19:53 -0600 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> Message-ID: On Thu, Aug 6, 2009 at 11:12 AM, James Bergstra wrote: > >David Warde-Farley cs.toronto.edu> writes: > >> It did inspire some of our colleagues in Montreal to create this, > >> though: > >> > >> http://code.google.com/p/cuda-ndarray/ > >> > >> I gather it is VERY early in development, but I'm sure they'd love > >> contributions! > >> > > > >Hi David, > >That does look quite close to what I imagined, probably a good start then! > >Romain > > Hi, I'm one of the devs for that project. Thanks David for the link. > I put some text on the homepage so it's a little more > self-explanatory. We do welcome contributions. > > I feel like I must be reinventing the wheel on this, so I'd really > appreciate it if someone who knows of a similar project would let me > know about it. Otherwise we'll keep plugging away at replicating core > ndarray interface elements (operators, math.h-type functions, array > indexing, etc.) > > http://code.google.com/p/cuda-ndarray/ > I almost looks like you are reimplementing numpy, in c++ no less. Is there any reason why you aren't working with a numpy branch and just adding ufuncs? I'm also curious if you have thoughts about how to use the GPU pipelines in parallel. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Thu Aug 6 13:25:21 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 06 Aug 2009 12:25:21 -0500 Subject: [Numpy-discussion] Bug or documentation omission with dtype parameter with numpy.mean Message-ID: <4A7B1201.80308@gmail.com> Hi, Should numpy.mean() (and similar functions) maintain the original dtype or the dtype used for the dtype option? I do understand the numerical issues involved but my question relates to whether this should be a bug or needs clarification in the documentation. According to the help, the dtype parameter to numpy.mean() says: dtype : dtype, optional Type to use in computing the mean. For integer inputs, the default is float64; for floating point, inputs it is the same as the input dtype. I interpret this to indicate the type used in the internal calculations and technically does not say what the output dtype should be. But the dtype option does change the output dtype as the simple example below shows. With Python 2.6 and numpy '1.4.0.dev7282' on Linux 64-bit Fedora 11 >>> import numpy as np >>> a=np.array([1,2,3]) >>> a.dtype dtype('int64') >>> a.mean().dtype dtype('float64') >>> a.mean(dtype=np.float32).dtype dtype('float64') >>> a.mean(dtype=np.float64).dtype dtype('float64') >>> a.mean(dtype=np.float128).dtype dtype('float128') >>> a.mean(dtype=np.int).dtype dtype('float64') Clearly the output dtype is float64 or higher as determined by the input dtype or dtype parameter. Bruce From bergstrj at iro.umontreal.ca Thu Aug 6 13:41:32 2009 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Thu, 6 Aug 2009 13:41:32 -0400 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> Message-ID: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> On Thu, Aug 6, 2009 at 1:19 PM, Charles R Harris wrote: > I almost looks like you are reimplementing numpy, in c++ no less. Is there > any reason why you aren't working with a numpy branch and just adding > ufuncs? I don't know how that would work. The Ufuncs need a datatype to work with, and AFAIK, it would break everything if a numpy ndarray pointed to memory on the GPU. Could you explain what you mean a little more? > I'm also curious if you have thoughts about how to use the GPU > pipelines in parallel. Current thinking for ufunc type computations: 1) divide up the tensors into subtensors whose dimensions have power-of-two sizes (this permits a fast integer -> ndarray coordinate computation using bit shifting), 2) launch a kernel for each subtensor in it's own stream to use parallel pipelines. 3) sync and return. This is a pain to do without automatic code generation though. Currently we're using macros, but that's not pretty. C++ has templates, which we don't really use yet, but were planning on using. These have some power to generate code. The 'theano' project (www.pylearn.org/theano) for which cuda-ndarray was created has a more powerful code generation mechanism similar to weave. This algorithm is used in theano-cuda-ndarray. Scipy.weave could be very useful for generating code for specific shapes/ndims on demand, if weave could use nvcc. James From erik.tollerud at gmail.com Thu Aug 6 14:54:52 2009 From: erik.tollerud at gmail.com (Erik Tollerud) Date: Thu, 6 Aug 2009 11:54:52 -0700 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> Message-ID: Note that this is from a "user" perspective, as I have no particular plan of developing the details of this implementation, but I've thought for a long time that GPU support could be great for numpy (I would also vote for OpenCL support over cuda, although conceptually they seem quite similar)... But what exactly would the large-scale plan be? One of the advantages of GPGPUs is that they are particularly suited to rather complicated paralellizable algorithms, and the numpy-level basic operations are just the simple arithmatic operations. So while I'd love to see it working, it's unclear to me exactly how much is gained at the core numpy level, especially given that it's limited to single-precision on most GPUs. Now linear algebra or FFTs on a GPU would probably be a huge boon, I'll admit - especially if it's in the form of a drop-in replacement for the numpy or scipy versions. By the way, I noticed no one mentioned the GPUArray class in pycuda (and it looks like there's something similar in the pyopencl) - seems like that's already done a fair amount of the work... http://documen.tician.de/pycuda/array.html#pycuda.gpuarray.GPUArray On Thu, Aug 6, 2009 at 10:41 AM, James Bergstra wrote: > On Thu, Aug 6, 2009 at 1:19 PM, Charles R > Harris wrote: > > I almost looks like you are reimplementing numpy, in c++ no less. Is > there > > any reason why you aren't working with a numpy branch and just adding > > ufuncs? > > I don't know how that would work. The Ufuncs need a datatype to work > with, and AFAIK, it would break everything if a numpy ndarray pointed > to memory on the GPU. Could you explain what you mean a little more? > > > I'm also curious if you have thoughts about how to use the GPU > > pipelines in parallel. > > Current thinking for ufunc type computations: > 1) divide up the tensors into subtensors whose dimensions have > power-of-two sizes (this permits a fast integer -> ndarray coordinate > computation using bit shifting), > 2) launch a kernel for each subtensor in it's own stream to use > parallel pipelines. > 3) sync and return. > > This is a pain to do without automatic code generation though. > Currently we're using macros, but that's not pretty. > C++ has templates, which we don't really use yet, but were planning on > using. These have some power to generate code. > The 'theano' project (www.pylearn.org/theano) for which cuda-ndarray > was created has a more powerful code generation mechanism similar to > weave. This algorithm is used in theano-cuda-ndarray. > Scipy.weave could be very useful for generating code for specific > shapes/ndims on demand, if weave could use nvcc. > > James > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Thu Aug 6 15:03:15 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 6 Aug 2009 21:03:15 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> Message-ID: 2009/8/6 Erik Tollerud : > Note that this is from a "user" perspective, as I have no particular plan of > developing the details of this implementation, but I've thought for a long > time that GPU support could be great for numpy?(I would also vote for OpenCL > support over cuda, although conceptually they seem quite similar)... > But??what exactly would the large-scale plan be? ?One of the advantages of > GPGPUs is that they are particularly suited to rather complicated > paralellizable algorithms, You mean simple parallizable algorithms, I suppose? and the numpy-level basic operations are just the > simple arithmatic operations. ?So while I'd love to see it working, it's > unclear to me exactly how much is gained at the core numpy level, especially > given that it's limited to single-precision on most GPUs. > Now linear algebra or FFTs on a GPU would probably be a huge boon, I'll > admit - especially if it's in the form of a drop-in replacement for the > numpy or scipy versions. > By the way, I noticed no one mentioned the GPUArray class in pycuda (and it > looks like there's something similar in the pyopencl) - seems like that's > already done a fair amount of the work... > http://documen.tician.de/pycuda/array.html#pycuda.gpuarray.GPUArray > > > On Thu, Aug 6, 2009 at 10:41 AM, James Bergstra > wrote: >> >> On Thu, Aug 6, 2009 at 1:19 PM, Charles R >> Harris wrote: >> > I almost looks like you are reimplementing numpy, in c++ no less. Is >> > there >> > any reason why you aren't working with a numpy branch and just adding >> > ufuncs? >> >> I don't know how that would work. ?The Ufuncs need a datatype to work >> with, and AFAIK, it would break everything if a numpy ndarray pointed >> to memory on the GPU. ?Could you explain what you mean a little more? >> >> > I'm also curious if you have thoughts about how to use the GPU >> > pipelines in parallel. >> >> Current thinking for ufunc type computations: >> 1) divide up the tensors into subtensors whose dimensions have >> power-of-two sizes (this permits a fast integer -> ndarray coordinate >> computation using bit shifting), >> 2) launch a kernel for each subtensor in it's own stream to use >> parallel pipelines. >> 3) sync and return. >> >> This is a pain to do without automatic code generation though. >> Currently we're using macros, but that's not pretty. >> C++ has templates, which we don't really use yet, but were planning on >> using. ?These have some power to generate code. >> The 'theano' project (www.pylearn.org/theano) for which cuda-ndarray >> was created has a more powerful code generation mechanism similar to >> weave. ? This algorithm is used in theano-cuda-ndarray. >> Scipy.weave could be very useful for generating code for specific >> shapes/ndims on demand, if weave could use nvcc. >> >> James >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From Chris.Barker at noaa.gov Thu Aug 6 16:16:25 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 06 Aug 2009 13:16:25 -0700 Subject: [Numpy-discussion] Optimized half-sizing of images? Message-ID: <4A7B3A19.7040800@noaa.gov> (second try on this message. the fist time I included a test PNG that made it too large) Hi folks, We have a need to to generate half-size version of RGB images as quickly as possible. PIL does a pretty good job, but it dawned on me that in the special case of a half-size, one might be able to do it faster with numpy, simply averaging the four pixels in the larger image to create one in the small. I'm doing tiling, and thus reducing 512x512 images to 256x256, so I imagine I'm making good use of cache (it does get pretty pokey with really large images!) What I have now is essentially this : # a is a (h, w, 3) RGB array a2 = a[0::2, 0::2, :].astype(np.uint16) a2 += a[0::2, 1::2, :] a2 += a[1::2, 0::2, :] a2 += a[1::2, 1::2, :] a2 /= 4 return a2.astype(np.uint8) time: 67.2 ms per loop I can speed it up a bit if I accumulate in a uint8 and divide as I go to prevent overflow: a2 = a[0::2, 0::2, :].astype(np.uint8) / 4 a2 += a[0::2, 1::2, :] / 4 a2 += a[1::2, 0::2, :] / 4 a2 += a[1::2, 1::2, :] / 4 return a2 time: 46.6 ms per loop That does lose a touch of accuracy, I suppose, but nothing I can see. Method 1 is about twice as slow as PIL's bilinear scaling. Can I do better? It seems it should be faster if I can avoid so many separate loops through the array. I figure there may be some way with filter or convolve or ndimage, but they all seem to return an array the same size. Any ideas? (Cython is another option, of course) -Chris Test code enclosed. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy_resize.py Type: application/x-python Size: 3526 bytes Desc: not available URL: From stefan at sun.ac.za Thu Aug 6 16:23:55 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 6 Aug 2009 15:23:55 -0500 Subject: [Numpy-discussion] Optimized half-sizing of images? In-Reply-To: <4A7B3A19.7040800@noaa.gov> References: <4A7B3A19.7040800@noaa.gov> Message-ID: <9457e7c80908061323g5aa82815oa6dc88bf7e80cc9c@mail.gmail.com> Hi Chris 2009/8/6 Christopher Barker : > Can I do better? It seems it should be faster if I can avoid so many > separate loops through the array. I figure there may be some way with > filter or convolve or ndimage, but they all seem to return an array the > same size. Are you willing to depend on SciPy? We've got pretty fast zooming code in ndimage. If speed is a big issue, I'd consider using the GPU, which was made for this sort of down-sampling. Cheers St?fan From dwf at cs.toronto.edu Thu Aug 6 16:49:56 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 6 Aug 2009 16:49:56 -0400 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> Message-ID: <9E3227F3-CD76-4594-B383-A1C5ED081F82@cs.toronto.edu> On 6-Aug-09, at 2:54 PM, Erik Tollerud wrote: > Now linear algebra or FFTs on a GPU would probably be a huge boon, > I'll > admit - especially if it's in the form of a drop-in replacement for > the > numpy or scipy versions. The word I'm hearing from people in my direct acquaintance who are using it is that if you have code that even do lots of matrix multiplies, nevermind solving systems or anything like that, the speedup is several orders of magnitude. Things that used to take weeks now take a day or two. If you can deal with the loss of precision it's really quite worth it. > By the way, I noticed no one mentioned the GPUArray class in pycuda > (and it > looks like there's something similar in the pyopencl) - seems like > that's > already done a fair amount of the work... > http://documen.tician.de/pycuda/array.html#pycuda.gpuarray.GPUArray This seems like a great start, I agree. The lack of any documentation on 'dot' is worrying, though. David From sturla at molden.no Thu Aug 6 16:57:50 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 06 Aug 2009 22:57:50 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> Message-ID: <4A7B43CE.7050509@molden.no> > Now linear algebra or FFTs on a GPU would probably be a huge boon, > I'll admit - especially if it's in the form of a drop-in replacement > for the numpy or scipy versions. NumPy generate temporary arrays for expressions involving ndarrays. This extra allocation and copying often takes more time than the computation. With GPGPUs, we have to bus the data to and from VRAM as well. D. Knuth quoted Hoare saying that "premature optimization is the root of all evil." Optimizing computation when the bottleneck is memory is premature. In order to improve on this, I think we have to add "lazy evaluation" to NumPy. That is, an operator should not return a temporary array but a symbolic expression. So if we have an expression like y = a*x + b it should not evalute a*x into a temporary array. Rather, the operators would build up a "parse tree" like y = add(multiply(a,x),b) and evalute the whole expression later on. This would require two things: First we need "dynamic code generation", which incidentally is what OpenCL is all about. I.e. OpenCL is dynamically invoked compiler; there is a function clCreateProgramFromSource, which does just what it says. Second, we need arrays to be immutable. This is very important. If arrays are not immutable, code like this could fail: y = a*x + b x[0] = 1235512371235 With lazy evaluation, the memory overhead would be much smaller. The GPGPU would also get a more complex expressions to use as a kernels. There should be an option of running this on the CPU, possibly using OpenMP for multi-threading. We could either depend on a compiler (C or Fortran) being installed, or use opcodes for a dedicated virtual machine (cf. what numexpr does). In order to reduce the effect of immutable arrays, we could introduce a context-manager. Inside the with statement, all arrays would be immutable. Second, the __exit__ method could trigger the code generator and do all the evaluation. So we would get something like this: # normal numpy here with numpy.accelerator(): # arrays become immutable # lazy evaluation # code generation and evaluation on exit # normal numpy continues here Thus, here is my plan: 1. a special context-manager class 2. immutable arrays inside with statement 3. lazy evaluation: expressions build up a parse tree 4. dynamic code generation 5. evaluation on exit I guess it is possibly to find ways to speed up this as well. If a context manager would always generate the same OpenCL code, the with statement would only need to execute once (we could raise an exception on enter to jump directly to exit). It is possibly to create a superfast NumPy. But just plugging GPGPUs into the current design would be premature. In NumPy's current state, with mutable ndarrays and operators generating temporary arrays, there is not much to gain from introducing GPGPUs. It would only be beneficial in computationally demanding parts like FFTs and solvers for linear algebra and differential equations. Ufuncs with trancendental functions might also benefit. SciPy would certainly benefit more from GPGPUs than NumPy. Just my five cents :-) Regards, Sturla Molden From Chris.Barker at noaa.gov Thu Aug 6 17:05:17 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 06 Aug 2009 14:05:17 -0700 Subject: [Numpy-discussion] Optimized half-sizing of images? In-Reply-To: <9457e7c80908061323g5aa82815oa6dc88bf7e80cc9c@mail.gmail.com> References: <4A7B3A19.7040800@noaa.gov> <9457e7c80908061323g5aa82815oa6dc88bf7e80cc9c@mail.gmail.com> Message-ID: <4A7B458D.9040005@noaa.gov> St?fan van der Walt wrote: > Are you willing to depend on SciPy? We've got pretty fast zooming > code in ndimage. I looked there, and didn't see it -- didn't think to look for "zoom". Now I do, I'll give it a try. However, my thought is that for the special case of half-sizing, all that spline stuff could be unneeded and slower. I'll see what we get. thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Thu Aug 6 17:17:36 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Aug 2009 16:17:36 -0500 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4A7B43CE.7050509@molden.no> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> Message-ID: <3d375d730908061417k102177bem860c207272c168e7@mail.gmail.com> On Thu, Aug 6, 2009 at 15:57, Sturla Molden wrote: > >> Now linear algebra or FFTs on a GPU would probably be a huge boon, >> I'll admit - especially if it's in the form of a drop-in replacement >> for the numpy or scipy versions. > > NumPy generate temporary arrays for expressions involving ndarrays. This > extra allocation and copying often takes more time than the computation. > With GPGPUs, we have to bus the data to and from VRAM as well. D. Knuth > quoted Hoare saying that "premature optimization is the root of all > evil." Optimizing computation when the bottleneck is memory is premature. > It is possibly to create a superfast NumPy. But just plugging GPGPUs > into the current design would be premature. In NumPy's current state, > with mutable ndarrays and operators generating temporary arrays, there > is not much to gain from introducing GPGPUs. It would only be beneficial > in computationally demanding parts like FFTs and solvers for linear > algebra and differential equations. I believe that is exactly the point that Erik is making. :-) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Thu Aug 6 17:23:12 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 06 Aug 2009 23:23:12 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <3d375d730908061417k102177bem860c207272c168e7@mail.gmail.com> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <3d375d730908061417k102177bem860c207272c168e7@mail.gmail.com> Message-ID: <4A7B49C0.6000405@molden.no> Robert Kern wrote: > I believe that is exactly the point that Erik is making. :-) > I wasn't arguing against him, just suggesting a solution. :-) I have big hopes for lazy evaluation, if we can find a way to to it right. Sturla From bergstrj at iro.umontreal.ca Thu Aug 6 17:29:07 2009 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Thu, 6 Aug 2009 17:29:07 -0400 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4A7B43CE.7050509@molden.no> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> Message-ID: <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> On Thu, Aug 6, 2009 at 4:57 PM, Sturla Molden wrote: > >> Now linear algebra or FFTs on a GPU would probably be a huge boon, >> I'll admit - especially if it's in the form of a drop-in replacement >> for the numpy or scipy versions. > > > NumPy generate temporary arrays for expressions involving ndarrays. This > extra allocation and copying often takes more time than the computation. > With GPGPUs, we have to bus the data to and from VRAM as well. D. Knuth > quoted Hoare saying that "premature optimization is the root of all > evil." Optimizing computation when the bottleneck is memory is premature. > > In order to improve on this, I think we have to add "lazy evaluation" to > NumPy. That is, an operator should not return a temporary array but a > symbolic expression. So if we have an expression like > > ? ?y = a*x + b > > it should not evalute a*x into a temporary array. Rather, the operators > would build up a "parse tree" like > > ? ?y = add(multiply(a,x),b) > > and evalute the whole expression ?later on. [snip] > Regards, > Sturla Molden > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi Sturla, The plan you describe is a good one, and Theano (www.pylearn.org/theano) almost exactly implements it. You should check it out. It does not use 'with' syntax at the moment, but it could provide the backend machinery for your mechanism if you want to go forward with that. Theano provides - symbolic expression building for a big subset of what numpy can do (and a few things that it doesn't) - expression optimization (for faster and more accurate computations) - dynamic code generation - cacheing of compiled functions to disk. Also, when you have a symbolic expression graph you can do cute stuff like automatic differentiation. We're currently working on the bridge between theano and cuda so that you declare certain inputs as residing on the GPU instead of the host memory, so you don't have to transfer things to and from host memory as much. James From dalcinl at gmail.com Thu Aug 6 17:48:45 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 6 Aug 2009 18:48:45 -0300 Subject: [Numpy-discussion] FCompiler and runtime library dirs Message-ID: Hi, folks, Using NumPy 1.3.0 from Fedora 11, though this issue likely applies to current trunk (I've not actually tested, just taken a look at the sources) As numpy.distutils.FCompiler inherits from distutils.ccompiler.CCompiler, the method "runtime_library_dir_option()" fails with NotImplementedError. I had to add the monkeypatch pasted below to a setup.py script (full code at http://petsc.cs.iit.edu/petsc4py/petsc4py-dev/file/tip/demo/wrap-f2py/setup.py) in order to get things working (wiht GCC on Linux): from numpy.distutils.fcompiler import FCompiler from numpy.distutils.unixccompiler import UnixCCompiler FCompiler.runtime_library_dir_option = \ UnixCCompiler.runtime_library_dir_option.im_func Do any of you have an idea about how to properly fix this issue in numpy? I'm tempted to re-use the UnixCompiler implementation in POSIX (including Mac OS X?), just to save some lines and do not repeat that code in every FCompiler subclass... Comments? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From npuloski at gmail.com Thu Aug 6 18:01:27 2009 From: npuloski at gmail.com (Nanime Puloski) Date: Thu, 6 Aug 2009 18:01:27 -0400 Subject: [Numpy-discussion] Strange Error with NumPy Message-ID: Can anyone explain to me why I receive an error(AtrributeError) in NumPy when I do numpy.sin(2**64), but not when I do numpy.sin(2.0**64), numpy.sin(float(2**64)) or even numpy.sin(2)? Where is the root problem in all of this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Aug 6 18:12:11 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Aug 2009 16:12:11 -0600 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> Message-ID: On Thu, Aug 6, 2009 at 3:29 PM, James Bergstra wrote: > On Thu, Aug 6, 2009 at 4:57 PM, Sturla Molden wrote: > > > >> Now linear algebra or FFTs on a GPU would probably be a huge boon, > >> I'll admit - especially if it's in the form of a drop-in replacement > >> for the numpy or scipy versions. > > > > > > NumPy generate temporary arrays for expressions involving ndarrays. This > > extra allocation and copying often takes more time than the computation. > > With GPGPUs, we have to bus the data to and from VRAM as well. D. Knuth > > quoted Hoare saying that "premature optimization is the root of all > > evil." Optimizing computation when the bottleneck is memory is premature. > > > > In order to improve on this, I think we have to add "lazy evaluation" to > > NumPy. That is, an operator should not return a temporary array but a > > symbolic expression. So if we have an expression like > > > > y = a*x + b > > > > it should not evalute a*x into a temporary array. Rather, the operators > > would build up a "parse tree" like > > > > y = add(multiply(a,x),b) > > > > and evalute the whole expression later on. > [snip] > > Regards, > > Sturla Molden > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > Hi Sturla, > > The plan you describe is a good one, and Theano > (www.pylearn.org/theano) almost exactly implements it. You should > check it out. It does not use 'with' syntax at the moment, but it > could provide the backend machinery for your mechanism if you want to > go forward with that. Theano provides > - symbolic expression building for a big subset of what numpy can do > (and a few things that it doesn't) > - expression optimization (for faster and more accurate computations) > - dynamic code generation > - cacheing of compiled functions to disk. > > Also, when you have a symbolic expression graph you can do cute stuff > like automatic differentiation. We're currently working on the bridge > between theano and cuda so that you declare certain inputs as residing > on the GPU instead of the host memory, so you don't have to transfer > things to and from host memory as much. > So what simple things could numpy implement that would help here? It almost sounds like numpy would mostly be an interface to python and the gpu would execute specialized code written and compiled for specific problems. Whether the code that gets compiled is written using lazy evaluation (ala Sturla), or is expressed some other way seems like an independent issue. It sounds like one important thing would be having arrays that reside on the GPU. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Aug 6 18:15:14 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Aug 2009 17:15:14 -0500 Subject: [Numpy-discussion] Strange Error with NumPy In-Reply-To: References: Message-ID: <3d375d730908061515k25bfcc55n6893eaa56b363b09@mail.gmail.com> On Thu, Aug 6, 2009 at 17:01, Nanime Puloski wrote: > Can anyone explain to me why I receive an error(AtrributeError) in NumPy > when I do numpy.sin(2**64), but not when I do numpy.sin(2.0**64), > numpy.sin(float(2**64)) or even numpy.sin(2)? > > Where is the root problem in all of this? numpy deals with objects that can be natively represented by the C numerical types on your machine. On your machine 2**64 gives you a Python long object which cannot be converted to one of the native C numerical types, so numpy.sin() treats it as a regular Python object. The default implementation of all of the ufuncs when faced with a Python object it doesn't know about is to look for a method on the object of the same name. long.sin() does not exist, so you get an AttributeError. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Thu Aug 6 18:16:57 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 6 Aug 2009 18:16:57 -0400 Subject: [Numpy-discussion] Strange Error with NumPy In-Reply-To: References: Message-ID: On 6-Aug-09, at 6:01 PM, Nanime Puloski wrote: > Can anyone explain to me why I receive an error(AtrributeError) in > NumPy > when I do numpy.sin(2**64), but not when I do numpy.sin(2.0**64), > numpy.sin(float(2**64)) or even numpy.sin(2)? In [6]: type(2**64) Out[6]: In [7]: type(2) Out[7]: In [8]: type(2.0**64) Out[8]: Probably because 2**64 yields a long, which is an arbitrary precision type in Python, and numpy is having trouble casting it to something it can use (it's too big for numpy.int64). Notice that np.sin(2**63 - 1) works fine while np.sin(2**63) doesn't - 2**63 - 1 is the largest value that an int64 can hold. You're right that it could be approximately represented as a float32 or float64, but numpy won't cast from exact types to inexact types without you telling it to do so. I agree that the error message is less than ideal. David From Chris.Barker at noaa.gov Thu Aug 6 18:34:34 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 06 Aug 2009 15:34:34 -0700 Subject: [Numpy-discussion] Optimized half-sizing of images? In-Reply-To: <4A7B458D.9040005@noaa.gov> References: <4A7B3A19.7040800@noaa.gov> <9457e7c80908061323g5aa82815oa6dc88bf7e80cc9c@mail.gmail.com> <4A7B458D.9040005@noaa.gov> Message-ID: <4A7B5A7A.5000105@noaa.gov> Christopher Barker wrote: > St?fan van der Walt wrote: >> Are you willing to depend on SciPy? We've got pretty fast zooming >> code in ndimage. > However, my thought is that for the special case of half-sizing, all > that spline stuff could be unneeded and slower. I'll see what we get. I've given that a try. The docs are pretty sparse. I couldn't figure out a way to get it to do the whole image at once. Rather I had to do each band separately. It ended up about the same speed as my int accumulator method: def test5(a): """ using ndimage tested on 512x512 RGB image: time: 40 ms per loop """ h = a.shape[0]/2 w = a.shape[1]/2 a2 = np.empty((h, w, 3), dtype=np.uint8) a2[:,:,0] = ndi.zoom(a[:,:,0], 0.5, order=1) a2[:,:,1] = ndi.zoom(a[:,:,1], 0.5, order=1) a2[:,:,2] = ndi.zoom(a[:,:,2], 0.5, order=1) return a2 Is there a way to get it to do all three colorbands at once? NOTE: if I didn't set order to 1, it was MUCH slower, as one might expect. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy_resize.py Type: application/x-python Size: 4236 bytes Desc: not available URL: From sturla at molden.no Thu Aug 6 18:36:54 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 07 Aug 2009 00:36:54 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> Message-ID: <4A7B5B06.2080909@molden.no> Charles R Harris wrote: > Whether the code that gets compiled is written using lazy evaluation > (ala Sturla), or is expressed some other way seems like an independent > issue. It sounds like one important thing would be having arrays that > reside on the GPU. Memory management is slow compared to computation. Operations like malloc, free and memcpy is not faster for VRAM than for RAM. There will be no benefit from the GPU if the bottleneck is memory. That is why we need to get rid of the creation of temporary arrays, hence lazy evaluation. Having arrays reside in VRAM would reduce the communication between RAM and VRAM, but the problem with temporary arrays is still there. Also VRAM tends to be a limited resource. Sturla From sturla at molden.no Thu Aug 6 18:49:32 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 07 Aug 2009 00:49:32 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4A7B5B06.2080909@molden.no> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> Message-ID: <4A7B5DFC.5050308@molden.no> Sturla Molden wrote: > Memory management is slow compared to computation. Operations like > malloc, free and memcpy is not faster for VRAM than for RAM. Actually it's not VRAM anymore, but whatever you call the memory dedicated to the GPU. It is cheap to put 8 GB of RAM into a computer, but graphics cards with more than 1 GB memory are expensive and uncommon on e.g. laptops. And this memory will be needed for other things as well, e.g. graphics. Sturla From charlesr.harris at gmail.com Thu Aug 6 18:50:20 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Aug 2009 16:50:20 -0600 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4A7B5B06.2080909@molden.no> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> Message-ID: On Thu, Aug 6, 2009 at 4:36 PM, Sturla Molden wrote: > Charles R Harris wrote: > > Whether the code that gets compiled is written using lazy evaluation > > (ala Sturla), or is expressed some other way seems like an independent > > issue. It sounds like one important thing would be having arrays that > > reside on the GPU. > Memory management is slow compared to computation. Operations like > malloc, free and memcpy is not faster for VRAM than for RAM. There will > be no benefit from the GPU if the bottleneck is memory. That is why we > need to get rid of the creation of temporary arrays, hence lazy evaluation. > > Having arrays reside in VRAM would reduce the communication between RAM > and VRAM, but the problem with temporary arrays is still there. > I'm not arguing with that, but I regard it as a separate problem. One could, after all, simply use an expression to GPU compiler to generate modules. The question is what simple additions we can make to numpy so that it acts as a convenient io channel. I mean, once the computations are moved elsewhere numpy is basically a convenient way to address memory. > > Also VRAM tends to be a limited resource. > But getting less so. These days it comes in gigabytes and there is no reason why it shouldn't soon excede what many folks have for main memory. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Thu Aug 6 19:10:52 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 07 Aug 2009 01:10:52 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> Message-ID: <4A7B62FC.6090504@molden.no> Charles R Harris wrote: > I mean, once the computations are moved elsewhere numpy is basically a > convenient way to address memory. That is how I mostly use NumPy, though. Computations I often do in Fortran 95 or C. NumPy arrays on the GPU memory is an easy task. But then I would have to write the computation in OpenCL's dialect of C99? But I'd rather program everything in Python if I could. Details like GPU and OpenCL should be hidden away. Nice looking Python with NumPy is much easier to read and write. That is why I'd like to see a code generator (i.e. JIT compiler) for NumPy. Sturla From npuloski at gmail.com Thu Aug 6 19:26:10 2009 From: npuloski at gmail.com (Nanime Puloski) Date: Thu, 6 Aug 2009 19:26:10 -0400 Subject: [Numpy-discussion] Strange Error with NumPy Addendum Message-ID: Thank you for your responses so far. What I also do not understand is why sin(2**64) works with the standard Python math module, but fails to do so with NumPy? To Robert Kern: Can't 2^64 be represented in C as a long double? It seems to work well on my machine. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Thu Aug 6 19:26:59 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 07 Aug 2009 01:26:59 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> Message-ID: <4A7B66C3.7070605@molden.no> James Bergstra wrote: > The plan you describe is a good one, and Theano > (www.pylearn.org/theano) almost exactly implements it. You should > check it out. It does not use 'with' syntax at the moment, but it > could provide the backend machinery for your mechanism if you want to > go forward with that. Theano provides > - symbolic expression building for a big subset of what numpy can do > (and a few things that it doesn't) > - expression optimization (for faster and more accurate computations) > - dynamic code generation > - cacheing of compiled functions to disk. Thank you James, theano looks great. :-D Sturla From charlesr.harris at gmail.com Thu Aug 6 19:27:29 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Aug 2009 17:27:29 -0600 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4A7B62FC.6090504@molden.no> References: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> Message-ID: On Thu, Aug 6, 2009 at 5:10 PM, Sturla Molden wrote: > Charles R Harris wrote: > > > I mean, once the computations are moved elsewhere numpy is basically a > > convenient way to address memory. > > That is how I mostly use NumPy, though. Computations I often do in > Fortran 95 or C. > > NumPy arrays on the GPU memory is an easy task. Glad to hear it. So maybe some way to specify and track where the memory is allocated would be helpful. Travis wants to add a dictionary to ndarrays and that might be useful here. But then I would have to > write the computation in OpenCL's dialect of C99? But I'd rather program > everything in Python if I could. Details like GPU and OpenCL should be > hidden away. Nice looking Python with NumPy is much easier to read and > write. That is why I'd like to see a code generator (i.e. JIT compiler) > for NumPy. > Yes, but that is a language/compiler problem. I'm thinking of what tools numpy can offer that would help people experimenting with different approaches to using GPUs. At some point we might want to adopt a working approach but now seems a bit early for that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Aug 6 19:29:41 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Aug 2009 18:29:41 -0500 Subject: [Numpy-discussion] Strange Error with NumPy Addendum In-Reply-To: References: Message-ID: <3d375d730908061629s721052d6tc8246e4f81231d2c@mail.gmail.com> On Thu, Aug 6, 2009 at 18:26, Nanime Puloski wrote: > Thank you for your responses so far. > What I also do not understand is why sin(2**64) works with > the standard Python math module, but fails to do so with NumPy? math.sin() always converts the argument to a float. We do not. > To Robert Kern: > Can't 2^64 be represented in C as a long double? For that value, yes, but not for long objects in general. We don't look at the value itself, just the type. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Thu Aug 6 20:00:20 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 6 Aug 2009 17:00:20 -0700 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4A7B43CE.7050509@molden.no> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> Message-ID: On Thu, Aug 6, 2009 at 1:57 PM, Sturla Molden wrote: > In order to reduce the effect of immutable arrays, we could introduce a > context-manager. Inside the with statement, all arrays would be > immutable. Second, the __exit__ method could trigger the code generator > and do all the evaluation. So we would get something like this: > > ? ?# normal numpy here > > ? ?with numpy.accelerator(): > > ? ? ? ?# arrays become immutable > ? ? ? ?# lazy evaluation > > ? ? ? ?# code generation and evaluation on exit > > ? ?# normal numpy continues here > > > Thus, here is my plan: > > 1. a special context-manager class > 2. immutable arrays inside with statement > 3. lazy evaluation: expressions build up a parse tree > 4. dynamic code generation > 5. evaluation on exit You will face one issue here: unless you raise a special exception inside the with block, the python interpreter will unconditionally execute that code without your control. I had a long talk about this with Alex Martelli last year at scipy, where I pitched the idea of allowing context managers to have an optional third method, __execute__, which would get the code block in the with statement for execution. He was fairly pessimistic about the possibility of this making its way into python, mostly (if I recall correctly) because of scoping issues: the with statement does not introduce a new scope, so you'd need to pass to this method the code plus the locals/globals of the entire enclosing scope, which felt messy. There was also the thorny question of how to pass the code block. Source? Bytecode? What? In many environments the source may not be available. Last year I wrote a gross hack to do this, which you can find here: http://bazaar.launchpad.net/~ipython-dev/ipython/0.10/annotate/head%3A/IPython/kernel/contexts.py The idea is that it would be used by code like this (note, this doesn't actually work right now): def test_simple(): # XXX - for now, we need a running cluster to be started separately. The # daemon work is almost finished, and will make much of this unnecessary. from IPython.kernel import client mec = client.MultiEngineClient(('127.0.0.1',10105)) try: mec.get_ids() except ConnectionRefusedError: import os, time os.system('ipcluster -n 2 &') time.sleep(2) mec = client.MultiEngineClient(('127.0.0.1',10105)) mec.block = False parallel = RemoteMultiEngine(mec) mec.pushAll() with parallel as pr: # A comment remote() # this means the code below only runs remotely print 'Hello remote world' x = range(10) # Comments are OK # Even misindented. y = x+1 print pr.x + pr.y ### The problem with my approach is that I find it brittle and ugly enough that I ultimately abandoned it. I'd love to see if you find a proper solution for this... Cheers, f From zachary.pincus at yale.edu Thu Aug 6 21:46:03 2009 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 6 Aug 2009 21:46:03 -0400 Subject: [Numpy-discussion] Optimized half-sizing of images? In-Reply-To: <4A7B3A19.7040800@noaa.gov> References: <4A7B3A19.7040800@noaa.gov> Message-ID: <86638F54-75A9-4DBD-9FA5-0F5C08794CA1@yale.edu> > We have a need to to generate half-size version of RGB images as > quickly > as possible. How good do these need to look? You could just throw away every other pixel... image[::2, ::2]. Failing that, you could also try using ndimage's convolve routines to run a 2x2 box filter over the image, and then throw away half of the pixels. But this would be slower than optimal, because the kernel would be convolved over every pixel, not just the ones you intend to keep. Really though, I'd just bite the bullet and write a C extension (or cython, whatever, an extension to work for a defined-dimensionality, defined-dtype array is pretty simple), or as suggested before, do it on the GPU. (Though I find that readback from the GPU can be slow enough that C code can beat it in some cases.) Zach From robert.kern at gmail.com Thu Aug 6 22:01:02 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Aug 2009 21:01:02 -0500 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> Message-ID: <3d375d730908061901k6e5286b0n2b3d88f5ffc54182@mail.gmail.com> On Thu, Aug 6, 2009 at 19:00, Fernando Perez wrote: > On Thu, Aug 6, 2009 at 1:57 PM, Sturla Molden wrote: >> In order to reduce the effect of immutable arrays, we could introduce a >> context-manager. Inside the with statement, all arrays would be >> immutable. Second, the __exit__ method could trigger the code generator >> and do all the evaluation. So we would get something like this: >> >> ? ?# normal numpy here >> >> ? ?with numpy.accelerator(): >> >> ? ? ? ?# arrays become immutable >> ? ? ? ?# lazy evaluation >> >> ? ? ? ?# code generation and evaluation on exit >> >> ? ?# normal numpy continues here >> >> >> Thus, here is my plan: >> >> 1. a special context-manager class >> 2. immutable arrays inside with statement >> 3. lazy evaluation: expressions build up a parse tree >> 4. dynamic code generation >> 5. evaluation on exit > > You will face one issue here: unless you raise a special exception > inside the with block, the python interpreter will unconditionally > execute that code without your control. ?I had a long talk about this > with Alex Martelli last year at scipy, where I pitched the idea of > allowing context managers to have an optional third method, > __execute__, which would get the code block in the with statement for > execution. ?He was fairly pessimistic about the possibility of this > making its way into python, mostly (if I recall correctly) because of > scoping issues: the with statement does not introduce a new scope, so > you'd need to pass to this method the code plus the locals/globals of > the entire enclosing scope, which felt messy. Sometimes, I fantasize about writing a python4ply grammar that repurposes the `` quotes to provide expression literals and ``` ``` triple quotes for multiline statement literals. They would be literals for _ast abstract syntax trees. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Aug 6 23:32:28 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Aug 2009 21:32:28 -0600 Subject: [Numpy-discussion] datetime code Message-ID: Travis, Robert, Is there any reason not to merge the c code in the datetime branch at this time? If not, I will do it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Fri Aug 7 01:03:07 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 7 Aug 2009 01:03:07 -0400 Subject: [Numpy-discussion] Strange Error with NumPy Addendum In-Reply-To: <3d375d730908061629s721052d6tc8246e4f81231d2c@mail.gmail.com> References: <3d375d730908061629s721052d6tc8246e4f81231d2c@mail.gmail.com> Message-ID: <3FDEDA59-7E92-44EF-9506-4159A0AFDF3E@cs.toronto.edu> On 6-Aug-09, at 7:29 PM, Robert Kern wrote: > For that value, yes, but not for long objects in general. We don't > look at the value itself, just the type. Err, don't look at the value (of a long), except when it's representable with an integer dtype, right? Hence why 2**63 - 1 works. David From robert.kern at gmail.com Fri Aug 7 01:05:44 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 7 Aug 2009 00:05:44 -0500 Subject: [Numpy-discussion] Strange Error with NumPy Addendum In-Reply-To: <3FDEDA59-7E92-44EF-9506-4159A0AFDF3E@cs.toronto.edu> References: <3d375d730908061629s721052d6tc8246e4f81231d2c@mail.gmail.com> <3FDEDA59-7E92-44EF-9506-4159A0AFDF3E@cs.toronto.edu> Message-ID: <3d375d730908062205v5e9e3a96y1e5074cf25774d26@mail.gmail.com> On Fri, Aug 7, 2009 at 00:03, David Warde-Farley wrote: > On 6-Aug-09, at 7:29 PM, Robert Kern wrote: > >> For that value, yes, but not for long objects in general. We don't >> look at the value itself, just the type. > > Err, don't look at the value (of a long), except when it's > representable with an integer dtype, right? Hence why 2**63 - 1 works. Err, you may be right. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From seb.haase at gmail.com Fri Aug 7 03:23:56 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Fri, 7 Aug 2009 09:23:56 +0200 Subject: [Numpy-discussion] Optimized half-sizing of images? In-Reply-To: <86638F54-75A9-4DBD-9FA5-0F5C08794CA1@yale.edu> References: <4A7B3A19.7040800@noaa.gov> <86638F54-75A9-4DBD-9FA5-0F5C08794CA1@yale.edu> Message-ID: On Fri, Aug 7, 2009 at 3:46 AM, Zachary Pincus wrote: >> We have a need to to generate half-size version of RGB images as >> quickly >> as possible. > > How good do these need to look? You could just throw away every other > pixel... image[::2, ::2]. > > Failing that, you could also try using ndimage's convolve routines to > run a 2x2 box filter over the image, and then throw away half of the > pixels. But this would be slower than optimal, because the kernel > would be convolved over every pixel, not just the ones you intend to > keep. > > Really though, I'd just bite the bullet and write a C extension (or > cython, whatever, an extension to work for a defined-dimensionality, > defined-dtype array is pretty simple), or as suggested before, do it > on the GPU. (Though I find that readback from the GPU can be slow > enough that C code can beat it in some cases.) > > Zach Chris, regarding your concerns of doing to fancy interpolation at the cost of speed, I would guess the overall bottle neck is rather the memory access than the extra CPU cycles needed for interpolation. Regarding ndimage.zoom it should be able to "not zoom" the color-axis but the others in one call. Cheers, -- Sebastian Haase From robertwb at math.washington.edu Fri Aug 7 03:34:17 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 7 Aug 2009 00:34:17 -0700 Subject: [Numpy-discussion] Optimized half-sizing of images? In-Reply-To: References: <4A7B3A19.7040800@noaa.gov> <86638F54-75A9-4DBD-9FA5-0F5C08794CA1@yale.edu> Message-ID: <12BEE90B-55D5-42ED-B202-B0F7D09040B4@math.washington.edu> On Aug 7, 2009, at 12:23 AM, Sebastian Haase wrote: > On Fri, Aug 7, 2009 at 3:46 AM, Zachary > Pincus wrote: >>> We have a need to to generate half-size version of RGB images as >>> quickly >>> as possible. >> >> How good do these need to look? You could just throw away every other >> pixel... image[::2, ::2]. >> >> Failing that, you could also try using ndimage's convolve routines to >> run a 2x2 box filter over the image, and then throw away half of the >> pixels. But this would be slower than optimal, because the kernel >> would be convolved over every pixel, not just the ones you intend to >> keep. >> >> Really though, I'd just bite the bullet You say that as if it's painful to do so :) ------------------------------------- import cython import numpy as np cimport numpy as np @cython.boundscheck(False) def halfsize_cython(np.ndarray[np.uint8_t, ndim=2, mode="c"] a): cdef unsigned int i, j, w, h w, h = a.shape[0], a.shape[1] cdef np.ndarray[np.uint8_t, ndim=2, mode="c"] a2 = np.ndarray((w/ 2, h/2), np.uint8) for i in range(w/2): for j in range(h/2): a2[i,j] = (a[2*i,2*j] + a[2*i+1,2*j] + a[2*i,2*j+1] + a[2*i+1,2*j+1])/4 return a2 def halfsize_slicing(a): a2 = a[0::2, 0::2].astype(np.uint8) / 4 a2 += a[0::2, 1::2] / 4 a2 += a[1::2, 0::2] / 4 a2 += a[1::2, 1::2] / 4 return a2 ------------------------------------- sage: import numpy; from half_size import * sage: a = numpy.ndarray((512, 512), numpy.uint8) sage: timeit("halfsize_cython(a)") 625 loops, best of 3: 604 ?s per loop sage: timeit("halfsize_slicing(a)") 5 loops, best of 3: 2.72 ms per loop >> and write a C extension (or cython, whatever, an extension to work >> for a defined-dimensionality, >> defined-dtype array is pretty simple), or as suggested before, do it >> on the GPU. (Though I find that readback from the GPU can be slow >> enough that C code can beat it in some cases.) >> >> Zach > > Chris, > regarding your concerns of doing to fancy interpolation at the cost of > speed, I would guess the overall bottle neck is rather the memory > access than the extra CPU cycles needed for interpolation. > Regarding ndimage.zoom it should be able to "not zoom" the color-axis > but the others in one call. I was about to say the same thing, it's probably the memory, not cycles, that's hurting you. Of course 512x512 is still small enough to fit in L2 of any modern computer. - Robert From romain.brette at ens.fr Fri Aug 7 06:06:36 2009 From: romain.brette at ens.fr (Romain Brette) Date: Fri, 07 Aug 2009 12:06:36 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4A7B43CE.7050509@molden.no> References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> Message-ID: Sturla Molden a ?crit : > Thus, here is my plan: > > 1. a special context-manager class > 2. immutable arrays inside with statement > 3. lazy evaluation: expressions build up a parse tree > 4. dynamic code generation > 5. evaluation on exit > There seems to be some similarity with what we want to do to accelerate our neural simulations (briansimulator.org), as described here: http://brian.svn.sourceforge.net/viewvc/brian/trunk/dev/BEPs/BEP-9-Automatic%20code%20generation.txt?view=markup (by the way BEP is "Brian Enhancement Proposal") The speed-up factor we got in our experimental code with GPU is very substantial when there are many neurons (= large vectors, e.g. 10 000 elements), even when operations are simple. Romain From npuloski at gmail.com Fri Aug 7 08:15:02 2009 From: npuloski at gmail.com (Nanime Puloski) Date: Fri, 7 Aug 2009 08:15:02 -0400 Subject: [Numpy-discussion] Strange Error with NumPy Addendum In-Reply-To: <3FDEDA59-7E92-44EF-9506-4159A0AFDF3E@cs.toronto.edu> References: <3d375d730908061629s721052d6tc8246e4f81231d2c@mail.gmail.com> <3FDEDA59-7E92-44EF-9506-4159A0AFDF3E@cs.toronto.edu> Message-ID: But if it were an unsigned int64, it should be able to hold 2**64 or at least 2**64-1. Am I correct? On Fri, Aug 7, 2009 at 1:03 AM, David Warde-Farley wrote: > On 6-Aug-09, at 7:29 PM, Robert Kern wrote: > > > For that value, yes, but not for long objects in general. We don't > > look at the value itself, just the type. > > Err, don't look at the value (of a long), except when it's > representable with an integer dtype, right? Hence why 2**63 - 1 works. > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chanley at stsci.edu Fri Aug 7 10:46:48 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Fri, 07 Aug 2009 10:46:48 -0400 Subject: [Numpy-discussion] Test failures for rev7299 Message-ID: <4A7C3E58.1000705@stsci.edu> Hi, I receive the following test errors after building numpy rev7229 from svn: ====================================================================== FAIL: test_simple_circular (test_multiarray.TestStackedNeighborhoodIter) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/chanley/dev/site-packages/lib/python/numpy/core/tests/test_multia rray.py", line 1344, in test_simple_circular assert_array_equal(l, r) File "/Users/chanley/dev/site-packages/lib/python/numpy/testing/utils.py", lin e 639, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Users/chanley/dev/site-packages/lib/python/numpy/testing/utils.py", lin e 571, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (mismatch 6.66666666667%) x: array([[ 0., 1., 2.], [ 1., 2., 3.], [ 2., 3., 0.],... y: array([[ 0., 1., 2.], [ 1., 2., 3.], [ 2., 3., 0.],... ====================================================================== FAIL: test_simple_mirror (test_multiarray.TestStackedNeighborhoodIter) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/chanley/dev/site-packages/lib/python/numpy/core/tests/test_multia rray.py", line 1296, in test_simple_mirror assert_array_equal(l, r) File "/Users/chanley/dev/site-packages/lib/python/numpy/testing/utils.py", lin e 639, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Users/chanley/dev/site-packages/lib/python/numpy/testing/utils.py", lin e 571, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (mismatch 6.66666666667%) x: array([[ 0., 1., 2.], [ 1., 2., 3.], [ 2., 3., 0.],... y: array([[ 0., 1., 2.], [ 1., 2., 3.], [ 2., 3., 0.],... ---------------------------------------------------------------------- Ran 2186 tests in 10.671s FAILED (KNOWNFAIL=1, SKIP=3, failures=2) I'm running on an Intel MacBook Pro running OS X 10.5.8. I am using Python 2.5.1. Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From david at ar.media.kyoto-u.ac.jp Fri Aug 7 10:51:57 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 07 Aug 2009 23:51:57 +0900 Subject: [Numpy-discussion] Test failures for rev7299 In-Reply-To: <4A7C3E58.1000705@stsci.edu> References: <4A7C3E58.1000705@stsci.edu> Message-ID: <4A7C3F8D.20400@ar.media.kyoto-u.ac.jp> Christopher Hanley wrote: > Hi, > > I receive the following test errors after building numpy rev7229 from svn: > Yep, a bug slipped in the last commit, I am fixing it right now, David From cournape at gmail.com Fri Aug 7 11:29:28 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 8 Aug 2009 00:29:28 +0900 Subject: [Numpy-discussion] Test failures for rev7299 In-Reply-To: <4A7C3F8D.20400@ar.media.kyoto-u.ac.jp> References: <4A7C3E58.1000705@stsci.edu> <4A7C3F8D.20400@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220908070829v1a3c5ebco70e1deb0436c45e6@mail.gmail.com> On Fri, Aug 7, 2009 at 11:51 PM, David Cournapeau wrote: > Christopher Hanley wrote: >> Hi, >> >> I receive the following test errors after building numpy rev7229 from svn: >> > > Yep, a bug slipped in the last commit, I am fixing it right now, Hm, the fix does not look so obvious, so I just reverted the faulty commit. The whole test suite pass (modulo the f2py tests which are broken for a while now), cheers, David From Chris.Barker at noaa.gov Fri Aug 7 12:28:53 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 07 Aug 2009 09:28:53 -0700 Subject: [Numpy-discussion] Optimized half-sizing of images? In-Reply-To: <86638F54-75A9-4DBD-9FA5-0F5C08794CA1@yale.edu> References: <4A7B3A19.7040800@noaa.gov> <86638F54-75A9-4DBD-9FA5-0F5C08794CA1@yale.edu> Message-ID: <4A7C5645.6090204@noaa.gov> Zachary Pincus wrote: >> We have a need to to generate half-size version of RGB images as >> quickly >> as possible. > > How good do these need to look? You could just throw away every other > pixel... image[::2, ::2]. I do the as good quality as I can get. throwing away pixels gets a bit ugly. > Failing that, you could also try using ndimage's convolve routines to > run a 2x2 box filter over the image, and then throw away half of the > pixels. But this would be slower than optimal, because the kernel > would be convolved over every pixel, not just the ones you intend to > keep. yup -- worth a try though. > Really though, I'd just bite the bullet and write a C extension (or > cython, whatever, an extension to work for a defined-dimensionality, > defined-dtype array is pretty simple), I was going to sit down and do that this morning, but... > or as suggested before, do it > on the GPU. I have no idea how to do that, except maybe pyOpenGL, which is on our list to try. Sebastian Haase wrote: > regarding your concerns of doing to fancy interpolation at the cost of > speed, I would guess the overall bottle neck is rather the memory > access than the extra CPU cycles needed for interpolation. well, could be, though I can't really know 'till I try. One example, though is using ndimage.zoom -- order 1 interpolation is MUCH faster than order 2 or 3. > Regarding ndimage.zoom it should be able to "not zoom" the color-axis > but the others in one call. well, that's what I thought, but I can't figure out how to do it. The docs are a bit sparse. Here's my offer: If someone tells me how to do it, I'll make a docs contribution to the SciPy docs explaining it. > You say that as if it's painful to do so :) wow! Thanks for doing my work for me. I thought this would be a good case to give Cython a try for the first time -- having a working example is great. > sage: timeit("halfsize_cython(a)") > 625 loops, best of 3: 604 ?s per loop > sage: timeit("halfsize_slicing(a)") > 5 loops, best of 3: 2.72 ms per loop and bingo! a 4.5 times speed-up -- I think that's enough to see in our app. > I was about to say the same thing, it's probably the memory, not > cycles, that's hurting you. sure, but the slicing method pushes that memory around more than it needs to. > Of course 512x512 is still small enough > to fit in L2 of any modern computer. I think so -- I do know that the slicing method slows down a lot with larger images. We're tiling anyway in this case, but if I did want to do a big image, I'd probably break it down into chunks to process it anyway. thanks, all. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From tjhnson at gmail.com Fri Aug 7 13:42:06 2009 From: tjhnson at gmail.com (T J) Date: Fri, 7 Aug 2009 10:42:06 -0700 Subject: [Numpy-discussion] Vectorize ufunc Message-ID: I was wondering why vectorize doesn't make the ufunc available at the topmost level.... >>> def a(x,y): return x + y >>> b = vectorize(a) >>> b.reduce Instead, the ufunc is stored at b.ufunc. Also, b.ufunc.reduce() doesn't seem to exist until I *use* the vectorized function at least once. Can this be changed so that it exists right away (and preferably at b.reduce instead of b.ufunc.reduce)? From alan at ajackson.org Fri Aug 7 13:45:25 2009 From: alan at ajackson.org (alan at ajackson.org) Date: Fri, 7 Aug 2009 12:45:25 -0500 Subject: [Numpy-discussion] Power distribution Message-ID: <20090807124525.65dd3fcf@ajackson.org> Documenting my way through the statistics modules in numpy, I ran into the Power Distribution. Anyone know what that is? I Googled for it, and found a lot of stuff on electricity, but no reference for a statistical distribution of that name. Does it have a common alias? -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From HAWRYLA at novachem.com Fri Aug 7 13:47:34 2009 From: HAWRYLA at novachem.com (Andrew Hawryluk) Date: Fri, 7 Aug 2009 11:47:34 -0600 Subject: [Numpy-discussion] Power distribution In-Reply-To: <20090807124525.65dd3fcf@ajackson.org> References: <20090807124525.65dd3fcf@ajackson.org> Message-ID: <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> You might get better results for 'power-law distribution' http://en.wikipedia.org/wiki/Power_law Andrew > -----Original Message----- > From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- > bounces at scipy.org] On Behalf Of alan at ajackson.org > Sent: 7 Aug 2009 11:45 AM > To: Discussion of Numerical Python > Subject: [Numpy-discussion] Power distribution > > Documenting my way through the statistics modules in numpy, I ran into > the Power Distribution. > > Anyone know what that is? I Googled for it, and found a lot of stuff on > electricity, but no reference for a statistical distribution of that > name. Does it have a common alias? > > -- > ----------------------------------------------------------------------- > | Alan K. Jackson | To see a World in a Grain of Sand | > | alan at ajackson.org | And a Heaven in a Wild Flower, | > | www.ajackson.org | Hold Infinity in the palm of your hand | > | Houston, Texas | And Eternity in an hour. - Blake | > ----------------------------------------------------------------------- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From kuiper at jpl.nasa.gov Fri Aug 7 14:05:14 2009 From: kuiper at jpl.nasa.gov (Tom Kuiper) Date: Fri, 7 Aug 2009 11:05:14 -0700 Subject: [Numpy-discussion] memmap capability Message-ID: <4A7C6CDA.9050400@jpl.nasa.gov> If this appears twice, forgive me. I sent it previously (7:13 am PDT) via a browser interface to JPL's Office Outlook. I have doubts about this system. This time, from Iceweasel through our SMTP server. There are two things I'd like to do using memmap. I suspect that they are impossible but maybe I'm missing some subtlety. 1) I would like to append rows to a memmap array and have the modified array changed on disk also. 2) I would like to have the memory view of the array on disk change, i.e., modify the offset for an opened array. The only way I can think of involves opening and closing arrays repeatedly. Regards Tom Kuiper -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjhnson at gmail.com Fri Aug 7 14:54:24 2009 From: tjhnson at gmail.com (T J) Date: Fri, 7 Aug 2009 11:54:24 -0700 Subject: [Numpy-discussion] reduce function of vectorize doesn't respect dtype? Message-ID: The reduce function of ufunc of a vectorized function doesn't seem to respect the dtype. >>> def a(x,y): return x+y >>> b = vectorize(a) >>> c = array([1,2]) >>> b(c, c) # use once to populate b.ufunc >>> d = b.ufunc.reduce(c) >>> c.dtype, type(d) dtype('int32'), >>> c = array([[1,2,3],[4,5,6]]) >>> b.ufunc.reduce(c) array([5, 7, 9], dtype=object) My goal is to use the output of vectorize() as if it is actually a ufunc. So I'd really like to just type: b.reduce, b.accumulate, etc. And I don't want to have to write (after using it once to populate b.ufunc): b.ufunc.reduce(c).astype(numpy.int32) From oliphant at enthought.com Fri Aug 7 15:35:02 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Fri, 7 Aug 2009 13:35:02 -0600 Subject: [Numpy-discussion] Vectorize ufunc In-Reply-To: References: Message-ID: <1FFF112C-F6B0-4C1E-B391-F0710E6C4ADE@enthought.com> The short answer is that it was easier this way. The ufunc is created on the fly and it needs to know several things that are easy to get once the function is called. Sent from my iPhone On Aug 7, 2009, at 11:42 AM, T J wrote: > I was wondering why vectorize doesn't make the ufunc available at the > topmost level.... > >>>> def a(x,y): return x + y >>>> b = vectorize(a) >>>> b.reduce > > Instead, the ufunc is stored at b.ufunc. > > Also, b.ufunc.reduce() doesn't seem to exist until I *use* the > vectorized function at least once. Can this be changed so that it > exists right away (and preferably at b.reduce instead of > b.ufunc.reduce)? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From alan at ajackson.org Fri Aug 7 16:49:06 2009 From: alan at ajackson.org (alan at ajackson.org) Date: Fri, 7 Aug 2009 15:49:06 -0500 Subject: [Numpy-discussion] Power distribution In-Reply-To: <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> References: <20090807124525.65dd3fcf@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> Message-ID: <20090807154906.223fe780@ajackson.org> I don't think that is it, since the one in numpy has a range restricted to the interval 0-1. Try out hist(np.random.power(5, 1000000), bins=100) >You might get better results for 'power-law distribution' >http://en.wikipedia.org/wiki/Power_law > >Andrew > >> -----Original Message----- >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >> bounces at scipy.org] On Behalf Of alan at ajackson.org >> Sent: 7 Aug 2009 11:45 AM >> To: Discussion of Numerical Python >> Subject: [Numpy-discussion] Power distribution >> >> Documenting my way through the statistics modules in numpy, I ran into >> the Power Distribution. >> >> Anyone know what that is? I Googled for it, and found a lot of stuff >on >> electricity, but no reference for a statistical distribution of that >> name. Does it have a common alias? >> >> -- >> >----------------------------------------------------------------------- >> | Alan K. Jackson | To see a World in a Grain of Sand >| >> | alan at ajackson.org | And a Heaven in a Wild Flower, >| >> | www.ajackson.org | Hold Infinity in the palm of your hand >| >> | Houston, Texas | And Eternity in an hour. - Blake >| >> >----------------------------------------------------------------------- >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From tjhnson at gmail.com Fri Aug 7 17:19:16 2009 From: tjhnson at gmail.com (T J) Date: Fri, 7 Aug 2009 14:19:16 -0700 Subject: [Numpy-discussion] dot documentation Message-ID: Hi, the documentation for dot says that a value error is raised if: If the last dimension of a is not the same size as the second-to-last dimension of b. (http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.htm) This doesn't appear to be the case: >>> a = array([[1,2],[3,4]]) >>> b = array([1,2]) >>> dot(a,b) array([5,11]) I can see *how* 5,11 is obtained, but it seems this should have raised a ValueError since the 2 != 1. So the actual code must do something more involved. When I think about broadcasting, it seems that maybe b should have been broadcasted to: --> array([[1,2],[1,2]]) and then the multiplication done as normal (but this would give a 2x2 result). Can someone explain this to me? From tjhnson at gmail.com Fri Aug 7 17:24:24 2009 From: tjhnson at gmail.com (T J) Date: Fri, 7 Aug 2009 14:24:24 -0700 Subject: [Numpy-discussion] dot documentation In-Reply-To: References: Message-ID: Oh. b.shape = (2,). So I suppose the second to last dimension is, in fact, the last dimension...and 2 == 2. nvm On Fri, Aug 7, 2009 at 2:19 PM, T J wrote: > Hi, ?the documentation for dot says that a value error is raised if: > > ? ?If the last dimension of a is not the same size as the > second-to-last dimension of b. > > (http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.htm) > > This doesn't appear to be the case: > >>>> a = array([[1,2],[3,4]]) >>>> b = array([1,2]) >>>> dot(a,b) > array([5,11]) > > I can see *how* 5,11 is obtained, but it seems this should have raised > a ValueError since the 2 != 1. ?So the actual code must do something > more involved. ?When I think about broadcasting, it seems that maybe b > should have been broadcasted to: > > --> ?array([[1,2],[1,2]]) > > and then the multiplication done as normal (but this would give a 2x2 result). > > Can someone explain this to me? > From HAWRYLA at novachem.com Fri Aug 7 17:25:21 2009 From: HAWRYLA at novachem.com (Andrew Hawryluk) Date: Fri, 7 Aug 2009 15:25:21 -0600 Subject: [Numpy-discussion] Power distribution In-Reply-To: <20090807154906.223fe780@ajackson.org> References: <20090807124525.65dd3fcf@ajackson.org><48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> <20090807154906.223fe780@ajackson.org> Message-ID: <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> Hmm ... good point. It appears to give a probability distribution proportional to x**(a-1), but I see no good reason why the domain should be limited to [0,1]. def test(a): nums = plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd') x = np.linspace(0,1,200) plt.plot(x,nums[0][-1]*x**(a-1)) Andrew > -----Original Message----- > From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- > bounces at scipy.org] On Behalf Of alan at ajackson.org > Sent: 7 Aug 2009 2:49 PM > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Power distribution > > I don't think that is it, since the one in numpy has a range restricted > to the interval 0-1. > > Try out hist(np.random.power(5, 1000000), bins=100) > > >You might get better results for 'power-law distribution' > >http://en.wikipedia.org/wiki/Power_law > > > >Andrew > > > >> -----Original Message----- > >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- > >> bounces at scipy.org] On Behalf Of alan at ajackson.org > >> Sent: 7 Aug 2009 11:45 AM > >> To: Discussion of Numerical Python > >> Subject: [Numpy-discussion] Power distribution > >> > >> Documenting my way through the statistics modules in numpy, I ran > >> into the Power Distribution. > >> > >> Anyone know what that is? I Googled for it, and found a lot of stuff > >on > >> electricity, but no reference for a statistical distribution of that > >> name. Does it have a common alias? > >> > >> -- > >> > >---------------------------------------------------------------------- > - > >> | Alan K. Jackson | To see a World in a Grain of Sand > >| > >> | alan at ajackson.org | And a Heaven in a Wild Flower, > >| > >> | www.ajackson.org | Hold Infinity in the palm of your > hand > >| > >> | Houston, Texas | And Eternity in an hour. - Blake > >| > >> > >---------------------------------------------------------------------- > - > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > -- > ----------------------------------------------------------------------- > | Alan K. Jackson | To see a World in a Grain of Sand | > | alan at ajackson.org | And a Heaven in a Wild Flower, | > | www.ajackson.org | Hold Infinity in the palm of your hand | > | Houston, Texas | And Eternity in an hour. - Blake | > ----------------------------------------------------------------------- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Fri Aug 7 17:42:19 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 7 Aug 2009 17:42:19 -0400 Subject: [Numpy-discussion] Power distribution In-Reply-To: <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> References: <20090807124525.65dd3fcf@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> <20090807154906.223fe780@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> Message-ID: <1cd32cbb0908071442va8eb05r785b9bde6819ab89@mail.gmail.com> On Fri, Aug 7, 2009 at 5:25 PM, Andrew Hawryluk wrote: > Hmm ... good point. > It appears to give a probability distribution proportional to x**(a-1), > but I see no good reason why the domain should be limited to [0,1]. > > def test(a): > ? ?nums = > plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd') > ? ?x = np.linspace(0,1,200) > ? ?plt.plot(x,nums[0][-1]*x**(a-1)) > > Andrew > > > >> -----Original Message----- >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >> bounces at scipy.org] On Behalf Of alan at ajackson.org >> Sent: 7 Aug 2009 2:49 PM >> To: Discussion of Numerical Python >> Subject: Re: [Numpy-discussion] Power distribution >> >> I don't think that is it, since the one in numpy has a range > restricted >> to the interval 0-1. >> >> Try out hist(np.random.power(5, 1000000), bins=100) >> >> >You might get better results for 'power-law distribution' >> >http://en.wikipedia.org/wiki/Power_law >> > >> >Andrew >> > >> >> -----Original Message----- >> >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >> >> bounces at scipy.org] On Behalf Of alan at ajackson.org >> >> Sent: 7 Aug 2009 11:45 AM >> >> To: Discussion of Numerical Python >> >> Subject: [Numpy-discussion] Power distribution >> >> >> >> Documenting my way through the statistics modules in numpy, I ran >> >> into the Power Distribution. >> >> >> >> Anyone know what that is? I Googled for it, and found a lot of > stuff >> >on >> >> electricity, but no reference for a statistical distribution of > that >> >> name. Does it have a common alias? >> >> >> >> -- same is in Travis' notes on the distribution and scipy.stats.distributions domain in [0,1], but I don't know anything about it either ## Power-function distribution ## Special case of beta dist. with d =1.0 class powerlaw_gen(rv_continuous): def _pdf(self, x, a): return a*x**(a-1.0) def _cdf(self, x, a): return x**(a*1.0) def _ppf(self, q, a): return pow(q, 1.0/a) def _stats(self, a): return a/(a+1.0), a*(a+2.0)/(a+1.0)**2, \ 2*(1.0-a)*sqrt((a+2.0)/(a*(a+3.0))), \ 6*polyval([1,-1,-6,2],a)/(a*(a+3.0)*(a+4)) def _entropy(self, a): return 1 - 1.0/a - log(a) powerlaw = powerlaw_gen(a=0.0, b=1.0, name="powerlaw", longname="A power-function", shapes="a", extradoc=""" Power-function distribution powerlaw.pdf(x,a) = a*x**(a-1) for 0 <= x <= 1, a > 0. """ ) From josef.pktd at gmail.com Fri Aug 7 18:13:20 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 7 Aug 2009 18:13:20 -0400 Subject: [Numpy-discussion] Power distribution In-Reply-To: <1cd32cbb0908071442va8eb05r785b9bde6819ab89@mail.gmail.com> References: <20090807124525.65dd3fcf@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> <20090807154906.223fe780@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> <1cd32cbb0908071442va8eb05r785b9bde6819ab89@mail.gmail.com> Message-ID: <1cd32cbb0908071513l29fa7fbgd35cb7ba428642da@mail.gmail.com> On Fri, Aug 7, 2009 at 5:42 PM, wrote: > On Fri, Aug 7, 2009 at 5:25 PM, Andrew Hawryluk wrote: >> Hmm ... good point. >> It appears to give a probability distribution proportional to x**(a-1), >> but I see no good reason why the domain should be limited to [0,1]. >> >> def test(a): >> ? ?nums = >> plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd') >> ? ?x = np.linspace(0,1,200) >> ? ?plt.plot(x,nums[0][-1]*x**(a-1)) >> >> Andrew >> >> >> >>> -----Original Message----- >>> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>> bounces at scipy.org] On Behalf Of alan at ajackson.org >>> Sent: 7 Aug 2009 2:49 PM >>> To: Discussion of Numerical Python >>> Subject: Re: [Numpy-discussion] Power distribution >>> >>> I don't think that is it, since the one in numpy has a range >> restricted >>> to the interval 0-1. >>> >>> Try out hist(np.random.power(5, 1000000), bins=100) >>> >>> >You might get better results for 'power-law distribution' >>> >http://en.wikipedia.org/wiki/Power_law >>> > >>> >Andrew >>> > >>> >> -----Original Message----- >>> >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>> >> bounces at scipy.org] On Behalf Of alan at ajackson.org >>> >> Sent: 7 Aug 2009 11:45 AM >>> >> To: Discussion of Numerical Python >>> >> Subject: [Numpy-discussion] Power distribution >>> >> >>> >> Documenting my way through the statistics modules in numpy, I ran >>> >> into the Power Distribution. >>> >> >>> >> Anyone know what that is? I Googled for it, and found a lot of >> stuff >>> >on >>> >> electricity, but no reference for a statistical distribution of >> that >>> >> name. Does it have a common alias? >>> >> >>> >> -- > > > same is in Travis' notes on the distribution and scipy.stats.distributions > domain in [0,1], but I don't know anything about it either > > ## Power-function distribution > ## ? Special case of beta dist. with d =1.0 > > class powerlaw_gen(rv_continuous): > ? ?def _pdf(self, x, a): > ? ? ? ?return a*x**(a-1.0) > ? ?def _cdf(self, x, a): > ? ? ? ?return x**(a*1.0) > ? ?def _ppf(self, q, a): > ? ? ? ?return pow(q, 1.0/a) > ? ?def _stats(self, a): > ? ? ? ?return a/(a+1.0), a*(a+2.0)/(a+1.0)**2, \ > ? ? ? ? ? ? ? 2*(1.0-a)*sqrt((a+2.0)/(a*(a+3.0))), \ > ? ? ? ? ? ? ? 6*polyval([1,-1,-6,2],a)/(a*(a+3.0)*(a+4)) > ? ?def _entropy(self, a): > ? ? ? ?return 1 - 1.0/a - log(a) > powerlaw = powerlaw_gen(a=0.0, b=1.0, name="powerlaw", > ? ? ? ? ? ? ? ? ? ? ? ?longname="A power-function", > ? ? ? ? ? ? ? ? ? ? ? ?shapes="a", extradoc=""" > > Power-function distribution > > powerlaw.pdf(x,a) = a*x**(a-1) > for 0 <= x <= 1, a > 0. > """ > ? ? ? ? ? ? ? ? ? ? ? ?) > it looks like it's the same distribution, even though it doesn't use the random numbers from the numpy function high p-values with Kolmogorov-Smirnov, see below I assume it is a truncated version of *a* powerlaw distribution, so that a can be large, which would be impossible in the open domain case. But a quick search, I only found powerlaw applications that refer to the tail behavior. Josef >>> rvs = np.random.power(5, 100000) >>> stats.kstest(rvs,'powerlaw',(5,)) (0.0021079715221341555, 0.76587118275752697) >>> rvs = np.random.power(5, 1000000) >>> stats.kstest(rvs,'powerlaw',(5,)) (0.00063983013407076239, 0.80757958281509501) >>> rvs = np.random.power(0.5, 1000000) >>> stats.kstest(rvs,'powerlaw',(0.5,)) (0.00081823148457027539, 0.51478478398950211) From charlesr.harris at gmail.com Fri Aug 7 18:50:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 7 Aug 2009 16:50:32 -0600 Subject: [Numpy-discussion] dot documentation In-Reply-To: References: Message-ID: On Fri, Aug 7, 2009 at 3:24 PM, T J wrote: > Oh. b.shape = (2,). So I suppose the second to last dimension is, in > fact, the last dimension...and 2 == 2. > > nvm > > On Fri, Aug 7, 2009 at 2:19 PM, T J wrote: > > Hi, the documentation for dot says that a value error is raised if: > > > > If the last dimension of a is not the same size as the > > second-to-last dimension of b. > > > > (http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.htm) > > > > This doesn't appear to be the case: > > > >>>> a = array([[1,2],[3,4]]) > >>>> b = array([1,2]) > >>>> dot(a,b) > > array([5,11]) > > > > I can see *how* 5,11 is obtained, but it seems this should have raised > > a ValueError since the 2 != 1. So the actual code must do something > > more involved. When I think about broadcasting, it seems that maybe b > > should have been broadcasted to: > > > > --> array([[1,2],[1,2]]) > > > > and then the multiplication done as normal (but this would give a 2x2 > result). > > > > Can someone explain this to me? > > > It looks like a bug in the documentation. Vectors, i.e., 1D arrays, are multiplied as if they were column/row vectors depending on what side of the product they occur. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Aug 7 18:57:15 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 7 Aug 2009 18:57:15 -0400 Subject: [Numpy-discussion] Power distribution In-Reply-To: <1cd32cbb0908071513l29fa7fbgd35cb7ba428642da@mail.gmail.com> References: <20090807124525.65dd3fcf@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> <20090807154906.223fe780@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> <1cd32cbb0908071442va8eb05r785b9bde6819ab89@mail.gmail.com> <1cd32cbb0908071513l29fa7fbgd35cb7ba428642da@mail.gmail.com> Message-ID: <1cd32cbb0908071557l605d98aboc3dbf8ec47241077@mail.gmail.com> On Fri, Aug 7, 2009 at 6:13 PM, wrote: > On Fri, Aug 7, 2009 at 5:42 PM, wrote: >> On Fri, Aug 7, 2009 at 5:25 PM, Andrew Hawryluk wrote: >>> Hmm ... good point. >>> It appears to give a probability distribution proportional to x**(a-1), >>> but I see no good reason why the domain should be limited to [0,1]. >>> >>> def test(a): >>> ? ?nums = >>> plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd') >>> ? ?x = np.linspace(0,1,200) >>> ? ?plt.plot(x,nums[0][-1]*x**(a-1)) >>> >>> Andrew >>> >>> >>> >>>> -----Original Message----- >>>> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>> Sent: 7 Aug 2009 2:49 PM >>>> To: Discussion of Numerical Python >>>> Subject: Re: [Numpy-discussion] Power distribution >>>> >>>> I don't think that is it, since the one in numpy has a range >>> restricted >>>> to the interval 0-1. >>>> >>>> Try out hist(np.random.power(5, 1000000), bins=100) >>>> >>>> >You might get better results for 'power-law distribution' >>>> >http://en.wikipedia.org/wiki/Power_law >>>> > >>>> >Andrew >>>> > >>>> >> -----Original Message----- >>>> >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>> >> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>> >> Sent: 7 Aug 2009 11:45 AM >>>> >> To: Discussion of Numerical Python >>>> >> Subject: [Numpy-discussion] Power distribution >>>> >> >>>> >> Documenting my way through the statistics modules in numpy, I ran >>>> >> into the Power Distribution. >>>> >> >>>> >> Anyone know what that is? I Googled for it, and found a lot of >>> stuff >>>> >on >>>> >> electricity, but no reference for a statistical distribution of >>> that >>>> >> name. Does it have a common alias? >>>> >> >>>> >> -- >> >> >> same is in Travis' notes on the distribution and scipy.stats.distributions >> domain in [0,1], but I don't know anything about it either >> >> ## Power-function distribution >> ## ? Special case of beta dist. with d =1.0 >> >> class powerlaw_gen(rv_continuous): >> ? ?def _pdf(self, x, a): >> ? ? ? ?return a*x**(a-1.0) >> ? ?def _cdf(self, x, a): >> ? ? ? ?return x**(a*1.0) >> ? ?def _ppf(self, q, a): >> ? ? ? ?return pow(q, 1.0/a) >> ? ?def _stats(self, a): >> ? ? ? ?return a/(a+1.0), a*(a+2.0)/(a+1.0)**2, \ >> ? ? ? ? ? ? ? 2*(1.0-a)*sqrt((a+2.0)/(a*(a+3.0))), \ >> ? ? ? ? ? ? ? 6*polyval([1,-1,-6,2],a)/(a*(a+3.0)*(a+4)) >> ? ?def _entropy(self, a): >> ? ? ? ?return 1 - 1.0/a - log(a) >> powerlaw = powerlaw_gen(a=0.0, b=1.0, name="powerlaw", >> ? ? ? ? ? ? ? ? ? ? ? ?longname="A power-function", >> ? ? ? ? ? ? ? ? ? ? ? ?shapes="a", extradoc=""" >> >> Power-function distribution >> >> powerlaw.pdf(x,a) = a*x**(a-1) >> for 0 <= x <= 1, a > 0. >> """ >> ? ? ? ? ? ? ? ? ? ? ? ?) >> > > > it looks like it's the same distribution, even though it doesn't use > the random numbers from the numpy function > > high p-values with Kolmogorov-Smirnov, see below > > I assume it is a truncated version of *a* powerlaw distribution, so > that a can be large, which would be impossible in the open domain > case. But a quick search, I only found powerlaw applications that > refer to the tail behavior. > > Josef > >>>> rvs = np.random.power(5, 100000) >>>> stats.kstest(rvs,'powerlaw',(5,)) > (0.0021079715221341555, 0.76587118275752697) >>>> rvs = np.random.power(5, 1000000) >>>> stats.kstest(rvs,'powerlaw',(5,)) > (0.00063983013407076239, 0.80757958281509501) >>>> rvs = np.random.power(0.5, 1000000) >>>> stats.kstest(rvs,'powerlaw',(0.5,)) > (0.00081823148457027539, 0.51478478398950211) > I found a short reference in Johnson, Kotz, Balakrishnan vol. 1 where it is refered to as the "power-function" distribution. roughly: if X is pareto (which kind) distributed, then Y=X**(-1) is distributed according to the power-function distribution. JKB have an extra parameter in there and is a bit more general then the scipy version, or maybe it is just the scale parameter included in the density function. It is also in NIST data plot, but I didn't find the html reference page, but only the pdf http://docs.google.com/gview?a=v&q=cache%3AEgQ6bRkeJl8J%3Awww.itl.nist.gov%2Fdiv898%2Fsoftware%2Fdataplot%2Frefman2%2Fauxillar%2Fpowpdf.pdf+power-function+distribution&hl=en&gl=ca&pli=1 the pdf-files for powpdf and powcdf are here http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/homepage.htm I can look some more a bit later tonight. Josef From Chris.Barker at noaa.gov Fri Aug 7 19:31:59 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 07 Aug 2009 16:31:59 -0700 Subject: [Numpy-discussion] concatenate and non-contiguous arrays? Message-ID: <4A7CB96F.9090607@noaa.gov> Hi all, I just noticed that np.concatenate does not necessarily produce contiguous arrays. I has figured that it was making a copy, so would produce a C-contiguous array, but not so: In [88]: a = np.arange(60).reshape((4,5,3)) In [89]: b = np.concatenate((a, a[:, -1:, :]), axis=1) In [90]: b.flags Out[90]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False I'm also not sure why OWNDATA is false is it sharing data with somethign else? It doesn't look like it is with a. I'll toss a ascontiguous() in my code to take care of this, but I'd like to understand it. Explanation? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Fri Aug 7 20:03:25 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 07 Aug 2009 17:03:25 -0700 Subject: [Numpy-discussion] Optimized half-sizing of images? In-Reply-To: <12BEE90B-55D5-42ED-B202-B0F7D09040B4@math.washington.edu> References: <4A7B3A19.7040800@noaa.gov> <86638F54-75A9-4DBD-9FA5-0F5C08794CA1@yale.edu> <12BEE90B-55D5-42ED-B202-B0F7D09040B4@math.washington.edu> Message-ID: <4A7CC0CD.5010500@noaa.gov> To finish off the thread for posterity: Robert Bradshaw wrote: Robert's version operated on a 2-d array, so only one band at a time if you have RGB. So I edited it a bit: import cython import numpy as np cimport numpy as np @cython.boundscheck(False) def halfsize(np.ndarray[np.uint8_t, ndim=3, mode="c"] a): cdef unsigned int i, j, b, w, h, d w, h, d = a.shape[0], a.shape[1], a.shape[2] cdef np.ndarray[np.uint8_t, ndim=3, mode="c"] a2 = np.ndarray((w/2, h/2, 3), np.uint8) for i in range(w/2): for j in range(h/2): for b in range(d): # color band a2[i,j,b] = (a[2*i,2*j,b] + a[2*i+1,2*j,b] + a[2*i,2*j+1,b] + a[2*i+1,2*j+1,b])/4 return a2 This now does the whole RGB image at once, and is pretty snappy. Here are my timings for half-sizing a 512x512 RGB image: slicing, accumulating with a float32 time: 89 ms per loop slicing, accumulating with a uint16 time: 67 ms per loop slicing, all calculations in uint8 time: 47 ms per loop using ndimage, one band at a time, 3rd order spline. time: 280 ms per loop using ndimage, one band at a time, 1st order spline. time: 40 ms per loop using cython, one band at a time time: 11.6 ms per loop using cython, all bands at once time: 2.66 ms per loop using PIL BILNEAR interpolation time: 2.66 ms per loop So a ten times speed up over PIL, and a 17 times speed up over my fastest numpy version. If anyone has any suggestions on how to improve on the Cython version, I'd like to hear it, though I doubt it would make a practical difference. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Fri Aug 7 20:54:49 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 7 Aug 2009 20:54:49 -0400 Subject: [Numpy-discussion] Power distribution In-Reply-To: <1cd32cbb0908071557l605d98aboc3dbf8ec47241077@mail.gmail.com> References: <20090807124525.65dd3fcf@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> <20090807154906.223fe780@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> <1cd32cbb0908071442va8eb05r785b9bde6819ab89@mail.gmail.com> <1cd32cbb0908071513l29fa7fbgd35cb7ba428642da@mail.gmail.com> <1cd32cbb0908071557l605d98aboc3dbf8ec47241077@mail.gmail.com> Message-ID: <1cd32cbb0908071754w5406ea83v89d78ecce1a81d7f@mail.gmail.com> On Fri, Aug 7, 2009 at 6:57 PM, wrote: > On Fri, Aug 7, 2009 at 6:13 PM, wrote: >> On Fri, Aug 7, 2009 at 5:42 PM, wrote: >>> On Fri, Aug 7, 2009 at 5:25 PM, Andrew Hawryluk wrote: >>>> Hmm ... good point. >>>> It appears to give a probability distribution proportional to x**(a-1), >>>> but I see no good reason why the domain should be limited to [0,1]. >>>> >>>> def test(a): >>>> ? ?nums = >>>> plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd') >>>> ? ?x = np.linspace(0,1,200) >>>> ? ?plt.plot(x,nums[0][-1]*x**(a-1)) >>>> >>>> Andrew >>>> >>>> >>>> >>>>> -----Original Message----- >>>>> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>>> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>>> Sent: 7 Aug 2009 2:49 PM >>>>> To: Discussion of Numerical Python >>>>> Subject: Re: [Numpy-discussion] Power distribution >>>>> >>>>> I don't think that is it, since the one in numpy has a range >>>> restricted >>>>> to the interval 0-1. >>>>> >>>>> Try out hist(np.random.power(5, 1000000), bins=100) >>>>> >>>>> >You might get better results for 'power-law distribution' >>>>> >http://en.wikipedia.org/wiki/Power_law >>>>> > >>>>> >Andrew >>>>> > >>>>> >> -----Original Message----- >>>>> >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>>> >> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>>> >> Sent: 7 Aug 2009 11:45 AM >>>>> >> To: Discussion of Numerical Python >>>>> >> Subject: [Numpy-discussion] Power distribution >>>>> >> >>>>> >> Documenting my way through the statistics modules in numpy, I ran >>>>> >> into the Power Distribution. >>>>> >> >>>>> >> Anyone know what that is? I Googled for it, and found a lot of >>>> stuff >>>>> >on >>>>> >> electricity, but no reference for a statistical distribution of >>>> that >>>>> >> name. Does it have a common alias? >>>>> >> >>>>> >> -- >>> >>> >>> same is in Travis' notes on the distribution and scipy.stats.distributions >>> domain in [0,1], but I don't know anything about it either >>> >>> ## Power-function distribution >>> ## ? Special case of beta dist. with d =1.0 >>> >>> class powerlaw_gen(rv_continuous): >>> ? ?def _pdf(self, x, a): >>> ? ? ? ?return a*x**(a-1.0) >>> ? ?def _cdf(self, x, a): >>> ? ? ? ?return x**(a*1.0) >>> ? ?def _ppf(self, q, a): >>> ? ? ? ?return pow(q, 1.0/a) >>> ? ?def _stats(self, a): >>> ? ? ? ?return a/(a+1.0), a*(a+2.0)/(a+1.0)**2, \ >>> ? ? ? ? ? ? ? 2*(1.0-a)*sqrt((a+2.0)/(a*(a+3.0))), \ >>> ? ? ? ? ? ? ? 6*polyval([1,-1,-6,2],a)/(a*(a+3.0)*(a+4)) >>> ? ?def _entropy(self, a): >>> ? ? ? ?return 1 - 1.0/a - log(a) >>> powerlaw = powerlaw_gen(a=0.0, b=1.0, name="powerlaw", >>> ? ? ? ? ? ? ? ? ? ? ? ?longname="A power-function", >>> ? ? ? ? ? ? ? ? ? ? ? ?shapes="a", extradoc=""" >>> >>> Power-function distribution >>> >>> powerlaw.pdf(x,a) = a*x**(a-1) >>> for 0 <= x <= 1, a > 0. >>> """ >>> ? ? ? ? ? ? ? ? ? ? ? ?) >>> >> >> >> it looks like it's the same distribution, even though it doesn't use >> the random numbers from the numpy function >> >> high p-values with Kolmogorov-Smirnov, see below >> >> I assume it is a truncated version of *a* powerlaw distribution, so >> that a can be large, which would be impossible in the open domain >> case. But a quick search, I only found powerlaw applications that >> refer to the tail behavior. >> >> Josef >> >>>>> rvs = np.random.power(5, 100000) >>>>> stats.kstest(rvs,'powerlaw',(5,)) >> (0.0021079715221341555, 0.76587118275752697) >>>>> rvs = np.random.power(5, 1000000) >>>>> stats.kstest(rvs,'powerlaw',(5,)) >> (0.00063983013407076239, 0.80757958281509501) >>>>> rvs = np.random.power(0.5, 1000000) >>>>> stats.kstest(rvs,'powerlaw',(0.5,)) >> (0.00081823148457027539, 0.51478478398950211) >> > > I found a short reference in Johnson, Kotz, Balakrishnan vol. 1 where > it is refered to as the "power-function" distribution. > roughly: if X is pareto (which kind) distributed, then Y=X**(-1) is > distributed according to the power-function distribution. JKB have an > extra parameter in there and is a bit more general then the scipy > version, or maybe it is just the scale parameter included in the > density function. > > It is also in NIST data plot, but I didn't find the html reference > page, but only the pdf > > http://docs.google.com/gview?a=v&q=cache%3AEgQ6bRkeJl8J%3Awww.itl.nist.gov%2Fdiv898%2Fsoftware%2Fdataplot%2Frefman2%2Fauxillar%2Fpowpdf.pdf+power-function+distribution&hl=en&gl=ca&pli=1 > > the pdf-files for powpdf and powcdf ?are here > http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/homepage.htm > > > I can look some more a bit later tonight. > > Josef > for the relationship to pareto, below are some kstests and graphs a reminder that numpy.random.pareto uses a non-standard 0 bound, instead of 1 ks tests don't show good numbers every once in a while, since they are random I checked the definitions in JKB (page 607) and my previous interpretation was correct. if X has a pareto distribution with lower bound at 1 and shape parameter a>0, then 1/X has a density function p(y) = a*y**(a-1), (0 References: <20090807124525.65dd3fcf@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> <20090807154906.223fe780@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> <1cd32cbb0908071442va8eb05r785b9bde6819ab89@mail.gmail.com> <1cd32cbb0908071513l29fa7fbgd35cb7ba428642da@mail.gmail.com> <1cd32cbb0908071557l605d98aboc3dbf8ec47241077@mail.gmail.com> <1cd32cbb0908071754w5406ea83v89d78ecce1a81d7f@mail.gmail.com> Message-ID: <1cd32cbb0908071810p40ba9fa1hd08c935372bf65c3@mail.gmail.com> On Fri, Aug 7, 2009 at 8:54 PM, wrote: > On Fri, Aug 7, 2009 at 6:57 PM, wrote: >> On Fri, Aug 7, 2009 at 6:13 PM, wrote: >>> On Fri, Aug 7, 2009 at 5:42 PM, wrote: >>>> On Fri, Aug 7, 2009 at 5:25 PM, Andrew Hawryluk wrote: >>>>> Hmm ... good point. >>>>> It appears to give a probability distribution proportional to x**(a-1), >>>>> but I see no good reason why the domain should be limited to [0,1]. >>>>> >>>>> def test(a): >>>>> ? ?nums = >>>>> plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd') >>>>> ? ?x = np.linspace(0,1,200) >>>>> ? ?plt.plot(x,nums[0][-1]*x**(a-1)) >>>>> >>>>> Andrew >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>>>> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>>>> Sent: 7 Aug 2009 2:49 PM >>>>>> To: Discussion of Numerical Python >>>>>> Subject: Re: [Numpy-discussion] Power distribution >>>>>> >>>>>> I don't think that is it, since the one in numpy has a range >>>>> restricted >>>>>> to the interval 0-1. >>>>>> >>>>>> Try out hist(np.random.power(5, 1000000), bins=100) >>>>>> >>>>>> >You might get better results for 'power-law distribution' >>>>>> >http://en.wikipedia.org/wiki/Power_law >>>>>> > >>>>>> >Andrew >>>>>> > >>>>>> >> -----Original Message----- >>>>>> >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>>>> >> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>>>> >> Sent: 7 Aug 2009 11:45 AM >>>>>> >> To: Discussion of Numerical Python >>>>>> >> Subject: [Numpy-discussion] Power distribution >>>>>> >> >>>>>> >> Documenting my way through the statistics modules in numpy, I ran >>>>>> >> into the Power Distribution. >>>>>> >> >>>>>> >> Anyone know what that is? I Googled for it, and found a lot of >>>>> stuff >>>>>> >on >>>>>> >> electricity, but no reference for a statistical distribution of >>>>> that >>>>>> >> name. Does it have a common alias? >>>>>> >> >>>>>> >> -- >>>> >>>> >>>> same is in Travis' notes on the distribution and scipy.stats.distributions >>>> domain in [0,1], but I don't know anything about it either >>>> >>>> ## Power-function distribution >>>> ## ? Special case of beta dist. with d =1.0 >>>> >>>> class powerlaw_gen(rv_continuous): >>>> ? ?def _pdf(self, x, a): >>>> ? ? ? ?return a*x**(a-1.0) >>>> ? ?def _cdf(self, x, a): >>>> ? ? ? ?return x**(a*1.0) >>>> ? ?def _ppf(self, q, a): >>>> ? ? ? ?return pow(q, 1.0/a) >>>> ? ?def _stats(self, a): >>>> ? ? ? ?return a/(a+1.0), a*(a+2.0)/(a+1.0)**2, \ >>>> ? ? ? ? ? ? ? 2*(1.0-a)*sqrt((a+2.0)/(a*(a+3.0))), \ >>>> ? ? ? ? ? ? ? 6*polyval([1,-1,-6,2],a)/(a*(a+3.0)*(a+4)) >>>> ? ?def _entropy(self, a): >>>> ? ? ? ?return 1 - 1.0/a - log(a) >>>> powerlaw = powerlaw_gen(a=0.0, b=1.0, name="powerlaw", >>>> ? ? ? ? ? ? ? ? ? ? ? ?longname="A power-function", >>>> ? ? ? ? ? ? ? ? ? ? ? ?shapes="a", extradoc=""" >>>> >>>> Power-function distribution >>>> >>>> powerlaw.pdf(x,a) = a*x**(a-1) >>>> for 0 <= x <= 1, a > 0. >>>> """ >>>> ? ? ? ? ? ? ? ? ? ? ? ?) >>>> >>> >>> >>> it looks like it's the same distribution, even though it doesn't use >>> the random numbers from the numpy function >>> >>> high p-values with Kolmogorov-Smirnov, see below >>> >>> I assume it is a truncated version of *a* powerlaw distribution, so >>> that a can be large, which would be impossible in the open domain >>> case. But a quick search, I only found powerlaw applications that >>> refer to the tail behavior. >>> >>> Josef >>> >>>>>> rvs = np.random.power(5, 100000) >>>>>> stats.kstest(rvs,'powerlaw',(5,)) >>> (0.0021079715221341555, 0.76587118275752697) >>>>>> rvs = np.random.power(5, 1000000) >>>>>> stats.kstest(rvs,'powerlaw',(5,)) >>> (0.00063983013407076239, 0.80757958281509501) >>>>>> rvs = np.random.power(0.5, 1000000) >>>>>> stats.kstest(rvs,'powerlaw',(0.5,)) >>> (0.00081823148457027539, 0.51478478398950211) >>> >> >> I found a short reference in Johnson, Kotz, Balakrishnan vol. 1 where >> it is refered to as the "power-function" distribution. >> roughly: if X is pareto (which kind) distributed, then Y=X**(-1) is >> distributed according to the power-function distribution. JKB have an >> extra parameter in there and is a bit more general then the scipy >> version, or maybe it is just the scale parameter included in the >> density function. >> >> It is also in NIST data plot, but I didn't find the html reference >> page, but only the pdf >> >> http://docs.google.com/gview?a=v&q=cache%3AEgQ6bRkeJl8J%3Awww.itl.nist.gov%2Fdiv898%2Fsoftware%2Fdataplot%2Frefman2%2Fauxillar%2Fpowpdf.pdf+power-function+distribution&hl=en&gl=ca&pli=1 >> >> the pdf-files for powpdf and powcdf ?are here >> http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/homepage.htm >> >> >> I can look some more a bit later tonight. >> >> Josef >> > > > for the relationship to pareto, below are some kstests and graphs > a reminder that numpy.random.pareto uses a non-standard 0 bound, instead of 1 > ks tests don't show good numbers every once in a while, since they are random > > I checked the definitions in JKB (page 607) and my previous > interpretation was correct. > if X has a pareto distribution with lower bound at 1 and shape > parameter a>0, then 1/X has a density function > p(y) = a*y**(a-1), ?(0 weak inequality in JKB instead of strict as in scipy.stats.powerlaw docstring > (the actual scipy.stats.powerlaw docstring ?has a typo, a**x**(a-1), > which I will correct) > > Josef > > > import numpy as np > from scipy import stats > import matplotlib.pyplot as plt > > > rvs = np.random.power(5, 1000000) > rvsp = np.random.pareto(5, 1000000) > rvsps = stats.pareto.rvs(5, size=100) > > print "stats.kstest(1./rvsps,'powerlaw',(5,))" > print stats.kstest(1./rvsps,'powerlaw',(5,)) > > print "stats.kstest(1./(1+rvsp),'powerlaw',(5,))" > print stats.kstest(1./(1+rvsp),'powerlaw',(5,)) > > print "stats.kstest(rvs,'powerlaw',(5,))" > print stats.kstest(rvs,'powerlaw',(5,)) > > print "stats.ks_2samp(rvs,1./(rvsp+1))" > print stats.ks_2samp(rvs,1./(rvsp+1)) > print "stats.ks_2samp(rvs,1./rvsps)" > print stats.ks_2samp(rvs,1./rvsps) > print "stats.ks_2samp(1+rvsp, rvsps)" > print stats.ks_2samp(1+rvsp, rvsps) Improvements to graphs, compare with theoretical pdf Josef xx = np.linspace(0,1,100) powpdf = stats.powerlaw.pdf(xx,5) plt.figure() plt.hist(rvs, bins=50, normed=True) plt.plot(xx,powpdf,'r-') plt.title('np.random.power(5)') plt.figure() plt.hist(1./(1.+rvsp), bins=50, normed=True) plt.plot(xx,powpdf,'r-') plt.title('inverse of 1 + np.random.pareto(5)') plt.figure() plt.hist(1./(1.+rvsp), bins=50, normed=True) plt.plot(xx,powpdf,'r-') plt.title('inverse of stats.pareto(5)') > > plt.figure() > plt.hist(rvs, bins=50) > plt.title('np.random.power(5)') > plt.figure() > plt.hist(1./(1.+rvsp), bins=50) > plt.title('inverse of 1 + np.random.pareto(5)') > plt.figure() > plt.hist(1./(1.+rvsp), bins=50) > plt.title('inverse of stats.pareto(5)') > #plt.show() > From alan at ajackson.org Fri Aug 7 22:17:38 2009 From: alan at ajackson.org (alan at ajackson.org) Date: Fri, 7 Aug 2009 21:17:38 -0500 Subject: [Numpy-discussion] Power distribution In-Reply-To: <1cd32cbb0908071810p40ba9fa1hd08c935372bf65c3@mail.gmail.com> References: <20090807124525.65dd3fcf@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> <20090807154906.223fe780@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> <1cd32cbb0908071442va8eb05r785b9bde6819ab89@mail.gmail.com> <1cd32cbb0908071513l29fa7fbgd35cb7ba428642da@mail.gmail.com> <1cd32cbb0908071557l605d98aboc3dbf8ec47241077@mail.gmail.com> <1cd32cbb0908071754w5406ea83v89d78ecce1a81d7f@mail.gmail.com> <1cd32cbb0908071810p40ba9fa1hd08c935372bf65c3@mail.gmail.com> Message-ID: <20090807211738.6a7ce10c@ajackson.org> Thanks! That helps a lot. >On Fri, Aug 7, 2009 at 8:54 PM, wrote: >> On Fri, Aug 7, 2009 at 6:57 PM, wrote: >>> On Fri, Aug 7, 2009 at 6:13 PM, wrote: >>>> On Fri, Aug 7, 2009 at 5:42 PM, wrote: >>>>> On Fri, Aug 7, 2009 at 5:25 PM, Andrew Hawryluk wrote: >>>>>> Hmm ... good point. >>>>>> It appears to give a probability distribution proportional to x**(a-1), >>>>>> but I see no good reason why the domain should be limited to [0,1]. >>>>>> >>>>>> def test(a): >>>>>> ? ?nums = >>>>>> plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd') >>>>>> ? ?x = np.linspace(0,1,200) >>>>>> ? ?plt.plot(x,nums[0][-1]*x**(a-1)) >>>>>> >>>>>> Andrew >>>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>>>>> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>>>>> Sent: 7 Aug 2009 2:49 PM >>>>>>> To: Discussion of Numerical Python >>>>>>> Subject: Re: [Numpy-discussion] Power distribution >>>>>>> >>>>>>> I don't think that is it, since the one in numpy has a range >>>>>> restricted >>>>>>> to the interval 0-1. >>>>>>> >>>>>>> Try out hist(np.random.power(5, 1000000), bins=100) >>>>>>> >>>>>>> >You might get better results for 'power-law distribution' >>>>>>> >http://en.wikipedia.org/wiki/Power_law >>>>>>> > >>>>>>> >Andrew >>>>>>> > >>>>>>> >> -----Original Message----- >>>>>>> >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>>>>> >> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>>>>> >> Sent: 7 Aug 2009 11:45 AM >>>>>>> >> To: Discussion of Numerical Python >>>>>>> >> Subject: [Numpy-discussion] Power distribution >>>>>>> >> >>>>>>> >> Documenting my way through the statistics modules in numpy, I ran >>>>>>> >> into the Power Distribution. >>>>>>> >> >>>>>>> >> Anyone know what that is? I Googled for it, and found a lot of >>>>>> stuff >>>>>>> >on >>>>>>> >> electricity, but no reference for a statistical distribution of >>>>>> that >>>>>>> >> name. Does it have a common alias? >>>>>>> >> >>>>>>> >> -- >>>>> >>>>> >>>>> same is in Travis' notes on the distribution and scipy.stats.distributions >>>>> domain in [0,1], but I don't know anything about it either >>>>> >>>>> ## Power-function distribution >>>>> ## ? Special case of beta dist. with d =1.0 >>>>> >>>>> class powerlaw_gen(rv_continuous): >>>>> ? ?def _pdf(self, x, a): >>>>> ? ? ? ?return a*x**(a-1.0) >>>>> ? ?def _cdf(self, x, a): >>>>> ? ? ? ?return x**(a*1.0) >>>>> ? ?def _ppf(self, q, a): >>>>> ? ? ? ?return pow(q, 1.0/a) >>>>> ? ?def _stats(self, a): >>>>> ? ? ? ?return a/(a+1.0), a*(a+2.0)/(a+1.0)**2, \ >>>>> ? ? ? ? ? ? ? 2*(1.0-a)*sqrt((a+2.0)/(a*(a+3.0))), \ >>>>> ? ? ? ? ? ? ? 6*polyval([1,-1,-6,2],a)/(a*(a+3.0)*(a+4)) >>>>> ? ?def _entropy(self, a): >>>>> ? ? ? ?return 1 - 1.0/a - log(a) >>>>> powerlaw = powerlaw_gen(a=0.0, b=1.0, name="powerlaw", >>>>> ? ? ? ? ? ? ? ? ? ? ? ?longname="A power-function", >>>>> ? ? ? ? ? ? ? ? ? ? ? ?shapes="a", extradoc=""" >>>>> >>>>> Power-function distribution >>>>> >>>>> powerlaw.pdf(x,a) = a*x**(a-1) >>>>> for 0 <= x <= 1, a > 0. >>>>> """ >>>>> ? ? ? ? ? ? ? ? ? ? ? ?) >>>>> >>>> >>>> >>>> it looks like it's the same distribution, even though it doesn't use >>>> the random numbers from the numpy function >>>> >>>> high p-values with Kolmogorov-Smirnov, see below >>>> >>>> I assume it is a truncated version of *a* powerlaw distribution, so >>>> that a can be large, which would be impossible in the open domain >>>> case. But a quick search, I only found powerlaw applications that >>>> refer to the tail behavior. >>>> >>>> Josef >>>> >>>>>>> rvs = np.random.power(5, 100000) >>>>>>> stats.kstest(rvs,'powerlaw',(5,)) >>>> (0.0021079715221341555, 0.76587118275752697) >>>>>>> rvs = np.random.power(5, 1000000) >>>>>>> stats.kstest(rvs,'powerlaw',(5,)) >>>> (0.00063983013407076239, 0.80757958281509501) >>>>>>> rvs = np.random.power(0.5, 1000000) >>>>>>> stats.kstest(rvs,'powerlaw',(0.5,)) >>>> (0.00081823148457027539, 0.51478478398950211) >>>> >>> >>> I found a short reference in Johnson, Kotz, Balakrishnan vol. 1 where >>> it is refered to as the "power-function" distribution. >>> roughly: if X is pareto (which kind) distributed, then Y=X**(-1) is >>> distributed according to the power-function distribution. JKB have an >>> extra parameter in there and is a bit more general then the scipy >>> version, or maybe it is just the scale parameter included in the >>> density function. >>> >>> It is also in NIST data plot, but I didn't find the html reference >>> page, but only the pdf >>> >>> http://docs.google.com/gview?a=v&q=cache%3AEgQ6bRkeJl8J%3Awww.itl.nist.gov%2Fdiv898%2Fsoftware%2Fdataplot%2Frefman2%2Fauxillar%2Fpowpdf.pdf+power-function+distribution&hl=en&gl=ca&pli=1 >>> >>> the pdf-files for powpdf and powcdf ?are here >>> http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/homepage.htm >>> >>> >>> I can look some more a bit later tonight. >>> >>> Josef >>> >> >> >> for the relationship to pareto, below are some kstests and graphs >> a reminder that numpy.random.pareto uses a non-standard 0 bound, instead of 1 >> ks tests don't show good numbers every once in a while, since they are random >> >> I checked the definitions in JKB (page 607) and my previous >> interpretation was correct. >> if X has a pareto distribution with lower bound at 1 and shape >> parameter a>0, then 1/X has a density function >> p(y) = a*y**(a-1), ?(0> weak inequality in JKB instead of strict as in scipy.stats.powerlaw docstring >> (the actual scipy.stats.powerlaw docstring ?has a typo, a**x**(a-1), >> which I will correct) >> >> Josef >> >> >> import numpy as np >> from scipy import stats >> import matplotlib.pyplot as plt >> >> >> rvs = np.random.power(5, 1000000) >> rvsp = np.random.pareto(5, 1000000) >> rvsps = stats.pareto.rvs(5, size=100) >> >> print "stats.kstest(1./rvsps,'powerlaw',(5,))" >> print stats.kstest(1./rvsps,'powerlaw',(5,)) >> >> print "stats.kstest(1./(1+rvsp),'powerlaw',(5,))" >> print stats.kstest(1./(1+rvsp),'powerlaw',(5,)) >> >> print "stats.kstest(rvs,'powerlaw',(5,))" >> print stats.kstest(rvs,'powerlaw',(5,)) >> >> print "stats.ks_2samp(rvs,1./(rvsp+1))" >> print stats.ks_2samp(rvs,1./(rvsp+1)) >> print "stats.ks_2samp(rvs,1./rvsps)" >> print stats.ks_2samp(rvs,1./rvsps) >> print "stats.ks_2samp(1+rvsp, rvsps)" >> print stats.ks_2samp(1+rvsp, rvsps) > > >Improvements to graphs, compare with theoretical pdf > >Josef > >xx = np.linspace(0,1,100) >powpdf = stats.powerlaw.pdf(xx,5) > >plt.figure() >plt.hist(rvs, bins=50, normed=True) >plt.plot(xx,powpdf,'r-') >plt.title('np.random.power(5)') >plt.figure() >plt.hist(1./(1.+rvsp), bins=50, normed=True) >plt.plot(xx,powpdf,'r-') >plt.title('inverse of 1 + np.random.pareto(5)') >plt.figure() >plt.hist(1./(1.+rvsp), bins=50, normed=True) >plt.plot(xx,powpdf,'r-') >plt.title('inverse of stats.pareto(5)') > > > >> >> plt.figure() >> plt.hist(rvs, bins=50) >> plt.title('np.random.power(5)') >> plt.figure() >> plt.hist(1./(1.+rvsp), bins=50) >> plt.title('inverse of 1 + np.random.pareto(5)') >> plt.figure() >> plt.hist(1./(1.+rvsp), bins=50) >> plt.title('inverse of stats.pareto(5)') >> #plt.show() >> -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From josef.pktd at gmail.com Fri Aug 7 22:38:04 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 7 Aug 2009 22:38:04 -0400 Subject: [Numpy-discussion] numpy.random.pareto, m equal zero Message-ID: <1cd32cbb0908071938s5ec7e663uddf1a98d2c8d8f40@mail.gmail.com> Does it make any (statistical) sense to have numpy.random.pareto produce random numbers that start at zero? Can we change it to start at 1 which is the usual default? Notation from http://docs.scipy.org/numpy/docs/numpy.random.mtrand.RandomState.pareto/ The probability density for the Pareto distribution is .. math:: p(x) = \\frac{am^a}{x^{a+1}} where :math:`a` is the shape and :math:`m` the location constraints from Johnson, Kotz, Balakrishnan vol1 page 574 m>0, a>0, x>=m 1) as m goes to zero, the pdf goes to zero for every point, (mean, variance go to zero, essentially masspoint at zero) 2) quote from http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/parpdf.htm (their `a` is our `m`) " Note that although the a (=m JP) parameter is typically called a location parameter (and it is in the sense that it defines the lower bound), it is not a location parameter in the technical sense that the following relation does not hold: f(x;gamma,a) = f((x-a);gamma,0) For this reason, Dataplot treats a (=m JP) as a shape parameter. In Dataplot, the a (=m JP) shape parameter is optional with a default value of 1. " my conclusion: --------------------- What numpy.random.pareto actually produces, are random numbers from a pareto distribution with lower bound m=1, but location parameter loc=-1, that shifts the distribution to the left. To actually get useful random numbers (that are correct in the usual usage http://en.wikipedia.org/wiki/Pareto_distribution), we need to add 1 to them. stats.distributions doesn't use mtrand.pareto (why?), so I never needed to check this before. rvs_pareto = 1 + numpy.random.pareto(a, size) for correction in some calculation, see the thread on the power distribution. Do we have to live with loc=-1, or can we change it, or am I misinterpreting something (which wouldn't be the first time either)? Josef From charlesr.harris at gmail.com Fri Aug 7 23:23:40 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 7 Aug 2009 21:23:40 -0600 Subject: [Numpy-discussion] Merging datetime, Yes or No Message-ID: I ask again, Datetime is getting really stale and hasn't been touched recently. Do the datetime folks want it merged or not, because it's getting to be a bit of work. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Aug 7 23:55:45 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 7 Aug 2009 23:55:45 -0400 Subject: [Numpy-discussion] Power distribution In-Reply-To: <20090807211738.6a7ce10c@ajackson.org> References: <20090807124525.65dd3fcf@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> <20090807154906.223fe780@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> <1cd32cbb0908071442va8eb05r785b9bde6819ab89@mail.gmail.com> <1cd32cbb0908071513l29fa7fbgd35cb7ba428642da@mail.gmail.com> <1cd32cbb0908071557l605d98aboc3dbf8ec47241077@mail.gmail.com> <1cd32cbb0908071754w5406ea83v89d78ecce1a81d7f@mail.gmail.com> <1cd32cbb0908071810p40ba9fa1hd08c935372bf65c3@mail.gmail.com> <20090807211738.6a7ce10c@ajackson.org> Message-ID: <1cd32cbb0908072055t5f45be8ax7aa272eaed7defc5@mail.gmail.com> On Fri, Aug 7, 2009 at 10:17 PM, wrote: > Thanks! That helps a lot. Thanks for improving the docs. > >>On Fri, Aug 7, 2009 at 8:54 PM, wrote: >>> On Fri, Aug 7, 2009 at 6:57 PM, wrote: >>>> On Fri, Aug 7, 2009 at 6:13 PM, wrote: >>>>> On Fri, Aug 7, 2009 at 5:42 PM, wrote: >>>>>> On Fri, Aug 7, 2009 at 5:25 PM, Andrew Hawryluk wrote: >>>>>>> Hmm ... good point. >>>>>>> It appears to give a probability distribution proportional to x**(a-1), >>>>>>> but I see no good reason why the domain should be limited to [0,1]. >>>>>>> >>>>>>> def test(a): >>>>>>> ? ?nums = >>>>>>> plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd') >>>>>>> ? ?x = np.linspace(0,1,200) >>>>>>> ? ?plt.plot(x,nums[0][-1]*x**(a-1)) >>>>>>> >>>>>>> Andrew >>>>>>> >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>>>>>> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>>>>>> Sent: 7 Aug 2009 2:49 PM >>>>>>>> To: Discussion of Numerical Python >>>>>>>> Subject: Re: [Numpy-discussion] Power distribution >>>>>>>> >>>>>>>> I don't think that is it, since the one in numpy has a range >>>>>>> restricted >>>>>>>> to the interval 0-1. >>>>>>>> >>>>>>>> Try out hist(np.random.power(5, 1000000), bins=100) >>>>>>>> >>>>>>>> >You might get better results for 'power-law distribution' >>>>>>>> >http://en.wikipedia.org/wiki/Power_law >>>>>>>> > >>>>>>>> >Andrew >>>>>>>> > >>>>>>>> >> -----Original Message----- >>>>>>>> >> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >>>>>>>> >> bounces at scipy.org] On Behalf Of alan at ajackson.org >>>>>>>> >> Sent: 7 Aug 2009 11:45 AM >>>>>>>> >> To: Discussion of Numerical Python >>>>>>>> >> Subject: [Numpy-discussion] Power distribution >>>>>>>> >> >>>>>>>> >> Documenting my way through the statistics modules in numpy, I ran >>>>>>>> >> into the Power Distribution. >>>>>>>> >> >>>>>>>> >> Anyone know what that is? I Googled for it, and found a lot of >>>>>>> stuff >>>>>>>> >on >>>>>>>> >> electricity, but no reference for a statistical distribution of >>>>>>> that >>>>>>>> >> name. Does it have a common alias? >>>>>>>> >> >>>>>>>> >> -- >>>>>> >>>>>> >>>>>> same is in Travis' notes on the distribution and scipy.stats.distributions >>>>>> domain in [0,1], but I don't know anything about it either >>>>>> >>>>>> ## Power-function distribution >>>>>> ## ? Special case of beta dist. with d =1.0 >>>>>> >>>>>> class powerlaw_gen(rv_continuous): >>>>>> ? ?def _pdf(self, x, a): >>>>>> ? ? ? ?return a*x**(a-1.0) >>>>>> ? ?def _cdf(self, x, a): >>>>>> ? ? ? ?return x**(a*1.0) >>>>>> ? ?def _ppf(self, q, a): >>>>>> ? ? ? ?return pow(q, 1.0/a) >>>>>> ? ?def _stats(self, a): >>>>>> ? ? ? ?return a/(a+1.0), a*(a+2.0)/(a+1.0)**2, \ >>>>>> ? ? ? ? ? ? ? 2*(1.0-a)*sqrt((a+2.0)/(a*(a+3.0))), \ >>>>>> ? ? ? ? ? ? ? 6*polyval([1,-1,-6,2],a)/(a*(a+3.0)*(a+4)) >>>>>> ? ?def _entropy(self, a): >>>>>> ? ? ? ?return 1 - 1.0/a - log(a) >>>>>> powerlaw = powerlaw_gen(a=0.0, b=1.0, name="powerlaw", >>>>>> ? ? ? ? ? ? ? ? ? ? ? ?longname="A power-function", >>>>>> ? ? ? ? ? ? ? ? ? ? ? ?shapes="a", extradoc=""" >>>>>> >>>>>> Power-function distribution >>>>>> >>>>>> powerlaw.pdf(x,a) = a*x**(a-1) >>>>>> for 0 <= x <= 1, a > 0. >>>>>> """ >>>>>> ? ? ? ? ? ? ? ? ? ? ? ?) >>>>>> >>>>> >>>>> >>>>> it looks like it's the same distribution, even though it doesn't use >>>>> the random numbers from the numpy function >>>>> >>>>> high p-values with Kolmogorov-Smirnov, see below >>>>> >>>>> I assume it is a truncated version of *a* powerlaw distribution, so >>>>> that a can be large, which would be impossible in the open domain >>>>> case. But a quick search, I only found powerlaw applications that >>>>> refer to the tail behavior. >>>>> >>>>> Josef >>>>> >>>>>>>> rvs = np.random.power(5, 100000) >>>>>>>> stats.kstest(rvs,'powerlaw',(5,)) >>>>> (0.0021079715221341555, 0.76587118275752697) >>>>>>>> rvs = np.random.power(5, 1000000) >>>>>>>> stats.kstest(rvs,'powerlaw',(5,)) >>>>> (0.00063983013407076239, 0.80757958281509501) >>>>>>>> rvs = np.random.power(0.5, 1000000) >>>>>>>> stats.kstest(rvs,'powerlaw',(0.5,)) >>>>> (0.00081823148457027539, 0.51478478398950211) >>>>> >>>> >>>> I found a short reference in Johnson, Kotz, Balakrishnan vol. 1 where >>>> it is refered to as the "power-function" distribution. >>>> roughly: if X is pareto (which kind) distributed, then Y=X**(-1) is >>>> distributed according to the power-function distribution. JKB have an >>>> extra parameter in there and is a bit more general then the scipy >>>> version, or maybe it is just the scale parameter included in the >>>> density function. >>>> >>>> It is also in NIST data plot, but I didn't find the html reference >>>> page, but only the pdf >>>> >>>> http://docs.google.com/gview?a=v&q=cache%3AEgQ6bRkeJl8J%3Awww.itl.nist.gov%2Fdiv898%2Fsoftware%2Fdataplot%2Frefman2%2Fauxillar%2Fpowpdf.pdf+power-function+distribution&hl=en&gl=ca&pli=1 >>>> >>>> the pdf-files for powpdf and powcdf ?are here >>>> http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/homepage.htm >>>> >>>> >>>> I can look some more a bit later tonight. >>>> >>>> Josef >>>> >>> >>> >>> for the relationship to pareto, below are some kstests and graphs >>> a reminder that numpy.random.pareto uses a non-standard 0 bound, instead of 1 >>> ks tests don't show good numbers every once in a while, since they are random >>> >>> I checked the definitions in JKB (page 607) and my previous >>> interpretation was correct. >>> if X has a pareto distribution with lower bound at 1 and shape >>> parameter a>0, then 1/X has a density function >>> p(y) = a*y**(a-1), ?(0>> weak inequality in JKB instead of strict as in scipy.stats.powerlaw docstring >>> (the actual scipy.stats.powerlaw docstring ?has a typo, a**x**(a-1), >>> which I will correct) >>> >>> Josef >>> >>> >>> import numpy as np >>> from scipy import stats >>> import matplotlib.pyplot as plt >>> >>> >>> rvs = np.random.power(5, 1000000) >>> rvsp = np.random.pareto(5, 1000000) >>> rvsps = stats.pareto.rvs(5, size=100) >>> >>> print "stats.kstest(1./rvsps,'powerlaw',(5,))" >>> print stats.kstest(1./rvsps,'powerlaw',(5,)) >>> >>> print "stats.kstest(1./(1+rvsp),'powerlaw',(5,))" >>> print stats.kstest(1./(1+rvsp),'powerlaw',(5,)) >>> >>> print "stats.kstest(rvs,'powerlaw',(5,))" >>> print stats.kstest(rvs,'powerlaw',(5,)) >>> >>> print "stats.ks_2samp(rvs,1./(rvsp+1))" >>> print stats.ks_2samp(rvs,1./(rvsp+1)) >>> print "stats.ks_2samp(rvs,1./rvsps)" >>> print stats.ks_2samp(rvs,1./rvsps) >>> print "stats.ks_2samp(1+rvsp, rvsps)" >>> print stats.ks_2samp(1+rvsp, rvsps) >> >> >>Improvements to graphs, compare with theoretical pdf >> >>Josef >> >>xx = np.linspace(0,1,100) >>powpdf = stats.powerlaw.pdf(xx,5) >> >>plt.figure() >>plt.hist(rvs, bins=50, normed=True) >>plt.plot(xx,powpdf,'r-') >>plt.title('np.random.power(5)') >>plt.figure() >>plt.hist(1./(1.+rvsp), bins=50, normed=True) >>plt.plot(xx,powpdf,'r-') >>plt.title('inverse of 1 + np.random.pareto(5)') >>plt.figure() >>plt.hist(1./(1.+rvsp), bins=50, normed=True) >>plt.plot(xx,powpdf,'r-') >>plt.title('inverse of stats.pareto(5)') Just a small correction of a copy and paste error, to have the correct example in the thread: The last graph is a duplicate and should instead use the scipy.stats random numbers, rvsps, without the +1 correction, i.e. plt.figure() plt.hist(1./rvsps, bins=50, normed=True) plt.plot(xx,powpdf,'r-') plt.title('inverse of stats.pareto(5)') Josef > > -- > ----------------------------------------------------------------------- > | Alan K. Jackson ? ? ? ? ? ?| To see a World in a Grain of Sand ? ? ?| > | alan at ajackson.org ? ? ? ? ?| And a Heaven in a Wild Flower, ? ? ? ? | > | www.ajackson.org ? ? ? ? ? | Hold Infinity in the palm of your hand | > | Houston, Texas ? ? ? ? ? ? | And Eternity in an hour. - Blake ? ? ? | > ----------------------------------------------------------------------- > From pfeldman at verizon.net Sat Aug 8 00:53:16 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Fri, 7 Aug 2009 21:53:16 -0700 (PDT) Subject: [Numpy-discussion] How to preserve number of array dimensions when taking a slice? Message-ID: <24875133.post@talk.nabble.com> I'd like to be able to make a slice of a 3-dimensional array, doing something like the following: Y= X[A, B, C] where A, B, and C are lists of indices. This works, but has an unexpected side-effect. When A, B, or C is a length-1 list, Y has fewer dimensions than X. Is there a way to do the slice such that the number of dimensions is preserved, i.e., I'd like Y to be a 3-dimensional array, even if one or more dimensions is unity. Is there a way to do this? -- View this message in context: http://www.nabble.com/How-to-preserve-number-of-array-dimensions-when-taking-a-slice--tp24875133p24875133.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From dwf at cs.toronto.edu Sat Aug 8 04:46:08 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sat, 8 Aug 2009 04:46:08 -0400 Subject: [Numpy-discussion] How to preserve number of array dimensions when taking a slice? In-Reply-To: <24875133.post@talk.nabble.com> References: <24875133.post@talk.nabble.com> Message-ID: On 8-Aug-09, at 12:53 AM, Dr. Phillip M. Feldman wrote: > > I'd like to be able to make a slice of a 3-dimensional array, doing > something > like the following: > > Y= X[A, B, C] > > where A, B, and C are lists of indices. This works, but has an > unexpected > side-effect. When A, B, or C is a length-1 list, Y has fewer > dimensions than > X. Is there a way to do the slice such that the number of dimensions > is > preserved, i.e., I'd like Y to be a 3-dimensional array, even if one > or more > dimensions is unity. Is there a way to do this? Err, X[A, B, C] with A, B and C lists should always return a 1D array, I think. Lists of indices count as 'fancy indexing', not slicing. If using slices, you can specify slices that are only 1 long as in X[5:6, :, :] and retain the dimensionality. From peterjeremy at optushome.com.au Sat Aug 8 05:38:22 2009 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Sat, 8 Aug 2009 19:38:22 +1000 Subject: [Numpy-discussion] ATLAS, NumPy and Threading Message-ID: <20090808093822.GA88083@server.vk2pj.dyndns.org> [Apologies if anyone sees this twice - the first copy appears to have disappeared into a black hole] Should ATLAS be built with or without threading support for use with NumPy? The NumPy documentation just says that ATLAS will be used if found but gives no indication of how ATLAS should be built. I have found that system_info.py explicitly selects threaded versions of libatlas and liblapack on FreeBSD but have been unable to find any rationale behind this in the available SVN logs. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available URL: From david at ar.media.kyoto-u.ac.jp Sat Aug 8 05:28:50 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 08 Aug 2009 18:28:50 +0900 Subject: [Numpy-discussion] ATLAS, NumPy and Threading In-Reply-To: <20090808093822.GA88083@server.vk2pj.dyndns.org> References: <20090808093822.GA88083@server.vk2pj.dyndns.org> Message-ID: <4A7D4552.8040002@ar.media.kyoto-u.ac.jp> Peter Jeremy wrote: > [Apologies if anyone sees this twice - the first copy appears to have > disappeared into a black hole] > > Should ATLAS be built with or without threading support for use with > NumPy? The NumPy documentation just says that ATLAS will be used if > found but gives no indication of how ATLAS should be built. > Both threaded and non threaded should be usable, at least on unix and mac os x. I don't know about the situation on windows, though. The FreeBSD notice in system_info may be obsolete, cheers, David From emmanuelle.gouillart at normalesup.org Sat Aug 8 06:27:36 2009 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Sat, 8 Aug 2009 12:27:36 +0200 Subject: [Numpy-discussion] Power distribution In-Reply-To: <1cd32cbb0908072055t5f45be8ax7aa272eaed7defc5@mail.gmail.com> References: <48C01AE7354EC240A26F19CEB995E943033AF254@CHMAILMBX01.novachem.com> <20090807154906.223fe780@ajackson.org> <48C01AE7354EC240A26F19CEB995E943033AF256@CHMAILMBX01.novachem.com> <1cd32cbb0908071442va8eb05r785b9bde6819ab89@mail.gmail.com> <1cd32cbb0908071513l29fa7fbgd35cb7ba428642da@mail.gmail.com> <1cd32cbb0908071557l605d98aboc3dbf8ec47241077@mail.gmail.com> <1cd32cbb0908071754w5406ea83v89d78ecce1a81d7f@mail.gmail.com> <1cd32cbb0908071810p40ba9fa1hd08c935372bf65c3@mail.gmail.com> <20090807211738.6a7ce10c@ajackson.org> <1cd32cbb0908072055t5f45be8ax7aa272eaed7defc5@mail.gmail.com> Message-ID: <20090808102736.GB21264@phare.normalesup.org> On Fri, Aug 07, 2009 at 11:55:45PM -0400, josef.pktd at gmail.com wrote: > On Fri, Aug 7, 2009 at 10:17 PM, wrote: > > Thanks! That helps a lot. > Thanks for improving the docs. Many thanks for taking the time of finding out what this distribution really is, and improving the docs. I was also puzzled by this distribution, so I started making some tests and writing a docstring stub, but it is much better now! Emmanuelle From lukshuntim at gmail.com Sat Aug 8 08:38:06 2009 From: lukshuntim at gmail.com (lukshuntim at gmail.com) Date: Sat, 08 Aug 2009 20:38:06 +0800 Subject: [Numpy-discussion] Test failures r7300 Message-ID: <4A7D71AE.2050704@gmail.com> Hi, I got 16 test failures after building r7300 from svn on debian/sid/i386. Seems all related to complex linear algebra modules. Here's the error messages: Running unit tests for numpy NumPy version 1.4.0.dev7300 NumPy is installed in /var/opt/py/lib/python2.5/site-packages/numpy Python version 2.5.4 (r254:67916, Feb 17 2009, 20:16:45) [GCC 4.3.3] nose version 0.11.1 ... FAIL: test_cdouble (test_linalg.TestCond2) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 114, in do old_assert_almost_equal(s[0]/s[-1], linalg.cond(a,2), decimal=5) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 421, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: 9.4348091510413177 DESIRED: 22.757141876814547 ====================================================================== FAIL: test_csingle (test_linalg.TestCond2) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 114, in do old_assert_almost_equal(s[0]/s[-1], linalg.cond(a,2), decimal=5) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 421, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: 9.4348097 DESIRED: 22.757143 ====================================================================== FAIL: test_cdouble (test_linalg.TestDet) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 135, in do assert_almost_equal(d, multiply.reduce(ev)) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: (8.881784197e-16-4j) DESIRED: (5.28-11.04j) ====================================================================== FAIL: test_csingle (test_linalg.TestDet) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 135, in do assert_almost_equal(d, multiply.reduce(ev)) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: (8.881784197e-16-4j) DESIRED: (5.28-11.04j) ====================================================================== FAIL: test_cdouble (test_linalg.TestEig) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 94, in do assert_almost_equal(dot(a, evectors), multiply(evectors, evalues)) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [[ 2.72530404+2.67511327j 1.91551601+1.28537403j] [ 5.95809316+4.79684551j 3.39598770+1.39546789j]] DESIRED: [[ 2.01388405+1.03693361j -1.39855180+1.88751398j] [ 1.78601662+0.01838201j -0.10378837-3.53101635j]] ====================================================================== FAIL: test_csingle (test_linalg.TestEig) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 94, in do assert_almost_equal(dot(a, evectors), multiply(evectors, evalues)) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [[ 2.72530413+2.6751132j 1.91551602+1.28537405j] [ 5.95809317+4.79684544j 3.39598775+1.39546788j]] DESIRED: [[ 2.01388407+1.03693354j -1.39855182+1.887514j ] [ 1.78601658+0.01838197j -0.10378837-3.53101635j]] ====================================================================== FAIL: test_cdouble (test_linalg.TestEigh) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 221, in test_cdouble self.do(a) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 259, in do assert_almost_equal(ev, evalues) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [-2.60555128 4.60555128] DESIRED: [-1.71080202-1.00413682j 3.01849433+1.46567528j] ====================================================================== FAIL: test_csingle (test_linalg.TestEigh) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 217, in test_csingle self.do(a) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 259, in do assert_almost_equal(ev, evalues) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [-2.60555124 4.60555124] DESIRED: [-1.71080208-1.0041368j 3.01849437+1.46567523j] ====================================================================== FAIL: test_cdouble (test_linalg.TestEigvalsh) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 221, in test_cdouble self.do(a) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 249, in do assert_almost_equal(ev, evalues) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [-2.60555128+0.j 4.60555128+0.j] DESIRED: [-1.71080202-1.00413682j 3.01849433+1.46567528j] ====================================================================== FAIL: test_csingle (test_linalg.TestEigvalsh) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 217, in test_csingle self.do(a) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 249, in do assert_almost_equal(ev, evalues) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [-2.60555124+0.j 4.60555124+0.j] DESIRED: [-1.71080208-1.0041368j 3.01849437+1.46567523j] ====================================================================== FAIL: test_cdouble (test_linalg.TestLstsq) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 141, in do assert_almost_equal(b, dot(a, x)) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [ 2.+1.j 1.+2.j] DESIRED: [ 0.95920929+0.98311952j 1.23494444+0.67346351j] ====================================================================== FAIL: test_csingle (test_linalg.TestLstsq) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 141, in do assert_almost_equal(b, dot(a, x)) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [ 2.+1.j 1.+2.j] DESIRED: [ 0.95920926+0.98311943j 1.23494434+0.67346334j] ====================================================================== FAIL: test_cdouble (test_linalg.TestPinv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 124, in do assert_almost_equal(dot(a, a_ginv), identity(asarray(a).shape[0])) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [[ 0.29169056-0.07799046j 0.17767375-0.01332484j] [ 0.04125021-0.38255608j 0.73402869+0.62377356j]] DESIRED: [[ 1. 0.] [ 0. 1.]] ====================================================================== FAIL: test_csingle (test_linalg.TestPinv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 124, in do assert_almost_equal(dot(a, a_ginv), identity(asarray(a).shape[0])) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [[ 0.29169053-0.07799049j 0.17767370-0.0133248j ] [ 0.04125014-0.38255614j 0.73402858+0.62377363j]] DESIRED: [[ 1. 0.] [ 0. 1.]] ====================================================================== FAIL: test_cdouble (test_linalg.TestSVD) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 100, in do assert_almost_equal(a, dot(multiply(u, s), vt)) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [[ 1.+2.j 2.+3.j] [ 3.+4.j 4.+5.j]] DESIRED: [[ 1.00000000+2.j 2.36670415+2.98574489j] [ 3.00000000+4.j 2.80882652+6.25521741j]] ====================================================================== FAIL: test_csingle (test_linalg.TestSVD) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 100, in do assert_almost_equal(a, dot(multiply(u, s), vt)) File "/var/opt/py/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 400, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [[ 1.+2.j 2.+3.j] [ 3.+4.j 4.+5.j]] DESIRED: [[ 0.99999994+2.j 2.36670423+2.98574495j] [ 3.00000000+4.j 2.80882668+6.25521755j]] ---------------------------------------------------------------------- Ran 2186 tests in 10.594s FAILED (KNOWNFAIL=1, SKIP=11, failures=16) Regards, ST -- From cournape at gmail.com Sat Aug 8 08:59:08 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 8 Aug 2009 21:59:08 +0900 Subject: [Numpy-discussion] Test failures r7300 In-Reply-To: <4A7D71AE.2050704@gmail.com> References: <4A7D71AE.2050704@gmail.com> Message-ID: <5b8d13220908080559j4cf98394qc4ee89d21c16300e@mail.gmail.com> On Sat, Aug 8, 2009 at 9:38 PM, wrote: > Hi, > > I got 16 test failures after building r7300 from svn on debian/sid/i386. > Seems all related to complex linear algebra modules. Are you using atlas ? (numpy.show_config() output) If so, did you compile it by yourself ? Did you compile everything with gfortran (do you have g77 installed). Problems related to complex are almost always caused by fortran compilers mismatch, cheers, David From lukshuntim at gmail.com Sat Aug 8 09:33:02 2009 From: lukshuntim at gmail.com (lukshuntim at gmail.com) Date: Sat, 08 Aug 2009 21:33:02 +0800 Subject: [Numpy-discussion] Test failures r7300 In-Reply-To: <5b8d13220908080559j4cf98394qc4ee89d21c16300e@mail.gmail.com> References: <4A7D71AE.2050704@gmail.com> <5b8d13220908080559j4cf98394qc4ee89d21c16300e@mail.gmail.com> Message-ID: <4A7D7E8E.8010902@gmail.com> David Cournapeau wrote: > On Sat, Aug 8, 2009 at 9:38 PM, wrote: >> Hi, >> >> I got 16 test failures after building r7300 from svn on debian/sid/i386. >> Seems all related to complex linear algebra modules. > > Are you using atlas ? (numpy.show_config() output) Yes, it's libatlas-sse2 3.6.0-24 debian/sid package. In [3]: numpy.show_config() atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/lib/sse2'] define_macros = [('ATLAS_INFO', '"\\"3.6.0\\""')] language = c atlas_blas_threads_info: NOT AVAILABLE lapack_opt_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/lib/sse2/atlas', '/usr/lib/sse2'] define_macros = [('ATLAS_INFO', '"\\"3.6.0\\""')] language = f77 atlas_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/lib/sse2/atlas', '/usr/lib/sse2'] language = f77 lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE atlas_blas_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/lib/sse2'] language = c mkl_info: NOT AVAILABLE I've set these in my site.cfg [DEFAULT] library_dirs = /usr/lib/sse2 [blas_opt] libraries = f77blas, cblas, atlas [lapack_opt] libraries = lapack_atlas, f77blas, cblas, atlas but it seems not to pick up the liblapack_atlas.so from the debian atlas package. > > If so, did you compile it by yourself ? Did you compile everything > with gfortran (do you have g77 installed). Yes, and I don't have g77 anymore. > > Problems related to complex are almost always caused by fortran > compilers mismatch, Does the numpy.show_config() show that the debian atlas libaries are compiled with f77? Running ldd /usr/lib/sse2/libatlas.so shows linux-gate.so.1 => (0xb7f97000) libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0xb7914000) libm.so.6 => /lib/i686/cmov/libm.so.6 (0xb78ee000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb78c2000) libc.so.6 => /lib/i686/cmov/libc.so.6 (0xb7763000) /lib/ld-linux.so.2 (0xb7f98000) > > cheers, > > David Thanks very much for the help, ST -- From cournape at gmail.com Sat Aug 8 09:45:56 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 8 Aug 2009 22:45:56 +0900 Subject: [Numpy-discussion] Test failures r7300 In-Reply-To: <4A7D7E8E.8010902@gmail.com> References: <4A7D71AE.2050704@gmail.com> <5b8d13220908080559j4cf98394qc4ee89d21c16300e@mail.gmail.com> <4A7D7E8E.8010902@gmail.com> Message-ID: <5b8d13220908080645o6c22a081xa36b5ff8d97abefa@mail.gmail.com> On Sat, Aug 8, 2009 at 10:33 PM, wrote: > David Cournapeau wrote: >> On Sat, Aug 8, 2009 at 9:38 PM, wrote: >>> Hi, >>> >>> I got 16 test failures after building r7300 from svn on debian/sid/i386. >>> Seems all related to complex linear algebra modules. >> >> Are you using atlas ? (numpy.show_config() output) > > Yes, it's libatlas-sse2 3.6.0-24 debian/sid package. I wonder if debian atlas package has the same problem as on recent Ubuntu. > [DEFAULT] > library_dirs = /usr/lib/sse2 > [blas_opt] > libraries = f77blas, cblas, atlas > [lapack_opt] > libraries = lapack_atlas, f77blas, cblas, atlas > > but it seems not to pick up the liblapack_atlas.so from the debian atlas > package. there is no need to do this I think: it should work out of the box without any site.cfg (the point is that it would make it easier to change which atlas is loaded at runtime for further debugging of the issue). > > Does the numpy.show_config() show that the debian atlas libaries are > compiled with f77? No, the f77 refers to the fortran 77 dialect, not the compiler. What I would try is first install libatlas-base (or whatever it is called on sid), i.e. the non sse version, and compare test output with both sse2/nosse (e.g. using LD_LIBRARY_PATH to point to /usr/lib so that the nosse is loaded, you can check using ldd which one is loaded by ld). David From pgmdevlist at gmail.com Sat Aug 8 12:33:11 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Sat, 8 Aug 2009 12:33:11 -0400 Subject: [Numpy-discussion] Merging datetime, Yes or No In-Reply-To: References: Message-ID: <747A6551-3A4E-4985-9382-EA32A05AA0D9@gmail.com> On Aug 7, 2009, at 11:23 PM, Charles R Harris wrote: > I ask again, > > Datetime is getting really stale and hasn't been touched recently. > Do the datetime folks want it merged or not, because it's getting to > be a bit of work. Chuck, Please check directly w/ Travis O. (and Robert ?), the only contributor(s) so far to this branch. Marty Fuhry, our GSoC student working on the same topic, is now trying to integrate his routines to the sources, and it'd be best if we had some up-to-date sources... P. From charlesr.harris at gmail.com Sat Aug 8 13:12:28 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 8 Aug 2009 11:12:28 -0600 Subject: [Numpy-discussion] Merging datetime, Yes or No In-Reply-To: <747A6551-3A4E-4985-9382-EA32A05AA0D9@gmail.com> References: <747A6551-3A4E-4985-9382-EA32A05AA0D9@gmail.com> Message-ID: On Sat, Aug 8, 2009 at 10:33 AM, Pierre GM wrote: > > On Aug 7, 2009, at 11:23 PM, Charles R Harris wrote: > > > I ask again, > > > > Datetime is getting really stale and hasn't been touched recently. > > Do the datetime folks want it merged or not, because it's getting to > > be a bit of work. > > Chuck, > Please check directly w/ Travis O. (and Robert ?), the only > contributor(s) so far to this branch. Marty Fuhry, our GSoC student > working on the same topic, is now trying to integrate his routines to > the sources, and it'd be best if we had some up-to-date sources... > P. > I've been waiting for some sort of nod from that direction. I actually see two parts here: a straight forward part involving the datetime and timedelta types, and a more complicated bit involving all the units and such. I think the latter needs to be looked at along with Darren's work for adding units to decide if it is the best approach. And I wonder a bit if a subclass using Darren's stuff wouldn't be the way to go seeing as how the two new types are basically npy_int64. Anyway, I think I'll just stop worrying about it and go ahead with updates in the trunk. Merging datetime might be a bit of work but I'll just leave that to those involved ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Sat Aug 8 14:23:14 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Sat, 8 Aug 2009 12:23:14 -0600 Subject: [Numpy-discussion] Merging datetime, Yes or No In-Reply-To: References: Message-ID: <2B8A03BE-E169-449D-8906-4062430B1CBA@enthought.com> You are welcome to merge it but I fear it is not stable enough. I'd like to spend more time with it first. -Travis -- (mobile phone of) Travis Oliphant Enthought, Inc. 1-512-536-1057 http://www.enthought.com On Aug 7, 2009, at 9:23 PM, Charles R Harris wrote: > I ask again, > > Datetime is getting really stale and hasn't been touched recently. > Do the datetime folks want it merged or not, because it's getting to > be a bit of work. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From kuiper at jpl.nasa.gov Sat Aug 8 22:33:22 2009 From: kuiper at jpl.nasa.gov (Tom Kuiper) Date: Sat, 8 Aug 2009 19:33:22 -0700 Subject: [Numpy-discussion] memmap, write through and flush Message-ID: <4A7E3572.9000909@jpl.nasa.gov> There is something curious here. The second flush() fails. Can anyone explain this? Tom ------------------- code snippet ------------------------ ... # create a memmap with dtype and shape that matches the data fp = np.memmap(filename, dtype='float32', mode='w+', shape=(3,4)) print "Initial memory mapped array (mode 'w+'):\n",fp # write data to memmap array fp[:] = data[:] fp.flush() # append a row to the array fp = np.append(fp, [[12,13,14,15]], 0) print "Filled memory mapped array:\n",fp fp.flush() ... ----------------- output ----------------------- Filled memory mapped array: [[ 0. 1. 2. 3.] [ 4. 5. 6. 7.] [ 8. 9. 10. 11.] [ 12. 13. 14. 15.]] --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) ..../memmap_test.py in () 21 fp = np.append(fp, [[12,13,14,15]], 0) 22 print "Filled memory mapped array:\n",fp ---> 23 fp.flush() 24 25 # deletion flushes memory changes to disk before removing the object AttributeError: 'numpy.ndarray' object has no attribute 'flush' WARNING: Failure executing file: From tjhnson at gmail.com Sat Aug 8 23:46:49 2009 From: tjhnson at gmail.com (T J) Date: Sat, 8 Aug 2009 20:46:49 -0700 Subject: [Numpy-discussion] Specifying Index Programmatically Message-ID: I have an array, and I need to index it like so: z[...,x,:] How can I write code which will index z, as above, when x is not known ahead of time. For that matter, the particular dimension I am querying is not known either. In case this is still confusing, I am looking for the NumPy way to do: z[("...",5,":")] z[(":", 3, ":", 5, "...")] z[(1, "...", 5)] Basically, I want to be able to pre-construct what should appear inside the []. The numbers are no problem, but I'm having trouble with the ellipsis and colon. From nmb at wartburg.edu Sat Aug 8 23:54:35 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Sat, 08 Aug 2009 22:54:35 -0500 Subject: [Numpy-discussion] Specifying Index Programmatically In-Reply-To: References: Message-ID: <4A7E487B.70906@wartburg.edu> On 2009-08-08 22:46 , T J wrote: > I have an array, and I need to index it like so: > > z[...,x,:] > > How can I write code which will index z, as above, when x is not known > ahead of time. For that matter, the particular dimension I am querying > is not known either. In case this is still confusing, I am looking > for the NumPy way to do: > > z[("...",5,":")] > > z[(":", 3, ":", 5, "...")] > > z[(1, "...", 5)] > > Basically, I want to be able to pre-construct what should appear > inside the []. The numbers are no problem, but I'm having trouble > with the ellipsis and colon. The ellipsis is a built-in python constant called Ellipsis. The colon is a slice object, again a python built-in, called with None as an argument. So, z[...,2,:] == z[Ellipsis,2,slice(None)]. -Neil From tjhnson at gmail.com Sat Aug 8 23:59:13 2009 From: tjhnson at gmail.com (T J) Date: Sat, 8 Aug 2009 20:59:13 -0700 Subject: [Numpy-discussion] reduce function of vectorize doesn't respect dtype? In-Reply-To: References: Message-ID: On Fri, Aug 7, 2009 at 11:54 AM, T J wrote: > The reduce function of ufunc of a vectorized function doesn't seem to > respect the dtype. > >>>> def a(x,y): return x+y >>>> b = vectorize(a) >>>> c = array([1,2]) >>>> b(c, c) ?# use once to populate b.ufunc >>>> d = b.ufunc.reduce(c) >>>> c.dtype, type(d) > dtype('int32'), > >>>> c = array([[1,2,3],[4,5,6]]) >>>> b.ufunc.reduce(c) > array([5, 7, 9], dtype=object) > So is this a bug? Or am I doing something wrong? In the second example.... >>> d = b.ufunc.reduce(c) >>> type(d[0]) >>> d.dtype dtype('object') From tjhnson at gmail.com Sun Aug 9 00:02:43 2009 From: tjhnson at gmail.com (T J) Date: Sat, 8 Aug 2009 21:02:43 -0700 Subject: [Numpy-discussion] Specifying Index Programmatically In-Reply-To: <4A7E487B.70906@wartburg.edu> References: <4A7E487B.70906@wartburg.edu> Message-ID: On Sat, Aug 8, 2009 at 8:54 PM, Neil Martinsen-Burrell wrote: > > The ellipsis is a built-in python constant called Ellipsis. ?The colon > is a slice object, again a python built-in, called with None as an > argument. ?So, z[...,2,:] == z[Ellipsis,2,slice(None)]. > Very helpful! Thank you. I didn't run into this information in any of the indexing tutorials I ran through. If I get some time, I'll try to add it. From tjhnson at gmail.com Sun Aug 9 00:36:07 2009 From: tjhnson at gmail.com (T J) Date: Sat, 8 Aug 2009 21:36:07 -0700 Subject: [Numpy-discussion] Indexing with a list... Message-ID: >>> z = array([1,2,3,4]) >>> z[[1]] array([1]) >>> z[(1,)] 1 I'm just curious: What is the motivation for this differing behavior? Is it a necessary consequence of, for example, the following: >>> z[z<3] array([1,2]) From dwf at cs.toronto.edu Sun Aug 9 01:09:25 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sun, 9 Aug 2009 01:09:25 -0400 Subject: [Numpy-discussion] Indexing with a list... In-Reply-To: References: Message-ID: <5163945D-E393-438B-95DC-2EE43A8C338D@cs.toronto.edu> On 9-Aug-09, at 12:36 AM, T J wrote: >>>> z = array([1,2,3,4]) >>>> z[[1]] > array([1]) >>>> z[(1,)] > 1 > > I'm just curious: What is the motivation for this differing behavior? When you address, i.e. an element in 2D array with a[2,3] you are actually indexing z with a tuple object (2,3). The 'comma' operator in Python creates a tuple, irrespective of whether you use parens or not. e.g. In [192]: z = {} In [193]: z[2,3] = 5 In [194]: z Out[194]: {(2, 3): 5} In the special case of scalar indices they're treated as if they are length-1 tuples. The behaviour you're seeing is the same as z[1]. David From tjhnson at gmail.com Sun Aug 9 02:38:01 2009 From: tjhnson at gmail.com (T J) Date: Sat, 8 Aug 2009 23:38:01 -0700 Subject: [Numpy-discussion] Indexing with a list... In-Reply-To: <5163945D-E393-438B-95DC-2EE43A8C338D@cs.toronto.edu> References: <5163945D-E393-438B-95DC-2EE43A8C338D@cs.toronto.edu> Message-ID: On Sat, Aug 8, 2009 at 10:09 PM, David Warde-Farley wrote: > On 9-Aug-09, at 12:36 AM, T J wrote: > >>>>> z = array([1,2,3,4]) >>>>> z[[1]] >> array([1]) >>>>> z[(1,)] >> 1 >> > In the special case of scalar indices they're treated as if they are > length-1 tuples. The behaviour you're seeing is the same as z[1]. > Sure, but that wasn't my question. I was asking about the difference between indexing with a 1-tuple (or scalar) and with a 1-list. Naively, I guess I didn't expect there to be a difference. Though, I can see its uses (through the z[z<3] example). The dictionary example is nice in that is really highlights exactly *how* different arrays are from python dictionaries (aside from the obvious): since lists are unhashable, you can't index with them at all. Yet you can index numpy arrays with lists AND the behavior is different from if you indexed with a tuple! >>> z = array([1,2,3]) >>> i = [2] >>> type(z[i]) >>> type(z[tuple(i)]) From dwf at cs.toronto.edu Sun Aug 9 03:53:43 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sun, 9 Aug 2009 03:53:43 -0400 Subject: [Numpy-discussion] Indexing with a list... In-Reply-To: References: <5163945D-E393-438B-95DC-2EE43A8C338D@cs.toronto.edu> Message-ID: On 9-Aug-09, at 2:38 AM, T J wrote: > Sure, but that wasn't my question. > > I was asking about the difference between indexing with a 1-tuple (or > scalar) and with a 1-list. Naively, I guess I didn't expect there to > be a difference. Though, I can see its uses (through the z[z<3] > example). Ah. I didn't see the relevance of z[z<3], but now I do. z < 3 produces a boolean *array*, and you're right that arrays and lists are treated the same. Single element lists and single element tuples are treated differently because tuples and lists are, in general; if the behaviour of list indices changed when they were length 1, you'd have all kinds of corner cases to check for and handle in cases where you don't know a priori know the length of your index list. Since you can also have a tuple containing lists/arrays, that will pull out the elements on each axis in a shape-preserving way. And you can mix and match lists/arrays and slices, i.e. A[:,[4,1,5],[6,9,7]]. David From fperez.net at gmail.com Sun Aug 9 05:12:25 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 9 Aug 2009 02:12:25 -0700 Subject: [Numpy-discussion] Sanity checklist for those attending the SciPy'09 tutorials Message-ID: Hi all, [ sorry for spamming the list, but even though I sent this to all the email addresses I have on file for tutorial attendees, I know I am missing a few, so I hope they see this message. ] In order to make your experience at the scipy tutorials as smooth as possible, we strongly recommend that you take a little time to install the necessary tools in advance. For both introductory and advanced tutorials: http://conference.scipy.org/intro_tutorial http://conference.scipy.org/advanced_tutorials you will find instructions on what to install and where to download it from. In addition (this is also mentioned on those pages), we encourage you to run according to your tutorial of choice, a little checklist script: https://cirl.berkeley.edu/fperez/tmp/intro_tut_checklist.py https://cirl.berkeley.edu/fperez/tmp/adv_tut_checklist.py This will try to spot any problems early, and we'll do our best to help you with them before you arrive to the conference. Best regards, Dave Peterson and Fernando Perez. ps - for those of you who may find fixes for the checklist scripts, the sources are hosted on github: http://github.com/fperez/scipytut/ From ezindy at gmail.com Sun Aug 9 06:17:38 2009 From: ezindy at gmail.com (Egor Zindy) Date: Sun, 9 Aug 2009 11:17:38 +0100 Subject: [Numpy-discussion] SWIG, numpy.i and errno: comments? Message-ID: Hello list, this is my attempt at generating python exceptions in SWIG/C using the errno mechanism: http://www.scipy.org/Cookbook/SWIG_NumPy_examples#head-10f49a0f5ea6b313127d2ec5ffa1eaf1c133cb22 Used together with numpy.i, this has been useful for notifying (in a pythonic way) memory allocation errors or array index problems. A change in the errno global variable is detected in the %exception part of the SWIG interface file, and Python exceptions are generated after $action depending on the errno error code value. %exception { errno = 0; $action if (errno != 0) { switch(errno) { case EPERM: PyErr_Format(PyExc_IndexError, "Index out of range"); break; case ENOMEM: PyErr_Format(PyExc_MemoryError, "Failed malloc()"); break; default: PyErr_Format(PyExc_Exception, "Unknown exception"); } return NULL; } } If there's a better way of doing this, I'll update the cookbook recipe. Regards, Egor From josef.pktd at gmail.com Sun Aug 9 07:34:59 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 9 Aug 2009 07:34:59 -0400 Subject: [Numpy-discussion] [SciPy-User] Sanity checklist for those attending the SciPy'09 tutorials In-Reply-To: References: Message-ID: <1cd32cbb0908090434m2fe57da1xbadc7e1657b19694@mail.gmail.com> On Sun, Aug 9, 2009 at 5:12 AM, Fernando Perez wrote: > Hi all, > > [ sorry for spamming the list, but even though I sent this to all the > email addresses I have on file for tutorial attendees, I know I am > missing a few, so I hope they see this message. ] > > In order to make your experience at the scipy tutorials as smooth as > possible, we strongly recommend that you take a little time to install > the necessary tools in advance. > > For both introductory and advanced ?tutorials: > > http://conference.scipy.org/intro_tutorial > http://conference.scipy.org/advanced_tutorials > > you will find instructions on what to install and where to download it > from. ?In addition (this is also mentioned on those pages), we > encourage you to run according to your tutorial of choice, a little > checklist script: > > https://cirl.berkeley.edu/fperez/tmp/intro_tut_checklist.py > https://cirl.berkeley.edu/fperez/tmp/adv_tut_checklist.py > > This will try to ?spot any problems early, and we'll do our best ?to > help you with them before you arrive to the conference. > > Best regards, > > Dave Peterson and Fernando Perez. > > ps - for those of you who may find fixes for the checklist scripts, > the sources are hosted on github: > > http://github.com/fperez/scipytut/ > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > is "Availability: recent flavors of Unix. " required (python 2.5 help for os.uname) Josef ================== System information ================== os.name : nt os.uname : Traceback (most recent call last): File "C:\Josef\work-oth\adv_tut_checklist.py", line 317, in main() File "C:\Josef\work-oth\adv_tut_checklist.py", line 312, in main sys_info() File "C:\Josef\work-oth\adv_tut_checklist.py", line 43, in sys_info print 'os.uname :',os.uname() AttributeError: 'module' object has no attribute 'uname' From lukshuntim at gmail.com Sun Aug 9 08:53:55 2009 From: lukshuntim at gmail.com (lukshuntim at gmail.com) Date: Sun, 09 Aug 2009 20:53:55 +0800 Subject: [Numpy-discussion] Test failures r7300 In-Reply-To: <5b8d13220908080645o6c22a081xa36b5ff8d97abefa@mail.gmail.com> References: <4A7D71AE.2050704@gmail.com> <5b8d13220908080559j4cf98394qc4ee89d21c16300e@mail.gmail.com> <4A7D7E8E.8010902@gmail.com> <5b8d13220908080645o6c22a081xa36b5ff8d97abefa@mail.gmail.com> Message-ID: <4A7EC6E3.5050706@gmail.com> David Cournapeau wrote: > On Sat, Aug 8, 2009 at 10:33 PM, wrote: >> David Cournapeau wrote: >>> On Sat, Aug 8, 2009 at 9:38 PM, wrote: >>>> Hi, >>>> >>>> I got 16 test failures after building r7300 from svn on debian/sid/i386. >>>> Seems all related to complex linear algebra modules. >>> Are you using atlas ? (numpy.show_config() output) >> Yes, it's libatlas-sse2 3.6.0-24 debian/sid package. > > I wonder if debian atlas package has the same problem as on recent Ubuntu. [snipped] > What I would try is first install libatlas-base (or whatever it is > called on sid), i.e. the non sse version, and compare test output with > both sse2/nosse (e.g. using LD_LIBRARY_PATH to point to /usr/lib so > that the nosse is loaded, you can check using ldd which one is loaded > by ld). Just to clarify, you mean doing a "ldd lapack_lite.so" to check which blas and lapack is used at runtime. Right? I also removed site.cfg when building. With no atlas, and with both libatlas-base and libatlas-sse, the complex linear algebra errors went away and I got only 1 error: FAIL: Test bug in reduceat with structured arrays copied for speed. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/pymodules/python2.5/nose/case.py", line 183, in runTest self.test(*self.arg) File "/var/opt/py/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 818, in test_reduceat assert_array_almost_equal(h1, h2) File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 726, in assert_array_almost_equal header='Arrays are not almost equal') File "/var/opt/py/lib/python2.5/site-packages/numpy/testing/utils.py", line 571, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal (mismatch 100.0%) x: array([ -5.17543592e+11, -5.17543592e+11, -5.17543592e+11, -5.17543592e+11], dtype=float32) y: array([ 700., 800., 1000., 7500.], dtype=float32) So it appears that it's the sse2 variant that is causing the problem. Regards, ST -- From wfspotz at sandia.gov Sun Aug 9 09:30:59 2009 From: wfspotz at sandia.gov (Bill Spotz) Date: Sun, 9 Aug 2009 09:30:59 -0400 Subject: [Numpy-discussion] SWIG, numpy.i and errno: comments? In-Reply-To: References: Message-ID: Egor, This looks about right. However, it is customary to invoke the SWIG macro "SWIG_fail;" instead of "break;". (This translates into a "goto" to the failure label, and is better in case there is any other cleanup code to execute.) On Aug 9, 2009, at 6:17 AM, Egor Zindy wrote: > Hello list, > > this is my attempt at generating python exceptions in SWIG/C using the > errno mechanism: > http://www.scipy.org/Cookbook/SWIG_NumPy_examples#head-10f49a0f5ea6b313127d2ec5ffa1eaf1c133cb22 > > Used together with numpy.i, this has been useful for notifying (in a > pythonic way) memory allocation errors or array index problems. > > A change in the errno global variable is detected in the %exception > part of the SWIG interface file, and Python exceptions are generated > after $action depending on the errno error code value. > > %exception > { > errno = 0; > $action > > if (errno != 0) > { > switch(errno) > { > case EPERM: > PyErr_Format(PyExc_IndexError, "Index out of range"); > break; > case ENOMEM: > PyErr_Format(PyExc_MemoryError, "Failed malloc()"); > break; > default: > PyErr_Format(PyExc_Exception, "Unknown exception"); > } > return NULL; > } > } > > If there's a better way of doing this, I'll update the cookbook > recipe. > > Regards, > Egor ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From aisaac at american.edu Sun Aug 9 09:41:56 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 09 Aug 2009 09:41:56 -0400 Subject: [Numpy-discussion] Indexing with a list... In-Reply-To: References: <5163945D-E393-438B-95DC-2EE43A8C338D@cs.toronto.edu> Message-ID: <4A7ED224.7040308@american.edu> Fancy indexing is discussed in detail in the Guide to NumPy. http://www.tramy.us/guidetoscipy.html Alan Isaac From ezindy at gmail.com Sun Aug 9 12:50:52 2009 From: ezindy at gmail.com (Egor Zindy) Date: Sun, 9 Aug 2009 17:50:52 +0100 Subject: [Numpy-discussion] SWIG, numpy.i and errno: comments? In-Reply-To: References: Message-ID: Bill, thank you for your comment. Would this do instead? (replacing the return NULL with SWIG_fail): %exception { errno = 0; $action if (errno != 0) { switch(errno) { case EPERM: PyErr_Format(PyExc_IndexError, "Index out of range"); break; case ENOMEM: PyErr_Format(PyExc_MemoryError, "failed malloc()"); break; default: PyErr_Format(PyExc_Exception, "Unknown exception"); } SWIG_fail; } } Cheers, Egor On Sun, Aug 9, 2009 at 2:30 PM, Bill Spotz wrote: > Egor, > > This looks about right. ?However, it is customary to invoke the SWIG macro > "SWIG_fail;" instead of "break;". ?(This translates into a "goto" to the > failure label, and is better in case there is any other cleanup code to > execute.) > > On Aug 9, 2009, at 6:17 AM, Egor Zindy wrote: > >> Hello list, >> >> this is my attempt at generating python exceptions in SWIG/C using the >> errno mechanism: >> >> http://www.scipy.org/Cookbook/SWIG_NumPy_examples#head-10f49a0f5ea6b313127d2ec5ffa1eaf1c133cb22 >> >> Used together with numpy.i, this has been useful for notifying (in a >> pythonic way) memory allocation errors or array index problems. >> >> A change in the errno global variable is detected in the %exception >> part of the SWIG interface file, and Python exceptions are generated >> after $action depending on the errno error code value. >> >> %exception >> { >> ? errno = 0; >> ? $action >> >> ? if (errno != 0) >> ? { >> ? ? ? switch(errno) >> ? ? ? { >> ? ? ? ? ? case EPERM: >> ? ? ? ? ? ? ? PyErr_Format(PyExc_IndexError, "Index out of range"); >> ? ? ? ? ? ? ? break; >> ? ? ? ? ? case ENOMEM: >> ? ? ? ? ? ? ? PyErr_Format(PyExc_MemoryError, "Failed malloc()"); >> ? ? ? ? ? ? ? break; >> ? ? ? ? ? default: >> ? ? ? ? ? ? ? PyErr_Format(PyExc_Exception, "Unknown exception"); >> ? ? ? } >> ? ? ? return NULL; >> ? } >> } >> >> If there's a better way of doing this, I'll update the cookbook recipe. >> >> Regards, >> Egor > > ** Bill Spotz ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?** > ** Sandia National Laboratories ?Voice: (505)845-0170 ? ? ?** > ** P.O. Box 5800 ? ? ? ? ? ? ? ? Fax: ? (505)284-0154 ? ? ?** > ** Albuquerque, NM 87185-0370 ? ?Email: wfspotz at sandia.gov ** > > > > > > > From fperez.net at gmail.com Sun Aug 9 13:46:49 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 9 Aug 2009 10:46:49 -0700 Subject: [Numpy-discussion] [SciPy-User] Sanity checklist for those attending the SciPy'09 tutorials In-Reply-To: <1cd32cbb0908090434m2fe57da1xbadc7e1657b19694@mail.gmail.com> References: <1cd32cbb0908090434m2fe57da1xbadc7e1657b19694@mail.gmail.com> Message-ID: On Sun, Aug 9, 2009 at 4:34 AM, wrote: > is "Availability: recent flavors of Unix. " ?required (python 2.5 help > for os.uname) > Thanks for the catch, sorry about that. My unix-isms showing through... Updated. f From giuseppe.aprea at gmail.com Mon Aug 10 06:20:40 2009 From: giuseppe.aprea at gmail.com (Giuseppe Aprea) Date: Mon, 10 Aug 2009 11:20:40 +0100 Subject: [Numpy-discussion] problem during installation Message-ID: Sorry if I posted this twice. I wonder if anyone can suggest me how to fix this error which I find during numpy build: ......... creating build/temp.linux-i686-2.6/numpy/linalg compile options: '-DNO_ATLAS_INFO=1 -Inumpy/core/include -Ibuild/src.linux-i686-2.6/numpy/core/include/numpy -Inumpy/core/src -Inumpy/core/include -I/home/gaprea/usr/local/include/python2.6 -c' gcc: numpy/linalg/lapack_litemodule.c gcc: numpy/linalg/python_xerbla.c /usr/bin/gfortran -Wall -L/home/gaprea/usr/local/lib build/temp.linux-i686-2.6/numpy/linalg/lapack_litemodule.o build/temp.linux-i686-2.6/numpy/linalg/python_xerbla.o -L/home/gaprea/usr/local/lib -L/usr/lib -L/home/gaprea/usr/local/lib -Lbuild/temp.linux-i686-2.6 -llapack -lblas -lpython2.6 -lgfortran -o build/lib.linux-i686-2.6/numpy/linalg/lapack_lite.so /usr/lib/gcc/i486-linux-gnu/4.2.4/libgfortranbegin.a(fmain.o): In function `main': (.text+0x23): undefined reference to `MAIN__' collect2: ld returned 1 exit status /usr/lib/gcc/i486-linux-gnu/4.2.4/libgfortranbegin.a(fmain.o): In function `main': (.text+0x23): undefined reference to `MAIN__' collect2: ld returned 1 exit status error: Command "/usr/bin/gfortran -Wall -L/home/gaprea/usr/local/lib build/temp.linux-i686-2.6/numpy/linalg/lapack_litemodule.o build/temp.linux-i686-2.6/numpy/linalg/python_xerbla.o -L/home/gaprea/usr/local/lib -L/usr/lib -L/home/gaprea/usr/local/lib -Lbuild/temp.linux-i686-2.6 -llapack -lblas -lpython2.6 -lgfortran -o build/lib.linux-i686-2.6/numpy/linalg/lapack_lite.so" failed with exit status 1 It seems that the installation program is using gfortran to link while it should have used gcc. I am using Kubuntu 8.04 and I have the following version of gfortran and gcc: $ gfortran -v Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --enable-targets=all --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu4) $ gcc -v Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --enable-targets=all --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu4) does anybody have a suggestion? thanks giuseppe From wfspotz at sandia.gov Mon Aug 10 11:14:10 2009 From: wfspotz at sandia.gov (Bill Spotz) Date: Mon, 10 Aug 2009 11:14:10 -0400 Subject: [Numpy-discussion] SWIG, numpy.i and errno: comments? In-Reply-To: References: Message-ID: Sure. On Aug 9, 2009, at 12:50 PM, Egor Zindy wrote: > Bill, > > thank you for your comment. Would this do instead? (replacing the > return NULL with SWIG_fail): > > %exception > { > errno = 0; > $action > > if (errno != 0) > { > switch(errno) > { > case EPERM: > PyErr_Format(PyExc_IndexError, "Index out of range"); > break; > case ENOMEM: > PyErr_Format(PyExc_MemoryError, "failed malloc()"); > break; > default: > PyErr_Format(PyExc_Exception, "Unknown exception"); > } > SWIG_fail; > } > } > > Cheers, > Egor > > On Sun, Aug 9, 2009 at 2:30 PM, Bill Spotz wrote: >> Egor, >> >> This looks about right. However, it is customary to invoke the >> SWIG macro >> "SWIG_fail;" instead of "break;". (This translates into a "goto" >> to the >> failure label, and is better in case there is any other cleanup >> code to >> execute.) >> >> On Aug 9, 2009, at 6:17 AM, Egor Zindy wrote: >> >>> Hello list, >>> >>> this is my attempt at generating python exceptions in SWIG/C using >>> the >>> errno mechanism: >>> >>> http://www.scipy.org/Cookbook/SWIG_NumPy_examples#head-10f49a0f5ea6b313127d2ec5ffa1eaf1c133cb22 >>> >>> Used together with numpy.i, this has been useful for notifying (in a >>> pythonic way) memory allocation errors or array index problems. >>> >>> A change in the errno global variable is detected in the %exception >>> part of the SWIG interface file, and Python exceptions are generated >>> after $action depending on the errno error code value. >>> >>> %exception >>> { >>> errno = 0; >>> $action >>> >>> if (errno != 0) >>> { >>> switch(errno) >>> { >>> case EPERM: >>> PyErr_Format(PyExc_IndexError, "Index out of range"); >>> break; >>> case ENOMEM: >>> PyErr_Format(PyExc_MemoryError, "Failed malloc()"); >>> break; >>> default: >>> PyErr_Format(PyExc_Exception, "Unknown exception"); >>> } >>> return NULL; >>> } >>> } >>> >>> If there's a better way of doing this, I'll update the cookbook >>> recipe. >>> >>> Regards, >>> Egor >> >> ** Bill Spotz ** >> ** Sandia National Laboratories Voice: (505)845-0170 ** >> ** P.O. Box 5800 Fax: (505)284-0154 ** >> ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** >> >> >> >> >> >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From rjel at ceh.ac.uk Mon Aug 10 11:08:27 2009 From: rjel at ceh.ac.uk (Rich E) Date: Mon, 10 Aug 2009 15:08:27 +0000 (UTC) Subject: [Numpy-discussion] Question about bounds checking Message-ID: Dear all, I am having a few issues with indexing in numpy and wondered if you could help me out. If I define an array a = zeros(( 4)) a array([ 0., 0., 0., 0.]) Then I try and reference a point beyond the bounds of the array a[4] Traceback (most recent call last): File "", line 1, in IndexError: index out of bounds but if I use the slicing format to reference the point I get a[0:4] array([ 0., 0., 0., 0.]) a[0:10] array([ 0., 0., 0., 0.]) it returns a[ 0 : 3 ], with no error raised. If I then ask for the shape of the array, I get a.shape (4,) but if I use a[0:3].shape (3,) which is one less than I would have expected. a[0:4].shape (4,) This is numpy 1.2.1 on python 2.5 Thanks in advance for you help, Rich From sccolbert at gmail.com Mon Aug 10 11:44:23 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 10 Aug 2009 11:44:23 -0400 Subject: [Numpy-discussion] Question about bounds checking In-Reply-To: References: Message-ID: <7f014ea60908100844i3cb94040vcfb575964f175f64@mail.gmail.com> when you use slice notation, [0:4] returns everything up-to but not including index 4. That is a[4] is actually the 5th element of the array (which doesn't exist) because arrays are zero-based in python. http://docs.scipy.org/doc/numpy-1.3.x/user/basics.indexing.html On Mon, Aug 10, 2009 at 11:08 AM, Rich E wrote: > Dear all, > I am having a few issues with indexing in numpy and wondered if you could help > me out. > If I define an array > a = zeros(( 4)) > a > array([ 0., ?0., ?0., ?0.]) > > Then I try and reference a point beyond the bounds of the array > > a[4] > Traceback (most recent call last): > ?File "", line 1, in > IndexError: index out of bounds > > but if I use the slicing format to reference the point I get > > a[0:4] > array([ 0., ?0., ?0., ?0.]) > a[0:10] > array([ 0., ?0., ?0., ?0.]) > > it returns a[ 0 : 3 ], with no error raised. If I then ask for the shape of the > array, I get > a.shape > (4,) > > but if I use > > a[0:3].shape > (3,) > > which is one less than I would have expected. > > a[0:4].shape > (4,) > > This is numpy 1.2.1 on python 2.5 > Thanks in advance for you help, > Rich > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Mon Aug 10 11:55:22 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 10 Aug 2009 08:55:22 -0700 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> Message-ID: On Thu, Aug 6, 2009 at 9:07 AM, Robert Kern wrote: > On Thu, Aug 6, 2009 at 11:03, Keith Goodman wrote: >> On Thu, Aug 6, 2009 at 8:55 AM, wrote: >>> What's the best way of getting back the correct shape to be able to >>> broadcast, mean, min,.. to the original array, that works for >>> arbitrary dimension and axis? >>> >>> I thought I have seen some helper functions, but I don't find them anymore? >>> >>> Josef >>> >>>>>> a >>> array([[1, 2, 3, 3, 0], >>> ? ? ? [2, 2, 3, 2, 1]]) >>>>>> a-a.max(0) >>> array([[-1, ?0, ?0, ?0, -1], >>> ? ? ? [ 0, ?0, ?0, -1, ?0]]) >>>>>> a-a.max(1) >>> Traceback (most recent call last): >>> ?File "", line 1, in >>> ? ?a-a.max(1) >>> ValueError: shape mismatch: objects cannot be broadcast to a single shape >>>>>> a-a.max(1)[:,None] >>> array([[-2, -1, ?0, ?0, -3], >>> ? ? ? [-1, -1, ?0, -1, -2]]) >> >> Would this do it? >> >>>> pylab.demean?? >> Type: ? ? ? ? ? function >> Base Class: ? ? >> String Form: ? ? >> Namespace: ? ? ?Interactive >> File: ? ? ? ? ? /usr/lib/python2.6/dist-packages/matplotlib/mlab.py >> Definition: ? ? pylab.demean(x, axis=0) >> Source: >> def demean(x, axis=0): >> ? ?"Return x minus its mean along the specified axis" >> ? ?x = np.asarray(x) >> ? ?if axis: >> ? ? ? ?ind = [slice(None)] * axis >> ? ? ? ?ind.append(np.newaxis) >> ? ? ? ?return x - x.mean(axis)[ind] >> ? ?return x - x.mean(axis) > > Ouch! That doesn't handle axis=-1. > > if axis != 0: > ? ?ind = [slice(None)] * x.ndim > ? ?ind[axis] = np.newaxis Ouch! That doesn't handle axis=None. if axis: ind = [slice(None)] * x.ndim ind[axis] = np.newaxis From josef.pktd at gmail.com Mon Aug 10 12:10:05 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 10 Aug 2009 12:10:05 -0400 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> Message-ID: <1cd32cbb0908100910k281da2bci8bd2716c04de123e@mail.gmail.com> On Mon, Aug 10, 2009 at 11:55 AM, Keith Goodman wrote: > On Thu, Aug 6, 2009 at 9:07 AM, Robert Kern wrote: >> On Thu, Aug 6, 2009 at 11:03, Keith Goodman wrote: >>> On Thu, Aug 6, 2009 at 8:55 AM, wrote: >>>> What's the best way of getting back the correct shape to be able to >>>> broadcast, mean, min,.. to the original array, that works for >>>> arbitrary dimension and axis? >>>> >>>> I thought I have seen some helper functions, but I don't find them anymore? >>>> >>>> Josef >>>> >>>>>>> a >>>> array([[1, 2, 3, 3, 0], >>>> ? ? ? [2, 2, 3, 2, 1]]) >>>>>>> a-a.max(0) >>>> array([[-1, ?0, ?0, ?0, -1], >>>> ? ? ? [ 0, ?0, ?0, -1, ?0]]) >>>>>>> a-a.max(1) >>>> Traceback (most recent call last): >>>> ?File "", line 1, in >>>> ? ?a-a.max(1) >>>> ValueError: shape mismatch: objects cannot be broadcast to a single shape >>>>>>> a-a.max(1)[:,None] >>>> array([[-2, -1, ?0, ?0, -3], >>>> ? ? ? [-1, -1, ?0, -1, -2]]) >>> >>> Would this do it? >>> >>>>> pylab.demean?? >>> Type: ? ? ? ? ? function >>> Base Class: ? ? >>> String Form: ? ? >>> Namespace: ? ? ?Interactive >>> File: ? ? ? ? ? /usr/lib/python2.6/dist-packages/matplotlib/mlab.py >>> Definition: ? ? pylab.demean(x, axis=0) >>> Source: >>> def demean(x, axis=0): >>> ? ?"Return x minus its mean along the specified axis" >>> ? ?x = np.asarray(x) >>> ? ?if axis: >>> ? ? ? ?ind = [slice(None)] * axis >>> ? ? ? ?ind.append(np.newaxis) >>> ? ? ? ?return x - x.mean(axis)[ind] >>> ? ?return x - x.mean(axis) >> >> Ouch! That doesn't handle axis=-1. >> >> if axis != 0: >> ? ?ind = [slice(None)] * x.ndim >> ? ?ind[axis] = np.newaxis > > Ouch! That doesn't handle axis=None. > > if axis: > ? ?ind = [slice(None)] * x.ndim > ? ?ind[axis] = np.newaxis that's why I used if axis != 0 and not axis is None: and included a testcase for None. (although my version looks a bit verbose but explicit) Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Mon Aug 10 12:21:25 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 10 Aug 2009 09:21:25 -0700 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: <1cd32cbb0908100910k281da2bci8bd2716c04de123e@mail.gmail.com> References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> <3d375d730908060907g10ba8374i73c9747abcc10392@mail.gmail.com> <1cd32cbb0908100910k281da2bci8bd2716c04de123e@mail.gmail.com> Message-ID: On Mon, Aug 10, 2009 at 9:10 AM, wrote: > On Mon, Aug 10, 2009 at 11:55 AM, Keith Goodman wrote: >> On Thu, Aug 6, 2009 at 9:07 AM, Robert Kern wrote: >>> On Thu, Aug 6, 2009 at 11:03, Keith Goodman wrote: >>>> On Thu, Aug 6, 2009 at 8:55 AM, wrote: >>>>> What's the best way of getting back the correct shape to be able to >>>>> broadcast, mean, min,.. to the original array, that works for >>>>> arbitrary dimension and axis? >>>>> >>>>> I thought I have seen some helper functions, but I don't find them anymore? >>>>> >>>>> Josef >>>>> >>>>>>>> a >>>>> array([[1, 2, 3, 3, 0], >>>>> ? ? ? [2, 2, 3, 2, 1]]) >>>>>>>> a-a.max(0) >>>>> array([[-1, ?0, ?0, ?0, -1], >>>>> ? ? ? [ 0, ?0, ?0, -1, ?0]]) >>>>>>>> a-a.max(1) >>>>> Traceback (most recent call last): >>>>> ?File "", line 1, in >>>>> ? ?a-a.max(1) >>>>> ValueError: shape mismatch: objects cannot be broadcast to a single shape >>>>>>>> a-a.max(1)[:,None] >>>>> array([[-2, -1, ?0, ?0, -3], >>>>> ? ? ? [-1, -1, ?0, -1, -2]]) >>>> >>>> Would this do it? >>>> >>>>>> pylab.demean?? >>>> Type: ? ? ? ? ? function >>>> Base Class: ? ? >>>> String Form: ? ? >>>> Namespace: ? ? ?Interactive >>>> File: ? ? ? ? ? /usr/lib/python2.6/dist-packages/matplotlib/mlab.py >>>> Definition: ? ? pylab.demean(x, axis=0) >>>> Source: >>>> def demean(x, axis=0): >>>> ? ?"Return x minus its mean along the specified axis" >>>> ? ?x = np.asarray(x) >>>> ? ?if axis: >>>> ? ? ? ?ind = [slice(None)] * axis >>>> ? ? ? ?ind.append(np.newaxis) >>>> ? ? ? ?return x - x.mean(axis)[ind] >>>> ? ?return x - x.mean(axis) >>> >>> Ouch! That doesn't handle axis=-1. >>> >>> if axis != 0: >>> ? ?ind = [slice(None)] * x.ndim >>> ? ?ind[axis] = np.newaxis >> >> Ouch! That doesn't handle axis=None. >> >> if axis: >> ? ?ind = [slice(None)] * x.ndim >> ? ?ind[axis] = np.newaxis > > that's why I used > > ?if axis != 0 and not axis is None: > > and included a testcase for None. (although my version looks a bit > verbose but explicit) I'm getting better. I'm only 3 days behind this time. Yeah, I caught it on a unit test too. From david.huard at gmail.com Mon Aug 10 15:08:01 2009 From: david.huard at gmail.com (David Huard) Date: Mon, 10 Aug 2009 15:08:01 -0400 Subject: [Numpy-discussion] vectorize problem with f2py and gfortran 4.3 Message-ID: <91cf711d0908101208s25601d88uf554b2df1ce99adf@mail.gmail.com> Hi all, A user on the pymc user list has reported a problem with f2py wrapped fortran functions compiled with gfortran 4.3, which is the standard Ubuntu Jaunty fortran compiler. I noticed the same bug in some of my own routines. The problem, as far as I can understand, is that vectorize tries to find the number of arguments by calling the function with no arguments and parsing the error message. With numpy 1.3, python 2.6 and gfortran 4.3, the error message is not what numpy expects, and does not contain the expected number of arguments. So I am wondering if there is a reliable way to introspect compiled extensions to provide the number of arguments needed by vectorize ? Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From liukis at usc.edu Mon Aug 10 15:19:24 2009 From: liukis at usc.edu (Maria Liukis) Date: Mon, 10 Aug 2009 12:19:24 -0700 Subject: [Numpy-discussion] Indexing empty array with empty boolean array causes "IndexError: invalid index exception" Message-ID: <7F7802B9-BD2F-4350-99B6-708D140089C8@usc.edu> Hello everybody, I'm using following versions of Scipy and Numpy packages: >>> scipy.__version__ '0.7.1' >>> np.__version__ '1.3.0' My code uses boolean array to filter 2-dimensional array which sometimes happens to be an empty array. It seems like I have to take special care when dimension I'm filtering is zero, otherwise I'm getting an "IndexError: invalid index" exception: >>> import numpy as np >>> a = np.zeros((2,10)) >>> a array([[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]) >>> filter_array = np.zeros(2,) >>> filter_array array([False, False], dtype=bool) >>> a[filter_array,:] array([], shape=(0, 10), dtype=float64) >>>>>> Now if filtered dimension is zero: >>> a = np.ones((0,10)) >>> a array([], shape=(0, 10), dtype=float64) >>> filter_array = np.zeros((0,), dtype=bool) >>> filter_array array([], dtype=bool) >>> filter_array.shape (0,) >>> a.shape (0, 10) >>> a[filter_array,:] Traceback (most recent call last): File "", line 1, in IndexError: invalid index >>> Would somebody know if it's an expected behavior, a package bug or am I doing something wrong? Thanks in advance, Masha -------------------- liukis at usc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From liukis at usc.edu Mon Aug 10 15:23:41 2009 From: liukis at usc.edu (Maria Liukis) Date: Mon, 10 Aug 2009 12:23:41 -0700 Subject: [Numpy-discussion] Indexing empty array with empty boolean array causes "IndexError: invalid index exception" In-Reply-To: <7F7802B9-BD2F-4350-99B6-708D140089C8@usc.edu> References: <7F7802B9-BD2F-4350-99B6-708D140089C8@usc.edu> Message-ID: <27D03FF9-3531-4BA9-A5A9-A9839C1A1CE0@usc.edu> A correction for the typo below. Thanks, Masha -------------------- liukis at usc.edu On Aug 10, 2009, at 12:19 PM, Maria Liukis wrote: > Hello everybody, > > I'm using following versions of Scipy and Numpy packages: > >>> scipy.__version__ > '0.7.1' > >>> np.__version__ > '1.3.0' > > My code uses boolean array to filter 2-dimensional array which > sometimes happens to be an empty array. It seems like I have to > take special care when dimension I'm filtering is zero, otherwise > I'm getting an "IndexError: invalid index" exception: > > >>> import numpy as np > >>> a = np.zeros((2,10)) Sorry, copied wrong line for an example which creates an array: >>> a = np.ones((2,10)) > >>> a > array([[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], > [ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]) > >>> filter_array = np.zeros(2,) > >>> filter_array > array([False, False], dtype=bool) > >>> a[filter_array,:] > array([], shape=(0, 10), dtype=float64) > >>>>>> > > > Now if filtered dimension is zero: > >>> a = np.ones((0,10)) > >>> a > array([], shape=(0, 10), dtype=float64) > >>> filter_array = np.zeros((0,), dtype=bool) > >>> filter_array > array([], dtype=bool) > >>> filter_array.shape > (0,) > >>> a.shape > (0, 10) > >>> a[filter_array,:] > Traceback (most recent call last): > File "", line 1, in > IndexError: invalid index > >>> > > Would somebody know if it's an expected behavior, a package bug or > am I doing something wrong? > > > Thanks in advance, > Masha > -------------------- > liukis at usc.edu > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From brennan.williams at visualreservoir.com Mon Aug 10 20:52:54 2009 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Tue, 11 Aug 2009 12:52:54 +1200 Subject: [Numpy-discussion] speeding up getting a subset of a data array Message-ID: <4A80C0E6.2090107@visualreservoir.com> Hi No doubt asked many times before so apologies.... I'm pulling a subset array out of a data array where I have a list of the indices I want (could be an array rather than a list actually - I have it in both). Potentially the number of points and the number of times I do this can get very large so any saving in time is good. So, paraphrasing what I've currently got.... say I have... subsetpointerlist=[0,1,2,5,8,15,25...] subsetsize=len(subsetpointerlist) subsetarray=zeros(subsetsize,dtype=float) for index,pos in enumerate(subsetpointerlist): subsetarray[index]=dataarray[pos] How do I speed this up in numpy, i.e. by removing the for loop? Do I set up some sort of a subsetpointerarray as a mask and then somehow apply that to dataarray to get the values into subsetarray? Thanks Brennan From josef.pktd at gmail.com Mon Aug 10 20:58:53 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 10 Aug 2009 20:58:53 -0400 Subject: [Numpy-discussion] speeding up getting a subset of a data array In-Reply-To: <4A80C0E6.2090107@visualreservoir.com> References: <4A80C0E6.2090107@visualreservoir.com> Message-ID: <1cd32cbb0908101758k21ab9a7bs7523c5fa625a8f98@mail.gmail.com> On Mon, Aug 10, 2009 at 8:52 PM, Brennan Williams wrote: > Hi > > No doubt asked many times before so apologies.... > > I'm pulling a subset array out of a data array where I have a list of > the indices I want (could be an array rather than a list actually - I > have it in both). > > Potentially the number of points and the number of times I do this can > get very large so any saving in time is good. > > So, paraphrasing what I've currently got.... say I have... > > subsetpointerlist=[0,1,2,5,8,15,25...] > subsetsize=len(subsetpointerlist) > subsetarray=zeros(subsetsize,dtype=float) > for index,pos in enumerate(subsetpointerlist): > ?subsetarray[index]=dataarray[pos] > > How do I speed this up in numpy, i.e. by removing the for loop? > > Do I set up some sort of a subsetpointerarray as a mask and then somehow > apply that to dataarray to get the values into subsetarray? > > Thanks > > Brennan > > looks to me like subsetarray = dataarray[subsetpointerlist] or with type conversion subsetarray = dataarray[subsetpointerlist].astype(float) Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From brennan.williams at visualreservoir.com Mon Aug 10 21:03:03 2009 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Tue, 11 Aug 2009 13:03:03 +1200 Subject: [Numpy-discussion] speeding up getting a subset of a data array In-Reply-To: <4A80C0E6.2090107@visualreservoir.com> References: <4A80C0E6.2090107@visualreservoir.com> Message-ID: <4A80C347.5080909@visualreservoir.com> Brennan Williams wrote: > Hi > > No doubt asked many times before so apologies.... > > I'm pulling a subset array out of a data array where I have a list of > the indices I want (could be an array rather than a list actually - I > have it in both). > > Potentially the number of points and the number of times I do this can > get very large so any saving in time is good. > > So, paraphrasing what I've currently got.... say I have... > > subsetpointerlist=[0,1,2,5,8,15,25...] > subsetsize=len(subsetpointerlist) > subsetarray=zeros(subsetsize,dtype=float) > for index,pos in enumerate(subsetpointerlist): > subsetarray[index]=dataarray[pos] > > How do I speed this up in numpy, i.e. by removing the for loop? > > It's not as simple as... subsetarray=dataarray[subsetpointerarray] is it? > Do I set up some sort of a subsetpointerarray as a mask and then somehow > apply that to dataarray to get the values into subsetarray? > > Thanks > > Brennan > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From brennan.williams at visualreservoir.com Mon Aug 10 21:10:29 2009 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Tue, 11 Aug 2009 13:10:29 +1200 Subject: [Numpy-discussion] speeding up getting a subset of a data array In-Reply-To: <1cd32cbb0908101758k21ab9a7bs7523c5fa625a8f98@mail.gmail.com> References: <4A80C0E6.2090107@visualreservoir.com> <1cd32cbb0908101758k21ab9a7bs7523c5fa625a8f98@mail.gmail.com> Message-ID: <4A80C505.7050208@visualreservoir.com> josef.pktd at gmail.com wrote: > On Mon, Aug 10, 2009 at 8:52 PM, Brennan > Williams wrote: > >> Hi >> >> No doubt asked many times before so apologies.... >> >> I'm pulling a subset array out of a data array where I have a list of >> the indices I want (could be an array rather than a list actually - I >> have it in both). >> >> Potentially the number of points and the number of times I do this can >> get very large so any saving in time is good. >> >> So, paraphrasing what I've currently got.... say I have... >> >> subsetpointerlist=[0,1,2,5,8,15,25...] >> subsetsize=len(subsetpointerlist) >> subsetarray=zeros(subsetsize,dtype=float) >> for index,pos in enumerate(subsetpointerlist): >> subsetarray[index]=dataarray[pos] >> >> How do I speed this up in numpy, i.e. by removing the for loop? >> >> Do I set up some sort of a subsetpointerarray as a mask and then somehow >> apply that to dataarray to get the values into subsetarray? >> >> Thanks >> >> Brennan >> >> >> > > looks to me like > > subsetarray = dataarray[subsetpointerlist] > > or with type conversion > > subsetarray = dataarray[subsetpointerlist].astype(float) > > Josef > Thanks, with a little bit of googling/rtfm I'm getting there. Think I overdid my thinking on mask based on something else that Robert Kern helped me out with. > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From fiolj at yahoo.com Mon Aug 10 23:29:14 2009 From: fiolj at yahoo.com (Juan Fiol) Date: Mon, 10 Aug 2009 20:29:14 -0700 (PDT) Subject: [Numpy-discussion] saving incrementally numpy arrays Message-ID: <668426.19955.qm@web52512.mail.re2.yahoo.com> Hi, I am creating numpy arrays in chunks and I want to save the chunks while my program creates them. I tried to use numpy.save but it failed (because it is not intended to append data). I'd like to know what is, in your opinion, the best way to go. I will put a few thousands every time but building up a file of several Gbytes. I do not want to put into memory all previous data each time. Also I cannot wait until the program finishes, I must save partial results periodically. Thanks, any help will be appreciated Juan From dwf at cs.toronto.edu Tue Aug 11 00:16:50 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 11 Aug 2009 00:16:50 -0400 Subject: [Numpy-discussion] saving incrementally numpy arrays In-Reply-To: <668426.19955.qm@web52512.mail.re2.yahoo.com> References: <668426.19955.qm@web52512.mail.re2.yahoo.com> Message-ID: <16374B35-670B-4C77-95B1-DCDF304FC2D6@cs.toronto.edu> On 10-Aug-09, at 11:29 PM, Juan Fiol wrote: > Hi, I am creating numpy arrays in chunks and I want to save the > chunks while my program creates them. I tried to use numpy.save but > it failed (because it is not intended to append data). I'd like to > know what is, in your opinion, the best way to go. I will put a few > thousands every time but building up a file of several Gbytes. I do > not want to put into memory all previous data each time PyTables sounds like a good way to go. If you need to append to arrays themselves it can do that too, but it can certainly append new arrays to a file. David From slaunger at gmail.com Tue Aug 11 02:57:40 2009 From: slaunger at gmail.com (Kim Hansen) Date: Tue, 11 Aug 2009 08:57:40 +0200 Subject: [Numpy-discussion] saving incrementally numpy arrays In-Reply-To: <668426.19955.qm@web52512.mail.re2.yahoo.com> References: <668426.19955.qm@web52512.mail.re2.yahoo.com> Message-ID: I have had some resembling challenges in my work, and here appending the nympy arrays to HDF5 files using PyTables has been the solution for me - that used in combination with lzo compression/decompression has lead to very high read/write performance in my application with low memory consumption. You may also want to have a look at the h5py package. Kim 2009/8/11 Juan Fiol > Hi, I am creating numpy arrays in chunks and I want to save the chunks > while my program creates them. I tried to use numpy.save but it failed > (because it is not intended to append data). I'd like to know what is, in > your opinion, the best way to go. I will put a few thousands every time but > building up a file of several Gbytes. I do not want to put into memory > all previous data each time. Also I cannot wait until the program finishes, > I must save partial results periodically. Thanks, any help will be > appreciated > Juan > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Aug 11 14:05:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 11 Aug 2009 13:05:21 -0500 Subject: [Numpy-discussion] saving incrementally numpy arrays In-Reply-To: <668426.19955.qm@web52512.mail.re2.yahoo.com> References: <668426.19955.qm@web52512.mail.re2.yahoo.com> Message-ID: <3d375d730908111105m4c0985f7k9a4a7fc2f6accca9@mail.gmail.com> On Mon, Aug 10, 2009 at 22:29, Juan Fiol wrote: > Hi, I am creating numpy arrays in chunks and I want to save the chunks while my program creates them. I tried to use numpy.save but it failed (because it is not intended to append data). I'd like to know what is, in your opinion, the best way to go. I will put a few thousands every time but building up a file of several Gbytes. I do not want to put into memory > all previous data each time. Also I cannot wait until the program finishes, I must save partial results periodically. Thanks, any help will be appreciated As others mentioned, PyTables is an excellent, complete solution. If you still want to write your own, then you can pass an open file object to numpy.save() in order to append. Just open it with the mode 'a+b' and seek to the end. f = open('myfile.npy', 'a+b') f.seek(0, 2) numpy.save(f, chunk) f.close() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Tue Aug 11 14:46:28 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 11 Aug 2009 11:46:28 -0700 Subject: [Numpy-discussion] saving incrementally numpy arrays In-Reply-To: <3d375d730908111105m4c0985f7k9a4a7fc2f6accca9@mail.gmail.com> References: <668426.19955.qm@web52512.mail.re2.yahoo.com> <3d375d730908111105m4c0985f7k9a4a7fc2f6accca9@mail.gmail.com> Message-ID: On Tue, Aug 11, 2009 at 11:05 AM, Robert Kern wrote: > On Mon, Aug 10, 2009 at 22:29, Juan Fiol wrote: >> Hi, I am creating numpy arrays in chunks and I want to save the chunks while my program creates them. I tried to use numpy.save but it failed (because it is not intended to append data). I'd like to know what is, in your opinion, the best way to go. I will put a few thousands every time but building up a file of several Gbytes. I do not want to put into memory >> all previous data each time. Also I cannot wait until the program finishes, I must save partial results periodically. Thanks, any help will be appreciated > > As others mentioned, PyTables is an excellent, complete solution. If > you still want to write your own, then you can pass an open file > object to numpy.save() in order to append. Just open it with the mode > 'a+b' and seek to the end. > > ?f = open('myfile.npy', 'a+b') > ?f.seek(0, 2) > ?numpy.save(f, chunk) > ?f.close() That looks nice. What am I doing wrong? >> x = np.array([1,2,3]) >> y = np.array([4,5,6]) >> >> f = open('myfile.npy', 'a+b') >> np.save(f, x) >> f.seek(0, 2) >> np.save(f, y) >> f.close() >> >> xy = np.load('myfile.npy') >> xy array([1, 2, 3]) I was expecting something like array([1, 2, 3, 4, 5, 6]). From fiolj at yahoo.com Tue Aug 11 15:28:24 2009 From: fiolj at yahoo.com (Juan Fiol) Date: Tue, 11 Aug 2009 12:28:24 -0700 (PDT) Subject: [Numpy-discussion] saving incrementally numpy arrays In-Reply-To: Message-ID: <391080.28922.qm@web52501.mail.re2.yahoo.com> Hi, thanks for all the answers. I am checking how to use pytables now, though I probably prefer to do it without further dependencies. I tried opening the file as 'append' and then pickle the array (because looking to the numpy.save it looked like what they did), but to retrieve the data then I have to load multiple times and concatenate (numpy.c_[]). I did not tried Robert suggestion yet, but it will probably happen the same and that is what Keith is seeing (though I may be wrong too). If I do not find a suitable solution with only numpy I'll learn how to use pytables. Thanks and Best regards, Juan --- On Tue, 8/11/09, Keith Goodman wrote: > From: Keith Goodman > Subject: Re: [Numpy-discussion] saving incrementally numpy arrays > To: "Discussion of Numerical Python" > Date: Tuesday, August 11, 2009, 7:46 PM > On Tue, Aug 11, 2009 at 11:05 AM, > Robert Kern > wrote: > > On Mon, Aug 10, 2009 at 22:29, Juan Fiol > wrote: > >> Hi, I am creating numpy arrays in chunks and I > want to save the chunks while my program creates them. I > tried to use numpy.save but it failed (because it is not > intended to append data). I'd like to know what is, in your > opinion, the best way to go. I will put a few thousands > every time but building up a file of several Gbytes. I do > not want to put into memory > >> all previous data each time. Also I cannot wait > until the program finishes, I must save partial results > periodically. Thanks, any help will be appreciated > > > > As others mentioned, PyTables is an excellent, > complete solution. If > > you still want to write your own, then you can pass an > open file > > object to numpy.save() in order to append. Just open > it with the mode > > 'a+b' and seek to the end. > > > > ?f = open('myfile.npy', 'a+b') > > ?f.seek(0, 2) > > ?numpy.save(f, chunk) > > ?f.close() > > That looks nice. What am I doing wrong? > > >> x = np.array([1,2,3]) > >> y = np.array([4,5,6]) > >> > >> f = open('myfile.npy', 'a+b') > >> np.save(f, x) > >> f.seek(0, 2) > >> np.save(f, y) > >> f.close() > >> > >> xy = np.load('myfile.npy') > >> xy > ???array([1, 2, 3]) > > I was expecting something like array([1, 2, 3, 4, 5, 6]). > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From fiolj at yahoo.com Tue Aug 11 15:40:35 2009 From: fiolj at yahoo.com (Juan Fiol) Date: Tue, 11 Aug 2009 12:40:35 -0700 (PDT) Subject: [Numpy-discussion] saving incrementally numpy arrays In-Reply-To: <391080.28922.qm@web52501.mail.re2.yahoo.com> Message-ID: <124312.33233.qm@web52510.mail.re2.yahoo.com> Hi, again, I can confirm that you have to load multiple times. Also I do not see differences if using or not the f.seek line The following snippet gives the expected result. The problem is that that way I have to load as many times as I wrote. Besides that, it works. Thanks, Juan #----------------------------------------- import numpy as np x = np.array([[1,2,3],[4,5,6]]) y = np.array([[7,8,9],[10,11,12]]) f = open('myfile1.npy', 'a+b') np.save(f, x) # f.seek(0, 2) np.save(f, y) f.close() fi=open('myfile1.npy','rb') x1 = np.load(fi) y1 = np.load(fi) fi.close() #----------------------------------------- --- On Tue, 8/11/09, Juan Fiol wrote: > From: Juan Fiol > Subject: Re: [Numpy-discussion] saving incrementally numpy arrays > To: "Discussion of Numerical Python" > Date: Tuesday, August 11, 2009, 8:28 PM > Hi, thanks for all the answers. I am > checking how to use pytables now, though I probably prefer > to do it without further dependencies. I tried opening the > file as 'append' and then pickle the array (because looking > to the numpy.save it looked like what they did), but to > retrieve the data then I have to load multiple times and > concatenate (numpy.c_[]). I did not tried Robert suggestion > yet, but it will probably happen the same and that is what > Keith is seeing (though I may be wrong too). > If I do not find a suitable solution with only numpy I'll > learn how to use pytables. Thanks and Best regards, > Juan > > --- On Tue, 8/11/09, Keith Goodman > wrote: > > > From: Keith Goodman > > Subject: Re: [Numpy-discussion] saving incrementally > numpy arrays > > To: "Discussion of Numerical Python" > > Date: Tuesday, August 11, 2009, 7:46 PM > > On Tue, Aug 11, 2009 at 11:05 AM, > > Robert Kern > > wrote: > > > On Mon, Aug 10, 2009 at 22:29, Juan Fiol > > wrote: > > >> Hi, I am creating numpy arrays in chunks and > I > > want to save the chunks while my program creates them. > I > > tried to use numpy.save but it failed (because it is > not > > intended to append data). I'd like to know what is, in > your > > opinion, the best way to go. I will put a few > thousands > > every time but building up a file of several Gbytes. I > do > > not want to put into memory > > >> all previous data each time. Also I cannot > wait > > until the program finishes, I must save partial > results > > periodically. Thanks, any help will be appreciated > > > > > > As others mentioned, PyTables is an excellent, > > complete solution. If > > > you still want to write your own, then you can > pass an > > open file > > > object to numpy.save() in order to append. Just > open > > it with the mode > > > 'a+b' and seek to the end. > > > > > > ?f = open('myfile.npy', 'a+b') > > > ?f.seek(0, 2) > > > ?numpy.save(f, chunk) > > > ?f.close() > > > > That looks nice. What am I doing wrong? > > > > >> x = np.array([1,2,3]) > > >> y = np.array([4,5,6]) > > >> > > >> f = open('myfile.npy', 'a+b') > > >> np.save(f, x) > > >> f.seek(0, 2) > > >> np.save(f, y) > > >> f.close() > > >> > > >> xy = np.load('myfile.npy') > > >> xy > > ???array([1, 2, 3]) > > > > I was expecting something like array([1, 2, 3, 4, 5, > 6]). > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > ? ? ? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From lciti at essex.ac.uk Tue Aug 11 16:26:44 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Tue, 11 Aug 2009 21:26:44 +0100 Subject: [Numpy-discussion] saving incrementally numpy arrays References: <124312.33233.qm@web52510.mail.re2.yahoo.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E63@sernt14.essex.ac.uk> You can do something a bit tricky but possibly working. I made the assumption of a C-ordered 1d vector. import numpy as np import numpy.lib.format as fmt # example of chunks chunks = [np.arange(l) for l in range(5,10)] # at the beginning fp = open('myfile.npy', 'wb') d = dict( descr=fmt.dtype_to_descr(chunks[0].dtype), fortran_order=False, shape=(2**30), # some big shape you think you'll never reach ) fp.write(fmt.magic(1,0)) fmt.write_array_header_1_0(fp, d) h_len = fp.tell() l = 0 # ... for each chunk ... for chunk in chunks: l += len(chunk) fp.write(chunk.tostring('C')) # finally fp.seek(0,0) fp.write(fmt.magic(1,0)) d['shape'] = (l,) fmt.write_array_header_1_0(fp, d) fp.write(' ' * (h_len - fp.tell() - 1)) fp.close() From danny.handoko at asml.com Wed Aug 12 03:12:09 2009 From: danny.handoko at asml.com (Danny Handoko) Date: Wed, 12 Aug 2009 09:12:09 +0200 Subject: [Numpy-discussion] Faulty behavior of numpy.histogram? Message-ID: Dear all, We try to use numpy.histogram with combination of matplotlib. We are using numpy 1.3.0, but a somewhat older matplotlib version of 0.91.2. Matplotlib's axes.hist() function calls the numpy.histogram, passing through the 'normed' parameter. However, this version of matplotlib uses '0' as the default value of 'normed' (I see it fixed in higher version). What I found strange is that if the 'normed' parameter of numpy.histogram is set with other object than 'True' or 'False', the output becomes None, but no exceptions are raised. As a result, the matplotlib code that does something like this: >>> n, bins = numpy.histogram([1,2,3], 10, range = None, normed = 0) Traceback (most recent call last): File "", line 1, in TypeError: 'NoneType' object is not iterable results in the above exception. Secondly, this matplotlib version also expects both outputs to be of the same length, which is no longer true with the new histogram semantics. This can be easily reverted using the parameter 'new = False' in numpy.histogram, but this parameter is not available for the caller of axes.hist() function in matplotlib. Is there any way to tell numpy to use the old semantics? Upgrading to the newer matplotlib is a rather longer term solution, and we hope to be able to find some workaround/short-term solution Thank you, -- Danny Handoko System Architecture and Generics Room 7G2.003 -- ph: x2968 email: danny.handoko at asml.com -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lars.bittrich at googlemail.com Wed Aug 12 04:31:57 2009 From: lars.bittrich at googlemail.com (Lars Bittrich) Date: Wed, 12 Aug 2009 10:31:57 +0200 Subject: [Numpy-discussion] identity Message-ID: <200908121031.57678.lars.bittrich@googlemail.com> Hi, a colleague made me aware of a speed issue with numpy.identity. Since he was using numpy.diag(numpy.ones(N)) before, he expected identity to be at least as fast as diag. But that is not the case. We found that there was a discussion on the list (July, 20th; "My identity" by Keith Goodman). The presented solution was much faster. Someone wondered if the change was already made in the svn. But I got something different: In [1]:import numpy In [2]:numpy.__version__ Out[2]:'1.4.0.dev7301' In [3]:numpy.identity?? [...] def identity(n, dtype=None): """ [...] """ a = array([1]+n*[0],dtype=dtype) b = empty((n,n),dtype=dtype) # Note that this assignment depends on the convention that since the a # array is shorter than the flattened b array, then the a array will # be repeated until it is the appropriate size. Given a's construction, # this nicely sets the diagonal to all ones. b.flat = a return b instead of (mail by Keith Goodman): def myidentity(n, dtype=None): a = zeros((n,n), dtype=dtype) a.flat[::n+1] = 1 return a Did I look at the wrong place or is there a reason to keep the slow version of identity? Lars From jdh2358 at gmail.com Wed Aug 12 07:28:08 2009 From: jdh2358 at gmail.com (John Hunter) Date: Wed, 12 Aug 2009 06:28:08 -0500 Subject: [Numpy-discussion] adaptive sampling of an interval or plane Message-ID: <88e473830908120428u382060c8he45ef631f1bc63c0@mail.gmail.com> We would like to add function plotting to mpl, but to do this right we need to be able to adaptively sample a function evaluated over an interval so that some tolerance condition is satisfied, perhaps with both a relative and absolute error tolerance condition. I am a bit out of my area of competency here, eg I do not know exactly how the tolerance condition should be specified, but I suspect some of you here may be experts on this. Does anyone have some code compatible with the BSD license, preferably based on numpy but we would consider an extension code or scipy solution, for doing this? The functionality we have in mind is provided in matlab with fplot http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/fplot.html We would like 1D and 2D versions of this ideally. If anyone has some suggestions, let me know. Thanks, JDH From david.huard at gmail.com Wed Aug 12 10:12:59 2009 From: david.huard at gmail.com (David Huard) Date: Wed, 12 Aug 2009 10:12:59 -0400 Subject: [Numpy-discussion] Faulty behavior of numpy.histogram? In-Reply-To: References: Message-ID: <91cf711d0908120712uc19f09eo225d59541f9ff075@mail.gmail.com> On Wed, Aug 12, 2009 at 3:12 AM, Danny Handoko wrote: > Dear all, > > We try to use numpy.histogram with combination of matplotlib. We are using > numpy 1.3.0, but a somewhat older matplotlib version of 0.91.2. > Matplotlib's axes.hist() function calls the numpy.histogram, passing > through the 'normed' parameter. However, this version of matplotlib uses > '0' as the default value of 'normed' (I see it fixed in higher version). > What I found strange is that if the 'normed' parameter of numpy.histogram is > set with other object than 'True' or 'False', the output becomes None, but > no exceptions are raised. As a result, the matplotlib code that does > something like this: > > >>> n, bins = numpy.histogram([1,2,3], 10, range = None, normed = 0) > Traceback (most recent call last): > File "", line 1, in > TypeError: 'NoneType' object is not iterable > results in the above exception. > This is now fixed. Thanks. > > Secondly, this matplotlib version also expects both outputs to be of the > same length, which is no longer true with the new histogram semantics. This > can be easily reverted using the parameter 'new = False' in numpy.histogram, > but this parameter is not available for the caller of axes.hist() function > in matplotlib. Is there any way to tell numpy to use the old semantics? > > Could you go in the numpy source code and change the default value for new ? David > Upgrading to the newer matplotlib is a rather longer term solution, and we > hope to be able to find some workaround/short-term solution > > Thank you, > > > -- > > Danny Handoko > > System Architecture and Generics > > Room 7G2.003 -- ph: x2968 > > email: danny.handoko at asml.com > > > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. ASML is neither liable for the > proper and complete transmission of the information contained in this > communication, nor for any delay in its receipt. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Wed Aug 12 10:24:36 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 12 Aug 2009 07:24:36 -0700 Subject: [Numpy-discussion] identity In-Reply-To: <200908121031.57678.lars.bittrich@googlemail.com> References: <200908121031.57678.lars.bittrich@googlemail.com> Message-ID: On Wed, Aug 12, 2009 at 1:31 AM, Lars Bittrich wrote: > Hi, > > a colleague made me aware of a speed issue with numpy.identity. Since he was > using numpy.diag(numpy.ones(N)) before, he expected identity to be at least as > fast as diag. But that is not the case. > > We found that there was a discussion on the list (July, 20th; "My identity" by > Keith Goodman). The presented solution was much faster. Someone wondered if > the change was already made in the svn. > But I got something different: > > In [1]:import numpy > > In [2]:numpy.__version__ > Out[2]:'1.4.0.dev7301' > > In [3]:numpy.identity?? > [...] > def identity(n, dtype=None): > ? ?""" > ? ?[...] > ? ?""" > ? ?a = array([1]+n*[0],dtype=dtype) > ? ?b = empty((n,n),dtype=dtype) > > ? ?# Note that this assignment depends on the convention that since the a > ? ?# array is shorter than the flattened b array, then the a array will > ? ?# be repeated until it is the appropriate size. Given a's construction, > ? ?# this nicely sets the diagonal to all ones. > ? ?b.flat = a > ? ?return b > > instead of (mail by Keith Goodman): > > def myidentity(n, dtype=None): > ? ?a = zeros((n,n), dtype=dtype) > ? ?a.flat[::n+1] = 1 > ? ?return a > > > Did I look at the wrong place or is there a reason to keep the slow version of > identity? Things tend to get lost on the mailing list. The next step would be to file a ticket on the numpy trac. (I've never done that) That would increase the chance of someone important taking a look at it. From kwgoodman at gmail.com Wed Aug 12 10:54:25 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 12 Aug 2009 07:54:25 -0700 Subject: [Numpy-discussion] identity In-Reply-To: References: <200908121031.57678.lars.bittrich@googlemail.com> Message-ID: On Wed, Aug 12, 2009 at 7:24 AM, Keith Goodman wrote: > On Wed, Aug 12, 2009 at 1:31 AM, Lars > Bittrich wrote: >> Hi, >> >> a colleague made me aware of a speed issue with numpy.identity. Since he was >> using numpy.diag(numpy.ones(N)) before, he expected identity to be at least as >> fast as diag. But that is not the case. >> >> We found that there was a discussion on the list (July, 20th; "My identity" by >> Keith Goodman). The presented solution was much faster. Someone wondered if >> the change was already made in the svn. >> But I got something different: >> >> In [1]:import numpy >> >> In [2]:numpy.__version__ >> Out[2]:'1.4.0.dev7301' >> >> In [3]:numpy.identity?? >> [...] >> def identity(n, dtype=None): >> ? ?""" >> ? ?[...] >> ? ?""" >> ? ?a = array([1]+n*[0],dtype=dtype) >> ? ?b = empty((n,n),dtype=dtype) >> >> ? ?# Note that this assignment depends on the convention that since the a >> ? ?# array is shorter than the flattened b array, then the a array will >> ? ?# be repeated until it is the appropriate size. Given a's construction, >> ? ?# this nicely sets the diagonal to all ones. >> ? ?b.flat = a >> ? ?return b >> >> instead of (mail by Keith Goodman): >> >> def myidentity(n, dtype=None): >> ? ?a = zeros((n,n), dtype=dtype) >> ? ?a.flat[::n+1] = 1 >> ? ?return a >> >> >> Did I look at the wrong place or is there a reason to keep the slow version of >> identity? > > Things tend to get lost on the mailing list. The next step would be to > file a ticket on the numpy trac. (I've never done that) That would > increase the chance of someone important taking a look at it. Here's the ticket: http://projects.scipy.org/numpy/ticket/1193 BTW, a fast eye function is already in svn. But identity, having fewer options, is a tiny bit faster. From ralph at dont-mind.de Wed Aug 12 11:22:53 2009 From: ralph at dont-mind.de (Ralph Heinkel) Date: Wed, 12 Aug 2009 17:22:53 +0200 Subject: [Numpy-discussion] Howto create a record array from arrays without copying their data Message-ID: <4A82DE4D.7090002@dont-mind.de> Hi, I'm creating (actually calculating) a set of very large 1-d arrays (vectors), which I would like to assemble into a record array so I can access the data row-wise. Unfortunately it seems that all data of my original 1-d arrays are getting copied in memory during that process. Is there a way to get around that? Basically what I do is: arr1 = numpy.array([1, 4, 5], dtype=int) arr2 = numpy.array([5.5, 6.6, 9.9], dtype=float) recarray = numpy.core.rec.fromarrays([arr1, arr2], names=['col1', 'col2']) When I now make a change in recarray like recarray.col1[0] = 5000 I cannot see the change in arr1. (So that's why I assume that the data is copied). Also in the numy book it tells that would be copied. Thanks, Ralph From rmay31 at gmail.com Wed Aug 12 11:28:55 2009 From: rmay31 at gmail.com (Ryan May) Date: Wed, 12 Aug 2009 10:28:55 -0500 Subject: [Numpy-discussion] Howto create a record array from arrays without copying their data In-Reply-To: <4A82DE4D.7090002@dont-mind.de> References: <4A82DE4D.7090002@dont-mind.de> Message-ID: On Wed, Aug 12, 2009 at 10:22 AM, Ralph Heinkel wrote: > Hi, > > I'm creating (actually calculating) a set of very large 1-d arrays > (vectors), which I would like to assemble into a record array so I can > access the data row-wise. Unfortunately it seems that all data of my > original 1-d arrays are getting copied in memory during that process. > Is there a way to get around that? I don't think so, because fundamentally numpy assumes array elements are packed together in memory. If you know C, record arrays are pretty much arrays of structures. You could try just using a python dictionary to hold the arrays, depending on you motives behind using a record array. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma Sent from Norman, Oklahoma, United States -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.sinclair.za at gmail.com Wed Aug 12 11:29:26 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 12 Aug 2009 17:29:26 +0200 Subject: [Numpy-discussion] identity In-Reply-To: References: <200908121031.57678.lars.bittrich@googlemail.com> Message-ID: <6a17e9ee0908120829o7c9f8d56x2d9699a34fe01a20@mail.gmail.com> >2009/8/12 Keith Goodman : > On Wed, Aug 12, 2009 at 7:24 AM, Keith Goodman wrote: >> On Wed, Aug 12, 2009 at 1:31 AM, Lars >> Bittrich wrote: >>> >>> a colleague made me aware of a speed issue with numpy.identity. Since he was >>> using numpy.diag(numpy.ones(N)) before, he expected identity to be at least as >>> fast as diag. But that is not the case. >>> >>> We found that there was a discussion on the list (July, 20th; "My identity" by >>> Keith Goodman). The presented solution was much faster. Someone wondered if >>> the change was already made in the svn. >> >> Things tend to get lost on the mailing list. The next step would be to >> file a ticket on the numpy trac. (I've never done that) That would >> increase the chance of someone important taking a look at it. > > Here's the ticket: > > http://projects.scipy.org/numpy/ticket/1193 > A patch against recent SVN trunk is attached to the ticket. Please review... Cheers, Scott From robert.kern at gmail.com Wed Aug 12 11:45:56 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 Aug 2009 10:45:56 -0500 Subject: [Numpy-discussion] reduce function of vectorize doesn't respect dtype? In-Reply-To: References: Message-ID: <3d375d730908120845n25017935p43d02c9cb69408a6@mail.gmail.com> On Fri, Aug 7, 2009 at 13:54, T J wrote: > The reduce function of ufunc of a vectorized function doesn't seem to > respect the dtype. > >>>> def a(x,y): return x+y >>>> b = vectorize(a) >>>> c = array([1,2]) >>>> b(c, c) ?# use once to populate b.ufunc >>>> d = b.ufunc.reduce(c) >>>> c.dtype, type(d) > dtype('int32'), > >>>> c = array([[1,2,3],[4,5,6]]) >>>> b.ufunc.reduce(c) > array([5, 7, 9], dtype=object) > > My goal is to use the output of vectorize() as if it is actually a ufunc. vectorize()d ufuncs are always object->object ufuncs because Python functions take objects and return objects, not C ints or other C data types. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Aug 12 11:53:59 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 12 Aug 2009 09:53:59 -0600 Subject: [Numpy-discussion] identity In-Reply-To: <6a17e9ee0908120829o7c9f8d56x2d9699a34fe01a20@mail.gmail.com> References: <200908121031.57678.lars.bittrich@googlemail.com> <6a17e9ee0908120829o7c9f8d56x2d9699a34fe01a20@mail.gmail.com> Message-ID: On Wed, Aug 12, 2009 at 9:29 AM, Scott Sinclair wrote: > >2009/8/12 Keith Goodman : > > On Wed, Aug 12, 2009 at 7:24 AM, Keith Goodman > wrote: > >> On Wed, Aug 12, 2009 at 1:31 AM, Lars > >> Bittrich wrote: > >>> > >>> a colleague made me aware of a speed issue with numpy.identity. Since > he was > >>> using numpy.diag(numpy.ones(N)) before, he expected identity to be at > least as > >>> fast as diag. But that is not the case. > >>> > >>> We found that there was a discussion on the list (July, 20th; "My > identity" by > >>> Keith Goodman). The presented solution was much faster. Someone > wondered if > >>> the change was already made in the svn. > >> > >> Things tend to get lost on the mailing list. The next step would be to > >> file a ticket on the numpy trac. (I've never done that) That would > >> increase the chance of someone important taking a look at it. > > > > Here's the ticket: > > > > http://projects.scipy.org/numpy/ticket/1193 > > > > A patch against recent SVN trunk is attached to the ticket. Please > review... > Already done. Thanks. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Aug 12 12:00:01 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 12 Aug 2009 12:00:01 -0400 Subject: [Numpy-discussion] Howto create a record array from arrays without copying their data In-Reply-To: References: <4A82DE4D.7090002@dont-mind.de> Message-ID: <1cd32cbb0908120900m17f9266evcaa8d8ef039c449d@mail.gmail.com> On Wed, Aug 12, 2009 at 11:28 AM, Ryan May wrote: > On Wed, Aug 12, 2009 at 10:22 AM, Ralph Heinkel wrote: >> >> Hi, >> >> I'm creating (actually calculating) a set of very large 1-d arrays >> (vectors), which I would like to assemble into a record array so I can >> access the data row-wise. ?Unfortunately it seems that all data of my >> original 1-d arrays are getting copied in memory during that process. >> Is there a way to get around that? > > I don't think so, because fundamentally numpy assumes array elements are > packed together in memory. ?If you know C, record arrays are pretty much > arrays of structures. ?You could try just using a python dictionary to hold > the arrays, depending on you motives behind using a record array. > > Ryan > Can you preallocate the record array and fill it up as you calculate the values? You can use the reference to the recarray columns. I never tried with recarrays. Josef From robert.kern at gmail.com Wed Aug 12 12:02:45 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 Aug 2009 11:02:45 -0500 Subject: [Numpy-discussion] Question about bounds checking In-Reply-To: References: Message-ID: <3d375d730908120902i3d70c33frf677108edeeedfa6@mail.gmail.com> On Mon, Aug 10, 2009 at 10:08, Rich E wrote: > Dear all, > I am having a few issues with indexing in numpy and wondered if you could help > me out. > If I define an array > a = zeros(( 4)) > a > array([ 0., ?0., ?0., ?0.]) > > Then I try and reference a point beyond the bounds of the array > > a[4] > Traceback (most recent call last): > ?File "", line 1, in > IndexError: index out of bounds > > but if I use the slicing format to reference the point I get > > a[0:4] > array([ 0., ?0., ?0., ?0.]) > a[0:10] > array([ 0., ?0., ?0., ?0.]) We do not raise an IndexError in the latter case because we follow Python's behavior for lists and tuples. In [1]: a = range(4) In [2]: a[0:10] Out[2]: [0, 1, 2, 3] In [3]: a[9] --------------------------------------------------------------------------- IndexError Traceback (most recent call last) /Users/rkern/ in () IndexError: list index out of range This is particularly useful in cases where you are iterating over chunks of the array. You do not have to handle the last chunk, which may be smaller than the others, as a special case. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Aug 12 12:04:58 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 Aug 2009 11:04:58 -0500 Subject: [Numpy-discussion] memmap, write through and flush In-Reply-To: <4A7E3572.9000909@jpl.nasa.gov> References: <4A7E3572.9000909@jpl.nasa.gov> Message-ID: <3d375d730908120904t4c8cfdffq3ae569442821a20a@mail.gmail.com> On Sat, Aug 8, 2009 at 21:33, Tom Kuiper wrote: > There is something curious here. ?The second flush() fails. ?Can anyone > explain this? numpy.append() does not append values in-place. It is just a convenience wrapper for numpy.concatenate(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Wed Aug 12 12:07:37 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 12 Aug 2009 12:07:37 -0400 Subject: [Numpy-discussion] is this a chararray decode bug Message-ID: <1cd32cbb0908120907x6ffaebecv873926af3b097196@mail.gmail.com> (copied from the lengthy unicode thread in scipy-dev, so it doesn't get lost) this looks like a bug ? or is it a known limitation that chararrays cannot be 0-d >>> b0= np.array(u'\xe9','>> print b0.encode('cp1252') Traceback (most recent call last): File "", line 1, in print b0.encode('cp1252') File "C:\Programs\Python25\Lib\site-packages\numpy\core\defchararray.py", line 217, in encode return self._generalmethod('encode', broadcast(self, encoding, errors)) File "C:\Programs\Python25\Lib\site-packages\numpy\core\defchararray.py", line 162, in _generalmethod newarr[:] = res ValueError: cannot slice a 0-d array Josef From robert.kern at gmail.com Wed Aug 12 12:13:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 Aug 2009 11:13:01 -0500 Subject: [Numpy-discussion] How to preserve number of array dimensions when taking a slice? In-Reply-To: <24875133.post@talk.nabble.com> References: <24875133.post@talk.nabble.com> Message-ID: <3d375d730908120913h4476734ftd12f2bd622d61406@mail.gmail.com> On Fri, Aug 7, 2009 at 23:53, Dr. Phillip M. Feldman wrote: > > I'd like to be able to make a slice of a 3-dimensional array, doing something > like the following: > > Y= X[A, B, C] > > where A, B, and C are lists of indices. This works, but has an unexpected > side-effect. When A, B, or C is a length-1 list, Y has fewer dimensions than > X. Is there a way to do the slice such that the number of dimensions is > preserved, i.e., I'd like Y to be a 3-dimensional array, even if one or more > dimensions is unity. ?Is there a way to do this? http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Aug 12 12:15:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 Aug 2009 11:15:43 -0500 Subject: [Numpy-discussion] memmap capability In-Reply-To: <4A7C6CDA.9050400@jpl.nasa.gov> References: <4A7C6CDA.9050400@jpl.nasa.gov> Message-ID: <3d375d730908120915u2fd709a4w51679baf251dbf0f@mail.gmail.com> On Fri, Aug 7, 2009 at 13:05, Tom Kuiper wrote: > If this appears twice, forgive me.? I sent it previously (7:13 am PDT) via a > browser interface to JPL's Office Outlook.? I have doubts about this > system.? This time, from Iceweasel through our SMTP server. > > There are two things I'd like to do using memmap.? I suspect that they are > impossible but maybe I'm missing some subtlety. > 1) I would like to append rows to a memmap array and have the modified array > changed on disk also. > 2) I would like to have the memory view of the array on disk change, i.e., > modify the offset for an opened array. > The only way I can think of involves opening and closing arrays repeatedly. That's the only way to do it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Aug 12 12:17:39 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 Aug 2009 11:17:39 -0500 Subject: [Numpy-discussion] Strange Error with NumPy Addendum In-Reply-To: References: <3d375d730908061629s721052d6tc8246e4f81231d2c@mail.gmail.com> <3FDEDA59-7E92-44EF-9506-4159A0AFDF3E@cs.toronto.edu> Message-ID: <3d375d730908120917h2f4b2d7au165abdc766aa709a@mail.gmail.com> On Fri, Aug 7, 2009 at 07:15, Nanime Puloski wrote: > But if it were an unsigned int64, it should be able to hold 2**64 or at > least 2**64-1. > Am I correct? There is no numpy.sin() implementation for uint64s, just the floating point types. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Aug 12 12:25:20 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 Aug 2009 11:25:20 -0500 Subject: [Numpy-discussion] Indexing empty array with empty boolean array causes "IndexError: invalid index exception" In-Reply-To: <7F7802B9-BD2F-4350-99B6-708D140089C8@usc.edu> References: <7F7802B9-BD2F-4350-99B6-708D140089C8@usc.edu> Message-ID: <3d375d730908120925s67c4f23cp643b96f5adc1696c@mail.gmail.com> On Mon, Aug 10, 2009 at 14:19, Maria Liukis wrote: > Hello everybody, > I'm using following versions of Scipy and Numpy packages: >>>> scipy.__version__ > '0.7.1' >>>> np.__version__ > '1.3.0' > My code uses boolean array to filter 2-dimensional array which sometimes > happens to be an empty array. It seems like I have to take special care when > dimension I'm filtering is zero, otherwise I'm getting an "IndexError: > invalid index" exception: >>>> import numpy as np >>>> a ?= np.zeros((2,10)) >>>> a > array([[ 1., ?1., ?1., ?1., ?1., ?1., ?1., ?1., ?1., ?1.], > ?? ? ? [ 1., ?1., ?1., ?1., ?1., ?1., ?1., ?1., ?1., ?1.]]) If that were actually your output from zeros(), that would definitely be a bug. :-) >>>>?filter_array = np.zeros(2,) >>>> filter_array > array([False, False], dtype=bool) >>>> a[filter_array,:] > array([], shape=(0, 10), dtype=float64) >>>>>>> > > Now if filtered dimension is zero: >>>> a ?= np.ones((0,10)) >>>> a > array([], shape=(0, 10), dtype=float64) >>>> filter_array = np.zeros((0,), dtype=bool) >>>> filter_array > array([], dtype=bool) >>>> filter_array.shape > (0,) >>>> a.shape > (0, 10) >>>> a[filter_array,:] > Traceback (most recent call last): > ??File "", line 1, in > IndexError: invalid index >>>> > Would somebody know if it's an expected behavior, a package bug or am I > doing something wrong? I would call it a bug. It's a corner case that we should probably handle gracefully rather than raising an exception. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From naromero at gmail.com Wed Aug 12 12:48:56 2009 From: naromero at gmail.com (Nichols A. Romero) Date: Wed, 12 Aug 2009 11:48:56 -0500 Subject: [Numpy-discussion] NumPy C-API - copying from Fortran to C order Message-ID: <6ac064b60908120948j3d7fde67rd273b734d18229db@mail.gmail.com> Hi, I am working on a *very* simple Python interface to ScaLAPACK using the NumPy C-API. I am not using f2py at all. Simple question: How can I copy a C-order NumPy array into a Fortran-order NumPy array within the C-API? (This is trivial in Python, it is simply A = A.copy("Fortran")) I would like to do this with the minimal amount of memory use. The matrices are 2-d and rectangular, but not square. It looks like I use PyArray_CopyInto or PyArray_MoveInto, but I would need to create another array using PyArray_EMPTY(2, dims, NPY_DOUBLE, NPY_F_CONTIGUOUS) Thanks in advance for your assistance. -- Nichols A. Romero, Ph.D. Argonne Leadership Computing Facility Argonne, IL 60490 (630) 252-3441 (O) (630) 470-0462 (C) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Wed Aug 12 13:02:43 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 12 Aug 2009 10:02:43 -0700 Subject: [Numpy-discussion] identity In-Reply-To: References: <200908121031.57678.lars.bittrich@googlemail.com> <6a17e9ee0908120829o7c9f8d56x2d9699a34fe01a20@mail.gmail.com> Message-ID: On Wed, Aug 12, 2009 at 8:53 AM, Charles R Harris wrote: > > > On Wed, Aug 12, 2009 at 9:29 AM, Scott Sinclair > wrote: >> >> >2009/8/12 Keith Goodman : >> > On Wed, Aug 12, 2009 at 7:24 AM, Keith Goodman >> > wrote: >> >> On Wed, Aug 12, 2009 at 1:31 AM, Lars >> >> Bittrich wrote: >> >>> >> >>> a colleague made me aware of a speed issue with numpy.identity. Since >> >>> he was >> >>> using numpy.diag(numpy.ones(N)) before, he expected identity to be at >> >>> least as >> >>> fast as diag. But that is not the case. >> >>> >> >>> We found that there was a discussion on the list (July, 20th; "My >> >>> identity" by >> >>> Keith Goodman). The presented solution was much faster. Someone >> >>> wondered if >> >>> the change was already made in the svn. >> >> >> >> Things tend to get lost on the mailing list. The next step would be to >> >> file a ticket on the numpy trac. (I've never done that) That would >> >> increase the chance of someone important taking a look at it. >> > >> > Here's the ticket: >> > >> > http://projects.scipy.org/numpy/ticket/1193 >> > >> >> A patch against recent SVN trunk is attached to the ticket. Please >> review... > > Already done. Thanks. Hey, thanks. Now I know how to do the first two steps: (1) Extend the work of others (in this case Luca Citi and Robert Kern) (2) File a ticket (3) ??? (4) Profit From sccolbert at gmail.com Wed Aug 12 13:36:39 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Wed, 12 Aug 2009 13:36:39 -0400 Subject: [Numpy-discussion] identity In-Reply-To: References: <200908121031.57678.lars.bittrich@googlemail.com> <6a17e9ee0908120829o7c9f8d56x2d9699a34fe01a20@mail.gmail.com> Message-ID: <7f014ea60908121036l68e33805u9991b2b181f168fc@mail.gmail.com> Someone posts on offtopic.com > > (1) Extend the work of others (in this case Luca Citi and Robert Kern) > (2) File a ticket > (3) ??? > (4) Profit > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From fperez.net at gmail.com Wed Aug 12 14:51:40 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 12 Aug 2009 11:51:40 -0700 Subject: [Numpy-discussion] adaptive sampling of an interval or plane In-Reply-To: <88e473830908120428u382060c8he45ef631f1bc63c0@mail.gmail.com> References: <88e473830908120428u382060c8he45ef631f1bc63c0@mail.gmail.com> Message-ID: On Wed, Aug 12, 2009 at 4:28 AM, John Hunter wrote: > We would like to add function plotting to mpl, but to do this right we > need to be able to adaptively sample a function evaluated over an > interval so that some tolerance condition is satisfied, perhaps with > both a relative and absolute error tolerance condition. ?I am a bit > out of my area of competency here, eg I do not know exactly how the > tolerance condition should be specified, but I suspect some of you > here may be experts on this. ?Does anyone have some code compatible > with the BSD license, preferably based on numpy but we would consider > an extension code or scipy solution, for doing this? > > The functionality we have in mind is provided in matlab with fplot > > ?http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/fplot.html > > We would like 1D and 2D versions of this ideally. ?If anyone has some > suggestions, let me know. In a past life I wrote code to do this in d=1..6, with lots of other bells and whistles. I'm no longer actively involved with the project, but a colleague has recently updated it with an eye towards releasing it, and right now I'm the bottleneck (time) to review the changes. I just checked out the updated version of the code and it looks a fair bit simpler than my original machinery, which had accreted lots of Fortran dependencies for historical but otherwise uninteresting reasons. How about we have a look at this next week at the conference? I'll ping my colleague in the meantime to check on this... Cheers, f From ellisonbg.net at gmail.com Wed Aug 12 15:20:17 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 12 Aug 2009 12:20:17 -0700 Subject: [Numpy-discussion] adaptive sampling of an interval or plane In-Reply-To: <88e473830908120428u382060c8he45ef631f1bc63c0@mail.gmail.com> References: <88e473830908120428u382060c8he45ef631f1bc63c0@mail.gmail.com> Message-ID: <6ce0ac130908121220t36649058p94950169c15b95aa@mail.gmail.com> We should also talk to Ondrej about this at SciPy. Both sympy (through mpmath) and mpmath have matplotlib based function plotting. I don't think it is adaptive, but I know mpmath can handle singularities. Also, Ondrej is doing doing his graduate with with a group that does adaptive finite elements, so he would also be familiar with such algorithms. I am sure that sympy and mpmath (Sage too as well) would be some of the main users of function plotting and would love to see this happen. Cheers, Brian On Wed, Aug 12, 2009 at 4:28 AM, John Hunter wrote: > We would like to add function plotting to mpl, but to do this right we > need to be able to adaptively sample a function evaluated over an > interval so that some tolerance condition is satisfied, perhaps with > both a relative and absolute error tolerance condition. I am a bit > out of my area of competency here, eg I do not know exactly how the > tolerance condition should be specified, but I suspect some of you > here may be experts on this. Does anyone have some code compatible > with the BSD license, preferably based on numpy but we would consider > an extension code or scipy solution, for doing this? > > The functionality we have in mind is provided in matlab with fplot > > > http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/fplot.html > > We would like 1D and 2D versions of this ideally. If anyone has some > suggestions, let me know. > > Thanks, > JDH > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From washakie at gmail.com Wed Aug 12 17:01:52 2009 From: washakie at gmail.com (John [H2O]) Date: Wed, 12 Aug 2009 14:01:52 -0700 (PDT) Subject: [Numpy-discussion] Masking an array with another array In-Reply-To: References: <49d6b3500904221421x2c32b8a8q136e48d0ce153dd6@mail.gmail.com> <20090423051615.GB25215@phare.normalesup.org> <49d6b3500904222224u1451e693v98b812fcbad856dc@mail.gmail.com> <1cd32cbb0904230608n6a172bcs8716a119bb98a9c2@mail.gmail.com> Message-ID: <24943419.post@talk.nabble.com> I suspect I am trying to do something similar... I would like to create a mask where I have data. In essence, I need to return True where x,y is equal to lon,lat.... I suppose a setmember solution may somehow be more elegant, but this is what I've worked up for now... suggestions? def genDataMask(x,y, xbounds=(-180,180), ybounds=(-90,90), res=(0.5,0.5) ): """ generate a data mask no data = False data = True """ xy = numpy.column_stack((x,y)) newx = np.arange(xbounds[0],xbound[1],res[0]) newy = np.arange(ybounds[0],ybounds[1],res[1]) #create datamask dm = np.empty(len(newx),len(newy)) dm.fill(np.nan) for _xy in xy: dm[np.where(_xy[0]=newx),np.where(_xy[1]==newy) ] = True -- View this message in context: http://www.nabble.com/Masking-an-array-with-another-array-tp23185887p24943419.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From fperez.net at gmail.com Wed Aug 12 18:27:22 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 12 Aug 2009 15:27:22 -0700 Subject: [Numpy-discussion] SciPy tutorials and talks will be recorded and posted online! Message-ID: Hi all, as you may recall, there have been recently a number of requests for videotaping the conference. I am very happy to announce that we will indeed have full video coverage this year of both tutorial tracks as well as the main talks (minus any specific talk where a speaker may object to being taped, since we'll respect such objections if they are made). Jeff Teeters and Kilian Koepsell from UC Berkeley, who have in the past made recordings like these of one of my workshops and a recent talk by Gael : http://www.archive.org/search.php?query=Fernando+Perez+scientific+python http://www.archive.org/details/ucb_py4science_2009_07_14_Gael_Varoquaux are going to perform the work. I'd like to sincerely thank: - Jeff and Kilian for offering to do this work and providing some of the recording equipment. - The Redwood Center for Theoretical Neuroscience, which provided other equipment. - Enthought, who are funding this on very short notice!!! (Especially Dave Peterson and Eric Jones, who tolerated my last minute nags very graciously). Without Enthought's last-minute support, this would simply not be happening. I really appreciate that everyone involved worked on short notice to make this possible, I hope the entire community will benefit from these resources being available. Best regards, f From fiolj at yahoo.com Wed Aug 12 19:11:26 2009 From: fiolj at yahoo.com (Juan Fiol) Date: Wed, 12 Aug 2009 16:11:26 -0700 (PDT) Subject: [Numpy-discussion] saving incrementally numpy arrays In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E63@sernt14.essex.ac.uk> Message-ID: <638936.64677.qm@web52506.mail.re2.yahoo.com> Hi, I finally decided by the pytables approach because will be easier later to work with the data. Now, I know is not the right place but may be I can get some quick pointers. I've calculated a numpy array of about 20 columns and a few thousands rows at each time. I'd like to append all the rows without iterating over the numpy array. Someone knows what would be the "right" approach? I am looking for something simple, I do not need to keep the piece of table after I put into the h5file. Thanks in advance and regards, Juan --- On Tue, 8/11/09, Citi, Luca wrote: > From: Citi, Luca > Subject: Re: [Numpy-discussion] saving incrementally numpy arrays > To: "Discussion of Numerical Python" > Date: Tuesday, August 11, 2009, 9:26 PM > You can do something a bit tricky but > possibly working. > I made the assumption of a C-ordered 1d vector. > > > > import numpy as np > import numpy.lib.format as fmt > > # example of chunks > chunks = [np.arange(l) for l in range(5,10)] > > # at the beginning > fp = open('myfile.npy', 'wb') > d = dict( > ? ? ? ? ? ? > descr=fmt.dtype_to_descr(chunks[0].dtype), > ? ? ? ? ? ? > fortran_order=False, > ? ? ? ? ? ? shape=(2**30), # > some big shape you think you'll never reach > ? ? ? ? ) > fp.write(fmt.magic(1,0)) > fmt.write_array_header_1_0(fp, d) > h_len = fp.tell() > l = 0 > # ... for each chunk ... > for chunk in chunks: > ? ? l += len(chunk) > ? ? fp.write(chunk.tostring('C')) > # finally > fp.seek(0,0) > fp.write(fmt.magic(1,0)) > d['shape'] = (l,) > fmt.write_array_header_1_0(fp, d) > fp.write(' ' * (h_len - fp.tell() - 1)) > fp.close() > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dwf at cs.toronto.edu Wed Aug 12 19:32:17 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 12 Aug 2009 19:32:17 -0400 Subject: [Numpy-discussion] saving incrementally numpy arrays In-Reply-To: <638936.64677.qm@web52506.mail.re2.yahoo.com> References: <638936.64677.qm@web52506.mail.re2.yahoo.com> Message-ID: <2A982BC2-76D9-4697-ADC5-18B0705A31A2@cs.toronto.edu> On 12-Aug-09, at 7:11 PM, Juan Fiol wrote: > Hi, I finally decided by the pytables approach because will be > easier later to work with the data. Now, I know is not the right > place but may be I can get some quick pointers. I've calculated a > numpy array of about 20 columns and a few thousands rows at each > time. I'd like to append all the rows without iterating over the > numpy array. Someone knows what would be the "right" approach? I am > looking for something simple, I do not need to keep the piece of > table after I put into the h5file. Thanks in advance and regards, Juan You'll probably want the EArray. createEArray() on a new h5file, then append to it. http://www.pytables.org/docs/manual/ch04.html#EArrayMethodsDescr If your chunks are always the same size it might be best to try and do your work in-place and not allocate a new NumPy array each time. In theory 'del' ing the object when you're done with it should work but the garbage collector may not act quickly enough for your liking/the allocation step may start slowing you down. What do I mean? Well, you could clear the array when you're done with it using foo[:] = 0 (or nan, or whatever) and when you're "building it up" use the inplace augmented assignment operators as much as possible (+=, /=, -=, *=, %=, etc.). David From fiolj at yahoo.com Wed Aug 12 19:48:30 2009 From: fiolj at yahoo.com (Juan Fiol) Date: Wed, 12 Aug 2009 16:48:30 -0700 (PDT) Subject: [Numpy-discussion] saving incrementally numpy arrays Message-ID: <215845.55275.qm@web52503.mail.re2.yahoo.com> Thanks David, I'll look into it now. Regarding the allocation/deallocation times I think that is not an issue for me. The chunks are generated by a fortran routine that takes several minutes to run (I am collecting a few thousand points before saving to disk). They are approximately the same size but not exactly. I want them to be stored for later retrieval and analysis in a convenient way. Thanks, regards ----------------------------------------- You'll probably want the EArray. createEArray() on a new h5file, then? append to it. http://www.pytables.org/docs/manual/ch04.html#EArrayMethodsDescr If your chunks are always the same size it might be best to try and do? your work in-place and not allocate a new NumPy array each time. In? theory 'del' ing the object when you're done with it should work but? the garbage collector may not act quickly enough for your liking/the? allocation step may start slowing you down. What do I mean? Well, you could clear the array when you're done with? it using foo[:] = 0 (or nan, or whatever) and when you're "building it? up" use the inplace augmented assignment operators as much as possible? (+=, /=, -=, *=, %=, etc.). David From scott.sinclair.za at gmail.com Thu Aug 13 01:36:00 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Thu, 13 Aug 2009 07:36:00 +0200 Subject: [Numpy-discussion] memmap, write through and flush In-Reply-To: <3d375d730908120904t4c8cfdffq3ae569442821a20a@mail.gmail.com> References: <4A7E3572.9000909@jpl.nasa.gov> <3d375d730908120904t4c8cfdffq3ae569442821a20a@mail.gmail.com> Message-ID: <6a17e9ee0908122236i5f34f10dn20fd074661d8cd62@mail.gmail.com> > 2009/8/12 Robert Kern : > On Sat, Aug 8, 2009 at 21:33, Tom Kuiper wrote: >> There is something curious here. ?The second flush() fails. ?Can anyone >> explain this? > > numpy.append() does not append values in-place. It is just a > convenience wrapper for numpy.concatenate(). Meaning that a copy of the data is returned in an ndarray, so when you do fp = np.append(fp, [[12,13,14,15]], 0) The name fp is no longer bound to a memmap, hence AttributeError: 'numpy.ndarray' object has no attribute 'flush' Cheers, Scott From sccolbert at gmail.com Thu Aug 13 02:14:30 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 13 Aug 2009 02:14:30 -0400 Subject: [Numpy-discussion] SciPy tutorials and talks will be recorded and posted online! In-Reply-To: References: Message-ID: <7f014ea60908122314ia56a363wbbb01e39b1934c77@mail.gmail.com> Thanks to everyone supporting this. I wish I could attend this year, and I will be making it a point to attend next year. I am very grateful to be able to catch the talks at this years conference. Thanks! Chris On Wed, Aug 12, 2009 at 6:27 PM, Fernando Perez wrote: > Hi all, > > as you may recall, there have been recently a number of requests for > videotaping the conference. > > I am very happy to announce that we will indeed have full video > coverage this year of both tutorial tracks as well as the main talks > (minus any specific talk where a speaker may object to being taped, > since we'll respect such objections if they are made). > > Jeff Teeters and Kilian Koepsell from UC Berkeley, who have in the > past made recordings like these of one of my workshops and a recent > talk by Gael : > > http://www.archive.org/search.php?query=Fernando+Perez+scientific+python > http://www.archive.org/details/ucb_py4science_2009_07_14_Gael_Varoquaux > > are going to perform the work. > > I'd like to sincerely thank: > > - Jeff and Kilian for offering to do this work and providing some of > the recording equipment. > > - The Redwood Center for Theoretical Neuroscience, which provided > other equipment. > > - Enthought, who are funding this on very short notice!!! ?(Especially > Dave Peterson and Eric Jones, who tolerated my last minute nags very > graciously). ?Without Enthought's last-minute support, this would > simply not be happening. > > > I really appreciate that everyone involved worked on short notice to > make this possible, I hope the entire community will benefit from > these resources being available. > > Best regards, > > f > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jdh2358 at gmail.com Thu Aug 13 12:34:37 2009 From: jdh2358 at gmail.com (John Hunter) Date: Thu, 13 Aug 2009 11:34:37 -0500 Subject: [Numpy-discussion] adaptive sampling of an interval or plane In-Reply-To: <88e473830908120428u382060c8he45ef631f1bc63c0@mail.gmail.com> References: <88e473830908120428u382060c8he45ef631f1bc63c0@mail.gmail.com> Message-ID: <88e473830908130934o2c571467ub85ab9cb31a348d3@mail.gmail.com> On Wed, Aug 12, 2009 at 6:28 AM, John Hunter wrote: > We would like to add function plotting to mpl, but to do this right we > need to be able to adaptively sample a function evaluated over an > interval so that some tolerance condition is satisfied, perhaps with > both a relative and absolute error tolerance condition. ?I am a bit > out of my area of competency here, eg I do not know exactly how the > tolerance condition should be specified, but I suspect some of you > here may be experts on this. ?Does anyone have some code compatible > with the BSD license, preferably based on numpy but we would consider > an extension code or scipy solution, for doing this? > > The functionality we have in mind is provided in matlab with fplot > > ?http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/fplot.html > > We would like 1D and 2D versions of this ideally. ?If anyone has some > suggestions, let me know. Denis Bzowy has replied to me off list with some code adaptive spline approximation code he is working on. He has documentation and code for the 1D case, and is preparing for the 2D case, and is seeking feedback. He's having trouble posting to the list, and asked me to forward this, so please make sure his email, included in this post, is in any replies http://drop.io/denis_adaspline1 From dario.soto at gmail.com Thu Aug 13 15:59:48 2009 From: dario.soto at gmail.com (Dokuro) Date: Fri, 14 Aug 2009 15:29:48 +1930 Subject: [Numpy-discussion] documentation translation Message-ID: Hello, we have a python group in Maracaibo Venezuela and have started a translation project for as many python documentations we can find from english to spanish, making the use of python easier to non-english speakers, so in behalf of our group i would like to ask if it won't be a problem if we make this translations and keep them in our wiki http://proyectociencia.org/Wiki/index.php/Grupo_de_Usuarios_de_Python Dokuro. From dwf at cs.toronto.edu Thu Aug 13 17:20:25 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 13 Aug 2009 17:20:25 -0400 Subject: [Numpy-discussion] SciPy2009 BoF Wiki Page Message-ID: <0636F499-8CBC-4053-ACF9-7BA40E5D58D4@cs.toronto.edu> I needed a short break from some heavy writing, so on Fernando's suggestion I took to the task of aggregating together mailing list traffic about the BoFs next week. So far, 4 have been proposed, and I've written down under "attendees" the names of anyone who has expressed interest (except in Perry's case, where I've only heard it via proxy). The page is at http://scipy.org/SciPy2009/BoF I've created sections below that are hyperlink targets for the topic of the session, if someone more knowledgeable of that domain can fill in those sections, please do. Edit away, and see you next week! (And if someone can forward this to the Matplotlib list, I'm not currently subscribed) David From gael.varoquaux at normalesup.org Thu Aug 13 17:41:08 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 13 Aug 2009 23:41:08 +0200 Subject: [Numpy-discussion] [IPython-user] SciPy2009 BoF Wiki Page In-Reply-To: <0636F499-8CBC-4053-ACF9-7BA40E5D58D4@cs.toronto.edu> References: <0636F499-8CBC-4053-ACF9-7BA40E5D58D4@cs.toronto.edu> Message-ID: <20090813214108.GA9220@phare.normalesup.org> On Thu, Aug 13, 2009 at 05:20:25PM -0400, David Warde-Farley wrote: > I needed a short break from some heavy writing, so on Fernando's > suggestion I took to the task of aggregating together mailing list > traffic about the BoFs next week. So far, 4 have been proposed, and > I've written down under "attendees" the names of anyone who has > expressed interest (except in Perry's case, where I've only heard it > via proxy). The page is at > http://scipy.org/SciPy2009/BoF Thank you, very useful. I have linked it from the BoF page on the conference website. Ga?l From gokhansever at gmail.com Thu Aug 13 18:36:09 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 13 Aug 2009 17:36:09 -0500 Subject: [Numpy-discussion] build_clib error during Enable 3_2_1 installation Message-ID: <49d6b3500908131536m6e5c4646qa0d1d03a1ea6c8c8@mail.gmail.com> For some unknown reason, ets develop can't pass the following compilation point: g++: enthought/kiva/agg/src/kiva_rect.cpp ar: adding 8 object files to build/temp.linux-i686-2.6/libkiva_src.a running build_ext build_clib already run, it is too late to ensure in-place build of build_clib Traceback (most recent call last): File "setup.py", line 327, in **config File "/home/gsever/Desktop/python-repo/numpy/numpy/distutils/core.py", line 186, in setup return old_setup(**new_attr) File "/usr/lib/python2.6/distutils/core.py", line 152, in setup dist.run_commands() File "/usr/lib/python2.6/distutils/dist.py", line 975, in run_commands self.run_command(cmd) File "/usr/lib/python2.6/distutils/dist.py", line 995, in run_command cmd_obj.run() File "/home/gsever/Desktop/python-repo/numpy/numpy/distutils/command/build_ext.py", line 74, in run self.library_dirs.append(build_clib.build_clib) UnboundLocalError: local variable 'build_clib' referenced before assignment Traceback (most recent call last): File "/usr/bin/ets", line 8, in load_entry_point('ETSProjectTools==0.6.0.dev-r24434', 'console_scripts', 'ets')() File "/usr/lib/python2.6/site-packages/ETSProjectTools-0.5.1-py2.6.egg/enthought/ets/ets.py", line 152, in main args.func(args, cfg) File "/usr/lib/python2.6/site-packages/ETSProjectTools-0.5.1-py2.6.egg/enthought/ets/develop.py", line 76, in main checkouts.perform(command, dry_run=args.dry_run) File "/usr/lib/python2.6/site-packages/ETSProjectTools-0.5.1-py2.6.egg/enthought/ets/tools/checkouts.py", line 126, in perform '%s' % project) RuntimeError: Unable to complete command for project: /home/gsever/Desktop/python-repo/ETS_3.3.1/Enable_3.2.1 Any suggestions? ################################################################## [gsever at ccn Desktop]$ python -c 'from numpy.f2py.diagnose import run; run()' ################################################################## ------ os.name='posix' ------ sys.platform='linux2' ------ sys.version: 2.6 (r26:66714, Jun 8 2009, 16:07:26) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] ------ sys.prefix: /usr ------ sys.path=':/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg:/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg:/home/gsever/Desktop/python-repo/ipython:/home/gsever/Desktop/python-repo/numpy:/home/gsever/Desktop/python-repo/matplotlib/lib:/usr/lib/python2.6/site-packages/Sphinx-0.6.2-py2.6.egg:/usr/lib/python2.6/site-packages/docutils-0.5-py2.6.egg:/usr/lib/python2.6/site-packages/Jinja2-2.1.1-py2.6-linux-i686.egg:/usr/lib/python2.6/site-packages/Pygments-1.0-py2.6.egg:/usr/lib/python2.6/site-packages/xlwt-0.7.2-py2.6.egg:/usr/lib/python2.6/site-packages/spyder-1.0.0beta1-py2.6.egg:/usr/lib/python2.6/site-packages/PyOpenGL-3.0.0c1-py2.6.egg:/home/gsever/Desktop/python-repo/ETS_3.3.1/EnthoughtBase_3.0.4:/home/gsever/Desktop/python-repo/ETS_3.3.1/TraitsBackendWX_3.2.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/ETSProjectTools_0.6.0:/home/gsever/Desktop/python-repo/ETS_3.3.1/Chaco_3.2.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/ETS_3.3.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/TraitsGUI_3.1.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/Traits_3.2.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/BlockCanvas_3.1.1:/usr/lib/python26.zip:/usr/lib/python2.6:/usr/lib/python2.6/plat-linux2:/usr/lib/python2.6/lib-tk:/usr/lib/python2.6/lib-old:/usr/lib/python2.6/lib-dynload:/usr/lib/python2.6/site-packages:/usr/lib/python2.6/site-packages/Numeric:/usr/lib/python2.6/site-packages/PIL:/usr/lib/python2.6/site-packages/gst-0.10:/usr/lib/python2.6/site-packages/gtk-2.0:/usr/lib/python2.6/site-packages:/usr/lib/python2.6/site-packages/wx-2.8-gtk2-unicode' ------ Failed to import numarray: No module named numarray Found Numeric version '24.2' in /usr/lib/python2.6/site-packages/Numeric/Numeric.pyc Found new numpy version '1.4.0.dev' in /home/gsever/Desktop/python-repo/numpy/numpy/__init__.pyc Found f2py2e version '2' in /home/gsever/Desktop/python-repo/numpy/numpy/f2py/f2py2e.pyc Found numpy.distutils version '0.4.0' in '/home/gsever/Desktop/python-repo/numpy/numpy/distutils/__init__.pyc' ------ Importing numpy.distutils.fcompiler ... ok ------ Checking availability of supported Fortran compilers: GnuFCompiler instance properties: archiver = ['/usr/bin/g77', '-cr'] compile_switch = '-c' compiler_f77 = ['/usr/bin/g77', '-g', '-Wall', '-fno-second- underscore', '-fPIC', '-O3', '-funroll-loops'] compiler_f90 = None compiler_fix = None libraries = ['g2c'] library_dirs = [] linker_exe = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall'] linker_so = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall', '- shared'] object_switch = '-o ' ranlib = ['/usr/bin/g77'] version = LooseVersion ('3.4.6') version_cmd = ['/usr/bin/g77', '--version'] Gnu95FCompiler instance properties: archiver = ['/usr/bin/gfortran', '-cr'] compile_switch = '-c' compiler_f77 = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-fno- second-underscore', '-fPIC', '-O3', '-funroll-loops'] compiler_f90 = ['/usr/bin/gfortran', '-Wall', '-fno-second-underscore', '-fPIC', '-O3', '-funroll-loops'] compiler_fix = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-fno- second-underscore', '-Wall', '-fno-second-underscore', '- fPIC', '-O3', '-funroll-loops'] libraries = ['gfortran'] library_dirs = [] linker_exe = ['/usr/bin/gfortran', '-Wall', '-Wall'] linker_so = ['/usr/bin/gfortran', '-Wall', '-Wall', '-shared'] object_switch = '-o ' ranlib = ['/usr/bin/gfortran'] version = LooseVersion ('4.4.0') version_cmd = ['/usr/bin/gfortran', '--version'] Fortran compilers found: --fcompiler=gnu GNU Fortran 77 compiler (3.4.6) --fcompiler=gnu95 GNU Fortran 95 compiler (4.4.0) Compilers available for this platform, but not found: --fcompiler=absoft Absoft Corp Fortran Compiler --fcompiler=compaq Compaq Fortran Compiler --fcompiler=g95 G95 Fortran Compiler --fcompiler=intel Intel Fortran Compiler for 32-bit apps --fcompiler=intele Intel Fortran Compiler for Itanium apps --fcompiler=intelem Intel Fortran Compiler for EM64T-based apps --fcompiler=lahey Lahey/Fujitsu Fortran 95 Compiler --fcompiler=nag NAGWare Fortran 95 Compiler --fcompiler=pg Portland Group Fortran Compiler --fcompiler=vast Pacific-Sierra Research Fortran 90 Compiler Compilers not available on this platform: --fcompiler=hpux HP Fortran 90 Compiler --fcompiler=ibm IBM XL Fortran Compiler --fcompiler=intelev Intel Visual Fortran Compiler for Itanium apps --fcompiler=intelv Intel Visual Fortran Compiler for 32-bit apps --fcompiler=mips MIPSpro Fortran Compiler --fcompiler=none Fake Fortran compiler --fcompiler=sun Sun or Forte Fortran 95 Compiler For compiler details, run 'config_fc --verbose' setup command. ------ Importing numpy.distutils.cpuinfo ... ok ------ CPU information: CPUInfoBase__get_nbits getNCPUs has_mmx has_sse has_sse2 has_sse3 has_ssse3 is_32bit is_Intel is_i686 ------ -- G?khan From robert.kern at gmail.com Thu Aug 13 18:36:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 13 Aug 2009 17:36:43 -0500 Subject: [Numpy-discussion] documentation translation In-Reply-To: References: Message-ID: <3d375d730908131536t2fa3f8ebjb00f5b3363f60b64@mail.gmail.com> On Thu, Aug 13, 2009 at 14:59, Dokuro wrote: > Hello, we have a python group in Maracaibo Venezuela and have started > a translation project for as many python documentations we can find > from english to spanish, making the use of python easier to > non-english speakers, so in behalf of our group i would like to ask if > it won't be a problem if we make this translations and keep them in > our wiki ?http://proyectociencia.org/Wiki/index.php/Grupo_de_Usuarios_de_Python Please do. Thank you! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Thu Aug 13 18:59:21 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 13 Aug 2009 15:59:21 -0700 Subject: [Numpy-discussion] [IPython-user] SciPy2009 BoF Wiki Page In-Reply-To: <0636F499-8CBC-4053-ACF9-7BA40E5D58D4@cs.toronto.edu> References: <0636F499-8CBC-4053-ACF9-7BA40E5D58D4@cs.toronto.edu> Message-ID: On Thu, Aug 13, 2009 at 2:20 PM, David Warde-Farley wrote: > I needed a short break from some heavy writing, so on Fernando's > suggestion I took to the task of aggregating together mailing list > traffic about the BoFs next week. So far, 4 have been proposed, and > I've written down under "attendees" the names of anyone who has > expressed interest (except in Perry's case, where I've only heard it > via proxy). The page is at > > ? ? ? ?http://scipy.org/SciPy2009/BoF > > I've created sections below that are hyperlink targets for the topic > of the session, if someone more knowledgeable of that domain can fill > in those sections, please do. Fantastic! Many thanks. > > Edit away, and see you next week! (And if someone can forward this to > the Matplotlib list, I'm not currently subscribed) I'll send that now. Cheers, f From jeremy.mayes at gmail.com Fri Aug 14 07:39:21 2009 From: jeremy.mayes at gmail.com (Jeremy Mayes) Date: Fri, 14 Aug 2009 06:39:21 -0500 Subject: [Numpy-discussion] ndarray subclass causes memory leak In-Reply-To: References: Message-ID: <890c2bf00908140439mc0ad63en5277882db506efea@mail.gmail.com> This simple example causes a massive leak in v1.3.0 which didn't exist in v1.0.1. What am I doing wrong? If I replace arr = [Array((2,2)), Array((2,2))] with arr = [numpy.ndarray((2,2,)), numpy.ndarray((2,2))] then I don't have the leak import numpy import gc class Array(numpy.ndarray): def __new__(subtype, shape, dtype=float, buffer=None, offset=0, strides=None, order=None, info=None): return numpy.ndarray.__new__(subtype, shape, dtype, buffer, offset, strides, order) def __array_finalize__(self, obj): print 'called array_finalize' if __name__=='__main__': arr = [Array((2,2)), Array((2,2))] nbytesAllocated = 0 for i in xrange(1000000000): a = numpy.array(arr) nbytesAllocated += a.nbytes if i%1000 == 0: print 'allocted %s'%nbytesAllocated gc.collect() -- --jlm -- --jlm -- --jlm -------------- next part -------------- An HTML attachment was scrubbed... URL: From markbak at gmail.com Fri Aug 14 08:27:23 2009 From: markbak at gmail.com (Mark Bakker) Date: Fri, 14 Aug 2009 14:27:23 +0200 Subject: [Numpy-discussion] finding range of values below threshold in sorted array Message-ID: <6946b9500908140527i4c9096adtd87b73b6b1c11171@mail.gmail.com> Hello List, I am trying to find a quick way to do the following: I have a *sorted* array of real numbers, say array A, sorted in ascending order (but easy to store descending if that would help) I want to find all numbers below a certain value, say b Sure, I can do A < b and I will get back a list with a bunch of True-s and then a bunch of False's, but all I need is the highest index for which A[i] < b, since A is sorted. Does anybody know a quick way to do this? I need to do it a lot, so the quicker the better. Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From emmanuelle.gouillart at normalesup.org Fri Aug 14 08:49:04 2009 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Fri, 14 Aug 2009 14:49:04 +0200 Subject: [Numpy-discussion] finding range of values below threshold in sorted array In-Reply-To: <6946b9500908140527i4c9096adtd87b73b6b1c11171@mail.gmail.com> References: <6946b9500908140527i4c9096adtd87b73b6b1c11171@mail.gmail.com> Message-ID: <20090814124904.GF25831@phare.normalesup.org> Hi, ind = np.searchsorted(A, b) values = A[:ind] Cheers, Emmanuelle On Fri, Aug 14, 2009 at 02:27:23PM +0200, Mark Bakker wrote: > Hello List, > I am trying to find a quick way to do the following: > I have a *sorted* array of real numbers, say array A, sorted in ascending > order (but easy to store descending if that would help) > I want to find all numbers below a certain value, say b > Sure, I can do? > A < b > and I will get back a list with a bunch of True-s and then a bunch of > False's, > but all I need is the highest index for which A[i] < b, since A is sorted. > Does anybody know a quick way to do this? I need to do it a lot, so the > quicker the better. > Thanks, > Mark > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From dagss at student.matnat.uio.no Fri Aug 14 09:04:09 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 14 Aug 2009 15:04:09 +0200 Subject: [Numpy-discussion] Cython BOF Message-ID: <4A8560C9.40302@student.matnat.uio.no> There's been some discussion earlier about how starting to write bigger parts of the NumPy/SciPy codebase in Cython could potentially lower the barrier of entry. Some topics: * Move towards PEP3118 as the primary scientific "data container" rather than ndarray? * Cython templates? * Native SIMD in Cython -- good or bad? (I don't know myself, to be honest.) * Future direction in general I'll add a Cython BOF to the wiki if somebody express interest. -- Dag Sverre From dsdale24 at gmail.com Fri Aug 14 09:22:25 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 14 Aug 2009 09:22:25 -0400 Subject: [Numpy-discussion] Cython BOF In-Reply-To: <4A8560C9.40302@student.matnat.uio.no> References: <4A8560C9.40302@student.matnat.uio.no> Message-ID: On Fri, Aug 14, 2009 at 9:04 AM, Dag Sverre Seljebotn wrote: > There's been some discussion earlier about how starting to write bigger > parts of the NumPy/SciPy codebase in Cython could potentially lower the > barrier of entry. > > Some topics: > ?* Move towards PEP3118 as the primary scientific "data container" > rather than ndarray? > ?* Cython templates? > ?* Native SIMD in Cython -- good or bad? (I don't know myself, to be > honest.) > ?* Future direction in general > > I'll add a Cython BOF to the wiki if somebody express interest. I'm relatively new to contributing to numpy, so I probably would not be able to provide a lot of input, but I would be interested in attending. Darren From kwmsmith at gmail.com Fri Aug 14 10:13:43 2009 From: kwmsmith at gmail.com (Kurt Smith) Date: Fri, 14 Aug 2009 09:13:43 -0500 Subject: [Numpy-discussion] Cython BOF In-Reply-To: <4A8560C9.40302@student.matnat.uio.no> References: <4A8560C9.40302@student.matnat.uio.no> Message-ID: On Fri, Aug 14, 2009 at 8:04 AM, Dag Sverre Seljebotn wrote: > There's been some discussion earlier about how starting to write bigger > parts of the NumPy/SciPy codebase in Cython could potentially lower the > barrier of entry. > > Some topics: > ?* Move towards PEP3118 as the primary scientific "data container" > rather than ndarray? > ?* Cython templates? > ?* Native SIMD in Cython -- good or bad? (I don't know myself, to be > honest.) > ?* Future direction in general > > I'll add a Cython BOF to the wiki if somebody express interest. +1. Definitely interested. > > -- > Dag Sverre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jeremy.mayes at gmail.com Fri Aug 14 12:00:07 2009 From: jeremy.mayes at gmail.com (Jeremy Mayes) Date: Fri, 14 Aug 2009 11:00:07 -0500 Subject: [Numpy-discussion] ndarray subclass causes memory leak In-Reply-To: <890c2bf00908140434s64a98fbet5aa87cd0e6a818d3@mail.gmail.com> References: <890c2bf00908140434s64a98fbet5aa87cd0e6a818d3@mail.gmail.com> Message-ID: <890c2bf00908140900t5c73091fma462dedd4997d7df@mail.gmail.com> I've narrowed this down to a change between 1.0.4 and 1.1.0. Valgrind (of v1.3.0) shows the following result. The change was in setArrayFromSequence where PyArray_EnsureArray gets invoked in v1.1.0 where it did not in v1.0.4. ==10132== 4,474,768 (3,197,200 direct, 1,277,568 indirect) bytes in 39,965 blocks are definitely lost in loss record 36 of 36 ==10132== at 0x4905B65: malloc (vg_replace_malloc.c:149) ==10132== by 0x7FFFDAA: array_alloc (arrayobject.c:7387) ==10132== by 0x8003465: PyArray_NewFromDescr (arrayobject.c:5900) ==10132== by 0x802434D: PyArray_EnsureArray (multiarraymodule.c:226) ==10132== by 0x8025766: setArrayFromSequence (arrayobject.c:7938) ==10132== by 0x80256B8: setArrayFromSequence (arrayobject.c:7957) ==10132== by 0x800F5FD: PyArray_FromAny (arrayobject.c:7984) ==10132== by 0x802CC21: PyArray_CheckFromAny (arrayobject.c:9530) ==10132== by 0x8037F64: _array_fromobject (multiarraymodule.c:6329) ==10132== by 0x4AC9501: PyEval_EvalFrameEx (ceval.c:3612) ==10132== by 0x4ACA894: PyEval_EvalCodeEx (ceval.c:2875) ==10132== by 0x4ACAA11: PyEval_EvalCode (ceval.c:514) ==10132== by 0x4AEC98B: PyRun_FileExFlags (pythonrun.c:1273) ==10132== by 0x4AED612: PyRun_SimpleFileExFlags (pythonrun.c:879) ==10132== by 0x4AF88C7: Py_Main (main.c:532) ==10132== by 0x38C821C3FA: (below main) (in /lib64/tls/libc-2.3.4.so) ==10132== On Fri, Aug 14, 2009 at 6:34 AM, Jeremy Mayes wrote: > This simple example causes a massive leak in v1.3.0 which didn't exist in > v1.0.1. What am I doing wrong? If I replace > arr = [Array((2,2)), Array((2,2))] > > with > > arr = [numpy.ndarray((2,2,)), numpy.ndarray((2,2))] > > then I don't have the leak > > > > import numpy > import gc > > class Array(numpy.ndarray): > def __new__(subtype, shape, dtype=float, buffer=None, offset=0, > strides=None, order=None, info=None): > return numpy.ndarray.__new__(subtype, shape, dtype, buffer, offset, > strides, order) > > def __array_finalize__(self, obj): > print 'called array_finalize' > > if __name__=='__main__': > arr = [Array((2,2)), Array((2,2))] > > nbytesAllocated = 0 > for i in xrange(1000000000): > a = numpy.array(arr) > nbytesAllocated += a.nbytes > if i%1000 == 0: > print 'allocted %s'%nbytesAllocated > gc.collect() > > > -- > --jlm > > > > > -- > --jlm > -- --jlm -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdh2358 at gmail.com Fri Aug 14 14:05:47 2009 From: jdh2358 at gmail.com (John Hunter) Date: Fri, 14 Aug 2009 13:05:47 -0500 Subject: [Numpy-discussion] masked index surprise Message-ID: <88e473830908141105i230bfc4cof1f4dc77541bf806@mail.gmail.com> I just tracked down a subtle bug in my code, which is equivalent to In [64]: x, y = np.random.rand(2, n) In [65]: z = np.zeros_like(x) In [66]: mask = x>0.5 In [67]: z[mask] = x/y I meant to write z[mask] = x[mask]/y[mask] so I can fix my code, but why is line 67 allowed In [68]: z[mask].shape Out[68]: (54,) In [69]: (x/y).shape Out[69]: (100,) it seems like broadcasting would fail In [70]: np.__version__ Out[70]: '1.4.0.dev7153' In [71]: From robert.kern at gmail.com Fri Aug 14 14:52:30 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Aug 2009 13:52:30 -0500 Subject: [Numpy-discussion] masked index surprise In-Reply-To: <88e473830908141105i230bfc4cof1f4dc77541bf806@mail.gmail.com> References: <88e473830908141105i230bfc4cof1f4dc77541bf806@mail.gmail.com> Message-ID: <3d375d730908141152g1d304b76u9e5f0a1b8070f8d2@mail.gmail.com> On Fri, Aug 14, 2009 at 13:05, John Hunter wrote: > I just tracked down a subtle bug in my code, which is equivalent to > > > In [64]: x, y = np.random.rand(2, n) > > In [65]: z = np.zeros_like(x) > > In [66]: mask = x>0.5 > > In [67]: z[mask] = x/y > > > > I meant to write > > ?z[mask] = x[mask]/y[mask] > > so I can fix my code, but why is line 67 allowed > > ?In [68]: z[mask].shape > ?Out[68]: (54,) > > ?In [69]: (x/y).shape > ?Out[69]: (100,) > > it seems like broadcasting would fail Broadcasting doesn't take place with boolean masks. Instead, the values repeat if there are too few and extra values are ignored. Boolean indexing derives from Numeric's putmask() implementation, which had these semantics, rather than other forms of indexing. You may consider this a wart or a bad design decision (and I would probably agree), but it is not a bug. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri Aug 14 15:20:33 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 14 Aug 2009 12:20:33 -0700 Subject: [Numpy-discussion] masked index surprise In-Reply-To: <3d375d730908141152g1d304b76u9e5f0a1b8070f8d2@mail.gmail.com> References: <88e473830908141105i230bfc4cof1f4dc77541bf806@mail.gmail.com> <3d375d730908141152g1d304b76u9e5f0a1b8070f8d2@mail.gmail.com> Message-ID: On Fri, Aug 14, 2009 at 11:52 AM, Robert Kern wrote: > On Fri, Aug 14, 2009 at 13:05, John Hunter wrote: >> I just tracked down a subtle bug in my code, which is equivalent to >> >> >> In [64]: x, y = np.random.rand(2, n) >> >> In [65]: z = np.zeros_like(x) >> >> In [66]: mask = x>0.5 >> >> In [67]: z[mask] = x/y >> >> >> >> I meant to write >> >> ?z[mask] = x[mask]/y[mask] >> >> so I can fix my code, but why is line 67 allowed >> >> ?In [68]: z[mask].shape >> ?Out[68]: (54,) >> >> ?In [69]: (x/y).shape >> ?Out[69]: (100,) >> >> it seems like broadcasting would fail > > Broadcasting doesn't take place with boolean masks. Instead, the > values repeat if there are too few and extra values are ignored. > Boolean indexing derives from Numeric's putmask() implementation, > which had these semantics, rather than other forms of indexing. > > You may consider this a wart or a bad design decision (and I would > probably agree), but it is not a bug. Are the last two, x[[1]] and x[np.array([1])], broadcasting? >> x = np.array([1,2,3]) >> x[1] = np.array([4,5,6]) ValueError: setting an array element with a sequence. >> x[(1,)] = np.array([4,5,6]) ValueError: array dimensions are not compatible for copy >> x[[1]] = np.array([4,5,6]) >> x array([1, 4, 3]) >> x[np.array([1])] = np.array([4,5,6]) >> x array([1, 4, 3]) From robert.kern at gmail.com Fri Aug 14 15:24:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Aug 2009 14:24:15 -0500 Subject: [Numpy-discussion] masked index surprise In-Reply-To: References: <88e473830908141105i230bfc4cof1f4dc77541bf806@mail.gmail.com> <3d375d730908141152g1d304b76u9e5f0a1b8070f8d2@mail.gmail.com> Message-ID: <3d375d730908141224t786b91e8q85cab49a824837c@mail.gmail.com> On Fri, Aug 14, 2009 at 14:20, Keith Goodman wrote: > On Fri, Aug 14, 2009 at 11:52 AM, Robert Kern wrote: >> On Fri, Aug 14, 2009 at 13:05, John Hunter wrote: >>> I just tracked down a subtle bug in my code, which is equivalent to >>> >>> >>> In [64]: x, y = np.random.rand(2, n) >>> >>> In [65]: z = np.zeros_like(x) >>> >>> In [66]: mask = x>0.5 >>> >>> In [67]: z[mask] = x/y >>> >>> >>> >>> I meant to write >>> >>> ?z[mask] = x[mask]/y[mask] >>> >>> so I can fix my code, but why is line 67 allowed >>> >>> ?In [68]: z[mask].shape >>> ?Out[68]: (54,) >>> >>> ?In [69]: (x/y).shape >>> ?Out[69]: (100,) >>> >>> it seems like broadcasting would fail >> >> Broadcasting doesn't take place with boolean masks. Instead, the >> values repeat if there are too few and extra values are ignored. >> Boolean indexing derives from Numeric's putmask() implementation, >> which had these semantics, rather than other forms of indexing. >> >> You may consider this a wart or a bad design decision (and I would >> probably agree), but it is not a bug. > > Are the last two, x[[1]] and x[np.array([1])], broadcasting? > >>> x = np.array([1,2,3]) >>> x[1] = np.array([4,5,6]) > ValueError: setting an array element with a sequence. >>> x[(1,)] = np.array([4,5,6]) > ValueError: array dimensions are not compatible for copy >>> x[[1]] = np.array([4,5,6]) >>> x > ? array([1, 4, 3]) >>> x[np.array([1])] = np.array([4,5,6]) >>> x > ? array([1, 4, 3]) I guess I'm just makin' stuff up again. kern_is_right() == False. All forms repeat, not broadcast, since they derive from put() and putmask() which both have the repeating/ignoring semantics. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri Aug 14 15:45:44 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 14 Aug 2009 12:45:44 -0700 Subject: [Numpy-discussion] masked index surprise In-Reply-To: <3d375d730908141224t786b91e8q85cab49a824837c@mail.gmail.com> References: <88e473830908141105i230bfc4cof1f4dc77541bf806@mail.gmail.com> <3d375d730908141152g1d304b76u9e5f0a1b8070f8d2@mail.gmail.com> <3d375d730908141224t786b91e8q85cab49a824837c@mail.gmail.com> Message-ID: On Fri, Aug 14, 2009 at 12:24 PM, Robert Kern wrote: > On Fri, Aug 14, 2009 at 14:20, Keith Goodman wrote: >> On Fri, Aug 14, 2009 at 11:52 AM, Robert Kern wrote: >>> On Fri, Aug 14, 2009 at 13:05, John Hunter wrote: >>>> I just tracked down a subtle bug in my code, which is equivalent to >>>> >>>> >>>> In [64]: x, y = np.random.rand(2, n) >>>> >>>> In [65]: z = np.zeros_like(x) >>>> >>>> In [66]: mask = x>0.5 >>>> >>>> In [67]: z[mask] = x/y >>>> >>>> >>>> >>>> I meant to write >>>> >>>> ?z[mask] = x[mask]/y[mask] >>>> >>>> so I can fix my code, but why is line 67 allowed >>>> >>>> ?In [68]: z[mask].shape >>>> ?Out[68]: (54,) >>>> >>>> ?In [69]: (x/y).shape >>>> ?Out[69]: (100,) >>>> >>>> it seems like broadcasting would fail >>> >>> Broadcasting doesn't take place with boolean masks. Instead, the >>> values repeat if there are too few and extra values are ignored. >>> Boolean indexing derives from Numeric's putmask() implementation, >>> which had these semantics, rather than other forms of indexing. >>> >>> You may consider this a wart or a bad design decision (and I would >>> probably agree), but it is not a bug. >> >> Are the last two, x[[1]] and x[np.array([1])], broadcasting? >> >>>> x = np.array([1,2,3]) >>>> x[1] = np.array([4,5,6]) >> ValueError: setting an array element with a sequence. >>>> x[(1,)] = np.array([4,5,6]) >> ValueError: array dimensions are not compatible for copy >>>> x[[1]] = np.array([4,5,6]) >>>> x >> ? array([1, 4, 3]) >>>> x[np.array([1])] = np.array([4,5,6]) >>>> x >> ? array([1, 4, 3]) > > I guess I'm just makin' stuff up again. kern_is_right() == False. All > forms repeat, not broadcast, since they derive from put() and > putmask() which both have the repeating/ignoring semantics. The ignoring scares me. If the dimensions aren't compatible I'd much rather get a ValueError. Does anyone have a use case for ignoring? (Besides ignoring my email.) From dwf at cs.toronto.edu Fri Aug 14 17:09:31 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 14 Aug 2009 17:09:31 -0400 Subject: [Numpy-discussion] Cython BOF In-Reply-To: <4A8560C9.40302@student.matnat.uio.no> References: <4A8560C9.40302@student.matnat.uio.no> Message-ID: <0B00D87F-1950-4106-BDBE-30B60B7305A8@cs.toronto.edu> +1. The topics (especially native SIMD) sound fantastic so far. David On 14-Aug-09, at 9:04 AM, Dag Sverre Seljebotn wrote: > There's been some discussion earlier about how starting to write > bigger > parts of the NumPy/SciPy codebase in Cython could potentially lower > the > barrier of entry. > > Some topics: > * Move towards PEP3118 as the primary scientific "data container" > rather than ndarray? > * Cython templates? > * Native SIMD in Cython -- good or bad? (I don't know myself, to be > honest.) > * Future direction in general > > I'll add a Cython BOF to the wiki if somebody express interest. > > -- > Dag Sverre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From gokhansever at gmail.com Fri Aug 14 17:11:57 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Fri, 14 Aug 2009 16:11:57 -0500 Subject: [Numpy-discussion] build_clib error during Enable 3_2_1 installation In-Reply-To: <49d6b3500908131536m6e5c4646qa0d1d03a1ea6c8c8@mail.gmail.com> References: <49d6b3500908131536m6e5c4646qa0d1d03a1ea6c8c8@mail.gmail.com> Message-ID: <49d6b3500908141411h65e0f59fue358524f5d9e5ed7@mail.gmail.com> Hello, I fix the scipy installation issue. I usually checkout the whole ETS trunk (ets co ETS) and do a "ets develop". After fullfilling the requirements it was always successfully building the whole code-stack (well at least always in Fedora 10). In this case it fails on Enable compilation step. Don't know this error: "build_clib already run, it is too late to ensure in-place build of build_clib" is due to my system files or a conflict with the installed Python tools. One more thing to note; there is build_clib.py file under /usr/lib/python2.6/distutils/command. Might these be due to a conflict between numpy's directive and Python's distutil? On Thu, Aug 13, 2009 at 5:36 PM, G?khan Sever wrote: > For some unknown reason, ets develop can't pass the following compilation > point: > > > g++: enthought/kiva/agg/src/kiva_rect.cpp > ar: adding 8 object files to build/temp.linux-i686-2.6/libkiva_src.a > running build_ext > build_clib already run, it is too late to ensure in-place build of > build_clib > Traceback (most recent call last): > File "setup.py", line 327, in > **config > File "/home/gsever/Desktop/python-repo/numpy/numpy/distutils/core.py", > line 186, in setup > return old_setup(**new_attr) > File "/usr/lib/python2.6/distutils/core.py", line 152, in setup > dist.run_commands() > File "/usr/lib/python2.6/distutils/dist.py", line 975, in run_commands > self.run_command(cmd) > File "/usr/lib/python2.6/distutils/dist.py", line 995, in run_command > cmd_obj.run() > File > "/home/gsever/Desktop/python-repo/numpy/numpy/distutils/command/build_ext.py", > line 74, in run > self.library_dirs.append(build_clib.build_clib) > UnboundLocalError: local variable 'build_clib' referenced before assignment > Traceback (most recent call last): > File "/usr/bin/ets", line 8, in > load_entry_point('ETSProjectTools==0.6.0.dev-r24434', > 'console_scripts', 'ets')() > File > "/usr/lib/python2.6/site-packages/ETSProjectTools-0.5.1-py2.6.egg/enthought/ets/ets.py", > line 152, in main > args.func(args, cfg) > File > "/usr/lib/python2.6/site-packages/ETSProjectTools-0.5.1-py2.6.egg/enthought/ets/develop.py", > line 76, in main > checkouts.perform(command, dry_run=args.dry_run) > File > "/usr/lib/python2.6/site-packages/ETSProjectTools-0.5.1-py2.6.egg/enthought/ets/tools/checkouts.py", > line 126, in perform > '%s' % project) > RuntimeError: Unable to complete command for project: > /home/gsever/Desktop/python-repo/ETS_3.3.1/Enable_3.2.1 > > > Any suggestions? > > > > ################################################################## > [gsever at ccn Desktop]$ python -c 'from numpy.f2py.diagnose import run; > run()' > ################################################################## > ------ > os.name='posix' > ------ > sys.platform='linux2' > ------ > sys.version: > 2.6 (r26:66714, Jun 8 2009, 16:07:26) > [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] > ------ > sys.prefix: > /usr > ------ > > sys.path=':/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg:/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg:/home/gsever/Desktop/python-repo/ipython:/home/gsever/Desktop/python-repo/numpy:/home/gsever/Desktop/python-repo/matplotlib/lib:/usr/lib/python2.6/site-packages/Sphinx-0.6.2-py2.6.egg:/usr/lib/python2.6/site-packages/docutils-0.5-py2.6.egg:/usr/lib/python2.6/site-packages/Jinja2-2.1.1-py2.6-linux-i686.egg:/usr/lib/python2.6/site-packages/Pygments-1.0-py2.6.egg:/usr/lib/python2.6/site-packages/xlwt-0.7.2-py2.6.egg:/usr/lib/python2.6/site-packages/spyder-1.0.0beta1-py2.6.egg:/usr/lib/python2.6/site-packages/PyOpenGL-3.0.0c1-py2.6.egg:/home/gsever/Desktop/python-repo/ETS_3.3.1/EnthoughtBase_3.0.4:/home/gsever/Desktop/python-repo/ETS_3.3.1/TraitsBackendWX_3.2.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/ETSProjectTools_0.6.0:/home/gsever/Desktop/python-repo/ETS_3.3.1/Chaco_3.2.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/ETS_3.3.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/TraitsGUI_3.1.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/Traits_3.2.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/BlockCanvas_3.1.1:/usr/lib/python26.zip:/usr/lib/python2.6:/usr/lib/python2.6/plat-linux2:/usr/lib/python2.6/lib-tk:/usr/lib/python2.6/lib-old:/usr/lib/python2.6/lib-dynload:/usr/lib/python2.6/site-packages:/usr/lib/python2.6/site-packages/Numeric:/usr/lib/python2.6/site-packages/PIL:/usr/lib/python2.6/site-packages/gst-0.10:/usr/lib/python2.6/site-packages/gtk-2.0:/usr/lib/python2.6/site-packages:/usr/lib/python2.6/site-packages/wx-2.8-gtk2-unicode' > ------ > Failed to import numarray: No module named numarray > Found Numeric version '24.2' in > /usr/lib/python2.6/site-packages/Numeric/Numeric.pyc > Found new numpy version '1.4.0.dev' in > /home/gsever/Desktop/python-repo/numpy/numpy/__init__.pyc > Found f2py2e version '2' in > /home/gsever/Desktop/python-repo/numpy/numpy/f2py/f2py2e.pyc > Found numpy.distutils version '0.4.0' in > '/home/gsever/Desktop/python-repo/numpy/numpy/distutils/__init__.pyc' > ------ > Importing numpy.distutils.fcompiler ... ok > ------ > Checking availability of supported Fortran compilers: > GnuFCompiler instance properties: > archiver = ['/usr/bin/g77', '-cr'] > compile_switch = '-c' > compiler_f77 = ['/usr/bin/g77', '-g', '-Wall', '-fno-second- > underscore', '-fPIC', '-O3', '-funroll-loops'] > compiler_f90 = None > compiler_fix = None > libraries = ['g2c'] > library_dirs = [] > linker_exe = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall'] > linker_so = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall', '- > shared'] > object_switch = '-o ' > ranlib = ['/usr/bin/g77'] > version = LooseVersion ('3.4.6') > version_cmd = ['/usr/bin/g77', '--version'] > Gnu95FCompiler instance properties: > archiver = ['/usr/bin/gfortran', '-cr'] > compile_switch = '-c' > compiler_f77 = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-fno- > second-underscore', '-fPIC', '-O3', '-funroll-loops'] > compiler_f90 = ['/usr/bin/gfortran', '-Wall', '-fno-second-underscore', > '-fPIC', '-O3', '-funroll-loops'] > compiler_fix = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-fno- > second-underscore', '-Wall', '-fno-second-underscore', > '- > fPIC', '-O3', '-funroll-loops'] > libraries = ['gfortran'] > library_dirs = [] > linker_exe = ['/usr/bin/gfortran', '-Wall', '-Wall'] > linker_so = ['/usr/bin/gfortran', '-Wall', '-Wall', '-shared'] > object_switch = '-o ' > ranlib = ['/usr/bin/gfortran'] > version = LooseVersion ('4.4.0') > version_cmd = ['/usr/bin/gfortran', '--version'] > Fortran compilers found: > --fcompiler=gnu GNU Fortran 77 compiler (3.4.6) > --fcompiler=gnu95 GNU Fortran 95 compiler (4.4.0) > Compilers available for this platform, but not found: > --fcompiler=absoft Absoft Corp Fortran Compiler > --fcompiler=compaq Compaq Fortran Compiler > --fcompiler=g95 G95 Fortran Compiler > --fcompiler=intel Intel Fortran Compiler for 32-bit apps > --fcompiler=intele Intel Fortran Compiler for Itanium apps > --fcompiler=intelem Intel Fortran Compiler for EM64T-based apps > --fcompiler=lahey Lahey/Fujitsu Fortran 95 Compiler > --fcompiler=nag NAGWare Fortran 95 Compiler > --fcompiler=pg Portland Group Fortran Compiler > --fcompiler=vast Pacific-Sierra Research Fortran 90 Compiler > Compilers not available on this platform: > --fcompiler=hpux HP Fortran 90 Compiler > --fcompiler=ibm IBM XL Fortran Compiler > --fcompiler=intelev Intel Visual Fortran Compiler for Itanium apps > --fcompiler=intelv Intel Visual Fortran Compiler for 32-bit apps > --fcompiler=mips MIPSpro Fortran Compiler > --fcompiler=none Fake Fortran compiler > --fcompiler=sun Sun or Forte Fortran 95 Compiler > For compiler details, run 'config_fc --verbose' setup command. > ------ > Importing numpy.distutils.cpuinfo ... ok > ------ > CPU information: CPUInfoBase__get_nbits getNCPUs has_mmx has_sse > has_sse2 has_sse3 has_ssse3 is_32bit is_Intel is_i686 ------ > > > > -- > G?khan > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Fri Aug 14 17:16:15 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Fri, 14 Aug 2009 14:16:15 -0700 (PDT) Subject: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze Message-ID: <740668.69161.qm@web52112.mail.re2.yahoo.com> Hi! Please remind: running python in the Windows Terminal (DOS command prompt), how does one pass the command "import numpy as np"? I tried 'python "import numpy as np"', to no avail. DG From robert.kern at gmail.com Fri Aug 14 17:20:08 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Aug 2009 16:20:08 -0500 Subject: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze In-Reply-To: <740668.69161.qm@web52112.mail.re2.yahoo.com> References: <740668.69161.qm@web52112.mail.re2.yahoo.com> Message-ID: <3d375d730908141420n6f9b9861x322e894712dc86aa@mail.gmail.com> On Fri, Aug 14, 2009 at 16:16, David Goldsmith wrote: > Hi! ?Please remind: running python in the Windows Terminal (DOS command prompt), how does one pass the command "import numpy as np"? ?I tried 'python "import numpy as np"', to no avail. $ python -h usage: /Library/Frameworks/Python.framework/Versions/2.5/Resources/PythonApp.app/Contents/MacOS/PythonApp [option] ... [-c cmd | -m mod | file | -] [arg] ... Options and arguments (and corresponding environment variables): -c cmd : program passed in as string (terminates option list) .... -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jsseabold at gmail.com Fri Aug 14 17:20:35 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 14 Aug 2009 17:20:35 -0400 Subject: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze In-Reply-To: <740668.69161.qm@web52112.mail.re2.yahoo.com> References: <740668.69161.qm@web52112.mail.re2.yahoo.com> Message-ID: On Fri, Aug 14, 2009 at 5:16 PM, David Goldsmith wrote: > Hi! ?Please remind: running python in the Windows Terminal (DOS command prompt), how does one pass the command "import numpy as np"? ?I tried 'python "import numpy as np"', to no avail. > Is this what you want? In a linux terminal python -c "import numpy as np; print np.ones(5)" http://docs.python.org/using/cmdline.html Skipper From d_l_goldsmith at yahoo.com Fri Aug 14 18:09:31 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Fri, 14 Aug 2009 15:09:31 -0700 (PDT) Subject: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze In-Reply-To: <3d375d730908141420n6f9b9861x322e894712dc86aa@mail.gmail.com> Message-ID: <244613.29168.qm@web52102.mail.re2.yahoo.com> Thanks! DG --- On Fri, 8/14/09, Robert Kern wrote: > From: Robert Kern > Subject: Re: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze > To: "Discussion of Numerical Python" > Date: Friday, August 14, 2009, 2:20 PM > On Fri, Aug 14, 2009 at 16:16, David > Goldsmith > wrote: > > Hi! ?Please remind: running python in the Windows > Terminal (DOS command prompt), how does one pass the command > "import numpy as np"? ?I tried 'python "import numpy as > np"', to no avail. > > $ python -h > usage: > /Library/Frameworks/Python.framework/Versions/2.5/Resources/PythonApp.app/Contents/MacOS/PythonApp > [option] ... [-c cmd | -m mod | file | -] [arg] ... > Options and arguments (and corresponding environment > variables): > -c cmd : program passed in as string (terminates option > list) > > .... > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, > a harmless > enigma that is made terrible by our own mad attempt to > interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From d_l_goldsmith at yahoo.com Fri Aug 14 18:18:55 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Fri, 14 Aug 2009 15:18:55 -0700 (PDT) Subject: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze In-Reply-To: Message-ID: <546323.15295.qm@web52112.mail.re2.yahoo.com> Thanks, Skipper & Robert; perhaps I'm misunderstanding what should happen, but this doesn't appear to work in Windoze: Begin Terminal output: C:\Python26>python -c "import numpy as np" C:\Python26> End Terminal output. In other words, no error is returned, but python doesn't "stay running". DG --- On Fri, 8/14/09, Skipper Seabold wrote: > From: Skipper Seabold > Subject: Re: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze > To: "Discussion of Numerical Python" > Date: Friday, August 14, 2009, 2:20 PM > On Fri, Aug 14, 2009 at 5:16 PM, > David Goldsmith > wrote: > > Hi! ?Please remind: running python in the Windows > Terminal (DOS command prompt), how does one pass the command > "import numpy as np"? ?I tried 'python "import numpy as > np"', to no avail. > > > > Is this what you want? > > In a linux terminal > > python -c "import numpy as np; print np.ones(5)" > > http://docs.python.org/using/cmdline.html > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Fri Aug 14 18:21:26 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 14 Aug 2009 18:21:26 -0400 Subject: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze In-Reply-To: <546323.15295.qm@web52112.mail.re2.yahoo.com> References: <546323.15295.qm@web52112.mail.re2.yahoo.com> Message-ID: On Fri, Aug 14, 2009 at 6:18 PM, David Goldsmith wrote: > Thanks, Skipper & Robert; perhaps I'm misunderstanding what should happen, but this doesn't appear to work in Windoze: > > Begin Terminal output: > > C:\Python26>python -c "import numpy as np" > > C:\Python26> > > End Terminal output. > Because it exits right after it imports. python -c "import numpy as np; print np.ones(5)" Should print an array of ones and then return you to the prompt. Skipper From robert.kern at gmail.com Fri Aug 14 18:23:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Aug 2009 17:23:52 -0500 Subject: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze In-Reply-To: <546323.15295.qm@web52112.mail.re2.yahoo.com> References: <546323.15295.qm@web52112.mail.re2.yahoo.com> Message-ID: <3d375d730908141523m4fb61632m96ff54b1431c6cfd@mail.gmail.com> On Fri, Aug 14, 2009 at 17:18, David Goldsmith wrote: > Thanks, Skipper & Robert; perhaps I'm misunderstanding what should happen, but this doesn't appear to work in Windoze: > > Begin Terminal output: > > C:\Python26>python -c "import numpy as np" > > C:\Python26> > > End Terminal output. > > In other words, no error is returned, but python doesn't "stay running". It's not supposed to, just like "python script.py" doesn't. Instead, use python -i -c "import numpy as np" http://docs.python.org/using/cmdline.html If you just want to execute some things before entering the interpreter every time, use PYTHONSTARTUP instead. Or IPython. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d_l_goldsmith at yahoo.com Fri Aug 14 18:27:25 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Fri, 14 Aug 2009 15:27:25 -0700 (PDT) Subject: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze In-Reply-To: <3d375d730908141523m4fb61632m96ff54b1431c6cfd@mail.gmail.com> Message-ID: <80361.38596.qm@web52102.mail.re2.yahoo.com> Excellent, thanks! DG --- On Fri, 8/14/09, Robert Kern wrote: > From: Robert Kern > Subject: Re: [Numpy-discussion] passing "import numpy as np" as python command arg in 'doze > To: "Discussion of Numerical Python" > Date: Friday, August 14, 2009, 3:23 PM > On Fri, Aug 14, 2009 at 17:18, David > Goldsmith > wrote: > > Thanks, Skipper & Robert; perhaps I'm > misunderstanding what should happen, but this doesn't appear > to work in Windoze: > > > > Begin Terminal output: > > > > C:\Python26>python -c "import numpy as np" > > > > C:\Python26> > > > > End Terminal output. > > > > In other words, no error is returned, but python > doesn't "stay running". > > It's not supposed to, just like "python script.py" doesn't. > Instead, use > > ? python -i -c "import numpy as np" > > http://docs.python.org/using/cmdline.html > > If you just want to execute some things before entering > the > interpreter every time, use PYTHONSTARTUP instead. Or > IPython. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, > a harmless > enigma that is made terrible by our own mad attempt to > interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From gokhansever at gmail.com Fri Aug 14 20:55:50 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Fri, 14 Aug 2009 19:55:50 -0500 Subject: [Numpy-discussion] build_clib error during Enable 3_2_1 installation In-Reply-To: <49d6b3500908141411h65e0f59fue358524f5d9e5ed7@mail.gmail.com> References: <49d6b3500908131536m6e5c4646qa0d1d03a1ea6c8c8@mail.gmail.com> <49d6b3500908141411h65e0f59fue358524f5d9e5ed7@mail.gmail.com> Message-ID: <49d6b3500908141755l635ac0c2s53737b392283a828@mail.gmail.com> Fixed this using the suggestion from Robert Kern Nabble - Numpy-discussion - Is this a bug in numpy.distutils ?On Tue, Aug 4, 2009 at 15:09, Matthew Brett> wrote: > File "/home/mb312/usr/*local*/lib/python2.5/site-packages/numpy/distutils/command/build_ext.py", > line 74, in run > self.library_dirs.append(*build_clib*.*build_clib*) > *UnboundLocalError*: *local* *variable* '*build_clib*' *referenced* * before* *assignment* > > because of the check for inplace builds above that, leaving *build_clib* > undefined. I'm afraid I wasn't quite sure what the right thing to do > was. Probably just *build_clib* = self.distribution.get_command_obj('*build_clib*') after the log.warn(). This worked indeed. Thanks :) On Fri, Aug 14, 2009 at 4:11 PM, G?khan Sever wrote: > Hello, > > I fix the scipy installation issue. I usually checkout the whole ETS trunk > (ets co ETS) and do a "ets develop". After fullfilling the requirements it > was always successfully building the whole code-stack (well at least always > in Fedora 10). In this case it fails on Enable compilation step. > > Don't know this error: "build_clib already run, it is too late to ensure > in-place build of build_clib" is due to my system files or a conflict with > the installed Python tools. > > One more thing to note; there is build_clib.py file under > /usr/lib/python2.6/distutils/command. Might these be due to a conflict > between numpy's directive and Python's distutil? > > > On Thu, Aug 13, 2009 at 5:36 PM, G?khan Sever wrote: > >> For some unknown reason, ets develop can't pass the following compilation >> point: >> >> >> g++: enthought/kiva/agg/src/kiva_rect.cpp >> ar: adding 8 object files to build/temp.linux-i686-2.6/libkiva_src.a >> running build_ext >> build_clib already run, it is too late to ensure in-place build of >> build_clib >> Traceback (most recent call last): >> File "setup.py", line 327, in >> **config >> File "/home/gsever/Desktop/python-repo/numpy/numpy/distutils/core.py", >> line 186, in setup >> return old_setup(**new_attr) >> File "/usr/lib/python2.6/distutils/core.py", line 152, in setup >> dist.run_commands() >> File "/usr/lib/python2.6/distutils/dist.py", line 975, in run_commands >> self.run_command(cmd) >> File "/usr/lib/python2.6/distutils/dist.py", line 995, in run_command >> cmd_obj.run() >> File >> "/home/gsever/Desktop/python-repo/numpy/numpy/distutils/command/build_ext.py", >> line 74, in run >> self.library_dirs.append(build_clib.build_clib) >> UnboundLocalError: local variable 'build_clib' referenced before >> assignment >> Traceback (most recent call last): >> File "/usr/bin/ets", line 8, in >> load_entry_point('ETSProjectTools==0.6.0.dev-r24434', >> 'console_scripts', 'ets')() >> File >> "/usr/lib/python2.6/site-packages/ETSProjectTools-0.5.1-py2.6.egg/enthought/ets/ets.py", >> line 152, in main >> args.func(args, cfg) >> File >> "/usr/lib/python2.6/site-packages/ETSProjectTools-0.5.1-py2.6.egg/enthought/ets/develop.py", >> line 76, in main >> checkouts.perform(command, dry_run=args.dry_run) >> File >> "/usr/lib/python2.6/site-packages/ETSProjectTools-0.5.1-py2.6.egg/enthought/ets/tools/checkouts.py", >> line 126, in perform >> '%s' % project) >> RuntimeError: Unable to complete command for project: >> /home/gsever/Desktop/python-repo/ETS_3.3.1/Enable_3.2.1 >> >> >> Any suggestions? >> >> >> >> ################################################################## >> [gsever at ccn Desktop]$ python -c 'from numpy.f2py.diagnose import run; >> run()' >> ################################################################## >> ------ >> os.name='posix' >> ------ >> sys.platform='linux2' >> ------ >> sys.version: >> 2.6 (r26:66714, Jun 8 2009, 16:07:26) >> [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] >> ------ >> sys.prefix: >> /usr >> ------ >> >> sys.path=':/usr/lib/python2.6/site-packages/foolscap-0.4.2-py2.6.egg:/usr/lib/python2.6/site-packages/Twisted-8.2.0-py2.6-linux-i686.egg:/home/gsever/Desktop/python-repo/ipython:/home/gsever/Desktop/python-repo/numpy:/home/gsever/Desktop/python-repo/matplotlib/lib:/usr/lib/python2.6/site-packages/Sphinx-0.6.2-py2.6.egg:/usr/lib/python2.6/site-packages/docutils-0.5-py2.6.egg:/usr/lib/python2.6/site-packages/Jinja2-2.1.1-py2.6-linux-i686.egg:/usr/lib/python2.6/site-packages/Pygments-1.0-py2.6.egg:/usr/lib/python2.6/site-packages/xlwt-0.7.2-py2.6.egg:/usr/lib/python2.6/site-packages/spyder-1.0.0beta1-py2.6.egg:/usr/lib/python2.6/site-packages/PyOpenGL-3.0.0c1-py2.6.egg:/home/gsever/Desktop/python-repo/ETS_3.3.1/EnthoughtBase_3.0.4:/home/gsever/Desktop/python-repo/ETS_3.3.1/TraitsBackendWX_3.2.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/ETSProjectTools_0.6.0:/home/gsever/Desktop/python-repo/ETS_3.3.1/Chaco_3.2.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/ETS_3.3.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/TraitsGUI_3.1.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/Traits_3.2.1:/home/gsever/Desktop/python-repo/ETS_3.3.1/BlockCanvas_3.1.1:/usr/lib/python26.zip:/usr/lib/python2.6:/usr/lib/python2.6/plat-linux2:/usr/lib/python2.6/lib-tk:/usr/lib/python2.6/lib-old:/usr/lib/python2.6/lib-dynload:/usr/lib/python2.6/site-packages:/usr/lib/python2.6/site-packages/Numeric:/usr/lib/python2.6/site-packages/PIL:/usr/lib/python2.6/site-packages/gst-0.10:/usr/lib/python2.6/site-packages/gtk-2.0:/usr/lib/python2.6/site-packages:/usr/lib/python2.6/site-packages/wx-2.8-gtk2-unicode' >> ------ >> Failed to import numarray: No module named numarray >> Found Numeric version '24.2' in >> /usr/lib/python2.6/site-packages/Numeric/Numeric.pyc >> Found new numpy version '1.4.0.dev' in >> /home/gsever/Desktop/python-repo/numpy/numpy/__init__.pyc >> Found f2py2e version '2' in >> /home/gsever/Desktop/python-repo/numpy/numpy/f2py/f2py2e.pyc >> Found numpy.distutils version '0.4.0' in >> '/home/gsever/Desktop/python-repo/numpy/numpy/distutils/__init__.pyc' >> ------ >> Importing numpy.distutils.fcompiler ... ok >> ------ >> Checking availability of supported Fortran compilers: >> GnuFCompiler instance properties: >> archiver = ['/usr/bin/g77', '-cr'] >> compile_switch = '-c' >> compiler_f77 = ['/usr/bin/g77', '-g', '-Wall', '-fno-second- >> underscore', '-fPIC', '-O3', '-funroll-loops'] >> compiler_f90 = None >> compiler_fix = None >> libraries = ['g2c'] >> library_dirs = [] >> linker_exe = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall'] >> linker_so = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall', '- >> shared'] >> object_switch = '-o ' >> ranlib = ['/usr/bin/g77'] >> version = LooseVersion ('3.4.6') >> version_cmd = ['/usr/bin/g77', '--version'] >> Gnu95FCompiler instance properties: >> archiver = ['/usr/bin/gfortran', '-cr'] >> compile_switch = '-c' >> compiler_f77 = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-fno- >> second-underscore', '-fPIC', '-O3', '-funroll-loops'] >> compiler_f90 = ['/usr/bin/gfortran', '-Wall', >> '-fno-second-underscore', >> '-fPIC', '-O3', '-funroll-loops'] >> compiler_fix = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-fno- >> second-underscore', '-Wall', '-fno-second-underscore', >> '- >> fPIC', '-O3', '-funroll-loops'] >> libraries = ['gfortran'] >> library_dirs = [] >> linker_exe = ['/usr/bin/gfortran', '-Wall', '-Wall'] >> linker_so = ['/usr/bin/gfortran', '-Wall', '-Wall', '-shared'] >> object_switch = '-o ' >> ranlib = ['/usr/bin/gfortran'] >> version = LooseVersion ('4.4.0') >> version_cmd = ['/usr/bin/gfortran', '--version'] >> Fortran compilers found: >> --fcompiler=gnu GNU Fortran 77 compiler (3.4.6) >> --fcompiler=gnu95 GNU Fortran 95 compiler (4.4.0) >> Compilers available for this platform, but not found: >> --fcompiler=absoft Absoft Corp Fortran Compiler >> --fcompiler=compaq Compaq Fortran Compiler >> --fcompiler=g95 G95 Fortran Compiler >> --fcompiler=intel Intel Fortran Compiler for 32-bit apps >> --fcompiler=intele Intel Fortran Compiler for Itanium apps >> --fcompiler=intelem Intel Fortran Compiler for EM64T-based apps >> --fcompiler=lahey Lahey/Fujitsu Fortran 95 Compiler >> --fcompiler=nag NAGWare Fortran 95 Compiler >> --fcompiler=pg Portland Group Fortran Compiler >> --fcompiler=vast Pacific-Sierra Research Fortran 90 Compiler >> Compilers not available on this platform: >> --fcompiler=hpux HP Fortran 90 Compiler >> --fcompiler=ibm IBM XL Fortran Compiler >> --fcompiler=intelev Intel Visual Fortran Compiler for Itanium apps >> --fcompiler=intelv Intel Visual Fortran Compiler for 32-bit apps >> --fcompiler=mips MIPSpro Fortran Compiler >> --fcompiler=none Fake Fortran compiler >> --fcompiler=sun Sun or Forte Fortran 95 Compiler >> For compiler details, run 'config_fc --verbose' setup command. >> ------ >> Importing numpy.distutils.cpuinfo ... ok >> ------ >> CPU information: CPUInfoBase__get_nbits getNCPUs has_mmx has_sse >> has_sse2 has_sse3 has_ssse3 is_32bit is_Intel is_i686 ------ >> >> >> >> -- >> G?khan >> > > > > -- > G?khan > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbagger at gmail.com Sat Aug 15 10:14:21 2009 From: bbagger at gmail.com (Bent) Date: Sat, 15 Aug 2009 16:14:21 +0200 Subject: [Numpy-discussion] Can't import numpy - problem with lapack? Message-ID: <2e19719f0908150714safaef97k1512dc3d3a0da305@mail.gmail.com> Hi list I want to use Numpy with Gnuradio but I cannot make it work. The problem, which turns out to have nothing with Gnuradio to do, is an undefined symbol as witnessed by these messages: bent at yosie:~> python Python 2.6 (r26:66714, Feb 3 2009, 20:52:03) [GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.6/site-packages/numpy/__init__.py", line 138, in import linalg File "/usr/lib/python2.6/site-packages/numpy/linalg/__init__.py", line 47, in from linalg import * File "/usr/lib/python2.6/site-packages/numpy/linalg/linalg.py", line 29, in from numpy.linalg import lapack_lite ImportError: /usr/lib/python2.6/site-packages/numpy/linalg/lapack_lite.so: undefined symbol: zgesdd_ >>> quit() bent at yosie:~> My distribution is openSUSE 11.1 and what really bugs me is that the exact same setup works as expected on another PC also running openSUSE 11.1. As far as I can tell, the installed packages are exactly the same, version numbers, build dates, etc, all are the same... The only difference between the two PCs is that the one (the one that fails) is a new install and the other is an upgrade from an older release. I have googled high and low but not found anything that could bring me further. I hope that this (collectively) all-knowledgeable list can give me some pointers. Kind regards, Bent From charlesr.harris at gmail.com Sat Aug 15 10:42:31 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Aug 2009 08:42:31 -0600 Subject: [Numpy-discussion] Can't import numpy - problem with lapack? In-Reply-To: <2e19719f0908150714safaef97k1512dc3d3a0da305@mail.gmail.com> References: <2e19719f0908150714safaef97k1512dc3d3a0da305@mail.gmail.com> Message-ID: On Sat, Aug 15, 2009 at 8:14 AM, Bent wrote: > Hi list > > I want to use Numpy with Gnuradio but I cannot make it work. The > problem, which turns out to have nothing with Gnuradio to do, is an > undefined symbol as witnessed by these messages: > > bent at yosie:~> python > Python 2.6 (r26:66714, Feb 3 2009, 20:52:03) > [GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/python2.6/site-packages/numpy/__init__.py", line 138, > in > import linalg > File "/usr/lib/python2.6/site-packages/numpy/linalg/__init__.py", > line 47, in > from linalg import * > File "/usr/lib/python2.6/site-packages/numpy/linalg/linalg.py", line > 29, in > from numpy.linalg import lapack_lite > ImportError: /usr/lib/python2.6/site-packages/numpy/linalg/lapack_lite.so: > undefined symbol: zgesdd_ > >>> quit() > bent at yosie:~> > > My distribution is openSUSE 11.1 and what really bugs me is that the > exact same setup works as expected on another PC also running openSUSE > 11.1. As far as I can tell, the installed packages are exactly the > same, version numbers, build dates, etc, all are the same... The only > difference between the two PCs is that the one (the one that fails) is > a new install and the other is an upgrade from an older release. > > I have googled high and low but not found anything that could bring me > further. I hope that this (collectively) all-knowledgeable list can > give me some pointers. > That's probably due to the ATLAS library you have installed, SuSe has a history of problems in that area. Have you checked the ATLAS versions also? Is the hardware the same? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bbagger at gmail.com Sat Aug 15 14:00:51 2009 From: bbagger at gmail.com (Bent) Date: Sat, 15 Aug 2009 20:00:51 +0200 Subject: [Numpy-discussion] Can't import numpy - problem with lapack? Message-ID: <2e19719f0908151100i72fb4f56w1a6b2abe9b89837@mail.gmail.com> Charles R Harris wrote: > > That's probably due to the ATLAS library you have installed, SuSe has a history of problems in that area. > Have you checked the ATLAS versions also? Is the hardware the same? > You certainly got something there. It turned out that the PCs, on which import numpy works, does not have anything named 'atlas' installed (apart from a kernel module named atlas_btns) whereas my workstation had a package named libatlas3-sse installed. When I removed this package, everything fell into place and import numpy now works. Thanks for the help Bent From stefan at sun.ac.za Sat Aug 15 20:46:10 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 15 Aug 2009 17:46:10 -0700 Subject: [Numpy-discussion] Cython BOF In-Reply-To: <4A8560C9.40302@student.matnat.uio.no> References: <4A8560C9.40302@student.matnat.uio.no> Message-ID: <9457e7c80908151746s623273b2jbeb3ef8523dba768@mail.gmail.com> 2009/8/14 Dag Sverre Seljebotn : > There's been some discussion earlier about how starting to write bigger > parts of the NumPy/SciPy codebase in Cython could potentially lower the > barrier of entry. Also, it could address the 2.x to 3.0 C-API transition problems. I'm not sure how we'd tackle it otherwise. St?fan From sccolbert at gmail.com Sun Aug 16 20:01:56 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 16 Aug 2009 20:01:56 -0400 Subject: [Numpy-discussion] is there a better way to do this array repeat? Message-ID: <7f014ea60908161701r711c0c79s6f97525fc29a4536@mail.gmail.com> I don't think np.repeat will do what I want because the order needs to be preserved. I have a 1x3 array that I want to repeat n times and form an nx3 array where each row is a copy of the original array. So far I have this: >>> import numpy as np >>> a = np.arange(3) >>> b = np.asarray([a]*10) >>> b array([[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2]]) >>> b.shape (10, 3) the issue is that my n may be very large and I'd rather not make that list if I don't have to. Cheers, Chris From robert.kern at gmail.com Sun Aug 16 20:07:23 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 16 Aug 2009 19:07:23 -0500 Subject: [Numpy-discussion] is there a better way to do this array repeat? In-Reply-To: <7f014ea60908161701r711c0c79s6f97525fc29a4536@mail.gmail.com> References: <7f014ea60908161701r711c0c79s6f97525fc29a4536@mail.gmail.com> Message-ID: <3d375d730908161707n663ea8b7sa100b1eaff8dc261@mail.gmail.com> On Sun, Aug 16, 2009 at 19:01, Chris Colbert wrote: > I don't think np.repeat will do what I want because the order needs to > be preserved. "Order"? > I have a 1x3 array that I want to repeat n times and form an nx3 array > where each row is a copy of the original array. > > So far I have this: > >>>> import numpy as np >>>> a = np.arange(3) >>>> b = np.asarray([a]*10) >>>> b > array([[0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2]]) >>>> b.shape > (10, 3) In [5]: a = arange(3) In [6]: repeat(a.reshape([1, -1]), 10, axis=0) Out[6]: array([[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2]]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Sun Aug 16 20:08:33 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 16 Aug 2009 17:08:33 -0700 Subject: [Numpy-discussion] is there a better way to do this array repeat? In-Reply-To: <7f014ea60908161701r711c0c79s6f97525fc29a4536@mail.gmail.com> References: <7f014ea60908161701r711c0c79s6f97525fc29a4536@mail.gmail.com> Message-ID: <9457e7c80908161708l53f0c0e3q7a2c37daa22d8a79@mail.gmail.com> 2009/8/16 Chris Colbert : > I have a 1x3 array that I want to repeat n times and form an nx3 array > where each row is a copy of the original array. a = np.arange(3)[None, :] np.repeat(a, 10, axis=0) Regards St?fan From sccolbert at gmail.com Sun Aug 16 20:30:58 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 16 Aug 2009 20:30:58 -0400 Subject: [Numpy-discussion] is there a better way to do this array repeat? In-Reply-To: <9457e7c80908161708l53f0c0e3q7a2c37daa22d8a79@mail.gmail.com> References: <7f014ea60908161701r711c0c79s6f97525fc29a4536@mail.gmail.com> <9457e7c80908161708l53f0c0e3q7a2c37daa22d8a79@mail.gmail.com> Message-ID: <7f014ea60908161730r5bbe206ew8ac05dbc1619a2e8@mail.gmail.com> great, thanks! by order I meant repeat the array in order rather than repeat each element. On 8/16/09, St?fan van der Walt wrote: > 2009/8/16 Chris Colbert : >> I have a 1x3 array that I want to repeat n times and form an nx3 array >> where each row is a copy of the original array. > > a = np.arange(3)[None, :] > np.repeat(a, 10, axis=0) > > Regards > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kxroberto at googlemail.com Mon Aug 17 02:42:02 2009 From: kxroberto at googlemail.com (Robert) Date: Mon, 17 Aug 2009 08:42:02 +0200 Subject: [Numpy-discussion] memory address of array data? Message-ID: Is there a function to get the memory address (int) of (contigious) ndarray data on Python level - like array.array.buffer_info() ? I'd need it to pass it to a camera function. From stefan at sun.ac.za Mon Aug 17 02:50:44 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 16 Aug 2009 23:50:44 -0700 Subject: [Numpy-discussion] memory address of array data? In-Reply-To: References: Message-ID: <9457e7c80908162350u2bed6cd6m6a2b5b53f278b9db@mail.gmail.com> 2009/8/16 Robert : > Is there a function to get the memory address (int) of > (contigious) ndarray data on Python level - like > array.array.buffer_info() ? > I'd need it to pass it to a camera function. Have a look at the array interface: x.__array_interface__['data'] Regards St?fan From lciti at essex.ac.uk Mon Aug 17 05:24:48 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Mon, 17 Aug 2009 10:24:48 +0100 Subject: [Numpy-discussion] is there a better way to do this arrayrepeat? References: <7f014ea60908161701r711c0c79s6f97525fc29a4536@mail.gmail.com><9457e7c80908161708l53f0c0e3q7a2c37daa22d8a79@mail.gmail.com> <7f014ea60908161730r5bbe206ew8ac05dbc1619a2e8@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E6E@sernt14.essex.ac.uk> As you stress on "repeat the array ... rather than repeat each element", you may want to consider tile as well: >>> np.tile(a, [10,1]) array([[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2]]) From gael.varoquaux at normalesup.org Mon Aug 17 10:44:04 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 17 Aug 2009 16:44:04 +0200 Subject: [Numpy-discussion] memory address of array data? In-Reply-To: <9457e7c80908162350u2bed6cd6m6a2b5b53f278b9db@mail.gmail.com> References: <9457e7c80908162350u2bed6cd6m6a2b5b53f278b9db@mail.gmail.com> Message-ID: <20090817144404.GE30571@phare.normalesup.org> On Sun, Aug 16, 2009 at 11:50:44PM -0700, St?fan van der Walt wrote: > 2009/8/16 Robert : > > Is there a function to get the memory address (int) of > > (contigious) ndarray data on Python level - like > > array.array.buffer_info() ? > > I'd need it to pass it to a camera function. > Have a look at the array interface: > x.__array_interface__['data'] Also, it might be interesting to have a look at the ctypes support: http://www.scipy.org/Cookbook/Ctypes Ga?l From mforbes at physics.ubc.ca Mon Aug 17 10:53:54 2009 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Mon, 17 Aug 2009 08:53:54 -0600 Subject: [Numpy-discussion] Specifying Index Programmatically In-Reply-To: References: <4A7E487B.70906@wartburg.edu> Message-ID: <125661EB-BB17-4546-B02A-CB4DF783219A@physics.ubc.ca> There is also numpy.s_: inds = np.s_[...,2,:] z[inds] (Though there are some problems with negative indices: see for example http://www.mail-archive.com/numpy-discussion at scipy.org/msg18245.html) On 8 Aug 2009, at 10:02 PM, T J wrote: > On Sat, Aug 8, 2009 at 8:54 PM, Neil Martinsen-Burrell > wrote: >> >> The ellipsis is a built-in python constant called Ellipsis. The >> colon >> is a slice object, again a python built-in, called with None as an >> argument. So, z[...,2,:] == z[Ellipsis,2,slice(None)]. From mforbes at physics.ubc.ca Mon Aug 17 11:12:00 2009 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Mon, 17 Aug 2009 09:12:00 -0600 Subject: [Numpy-discussion] IndexExpression bug? In-Reply-To: <3d375d730906051512r584f253dx864c51ceed6f40cf@mail.gmail.com> References: <3d375d730906051512r584f253dx864c51ceed6f40cf@mail.gmail.com> Message-ID: <600DA126-1163-489B-9335-95DE6557EA56@physics.ubc.ca> Submitted as ticket 1196 http://projects.scipy.org/numpy/ticket/1196 On 5 Jun 2009, at 4:12 PM, Robert Kern wrote: > On Fri, Jun 5, 2009 at 16:14, Michael McNeil Forbes > wrote: >> >>> np.array([0,1,2,3])[1:-1] >> array([1, 2]) >> >> but >> >> >>> np.array([0,1,2,3])[np.s_[1:-1]] >> array([1, 2, 3]) >> >>> np.array([0,1,2,3])[np.index_exp[1:-1]] >> array([1, 2, 3]) ... > I think that getting rid of __getslice__ and __len__ should work > better. I don't really understand what the logic was behind including > them in the first place, though. I might be missing something. ... From sccolbert at gmail.com Mon Aug 17 12:20:14 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 17 Aug 2009 12:20:14 -0400 Subject: [Numpy-discussion] is there a better way to do this arrayrepeat? In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E6E@sernt14.essex.ac.uk> References: <7f014ea60908161701r711c0c79s6f97525fc29a4536@mail.gmail.com> <9457e7c80908161708l53f0c0e3q7a2c37daa22d8a79@mail.gmail.com> <7f014ea60908161730r5bbe206ew8ac05dbc1619a2e8@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E6E@sernt14.essex.ac.uk> Message-ID: <7f014ea60908170920k18bd56bep9484e1d73500f677@mail.gmail.com> That's exactly it. Thanks! On Mon, Aug 17, 2009 at 5:24 AM, Citi, Luca wrote: > As you stress on "repeat the array ... rather than repeat each element", > you may want to consider tile as well: > >>>> np.tile(a, [10,1]) > array([[0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2], > ? ? ? [0, 1, 2]]) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jonathan.taylor at utoronto.ca Mon Aug 17 13:42:32 2009 From: jonathan.taylor at utoronto.ca (Jonathan Taylor) Date: Mon, 17 Aug 2009 13:42:32 -0400 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. Message-ID: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> Hi, I am getting a strange crash in numpy.linalg.lstsq. I have put the code that causes the crash along with two data files on my website at: http://www.cs.toronto.edu/~jtaylor/crash/ I would be interested to know if this bug can be duplicated and/or if anyone has any suggestions as to why: import numpy as np A = np.load('A.npy') b = np.load('b.npy') rc = np.linalg.lstsq(A,b) produces: *** glibc detected *** /usr/bin/python: free(): invalid next size (normal): 0x091793c0 *** ======= Backtrace: ========= /lib/tls/i686/cmov/libc.so.6[0xb7dc7a85] /lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7dcb4f0] /u/jtaylor/lib/python2.5/site-packages/numpy/core/multiarray.so[0xb795403e] /usr/bin/python[0x811247a] /usr/bin/python(PyEval_EvalCodeEx+0x323)[0x80cae33] /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalCode+0x57)[0x80cb347] /usr/bin/python(PyRun_FileExFlags+0xf8)[0x80ea818] /usr/bin/python[0x80c1f5a] /usr/bin/python(PyObject_Call+0x27)[0x805cb97] /usr/bin/python(PyEval_EvalFrameEx+0x4064)[0x80c7e04] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python[0x8113696] /usr/bin/python(PyObject_Call+0x27)[0x805cb97] /usr/bin/python(PyEval_EvalFrameEx+0x4064)[0x80c7e04] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python[0x8113696] /usr/bin/python(PyObject_Call+0x27)[0x805cb97] /usr/bin/python[0x8062bfb] /usr/bin/python(PyObject_Call+0x27)[0x805cb97] /usr/bin/python(PyEval_EvalFrameEx+0x3d07)[0x80c7aa7] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] /usr/bin/python(PyEval_EvalFrameEx+0x5945)[0x80c96e5] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalFrameEx+0x6d09)[0x80caaa9] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] /usr/bin/python(PyEval_EvalFrameEx+0x5945)[0x80c96e5] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] /usr/bin/python(PyEval_EvalCode+0x57)[0x80cb347] /usr/bin/python(PyRun_FileExFlags+0xf8)[0x80ea818] /usr/bin/python(PyRun_SimpleFileExFlags+0x199)[0x80eaab9] /usr/bin/python(Py_Main+0xa35)[0x8059335] /usr/bin/python(main+0x22)[0x80587f2] /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0)[0xb7d72450] /usr/bin/python[0x8058761] ======= Memory map: ======== 08048000-08140000 r-xp 00000000 08:06 83501 /usr/bin/python2.5 08140000-08165000 rw-p 000f7000 08:06 83501 /usr/bin/python2.5 08165000-0919a000 rw-p 08165000 00:00 0 [heap] b5200000-b5221000 rw-p b5200000 00:00 0 b5221000-b5300000 ---p b5221000 00:00 0 b53fc000-b5499000 r-xp 00000000 00:1a 552170 /h/44/jtaylor/lib/python2.5/site-packages/Cython/Compiler/Parsing.so b5499000-b54a2000 rw-p 0009d000 00:1a 552170 /h/44/jtaylor/lib/python2.5/site-packages/Cython/Compiler/Parsing.so b54a2000-b5624000 rw-p b54a2000 00:00 0 b5624000-b568f000 r-xp 00000000 00:1a 553542 /h/44/jtaylor/build/matplotlib/lib/matplotlib/backends/_backend_agg.so b568f000-b5691000 rw-p 0006a000 00:1a 553542 /h/44/jtaylor/build/matplotlib/lib/matplotlib/backends/_backend_agg.so b5691000-b56f6000 r-xp 00000000 08:06 90831 /usr/lib/python2.5/lib-dynload/unicodedata.so b56f6000-b5705000 rw-p 00065000 08:06 90831 /usr/lib/python2.5/lib-dynload/unicodedata.so b5705000-b5725000 r-xp 00000000 00:1a 553545 /h/44/jtaylor/build/matplotlib/lib/matplotlib/backends/_tkagg.so b5725000-b5726000 rw-p 00020000 00:1a 553545 /h/44/jtaylor/build/matplotlib/lib/matplotlib/backends/_tkagg.so b5726000-b5727000 ---p b5726000 00:00 0 b5727000-b5f27000 rwxp b5727000 00:00 0 b5f27000-b5f3e000 r-xp 00000000 08:06 85532 /usr/lib/libxcb.so.1.0.0 b5f3e000-b5f3f000 rw-p 00016000 08:06 85532 /usr/lib/libxcb.so.1.0.0 b5f3f000-b5f53000 r-xp 00000000 08:06 1187870 /lib/tls/i686/cmov/ libnsl-2.7.so b5f53000-b5f55000 rw-p 00013000 08:06 1187870 /lib/tls/i686/cmov/ libnsl-2.7.so b5f55000-b5f57000 rw-p b5f55000 00:00 0 b5f57000-b603b000 r-xp 00000000 08:06 85536 /usr/lib/libX11.so.6.2.0 b603b000-b603e000 rw-p 000e4000 08:06 85536 /usr/lib/libX11.so.6.2.0 b603e000-b60e7000 r-xp 00000000 08:06 85098 /usr/lib/libtcl8.4.so.0 b60e7000-b60f1000 rw-p 000a8000 08:06 85098 /usr/lib/libtcl8.4.so.0 b60f1000-b60f2000 rw-p b60f1000 00:00 0 b60f2000-b61c4000 r-xp 00000000 08:06 85102 /usr/lib/libtk8.4.so.0 b61c4000-b61cf000 rw-p 000d2000 08:06 85102 /usr/lib/libtk8.4.so.0 b61cf000-b61d0000 rw-p b61cf000 00:00 0 b61d0000-b62a8000 r-xp 00000000 08:06 85103 /usr/lib/libBLT.2.4.so.8.4 b62a8000-b62b9000 rw-p 000d8000 08:06 85103 /usr/lib/libBLT.2.4.so.8.4 b62b9000-b62ba000 rw-p b62b9000 00:00 0 b62ba000-b62dc000 r-xp 00000000 08:06 180469 /usr/lib/libpng12.so.0.15.0 b62dc000-b62dd000 rw-p 00022000 08:06 180469 /usr/lib/libpng12.so.0.15.0 b62f5000-b62f6000 rw-p b62f5000 00:00 0 b62f6000-b631d000 r-xp 00000000 00:1a 553544 /h/44/jtaylor/build/matplotlib/lib/matplotlib/_png.so b631d000-b631e000 rw-p 00027000 00:1a 553544 /h/44/jtaylor/build/matplotlib/lib/matplotlib/_png.so b631e000-b6367000 r-xp 00000000 00:1a 553543 /h/44/jtaylor/build/matplotlib/lib/matplotlib/_image.so b6367000-b6369000 rw-p 00049000 00:1a 553543 /h/44/jtaylor/build/matplotlib/lib/matplotlib/_image.so b6369000-b63d3000 r-xp 00000000 08:06 83795 /usr/lib/libfreetype.so.6.3.16 b63d3000-b63d6000 rw-p 0006a000 08:06 83795 /usr/lib/libfreetype.so.6.3.16 b63d6000-b6424000 r-xp 00000000 00:1a 553535 /h/44/jtaylor/build/matplotlib/lib/matplotlib/ft2font.so b6424000-b6427000 rw-p 0004e000 00:1a 553535 /h/44/jtaylor/build/matplotlib/lib/matplotlib/ft2font.so b6427000-b650f000 r-xp 00000000 08:06 88506 /usr/lib/libstdc++.so.6.0.9 b650f000-b6512000 r--p 000e8000 08:06 88506 /usr/lib/libstdc++.so.6.0.9 b6512000-b6514000 rw-p 000eb000 08:06 88506 /usr/lib/libstdc++.so.6.0.9 b6514000-b651a000 rw-p b6514000 00:00 0 b651e000-b6528000 r-xp 00000000 08:06 313979 /usr/lib/python2.5/lib-dynload/_tkinter.so b6528000-b6529000 rw-p 0000a000 08:06 313979 /usr/lib/python2.5/lib-dynload/_tkinter.so b6529000-b652d000 r-xp 00000000 08:06 90832 /usr/lib/python2.5/lib-dynload/zlib.so b652d000-b652e000 rw-p 00004000 08:06 90832 /usr/lib/python2.5/lib-dynload/zlib.so b652e000-b6532000 r-xp 00000000 00:1a 553538 /h/44/jtaylor/build/matplotlib/lib/matplotlib/_cntr.so b6532000-b6533000 rw-p 00004000 00:1a 553538 /h/44/jtaylor/build/matplotlib/lib/matplotlib/_cntr.so b6533000-b6577000 r-xp 00000000 00:1a 553541 /h/44/jtaylor/build/matplotlib/lib/matplotlib/_path.so b6577000-b6578000 rw-p 00044000 00:1a 553541 /h/44/jtaylor/build/matplotlib/lib/matplotlib/_path.so b6578000-b6587000 r-xp 00000000 08:06 92419 /usr/lib/python2.5/lib-dynload/datetime.so b6587000-b658a000 rw-p 0000e000 08:06 92419 /usr/lib/python2.5/lib-dynload/datetime.so b658a000-b65b9000 r-xp 00000000 00:1a 532925 /h/44/jtaylor/lib/python2.5/site-packages/numpy/random/mtrand.so b65b9000-b65cb000 rw-p 0002e000 00:1a 532925 /h/44/jtaylor/lib/python2.5/site-packages/numpy/random/mtrand.so b65cb000-b6923000 r-xp 00000000 08:06 517267 /usr/lib/atlas/libblas.so.3.0 b6923000-b6927000 rw-p 00358000 08:06 517267 /usr/lib/atlas/libblas.so.3.0 b6927000-b6e6f000 r-xp 00000000 08:06 517268 /usr/lib/atlas/liblapack.so.3.0 b6e6f000-b6e72000 rw-p 00548000 08:06 517268 /usr/lib/atlas/liblapack.so.3.0 b6e72000-b6f76000 rw-p b6e72000 00:00 0 b6f76000-b6f7a000 r-xp 00000000 08:06 85530 /usr/lib/libXdmcp.so.6.0.0 b6f7a000-b6f7b000 rw-p 00003000 08:06 85530 /usr/lib/libXdmcp.so.6.0.0 b6f7b000-b6f7f000 r-xp 00000000 08:06 92437 /usr/lib/python2.5/lib-dynload/_csv.so b6f7f000-b6f81000 rw-p 00004000 08:06 92437 /usr/lib/python2.5/lib-dynload/_csv.so b6f81000-b6f84000 r-xp 00000000 08:06 92427 /usr/lib/python2.5/lib-dynload/_locale.so b6f84000-b6f85000 rw-p 00003000 08:06 92427 /usr/lib/python2.5/lib-dynload/_locale.so b6f85000-b6f8e000 r-xp 00000000 00:1a 533113 /h/44/jtaylor/lib/python2.5/site-packages/numpy/fft/fftpack_lite.so b6f8e000-b6f8f000 rw-p 00008000 00:1a 533113 /h/44/jtaylor/lib/python2.5/site-packages/numpy/fft/fftpack_lite.so b6f8f000-b6fad000 r-xp 00000000 00:1a 533028 /h/44/jtaylor/lib/python2.5/site-packages/numpy/core/scalarmath.so b6fad000-b6fae000 rw-p 0001e000 00:1a 533028 /h/44/jtaylor/lib/python2.Aborted -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Aug 17 13:55:08 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 17 Aug 2009 13:55:08 -0400 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> Message-ID: <1cd32cbb0908171055i1023138dx7861f32a9fe88b33@mail.gmail.com> On Mon, Aug 17, 2009 at 1:42 PM, Jonathan Taylor wrote: > Hi, > > I am getting a strange crash in numpy.linalg.lstsq.? I have put the code > that causes the crash along with two data files on my website at: > > http://www.cs.toronto.edu/~jtaylor/crash/ > > I would be interested to know if this bug can be duplicated and/or if anyone > has any suggestions as to why: > > import numpy as np > A = np.load('A.npy') > b = np.load('b.npy') > rc = np.linalg.lstsq(A,b) > > produces: > > *** glibc detected *** /usr/bin/python: free(): invalid next size (normal): > 0x091793c0 *** > ======= Backtrace: ========= > /lib/tls/i686/cmov/libc.so.6[0xb7dc7a85] > /lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7dcb4f0] > /u/jtaylor/lib/python2.5/site-packages/numpy/core/multiarray.so[0xb795403e] > /usr/bin/python[0x811247a] > /usr/bin/python(PyEval_EvalCodeEx+0x323)[0x80cae33] > /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalCode+0x57)[0x80cb347] > /usr/bin/python(PyRun_FileExFlags+0xf8)[0x80ea818] > /usr/bin/python[0x80c1f5a] > /usr/bin/python(PyObject_Call+0x27)[0x805cb97] > /usr/bin/python(PyEval_EvalFrameEx+0x4064)[0x80c7e04] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python[0x8113696] > /usr/bin/python(PyObject_Call+0x27)[0x805cb97] > /usr/bin/python(PyEval_EvalFrameEx+0x4064)[0x80c7e04] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python[0x8113696] > /usr/bin/python(PyObject_Call+0x27)[0x805cb97] > /usr/bin/python[0x8062bfb] > /usr/bin/python(PyObject_Call+0x27)[0x805cb97] > /usr/bin/python(PyEval_EvalFrameEx+0x3d07)[0x80c7aa7] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] > /usr/bin/python(PyEval_EvalFrameEx+0x5945)[0x80c96e5] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalFrameEx+0x6d09)[0x80caaa9] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] > /usr/bin/python(PyEval_EvalFrameEx+0x5945)[0x80c96e5] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalFrameEx+0x565e)[0x80c93fe] > /usr/bin/python(PyEval_EvalCodeEx+0x6e7)[0x80cb1f7] > /usr/bin/python(PyEval_EvalCode+0x57)[0x80cb347] > /usr/bin/python(PyRun_FileExFlags+0xf8)[0x80ea818] > /usr/bin/python(PyRun_SimpleFileExFlags+0x199)[0x80eaab9] > /usr/bin/python(Py_Main+0xa35)[0x8059335] > /usr/bin/python(main+0x22)[0x80587f2] > /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0)[0xb7d72450] > /usr/bin/python[0x8058761] > ======= Memory map: ======== > 08048000-08140000 r-xp 00000000 08:06 83501????? /usr/bin/python2.5 > 08140000-08165000 rw-p 000f7000 08:06 83501????? /usr/bin/python2.5 > 08165000-0919a000 rw-p 08165000 00:00 0????????? [heap] > b5200000-b5221000 rw-p b5200000 00:00 0 > b5221000-b5300000 ---p b5221000 00:00 0 > b53fc000-b5499000 r-xp 00000000 00:1a 552170 > /h/44/jtaylor/lib/python2.5/site-packages/Cython/Compiler/Parsing.so > b5499000-b54a2000 rw-p 0009d000 00:1a 552170 > /h/44/jtaylor/lib/python2.5/site-packages/Cython/Compiler/Parsing.so > b54a2000-b5624000 rw-p b54a2000 00:00 0 > b5624000-b568f000 r-xp 00000000 00:1a 553542 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/backends/_backend_agg.so > b568f000-b5691000 rw-p 0006a000 00:1a 553542 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/backends/_backend_agg.so > b5691000-b56f6000 r-xp 00000000 08:06 90831 > /usr/lib/python2.5/lib-dynload/unicodedata.so > b56f6000-b5705000 rw-p 00065000 08:06 90831 > /usr/lib/python2.5/lib-dynload/unicodedata.so > b5705000-b5725000 r-xp 00000000 00:1a 553545 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/backends/_tkagg.so > b5725000-b5726000 rw-p 00020000 00:1a 553545 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/backends/_tkagg.so > b5726000-b5727000 ---p b5726000 00:00 0 > b5727000-b5f27000 rwxp b5727000 00:00 0 > b5f27000-b5f3e000 r-xp 00000000 08:06 85532????? /usr/lib/libxcb.so.1.0.0 > b5f3e000-b5f3f000 rw-p 00016000 08:06 85532????? /usr/lib/libxcb.so.1.0.0 > b5f3f000-b5f53000 r-xp 00000000 08:06 1187870 > /lib/tls/i686/cmov/libnsl-2.7.so > b5f53000-b5f55000 rw-p 00013000 08:06 1187870 > /lib/tls/i686/cmov/libnsl-2.7.so > b5f55000-b5f57000 rw-p b5f55000 00:00 0 > b5f57000-b603b000 r-xp 00000000 08:06 85536????? /usr/lib/libX11.so.6.2.0 > b603b000-b603e000 rw-p 000e4000 08:06 85536????? /usr/lib/libX11.so.6.2.0 > b603e000-b60e7000 r-xp 00000000 08:06 85098????? /usr/lib/libtcl8.4.so.0 > b60e7000-b60f1000 rw-p 000a8000 08:06 85098????? /usr/lib/libtcl8.4.so.0 > b60f1000-b60f2000 rw-p b60f1000 00:00 0 > b60f2000-b61c4000 r-xp 00000000 08:06 85102????? /usr/lib/libtk8.4.so.0 > b61c4000-b61cf000 rw-p 000d2000 08:06 85102????? /usr/lib/libtk8.4.so.0 > b61cf000-b61d0000 rw-p b61cf000 00:00 0 > b61d0000-b62a8000 r-xp 00000000 08:06 85103????? /usr/lib/libBLT.2.4.so.8.4 > b62a8000-b62b9000 rw-p 000d8000 08:06 85103????? /usr/lib/libBLT.2.4.so.8.4 > b62b9000-b62ba000 rw-p b62b9000 00:00 0 > b62ba000-b62dc000 r-xp 00000000 08:06 180469???? /usr/lib/libpng12.so.0.15.0 > b62dc000-b62dd000 rw-p 00022000 08:06 180469???? /usr/lib/libpng12.so.0.15.0 > b62f5000-b62f6000 rw-p b62f5000 00:00 0 > b62f6000-b631d000 r-xp 00000000 00:1a 553544 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/_png.so > b631d000-b631e000 rw-p 00027000 00:1a 553544 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/_png.so > b631e000-b6367000 r-xp 00000000 00:1a 553543 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/_image.so > b6367000-b6369000 rw-p 00049000 00:1a 553543 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/_image.so > b6369000-b63d3000 r-xp 00000000 08:06 83795 > /usr/lib/libfreetype.so.6.3.16 > b63d3000-b63d6000 rw-p 0006a000 08:06 83795 > /usr/lib/libfreetype.so.6.3.16 > b63d6000-b6424000 r-xp 00000000 00:1a 553535 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/ft2font.so > b6424000-b6427000 rw-p 0004e000 00:1a 553535 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/ft2font.so > b6427000-b650f000 r-xp 00000000 08:06 88506????? /usr/lib/libstdc++.so.6.0.9 > b650f000-b6512000 r--p 000e8000 08:06 88506????? /usr/lib/libstdc++.so.6.0.9 > b6512000-b6514000 rw-p 000eb000 08:06 88506????? /usr/lib/libstdc++.so.6.0.9 > b6514000-b651a000 rw-p b6514000 00:00 0 > b651e000-b6528000 r-xp 00000000 08:06 313979 > /usr/lib/python2.5/lib-dynload/_tkinter.so > b6528000-b6529000 rw-p 0000a000 08:06 313979 > /usr/lib/python2.5/lib-dynload/_tkinter.so > b6529000-b652d000 r-xp 00000000 08:06 90832 > /usr/lib/python2.5/lib-dynload/zlib.so > b652d000-b652e000 rw-p 00004000 08:06 90832 > /usr/lib/python2.5/lib-dynload/zlib.so > b652e000-b6532000 r-xp 00000000 00:1a 553538 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/_cntr.so > b6532000-b6533000 rw-p 00004000 00:1a 553538 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/_cntr.so > b6533000-b6577000 r-xp 00000000 00:1a 553541 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/_path.so > b6577000-b6578000 rw-p 00044000 00:1a 553541 > /h/44/jtaylor/build/matplotlib/lib/matplotlib/_path.so > b6578000-b6587000 r-xp 00000000 08:06 92419 > /usr/lib/python2.5/lib-dynload/datetime.so > b6587000-b658a000 rw-p 0000e000 08:06 92419 > /usr/lib/python2.5/lib-dynload/datetime.so > b658a000-b65b9000 r-xp 00000000 00:1a 532925 > /h/44/jtaylor/lib/python2.5/site-packages/numpy/random/mtrand.so > b65b9000-b65cb000 rw-p 0002e000 00:1a 532925 > /h/44/jtaylor/lib/python2.5/site-packages/numpy/random/mtrand.so > b65cb000-b6923000 r-xp 00000000 08:06 517267 > /usr/lib/atlas/libblas.so.3.0 > b6923000-b6927000 rw-p 00358000 08:06 517267 > /usr/lib/atlas/libblas.so.3.0 > b6927000-b6e6f000 r-xp 00000000 08:06 517268 > /usr/lib/atlas/liblapack.so.3.0 > b6e6f000-b6e72000 rw-p 00548000 08:06 517268 > /usr/lib/atlas/liblapack.so.3.0 > b6e72000-b6f76000 rw-p b6e72000 00:00 0 > b6f76000-b6f7a000 r-xp 00000000 08:06 85530????? /usr/lib/libXdmcp.so.6.0.0 > b6f7a000-b6f7b000 rw-p 00003000 08:06 85530????? /usr/lib/libXdmcp.so.6.0.0 > b6f7b000-b6f7f000 r-xp 00000000 08:06 92437 > /usr/lib/python2.5/lib-dynload/_csv.so > b6f7f000-b6f81000 rw-p 00004000 08:06 92437 > /usr/lib/python2.5/lib-dynload/_csv.so > b6f81000-b6f84000 r-xp 00000000 08:06 92427 > /usr/lib/python2.5/lib-dynload/_locale.so > b6f84000-b6f85000 rw-p 00003000 08:06 92427 > /usr/lib/python2.5/lib-dynload/_locale.so > b6f85000-b6f8e000 r-xp 00000000 00:1a 533113 > /h/44/jtaylor/lib/python2.5/site-packages/numpy/fft/fftpack_lite.so > b6f8e000-b6f8f000 rw-p 00008000 00:1a 533113 > /h/44/jtaylor/lib/python2.5/site-packages/numpy/fft/fftpack_lite.so > b6f8f000-b6fad000 r-xp 00000000 00:1a 533028 > /h/44/jtaylor/lib/python2.5/site-packages/numpy/core/scalarmath.so > b6fad000-b6fae000 rw-p 0001e000 00:1a 533028 > /h/44/jtaylor/lib/python2.Aborted > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > no problem here, with official Windows numpy Josef >>> np.version.version '1.3.0' >python -i why_crash.py >>> print rc (array([ -5.23462841, -4.85584394, -2.99233015, -7.54676368, -10.15455332, 7.074554 , 8.4043877 , 2.79661679, 3.41336578, 5.29202285, 2.70716181, 15.53449435, 9.34557621, 17.32209602, 18.16994838, -50.98017437, -50.96547959, -17.51283078, 7.68637678, 13.53704022, 20.66278929, -23.64368007, -4.70942583, 18.2568222 , 2.45709374, 12.97156815, 15.47026211, -44.93348725, 7.7558192 , -39.13996758, 1.20214959, 23.61872159, -20.21198664, -7.87137325, -4.20255668, -45.24948722, 12.49507108, 24.22157348, 23.46404032, 18.62294373, -26.31401828, 24.35842929, -37.5578372 , 18.24079679, 28.90693972, -40.40246853, 23.85976491, 11.70965078, 17.38628028, 6.14989021, 0.19683346, 11.57781284, -6.70961655, -21.98525308, -11.30257635, 31.16804751, 5.08794164, 0.26279222, -27.78390652, -26.3151511 , 14.89172102, 29.02572416, -10.84227516, 3.20577699, -34.73738042, 24.90588989, 37.92166034, -30.30146211, 37.28852751, -16.03146259, -30.87415056, -33.02832669, -21.63514384, 11.15711455, 10.43855884, -7.08345237, 31.50460928, -28.64336727, -12.32269443, -24.59112645, 41.71351395, -29.85091349, -4.07409268, 0.82708638, 14.67839587, 41.58165228, -29.44030397, 31.13279856, -28.46626932, 31.21863319, -30.50159697, -6.26718832, -26.41654876, -2.42547434, 44.00738912, -10.94028372, -0.65862359, -25.08227995, -26.04263867, 13.25529043, -7.41115206, 36.11891076, 47.22737694, 23.39250661, -16.59126536, 37.75596345, 12.59698144, 9.15952276, -22.0567611 , -27.79573887, -30.57535286, 28.71831817, -21.38243352, 19.30944773, 49.81583705, -19.59172648]), array([ 1063.81 458595]), 116, array([ 10.77032961, 3.02162267, 3.02054405, 3.0010756 , 2.96191492, 2.94807426, 2.94230063, 2.93906657, 2.92832506, 2.91399677, 2.88159001, 2.86294336, 2.85790349, 2.84497487, 2.82744239, 2.81275744, 2.78836986, 2.77119523, 2.76422221, 2.75861982, 2.75015801, 2.72908307, 2.68445243, 2.67800314, 2.666536 , 2.65671856, 2.64826304, 2.63879427, 2.6296631 , 2.60120053, 2.59118748, 2.58256916, 2.57264941, 2.56585886, 2.53898947, 2.53365513, 2.52103196, 2.49959127, 2.47968021, 2.46456052, 2.46068247, 2.44924031, 2.43199483, 2.41963211, 2.41515001, 2.40937849, 2.39016287, 2.3762653 , 2.35560428, 2.34357138, 2.3260469 , 2.30884773, 2.29027418, 2.27944481, 2.27465575, 2.25660949, 2.21410648, 2.20263598, 2.1791073 , 2.15789688, 2.14225592, 2.13043072, 2.09846149, 2.07491627, 2.06112946, 2.04336228, 2.02056257, 1.99107297, 1.98856298, 1.97039638, 1.9575191 , 1.93587212, 1.91997992, 1.85665009, 1.84338407, 1.79610228, 1.79328928, 1.78429932, 1.74123465, 1.7241243 , 1.7010803 , 1.64746663, 1.62765943, 1.62303706, 1.61800823, 1.60531761, 1.52425119, 1.50620662, 1.485018 , 1.45765932, 1.40861388, 1.39268607, 1.3483904 , 1.32025766, 1.31350522, 1.28517948, 1.25950863, 1.23770526, 1.18665953, 1.15504454, 1.14088912, 1.11336858, 1.01682096, 0.9791356 , 0.93161774, 0.90834728, 0.8611552 , 0.82261935, 0.79141265, 0.64055544, 0.60890393, 0.58578707, 0.4948037 , 0.38776132, 0.35580931, 0.20854201])) >>> From matthew.brett at gmail.com Mon Aug 17 14:11:46 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 17 Aug 2009 11:11:46 -0700 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> Message-ID: <1e2af89e0908171111h608491f8t35aae86f239c5b57@mail.gmail.com> Hi Jonathan, > http://www.cs.toronto.edu/~jtaylor/crash/ > > I would be interested to know if this bug can be duplicated and/or if anyone > has any suggestions as to why: > > import numpy as np > A = np.load('A.npy') > b = np.load('b.npy') > rc = np.linalg.lstsq(A,b) > > produces: > > *** glibc detected *** /usr/bin/python: free(): invalid next size (normal): > 0x091793c0 *** I just tried it on 4 ubuntu machines, and one Fedora 11 machine, in various states of numpy-ness (including recent SVN) with no crash. What versions of stuff do you have over there? See you, Matthew From charlesr.harris at gmail.com Mon Aug 17 15:12:06 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Aug 2009 13:12:06 -0600 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> Message-ID: On Mon, Aug 17, 2009 at 11:42 AM, Jonathan Taylor < jonathan.taylor at utoronto.ca> wrote: > Hi, > > I am getting a strange crash in numpy.linalg.lstsq. I have put the code > that causes the crash along with two data files on my website at: > > http://www.cs.toronto.edu/~jtaylor/crash/ > > I would be interested to know if this bug can be duplicated and/or if > anyone has any suggestions as to why: > Usually these problems are due to ATLAS. If you are using ATLAS, what is your OS/distribution? What hardware are you running on? Did you build ATLAS yourself? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.taylor at utoronto.ca Mon Aug 17 15:43:19 2009 From: jonathan.taylor at utoronto.ca (Jonathan Taylor) Date: Mon, 17 Aug 2009 15:43:19 -0400 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> Message-ID: <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> Hi, I am using a computer that is administered. It is an intel Ubuntu box and came with an ATLAS compiled. I thus compiled my own numpy1.3.0 against that ATLAS. I was thinking about recompiling ATLAS myself. This machine only has g77 and not gfortran on it. Will that still work? Thanks, Jonathan. On Mon, Aug 17, 2009 at 3:12 PM, Charles R Harris wrote: > > > On Mon, Aug 17, 2009 at 11:42 AM, Jonathan Taylor wrote: >> >> Hi, >> >> I am getting a strange crash in numpy.linalg.lstsq.? I have put the code that causes the crash along with two data files on my website at: >> >> http://www.cs.toronto.edu/~jtaylor/crash/ >> >> I would be interested to know if this bug can be duplicated and/or if anyone has any suggestions as to why: > > Usually these problems are due to ATLAS. If you are using ATLAS, what is your OS/distribution? What hardware are you running on? Did you build ATLAS yourself? > > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From reckoner at gmail.com Mon Aug 17 16:28:26 2009 From: reckoner at gmail.com (Reckoner) Date: Mon, 17 Aug 2009 13:28:26 -0700 Subject: [Numpy-discussion] ImportError: No module named multiarray Message-ID: Hi, I created a pickled file on my Windows PC, uploaded to a Linux machine and then received the following error: Python 2.5.4 (r254:67916, Feb 5 2009, 19:52:35) [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import cPickle >>> cPickle.load(open('tst.pkl')) Traceback (most recent call last): File "", line 1, in ImportError: No module named multiarray Obviously, the pickled file loads fine on the Windows PC. the following is the result of numpy.test() >>> numpy.test() Running unit tests for numpy NumPy version 1.2.1 NumPy is installed in /nfs/02/reckoner/Starburst/lib/python2.5/site-packages/numpy Python version 2.5.4 (r254:67916, Feb 5 2009, 19:52:35) [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] nose version 0.10.3 ..........................................................................................................................................................................................................................................................................................................................................................................................................................................F................K................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ ====================================================================== FAIL: test_umath.TestComplexFunctions.test_against_cmath ---------------------------------------------------------------------- Traceback (most recent call last): File "/nfs/02/reckoner/Starburst/lib/python2.5/site-packages/nose-0.10.3-py2.5.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/nfs/02/reckoner/Starburst/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 268, in test_against_cmath assert abs(a - b) < atol, "%s %s: %s; cmath: %s"%(fname,p,a,b) AssertionError: arcsinh -2j: (-1.31695789692-1.57079632679j); cmath: (1.31695789692-1.57079632679j) ---------------------------------------------------------------------- Ran 1740 tests in 10.493s FAILED (KNOWNFAIL=1, failures=1) Any help appreciated. From charlesr.harris at gmail.com Mon Aug 17 16:38:21 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Aug 2009 14:38:21 -0600 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> Message-ID: On Mon, Aug 17, 2009 at 1:43 PM, Jonathan Taylor < jonathan.taylor at utoronto.ca> wrote: > Hi, > > I am using a computer that is administered. It is an intel Ubuntu box > and came with an ATLAS compiled. I thus compiled my own numpy1.3.0 > against that ATLAS. > > I was thinking about recompiling ATLAS myself. This machine only has > g77 and not gfortran on it. Will that still work? > As long as everything is consistent it should. Ubuntu has had some issues with ATLAS and I suspect that is what you are seeing. David Cournapeau could tell you more. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.taylor at utoronto.ca Mon Aug 17 16:50:00 2009 From: jonathan.taylor at utoronto.ca (Jonathan Taylor) Date: Mon, 17 Aug 2009 16:50:00 -0400 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> Message-ID: <463e11f90908171350k619b7b00md3867bf3faa3ad24@mail.gmail.com> I compiled lapack and atlas from scratch using g77 but now numpy.test() hangs when I try to use any numpy functionality. I think I saw someone else write about this. Is this a common problem? Thanks, Jonathan. On Mon, Aug 17, 2009 at 4:38 PM, Charles R Harris wrote: > > > On Mon, Aug 17, 2009 at 1:43 PM, Jonathan Taylor > wrote: >> >> Hi, >> >> I am using a computer that is administered. ?It is an intel Ubuntu box >> and came with an ATLAS compiled. ?I thus compiled my own numpy1.3.0 >> against that ATLAS. >> >> I was thinking about recompiling ATLAS myself. ?This machine only has >> g77 and not gfortran on it. ?Will that still work? > > As long as everything is consistent it should. Ubuntu has had some issues > with ATLAS and I suspect that is what you are seeing. David Cournapeau could > tell you more. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From kwgoodman at gmail.com Mon Aug 17 17:03:54 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 17 Aug 2009 14:03:54 -0700 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: <463e11f90908171350k619b7b00md3867bf3faa3ad24@mail.gmail.com> References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> <463e11f90908171350k619b7b00md3867bf3faa3ad24@mail.gmail.com> Message-ID: On Mon, Aug 17, 2009 at 1:50 PM, Jonathan Taylor wrote: > I compiled lapack and atlas from scratch using g77 but now > numpy.test() hangs when I try to use any numpy functionality. ?I think > I saw someone else write about this. ?Is this a common problem? Yes, it seems common. I know of 4 recent ATLAS builds (including mine but not yours) that have failed on 32 bit systems. The recent successes I have seen (including mine) have been on 64 bit systems. But maybe 32/64 bit has nothing to do with it. I am sure there are many 32 bit systems running a self-compiled ATLAS just fine. Oh, you crash on any numpy functionality? I only crashed on ATLAS type problems. From jonathan.taylor at utoronto.ca Mon Aug 17 17:13:40 2009 From: jonathan.taylor at utoronto.ca (Jonathan Taylor) Date: Mon, 17 Aug 2009 17:13:40 -0400 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> <463e11f90908171350k619b7b00md3867bf3faa3ad24@mail.gmail.com> Message-ID: <463e11f90908171413h427b7aecq5370c9cd6089523@mail.gmail.com> Yes... ATLAS type problems like matrix multiplication. Is there some alternative to get a working numpy going? How might I go about compiling numpy without ATLAS? I really got to get at least something working temporarily. Thanks, Jon. On Mon, Aug 17, 2009 at 5:03 PM, Keith Goodman wrote: > On Mon, Aug 17, 2009 at 1:50 PM, Jonathan > Taylor wrote: >> I compiled lapack and atlas from scratch using g77 but now >> numpy.test() hangs when I try to use any numpy functionality. ?I think >> I saw someone else write about this. ?Is this a common problem? > > Yes, it seems common. I know of 4 recent ATLAS builds (including mine > but not yours) that have failed on 32 bit systems. The recent > successes I have seen (including mine) have been on 64 bit systems. > But maybe 32/64 bit has nothing to do with it. I am sure there are > many 32 bit systems running a self-compiled ATLAS just fine. > > Oh, you crash on any numpy functionality? I only crashed on ATLAS type problems. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Mon Aug 17 17:21:26 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 17 Aug 2009 14:21:26 -0700 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: <463e11f90908171413h427b7aecq5370c9cd6089523@mail.gmail.com> References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> <463e11f90908171350k619b7b00md3867bf3faa3ad24@mail.gmail.com> <463e11f90908171413h427b7aecq5370c9cd6089523@mail.gmail.com> Message-ID: On Mon, Aug 17, 2009 at 2:13 PM, Jonathan Taylor wrote: > Is there some alternative to get a working numpy going? ?How might I > go about compiling numpy without ATLAS? ?I really got to get at least > something working temporarily. Just build numpy again but skip the ATLAS steps. From jonathan.taylor at utoronto.ca Mon Aug 17 17:27:22 2009 From: jonathan.taylor at utoronto.ca (Jonathan Taylor) Date: Mon, 17 Aug 2009 17:27:22 -0400 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> <463e11f90908171350k619b7b00md3867bf3faa3ad24@mail.gmail.com> <463e11f90908171413h427b7aecq5370c9cd6089523@mail.gmail.com> Message-ID: <463e11f90908171427g1257ed41yd72a9c6f813c42f2@mail.gmail.com> It seems to automatically detect it though. Specifically lapack_lite.so always seems to reference libatlas. On Mon, Aug 17, 2009 at 5:21 PM, Keith Goodman wrote: > On Mon, Aug 17, 2009 at 2:13 PM, Jonathan > Taylor wrote: >> Is there some alternative to get a working numpy going? ?How might I >> go about compiling numpy without ATLAS? ?I really got to get at least >> something working temporarily. > > Just build numpy again but skip the ATLAS steps. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Mon Aug 17 17:34:51 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 17 Aug 2009 14:34:51 -0700 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: <463e11f90908171427g1257ed41yd72a9c6f813c42f2@mail.gmail.com> References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> <463e11f90908171350k619b7b00md3867bf3faa3ad24@mail.gmail.com> <463e11f90908171413h427b7aecq5370c9cd6089523@mail.gmail.com> <463e11f90908171427g1257ed41yd72a9c6f813c42f2@mail.gmail.com> Message-ID: On Mon, Aug 17, 2009 at 2:27 PM, Jonathan Taylor wrote: > It seems to automatically detect it though. ?Specifically > lapack_lite.so always seems to reference libatlas. > > On Mon, Aug 17, 2009 at 5:21 PM, Keith Goodman wrote: >> On Mon, Aug 17, 2009 at 2:13 PM, Jonathan >> Taylor wrote: >>> Is there some alternative to get a working numpy going? ?How might I >>> go about compiling numpy without ATLAS? ?I really got to get at least >>> something working temporarily. >> >> Just build numpy again but skip the ATLAS steps. Yes, sorry. The only way I've tried doing it is uninstalling the unbuntu ATLAS binary. I don't know how to ignore it if it is installed. From jonathan.taylor at utoronto.ca Mon Aug 17 18:25:37 2009 From: jonathan.taylor at utoronto.ca (Jonathan Taylor) Date: Mon, 17 Aug 2009 18:25:37 -0400 Subject: [Numpy-discussion] How to compile numpy without ATLAS support? Message-ID: <463e11f90908171525r9a2da5v20fa4a675a4491c8@mail.gmail.com> I am wondering how I might be able to compile numpy without ATLAS on a ubuntu machine that has an atlas deb installed. It seems that the numpy build routine automatically detects it. Thanks for any help, Jonathan. From liukis at usc.edu Mon Aug 17 23:13:43 2009 From: liukis at usc.edu (Maria Liukis) Date: Mon, 17 Aug 2009 20:13:43 -0700 Subject: [Numpy-discussion] Indexing empty array with empty boolean array causes "IndexError: invalid index exception" In-Reply-To: <3d375d730908120925s67c4f23cp643b96f5adc1696c@mail.gmail.com> References: <7F7802B9-BD2F-4350-99B6-708D140089C8@usc.edu> <3d375d730908120925s67c4f23cp643b96f5adc1696c@mail.gmail.com> Message-ID: On Aug 12, 2009, at 9:25 AM, Robert Kern wrote: > On Mon, Aug 10, 2009 at 14:19, Maria Liukis wrote: >> Hello everybody, >> I'm using following versions of Scipy and Numpy packages: >>>>> scipy.__version__ >> '0.7.1' >>>>> np.__version__ >> '1.3.0' >> My code uses boolean array to filter 2-dimensional array which >> sometimes >> happens to be an empty array. It seems like I have to take special >> care when >> dimension I'm filtering is zero, otherwise I'm getting an >> "IndexError: >> invalid index" exception: >>>>> import numpy as np >>>>> a = np.zeros((2,10)) >>>>> a >> array([[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], >> [ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]) > > If that were actually your output from zeros(), that would definitely > be a bug. :-) Sorry, I realized my copy-and-paste mistake a minute after I posted the message. Obviously, it was too late :) > >>>>> filter_array = np.zeros(2,) >>>>> filter_array >> array([False, False], dtype=bool) >>>>> a[filter_array,:] >> array([], shape=(0, 10), dtype=float64) >>>>>>>> >> >> Now if filtered dimension is zero: >>>>> a = np.ones((0,10)) >>>>> a >> array([], shape=(0, 10), dtype=float64) >>>>> filter_array = np.zeros((0,), dtype=bool) >>>>> filter_array >> array([], dtype=bool) >>>>> filter_array.shape >> (0,) >>>>> a.shape >> (0, 10) >>>>> a[filter_array,:] >> Traceback (most recent call last): >> File "", line 1, in >> IndexError: invalid index >>>>> >> Would somebody know if it's an expected behavior, a package bug or >> am I >> doing something wrong? > > I would call it a bug. It's a corner case that we should probably > handle gracefully rather than raising an exception. Thanks, Robert! > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Tue Aug 18 00:28:38 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Aug 2009 22:28:38 -0600 Subject: [Numpy-discussion] How to compile numpy without ATLAS support? In-Reply-To: <463e11f90908171525r9a2da5v20fa4a675a4491c8@mail.gmail.com> References: <463e11f90908171525r9a2da5v20fa4a675a4491c8@mail.gmail.com> Message-ID: On Mon, Aug 17, 2009 at 4:25 PM, Jonathan Taylor < jonathan.taylor at utoronto.ca> wrote: > I am wondering how I might be able to compile numpy without ATLAS on a > ubuntu machine that has an atlas deb installed. It seems that the > numpy build routine automatically detects it. > Try BLAS=None LAPACK=None* ATLAS*=None python setup.py .... I haven't tried it myself but it is rumored to work ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From liukis at usc.edu Tue Aug 18 00:30:46 2009 From: liukis at usc.edu (Maria Liukis) Date: Mon, 17 Aug 2009 21:30:46 -0700 Subject: [Numpy-discussion] unique rows of array Message-ID: Hello everybody, While re-implementing some Matlab code in Python, I've run into a problem of finding a NumPy function analogous to the Matlab's "unique (array, 'rows')" to get unique rows of an array. Searching the web, I've found a similar discussion from couple of years ago with an example: ############## A SNIPPET FROM THE DISCUSSION [Numpy-discussion] Finding unique rows in an array [Was: Finding a row match within a numpy array] A Tuesday 21 August 2007, Mark.Miller escrigu?: > A slightly related question on this topic... > > Is there a good loopless way to identify all of the unique rows in an > array? Something like numpy.unique() is ideal, but capable of > extracting unique subarrays along an axis. You can always do a view of the rows as strings and then use unique(). Here is an example: In [1]: import numpy In [2]: a=numpy.arange(12).reshape(4,3) In [3]: a[2]=(3,4,5) In [4]: a Out[4]: array([[ 0, 1, 2], [ 3, 4, 5], [ 3, 4, 5], [ 9, 10, 11]]) now, create the view and select the unique rows: In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4') and finally restore the shape: In [6]: b.reshape((len(b)/a.shape[1], a.shape[1])) Out[6]: array([[ 0, 1, 2], [ 3, 4, 5], [ 9, 10, 11]]) If you want to find unique columns instead of rows, do a tranpose first on the initial array. ################END OF DISCUSSION Provided example works only because array elements are row-sorted. Changing tested array to (in my case, it's 'c'): >>> c array([[ 0, 1, 2], [ 3, 4, 5], [ 3, 4, 5], [ 9, 10, 11]]) >>> c[0] = (11, 10, 0) >>> c array([[11, 10, 0], [ 3, 4, 5], [ 3, 4, 5], [ 9, 10, 11]]) >>> b = np.unique(c.view('S%s' %c.itemsize*c.shape[0])) >>> b array(['', '\x03', '\x04', '\x05', '\t', '\n', '\x0b'], dtype='|S4') >>> b.view('i4') array([ 0, 3, 4, 5, 9, 10, 11]) >>> b.reshape((len(b)/c.shape[1], c.shape[1])).view('i4') Traceback (most recent call last): File "", line 1, in ValueError: total size of new array must be unchanged >>> Since len(b) = 7. Suggested approach would work if the whole row would be converted to a single string, I guess. But from what I could gather, numpy.array.view() only changes display element-wise. Before I start re-inventing the wheel, I was just wondering if using existing numpy functionality one could find unique rows in an array. Many thanks in advance! Masha -------------------- liukis at usc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Aug 18 00:44:37 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 18 Aug 2009 00:44:37 -0400 Subject: [Numpy-discussion] unique rows of array In-Reply-To: References: Message-ID: <1cd32cbb0908172144u7b0c3846rb602bfb8c15ea552@mail.gmail.com> On Tue, Aug 18, 2009 at 12:30 AM, Maria Liukis wrote: > Hello everybody, > While re-implementing some Matlab code in Python, I've run into a problem of > finding a NumPy function analogous to the Matlab's "unique(array, 'rows')" > to get unique rows of an array. Searching the web, I've found a similar > discussion from couple of years ago with an example: > > ############## A SNIPPET FROM THE DISCUSSION > [Numpy-discussion] Finding unique rows in an array [Was: Finding a row match > within a numpy array] > A Tuesday 21 August 2007, Mark.Miller escrigu?: >> A slightly related question on this topic... >> >> Is there a good loopless way to identify all of the unique rows in an >> array?? Something like numpy.unique() is ideal, but capable of >> extracting unique subarrays along an axis. > You can always do a view of the rows as strings and then use unique(). > Here is an example: > In [1]: import numpy > In [2]: a=numpy.arange(12).reshape(4,3) > In [3]: a[2]=(3,4,5) > In [4]: a > Out[4]: > array([[ 0,? 1,? 2], > ?? ? ? [ 3,? 4,? 5], > ?? ? ? [ 3,? 4,? 5], > ?? ? ? [ 9, 10, 11]]) > now, create the view and select the unique rows: > In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4') > and finally restore the shape: > In [6]: b.reshape((len(b)/a.shape[1], a.shape[1])) > Out[6]: > array([[ 0,? 1,? 2], > ?? ? ? [ 3,? 4,? 5], > ?? ? ? [ 9, 10, 11]]) > If you want to find unique columns instead of rows, do a tranpose first > on the initial array. > ################END OF DISCUSSION > > Provided example works only because array elements are row-sorted. > Changing?tested?array?to?(in my case, it's 'c'): >>>> c > array([[ 0, ?1, ?2], > ?? ? ? [ 3, ?4, ?5], > ?? ? ? [ 3, ?4, ?5], > ?? ? ? [ 9, 10, 11]]) >>>> c[0] = (11, 10, 0) >>>> c > array([[11, 10, ?0], > ?? ? ? [ 3, ?4, ?5], > ?? ? ? [ 3, ?4, ?5], > ?? ? ? [ 9, 10, 11]]) >>>> b = np.unique(c.view('S%s' %c.itemsize*c.shape[0])) >>>> b > array(['', '\x03', '\x04', '\x05', '\t', '\n', '\x0b'], > ?? ? ?dtype='|S4') >>>> b.view('i4') > array([ 0, ?3, ?4, ?5, ?9, 10, 11]) >>>> b.reshape((len(b)/c.shape[1], c.shape[1])).view('i4') > Traceback (most recent call last): > ??File "", line 1, in > ValueError: total size of new array must be unchanged >>>> > Since len(b) = 7. > Suggested approach would work if the whole row would be converted to a > single string, I guess. But from what I could gather, numpy.array.view() > only changes display element-wise. > Before I start re-inventing the wheel, I was just wondering if using > existing numpy functionality one could find unique rows in an array. > > Many thanks in advance! > Masha > -------------------- > liukis at usc.edu > > one way is to convert to structured array >>> c = np.array([[ 0, 1, 2], [ 3, 4, 5], [ 3, 4, 5], [ 9, 10, 11]]) >>> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1]) array([[ 0, 1, 2], [ 3, 4, 5], [ 9, 10, 11]]) for explanation, I asked a similar question last december about "sortrows". (I never remember, when I need the last reshape and when not) Josef From charlesr.harris at gmail.com Tue Aug 18 00:51:44 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Aug 2009 22:51:44 -0600 Subject: [Numpy-discussion] unique rows of array In-Reply-To: References: Message-ID: On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis wrote: > Hello everybody, > While re-implementing some Matlab code in Python, I've run into a problem > of finding a NumPy function analogous to the Matlab's "unique(array, > 'rows')" to get unique rows of an array. Searching the web, I've found a > similar discussion from couple of years ago with an example: > > Just to be clear, do you mean finding all rows that only occur once in the array? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From liukis at usc.edu Tue Aug 18 00:59:40 2009 From: liukis at usc.edu (Maria Liukis) Date: Mon, 17 Aug 2009 21:59:40 -0700 Subject: [Numpy-discussion] unique rows of array In-Reply-To: <1cd32cbb0908172144u7b0c3846rb602bfb8c15ea552@mail.gmail.com> References: <1cd32cbb0908172144u7b0c3846rb602bfb8c15ea552@mail.gmail.com> Message-ID: Josef, Thanks, I'll try that and will search for your question from last december :) Masha -------------------- liukis at usc.edu On Aug 17, 2009, at 9:44 PM, josef.pktd at gmail.com wrote: > On Tue, Aug 18, 2009 at 12:30 AM, Maria Liukis wrote: >> Hello everybody, >> While re-implementing some Matlab code in Python, I've run into a >> problem of >> finding a NumPy function analogous to the Matlab's "unique(array, >> 'rows')" >> to get unique rows of an array. Searching the web, I've found a >> similar >> discussion from couple of years ago with an example: >> >> ############## A SNIPPET FROM THE DISCUSSION >> [Numpy-discussion] Finding unique rows in an array [Was: Finding a >> row match >> within a numpy array] >> A Tuesday 21 August 2007, Mark.Miller escrigu?: >>> A slightly related question on this topic... >>> >>> Is there a good loopless way to identify all of the unique rows >>> in an >>> array? Something like numpy.unique() is ideal, but capable of >>> extracting unique subarrays along an axis. >> You can always do a view of the rows as strings and then use unique >> (). >> Here is an example: >> In [1]: import numpy >> In [2]: a=numpy.arange(12).reshape(4,3) >> In [3]: a[2]=(3,4,5) >> In [4]: a >> Out[4]: >> array([[ 0, 1, 2], >> [ 3, 4, 5], >> [ 3, 4, 5], >> [ 9, 10, 11]]) >> now, create the view and select the unique rows: >> In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view >> ('i4') >> and finally restore the shape: >> In [6]: b.reshape((len(b)/a.shape[1], a.shape[1])) >> Out[6]: >> array([[ 0, 1, 2], >> [ 3, 4, 5], >> [ 9, 10, 11]]) >> If you want to find unique columns instead of rows, do a tranpose >> first >> on the initial array. >> ################END OF DISCUSSION >> >> Provided example works only because array elements are row-sorted. >> Changing tested array to (in my case, it's 'c'): >>>>> c >> array([[ 0, 1, 2], >> [ 3, 4, 5], >> [ 3, 4, 5], >> [ 9, 10, 11]]) >>>>> c[0] = (11, 10, 0) >>>>> c >> array([[11, 10, 0], >> [ 3, 4, 5], >> [ 3, 4, 5], >> [ 9, 10, 11]]) >>>>> b = np.unique(c.view('S%s' %c.itemsize*c.shape[0])) >>>>> b >> array(['', '\x03', '\x04', '\x05', '\t', '\n', '\x0b'], >> dtype='|S4') >>>>> b.view('i4') >> array([ 0, 3, 4, 5, 9, 10, 11]) >>>>> b.reshape((len(b)/c.shape[1], c.shape[1])).view('i4') >> Traceback (most recent call last): >> File "", line 1, in >> ValueError: total size of new array must be unchanged >>>>> >> Since len(b) = 7. >> Suggested approach would work if the whole row would be converted >> to a >> single string, I guess. But from what I could gather, >> numpy.array.view() >> only changes display element-wise. >> Before I start re-inventing the wheel, I was just wondering if using >> existing numpy functionality one could find unique rows in an array. >> >> Many thanks in advance! >> Masha >> -------------------- >> liukis at usc.edu >> >> > > one way is to convert to structured array > >>>> c = np.array([[ 0, 1, 2], > [ 3, 4, 5], > [ 3, 4, 5], > [ 9, 10, 11]]) > >>>> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view >>>> (c.dtype).reshape(-1,c.shape[1]) > array([[ 0, 1, 2], > [ 3, 4, 5], > [ 9, 10, 11]]) > > for explanation, I asked a similar question last december about > "sortrows". > (I never remember, when I need the last reshape and when not) > > Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From liukis at usc.edu Tue Aug 18 00:59:42 2009 From: liukis at usc.edu (Maria Liukis) Date: Mon, 17 Aug 2009 21:59:42 -0700 Subject: [Numpy-discussion] unique rows of array In-Reply-To: References: Message-ID: <93B8B1E1-B25E-47E3-A8B7-4E8D781CDF4C@usc.edu> On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote: > > > On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis wrote: > Hello everybody, > > While re-implementing some Matlab code in Python, I've run into a > problem of finding a NumPy function analogous to the Matlab's > "unique(array, 'rows')" to get unique rows of an array. Searching > the web, I've found a similar discussion from couple of years ago > with an example: > > > Just to be clear, do you mean finding all rows that only occur once > in the array? Yes. > > > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Aug 18 01:03:35 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 18 Aug 2009 01:03:35 -0400 Subject: [Numpy-discussion] unique rows of array In-Reply-To: <93B8B1E1-B25E-47E3-A8B7-4E8D781CDF4C@usc.edu> References: <93B8B1E1-B25E-47E3-A8B7-4E8D781CDF4C@usc.edu> Message-ID: <1cd32cbb0908172203h4af1f9f9m83af252debe2b700@mail.gmail.com> On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukis wrote: > > On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote: > > > On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis wrote: >> >> Hello everybody, >> While re-implementing some Matlab code in Python, I've run into a problem >> of finding a NumPy function analogous to the Matlab's "unique(array, >> 'rows')" to get unique rows of an array. Searching the web, I've found a >> similar discussion from couple of years ago with an example: > > Just to be clear, do you mean finding all rows that only occur once in the > array? > > Yes. I interpreted your question as removing duplicates. It keeps rows that occur more than once. That's what my example is intended to do. Josef > > > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From josef.pktd at gmail.com Tue Aug 18 01:25:24 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 18 Aug 2009 01:25:24 -0400 Subject: [Numpy-discussion] unique rows of array In-Reply-To: <1cd32cbb0908172203h4af1f9f9m83af252debe2b700@mail.gmail.com> References: <93B8B1E1-B25E-47E3-A8B7-4E8D781CDF4C@usc.edu> <1cd32cbb0908172203h4af1f9f9m83af252debe2b700@mail.gmail.com> Message-ID: <1cd32cbb0908172225y1183f17et9c33cae6925d3826@mail.gmail.com> On Tue, Aug 18, 2009 at 1:03 AM, wrote: > On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukis wrote: >> >> On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote: >> >> >> On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis wrote: >>> >>> Hello everybody, >>> While re-implementing some Matlab code in Python, I've run into a problem >>> of finding a NumPy function analogous to the Matlab's "unique(array, >>> 'rows')" to get unique rows of an array. Searching the web, I've found a >>> similar discussion from couple of years ago with an example: >> >> Just to be clear, do you mean finding all rows that only occur once in the >> array? >> >> Yes. > > I interpreted your question as removing duplicates. It keeps rows that > occur more than once. > That's what my example is intended to do. > > Josef > >> >> >> >> Chuck >> Just a reminder about views on views, I don't think the recommendation to take the transpose to get unique columns works. We had the discussion some time ago, that views work on the original array data and not on the view, and in this case the transpose creates a view. example below Also, unique does a sort and doesn't preserve order. Josef >>> c=np.array([[ 10, 1, 2], [ 3, 4, 5], [ 3, 4, 5], [ 9, 10, 11]]) >>> cc = c.copy() #backup >>> c = cc.T >>> cc array([[10, 1, 2], [ 3, 4, 5], [ 3, 4, 5], [ 9, 10, 11]]) >>> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1]) Traceback (most recent call last): File "", line 1, in np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1]) ValueError: new type not compatible with array. >>> c = cc.T.copy() >>> c array([[10, 3, 3, 9], [ 1, 4, 4, 10], [ 2, 5, 5, 11]]) >>> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1]) array([[ 1, 4, 4, 10], [ 2, 5, 5, 11], [10, 3, 3, 9]]) >>> c = np.ascontiguousarray(cc.T) >>> np.unique1d(c.view([('',c.dtype)]*c.shape[1])).view(c.dtype).reshape(-1,c.shape[1]) array([[ 1, 4, 4, 10], [ 2, 5, 5, 11], [10, 3, 3, 9]]) From liukis at usc.edu Tue Aug 18 01:44:44 2009 From: liukis at usc.edu (Maria Liukis) Date: Mon, 17 Aug 2009 22:44:44 -0700 Subject: [Numpy-discussion] unique rows of array In-Reply-To: <1cd32cbb0908172203h4af1f9f9m83af252debe2b700@mail.gmail.com> References: <93B8B1E1-B25E-47E3-A8B7-4E8D781CDF4C@usc.edu> <1cd32cbb0908172203h4af1f9f9m83af252debe2b700@mail.gmail.com> Message-ID: On Aug 17, 2009, at 10:03 PM, josef.pktd at gmail.com wrote: > On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukis wrote: >> >> On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote: >> >> >> On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis >> wrote: >>> >>> Hello everybody, >>> While re-implementing some Matlab code in Python, I've run into a >>> problem >>> of finding a NumPy function analogous to the Matlab's "unique(array, >>> 'rows')" to get unique rows of an array. Searching the web, I've >>> found a >>> similar discussion from couple of years ago with an example: >> >> Just to be clear, do you mean finding all rows that only occur >> once in the >> array? Sorry, I think it shows that I should stop working pass 10pm :) >> >> Yes. > > I interpreted your question as removing duplicates. It keeps rows that > occur more than once. Yes, I meant keeping only unique (without duplicates) rows. > That's what my example is intended to do. > > Josef > >> >> >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From liukis at usc.edu Tue Aug 18 02:01:22 2009 From: liukis at usc.edu (Maria Liukis) Date: Mon, 17 Aug 2009 23:01:22 -0700 Subject: [Numpy-discussion] unique rows of array In-Reply-To: <1cd32cbb0908172203h4af1f9f9m83af252debe2b700@mail.gmail.com> References: <93B8B1E1-B25E-47E3-A8B7-4E8D781CDF4C@usc.edu> <1cd32cbb0908172203h4af1f9f9m83af252debe2b700@mail.gmail.com> Message-ID: Josef, Many thanks for the example! It should become an official NumPy recipe :) Thanks again, Masha -------------------- liukis at usc.edu On Aug 17, 2009, at 10:03 PM, josef.pktd at gmail.com wrote: > On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukis wrote: >> >> On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote: >> >> >> On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis >> wrote: >>> >>> Hello everybody, >>> While re-implementing some Matlab code in Python, I've run into a >>> problem >>> of finding a NumPy function analogous to the Matlab's "unique(array, >>> 'rows')" to get unique rows of an array. Searching the web, I've >>> found a >>> similar discussion from couple of years ago with an example: >> >> Just to be clear, do you mean finding all rows that only occur >> once in the >> array? >> >> Yes. > > I interpreted your question as removing duplicates. It keeps rows that > occur more than once. > That's what my example is intended to do. > > Josef > >> >> >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From schut at sarvision.nl Tue Aug 18 04:22:03 2009 From: schut at sarvision.nl (Vincent Schut) Date: Tue, 18 Aug 2009 10:22:03 +0200 Subject: [Numpy-discussion] add axis to results of reduction (mean, min, ...) In-Reply-To: References: <1cd32cbb0908060855v3fec4524rfed715d60c741a0b@mail.gmail.com> Message-ID: Keith Goodman wrote: > On Thu, Aug 6, 2009 at 9:58 AM, Charles R > Harris wrote: >> >> On Thu, Aug 6, 2009 at 9:55 AM, wrote: >>> What's the best way of getting back the correct shape to be able to >>> broadcast, mean, min,.. to the original array, that works for >>> arbitrary dimension and axis? >>> >>> I thought I have seen some helper functions, but I don't find them >>> anymore? >> Adding a keyword to retain the number of dimensions has been mooted. It >> shouldn't be too difficult to implement and would allow things like: >> >>>>> scaled = a/a.max(1, reduce=0) >> I could do that for 1.4 if folks are interested. > > I'd use that. It's better than what I usually do: > > scaled = a / a.max(1).reshape(-1,1) To chime in after returning from holidays: I'd use that keyword a great deal. Would be more than welcome to me. I currently have loads of code numpy.newaxis-ing the results of min/max/mean operations... From eadrogue at gmx.net Tue Aug 18 07:01:50 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Tue, 18 Aug 2009 13:01:50 +0200 Subject: [Numpy-discussion] indexing problem Message-ID: <20090818110150.GA13641@doriath.local> Hi, Suppose I have a 3-dimansional array, where one dimension is time. I'm not particularly interested in selecting specific moments in time, so most of the time I won't be indexing this dimension. Intuitively, one would make time the third dimension, but if you do that you have to specifiy the time in every index, which is annoying, because then all indices must start with an empty slice, e.g., a[:,1] a[:,:,3] a[:,0,1] etc. On the other hand, the arrays resulting from indexing have all elements sorted by time, which is a good thing. Then, if I change it and make time the first dimension, it's handy because I can omit time in indices, BUT then the sub-arrays produced by indexing are not sorted by time! Is it possible to change the way numpy traverses the array, so that it moves "less" on the first dimension (instead of the default, which is to move less on the last dimension), so that I get arrays sorted by time when time is not on the last dimension? Thanks. Ernest From josef.pktd at gmail.com Tue Aug 18 10:00:27 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 18 Aug 2009 10:00:27 -0400 Subject: [Numpy-discussion] unique rows of array In-Reply-To: References: <93B8B1E1-B25E-47E3-A8B7-4E8D781CDF4C@usc.edu> <1cd32cbb0908172203h4af1f9f9m83af252debe2b700@mail.gmail.com> Message-ID: <1cd32cbb0908180700l141e4c47x12b40ea6da232785@mail.gmail.com> On Tue, Aug 18, 2009 at 2:01 AM, Maria Liukis wrote: > Josef, > Many thanks for the example! It should become an official NumPy recipe :) > Thanks again, > Masha > -------------------- > liukis at usc.edu Actually, there is also an implementation of unique rows in scipy.stats._support. It uses loops (and array concatenation in the loop), but it preserves the order of the rows in the array. In general, I don't recommend using scipy.stats._support, since many or most functions are not tested and only some are used in scipy.stats. These functions wait for a rewrite or removal. When I thought about a rewrite last year, I didn't know much about structured arrays and views. Josef >>> cc array([[10, 1, 2], [ 3, 4, 5], [ 3, 4, 5], [ 9, 10, 11]]) >>> scipy.stats._support.unique(cc) array([[10, 1, 2], [ 3, 4, 5], [ 9, 10, 11]]) unique columns using transpose : >>> cct = cc.T.copy() >>> cct array([[10, 3, 3, 9], [ 1, 4, 4, 10], [ 2, 5, 5, 11]]) >>> scipy.stats._support.unique(cct.T).T array([[10, 3, 9], [ 1, 4, 10], [ 2, 5, 11]]) Josef > > On Aug 17, 2009, at 10:03 PM, josef.pktd at gmail.com wrote: > > On Tue, Aug 18, 2009 at 12:59 AM, Maria Liukis wrote: > > On Aug 17, 2009, at 9:51 PM, Charles R Harris wrote: > > On Mon, Aug 17, 2009 at 10:30 PM, Maria Liukis wrote: > > Hello everybody, > While re-implementing some Matlab code in Python, I've run into a problem > of finding a NumPy function analogous to the Matlab's "unique(array, > 'rows')" to get unique rows of an array. Searching the web, I've found a > similar discussion from couple of years ago with an example: > > Just to be clear, do you mean finding all rows that only occur once in the > array? > Yes. > > I interpreted your question as removing duplicates. It keeps rows that > occur more than once. > That's what my example is intended to do. > Josef > > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From robert.kern at gmail.com Tue Aug 18 10:33:24 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 18 Aug 2009 07:33:24 -0700 Subject: [Numpy-discussion] indexing problem In-Reply-To: <20090818110150.GA13641@doriath.local> References: <20090818110150.GA13641@doriath.local> Message-ID: <3d375d730908180733wdeb3a71sdc19bfce15b79720@mail.gmail.com> 2009/8/18 Ernest Adrogu? : > Hi, > > Suppose I have a 3-dimansional array, where one dimension > is time. I'm not particularly interested in selecting specific > moments in time, so most of the time I won't be indexing this > dimension. > > Intuitively, one would make time the third dimension, but if > you do that you have to specifiy the time in every index, which > is annoying, because then all indices must start with an empty > slice, e.g., > > a[:,1] > a[:,:,3] > a[:,0,1] > > etc. On the other hand, the arrays resulting from indexing have > all elements sorted by time, which is a good thing. > > Then, if I change it and make time the first dimension, it's > handy because I can omit time in indices, BUT then the sub-arrays > produced by indexing are not sorted by time! I do not know what you mean by "not sorted by time". You can keep the sub-arrays sorted however you like regardless of the index used for time. Can you show us an example of the problem you are seeing? > Is it possible to change the way numpy traverses the array, > so that it moves "less" on the first dimension (instead ?of the > default, which is to move less on the last dimension), so that I > get arrays sorted by time when time is not on the last dimension? No, but I suspect that the problem you are seeing can be fixed in another way. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From eadrogue at gmx.net Tue Aug 18 12:22:28 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Tue, 18 Aug 2009 18:22:28 +0200 Subject: [Numpy-discussion] indexing problem In-Reply-To: <3d375d730908180733wdeb3a71sdc19bfce15b79720@mail.gmail.com> References: <20090818110150.GA13641@doriath.local> <3d375d730908180733wdeb3a71sdc19bfce15b79720@mail.gmail.com> Message-ID: <20090818162228.GA13911@doriath.local> 18/08/09 @ 07:33 (-0700), thus spake Robert Kern: > 2009/8/18 Ernest Adrogu? : > > Hi, > > > > Suppose I have a 3-dimansional array, where one dimension > > is time. I'm not particularly interested in selecting specific > > moments in time, so most of the time I won't be indexing this > > dimension. > > > > Intuitively, one would make time the third dimension, but if > > you do that you have to specifiy the time in every index, which > > is annoying, because then all indices must start with an empty > > slice, e.g., > > > > a[:,1] > > a[:,:,3] > > a[:,0,1] > > > > etc. On the other hand, the arrays resulting from indexing have > > all elements sorted by time, which is a good thing. > > > > Then, if I change it and make time the first dimension, it's > > handy because I can omit time in indices, BUT then the sub-arrays > > produced by indexing are not sorted by time! > > I do not know what you mean by "not sorted by time". You can keep the > sub-arrays sorted however you like regardless of the index used for > time. Can you show us an example of the problem you are seeing? Sorry for not explaining myself clearly enough. I'm using a masked arrays, and I call the compressed() method on the sub-arrays resulting from indexing, which gives a 1-d array. Is this 1-d array that isn't sorted the way I'd like. I'll try to explain this with an example. Here is a 3-d array that represents a 2-d array in differents moments in time (for ilustration purposes all elements in the 2-d array increase by one at each time point: In [35]: x=np.zeros((3,2,2)) In [36]: x[0]=1 In [37]: x[1]=2 In [38]: x[2]=3 In [39]: x Out[39]: array([[[ 1., 1.], [ 1., 1.]], [[ 2., 2.], [ 2., 2.]], [[ 3., 3.], [ 3., 3.]]]) Then if I take the elements [:,0] and flatten the resulting array, we can see that the resulting array has its elements sorted by time: In [40]: x[:,0].flatten() Out[40]: array([ 1., 1., 2., 2., 3., 3.]) But then I thought that it would be nice to arrange the data differently, so that the dimension that represents time can be omitted in the index. Therefore, I re-arrange the data in this way: In [41]: x=np.zeros((2,2,3)) In [42]: x[:,:,0]=1 In [43]: x[:,:,1]=2 In [44]: x[:,:,2]=3 In [46]: x Out[46]: array([[[ 1., 2., 3.], [ 1., 2., 3.]], [[ 1., 2., 3.], [ 1., 2., 3.]]]) But then, the flattened arrays I get are no longer in time-ascending order: In [45]: x[0].flatten() Out[45]: array([ 1., 2., 3., 1., 2., 3.]) It's a bit difficult to explain, but I hope it's more clear now! Ernest From robert.kern at gmail.com Tue Aug 18 12:27:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 18 Aug 2009 09:27:21 -0700 Subject: [Numpy-discussion] indexing problem In-Reply-To: <20090818162228.GA13911@doriath.local> References: <20090818110150.GA13641@doriath.local> <3d375d730908180733wdeb3a71sdc19bfce15b79720@mail.gmail.com> <20090818162228.GA13911@doriath.local> Message-ID: <3d375d730908180927v6db90fb7id85a4987b31652f3@mail.gmail.com> 2009/8/18 Ernest Adrogu? : > 18/08/09 @ 07:33 (-0700), thus spake Robert Kern: >> 2009/8/18 Ernest Adrogu? : >> > Hi, >> > >> > Suppose I have a 3-dimansional array, where one dimension >> > is time. I'm not particularly interested in selecting specific >> > moments in time, so most of the time I won't be indexing this >> > dimension. >> > >> > Intuitively, one would make time the third dimension, but if >> > you do that you have to specifiy the time in every index, which >> > is annoying, because then all indices must start with an empty >> > slice, e.g., >> > >> > a[:,1] >> > a[:,:,3] >> > a[:,0,1] >> > >> > etc. On the other hand, the arrays resulting from indexing have >> > all elements sorted by time, which is a good thing. >> > >> > Then, if I change it and make time the first dimension, it's >> > handy because I can omit time in indices, BUT then the sub-arrays >> > produced by indexing are not sorted by time! >> >> I do not know what you mean by "not sorted by time". You can keep the >> sub-arrays sorted however you like regardless of the index used for >> time. Can you show us an example of the problem you are seeing? > > Sorry for not explaining myself clearly enough. > > I'm using a masked arrays, and I call the compressed() method on > the sub-arrays resulting from indexing, which gives a 1-d array. Is > this 1-d array that isn't sorted the way I'd like. > > I'll try to explain this with an example. Here is a 3-d array > that represents a 2-d array in differents moments in time (for > ilustration purposes all elements in the 2-d array increase by > one at each time point: > > In [35]: x=np.zeros((3,2,2)) > > In [36]: x[0]=1 > > In [37]: x[1]=2 > > In [38]: x[2]=3 > > In [39]: x > Out[39]: > array([[[ 1., ?1.], > ? ? ? ?[ 1., ?1.]], > > ? ? ? [[ 2., ?2.], > ? ? ? ?[ 2., ?2.]], > > ? ? ? [[ 3., ?3.], > ? ? ? ?[ 3., ?3.]]]) > > Then if I take the elements [:,0] and flatten the resulting > array, we can see that the resulting array has its elements sorted > by time: > > In [40]: x[:,0].flatten() > Out[40]: array([ 1., ?1., ?2., ?2., ?3., ?3.]) > > But then I thought that it would be nice to arrange the data > differently, so that the dimension that represents time can be > omitted in the index. > > Therefore, I re-arrange the data in this way: > > In [41]: x=np.zeros((2,2,3)) > > In [42]: x[:,:,0]=1 > > In [43]: x[:,:,1]=2 > > In [44]: x[:,:,2]=3 > > In [46]: x > Out[46]: > array([[[ 1., ?2., ?3.], > ? ? ? ?[ 1., ?2., ?3.]], > > ? ? ? [[ 1., ?2., ?3.], > ? ? ? ?[ 1., ?2., ?3.]]]) > > But then, the flattened arrays I get are no longer in > time-ascending order: > > In [45]: x[0].flatten() > Out[45]: array([ 1., ?2., ?3., ?1., ?2., ?3.]) > > It's a bit difficult to explain, but I hope it's more clear now! x[0].T.flatten() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From noagbodjivictor at gmail.com Tue Aug 18 13:11:14 2009 From: noagbodjivictor at gmail.com (Victor Noagbodji) Date: Tue, 18 Aug 2009 13:11:14 -0400 Subject: [Numpy-discussion] how can i optimize this function further more? Message-ID: hello, i'm fairly new to numpy. i need help with a snow effect done with pygame. the entire code is below. the performance drops in the snowfall function. the original c code and a demo can be found here : http://sol.gfxile.net/gp/ch04.html as you can see. the original c code went the pixel by pixel way, but i couldn't do that with pygame. i'm asking here because i think there might be a way numpy could help. the idea of the snowfall function is to plot white pixel one line down (while avoiding the green ground, thus the test array1-array2 == white) thanks a lot in advance for your help. ps. sorry for the ugly code. this is the result of several optimizations, plus help from the pygame mailing list. import numpy import cProfile from math import sin, cos from random import randint from sys import exit, stdout import pygame import pygame.event as event import pygame.display as display from pygame.locals import * from pygame import Surface from pygame.time import Clock, get_ticks from pygame.surfarray import pixels2d, blit_array, make_surface WIDTH = 640 HEIGHT = 480 RESOLUTION = (WIDTH, HEIGHT) def init(screen_array, green=int(0x007f00)): lowest_level = 0 for i in xrange(WIDTH): sins = (sin((i + 3247) * 0.02) * 0.3 + sin((i + 2347) * 0.04) * 0.1 + sin((i + 4378) * 0.01) * 0.6) p = int(sins * 100 + 380) lowest_level = max(p, lowest_level) for j in range(p, HEIGHT): screen_array[i, j] = green return lowest_level def newsnow(screen_array, white=int(0xffffff), density=1): new_indices = numpy.array([randint(1, WIDTH-2) for i in xrange(density)]) screen_array[new_indices, 0] = white def snowfall(screen_array, white=int(0xffffff), fallrng=xrange(HEIGHT-2, -1, -1)): screen_array = numpy.transpose(screen_array) snow_in_next_layer = numpy.zeros(WIDTH, dtype=int) for j in fallrng: array1 = screen_array[j] array2 = screen_array[j+1] indices_where_snow_moved_down = numpy.where(array1-array2 == white)[0] snow_in_next_layer.fill(0) snow_in_next_layer[indices_where_snow_moved_down] = white screen_array[j ] = array1 - snow_in_next_layer screen_array[j+1] = array2 + snow_in_next_layer screen_array = numpy.transpose(screen_array) def main(): pygame.init() screen_surf = display.set_mode(RESOLUTION) screen_rect = screen_surf.get_rect() screen_array = pixels2d(screen_surf) lowest_level = init(screen_array) display.update(screen_rect) fallrng = xrange(lowest_level-2, -1, -1) white = int(0xffffff) black = 0 c = Clock() while True: newsnow(screen_array, density=8) snowfall(screen_array) display.update(screen_rect) for e in event.get(): type = e.type if type == QUIT: exit() elif type == KEYUP and e.key == K_ESCAPE: return c.tick() stdout.write('fps: ~%s\r' % round(c.get_fps())) stdout.flush() if __name__ == '__main__': cProfile.runctx('main()', globals(), {'main': main}, 'last_gp3_stats') -- paul victor noagbodji From jonathan.taylor at utoronto.ca Tue Aug 18 15:14:55 2009 From: jonathan.taylor at utoronto.ca (Jonathan Taylor) Date: Tue, 18 Aug 2009 15:14:55 -0400 Subject: [Numpy-discussion] Strange crash in numpy.linalg.lstsq. In-Reply-To: References: <463e11f90908171042u1457b5d6l6c70a46dbbd1418e@mail.gmail.com> <463e11f90908171243u2da6d368xcb390b6734e75fc6@mail.gmail.com> <463e11f90908171350k619b7b00md3867bf3faa3ad24@mail.gmail.com> <463e11f90908171413h427b7aecq5370c9cd6089523@mail.gmail.com> <463e11f90908171427g1257ed41yd72a9c6f813c42f2@mail.gmail.com> Message-ID: <463e11f90908181214jc91216co5da23ba79e8276e5@mail.gmail.com> Right... So I was able to get everything working finally. I am not 100% sure how or why it works though so I am going to outline what I did here for reference. I first tried just using LAPACK 3.1.1 (since it seemed set up for g77 instead of gfortran which I do not have). I compiled this to yield the associated lapack and blas libraries using the fortran compiler settings detailed on the scipy install web page. This actually gave me a numpy with the same problems (hanging on numpy.test()). Thus I realized the problem was LAPACK and not with ATLAS. Eventually I got numpy with LAPACK to work when I used the default settings in the example config file of LAPACK instead of using the suggested settings on the numpy web site. I then compiled ATLAS and this worked as well. It still seems a little bit weird that these settings can break the software though. Thanks for the suggestions and I hope this helps someone. Jonathan. On Mon, Aug 17, 2009 at 5:34 PM, Keith Goodman wrote: > On Mon, Aug 17, 2009 at 2:27 PM, Jonathan > Taylor wrote: >> It seems to automatically detect it though. ?Specifically >> lapack_lite.so always seems to reference libatlas. >> >> On Mon, Aug 17, 2009 at 5:21 PM, Keith Goodman wrote: >>> On Mon, Aug 17, 2009 at 2:13 PM, Jonathan >>> Taylor wrote: >>>> Is there some alternative to get a working numpy going? ?How might I >>>> go about compiling numpy without ATLAS? ?I really got to get at least >>>> something working temporarily. >>> >>> Just build numpy again but skip the ATLAS steps. > > Yes, sorry. The only way I've tried doing it is uninstalling the > unbuntu ATLAS binary. I don't know how to ignore it if it is > installed. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From markbak at gmail.com Wed Aug 19 08:25:50 2009 From: markbak at gmail.com (Mark Bakker) Date: Wed, 19 Aug 2009 14:25:50 +0200 Subject: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution? Message-ID: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> Hello list, I compute the index of the last term in an array that I need and call the index n. I can then call the array b as b[:-n] If I need all terms in the array, the logical syntax would be: b[:-0] but that doesn't work. Any reason why that has not been implemented? Any elegant workaround? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Wed Aug 19 08:48:32 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Wed, 19 Aug 2009 14:48:32 +0200 Subject: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution? In-Reply-To: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> References: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> Message-ID: I'm sure there is a better solution....: In [1]: x = numpy.array([i for i in range(10)]) In [2]: foo = lambda n: -n if n!=0 else None ....: In [3]: x[:foo(1)] Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8]) In [4]: x[:foo(0)] Out[4]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) On Wed, Aug 19, 2009 at 2:25 PM, Mark Bakker wrote: > Hello list, > > I compute the index of the last term in an array that I need and call the > index n. > > I can then call the array b as > > b[:-n] > > If I need all terms in the array, the logical syntax would be: > > b[:-0] > > but that doesn't work. Any reason why that has not been implemented? Any > elegant workaround? > > Thanks, Mark > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From lciti at essex.ac.uk Wed Aug 19 08:47:04 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 19 Aug 2009 13:47:04 +0100 Subject: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution? References: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E77@sernt14.essex.ac.uk> The problem is that n is integer and integer do not have different representations for 0 and -0 (while floats do). Therefore it is impossible to disambiguate the following two case scenarios: >> b[:n] # take the first n >> b[:-n] # take all but the last n when n ==0. One possible solution (you decide whether it is elegant :-D ): >> b[:len(b)-n] From nmb at wartburg.edu Wed Aug 19 08:50:46 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Wed, 19 Aug 2009 07:50:46 -0500 Subject: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution? In-Reply-To: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> References: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> Message-ID: On Aug 19, 2009, at 7:25 AM, Mark Bakker wrote: > I compute the index of the last term in an array that I need and > call the index n. > > I can then call the array b as > > b[:-n] > > If I need all terms in the array, the logical syntax would be: > > b[:-0] > > but that doesn't work. Any reason why that has not been implemented? > Any elegant workaround? Because there is no negative zero as an integer: >>> -0 == 0 True So when Python parses your request, it sees "-0" and replaces that with the integer 0. And as you found out, b[:0] gives you an empty slice. Negative indices are just syntactic sugar for (N+1)-n where N is the length of the list and that should work for n=0 as well: >>> b = [1,2,3,4,5] >>> b[:0] [] >>> b[:len(b)+1-0] [1, 2, 3, 4, 5] -Neil From lciti at essex.ac.uk Wed Aug 19 08:54:28 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 19 Aug 2009 13:54:28 +0100 Subject: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution? References: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E78@sernt14.essex.ac.uk> Another solution (elegant?? readable??) : >> x[slice(-n or None)] # with n == 0, 1, ... From sebastian.walter at gmail.com Wed Aug 19 09:03:38 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Wed, 19 Aug 2009 15:03:38 +0200 Subject: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution? In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E78@sernt14.essex.ac.uk> References: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E78@sernt14.essex.ac.uk> Message-ID: In [45]: x[: -0 or None] Out[45]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [46]: x[: -1 or None] Out[46]: array([0, 1, 2, 3, 4, 5, 6, 7, 8]) works fine without slice() On Wed, Aug 19, 2009 at 2:54 PM, Citi, Luca wrote: > Another solution (elegant?? readable??) : >>> x[slice(-n or None)] # with n == 0, 1, ... > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From markbak at gmail.com Wed Aug 19 09:28:06 2009 From: markbak at gmail.com (Mark Bakker) Date: Wed, 19 Aug 2009 15:28:06 +0200 Subject: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution? Message-ID: <6946b9500908190628k1220eac2gc1f0c1638dcf9105@mail.gmail.com> The winner so far: x[: -n or None] works fine when n = 0, relatively elegant, even pretty slick I think. And I expect it to be quick. Thanks for all the replys, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at loria.fr Wed Aug 19 16:45:49 2009 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Wed, 19 Aug 2009 22:45:49 +0200 Subject: [Numpy-discussion] Recipe: extract a sub-array using given shape, centered on given position Message-ID: Hi, I've coded a function that allows to extract a contiguous array from another one using a given shape and centered on a given position. I did not find an equivalent within numpy so I hope I did not miss it. The only interest of the function is to guarantee that the resulting sub-array will have the required shape. If some values are out of bounds, result array is padded with a fill value. Hope it can be useful to someone. Nicolas Code: ----- import numpy def extract(Z, shape, position, fill=numpy.NaN): """ Extract a sub-array from Z using given shape and centered on position. If some part of the sub-array is out of Z bounds, result is padded with fill value. **Parameters** `Z` : array_like Input array. `shape` : tuple Shape of the output array `position` : tuple Position within Z `fill` : scalar Fill value **Returns** `out` : array_like Z slice with given shape and center **Examples** >>> Z = numpy.arange(0,16).reshape((4,4)) >>> extract(Z, shape=(3,3), position=(0,0)) [[ NaN NaN NaN] [ NaN 0. 1.] [ NaN 4. 5.]] Schema: +-----------+ | 0 0 0 | = extract (Z, shape=(3,3), position=(0,0)) | +---------------+ | 0 | 0 1 | 2 3 | = Z | | | | | 0 | 4 5 | 6 7 | +---|-------+ | | 8 9 10 11 | | | | 12 13 14 15 | +---------------+ >>> Z = numpy.arange(0,16).reshape((4,4)) >>> extract(Z, shape=(3,3), position=(3,3)) [[ 10. 11. NaN] [ 14. 15. NaN] [ NaN NaN NaN]] Schema: +---------------+ | 0 1 2 3 | = Z | | | 4 5 6 7 | | +-----------+ | 8 9 |10 11 | 0 | = extract (Z, shape=(3,3), position=(3,3)) | | | | | 12 13 |14 15 | 0 | +---------------+ | | 0 0 0 | +-----------+ """ # assert(len(position) == len(Z.shape)) # if len(shape) < len(Z.shape): # shape = shape + Z.shape[len(Z.shape)-len(shape):] R = numpy.ones(shape, dtype=Z.dtype)*fill P = numpy.array(list(position)).astype(int) Rs = numpy.array(list(R.shape)).astype(int) Zs = numpy.array(list(Z.shape)).astype(int) R_start = numpy.zeros((len(shape),)).astype(int) R_stop = numpy.array(list(shape)).astype(int) Z_start = (P-Rs//2) Z_stop = (P+Rs//2)+Rs%2 R_start = (R_start - numpy.minimum(Z_start,0)).tolist() Z_start = (numpy.maximum(Z_start,0)).tolist() R_stop = (R_stop - numpy.maximum(Z_stop-Zs,0)).tolist() Z_stop = (numpy.minimum(Z_stop,Zs)).tolist() r = [slice(start,stop) for start,stop in zip(R_start,R_stop)] z = [slice(start,stop) for start,stop in zip(Z_start,Z_stop)] R[r] = Z[z] return R Z = numpy.arange(0,16).reshape((4,4)) print Z print print extract(Z, shape=(3,3), position=(0,0)) print print extract(Z, shape=(3,3), position=(3,3)) From oliphant at enthought.com Wed Aug 19 19:40:25 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Wed, 19 Aug 2009 18:40:25 -0500 Subject: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution?. In-Reply-To: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> References: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> Message-ID: Use N = len(b) and then b[:N-n] -Travis -- (mobile phone of) Travis Oliphant Enthought, Inc. 1-512-536-1057 http://www.enthought.com On Aug 19, 2009, at 7:25 AM, Mark Bakker wrote: > Hello list, > > I compute the index of the last term in an array that I need and > call the index n. > > I can then call the array b as > > b[:-n] > > If I need all terms in the array, the logical syntax would be: > > b[:-0] > > but that doesn't work. Any reason why that has not been implemented? > Any elegant workaround? > > Thanks, Mark > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From cournape at gmail.com Wed Aug 19 21:22:48 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 19 Aug 2009 18:22:48 -0700 Subject: [Numpy-discussion] why does b[:-0] not work, and is there an elegant solution? In-Reply-To: References: <6946b9500908190525p55333accy5fd1d9e29780815e@mail.gmail.com> Message-ID: <5b8d13220908191822q288bb9aew71fa8cb713bceb77@mail.gmail.com> On Wed, Aug 19, 2009 at 5:50 AM, Neil Martinsen-Burrell wrote: > On Aug 19, 2009, at 7:25 AM, Mark Bakker wrote: >> I compute the index of the last term in an array that I need and >> call the index n. >> >> I can then call the array b as >> >> b[:-n] >> >> If I need all terms in the array, the logical syntax would be: >> >> b[:-0] >> >> but that doesn't work. Any reason why that has not been implemented? >> Any elegant workaround? > > Because there is no negative zero as an integer: > > ?>>> -0 == 0 > True Not that it matters for the discussion, but -0.0 == 0.0: x = np.array(np.PZERO) y = np.array(np.NZERO) y == x # True 1 / x == 1 / y # False: inf and negative inf The only way to differentiate the number by itself is signbit, cheers, David From erik.tollerud at gmail.com Thu Aug 20 03:37:07 2009 From: erik.tollerud at gmail.com (Erik Tollerud) Date: Thu, 20 Aug 2009 00:37:07 -0700 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4A7B62FC.6090504@molden.no> References: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> Message-ID: I realize this topic is a bit old, but I couldn't help but add something I forgot to mention earlier... >> I mean, once the computations are moved elsewhere numpy is basically a >> convenient way to address memory. > > That is how I mostly use NumPy, though. Computations I often do in > Fortran 95 or C. > > NumPy arrays on the GPU memory is an easy task. But then I would have to > write the computation in OpenCL's dialect of C99? But I'd rather program > everything in Python if I could. Details like GPU and OpenCL should be > hidden away. Nice looking Python with NumPy is much easier to read and > write. That is why I'd like to see a code generator (i.e. JIT compiler) > for NumPy. This is true to some extent, but also probably difficult to do given the fact that paralellizable algorithms are generally more difficult to formulate in striaghtforward ways. In the intermediate-term, I think there is value in having numpy implement some sort of interface to OpenCL or cuda - I can easily see an explosion of different bindings (it's already starting), and having a "canonical" way encoded in numpy or scipy is probably the best way to mitigate the inevitable compatibility problems... I'm partial to the way pycuda can do it (basically, just export numpy arrays to the GPU and let you write the code from there), but the main point is to just get some basic compatibility in pretty quickly, as I think this GPGPU is here to stay... From faltet at pytables.org Thu Aug 20 07:07:23 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 20 Aug 2009 13:07:23 +0200 Subject: [Numpy-discussion] Accelerating NumPy computations [Was: GPU Numpy] In-Reply-To: References: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> Message-ID: <1250766444.5546.40.camel@inspiron> El dj 20 de 08 del 2009 a les 00:37 -0700, en/na Erik Tollerud va escriure: > > NumPy arrays on the GPU memory is an easy task. But then I would have to > > write the computation in OpenCL's dialect of C99? But I'd rather program > > everything in Python if I could. Details like GPU and OpenCL should be > > hidden away. Nice looking Python with NumPy is much easier to read and > > write. That is why I'd like to see a code generator (i.e. JIT compiler) > > for NumPy. > > This is true to some extent, but also probably difficult to do given > the fact that paralellizable algorithms are generally more difficult > to formulate in striaghtforward ways. In the intermediate-term, I > think there is value in having numpy implement some sort of interface > to OpenCL or cuda - I can easily see an explosion of different > bindings (it's already starting), and having a "canonical" way encoded > in numpy or scipy is probably the best way to mitigate the inevitable > compatibility problems... I'm partial to the way pycuda can do it > (basically, just export numpy arrays to the GPU and let you write the > code from there), but the main point is to just get some basic > compatibility in pretty quickly, as I think this GPGPU is here to > stay... Maybe. However I think that we should not forget the fact that, as Stula pointed out, the main bottleneck for *many* problems nowadays is memory access, not CPU speed. GPUs may have faster memory, but only a few % better than main stream memory. I'd like to hear from anyone here having achieved any kind of speed-up in their calculations by using GPUs instead of CPUs. By looking at these scenarios we may get an idea of where GPUs can be useful, and if driving an effort for give support for them in NumPy would be worth the effort. I personally think that, in general, exposing GPU capabilities directly to NumPy would provide little service for most NumPy users. I rather see letting this task to specialized libraries (like PyCUDA, or special versions of ATLAS, for example) that can be used from NumPy. Until then, I think that a more direct approach (and one that would deliver results earlier) for speeding-up NumPy is to be aware of the hierachical nature of the different memory levels in current CPU's and make NumPy to play nicely with it. In that sense, I think that applying the blocking technique (see [1] for a brief explanation) for taking advantage of both spatial and temporal localities is the way to go. For example, most part of the speed-up that Numexpr achieves comes from the fact that it uses blocking during the evaluation of complex expressions. This is so because the temporaries are kept small and can fit in current CPU caches. Implementing similar algorithms in NumPy should not be that difficult, most specially now that it already exists the Numexpr implementation as a model. And another thing that may further help to fight memory slowness (or CPU/GPU quickness, as you prefer ;-) in the next future is compression. Compression already helped bringing data faster from disk to CPU in the last 10 years, and now, it is almost time that this can happen with the memory too, not only disk. In [1] I demonstrated that compression can *already* help transmitting data in memory to CPU. Agreed, right now this is only true for highly compressible data (which is an important corner case anyway), but in the short future we will see how the compression technique would be able to accelerate computations for a high variety of datasets, even if they are not very compressible. So, in my humble opinion, implementing the possibility that NumPy can deal with compressed buffers in addition to uncompressed ones, could be very interesting in the short future (or even now, in specific situations). [1] http://www.pytables.org/docs/StarvingCPUs.pdf Francesc From ralf.gommers at googlemail.com Thu Aug 20 12:18:07 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 20 Aug 2009 12:18:07 -0400 Subject: [Numpy-discussion] what to do with chararray Message-ID: [this discussion moved here from the SciPy list] On Thu, Aug 20, 2009 at 9:50 AM, Christopher Hanley wrote: > Hi, > > I'd like to respectfully request that we move any discussion of what > to do with the numpy.char module to the numpy list. > > I'm a little concerned about some of the assumptions that are being > made about the number of users of the module. My assumption about there being few users of chararray is based on the absence of questions about it being asked on the list, the lack of docs and the apparent lack of a good use-case. Plus "who uses this?" has been asked twice on the list, and in both cases you (Chris) were the only one replying. > I would also like to > better understand the reasons for wanting to dump it. Let me be > clear. I'm not opposed to change. However breaking other people's > code just for the sake of change seems like a poor reason and a mean > thing to do to our customers. That would be a very poor reason, and I don't think that's the case. The reason this question about the future of chararray has now been asked twice is that people are trying to document it, and having trouble. If it stays, it has to be documented (and the bugs found in the process fixed) which costs time as well. It just being there will also cause people to understand / try to use it. Then if it turns out it is useless to those people, it wasted their time. This was the case for me. It would be great if we could a clear use-case and a reason why chararray is in NumPy besides backwards compatibility. Otherwise it should at least be documented as not for new development, and possibly deprecated. Finally, deprecation does not mean that the module disappears tomorrow. It can stick around for years if needed (there are functions in fft for example that have been deprecated for three years), but give a clear message to new users not to bother with it. Best regards, Ralf > > Thank you for your time and help, > Chris > > > -- > Christopher Hanley > Senior Systems Software Engineer > Space Telescope Science Institute > 3700 San Martin Drive > Baltimore MD, 21218 > (410) 338-4338 > > On Aug 20, 2009, at 1:35 AM, Robert Kern wrote: > > > On Wed, Aug 19, 2009 at 20:03, David > > Goldsmith wrote: > >> I'm going to take it a step further: "breakage" is always the > >> deterrent to change, and yet "change we must" (i.e., "adapt or > >> die"). It's certainly not without precedent - even within Numpy, I > >> believe - for things (though perhaps not whole namespaces) to be > >> deemed "to-be-deprecated," have a warning to this effect > >> established in one x.[even #].0 release, and then be removed by the > >> x.[even # + 2 or + 4].0 release. How has deprecation in Numpy > >> worked in the past - by dictum, vote, or consensus? > > > > Consensus or dictum without major objection. Voting is pointless > > except to inform one of those. > > > > -- > > Robert Kern > > > > "I have come to believe that the whole world is an enigma, a harmless > > enigma that is made terrible by our own mad attempt to interpret it as > > though it had an underlying truth." > > -- Umberto Eco > > _______________________________________________ > > Scipy-dev mailing list > > Scipy-dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Aug 20 12:23:27 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Aug 2009 09:23:27 -0700 Subject: [Numpy-discussion] what to do with chararray In-Reply-To: References: Message-ID: <3d375d730908200923t27c04c3fn124430dd7440580b@mail.gmail.com> On Thu, Aug 20, 2009 at 09:18, Ralf Gommers wrote: > [this discussion moved here from the SciPy list] > > On Thu, Aug 20, 2009 at 9:50 AM, Christopher Hanley > wrote: >> >> Hi, >> >> I'd like to respectfully request that we move any discussion of what >> to do with the numpy.char module to the numpy list. >> >> I'm a little concerned about some of the assumptions that are being >> made about the number of users of the module. > > My assumption about there being few users of chararray is based on the > absence of questions about it being asked on the list, the lack of docs and > the apparent lack of a good use-case. Plus "who uses this?" has been asked > twice on the list, and in both cases you (Chris) were the only one replying. In particular, Chris, do you know anyone who uses chararray? Do you think you could convince them to write a few docstrings or contribute a couple of examples? Does anyone know anyone who uses chararray? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d_l_goldsmith at yahoo.com Thu Aug 20 12:27:45 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Thu, 20 Aug 2009 09:27:45 -0700 (PDT) Subject: [Numpy-discussion] Deprecate chararray [was [SciPy-dev] Plea for help] Message-ID: <857977.74958.qm@web52106.mail.re2.yahoo.com> --- On Thu, 8/20/09, Christopher Hanley wrote: > I'd like to respectfully request that we move any > discussion of what? > to do with the numpy.char module to the numpy list. NP, done. > I'm a little concerned about some of the assumptions that > are being? > made about the number of users of the module.? I would > also like to? > better understand the reasons for wanting to dump it.? I think Ralf did a pretty good job of synopsizing the reasons for deprecation, but since we're moving the thread, I'll reprint them here: 0) "it gets very little use" (an assumption you presumably dispute); 1) "is pretty much undocumented" (less true than a week ago, but still true for several of the attributes, with another handful or so falling into the category of "poorly documented"); 2) "probably more buggy than most other parts of NumPy" ("probably" being a euphemism, IMO); 3) "there is not a really good use-case for it" (a conjecture, but one that has yet to be challenged by counter-example); 4) it's not the first time its presence in NumPy has been questioned ("as Stefan pointed out when asking this same question last year") 5) NumPy already has a (perhaps superior) alternative ("object arrays would do nicely if one needs this functionality"); to which I'll add: 6) it is, on its face, "counter to the spirit" of NumPy. So far, IIRC, the only reason in favor of its continued inclusion is inertia. > Let me be? > clear.? I'm not opposed to change.? However > breaking other people's? > code just for the sake of change seems like a poor reason So, I don't think we're proposing this "just for the sake of change" > and a mean? > thing to do to our customers. Apologies, but it is not proposed maliciously. The only other things I would add by way of "review" from the scipy-dev thread: a compromise proposal (made by Ralf): "Put clearly in the docs that this module exists for backwards compatibility reasons, and is not recommended for new development" and a clarification of deprecation process (provided by Robert): "[asked by the present author] How has deprecation in Numpy worked in the past - by dictum, vote, or consensus? [Robert's answer] Consensus or dictum without major objection. Voting is pointless except to inform one of those." Thanks for your time and consideration. David Goldsmith > Thank you for your time and help, > Chris > > > -- > Christopher Hanley > Senior Systems Software Engineer > Space Telescope Science Institute > 3700 San Martin Drive > Baltimore MD, 21218 > (410) 338-4338 > > On Aug 20, 2009, at 1:35 AM, Robert Kern wrote: > > > On Wed, Aug 19, 2009 at 20:03, David? > > Goldsmith > wrote: > >> I'm going to take it a step further: "breakage" is > always the? > >> deterrent to change, and yet "change we must" > (i.e., "adapt or? > >> die").? It's certainly not without precedent > - even within Numpy, I? > >> believe - for things (though perhaps not whole > namespaces) to be? > >> deemed "to-be-deprecated," have a warning to this > effect? > >> established in one x.[even #].0 release, and then > be removed by the? > >> x.[even # + 2 or + 4].0 release.? How has > deprecation in Numpy? > >> worked in the past - by dictum, vote, or > consensus? > > > > Consensus or dictum without major objection. Voting is > pointless > > except to inform one of those. > > > > -- > > Robert Kern > > > > "I have come to believe that the whole world is an > enigma, a harmless > > enigma that is made terrible by our own mad attempt to > interpret it as > > though it had an underlying truth." > >? -- Umberto Eco > > _______________________________________________ > > Scipy-dev mailing list > > Scipy-dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From chanley at stsci.edu Thu Aug 20 13:32:08 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Thu, 20 Aug 2009 13:32:08 -0400 Subject: [Numpy-discussion] Deprecate chararray [was [SciPy-dev] Plea for help] In-Reply-To: <857977.74958.qm@web52106.mail.re2.yahoo.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> Message-ID: <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> Here is what I know about the chararray usage at STScI since first looking into it this morning. It is used in PyFITS and within the COS instrument calibration code. I have not heard back from the other projects yet given most of our developers are away at this time. It appears that the COS code can be changed easily. I am waiting to hear back from PyFITs. Also, I do not know how many people use this particular feature. However I would point out that many people who use numpy are not also on the mailing lists. Most of the STScI do not follow the numpy list. I serve as our point of contact to the numpy community. I'm trying to gather a list of projects that use this feature and specific use cases for you. As I do not use this module myself I cannot counter your arguments at this time. If we decide to deprecate this module would we reverse this decision if we then find out that the assumptions that went into the decision were in error? Another concern is that we told people coming from numarray to use this module. It is my opinion that at this point in the numpy release cycle that an API change needs a very strong justification. Anecdotes about the number of users, a "change or die" philosophy, and an un- articulated notion of "the spirit of numpy" do not in my consideration meet that high bar. If you would like us to provide additional documentation and tests that would be possible. I'll do it myself if that is the only think keeping the module from remaining in numpy. This also raises the question of what else is going to go? Will recarray be removed? What about the numarray c-api compatibility layer? Like I said earlier, I'm not opposed to change. I am just of the opinion that this isn't a simple, cut and dry decision. For those at SciPy 2009 feel free to come yell at me and beat me with sticks. I'm the fat guy in jeans and a blue shirt sitting towards the back middle on the left. Cheers, Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 On Aug 20, 2009, at 12:27 PM, David Goldsmith wrote: > --- On Thu, 8/20/09, Christopher Hanley wrote: > >> I'd like to respectfully request that we move any >> discussion of what >> to do with the numpy.char module to the numpy list. > > NP, done. > >> I'm a little concerned about some of the assumptions that >> are being >> made about the number of users of the module. I would >> also like to >> better understand the reasons for wanting to dump it. > > I think Ralf did a pretty good job of synopsizing the reasons for > deprecation, but since we're moving the thread, I'll reprint them > here: > > 0) "it gets very little use" (an assumption you presumably dispute); > > 1) "is pretty much undocumented" (less true than a week ago, but > still true for several of the attributes, with another handful or so > falling into the category of "poorly documented"); > > 2) "probably more buggy than most other parts of NumPy" ("probably" > being a euphemism, IMO); > > 3) "there is not a really good use-case for it" (a conjecture, but > one that has yet to be challenged by counter-example); > > 4) it's not the first time its presence in NumPy has been questioned > ("as Stefan pointed out when asking this same question last year") > > 5) NumPy already has a (perhaps superior) alternative ("object > arrays would do nicely if one needs this functionality"); > > to which I'll add: > > 6) it is, on its face, "counter to the spirit" of NumPy. > > So far, IIRC, the only reason in favor of its continued inclusion is > inertia. > >> Let me be >> clear. I'm not opposed to change. However >> breaking other people's >> code just for the sake of change seems like a poor reason > > So, I don't think we're proposing this "just for the sake of change" > >> and a mean >> thing to do to our customers. > > Apologies, but it is not proposed maliciously. > > The only other things I would add by way of "review" from the scipy- > dev thread: > > a compromise proposal (made by Ralf): > > "Put clearly in the docs that this module exists for backwards > compatibility reasons, and is not recommended for new development" > > and a clarification of deprecation process (provided by Robert): > > "[asked by the present author] How has deprecation in Numpy worked > in the past - by dictum, vote, or consensus? > > [Robert's answer] Consensus or dictum without major objection. > Voting is pointless except to inform one of those." > > Thanks for your time and consideration. > > David Goldsmith > >> Thank you for your time and help, >> Chris >> >> >> -- >> Christopher Hanley >> Senior Systems Software Engineer >> Space Telescope Science Institute >> 3700 San Martin Drive >> Baltimore MD, 21218 >> (410) 338-4338 >> >> On Aug 20, 2009, at 1:35 AM, Robert Kern wrote: >> >>> On Wed, Aug 19, 2009 at 20:03, David >>> Goldsmith >> wrote: >>>> I'm going to take it a step further: "breakage" is >> always the >>>> deterrent to change, and yet "change we must" >> (i.e., "adapt or >>>> die"). It's certainly not without precedent >> - even within Numpy, I >>>> believe - for things (though perhaps not whole >> namespaces) to be >>>> deemed "to-be-deprecated," have a warning to this >> effect >>>> established in one x.[even #].0 release, and then >> be removed by the >>>> x.[even # + 2 or + 4].0 release. How has >> deprecation in Numpy >>>> worked in the past - by dictum, vote, or >> consensus? >>> >>> Consensus or dictum without major objection. Voting is >> pointless >>> except to inform one of those. >>> >>> -- >>> Robert Kern >>> >>> "I have come to believe that the whole world is an >> enigma, a harmless >>> enigma that is made terrible by our own mad attempt to >> interpret it as >>> though it had an underlying truth." >>> -- Umberto Eco >>> _______________________________________________ >>> Scipy-dev mailing list >>> Scipy-dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> Scipy-dev mailing list >> Scipy-dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From cournape at gmail.com Thu Aug 20 15:04:10 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 20 Aug 2009 12:04:10 -0700 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> Message-ID: <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> On Thu, Aug 20, 2009 at 10:32 AM, Christopher Hanley wrote: > > Another concern is that we told people coming from numarray to use > this module. ?It is my opinion that at this point in the numpy release > cycle that an API change needs a very strong justification. ?Anecdotes > about the number of users, a "change or die" philosophy, and an un- > articulated notion of ?"the spirit of numpy" ?do not in my > consideration meet that high bar. I agree those are not strong reasons without more backing. What worries me the most, in both numpy and scipy, is code that nobody knows about, without any test or documentation. When it breaks, we can't fix it. That's unsustainable in the long term, because it takes a lot of time that people could spend somewhere else more useful. Especially when you have C code which does not work on some platforms, with new version of python (python 3k port, for example). I much prefer removing code to having code that barely works and cannot be maintained. Old code that people are ready to maintain, I have nothing against. cheers, David From sccolbert at gmail.com Thu Aug 20 16:52:08 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 20 Aug 2009 16:52:08 -0400 Subject: [Numpy-discussion] nosetests and permissions Message-ID: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> when I build numpy from source via: python setup.py build sudo python setup.py install the nosetests fail because of permissions: In [5]: np.test() Running unit tests for numpy NumPy version 1.3.0 NumPy is installed in /usr/local/lib/python2.6/dist-packages/numpy Python version 2.6.2 (release26-maint, Apr 19 2009, 01:58:18) [GCC 4.3.3] nose version 0.10.4 ---------------------------------------------------------------------- Ran 0 tests in 0.007s OK Out[5]: The problem I'm running into is I can't do a blanket chmod 664 *.py on the numpy directory because that breaks things. And since I don't which files are nosetests, it's very difficult to change by hand. Is there a workaround for this, or would it more appropriate for the numpy build script to set the permissions of the test file accordingly? Cheers, Chris From robert.kern at gmail.com Thu Aug 20 16:58:48 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Aug 2009 13:58:48 -0700 Subject: [Numpy-discussion] nosetests and permissions In-Reply-To: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> References: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> Message-ID: <3d375d730908201358p3e57b8f8y9ba0cfd9946ca9a4@mail.gmail.com> On Thu, Aug 20, 2009 at 13:52, Chris Colbert wrote: > when I build numpy from source via: > > python setup.py build > sudo python setup.py install > > > the nosetests fail because of permissions: What permissions do your files have? If they're not readable for whatever reason, you would be SOL no matter what. The only fixable issue I am aware of is that nosetests does not like to collect tests in executable files (nose 0.11 has an option to permit that). However, I don't know why such a standard installation would do either of those things. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Thu Aug 20 17:03:38 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 20 Aug 2009 14:03:38 -0700 Subject: [Numpy-discussion] nosetests and permissions In-Reply-To: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> References: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> Message-ID: On Thu, Aug 20, 2009 at 1:52 PM, Chris Colbert wrote: > when I build numpy from source via: > > python setup.py build > sudo python setup.py install > > > the nosetests fail because of permissions: > > In [5]: np.test() > Running unit tests for numpy > NumPy version 1.3.0 > NumPy is installed in /usr/local/lib/python2.6/dist-packages/numpy > Python version 2.6.2 (release26-maint, Apr 19 2009, 01:58:18) [GCC 4.3.3] > nose version 0.10.4 > > ---------------------------------------------------------------------- > Ran 0 tests in 0.007s > > OK > Out[5]: > > > The problem I'm running into is I can't do a blanket chmod 664 *.py on > the numpy directory because that breaks things. And since I don't > which files are nosetests, it's very difficult to change by hand. > > Is there a workaround for this, or would it more appropriate for the > numpy build script to set the permissions of the test file > accordingly? Works for me. But my numpy is in the site-packages directory. Did you move it to dist-packages? >> np.test() Running unit tests for numpy NumPy version 1.3.0 NumPy is installed in /usr/local/lib/python2.6/site-packages/numpy Python version 2.6.2 (release26-maint, Apr 19 2009, 01:58:18) [GCC 4.3.3] nose version 0.11.1 [snip] ---------------------------------------------------------------------- Ran 2030 tests in 5.033s OK (KNOWNFAIL=1, SKIP=11) From sccolbert at gmail.com Thu Aug 20 17:06:28 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 20 Aug 2009 17:06:28 -0400 Subject: [Numpy-discussion] nosetests and permissions In-Reply-To: References: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> Message-ID: <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> the issue is that the files are executable. I have no idea why they are set that way either. This is numpy 1.3.0 built from source. the default install location for setup.py install is the local dist-packages. So that's where it is. On Thu, Aug 20, 2009 at 5:03 PM, Keith Goodman wrote: > On Thu, Aug 20, 2009 at 1:52 PM, Chris Colbert wrote: >> when I build numpy from source via: >> >> python setup.py build >> sudo python setup.py install >> >> >> the nosetests fail because of permissions: >> >> In [5]: np.test() >> Running unit tests for numpy >> NumPy version 1.3.0 >> NumPy is installed in /usr/local/lib/python2.6/dist-packages/numpy >> Python version 2.6.2 (release26-maint, Apr 19 2009, 01:58:18) [GCC 4.3.3] >> nose version 0.10.4 >> >> ---------------------------------------------------------------------- >> Ran 0 tests in 0.007s >> >> OK >> Out[5]: >> >> >> The problem I'm running into is I can't do a blanket chmod 664 *.py on >> the numpy directory because that breaks things. And since I don't >> which files are nosetests, it's very difficult to change by hand. >> >> Is there a workaround for this, or would it more appropriate for the >> numpy build script to set the permissions of the test file >> accordingly? > > Works for me. But my numpy is in the site-packages directory. Did you > move it to dist-packages? > >>> np.test() > Running unit tests for numpy > NumPy version 1.3.0 > NumPy is installed in /usr/local/lib/python2.6/site-packages/numpy > Python version 2.6.2 (release26-maint, Apr 19 2009, 01:58:18) [GCC 4.3.3] > nose version 0.11.1 > [snip] > ---------------------------------------------------------------------- > Ran 2030 tests in 5.033s > > OK (KNOWNFAIL=1, SKIP=11) > ? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sccolbert at gmail.com Thu Aug 20 17:06:51 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 20 Aug 2009 17:06:51 -0400 Subject: [Numpy-discussion] nosetests and permissions In-Reply-To: <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> References: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> Message-ID: <7f014ea60908201406q3a6ba727t78e8c6fce6cf4d20@mail.gmail.com> this happens with scipy too... On Thu, Aug 20, 2009 at 5:06 PM, Chris Colbert wrote: > the issue is that the files are executable. I have no idea why they > are set that way either. This is numpy 1.3.0 built from source. > > the default install location for setup.py install is the local > dist-packages. So that's where it is. > > > > On Thu, Aug 20, 2009 at 5:03 PM, Keith Goodman wrote: >> On Thu, Aug 20, 2009 at 1:52 PM, Chris Colbert wrote: >>> when I build numpy from source via: >>> >>> python setup.py build >>> sudo python setup.py install >>> >>> >>> the nosetests fail because of permissions: >>> >>> In [5]: np.test() >>> Running unit tests for numpy >>> NumPy version 1.3.0 >>> NumPy is installed in /usr/local/lib/python2.6/dist-packages/numpy >>> Python version 2.6.2 (release26-maint, Apr 19 2009, 01:58:18) [GCC 4.3.3] >>> nose version 0.10.4 >>> >>> ---------------------------------------------------------------------- >>> Ran 0 tests in 0.007s >>> >>> OK >>> Out[5]: >>> >>> >>> The problem I'm running into is I can't do a blanket chmod 664 *.py on >>> the numpy directory because that breaks things. And since I don't >>> which files are nosetests, it's very difficult to change by hand. >>> >>> Is there a workaround for this, or would it more appropriate for the >>> numpy build script to set the permissions of the test file >>> accordingly? >> >> Works for me. But my numpy is in the site-packages directory. Did you >> move it to dist-packages? >> >>>> np.test() >> Running unit tests for numpy >> NumPy version 1.3.0 >> NumPy is installed in /usr/local/lib/python2.6/site-packages/numpy >> Python version 2.6.2 (release26-maint, Apr 19 2009, 01:58:18) [GCC 4.3.3] >> nose version 0.11.1 >> [snip] >> ---------------------------------------------------------------------- >> Ran 2030 tests in 5.033s >> >> OK (KNOWNFAIL=1, SKIP=11) >> ? >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > From robert.kern at gmail.com Thu Aug 20 17:09:04 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Aug 2009 14:09:04 -0700 Subject: [Numpy-discussion] nosetests and permissions In-Reply-To: <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> References: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> Message-ID: <3d375d730908201409x73d6596r7dd1c6a6ff55dc68@mail.gmail.com> On Thu, Aug 20, 2009 at 14:06, Chris Colbert wrote: > the issue is that the files are executable. I have no idea why they > are set that way either. This is numpy 1.3.0 built from source. Are you sure that those are exactly the commands that you executed? You didn't invoke setuptools in any way? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sccolbert at gmail.com Thu Aug 20 17:13:33 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 20 Aug 2009 17:13:33 -0400 Subject: [Numpy-discussion] nosetests and permissions In-Reply-To: <3d375d730908201409x73d6596r7dd1c6a6ff55dc68@mail.gmail.com> References: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> <3d375d730908201409x73d6596r7dd1c6a6ff55dc68@mail.gmail.com> Message-ID: <7f014ea60908201413o2a3811a3ode83bfaf08cf8876@mail.gmail.com> nope. I build Atlas, and modified site.cfg to find those libs in /usr/local/lib/atlas/ then i did: python setup.py build sudo python setup.py install that's it. On Thu, Aug 20, 2009 at 5:09 PM, Robert Kern wrote: > On Thu, Aug 20, 2009 at 14:06, Chris Colbert wrote: >> the issue is that the files are executable. I have no idea why they >> are set that way either. This is numpy 1.3.0 built from source. > > Are you sure that those are exactly the commands that you executed? > You didn't invoke setuptools in any way? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Thu Aug 20 17:15:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Aug 2009 14:15:06 -0700 Subject: [Numpy-discussion] nosetests and permissions In-Reply-To: <7f014ea60908201413o2a3811a3ode83bfaf08cf8876@mail.gmail.com> References: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> <3d375d730908201409x73d6596r7dd1c6a6ff55dc68@mail.gmail.com> <7f014ea60908201413o2a3811a3ode83bfaf08cf8876@mail.gmail.com> Message-ID: <3d375d730908201415o140389d0rbdcd5bb69fa11fdd@mail.gmail.com> On Thu, Aug 20, 2009 at 14:13, Chris Colbert wrote: > nope. > > I build Atlas, and modified site.cfg to find those libs in /usr/local/lib/atlas/ > > then i did: > > python setup.py build > sudo python setup.py install > > that's it. Huh. I don't know. Are the source files executable? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Thu Aug 20 18:33:26 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 20 Aug 2009 15:33:26 -0700 Subject: [Numpy-discussion] nosetests and permissions In-Reply-To: <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> References: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> Message-ID: <5b8d13220908201533l1db7bfaem9f61874cb76bc73e@mail.gmail.com> On Thu, Aug 20, 2009 at 2:06 PM, Chris Colbert wrote: > the issue is that the files are executable. I have no idea why they > are set that way either. This is numpy 1.3.0 built from source. Which sources are you using ? The tarball on sourceforge, from svn, etc... ? cheers, David From chanley at stsci.edu Thu Aug 20 19:43:27 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Thu, 20 Aug 2009 19:43:27 -0400 Subject: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]] In-Reply-To: <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> Message-ID: I agree with David's comments. In that theme I have removed scipy.stsci from scipy. Users get it directly from us at STScI via STSCI_PYTHON. It doesn't have any documentation in the doc system. It isn't by default in the scipy namespace. And as a recent bug report indicates they can't import it anyway. That should clean some code up. If someday a generic image processing library is added to scipy we can consider incorporating our modules back into scipy. Until that time I would rather remove the redundancy. It also help scipy's maintainability and frees me from having to worry about a fork in the code developing. Cheers, Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 On Aug 20, 2009, at 3:04 PM, David Cournapeau wrote: > On Thu, Aug 20, 2009 at 10:32 AM, Christopher > Hanley wrote: > > I agree those are not strong reasons without more backing. What > worries me the most, in both numpy and scipy, is code that nobody > knows about, without any test or documentation. When it breaks, we > can't fix it. That's unsustainable in the long term, because it takes > a lot of time that people could spend somewhere else more useful. > Especially when you have C code which does not work on some platforms, > with new version of python (python 3k port, for example). > > I much prefer removing code to having code that barely works and > cannot be maintained. Old code that people are ready to maintain, I > have nothing against. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From stefan at sun.ac.za Thu Aug 20 19:48:21 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 20 Aug 2009 16:48:21 -0700 Subject: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]] In-Reply-To: References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> Message-ID: <9457e7c80908201648r474297fo1d6ad2061c240fa6@mail.gmail.com> Hi Chris 2009/8/20 Christopher Hanley : > That should clean some code up. ?If someday a generic image processing > library is added to scipy we can consider incorporating our modules > back into scipy. ?Until that time I would rather remove the > redundancy. ?It also help scipy's maintainability and frees me from > having to worry about a fork in the code developing. We'll be spriting on an Image Processing Scikit this weekend. If you have any functions you'd like to include, let me know. Regards St?fan From chanley at stsci.edu Thu Aug 20 19:51:45 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Thu, 20 Aug 2009 19:51:45 -0400 Subject: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]] In-Reply-To: <9457e7c80908201648r474297fo1d6ad2061c240fa6@mail.gmail.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> <9457e7c80908201648r474297fo1d6ad2061c240fa6@mail.gmail.com> Message-ID: <88B0EA8C-5637-4050-AC4F-023C6D41BB3A@stsci.edu> On Aug 20, 2009, at 7:48 PM, St?fan van der Walt wrote: Hi Stefan, > We'll be spriting on an Image Processing Scikit this weekend. If you > have any functions you'd like to include, let me know. > > Regards > St?fan Will the Image Processing Scikit be dedicated to working with a single image or stacks of images? Cheers, Chris From chanley at stsci.edu Thu Aug 20 19:57:43 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Thu, 20 Aug 2009 19:57:43 -0400 Subject: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]] In-Reply-To: <88B0EA8C-5637-4050-AC4F-023C6D41BB3A@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> <9457e7c80908201648r474297fo1d6ad2061c240fa6@mail.gmail.com> <88B0EA8C-5637-4050-AC4F-023C6D41BB3A@stsci.edu> Message-ID: Hi Stefan, Never mind. I just found the Sprint website and read the description. I'm sorry I hadn't found this sooner. I would have made plans to stay and help. My apologizes. Sorry, Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 On Aug 20, 2009, at 7:51 PM, Christopher Hanley wrote: > On Aug 20, 2009, at 7:48 PM, St?fan van der Walt wrote: > > Hi Stefan, > >> We'll be spriting on an Image Processing Scikit this weekend. If you >> have any functions you'd like to include, let me know. >> >> Regards >> St?fan > > Will the Image Processing Scikit be dedicated to working with a > single image or stacks of images? > > Cheers, > Chris > > From jturner at gemini.edu Thu Aug 20 20:04:22 2009 From: jturner at gemini.edu (James Turner) Date: Thu, 20 Aug 2009 20:04:22 -0400 Subject: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]] In-Reply-To: References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> <9457e7c80908201648r474297fo1d6ad2061c240fa6@mail.gmail.com> <88B0EA8C-5637-4050-AC4F-023C6D41BB3A@stsci.edu> Message-ID: <4A8DE486.9040806@gemini.edu> Hi Chris & Stefan, I will be around for most of the weekend (as I believe will Perry). I'm not sure I'll be able to contribute a lot to coding, but if there's any stuff you want to co-ordinate between STScI and Stefan's scikit, let me know if I can help. That's probably about the most useful thing I could do. Cheers, James. > Hi Stefan, > > Never mind. I just found the Sprint website and read the > description. I'm sorry I hadn't found this sooner. I would have made > plans to stay and help. My apologizes. > > Sorry, > Chris From oliphant at enthought.com Thu Aug 20 20:06:02 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 20 Aug 2009 19:06:02 -0500 Subject: [Numpy-discussion] Deprecate chararray [was Plea for help] In-Reply-To: <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> Message-ID: <3FE3D509-1893-4A23-84CA-0177FEDC2CD8@enthought.com> On Aug 20, 2009, at 2:04 PM, David Cournapeau wrote: > On Thu, Aug 20, 2009 at 10:32 AM, Christopher > Hanley wrote: > >> >> Another concern is that we told people coming from numarray to use >> this module. It is my opinion that at this point in the numpy >> release >> cycle that an API change needs a very strong justification. >> Anecdotes >> about the number of users, a "change or die" philosophy, and an un- >> articulated notion of "the spirit of numpy" do not in my >> consideration meet that high bar. > > I agree those are not strong reasons without more backing. What > worries me the most, in both numpy and scipy, is code that nobody > knows about, without any test or documentation. When it breaks, we > can't fix it. That's unsustainable in the long term, because it takes > a lot of time that people could spend somewhere else more useful. > Especially when you have C code which does not work on some platforms, > with new version of python (python 3k port, for example). The claim that "chararray" is not understood is not true. I know about the code. It's not that difficult of a piece of code. It's utility can be questioned, but it was and may still be an important part of Numarray compatibilty. I was asked a question that I didn't have time to answer properly (it would have taken more than 5 minutes). That lack of answer does not constitute "nobody knows about" the code. It's a great idea to add to the docstring that the code was created for compatibility with numarray. But, I'm not convinced by any of the given reasons to remove the code from NumPy. -Travis -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliphant at enthought.com From cournape at gmail.com Thu Aug 20 20:14:25 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 20 Aug 2009 17:14:25 -0700 Subject: [Numpy-discussion] Deprecate chararray [was Plea for help] In-Reply-To: <3FE3D509-1893-4A23-84CA-0177FEDC2CD8@enthought.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> <3FE3D509-1893-4A23-84CA-0177FEDC2CD8@enthought.com> Message-ID: <5b8d13220908201714k2750b975qeeaaa959df1c7e17@mail.gmail.com> On Thu, Aug 20, 2009 at 5:06 PM, Travis Oliphant wrote: > > On Aug 20, 2009, at 2:04 PM, David Cournapeau wrote: > >> On Thu, Aug 20, 2009 at 10:32 AM, Christopher >> Hanley wrote: >> >>> >>> Another concern is that we told people coming from numarray to use >>> this module. ?It is my opinion that at this point in the numpy >>> release >>> cycle that an API change needs a very strong justification. >>> Anecdotes >>> about the number of users, a "change or die" philosophy, and an un- >>> articulated notion of ?"the spirit of numpy" ?do not in my >>> consideration meet that high bar. >> >> I agree those are not strong reasons without more backing. ?What >> worries me the most, in both numpy and scipy, is code that nobody >> knows about, without any test or documentation. When it breaks, we >> can't fix it. That's unsustainable in the long term, because it takes >> a lot of time that people could spend somewhere else more useful. >> Especially when you have C code which does not work on some platforms, >> with new version of python (python 3k port, for example). > > > The claim that "chararray" is not understood is not true. ? I know > about the code. ?It's not that difficult of a piece of code. ? It's > utility can be questioned, but it was and may still be an important > part of Numarray compatibilty. I did not want to imply that chararray is unknown, sorry for the confusion. I just wanted to say that old code is not an argument to remove something. Whether someone is willing to maintain it is a much better argument IMO. cheers, David From stefan at sun.ac.za Thu Aug 20 20:14:25 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 20 Aug 2009 17:14:25 -0700 Subject: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]] In-Reply-To: <88B0EA8C-5637-4050-AC4F-023C6D41BB3A@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> <9457e7c80908201648r474297fo1d6ad2061c240fa6@mail.gmail.com> <88B0EA8C-5637-4050-AC4F-023C6D41BB3A@stsci.edu> Message-ID: <9457e7c80908201714u3d09395o4982688d0499235a@mail.gmail.com> 2009/8/20 Christopher Hanley : > Will the Image Processing Scikit be dedicated to working with a single > image or stacks of images? Thanks for the reminder -- I have to add ImageCollection to the set of features. Fernando started working on something similar in 2006, and I've implemented a cached reader here: http://mentat.za.net/supreme/doc/supreme.misc.io.ImageCollection-class.html To get back to your comment: what kind of operations would you like to see supported on multiple images? Regards St?fan From pfeldman at verizon.net Thu Aug 20 20:46:43 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Thu, 20 Aug 2009 17:46:43 -0700 (PDT) Subject: [Numpy-discussion] itemsize() doesn't work Message-ID: <25072522.post@talk.nabble.com> I've been reading the online NumPy tutorial at the following URL: http://numpy.scipy.org/numpydoc/numpy-10.html When I try the following example, I get an error message: In [1]: a=arange(10) In [2]: a.itemsize() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) C:\Python\ in () TypeError: 'int' object is not callable -- View this message in context: http://www.nabble.com/itemsize%28%29-doesn%27t-work-tp25072522p25072522.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From stefan at sun.ac.za Thu Aug 20 20:58:14 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 20 Aug 2009 17:58:14 -0700 Subject: [Numpy-discussion] itemsize() doesn't work In-Reply-To: <25072522.post@talk.nabble.com> References: <25072522.post@talk.nabble.com> Message-ID: <9457e7c80908201758v500c1797s555236d1c2859125@mail.gmail.com> 2009/8/20 Dr. Phillip M. Feldman : > > I've been reading the online NumPy tutorial at the following URL: > > http://numpy.scipy.org/numpydoc/numpy-10.html > > When I try the following example, I get an error message: > > In [1]: a=arange(10) > In [2]: a.itemsize() This is a mistake, and should be "a.itemsize". The latest docs are always on docs.scipy.org, so if this same mistake occurs there, please fix it. Thanks! St?fan From pfeldman at verizon.net Thu Aug 20 21:00:46 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Thu, 20 Aug 2009 18:00:46 -0700 (PDT) Subject: [Numpy-discussion] how to find array indices at which a condition is satisfied? Message-ID: <25072656.post@talk.nabble.com> I have a 1-D array and would like to generate a list of indices for which a given condition is satisfied. What is the cleanest way to do this? -- View this message in context: http://www.nabble.com/how-to-find-array-indices-at-which-a-condition-is-satisfied--tp25072656p25072656.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From josef.pktd at gmail.com Thu Aug 20 21:05:35 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 20 Aug 2009 21:05:35 -0400 Subject: [Numpy-discussion] itemsize() doesn't work In-Reply-To: <9457e7c80908201758v500c1797s555236d1c2859125@mail.gmail.com> References: <25072522.post@talk.nabble.com> <9457e7c80908201758v500c1797s555236d1c2859125@mail.gmail.com> Message-ID: <1cd32cbb0908201805y5b13cce6p6da9350559e8e0c3@mail.gmail.com> 2009/8/20 St?fan van der Walt : > 2009/8/20 Dr. Phillip M. Feldman : >> >> I've been reading the online NumPy tutorial at the following URL: >> >> http://numpy.scipy.org/numpydoc/numpy-10.html >> >> When I try the following example, I get an error message: >> >> In [1]: a=arange(10) >> In [2]: a.itemsize() > > This is a mistake, and should be "a.itemsize". ?The latest docs are > always on docs.scipy.org, so if this same mistake occurs there, please > fix it. > > Thanks! > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > itemsize is listed as an attribute in the numpy 1.2 docs. Josef From robert.kern at gmail.com Thu Aug 20 21:05:48 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Aug 2009 18:05:48 -0700 Subject: [Numpy-discussion] how to find array indices at which a condition is satisfied? In-Reply-To: <25072656.post@talk.nabble.com> References: <25072656.post@talk.nabble.com> Message-ID: <3d375d730908201805j4ceeea28rb9824a2645a23a98@mail.gmail.com> On Thu, Aug 20, 2009 at 18:00, Dr. Phillip M. Feldman wrote: > > I have a 1-D array and would like to generate a list of indices for which a > given condition is satisfied. ?What is the cleanest way to do this? numpy.nonzero(condition)[0] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Thu Aug 20 21:06:30 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 20 Aug 2009 18:06:30 -0700 Subject: [Numpy-discussion] how to find array indices at which a condition is satisfied? In-Reply-To: <25072656.post@talk.nabble.com> References: <25072656.post@talk.nabble.com> Message-ID: <9457e7c80908201806s37c86fa8pef48a6cd0f5c38d6@mail.gmail.com> 2009/8/20 Dr. Phillip M. Feldman : > > I have a 1-D array and would like to generate a list of indices for which a > given condition is satisfied. ?What is the cleanest way to do this? np.where(x > 0) St?fan From eadrogue at gmx.net Thu Aug 20 21:14:02 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Fri, 21 Aug 2009 03:14:02 +0200 Subject: [Numpy-discussion] how to find array indices at which a condition is satisfied? In-Reply-To: <25072656.post@talk.nabble.com> References: <25072656.post@talk.nabble.com> Message-ID: <20090821011402.GA4312@doriath.local> 20/08/09 @ 18:00 (-0700), thus spake Dr. Phillip M. Feldman: > I have a 1-D array and would like to generate a list of indices for which a > given condition is satisfied. What is the cleanest way to do this? you can do something like this: numpy.arange(len(x))[x > 5] it'll give you the indices of x where x is > 5. Ernest From d_l_goldsmith at yahoo.com Thu Aug 20 23:18:31 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Thu, 20 Aug 2009 20:18:31 -0700 (PDT) Subject: [Numpy-discussion] itemsize() doesn't work In-Reply-To: <25072522.post@talk.nabble.com> Message-ID: <238384.43034.qm@web52112.mail.re2.yahoo.com> Thanks for the bug report! DG --- On Thu, 8/20/09, Dr. Phillip M. Feldman wrote: > From: Dr. Phillip M. Feldman > Subject: [Numpy-discussion] itemsize() doesn't work > To: numpy-discussion at scipy.org > Date: Thursday, August 20, 2009, 5:46 PM > > I've been reading the online NumPy tutorial at the > following URL: > > http://numpy.scipy.org/numpydoc/numpy-10.html > > When I try the following example, I get an error message: > > In [1]: a=arange(10) > In [2]: a.itemsize() > --------------------------------------------------------------------------- > TypeError? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? > ???Traceback (most recent call last) > C:\Python\ in () > TypeError: 'int' object is not callable > -- > View this message in context: http://www.nabble.com/itemsize%28%29-doesn%27t-work-tp25072522p25072522.html > Sent from the Numpy-discussion mailing list archive at > Nabble.com. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From sccolbert at gmail.com Thu Aug 20 23:22:34 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 20 Aug 2009 23:22:34 -0400 Subject: [Numpy-discussion] nosetests and permissions In-Reply-To: <5b8d13220908201533l1db7bfaem9f61874cb76bc73e@mail.gmail.com> References: <7f014ea60908201352v395a09e8qff6d6f1abd6c743c@mail.gmail.com> <7f014ea60908201406p30003988l383cf0a5c00433ef@mail.gmail.com> <5b8d13220908201533l1db7bfaem9f61874cb76bc73e@mail.gmail.com> Message-ID: <7f014ea60908202022i58c88421na6ad313c1d1fae6d@mail.gmail.com> tarball from sourceforge. On Thu, Aug 20, 2009 at 6:33 PM, David Cournapeau wrote: > On Thu, Aug 20, 2009 at 2:06 PM, Chris Colbert wrote: >> the issue is that the files are executable. I have no idea why they >> are set that way either. This is numpy 1.3.0 built from source. > > Which sources are you using ? The tarball on sourceforge, from svn, etc... ? > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d_l_goldsmith at yahoo.com Fri Aug 21 00:47:09 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Fri, 21 Aug 2009 04:47:09 +0000 (GMT) Subject: [Numpy-discussion] itemsize() doesn't work In-Reply-To: <9457e7c80908201758v500c1797s555236d1c2859125@mail.gmail.com> Message-ID: <688030.64633.qm@web52108.mail.re2.yahoo.com> Hi, Stefan. Is this editable through the Wiki? I went to the Docstrings page and searched for "numpydoc" and "tutorial" and got no hits. DG --- On Thu, 8/20/09, St?fan van der Walt wrote: > From: St?fan van der Walt > Subject: Re: [Numpy-discussion] itemsize() doesn't work > To: "Discussion of Numerical Python" > Date: Thursday, August 20, 2009, 5:58 PM > 2009/8/20 Dr. Phillip M. Feldman > : > > > > I've been reading the online NumPy tutorial at the > following URL: > > > > http://numpy.scipy.org/numpydoc/numpy-10.html > > > > When I try the following example, I get an error > message: > > > > In [1]: a=arange(10) > > In [2]: a.itemsize() > > This is a mistake, and should be "a.itemsize".? The > latest docs are > always on docs.scipy.org, so if this same mistake occurs > there, please > fix it. > > Thanks! > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From ralf.gommers at googlemail.com Fri Aug 21 01:00:31 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 21 Aug 2009 01:00:31 -0400 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> Message-ID: On Thu, Aug 20, 2009 at 1:32 PM, Christopher Hanley wrote: > Also, I do not know how many people use this particular feature. > However I would point out that many people who use numpy are not also > on the mailing lists. Most of the STScI do not follow the numpy > list. I serve as our point of contact to the numpy community. I'm > trying to gather a list of projects that use this feature and specific > use cases for you. Great. Even one good use case might change my opinion of chararray. > As I do not use this module myself I cannot > counter your arguments at this time. If we decide to deprecate this > module would we reverse this decision if we then find out that the > assumptions that went into the decision were in error? That would make sense. Another concern is that we told people coming from numarray to use > this module. It is my opinion that at this point in the numpy release > cycle that an API change needs a very strong justification. Anecdotes > about the number of users, a "change or die" philosophy, and an un- > articulated notion of "the spirit of numpy" do not in my > consideration meet that high bar. That is not very fair. I gave you four reasons for assuming there are not many users, other arguments you leave out here are the state of the code, lack of docs, tests and (most importantly) a use case. > If you would like us to provide > additional documentation and tests that would be possible. I'll do it > myself if that is the only think keeping the module from remaining in > numpy. Thanks a lot, that would definitely help. How about for now we document the module as being there for numarray compatibility and not recommended for new development? Then if you turn up a good use case we add it to the docs, and if you don't we revisit the deprecation issue? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From schut at sarvision.nl Fri Aug 21 03:58:34 2009 From: schut at sarvision.nl (Vincent Schut) Date: Fri, 21 Aug 2009 09:58:34 +0200 Subject: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]] In-Reply-To: References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> <9457e7c80908201648r474297fo1d6ad2061c240fa6@mail.gmail.com> <88B0EA8C-5637-4050-AC4F-023C6D41BB3A@stsci.edu> Message-ID: Christopher Hanley wrote: > Hi Stefan, > > Never mind. I just found the Sprint website and read the > description. I'm sorry I hadn't found this sooner. I would have made > plans to stay and help. My apologizes. > Hi list, I just saw this too and would like to misuse this thread to suggest another enhancement related to image processing you might consider: make ndimage maskedarray aware. Unfortunately we don't have any money to spend, otherwise I'd love to support this financially too. But it's a thing I'm running into (ndimage not working with missing data, that is) regularly, and for which it often is pretty hard to work out a workaround. E.g. any of the resampling (ndimage.zoom) or kernel filtering routines choke on array's with NaN's, and don't recognize masked arrays. Alas virtually all of the image data I process (satellite imagery) contains missing/bad data... I know it probably will be a pretty involved task, as ndimage comes from numarray and seems to be largely implemented in C. But I really wanted to raise the issue now the image processing subject turns up once again, and hope some folks with more/better programming skills than me might like the idea... Oh and I know of course ndimage is scipy, and this list is numpy. But as the image processing subject emerged here, well... Cheers, Vincent Schut. > Sorry, > Chris > > From pav+sp at iki.fi Fri Aug 21 04:30:58 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Fri, 21 Aug 2009 08:30:58 +0000 (UTC) Subject: [Numpy-discussion] itemsize() doesn't work References: <9457e7c80908201758v500c1797s555236d1c2859125@mail.gmail.com> <688030.64633.qm@web52108.mail.re2.yahoo.com> Message-ID: Fri, 21 Aug 2009 04:47:09 +0000, David Goldsmith wrote: [clip] > > > http://numpy.scipy.org/numpydoc/numpy-10.html [clip] > Is this editable through the Wiki? I went to the > Docstrings page and searched for "numpydoc" and "tutorial" and got no > hits. This is the old Numeric module documentation. It probably doesn't describe all points of Numpy accurately. Of course, the URL is misleading... -- Pauli Virtanen From nouiz at nouiz.org Fri Aug 21 10:01:26 2009 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 21 Aug 2009 10:01:26 -0400 Subject: [Numpy-discussion] Accelerating NumPy computations [Was: GPU Numpy] In-Reply-To: <1250766444.5546.40.camel@inspiron> References: <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> <1250766444.5546.40.camel@inspiron> Message-ID: <2d1d7fe70908210701k49fb02a1l8818779f1864bce0@mail.gmail.com> Hi, On Thu, Aug 20, 2009 at 7:07 AM, Francesc Alted wrote: > El dj 20 de 08 del 2009 a les 00:37 -0700, en/na Erik Tollerud va > escriure: > > > NumPy arrays on the GPU memory is an easy task. But then I would have > to > > > write the computation in OpenCL's dialect of C99? But I'd rather > program > > > everything in Python if I could. Details like GPU and OpenCL should be > > > hidden away. Nice looking Python with NumPy is much easier to read and > > > write. That is why I'd like to see a code generator (i.e. JIT compiler) > > > for NumPy. > > > > This is true to some extent, but also probably difficult to do given > > the fact that paralellizable algorithms are generally more difficult > > to formulate in striaghtforward ways. In the intermediate-term, I > > think there is value in having numpy implement some sort of interface > > to OpenCL or cuda - I can easily see an explosion of different > > bindings (it's already starting), and having a "canonical" way encoded > > in numpy or scipy is probably the best way to mitigate the inevitable > > compatibility problems... I'm partial to the way pycuda can do it > > (basically, just export numpy arrays to the GPU and let you write the > > code from there), but the main point is to just get some basic > > compatibility in pretty quickly, as I think this GPGPU is here to > > stay... > > Maybe. However I think that we should not forget the fact that, as > Stula pointed out, the main bottleneck for *many* problems nowadays is > memory access, not CPU speed. GPUs may have faster memory, but only a > few % better than main stream memory. I'd like to hear from anyone here > having achieved any kind of speed-up in their calculations by using GPUs > instead of CPUs. By looking at these scenarios we may get an idea of > where GPUs can be useful, and if driving an effort for give support for > them in NumPy would be worth the effort. I have around 10x speed up in convolution. I compare again my own version on the cpu that is 20-30x faster then the version in scipy... I should backport some of my optimisation(not all possible as I remove some case), but I didn't get the time. The GPU are the most usefull when the bottleneck is the cpu, not the memory and the probleme must be highly parallel. In that case reported speed-up of 100-200x have been reported. But take those number with a grain of salt, many of them don't talk much about the cpu implementation. In that case, they probable compare a highly optimized version on the GPU again a not optimised version on the CPU. I have see a case where they don't tell withch version of blas they used on the cpu for matrix multiplication. So this can be that they just forget to tell that they used an optimized one or that they didn't used them. In the last case, the speed-up don't have any meaning... > > > I personally think that, in general, exposing GPU capabilities directly > to NumPy would provide little service for most NumPy users. I rather > see letting this task to specialized libraries (like PyCUDA, or special > versions of ATLAS, for example) that can be used from NumPy. specialized library can be a good start as currently their is too much incertitude in the language(opencl vs nvidia api driver(pycuda, but not cublas, cufft,...) vs c-cuda(cublas, cufft)) One think that could help all those specialized libraries(I make one with James B. cuda_ndarray) is to have a standardized version of NDarray for the gpu. But I'm not shure it is a good time to do it now. > > > Until then, I think that a more direct approach (and one that would > deliver results earlier) for speeding-up NumPy is to be aware of the > hierachical nature of the different memory levels in current CPU's and > make NumPy to play nicely with it. In that sense, I think that applying > the blocking technique (see [1] for a brief explanation) for taking > advantage of both spatial and temporal localities is the way to go. For > example, most part of the speed-up that Numexpr achieves comes from the > fact that it uses blocking during the evaluation of complex expressions. > This is so because the temporaries are kept small and can fit in current > CPU caches. Implementing similar algorithms in NumPy should not be that > difficult, most specially now that it already exists the Numexpr > implementation as a model. > > And another thing that may further help to fight memory slowness (or > CPU/GPU quickness, as you prefer ;-) in the next future is compression. > Compression already helped bringing data faster from disk to CPU in the > last 10 years, and now, it is almost time that this can happen with the > memory too, not only disk. > In [1] I demonstrated that compression can *already* help transmitting > data in memory to CPU. Agreed, right now this is only true for highly > compressible data (which is an important corner case anyway), but in the > short future we will see how the compression technique would be able to > accelerate computations for a high variety of datasets, even if they are > not very compressible. > > So, in my humble opinion, implementing the possibility that NumPy can > deal with compressed buffers in addition to uncompressed ones, could be > very interesting in the short future (or even now, in specific > situations). > > [1] http://www.pytables.org/docs/StarvingCPUs.pdf very interesting, optimized numpy on the cpu is a good think as not all algo are well suited for the gpu and when we make new algo, doing it on the cpu is MUCH easier today. Frederic Bastien -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Fri Aug 21 10:44:04 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 21 Aug 2009 16:44:04 +0200 Subject: [Numpy-discussion] Accelerating NumPy computations [Was: GPU Numpy] In-Reply-To: <2d1d7fe70908210701k49fb02a1l8818779f1864bce0@mail.gmail.com> References: <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> <1250766444.5546.40.camel@inspiron> <2d1d7fe70908210701k49fb02a1l8818779f1864bce0@mail.gmail.com> Message-ID: >> I personally think that, in general, exposing GPU capabilities directly >> to NumPy would provide little service for most NumPy users. ?I rather >> see letting this task to specialized libraries (like PyCUDA, or special >> versions of ATLAS, for example) that can be used from NumPy. > > specialized library can be a good start as currently their is too much > incertitude in the language(opencl vs nvidia api driver(pycuda, but not > cublas, cufft,...) vs c-cuda(cublas, cufft)) Indeed. In the future, if OpenCL is the way to go, it may even be helpful to have Numpy using OpenCL directly, as AMD provides an SDK for OpenCL, and with Larrabee approaching, Intel will surely provide one of its own. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From mike.ressler at alum.mit.edu Fri Aug 21 11:47:24 2009 From: mike.ressler at alum.mit.edu (Mike Ressler) Date: Fri, 21 Aug 2009 08:47:24 -0700 Subject: [Numpy-discussion] A better median function? Message-ID: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> I presented this during a lightning talk at the scipy conference yesterday, so again, at the risk of painting myself as a flaming idiot: --------------------- Wanted: A Better/Faster median() Function numpy implementation uses simple sorting algorithm: Sort all the data using the .sort() method Return middle value (or mean of two middle values) One doesn?t have to sort all data ? need only the middle value Nicolas Devillard discusses several algorithms at http://ndevilla.free.fr/median/median/index.html Implemented Devillard?s version of the Numerical Recipes select() function using ctypes: 2 to 20 times faster on the large (> 10^6 points) arrays I tested --- Caveat: I don?t have all the bells and whistles of the built-in median function (multiple dimensions, non-contiguous, etc.) Any of the numpy developers interested in pursuing this further? ----------------------- I got a fairly loud "yes" from the back of the room which a few of us guessed was Robert Kern. I take that as generic interest at least in checking this out. The background on this is that I am doing some glitch finding algorithms where I call median frequently. I think my ultimate problem is not in median(), but how I loop through the data, but that is a different discussion. What I noticed as I was investigating was what I noted in the slide above. Returning the middle of a sorted vector is not a bad thing to do (admit it, we've all done it at some point), but it does too much work. Things that are lower or higher than the median don't need to be in a perfectly sorted order if all we are after is the median value. I did some googling and came up with the web page noted above. I used his modified NumRec select() function as an excuse to learn ctypes, and my initial weak attempts were successful. The speed ups depend highly on the length of the data and the randomness - things that are correlated or partially sorted already go quickly. My caveat is that my select-based median is too simple; it must have 1-d contiguous data of a predefined type. It also moves the data in place, affecting the original variable. I have no idea how this will blow up if implemented in a general purpose way. Anyway, I'm not enough of a C-coder to have any hope of improving this to the point where it can be included in numpy itself. However, if someone is willing to take up the torch, I will volunteer to assist with discussion, prototyping a few routines, and testing (I have lots of real-world data). One could argue that the current median implementation is good enough (and it probably is for 99% of all usage), but I view this as a chance to add an industrial strength routine to the numpy base. Thanks for listening. Mike -- mike.ressler at alum.mit.edu From sturla at molden.no Fri Aug 21 20:48:24 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 21 Aug 2009 17:48:24 -0700 Subject: [Numpy-discussion] PRNGs and multi-threading Message-ID: <4A8F4058.80203@molden.no> I am not sure if this is the right place to discuss this issue. However, I problem I keep running into in Monte Carlo simulations is generating pseudorandom numbers with multiple threads. PRNGs such as the Mersenne Twister keep an internal state, which prevents the PRNG from being re-entrant and thread-safe. Possible solutions: 1. Use multiple instances of PRNG states (numpy.random.RandomState), one for each thread. This should give no contention, but is this mathematically acceptable? I don't know. At least I have not seen any proof that it is. 2. Protect the PRNG internally with a spinlock. In Windows lingo, that is: #include static volatile long spinlock = 0; #define ACQUIRE_SPINLOCK while(InterlockedExchangeAcquire(&spinlock, 1)); #define RELEASE_SPINLOCK InterlockedExchangeAcquire(&spinlock, 0); Problem: possible contention between thread, idle work if threads spins a lot. 3. Use a conventional mutex object to protect the PRNG (e.g. threading.Lock in Python or CRITICAL_SECTION in Windows). Problem: contention, context shifting, and mutexes tend to be slow. Possibly the worst solution. 4. Put the PRNG in a dedicated thread, fill up rather big arrays with pseudo-random numbers, and write them to a queue. Problem: Context shifting unless a CPU is dedicated to this task. Unless producing random numbers consitutes a major portion of the simulation, this should not lead to much contention. import threading import Queue import numpy from numpy.random import rand class PRNG(threading.Thread): ''' a thread that generate arrays with random numbers and dumps them to a queue ''' def __init__(self, nthreads, shape): self.shape = shape self.count = numpy.prod(shape) self.queue = Queue.Queue(4 * nthreads) # magic number self.stop_evt = threading.Event() def generate(self, block=True, timeout=None): return self.queue.get(block,timeout) def join(self, timeout=None): self.stop_evt.set() super(PRNG,self).join(timeout) def run(self): tmp = rand(count).reshape(shape) while 1: try: self.queue.put(tmp, block=True, timeout=2.0) except Queue.Full: if self.stop_evt.isSet(): break else: tmp = rand(count).reshape(shape) Do you have any view on this? Is there any way of creating multiple independent random states that will work correctly? I know of SPRNG (Scalable PRNG), but it is made to work with MPI (which I don't use). Regards, Sturla Molden From robert.kern at gmail.com Fri Aug 21 12:12:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 21 Aug 2009 09:12:06 -0700 Subject: [Numpy-discussion] PRNGs and multi-threading In-Reply-To: <4A8F4058.80203@molden.no> References: <4A8F4058.80203@molden.no> Message-ID: <3d375d730908210912v4abed520vfc515c41858b2af6@mail.gmail.com> On Fri, Aug 21, 2009 at 17:48, Sturla Molden wrote: > > I am not sure if this is the right place to discuss this issue. However, > I problem I keep running into in Monte Carlo simulations is generating > pseudorandom numbers with multiple threads. PRNGs such as the Mersenne > Twister keep an internal state, which prevents the PRNG from being > re-entrant and thread-safe. C extension function calls are always atomic unless if they release the GIL. numpy.random does not. You have de facto locks thanks to the GIL. > Possible solutions: > > > 1. Use multiple instances of PRNG states (numpy.random.RandomState), one > for each thread. This should give no contention, but is this > mathematically acceptable? I don't know. At least I have not seen any > proof that it is. As long as you use different seeds, I believe this is fine. The state size of MT is so enormous that almost any reasonable use will not find overlaps. Although you don't really have re-entrancy issues, you will usually want one PRNG per thread for determinism. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d_l_goldsmith at yahoo.com Fri Aug 21 12:50:00 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Fri, 21 Aug 2009 09:50:00 -0700 (PDT) Subject: [Numpy-discussion] A better median function? In-Reply-To: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> Message-ID: <963316.2405.qm@web52109.mail.re2.yahoo.com> Not to make you regret your post ;-) but, you having readily furnished your email address, I'm taking the liberty of forwarding you my resume - I'm the guy who introduced himself yesterday by asking if you knew Don Hall - in case you have need of an experienced CCD data reduction programmer who knows Python, numpy, and matplotlib, as well as IDL, matlab, C/C++, and, from the "distant past" FORTRAN (not to mention advanced math and a little advanced physics, to boot). Caveat: I'm not presently in a position to relocate. :-( Thanks for your time and consideration, David Goldsmith DAVID GOLDSMITH 2036 Lakemoor Dr. SW Olympia, WA 98512 360-753-2318 dgoldsmith_89 at alumni.brown.edu Career Interests: Support of research possessing a strong component of one or more of the following: mathematics, statistics, programming, modeling, physical sciences, engineering, etc. Desired salary rate: $75,000/yr. Skills Computer Operating Systems: Windows, Macintosh, Unix Programming/Technical: Python, C/C++, SWIG, numpy, matplotlib, wxmpl, wxWidgets, SPE, Visual Studio .NET 2003, Trac, TortoiseSVN, RapidSVN, WinCVS, LAPACK, Matlab, Scientific Workplace, IDL, FORTRAN, Splus, Django (learning in progress). Office: MS Word, Excel, PowerPoint, Outlook, Publisher, etc.; Page Maker; etc. Communications: Firefox, Thunderbird, VPN, MS Explorer, Netscape, NCSA Telnet, Fetch, WS FTP, telnet, ftp, lynx, pine, etc. Other Advanced mathematics, statistics, physics, fluid dynamics, engineering, etc.; technical documentation. Programming Employment Technical Editor (Research Manager); June, 2009 to present; Planetary Sciences Group, Dept. of Physics, University of Central Florida, Orlando, FL (but working out of Olympia, WA). Write and review a broad range of docstrings for NumPy, the standard Python module for numerical computing, and manage the 2009 NumPy Documentation Summer Marathon, including volunteer recruitment and coordination, project promotion, grant writing for perpetuation of the project, etc. Programming Mathematical Modeler (Functional Analyst II); June, 2004 through February, 2008; Emergency Response Division, National Oceanic and Atmospheric Administration, Seattle, WA (under contract with General Dynamics Information Technology, Fairfax, VA). Develop 3D enhancements to existing 2D estuarine circulation codes and data visualization and analysis tools in Python and C++, using SWIG, numpy, C/LAPACK, ATLAS, matplotlib, wxmpl, SPE, Visual Studio/Visual C++, wxWidgets, RapidSVN, TortoiseSVN, WinCVS, etc. as development tools; confer regularly with other physical scientists, mathematicians, and programmers about these tools and other issues/projects related to hazardous material emergency response. Programming Statistician (Research Associate V); May, 1999 to September, 2001; Institute for Astronomy, University of Hawai`i, Hilo. Developed IDL-based software for analysis of data obtained in development of solid-state sensor technology for the Next Generation Space Telescope, and other related computer activities. Programming Research Assistant; September to December, 1997; Physics Dept., Univ. of Montana, Missoula. Assisted in the development of a FORTRAN computational model for optimization of toroidal plasma confinement. Programming Research Assistant; June to August, 1997; Physics Dept., Univ. of Montana, Missoula. Assisted in FORTRAN computer modeling of passive scalar transport in the stratosphere. Programming Research Assistant; June to August, 1997; Mathematical Sciences Dept., Univ. of Montana, Missoula. Developed, in MATLAB, a cellular-automata-based simulation of flow around windmill turbine blades. Programming Consultant; April, 1995; Earth Justice Legal Defense Fund, Honolulu, Hawai`i. Developed Excel spreadsheet to determine sewage discharge violations from municipal wastewater facility records. Programming Research Assistant; June to August, 1985 and 1986; Plasma Physics Branch, Naval Research Laboratory, Washington, DC. Assisted in FORTRAN computer modeling of plasma switching devices. Publications (abridged) 2000, w/ D. Hall (1st author) et al., "Characterization of lambda_c ~ 5 micron Hg:Cd:Te Arrays for Low-Background Astronomy", Optical and IR Telescope Instrumentation and Detectors, Proceedings of SPIE, Vol. 4008, Part 2. 2000, w/ D. Hall (1st author) et al., "Molecular Beam Epitaxial Mercury Cadmium Telluride: A Quiet, Warm FPA For NGST", Astr. Soc. Pacific Conf. Ser., Vol. 207. 1997, w/ A. Ware (1st author) et al., "Stability of Small Aspect Ratio Toroidal Hybrid Devices", American Physical Society, Plasma Physics Section, Semi-annual meeting. Education (abridged) Master of Arts, Mathematical Sciences, University of Montana, Missoula, awarded May, 1998. GPA: 4.0. Master of Science, Aquacultural Engineering, University of Hawai`i, Manoa, awarded August, 1993. GPA: 3.72. Bachelor of Arts, Mathematics, Brown University, Providence, Rhode Island, awarded May, 1989. GPA: Unreported (Brown does not routinely calculate GPA's; unofficially: 3.83). References Prof. Joseph Harrington, Ph.D., Department of Physics, University of Central Florida, 321-696-9914, jh at physics.ucf.edu Debbie Payton, Branch Chief and Oceanographer, Emergency Response Division, NOAA, 206-526-6320, debbie.payton at noaa.gov Glen Watabayashi, Operations Manager and Oceanographer, ERD, NOAA, 206-526-6324, glen.watabayashi at noaa.gov Chris Barker, Ph.D., Oceanographer, ERD, NOAA, 206-526-6959, chris.barker at noaa.gov Don Hall, Ph.D., Institute for Astronomy, University of Hawai`i, 808-932-2360, hall at ifa.hawaii.edu --- On Fri, 8/21/09, Mike Ressler wrote: > From: Mike Ressler > Subject: [Numpy-discussion] A better median function? > To: "Discussion of Numerical Python" > Date: Friday, August 21, 2009, 8:47 AM > I presented this during a lightning > talk at the scipy conference > yesterday, so again, at the risk of painting myself as a > flaming > idiot: > > --------------------- > Wanted: A Better/Faster median() Function > > numpy implementation uses simple sorting algorithm: > Sort all the data using the .sort() method > Return middle value (or mean of two middle values) > > One doesn?t have to sort all data ? need only the > middle value > > Nicolas Devillard discusses several algorithms at > http://ndevilla.free.fr/median/median/index.html > > Implemented Devillard?s version of the Numerical Recipes > select() > function using ctypes: 2 to 20 times faster on the large > (> 10^6 > points) arrays I tested > --- Caveat: I don?t have all the bells and whistles of > the built-in > median function (multiple dimensions, non-contiguous, > etc.) > > Any of the numpy developers interested in pursuing this > further? > ----------------------- > > I got a fairly loud "yes" from the back of the room which a > few of us > guessed was Robert Kern. I take that as generic interest at > least in > checking this out. > > The background on this is that I am doing some glitch > finding > algorithms where I call median frequently. I think my > ultimate problem > is not in median(), but how I loop through the data, but > that is a > different discussion. What I noticed as I was investigating > was what I > noted in the slide above. Returning the middle of a sorted > vector is > not a bad thing to do (admit it, we've all done it at some > point), but > it does too much work. Things that are lower or higher than > the median > don't need to be in a perfectly sorted order if all we are > after is > the median value. > > I did some googling and came up with the web page noted > above. I used > his modified NumRec select() function as an excuse to learn > ctypes, > and my initial weak attempts were successful. The speed ups > depend > highly on the length of the data and the randomness - > things that are > correlated or partially sorted already go quickly. My > caveat is that > my select-based median is too simple; it must have 1-d > contiguous data > of a predefined type. It also moves the data in place, > affecting the > original variable. I have no idea how this will blow up if > implemented > in a general purpose way. > > Anyway, I'm not enough of a C-coder to have any hope of > improving this > to the point where it can be included in numpy itself. > However, if > someone is willing to take up the torch, I will volunteer > to assist > with discussion, prototyping a few routines, and testing (I > have lots > of real-world data). One could argue that the current > median > implementation is good enough (and it probably is for 99% > of all > usage), but I view this as a chance to add an industrial > strength > routine to the numpy base. > > Thanks for listening. > > Mike > > -- > mike.ressler at alum.mit.edu > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From d_l_goldsmith at yahoo.com Fri Aug 21 12:55:48 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Fri, 21 Aug 2009 09:55:48 -0700 (PDT) Subject: [Numpy-discussion] A better median function? In-Reply-To: <963316.2405.qm@web52109.mail.re2.yahoo.com> Message-ID: <676615.2812.qm@web52102.mail.re2.yahoo.com> Ouch, didn't check my to address first, sorry!!! DG --- On Fri, 8/21/09, David Goldsmith wrote: > From: David Goldsmith > Subject: Re: [Numpy-discussion] A better median function? > To: "Discussion of Numerical Python" > Date: Friday, August 21, 2009, 9:50 AM > Not to make you regret your post ;-) > but, you having readily furnished your email address, I'm > taking the liberty of forwarding you my resume - I'm the guy > who introduced himself yesterday by asking if you knew Don > Hall - in case you have need of an experienced CCD data > reduction programmer who knows Python, numpy, and > matplotlib, as well as IDL, matlab, C/C++, and, from the > "distant past" FORTRAN (not to mention advanced math and a > little advanced physics, to boot).? Caveat: I'm not > presently in a position to relocate. :-(? Thanks for > your time and consideration, > > David Goldsmith > > DAVID GOLDSMITH > 2036 Lakemoor Dr. SW > Olympia, WA 98512 > 360-753-2318 > dgoldsmith_89 at alumni.brown.edu > > Career Interests: Support of research possessing a strong > component of one or more of the following: mathematics, > statistics, programming, modeling, physical sciences, > engineering, etc. > > Desired salary rate: $75,000/yr. > > > Skills > > ? ? Computer > > Operating Systems: Windows, Macintosh, Unix > > Programming/Technical: Python, C/C++, SWIG, numpy, > matplotlib, wxmpl, wxWidgets, SPE, Visual Studio .NET 2003, > Trac, TortoiseSVN, RapidSVN, WinCVS, LAPACK, Matlab, > Scientific Workplace, IDL, FORTRAN, Splus, Django (learning > in progress). > > Office: MS Word, Excel, PowerPoint, Outlook, Publisher, > etc.; Page Maker; etc. > > Communications: Firefox, Thunderbird, VPN, MS Explorer, > Netscape, NCSA Telnet, Fetch, WS FTP, telnet, ftp, lynx, > pine, etc. > > ? ? Other > > Advanced mathematics, statistics, physics, fluid dynamics, > engineering, etc.; technical documentation. > > > Programming Employment > > Technical Editor (Research Manager); June, 2009 to present; > Planetary Sciences Group, Dept. of Physics, University of > Central Florida, Orlando, FL (but working out of Olympia, > WA).? Write and review a broad range of docstrings for > NumPy, the standard Python module for numerical computing, > and manage the 2009 NumPy Documentation Summer Marathon, > including volunteer recruitment and coordination, project > promotion, grant writing for perpetuation of the project, > etc. > > Programming Mathematical Modeler (Functional Analyst II); > June, 2004 through February, 2008; Emergency Response > Division, National Oceanic and Atmospheric Administration, > Seattle, WA (under contract with General Dynamics > Information Technology, Fairfax, VA).? Develop 3D > enhancements to existing 2D estuarine circulation codes and > data visualization and analysis tools in Python and C++, > using SWIG, numpy, C/LAPACK, ATLAS, matplotlib, wxmpl, SPE, > Visual Studio/Visual C++, wxWidgets, RapidSVN, TortoiseSVN, > WinCVS, etc. as development tools; confer regularly with > other physical scientists, mathematicians, and programmers > about these tools and other issues/projects related to > hazardous material emergency response. > > Programming Statistician (Research Associate V); May, 1999 > to September, 2001; Institute for Astronomy, University of > Hawai`i, Hilo.? Developed IDL-based software for > analysis of data obtained in development of solid-state > sensor technology for the Next Generation Space Telescope, > and other related computer activities. > > Programming Research Assistant; September to December, > 1997; Physics Dept., Univ. of Montana, Missoula.? > Assisted in the development of a FORTRAN computational model > for optimization of toroidal plasma confinement. > > Programming Research Assistant; June to August, 1997; > Physics Dept., Univ. of Montana, Missoula.? Assisted in > FORTRAN computer modeling of passive scalar transport in the > stratosphere. > > Programming Research Assistant; June to August, 1997; > Mathematical Sciences Dept., Univ. of Montana, > Missoula.? Developed, in MATLAB, a > cellular-automata-based simulation of flow around windmill > turbine blades. > > Programming Consultant; April, 1995; Earth Justice Legal > Defense Fund, Honolulu, Hawai`i.? Developed Excel > spreadsheet to determine sewage discharge violations from > municipal wastewater facility records. > > Programming Research Assistant; June to August, 1985 and > 1986; Plasma Physics Branch, Naval Research Laboratory, > Washington, DC.? Assisted in FORTRAN computer modeling > of plasma switching devices. > > > Publications (abridged) > > 2000, w/ D. Hall (1st author) et al., "Characterization of > lambda_c ~ 5 micron Hg:Cd:Te Arrays for Low-Background > Astronomy", Optical and IR Telescope Instrumentation and > Detectors, Proceedings of SPIE, Vol. 4008, Part 2. > > 2000, w/ D. Hall (1st author) et al., "Molecular Beam > Epitaxial Mercury Cadmium Telluride: A Quiet, Warm FPA For > NGST", Astr. Soc. Pacific Conf. Ser., Vol. 207. > > 1997, w/ A. Ware (1st author) et al., "Stability of Small > Aspect Ratio Toroidal Hybrid Devices", American Physical > Society, Plasma Physics Section, Semi-annual meeting. > > > Education (abridged) > > Master of Arts, Mathematical Sciences, University of > Montana, Missoula, awarded May, 1998. GPA: 4.0. > > Master of Science, Aquacultural Engineering, University of > Hawai`i, Manoa, awarded August, 1993. GPA: 3.72. > > Bachelor of Arts, Mathematics, Brown University, > Providence, Rhode Island, awarded May, 1989. GPA: Unreported > (Brown does not routinely calculate GPA's; unofficially: > 3.83). > > > References > > Prof. Joseph Harrington, Ph.D., Department of Physics, > University of Central Florida, 321-696-9914, jh at physics.ucf.edu > > Debbie Payton, Branch Chief and Oceanographer, Emergency > Response Division, NOAA, 206-526-6320, debbie.payton at noaa.gov > > Glen Watabayashi, Operations Manager and Oceanographer, > ERD, NOAA, 206-526-6324, glen.watabayashi at noaa.gov > > Chris Barker, Ph.D., Oceanographer, ERD, NOAA, > 206-526-6959, chris.barker at noaa.gov > > Don Hall, Ph.D., Institute for Astronomy, University of > Hawai`i, 808-932-2360, hall at ifa.hawaii.edu > --- On Fri, 8/21/09, Mike Ressler > wrote: > > > From: Mike Ressler > > Subject: [Numpy-discussion] A better median function? > > To: "Discussion of Numerical Python" > > Date: Friday, August 21, 2009, 8:47 AM > > I presented this during a lightning > > talk at the scipy conference > > yesterday, so again, at the risk of painting myself as > a > > flaming > > idiot: > > > > --------------------- > > Wanted: A Better/Faster median() Function > > > > numpy implementation uses simple sorting algorithm: > > Sort all the data using the .sort() method > > Return middle value (or mean of two middle values) > > > > One doesn?t have to sort all data ? need only the > > middle value > > > > Nicolas Devillard discusses several algorithms at > > http://ndevilla.free.fr/median/median/index.html > > > > Implemented Devillard?s version of the Numerical > Recipes > > select() > > function using ctypes: 2 to 20 times faster on the > large > > (> 10^6 > > points) arrays I tested > > --- Caveat: I don?t have all the bells and whistles > of > > the built-in > > median function (multiple dimensions, non-contiguous, > > etc.) > > > > Any of the numpy developers interested in pursuing > this > > further? > > ----------------------- > > > > I got a fairly loud "yes" from the back of the room > which a > > few of us > > guessed was Robert Kern. I take that as generic > interest at > > least in > > checking this out. > > > > The background on this is that I am doing some glitch > > finding > > algorithms where I call median frequently. I think my > > ultimate problem > > is not in median(), but how I loop through the data, > but > > that is a > > different discussion. What I noticed as I was > investigating > > was what I > > noted in the slide above. Returning the middle of a > sorted > > vector is > > not a bad thing to do (admit it, we've all done it at > some > > point), but > > it does too much work. Things that are lower or higher > than > > the median > > don't need to be in a perfectly sorted order if all we > are > > after is > > the median value. > > > > I did some googling and came up with the web page > noted > > above. I used > > his modified NumRec select() function as an excuse to > learn > > ctypes, > > and my initial weak attempts were successful. The > speed ups > > depend > > highly on the length of the data and the randomness - > > things that are > > correlated or partially sorted already go quickly. My > > caveat is that > > my select-based median is too simple; it must have > 1-d > > contiguous data > > of a predefined type. It also moves the data in > place, > > affecting the > > original variable. I have no idea how this will blow > up if > > implemented > > in a general purpose way. > > > > Anyway, I'm not enough of a C-coder to have any hope > of > > improving this > > to the point where it can be included in numpy > itself. > > However, if > > someone is willing to take up the torch, I will > volunteer > > to assist > > with discussion, prototyping a few routines, and > testing (I > > have lots > > of real-world data). One could argue that the current > > median > > implementation is good enough (and it probably is for > 99% > > of all > > usage), but I view this as a chance to add an > industrial > > strength > > routine to the numpy base. > > > > Thanks for listening. > > > > Mike > > > > -- > > mike.ressler at alum.mit.edu > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > __________________________________________________ > Do You Yahoo!? > Tired of spam?? Yahoo! Mail has the best spam > protection around > http://mail.yahoo.com > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sturla at molden.no Fri Aug 21 22:09:08 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 21 Aug 2009 19:09:08 -0700 Subject: [Numpy-discussion] PRNGs and multi-threading In-Reply-To: <3d375d730908210912v4abed520vfc515c41858b2af6@mail.gmail.com> References: <4A8F4058.80203@molden.no> <3d375d730908210912v4abed520vfc515c41858b2af6@mail.gmail.com> Message-ID: <4A8F5344.8010205@molden.no> Robert Kern skrev: > Although you don't really have re-entrancy issues, you will usually > want one PRNG per thread for determinism. > I see... numpy.random.rand does not have re-entrancy issues because of the GIL, but I get indeterminism from the OS scheduling the threads. RandomState might not release the GIL either, but preserves determinism in presence of multiple threads. Thanks. :-) Regards, Sturla Molden From stefan at sun.ac.za Fri Aug 21 13:33:03 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 21 Aug 2009 10:33:03 -0700 Subject: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]] In-Reply-To: References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> <9457e7c80908201648r474297fo1d6ad2061c240fa6@mail.gmail.com> <88B0EA8C-5637-4050-AC4F-023C6D41BB3A@stsci.edu> Message-ID: <9457e7c80908211033y41e1e919le24181e277589b25@mail.gmail.com> Hi Vincent 2009/8/21 Vincent Schut : > I know it probably will be a pretty involved task, as ndimage comes from > numarray and seems to be largely implemented in C. But I really wanted > to raise the issue now the image processing subject turns up once again, > and hope some folks with more/better programming skills than me might > like the idea... What would you like the behaviour to be? For example, how should ndimage.zoom handle these missing values? Regards St?fan From chad.netzer at gmail.com Fri Aug 21 14:10:51 2009 From: chad.netzer at gmail.com (Chad Netzer) Date: Fri, 21 Aug 2009 11:10:51 -0700 Subject: [Numpy-discussion] A better median function? In-Reply-To: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> References: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> Message-ID: On Fri, Aug 21, 2009 at 8:47 AM, Mike Ressler wrote: > I presented this during a lightning talk at the scipy conference > yesterday, so again, at the risk of painting myself as a flaming > idiot: > > --------------------- > Wanted: A Better/Faster median() Function > > numpy implementation uses simple sorting algorithm: > Sort all the data using the .sort() method > Return middle value (or mean of two middle values) Michael and I also discussed this briefly this morning at the SciPy conference. I'll summarize a bit: scipy.signals has a medianfilter() implemented in C which uses the Hoare selection algorithm: http://en.wikipedia.org/wiki/Selection_algorithm#Partition-based_general_selection_algorithm This *may* suit Michael's needs. More generally, C++ std lib has both partial_sort() and nth_element(), both of which would be "nice" functionality to have natively (ie. C speeds) in numpy, imo: http://www.cplusplus.com/reference/algorithm/partial_sort/ http://www.cplusplus.com/reference/algorithm/nth_element/ Since both can be made from a modified subset of quicksort, it should be straightforward to craft these methods from the existing numpy quicksort code. With these primitives, it is trivial to improve the existing numpy median() implementations. I'll probably attempt to tackle it myself, unless anyone else has done it (or has a better idea). Certainly, the ability to quickly find the minimal or maximal n elements of a sequence, without having to perform a full sort, would be of use to many numpy users. Has this problem already been solved in numpy? -Chad From matthew.brett at gmail.com Fri Aug 21 14:33:43 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 21 Aug 2009 11:33:43 -0700 Subject: [Numpy-discussion] A better median function? In-Reply-To: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> References: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> Message-ID: <1e2af89e0908211133n661e0accsd48021bea6158121@mail.gmail.com> Hi, > Nicolas Devillard discusses several algorithms at > http://ndevilla.free.fr/median/median/index.html Thanks for this. A loud 'yes' from the back of the internet too. I contacted Nicolas Devillard a year or so ago to ask him if we could include his code in Scipy, and he said 'yes'. I can forward this if that's useful. Nicolas investigated algorithms that find the lower (or upper) median value. The lower median is the median iff there are an odd number of entries in our list, or the lower of the central values in the sort, when there are an even number of values in the list. So, we need the upper _and_ lower median when there are an even number of entries. I guess the necessity of doing those two related searches may change the relative strengths of the algorithms, but I'm sure this is a well-known problem. Best, Matthew From sturla at molden.no Fri Aug 21 23:37:34 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 21 Aug 2009 20:37:34 -0700 Subject: [Numpy-discussion] PRNGs and multi-threading In-Reply-To: <3d375d730908210912v4abed520vfc515c41858b2af6@mail.gmail.com> References: <4A8F4058.80203@molden.no> <3d375d730908210912v4abed520vfc515c41858b2af6@mail.gmail.com> Message-ID: <4A8F67FE.7060803@molden.no> Robert Kern skrev: > As long as you use different seeds, I believe this is fine. The state > size of MT is so enormous that almost any reasonable use will not find > overlaps. > It seems there is a special version of the Mersenne Twister for this. The code is LGPL (annoying for SciPy but ok for me). http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dc.html Sturla Molden From sturla at molden.no Fri Aug 21 23:50:32 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 21 Aug 2009 20:50:32 -0700 Subject: [Numpy-discussion] PRNGs and multi-threading In-Reply-To: <4A8F67FE.7060803@molden.no> References: <4A8F4058.80203@molden.no> <3d375d730908210912v4abed520vfc515c41858b2af6@mail.gmail.com> <4A8F67FE.7060803@molden.no> Message-ID: <4A8F6B08.7040802@molden.no> Sturla Molden skrev: > It seems there is a special version of the Mersenne Twister for this. > The code is LGPL (annoying for SciPy but ok for me). http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf Basically it encodes the thread-ids in the characteristic polynomial of the MT, producing multiple small-period, independent MTs. That solves it then. Too bad this is LGPL. It would be a very useful enhancement to RandomState. Sturla Molden From matthew.brett at gmail.com Fri Aug 21 14:51:50 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 21 Aug 2009 11:51:50 -0700 Subject: [Numpy-discussion] Accelerating NumPy computations [Was: GPU Numpy] In-Reply-To: References: <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> <1250766444.5546.40.camel@inspiron> <2d1d7fe70908210701k49fb02a1l8818779f1864bce0@mail.gmail.com> Message-ID: <1e2af89e0908211151s49b20421wbea4d5b15f3f2455@mail.gmail.com> Hi, > Indeed. In the future, if OpenCL is the way to go, it may even be > helpful to have Numpy using OpenCL directly, as AMD provides an SDK > for OpenCL, and with Larrabee approaching, Intel will surely provide > one of its own. I was just in a lecture by one of the Intel people about OpenCL: http://parlab.eecs.berkeley.edu/bootcampagenda http://parlab.eecs.berkeley.edu/sites/all/parlab/files/OpenCL_Mattson.pdf He offered no schedule for an Intel OpenCL implementation, but said that they were committed to it. The lectures in general were effective in pointing out what a time-consuming effort it can be moving algorithms into the the parallel world - including GPUs. The lecture just passed cited the example of a CUDA-based BLAS implementation on the GPU that was slower than the CPU version. Making BLAS go faster required a lot of work to find optimal strategies for blocking, transfer between CPU / GPU shared memory / GPU registers, vector sizes and so on - this on a specific NVIDIA architecture. I can imagine Numpy being useful for scripting in this C-and-assembler-centric world, making it easier to write automated testers, or even generate C code. Is anyone out there working on this kind of stuff? I ask only because there seems to be considerable interest here on the Berkeley campus. Best, Matthew From michael at directaid.ca Fri Aug 21 14:34:58 2009 From: michael at directaid.ca (Michael Cooper) Date: Fri, 21 Aug 2009 12:34:58 -0600 Subject: [Numpy-discussion] Problems distributing NumPy Message-ID: Hi all- I am writing a C++ application with embedded Python scripting. Some of the scripts use NumPy, so I have been working out the best way to distribute NumPy with my software. At the moment, I've got a private folder which I add to the Python path. In this folder, I include all the files which would usually get installed to the "site-packages" folder. To get these files, I've simply unzipped the distutils installer (for NumPy, I am using the "no SSE" version at the moment), and copied the contents of the resulting "PLATLIB" folder to my private folder. For all the other libraries I am using, this method seems to work fine. However, with NumPy, if I do things this way, when I call "import_array()", it jumps back out of the calling function, skipping the rest of that function, and continues from there. If I install NumPy in the usual way, this does not happen. It seems like the NumPy initialization is failing when I install into the private folder, but not if I use the normal installer. Many of the users of my software aren't particularly Python savvy, so having them install everything manually is not an option. I would like to avoid having my own installer call external installers, sine that's confusing for some users. Finally, if possible, I would like to avoid changing the user's "Python26" folder, since I have no way of knowing what else might be relying on its contents. Lots of searching on how to install third-party libraries has led me to the "PLATLIB" method, so I'm at a bit of a loss as to what else to try. Does anyone here know what might be going wrong? I am using Python 2.6, Boost 1.38, and NumPy 1.3.0 on a Windows XP system. The embedding program is written in C++, and compiled using Visual Studio 2005. Thanks, Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Aug 21 15:00:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 21 Aug 2009 12:00:28 -0700 Subject: [Numpy-discussion] PRNGs and multi-threading In-Reply-To: <4A8F6B08.7040802@molden.no> References: <4A8F4058.80203@molden.no> <3d375d730908210912v4abed520vfc515c41858b2af6@mail.gmail.com> <4A8F67FE.7060803@molden.no> <4A8F6B08.7040802@molden.no> Message-ID: <3d375d730908211200p144412e5ve28b18ca39c62545@mail.gmail.com> On Fri, Aug 21, 2009 at 20:50, Sturla Molden wrote: > Sturla Molden skrev: >> It seems there is a special version of the Mersenne Twister for this. >> The code is LGPL (annoying for SciPy but ok for me). > > http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf > > > Basically it encodes the thread-ids in the characteristic polynomial of > the MT, producing multiple small-period, independent MTs. That solves it > then. Too bad this is LGPL. It would be a very useful enhancement to > RandomState. I agree. It might be possible to re-implement it from the original papers, but it's a chunk of work. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mike.ressler at alum.mit.edu Fri Aug 21 15:08:50 2009 From: mike.ressler at alum.mit.edu (Mike Ressler) Date: Fri, 21 Aug 2009 12:08:50 -0700 Subject: [Numpy-discussion] A better median function? In-Reply-To: <1e2af89e0908211133n661e0accsd48021bea6158121@mail.gmail.com> References: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> <1e2af89e0908211133n661e0accsd48021bea6158121@mail.gmail.com> Message-ID: <268febdf0908211208o4ff73ff1q69146b542c75321d@mail.gmail.com> Hi, On Fri, Aug 21, 2009 at 11:33 AM, Matthew Brett wrote: > Nicolas investigated algorithms that find the lower (or upper) median > value. ?The lower median is the median iff there are an odd number of > entries in our list, or the lower of the central values in the sort, > when there are an even number of values in the list. So, we need the > upper _and_ lower median when there are an even number of entries. ?I > guess the necessity of doing those two related searches may change the > relative strengths of the algorithms, but I'm sure this is a > well-known problem. My trivial solution to this was that since the data is now in two "partitions", all one needs to do is quickly scan through the upper partition to find the minimum value. Since you already have the lower median from the select run, this minimum is by definition the upper median. This can be averaged with the lower median and you are done. Brain dead, perhaps, but it worked for my test. I did not (yet) investigate whether this minimum can be located in some more expedient fashion (i.e. did the select put the minimum in some consistent place where we don't have to scan through half the input array?). Mike -- mike.ressler at alum.mit.edu From matthew.brett at gmail.com Fri Aug 21 15:16:48 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 21 Aug 2009 12:16:48 -0700 Subject: [Numpy-discussion] A better median function? In-Reply-To: <268febdf0908211208o4ff73ff1q69146b542c75321d@mail.gmail.com> References: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> <1e2af89e0908211133n661e0accsd48021bea6158121@mail.gmail.com> <268febdf0908211208o4ff73ff1q69146b542c75321d@mail.gmail.com> Message-ID: <1e2af89e0908211216u73200022n6fee38d87bc8e3b1@mail.gmail.com> > On Fri, Aug 21, 2009 at 11:33 AM, Matthew Brett wrote: >> Nicolas investigated algorithms that find the lower (or upper) median >> value. ?The lower median is the median iff there are an odd number of >> entries in our list, or the lower of the central values in the sort, >> when there are an even number of values in the list. So, we need the >> upper _and_ lower median when there are an even number of entries. ?I >> guess the necessity of doing those two related searches may change the >> relative strengths of the algorithms, but I'm sure this is a >> well-known problem. > > My trivial solution to this was that since the data is now in two > "partitions", all one needs to do is quickly scan through the upper > partition to find the minimum value. ... > Brain dead, perhaps, but it worked for my test. Nice... Your brain, when dead, is working better than mine, Matthew From saintmlx at apstat.com Fri Aug 21 15:20:41 2009 From: saintmlx at apstat.com (Xavier Saint-Mleux) Date: Fri, 21 Aug 2009 15:20:41 -0400 Subject: [Numpy-discussion] PRNGs and multi-threading In-Reply-To: <3d375d730908211200p144412e5ve28b18ca39c62545@mail.gmail.com> References: <4A8F4058.80203@molden.no> <3d375d730908210912v4abed520vfc515c41858b2af6@mail.gmail.com> <4A8F67FE.7060803@molden.no> <4A8F6B08.7040802@molden.no> <3d375d730908211200p144412e5ve28b18ca39c62545@mail.gmail.com> Message-ID: <4A8EF389.2080209@apstat.com> Robert Kern wrote: > On Fri, Aug 21, 2009 at 20:50, Sturla Molden wrote: > >> Sturla Molden skrev: >> >>> It seems there is a special version of the Mersenne Twister for this. >>> The code is LGPL (annoying for SciPy but ok for me). >>> >> http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf >> >> >> Basically it encodes the thread-ids in the characteristic polynomial of >> the MT, producing multiple small-period, independent MTs. That solves it >> then. Too bad this is LGPL. It would be a very useful enhancement to >> RandomState. >> > > I agree. It might be possible to re-implement it from the original > papers, but it's a chunk of work. > > I use the following PRNG class, derived from RandomState, which allows a PRNG to create multiple different sub-PRNGs in a deterministic way: http://bazaar.launchpad.net/~piaget-dev/piaget/dev/annotate/head%3A/piaget/math/prng.py It is written in Python and has an Apache license (BSD-like). There is no mathematical proof that different PRNGs will have states "far enough" from each other, but it works well in practice (I've had bad surprises using Python's random.jumpahead, which is not a real jumpahead). Of course, the mathematically correct way would be to use a correct jumpahead function, but all the implementations that I know of are GPL. A recent article about this is: www.iro.umontreal.ca/~lecuyer/myftp/papers/jumpmt.pdf Xavier Saint-Mleux From bergstrj at iro.umontreal.ca Fri Aug 21 15:46:01 2009 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Fri, 21 Aug 2009 15:46:01 -0400 Subject: [Numpy-discussion] Accelerating NumPy computations [Was: GPU Numpy] In-Reply-To: <1e2af89e0908211151s49b20421wbea4d5b15f3f2455@mail.gmail.com> References: <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> <1250766444.5546.40.camel@inspiron> <2d1d7fe70908210701k49fb02a1l8818779f1864bce0@mail.gmail.com> <1e2af89e0908211151s49b20421wbea4d5b15f3f2455@mail.gmail.com> Message-ID: <7f1eaee30908211246t54fa676fyefa8d32f7bdf37bd@mail.gmail.com> On Fri, Aug 21, 2009 at 2:51 PM, Matthew Brett wrote: > I can imagine Numpy being useful for scripting in this > C-and-assembler-centric world, making it easier to write automated > testers, or even generate C code. > > Is anyone out there working on this kind of stuff? ?I ask only because > there seems to be considerable interest here on the Berkeley campus. > > Best, > > Matthew Frederic Bastien and I are working on this sort of thing. We use a project called theano to build symbolic expression graphs. Theano optimizes those graphs like an optimizing compiler, and then it generates C code for those graphs. We haven't put a lot of effort into optimizing the C implementations of most expressions (except for non-separable convolution), but we call fast blas and fftw functions, and our naive implementations are typically faster than equivalent numpy expressions just because they are in C. (Although congrats to those working at optimizing numpy... it has gotten a lot faster over the last few years!) We are now writing another backend that generates cuda runtime C++. It is just like you say: even for simple tasks like adding two vectors together or summing the elements of a matrix, there are several possible kernels that can be optimal in different circumstances. The penalty of choosing a sub-optimal kernel can be pretty high. So what ends up happening is that even for simple ufunc-type expressions, we have - a version for when the arguments are small and everything is c-contiguous - a general version that is typically orders of magnitude slower than the optimal choice - versions for when arguments are small and 1D, 2D, 3D, 4D, 5D - versions for when various of the arguments are broadcasted in different ways - versions for when there is at least one large contiguous dimension And the list goes on. We are still in the process of understanding the architecture and the most effective strategies for optimization. I think our design is a good one though from the users' perspective because it supports a completely opaque front-end.. you just program the symbolic graph in python using normal expressions, compile it as a function, and call it. The detail of whether it is evaluated on the CPU or the GPU (or both) is hidden. If anyone is interested in what we're doing please feel free to send me an email. Links to these projects are http://www.pylearn.org/theano http://code.google.com/p/theano-cuda-ndarray/ http://code.google.com/p/cuda-ndarray/ James -- http://www-etud.iro.umontreal.ca/~bergstrj From michael at directaid.ca Fri Aug 21 16:15:26 2009 From: michael at directaid.ca (Michael Cooper) Date: Fri, 21 Aug 2009 14:15:26 -0600 Subject: [Numpy-discussion] Problems distributing NumPy In-Reply-To: References: Message-ID: <74BD9F19152A44F3923369D830D620FA@polarsun> Hi again- I've been working on this problem off and on over the last couple of years, and of course did not find the solution until I finally broke down and posted to the list. It looks like the problem is very simple: Although I was adding my private folder to the Python path, I was not doing so until after NumPy was initialized. Hence, the library could not be found. It is, in fact, as simple as I had hoped it would be to distribute NumPy in a private folder. Thanks, Michael _____ From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Michael Cooper Sent: August 21, 2009 12:35 PM To: numpy-discussion at scipy.org Subject: [Numpy-discussion] Problems distributing NumPy Hi all- I am writing a C++ application with embedded Python scripting. Some of the scripts use NumPy, so I have been working out the best way to distribute NumPy with my software. At the moment, I've got a private folder which I add to the Python path. In this folder, I include all the files which would usually get installed to the "site-packages" folder. To get these files, I've simply unzipped the distutils installer (for NumPy, I am using the "no SSE" version at the moment), and copied the contents of the resulting "PLATLIB" folder to my private folder. For all the other libraries I am using, this method seems to work fine. However, with NumPy, if I do things this way, when I call "import_array()", it jumps back out of the calling function, skipping the rest of that function, and continues from there. If I install NumPy in the usual way, this does not happen. It seems like the NumPy initialization is failing when I install into the private folder, but not if I use the normal installer. Many of the users of my software aren't particularly Python savvy, so having them install everything manually is not an option. I would like to avoid having my own installer call external installers, sine that's confusing for some users. Finally, if possible, I would like to avoid changing the user's "Python26" folder, since I have no way of knowing what else might be relying on its contents. Lots of searching on how to install third-party libraries has led me to the "PLATLIB" method, so I'm at a bit of a loss as to what else to try. Does anyone here know what might be going wrong? I am using Python 2.6, Boost 1.38, and NumPy 1.3.0 on a Windows XP system. The embedding program is written in C++, and compiled using Visual Studio 2005. Thanks, Michael No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.409 / Virus Database: 270.13.61/2313 - Release Date: 08/21/09 06:04:00 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Sat Aug 22 02:08:28 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 21 Aug 2009 23:08:28 -0700 Subject: [Numpy-discussion] PRNGs and multi-threading In-Reply-To: <4A8EF389.2080209@apstat.com> References: <4A8F4058.80203@molden.no> <3d375d730908210912v4abed520vfc515c41858b2af6@mail.gmail.com> <4A8F67FE.7060803@molden.no> <4A8F6B08.7040802@molden.no> <3d375d730908211200p144412e5ve28b18ca39c62545@mail.gmail.com> <4A8EF389.2080209@apstat.com> Message-ID: <4A8F8B5C.3090700@molden.no> Xavier Saint-Mleux skrev: > Of course, the mathematically correct way would be to use a correct > jumpahead function, but all the implementations that I know of are GPL. > A recent article about this is: > > www.iro.umontreal.ca/~lecuyer/myftp/papers/jumpmt.pdf > > I know of no efficient "jumpahead" function for MT. Several seconds for 1000 jumps ahead is not impressive -- just generating the deviates is faster! With DCMT it is easy to create "independent" MTs with smaller periods. Independence here means that the "characteristic polynomials are relatively prime to each other". A "small" period of e.g. 2**521 - 1 means that if we produce 1 billion deviates per minute, it would still take the MT about 10**143 years to cycle. Chances are we will not be around to see that happen. It also seems that nvidia has endorsed this method: http://developer.download.nvidia.com/compute/cuda/sdk/website/projects/MersenneTwister/doc/MersenneTwister.pdf S.M. From pivanov314 at gmail.com Fri Aug 21 18:06:50 2009 From: pivanov314 at gmail.com (Paul Ivanov) Date: Fri, 21 Aug 2009 15:06:50 -0700 Subject: [Numpy-discussion] Accelerating NumPy computations [Was: GPU Numpy] In-Reply-To: <1e2af89e0908211151s49b20421wbea4d5b15f3f2455@mail.gmail.com> References: <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> <1250766444.5546.40.camel@inspiron> <2d1d7fe70908210701k49fb02a1l8818779f1864bce0@mail.gmail.com> <1e2af89e0908211151s49b20421wbea4d5b15f3f2455@mail.gmail.com> Message-ID: <20090821220650.GD8191@ykcyc> Matthew Brett, on 2009-08-21 11:51, wrote: > Hi, > > > Indeed. In the future, if OpenCL is the way to go, it may even be > > helpful to have Numpy using OpenCL directly, as AMD provides an SDK > > for OpenCL, and with Larrabee approaching, Intel will surely provide > > one of its own. > > I was just in a lecture by one of the Intel people about OpenCL: > > http://parlab.eecs.berkeley.edu/bootcampagenda > http://parlab.eecs.berkeley.edu/sites/all/parlab/files/OpenCL_Mattson.pdf > > He offered no schedule for an Intel OpenCL implementation, but said > that they were committed to it. > > The lectures in general were effective in pointing out what a > time-consuming effort it can be moving algorithms into the the > parallel world - including GPUs. The lecture just passed cited the > example of a CUDA-based BLAS implementation on the GPU that was slower > than the CPU version. Making BLAS go faster required a lot of work > to find optimal strategies for blocking, transfer between CPU / GPU > shared memory / GPU registers, vector sizes and so on - this on a > specific NVIDIA architecture. > > I can imagine Numpy being useful for scripting in this > C-and-assembler-centric world, making it easier to write automated > testers, or even generate C code. > > Is anyone out there working on this kind of stuff? I ask only because > there seems to be considerable interest here on the Berkeley campus. This is exactly the sort of thing you can do with PyCUDA, which makes it so awesome! In particular, see the metaprogramming portion of the docs: The metaprogramming section of the slides and source code from Nicolas Pinto and Andreas Kl?ckner *excellent* SciPy2009 Tutorials is even more thorough: cheers, Paul Ivanov From pinto at mit.edu Fri Aug 21 20:19:13 2009 From: pinto at mit.edu (Nicolas Pinto) Date: Fri, 21 Aug 2009 17:19:13 -0700 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: References: Message-ID: <954ae5aa0908211719q5f592775ia2e79000e09b7bad@mail.gmail.com> Hello from gpunumpy import * > x=zeros(100,dtype='gpufloat') # Creates an array of 100 elements on the GPU > y=ones(100,dtype='gpufloat') > z=exp(2*x+y) # z in on the GPU, all operations on GPU with no transfer > z_cpu=array(z,dtype='float') # z is copied to the CPU > i=(z>2.3).nonzero()[0] # operation on GPU, returns a CPU integer array > PyCuda already supports this through the gpuarray interface. As soon as Nvidia allows us to combine Driver and Runtime APIs, we'll be able to integrate libraries like CUBLAS, CUFFT, and any other runtime-depedent library. We could probably get access to CUBLAS/CUFFT source code as Nvidia released the 1.1 version in the past: http://sites.google.com/site/cudaiap2009/materials-1/extras/online-resources#TOC-CUBLAS-and-CUFFT-1.1-Source-Code but it would be easier to just use the libraries (and 1.1 is outdated now). For those of you who are interested, we forked python-cuda recently and started to add some numpy "sugar". The goal of python-cuda is to *complement* PyCuda by providing an equivalent to the CUDA Runtime API (understand: not Pythonic) using automatically-generated ctypes bindings. With it you can use CUBLAS, CUFFT and the emulation mode (so you don't need a GPU to develop): http://github.com/npinto/python-cuda/tree/master HTH Best, -- Nicolas Pinto Ph.D. Candidate, Brain & Computer Sciences Massachusetts Institute of Technology, USA http://web.mit.edu/pinto -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri Aug 21 20:23:28 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 21 Aug 2009 17:23:28 -0700 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: <954ae5aa0908211719q5f592775ia2e79000e09b7bad@mail.gmail.com> References: <954ae5aa0908211719q5f592775ia2e79000e09b7bad@mail.gmail.com> Message-ID: <9457e7c80908211723w12da7fbctc0a62c809d46eaca@mail.gmail.com> 2009/8/21 Nicolas Pinto : > For those of you who are interested, we forked python-cuda recently and > started to add some numpy "sugar". The goal of python-cuda is to > *complement* PyCuda by providing an equivalent to the CUDA Runtime API > (understand: not Pythonic) using automatically-generated ctypes bindings. > With it you can use CUBLAS, CUFFT and the emulation mode (so you don't need > a GPU to develop): > http://github.com/npinto/python-cuda/tree/master Since you forked the project, it may be worth giving it a new name. PyCuda vs. python-cuda is bound to confuse people horribly! Cheers St?fan From pinto at mit.edu Fri Aug 21 20:30:12 2009 From: pinto at mit.edu (Nicolas Pinto) Date: Fri, 21 Aug 2009 17:30:12 -0700 Subject: [Numpy-discussion] GPU Numpy In-Reply-To: <9457e7c80908211723w12da7fbctc0a62c809d46eaca@mail.gmail.com> References: <954ae5aa0908211719q5f592775ia2e79000e09b7bad@mail.gmail.com> <9457e7c80908211723w12da7fbctc0a62c809d46eaca@mail.gmail.com> Message-ID: <954ae5aa0908211730n7fec8daexaeb42f0ef06b30b4@mail.gmail.com> Agreed! What would be the best name? Our package will provide non-pythonic bindings to cuda (e.g. import cuda; cuda.cudaMemcpy( ... ) ) and some numpy sugar (e.g. from cuda import sugar; sugar.fft.fftconvolve(ndarray_a, ndarray_b, 'same')). How about cuda-ctypes or ctypes-cuda? Any suggestion? At the same time we may wait for Nvidia to unlock this Driver/Runtime issue, so we don't need this anymore. Best, N 2009/8/21 St?fan van der Walt > 2009/8/21 Nicolas Pinto : > > For those of you who are interested, we forked python-cuda recently and > > started to add some numpy "sugar". The goal of python-cuda is to > > *complement* PyCuda by providing an equivalent to the CUDA Runtime API > > (understand: not Pythonic) using automatically-generated ctypes bindings. > > With it you can use CUBLAS, CUFFT and the emulation mode (so you don't > need > > a GPU to develop): > > http://github.com/npinto/python-cuda/tree/master > > Since you forked the project, it may be worth giving it a new name. > PyCuda vs. python-cuda is bound to confuse people horribly! > > Cheers > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Nicolas Pinto Ph.D. Candidate, Brain & Computer Sciences Massachusetts Institute of Technology, USA http://web.mit.edu/pinto -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Sat Aug 22 05:50:28 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 22 Aug 2009 02:50:28 -0700 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> Message-ID: <4A8FBF64.50300@molden.no> Erik Tollerud skrev: >> NumPy arrays on the GPU memory is an easy task. But then I would have to >> write the computation in OpenCL's dialect of C99? > This is true to some extent, but also probably difficult to do given > the fact that paralellizable algorithms are generally more difficult > to formulate in striaghtforward ways. Then you have misunderstood me completely. Creating an ndarray that has a buffer in graphics memory is not too difficult, given that graphics memory can be memory mapped. This has nothing to do with parallelizable algorithms or not. It is just memory management. We could make an ndarray subclass that quickly puts is content in a buffer accessible to the GPU. That is not difficult. But then comes the question of what you do with it. I think many here misunderstands the issue here: Teraflops peak performance of modern GPUs is impressive. But NumPy cannot easily benefit from that. In fact, there is little or nothing to gain from optimising in that end. In order for a GPU to help, computation must be the time-limiting factor. It is not. There is not more to say about using GPUs in NumPy right now. Take a look at the timings here: http://www.scipy.org/PerformancePython It shows that computing with NumPy is more than ten times slower than using plain C. This is despite NumPy being written in C. The NumPy code does not incur 10 times more floating point operations than the C code. The floating point unit does not run in turtle mode when using NumPy. NumPy's relative slowness compared to C has nothing to do with floating point computation. It is due to inferior memory use (temporary buffers, multiple buffer traversals) and memory access being slow. Moving computation to the GPU can only make this worse. Improved memory usage - e.g. through lazy evaluation and JIT compilaton of expressions - can give up to a tenfold increase in performance. That is where we must start optimising to get a faster NumPy. Incidentally, this will also make it easier to leverage on modern GPUs. Sturla Molden From d_l_goldsmith at yahoo.com Sat Aug 22 02:44:04 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Sat, 22 Aug 2009 06:44:04 +0000 (GMT) Subject: [Numpy-discussion] Anyone coming to the sprints have a >= 1GB memory stick or blank CD? Message-ID: <773064.54595.qm@web52108.mail.re2.yahoo.com> If so, and I could use it to try to install Kubuntu tomorrow, I'd really appreciate it if you'd bring it w/ you. Thanks! DG __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From eadrogue at gmx.net Sat Aug 22 07:58:44 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Sat, 22 Aug 2009 13:58:44 +0200 Subject: [Numpy-discussion] masked arrays of structured arrays Message-ID: <20090822115844.GA6422@doriath.local> Hi there, Here is a structured array with 3 fields each of which has 3 fields in turn: In [3]: desc = [('a',int), ('b',int), ('c',int)] In [4]: desc = [('x',desc), ('y',desc), ('z',desc)] With a regular ndarray it works just fine: In [11]: x = np.zeros(2, dtype=desc) In [12]: x['x']['b'] = 2 In [13]: x['x']['b'] Out[13]: array([2, 2]) However if try the same with a masked array, it fails: In [14]: x = np.ma.masked_all(2, dtype=desc) In [15]: x['x']['b'] = 2 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/ernest/ in () /usr/lib/python2.5/site-packages/numpy/ma/core.pyc in __setitem__(self, indx, value) 1574 if self._mask is nomask: 1575 self._mask = make_mask_none(self.shape, self.dtype) -> 1576 ndarray.__setitem__(self._mask, indx, getmask(value)) 1577 return 1578 #........................................ ValueError: field named b not found. Any idea of what the problem is? Ernest From joschu at caltech.edu Sat Aug 22 11:55:03 2009 From: joschu at caltech.edu (John Schulman) Date: Sat, 22 Aug 2009 08:55:03 -0700 Subject: [Numpy-discussion] Scipy09 Sprints Message-ID: <185761440908220855u42bc5834kff74d776e69bb4f1@mail.gmail.com> For the numpy/scipyers at Caltech now: I'm wondering what is happening with the scipy09 sprints. According to the website, the sprints start at 8AM in Powell Booth, but that building is locked and there's no sign of life. Best, John From denis-bz-py at t-online.de Sat Aug 22 12:03:43 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Sat, 22 Aug 2009 16:03:43 +0000 (UTC) Subject: [Numpy-discussion] adaptive interpolation on a regular 2d grid Message-ID: Folks, here's a simple adaptive interpolator; drop me a line to chat about it adalin2( func, near, nx=300, ny=150, xstep=32, ystep=16, xrange=(0,1), yrange=(0,1), dtype=np.float, norm=abs ) Purpose: interpolate a function on a regular 2d grid: take func() where it changes rapidly, bilinear interpolate where it's smooth. Keywords: adaptive interpolation, recursive splitting, bilinear, Python, numpy Example: x,y,z = adaline( ... ) fig = pylab.figure() ax = mpl_toolkits.mplot3d.Axes3D( fig ) X, Y = np.meshgrid( x, y ) ax.plot_wireframe( X, Y, z, rstride=5, cstride=5 ) Out: x,y,z = adalin2( ... ) x = linspace( xrange[0], xrange[1], nx' ) # nx' = nx + a bit, see below y = linspace( yrange[0], yrange[1], ny' ) z[ny'][nx'] some func(), some interpolated values In: func( x, y ): a scalar or vector function nx=300, ny=150: the output array z[][] is this size, plus a bit. For example, with nx=300, ny=150, xstep=32, ystep=16, z will be 161 x 321 (up to z[160][320]) so that 32 x 16 tiles fit exactly in z. xstep=32, ystep=16: the size of the initial coarse grid, z[::ystep][::xstep] = func( x, y ). These must be powers of 2 (so that the recursive splitting works). near = .02 * fmax is either a number for absolute error, or a callable function, near( x, y, f(x,y), av4 ) -> True if near enough. norm: if func() is vector-valued, supply e.g. norm=np.linalg.norm more=1: return [x,y,z, percent func eval, ...] How it works: Initially, sample func() at a coarse xstep x ystep grid: increase nx, ny to nx', ny' if need be z = array((ny', nx')) z[::ystep][::xstep] = func( x, y ) If near=infinity, just bilinear-interpolate all the other points. else for each xstep x ystep rectangle if average func( 4 corners ) is near func( midpoint ) fill it with bilinear-interpolated values else split the rectangle on its longer dimension, recurse. Dependencies: numpy Notes: One can interpolate (blend, tween) just about anything: colors in a color space, or musical sounds, or curves ... Song: Sweet Adeline From nicolas.pinto at gmail.com Sat Aug 22 12:12:30 2009 From: nicolas.pinto at gmail.com (Nicolas Pinto) Date: Sat, 22 Aug 2009 09:12:30 -0700 Subject: [Numpy-discussion] Scipy09 Sprints In-Reply-To: <185761440908220855u42bc5834kff74d776e69bb4f1@mail.gmail.com> References: <185761440908220855u42bc5834kff74d776e69bb4f1@mail.gmail.com> Message-ID: <954ae5aa0908220912l49635023m926179a208ac057a@mail.gmail.com> Gael is leaving now! On Saturday, August 22, 2009, John Schulman wrote: > For the numpy/scipyers at Caltech now: > I'm wondering what is happening with the scipy09 sprints. According to > the website, the sprints start at 8AM in Powell Booth, but that > building is locked and there's no sign of life. > Best, > John > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Nicolas Pinto Ph.D. Candidate, Brain & Computer Sciences Massachusetts Institute of Technology, USA http://web.mit.edu/pinto From jsseabold at gmail.com Sat Aug 22 15:34:25 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Sat, 22 Aug 2009 15:34:25 -0400 Subject: [Numpy-discussion] Bug in NoseTester? Message-ID: I'm trying to define a function, so that I don't have to pass the extra_argv==["--exe"] manually to run the tests for my scikits package. To do so, I believe I need to define a function that calls Tester(package=string_of_fullpath).test(extra_argv=["--exe"]) In nosetester.NoseTester, the docstring says that package can be a string, but if it's a string then package_path gets referenced before assignment I believe (line 128) There needs to be another explicit test to see if it's a string and then assign? Skipper From stefan at sun.ac.za Sat Aug 22 16:19:51 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 22 Aug 2009 13:19:51 -0700 Subject: [Numpy-discussion] Sprinting at SciPy2009 today and tomorrow Message-ID: <9457e7c80908221319peab26e8we224d5a7a6d3bebf@mail.gmail.com> Hey everyone, The SciPy2009 sprints are underway, and you are welcome to take part! Topics include NumPy, SciPy, Mayavi, Traits, IPython, Documentation and the image processing toolbox. Join us on irc in channel #scipy, server irc.freenode.net. The timezone here is GMT-7, and we'll be around both days from 10:00am till late. See you there, St?fan From chad.netzer at gmail.com Sat Aug 22 16:28:23 2009 From: chad.netzer at gmail.com (Chad Netzer) Date: Sat, 22 Aug 2009 13:28:23 -0700 Subject: [Numpy-discussion] A better median function? In-Reply-To: <268febdf0908211208o4ff73ff1q69146b542c75321d@mail.gmail.com> References: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> <1e2af89e0908211133n661e0accsd48021bea6158121@mail.gmail.com> <268febdf0908211208o4ff73ff1q69146b542c75321d@mail.gmail.com> Message-ID: The good news is that it was trivial to adapt numpy/core/src/_sortmodule.c.src:quicksort() to do a quickselect(). When I'm back home I'll follow up with discussion on how (if it all) to expose this to numpy.median() or numpy in general. -Chad From martyfuhry at gmail.com Sat Aug 22 21:15:17 2009 From: martyfuhry at gmail.com (Marty Fuhry) Date: Sat, 22 Aug 2009 21:15:17 -0400 Subject: [Numpy-discussion] ufunc void *extra Message-ID: The "Beyond the Basics" manual (http://docs.scipy.org/doc/numpy/user/c-info.beyond-basics.html) indicates that the generic ufunc loop (in this example loop1d) has a void* extra (or void* data) argument that can be used to pass extra data to the ufunc, but it doesn't indicate how to use this argument. Can anyone tell me how to use this extra argument? I'm trying to get a ufunc to perform a different operation by passing it a second parameter, but I'm not sure how to do this. >>> some_ufunc(narray, "A") will perform operation A on object narray >>> some_ufunc(narray, "B") will perform operation B on object narray Is this even a valid operation? -Marty Fuhry From robert.kern at gmail.com Sat Aug 22 21:26:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 22 Aug 2009 18:26:28 -0700 Subject: [Numpy-discussion] ufunc void *extra In-Reply-To: References: Message-ID: <3d375d730908221826i737d2a3aqafc062b07a6f020@mail.gmail.com> On Sat, Aug 22, 2009 at 18:15, Marty Fuhry wrote: > The "Beyond the Basics" manual > (http://docs.scipy.org/doc/numpy/user/c-info.beyond-basics.html) > indicates that the generic ufunc loop (in this example loop1d) has a > void* extra (or void* data) argument that can be used to pass extra > data to the ufunc, but it doesn't indicate how to use this argument. > > Can anyone tell me how to use this extra argument? I'm trying to get a > ufunc to perform a different operation by passing it a second > parameter, but I'm not sure how to do this. > >>>> some_ufunc(narray, "A") > will perform operation A on object narray >>>> some_ufunc(narray, "B") > will perform operation B on object narray > > Is this even a valid operation? No. This is not configurable at call-time and certainly not from Python. It is used only from the C level. In particular, it is used for things like the ufuncs in scipy.special to configure the (otherwise generic) ufuncs with the function pointer that does the actual computation. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nicolas.pinto at gmail.com Sun Aug 23 03:27:28 2009 From: nicolas.pinto at gmail.com (Nicolas Pinto) Date: Sun, 23 Aug 2009 00:27:28 -0700 Subject: [Numpy-discussion] problem with numpy.distutils and Cython Message-ID: <954ae5aa0908230027h956568h585e7854bc05b8a7@mail.gmail.com> Hello, I'm trying to use numpy.distutils and Cython in a setup.py but I'm running into some problems. The following code raises a "AttributeError: fcompiler" when I run "python setup.py install" (it runs smoothly with "python setup.py build_ext --inplace"): from numpy.distutils.core import setup, Extension from Cython.Distutils import build_ext ext_modules = [Extension("test", ["test.pyx"])] setup(cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules) Whereas the following works in both cases: from distutils.core import setup, Extension from Cython.Distutils import build_ext ext_modules = [Extension("test", ["test.pyx"])] setup(cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules) Am I missing something? Thanks for your help. Best, -- Nicolas Pinto Ph.D. Candidate, Brain & Computer Sciences Massachusetts Institute of Technology, USA http://web.mit.edu/pinto -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Aug 23 03:34:17 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 23 Aug 2009 00:34:17 -0700 Subject: [Numpy-discussion] problem with numpy.distutils and Cython In-Reply-To: <954ae5aa0908230027h956568h585e7854bc05b8a7@mail.gmail.com> References: <954ae5aa0908230027h956568h585e7854bc05b8a7@mail.gmail.com> Message-ID: <3d375d730908230034j1b9e8370x7d2dbbe283fd988b@mail.gmail.com> On Sun, Aug 23, 2009 at 00:27, Nicolas Pinto wrote: > Hello, > > I'm trying to use numpy.distutils and Cython in a setup.py but I'm running > into some problems. > > The following code raises a "AttributeError: fcompiler" when I run "python > setup.py install" (it runs smoothly with "python setup.py build_ext > --inplace"): > > from numpy.distutils.core import setup, Extension > from Cython.Distutils import build_ext > ext_modules = [Extension("test", ["test.pyx"])] > setup(cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules) > > Whereas the following works in both cases: > > from distutils.core import setup, Extension > from Cython.Distutils import build_ext > ext_modules = [Extension("test", ["test.pyx"])] > setup(cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules) > > Am I missing something? numpy.distutils needs its own build_ext, which you are overriding with Cython's. You need one build_ext that does both things. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dalcinl at gmail.com Sun Aug 23 18:30:59 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 23 Aug 2009 19:30:59 -0300 Subject: [Numpy-discussion] problem with numpy.distutils and Cython In-Reply-To: <3d375d730908230034j1b9e8370x7d2dbbe283fd988b@mail.gmail.com> References: <954ae5aa0908230027h956568h585e7854bc05b8a7@mail.gmail.com> <3d375d730908230034j1b9e8370x7d2dbbe283fd988b@mail.gmail.com> Message-ID: The monkeypatching below in your setup.py could work. This way, you just have to use numpy.distutils, but you will not be able to pass many options to Cython (like C++ code generation?) from numpy.distutils.command import build_src import Cython import Cython.Compiler.Main build_src.Pyrex = Cython build_src.have_pyrex = True On Sun, Aug 23, 2009 at 4:34 AM, Robert Kern wrote: > On Sun, Aug 23, 2009 at 00:27, Nicolas Pinto wrote: >> Hello, >> >> I'm trying to use numpy.distutils and Cython in a setup.py but I'm running >> into some problems. >> >> The following code raises a "AttributeError: fcompiler" when I run "python >> setup.py install" (it runs smoothly with "python setup.py build_ext >> --inplace"): >> >> from numpy.distutils.core import setup, Extension >> from Cython.Distutils import build_ext >> ext_modules = [Extension("test", ["test.pyx"])] >> setup(cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules) >> >> Whereas the following works in both cases: >> >> from distutils.core import setup, Extension >> from Cython.Distutils import build_ext >> ext_modules = [Extension("test", ["test.pyx"])] >> setup(cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules) >> >> Am I missing something? > > numpy.distutils needs its own build_ext, which you are overriding with > Cython's. You need one build_ext that does both things. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From schut at sarvision.nl Mon Aug 24 09:09:23 2009 From: schut at sarvision.nl (Vincent Schut) Date: Mon, 24 Aug 2009 15:09:23 +0200 Subject: [Numpy-discussion] Removing scipy.stsci was [Re: [SciPy-dev] Deprecate chararray [was Plea for help]] In-Reply-To: <9457e7c80908211033y41e1e919le24181e277589b25@mail.gmail.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <5b8d13220908201204s3c74cad1pabccdce47d3a13a1@mail.gmail.com> <9457e7c80908201648r474297fo1d6ad2061c240fa6@mail.gmail.com> <88B0EA8C-5637-4050-AC4F-023C6D41BB3A@stsci.edu> <9457e7c80908211033y41e1e919le24181e277589b25@mail.gmail.com> Message-ID: St?fan van der Walt wrote: > Hi Vincent > > 2009/8/21 Vincent Schut : >> I know it probably will be a pretty involved task, as ndimage comes from >> numarray and seems to be largely implemented in C. But I really wanted >> to raise the issue now the image processing subject turns up once again, >> and hope some folks with more/better programming skills than me might >> like the idea... > > What would you like the behaviour to be? For example, how should > ndimage.zoom handle these missing values? Good question :-) I see 2 possibilities, both of them can be usefull in their own situations. Note that I am really not into splines mathematically, so my suggestions and terminology might not apply at all... 1. for any output cell that depends on a missing value in the input, return a missing/masked/NaN value, but (and I think this differs from the current implementation), for any output cell which could be calculated, return a proper output. Currently any array that contains one or more NaNs will give an output array full of NaNs (except for order=0, which shows this exact behaviour already). But maybe that's inherent to splines interpolation? This would at least allow input arrays with missing values (or masked arrays) to be used; this behaviour could be extended to many of the ndimage functions, like the kernel based stuff. FFT based convolutions could be another story altogether... 2. In case of zoom&co: only use the non-missing values to calculate the splines, thus effectively inter/extrapolating missing/masked values in the process. This probably raises a lot of new questions about the implementation. It would however be highly usefull for me... I don't know if a irregular grid based splines interpolation implementation exists? What I currently do in a case like this (zooming an array with missing values) is first fill the missing values by using ndimage.generic_filter with a kernel function that averages the non-missing values in the moving window. This works as long as there are not too many missing values next to each other, however it is very slow... I think that, if an effort like this is to be made, a thorough discussion on the possible behaviours of ndimage functions with missing values should take place on one of the numpy/scipy related mailing lists. I'm sure I'm not the only one with ideas and/or use cases for this, and I'm certainly not someone with a lot of theoretical knowledge in this area. Regards, Vincent. > > Regards > St?fan From amcmorl at gmail.com Mon Aug 24 15:37:24 2009 From: amcmorl at gmail.com (Angus McMorland) Date: Mon, 24 Aug 2009 15:37:24 -0400 Subject: [Numpy-discussion] Pointer to array data for passing to swig-wrapped C++ Message-ID: Hi all, Our lab has an in-house messaging protocol, written in C++, for interfacing the different components of our experimental setups, and allowing programs written in several different languages to talk to each other. I'm currently trying to write a Python interface to this protocol, mainly so I can show people here how good a Traits-based GUI would be for controlling things. I'm not familiar with interfacing C++ and Python code, so this is a little hit and miss, and any suggestions on better approaches would be welcome. I have wrapped the C++ code in swig, and can now call the simple routines from within Python. The trouble I'm having is constructing new message data, and passing a reference to that data. Each message consists of a ID code telling us what type of message it is, a buffer of data, of which some subset is actually useful, and the number of bytes of the buffer that have been filled with useful data. I can get message data out of a message constructed and sent by some other implementation of the protocol by reading it into a numpy array with a dtype matching the structure of the data being sent, and calling, for example: np.fromstring(msg_data[0:num_bytes], dtype=[('a', int),('b', float)]) I can't work out how to do the reverse operation: to populate the C++ message object with data constructed in Python. The message object has a SetData function,which is exposed by swig, that requires a 'void *' pointer to the data (as well as the number of bytes being sent), and I thought I might be able to do something like: msg.SetData(array.data, array.nbytes) where array is another ndarray of the desired dtype, but that returns the error: TypeError: in method 'CMessage_SetData', argument 2 of type 'void *' and using array.tostring() in place of array.data above gives exactly the same result. Is there a convenient syntax to pass a void * pointer to the array's data to the swig-wrapped SetData routine, or do I have to write some extra wrapping code to make this happen, and what might that code look like roughly? Many thanks for your time, Angus. -- AJC McMorland Post-doctoral research fellow Neurobiology, University of Pittsburgh From robert.kern at gmail.com Mon Aug 24 15:47:47 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 24 Aug 2009 12:47:47 -0700 Subject: [Numpy-discussion] Pointer to array data for passing to swig-wrapped C++ In-Reply-To: References: Message-ID: <3d375d730908241247v16fabafcl3cfbf4d8c78d7639@mail.gmail.com> On Mon, Aug 24, 2009 at 12:37, Angus McMorland wrote: > Hi all, > > Our lab has an in-house messaging protocol, written in C++, for > interfacing the different components of our experimental setups, and > allowing programs written in several different languages to talk to > each other. I'm currently trying to write a Python interface to this > protocol, mainly so I can show people here how good a Traits-based GUI > would be for controlling things. I'm not familiar with interfacing C++ > and Python code, so this is a little hit and miss, and any suggestions > on better approaches would be welcome. I have wrapped the C++ code in > swig, and can now call the simple routines from within Python. > > The trouble I'm having is constructing new message data, and passing a > reference to that data. Each message consists of a ID code telling us > what type of message it is, a buffer of data, of which some subset is > actually useful, and the number of bytes of the buffer that have been > filled with useful data. I can get message data out of a message > constructed and sent by some other implementation of the protocol by > reading it into a numpy array with a dtype matching the structure of > the data being sent, and calling, for example: > > np.fromstring(msg_data[0:num_bytes], dtype=[('a', int),('b', float)]) > > I can't work out how to do the reverse operation: to populate the C++ > message object with data constructed in Python. The message object has > a SetData function,which is exposed by swig, that requires a 'void *' > pointer to the data (as well as the number of bytes being sent), and I > thought I might be able to do something like: > > msg.SetData(array.data, array.nbytes) > > where array is another ndarray of the desired dtype, but that returns the error: > > TypeError: in method 'CMessage_SetData', argument 2 of type 'void *' > > and using array.tostring() in place of array.data above gives exactly > the same result. Is there a convenient syntax to pass a void * pointer > to the array's data to the swig-wrapped SetData routine, or do I have > to write some extra wrapping code to make this happen, and what might > that code look like roughly? You will need an extra typemap on SetData() to make it accept a char* on the Python side (which can be supplied as a string or preferably a buffer object) and cast it to a void*. Someone with more recent SWIG experience or the SWIG docs will have to tell you how to do that, though. It should be straightforward, though. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From numpy-discussion at maubp.freeserve.co.uk Mon Aug 24 16:08:33 2009 From: numpy-discussion at maubp.freeserve.co.uk (Peter) Date: Mon, 24 Aug 2009 21:08:33 +0100 Subject: [Numpy-discussion] Pointer to array data for passing to swig-wrapped C++ In-Reply-To: References: Message-ID: <320fb6e00908241308i5d124daew96e9d72f65c5fbcc@mail.gmail.com> On Mon, Aug 24, 2009 at 8:37 PM, Angus McMorland wrote: > > Hi all, > > Our lab has an in-house messaging protocol, written in C++, for > interfacing the different components of our experimental setups, and > allowing programs written in several different languages to talk to > each other. I'm currently trying to write a Python interface to this > protocol, mainly so I can show people here how good a Traits-based GUI > would be for controlling things. I'm not familiar with interfacing C++ > and Python code, so this is a little hit and miss, and any suggestions > on better approaches would be welcome. I have wrapped the C++ code in > swig, and can now call the simple routines from within Python. > > The trouble I'm having is constructing new message data, and passing a > reference to that data. Each message consists of a ID code telling us > what type of message it is, a buffer of data, of which some subset is > actually useful, and the number of bytes of the buffer that have been > filled with useful data. I can get message data out of a message > constructed and sent by some other implementation of the protocol by > reading it into a numpy array with a dtype matching the structure of > the data being sent, and calling, for example ... Have you considered using the Python struct module? If your "buffer of data" is a mixture of fields, this might be a better match than using numpy. See http://docs.python.org/library/struct.html Peter From chad.netzer at gmail.com Tue Aug 25 02:24:22 2009 From: chad.netzer at gmail.com (Chad Netzer) Date: Mon, 24 Aug 2009 23:24:22 -0700 Subject: [Numpy-discussion] A better median function? In-Reply-To: References: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> <1e2af89e0908211133n661e0accsd48021bea6158121@mail.gmail.com> <268febdf0908211208o4ff73ff1q69146b542c75321d@mail.gmail.com> Message-ID: I've made some progress on this, building the tools for a faster median(). I was able to fairly easily make both nth_element() and partial_sort() types of functions by modifying numpy's quicksort, however I wasn't that happy with their API from a python/numpy point of view. My current plan of attack is to deliver a partition() function that basically returns an array such that elements less than the pivot(s) come first, then the pivot(s), then the elements greater than the pivot(s). The nifty trick being that the pivot can be a range of indices, so it can be used to easily implement C++'s nth_element(), partial_sort() and more. This should allow for faster medians, as well as satisfy those wanting the two values of an even-length array's median elements, rather than the average between them. It will probably work something like this: $ python >>> import numpy as np >>> a=np.array(range(9,-1,-1)) >>> a array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0]) >>> np.partition(a, 3, 7) # partition around the indices 3-7 >>> array([1, 0, 2, 4, 6, 5, 3, 8, 9, 7]) # The partition is not necessarily sorted >>> a[3:7].sort() # But can be sub-sorted after the fact >>> a array([1, 0, 2, 3, 4, 5, 6, 8, 9, 7]) This partition operation can usually be expected be to faster than a full sort, depending on the length of the pivot range. An nth_element() operation then becomes: >>> np.partition(a, k) # a single arg means pivot around that index >>> a[k] partial_sort() is as above: >>> np.partition(a, start, end) # using Python range notation >>> a[start:end].sort() odd length median is: >>> n = a.size//2 >>> np.partition(a, n) >>> a[n] and the pair of elements for an even length median would be: >>> start = a.size//2 - 1 >>> end = a.size//2 + 1 >>> np.partition(a, start, end) >>> a[start:end].sort() >>> l, r = a[start:end] I'll work towards getting some code to look at up later in the week. In the meantime, anyone interested, who has feedback on the above proposal, should please respond with comments, or suggestions. Note - I already am not sure of this proposed API using [start, end) type ranges to define the pivot range. What then would be the result of: >>> np.partition(a, start, start) # an "empty" range? Probably just a no-op... Also, it might just make sense to go with partial_sort() directly (ie. does the partitioning and also sorts the pivot range), since that avoids a separate python call to sort() just compute the even-median. Maybe I'll just guarantee for partitioning "short" ranges, that the pivot will be sorted after partitioning. It avoids an additional Python sort call, and is trivially fast for insertion sort to do... Thanks, Chad Netzer From cournape at gmail.com Tue Aug 25 13:30:49 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 25 Aug 2009 12:30:49 -0500 Subject: [Numpy-discussion] Quick question on NumPy builds vs. Cython In-Reply-To: <4A941D31.6050103@student.matnat.uio.no> References: <4A941D31.6050103@student.matnat.uio.no> Message-ID: <5b8d13220908251030v5110efccqbfadf616e19b6ff4@mail.gmail.com> Hi Dag, On Tue, Aug 25, 2009 at 12:19 PM, Dag Sverre Seljebotn wrote: > [Let me know if this should go to numpy-discuss instead.] > I guess this can be discussed on the ML as well (I CC to the list). > > I see that there are currently two modes, and that it is possible to build > NumPy using a master .c-file #include-ing the rest. (Which is much more > difficult to support using Cython, though not impossible.) > > Is there any plans for the one-file build to go away, or is supporting this > a requirement? This is a requirement, as supporting this depends on non standard compilers extensions (that's why it is not the default - but it works well, I am always using this mode when working on numpy since the build/test/debug cycle is so much shorter with numscons and this). The basic problem is as follows: - On Unix at least, a function is exported in a shared library by default. - The usual way to avoid polluting the namespace is to put static in front of it - You can't reuse a static function in another compilation unit (there is no "friend static"). So what happens in the multi-files build is that the function are tagged as hidden instead of static, with hidden being __attribute__((hidden)) for gcc, nothing for MS compiler (on windows, you have to tag the exported functions, nothing is exported by default), and will break on other platforms. David From charlesr.harris at gmail.com Tue Aug 25 13:51:04 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Aug 2009 11:51:04 -0600 Subject: [Numpy-discussion] c++ comments in parse_datetime.c Message-ID: Hi Travis, The new parse_datetime.c file contains a lot of c++ style comments that should be fixed. Also, the new test for mirr is failing on all the buildbots. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Aug 25 14:03:51 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 25 Aug 2009 14:03:51 -0400 Subject: [Numpy-discussion] c++ comments in parse_datetime.c In-Reply-To: References: Message-ID: On Tue, Aug 25, 2009 at 1:59 PM, Skipper Seabold wrote: > On Tue, Aug 25, 2009 at 1:51 PM, Charles R > Harris wrote: >> Hi Travis, >> >> The new parse_datetime.c file contains a lot of c++ style comments that >> should be fixed. Also, the new test for mirr is failing on all the >> buildbots. >> >> Chuck >> > > Hi, > > For mirr it looks like the lines in the patch > > > pos = values * (values>0) > neg = values * (values<0) > > were copied to the trunk as > > pos = values > 0 > neg = values < 0 > > This is probably my fault for not submitting the patch as a diff. > Oops nevermind I didn't see that npv is now provided pos*values and neg*values. Skipper From giuseppe.aprea at gmail.com Tue Aug 25 14:07:36 2009 From: giuseppe.aprea at gmail.com (Giuseppe Aprea) Date: Tue, 25 Aug 2009 20:07:36 +0200 Subject: [Numpy-discussion] filters for rows or columns Message-ID: Hi list, I wonder if there is any smarter way to apply a filter to a 2 dimensional array than a for loop: a=array(.......) idxList=[] for i in range(0,a.shape[1]): if (some condition on a[:,i]): idxList.append(i) thanks in advance. g From giuseppe.aprea at gmail.com Tue Aug 25 14:07:36 2009 From: giuseppe.aprea at gmail.com (Giuseppe Aprea) Date: Tue, 25 Aug 2009 20:07:36 +0200 Subject: [Numpy-discussion] filters for rows or columns Message-ID: Hi list, I wonder if there is any smarter way to apply a filter to a 2 dimensional array than a for loop: a=array(.......) idxList=[] for i in range(0,a.shape[1]): if (some condition on a[:,i]): idxList.append(i) thanks in advance. g From robert.kern at gmail.com Tue Aug 25 14:13:59 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 25 Aug 2009 11:13:59 -0700 Subject: [Numpy-discussion] filters for rows or columns In-Reply-To: References: Message-ID: <3d375d730908251113n7a021577gac8f254ee010b92f@mail.gmail.com> On Tue, Aug 25, 2009 at 11:07, Giuseppe Aprea wrote: > Hi list, > > > I wonder if there is any smarter way to apply a filter to a 2 dimensional array > than a for loop: > > a=array(.......) > idxList=[] > for i in range(0,a.shape[1]): > ? ? ? if (some condition on a[:,i]): > ? ? ? ? ? ? idxList.append(i) Define a "some condition on a[:,i]" that is of interest to you, and I will show you how to do it. Roughly, you should define a function that takes 'a' and operates on it in bulk in order to get a boolean array of shape (a.shape[0],) evaluating the condition for each column. Then use numpy.where() on that boolean array to get indices if you actually need indices; frequently, you can just use the boolean array where you wanted the indices. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jsseabold at gmail.com Tue Aug 25 13:59:03 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 25 Aug 2009 13:59:03 -0400 Subject: [Numpy-discussion] c++ comments in parse_datetime.c In-Reply-To: References: Message-ID: On Tue, Aug 25, 2009 at 1:51 PM, Charles R Harris wrote: > Hi Travis, > > The new parse_datetime.c file contains a lot of c++ style comments that > should be fixed. Also, the new test for mirr is failing on all the > buildbots. > > Chuck > Hi, For mirr it looks like the lines in the patch pos = values * (values>0) neg = values * (values<0) were copied to the trunk as pos = values > 0 neg = values < 0 This is probably my fault for not submitting the patch as a diff. Skipper From charlesr.harris at gmail.com Tue Aug 25 14:38:27 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Aug 2009 12:38:27 -0600 Subject: [Numpy-discussion] c++ comments in parse_datetime.c In-Reply-To: References: Message-ID: On Tue, Aug 25, 2009 at 11:51 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Hi Travis, > > The new parse_datetime.c file contains a lot of c++ style comments that > should be fixed. Also, the new test for mirr is failing on all the > buildbots. > Also, 1) There are two macros with gotos, which is a no-no. You could use inline functions instead or just write out the code where needed. 2) Multiline comments like so: /* * blah, blah */ 3) Think twice about using trailing comments. 4) The constant defines should be collected in one spot and documented. 5) Use blank lines between all function definitions. 6) Use {} around all if statement blocks 7) Format else like so: } else if (bug) { blah; } 8) The file contains hard tabs and trailing whitespace. 9) A number of lines are longer than 80 characters. Yours Truly, The Pedant ;) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Aug 25 15:05:47 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 25 Aug 2009 15:05:47 -0400 Subject: [Numpy-discussion] c++ comments in parse_datetime.c In-Reply-To: References: Message-ID: <2DF15AC0-1F31-40E9-A55D-B51D75F63093@gmail.com> On Aug 25, 2009, at 1:59 PM, Skipper Seabold wrote: > On Tue, Aug 25, 2009 at 1:51 PM, Charles R > Harris wrote: >> Hi Travis, >> >> The new parse_datetime.c file contains a lot of c++ style comments >> that >> should be fixed. Also, the new test for mirr is failing on all the >> buildbots. Comments sent to Marty who wrote the parse_datetime.c as part of his GSoC: Marty, I guess you have a bit of cleaning up to do. (As a snarky side note, Marty posted on the list a few weeks ago asking just for this kind of comments... But all is well and better late than never.) From charlesr.harris at gmail.com Tue Aug 25 15:21:15 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Aug 2009 13:21:15 -0600 Subject: [Numpy-discussion] c++ comments in parse_datetime.c In-Reply-To: <2DF15AC0-1F31-40E9-A55D-B51D75F63093@gmail.com> References: <2DF15AC0-1F31-40E9-A55D-B51D75F63093@gmail.com> Message-ID: On Tue, Aug 25, 2009 at 1:05 PM, Pierre GM wrote: > > On Aug 25, 2009, at 1:59 PM, Skipper Seabold wrote: > > > On Tue, Aug 25, 2009 at 1:51 PM, Charles R > > Harris wrote: > >> Hi Travis, > >> > >> The new parse_datetime.c file contains a lot of c++ style comments > >> that > >> should be fixed. Also, the new test for mirr is failing on all the > >> buildbots. > > Comments sent to Marty who wrote the parse_datetime.c as part of his > GSoC: Marty, I guess you have a bit of cleaning up to do. > (As a snarky side note, Marty posted on the list a few weeks ago > asking just for this kind of comments... But all is well and better > late than never.) My bad, then, I missed it. So let me add 1) Because the default compilation is to include all the files in a master file, the local defines should be undef'ed at the end to avoid namespace pollution. 2) Never do this: if (bug) return -1; or this if (bug) {blah; blah;} do it this way if (bug) { return -1; } The last is more for Travis in the most recent commit ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Tue Aug 25 17:43:27 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 25 Aug 2009 23:43:27 +0200 Subject: [Numpy-discussion] A better median function? In-Reply-To: References: <268febdf0908210847k227315a6u564b50ffb8edf750@mail.gmail.com> <1e2af89e0908211133n661e0accsd48021bea6158121@mail.gmail.com> <268febdf0908211208o4ff73ff1q69146b542c75321d@mail.gmail.com> Message-ID: <4A945AFF.5060405@molden.no> Chad Netzer skrev: > My current plan of attack is to deliver a partition() function that > basically returns an array such that elements less than the pivot(s) > come first, then the pivot(s), then the elements greater than the > pivot(s). I'm actually trying to write a fast median replacement myself. I was thinking in the same lines, except I don't store those two arrays. I just keep track of counts in them. For the even case, I also keep track the elements closest to the pivot (smaller and bigger). It's incredibly simple actually. So lets see who gets there first :-) Sturla Molden From giuseppe.aprea at gmail.com Tue Aug 25 19:18:11 2009 From: giuseppe.aprea at gmail.com (Giuseppe Aprea) Date: Wed, 26 Aug 2009 01:18:11 +0200 Subject: [Numpy-discussion] filters for rows or columns In-Reply-To: <3d375d730908251113n7a021577gac8f254ee010b92f@mail.gmail.com> References: <3d375d730908251113n7a021577gac8f254ee010b92f@mail.gmail.com> Message-ID: On Tue, Aug 25, 2009 at 8:13 PM, Robert Kern wrote: > On Tue, Aug 25, 2009 at 11:07, Giuseppe Aprea wrote: >> Hi list, >> >> >> I wonder if there is any smarter way to apply a filter to a 2 dimensional array >> than a for loop: >> >> a=array(.......) >> idxList=[] >> for i in range(0,a.shape[1]): >> ? ? ? if (some condition on a[:,i]): >> ? ? ? ? ? ? idxList.append(i) > > Define a "some condition on a[:,i]" that is of interest to you, and I > will show you how to do it. Roughly, you should define a function that > takes 'a' and operates on it in bulk in order to get a boolean array > of shape (a.shape[0],) evaluating the condition for each column. Then > use numpy.where() on that boolean array to get indices if you actually > need indices; frequently, you can just use the boolean array where you > wanted the indices. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, I would like to do something like this a=array([[1,2,3,4],[5,6,7,8],[4,5,6,0]]) idxList=[] for i in range(0,a.shape[1]): if len(nonzero(a[:,i])[0])==1: #want to extract column indices of those columns which only have one non vanishing element idxList.append(i) I already used where on !D array but I don't know if there is some function or some kind of syntax which allow you to evaluate a condition for each column(row). regards g From robert.kern at gmail.com Tue Aug 25 19:26:10 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 25 Aug 2009 16:26:10 -0700 Subject: [Numpy-discussion] filters for rows or columns In-Reply-To: References: <3d375d730908251113n7a021577gac8f254ee010b92f@mail.gmail.com> Message-ID: <3d375d730908251626v4d88a51fpd6e53ecb041da100@mail.gmail.com> On Tue, Aug 25, 2009 at 16:18, Giuseppe Aprea wrote: > Hi, ?I would like to do something like this > > a=array([[1,2,3,4],[5,6,7,8],[4,5,6,0]]) > idxList=[] > for i in range(0,a.shape[1]): > ? ?if len(nonzero(a[:,i])[0])==1: ? #want to extract column indices > of those columns which only have one non vanishing element > ? ? ? ? idxList.append(i) > > I already used where on !D array but I don't know if there is some function or > some kind of syntax which allow you to evaluate a condition for each > column(row). column_mask = ((a != 0).sum(axis=1) == 1) idxArray = np.nonzero(column_mask)[0] # if you must -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Aug 25 22:44:41 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Aug 2009 20:44:41 -0600 Subject: [Numpy-discussion] Quick question on NumPy builds vs. Cython In-Reply-To: <5b8d13220908251030v5110efccqbfadf616e19b6ff4@mail.gmail.com> References: <4A941D31.6050103@student.matnat.uio.no> <5b8d13220908251030v5110efccqbfadf616e19b6ff4@mail.gmail.com> Message-ID: On Tue, Aug 25, 2009 at 11:30 AM, David Cournapeau wrote: > Hi Dag, > > On Tue, Aug 25, 2009 at 12:19 PM, Dag Sverre > Seljebotn wrote: > > [Let me know if this should go to numpy-discuss instead.] > > > > I guess this can be discussed on the ML as well (I CC to the list). > > > > > I see that there are currently two modes, and that it is possible to > build > > NumPy using a master .c-file #include-ing the rest. (Which is much more > > difficult to support using Cython, though not impossible.) > > > > Is there any plans for the one-file build to go away, or is supporting > this > > a requirement? > > This is a requirement, as supporting this depends on non standard > compilers extensions (that's why it is not the default - but it works > well, I am always using this mode when working on numpy since the > build/test/debug cycle is so much shorter with numscons and this). > > The basic problem is as follows: > - On Unix at least, a function is exported in a shared library by default. > - The usual way to avoid polluting the namespace is to put static in > front of it > - You can't reuse a static function in another compilation unit > (there is no "friend static"). > > So what happens in the multi-files build is that the function are > tagged as hidden instead of static, with hidden being > __attribute__((hidden)) for gcc, nothing for MS compiler (on windows, > you have to tag the exported functions, nothing is exported by > default), and will break on other platforms. > Does the build actually break or is it the case that a lot of extraneous names become visible, increasing the module size and exposing functions that we don't what anyone to access? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Aug 25 23:38:48 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Aug 2009 21:38:48 -0600 Subject: [Numpy-discussion] mirr test correctly fails for given input. Message-ID: So is it a bug in the test or a bug in the implementation? The problem is that the slice values[1:] when values = [-120000,39000,30000,21000,37000,46000] contains no negative number and a nan is returned. This looks like a bug in the test. The documentation also probably needs fixing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Aug 26 01:45:53 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 Aug 2009 01:45:53 -0400 Subject: [Numpy-discussion] mirr test correctly fails for given input. In-Reply-To: References: Message-ID: <1cd32cbb0908252245r273b4ff5w59706c22b1b42edd@mail.gmail.com> On Tue, Aug 25, 2009 at 11:38 PM, Charles R Harris wrote: > So is it a bug in the test or a bug in the implementation? The problem is > that the slice values[1:] when > values =? [-120000,39000,30000,21000,37000,46000] contains no negative > number and a nan is returned. This looks like a bug in the test. The > documentation also probably needs fixing. > > Chuck There is a bug in the code, the nan is incorrectly raised. After correcting the nan (checking on the original, instead of shortened values), I got one failing test, that I corrected with the matching number from Openoffice. (The main problem that the function is more complicated than necessary, is because np.npv doesn't allow the inclusion of the investment in the initial period) This needs reviewing, since it's late here. Josef import numpy as np from numpy.testing import assert_almost_equal, assert_ from numpy import npv def mirr(values, finance_rate, reinvest_rate): """ Modified internal rate of return. Parameters ---------- values : array_like Cash flows (must contain at least one positive and one negative value) or nan is returned. finance_rate : scalar Interest rate paid on the cash flows reinvest_rate : scalar Interest rate received on the cash flows upon reinvestment Returns ------- out : float Modified internal rate of return """ values = np.asarray(values, dtype=np.double) initial = values[0] values1 = values[1:] n = values1.size pos = values1 > 0 neg = values1 < 0 if not (np.sum(values[values>0]) > 0 and np.sum(values[values<0]) < 0): return np.nan numer = np.abs(npv(reinvest_rate, values1*pos)) denom = np.abs(npv(finance_rate, values1*neg)) if initial > 0: return ((initial + numer) / denom)**(1.0/n)*(1 + reinvest_rate) - 1 else: return ((numer / (-initial + denom)))**(1.0/n)*(1 + reinvest_rate) - 1 #tests from testsuite and Skipper plus isnan test v1 = [-4500,-800,800,800,600,600,800,800,700,3000] print mirr(v1,0.08,0.055) assert_almost_equal(mirr(v1,0.08,0.055), 0.0666, 4) #incorrect test ? corrected v2 = [-120000,39000,30000,21000,37000,46000] print mirr(v2,0.10,0.12) assert_almost_equal(mirr(v2,0.10,0.12), 0.126094, 6) # corrected from OO v2 = [39000,30000,21000,37000,46000] assert_(np.isnan(mirr(v2,0.10,0.12))) v3 = [100,200,-50,300,-200] print mirr(v3,0.05,0.06) assert_almost_equal(mirr(v3,0.05,0.06), 0.3428, 4) #-------------- print mirr([100, 200, -50, 300, -200], .05, .06) assert_almost_equal(mirr((100, 200,-50, 300,-200), .05, .06), 0.342823387842, 4) V2 = [-4500,-800,800,800,600,600,800,800,700,3000] print mirr(V2, 0.08, 0.055) assert_almost_equal(mirr(V2, 0.08, 0.055), 0.06659718, 4) From cournape at gmail.com Wed Aug 26 02:14:21 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 26 Aug 2009 01:14:21 -0500 Subject: [Numpy-discussion] Quick question on NumPy builds vs. Cython In-Reply-To: References: <4A941D31.6050103@student.matnat.uio.no> <5b8d13220908251030v5110efccqbfadf616e19b6ff4@mail.gmail.com> Message-ID: <5b8d13220908252314q5df44a8aj7e66193297bcc956@mail.gmail.com> On Tue, Aug 25, 2009 at 9:44 PM, Charles R Harris wrote: > > > On Tue, Aug 25, 2009 at 11:30 AM, David Cournapeau > wrote: >> >> Hi Dag, >> >> On Tue, Aug 25, 2009 at 12:19 PM, Dag Sverre >> Seljebotn wrote: >> > [Let me know if this should go to numpy-discuss instead.] >> > >> >> I guess this can be discussed on the ML as well (I CC to the list). >> >> > >> > I see that there are currently two modes, and that it is possible to >> > build >> > NumPy using a master .c-file #include-ing the rest. (Which is much more >> > difficult to support using Cython, though not impossible.) >> > >> > Is there any plans for the one-file build to go away, or is supporting >> > this >> > a requirement? >> >> This is a requirement, as supporting this depends on non standard >> compilers extensions (that's why it is not the default - but it works >> well, I am always using this mode when working on numpy since the >> build/test/debug cycle is so much shorter with numscons and this). >> >> The basic problem is as follows: >> ?- On Unix at least, a function is exported in a shared library by >> default. >> ?- The usual way to avoid polluting the namespace is to put static in >> front of it >> ?- You can't reuse a static function in another compilation unit >> (there is no "friend static"). >> >> So what happens in the multi-files build is that the function are >> tagged as hidden instead of static, with hidden being >> __attribute__((hidden)) for gcc, nothing for MS compiler (on windows, >> you have to tag the exported functions, nothing is exported by >> default), and will break on other platforms. > > Does the build actually break or is it the case that a lot of extraneous > names become visible, increasing the module size and exposing functions that > we don't what anyone to access? I think the latter. The number of exported symbols is pretty high, though. cheers, David From giuseppe.aprea at gmail.com Wed Aug 26 03:21:13 2009 From: giuseppe.aprea at gmail.com (Giuseppe Aprea) Date: Wed, 26 Aug 2009 09:21:13 +0200 Subject: [Numpy-discussion] filters for rows or columns In-Reply-To: <3d375d730908251626v4d88a51fpd6e53ecb041da100@mail.gmail.com> References: <3d375d730908251113n7a021577gac8f254ee010b92f@mail.gmail.com> <3d375d730908251626v4d88a51fpd6e53ecb041da100@mail.gmail.com> Message-ID: On Wed, Aug 26, 2009 at 1:26 AM, Robert Kern wrote: > On Tue, Aug 25, 2009 at 16:18, Giuseppe Aprea wrote: >> Hi, ?I would like to do something like this >> >> a=array([[1,2,3,4],[5,6,7,8],[4,5,6,0]]) >> idxList=[] >> for i in range(0,a.shape[1]): >> ? ?if len(nonzero(a[:,i])[0])==1: ? #want to extract column indices >> of those columns which only have one non vanishing element >> ? ? ? ? idxList.append(i) >> >> I already used where on !D array but I don't know if there is some function or >> some kind of syntax which allow you to evaluate a condition for each >> column(row). > > column_mask = ((a != 0).sum(axis=1) == 1) > idxArray = np.nonzero(column_mask)[0] ?# if you must > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > That's interesting. Thanks a lot! In my case that becames: column_mask = ((a != 0).sum(axis=0) == 1) idxArray = np.nonzero(column_mask)[0] cheers g From jsseabold at gmail.com Wed Aug 26 10:08:47 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 26 Aug 2009 10:08:47 -0400 Subject: [Numpy-discussion] mirr test correctly fails for given input. In-Reply-To: <1cd32cbb0908252245r273b4ff5w59706c22b1b42edd@mail.gmail.com> References: <1cd32cbb0908252245r273b4ff5w59706c22b1b42edd@mail.gmail.com> Message-ID: On Wed, Aug 26, 2009 at 1:45 AM, wrote: > On Tue, Aug 25, 2009 at 11:38 PM, Charles R > Harris wrote: >> So is it a bug in the test or a bug in the implementation? The problem is >> that the slice values[1:] when >> values =? [-120000,39000,30000,21000,37000,46000] contains no negative >> number and a nan is returned. This looks like a bug in the test. The >> documentation also probably needs fixing. >> >> Chuck > > There is a bug in the code, the nan is incorrectly raised. After > correcting the nan (checking on the original, instead of shortened > values), I got one failing test, that I corrected with the matching > number from Openoffice. > > (The main problem that the function is more complicated than > necessary, is because np.npv doesn't allow the inclusion of the > investment in the initial period) > > This needs reviewing, since it's late here. > > Josef > > > import numpy as np > from numpy.testing import assert_almost_equal, assert_ > > from numpy import npv > > def mirr(values, finance_rate, reinvest_rate): > ? ?""" > ? ?Modified internal rate of return. > > ? ?Parameters > ? ?---------- > ? ?values : array_like > ? ? ? ?Cash flows (must contain at least one positive and one negative value) > ? ? ? ?or nan is returned. > ? ?finance_rate : scalar > ? ? ? ?Interest rate paid on the cash flows > ? ?reinvest_rate : scalar > ? ? ? ?Interest rate received on the cash flows upon reinvestment > > ? ?Returns > ? ?------- > ? ?out : float > ? ? ? ?Modified internal rate of return > > ? ?""" > > ? ?values = np.asarray(values, dtype=np.double) > ? ?initial = values[0] > ? ?values1 = values[1:] > ? ?n = values1.size > ? ?pos = values1 > 0 > ? ?neg = values1 < 0 > ? ?if not (np.sum(values[values>0]) > 0 and np.sum(values[values<0]) < 0): > ? ? ? ?return np.nan > ? ?numer = np.abs(npv(reinvest_rate, values1*pos)) > ? ?denom = np.abs(npv(finance_rate, values1*neg)) > ? ?if initial > 0: > ? ? ? ?return ((initial + numer) / denom)**(1.0/n)*(1 + reinvest_rate) - 1 > ? ?else: > ? ? ? ?return ((numer / (-initial + denom)))**(1.0/n)*(1 + reinvest_rate) - 1 > > > > > > #tests from testsuite and Skipper plus isnan test > > v1 = [-4500,-800,800,800,600,600,800,800,700,3000] > print mirr(v1,0.08,0.055) > assert_almost_equal(mirr(v1,0.08,0.055), > ? ? ? ? ? ? ? ? ? ?0.0666, 4) > > #incorrect test ? corrected > v2 = [-120000,39000,30000,21000,37000,46000] > print mirr(v2,0.10,0.12) > assert_almost_equal(mirr(v2,0.10,0.12), 0.126094, 6) ?# corrected from OO > Yes, the value in the tests that this v2 tests against is wrong. It was the value returned by the old mirr but not excel or oocalc. This is the correct one. I noted it in my patch, but it was hard to catch since I didn't supply a diff. Now, I know... > > v2 = [39000,30000,21000,37000,46000] > assert_(np.isnan(mirr(v2,0.10,0.12))) > > > v3 = [100,200,-50,300,-200] > print mirr(v3,0.05,0.06) > assert_almost_equal(mirr(v3,0.05,0.06), 0.3428, 4) > > > #-------------- > print mirr([100, 200, -50, 300, -200], .05, .06) > assert_almost_equal(mirr((100, 200,-50, 300,-200), .05, .06), > ? ? ? ? ? ? ? ? ? ?0.342823387842, 4) > > V2 = [-4500,-800,800,800,600,600,800,800,700,3000] > print mirr(V2, 0.08, 0.055) > assert_almost_equal(mirr(V2, 0.08, 0.055), 0.06659718, 4) Skipper From josef.pktd at gmail.com Wed Aug 26 11:25:35 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 Aug 2009 11:25:35 -0400 Subject: [Numpy-discussion] mirr test correctly fails for given input. In-Reply-To: References: <1cd32cbb0908252245r273b4ff5w59706c22b1b42edd@mail.gmail.com> Message-ID: <1cd32cbb0908260825o6a93efc7ne2f061e90a4464ca@mail.gmail.com> On Wed, Aug 26, 2009 at 10:08 AM, Skipper Seabold wrote: > On Wed, Aug 26, 2009 at 1:45 AM, wrote: >> On Tue, Aug 25, 2009 at 11:38 PM, Charles R >> Harris wrote: >>> So is it a bug in the test or a bug in the implementation? The problem is >>> that the slice values[1:] when >>> values =? [-120000,39000,30000,21000,37000,46000] contains no negative >>> number and a nan is returned. This looks like a bug in the test. The >>> documentation also probably needs fixing. >>> >>> Chuck >> >> There is a bug in the code, the nan is incorrectly raised. After >> correcting the nan (checking on the original, instead of shortened >> values), I got one failing test, that I corrected with the matching >> number from Openoffice. >> >> (The main problem that the function is more complicated than >> necessary, is because np.npv doesn't allow the inclusion of the >> investment in the initial period) >> >> This needs reviewing, since it's late here. >> >> Josef >> >> >> import numpy as np >> from numpy.testing import assert_almost_equal, assert_ >> >> from numpy import npv >> >> def mirr(values, finance_rate, reinvest_rate): >> ? ?""" >> ? ?Modified internal rate of return. >> >> ? ?Parameters >> ? ?---------- >> ? ?values : array_like >> ? ? ? ?Cash flows (must contain at least one positive and one negative value) >> ? ? ? ?or nan is returned. >> ? ?finance_rate : scalar >> ? ? ? ?Interest rate paid on the cash flows >> ? ?reinvest_rate : scalar >> ? ? ? ?Interest rate received on the cash flows upon reinvestment >> >> ? ?Returns >> ? ?------- >> ? ?out : float >> ? ? ? ?Modified internal rate of return >> >> ? ?""" >> >> ? ?values = np.asarray(values, dtype=np.double) >> ? ?initial = values[0] >> ? ?values1 = values[1:] >> ? ?n = values1.size >> ? ?pos = values1 > 0 >> ? ?neg = values1 < 0 >> ? ?if not (np.sum(values[values>0]) > 0 and np.sum(values[values<0]) < 0): >> ? ? ? ?return np.nan >> ? ?numer = np.abs(npv(reinvest_rate, values1*pos)) >> ? ?denom = np.abs(npv(finance_rate, values1*neg)) >> ? ?if initial > 0: >> ? ? ? ?return ((initial + numer) / denom)**(1.0/n)*(1 + reinvest_rate) - 1 >> ? ?else: >> ? ? ? ?return ((numer / (-initial + denom)))**(1.0/n)*(1 + reinvest_rate) - 1 >> >> >> >> >> >> #tests from testsuite and Skipper plus isnan test >> >> v1 = [-4500,-800,800,800,600,600,800,800,700,3000] >> print mirr(v1,0.08,0.055) >> assert_almost_equal(mirr(v1,0.08,0.055), >> ? ? ? ? ? ? ? ? ? ?0.0666, 4) >> >> #incorrect test ? corrected >> v2 = [-120000,39000,30000,21000,37000,46000] >> print mirr(v2,0.10,0.12) >> assert_almost_equal(mirr(v2,0.10,0.12), 0.126094, 6) ?# corrected from OO >> > > Yes, the value in the tests that this v2 tests against is wrong. ?It > was the value returned by the old mirr but not excel or oocalc. ?This > is the correct one. ?I noted it in my patch, but it was hard to catch > since I didn't supply a diff. ?Now, I know... > >> >> v2 = [39000,30000,21000,37000,46000] >> assert_(np.isnan(mirr(v2,0.10,0.12))) >> >> >> v3 = [100,200,-50,300,-200] >> print mirr(v3,0.05,0.06) >> assert_almost_equal(mirr(v3,0.05,0.06), 0.3428, 4) >> >> >> #-------------- >> print mirr([100, 200, -50, 300, -200], .05, .06) >> assert_almost_equal(mirr((100, 200,-50, 300,-200), .05, .06), >> ? ? ? ? ? ? ? ? ? ?0.342823387842, 4) >> >> V2 = [-4500,-800,800,800,600,600,800,800,700,3000] >> print mirr(V2, 0.08, 0.055) >> assert_almost_equal(mirr(V2, 0.08, 0.055), 0.06659718, 4) > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Here is a shortened version, that uses Skippers corrections, but avoids splitting the values array, by working around npv not starting with the initial investment. It passes the same tests as the corrected version. Josef def mirr(values, finance_rate, reinvest_rate): values = np.asarray(values, dtype=np.double) n = values.size pos = values > 0 neg = values < 0 if not (pos.any() and neg.any()): return np.nan numer = np.abs(npv(reinvest_rate, values*pos)) * (1 + reinvest_rate) denom = np.abs(npv(finance_rate, values*neg)) * (1 + finance_rate) return (numer / denom)**(1.0/(n-1)) * (1 + reinvest_rate) - 1 From josef.pktd at gmail.com Wed Aug 26 11:44:41 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 Aug 2009 11:44:41 -0400 Subject: [Numpy-discussion] mirr test correctly fails for given input. In-Reply-To: <1cd32cbb0908260825o6a93efc7ne2f061e90a4464ca@mail.gmail.com> References: <1cd32cbb0908252245r273b4ff5w59706c22b1b42edd@mail.gmail.com> <1cd32cbb0908260825o6a93efc7ne2f061e90a4464ca@mail.gmail.com> Message-ID: <1cd32cbb0908260844i54ea57cbn6e941bed4c167e47@mail.gmail.com> On Wed, Aug 26, 2009 at 11:25 AM, wrote: > On Wed, Aug 26, 2009 at 10:08 AM, Skipper Seabold wrote: >> On Wed, Aug 26, 2009 at 1:45 AM, wrote: >>> On Tue, Aug 25, 2009 at 11:38 PM, Charles R >>> Harris wrote: >>>> So is it a bug in the test or a bug in the implementation? The problem is >>>> that the slice values[1:] when >>>> values =? [-120000,39000,30000,21000,37000,46000] contains no negative >>>> number and a nan is returned. This looks like a bug in the test. The >>>> documentation also probably needs fixing. >>>> >>>> Chuck >>> >>> There is a bug in the code, the nan is incorrectly raised. After >>> correcting the nan (checking on the original, instead of shortened >>> values), I got one failing test, that I corrected with the matching >>> number from Openoffice. >>> >>> (The main problem that the function is more complicated than >>> necessary, is because np.npv doesn't allow the inclusion of the >>> investment in the initial period) >>> >>> This needs reviewing, since it's late here. >>> >>> Josef >>> >>> >>> import numpy as np >>> from numpy.testing import assert_almost_equal, assert_ >>> >>> from numpy import npv >>> >>> def mirr(values, finance_rate, reinvest_rate): >>> ? ?""" >>> ? ?Modified internal rate of return. >>> >>> ? ?Parameters >>> ? ?---------- >>> ? ?values : array_like >>> ? ? ? ?Cash flows (must contain at least one positive and one negative value) >>> ? ? ? ?or nan is returned. >>> ? ?finance_rate : scalar >>> ? ? ? ?Interest rate paid on the cash flows >>> ? ?reinvest_rate : scalar >>> ? ? ? ?Interest rate received on the cash flows upon reinvestment >>> >>> ? ?Returns >>> ? ?------- >>> ? ?out : float >>> ? ? ? ?Modified internal rate of return >>> >>> ? ?""" >>> >>> ? ?values = np.asarray(values, dtype=np.double) >>> ? ?initial = values[0] >>> ? ?values1 = values[1:] >>> ? ?n = values1.size >>> ? ?pos = values1 > 0 >>> ? ?neg = values1 < 0 >>> ? ?if not (np.sum(values[values>0]) > 0 and np.sum(values[values<0]) < 0): >>> ? ? ? ?return np.nan >>> ? ?numer = np.abs(npv(reinvest_rate, values1*pos)) >>> ? ?denom = np.abs(npv(finance_rate, values1*neg)) >>> ? ?if initial > 0: >>> ? ? ? ?return ((initial + numer) / denom)**(1.0/n)*(1 + reinvest_rate) - 1 >>> ? ?else: >>> ? ? ? ?return ((numer / (-initial + denom)))**(1.0/n)*(1 + reinvest_rate) - 1 >>> >>> >>> >>> >>> >>> #tests from testsuite and Skipper plus isnan test >>> >>> v1 = [-4500,-800,800,800,600,600,800,800,700,3000] >>> print mirr(v1,0.08,0.055) >>> assert_almost_equal(mirr(v1,0.08,0.055), >>> ? ? ? ? ? ? ? ? ? ?0.0666, 4) >>> >>> #incorrect test ? corrected >>> v2 = [-120000,39000,30000,21000,37000,46000] >>> print mirr(v2,0.10,0.12) >>> assert_almost_equal(mirr(v2,0.10,0.12), 0.126094, 6) ?# corrected from OO >>> >> >> Yes, the value in the tests that this v2 tests against is wrong. ?It >> was the value returned by the old mirr but not excel or oocalc. ?This >> is the correct one. ?I noted it in my patch, but it was hard to catch >> since I didn't supply a diff. ?Now, I know... >> >>> >>> v2 = [39000,30000,21000,37000,46000] >>> assert_(np.isnan(mirr(v2,0.10,0.12))) >>> >>> >>> v3 = [100,200,-50,300,-200] >>> print mirr(v3,0.05,0.06) >>> assert_almost_equal(mirr(v3,0.05,0.06), 0.3428, 4) >>> >>> >>> #-------------- >>> print mirr([100, 200, -50, 300, -200], .05, .06) >>> assert_almost_equal(mirr((100, 200,-50, 300,-200), .05, .06), >>> ? ? ? ? ? ? ? ? ? ?0.342823387842, 4) >>> >>> V2 = [-4500,-800,800,800,600,600,800,800,700,3000] >>> print mirr(V2, 0.08, 0.055) >>> assert_almost_equal(mirr(V2, 0.08, 0.055), 0.06659718, 4) >> >> Skipper >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > Here is a shortened version, that uses Skippers corrections, but > avoids splitting the values array, by working around npv not starting > with the initial investment. It passes the same tests as the corrected > version. > > Josef > > def mirr(values, finance_rate, reinvest_rate): > ? ?values = np.asarray(values, dtype=np.double) > ? ?n = values.size > ? ?pos = values > 0 > ? ?neg = values < 0 > ? ?if not (pos.any() and neg.any()): > ? ? ? ?return np.nan > > ? ?numer = np.abs(npv(reinvest_rate, values*pos)) * (1 + reinvest_rate) > ? ?denom = np.abs(npv(finance_rate, values*neg)) * (1 + finance_rate) > ? ?return (numer / denom)**(1.0/(n-1)) * (1 + reinvest_rate) - 1 > a comment on the function >From a theoretical perspective returning nan wouldn't be necessary. The rate of return would be well defined: -1: you only pay and get nothing back (you lose 100%) inf: you only receive and have nothing to pay 0/0 = nan : you don't pay and you get nothing back for practical purposes, the nan might signal better that the user might have made a mistake Josef From charlesr.harris at gmail.com Wed Aug 26 16:10:26 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 Aug 2009 14:10:26 -0600 Subject: [Numpy-discussion] mirr test correctly fails for given input. In-Reply-To: <1cd32cbb0908260844i54ea57cbn6e941bed4c167e47@mail.gmail.com> References: <1cd32cbb0908252245r273b4ff5w59706c22b1b42edd@mail.gmail.com> <1cd32cbb0908260825o6a93efc7ne2f061e90a4464ca@mail.gmail.com> <1cd32cbb0908260844i54ea57cbn6e941bed4c167e47@mail.gmail.com> Message-ID: Fixes applied in r7324. Thanks guys. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.wendell at gmail.com Wed Aug 26 21:48:46 2009 From: mark.wendell at gmail.com (Mark Wendell) Date: Wed, 26 Aug 2009 19:48:46 -0600 Subject: [Numpy-discussion] array is not writable Message-ID: Hi all - I'm playing with editing image data converted from PIL objects, and running into a situation where numpy tells me that an 'array is not writable'. Not sure I understand what that means, or how to get around it. Here's a sample interactive session: >>> import Image >>> import numpy as np >>> im = Image.open("rgb.0001.jpg") >>> a = np.asarray(im) >>> a.shape (512, 512, 3) >>> a.dtype dtype('uint8') >>> a[0,0,0] 254 >>> a[0,0,0] = 10 Traceback (most recent call last): File "", line 1, in RuntimeError: array is not writeable Any help appreciated. Thanks, Mark PIL 1.1.6 numpy 1.2.1 Ubuntu 9.04 -- -- Mark Wendell -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.wendell at gmail.com Wed Aug 26 23:49:11 2009 From: mark.wendell at gmail.com (Mark Wendell) Date: Wed, 26 Aug 2009 21:49:11 -0600 Subject: [Numpy-discussion] array is not writable In-Reply-To: References: Message-ID: Figured this much out: if I do an np.copy of the original array to a new array, then I can edit individual 'color' values with impunity. So I guess the original array from the pil object still shares memory with that image object somehow, making it unwritable? thanks Mark On Wed, Aug 26, 2009 at 7:48 PM, Mark Wendell wrote: > > Hi all - I'm playing with editing image data converted from PIL objects, and running into a situation where numpy tells me that an 'array is not writable'. Not sure I understand what that means, or how to get around it. Here's a sample interactive session: > > >>> import Image > >>> import numpy as np > >>> im = Image.open("rgb.0001.jpg") > >>> a = np.asarray(im) > >>> a.shape > (512, 512, 3) > >>> a.dtype > dtype('uint8') > >>> a[0,0,0] > 254 > >>> a[0,0,0] = 10 > Traceback (most recent call last): > ? File "", line 1, in > RuntimeError: array is not writeable > > > Any help appreciated. Thanks, > Mark > > PIL 1.1.6 > numpy 1.2.1 > Ubuntu 9.04 > > -- > -- > Mark Wendell -- -- Mark Wendell From jsseabold at gmail.com Thu Aug 27 00:04:00 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 27 Aug 2009 00:04:00 -0400 Subject: [Numpy-discussion] array is not writable In-Reply-To: References: Message-ID: On Wed, Aug 26, 2009 at 11:49 PM, Mark Wendell wrote: > Figured this much out: if I do an np.copy of the original array to a > new array, then I can edit individual 'color' values with impunity. So > I guess the original array from the pil object still shares memory > with that image object somehow, making it unwritable? > > thanks > Mark > > On Wed, Aug 26, 2009 at 7:48 PM, Mark Wendell wrote: >> >> Hi all - I'm playing with editing image data converted from PIL objects, and running into a situation where numpy tells me that an 'array is not writable'. Not sure I understand what that means, or how to get around it. Here's a sample interactive session: >> >> >>> import Image >> >>> import numpy as np >> >>> im = Image.open("rgb.0001.jpg") >> >>> a = np.asarray(im) >> >>> a.shape >> (512, 512, 3) >> >>> a.dtype >> dtype('uint8') >> >>> a[0,0,0] >> 254 >> >>> a[0,0,0] = 10 >> Traceback (most recent call last): >> ? File "", line 1, in >> RuntimeError: array is not writeable >> >> >> Any help appreciated. Thanks, >> Mark >> >> PIL 1.1.6 >> numpy 1.2.1 >> Ubuntu 9.04 >> Hi Mark, I don't really know the specifics of why your array isn't writeable (someone will have a better answer and will correct me if I'm wrong), but you could try to check if it's writeable by doing: >>> a.flags # or a.flags.writeable And check the value of writeable. It might be as simple as setting a.flags.writeable = True, but then again there might be a good reason why it's unwriteable. I don't really know. Skipper From robert.kern at gmail.com Thu Aug 27 00:06:25 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 26 Aug 2009 21:06:25 -0700 Subject: [Numpy-discussion] array is not writable In-Reply-To: References: Message-ID: <3d375d730908262106k6e1a6b8j86df74fafc63a2cf@mail.gmail.com> On Wed, Aug 26, 2009 at 20:49, Mark Wendell wrote: > Figured this much out: if I do an np.copy of the original array to a > new array, then I can edit individual 'color' values with impunity. So > I guess the original array from the pil object still shares memory > with that image object somehow, making it unwritable? Most likely. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Thu Aug 27 00:34:04 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 27 Aug 2009 00:34:04 -0400 Subject: [Numpy-discussion] array is not writable In-Reply-To: References: Message-ID: On 26-Aug-09, at 11:49 PM, Mark Wendell wrote: > Figured this much out: if I do an np.copy of the original array to a > new array, then I can edit individual 'color' values with impunity. So > I guess the original array from the pil object still shares memory > with that image object somehow, making it unwritable? I'm going to guess that it's because PIL is still responsible for that memory, not NumPy. I don't really know how this stuff works but asarray() would just give you a view onto that chunk of memory; since NumPy didn't allocate it, it probably doesn't want to modify it. Not sure if you can get away with not making a copy in this situation. David From charlesr.harris at gmail.com Thu Aug 27 00:43:13 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 Aug 2009 22:43:13 -0600 Subject: [Numpy-discussion] array is not writable In-Reply-To: References: Message-ID: On Wed, Aug 26, 2009 at 10:34 PM, David Warde-Farley wrote: > On 26-Aug-09, at 11:49 PM, Mark Wendell wrote: > > > Figured this much out: if I do an np.copy of the original array to a > > new array, then I can edit individual 'color' values with impunity. So > > I guess the original array from the pil object still shares memory > > with that image object somehow, making it unwritable? > > I'm going to guess that it's because PIL is still responsible for that > memory, not NumPy. I don't really know how this stuff works but > asarray() would just give you a view onto that chunk of memory; since > NumPy didn't allocate it, it probably doesn't want to modify it. Not > sure if you can get away with not making a copy in this situation. > I expect you can. It's the unreadable arrays that are a problem... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From timmichelsen at gmx-topmail.de Thu Aug 27 08:07:12 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 27 Aug 2009 12:07:12 +0000 (UTC) Subject: [Numpy-discussion] histogram: sum up values in each bin Message-ID: Hello, I need some advice on histograms. If I interpret the documentation [1, 2] for numpy.histogram correctly, the result of the function is a count of the occurences sorted into each bin. (n, bins) = numpy.histogram(v, bins=50, normed=1) But how can I apply another function on these values stacked in each bin? Like summing them up or building averages? Thanks, Timmie [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html [2] http://www.scipy.org/Tentative_NumPy_Tutorial#head-aa75ec76530ff51a2e98071adb7224a4b793519e From baker.alexander at gmail.com Thu Aug 27 08:23:45 2009 From: baker.alexander at gmail.com (alexander baker) Date: Thu, 27 Aug 2009 13:23:45 +0100 Subject: [Numpy-discussion] histogram: sum up values in each bin In-Reply-To: References: Message-ID: <270620220908270523h268998fcmd617f9557049b9ab@mail.gmail.com> Here is an example, this does something a extra at the end but shows how the bins can be used. Regards Alex Baker. from scipy.stats import norm r = norm.rvs(size=10000) import numpy as np p, bins = np.histogram(r, width, normed=True) db = bins[1]-bins[0] cdf = np.cumsum(p*db) from pylab import figure, show fig = figure() ax = fig.add_subplot(111) ax.bar(bins[:-1], cdf, width=0.8*db) show() o = [] rates = [] for r in np.arange(0, max(bins), db): G = max(np.cumsum([bin for bin in bins if bin > r])) L = min(np.cumsum([bin for bin in bins if bin < r])) o.append(abs(G/L)) rates.append(r) Mobile: 07788 872118 Blog: www.alexfb.com -- All science is either physics or stamp collecting. 2009/8/27 Tim Michelsen > Hello, > I need some advice on histograms. > If I interpret the documentation [1, 2] for numpy.histogram correctly, the > result of the function is a count of the occurences sorted into each bin. > > (n, bins) = numpy.histogram(v, bins=50, normed=1) > > But how can I apply another function on these values stacked in each bin? > Like summing them up or building averages? > > Thanks, > Timmie > > [1] > http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html > [2] > > http://www.scipy.org/Tentative_NumPy_Tutorial#head-aa75ec76530ff51a2e98071adb7224a4b793519e > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Aug 27 09:19:15 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 Aug 2009 09:19:15 -0400 Subject: [Numpy-discussion] histogram: sum up values in each bin In-Reply-To: <270620220908270523h268998fcmd617f9557049b9ab@mail.gmail.com> References: <270620220908270523h268998fcmd617f9557049b9ab@mail.gmail.com> Message-ID: <1cd32cbb0908270619t509c4b16je8427104eace3f82@mail.gmail.com> On Thu, Aug 27, 2009 at 8:23 AM, alexander baker wrote: > Here is an example, this does something a extra at the end but shows how the > bins can be used. > > Regards > > Alex Baker. > > from scipy.stats import norm > r = norm.rvs(size=10000) > > import numpy as np > p, bins = np.histogram(r, width, normed=True) > db = bins[1]-bins[0] > cdf = np.cumsum(p*db) > > from pylab import figure, show > fig = figure() > ax = fig.add_subplot(111) > ax.bar(bins[:-1], cdf, width=0.8*db) > show() > > o = [] > rates = [] > for r in np.arange(0, max(bins), db): > ??? G = max(np.cumsum([bin for bin in bins if bin > r])) > ??? L = min(np.cumsum([bin for bin in bins if bin < r])) > ??? o.append(abs(G/L)) > ??? rates.append(r) > > Mobile: 07788 872118 > Blog: www.alexfb.com > > -- > All science is either physics or stamp collecting. > > > 2009/8/27 Tim Michelsen >> >> Hello, >> I need some advice on histograms. >> If I interpret the documentation [1, 2] for numpy.histogram correctly, the >> result of the function is a count of the occurences sorted into each bin. >> >> (n, bins) = numpy.histogram(v, bins=50, normed=1) >> >> But how can I apply another function on these values stacked in each bin? >> Like summing them up or building averages? >> >> Thanks, >> Timmie >> >> [1] >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html >> [2] >> >> http://www.scipy.org/Tentative_NumPy_Tutorial#head-aa75ec76530ff51a2e98071adb7224a4b793519e >> Tim, do you mean, that you want to apply other functions, e.g. mean or variance, to the original values but calculated per bin? If I read the answer of Alex correctly, then it only works with the bin count. To calculate e.g. the variance of all values per bin, I think, the easiest would be to create a label array, with values arange(nbins-1) for the corresponding original data and then use np.bincount. I don't know straight away what the easiest or fastest way is to create the label array from the histogram bin boundaries Josef From schut at sarvision.nl Thu Aug 27 09:23:15 2009 From: schut at sarvision.nl (Vincent Schut) Date: Thu, 27 Aug 2009 15:23:15 +0200 Subject: [Numpy-discussion] histogram: sum up values in each bin In-Reply-To: References: Message-ID: Tim Michelsen wrote: > Hello, > I need some advice on histograms. > If I interpret the documentation [1, 2] for numpy.histogram correctly, the > result of the function is a count of the occurences sorted into each bin. > > (n, bins) = numpy.histogram(v, bins=50, normed=1) > > But how can I apply another function on these values stacked in each bin? > Like summing them up or building averages? Hi Tim, If you just want to sum and/or average (= sum / count), you can shortcut this using the weights parameter of numpy.histogram, e.g. something like: data = numpy.random.random((100,)) countsPerBin = numpy.histogram(data) sumsPerBin = numpy.histogram(data, weights=data) averagePerBin = sumsPerBin / countsPerBin Regards, Vincent. > > Thanks, > Timmie > > [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html > [2] > http://www.scipy.org/Tentative_NumPy_Tutorial#head-aa75ec76530ff51a2e98071adb7224a4b793519e From josef.pktd at gmail.com Thu Aug 27 09:42:36 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 Aug 2009 09:42:36 -0400 Subject: [Numpy-discussion] histogram: sum up values in each bin In-Reply-To: References: Message-ID: <1cd32cbb0908270642l1c71e409k13c23afe973b5eb8@mail.gmail.com> On Thu, Aug 27, 2009 at 9:23 AM, Vincent Schut wrote: > Tim Michelsen wrote: >> Hello, >> I need some advice on histograms. >> If I interpret the documentation [1, 2] for numpy.histogram correctly, the >> result of the function is a count of the occurences sorted into each bin. >> >> (n, bins) = numpy.histogram(v, bins=50, normed=1) >> >> But how can I apply another function on these values stacked in each bin? >> Like summing them up or building averages? > > Hi Tim, > > If you just want to sum and/or average (= sum / count), you can shortcut > this using the weights parameter of numpy.histogram, e.g. something like: > > data = numpy.random.random((100,)) > countsPerBin = numpy.histogram(data) > sumsPerBin = numpy.histogram(data, weights=data) > averagePerBin = sumsPerBin / countsPerBin > > Regards, > Vincent. Thanks for the pointer, I didn't realize that histogram also has a weights argument. It would be interesting to know what the overhead is for repeated calls to histogram for larger arrays, if anyone knows. Josef > >> >> Thanks, >> Timmie >> >> [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html >> [2] >> http://www.scipy.org/Tentative_NumPy_Tutorial#head-aa75ec76530ff51a2e98071adb7224a4b793519e > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From timmichelsen at gmx-topmail.de Thu Aug 27 12:49:23 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 27 Aug 2009 16:49:23 +0000 (UTC) Subject: [Numpy-discussion] histogram: sum up values in each bin References: <270620220908270523h268998fcmd617f9557049b9ab@mail.gmail.com> <1cd32cbb0908270619t509c4b16je8427104eace3f82@mail.gmail.com> Message-ID: > Tim, do you mean, that you want to apply other functions, e.g. mean or > variance, to the original values but calculated per bin? Sorry that I forgot to add this. Shame. I would like to apply these mathematical functions on the original values stacked in the respective bins. For instance: The sample data measures the wight of an animal. 1) historam give a count of how many values are in each bin. I would like to calculate the average wight of all animals sorted in bin1, bin2 etc. This is also useful in where you have a time component. In Spreadsheets I would use a '=' to reference to the original data and then either sum it up or count it per class. I hope this is somehow understandable. Thanks, Timmie From jackchungchiehyu at googlemail.com Thu Aug 27 13:00:25 2009 From: jackchungchiehyu at googlemail.com (Jack Yu) Date: Thu, 27 Aug 2009 18:00:25 +0100 Subject: [Numpy-discussion] linalg svd illegal instruction Message-ID: Hi all, I am having trouble using the function numpy.linalg.svd(). It works fine on my personal computer. However, when I use it on a cluster at university, it returns 'Illegal Instruction', when the input matrix is complex. Is this function meant to work on a complex array? If so, what could be the cause of the illegal instruction message? The version of numpy is 1.2.1. I have tried calling it via both numpy.linalg.svd(), and pylab.svd(). Thanks in advance for any help, Jack Yu -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Aug 27 13:03:37 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Aug 2009 10:03:37 -0700 Subject: [Numpy-discussion] linalg svd illegal instruction In-Reply-To: References: Message-ID: <3d375d730908271003u5fa79f2ei6fa568dabac3dce8@mail.gmail.com> On Thu, Aug 27, 2009 at 10:00, Jack Yu wrote: > Hi all, > > I am having trouble using the function numpy.linalg.svd().? It works fine on > my personal computer.? However, when I use it on a cluster at university, it > returns 'Illegal Instruction', when the input matrix is complex.? Is this > function meant to work on a complex array? Yes. > If so, what could be the cause > of the illegal instruction message? The numpy installed on your cluster was linked against a build of ATLAS that was not configured correctly for the CPU it is running on. > The version of numpy is 1.2.1.? I have tried calling it via both > numpy.linalg.svd(), and pylab.svd(). The latter just calls the former. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Thu Aug 27 13:27:29 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 Aug 2009 13:27:29 -0400 Subject: [Numpy-discussion] histogram: sum up values in each bin In-Reply-To: References: <270620220908270523h268998fcmd617f9557049b9ab@mail.gmail.com> <1cd32cbb0908270619t509c4b16je8427104eace3f82@mail.gmail.com> Message-ID: <1cd32cbb0908271027l75916349jf978ae21ab9c43d4@mail.gmail.com> On Thu, Aug 27, 2009 at 12:49 PM, Tim Michelsen wrote: >> Tim, do you mean, that you want to apply other functions, e.g. mean or >> variance, to the original values but calculated per bin? > Sorry that I forgot to add this. Shame. > > I would like to apply these mathematical functions on the original values > stacked in the respective bins. > > For instance: > > The sample data measures the wight of an animal. > > 1) historam give a count of how many values are in each bin. > > I would like to calculate the average wight of all animals > sorted in bin1, bin2 etc. > > This is also useful in where you have a time component. > > In Spreadsheets I would use a '=' to reference to the original data and then > either sum it up or count it per class. > > I hope this is somehow understandable. Yes, it is a quite common use case for descriptive statistics, and I'm starting to collect different ways of doing it. In your case, Vincents way is the easiest. If you need to be faster, or you want to apply the same classification also to other variables, e.g. size of the animal,.., then creating a label array would be a more flexible solution. There was a similar thread recently on the scipy-user list for sorted arrays: "How to average different pieces or an array?" Josef > > Thanks, > Timmie From charlesr.harris at gmail.com Thu Aug 27 14:24:49 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 12:24:49 -0600 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? Message-ID: I'm thinking double. There is a potential loss of precision for 64 bit ints but nothing else seems reasonable for a default. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Thu Aug 27 14:32:10 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 27 Aug 2009 21:32:10 +0300 Subject: [Numpy-discussion] What type should / return in python 3k whenapplied to two integer types? References: Message-ID: <710F2847B0018641891D9A21602763605AD12E@ex3.envision.co.il> Double is the natural choice, there is a possibility of long double (float96 on x86 or float128 on amd64) where there is no precision loss. Is this option portable? Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Charles R Harris ????: ? 27-??????-09 21:24 ??: numpy-discussion ????: [Numpy-discussion] What type should / return in python 3k whenapplied to two integer types? I'm thinking double. There is a potential loss of precision for 64 bit ints but nothing else seems reasonable for a default. Thoughts? Chuck -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3216 bytes Desc: not available URL: From charlesr.harris at gmail.com Thu Aug 27 14:50:14 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 12:50:14 -0600 Subject: [Numpy-discussion] What type should / return in python 3k whenapplied to two integer types? In-Reply-To: <710F2847B0018641891D9A21602763605AD12E@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD12E@ex3.envision.co.il> Message-ID: 2009/8/27 Nadav Horesh > Double is the natural choice, there is a possibility of long double > (float96 on x86 or float128 on amd64) where there is no precision loss. Is > this option portable? Not really. The long double type can be a bit weird and varies from architecture to architecture. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Aug 27 14:54:28 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 12:54:28 -0600 Subject: [Numpy-discussion] What type should / return in python 3k whenapplied to two integer types? In-Reply-To: References: <710F2847B0018641891D9A21602763605AD12E@ex3.envision.co.il> Message-ID: On Thu, Aug 27, 2009 at 12:50 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > 2009/8/27 Nadav Horesh > >> Double is the natural choice, there is a possibility of long double >> (float96 on x86 or float128 on amd64) where there is no precision loss. Is >> this option portable? > > > Not really. The long double type can be a bit weird and varies from > architecture to architecture. > The real problem is deciding what to do with integer precisions that fit in float32. At present we have In [2]: x = ones(1, dtype=int16) In [3]: true_divide(x,x) Out[3]: array([ 1.], dtype=float32) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Thu Aug 27 15:12:01 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 27 Aug 2009 15:12:01 -0400 Subject: [Numpy-discussion] What type should / return in python 3k whenapplied to two integer types? In-Reply-To: References: <710F2847B0018641891D9A21602763605AD12E@ex3.envision.co.il> Message-ID: <4A96DA81.4050405@american.edu> Charles R Harris wrote: > The real problem is deciding what to do with integer precisions that fit > in float32. At present we have > > In [2]: x = ones(1, dtype=int16) > > In [3]: true_divide(x,x) > Out[3]: array([ 1.], dtype=float32) A user perspective: ambiguous cases should always be resolved to the default (float64). Users that know what they are doing can always request another dtype. (Well at least in principle; currently ufuncs not allow a dtype argument I guess. Is there a reason not to make the `out` argument a keyword argument and then also alternatively allow a dtype specification?) Alan Isaac From nadavh at visionsense.com Thu Aug 27 15:13:34 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 27 Aug 2009 22:13:34 +0300 Subject: [Numpy-discussion] What type should / return in python 3kwhenapplied to two integer types? References: <710F2847B0018641891D9A21602763605AD12E@ex3.envision.co.il> Message-ID: <710F2847B0018641891D9A21602763605AD130@ex3.envision.co.il> How about making this arch dependent translation: short int -> float int -> double long int -> long double or adding a flag that would switch between the above translation to the option that would produce only doubles. For some computing projects I made I would prefer the first option: There I used huge array, and could not afford having extra precision of the account of memory consumption. Currently I do a lot of (8 and 16 bits) image processing, memory size is not a problem, and it feels nice not to worry about precision (think about the bad habits of Matlab users). Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Charles R Harris ????: ? 27-??????-09 21:54 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] What type should / return in python 3kwhenapplied to two integer types? On Thu, Aug 27, 2009 at 12:50 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > 2009/8/27 Nadav Horesh > >> Double is the natural choice, there is a possibility of long double >> (float96 on x86 or float128 on amd64) where there is no precision loss. Is >> this option portable? > > > Not really. The long double type can be a bit weird and varies from > architecture to architecture. > The real problem is deciding what to do with integer precisions that fit in float32. At present we have In [2]: x = ones(1, dtype=int16) In [3]: true_divide(x,x) Out[3]: array([ 1.], dtype=float32) Chuck From charlesr.harris at gmail.com Thu Aug 27 15:25:29 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 13:25:29 -0600 Subject: [Numpy-discussion] What type should / return in python 3kwhenapplied to two integer types? In-Reply-To: <710F2847B0018641891D9A21602763605AD130@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD12E@ex3.envision.co.il> <710F2847B0018641891D9A21602763605AD130@ex3.envision.co.il> Message-ID: 2009/8/27 Nadav Horesh > > How about making this arch dependent translation: > > short int -> float > int -> double > long int -> long double > > or adding a flag that would switch between the above translation to the > option that would produce only doubles. > > For some computing projects I made I would prefer the first option: There I > used huge array, and could not afford having extra precision of the account > of memory consumption. Currently I do a lot of (8 and 16 bits) image > processing, memory size is not a problem, and it feels nice not to worry > about precision (think about the bad habits of Matlab users). > I really want to avoid long double. My hope is that some day it will always be quad precision on ieee machines and then it will be a more universal type. But that day is still in the (far?) future. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Aug 27 15:27:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Aug 2009 12:27:52 -0700 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: References: Message-ID: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> On Thu, Aug 27, 2009 at 11:24, Charles R Harris wrote: > I'm thinking double. There is a potential loss of precision for 64 bit ints > but nothing else seems reasonable for a default. Thoughts? Python int / Python int => Python float no matter how many decimal places the two ints have. I also say double. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Aug 27 15:43:33 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 13:43:33 -0600 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> Message-ID: On Thu, Aug 27, 2009 at 1:27 PM, Robert Kern wrote: > On Thu, Aug 27, 2009 at 11:24, Charles R > Harris wrote: > > I'm thinking double. There is a potential loss of precision for 64 bit > ints > > but nothing else seems reasonable for a default. Thoughts? > > Python int / Python int => Python float > > no matter how many decimal places the two ints have. I also say double. > What about //? In [1]: x = ones(1, dtype=uint64) In [2]: y = ones(1, dtype=int64) In [3]: floor_divide(x,y).dtype Out[3]: dtype('float64') Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Thu Aug 27 15:43:31 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 27 Aug 2009 22:43:31 +0300 Subject: [Numpy-discussion] What type should / return in python3kwhenapplied to two integer types? References: <710F2847B0018641891D9A21602763605AD12E@ex3.envision.co.il><710F2847B0018641891D9A21602763605AD130@ex3.envision.co.il> Message-ID: <710F2847B0018641891D9A21602763605AD131@ex3.envision.co.il> I really do not mind avoiding the long doubles, In practice I used them only once or twice, but I assume that short int -> float would be useful for many of the numpy users. It also may align nicely with (u)int8->float16 on GPUs. Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Charles R Harris ????: ? 27-??????-09 22:25 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] What type should / return in python3kwhenapplied to two integer types? 2009/8/27 Nadav Horesh > > How about making this arch dependent translation: > > short int -> float > int -> double > long int -> long double > > or adding a flag that would switch between the above translation to the > option that would produce only doubles. > > For some computing projects I made I would prefer the first option: There I > used huge array, and could not afford having extra precision of the account > of memory consumption. Currently I do a lot of (8 and 16 bits) image > processing, memory size is not a problem, and it feels nice not to worry > about precision (think about the bad habits of Matlab users). > I really want to avoid long double. My hope is that some day it will always be quad precision on ieee machines and then it will be a more universal type. But that day is still in the (far?) future. Chuck -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 4106 bytes Desc: not available URL: From robert.kern at gmail.com Thu Aug 27 15:46:33 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Aug 2009 12:46:33 -0700 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> Message-ID: <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> On Thu, Aug 27, 2009 at 12:43, Charles R Harris wrote: > > > On Thu, Aug 27, 2009 at 1:27 PM, Robert Kern wrote: >> >> On Thu, Aug 27, 2009 at 11:24, Charles R >> Harris wrote: >> > I'm thinking double. There is a potential loss of precision for 64 bit >> > ints >> > but nothing else seems reasonable for a default. Thoughts? >> >> Python int / Python int => Python float >> >> no matter how many decimal places the two ints have. I also say double. > > What about //? > > In [1]: x = ones(1, dtype=uint64) > > In [2]: y = ones(1, dtype=int64) > > In [3]: floor_divide(x,y).dtype > Out[3]: dtype('float64') Ewww. It should be an appropriate integer type. Probably whatever x*y is. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Aug 27 15:57:40 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 13:57:40 -0600 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> Message-ID: On Thu, Aug 27, 2009 at 1:46 PM, Robert Kern wrote: > On Thu, Aug 27, 2009 at 12:43, Charles R > Harris wrote: > > > > > > On Thu, Aug 27, 2009 at 1:27 PM, Robert Kern > wrote: > >> > >> On Thu, Aug 27, 2009 at 11:24, Charles R > >> Harris wrote: > >> > I'm thinking double. There is a potential loss of precision for 64 bit > >> > ints > >> > but nothing else seems reasonable for a default. Thoughts? > >> > >> Python int / Python int => Python float > >> > >> no matter how many decimal places the two ints have. I also say double. > > > > What about //? > > > > In [1]: x = ones(1, dtype=uint64) > > > > In [2]: y = ones(1, dtype=int64) > > > > In [3]: floor_divide(x,y).dtype > > Out[3]: dtype('float64') > > Ewww. It should be an appropriate integer type. Probably whatever x*y is. > In [5]: (x*y).dtype Out[5]: dtype('float64') ?? The problem is that numpy doesn't have an integer type that can support all the possible return values. Python doesn't have that problem, so mapping Python behaviour to numpy is a square peg round hole sort of thing. Float64 doesn't do the job either. The only two really "correct" options would seem to be raising an error for this combination of types or returning object arrays containing long ints. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Aug 27 16:35:57 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 Aug 2009 16:35:57 -0400 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> Message-ID: <1cd32cbb0908271335w17df1158h1fb60d38966ec427@mail.gmail.com> On Thu, Aug 27, 2009 at 3:57 PM, Charles R Harris wrote: > > > On Thu, Aug 27, 2009 at 1:46 PM, Robert Kern wrote: >> >> On Thu, Aug 27, 2009 at 12:43, Charles R >> Harris wrote: >> > >> > >> > On Thu, Aug 27, 2009 at 1:27 PM, Robert Kern >> > wrote: >> >> >> >> On Thu, Aug 27, 2009 at 11:24, Charles R >> >> Harris wrote: >> >> > I'm thinking double. There is a potential loss of precision for 64 >> >> > bit >> >> > ints >> >> > but nothing else seems reasonable for a default. Thoughts? >> >> >> >> Python int / Python int => Python float >> >> >> >> no matter how many decimal places the two ints have. I also say double. >> > >> > What about //? >> > >> > In [1]: x = ones(1, dtype=uint64) >> > >> > In [2]: y = ones(1, dtype=int64) >> > >> > In [3]: floor_divide(x,y).dtype >> > Out[3]: dtype('float64') >> >> Ewww. It should be an appropriate integer type. Probably whatever x*y is. > > In [5]: (x*y).dtype > Out[5]: dtype('float64') > ??? > > The problem is that numpy doesn't have an integer type that can support all > the possible return values. Python doesn't have that problem, so mapping > Python behaviour to numpy is a square peg round hole sort of thing. Float64 > doesn't do the job either. The only two really "correct" options would seem > to be raising an error for this combination of types or returning object > arrays containing long ints. > > Chuck > I'm always a bit surprised about integers in numpy and try to avoid calculations with them. So I would be in favor of x/y is correct floating point answer. Josef >>> x = np.ones(1, dtype=np.uint64); y = np.ones(1, dtype=np.int64) >>> np.true_divide((0*x),0) array([ 0.]) >>> np.true_divide((0*x),0).dtype dtype('float64') >>> np.true_divide((0*x),0.) array([ NaN]) >>> np.true_divide((x),0) array([ 0.]) >>> np.true_divide((x),0.) array([ Inf]) floor doesn't return an integer >>> np.floor(x).dtype dtype('float64') From charlesr.harris at gmail.com Thu Aug 27 16:51:51 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 14:51:51 -0600 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <1cd32cbb0908271335w17df1158h1fb60d38966ec427@mail.gmail.com> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <1cd32cbb0908271335w17df1158h1fb60d38966ec427@mail.gmail.com> Message-ID: On Thu, Aug 27, 2009 at 2:35 PM, wrote: > > I'm always a bit surprised about integers in numpy and try to avoid > calculations with them. So I would be in favor of x/y is correct > floating point answer. > > Josef > > >>> x = np.ones(1, dtype=np.uint64); y = np.ones(1, dtype=np.int64) > >>> np.true_divide((0*x),0) > array([ 0.]) > >>> np.true_divide((0*x),0).dtype > dtype('float64') > > >>> np.true_divide((0*x),0.) > array([ NaN]) > >>> np.true_divide((x),0) > array([ 0.]) > >>> np.true_divide((x),0.) > array([ Inf]) > > floor doesn't return an integer > floor_divide is different, it is supposed to correspond to the new python // operator. In [1]: x = ones(1, dtype=int16) In [2]: floor_divide(x,x).dtype Out[2]: dtype('int16') Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From terhorst at gmail.com Thu Aug 27 17:09:40 2009 From: terhorst at gmail.com (Jonathan T) Date: Thu, 27 Aug 2009 21:09:40 +0000 (UTC) Subject: [Numpy-discussion] Efficiently defining a multidimensional array Message-ID: Hi, I want to define a 3-D array as the sum of two 2-D arrays as follows: C[x,y,z] := A[x,y] + B[x,z] My linear algebra is a bit rusty; is there a good way to do this that does not require me to loop over x,y,z? Thanks! Jonathan From Chris.Barker at noaa.gov Thu Aug 27 17:22:39 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 27 Aug 2009 14:22:39 -0700 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> Message-ID: <4A96F91F.20206@noaa.gov> Robert Kern wrote: > On Thu, Aug 27, 2009 at 12:43, Charles R > Harris wrote: >> In [3]: floor_divide(x,y).dtype >> Out[3]: dtype('float64') > > Ewww. It should be an appropriate integer type. Probably whatever x*y is. +1 if you are working with integers, you should get integers, because that's probably what you want. -- they can overflow, etc. anyway, so buyer beware! In [7]: x.dtype Out[7]: dtype('int64') In [8]: y.dtype Out[8]: dtype('uint64') In [9]: (x * y).dtype Out[9]: dtype('float64') hmmm -- I thought we had removed this kind of silent upcasting (particularly int-> float), but I guess when you mix two types, numpy has to choose something! In any case, x/y should probably return the same type as x*y. By the way -- is there something about py3k that changes all this? Or is this just an opportunity to perhaps make some backward-incompatible changes to numpy? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Thu Aug 27 17:27:02 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 27 Aug 2009 14:27:02 -0700 Subject: [Numpy-discussion] Efficiently defining a multidimensional array In-Reply-To: References: Message-ID: <4A96FA26.3080606@noaa.gov> Jonathan T wrote: > I want to define a 3-D array as the sum of two 2-D arrays as follows: > > C[x,y,z] := A[x,y] + B[x,z] Is this what you mean? In [14]: A = np.arange(6).reshape((2,3,1)) In [15]: B = np.arange(12).reshape((1,3,4)) In [18]: A Out[18]: array([[[0], [1], [2]], [[3], [4], [5]]]) In [19]: B Out[19]: array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]]) In [20]: A+B Out[20]: array([[[ 0, 1, 2, 3], [ 5, 6, 7, 8], [10, 11, 12, 13]], [[ 3, 4, 5, 6], [ 8, 9, 10, 11], [13, 14, 15, 16]]]) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From lciti at essex.ac.uk Thu Aug 27 17:24:15 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Thu, 27 Aug 2009 22:24:15 +0100 Subject: [Numpy-discussion] Efficiently defining a multidimensional array References: Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E87@sernt14.essex.ac.uk> One solution I can think of still requires one loop (instead of three): import numpy as np a = np.arange(12).reshape(3,4) b = np.arange(15).reshape(3,5) z = np.empty(a.shape + (b.shape[-1],)) for i in range(len(z)): z[i] = np.add.outer(a[i], b[i]) From robert.kern at gmail.com Thu Aug 27 17:26:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Aug 2009 14:26:54 -0700 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <4A96F91F.20206@noaa.gov> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> Message-ID: <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> On Thu, Aug 27, 2009 at 14:22, Christopher Barker wrote: > By the way -- is there something about py3k that changes all this? Or is > this just an opportunity to perhaps make some backward-incompatible > changes to numpy? Python 3 makes the promised change of int/int => float. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From lciti at essex.ac.uk Thu Aug 27 17:32:31 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Thu, 27 Aug 2009 22:32:31 +0100 Subject: [Numpy-discussion] Efficiently defining a multidimensional array References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E87@sernt14.essex.ac.uk> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E88@sernt14.essex.ac.uk> Or a[:,:,None] + b[:,None,:] From charlesr.harris at gmail.com Thu Aug 27 17:40:42 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 15:40:42 -0600 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> Message-ID: On Thu, Aug 27, 2009 at 3:26 PM, Robert Kern wrote: > On Thu, Aug 27, 2009 at 14:22, Christopher Barker > wrote: > > > By the way -- is there something about py3k that changes all this? Or is > > this just an opportunity to perhaps make some backward-incompatible > > changes to numpy? > > Python 3 makes the promised change of int/int => float. > > -- > Robert Kern > I also intend to make it work with from future import division As a start. Because we only support python >= 2.4, the new division is available and that could help us with porting. I've also considered making that import the default for numpy internally, so we can fix things up, but that may be a bit radical. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Thu Aug 27 17:41:25 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 27 Aug 2009 17:41:25 -0400 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> Message-ID: <4B587A17-0A87-4CDE-B4F2-89474FDE2467@cs.toronto.edu> On 27-Aug-09, at 3:27 PM, Robert Kern wrote: > no matter how many decimal places the two ints have. Er... I must be missing something here. ;) David From fons at kokkinizita.net Thu Aug 27 17:49:56 2009 From: fons at kokkinizita.net (Fons Adriaensen) Date: Thu, 27 Aug 2009 23:49:56 +0200 Subject: [Numpy-discussion] future directions Message-ID: <20090827214955.GG2963@zita2.kokkinizita.net> Some weeks ago there was a post on this list requesting feedback on possible future directions for numpy. As I was quite busy at that time I'll reply to it now. My POV is that of a novice user, who at the same time wants quite badly to use the numpy framework for his numerical work which in this case is related to (some rather advanced) multichannell audio processing. >From that POV, I'd suggest the following: 1. Adopt an object based on Python-3's buffer protocol as the basic array type. It's immensely more powerful than ndarray, while at the same time it's close enough to ndarray to allow a gradual adoption. 2. Adopting that format will make it even more important to clearly define in which cases data gets copied and when not. This should be based on some simple rules that can be evaluated by a code author without requiring a lookup in the reference docs each time. 3. Finally remove all the redundancy and legacy stuff from the world of numerical Python. It is *very* confusing to a new user. 4. Ensure that each package deals with one problem area only. For example a package that (by its name) suggests it provides plotting facilities should provide only plotting facilities, and not spectra, averages of all sorts, etc. 5. Ensure some consistency in style. Some numerical Python packages use two-character function names, some a have veryLongCamelCased names. Just my two Eurocents of course. Ciao, -- FA Io lo dico sempre: l'Italia ? troppo stretta e lunga. From eads at soe.ucsc.edu Thu Aug 27 17:52:34 2009 From: eads at soe.ucsc.edu (Damian Eads) Date: Thu, 27 Aug 2009 14:52:34 -0700 Subject: [Numpy-discussion] Efficiently defining a multidimensional array In-Reply-To: References: Message-ID: <91b4b1ab0908271452m8de5aa5m6087447530f1cb31@mail.gmail.com> Hi Jonathan, This isn't quite your typical linear algebra. NumPy has a nice feature called array broadcasting, which enables you to perform element-wise operations on arrays of different shapes. The number of dimensions of the arrays must be the same, in your case, all the arrays must have three dimensions. The newaxis keyword is useful for creating a dimension of size one. import numpy as np A=np.random.rand(m,n) B=np.random.rand(n,k) # Line up the axes of size>1 by creating a new axis for each array. C=A[:,:,np.newaxis] + B[np.newaxis,:,:] # This is equivalent to the much slower triple for-loop TC=np.zeros((m,n,k)) for x in xrange(0,m): for y in xrange(0,n): for z in xrange(0,k): TC[x,y,z]=A[x,y]+B[y,z] # This should be true. print (TC==C).all() I hope this helps. Damian On Thu, Aug 27, 2009 at 3:09 PM, Jonathan T wrote: > Hi, > > I want to define a 3-D array as the sum of two 2-D arrays as follows: > > ? C[x,y,z] := A[x,y] + B[x,z] > > My linear algebra is a bit rusty; is there a good way to do this that does not > require me to loop over x,y,z? Thanks! > > Jonathan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- ----------------------------------------------------- Damian Eads Ph.D. Candidate University of California Computer Science 1156 High Street Machine Learning Lab, E2-489 Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads From robert.kern at gmail.com Thu Aug 27 17:55:51 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Aug 2009 14:55:51 -0700 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <4B587A17-0A87-4CDE-B4F2-89474FDE2467@cs.toronto.edu> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <4B587A17-0A87-4CDE-B4F2-89474FDE2467@cs.toronto.edu> Message-ID: <3d375d730908271455m4dda8186u8d203b79741c48ac@mail.gmail.com> On Thu, Aug 27, 2009 at 14:41, David Warde-Farley wrote: > On 27-Aug-09, at 3:27 PM, Robert Kern wrote: > >> no matter how many decimal places the two ints have. > > Er... I must be missing something here. ;) I meant decimal digits. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Aug 27 17:56:54 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 15:56:54 -0600 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <4A96F91F.20206@noaa.gov> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> Message-ID: On Thu, Aug 27, 2009 at 3:22 PM, Christopher Barker wrote: > Robert Kern wrote: > > On Thu, Aug 27, 2009 at 12:43, Charles R > > Harris wrote: > >> In [3]: floor_divide(x,y).dtype > >> Out[3]: dtype('float64') > > > > Ewww. It should be an appropriate integer type. Probably whatever x*y is. > > +1 if you are working with integers, you should get integers, because > that's probably what you want. -- they can overflow, etc. anyway, so > buyer beware! > > In [7]: x.dtype > Out[7]: dtype('int64') > > In [8]: y.dtype > Out[8]: dtype('uint64') > > In [9]: (x * y).dtype > Out[9]: dtype('float64') > > hmmm -- I thought we had removed this kind of silent upcasting > (particularly int-> float), but I guess when you mix two types, numpy > has to choose something! > > In any case, x/y should probably return the same type as x*y. > Another possibility is to cast the signed type to unsigned of the same precisions. But then uint64(1)//int64(-1) == 0 which may be too much of a surprise. Note that int64(x)//uint(64) always fits in int64, so the order of the types could be significant also. However, too possibilities for the return type is likely a complexity too far. The cast from signed to unsigned type of the same precision is what I would have chosen for numpy in the first place. Then again, I tend to be a bit "odd" about some things ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Aug 27 18:03:34 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 28 Aug 2009 00:03:34 +0200 Subject: [Numpy-discussion] Performance testing in unit tests Message-ID: <20090827220334.GB21256@phare.normalesup.org> Hi list, This is slightly off topic, so please pardon me. I want to do performance testing. To be precise, I have a simple case: I want to check that 2 operations perform with a similar speed (so I am abstracted from the machines performance). What would be the recommended way of timing the operation in a unit test? I use nose, if this is of any use. I am more than happy to be pointed to an example. Cheers, Ga?l From Chris.Barker at noaa.gov Thu Aug 27 18:13:30 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 27 Aug 2009 15:13:30 -0700 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> Message-ID: <4A97050A.6040505@noaa.gov> Charles R Harris wrote: > I also intend to make it work with > > from future import division doesn't already? In [3]: from __future__ import division In [5]: 3 / 4 Out[5]: 0.75 In [6]: import numpy as np In [7]: np.array(3) / np.array(4) Out[7]: 0.75 In [8]: np.array(3) // np.array(4) Out[8]: 0 > I've also considered making that import the default for numpy I'd like that, but it is a bit radical -- -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Thu Aug 27 18:21:34 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Aug 2009 15:21:34 -0700 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <4A97050A.6040505@noaa.gov> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <4A97050A.6040505@noaa.gov> Message-ID: <3d375d730908271521v25cdaa18j1a7dc4a3745c5e2c@mail.gmail.com> On Thu, Aug 27, 2009 at 15:13, Christopher Barker wrote: > Charles R Harris wrote: >> I also intend to make it work with >> >> from future import division > > doesn't already? > > In [3]: from __future__ import division > > In [5]: 3 / 4 > Out[5]: 0.75 > > In [6]: import numpy as np > > In [7]: np.array(3) / np.array(4) > Out[7]: 0.75 > > In [8]: np.array(3) // np.array(4) > Out[8]: 0 Yes, the support for that feature is already there. >> I've also considered making that import the default for numpy > > I'd like that, but it is a bit radical -- I don't think so. The policy just affects modules inside numpy, not users of numpy. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu Aug 27 18:33:30 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Aug 2009 15:33:30 -0700 Subject: [Numpy-discussion] Performance testing in unit tests In-Reply-To: <20090827220334.GB21256@phare.normalesup.org> References: <20090827220334.GB21256@phare.normalesup.org> Message-ID: <3d375d730908271533s4166a5f4jef5ed95e7097d1c1@mail.gmail.com> On Thu, Aug 27, 2009 at 15:03, Gael Varoquaux wrote: > Hi list, > > This is slightly off topic, so please pardon me. > > I want to do performance testing. To be precise, I have a simple case: I > want to check that 2 operations perform with a similar speed (so I am > abstracted from the machines performance). > > What would be the recommended way of timing the operation in a unit test? > I use nose, if this is of any use. I am more than happy to be pointed to > an example. >From my experience, doing performance tests inside of your normal test suite is entirely unreliable. Performance testing requires rigorous control over external factors that you cannot do inside of your test suite. Your tests will fail when run in the entire test suite and pass when run by themselves, or vice versa. It can also be hard to run two similar tests serially thanks to any number of caches that might be in effect, but this is often manageable. If you can manage that, then you can probably use nose or some other framework to conveniently run individually named tests in a reasonably controlled manner. There is not much unit test-specific to do, though. You time your two code paths and compare them inside of a test_function() just like you would do if you are writing an independent benchmark script. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Aug 27 19:00:35 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 17:00:35 -0600 Subject: [Numpy-discussion] Efficiently defining a multidimensional array In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E88@sernt14.essex.ac.uk> References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E87@sernt14.essex.ac.uk> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E88@sernt14.essex.ac.uk> Message-ID: On Thu, Aug 27, 2009 at 3:32 PM, Citi, Luca wrote: > Or > a[:,:,None] + b[:,None,:] I think that is the way to go. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From terhorst at gmail.com Thu Aug 27 19:50:33 2009 From: terhorst at gmail.com (Jonathan T) Date: Thu, 27 Aug 2009 23:50:33 +0000 (UTC) Subject: [Numpy-discussion] Efficiently defining a multidimensional array References: <91b4b1ab0908271452m8de5aa5m6087447530f1cb31@mail.gmail.com> Message-ID: Perfect, that is exactly what I was looking for. Thanks to all who responded. There is one more problem which currently has me stumped. Same idea but slightly different effect: V[p,x,r] := C[p, E[p,x,r], r] This multidimensional array stuff is confusing but the time savings seem to be worth it (my arrays have several million entries.) If anyone has an idea I'd love to hear it. Thanks again! Jonathan From nmb at wartburg.edu Thu Aug 27 20:53:42 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Thu, 27 Aug 2009 19:53:42 -0500 Subject: [Numpy-discussion] Efficiently defining a multidimensional array In-Reply-To: References: Message-ID: <4A972A96.6050006@wartburg.edu> On 2009-08-27 16:09 , Jonathan T wrote: > Hi, > > I want to define a 3-D array as the sum of two 2-D arrays as follows: > > C[x,y,z] := A[x,y] + B[x,z] > > My linear algebra is a bit rusty; is there a good way to do this that does not > require me to loop over x,y,z? Thanks! Numpy's broadcasting is ideal for this. Using None as an index in a slice adds a new axis with length 1 to an array. Then, the operation of addition broadcasts the arrays by repeating them across the singleton dimensions. So the above operation is C[x,y,z] = A[x,y,z] + B[x,y,z] where A is constant across its third dimension and B is constant across its second dimenision. For a concrete example: In [3]: A = np.arange(10).reshape((5,2)) In [4]: B = np.arange(15).reshape((5,3)) In [5]: C = A[:,:,None] + B[:,None,:] In [6]: C Out[6]: array([[[ 0, 1, 2], [ 1, 2, 3]], [[ 5, 6, 7], [ 6, 7, 8]], [[10, 11, 12], [11, 12, 13]], [[15, 16, 17], [16, 17, 18]], [[20, 21, 22], [21, 22, 23]]]) In [7]: C.shape Out[7]: (5, 2, 3) In [8]: Cprime = np.empty((5,2,3)) In [9]: for x in range(5): for y in range(2): for z in range(3): Cprime[x,y,z] = A[x,y] + B[x,z] ....: In [13]: (C == Cprime).all() Out[13]: True -Neil From d_l_goldsmith at yahoo.com Thu Aug 27 20:56:00 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Thu, 27 Aug 2009 17:56:00 -0700 (PDT) Subject: [Numpy-discussion] future directions In-Reply-To: <20090827214955.GG2963@zita2.kokkinizita.net> Message-ID: <712515.6455.qm@web52108.mail.re2.yahoo.com> --- On Thu, 8/27/09, Fons Adriaensen wrote: > 2. Adopting that format will make it even more important > to > clearly define in which cases data gets copied and when > not. > This should be based on some simple rules that can be > evaluated > by a code author without requiring a lookup in the > reference > docs each time. I think this is a _good_ idea (I don't know how easy/difficult it would be to implement, though; perhaps the most difficult part would be the human side, i.e., settling on a policy to implement.) > 3. Finally remove all the redundancy and legacy stuff from > the > world of numerical Python. It is *very* confusing to a new > user. I like this also (but I also know that actually trying to achieve it would ruffle a lot of feathers). > 4. Ensure that each package deals with one problem area > only. > For example a package that (by its name) suggests it > provides > plotting facilities should provide only plotting > facilities, > and not spectra, averages of all sorts, etc. I thought #3 was "Finally." ;-) Seriously though, can you be more specific as to where you see this problem presently in NumPy? For example, NumPy doesn't presently have a plotting package... (MatPlotLib is not a NumPy package - it is an independent package that uses NumPy, but it is not part of NumPy - picking nits, perhaps, but it's a nit I'm sure many would beg to pick.) > 5. Ensure some consistency in style. Some numerical Python > > packages use two-character function names, some a have > veryLongCamelCased names. Again, where exactly do you see this problem presently in NumPy? DG > Just my two Eurocents of course. > > Ciao, > > -- > FA > > Io lo dico sempre: l'Italia ? troppo stretta e lunga. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From nmb at wartburg.edu Thu Aug 27 21:28:20 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Thu, 27 Aug 2009 20:28:20 -0500 Subject: [Numpy-discussion] future directions In-Reply-To: <712515.6455.qm@web52108.mail.re2.yahoo.com> References: <712515.6455.qm@web52108.mail.re2.yahoo.com> Message-ID: <4A9732B4.3030200@wartburg.edu> On 2009-08-27 19:56 , David Goldsmith wrote: > --- On Thu, 8/27/09, Fons Adriaensen wrote: [...] >> 3. Finally remove all the redundancy and legacy stuff from the >> world of numerical Python. It is *very* confusing to a new user. > > I like this also (but I also know that actually trying to achieve it > would ruffle a lot of feathers). I think that feather ruffling is *not* the problem with this change. The persistence of the idea that removing Numpy's legacy features will only be annoyance is inimical to the popularity of the whole Numpy project. Numpy enjoys some of its ongoing popularity among active scientists because of its stability and the ease of transition forward from Numeric. Once scientists have working codes it is more than an annoyance to have to change those codes. In some cases, it may be the motivation for people to use other software packages. I think that as we go forward it is important to balance not confusing new users (a problem that can be addressed with better documentation and pointing people to modern ways of doing things) with not alienating existing users (who are in some cases influential in recruiting those new users in the first place). For software developers, compatibility-breaking changes seem like they call for just a few small tweaks to the code. For scientists who work with software, those same changes may call for never choosing Numpy again in the future. I think that this is a balance that we should be aware of when introducing changes. It makes sense that we will all see this balance differently, but I think that we need to acknowledge that this is the essential tension in removing cruft incompatibly. -Neil From charlesr.harris at gmail.com Thu Aug 27 22:11:53 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Aug 2009 20:11:53 -0600 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <3d375d730908271521v25cdaa18j1a7dc4a3745c5e2c@mail.gmail.com> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <4A97050A.6040505@noaa.gov> <3d375d730908271521v25cdaa18j1a7dc4a3745c5e2c@mail.gmail.com> Message-ID: On Thu, Aug 27, 2009 at 4:21 PM, Robert Kern wrote: > On Thu, Aug 27, 2009 at 15:13, Christopher Barker > wrote: > > Charles R Harris wrote: > >> I also intend to make it work with > >> > >> from future import division > > > > doesn't already? > > > > In [3]: from __future__ import division > > > > In [5]: 3 / 4 > > Out[5]: 0.75 > > > > In [6]: import numpy as np > > > > In [7]: np.array(3) / np.array(4) > > Out[7]: 0.75 > > > > In [8]: np.array(3) // np.array(4) > > Out[8]: 0 > > Yes, the support for that feature is already there. > > >> I've also considered making that import the default for numpy > > > > I'd like that, but it is a bit radical -- > > I don't think so. The policy just affects modules inside numpy, not > users of numpy. > If we go to returning doubles we will have a backward compatibility problem because the current floor_divide returns float32 for short ints. I see three options here. 1) Leave true_divide as is. 2) Leave true_divide as is, introduce slash_divide that always returns doubles. 3) Change true_divide to always return doubles. None of these options involve much more that a short edit of of the generate_umath.py module. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Aug 27 23:37:02 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 Aug 2009 23:37:02 -0400 Subject: [Numpy-discussion] histogram: sum up values in each bin In-Reply-To: <1cd32cbb0908271027l75916349jf978ae21ab9c43d4@mail.gmail.com> References: <270620220908270523h268998fcmd617f9557049b9ab@mail.gmail.com> <1cd32cbb0908270619t509c4b16je8427104eace3f82@mail.gmail.com> <1cd32cbb0908271027l75916349jf978ae21ab9c43d4@mail.gmail.com> Message-ID: <1cd32cbb0908272037p418344b2nc91c596be2aff642@mail.gmail.com> On Thu, Aug 27, 2009 at 1:27 PM, wrote: > On Thu, Aug 27, 2009 at 12:49 PM, Tim > Michelsen wrote: >>> Tim, do you mean, that you want to apply other functions, e.g. mean or >>> variance, to the original values but calculated per bin? >> Sorry that I forgot to add this. Shame. >> >> I would like to apply these mathematical functions on the original values >> stacked in the respective bins. >> >> For instance: >> >> The sample data measures the wight of an animal. >> >> 1) historam give a count of how many values are in each bin. >> >> I would like to calculate the average wight of all animals >> sorted in bin1, bin2 etc. >> >> This is also useful in where you have a time component. >> >> In Spreadsheets I would use a '=' to reference to the original data and then >> either sum it up or count it per class. >> >> I hope this is somehow understandable. > > Yes, it is a quite common use case for descriptive statistics, and I'm > starting to collect different ways of doing it. > > In your case, Vincents way is the easiest. > > If you need to be faster, or you want to apply the same classification > also to other variables, e.g. size of the animal,.., then creating a > label array would be a more flexible solution. > > There was a similar thread recently on the scipy-user list for sorted > arrays: "How to average different pieces or an array?" > > Josef > >> >> Thanks, >> Timmie > Here is a version where bincount and histogram produce the same results for mean and variance per bin if no bins are empty. If a bin is empty then either some nans or some small arbitrary numbers are returned. Josef # incompletely tested if a bin has zero elements, nans or missing in variance import numpy as np x = np.random.normal(size=100) #+ 1e5 # + 1e8 to compare precision c, b = np.histogram(x) sortind = np.argsort(x) reverse_sortind = np.argsort(sortind) xsorted = x[sortind] bind = np.searchsorted(xsorted,b,'right') #construct label index ind2 = np.zeros(x.shape, int) ind2[bind[1:-1]] = 1 # assumes boundary indices are included in y ind = ind2.cumsum() labels = ind[reverse_sortind] # reverse sorting print '\nmean' means = np.bincount(ind,xsorted)*1.0/np.bincount(ind) print means count = np.bincount(labels) means = np.bincount(labels,x)*1.0/count print means #compare mean with histogram countsPerBin = np.histogram(x)[0] sumsPerBin = np.histogram(x, weights=x)[0] averagePerBin = sumsPerBin / countsPerBin print averagePerBin print '\nvariance' meanarr = means[labels] var = np.bincount(labels,(x-meanarr)**2)/count print var # with histogram squaresums_perbin = np.histogram(x, weights=x**2)[0] var_perbin = squaresums_perbin*1.0 / countsPerBin - averagePerBin**2 print var_perbin print np.array(var) - np.array(var_perbin) From gael.varoquaux at normalesup.org Fri Aug 28 01:44:30 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 28 Aug 2009 07:44:30 +0200 Subject: [Numpy-discussion] Performance testing in unit tests In-Reply-To: <3d375d730908271533s4166a5f4jef5ed95e7097d1c1@mail.gmail.com> References: <20090827220334.GB21256@phare.normalesup.org> <3d375d730908271533s4166a5f4jef5ed95e7097d1c1@mail.gmail.com> Message-ID: <20090828054430.GC16134@phare.normalesup.org> On Thu, Aug 27, 2009 at 03:33:30PM -0700, Robert Kern wrote: > From my experience, doing performance tests inside of your normal test > suite is entirely unreliable. Performance testing requires rigorous > control over external factors that you cannot do inside of your test > suite. Your tests will fail when run in the entire test suite and pass > when run by themselves, or vice versa. > It can also be hard to run two similar tests serially thanks to any > number of caches that might be in effect, but this is often > manageable. That is why it can be useful to repeat the measure several times, isn't it? > If you can manage that, then you can probably use nose or some other > framework to conveniently run individually named tests in a reasonably > controlled manner. There is not much unit test-specific to do, though. > You time your two code paths and compare them inside of a > test_function() just like you would do if you are writing an > independent benchmark script. OK, the following seems to be give quite reproducible results: # timing procedure: a = np.random.random(1000000) time_hash = list() for _ in range(3): t1 = time.time() hash(a) time_hash.append(time.time() - t1) time_hash = min(time_hash) time_hashlib = list() for _ in range(3): t1 = time.time() hashlib.md5(a).hexdigest() time_hashlib.append(time.time() - t1) time_hashlib = min(time_hashlib) relative_diff = abs(time_hashlib - time_hash)/time_hashlib nose.tools.assert_true(relative_diff < 0.05) Thanks, Ga?l From robert.kern at gmail.com Fri Aug 28 01:48:55 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Aug 2009 00:48:55 -0500 Subject: [Numpy-discussion] Performance testing in unit tests In-Reply-To: <20090828054430.GC16134@phare.normalesup.org> References: <20090827220334.GB21256@phare.normalesup.org> <3d375d730908271533s4166a5f4jef5ed95e7097d1c1@mail.gmail.com> <20090828054430.GC16134@phare.normalesup.org> Message-ID: <3d375d730908272248g2b67c997xa5028da5896b3b97@mail.gmail.com> On Fri, Aug 28, 2009 at 00:44, Gael Varoquaux wrote: > On Thu, Aug 27, 2009 at 03:33:30PM -0700, Robert Kern wrote: >> From my experience, doing performance tests inside of your normal test >> suite is entirely unreliable. Performance testing requires rigorous >> control over external factors that you cannot do inside of your test >> suite. Your tests will fail when run in the entire test suite and pass >> when run by themselves, or vice versa. > >> It can also be hard to run two similar tests serially thanks to any >> number of caches that might be in effect, but this is often >> manageable. > > That is why it can be useful to repeat the measure several times, isn't > it? If warm-cache performance is what you want to measure, yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From johan.gronqvist at gmail.com Fri Aug 28 02:28:01 2009 From: johan.gronqvist at gmail.com (=?ISO-8859-1?Q?Johan_Gr=F6nqvist?=) Date: Fri, 28 Aug 2009 08:28:01 +0200 Subject: [Numpy-discussion] future directions In-Reply-To: <4A9732B4.3030200@wartburg.edu> References: <712515.6455.qm@web52108.mail.re2.yahoo.com> <4A9732B4.3030200@wartburg.edu> Message-ID: Neil Martinsen-Burrell skrev: > > The persistence of the idea that removing Numpy's legacy features will > only be annoyance is inimical to the popularity of the whole Numpy > project. [...] Once scientists have working codes it is more than an > annoyance to have to change those codes. In some cases, it may be the > motivation for people to use other software packages. > [...] > For software developers, > compatibility-breaking changes seem like they call for just a few small > tweaks to the code. For scientists who work with software, those same > changes may call for never choosing Numpy again in the future. > I very much agree, and similar (a bit worse, actually) behaviour in another product is an important reason why I am trying to switch to numpy (and I enjoy talking badly about that other product when appropriate). If the proposed changes seem important, I would appreciate having a namespace called numpy.legacy or numpy.deprecated or numpy.1dotX, that retains all the old functions. That would only be a small annoyance (to me) if importing the right thing could be handled in code when moving between machines having different versions of numpy. (something like from numpy import version if version > x.y: import numpy.legacy else: import numpy ) All IMHO, my 2 cents etc. Thanks / johan From d_l_goldsmith at yahoo.com Fri Aug 28 04:30:07 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Fri, 28 Aug 2009 01:30:07 -0700 (PDT) Subject: [Numpy-discussion] future directions In-Reply-To: Message-ID: <646356.58574.qm@web52106.mail.re2.yahoo.com> --- On Thu, 8/27/09, Johan Gr?nqvist wrote: > If the proposed changes seem important, I would appreciate > having a > namespace called numpy.legacy or numpy.deprecated or > numpy.1dotX, that > retains all the old functions. That would only be a small > annoyance (to > me) if importing the right thing could be handled in code > when moving > between machines having different versions of numpy. > > (something like > from numpy import version > if version > x.y: > ??? import numpy.legacy > else: > ??? import numpy > ) > > All IMHO, my 2 cents etc. No need to be H about it ;-) it sounds like a pretty good compromise IYAM (but I won't be surprised by a deluge of explanations as to why it isn't). DG > > Thanks > > / johan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From aisaac at american.edu Fri Aug 28 09:24:50 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 28 Aug 2009 09:24:50 -0400 Subject: [Numpy-discussion] future directions In-Reply-To: References: <712515.6455.qm@web52108.mail.re2.yahoo.com> <4A9732B4.3030200@wartburg.edu> Message-ID: <4A97DAA2.9040006@american.edu> > Neil Martinsen-Burrell skrev: >> The persistence of the idea that removing Numpy's legacy features will >> only be annoyance is inimical to the popularity of the whole Numpy >> project. [...] Once scientists have working codes it is more than an >> annoyance to have to change those codes. In some cases, it may be the >> motivation for people to use other software packages. On 8/28/2009 2:28 AM Johan Gr?nqvist apparently wrote: > I very much agree I'm just a user but I've read the NumPy and SciPy lists for years. Although this idea keeps resurfacing, I do not have the sense that there is anything close to developer unanimity that this is a good idea. Note that over the years proposals to *add* functionality to NumPy keeps resurfacing too, with similar healthy inertia. I will speculate that the outcome will eventually be that something like ndarray will become part of the standard library, satisfying users who just want that, and that NumPy will continue to provide its current (and therefore legacy) functionality. That is just speculation by a user, but the new buffer interface does seem to lay the groundwork. Alan Isaac From ndbecker2 at gmail.com Fri Aug 28 09:46:39 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 28 Aug 2009 09:46:39 -0400 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> Message-ID: Robert Kern wrote: > On Thu, Aug 27, 2009 at 14:22, Christopher Barker > wrote: > >> By the way -- is there something about py3k that changes all this? Or is >> this just an opportunity to perhaps make some backward-incompatible >> changes to numpy? > > Python 3 makes the promised change of int/int => float. > Does that mean that we want numpy to do the same? I'm not so sure. Sounds like opening a can of worms (numpy has more types to worry about than just int and float. If we start playing strange games we may regret it.) From david.huard at gmail.com Fri Aug 28 09:50:58 2009 From: david.huard at gmail.com (David Huard) Date: Fri, 28 Aug 2009 09:50:58 -0400 Subject: [Numpy-discussion] Fortran reader for npy files Message-ID: <91cf711d0908280650j35388cb9tba360adf2763085e@mail.gmail.com> Hi, Has someone written a fortran reader for the "npy" binary files numpy.save creates ? Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Aug 28 09:55:19 2009 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 28 Aug 2009 13:55:19 +0000 (UTC) Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> Message-ID: Fri, 28 Aug 2009 09:46:39 -0400, Neal Becker kirjoitti: > Robert Kern wrote: > >> On Thu, Aug 27, 2009 at 14:22, Christopher >> Barker wrote: >> >>> By the way -- is there something about py3k that changes all this? Or >>> is this just an opportunity to perhaps make some backward-incompatible >>> changes to numpy? >> >> Python 3 makes the promised change of int/int => float. > > Does that mean that we want numpy to do the same? I'm not so sure. > Sounds like opening a can of worms (numpy has more types to worry about > than just int and float. If we start playing strange games we may > regret it.) I believe we want to. This is not really a strange trick: it's just that in Python 3, the operator / is true_division, and // is floor_division. I believe any worms released by this are mostly small and tasty... The main issue is probably just choosing an appropriate float return type, and personally I believe this should be same as numpy's default float. -- Pauli Virtanen From josef.pktd at gmail.com Fri Aug 28 10:08:28 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 28 Aug 2009 10:08:28 -0400 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> Message-ID: <1cd32cbb0908280708l3501f6ffnc29ead80111e75ca@mail.gmail.com> On Fri, Aug 28, 2009 at 9:55 AM, Pauli Virtanen wrote: > Fri, 28 Aug 2009 09:46:39 -0400, Neal Becker kirjoitti: > >> Robert Kern wrote: >> >>> On Thu, Aug 27, 2009 at 14:22, Christopher >>> Barker wrote: >>> >>>> By the way -- is there something about py3k that changes all this? Or >>>> is this just an opportunity to perhaps make some backward-incompatible >>>> changes to numpy? >>> >>> Python 3 makes the promised change of int/int => float. >> >> Does that mean that we want numpy to do the same? ?I'm not so sure. >> Sounds like opening a can of worms (numpy has more types to worry about >> than just int and float. ?If we start playing strange games we may >> regret it.) > > I believe we want to. This is not really a strange trick: it's just that > in Python 3, the operator / is true_division, and // is floor_division. > I believe any worms released by this are mostly small and tasty... > > The main issue is probably just choosing an appropriate float return > type, and personally I believe this should be same as numpy's default > float. and getting the infs and nans as in true float division not as in np.true_divide Josef > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Fri Aug 28 10:16:09 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Aug 2009 08:16:09 -0600 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <1cd32cbb0908280708l3501f6ffnc29ead80111e75ca@mail.gmail.com> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <1cd32cbb0908280708l3501f6ffnc29ead80111e75ca@mail.gmail.com> Message-ID: On Fri, Aug 28, 2009 at 8:08 AM, wrote: > On Fri, Aug 28, 2009 at 9:55 AM, Pauli Virtanen wrote: > > Fri, 28 Aug 2009 09:46:39 -0400, Neal Becker kirjoitti: > > > >> Robert Kern wrote: > >> > >>> On Thu, Aug 27, 2009 at 14:22, Christopher > >>> Barker wrote: > >>> > >>>> By the way -- is there something about py3k that changes all this? Or > >>>> is this just an opportunity to perhaps make some backward-incompatible > >>>> changes to numpy? > >>> > >>> Python 3 makes the promised change of int/int => float. > >> > >> Does that mean that we want numpy to do the same? I'm not so sure. > >> Sounds like opening a can of worms (numpy has more types to worry about > >> than just int and float. If we start playing strange games we may > >> regret it.) > > > > I believe we want to. This is not really a strange trick: it's just that > > in Python 3, the operator / is true_division, and // is floor_division. > > I believe any worms released by this are mostly small and tasty... > > > > The main issue is probably just choosing an appropriate float return > > type, and personally I believe this should be same as numpy's default > > float. > > and getting the infs and nans as in true float division not as in > np.true_divide > Umm, good point. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Fri Aug 28 10:46:45 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 28 Aug 2009 10:46:45 -0400 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <1cd32cbb0908280708l3501f6ffnc29ead80111e75ca@mail.gmail.com> Message-ID: Charles R Harris wrote: > On Fri, Aug 28, 2009 at 8:08 AM, wrote: > >> On Fri, Aug 28, 2009 at 9:55 AM, Pauli Virtanen wrote: >> > Fri, 28 Aug 2009 09:46:39 -0400, Neal Becker kirjoitti: >> > >> >> Robert Kern wrote: >> >> >> >>> On Thu, Aug 27, 2009 at 14:22, Christopher >> >>> Barker wrote: >> >>> >> >>>> By the way -- is there something about py3k that changes all this? >> >>>> Or is this just an opportunity to perhaps make some >> >>>> backward-incompatible changes to numpy? >> >>> >> >>> Python 3 makes the promised change of int/int => float. >> >> >> >> Does that mean that we want numpy to do the same? I'm not so sure. >> >> Sounds like opening a can of worms (numpy has more types to worry >> >> about >> >> than just int and float. If we start playing strange games we may >> >> regret it.) >> > >> > I believe we want to. This is not really a strange trick: it's just >> > that in Python 3, the operator / is true_division, and // is >> > floor_division. I believe any worms released by this are mostly small >> > and tasty... >> > >> > The main issue is probably just choosing an appropriate float return >> > type, and personally I believe this should be same as numpy's default >> > float. >> >> and getting the infs and nans as in true float division not as in >> np.true_divide >> > > Umm, good point. > > Chuck explicit is better than implicit. IMO, if I want int/int-> float, I should ask for it explicitly, by casting the ints to float first (in numpy, that would be using astype). From aisaac at american.edu Fri Aug 28 10:53:58 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 28 Aug 2009 10:53:58 -0400 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <1cd32cbb0908280708l3501f6ffnc29ead80111e75ca@mail.gmail.com> Message-ID: <4A97EF86.7080201@american.edu> On 8/28/2009 10:46 AM Neal Becker apparently wrote: > explicit is better than implicit. IMO, if I want int/int-> float, I should > ask for it explicitly, by casting the ints to float first (in numpy, that > would be using astype). Aren't you begging the question? Nobody is suggesting int//int -> float. The question is: what is the meaning of `/`. Adopting a Python 3 compatible meaning is forward looking. Once we agree on the meaning, it *is* explicit, as is int//int. Alan Isaac From josef.pktd at gmail.com Fri Aug 28 10:54:25 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 28 Aug 2009 10:54:25 -0400 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: References: <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <1cd32cbb0908280708l3501f6ffnc29ead80111e75ca@mail.gmail.com> Message-ID: <1cd32cbb0908280754w4a003f3ck13c52132a7fa861b@mail.gmail.com> On Fri, Aug 28, 2009 at 10:46 AM, Neal Becker wrote: > Charles R Harris wrote: > >> On Fri, Aug 28, 2009 at 8:08 AM, wrote: >> >>> On Fri, Aug 28, 2009 at 9:55 AM, Pauli Virtanen wrote: >>> > Fri, 28 Aug 2009 09:46:39 -0400, Neal Becker kirjoitti: >>> > >>> >> Robert Kern wrote: >>> >> >>> >>> On Thu, Aug 27, 2009 at 14:22, Christopher >>> >>> Barker wrote: >>> >>> >>> >>>> By the way -- is there something about py3k that changes all this? >>> >>>> Or is this just an opportunity to perhaps make some >>> >>>> backward-incompatible changes to numpy? >>> >>> >>> >>> Python 3 makes the promised change of int/int => float. >>> >> >>> >> Does that mean that we want numpy to do the same? ?I'm not so sure. >>> >> Sounds like opening a can of worms (numpy has more types to worry >>> >> about >>> >> than just int and float. ?If we start playing strange games we may >>> >> regret it.) >>> > >>> > I believe we want to. This is not really a strange trick: it's just >>> > that in Python 3, the operator / is true_division, and // is >>> > floor_division. I believe any worms released by this are mostly small >>> > and tasty... >>> > >>> > The main issue is probably just choosing an appropriate float return >>> > type, and personally I believe this should be same as numpy's default >>> > float. >>> >>> and getting the infs and nans as in true float division not as in >>> np.true_divide >>> >> >> Umm, good point. >> >> Chuck > > explicit is better than implicit. ?IMO, if I want int/int-> float, I should > ask for it explicitly, by casting the ints to float first (in numpy, that > would be using astype). if "/" has a completely different meaning in numpy than in python, it will be a lot of work keeping track of whether you are working with numpy ints or python ints, a/b = ? Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From oliphant at enthought.com Fri Aug 28 11:06:27 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Fri, 28 Aug 2009 10:06:27 -0500 Subject: [Numpy-discussion] Merging datetime branch In-Reply-To: References: <2DF15AC0-1F31-40E9-A55D-B51D75F63093@gmail.com> Message-ID: <0911B4A7-5BC1-4031-9FB9-B83A309350B6@enthought.com> On Aug 25, 2009, at 2:21 PM, Charles R Harris wrote: > > > On Tue, Aug 25, 2009 at 1:05 PM, Pierre GM > wrote: > > On Aug 25, 2009, at 1:59 PM, Skipper Seabold wrote: > > > On Tue, Aug 25, 2009 at 1:51 PM, Charles R > > Harris wrote: > >> Hi Travis, > >> > >> The new parse_datetime.c file contains a lot of c++ style comments > >> that > >> should be fixed. Also, the new test for mirr is failing on all the > >> buildbots. > > Comments sent to Marty who wrote the parse_datetime.c as part of his > GSoC: Marty, I guess you have a bit of cleaning up to do. > (As a snarky side note, Marty posted on the list a few weeks ago > asking just for this kind of comments... But all is well and better > late than never.) > > My bad, then, I missed it. So let me add > > 1) Because the default compilation is to include all the files in a > master file, the local defines should be undef'ed at the end to > avoid namespace pollution. > > 2) Never do this: > > if (bug) return -1; > > or this > > if (bug) {blah; blah;} > > do it this way > > if (bug) { > return -1; > } > > The last is more for Travis in the most recent commit ;) > Thanks for the reminders and the review. I've been busy on the datetime branch (trying to merge Marty's code which is where all the C++ comments come from). I've changed a lot of the stylistic differences in Marty's code (not sure if I've got them all). I doubt I will have time to be pedantic, but will welcome any such changes from others. While there are a couple of features that need to be added (coercion between two date-time datatypes is one big one), and a whole lot of tests that need to be added for the datetime support. I think it's ready to merge back to the mainline trunk so it can be a part of the development toward 1.4 Let me know if anyone has any big changes to trunk that are going to occur today. Thanks, -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at informa.tiker.net Fri Aug 28 11:03:44 2009 From: lists at informa.tiker.net (Andreas =?utf-8?q?Kl=C3=B6ckner?=) Date: Fri, 28 Aug 2009 11:03:44 -0400 Subject: [Numpy-discussion] [ANN] PyOpenCL 0.90 - a Python interface for OpenCL Message-ID: <200908281103.45128.lists@informa.tiker.net> What is it? ----------- PyOpenCL makes the industry-standard OpenCL compute abstraction available from Python. PyOpenCL has been tested to work with AMD's and Nvidia's OpenCL implementations and allows complete access to all features of the standard, from a nice, Pythonic interface. Where can I get it? ------------------- Homepage: http://mathema.tician.de/software/pyopencl Download: http://pypi.python.org/pypi/pyopencl Documentation: http://documen.tician.de/pyopencl Wiki: http://wiki.tiker.net/PyOpenCL Main Features ------------- * Object cleanup tied to lifetime of objects. This idiom, often called RAII in C++, makes it much easier to write correct, leak- and crash-free code. * Completeness. PyOpenCL puts the full power of OpenCL?s API at your disposal, if you wish. Every obscure get_info() query and all CL calls are accessible. * Automatic Error Checking. All errors are automatically translated into Python exceptions. * Speed. PyOpenCL?s base layer is written in C++, so all the niceties above are virtually free. * Helpful, complete Documentation If that sounds similar to PyOpenCL's sister project PyCUDA [1], that is not entirely a coincidence. :) License ------- PyOpenCL is open-source under the MIT/X11 license and free for commercial, academic, and private use. Andreas [1] http://mathema.tician.de/software/pycuda -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From jh at physics.ucf.edu Fri Aug 28 11:26:14 2009 From: jh at physics.ucf.edu (Joe Harrington) Date: Fri, 28 Aug 2009 11:26:14 -0400 Subject: [Numpy-discussion] future directions In-Reply-To: (numpy-discussion-request@scipy.org) References: Message-ID: > ...numpy clean-up... > ...cruft... > ...API breakage... > ...etc.... At the risk of starting a flame war, the cleanest way out of the legacy API trap is some level of fork, with the old code maintained for some years while new uses (new users and new code by old users) get done in the new package, which is named differently. The reader will note that "fork" is a four-letter word, particularly in this community. However, there are two natural forklets coming up. The first is Python 3.0, which will necessitate some API changes. A numpy 2.0 that used Python 3.0 could have significant API breaks and might include a cleanup. Yet, there would still be resistance to that level of API breakage and I can't say I'd be on the breaking side of that debate. In any event, a Python 2.x branch would need to be maintained unless the Python 3.x branch was completely backward compatible, which I am not sure it will be. I won't claim to be Yoda, but there is another. There seems to be consensus that something like ndarray go into the Python language. There is also an increasing amount of talk about putting numpy itself into the core "eventually", and increasing agreement from the mainstream Python community, too. The resistance from that side seems to be 1) we (the Python community) don't have the numerical expertise to maintain it, 2) it's not clean enough, and sometimes 3) it's too big. So, it may be worthwhile making a cleaned-up successor to numpy, with a different name, that is intended for inclusion in Python. That would answer objection 2. I think that we have demonstrated the strength of this community sufficiently to say that we'll maintain that part of Python as if it were our own, because it would be, and that such participation would be sufficient. The size problem, if there is one, gets helped a lot by removal of all the cruft. Anything useful from the original package that didn't go into Python would go into the scipy package. We'd maintain the old numpy for a few years at least. To do any of these right, I think we need to establish a more formal mechanism for proposing, discussing, and deciding API issues, and then stick to it like glue. Failing that, we should not even attempt a cleanup, as it will just be one more dirty fork. Following the full PEP procedure would be appropriate for developing a Python language candidate package. --jh-- From charlesr.harris at gmail.com Fri Aug 28 11:45:22 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Aug 2009 09:45:22 -0600 Subject: [Numpy-discussion] Merging datetime branch In-Reply-To: <0911B4A7-5BC1-4031-9FB9-B83A309350B6@enthought.com> References: <2DF15AC0-1F31-40E9-A55D-B51D75F63093@gmail.com> <0911B4A7-5BC1-4031-9FB9-B83A309350B6@enthought.com> Message-ID: On Fri, Aug 28, 2009 at 9:06 AM, Travis Oliphant wrote: > > On Aug 25, 2009, at 2:21 PM, Charles R Harris wrote: > > > > On Tue, Aug 25, 2009 at 1:05 PM, Pierre GM wrote: > >> >> On Aug 25, 2009, at 1:59 PM, Skipper Seabold wrote: >> >> > On Tue, Aug 25, 2009 at 1:51 PM, Charles R >> > Harris wrote: >> >> Hi Travis, >> >> >> >> The new parse_datetime.c file contains a lot of c++ style comments >> >> that >> >> should be fixed. Also, the new test for mirr is failing on all the >> >> buildbots. >> >> Comments sent to Marty who wrote the parse_datetime.c as part of his >> GSoC: Marty, I guess you have a bit of cleaning up to do. >> (As a snarky side note, Marty posted on the list a few weeks ago >> asking just for this kind of comments... But all is well and better >> late than never.) > > > My bad, then, I missed it. So let me add > > 1) Because the default compilation is to include all the files in a master > file, the local defines should be undef'ed at the end to avoid namespace > pollution. > > 2) Never do this: > > if (bug) return -1; > > or this > > if (bug) {blah; blah;} > > do it this way > > if (bug) { > return -1; > } > > The last is more for Travis in the most recent commit ;) > > > Thanks for the reminders and the review. > > I've been busy on the datetime branch (trying to merge Marty's code which > is where all the C++ comments come from). I've changed a lot of the > stylistic differences in Marty's code (not sure if I've got them all). I > doubt I will have time to be pedantic, but will welcome any such changes > from others. > > While there are a couple of features that need to be added (coercion > between two date-time datatypes is one big one), and a whole lot of tests > that need to be added for the datetime support. I think it's ready to merge > back to the mainline trunk so it can be a part of the development toward 1.4 > > Let me know if anyone has any big changes to trunk that are going to occur > today. > I don't plan on any, but there have been changes since early June... Please be careful to have the ifdef NPY_PY3K bits in the type object initializations. Thanks Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Fri Aug 28 11:47:52 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Fri, 28 Aug 2009 16:47:52 +0100 Subject: [Numpy-discussion] What type should / return in python 3kwhen applied to two integer types? References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com><3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com><4A96F91F.20206@noaa.gov><3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8C@sernt14.essex.ac.uk> > The main issue is probably just choosing an appropriate float return > type, and personally I believe this should be same as numpy's default > float. I completely agree. Maybe we could let the user decide whether to use a different type. It is already somehow possible through the "out" argument. >>> np.true_divide(a, b, np.empty(a.shape, dtype=np.float32)) but clearly a but clumsy. Alan Isaac suggested: """ Is there a reason not to make the `out` argument a keyword argument and then also alternatively allow a dtype specification? """ As one needs to specify EITHER the output array OR the type, is it possible to use a type as "out" argument? Something like: >>> np.true_divide(a, b, np.float32) that, instead of raising "return arrays must be of ArrayType" could, if "out" is a valid type, create the corresponding array, use it and return it. From Chris.Barker at noaa.gov Fri Aug 28 12:15:53 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 28 Aug 2009 09:15:53 -0700 Subject: [Numpy-discussion] future directions In-Reply-To: References: Message-ID: <4A9802B9.3080705@noaa.gov> Joe Harrington wrote: > However, there are two natural forklets coming up. > > The first is Python 3.0, which will necessitate some API changes. Absolutely! This seems like a no-brainer. I don't think we are talking about really major changes to the numpy API anyway, generally clean-up, and there is no way anyone is going to get their Py2 code working on Py3 without tweaking it anyway, this is the time to do it. Like it or not, Python2 and Python3 are both going to be around for a while, so numpy2 and numpy3 will also, but the broader Python community made a choice to make a transition, so we should take the opportunity to do so as well. We'll have plenty to argue about even if we do decide that backward compatibility is not a goal! > There seems to be consensus that something like ndarray go into the > Python language. I know I think so, but I think one of the big issues is how much of numpy go in. The basic nd data container with slicing and dicing at least, but what about ufuncs? and ??? > The resistance > from that side seems to be 1) we (the Python community) don't have the > numerical expertise to maintain it, 2) it's not clean enough, and > sometimes 3) it's too big. and 4) the pace of change in numpy is too great. That last one may be soluble, as I think that the nd-array object itself is a lot more stable than the rest of numpy. > So, it may be worthwhile making a cleaned-up successor to numpy, with > a different name, that is intended for inclusion in Python. I think the clean-up successor is a great idea with or without this intension -- but I do like the idea -- it might help guide what really belongs in numpy, and what in add-on packages. > Following the full > PEP procedure or a parallel NPEP system. If nothing else, it would be nice to have better documentation for decisions than the mailing list archive and occasional wiki pages. long live numpy3k! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From denis-bz-py at t-online.de Fri Aug 28 12:14:36 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Fri, 28 Aug 2009 16:14:36 +0000 (UTC) Subject: [Numpy-discussion] a[j, k], clipping j k to the edge if they're 1 off ? Message-ID: Folks, I want to index a[j,k], clipping j or k to the edge if they're 1 off -- def aget( a, j, k ): """ -> a[j,k] or a[edge] """ # try: # return a[j,k] -- nope, -1 # except IndexError: m,n = a.shape return a[ min(max(j, 0), m-1), min(max(k, 0), n-1)] This works but is both ugly and 5* slower than plain a[j][k]. Is there a better way ? (Sorry if this is a duplicate, must come up often.) cheers -- denis From robert.kern at gmail.com Fri Aug 28 12:18:41 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Aug 2009 09:18:41 -0700 Subject: [Numpy-discussion] future directions In-Reply-To: <4A9802B9.3080705@noaa.gov> References: <4A9802B9.3080705@noaa.gov> Message-ID: <3d375d730908280918i4ea79b8foee043ed1bc8d0ec@mail.gmail.com> On Fri, Aug 28, 2009 at 09:15, Christopher Barker wrote: > Joe Harrington wrote: >> However, there are two natural forklets coming up. >> >> The first is Python 3.0, which will necessitate some API changes. > > Absolutely! This seems like a no-brainer. I don't think we are talking > about really major changes to the numpy API anyway, generally clean-up, > and there is no way anyone is going to get their Py2 code working on Py3 > without tweaking it anyway, this is the time to do it. No, it is the *worst* time to do it. We have been asked by the Python developer team *not* to use the Python 3 transition to break all kinds of other backwards compatibility. If we and other libraries do that, people will simply not transition to Python 3. Or they will transition to another language that they think is more stable. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Aug 28 12:28:18 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Aug 2009 09:28:18 -0700 Subject: [Numpy-discussion] a[j, k], clipping j k to the edge if they're 1 off ? In-Reply-To: References: Message-ID: <3d375d730908280928y13160d1bo2042b2f1e33c7552@mail.gmail.com> On Fri, Aug 28, 2009 at 09:14, denis bzowy wrote: > Folks, > > ?I want to index a[j,k], clipping j or k to the edge if they're 1 off -- > > def aget( a, j, k ): > ? ?""" -> a[j,k] ?or a[edge] """ > ? ? ? ?# try: > ? ? ? ?# ? ?return a[j,k] ?-- nope, -1 > ? ? ? ?# except IndexError: > ? ?m,n = a.shape > ? ?return a[ min(max(j, 0), m-1), min(max(k, 0), n-1)] > > > This works but is both ugly and 5* slower than plain a[j][k]. > Is there a better way ? Nope. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri Aug 28 12:31:44 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Aug 2009 10:31:44 -0600 Subject: [Numpy-discussion] What type should / return in python 3kwhen applied to two integer types? In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8C@sernt14.essex.ac.uk> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8C@sernt14.essex.ac.uk> Message-ID: On Fri, Aug 28, 2009 at 9:47 AM, Citi, Luca wrote: > > The main issue is probably just choosing an appropriate float return > > type, and personally I believe this should be same as numpy's default > > float. > I completely agree. > > Maybe we could let the user decide whether to use a different type. > It is already somehow possible through the "out" argument. > >>> np.true_divide(a, b, np.empty(a.shape, dtype=np.float32)) > but clearly a but clumsy. > The numpy true_divide function can be changed at runtime through the python interface, one isn't stuck with the defaults for any of the python parsed numeric methods. However, changing the defaults might lead to code portability problems. I think it was a bad idea to have that facility... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Fri Aug 28 12:36:28 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Fri, 28 Aug 2009 11:36:28 -0500 Subject: [Numpy-discussion] What type should / return in python 3kwhen applied to two integer types? In-Reply-To: References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8C@sernt14.essex.ac.uk> Message-ID: <4E47E895-D7CB-4BFA-919F-5584D0C6A4A2@enthought.com> On Aug 28, 2009, at 11:31 AM, Charles R Harris wrote: > > > On Fri, Aug 28, 2009 at 9:47 AM, Citi, Luca wrote: > > The main issue is probably just choosing an appropriate float return > > type, and personally I believe this should be same as numpy's > default > > float. > I completely agree. > > Maybe we could let the user decide whether to use a different type. > It is already somehow possible through the "out" argument. > >>> np.true_divide(a, b, np.empty(a.shape, dtype=np.float32)) > but clearly a but clumsy. > > The numpy true_divide function can be changed at runtime through the > python interface, one isn't stuck with the defaults for any of the > python parsed numeric methods. However, changing the defaults might > lead to code portability problems. I think it was a bad idea to have > that facility... I see you have not been converted to the power of the with statement to create local environments... --- probably it's a good thing you haven't. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Aug 28 12:40:51 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Aug 2009 10:40:51 -0600 Subject: [Numpy-discussion] What type should / return in python 3kwhen applied to two integer types? In-Reply-To: <4E47E895-D7CB-4BFA-919F-5584D0C6A4A2@enthought.com> References: <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8C@sernt14.essex.ac.uk> <4E47E895-D7CB-4BFA-919F-5584D0C6A4A2@enthought.com> Message-ID: On Fri, Aug 28, 2009 at 10:36 AM, Travis Oliphant wrote: > > On Aug 28, 2009, at 11:31 AM, Charles R Harris wrote: > > > > On Fri, Aug 28, 2009 at 9:47 AM, Citi, Luca wrote: > >> > The main issue is probably just choosing an appropriate float return >> > type, and personally I believe this should be same as numpy's default >> > float. >> I completely agree. >> >> Maybe we could let the user decide whether to use a different type. >> It is already somehow possible through the "out" argument. >> >>> np.true_divide(a, b, np.empty(a.shape, dtype=np.float32)) >> but clearly a but clumsy. >> > > The numpy true_divide function can be changed at runtime through the python > interface, one isn't stuck with the defaults for any of the python parsed > numeric methods. However, changing the defaults might lead to code > portability problems. I think it was a bad idea to have that facility... > > > I see you have not been converted to the power of the with statement to > create local environments... --- probably it's a good thing you haven't. > One of the problems with extensible languages is that, after extending, they become different languages. It's the software equivalent of the tower of Babel. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Aug 28 12:46:17 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Aug 2009 10:46:17 -0600 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: <1cd32cbb0908280708l3501f6ffnc29ead80111e75ca@mail.gmail.com> References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <1cd32cbb0908280708l3501f6ffnc29ead80111e75ca@mail.gmail.com> Message-ID: On Fri, Aug 28, 2009 at 8:08 AM, wrote: > On Fri, Aug 28, 2009 at 9:55 AM, Pauli Virtanen wrote: > > Fri, 28 Aug 2009 09:46:39 -0400, Neal Becker kirjoitti: > > > >> Robert Kern wrote: > >> > >>> On Thu, Aug 27, 2009 at 14:22, Christopher > >>> Barker wrote: > >>> > >>>> By the way -- is there something about py3k that changes all this? Or > >>>> is this just an opportunity to perhaps make some backward-incompatible > >>>> changes to numpy? > >>> > >>> Python 3 makes the promised change of int/int => float. > >> > >> Does that mean that we want numpy to do the same? I'm not so sure. > >> Sounds like opening a can of worms (numpy has more types to worry about > >> than just int and float. If we start playing strange games we may > >> regret it.) > > > > I believe we want to. This is not really a strange trick: it's just that > > in Python 3, the operator / is true_division, and // is floor_division. > > I believe any worms released by this are mostly small and tasty... > > > > The main issue is probably just choosing an appropriate float return > > type, and personally I believe this should be same as numpy's default > > float. > > and getting the infs and nans as in true float division not as in > np.true_divide > Note that currently true_divide returns zeros in these cases and attempts -- unsuccessfully -- to raise a zero division error; that is what python does. So if we make this change there will be a divergence from python behaviour. However, arrays are different from scalars and I think we should make this change. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Fri Aug 28 12:47:12 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Fri, 28 Aug 2009 11:47:12 -0500 Subject: [Numpy-discussion] Merge of date-time branch completed Message-ID: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> Hello folks, In keeping with the complaint that the pace of NumPy development is too fast, I've finished the merge of the datetime branch to the core. The trunk builds and all the (previous) tests pass for me. There are several tasks remaining to be done (the current status is definitely still alpha): * write many unit tests for the desired behavior (especially for the many different kinds of dates supported) * finish coercion between datetimes and timedeltas with different frequencies * improve the ufuncs that support datetime and timedelta so that they look at the frequency information. * improve the way datetime arrays print * probably several other things that I haven't listed Because of the last point, I will spend my next effort on the work updating the proposal to more clearly define some of the expected behaviors and write documentation about the expected behavior of the new features. Help, reviews, criticisms, suggestions, fixes, and patches, are most welcome. Best regards, -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsdale24 at gmail.com Fri Aug 28 12:53:24 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 28 Aug 2009 12:53:24 -0400 Subject: [Numpy-discussion] Merge of date-time branch completed In-Reply-To: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> References: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> Message-ID: On Fri, Aug 28, 2009 at 12:47 PM, Travis Oliphant wrote: > > Hello folks, > In?keeping?with?the?complaint?that?the?pace?of?NumPy?development?is?too?fast,?I've?finished?the?merge?of?the?datetime?branch?to?the?core.??The?trunk?builds?and?all?the?(previous)?tests?pass?for?me. > There are several tasks remaining to be done (the current status is > definitely still alpha): > * write many unit tests for the desired behavior ?(especially for the many > different kinds of dates supported) > * finish coercion between datetimes and timedeltas with different > frequencies > * improve the ufuncs that support datetime and timedelta so that they look > at the frequency information. I haven't been following development on datetime. Can you use __array_prepare__ and __array_wrap__ to do this? __array_prepare__ was committed to the trunk during the scipy sprints. Darren From josef.pktd at gmail.com Fri Aug 28 12:58:18 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 28 Aug 2009 12:58:18 -0400 Subject: [Numpy-discussion] What type should / return in python 3k when applied to two integer types? In-Reply-To: References: <3d375d730908271227k3cea13b8ledd88a6ff8f21172@mail.gmail.com> <3d375d730908271246k509f1404s8c107e7e8cbb74b9@mail.gmail.com> <4A96F91F.20206@noaa.gov> <3d375d730908271426m7068c25fj10078a1b5546d8a9@mail.gmail.com> <1cd32cbb0908280708l3501f6ffnc29ead80111e75ca@mail.gmail.com> Message-ID: <1cd32cbb0908280958h20bccd0cy9a250a54cd5d9bd2@mail.gmail.com> On Fri, Aug 28, 2009 at 12:46 PM, Charles R Harris wrote: > > > On Fri, Aug 28, 2009 at 8:08 AM, wrote: >> >> On Fri, Aug 28, 2009 at 9:55 AM, Pauli Virtanen wrote: >> > Fri, 28 Aug 2009 09:46:39 -0400, Neal Becker kirjoitti: >> > >> >> Robert Kern wrote: >> >> >> >>> On Thu, Aug 27, 2009 at 14:22, Christopher >> >>> Barker wrote: >> >>> >> >>>> By the way -- is there something about py3k that changes all this? Or >> >>>> is this just an opportunity to perhaps make some >> >>>> backward-incompatible >> >>>> changes to numpy? >> >>> >> >>> Python 3 makes the promised change of int/int => float. >> >> >> >> Does that mean that we want numpy to do the same? ?I'm not so sure. >> >> Sounds like opening a can of worms (numpy has more types to worry about >> >> than just int and float. ?If we start playing strange games we may >> >> regret it.) >> > >> > I believe we want to. This is not really a strange trick: it's just that >> > in Python 3, the operator / is true_division, and // is floor_division. >> > I believe any worms released by this are mostly small and tasty... >> > >> > The main issue is probably just choosing an appropriate float return >> > type, and personally I believe this should be same as numpy's default >> > float. >> >> and getting the infs and nans as in true float division not as in >> np.true_divide > > Note that currently true_divide returns zeros in these cases and attempts -- > unsuccessfully -- to raise a zero division error; that is what python does. > So if we make this change there will be a divergence from python behaviour. > However, arrays are different from scalars and I think we should make this > change. The difference is already now in the floating point operations. Since python doesn't know about inf and nans, I was switching functions to use the numpy version for floating point operations to have robust results instead of exceptions (in stats.distributions) >>> 0.**(-1) Traceback (most recent call last): File "", line 1, in 0.**(-1) ZeroDivisionError: 0.0 cannot be raised to a negative power >>> np.power(0., -1) inf >>> np.array(0.)**(-1) inf and I would expect that a numpy "/" follows the numpy floating point definitions ( and not the missing inf and nan behavior of python) Josef > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Fri Aug 28 13:00:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Aug 2009 11:00:18 -0600 Subject: [Numpy-discussion] Merge of date-time branch completed In-Reply-To: References: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> Message-ID: On Fri, Aug 28, 2009 at 10:53 AM, Darren Dale wrote: > On Fri, Aug 28, 2009 at 12:47 PM, Travis Oliphant > wrote: > > > > Hello folks, > > > In keeping with the complaint that the pace of NumPy development is too fast, I've finished the merge of the datetime branch to the core. The trunk builds and all the (previous) tests pass for me. > > There are several tasks remaining to be done (the current status is > > definitely still alpha): > > * write many unit tests for the desired behavior (especially for the > many > > different kinds of dates supported) > > * finish coercion between datetimes and timedeltas with different > > frequencies > > * improve the ufuncs that support datetime and timedelta so that they > look > > at the frequency information. > > I haven't been following development on datetime. Can you use > __array_prepare__ and __array_wrap__ to do this? __array_prepare__ was > committed to the trunk during the scipy sprints. > There looks to be some overlap there. The datetime types are just int64 under the covers and I've wondered if derived types would have sufficed with the addition of __array_prepare__. That said, I'm not familiar with the datetime functionality and there are likely other considerations. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at physics.ucf.edu Fri Aug 28 13:13:38 2009 From: jh at physics.ucf.edu (Joe Harrington) Date: Fri, 28 Aug 2009 13:13:38 -0400 Subject: [Numpy-discussion] future directions In-Reply-To: (numpy-discussion-request@scipy.org) References: Message-ID: Christopher Barker wrote: >> Following the full >> PEP procedure >or a parallel NPEP system. Actually, I originally intended just to mean "follow the procedure" not "do it in their system". But, in thinking about it, if it's compatible with their system to develop a whole subpackage in their procedure space, we should. Ultimately the decision to include something in Python is one they will make, and such decisions are largely social ones. The more they have seen, been included in, and had chance to comment on our stuff, the more bought in they are and the less chance there is of objection to it. As for rapid pace of change, it would be feasible to agree on a core set of functionality that should go in the main language, to look at that with our decade of experience, and decide on a design and implementation that would be stable relative to what we have now. Mostly the code would just transfer; what people complain about is fluff, underlying package structure, API inconsistency, and turds. A few things might have API changes (like np.median() not long ago). There would be some significant issues to decide to do or not to do (like generalized ufuncs not long ago), but not many. I am thinking small here, though others will differ. No financials, nothing now deprecated, no fromnumeric namespace, nothing that is likely to evolve fast. Simple, clean, organized. The rest goes into scipy. --jh-- From lasagnadavide at gmail.com Fri Aug 28 13:13:56 2009 From: lasagnadavide at gmail.com (davide lasagna) Date: Fri, 28 Aug 2009 19:13:56 +0200 Subject: [Numpy-discussion] Iterate over an array Message-ID: Hi all, I ve got a 2d array and i want to iterate over its columns in a "pythonic way". This is what i have in mind: please consider this snippet: ################################################# import numpy as np array = np.random.standard_normal( (10,10) ) for column in array.some_column_method(): column = do_something() ################################################# The trivial way do the for loop is: ################################################# for i in range( array.shape[1] ): array[:, i] = do_something() ################################################# Is there any way to do what i think?? Can i obtain "pythonically" a list of column arrays?? Any help is appreciated. Cheers.. Davide Lasagna Dip. Ingegneria Aerospaziale Politecnico di Torino Italia -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Aug 28 13:17:55 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Aug 2009 10:17:55 -0700 Subject: [Numpy-discussion] Iterate over an array In-Reply-To: References: Message-ID: <3d375d730908281017r11da8eb3rede89c66da4aec8b@mail.gmail.com> On Fri, Aug 28, 2009 at 10:13, davide lasagna wrote: > Hi all, > I ve got a 2d array and i want to iterate over its columns in a "pythonic > way". > This is what i have in mind: please consider this snippet: > > ################################################# > import numpy as np > > array = np.random.standard_normal( (10,10) ) > > for column in array.some_column_method(): > ????? column? = do_something() > > ################################################# > > The trivial way do the for loop is: > > ################################################# > for i in range(? array.shape[1] ): > ????? array[:, i] = do_something() > ################################################# > > Is there any way to do what i think?? Can i obtain "pythonically" a list of > column arrays?? for column in array.transpose(): column[:] = do_something() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri Aug 28 13:39:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Aug 2009 11:39:39 -0600 Subject: [Numpy-discussion] Merge of date-time branch completed In-Reply-To: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> References: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> Message-ID: On Fri, Aug 28, 2009 at 10:47 AM, Travis Oliphant wrote: > > Hello folks, > > In keeping with the complaint that the pace of NumPy development is too fast, I've finished the merge of the datetime branch to the core. The trunk builds and all the (previous) tests pass for me. > > There are several tasks remaining to be done (the current status is > definitely still alpha): > > * write many unit tests for the desired behavior (especially for the many > different kinds of dates supported) > * finish coercion between datetimes and timedeltas with different > frequencies > * improve the ufuncs that support datetime and timedelta so that they look > at the frequency information. > * improve the way datetime arrays print > * probably several other things that I haven't listed > > Because of the last point, I will spend my next effort on the work updating > the proposal to more clearly define some of the expected behaviors and write > documentation about the expected behavior of the new features. > > Help, reviews, criticisms, suggestions, fixes, and patches, are most > welcome. > Umm, replacing the previous code 'M' by '.' in generate_umath is a bit obscure. Isn't there a better choice than '.' ? Please make the multiline comments conform to the standard. I spend a lot of time fixing these up... And you broke some I already fixed. Could you break up the long lines in the repeats while you are at it. I'm doing that too, but every bit helps. What does UFUNC_OBJ_NEEDS_API do? Things like if (fromtype == PyArray_DATETIME || fromtype == PyArray_TIMEDELTA || totype == PyArray_DATETIME || totype == PyArray_TIMEDELTA) { Are more readable if the trailing || is moved to the head of the line if (fromtype == PyArray_DATETIME || fromtype == PyArray_TIMEDELTA || totype == PyArray_DATETIME || totype == PyArray_TIMEDELTA) { Hmm, "can also have an additional key called "metadata" which can be any dictionary", is this new functionality? What does it do? There are a lot of changes like this : !(loop->obj & UFUNC_OBJ_ISOBJECT). What is the meaning of UFUNC_OBJ_ISOBJECT and why is this test necessary? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Fri Aug 28 13:52:22 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Aug 2009 19:52:22 +0200 Subject: [Numpy-discussion] future directions In-Reply-To: <20090827214955.GG2963@zita2.kokkinizita.net> References: <20090827214955.GG2963@zita2.kokkinizita.net> Message-ID: <4A981956.2040607@student.matnat.uio.no> Fons Adriaensen wrote: > Some weeks ago there was a post on this list requesting feedback > on possible future directions for numpy. As I was quite busy at that > time I'll reply to it now. > > My POV is that of a novice user, who at the same time wants quite > badly to use the numpy framework for his numerical work which in > this case is related to (some rather advanced) multichannell audio > processing. > I'm reluctantly joining the discussion... (reluctant because, as interesting as these discussions may be, (relatively) simple things that everyone agrees about like Python 3 compatability and PEP 3118 support is still some ways off. Agreeing on things doesn't make it happen.) > >From that POV, I'd suggest the following: > > 1. Adopt an object based on Python-3's buffer protocol as the > basic array type. It's immensely more powerful than ndarray, > while at the same time it's close enough to ndarray to allow > a gradual adoption. > It's not immensely more powerful? It allows pointers, that's right, but that's primarily for exporting data from data providers... For things like "pointers to images" (which PEP 3118 could be used for), Python lists usually work better anyway because they can be appended. I think the whole idea of the protocol is that you can start passing around data in *various* containers. Adopting a new array type as the "basic array type" basically defeats this purpose. My way of thinking of it is: Focus shifted over on the NumPy library providing ufuncs, not array container. I think we'll in some years be doing np.sin(x, out=y) without x or y being ndarrays at all. One conclusion: All of this might call for a new library which tries to focus more and support a wider set of memory layouts. But, well, it's just to go ahead and do that! -- but I don't think NumPy can be turned into it, nor do the NumPy developers likely have time to spare for that. If you wait a year, such a library might be a 100-liner in Cython :-) Actually, I right now think the best way of getting such a library implemented is help out on Cython's array features, then export Cython's arrays to Python-space in a library. Secondly, One BIG gotcha people should be aware about here is that PEP 3118 supports "fancy indexing as views". I.e. with an object based on PEP 3118's memory model you could potentially do b = a[a == 2] b[0] = 3 and have that change a! I believe these semantics to be superior myself (because you can always do "b = a[a==2].copy()" to get NumPy's behaviour). But it does raise some interesting questions about consistency vs. subtle API breakage etc. > 2. Adopting that format will make it even more important to > clearly define in which cases data gets copied and when not. > This should be based on some simple rules that can be evaluated > by a code author without requiring a lookup in the reference > docs each time. > I think NumPy's already doing quite good here, except for the case of fancy indexing as mentioned above. Cleaning up various incarnations of "reshape" etc. to be consistent here would be good too (my vote is for never doing any automatic copying in methods like reshape, but I actually haven't checked what the semantics ended up being in the end). (BTW, I was recently observed saying I might chip in and implement PEP 3118 for NumPy around November. If anyone wants to beat me to it then I'd be happy of course.) Dag Sverre From d_l_goldsmith at yahoo.com Fri Aug 28 14:15:53 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Fri, 28 Aug 2009 11:15:53 -0700 (PDT) Subject: [Numpy-discussion] future directions In-Reply-To: <4A9802B9.3080705@noaa.gov> Message-ID: <86557.24473.qm@web52106.mail.re2.yahoo.com> --- On Fri, 8/28/09, Christopher Barker wrote: > long live numpy3k! > > -Chris Or at least until Py4K makes us "fork" again. ;) DG From robert.kern at gmail.com Fri Aug 28 16:23:20 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Aug 2009 13:23:20 -0700 Subject: [Numpy-discussion] Merge of date-time branch completed In-Reply-To: References: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> Message-ID: <3d375d730908281323jff942cbx8e1ef357584eda6e@mail.gmail.com> On Fri, Aug 28, 2009 at 10:39, Charles R Harris wrote: > What does UFUNC_OBJ_NEEDS_API do? It specifies that the ufunc loops need access to the Python C API, so the dispatcher should not release the GIL before running the loop. > Hmm, "can also have an additional key called "metadata" which can be any > dictionary", is this new functionality? What does it do? It adds a dictionary to the dtype. This can potentially be used for a couple of applications, but here is used to hold the datetime frequency information. > There are a lot of changes like this : !(loop->obj & UFUNC_OBJ_ISOBJECT). > What is the meaning of UFUNC_OBJ_ISOBJECT and why is this test necessary? Previously, loop->obj was an int, but only took 0 or 1 to specify that the ufunc loop was for object dtypes. This was used for two distinct things: reference counting and keeping hold of the GIL. The datetime loops require the latter but not the former. loop->obj is now a bitset which can be 0, UFUNC_OBJ_ISOBJECT, UFUNC_OBJ_NEEDS_API, or UFUNC_OBJ_ISOBJECT|UFUNC_OBJ_NEEDS_API. The tests were modified to test for the most specific required flag(s) for the particular operation it was going to do. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant at enthought.com Fri Aug 28 16:43:34 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Fri, 28 Aug 2009 15:43:34 -0500 Subject: [Numpy-discussion] Merge of date-time branch completed In-Reply-To: References: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> Message-ID: On Aug 28, 2009, at 12:39 PM, Charles R Harris wrote: > > > On Fri, Aug 28, 2009 at 10:47 AM, Travis Oliphant > wrote: > > Hello folks, > > In keeping with the complaint that the pace of NumPy development is > too fast, I've finished the merge of the datetime branch to the > core. The trunk builds and all the (previous) tests pass for me. > > There are several tasks remaining to be done (the current status is > definitely still alpha): > > * write many unit tests for the desired behavior (especially for > the many different kinds of dates supported) > * finish coercion between datetimes and timedeltas with different > frequencies > * improve the ufuncs that support datetime and timedelta so that > they look at the frequency information. > * improve the way datetime arrays print > * probably several other things that I haven't listed > > Because of the last point, I will spend my next effort on the work > updating the proposal to more clearly define some of the expected > behaviors and write documentation about the expected behavior of the > new features. > > Help, reviews, criticisms, suggestions, fixes, and patches, are most > welcome. > > Umm, replacing the previous code 'M' by '.' in generate_umath is a > bit obscure. Isn't there a better choice than '.' ? > > Please make the multiline comments conform to the standard. I spend > a lot of time fixing these up... And you broke some I already fixed. Sorry about that. Can you remind me what the standard is? Thanks, -Travis -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliphant at enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Aug 28 17:03:28 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Aug 2009 15:03:28 -0600 Subject: [Numpy-discussion] Merge of date-time branch completed In-Reply-To: References: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> Message-ID: On Fri, Aug 28, 2009 at 2:43 PM, Travis Oliphant wrote: > > On Aug 28, 2009, at 12:39 PM, Charles R Harris wrote: > > > > On Fri, Aug 28, 2009 at 10:47 AM, Travis Oliphant wrote: > >> >> Hello folks, >> >> In keeping with the complaint that the pace of NumPy development is too fast, I've finished the merge of the datetime branch to the core. The trunk builds and all the (previous) tests pass for me. >> >> There are several tasks remaining to be done (the current status is >> definitely still alpha): >> >> * write many unit tests for the desired behavior (especially for the many >> different kinds of dates supported) >> * finish coercion between datetimes and timedeltas with different >> frequencies >> * improve the ufuncs that support datetime and timedelta so that they look >> at the frequency information. >> * improve the way datetime arrays print >> * probably several other things that I haven't listed >> >> Because of the last point, I will spend my next effort on the work >> updating the proposal to more clearly define some of the expected behaviors >> and write documentation about the expected behavior of the new features. >> >> Help, reviews, criticisms, suggestions, fixes, and patches, are most >> welcome. >> > > Umm, replacing the previous code 'M' by '.' in generate_umath is a bit > obscure. Isn't there a better choice than '.' ? > > Please make the multiline comments conform to the standard. I spend a lot > of time fixing these up... And you broke some I already fixed. > > > Sorry about that. Can you remind me what the standard is? > /* * blah, blah * blah, blah */ It makes the extent of the comment more blatant, especially if it is a long comment, and separates it from the code. No more looking for that elusive */. For code reading/maintenance blatant is good. How about 'P' instead of '.' ? I'll guess that 'M' originally stood for method and that's gone, but 'P' follows 'O', which isn't any sort of argument but at least 'P' is easier to see on the page ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Fri Aug 28 17:09:49 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Fri, 28 Aug 2009 16:09:49 -0500 Subject: [Numpy-discussion] Merge of date-time branch completed In-Reply-To: References: <461AEC55-EB92-4EE5-9673-F33BB33FBEEF@enthought.com> Message-ID: On Aug 28, 2009, at 4:03 PM, Charles R Harris wrote: > > > On Fri, Aug 28, 2009 at 2:43 PM, Travis Oliphant > wrote: > > On Aug 28, 2009, at 12:39 PM, Charles R Harris wrote: > >> >> >> On Fri, Aug 28, 2009 at 10:47 AM, Travis Oliphant > > wrote: >> >> Hello folks, >> >> In keeping with the complaint that the pace of NumPy development is >> too fast, I've finished the merge of the datetime branch to the >> core. The trunk builds and all the (previous) tests pass for me. >> >> There are several tasks remaining to be done (the current status is >> definitely still alpha): >> >> * write many unit tests for the desired behavior (especially for >> the many different kinds of dates supported) >> * finish coercion between datetimes and timedeltas with different >> frequencies >> * improve the ufuncs that support datetime and timedelta so that >> they look at the frequency information. >> * improve the way datetime arrays print >> * probably several other things that I haven't listed >> >> Because of the last point, I will spend my next effort on the work >> updating the proposal to more clearly define some of the expected >> behaviors and write documentation about the expected behavior of >> the new features. >> >> Help, reviews, criticisms, suggestions, fixes, and patches, are >> most welcome. >> >> Umm, replacing the previous code 'M' by '.' in generate_umath is a >> bit obscure. Isn't there a better choice than '.' ? >> >> Please make the multiline comments conform to the standard. I spend >> a lot of time fixing these up... And you broke some I already fixed. > > Sorry about that. Can you remind me what the standard is? > > /* > * blah, blah > * blah, blah > */ > > It makes the extent of the comment more blatant, especially if it is > a long comment, and separates it from the code. No more looking for > that elusive */. For code reading/maintenance blatant is good. > > How about 'P' instead of '.' ? I'll guess that 'M' originally stood > for method and that's gone, but 'P' follows 'O', which isn't any > sort of argument but at least 'P' is easier to see on the page ;) > I like it --- was just trying to think of a better one. Thought of 'o', but it looks basically the same. -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliphant at enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Aug 29 01:41:45 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 29 Aug 2009 00:41:45 -0500 Subject: [Numpy-discussion] future directions In-Reply-To: <3d375d730908280918i4ea79b8foee043ed1bc8d0ec@mail.gmail.com> References: <4A9802B9.3080705@noaa.gov> <3d375d730908280918i4ea79b8foee043ed1bc8d0ec@mail.gmail.com> Message-ID: <5b8d13220908282241n11fbdcfcn688c4ec23bd936d3@mail.gmail.com> On Fri, Aug 28, 2009 at 11:18 AM, Robert Kern wrote: > On Fri, Aug 28, 2009 at 09:15, Christopher Barker wrote: >> Joe Harrington wrote: >>> However, there are two natural forklets coming up. >>> >>> The first is Python 3.0, which will necessitate some API changes. >> >> Absolutely! This seems like a no-brainer. I don't think we are talking >> about really major changes to the numpy API anyway, generally clean-up, >> and there is no way anyone is going to get their Py2 code working on Py3 >> without tweaking it anyway, this is the time to do it. > > No, it is the *worst* time to do it. We have been asked by the Python > developer team *not* to use the Python 3 transition to break all kinds > of other backwards compatibility. AFAIK, the main argument is that this would allow for easier transition, since someone could use 2to3 to make the transition from numpy for python 2 to numpy for python 3. But is it even possible for a large package like numpy ? I don't see how the C api for example could be backward compatible, since the API with PyString and PyInt would have to be changed. I guess time will tell, once other package with a lof of C will be converted. I am curious to see whether this advice from the python dev community will be followed at all. cheers, David From robert.kern at gmail.com Sat Aug 29 04:07:22 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 29 Aug 2009 03:07:22 -0500 Subject: [Numpy-discussion] future directions In-Reply-To: <5b8d13220908282241n11fbdcfcn688c4ec23bd936d3@mail.gmail.com> References: <4A9802B9.3080705@noaa.gov> <3d375d730908280918i4ea79b8foee043ed1bc8d0ec@mail.gmail.com> <5b8d13220908282241n11fbdcfcn688c4ec23bd936d3@mail.gmail.com> Message-ID: <3d375d730908290107h4e4b33aev22175515b810ac8c@mail.gmail.com> On Sat, Aug 29, 2009 at 00:41, David Cournapeau wrote: > On Fri, Aug 28, 2009 at 11:18 AM, Robert Kern wrote: >> On Fri, Aug 28, 2009 at 09:15, Christopher Barker wrote: >>> Joe Harrington wrote: >>>> However, there are two natural forklets coming up. >>>> >>>> The first is Python 3.0, which will necessitate some API changes. >>> >>> Absolutely! This seems like a no-brainer. I don't think we are talking >>> about really major changes to the numpy API anyway, generally clean-up, >>> and there is no way anyone is going to get their Py2 code working on Py3 >>> without tweaking it anyway, this is the time to do it. >> >> No, it is the *worst* time to do it. We have been asked by the Python >> developer team *not* to use the Python 3 transition to break all kinds >> of other backwards compatibility. > > AFAIK, the main argument is that this would allow for easier > transition, since someone could use 2to3 to make the transition from > numpy for python 2 to numpy for python 3. But is it even possible for > a large package like numpy ? I don't see how the C api for example > could be backward compatible, since the API with PyString and PyInt > would have to be changed. I'm not talking about that kind of breakage. You can break compatibility for whatever is *necessary* in order to make the transition. What Chris is suggesting we do and what Guido is requesting we not do is to take the opportunity to break compatibility for stuff entirely unrelated to the transition just because people will have to port stuff around that time anyways. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Sun Aug 30 13:19:43 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 30 Aug 2009 13:19:43 -0400 Subject: [Numpy-discussion] masked arrays of structured arrays In-Reply-To: <20090822115844.GA6422@doriath.local> References: <20090822115844.GA6422@doriath.local> Message-ID: <00FAC407-B21B-46BC-8ADA-E4702DBE54A6@gmail.com> Oops, overlooked this one ... On Aug 22, 2009, at 7:58 AM, Ernest Adrogu? wrote: > Hi there, > > Here is a structured array with 3 fields each of which has 3 fields > in turn: > > However if try the same with a masked array, it fails: > > In [14]: x = np.ma.masked_all(2, dtype=desc) > > In [15]: x['x']['b'] = 2 > --------------------------------------------------------------------------- > ValueError Traceback (most recent > call last) > > /home/ernest/ in () > > /usr/lib/python2.5/site-packages/numpy/ma/core.pyc in > __setitem__(self, indx, value) > 1574 if self._mask is nomask: > 1575 self._mask = make_mask_none(self.shape, > self.dtype) > -> 1576 ndarray.__setitem__(self._mask, indx, > getmask(value)) > 1577 return > 1578 #........................................ > > ValueError: field named b not found. > > Any idea of what the problem is? I can't reproduce that with a recent SVN version (r7348). What version of numpy are you using ? From dsdale24 at gmail.com Sun Aug 30 20:28:08 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Sun, 30 Aug 2009 20:28:08 -0400 Subject: [Numpy-discussion] segfaults when passing ndarray subclass to ufunc with out=None In-Reply-To: <9457e7c80908301553u28fc1ca6ged1f113cc418881c@mail.gmail.com> References: <9457e7c80908301553u28fc1ca6ged1f113cc418881c@mail.gmail.com> Message-ID: Hi Stefan, I think Chuck applied the patch after I filed a ticket at the trac website. http://projects.scipy.org/numpy/ticket/1022 . I just tried running the script I posted with the most recent checkout and numpy raised an error instead of segfaulting, so I think this issue is clear. Thank you for following up. Darren 2009/8/30 St?fan van der Walt : > Hi, Darren > > Is this problem still present? ?If so, we should fix it before 1.4 is released. > > Regards > St?fan > > > ---------- Forwarded message ---------- > From: Darren Dale > Date: 2009/3/8 > Subject: Re: [Numpy-discussion] segfaults when passing ndarray > subclass to ufunc with out=None > To: Discussion of Numerical Python > > > > > On Sun, Feb 8, 2009 at 12:49 PM, Darren Dale wrote: >> >> I am seeing some really strange behavior when I try to pass an ndarray subclass and out=None to numpy's ufuncs. This example will reproduce the problem with svn numpy, the first print statement yields 1 as expected, the second yields? "" and the third yields a segmentation fault: >> >> import numpy as np >> >> class MyArray(np.ndarray): >> >> ??? __array_priority__ = 20 >> >> ??? def __new__(cls): >> ??????? return np.asarray(1).view(cls).copy() >> >> ??? def __repr__(self): >> ??????? return 'my_array' >> >> ??? __str__ = __repr__ >> >> ??? def __mul__(self, other): >> ??????? return super(MyArray, self).__mul__(other) >> >> ??? def __rmul__(self, other): >> ??????? return super(MyArray, self).__rmul__(other) >> >> mine = MyArray() >> print np.multiply(1, 1, None) >> x = np.multiply(mine, mine, None) >> print type(x) >> print x > > > I think I might have found a fix for this. The following patch allows > my script to run without a segfault: > > $ svn diff > Index: umath_ufunc_object.inc > =================================================================== > --- umath_ufunc_object.inc????? (revision 6566) > +++ umath_ufunc_object.inc????? (working copy) > @@ -3212,13 +3212,10 @@ > ???????? output_wrap[i] = wrap; > ???????? if (j < nargs) { > ???????????? obj = PyTuple_GET_ITEM(args, j); > -??????????? if (obj == Py_None) { > -??????????????? continue; > -??????????? } > ???????????? if (PyArray_CheckExact(obj)) { > ???????????????? output_wrap[i] = Py_None; > ???????????? } > -??????????? else { > +??????????? else if (obj != Py_None) { > ???????????????? PyObject *owrap = PyObject_GetAttrString(obj,"__array_wrap__"); > ???????????????? incref = 0; > ???????????????? if (!(owrap) || !(PyCallable_Check(owrap))) { > > > That call to continue skipped this bit of code in the loop, which is > apparently important: > > ??????? if (incref) { > ??????????? Py_XINCREF(output_wrap[i]); > ??????? } > > > I've tested the trunk on 64 bit linux, with and without this patch > applied, and I get the same result in both cases: 1 known failure, 11 > skips. Is there any chance someone could consider applying this patch > before 1.3 ships? > > Darren > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- "In our description of nature, the purpose is not to disclose the real essence of the phenomena but only to track down, so far as it is possible, relations between the manifold aspects of our experience" - Niels Bohr "It is a bad habit of physicists to take their most successful abstractions to be real properties of our world." - N. David Mermin "Once we have granted that any physical theory is essentially only a model for the world of experience, we must renounce all hope of finding anything like the correct theory ... simply because the totality of experience is never accessible to us." - Hugh Everett III From eadrogue at gmx.net Mon Aug 31 14:33:17 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Mon, 31 Aug 2009 20:33:17 +0200 Subject: [Numpy-discussion] masked arrays of structured arrays In-Reply-To: <00FAC407-B21B-46BC-8ADA-E4702DBE54A6@gmail.com> References: <20090822115844.GA6422@doriath.local> <00FAC407-B21B-46BC-8ADA-E4702DBE54A6@gmail.com> Message-ID: <20090831183317.GA7148@doriath.local> 30/08/09 @ 13:19 (-0400), thus spake Pierre GM: > I can't reproduce that with a recent SVN version (r7348). What version > of numpy are you using ? Version 1.2.1 -- Ernest From pgmdevlist at gmail.com Mon Aug 31 14:37:35 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 31 Aug 2009 14:37:35 -0400 Subject: [Numpy-discussion] masked arrays of structured arrays In-Reply-To: <20090831183317.GA7148@doriath.local> References: <20090822115844.GA6422@doriath.local> <00FAC407-B21B-46BC-8ADA-E4702DBE54A6@gmail.com> <20090831183317.GA7148@doriath.local> Message-ID: <48116BC6-2100-47E6-9B9F-AD3E4465BF43@gmail.com> On Aug 31, 2009, at 2:33 PM, Ernest Adrogu? wrote: > 30/08/09 @ 13:19 (-0400), thus spake Pierre GM: >> I can't reproduce that with a recent SVN version (r7348). What >> version >> of numpy are you using ? > > Version 1.2.1 That must be that. Can you try w/ 1.3 ? From robert.kern at gmail.com Mon Aug 31 18:52:48 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 31 Aug 2009 17:52:48 -0500 Subject: [Numpy-discussion] adaptive interpolation on a regular 2d grid In-Reply-To: References: Message-ID: <3d375d730908311552r399ed62ah4f482b14b4e89b88@mail.gmail.com> On Sat, Aug 22, 2009 at 11:03, denis bzowy wrote: > Folks, > ?here's a simple adaptive interpolator; > drop me a line to chat about it > > ? ?adalin2( func, near, nx=300, ny=150, xstep=32, ystep=16, > ? ? ? ?xrange=(0,1), yrange=(0,1), dtype=np.float, norm=abs ) > > Purpose: > ? ?interpolate a function on a regular 2d grid: > ? ?take func() where it changes rapidly, bilinear interpolate where it's smooth. Looks good! Where can we get the code? Can this be specialized for 1D functions? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco