From contact at pythonxy.com Fri Jan 1 08:43:17 2010 From: contact at pythonxy.com (Pierre Raybaut) Date: Fri, 1 Jan 2010 14:43:17 +0100 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation Message-ID: <629b08a41001010543r193acb2bk3290b6458f97c596@mail.gmail.com> Hi David, Following your announcement for the 'toydist' module, I think that your project is very promising: this is certainly a great idea and it will be very controversial but that's because people expectactions are great on this matter (distutils is so disappointing indeed). Anyway, if I may be useful, I'll gladly contribute to it. In time, I could change the whole Python(x,y) packaging system (which is currently quite ugly... but easy/quick to manage/maintain) to use/promote this new module. Happy New Year! and Long Live Scientific Python! ;-) Cheers, Pierre From kwgoodman at gmail.com Fri Jan 1 15:23:17 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 1 Jan 2010 12:23:17 -0800 Subject: [Numpy-discussion] arrays and __eq__ Message-ID: I have a class that stores some of its data in a numpy array. I can check for equality when myclass is on the left and an array is on the right: >> m = myclass([1,2,3]) >> a = np.asarray([9,2,3]) >> >> m == a myclass([False, True, True], dtype=bool) But I get the wrong answer when an array is on the left and myclass is on the right: >> a == m array([ True, True, True], dtype=bool) import numpy as np class myclass(object): def __init__(self, arr): self.arr = np.asarray(arr) def __eq__(self, other): if np.isscalar(other) or isinstance(other, np.ndarray): x = myclass(self.arr.copy()) x.arr = x.arr == other else: raise TypeError, 'This example just tests numpy arrays and scalars.' return x def __repr__(self): return 'myclass' + repr(self.arr).split('array')[1] I've run into a similar problem with __radd__ but the solution to that problem doesn't work for __eq__: http://www.mail-archive.com/numpy-discussion at scipy.org/msg09476.html From eadrogue at gmx.net Fri Jan 1 18:42:10 2010 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Sat, 2 Jan 2010 00:42:10 +0100 Subject: [Numpy-discussion] is it safe to change the dtype without rebuilding the array? Message-ID: <20100101234210.GA11243@doriath.local> Hi, I find myself doing this: In [244]: x Out[244]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [245]: y=x.copy() In [251]: y.dtype.char Out[251]: 'l' In [252]: dt=np.dtype([('a','l'),('b','l'),('c','l')]) In [254]: y.dtype=dt Is it okay? The problem is that it's not easy to rebuild the array. I tried with: y.astype(dt) np.array(y, dt) np.array(y.tolist(), dt) None worked. Bye. Ernest From robert.kern at gmail.com Fri Jan 1 18:49:31 2010 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 1 Jan 2010 17:49:31 -0600 Subject: [Numpy-discussion] is it safe to change the dtype without rebuilding the array? In-Reply-To: <20100101234210.GA11243@doriath.local> References: <20100101234210.GA11243@doriath.local> Message-ID: <3d375d731001011549x350d5ba9tc381413eb81b6fc5@mail.gmail.com> 2010/1/1 Ernest Adrogu? : > Hi, > > I find myself doing this: > > In [244]: x > Out[244]: > array([[0, 1, 2], > ? ? ? [3, 4, 5], > ? ? ? [6, 7, 8]]) > > In [245]: y=x.copy() > > In [251]: y.dtype.char > Out[251]: 'l' > > In [252]: dt=np.dtype([('a','l'),('b','l'),('c','l')]) > > In [254]: y.dtype=dt > > Is it okay? > The problem is that it's not easy to rebuild the array. > I tried with: > > y.astype(dt) > np.array(y, dt) > np.array(y.tolist(), dt) > > None worked. y.view(dt) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Fri Jan 1 21:32:00 2010 From: cournape at gmail.com (David Cournapeau) Date: Sat, 2 Jan 2010 11:32:00 +0900 Subject: [Numpy-discussion] [SciPy-dev] Announcing toydist, improving distribution and packaging situation In-Reply-To: References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912300816s12c934adh4abdd6d703f8928f@mail.gmail.com> Message-ID: <5b8d13221001011832p5ca4875dj13f04ba7ee61dd41@mail.gmail.com> On Thu, Dec 31, 2009 at 6:06 AM, Darren Dale wrote: > > I should defer to the description of extras in the setuptools > documentation. It is only a few paragraphs long: > > http://peak.telecommunity.com/DevCenter/setuptools#declaring-extras-optional-features-with-their-own-dependencies Ok, so there are two issues related to this feature: - supporting variant at the build stage - supporting different variants of the same package in the dependency graph at install time The first issue is definitely supported - I fixed a bug in toydist to support this correctly, and this will be used when converting setuptools-based setup.py which use the features argument. The second issue is more challenging. It complicates the dependency handling quite a bit, and may cause difficult situations to happen at dependency resolution time. This becomes particularly messy if you mix packages you build yourself with packages grabbed from a repository. I wonder if there is a simpler solution which would give a similar feature set. cheers, David From cournape at gmail.com Sat Jan 2 02:51:38 2010 From: cournape at gmail.com (David Cournapeau) Date: Sat, 2 Jan 2010 16:51:38 +0900 Subject: [Numpy-discussion] [SciPy-User] Announcing toydist, improving distribution and packaging situation In-Reply-To: <629b08a41001010543r193acb2bk3290b6458f97c596@mail.gmail.com> References: <629b08a41001010543r193acb2bk3290b6458f97c596@mail.gmail.com> Message-ID: <5b8d13221001012351w4feda89bj13e67d102318076d@mail.gmail.com> On Fri, Jan 1, 2010 at 10:43 PM, Pierre Raybaut wrote: > Hi David, > > Following your announcement for the 'toydist' module, I think that > your project is very promising: this is certainly a great idea and it > will be very controversial but that's because people expectactions are > great on this matter (distutils is so disappointing indeed). > > Anyway, if I may be useful, I'll gladly contribute to it. > In time, I could change the whole Python(x,y) packaging system (which > is currently quite ugly... but easy/quick to manage/maintain) to > use/promote this new module. That would be a good way to test toydist on a real, complex package. I am not familiar at all with python(x,y) internals. Do you have some explanation I could look at somewhere ? In the meantime, I will try to clean-up the code to have a first experimental release. cheers, David From gael.varoquaux at normalesup.org Sat Jan 2 02:58:48 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 2 Jan 2010 08:58:48 +0100 Subject: [Numpy-discussion] [SciPy-dev] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13221001011832p5ca4875dj13f04ba7ee61dd41@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912300816s12c934adh4abdd6d703f8928f@mail.gmail.com> <5b8d13221001011832p5ca4875dj13f04ba7ee61dd41@mail.gmail.com> Message-ID: <20100102075848.GA17293@phare.normalesup.org> On Sat, Jan 02, 2010 at 11:32:00AM +0900, David Cournapeau wrote: > [snip] > - supporting different variants of the same package in the > dependency graph at install time > [snip] > The second issue is more challenging. It complicates the dependency > handling quite a bit, and may cause difficult situations to happen at > dependency resolution time. This becomes particularly messy if you mix > packages you build yourself with packages grabbed from a repository. I > wonder if there is a simpler solution which would give a similar > feature set. AFAICT, in Debian, the same feature is given via virtual packages: you would have: python-matplotlib python-matplotlib-basemap for instance. It is interesting to note that the same source package may be used to generate both binary, end-user, packages. And happy new year! Ga?l From cournape at gmail.com Sat Jan 2 05:18:34 2010 From: cournape at gmail.com (David Cournapeau) Date: Sat, 2 Jan 2010 19:18:34 +0900 Subject: [Numpy-discussion] [SciPy-dev] Announcing toydist, improving distribution and packaging situation In-Reply-To: <20100102075848.GA17293@phare.normalesup.org> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912300816s12c934adh4abdd6d703f8928f@mail.gmail.com> <5b8d13221001011832p5ca4875dj13f04ba7ee61dd41@mail.gmail.com> <20100102075848.GA17293@phare.normalesup.org> Message-ID: <5b8d13221001020218j32eaff29v2c777c513373a43c@mail.gmail.com> On Sat, Jan 2, 2010 at 4:58 PM, Gael Varoquaux wrote: > On Sat, Jan 02, 2010 at 11:32:00AM +0900, David Cournapeau wrote: >> [snip] >> ? - supporting different variants of the same package in the >> dependency graph at install time > >> [snip] > >> The second issue is more challenging. It complicates the dependency >> handling quite a bit, and may cause difficult situations to happen at >> dependency resolution time. This becomes particularly messy if you mix >> packages you build yourself with packages grabbed from a repository. I >> wonder if there is a simpler solution which would give a similar >> feature set. > > > AFAICT, in Debian, the same feature is given via virtual packages: you > would have: I don't think virtual-packages entirely fix the issue. AFAIK, virtual packages have two uses: - handle dependencies where several packages may resolve one particular dependency in an equivalent way (one good example is LAPACK: both liblapack and libatlas provides the lapack feature) - closer to this discussion, you can build several variants of the same package, and each variant would resolve the dependency on a virtual package handling the commonalities. For example, say we have two numpy packages, one built with lapack (python-numpy-full), the other without (python-numpy-core). What happens when a package foo depends on numpy-full, but numpy-core is installed ? AFAICS, this can only work as long as the set containing every variant can be ordered (in the conventional set ordering sense), and the dependency can be satisfied by the smallest one. cheers, David From manuel.wittchen at gmail.com Sat Jan 2 05:23:13 2010 From: manuel.wittchen at gmail.com (Manuel Wittchen) Date: Sat, 2 Jan 2010 11:23:13 +0100 Subject: [Numpy-discussion] calculating the difference of an array Message-ID: <209cec441001020223v2260d5c6lf92491b89fc543da@mail.gmail.com> Hi, I want to calculate the difference between the values of a numpy-array. The formula is: deltaT = t_(n+1) - t_(n) My approach to calculate the difference looks like this: for i in len(ARRAY): delta_t[i] = ARRAY[(i+1):] - ARRAY[:(len(ARRAY)-1)] print "result:", delta_t But I get a TypeError: File "./test.py", line 19, in for i in len(ARRAY): TypeError: 'int' object is not iterable Where is the mistake in the code? Regards and a happy new year, Manuel Wittchen From cournape at gmail.com Sat Jan 2 05:31:32 2010 From: cournape at gmail.com (David Cournapeau) Date: Sat, 2 Jan 2010 19:31:32 +0900 Subject: [Numpy-discussion] calculating the difference of an array In-Reply-To: <209cec441001020223v2260d5c6lf92491b89fc543da@mail.gmail.com> References: <209cec441001020223v2260d5c6lf92491b89fc543da@mail.gmail.com> Message-ID: <5b8d13221001020231m10c7b96ch5f62138d49de8953@mail.gmail.com> On Sat, Jan 2, 2010 at 7:23 PM, Manuel Wittchen wrote: > Hi, > > I want to calculate the difference between the values of a > numpy-array. The formula is: > > deltaT = t_(n+1) - t_(n) > > My approach to calculate the difference looks like this: > > for i in len(ARRAY): > ? ? ? ?delta_t[i] = ARRAY[(i+1):] - ARRAY[:(len(ARRAY)-1)] > > print "result:", delta_t > > But I get a TypeError: > File "./test.py", line 19, in > ? ?for i in len(ARRAY): > TypeError: 'int' object is not iterable > > Where is the mistake in the code? There are several mistakes :) Assuming ARRAY is a numpy array, len(ARRAY) will return an int. You would have the same error if ARRAY was any sequence: you should iterate over range(len(ARRAY)). Your formula within the loop is not very clear, and does not seem to match your formula. ARRAY[i+1:] gives you all the items ARRAY[i+1] until the end, ARRAY[:len(ARRAY)-1] gives you every item ARRAY[j] for 0 <= j < len(ARRAY)-1, that is the whole array. I think you want: for i in range(len(ARRAY)-1): delta_i[i] = ARRAY[i+1] - ARRAY[i] Also, using numpy efficiently requires to use vectorization, so actually: delta_t = ARRAY[1:] - ARRAY[:-1] gives you a more efficient version. But really, you should use diff, which implements what you want: import numpy as np delta_t = np.diff(ARRAY) cheers, David From emmanuelle.gouillart at normalesup.org Sat Jan 2 05:34:47 2010 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Sat, 2 Jan 2010 11:34:47 +0100 Subject: [Numpy-discussion] calculating the difference of an array In-Reply-To: <209cec441001020223v2260d5c6lf92491b89fc543da@mail.gmail.com> References: <209cec441001020223v2260d5c6lf92491b89fc543da@mail.gmail.com> Message-ID: <20100102103447.GA30365@phare.normalesup.org> Hello Manuel, the discrete difference of a numpy array can be written in a very natural way, without loops. Below are two possible ways to do it: >>> a = np.arange(10)**2 >>> a array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81]) >>> a[1:] - a[:-1] array([ 1, 3, 5, 7, 9, 11, 13, 15, 17]) >>> np.diff(a) # another way to calculate the difference array([ 1, 3, 5, 7, 9, 11, 13, 15, 17]) The error in the example you give is due to the fact that you iterate over len(ARRAY), which is an integer, hence not an iterable object. You should write ``for i in range(len(ARRAY))`` instead. Cheers, Emmanuelle On Sat, Jan 02, 2010 at 11:23:13AM +0100, Manuel Wittchen wrote: > Hi, > I want to calculate the difference between the values of a > numpy-array. The formula is: > deltaT = t_(n+1) - t_(n) > My approach to calculate the difference looks like this: > for i in len(ARRAY): > delta_t[i] = ARRAY[(i+1):] - ARRAY[:(len(ARRAY)-1)] > print "result:", delta_t > But I get a TypeError: > File "./test.py", line 19, in > for i in len(ARRAY): > TypeError: 'int' object is not iterable > Where is the mistake in the code? > Regards and a happy new year, > Manuel Wittchen > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From contact at pythonxy.com Sat Jan 2 05:40:16 2010 From: contact at pythonxy.com (Pierre Raybaut) Date: Sat, 2 Jan 2010 11:40:16 +0100 Subject: [Numpy-discussion] [SPAM] Re: [SciPy-User] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13221001012351w4feda89bj13e67d102318076d@mail.gmail.com> References: <629b08a41001010543r193acb2bk3290b6458f97c596@mail.gmail.com> <5b8d13221001012351w4feda89bj13e67d102318076d@mail.gmail.com> Message-ID: <629b08a41001020240y642518f4r68f4a6a3860a3eee@mail.gmail.com> 2010/1/2 David Cournapeau : > On Fri, Jan 1, 2010 at 10:43 PM, Pierre Raybaut wrote: >> Hi David, >> >> Following your announcement for the 'toydist' module, I think that >> your project is very promising: this is certainly a great idea and it >> will be very controversial but that's because people expectactions are >> great on this matter (distutils is so disappointing indeed). >> >> Anyway, if I may be useful, I'll gladly contribute to it. >> In time, I could change the whole Python(x,y) packaging system (which >> is currently quite ugly... but easy/quick to manage/maintain) to >> use/promote this new module. > > That would be a good way to test toydist on a real, complex package. I > am not familiar at all with python(x,y) internals. Do you have some > explanation I could look at somewhere ? Honestly, let's assume that there is currently no packaging system... it would not be very far from the truth. I did it when I was young and naive regarding Python. Actually I almost did it without having writing any code in Python (approx. two months after earing about the Python language for the first time) : it's an ugly collection of AutoIt, NSIS and PHP scripts -- most of the tasks are automated like updating the generated website pages and so on. So I'm not proud at all, but it was easy and very quick to do as it is, and it's still quite easy to maintain. But, it's not satisfying in terms of code "purity" -- I've been wanting to rewrite all this in Python for a year and a half but since the features are there, there is no real motivation to do the work (in other words, Python(x,y) users would not see the difference, at least at the beginning). An other thing: Python(x,y) plugins are not built from source but from existing binaries (it's a pity I know, but it was incredibly faster to do this way). For example, eggs or distutils .exe may be converted in Python(x,y) plugins directly (same internal directory structure). So it may be different from the idea you had in mind (it's not like EPD which is entirely generated from source, AFAIK). > In the meantime, I will try to clean-up the code to have a first > experimental release. > Ok, keep up the good work! Cheers, Pierre From manuel.wittchen at gmail.com Sat Jan 2 06:03:23 2010 From: manuel.wittchen at gmail.com (Manuel Wittchen) Date: Sat, 2 Jan 2010 12:03:23 +0100 Subject: [Numpy-discussion] calculating the difference of an array In-Reply-To: <20100102103447.GA30365@phare.normalesup.org> References: <209cec441001020223v2260d5c6lf92491b89fc543da@mail.gmail.com> <20100102103447.GA30365@phare.normalesup.org> Message-ID: <209cec441001020303l45741e9cseebe8bddd6f078e1@mail.gmail.com> Hi, Thanks for your help. I tried np.diff() before, but the result looked like this: RESULT = [1, 1, 1, 1] So I was thinking that np.diff() doesn't iterate over the values of the array. So I gave the for-loop a try. Now, seeing your code below, I realized that my mistake was that I used ARRAY = [0, 1, 2, 3, 4, 5] for the calculations... Stupid me. 2010/1/2 Emmanuelle Gouillart : > Hello Manuel, > > the discrete difference of a numpy array can be written in a very > natural way, without loops. Below are two possible ways to do it: >>>> a = np.arange(10)**2 >>>> a > array([ 0, ?1, ?4, ?9, 16, 25, 36, 49, 64, 81]) >>>> a[1:] - a[:-1] > array([ 1, ?3, ?5, ?7, ?9, 11, 13, 15, 17]) >>>> np.diff(a) # another way to calculate the difference > array([ 1, ?3, ?5, ?7, ?9, 11, 13, 15, 17]) > > The error in the example you give is due to the fact that you iterate > over len(ARRAY), which is an integer, hence not an iterable object. You > should write ``for i in range(len(ARRAY))`` instead. > > Cheers, > > Emmanuelle > > On Sat, Jan 02, 2010 at 11:23:13AM +0100, Manuel Wittchen wrote: >> Hi, > >> I want to calculate the difference between the values of a >> numpy-array. The formula is: > >> deltaT = t_(n+1) - t_(n) > >> My approach to calculate the difference looks like this: > >> for i in len(ARRAY): >> ? ? ? delta_t[i] = ARRAY[(i+1):] - ARRAY[:(len(ARRAY)-1)] > >> print "result:", delta_t > >> But I get a TypeError: >> File "./test.py", line 19, in >> ? ? for i in len(ARRAY): >> TypeError: 'int' object is not iterable > >> Where is the mistake in the code? > >> Regards and a happy new year, >> Manuel Wittchen >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From tpk at kraussfamily.org Sat Jan 2 15:41:29 2010 From: tpk at kraussfamily.org (Tom K.) Date: Sat, 2 Jan 2010 12:41:29 -0800 (PST) Subject: [Numpy-discussion] ANN: upfirdn 0.2.0 Message-ID: <26996309.post@talk.nabble.com> ANNOUNCEMENT I am pleased to announce a new release of "upfirdn" - version 0.2.0. This package provides an efficient polyphase FIR resampler object (SWIG-ed C++) and some python wrappers. This release greatly improves installation with distutils relative to the initial 0.1.0 release. 0.2.0 includes no functional changes relative to 0.1.0. Also, the source code is now browse-able online through a Google Code site with mercurial repository. https://opensource.motorola.com/sf/projects/upfirdn http://code.google.com/p/upfirdn/ Thanks to Google for providing this hosting service! -- View this message in context: http://old.nabble.com/ANN%3A-upfirdn-0.2.0-tp26996309p26996309.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From njs at pobox.com Sun Jan 3 06:05:54 2010 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 3 Jan 2010 03:05:54 -0800 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> Message-ID: <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> On Tue, Dec 29, 2009 at 6:34 AM, David Cournapeau wrote: > Buildout, virtualenv all work by sandboxing from the system python: > each of them do not see each other, which may be useful for > development, but as a deployment solution to the casual user who may > not be familiar with python, it is useless. A scientist who installs > numpy, scipy, etc... to try things out want to have everything > available in one python interpreter, and does not want to jump to > different virtualenvs and whatnot to try different packages. What I do -- and documented for people in my lab to do -- is set up one virtualenv in my user account, and use it as my default python. (I 'activate' it from my login scripts.) The advantage of this is that easy_install (or pip) just works, without any hassle about permissions etc. This should be easier, but I think the basic approach is sound. "Integration with the package system" is useless; the advantage of distribution packages is that distributions can provide a single coherent system with consistent version numbers across all packages, etc., and the only way to "integrate" with that is to, well, get the packages into the distribution. On another note, I hope toydist will provide a "source prepare" step, that allows arbitrary code to be run on the source tree. (For, e.g., cython->C conversion, ad-hoc template languages, etc.) IME this is a very common pain point with distutils; there is just no good way to do it, and it has to be supported in the distribution utility in order to get everything right. In particular: -- Generated files should never be written to the source tree itself, but only the build directory -- Building from a source checkout should run the "source prepare" step automatically -- Building a source distribution should also run the "source prepare" step, and stash the results in such a way that when later building the source distribution, this step can be skipped. This is a common requirement for user convenience, and necessary if you want to avoid arbitrary code execution during builds. And if you just set up the distribution util so that the only place you can specify arbitrary code execution is in the "source prepare" step, then even people who know nothing about packaging will automatically get all of the above right. Cheers, -- Nathaniel From gael.varoquaux at normalesup.org Sun Jan 3 06:11:53 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 3 Jan 2010 12:11:53 +0100 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> Message-ID: <20100103111153.GB24770@phare.normalesup.org> On Sun, Jan 03, 2010 at 03:05:54AM -0800, Nathaniel Smith wrote: > What I do -- and documented for people in my lab to do -- is set up > one virtualenv in my user account, and use it as my default python. (I > 'activate' it from my login scripts.) The advantage of this is that > easy_install (or pip) just works, without any hassle about permissions > etc. This should be easier, but I think the basic approach is sound. > "Integration with the package system" is useless; the advantage of > distribution packages is that distributions can provide a single > coherent system with consistent version numbers across all packages, > etc., and the only way to "integrate" with that is to, well, get the > packages into the distribution. That works because either you use packages that don't have much hard-core compiled dependencies, or these are already installed. Think about installing VTK or ITK this way, even something simpler such as umfpack. I think that you would loose most of your users. In my lab, I do lose users on such packages actually. Beside, what you are describing is possible without package isolation, it is simply the use of a per-user local site-packages, which now semi automatic in python2.6 using the '.local' directory. I do agree that, in a research lab, this is a best practice. Ga?l From cournape at gmail.com Sun Jan 3 06:24:59 2010 From: cournape at gmail.com (David Cournapeau) Date: Sun, 3 Jan 2010 20:24:59 +0900 Subject: [Numpy-discussion] [matplotlib-devel] [SciPy-dev] Announcing toydist, improving distribution and packaging situation In-Reply-To: <4B3F8FF6.7010800@astraw.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912300816s12c934adh4abdd6d703f8928f@mail.gmail.com> <5b8d13221001011832p5ca4875dj13f04ba7ee61dd41@mail.gmail.com> <20100102075848.GA17293@phare.normalesup.org> <5b8d13221001020218j32eaff29v2c777c513373a43c@mail.gmail.com> <4B3F8FF6.7010800@astraw.com> Message-ID: <5b8d13221001030324i67c3125dh8efa68c04a539659@mail.gmail.com> On Sun, Jan 3, 2010 at 3:27 AM, Andrew Straw wrote: >> > Typically, the dependencies only depend on the smallest subset of what > they require (if they don't need lapack, they'd only depend on > python-numpy-core in your example), but yes, if there's an unsatisfiable > condition, then apt-get will raise an error and abort. In practice, this > system seems to work quite well, IMO. Yes, but: - debian dependency resolution is complex. I think many people don't realize how complex the problem really is (AFAIK, any correct scheme to resolve dependencies in debian requires an algorithm which is NP-complete ) - introducing a lot of variants significantly slow down the whole thing. I think it worths thinking whether our problems warrant such a complexity. > > Anyhow, here's the full Debian documentation: > http://www.debian.org/doc/debian-policy/ch-relationships.html This is not the part I am afraid of. This is: http://people.debian.org/~dburrows/model.pdf cheers, David From cournape at gmail.com Sun Jan 3 07:23:14 2010 From: cournape at gmail.com (David Cournapeau) Date: Sun, 3 Jan 2010 21:23:14 +0900 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> Message-ID: <5b8d13221001030423j96fdb72l832964f6c5df7f97@mail.gmail.com> On Sun, Jan 3, 2010 at 8:05 PM, Nathaniel Smith wrote: > On Tue, Dec 29, 2009 at 6:34 AM, David Cournapeau wrote: >> Buildout, virtualenv all work by sandboxing from the system python: >> each of them do not see each other, which may be useful for >> development, but as a deployment solution to the casual user who may >> not be familiar with python, it is useless. A scientist who installs >> numpy, scipy, etc... to try things out want to have everything >> available in one python interpreter, and does not want to jump to >> different virtualenvs and whatnot to try different packages. > > What I do -- and documented for people in my lab to do -- is set up > one virtualenv in my user account, and use it as my default python. (I > 'activate' it from my login scripts.) The advantage of this is that > easy_install (or pip) just works, without any hassle about permissions > etc. It just works if you happen to be able to build everything from sources. That alone means you ignore the majority of users I intend to target. No other community (except maybe Ruby) push those isolated install solutions as a general deployment solutions. If it were such a great idea, other people would have picked up those solutions. > This should be easier, but I think the basic approach is sound. > "Integration with the package system" is useless; the advantage of > distribution packages is that distributions can provide a single > coherent system with consistent version numbers across all packages, > etc., and the only way to "integrate" with that is to, well, get the > packages into the distribution. Another way is to provide our own repository for a few major distributions, with automatically built packages. This is how most open source providers work. Miguel de Icaza explains this well: http://tirania.org/blog/archive/2007/Jan-26.html I hope we will be able to reuse much of the opensuse build service infrastructure. > > On another note, I hope toydist will provide a "source prepare" step, > that allows arbitrary code to be run on the source tree. (For, e.g., > cython->C conversion, ad-hoc template languages, etc.) IME this is a > very common pain point with distutils; there is just no good way to do > it, and it has to be supported in the distribution utility in order to > get everything right. In particular: > ?-- Generated files should never be written to the source tree > itself, but only the build directory > ?-- Building from a source checkout should run the "source prepare" > step automatically > ?-- Building a source distribution should also run the "source > prepare" step, and stash the results in such a way that when later > building the source distribution, this step can be skipped. This is a > common requirement for user convenience, and necessary if you want to > avoid arbitrary code execution during builds. Build directories are hard to implement right. I don't think toydist will support this directly. IMO, those advanced builds warrant a real build tool - one main goal of toydist is to make integration with waf or scons much easier. Both waf and scons have the concept of a build directory, which should do everything you described. David From njs at pobox.com Sun Jan 3 18:42:32 2010 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 3 Jan 2010 15:42:32 -0800 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13221001030423j96fdb72l832964f6c5df7f97@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> <5b8d13221001030423j96fdb72l832964f6c5df7f97@mail.gmail.com> Message-ID: <961fa2b41001031542t203e8ef6mee8f590095e54d18@mail.gmail.com> On Sun, Jan 3, 2010 at 4:23 AM, David Cournapeau wrote: > On Sun, Jan 3, 2010 at 8:05 PM, Nathaniel Smith wrote: >> What I do -- and documented for people in my lab to do -- is set up >> one virtualenv in my user account, and use it as my default python. (I >> 'activate' it from my login scripts.) The advantage of this is that >> easy_install (or pip) just works, without any hassle about permissions >> etc. > > It just works if you happen to be able to build everything from > sources. That alone means you ignore the majority of users I intend to > target. > > No other community (except maybe Ruby) push those isolated install > solutions as a general deployment solutions. If it were such a great > idea, other people would have picked up those solutions. AFAICT, R works more-or-less identically (once I convinced it to use a per-user library directory); install.packages() builds from source, and doesn't automatically pull in and build random C library dependencies. I'm not advocating the 'every app in its own world' model that virtualenv's designers had min mind, but virtualenv is very useful to give each user their own world. Normally I only use a fraction of virtualenv's power this way, but sometimes it's handy that they've solved the more general problem -- I can easily move my environment out of the way and rebuild if I've done something stupid, or experiment with new python versions in isolation, or whatever. And when you *do* have to reproduce some old environment -- if only to test that the new improved environment gives the same results -- then it's *really* handy. >> This should be easier, but I think the basic approach is sound. >> "Integration with the package system" is useless; the advantage of >> distribution packages is that distributions can provide a single >> coherent system with consistent version numbers across all packages, >> etc., and the only way to "integrate" with that is to, well, get the >> packages into the distribution. > > Another way is to provide our own repository for a few major > distributions, with automatically built packages. This is how most > open source providers work. Miguel de Icaza explains this well: > > http://tirania.org/blog/archive/2007/Jan-26.html > > I hope we will be able to reuse much of the opensuse build service > infrastructure. Sure, I'm aware of the opensuse build service, have built third-party packages for my projects, etc. It's a good attempt, but also has a lot of problems, and when talking about scientific software it's totally useless to me :-). First, I don't have root on our compute cluster. Second, even if I did I'd be very leery about installing third-party packages because there is no guarantee that the version numbering will be consistent between the third-party repo and the real distro repo -- suppose that the distro packages 0.1, then the third party packages 0.2, then the distro packages 0.3, will upgrades be seamless? What if the third party screws up the version numbering at some point? Debian has "epochs" to deal with this, but third-parties can't use them and maintain compatibility. What if the person making the third party packages is not an expert on these random distros that they don't even use? Will bug reporting tools work properly? Distros are complicated. Third, while we shouldn't advocate that people screw up backwards compatibility, version skew is a real issue. If I need one version of a package and my lab-mate needs another and we have submissions due tomorrow, then filing bugs is a great idea but not a solution. Fourth, even if we had expert maintainers taking care of all these third-party packages and all my concerns were answered, I couldn't convince our sysadmin of that; he's the one who'd have to clean up if something went wrong we don't have a big budget for overtime. Let's be honest -- scientists, on the whole, suck at IT infrastructure, and small individual packages are not going to be very expertly put together. IMHO any real solution should take this into account, keep them sandboxed from the rest of the system, and focus on providing the most friendly and seamless sandbox possible. >> On another note, I hope toydist will provide a "source prepare" step, >> that allows arbitrary code to be run on the source tree. (For, e.g., >> cython->C conversion, ad-hoc template languages, etc.) IME this is a >> very common pain point with distutils; there is just no good way to do >> it, and it has to be supported in the distribution utility in order to >> get everything right. In particular: >> ?-- Generated files should never be written to the source tree >> itself, but only the build directory >> ?-- Building from a source checkout should run the "source prepare" >> step automatically >> ?-- Building a source distribution should also run the "source >> prepare" step, and stash the results in such a way that when later >> building the source distribution, this step can be skipped. This is a >> common requirement for user convenience, and necessary if you want to >> avoid arbitrary code execution during builds. > > Build directories are hard to implement right. I don't think toydist > will support this directly. IMO, those advanced builds warrant a real > build tool - one main goal of toydist is to make integration with waf > or scons much easier. Both waf and scons have the concept of a build > directory, which should do everything you described. Maybe I was unclear -- proper build directory handling is nice, Cython/Pyrex's distutils integration get it wrong (not their fault, distutils is just impossible to do anything sensible with, as you've said), and I've never found build directories hard to implement (perhaps I'm missing something). But what I'm really talking about is having a "pre-build" step that integrates properly with the source and binary packaging stages, and that's not something waf or scons have any particular support for, AFAIK. -- Nathaniel From robert.kern at gmail.com Sun Jan 3 18:52:04 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 3 Jan 2010 17:52:04 -0600 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <961fa2b41001031542t203e8ef6mee8f590095e54d18@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> <5b8d13221001030423j96fdb72l832964f6c5df7f97@mail.gmail.com> <961fa2b41001031542t203e8ef6mee8f590095e54d18@mail.gmail.com> Message-ID: <3d375d731001031552u6783266t9b035ece83c14927@mail.gmail.com> On Sun, Jan 3, 2010 at 17:42, Nathaniel Smith wrote: > On Sun, Jan 3, 2010 at 4:23 AM, David Cournapeau wrote: >> On Sun, Jan 3, 2010 at 8:05 PM, Nathaniel Smith wrote: >>> What I do -- and documented for people in my lab to do -- is set up >>> one virtualenv in my user account, and use it as my default python. (I >>> 'activate' it from my login scripts.) The advantage of this is that >>> easy_install (or pip) just works, without any hassle about permissions >>> etc. >> >> It just works if you happen to be able to build everything from >> sources. That alone means you ignore the majority of users I intend to >> target. >> >> No other community (except maybe Ruby) push those isolated install >> solutions as a general deployment solutions. If it were such a great >> idea, other people would have picked up those solutions. > > AFAICT, R works more-or-less identically (once I convinced it to use a > per-user library directory); install.packages() builds from source, > and doesn't automatically pull in and build random C library > dependencies. That's not quite the same. That is the R equivalent of Python's recent per-user site-packages feature (every user get's their own sandbox), not virtualenv (every project gets it's own sandbox). The former feature has a long history in the multiuser UNIX world and is not really controversial. http://www.python.org/dev/peps/pep-0370/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Mon Jan 4 02:25:44 2010 From: cournape at gmail.com (David Cournapeau) Date: Mon, 4 Jan 2010 16:25:44 +0900 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <961fa2b41001031542t203e8ef6mee8f590095e54d18@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> <5b8d13221001030423j96fdb72l832964f6c5df7f97@mail.gmail.com> <961fa2b41001031542t203e8ef6mee8f590095e54d18@mail.gmail.com> Message-ID: <5b8d13221001032325o250e4d5ao5f476b384ad6dd17@mail.gmail.com> On Mon, Jan 4, 2010 at 8:42 AM, Nathaniel Smith wrote: > On Sun, Jan 3, 2010 at 4:23 AM, David Cournapeau wrote: >> On Sun, Jan 3, 2010 at 8:05 PM, Nathaniel Smith wrote: >>> What I do -- and documented for people in my lab to do -- is set up >>> one virtualenv in my user account, and use it as my default python. (I >>> 'activate' it from my login scripts.) The advantage of this is that >>> easy_install (or pip) just works, without any hassle about permissions >>> etc. >> >> It just works if you happen to be able to build everything from >> sources. That alone means you ignore the majority of users I intend to >> target. >> >> No other community (except maybe Ruby) push those isolated install >> solutions as a general deployment solutions. If it were such a great >> idea, other people would have picked up those solutions. > > AFAICT, R works more-or-less identically (once I convinced it to use a > per-user library directory); install.packages() builds from source, > and doesn't automatically pull in and build random C library > dependencies. As mentioned by Robert, this is different from the usual virtualenv approach. Per-user app installation is certainly a useful (and uncontroversial) feature. And R does support automatically-built binary installers. > > Sure, I'm aware of the opensuse build service, have built third-party > packages for my projects, etc. It's a good attempt, but also has a lot > of problems, and when talking about scientific software it's totally > useless to me :-). First, I don't have root on our compute cluster. True, non-root install is a problem. Nothing *prevents* dpkg to run in non root environment in principle if the packages itself does not require it, but it is not really supported by the tools ATM. > Second, even if I did I'd be very leery about installing third-party > packages because there is no guarantee that the version numbering will > be consistent between the third-party repo and the real distro repo -- > suppose that the distro packages 0.1, then the third party packages > 0.2, then the distro packages 0.3, will upgrades be seamless? What if > the third party screws up the version numbering at some point? Debian > has "epochs" to deal with this, but third-parties can't use them and > maintain compatibility. Actually, at least with .deb-based distributions, this issue has a solution. As packages has their own version in addition to the upstream version, PPA-built packages have their own versions. https://help.launchpad.net/Packaging/PPA/BuildingASourcePackage Of course, this assumes a simple versioning scheme in the first place, instead of the cluster-fck that versioning has became within python packages (again, the scheme used in python is much more complicated than everyone else, and it seems that nobody has ever stopped and thought 5 minutes about the consequences, and whether this complexity was a good idea in the first place). > What if the person making the third party > packages is not an expert on these random distros that they don't even > use? I think simple rules/conventions + build farms would solve most issues. The problem is if you allow total flexibility as input, then automatic and simple solutions become impossible. Certainly, PPA and the build service provides for a much better experience than anything pypi has ever given to me. > Third, while we shouldn't advocate that people screw up backwards > compatibility, version skew is a real issue. If I need one version of > a package and my lab-mate needs another and we have submissions due > tomorrow, then filing bugs is a great idea but not a solution. Nothing prevents you from using virtualenv in that case (I may sound dismissive of those tools, but I am really not. I use them myselves. What I strongly react to is when those are pushed as the de-facto, standard method). > Fourth, > even if we had expert maintainers taking care of all these third-party > packages and all my concerns were answered, I couldn't convince our > sysadmin of that; he's the one who'd have to clean up if something > went wrong we don't have a big budget for overtime. I am not advocating using only packaged, binary installers. I am advocating using them as much as possible where it makes sense - on windows and mac os x in particular. Toydist also aims at making it easier to build, customize installs. Although not yet implemented, --user-like scheme would be quite simple to implement, because toydist installer internally uses autoconf-like directories description (of which --user is a special case). If you need sandboxed installs, customized installs, toydist will not prevent it. It is certainly my intention to make it possible to use virtualenv and co (you already can by building eggs, actually). I hope that by having our own "SciPi", we can actually have a more reliable approach. For example, the static dependency description + mandated metadata would make this much easier and more robust, as there would not be a need to run a setup.py to get the dependencies. If you look at hackageDB (http://hackage.haskell.org/packages/hackage.html), they have a very simple index structure, which makes it easy to download it entirely, and reuse this locally to avoid any internet access. > Let's be honest -- scientists, on the whole, suck at IT > infrastructure, and small individual packages are not going to be very > expertly put together. IMHO any real solution should take this into > account, keep them sandboxed from the rest of the system, and focus on > providing the most friendly and seamless sandbox possible. I agree packages will not always be well put together - but I don't see why this would be worse than the current situation. I also strongly disagree about the sandboxing as the solution of choice. For most users, having only one install of most packages is the typical use-case. Once you start sandboxing, you create artificial barriers between the sandboxes, and this becomes too complicated for most users IMHO. > > Maybe I was unclear -- proper build directory handling is nice, > Cython/Pyrex's distutils integration get it wrong (not their fault, > distutils is just impossible to do anything sensible with, as you've > said), and I've never found build directories hard to implement It is simple if you have a good infrastructure in place (node abstraction, etc...), but that infrastructure is hard to get right. > But what I'm really talking about is > having a "pre-build" step that integrates properly with the source and > binary packaging stages, and that's not something waf or scons have > any particular support for, AFAIK. Could you explain with a concrete example what a pre-build stage would look like ? I don't think I understand what you want, cheers, David From dagss at student.matnat.uio.no Mon Jan 4 03:48:43 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 4 Jan 2010 09:48:43 +0100 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <961fa2b41001031542t203e8ef6mee8f590095e54d18@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> <5b8d13221001030423j96fdb72l832964f6c5df7f97@mail.gmail.com> <961fa2b41001031542t203e8ef6mee8f590095e54d18@mail.gmail.com> Message-ID: <89f11d872447395b3271dbea1fd5cd9d.squirrel@webmail.uio.no> Nathaniel Smith wrote: > On Sun, Jan 3, 2010 at 4:23 AM, David Cournapeau > wrote: >> Another way is to provide our own repository for a few major >> distributions, with automatically built packages. This is how most >> open source providers work. Miguel de Icaza explains this well: >> >> http://tirania.org/blog/archive/2007/Jan-26.html >> >> I hope we will be able to reuse much of the opensuse build service >> infrastructure. > > Sure, I'm aware of the opensuse build service, have built third-party > packages for my projects, etc. It's a good attempt, but also has a lot > of problems, and when talking about scientific software it's totally > useless to me :-). First, I don't have root on our compute cluster. I use Sage for this very reason, and others use EPD or FEMHub or Python(x,y) for the same reasons. Rolling this into the Python package distribution scheme seems backwards though, since a lot of binary packages that have nothing to do with Python are used as well -- the Python stuff is simply thin wrappers around what should ideally be located in /usr/lib or similar (but are nowadays compiled into the Python extension .so because of distribution problems). To solve the exact problem you (and me) have I think the best solution is to integrate the tools mentioned above with what David is planning (SciPI etc.). Or if that isn't good enough, find generic "userland package manager" that has nothing to do with Python (I'm sure a dozen half-finished ones must have been written but didn't look), finish it, and connect it to SciPI. What David does (I think) is seperate the concerns. This makes the task feasible, and also has the advantage of convenience for the people that *do* want to use Ubuntu, Red Hat or whatever to roll out scientific software on hundreds of clients. Dag Sverre From cournape at gmail.com Mon Jan 4 04:11:13 2010 From: cournape at gmail.com (David Cournapeau) Date: Mon, 4 Jan 2010 18:11:13 +0900 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <89f11d872447395b3271dbea1fd5cd9d.squirrel@webmail.uio.no> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <961fa2b41001030305mddd301fp416a2fe23fc11568@mail.gmail.com> <5b8d13221001030423j96fdb72l832964f6c5df7f97@mail.gmail.com> <961fa2b41001031542t203e8ef6mee8f590095e54d18@mail.gmail.com> <89f11d872447395b3271dbea1fd5cd9d.squirrel@webmail.uio.no> Message-ID: <5b8d13221001040111x29cd463al7f9559ff93655508@mail.gmail.com> On Mon, Jan 4, 2010 at 5:48 PM, Dag Sverre Seljebotn wrote: > > Rolling this into the Python package distribution scheme seems backwards > though, since a lot of binary packages that have nothing to do with Python > are used as well Yep, exactly. > > To solve the exact problem you (and me) have I think the best solution is > to integrate the tools mentioned above with what David is planning (SciPI > etc.). Or if that isn't good enough, find generic "userland package > manager" that has nothing to do with Python (I'm sure a dozen > half-finished ones must have been written but didn't look), finish it, and > connect it to SciPI. You have 0install, autopackage and klik, to cite the ones I know about. I wish people had looked at those before rolling toy solutions to complex problems. > > What David does (I think) is seperate the concerns. Exactly - you've describe this better than I did David From Chris.Barker at noaa.gov Mon Jan 4 20:05:30 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 04 Jan 2010 17:05:30 -0800 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) Message-ID: <4B42905A.4080105@noaa.gov> Hi folks, I'm taking a look once again at fromfile() for reading text files. I often have the need to read a LOT of numbers form a text file, and it can actually be pretty darn slow do i the normal python way: for line in file: data = map(float, line.strip().split()) or various other versions that are similar. It really does take longer to read the text, split it up, convert to a number, then put that number into a numpy array, than it does to simply read it straight into the array. However, as it stands, fromfile() turn out to be next to useless for anything but whitespace separated text. Full set of ideas here: http://projects.scipy.org/numpy/ticket/909 However, for the moment, I'm digging into the code to address a particular problem -- reading files like this: 123, 65.6, 789 23, 3.2, 34 ... That is comma (or whatever) separated text -- pretty common stuff. The problem with the current code is that you can't read more than one line at time with fromfile: a = np.fromfile(infile, sep=",") will read until it doesn't find a comma, and thus only one line, as there is no comma after each line. As this is a really typical case, I think it should be supported. Here is the question: The work of finding the separator is done in: multiarray/ctors.c: fromfile_skip_separator() It looks like it wouldn't be too hard to add some code in there to look for a newline, and consider that a valid separator. However, that would break backward compatibility. So maybe a flag could be passed in, saying you wanted to support newlines. The problem is that flag would have to get passed all the way through to this function (and also for fromstring). I also notice that it supports separators of arbitrary length, which I wonder how useful that is. But it also does odd things with spaces embedded in the separator: ", $ #" matches all of: ",$#" ", $#" ",$ #" Is it worth trying to fix that? In the longer term, it would be really nice to support comments as well, tough that would require more of a re-factoring of the code, I think (though maybe not -- I suppose a call to fromfile_skip_separator() could look for a comment character, then if it found one, skip to where the comment ends -- hmmm. thanks for any feedback, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From alan at ajackson.org Mon Jan 4 22:39:42 2010 From: alan at ajackson.org (alan at ajackson.org) Date: Mon, 4 Jan 2010 21:39:42 -0600 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <4B42905A.4080105@noaa.gov> References: <4B42905A.4080105@noaa.gov> Message-ID: <20100104213942.588435c2@ajackson.org> >Hi folks, > >I'm taking a look once again at fromfile() for reading text files. I >often have the need to read a LOT of numbers form a text file, and it >can actually be pretty darn slow do i the normal python way: > >for line in file: > data = map(float, line.strip().split()) > > >or various other versions that are similar. It really does take longer >to read the text, split it up, convert to a number, then put that number >into a numpy array, than it does to simply read it straight into the array. > >However, as it stands, fromfile() turn out to be next to useless for >anything but whitespace separated text. Full set of ideas here: > >http://projects.scipy.org/numpy/ticket/909 > >However, for the moment, I'm digging into the code to address a >particular problem -- reading files like this: > >123, 65.6, 789 >23, 3.2, 34 >... > >That is comma (or whatever) separated text -- pretty common stuff. > >The problem with the current code is that you can't read more than one >line at time with fromfile: > >a = np.fromfile(infile, sep=",") > >will read until it doesn't find a comma, and thus only one line, as >there is no comma after each line. As this is a really typical case, I >think it should be supported. > >Here is the question: > >The work of finding the separator is done in: > >multiarray/ctors.c: fromfile_skip_separator() > >It looks like it wouldn't be too hard to add some code in there to look >for a newline, and consider that a valid separator. However, that would >break backward compatibility. So maybe a flag could be passed in, saying >you wanted to support newlines. The problem is that flag would have to >get passed all the way through to this function (and also for fromstring). > >I also notice that it supports separators of arbitrary length, which I >wonder how useful that is. But it also does odd things with spaces >embedded in the separator: > >", $ #" matches all of: ",$#" ", $#" ",$ #" > >Is it worth trying to fix that? > > >In the longer term, it would be really nice to support comments as well, >tough that would require more of a re-factoring of the code, I think >(though maybe not -- I suppose a call to fromfile_skip_separator() could >look for a comment character, then if it found one, skip to where the >comment ends -- hmmm. > >thanks for any feedback, > >-Chris > I agree. I've tried using it, and usually find that it doesn't quite get there. I rather like the R command(s) for reading text files - except then I have to use R which is painful after using python and numpy. Although ggplot2 is awfully nice too ... but that is a later post. read.table(file, header = FALSE, sep = "", quote = "\"'", dec = ".", row.names, col.names, as.is = !stringsAsFactors, na.strings = "NA", colClasses = NA, nrows = -1, skip = 0, check.names = TRUE, fill = !blank.lines.skip, strip.white = FALSE, blank.lines.skip = TRUE, comment.char = "#", allowEscapes = FALSE, flush = FALSE, stringsAsFactors = default.stringsAsFactors(), fileEncoding = "", encoding = "unknown") read.csv(file, header = TRUE, sep = ",", quote="\"", dec=".", fill = TRUE, comment.char="", ...) read.csv2(file, header = TRUE, sep = ";", quote="\"", dec=",", fill = TRUE, comment.char="", ...) read.delim(file, header = TRUE, sep = "\t", quote="\"", dec=".", fill = TRUE, comment.char="", ...) read.delim2(file, header = TRUE, sep = "\t", quote="\"", dec=",", fill = TRUE, comment.char="", ...) There is really only read.table, the others are just aliases with different defaults. But the flexibility is great, as you can see. -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From josef.pktd at gmail.com Mon Jan 4 22:45:04 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 4 Jan 2010 22:45:04 -0500 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <20100104213942.588435c2@ajackson.org> References: <4B42905A.4080105@noaa.gov> <20100104213942.588435c2@ajackson.org> Message-ID: <1cd32cbb1001041945n24a5c49qcf80ef58430596f2@mail.gmail.com> On Mon, Jan 4, 2010 at 10:39 PM, wrote: >>Hi folks, >> >>I'm taking a look once again at fromfile() for reading text files. I >>often have the need to read a LOT of numbers form a text file, and it >>can actually be pretty darn slow do i the normal python way: >> >>for line in file: >> ? ?data = map(float, line.strip().split()) >> >> >>or various other versions that are similar. It really does take longer >>to read the text, split it up, convert to a number, then put that number >>into a numpy array, than it does to simply read it straight into the array. >> >>However, as it stands, fromfile() turn out to be next to useless for >>anything but whitespace separated text. Full set of ideas here: >> >>http://projects.scipy.org/numpy/ticket/909 >> >>However, for the moment, I'm digging into the code to address a >>particular problem -- reading files like this: >> >>123, 65.6, 789 >>23, ?3.2, ?34 >>... >> >>That is comma (or whatever) separated text -- pretty common stuff. >> >>The problem with the current code is that you can't read more than one >>line at time with fromfile: >> >>a = np.fromfile(infile, sep=",") >> >>will read until it doesn't find a comma, and thus only one line, as >>there is no comma after each line. As this is a really typical case, I >>think it should be supported. >> >>Here is the question: >> >>The work of finding the separator is done in: >> >>multiarray/ctors.c: ?fromfile_skip_separator() >> >>It looks like it wouldn't be too hard to add some code in there to look >>for a newline, and consider that a valid separator. However, that would >>break backward compatibility. So maybe a flag could be passed in, saying >>you wanted to support newlines. The problem is that flag would have to >>get passed all the way through to this function (and also for fromstring). >> >>I also notice that it supports separators of arbitrary length, which I >>wonder how useful that is. But it also does odd things with spaces >>embedded in the separator: >> >>", $ #" matches all of: ?",$#" ? ", $#" ?",$ #" >> >>Is it worth trying to fix that? >> >> >>In the longer term, it would be really nice to support comments as well, >>tough that would require more of a re-factoring of the code, I think >>(though maybe not -- I suppose a call to fromfile_skip_separator() could >>look for a comment character, then if it found one, skip to where the >>comment ends -- hmmm. >> >>thanks for any feedback, >> >>-Chris >> > > I agree. I've tried using it, and usually find that it doesn't quite get there. > > I rather like the R command(s) for reading text files - except then I have to > use R which is painful after using python and numpy. Although ggplot2 is > awfully nice too ... but that is a later post. > > ? ? read.table(file, header = FALSE, sep = "", quote = "\"'", > ? ? ? ? ? ? ? ?dec = ".", row.names, col.names, > ? ? ? ? ? ? ? ?as.is = !stringsAsFactors, > ? ? ? ? ? ? ? ?na.strings = "NA", colClasses = NA, nrows = -1, > ? ? ? ? ? ? ? ?skip = 0, check.names = TRUE, fill = !blank.lines.skip, > ? ? ? ? ? ? ? ?strip.white = FALSE, blank.lines.skip = TRUE, > ? ? ? ? ? ? ? ?comment.char = "#", > ? ? ? ? ? ? ? ?allowEscapes = FALSE, flush = FALSE, > ? ? ? ? ? ? ? ?stringsAsFactors = default.stringsAsFactors(), > ? ? ? ? ? ? ? ?fileEncoding = "", encoding = "unknown") > > ? ? read.csv(file, header = TRUE, sep = ",", quote="\"", dec=".", > ? ? ? ? ? ? ?fill = TRUE, comment.char="", ...) > > ? ? read.csv2(file, header = TRUE, sep = ";", quote="\"", dec=",", > ? ? ? ? ? ? ? fill = TRUE, comment.char="", ...) > > ? ? read.delim(file, header = TRUE, sep = "\t", quote="\"", dec=".", > ? ? ? ? ? ? ? ?fill = TRUE, comment.char="", ...) > > ? ? read.delim2(file, header = TRUE, sep = "\t", quote="\"", dec=",", > ? ? ? ? ? ? ? ? fill = TRUE, comment.char="", ...) > > > There is really only read.table, the others are just aliases with different > defaults. ?But the flexibility is great, as you can see. Aren't the newly improved numpy.genfromtxt(fname, dtype=, comments='#', delimiter=None, skiprows=0, converters=None, missing='', missing_values=None, usecols=None, names=None, excludelist=None, deletechars=None, case_sensitive=True, unpack=None, usemask=False, loose=True) and friends indented to handle all this Josef > > -- > ----------------------------------------------------------------------- > | Alan K. Jackson ? ? ? ? ? ?| To see a World in a Grain of Sand ? ? ?| > | alan at ajackson.org ? ? ? ? ?| And a Heaven in a Wild Flower, ? ? ? ? | > | www.ajackson.org ? ? ? ? ? | Hold Infinity in the palm of your hand | > | Houston, Texas ? ? ? ? ? ? | And Eternity in an hour. - Blake ? ? ? | > ----------------------------------------------------------------------- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pivanov314 at gmail.com Tue Jan 5 03:30:17 2010 From: pivanov314 at gmail.com (Paul Ivanov) Date: Tue, 05 Jan 2010 00:30:17 -0800 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <4B42905A.4080105@noaa.gov> References: <4B42905A.4080105@noaa.gov> Message-ID: <4B42F899.90109@gmail.com> Christopher Barker, on 2010-01-04 17:05, wrote: > Hi folks, > > I'm taking a look once again at fromfile() for reading text files. I > often have the need to read a LOT of numbers form a text file, and it > can actually be pretty darn slow do i the normal python way: > > for line in file: > data = map(float, line.strip().split()) > > > or various other versions that are similar. It really does take longer > to read the text, split it up, convert to a number, then put that number > into a numpy array, than it does to simply read it straight into the array. > > However, as it stands, fromfile() turn out to be next to useless for > anything but whitespace separated text. Full set of ideas here: > > http://projects.scipy.org/numpy/ticket/909 > > However, for the moment, I'm digging into the code to address a > particular problem -- reading files like this: > > 123, 65.6, 789 > 23, 3.2, 34 > ... > > That is comma (or whatever) separated text -- pretty common stuff. > > The problem with the current code is that you can't read more than one > line at time with fromfile: > > a = np.fromfile(infile, sep=",") > > will read until it doesn't find a comma, and thus only one line, as > there is no comma after each line. As this is a really typical case, I > think it should be supported. Just a potshot, but have you tried np.loadtxt? I find it pretty fast. > > Here is the question: > > The work of finding the separator is done in: > > multiarray/ctors.c: fromfile_skip_separator() > > It looks like it wouldn't be too hard to add some code in there to look > for a newline, and consider that a valid separator. However, that would > break backward compatibility. So maybe a flag could be passed in, saying > you wanted to support newlines. The problem is that flag would have to > get passed all the way through to this function (and also for fromstring). > > I also notice that it supports separators of arbitrary length, which I > wonder how useful that is. But it also does odd things with spaces > embedded in the separator: > > ", $ #" matches all of: ",$#" ", $#" ",$ #" > > Is it worth trying to fix that? > > > In the longer term, it would be really nice to support comments as well, > tough that would require more of a re-factoring of the code, I think > (though maybe not -- I suppose a call to fromfile_skip_separator() could > look for a comment character, then if it found one, skip to where the > comment ends -- hmmm. > > thanks for any feedback, > > -Chris > > > > > > > From weisen123 at gmail.com Tue Jan 5 11:35:22 2010 From: weisen123 at gmail.com (neil weisenfeld) Date: Tue, 5 Jan 2010 11:35:22 -0500 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 Message-ID: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> Hi all, I'm having an odd problem with the package installer for numpy 1.4.0. It complains: numpy 1.4.0 can't be installed on this disk. numpy requires System Python 2.6 to install. I'm running a stock system with a stock python, so I'm not sure why the test is failing. Any ideas how to debug this? Some info: weisen at Neil-Weisenfeld-MacBook-Pro:~ [507]$ which python /usr/bin/python weisen at Neil-Weisenfeld-MacBook-Pro:~ [508]$ python Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> weisen at Neil-Weisenfeld-MacBook-Pro:~ [509]$ ls -l /System/Library/Frameworks/Python.framework/Versions/Current lrwxr-xr-x 1 root wheel 3 Sep 22 17:49 /System/Library/Frameworks/Python.framework/Versions/Current@ -> 2.6 Thanks, Neil From charlesr.harris at gmail.com Tue Jan 5 12:20:24 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 Jan 2010 10:20:24 -0700 Subject: [Numpy-discussion] finishing ticket #1035 Message-ID: Hi All, I'm looking at ticket #1025with an eye to bringing it to completion but there some issues that need discussion. Currently there are three ways in which nans can be compared: maximum/minimum, fmax/fmin, or the new sort order. The maximum/minimum ufuncs propagate nans, i.e., they will always return a nan if one is present. The fmax/fmin ufuncs don't propagate nans, they ignore nans when possible. The new sort order sorts nans to the end, i.e., nans are treated as larger than any non-nan number; at present there are no ufuncs that correspond to the sort order. The issues I think need resolving are: 1) Should there be ufuncs corresponding to the sort order? 2) What should a.max(), a.argmax(), a.min(), and a.argmin() do? I note that a.argmax() is not consistent with a.max() at the moment: In [9]: a Out[9]: array([ 0., 1., 2., 3., NaN, 5., 6., 7., NaN, NaN]) In [10]: a.argmax() Out[10]: 7 In [11]: a.max() Out[11]: nan Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Tue Jan 5 12:32:01 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 05 Jan 2010 09:32:01 -0800 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <1cd32cbb1001041945n24a5c49qcf80ef58430596f2@mail.gmail.com> References: <4B42905A.4080105@noaa.gov> <20100104213942.588435c2@ajackson.org> <1cd32cbb1001041945n24a5c49qcf80ef58430596f2@mail.gmail.com> Message-ID: <4B437791.7020808@noaa.gov> josef.pktd at gmail.com wrote: > On Mon, Jan 4, 2010 at 10:39 PM, wrote: >> I rather like the R command(s) for reading text files > Aren't the newly improved > > numpy.genfromtxt() ... > and friends indented to handle all this Yes, they are, and they are great, but not really all that fast. If you've got big complicated tables of data to read, then genfromtxt is the way to go -- it's a great tool. However, for the simple stuff, it's not really optimized. I also find I have to read a lot of text files that aren't tables of data, but rather an odd mix of stuff, but still a lot of reading lots of numbers from a file. As far as I can tell, genfromtxt and loadtxt can only load the entire file as a table (a very common situation, of course). Paul Ivanov wrote: > Just a potshot, but have you tried np.loadtxt? > > I find it pretty fast. I guess I should have posted timings in the first place: In [19]: timeit timing.time_genfromtxt() 10 loops, best of 3: 216 ms per loop In [20]: timeit timing.time_loadtxt() 10 loops, best of 3: 166 ms per loop In [21]: timeit timing.time_fromfile() 10 loops, best of 3: 47.1 ms per loop (40,000 doubles from a space-delimted text file) so fromfile() is 3.5 times as fast as loadtxt and 4.5 times as fast as genfromtxt. That does make a difference for me -- the user waiting 4 seconds, rather than one second to load a file matters. I suppose another option might be to see if I can optimize the inner scanning function of genfromtxt with Cython or C, but I'm not sure that's possible, as it's really very flexible, and re-writing all of that without Python would be really painful! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pgmdevlist at gmail.com Tue Jan 5 13:51:18 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 5 Jan 2010 13:51:18 -0500 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <4B437791.7020808@noaa.gov> References: <4B42905A.4080105@noaa.gov> <20100104213942.588435c2@ajackson.org> <1cd32cbb1001041945n24a5c49qcf80ef58430596f2@mail.gmail.com> <4B437791.7020808@noaa.gov> Message-ID: <07B0901E-5F52-4890-AB1D-F225F0D2E009@gmail.com> On Jan 5, 2010, at 12:32 PM, Christopher Barker wrote: > josef.pktd at gmail.com wrote: >> On Mon, Jan 4, 2010 at 10:39 PM, wrote: >>> I rather like the R command(s) for reading text files > >> Aren't the newly improved >> >> numpy.genfromtxt() > > ... > >> and friends indented to handle all this > > Yes, they are, and they are great, but not really all that fast. If > you've got big complicated tables of data to read, then genfromtxt is > the way to go -- it's a great tool. However, for the simple stuff, it's > not really optimized. genfromtxt is nothing but loadtxt overloaded to deal with undefined dtype and missing entries. It's doomed to be slower, and it shouldn't be used if you know your data is well-defined and well-behaved. Stick to loadtxt > I also find I have to read a lot of text files > that aren't tables of data, but rather an odd mix of stuff, but still a > lot of reading lots of numbers from a file. Well, everything depends on what kind of stuff you have in your mix, I guess... > so fromfile() is 3.5 times as fast as loadtxt and 4.5 times as fast as > genfromtxt. That does make a difference for me -- the user waiting 4 > seconds, rather than one second to load a file matters. Rmmbr that fromfile is C when loadtxt and genfromtxt are Python... > I suppose another option might be to see if I can optimize the inner > scanning function of genfromtxt with Cython or C, but I'm not sure > that's possible, as it's really very flexible, and re-writing all of > that without Python would be really painful! Well, there's room for some optimization for particular cases (dtype!=None), but the generic case will be tricky... From pav at iki.fi Tue Jan 5 15:01:36 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 05 Jan 2010 22:01:36 +0200 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <4B42905A.4080105@noaa.gov> References: <4B42905A.4080105@noaa.gov> Message-ID: <1262721695.5107.1.camel@idol> ma, 2010-01-04 kello 17:05 -0800, Christopher Barker kirjoitti: [clip] > I also notice that it supports separators of arbitrary length, which I > wonder how useful that is. But it also does odd things with spaces > embedded in the separator: > > ", $ #" matches all of: ",$#" ", $#" ",$ #" > > Is it worth trying to fix that? That's a documented feature: sep : str Separator between items if file is a text file. Empty ("") separator means the file should be treated as binary. Spaces (" ") in the separator match zero or more whitespace characters. A separator consisting only of spaces must match at least one whitespace. -- Pauli Virtanen From efiring at hawaii.edu Tue Jan 5 15:45:51 2010 From: efiring at hawaii.edu (Eric Firing) Date: Tue, 05 Jan 2010 10:45:51 -1000 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> Message-ID: <4B43A4FF.3010604@hawaii.edu> neil weisenfeld wrote: > Hi all, > > I'm having an odd problem with the package installer for numpy 1.4.0. > It complains: > > numpy 1.4.0 can't be installed on this disk. numpy requires System > Python 2.6 to install. I think the problem is that the message is misleading; it should be saying you need python from python.org, *not* the python that comes with OSX. (The two coexist; installing python from python.org does not interfere with OSX's use of its own python. Caveat: I don't use Mac myself, so I am basing all this on second-hand experience--helping with a numpy installation a few minutes ago--and what I remember seeing. I think there is a mailing list thread about this, but I couldn't find it.) Eric > > > I'm running a stock system with a stock python, so I'm not sure why > the test is failing. Any ideas how to debug this? > > Some info: > > weisen at Neil-Weisenfeld-MacBook-Pro:~ > [507]$ which python > /usr/bin/python > > weisen at Neil-Weisenfeld-MacBook-Pro:~ > [508]$ python > Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51) > [GCC 4.2.1 (Apple Inc. build 5646)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > > weisen at Neil-Weisenfeld-MacBook-Pro:~ > [509]$ ls -l /System/Library/Frameworks/Python.framework/Versions/Current > lrwxr-xr-x 1 root wheel 3 Sep 22 17:49 > /System/Library/Frameworks/Python.framework/Versions/Current@ -> 2.6 > > > > Thanks, > Neil > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From pgmdevlist at gmail.com Tue Jan 5 16:03:10 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 5 Jan 2010 16:03:10 -0500 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B43A4FF.3010604@hawaii.edu> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> Message-ID: On Jan 5, 2010, at 3:45 PM, Eric Firing wrote: > neil weisenfeld wrote: >> Hi all, >> >> I'm having an odd problem with the package installer for numpy 1.4.0. >> It complains: >> >> numpy 1.4.0 can't be installed on this disk. numpy requires System >> Python 2.6 to install. > > I think the problem is that the message is misleading; it should be > saying you need python from python.org, *not* the python that comes with > OSX. ??? I have several versions of numpy installed on my Macbook (OS X 6), but only one Python (the one that comes with Apple). However, these versions are installed in different virtual environments. Neil, could you give us more info about how you're trying to install it ? Have you tried to use the --user flag (ie, `python setup.py install --user`) ? From manuel.wittchen at gmail.com Tue Jan 5 16:14:54 2010 From: manuel.wittchen at gmail.com (Manuel Wittchen) Date: Tue, 5 Jan 2010 22:14:54 +0100 Subject: [Numpy-discussion] extracting data from ODF files Message-ID: <209cec441001051314k284895bdq6e9a3b8f5fd9197c@mail.gmail.com> Hi, is there a (simple) solution to extract data from OpenDocument files (espacially OpenOffice.org Calc files) into a Numpy Array? At the moment I copy the colums from OO.org Calc manually into a tab-separatet Plaintext file which is quite annoying. Regards, Manuel Wittchen From emmanuelle.gouillart at normalesup.org Tue Jan 5 16:23:08 2010 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Tue, 5 Jan 2010 22:23:08 +0100 Subject: [Numpy-discussion] extracting data from ODF files In-Reply-To: <209cec441001051314k284895bdq6e9a3b8f5fd9197c@mail.gmail.com> References: <209cec441001051314k284895bdq6e9a3b8f5fd9197c@mail.gmail.com> Message-ID: <20100105212307.GA32094@phare.normalesup.org> Hi Manuel, you may save your odf file as a csv (comma separated value) file with OpenOffice, then use np.loadtxt, specifying the 'delimiter' keyword: myarray = np.loadtxt('myfile.csv', delimiter=',') Cheers, Emmanuelle On Tue, Jan 05, 2010 at 10:14:54PM +0100, Manuel Wittchen wrote: > Hi, > is there a (simple) solution to extract data from OpenDocument files > (espacially OpenOffice.org Calc files) into a Numpy Array? At the > moment I copy the colums from OO.org Calc manually into a > tab-separatet Plaintext file which is quite annoying. > Regards, > Manuel Wittchen > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From xavier.gnata at gmail.com Tue Jan 5 16:48:43 2010 From: xavier.gnata at gmail.com (Xavier Gnata) Date: Tue, 05 Jan 2010 22:48:43 +0100 Subject: [Numpy-discussion] Lots of 32bits specific errors Message-ID: <4B43B3BB.4060702@gmail.com> Hi, I have compiled numpy 1.5.0.dev8039 both on a 32 and a 64bits ubuntu machine. On the 64bits one, everything is fine: numpy.test get a perfect score: On the 32bits ubuntu, the story is not that nice: ERROR: Test filled w/ mvoid ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/ma/tests/test_core.py", line 506, in test_filled_w_mvoid a = mvoid(np.array((1, 2)), mask=[(0, 1)], dtype=ndtype) File "/usr/local/lib/python2.6/dist-packages/numpy/ma/core.py", line 5454, in __new__ _data = ndarray.__new__(self, (), dtype=dtype, buffer=data.data) TypeError: buffer is too small for requested array ====================================================================== FAIL: test_cdouble (test_linalg.TestCond2) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 114, in do old_assert_almost_equal(s[0]/s[-1], linalg.cond(a,2), decimal=5) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 455, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: 9.4348091510413177 DESIRED: 22.757141876814547 ====================================================================== FAIL: test_csingle (test_linalg.TestCond2) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 114, in do old_assert_almost_equal(s[0]/s[-1], linalg.cond(a,2), decimal=5) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 455, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: 9.4348097 DESIRED: 22.757143 ====================================================================== FAIL: test_cdouble (test_linalg.TestDet) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 135, in do assert_almost_equal(d, multiply.reduce(ev)) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: (8.8817841970012523e-16-4j) DESIRED: (5.2800000000000011-11.040000000000004j) ====================================================================== FAIL: test_csingle (test_linalg.TestDet) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 135, in do assert_almost_equal(d, multiply.reduce(ev)) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: (8.8817841970012523e-16-4j) DESIRED: (5.2800000000000011-11.040000000000004j) ====================================================================== FAIL: test_cdouble (test_linalg.TestEig) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 94, in do assert_almost_equal(dot(a, evectors), multiply(evectors, evalues)) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([[ 2.72530404+2.67511327j, 1.92238375+1.30131653j], [ 5.95809316+4.79684551j, 3.41362547+1.42587017j]]) DESIRED: array([[ 2.01388405+1.03693361j, -1.39512658+1.87085135j], [ 1.78601662+0.01838201j, -0.10408816-3.54121552j]]) ====================================================================== FAIL: test_csingle (test_linalg.TestEig) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 94, in do assert_almost_equal(dot(a, evectors), multiply(evectors, evalues)) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([[ 2.72530413+2.6751132j , 1.92238379+1.3013165j ], [ 5.95809317+4.79684544j, 3.41362548+1.42587018j]], dtype=complex64) DESIRED: array([[ 2.01388407+1.03693354j, -1.39512670+1.8708514j ], [ 1.78601658+0.01838197j, -0.10408816-3.54121566j]], dtype=complex64) ====================================================================== FAIL: test_cdouble (test_linalg.TestEigh) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 221, in test_cdouble self.do(a) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 259, in do assert_almost_equal(ev, evalues) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([-2.60555128, 4.60555128]) DESIRED: array([-1.71080202-1.00413682j, 3.01849433+1.46567528j]) ====================================================================== FAIL: test_csingle (test_linalg.TestEigh) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 217, in test_csingle self.do(a) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 259, in do assert_almost_equal(ev, evalues) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([-2.60555124, 4.60555124], dtype=float32) DESIRED: array([-1.71080208-1.0041368j , 3.01849437+1.46567523j], dtype=complex64) ====================================================================== FAIL: test_cdouble (test_linalg.TestEigvalsh) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 221, in test_cdouble self.do(a) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 249, in do assert_almost_equal(ev, evalues) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([-2.60555128+0.j, 4.60555128+0.j]) DESIRED: array([-1.71080202-1.00413682j, 3.01849433+1.46567528j]) ====================================================================== FAIL: test_csingle (test_linalg.TestEigvalsh) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 217, in test_csingle self.do(a) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 249, in do assert_almost_equal(ev, evalues) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([-2.60555124+0.j, 4.60555124+0.j], dtype=complex64) DESIRED: array([-1.71080208-1.0041368j , 3.01849437+1.46567523j], dtype=complex64) ====================================================================== FAIL: test_cdouble (test_linalg.TestLstsq) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 141, in do assert_almost_equal(b, dot(a, x)) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([ 2.+1.j, 1.+2.j]) DESIRED: array([ 0.95920929+0.98311952j, 1.23494444+0.67346351j]) ====================================================================== FAIL: test_csingle (test_linalg.TestLstsq) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 141, in do assert_almost_equal(b, dot(a, x)) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([ 2.+1.j, 1.+2.j], dtype=complex64) DESIRED: array([ 0.95920926+0.98311943j, 1.23494434+0.67346334j], dtype=complex64) ====================================================================== FAIL: test_cdouble (test_linalg.TestPinv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 124, in do assert_almost_equal(dot(a, a_ginv), identity(asarray(a).shape[0])) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([[ 0.29169056-0.07799046j, 0.17767375-0.01332484j], [ 0.04125021-0.38255608j, 0.73402869+0.62377356j]]) DESIRED: array([[ 1., 0.], [ 0., 1.]]) ====================================================================== FAIL: test_csingle (test_linalg.TestPinv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 124, in do assert_almost_equal(dot(a, a_ginv), identity(asarray(a).shape[0])) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([[ 0.29169053-0.07799049j, 0.17767370-0.0133248j ], [ 0.04125014-0.38255614j, 0.73402858+0.62377363j]], dtype=complex64) DESIRED: array([[ 1., 0.], [ 0., 1.]]) ====================================================================== FAIL: test_cdouble (test_linalg.TestSVD) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 44, in test_cdouble self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 100, in do assert_almost_equal(a, dot(multiply(u, s), vt)) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([[ 1.+2.j, 2.+3.j], [ 3.+4.j, 4.+5.j]]) DESIRED: array([[ 1.00000000+2.j , 2.36670415+2.98574489j], [ 3.00000000+4.j , 2.80882652+6.25521741j]]) ====================================================================== FAIL: test_csingle (test_linalg.TestSVD) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 39, in test_csingle self.do(a, b) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 100, in do assert_almost_equal(a, dot(multiply(u, s), vt)) File "/usr/local/lib/python2.6/dist-packages/numpy/linalg/tests/test_linalg.py", line 23, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "/usr/local/lib/python2.6/dist-packages/numpy/testing/utils.py", line 435, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: array([[ 1.+2.j, 2.+3.j], [ 3.+4.j, 4.+5.j]], dtype=complex64) DESIRED: array([[ 0.99999994+2.j , 2.36670423+2.98574495j], [ 3.00000000+4.j , 2.80882668+6.25521755j]], dtype=complex64) Xavier From efiring at hawaii.edu Tue Jan 5 16:53:17 2010 From: efiring at hawaii.edu (Eric Firing) Date: Tue, 05 Jan 2010 11:53:17 -1000 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> Message-ID: <4B43B4CD.3030701@hawaii.edu> Pierre GM wrote: > On Jan 5, 2010, at 3:45 PM, Eric Firing wrote: >> neil weisenfeld wrote: >>> Hi all, >>> >>> I'm having an odd problem with the package installer for numpy 1.4.0. >>> It complains: >>> >>> numpy 1.4.0 can't be installed on this disk. numpy requires System >>> Python 2.6 to install. >> I think the problem is that the message is misleading; it should be >> saying you need python from python.org, *not* the python that comes with >> OSX. > > ??? > I have several versions of numpy installed on my Macbook (OS X 6), but only one Python (the one that comes with Apple). However, these versions are installed in different virtual environments. > Neil, could you give us more info about how you're trying to install it ? Have you tried to use the --user flag (ie, `python setup.py install --user`) ? > Pierre, He is installing using the binary package installer, not installing from source. Eric > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From Chris.Barker at noaa.gov Tue Jan 5 17:09:14 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 05 Jan 2010 14:09:14 -0800 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B43A4FF.3010604@hawaii.edu> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> Message-ID: <4B43B88A.6020008@noaa.gov> Eric Firing wrote: >> I'm having an odd problem with the package installer for numpy 1.4.0. >> It complains: >> >> numpy 1.4.0 can't be installed on this disk. numpy requires System >> Python 2.6 to install. > > I think the problem is that the message is misleading; it should be > saying you need python from python.org, *not* the python that comes with > OSX. yes, that is it the "system python" referred to in the error message is actually the "python.org Framework build". I recommend it anyway, it's newer, and you won't accidentally mess with anything Apple is doing. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cournape at gmail.com Tue Jan 5 17:31:55 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 Jan 2010 07:31:55 +0900 Subject: [Numpy-discussion] Lots of 32bits specific errors In-Reply-To: <4B43B3BB.4060702@gmail.com> References: <4B43B3BB.4060702@gmail.com> Message-ID: <5b8d13221001051431q418df142x5df762c0d065c992@mail.gmail.com> On Wed, Jan 6, 2010 at 6:48 AM, Xavier Gnata wrote: > Hi, > > I have compiled numpy 1.5.0.dev8039 both on a 32 and a 64bits ubuntu > machine. > > On the 64bits one, everything is fine: > numpy.test get a perfect score: > > > On the 32bits ubuntu, the story is not that nice: > Almost all your errors are in linalg - most likely an atlas problem. Atlas with sse2 is buggy on some Ubuntu versions. You should check Ubuntu bug tracker to see if it affects the version you are using. David From pgmdevlist at gmail.com Tue Jan 5 17:38:44 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 5 Jan 2010 17:38:44 -0500 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B43B4CD.3030701@hawaii.edu> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> Message-ID: <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> On Jan 5, 2010, at 4:53 PM, Eric Firing wrote: > Pierre GM wrote: > > Pierre, > > He is installing using the binary package installer, not installing from > source. Ah OK, my bad. Now, why should it be that different ? Why rely on a second Python to install numpy from a dmg ? If it's a matter of framework, couldn't we used the Python in /System and create a framework in ~/Library ? (I'm just asking out of curiosity...) From pgmdevlist at gmail.com Tue Jan 5 17:38:44 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 5 Jan 2010 17:38:44 -0500 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B43B4CD.3030701@hawaii.edu> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> Message-ID: <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> On Jan 5, 2010, at 4:53 PM, Eric Firing wrote: > Pierre GM wrote: > > Pierre, > > He is installing using the binary package installer, not installing from > source. Ah OK, my bad. Now, why should it be that different ? Why rely on a second Python to install numpy from a dmg ? If it's a matter of framework, couldn't we used the Python in /System and create a framework in ~/Library ? (I'm just asking out of curiosity...) From Chris.Barker at noaa.gov Tue Jan 5 18:01:38 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 05 Jan 2010 15:01:38 -0800 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> Message-ID: <4B43C4D2.5030701@noaa.gov> Pierre GM wrote: > Ah OK, my bad. Now, why should it be that different ? Why rely on a > second Python to install numpy from a dmg? OS-X has a way of hard coding paths, so a given installer is designed to go in one place, and one place only. The python.org python is the best one to support -- Apple has never upgraded a python, has often shipped a broken version, and has provided different versions with each OS-X version. If we support the python.org python for OS-X 10.4, it can work for everyone with 10.4 - 10.6. It's changing a bit with OS-X 10.6 -- for the first time, Apple at least provided an up-to-date python that isn't broken. But it's not really up to date anymore, anyway (2.6.1 when 2.6.4 is out. I know I've been bitten by at least one bug that was fixed between 2.6.1 and 2.6.3). This is a policy followed by other projects as well. > If it's a matter of > framework, couldn't we used the Python in /System and create a > framework in ~/Library ? (I'm just asking out of curiosity...) The Apple python is fine -- it just isn't the same one, installed in the same place. If you want to build a binary installer for it, it's easy -- but it will only work on 10.6, and not with an updated Python. As the 2.6 series is binary compatible, you can build a single installer that will work with both -- Robin Dunn has done this for wxPython. The way he's done it is to put wxPython itself into /usr/local, and then put some *.pth trickery into both of the pythons: /System/... and /Library/... It works fine, and I've suggested it on this list before, but I guess folks think it's too much of a hack -- or just no one has taken the time to do it. If I get a vote of approval for the approach, I suppose I could do it, I'm sure I could find Robin's scripts and hack them for numpy. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From dwf at cs.toronto.edu Tue Jan 5 18:22:23 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 5 Jan 2010 18:22:23 -0500 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B43C4D2.5030701@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> Message-ID: On 5-Jan-10, at 6:01 PM, Christopher Barker wrote: > The python.org python is the best one to support -- Apple has never > upgraded a python, has often shipped a broken version, and has > provided > different versions with each OS-X version. If we support the > python.org > python for OS-X 10.4, it can work for everyone with 10.4 - 10.6. > > It's changing a bit with OS-X 10.6 -- for the first time, Apple at > least > provided an up-to-date python that isn't broken. But it's not really > up > to date anymore, anyway (2.6.1 when 2.6.4 is out. I know I've been > bitten by at least one bug that was fixed between 2.6.1 and 2.6.3). AFAIK, the System Python in 10.6 is 64-bit capable (but not in the same way as Ron Oussoren's 4-way universal build script does it). Pretty sure the python.org binaries are 32-bit only. I still think it's sensible to prefer the > As the 2.6 series is binary compatible, you can build a single > installer > that will work with both -- Robin Dunn has done this for wxPython. The > way he's done it is to put wxPython itself into /usr/local, and then > put > some *.pth trickery into both of the pythons: /System/... and / > Library/... > > It works fine, and I've suggested it on this list before, but I guess > folks think it's too much of a hack -- or just no one has taken the > time > to do it. +1 on the general approach though it might get a bit more complicated if the two Pythons support different sets of architectures (e.g. i386 and x86_64 in System Python 10.6, i386 and ppc in Python.org Python, or some home-rolled weirdness). With wxPython this doesn't so much matter since wxMac depends on Carbon anyway (I think it still does, at least, unless the Cocoa port's suddenly sped up an incredible amount), which is a 64-bit no-no. I'm not really a fan of packages polluting /usr/local, I'd rather the tree appear /opt/packagename or /usr/local/packagename instead, for ease of removal, but the general approach of "stash somewhere and put a .pth in both site-packages" seems fine to me. David From xavier.gnata at gmail.com Tue Jan 5 18:17:12 2010 From: xavier.gnata at gmail.com (Xavier Gnata) Date: Wed, 06 Jan 2010 00:17:12 +0100 Subject: [Numpy-discussion] Lots of 32bits specific errors In-Reply-To: <5b8d13221001051431q418df142x5df762c0d065c992@mail.gmail.com> References: <4B43B3BB.4060702@gmail.com> <5b8d13221001051431q418df142x5df762c0d065c992@mail.gmail.com> Message-ID: <4B43C878.3060007@gmail.com> > On Wed, Jan 6, 2010 at 6:48 AM, Xavier Gnata wrote: > >> Hi, >> >> I have compiled numpy 1.5.0.dev8039 both on a 32 and a 64bits ubuntu >> machine. >> >> On the 64bits one, everything is fine: >> numpy.test get a perfect score: >> >> >> On the 32bits ubuntu, the story is not that nice: >> >> > Almost all your errors are in linalg - most likely an atlas problem. > Atlas with sse2 is buggy on some Ubuntu versions. You should check > Ubuntu bug tracker to see if it affects the version you are using. > > Thanks, you got it! atlas sse2 in fully buggy in karmic (at least on 32bits machines). without sse. I'm going to try with the sse version of atlas (sse not sse2...) to see if it is as buggy as the sse? version. Xavier From pgmdevlist at gmail.com Tue Jan 5 18:21:24 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 5 Jan 2010 18:21:24 -0500 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> Message-ID: <01D3DA28-6462-4481-97AE-E4ECEB5DD181@gmail.com> On Jan 5, 2010, at 6:22 PM, David Warde-Farley wrote: > > On 5-Jan-10, at 6:01 PM, Christopher Barker wrote: > >> The python.org python is the best one to support -- Apple has never >> upgraded a python, has often shipped a broken version, and has >> provided >> different versions with each OS-X version. If we support the >> python.org >> python for OS-X 10.4, it can work for everyone with 10.4 - 10.6. >> >> It's changing a bit with OS-X 10.6 -- for the first time, Apple at >> least >> provided an up-to-date python that isn't broken. But it's not really >> up >> to date anymore, anyway (2.6.1 when 2.6.4 is out. I know I've been >> bitten by at least one bug that was fixed between 2.6.1 and 2.6.3). > > AFAIK, the System Python in 10.6 is 64-bit capable (but not in the > same way as Ron Oussoren's 4-way universal build script does it). > Pretty sure the python.org binaries are 32-bit only. I still think > it's sensible to prefer the OK, so there's still no 64b Python on python.org ? Gonna stick with Apple's then (I remember using a lot of sleep hours when I upgraded to 10.6...) >> As the 2.6 series is binary compatible, you can build a single >> installer >> that will work with both -- Robin Dunn has done this for wxPython. The >> way he's done it is to put wxPython itself into /usr/local, and then >> put >> some *.pth trickery into both of the pythons: /System/... and / >> Library/... >> >> It works fine, and I've suggested it on this list before, but I guess >> folks think it's too much of a hack -- or just no one has taken the >> time >> to do it. > > +1 on the general approach though it might get a bit more complicated > if the two Pythons support different sets of architectures (e.g. i386 > and x86_64 in System Python 10.6, i386 and ppc in Python.org Python, > or some home-rolled weirdness). With wxPython this doesn't so much > matter since wxMac depends on Carbon anyway (I think it still does, at > least, unless the Cocoa port's suddenly sped up an incredible amount), > which is a 64-bit no-no. > > I'm not really a fan of packages polluting /usr/local, I'd rather the > tree appear /opt/packagename or /usr/local/packagename instead, for > ease of removal, but the general approach of "stash somewhere and put > a .pth in both site-packages" seems fine to me. +1 w/ David as well. Christopher, thanks a lot for the info. I'm glad I don't have to deal w/ packaging issues... From Chris.Barker at noaa.gov Tue Jan 5 19:02:56 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 05 Jan 2010 16:02:56 -0800 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> Message-ID: <4B43D330.7090608@noaa.gov> David Warde-Farley wrote: > AFAIK, the System Python in 10.6 is 64-bit capable (but not in the > same way as Ron Oussoren's 4-way universal build script does it). right -- I'm not sure if it's useful, though, I don't' think there is a 64 bit interpreter, for instance. But maybe that was the one delivered with 10.5. But I'm not the one to ask -- I don't have 10.6, I'm still on an old PPC running 10.4. > Pretty sure the python.org binaries are 32-bit only. I still think > it's sensible to prefer the waiting the rest of this sentence.. ;-) >> As the 2.6 series is binary compatible, you can build a single >> installer that will work with both > +1 on the general approach though it might get a bit more complicated > if the two Pythons support different sets of architectures (e.g. i386 > and x86_64 in System Python 10.6, i386 and ppc in Python.org Python, > or some home-rolled weirdness). Yes, the whole thing is a nightmare, really. 32bit ppc+i386 was bad enough -- with four now, it's really a mess. > With wxPython this doesn't so much > matter since wxMac depends on Carbon anyway (I think it still does, at > least, unless the Cocoa port's suddenly sped up an incredible amount), > which is a 64-bit no-no. You're right -- still strictly Carbon, and therefor strictly 32 bit. > I'm not really a fan of packages polluting /usr/local, I'd rather the > tree appear /opt/packagename well, /opt has kind of been co-opted by macports. > or /usr/local/packagename instead, for > ease of removal wxPython gets put entirely into: /usr/local/lib/wxPython-unicode-2.10.8 which isn't bad. > but the general approach of "stash somewhere and put > a .pth in both site-packages" seems fine to me. OK -- what about simply punting and doing two builds: one 32 bit, and one 64 bit. I wonder if we need 64bit PPC at all? I know I'm running 64 bit hardware, but never ran a 64 bit OS on it -- I wonder if anyone is? What machines/OS versions are available for building Mac installer with? I could do 10.4, 32 bit, ppc+intel. I'll post on the pythonmac list to see what folks there think. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue Jan 5 19:18:46 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 05 Jan 2010 16:18:46 -0800 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B43D330.7090608@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> Message-ID: <4B43D6E6.80305@noaa.gov> Christopher Barker wrote: > OK -- what about simply punting and doing two builds: one 32 bit, and > one 64 bit. I wonder if we need 64bit PPC at all? I know I'm running 64 > bit hardware, but never ran a 64 bit OS on it -- I wonder if anyone is? Oh, I think this approach may be completely egg-incompatible.... Maybe we just need ten builds -- arrgg! If distutils/setuptools could identify the python version properly, then binary eggs and easy-install could be a solution -- but that's a mess, too. oh well, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cournape at gmail.com Tue Jan 5 19:42:31 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 Jan 2010 09:42:31 +0900 Subject: [Numpy-discussion] Lots of 32bits specific errors In-Reply-To: <4B43C878.3060007@gmail.com> References: <4B43B3BB.4060702@gmail.com> <5b8d13221001051431q418df142x5df762c0d065c992@mail.gmail.com> <4B43C878.3060007@gmail.com> Message-ID: <5b8d13221001051642i24d0f0f5m82437235f5b1ae21@mail.gmail.com> On Wed, Jan 6, 2010 at 8:17 AM, Xavier Gnata wrote: > >> On Wed, Jan 6, 2010 at 6:48 AM, Xavier Gnata wrote: >> >>> Hi, >>> >>> I have compiled numpy 1.5.0.dev8039 both on a 32 and a 64bits ubuntu >>> machine. >>> >>> On the 64bits one, everything is fine: >>> numpy.test get a perfect score: >>> >>> >>> On the 32bits ubuntu, the story is not that nice: >>> >>> >> Almost all your errors are in linalg - most likely an atlas problem. >> Atlas with sse2 is buggy on some Ubuntu versions. You should check >> Ubuntu bug tracker to see if it affects the version you are using. >> >> > Thanks, you got it! atlas sse2 in fully buggy in karmic (at least on > 32bits machines). That's just incompetence at that point. This bug is known for > one year, and they still have not fixed it. The least they could do is removing the package so that people has slower, but accurate version. David From cournape at gmail.com Tue Jan 5 20:07:26 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 Jan 2010 10:07:26 +0900 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> Message-ID: <5b8d13221001051707g28d317f6t8ea07a138c457c38@mail.gmail.com> On Wed, Jan 6, 2010 at 8:22 AM, David Warde-Farley wrote: > > On 5-Jan-10, at 6:01 PM, Christopher Barker wrote: > > >> As the 2.6 series is binary compatible, you can build a single >> installer >> that will work with both I don't think that's true. 2.6.x are compatible with each other iif they are built with the same compiler options. There are too many differences between Apple python and the python.org python (dtrace, 64 bits support, compiler options, etc...) IMHO to make a compatible installer for both versions worthwhile. >> way he's done it is to put wxPython itself into /usr/local, and then >> put >> some *.pth trickery into both of the pythons: /System/... and / >> Library/... >> >> It works fine, and I've suggested it on this list before, but I guess >> folks think it's too much of a hack -- or just no one has taken the >> time >> to do it. I don't think it worths it. .pth files will involve even more point of failures, and it has the potential of breaking things in non obvious ways. I agree that the lack of 64 bits installer is an issue, but building numpy on mac os x is not that difficult, and I think people who need 64 bits often are more knowledgeable. There are also solutions like EPD and the likes, which support 64 bits. David From cournape at gmail.com Tue Jan 5 20:09:19 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 6 Jan 2010 10:09:19 +0900 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B43D6E6.80305@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> Message-ID: <5b8d13221001051709g58103d1ewf4915d8be38a3277@mail.gmail.com> On Wed, Jan 6, 2010 at 9:18 AM, Christopher Barker wrote: > If distutils/setuptools could identify the python version properly, then > ?binary eggs and easy-install could be a solution -- but that's a mess, > too. It would not solve the problem, really. Two same versions of python does not imply compatible python when C extensions are involved. In current state of affairs, where python does not have a stable ABI, the only workable solution is to target one specific python (or to build your own as in EPD). cheers, David From x.yang at physics.usyd.edu.au Tue Jan 5 22:38:29 2010 From: x.yang at physics.usyd.edu.au (Xue (Sue) Yang) Date: Wed, 6 Jan 2010 14:38:29 +1100 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab Message-ID: <001a01ca8e81$b626ec50$2274c4f0$@yang@physics.usyd.edu.au> Hi, I followed what I collected about installation of numpy with lapack and atlas and installed numpy on our desktop with RHEL4 and 4 cores. >uname -a Linux curie.physics.usyd.edu.au 2.6.9-89.0.15.ELsmp #1 SMP Sat Oct 10 05:59:16 EDT 2009 i686 i686 i386 GNU/Linux I successfully installed lapack-3.1.1, atlas3.8.0 with fortran comfiler: gfortran, and numpy-1.3.0 with enthought-python distribution (python2.5). > python >>import numpy >>a = numpy.random.randn(6000, 6000) >>numpy.dot(a, a) Surprisingly, it only uses 2 cores instead of 4 cores. Where and how should I set up the number of threads for numpy? Thanks! Dr. Xue (Sue) Yang School of Physics, University of Sydney Ph: 02 9351 6081 Email: x.yang at physics.usyd.edu.au From david at silveregg.co.jp Tue Jan 5 22:49:28 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Wed, 06 Jan 2010 12:49:28 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <001a01ca8e81$b626ec50$2274c4f0$@yang@physics.usyd.edu.au> References: <001a01ca8e81$b626ec50$2274c4f0$@yang@physics.usyd.edu.au> Message-ID: <4B440848.2060106@silveregg.co.jp> Xue (Sue) Yang wrote: > Hi, > > I followed what I collected about installation of numpy with lapack and > atlas and installed numpy on our desktop with RHEL4 and 4 cores. > >> uname -a > > Linux curie.physics.usyd.edu.au 2.6.9-89.0.15.ELsmp #1 SMP Sat Oct 10 > 05:59:16 EDT 2009 i686 i686 i386 GNU/Linux > > I successfully installed lapack-3.1.1, atlas3.8.0 with fortran comfiler: > gfortran, and numpy-1.3.0 with enthought-python distribution (python2.5). > >> python >>> import numpy >>> a = numpy.random.randn(6000, 6000) >>> numpy.dot(a, a) > > Surprisingly, it only uses 2 cores instead of 4 cores. Where and how should > I set up the number of threads for numpy? Atlas (at least your version, I don't know about 3.9.* series) does not support setting the number of threads dynamically - it is a compile time option. If the compile time option is indeed 4 threads, it may be that ATLAS decided that using 2 threads instead of 4 was more efficient. You can find this info in atlas_buildinfo.h file (the ATL_NCPU CPP define). Note that you should not use atlas 3.8.0, as it has a number of serious bugs - you should use 3.8.3. David From nadavh at visionsense.com Wed Jan 6 01:13:33 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 6 Jan 2010 08:13:33 +0200 Subject: [Numpy-discussion] extracting data from ODF files References: <209cec441001051314k284895bdq6e9a3b8f5fd9197c@mail.gmail.com> Message-ID: <710F2847B0018641891D9A21602763605AD297@ex3.envision.co.il> There is a possibility to export the data to excel format and use xlrd or similar package to read it. Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Manuel Wittchen Sent: Tue 05-Jan-10 23:14 To: Discussion of Numerical Python Subject: [Numpy-discussion] extracting data from ODF files Hi, is there a (simple) solution to extract data from OpenDocument files (espacially OpenOffice.org Calc files) into a Numpy Array? At the moment I copy the colums from OO.org Calc manually into a tab-separatet Plaintext file which is quite annoying. Regards, Manuel Wittchen _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 2927 bytes Desc: not available URL: From Chris.Barker at noaa.gov Wed Jan 6 01:34:45 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 05 Jan 2010 22:34:45 -0800 Subject: [Numpy-discussion] extracting data from ODF files In-Reply-To: <710F2847B0018641891D9A21602763605AD297@ex3.envision.co.il> References: <209cec441001051314k284895bdq6e9a3b8f5fd9197c@mail.gmail.com> <710F2847B0018641891D9A21602763605AD297@ex3.envision.co.il> Message-ID: <4B442F05.7070400@noaa.gov> Nadav Horesh wrote: > is there a (simple) solution to extract data from OpenDocument files > (espacially OpenOffice.org Calc files) into a Numpy Array? Aren't they XML? you may be able to use an XML parser. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From silva at lma.cnrs-mrs.fr Wed Jan 6 04:50:39 2010 From: silva at lma.cnrs-mrs.fr (Fabricio Silva) Date: Wed, 06 Jan 2010 10:50:39 +0100 Subject: [Numpy-discussion] extracting data from ODF files In-Reply-To: <4B442F05.7070400@noaa.gov> References: <209cec441001051314k284895bdq6e9a3b8f5fd9197c@mail.gmail.com> <710F2847B0018641891D9A21602763605AD297@ex3.envision.co.il> <4B442F05.7070400@noaa.gov> Message-ID: <1262771439.3294.2.camel@PCTerrusse> Le mardi 05 janvier 2010 ? 22:34 -0800, Christopher Barker a ?crit : > Nadav Horesh wrote: > > is there a (simple) solution to extract data from OpenDocument files > > (espacially OpenOffice.org Calc files) into a Numpy Array? > > Aren't they XML? you may be able to use an XML parser. See, e.g., http://wiki.services.openoffice.org/wiki/CalcParser Maybe another solution is to use the python open office interface : python-uno ? -- Fabrice Silva Laboratory of Mechanics and Acoustics (CNRS, UPR 7051) From Chris.Barker at noaa.gov Wed Jan 6 11:22:24 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 06 Jan 2010 08:22:24 -0800 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <5b8d13221001051709g58103d1ewf4915d8be38a3277@mail.gmail.com> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> <5b8d13221001051709g58103d1ewf4915d8be38a3277@mail.gmail.com> Message-ID: <4B44B8C0.5010203@noaa.gov> NOTE: cc-d to the pythonmac list from the numpy list -- this is really a Mac issue. It's a discussion of what/how to produce binaries of numpy for OS-X David Cournapeau wrote: > On Wed, Jan 6, 2010 at 9:18 AM, Christopher Barker > wrote: > >> If distutils/setuptools could identify the python version properly, then >> binary eggs and easy-install could be a solution -- but that's a mess, >> too. > > It would not solve the problem, really. Two same versions of python > does not imply compatible python when C extensions are involved. In > current state of affairs, where python does not have a stable ABI, the > only workable solution is to target one specific python So you are saying that binary eggs are simply impossible altogether. Maybe true, I suppose, but... >>> As the 2.6 series is binary compatible, you can build a single >>> installer >>> that will work with both > > I don't think that's true. 2.6.x are compatible with each other iif > they are built with the same compiler options. There are too many > differences between Apple python and the python.org python (dtrace, 64 > bits support, compiler options, etc...) IMHO to make a compatible > installer for both versions worthwhile. Well, it was possible once, and it's been working just fine for wxPython for a good while. Things may have changed with OS-X 10.6, tough I think the wxPython binary still works (32 bit only, of course). > I agree that the lack of 64 bits installer is an issue, but building > numpy on mac os x is not that difficult, and I think people who need > 64 bits often are more knowledgeable. I agree -- but what do you get if you install OS-X 10.6, and then type "python" at the prompt -- is that a 32 bit or 64 bit python? > There are also solutions like > EPD and the likes, which support 64 bits. but not PPC anymore, sigh. There is a key problem here -- folks running OS-X expecting it to be another Unix are fine -- they install the compiler, build their own extensions, probably use some combination of fink and macports, etc. This may well apply to many Scientific programmers, and web developers (though it maybe not). However, there is a different type of Mac user -- the type that has traditionally used Macs. Some of these folks are giving a bit of programming a try, and have heard that python is an easy to learn language -- and, cool, OS-X even comes with it installed! But then they soon enough discover that they need additional packages of some sort --- and numpy is a very, very useful package, and not just for the experienced programmer (think Matlab users, for instance). These folks haven't installed the compiler, don't know 64 from 32 bit, and heaven forbid, have no idea how the heck to compile a dependency with the "./configure && make && make install" dance. Some years ago, the community on the pythonmac list made significant efforts to try to support these folks. Primarily what they need are binary installers. We also more or less declared the python.org python as the official python to support, and even had a repository of pre-built packages (http://pythonmac.org/packages/py25-fat/index.html). It was pretty handy -- you could get python itself and all the major packages there, all working together. That repository is not longer maintained, for a couple reasons: 1) Bob Ippolito is doing other things 2) A lot of package developers are providing binaries themselves 3) It's gotten even messier! But there is still a community out there that it would be nice to support. So the question is: how do we do it? That repository appears to be dead, though it could be revived if there is a bit of interest. But even without it, it would be great if there was some sort of consensus among the pythonmac crowd and the major package developers as to what to provide binaries for. We're really in a mess if you can only get a binary for PIL for the Apple Python, and only get a binary for numpy for the python.org python. Personally, I still think the Apple python is dead-end -- Apple has never supported it properly. And, if you go that route you need a different build for people running 10.4 and 10.5, and 10.6, and ... I'm not sure what the 64 bit story is -- I suspect that David is right -- folks running 64 bits are the ones that know what they are doing, so they have less need for binaries. So maybe for now this is a good goal: python 2.5: python.org build (32 bit PPC and intel) python 2.6: 32 bit python.org 2.6.4 64 bit python.org build? python 2.7: python.org 3-way build, if that happens. or separate 32 and 64 bit builds python 3.1: python.org build (whatever it ends up being) Darn, that's quite a few to support! NOTE: Ned Deily just posted a good summary of what's out there on teh pythonmac list: http://mail.python.org/pipermail/pythonmac-sig/2010-January/022031.html -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Wed Jan 6 11:35:47 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 06 Jan 2010 08:35:47 -0800 Subject: [Numpy-discussion] [Pythonmac-SIG] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B44B8C0.5010203@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> <5b8d13221001051709g58103d1ewf4915d8be38a3277@mail.gmail.com> <4B44B8C0.5010203@noaa.gov> Message-ID: <4B44BBE3.9090601@noaa.gov> One more note: An easy improvement to the current situation with binaries is to LABEL THEM WELL: It's worse to have a binary you expect to work fail for you than to not have one available. IN the past, I think folks' have used the default name provided by bdist_mpkg, and those are not always clear. Something like: numpy1.4-osx10.4-python.org2.6-32bit.dmg or something -- even better, with a a bit more text -- would help a lot. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cournape at gmail.com Wed Jan 6 20:09:42 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 7 Jan 2010 10:09:42 +0900 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B44B8C0.5010203@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> <5b8d13221001051709g58103d1ewf4915d8be38a3277@mail.gmail.com> <4B44B8C0.5010203@noaa.gov> Message-ID: <5b8d13221001061709u56d72a80m73bacb69d123cd04@mail.gmail.com> On Thu, Jan 7, 2010 at 1:22 AM, Christopher Barker wrote: > NOTE: cc-d to the pythonmac list from the numpy list -- this is really a > Mac issue. It's a discussion of what/how to produce binaries of numpy > for OS-X > > > David Cournapeau wrote: >> On Wed, Jan 6, 2010 at 9:18 AM, Christopher Barker >> wrote: >> >>> If distutils/setuptools could identify the python version properly, then >>> ?binary eggs and easy-install could be a solution -- but that's a mess, >>> too. >> >> It would not solve the problem, really. Two same versions of python >> does not imply compatible python when C extensions are involved. In >> current state of affairs, where python does not have a stable ABI, the >> only workable solution is to target one specific python > > So you are saying that binary eggs are simply impossible altogether. More simply, you can't offer a single binary installer which works on binary-incompatible python versions. > I agree -- but what do you get if you install OS-X 10.6, and then type > "python" at the prompt -- is that a 32 bit or 64 bit python? 64 bits, at least by default. All the userland provided by OS-X is 64 bits AFAIK (the only apps still 32 bits on my macbook are vmware and the kernel). There is also the problem that controlling the minimal supported version of OS X is hard to control (another distutils insanity). > However, there is a different type of Mac user -- the type that has > traditionally used Macs. Some of these folks are giving a bit of > programming a try, and have heard that python is an easy to learn > language -- and, cool, OS-X even comes with it installed! > > But then they soon enough discover that they need additional packages of > some sort --- and numpy is a very, very useful package, and not just for > the experienced programmer (think Matlab users, for instance). These > folks haven't installed the compiler, don't know 64 from 32 bit, and > heaven forbid, have no idea how the heck to compile a dependency with > the "./configure && make && make install" dance. Those people already have numpy installed, though. The only solution I can see for a one-click install is to control the whole stack, e.g. like EPD eos. > Some years ago, the community on the pythonmac list made significant > efforts to try to support these folks. Primarily what they need are > binary installers. We also more or less declared the python.org python > as the official python to support, and even had a repository of > pre-built packages (http://pythonmac.org/packages/py25-fat/index.html). > It was pretty handy -- you could get python itself and all the major > packages there, all working together. I hope that our own scientific repository will be able to do this - at least that's one of the stated goal (see the toydist discussion). The only scalable solution I can see is if the packages are automatically built for every version of Mac OS X we wish to support. > > Personally, I still think the Apple python is dead-end -- Apple has > never supported it properly. And, if you go that route you need a > different build for people running 10.4 and 10.5, and 10.6, and ... I am afraid that this is needed anyway once you start depending on "high-level" stuff from Mac OS X API. > Darn, that's quite a few to support! I would say that's insane :) That's hopeless intractable. If the numpy stats are any indication, only supporting the last released python version is enough for most users. IMHO, it is much better to support only one binary installer which works well rather than a myriad which work half the time, and only confuse people anyway. David From cournape at gmail.com Wed Jan 6 20:12:08 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 7 Jan 2010 10:12:08 +0900 Subject: [Numpy-discussion] [Pythonmac-SIG] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B44BBE3.9090601@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> <5b8d13221001051709g58103d1ewf4915d8be38a3277@mail.gmail.com> <4B44B8C0.5010203@noaa.gov> <4B44BBE3.9090601@noaa.gov> Message-ID: <5b8d13221001061712m3498390cse4b2ae6b37fc19e2@mail.gmail.com> On Thu, Jan 7, 2010 at 1:35 AM, Christopher Barker wrote: > One more note: > > An easy improvement to the current situation with binaries is to LABEL > THEM WELL: > > It's worse to have a binary you expect to work fail for you than to not > have one available. IN the past, I think folks' have used the default > name provided by bdist_mpkg, and those are not always clear. Something like: > > > numpy1.4-osx10.4-python.org2.6-32bit.dmg The 32 bits is redundant - we support all archs supported by the official python binary, so python.org is enough. About osx10.4, I still don't know how to make sure we do work there with distutils. The whole MACOSX_DEPLOYMENT_TARGET confuses me quite a lot. Other than that, the numpy 1.4.0 follows your advice, and contains the python.org part. David From x.yang at physics.usyd.edu.au Wed Jan 6 21:20:33 2010 From: x.yang at physics.usyd.edu.au (Xue (Sue) Yang) Date: Thu, 7 Jan 2010 13:20:33 +1100 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab Message-ID: <002401ca8f3f$fdae2340$f90a69c0$@yang@physics.usyd.edu.au> Hi David, Thank you for the reply which is useful. I also tried to Install numpy with intel mkl 9.1 I still used gfortran for numpy installation as intel mkl 9.1 supports gnu compiler. I only uncomment these lines for site.cfg in site.cfg.example [mkl] library_dirs = /usr/physics/intel/mkl/lib/32 include_dirs = /usr/physics/intel/mkl/include lapack_libs = mkl_lapack then I tested the numpy with > python >>import numpy >>a = numpy.random.randn(6000, 6000) >>numpy.dot(a, a) This time, only one cpu was used. Does it mean that our installed intel mkl 9.1 is not threaded? I don't think so. We have used it for openMP parallelization for quite a while. Thanks! Sue From cournape at gmail.com Wed Jan 6 23:21:07 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 7 Jan 2010 13:21:07 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <-7769099014455669290@unknownmsgid> References: <-7769099014455669290@unknownmsgid> Message-ID: <5b8d13221001062021k7d6dd475r7e895bed70b712f7@mail.gmail.com> On Thu, Jan 7, 2010 at 11:20 AM, Xue (Sue) Yang wrote: > This time, only one cpu was used. ?Does it mean that our installed intel mkl > 9.1 is not threaded? You would have to consult the MKL documentation - I believe you can control how many threads are used from an environment variable. Also, the exact build commands depend on the version of the MKL, as its libraries often change between versions. David From sturla at molden.no Thu Jan 7 06:31:38 2010 From: sturla at molden.no (Sturla Molden) Date: Thu, 7 Jan 2010 12:31:38 +0100 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <002401ca8f3f$fdae2340$f90a69c0$@yang@physics.usyd.edu.au> References: <002401ca8f3f$fdae2340$f90a69c0$@yang@physics.usyd.edu.au> Message-ID: <1d4f24dc6f376254aa63963eb6cd5916.squirrel@webmail.uio.no> > I also tried to Install numpy with intel mkl 9.1 > I still used gfortran for numpy installation as intel mkl 9.1 supports gnu > compiler. I would suggest using GotoBLAS instead of ATLAS. It is easier to build then ATLAS (basically no configuration), and has even better performance than MKL. http://www.tacc.utexas.edu/tacc-projects/ S.M. From denis-bz-py at t-online.de Thu Jan 7 09:05:19 2010 From: denis-bz-py at t-online.de (denis) Date: Thu, 07 Jan 2010 15:05:19 +0100 Subject: [Numpy-discussion] Repeated dot products In-Reply-To: References: Message-ID: On 12/12/2009 22:55, T J wrote: > Hi, > > Suppose I have an array of shape: (n, k, k). In this case, I have n > k-by-k matrices. My goal is to compute the product of a (potentially > large) user-specified selection (with replacement) of these matrices. > For example, > > x = [0,1,2,1,3,3,2,1,3,2,1,5,3,2,3,5,2,5,3,2,1,3,5,6] TJ, what are your n, k, len(x) ? _dotblas.dot is fast: dot( 10x10 matrices ) takes ~ 22 usec on my g4 ppc, which is ~ 15 clock cycles (700 MHz) per mem access * +. A hack to find repeated pairs (or triples ...) follows. Your sequence above has only (3,2) 4 times, no win. (Can someone give a probabilistic estimate of the number of non-overlapping pairs in N letters from an alphabet of size A ?) #!/usr/bin/env python # numpy-discuss 2009 12dec TJ repeated dot products from __future__ import division from collections import defaultdict import numpy as np __version__ = "2010 7jan denis" def pairs( s, Len=2 ): """ repeated non-overlapping pairs (substrings, subwords) "abracadabra" -> ab ra [[0 7] [2 9]], not br Len=3: triples, 4 ... """ # bruteforce # grow repeated 2 3 ... ? pairs = defaultdict(list) for j in range(len(s)-Len+1): pairs[ s[j:j+Len] ].append(j) min2 = filter( lambda x: len(x) > 1, pairs.values() ) min2.sort( key = lambda x: len(x), reverse=True ) # remove overlaps -- # (if many, during init scan would be faster) runs = np.zeros( len(s), np.uint8 ) run = np.ones( Len, np.uint8 ) run[0] = Len chains = [] for ovchain in min2: chain = [] for c in ovchain: if not runs[c:c+Len].any(): runs[c:c+Len] = run chain.append(c) if len(chain) > 1: chains.append(chain) return (chains, runs) #............................................................................... if __name__ == "__main__": import sys abra = "abracadabra" alph = 5 randlen = 100 randseed = 1 exec( "\n".join( sys.argv[1:] )) # Test= ... print "pairs( %s ) --" % abra print pairs( abra ) # ab [0, 7], br [2, 9]] print pairs( abra, 3 ) # abr [0, 7] np.random.seed( randseed ) r = np.random.random_integers( 1, alph, randlen ) chains, runs = pairs( tuple(r) ) npair = sum([ len(c) for c in chains ]) print "%d repeated pairs in %d random %d" % (npair, randlen, alph) # 35 repeated pairs in 100 random 5 (prob estimate this ?) # 25 repeated pairs in 100 random 10 From Chris.Barker at noaa.gov Thu Jan 7 12:36:31 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 07 Jan 2010 09:36:31 -0800 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <1d4f24dc6f376254aa63963eb6cd5916.squirrel@webmail.uio.no> References: <002401ca8f3f$fdae2340$f90a69c0$%yang@physics.usyd.edu.au> <1d4f24dc6f376254aa63963eb6cd5916.squirrel@webmail.uio.no> Message-ID: <4B461B9F.5060800@noaa.gov> Sturla Molden wrote: > I would suggest using GotoBLAS instead of ATLAS. > http://www.tacc.utexas.edu/tacc-projects/ That does look promising -- nay idea what the license is? They don't make it clear on the site (maybe it it is you set up a user account and download, but I'd rather know up front). The only reference I could find is from 2006: http://www.utexas.edu/news/2006/04/12/tacc/ and in that, they refer to one of those annoying "free for academic and scientific use" clauses. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From sturla at molden.no Thu Jan 7 12:47:25 2010 From: sturla at molden.no (Sturla Molden) Date: Thu, 7 Jan 2010 18:47:25 +0100 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4B461B9F.5060800@noaa.gov> References: <002401ca8f3f$fdae2340$f90a69c0$%yang@physics.usyd.edu.au> <1d4f24dc6f376254aa63963eb6cd5916.squirrel@webmail.uio.no> <4B461B9F.5060800@noaa.gov> Message-ID: > Sturla Molden wrote: >> I would suggest using GotoBLAS instead of ATLAS. > >> http://www.tacc.utexas.edu/tacc-projects/ > > That does look promising -- nay idea what the license is? They don't > make it clear on the site UT TACC Research License (Source Code) The Texas Advanced Computing Center of The University of Texas at Austin has developed certain software and documentation that it desires to make available without charge to anyone for academic, research, experimental or personal use. This license is designed to guarantee freedom to use the software for these purposes. If you wish to distribute or make other use of the software, you may purchase a license to do so from the University of Texas. The accompanying source code is made available to you under the terms of this UT TACC Research License (this "UTTRL"). By clicking the "ACCEPT" button, or by installing or using the code, you are consenting to be bound by this UTTRL. If you do not agree to the terms and conditions of this license, do not click the "ACCEPT" button, and do not install or use any part of the code. The terms and conditions in this UTTRL not only apply to the source code made available by UT TACC, but also to any improvements to, or derivative works of, that source code made by you and to any object code compiled from such source code, improvements or derivative works. 1. DEFINITIONS. 1.1 "Commercial Use" shall mean use of Software or Documentation by Licensee for direct or indirect financial, commercial or strategic gain or advantage, including without limitation: (a) bundling or integrating the Software with any hardware product or another software product for transfer, sale or license to a third party (even if distributing the Software on separate media and not charging for the Software); (b) providing customers with a link to the Software or a copy of the Software for use with hardware or another software product purchased by that customer; or (c) use in connection with the performance of services for which Licensee is compensated. 1.2 "Derivative Products" means any improvements to, or other derivative works of, the Software made by Licensee. 1.3 "Documentation" shall mean all manuals, user documentation, and other related materials pertaining to the Software that are made available to Licensee in connection with the Software. 1.4 "Licensor" shall mean The University of Texas. 1.5 "Licensee" shall mean the person or entity that has agreed to the terms hereof and is exercising rights granted hereunder. 1.6 "Software" shall mean the computer program(s) referred to as GotoBLAS2 made available under this UTTRL in source code form, including any error corrections, bug fixes, patches, updates or other modifications that Licensor may in its sole discretion make available to Licensee from time to time, and any object code compiled from such source code. 2. GRANT OF RIGHTS. Subject to the terms and conditions hereunder, Licensor hereby grants to Licensee a worldwide, non-transferable, non-exclusive license to (a) install, use and reproduce the Software for academic, research, experimental and personal use (but specifically excluding Commercial Use); (b) use and modify the Software to create Derivative Products, subject to Section 3.2; and (c) use the Documentation, if any, solely in connection with Licensee's authorized use of the Software. 3. RESTRICTIONS; COVENANTS. 3.1 Licensee may not: (a) distribute, sub-license or otherwise transfer copies or rights to the Software (or any portion thereof) or the Documentation; (b) use the Software (or any portion thereof) or Documentation for Commercial Use, or for any other use except as described in Section 2; (c) copy the Software or Documentation other than for archival and backup purposes; or (d) remove any product identification, copyright, proprietary notices or labels from the Software and Documentation. This UTTRL confers no rights upon Licensee except those expressly granted herein. 3.2 Licensee hereby agrees that it will provide a copy of all Derivative Products to Licensor and that its use of the Derivative Products will be subject to all of the same terms, conditions, restrictions and limitations on use imposed on the Software under this UTTRL. Licensee hereby grants Licensor a worldwide, non-exclusive, royalty-free license to reproduce, prepare derivative works of, publicly display, publicly perform, sublicense and distribute Derivative Products. Licensee also hereby grants Licensor a worldwide, non-exclusive, royalty-free patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Derivative Products under those patent claims licensable by Licensee that are necessarily infringed by the Derivative Products. 4. PROTECTION OF SOFTWARE. 4.1 Confidentiality. The Software and Documentation are the confidential and proprietary information of Licensor. Licensee agrees to take adequate steps to protect the Software and Documentation from unauthorized disclosure or use. Licensee agrees that it will not disclose the Software or Documentation to any third party. 4.2 Proprietary Notices. Licensee shall maintain and place on any copy of Software or Documentation that it reproduces for internal use all notices as are authorized and/or required hereunder. Licensee shall include a copy of this UTTRL and the following notice, on each copy of the Software and Documentation. Such license and notice shall be embedded in each copy of the Software, in the video screen display, on the physical medium embodying the Software copy and on any Documentation: Copyright ?? The University of Texas, 2009. All right reserved. UNIVERSITY EXPRESSLY DISCLAIMS ANY AND ALL WARRANTIES CONCERNING THIS SOFTWARE AND DOCUMENTATION, INCLUDING ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR ANY PARTICULAR PURPOSE, NON-INFRINGEMENT AND WARRANTIES OF PERFORMANCE, AND ANY WARRANTY THAT MIGHT OTHERWISE ARISE FROM COURSE OF DEALING OR USAGE OF TRADE. NO WARRANTY IS EITHER EXPRESS OR IMPLIED WITH RESPECT TO THE USE OF THE SOFTWARE OR DOCUMENTATION. Under no circumstances shall University be liable for incidental, special, indirect, direct or consequential damages or loss of profits, interruption of business, or related expenses which may arise from use of Software or Documentation, including but not limited to those resulting from defects in Software and/or Documentation, or loss or inaccuracy of data of any kind. 5. WARRANTIES. 5.1 Disclaimer of Warranties. TO THE EXTENT PERMITTED BY APPLICABLE LAW, THE SOFTWARE AND DOCUMENTATION ARE BEING PROVIDED ON AN "AS IS" BASIS WITHOUT ANY WARRANTIES OF ANY KIND RESPECTING THE SOFTWARE OR DOCUMENTATION, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY OF DESIGN, MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. 5.2 Limitation of Liability. UNDER NO CIRCUMSTANCES UNLESS REQUIRED BY APPLICABLE LAW SHALL LICENSOR BE LIABLE FOR INCIDENTAL, SPECIAL, INDIRECT, DIRECT OR CONSEQUENTIAL DAMAGES OR LOSS OF PROFITS, INTERRUPTION OF BUSINESS, OR RELATED EXPENSES WHICH MAY ARISE AS A RESULT OF THIS LICENSE OR OUT OF THE USE OR ATTEMPT OF USE OF SOFTWARE OR DOCUMENTATION INCLUDING BUT NOT LIMITED TO THOSE RESULTING FROM DEFECTS IN SOFTWARE AND/OR DOCUMENTATION, OR LOSS OR INACCURACY OF DATA OF ANY KIND. THE FOREGOING EXCLUSIONS AND LIMITATIONS WILL APPLY TO ALL CLAIMS AND ACTIONS OF ANY KIND, WHETHER BASED ON CONTRACT, TORT (INCLUDING, WITHOUT LIMITATION, NEGLIGENCE), OR ANY OTHER GROUNDS. 6. INDEMNIFICATION. Licensee shall indemnify, defend and hold harmless Licensor, the University of Texas System, their Regents, and their officers, agents and employees from and against any claims, demands, or causes of action whatsoever caused by, or arising out of, or resulting from, the exercise or practice of the license granted hereunder by Licensee, its officers, employees, agents or representatives. 7. TERMINATION. If Licensee breaches this UTTRL, Licensee\'s right to use the Software and Documentation will terminate immediately without notice, but all provisions of this UTTRL except Section 2 will survive termination and continue in effect. Upon termination, Licensee must destroy all copies of the Software and Documentation. 8. GOVERNING LAW; JURISDICTION AND VENUE. The validity, interpretation, construction and performance of this UTTRL shall be governed by the laws of the State of Texas. The Texas state courts of Travis County, Texas (or, if there is exclusive federal jurisdiction, the United States District Court for the Central District of Texas) shall have exclusive jurisdiction and venue over any dispute arising out of this UTTRL, and Licensee consents to the jurisdiction of such courts. Application of the United Nations Convention on Contracts for the International Sale of Goods is expressly excluded. 9. EXPORT CONTROLS. This license is subject to all applicable export restrictions. Licensee must comply with all export and import laws and restrictions and regulations of any United States or foreign agency or authority relating to the Software and its use. 10. U.S. GOVERNMENT END-USERS. The Software is a "commercial item," as that term is defined in 48 C.F.R. 2.101, consisting of "commercial computer software" and "commercial computer software documentation," as such terms are used in 48 C.F.R. 12.212 (Sept. 1995) and 48 C.F.R. 227.7202 (June 1995). Consistent with 48 C.F.R. 12.212, 48 C.F.R. 27.405(b)(2) (June 1998) and 48 C.F.R. 227.7202, all U.S. Government End Users acquire the Software with only those rights as set forth herein. 11. MISCELLANEOUS. If any provision hereof shall be held illegal, invalid or unenforceable, in whole or in part, such provision shall be modified to the minimum extent necessary to make it legal, valid and enforceable, and the legality, validity and enforceability of all other provisions of this UTTRL shall not be affected thereby. Licensee may not assign this UTTRL in whole or in part, without Licensor's prior written consent. Any attempt to assign this UTTRL without such consent will be null and void. This UTTRL is the complete and exclusive statement between Licensee and Licensor relating to the subject matter hereof and supersedes all prior oral and written and all contemporaneous oral negotiations, commitments and understandings of the parties, if any. Any waiver by either party of any default or breach hereunder shall not constitute a waiver of any provision of this UTTRL or of any subsequent default or breach of the same or a different kind. END OF LICENSE From Nikolas.Tezak at gmx.de Thu Jan 7 12:51:44 2010 From: Nikolas.Tezak at gmx.de (Nikolas Tezak) Date: Thu, 7 Jan 2010 18:51:44 +0100 Subject: [Numpy-discussion] Behaviour of vdot(array2d, array1d) Message-ID: <90626729-8117-4C72-9509-E1BE7D7F7933@gmx.de> Hi, I am new to this list, but I have been using scipy for a couple of months now with great satisfaction. Currently I have a problem: I diagonalize a hermitian complex matrix using the eigh routine from scipy.linalg (this is still a numpy question, see below) This returns the eigenvectors as columns of a 2d array. Now I would like to project a vector onto this new basis. I could either do: inital_state = array(...) #dtype=complex, shape=(dim,) coefficients = zeros( shape=(dim,), dtype=complex) matrix = array(...) #dtype=complex, shape=(dim, dim) eigenvalues, eigenvectors = eigh(matrix) for i in xrange(dim): coefficients[i] = vdot(eigenvalues[:, i], initial_state) But it seems to me after reading the documentation for vdot, that it should also be possible to do this without a loop: initial_state = array(...) #dtype=complex, shape=(dim,) matrix = array(...) #dtype=complex, shape=(dim, dim) eigenvalues, eigenvectors = eigh(matrix) coefficients = vdot( eigenvalues.transpose(), initial_state) However when I do this, vdot raises a ValueError complaining that the "vectors have different lengths". It seems that vdot (as opposed to dot) cannot handle arguments with different shape although the documentation suggests otherwise. I am using numpy version 1.3.0. Is this a bug or am I missing something? Regards, Nikolas From Chris.Barker at noaa.gov Thu Jan 7 14:16:40 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 07 Jan 2010 11:16:40 -0800 Subject: [Numpy-discussion] [Pythonmac-SIG] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <5b8d13221001061712m3498390cse4b2ae6b37fc19e2@mail.gmail.com> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> <5b8d13221001051709g58103d1ewf4915d8be38a3277@mail.gmail.com> <4B44B8C0.5010203@noaa.gov> <4B44BBE3.9090601@noaa.gov> <5b8d13221001061712m3498390cse4b2ae6b37fc19e2@mail.gmail.com> Message-ID: <4B463318.5060902@noaa.gov> David Cournapeau wrote: > On Thu, Jan 7, 2010 at 1:35 AM, Christopher Barker >> In the past, I think folks' have used the default >> name provided by bdist_mpkg, and those are not always clear. Something like: >> >> >> numpy1.4-osx10.4-python.org2.6-32bit.dmg > > The 32 bits is redundant - we support all archs supported by the > official python binary, so python.org is enough. True, though I was anticipating that there may be 32 and 64 bit builds some day. > About osx10.4, As for that -- I put that in 'cause I remembered that in the past it has said "10.5", when, in fact 10.4 was supported. Thinking more, I think it's like 32 bit -- the python.org build supports 10.4, so that's all the information folks need. > still don't know how to make sure we do work there with distutils. The > whole MACOSX_DEPLOYMENT_TARGET confuses me quite a lot. distutils should do it right, and indeed, I just tested the py2.5 and py2.6 binaries on my 10.4 PPC machine ,and most of the tests all pass on both. (though see the note below) I think distutils does do it right, at least if you use the latest version of 2.6 -- a bug was fixed there. What OS/architecture were those built with? > Other than > that, the numpy 1.4.0 follows your advice, and contains the python.org > part. I should have looked first -- thanks, I think that will be helpful. NOTE: When I first installed the binary, I got a whole bunch of errors because "matrix' wasn't found. I recalled this issue from testing, and cleared out the install, then re-installed, and all was fine. I wonder if it's possible to have a mpkg remove anything? Other failed tests: ====================================================================== FAIL: test_umath.test_nextafterl ... return _test_nextafter(np.longdouble) File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/core/tests/test_umath.py", line 852, in _test_nextafter assert np.nextafter(one, two) - one == eps AssertionError ====================================================================== FAIL: test_umath.test_spacingl ---------------------------------------------------------------------- ... Traceback (most recent call last): line 887, in test_spacingl return _test_spacing(np.longdouble) File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/core/tests/test_umath.py", line 873, in _test_spacing assert np.spacing(one) == eps AssertionError I think both of those are known issues, and not a big deal. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Thu Jan 7 15:08:23 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 07 Jan 2010 12:08:23 -0800 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <1262721695.5107.1.camel@idol> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> Message-ID: <4B463F37.4010108@noaa.gov> Pauli Virtanen wrote: > ma, 2010-01-04 kello 17:05 -0800, Christopher Barker kirjoitti: > it also does odd things with spaces >> embedded in the separator: >> >> ", $ #" matches all of: ",$#" ", $#" ",$ #" > That's a documented feature: Fair enough. OK, I've written a patch that allows newlines to be interpreted as separators in addition to whatever is specified in sep. In the process of testing, I found again these issues, which are still marked as "needs decision". http://projects.scipy.org/numpy/ticket/883 In short: what to do with missing values? I'd like to address this bug, but I need a decision to do so. My proposal: Raise an ValueError with missing values. Justification: No function should EVER return data that is not there. Period. It is simply asking for hard to find bugs. Therefore: fromstring("3, 4,,5", sep=",") Should never, ever, return: array([ 3., 4., 0., 5.]) Which is what it does now. bad. bad. bad. Alternatives: A) Raising a ValueError is the easiest way to get "proper" behavior. Folks can use a more sophisticated file reader if they want missing values handled. I'm willing to contribute this patch. B) If the dtype is a floating point type, NaN could fill in the missing values -- a fine idea, but you can't use it for integers, and zero is a really bad replacement! C) The user could specify what they want filled in for missing values. This is a fine idea, though I'm not sure I want to take the time to impliment it. Oh, and this is a bug too, with probably the same solution: In [20]: np.fromstring("hjba", sep=',') Out[20]: array([ 0.]) In [26]: np.fromstring("34gytf39", sep=',') Out[26]: array([ 34.]) One more unresolved question: what should: np.fromstring("3, 4, 5,", sep=",") return? it currently returns: array([ 3., 4., 5.]) which seems a bit inconsitent with missing value handling. I also found a bug: In [6]: np.fromstring("3, 4, 5 , ", sep=",") Out[6]: array([ 3., 4., 5., 0.]) so if there is some extra whitespace in there, it does return a missing value. With my proposal, that wouldn't happen, but you might get an exception. I think you should, but it'll be easier to implement my "allow newlines" code if not. so, should I do (A) ? Another question: I've got a patch mostly working (except for the above issues) that will allow fromfile/string to read multiline non-whitespace separated data in one shot: In [15]: str Out[15]: '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12' In [16]: np.fromstring(str, sep=',', allow_newlines=True) Out[16]: array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.]) I think this is a very helpful enhancement, and, as it is a new kwarg, backward compatible: 1) Might it be accepted for inclusion? 2) Is the name for the flag OK: "allow_newlines"? It's pretty explicit, but also long -- I used it for the flag name in the C code, too. 3) What C datatype should I use for a boolean flag? I used a char, but I don't know what the numpy standard is. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Thu Jan 7 15:32:55 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Jan 2010 15:32:55 -0500 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <4B463F37.4010108@noaa.gov> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> Message-ID: <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> On Thu, Jan 7, 2010 at 3:08 PM, Christopher Barker wrote: > Pauli Virtanen wrote: >> ma, 2010-01-04 kello 17:05 -0800, Christopher Barker kirjoitti: >> it also does odd things with spaces >>> embedded in the separator: >>> >>> ", $ #" matches all of: ?",$#" ? ", $#" ?",$ #" > >> That's a documented feature: > > Fair enough. > > OK, I've written a patch that allows newlines to be interpreted as > separators in addition to whatever is specified in sep. > > In the process of testing, I found again these issues, which are still > marked as "needs decision". > > http://projects.scipy.org/numpy/ticket/883 > > In short: what to do with missing values? > > I'd like to address this bug, but I need a decision to do so. > > > My proposal: > > Raise an ValueError with missing values. > > > Justification: > > No function should EVER return data that is not there. Period. It is > simply asking for hard to find bugs. Therefore: > > fromstring("3, 4,,5", sep=",") > > Should never, ever, return: > > array([ 3., ?4., ?0., ?5.]) > > Which is what it does now. bad. bad. bad. > > > > > Alternatives: > > ? A) Raising a ValueError is the easiest way to get "proper" behavior. > Folks can use a more sophisticated file reader if they want missing > values handled. I'm willing to contribute this patch. > > ? B) If the dtype is a floating point type, NaN could fill in the > missing values -- a fine idea, but you can't use it for integers, and > zero is a really bad replacement! > > ? C) The user could specify what they want filled in for missing > values. This is a fine idea, though I'm not sure I want to take the time > to impliment it. > > Oh, and this is a bug too, with probably the same solution: > > In [20]: np.fromstring("hjba", sep=',') > Out[20]: array([ 0.]) > > In [26]: np.fromstring("34gytf39", sep=',') > Out[26]: array([ 34.]) > > > One more unresolved question: > > what should: > > np.fromstring("3, 4, 5,", sep=",") > > return? > > it currently returns: > > array([ 3., ?4., ?5.]) > > which seems a bit inconsitent with missing value handling. I also found > a bug: > > In [6]: np.fromstring("3, 4, 5 , ", sep=",") > Out[6]: array([ 3., ?4., ?5., ?0.]) > > so if there is some extra whitespace in there, it does return a missing > value. With my proposal, that wouldn't happen, but you might get an > exception. I think you should, but it'll be easier to implement my > "allow newlines" code if not. > > > so, should I do (A) ? > > > Another question: > > I've got a patch mostly working (except for the above issues) that will > allow fromfile/string to read multiline non-whitespace separated data in > one shot: > > > In [15]: str > Out[15]: '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12' > > In [16]: np.fromstring(str, sep=',', allow_newlines=True) > Out[16]: > array([ ?1., ? 2., ? 3., ? 4., ? 5., ? 6., ? 7., ? 8., ? 9., ?10., ?11., > ? ? ? ? 12.]) > > > I think this is a very helpful enhancement, and, as it is a new kwarg, > backward compatible: > > 1) Might it be accepted for inclusion? > > 2) Is the name for the flag OK: "allow_newlines"? It's pretty explicit, > but also long -- I used it for the flag name in the C code, too. > > 3) What C datatype should I use for a boolean flag? I used a char, but I > don't know what the numpy standard is. > > > -Chris > > I don't know much about this, just a few more test cases comma and newline str = '1, 2, 3, 4,\n5, 6, 7, 8,\n9, 10, 11, 12' extra comma at end of file str = '1, 2, 3, 4,\n5, 6, 7, 8,\n9, 10, 11, 12,' extra newlines at end of file str = '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12\n\n\n' It would be nice if these cases would go through without missing values or exception, but I don't often have files that are clean enough for fromfile(). I'm in favor of nan for missing values with floating point numbers. It would make it easy to read correctly formatted csv files, even if the data is not complete. Josef > > > > > > > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959 ? voice > 7600 Sand Point Way NE ? (206) 526-6329 ? fax > Seattle, WA ?98115 ? ? ? (206) 526-6317 ? main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From bsouthey at gmail.com Thu Jan 7 16:11:01 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 7 Jan 2010 15:11:01 -0600 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> Message-ID: On Thu, Jan 7, 2010 at 2:32 PM, wrote: > On Thu, Jan 7, 2010 at 3:08 PM, Christopher Barker > wrote: >> Pauli Virtanen wrote: >>> ma, 2010-01-04 kello 17:05 -0800, Christopher Barker kirjoitti: >>> it also does odd things with spaces >>>> embedded in the separator: >>>> >>>> ", $ #" matches all of: ?",$#" ? ", $#" ?",$ #" >> >>> That's a documented feature: >> >> Fair enough. >> >> OK, I've written a patch that allows newlines to be interpreted as >> separators in addition to whatever is specified in sep. >> >> In the process of testing, I found again these issues, which are still >> marked as "needs decision". >> >> http://projects.scipy.org/numpy/ticket/883 >> >> In short: what to do with missing values? >> >> I'd like to address this bug, but I need a decision to do so. >> >> >> My proposal: >> >> Raise an ValueError with missing values. >> >> >> Justification: >> >> No function should EVER return data that is not there. Period. It is >> simply asking for hard to find bugs. Therefore: >> >> fromstring("3, 4,,5", sep=",") >> >> Should never, ever, return: >> >> array([ 3., ?4., ?0., ?5.]) >> >> Which is what it does now. bad. bad. bad. >> >> >> >> >> Alternatives: >> >> ? A) Raising a ValueError is the easiest way to get "proper" behavior. >> Folks can use a more sophisticated file reader if they want missing >> values handled. I'm willing to contribute this patch. >> >> ? B) If the dtype is a floating point type, NaN could fill in the >> missing values -- a fine idea, but you can't use it for integers, and >> zero is a really bad replacement! >> >> ? C) The user could specify what they want filled in for missing >> values. This is a fine idea, though I'm not sure I want to take the time >> to impliment it. >> >> Oh, and this is a bug too, with probably the same solution: >> >> In [20]: np.fromstring("hjba", sep=',') >> Out[20]: array([ 0.]) >> >> In [26]: np.fromstring("34gytf39", sep=',') >> Out[26]: array([ 34.]) >> >> >> One more unresolved question: >> >> what should: >> >> np.fromstring("3, 4, 5,", sep=",") >> >> return? >> >> it currently returns: >> >> array([ 3., ?4., ?5.]) >> >> which seems a bit inconsitent with missing value handling. I also found >> a bug: >> >> In [6]: np.fromstring("3, 4, 5 , ", sep=",") >> Out[6]: array([ 3., ?4., ?5., ?0.]) >> >> so if there is some extra whitespace in there, it does return a missing >> value. With my proposal, that wouldn't happen, but you might get an >> exception. I think you should, but it'll be easier to implement my >> "allow newlines" code if not. >> >> >> so, should I do (A) ? >> >> >> Another question: >> >> I've got a patch mostly working (except for the above issues) that will >> allow fromfile/string to read multiline non-whitespace separated data in >> one shot: >> >> >> In [15]: str >> Out[15]: '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12' >> >> In [16]: np.fromstring(str, sep=',', allow_newlines=True) >> Out[16]: >> array([ ?1., ? 2., ? 3., ? 4., ? 5., ? 6., ? 7., ? 8., ? 9., ?10., ?11., >> ? ? ? ? 12.]) >> >> >> I think this is a very helpful enhancement, and, as it is a new kwarg, >> backward compatible: >> >> 1) Might it be accepted for inclusion? >> >> 2) Is the name for the flag OK: "allow_newlines"? It's pretty explicit, >> but also long -- I used it for the flag name in the C code, too. >> >> 3) What C datatype should I use for a boolean flag? I used a char, but I >> don't know what the numpy standard is. >> >> >> -Chris >> >> > > I don't know much about this, just a few more test cases > > comma and newline > str = ?'1, 2, 3, 4,\n5, 6, 7, 8,\n9, 10, 11, 12' > > extra comma at end of file > str = ?'1, 2, 3, 4,\n5, 6, 7, 8,\n9, 10, 11, 12,' > > extra newlines at end of file > str = ?'1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12\n\n\n' > > It would be nice if these cases would go through without missing > values or exception, but I don't often have files that are clean > enough for fromfile(). > > I'm in favor of nan for missing values with floating point numbers. It > would make it easy to read correctly formatted csv files, even if the > data is not complete. > Using the numpy NaN or similar (noting R's approach to missing values which in turn allows it to have the above functionality) is just a very bad idea for missing values because you always have to check that which NaN is a missing value and which was due to some numerical calculation. It is a very bad idea because we have masked arrays that nicely but slowly handle this situation. >From what I can see is that you expect that fromfile() should only split at the supplied delimiters, optionally(?) strip any whitespace and force a specific dtype. I would agree that the failure of any of one these should create an exception by default rather than making the best guess. So 'missing data' would potentially fail with forcing the specified dtype. Thus, you should either create an exception for invalid data (with appropriate location) or use masked arrays. Your output from this string '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12' actually assumes multiple delimiters because there is no comma between 4 and 5 and 8 and 9. So I think it would be better if fromfile accepted multiple delimiters. In Josef's last case how many 'missing values should there be? Bruce From oliphant at enthought.com Thu Jan 7 16:11:12 2010 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 7 Jan 2010 15:11:12 -0600 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> Message-ID: <1822EAEB-7243-4C83-8FB0-B55DE40503DF@enthought.com> On Jan 7, 2010, at 2:32 PM, josef.pktd at gmail.com wrote: > On Thu, Jan 7, 2010 at 3:08 PM, Christopher Barker > wrote: >> Pauli Virtanen wrote: >>> ma, 2010-01-04 kello 17:05 -0800, Christopher Barker kirjoitti: >>> it also does odd things with spaces >>>> embedded in the separator: >>>> >>>> ", $ #" matches all of: ",$#" ", $#" ",$ #" >> >>> That's a documented feature: >> >> Fair enough. >> >> OK, I've written a patch that allows newlines to be interpreted as >> separators in addition to whatever is specified in sep. >> >> In the process of testing, I found again these issues, which are >> still >> marked as "needs decision". >> >> http://projects.scipy.org/numpy/ticket/883 >> >> In short: what to do with missing values? >> >> I'd like to address this bug, but I need a decision to do so. >> >> >> My proposal: >> >> Raise an ValueError with missing values. >> >> >> Justification: >> >> No function should EVER return data that is not there. Period. It is >> simply asking for hard to find bugs. Therefore: >> >> fromstring("3, 4,,5", sep=",") >> >> Should never, ever, return: >> >> array([ 3., 4., 0., 5.]) >> >> Which is what it does now. bad. bad. bad. >> >> >> >> >> Alternatives: >> >> A) Raising a ValueError is the easiest way to get "proper" >> behavior. >> Folks can use a more sophisticated file reader if they want missing >> values handled. I'm willing to contribute this patch. >> >> B) If the dtype is a floating point type, NaN could fill in the >> missing values -- a fine idea, but you can't use it for integers, and >> zero is a really bad replacement! >> >> C) The user could specify what they want filled in for missing >> values. This is a fine idea, though I'm not sure I want to take the >> time >> to impliment it. >> >> Oh, and this is a bug too, with probably the same solution: >> >> In [20]: np.fromstring("hjba", sep=',') >> Out[20]: array([ 0.]) >> >> In [26]: np.fromstring("34gytf39", sep=',') >> Out[26]: array([ 34.]) >> >> >> One more unresolved question: >> >> what should: >> >> np.fromstring("3, 4, 5,", sep=",") >> >> return? >> >> it currently returns: >> >> array([ 3., 4., 5.]) >> >> which seems a bit inconsitent with missing value handling. I also >> found >> a bug: >> >> In [6]: np.fromstring("3, 4, 5 , ", sep=",") >> Out[6]: array([ 3., 4., 5., 0.]) >> >> so if there is some extra whitespace in there, it does return a >> missing >> value. With my proposal, that wouldn't happen, but you might get an >> exception. I think you should, but it'll be easier to implement my >> "allow newlines" code if not. >> >> >> so, should I do (A) ? >> >> >> Another question: >> >> I've got a patch mostly working (except for the above issues) that >> will >> allow fromfile/string to read multiline non-whitespace separated >> data in >> one shot: >> >> >> In [15]: str >> Out[15]: '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12' >> >> In [16]: np.fromstring(str, sep=',', allow_newlines=True) >> Out[16]: >> array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., >> 11., >> 12.]) >> >> >> I think this is a very helpful enhancement, and, as it is a new >> kwarg, >> backward compatible: >> >> 1) Might it be accepted for inclusion? >> >> 2) Is the name for the flag OK: "allow_newlines"? It's pretty >> explicit, >> but also long -- I used it for the flag name in the C code, too. >> >> 3) What C datatype should I use for a boolean flag? I used a char, >> but I >> don't know what the numpy standard is. >> >> >> -Chris >> >> > > I don't know much about this, just a few more test cases > > comma and newline > str = '1, 2, 3, 4,\n5, 6, 7, 8,\n9, 10, 11, 12' > > extra comma at end of file > str = '1, 2, 3, 4,\n5, 6, 7, 8,\n9, 10, 11, 12,' > > extra newlines at end of file > str = '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12\n\n\n' > > It would be nice if these cases would go through without missing > values or exception, but I don't often have files that are clean > enough for fromfile(). +1 (ignoring new-lines transparently is a nice feature). You can also use sscanf with weave to read most files. > > I'm in favor of nan for missing values with floating point numbers. It > would make it easy to read correctly formatted csv files, even if the > data is not complete. +1 (much preferrable to insert NaN or other user value than raise ValueError in my opinion) -Travis From Chris.Barker at noaa.gov Thu Jan 7 16:45:41 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 07 Jan 2010 13:45:41 -0800 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> Message-ID: <4B465605.3010406@noaa.gov> Bruce Southey wrote: >> wrote: > Using the numpy NaN or similar (noting R's approach to missing values > which in turn allows it to have the above functionality) is just a > very bad idea for missing values because you always have to check that > which NaN is a missing value and which was due to some numerical > calculation. well, this is specific to reading files, so you know where it came from. And the principle of fromfile() is that it is fast and simple, if you want masked arrays, use slower, but more full-featured methods. However, in this case: In [9]: np.fromstring("3, 4, NaN, 5", sep=",") Out[9]: array([ 3., 4., NaN, 5.]) An actual NaN is read from the file, rather than a missing value. Perhaps the user does want the distinction, so maybe it should really only fil it in if the users asks for it, but specifying "missing_value=np.nan" or something. >>From what I can see is that you expect that fromfile() should only > split at the supplied delimiters, optionally(?) strip any whitespace whitespace stripping is not optional. > Your output from this string '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12' > actually assumes multiple delimiters because there is no comma between > 4 and 5 and 8 and 9. Yes, that's the point. I thought about allowing arbitrary multiple delimiters, but I think '/n' is a special case - for instance, a comma at the end of some numbers might mean missing data, but a '\n' would not. And I couldn't really think of a useful use-case for arbitrary multiple delimiters. > In Josef's last case how many 'missing values should there be? >> extra newlines at end of file >> str = '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12\n\n\n' none -- exactly why I think \n is a special case. What about: >> extra newlines in the middle of the file >> str = '1, 2, 3, 4\n\n5, 6, 7, 8\n9, 10, 11, 12\n' I think they should be ignored, but I hope I'm not making something that is too specific to my personal needs. Travis Oliphant wrote: > +1 (ignoring new-lines transparently is a nice feature). You can also > use sscanf with weave to read most files. right -- but that requires weave. In fact, MATLAB has a fscanf function that allows you to pass in a C format string and it vectorizes it to use the same one over an over again until it's done. It's actually quite powerful and flexible. I once started with that in mind, but didn't have the C chops to do it. I ended up with a tool that only did doubles (come to think of it, MATLAB only does doubles, anyway...) I may some day write a whole new C (or, more likely, Cython) function that does something like that, but for now, I'm jsut trying to get fromfile to be useful for me. > +1 (much preferrable to insert NaN or other user value than raise > ValueError in my opinion) But raise an error for integer types? I guess this is still up the air -- no consensus yet. Thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From james.mazer at yale.edu Thu Jan 7 16:54:44 2010 From: james.mazer at yale.edu (James Mazer) Date: Thu, 07 Jan 2010 16:54:44 -0500 Subject: [Numpy-discussion] cPickle/unPickle across archs Message-ID: <4B465824.70009@yale.edu> Hi, I've got a some Numeric arrays that were created without an explicit byte size in the initial declaration and pickled. Something like this: >>> cPickle.write(array(ones((3,3,)), 'f'), open('foo.pic', 'w')) as opposed to: >>> cPickle.write(array(ones((3,3,)), Float32), open('foo.pic', 'w')) This works as long as the word size doesn't change between the reading and writing machines. The data were generated under a 32bit linux kernel and now I'm trying to read them under a 64bit kernel, so the word size has changed and Numeric assumes that the 'f' type is the NATIVE float and 'l' type is the NATIVE long) and dies miserable when the native types don't match the actual types (which defeats the whole point of pickling, to some extent -- I thought that cPickle.save/load were "ensured" to be invertable...) I've got terrabytes of data that need to be read by both 32bit and 64bit machines (and it's not really feasible to scan all the files into new structures with explict types on a 32bit machine). Anybody have hints for addressing this problem? I found similar questions, but no answers, so I'm not completely alone iwth this problem. Thanks, /jamie -- James Mazer Department of Neurobiology Yale School of Medicine phone: 203-737-5853 fax: 203-785-5263 From robert.kern at gmail.com Thu Jan 7 17:30:24 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 7 Jan 2010 16:30:24 -0600 Subject: [Numpy-discussion] cPickle/unPickle across archs In-Reply-To: <4B465824.70009@yale.edu> References: <4B465824.70009@yale.edu> Message-ID: <3d375d731001071430m8f7dbdg56008e5f53606079@mail.gmail.com> On Thu, Jan 7, 2010 at 15:54, James Mazer wrote: > Hi, > > I've got a some Numeric arrays that were created without > an explicit byte size in the initial declaration and pickled. > Something like this: > > ? >>> cPickle.write(array(ones((3,3,)), 'f'), open('foo.pic', 'w')) > > as opposed to: > > ? >>> cPickle.write(array(ones((3,3,)), Float32), open('foo.pic', 'w')) > > This works as long as the word size doesn't change between the > reading and writing machines. > > The data were generated under a 32bit linux kernel and now I'm trying > to read them under a 64bit kernel, so the word size has changed and > Numeric assumes that the 'f' type is the NATIVE float Please note that 'f' is always a 32-bit float on any machine. Only integers may change size. > and 'l' type is > the NATIVE long) and dies miserable when the native types don't match > the actual types (which defeats the whole point of pickling, to some > extent -- I thought that cPickle.save/load were "ensured" to be > invertable...) I don't think cPickle ensures much at all. It's actually rather fragile for persisting data over long times and between different environments. It works better as a wire format for communication between similar codebases when thoroughly tested on both ends. Using a standard scientific file format for storing your important data has always been de rigeur. That said, it is a deficiency in Numeric that it records the native typecode instead of a platform-neutral, explicitly sized typecode. Unfortunately, Numeric has been deprecated for many years now, and is not maintained. Numeric's replacement, numpy, does not have this problem. > I've got terrabytes of data that need to be read by both 32bit and > 64bit machines (and it's not really feasible to scan all the files > into new structures with explict types on a 32bit machine). Anybody > have hints for addressing this problem? ?I found similar questions, > but no answers, so I'm not completely alone iwth this problem. What you can do is monkeypatch the function Numeric.array_constructor() to do "the right thing" for your case when it sees a platform-specific integer typecode. Something like the following (untested; you may need to generalize it to handle the unsigned integer typecodes, too, if you have that kind of data): import Numeric i_size = Numeric.empty(0, 'i').itemsize() def patched_array_constructor(shape, typecode, thestr, Endian=Numeric.LittleEndian): if typecode == "l": # Ensure that the length of the data matches our expectations. size = Numeric.product(shape) itemsize = len(thestr) // size if itemsize == i_size: typecode = 'i' if typecode == "O": x = Numeric.array(thestr,"O") else: x = Numeric.fromstring(thestr, typecode) x.shape = shape if LittleEndian != Endian: return x.byteswapped() else: return x Numeric.array_constructor = patched_array_constructor After you have done that, cPickle.load() will use that patched function to reconstruct the arrays and make sure that the appropriate typecode is used to interpret the data. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From weisen123 at gmail.com Thu Jan 7 17:58:30 2010 From: weisen123 at gmail.com (neil weisenfeld) Date: Thu, 7 Jan 2010 17:58:30 -0500 Subject: [Numpy-discussion] [Pythonmac-SIG] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B44BBE3.9090601@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> <5b8d13221001051709g58103d1ewf4915d8be38a3277@mail.gmail.com> <4B44B8C0.5010203@noaa.gov> <4B44BBE3.9090601@noaa.gov> Message-ID: <9d5ec4221001071458u4255a0d6ne966be82272ea4fd@mail.gmail.com> On Wed, Jan 6, 2010 at 11:35 AM, Christopher Barker wrote: > > It's worse to have a binary you expect to work fail for you than to not > have one available. IN the past, I think folks' have used the default > name provided by bdist_mpkg, and those are not always clear. Something like: > > > numpy1.4-osx10.4-python.org2.6-32bit.dmg > > or something -- even better, with a a bit more text -- would help a lot. > I agree here. Better labeling of the .dmg would indeed help, I think. And thanks to everyone for all of the responses. I joined the mailing list, posted my question, and then went back to dissertation writing for a few days. When I looked up, there were 18 answers. I'll try getting python from python.org and/or building it all from scratch. Thanks again, Neil From josef.pktd at gmail.com Thu Jan 7 18:15:46 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Jan 2010 18:15:46 -0500 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <4B465605.3010406@noaa.gov> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> Message-ID: <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> On Thu, Jan 7, 2010 at 4:45 PM, Christopher Barker wrote: > Bruce Southey wrote: >>> wrote: > >> Using the numpy NaN or similar (noting R's approach to missing values >> which in turn allows it to have the above functionality) is just a >> very bad idea for missing values because you always have to check that >> which NaN is a missing value and which was due to some numerical >> calculation. > > well, this is specific to reading files, so you know where it came from. > And the principle of fromfile() is that it is fast and simple, if you > want masked arrays, use slower, but more full-featured methods. > > However, in this case: > > In [9]: np.fromstring("3, 4, NaN, 5", sep=",") > Out[9]: array([ ?3., ? 4., ?NaN, ? 5.]) > > > An actual NaN is read from the file, rather than a missing value. > Perhaps the user does want the distinction, so maybe it should really > only fil it in if the users asks for it, but specifying > "missing_value=np.nan" or something. > >>>From what I can see is that you expect that fromfile() should only >> split at the supplied delimiters, optionally(?) strip any whitespace > > whitespace stripping is not optional. > >> Your output from this string '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12' >> actually assumes multiple delimiters because there is no comma between >> 4 and 5 and 8 and 9. > > Yes, that's the point. I thought about allowing arbitrary multiple > delimiters, but I think '/n' is a special case - for instance, a comma > at the end of some numbers might mean missing data, but a '\n' would not. > > And I couldn't really think of a useful use-case for arbitrary multiple > delimiters. > >> In Josef's last case how many 'missing values should there be? > > ?>> extra newlines at end of file > ?>> str = ?'1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12\n\n\n' > > none -- exactly why I think \n is a special case. > > What about: > ?>> extra newlines in the middle of the file > ?>> str = ?'1, 2, 3, 4\n\n5, 6, 7, 8\n9, 10, 11, 12\n' > > I think they should be ignored, but I hope I'm not making something that > is too specific to my personal needs. > > Travis Oliphant wrote: >> +1 (ignoring new-lines transparently is a nice feature). ?You can also >> use sscanf with weave to read most files. > > right -- but that requires weave. In fact, MATLAB has a fscanf function > that allows you to pass in a C format string and it vectorizes it to use > the same one over an over again until it's done. It's actually quite > powerful and flexible. I once started with that in mind, but didn't have > the C chops to do it. I ended up with a tool that only did doubles (come > to think of it, MATLAB only does doubles, anyway...) > > I may some day write a whole new C (or, more likely, Cython) function > that does something like that, but for now, I'm jsut trying to get > fromfile to be useful for me. > > >> +1 ? (much preferrable to insert NaN or other user value than raise >> ValueError in my opinion) > > But raise an error for integer types? > > I guess this is still up the air -- no consensus yet. raise an exception, I hate the silent cast of nan to integer zero, too much debugging and useless if there are real zeros. (or use some -999 kind of thing if user defined nan codes are allowed, but I just work with float if I expect nans/missing values.) Josef > > Thanks, > > -Chris > > > > > > > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959 ? voice > 7600 Sand Point Way NE ? (206) 526-6329 ? fax > Seattle, WA ?98115 ? ? ? (206) 526-6317 ? main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Thu Jan 7 18:29:16 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 07 Jan 2010 15:29:16 -0800 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> Message-ID: <4B466E4C.10504@noaa.gov> josef.pktd at gmail.com wrote: >>> +1 (much preferrable to insert NaN or other user value than raise >>> ValueError in my opinion) >> But raise an error for integer types? >> >> I guess this is still up the air -- no consensus yet. > > raise an exception, I hate the silent cast of nan to integer zero, me too -- I'm sorry, I wasn't clear -- I'm not going to write any code that returns a zero for a missing value. These are the options I'd consider: 1) Have the user specify what to use for missing values, otherwise, raise an exception 2) Insert a NaN for floating points types, and raise an exception for integer types. what's not clear is whether (2) is a good idea. As for (1), I just don't know if I'm going to get around to writing the code, and I maybe more kwargs is a bad idea -- though maybe not. Enough talk: I've got ugly C code to wade through... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From x.yang at physics.usyd.edu.au Thu Jan 7 18:58:03 2010 From: x.yang at physics.usyd.edu.au (Xue (Sue) Yang) Date: Fri, 8 Jan 2010 10:58:03 +1100 Subject: [Numpy-discussion] Numpy & MKL Message-ID: <003301ca8ff5$40534940$c0f9dbc0$@yang@physics.usyd.edu.au> I understand that intel mkl uses openMP parallel model. Therefore I set environment variable >>os.environ['OMP_NUM_THREADS'] = '4' With same test example, however, still one cpu is used. Do I need any specifications when I run numpy with intel MKL (MKL9.1)? numpy developers would be able to answer this question? I changed the name of numpy-discussion thread to "Numpy & MKL" attempting to draw attentions from wide range of readers. Thanks! Sue On Thu, Jan 7, 2010 at 11:20 AM, Xue (Sue) Yang wrote: > This time, only one cpu was used. Does it mean that our installed intel mkl > 9.1 is not threaded? You would have to consult the MKL documentation - I believe you can control how many threads are used from an environment variable. Also, the exact build commands depend on the version of the MKL, as its libraries often change between versions. David > Thank you for the reply which is useful. > > I also tried to Install numpy with intel mkl 9.1 > I still used gfortran for numpy installation as intel mkl 9.1 supports > gnu compiler. > > I only uncomment these lines for site.cfg in site.cfg.example > > [mkl] > library_dirs = /usr/physics/intel/mkl/lib/32 > include_dirs = /usr/physics/intel/mkl/include > lapack_libs = mkl_lapack > > then I tested the numpy with > > > python > >>import numpy > >>a = numpy.random.randn(6000, 6000) > >>numpy.dot(a, a) > > This time, only one cpu was used. Does it mean that our installed > intel mkl 9.1 is not threaded? > I don't think so. We have used it for openMP parallelization for quite > a while. > > Thanks! > > Sue From dwf at cs.toronto.edu Thu Jan 7 19:48:57 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 7 Jan 2010 19:48:57 -0500 Subject: [Numpy-discussion] Numpy & MKL In-Reply-To: <003301ca8ff5$40534940$c0f9dbc0$@yang@physics.usyd.edu.au> References: <003301ca8ff5$40534940$c0f9dbc0$@yang@physics.usyd.edu.au> Message-ID: On 7-Jan-10, at 6:58 PM, Xue (Sue) Yang wrote: > Do I need any specifications when I run numpy with intel MKL (MKL9.1)? > numpy developers would be able to answer this question? Are you sure you've compiled against MKL properly? What is printed by numpy.show_config()? David From x.yang at physics.usyd.edu.au Thu Jan 7 20:13:22 2010 From: x.yang at physics.usyd.edu.au (Xue (Sue) Yang) Date: Fri, 8 Jan 2010 12:13:22 +1100 Subject: [Numpy-discussion] Numpy & MKL Message-ID: <003a01ca8fff$c62582e0$527088a0$@yang@physics.usyd.edu.au> This is what I had (when I built numpy, I chose gnu compilers instead of intel compilers), >>> numpy.show_config() lapack_opt_info: libraries = ['mkl_lapack', 'mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] blas_opt_info: libraries = ['mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] lapack_mkl_info: libraries = ['mkl_lapack', 'mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] blas_mkl_info: libraries = ['mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] mkl_info: libraries = ['mkl', 'vml', 'guide', 'pthread'] library_dirs = ['/usr/physics/intel/mkl/lib/32'] define_macros = [('SCIPY_MKL_H', None)] include_dirs = ['/usr/physics/intel/mkl/include'] Thanks! Sue >> Do I need any specifications when I run numpy with intel MKL (MKL9.1)? >> numpy developers would be able to answer this question? >Are you sure you've compiled against MKL properly? What is printed by >numpy.show_config()? >David From Chris.Barker at noaa.gov Thu Jan 7 20:21:34 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 07 Jan 2010 17:21:34 -0800 Subject: [Numpy-discussion] fromfile() -- help! In-Reply-To: <4B466E4C.10504@noaa.gov> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> <4B466E4C.10504@noaa.gov> Message-ID: <4B46889E.3020008@noaa.gov> OK, I'm trying to dig into the code and figure out how to get it to stop putting in zeros for missing data with fromfile()/fromstring() text reading. It looks like the culprit is this, in arraytypes.c.src: @fname at _scan(FILE *fp, @type@ *ip, void *NPY_UNUSED(ignore), PyArray_Descr *NPY_UNUSED(ignored)) { double result; int ret; ret = NumPyOS_ascii_ftolf(fp, &result); *ip = (@type@) result; return ret; } If I'm reading this right, this gets called for the datatype of interest, and it is passed in a pointer to the file that is being read. if I have NumPyOS_ascii_ftolf right, it should return 0 if it doesn't succesfully read a number. However, this looks like it sets the data in *ip, even if the return value is zero. It does pass on that return value, but, from ctors.c: fromfile_next_element(FILE **fp, void *dptr, PyArray_Descr *dtype, void *NPY_UNUSED(stream_data)) { /* the NULL argument is for backwards-compatibility */ return dtype->f->scanfunc(*fp, dptr, NULL, dtype); } just moves it on through. This is called from here: if (next(&stream, dptr, dtype, stream_data) < 0) { break; } which is checking for < 0 , so if a zero is returned, it will just go in its merry way... So, have I got that right? Should this get fixed at that last point? One more point, this is a bit different for fromfile and fromstring, so I'm getting really confused! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From david at silveregg.co.jp Thu Jan 7 21:07:03 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Fri, 08 Jan 2010 11:07:03 +0900 Subject: [Numpy-discussion] [Pythonmac-SIG] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B463318.5060902@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> <5b8d13221001051709g58103d1ewf4915d8be38a3277@mail.gmail.com> <4B44B8C0.5010203@noaa.gov> <4B44BBE3.9090601@noaa.gov> <5b8d13221001061712m3498390cse4b2ae6b37fc19e2@mail.gmail.com> <4B463318.5060902@noaa.gov> Message-ID: <4B469347.6000000@silveregg.co.jp> Christopher Barker wrote: > David Cournapeau wrote: >> On Thu, Jan 7, 2010 at 1:35 AM, Christopher Barker >>> In the past, I think folks' have used the default >>> name provided by bdist_mpkg, and those are not always clear. Something like: >>> >>> >>> numpy1.4-osx10.4-python.org2.6-32bit.dmg >> The 32 bits is redundant - we support all archs supported by the >> official python binary, so python.org is enough. > > True, though I was anticipating that there may be 32 and 64 bit builds > some day. I suspect it will be exactly as today, i.e. a universal build with 64 bits. I have not followed closely the discussion on python-dev on that topic, but I believe python 2.7 sill contain 64 bits as an arch. > What OS/architecture were those built with? Snow Leopard. > When I first installed the binary, I got a whole bunch of errors because > "matrix' wasn't found. I recalled this issue from testing, and cleared > out the install, then re-installed, and all was fine. I wonder if it's > possible to have a mpkg remove anything? pkg does not have a uninstaller - I don't think Apple provides one, that's a known limitation of Mac OS X installers (although I believe there are 3rd party ones) > > > I think both of those are known issues, and not a big deal. Maybe the spacing function is wrong on PPC. The underlying is highly architecture dependent. David From dwf at cs.toronto.edu Thu Jan 7 21:13:45 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 7 Jan 2010 21:13:45 -0500 Subject: [Numpy-discussion] Numpy & MKL In-Reply-To: <003a01ca8fff$c62582e0$527088a0$@yang@physics.usyd.edu.au> References: <003a01ca8fff$c62582e0$527088a0$@yang@physics.usyd.edu.au> Message-ID: <717E3E58-3B41-46B3-B432-21B937C8E9CA@cs.toronto.edu> On 7-Jan-10, at 8:13 PM, Xue (Sue) Yang wrote: > This is what I had (when I built numpy, I chose gnu compilers > instead of > intel compilers), > >>>> numpy.show_config() > lapack_opt_info: > libraries = ['mkl_lapack', 'mkl', 'vml', 'guide', 'pthread'] > library_dirs = ['/usr/physics/intel/mkl/lib/32'] > define_macros = [('SCIPY_MKL_H', None)] > include_dirs = ['/usr/physics/intel/mkl/include'] > > blas_opt_info: > libraries = ['mkl', 'vml', 'guide', 'pthread'] > library_dirs = ['/usr/physics/intel/mkl/lib/32'] > define_macros = [('SCIPY_MKL_H', None)] > include_dirs = ['/usr/physics/intel/mkl/include'] > > lapack_mkl_info: > libraries = ['mkl_lapack', 'mkl', 'vml', 'guide', 'pthread'] > library_dirs = ['/usr/physics/intel/mkl/lib/32'] > define_macros = [('SCIPY_MKL_H', None)] > include_dirs = ['/usr/physics/intel/mkl/include'] > > blas_mkl_info: > libraries = ['mkl', 'vml', 'guide', 'pthread'] > library_dirs = ['/usr/physics/intel/mkl/lib/32'] > define_macros = [('SCIPY_MKL_H', None)] > include_dirs = ['/usr/physics/intel/mkl/include'] > > mkl_info: > libraries = ['mkl', 'vml', 'guide', 'pthread'] > library_dirs = ['/usr/physics/intel/mkl/lib/32'] > define_macros = [('SCIPY_MKL_H', None)] > include_dirs = ['/usr/physics/intel/mkl/include'] That looks right to me... And you're sure you've set the environment variable before Python is run and NumPy is loaded? Try running: import os; print os.environ['OMP_NUM_THREADS'] and verify it's the right number. David From dwf at cs.toronto.edu Thu Jan 7 21:24:40 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 7 Jan 2010 21:24:40 -0500 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B43D6E6.80305@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> Message-ID: <876E9DCB-4641-42F4-AD8C-385192F84582@cs.toronto.edu> On 5-Jan-10, at 7:18 PM, Christopher Barker wrote: > If distutils/setuptools could identify the python version properly, > then > binary eggs and easy-install could be a solution -- but that's a > mess, > too. Long live toydist! :) David From dwf at cs.toronto.edu Thu Jan 7 21:29:22 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 7 Jan 2010 21:29:22 -0500 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <4B43D330.7090608@noaa.gov> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> Message-ID: On 5-Jan-10, at 7:02 PM, Christopher Barker wrote: >> Pretty sure the python.org binaries are 32-bit only. I still think >> it's sensible to prefer the > > waiting the rest of this sentence.. ;-) I had meant to say 'sensible to prefer the Python.org version' though in reality I'm a little miffed that Python.org isn't providing Ron's 4- way binaries, since he went to the trouble of adding support for building them. Grumble grumble. >> I'm not really a fan of packages polluting /usr/local, I'd rather the >> tree appear /opt/packagename > > well, /opt has kind of been co-opted by macports. I'd forgotten about that. >> or /usr/local/packagename instead, for >> ease of removal > > wxPython gets put entirely into: > > /usr/local/lib/wxPython-unicode-2.10.8 > > which isn't bad. Ah, yeah, that isn't bad either. >> but the general approach of "stash somewhere and put >> a .pth in both site-packages" seems fine to me. > > OK -- what about simply punting and doing two builds: one 32 bit, and > one 64 bit. I wonder if we need 64bit PPC at all? I know I'm running > 64 > bit hardware, but never ran a 64 bit OS on it -- I wonder if anyone > is? I've built for ppc64 before, and in fact discovered a long-standing bug in the way ppc64 was detected. The fact that nobody found it before me is probably evidence that it is nearly never used. It could be useful in a minority of situations but I don't think it's going to be worth it for most people. David From robert.kern at gmail.com Thu Jan 7 21:34:09 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 7 Jan 2010 20:34:09 -0600 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> Message-ID: <3d375d731001071834p69de1764m5f3b8b0cf9eeee53@mail.gmail.com> On 2010-01-07, David Warde-Farley wrote: > On 5-Jan-10, at 7:02 PM, Christopher Barker wrote: >>> I'm not really a fan of packages polluting /usr/local, I'd rather the >>> tree appear /opt/packagename >> >> well, /opt has kind of been co-opted by macports. > > I'd forgotten about that. It's not really true, though. MacPorts took /opt/local/, but /opt// probably hasn't been. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Thu Jan 7 21:47:47 2010 From: cournape at gmail.com (David Cournapeau) Date: Fri, 8 Jan 2010 11:47:47 +0900 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: <876E9DCB-4641-42F4-AD8C-385192F84582@cs.toronto.edu> References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> <4B43D6E6.80305@noaa.gov> <876E9DCB-4641-42F4-AD8C-385192F84582@cs.toronto.edu> Message-ID: <5b8d13221001071847u48cad23bx2c64a2cbf4df4584@mail.gmail.com> On Fri, Jan 8, 2010 at 11:24 AM, David Warde-Farley wrote: > On 5-Jan-10, at 7:18 PM, Christopher Barker wrote: > >> If distutils/setuptools could identify the python version properly, >> then >> ?binary eggs and easy-install could be a solution -- but that's a >> mess, >> too. > > > Long live toydist! :) Toydist will not solve anything here. Versioning info is useless here if it does not translate to compatible ABI. What is required is to be able to identify a precise python ABI: python makes that hard, mac os x harder, and universal builds ever harder. Things like PEP 384 may help in the future - As it is written by someone who actually knows about this stuff, it will hopefully be useful. David From cournape at gmail.com Thu Jan 7 22:12:20 2010 From: cournape at gmail.com (David Cournapeau) Date: Fri, 8 Jan 2010 12:12:20 +0900 Subject: [Numpy-discussion] FIY: a (new ?) practical profiling tool on linux Message-ID: <5b8d13221001071912n54ff968dte7244ea56dc3a930@mail.gmail.com> Hi, I don't know if many people are aware of it, but I have recently discovered perf, a tool available from the kernel sources. It is extremely simple to use, and very useful when looking at numpy/scipy perf issues in compiled code. For example, I can get this kind of results for looking at the numpy neighborhood iterator performance in one simple command, without special compilation flags: 44.69% python /home/david/local/stow/scipy.git/lib/python2.6/site-packages/scipy/signal/sigtools.so [.] _imp_correlate_nd_double 39.47% python /home/david/local/stow/numpy-1.4.0/lib/python2.6/site-packages/numpy/core/multiarray.so [.] get_ptr_constant 9.98% python /home/david/local/stow/numpy-1.4.0/lib/python2.6/site-packages/numpy/core/multiarray.so [.] get_ptr_simple 0.65% python /usr/bin/python2.6 [.] 0x0000000012b8a0 0.40% python /usr/bin/python2.6 [.] 0x000000000a6662 0.37% python /usr/bin/python2.6 [.] 0x0000000004c10d 0.32% python /usr/bin/python2.6 [.] PyEval_EvalFrameEx 0.15% python [kernel] [k] __d_lookup 0.14% python /lib/libc-2.10.1.so [.] _int_malloc 0.12% python /usr/bin/python2.6 [.] 0x0000000004f90e 0.10% python [kernel] [k] __link_path_walk 0.09% python /usr/bin/python2.6 [.] PyObject_Malloc 0.09% python /lib/ld-2.10.1.so [.] do_lookup_x 0.09% python /lib/libc-2.10.1.so [.] __GI_memcpy 0.08% python [kernel] [k] __ticket_spin_lock 0.07% python /usr/bin/python2.6 [.] PyParser_AddToken And even cooler, annotated sources: ------------------------------------------------ Percent | Source code & Disassembly of multiarray.so ------------------------------------------------ : : : : Disassembly of section .text: : : 000000000001d8a0 : : _coordinates[c] = bd; : : /* set the dataptr from its current coordinates */ : static char* : get_ptr_constant(PyArrayIterObject* _iter, npy_intp *coordinates) : { 15.69 : 1d8a0: 48 81 ec 08 01 00 00 sub $0x108,%rsp : int i; : npy_intp bd, _coordinates[NPY_MAXDIMS]; : PyArrayNeighborhoodIterObject *niter = (PyArrayNeighborhoodIterObject*)_iter; : PyArrayIterObject *p = niter->_internal_iter; : : for(i = 0; i < niter->nd; ++i) { 0.02 : 1d8a7: 48 83 bf 48 0a 00 00 cmpq $0x0,0xa48(%rdi) 0.00 : 1d8ae: 00 : get_ptr_constant(PyArrayIterObject* _iter, npy_intp *coordinates) : { : int i; : npy_intp bd, _coordinates[NPY_MAXDIMS]; : PyArrayNeighborhoodIterObject *niter = (PyArrayNeighborhoodIterObject*)_iter; : PyArrayIterObject *p = niter->_internal_iter; 0.01 : 1d8af: 48 8b 87 50 0b 00 00 mov 0xb50(%rdi),%rax : : for(i = 0; i < niter->nd; ++i) { 7.92 : 1d8b6: 7e 64 jle 1d91c : _INF_SET_PTR(i) 0.01 : 1d8b8: 48 8b 0e mov (%rsi),%rcx 0.00 : 1d8bb: 48 03 48 28 add 0x28(%rax),%rcx 0.03 : 1d8bf: 48 3b 88 40 07 00 00 cmp 0x740(%rax),%rcx 7.97 : 1d8c6: 7c 68 jl 1d930 0.02 : 1d8c8: 45 31 c9 xor %r9d,%r9d 0.00 : 1d8cb: 31 d2 xor %edx,%edx 0.00 : 1d8cd: 48 3b 88 48 07 00 00 cmp 0x748(%rax),%rcx 7.75 : 1d8d4: 7e 32 jle 1d908 0.00 : 1d8d6: eb 58 jmp 1d930 0.00 : 1d8d8: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) 0.00 : 1d8df: 00 7.68 : 1d8e0: 4c 8d 42 74 lea 0x74(%rdx),%r8 0.00 : 1d8e4: 48 8b 0c d6 mov (%rsi,%rdx,8),%rcx 0.00 : 1d8e8: 48 03 4c d0 28 add 0x28(%rax,%rdx,8),%rcx 0.00 : 1d8ed: 49 c1 e0 04 shl $0x4,%r8 7.89 : 1d8f1: 49 3b 0c 00 cmp (%r8,%rax,1),%rcx 0.00 : 1d8f5: 7c 39 jl 1d930 0.01 : 1d8f7: 49 89 d0 mov %rdx,%r8 0.11 : 1d8fa: 49 c1 e0 04 shl $0x4,%r8 7.18 : 1d8fe: 4a 3b 8c 00 48 07 00 cmp 0x748(%rax,%r8,1),%rcx 0.00 : 1d905: 00 0.09 : 1d906: 7f 28 jg 1d930 : int i; : npy_intp bd, _coordinates[NPY_MAXDIMS]; : PyArrayNeighborhoodIterObject *niter = (PyArrayNeighborhoodIterObject*)_iter; : PyArrayIterObject *p = niter->_internal_iter; : It works for C and Fortran, BTW, cheers, David From bsouthey at gmail.com Thu Jan 7 23:10:39 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 7 Jan 2010 22:10:39 -0600 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <4B465605.3010406@noaa.gov> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> Message-ID: On Thu, Jan 7, 2010 at 3:45 PM, Christopher Barker wrote: > Bruce Southey wrote: >>> wrote: > >> Using the numpy NaN or similar (noting R's approach to missing values >> which in turn allows it to have the above functionality) is just a >> very bad idea for missing values because you always have to check that >> which NaN is a missing value and which was due to some numerical >> calculation. > > well, this is specific to reading files, so you know where it came from. You can only know where it came from when you compare the original array to the transformed one. Also a user has to check for missing values or numpy has to warn a user that missing values are present immediately after reading the data so the appropriate action can be taken (like using functions that handle missing values appropriately). That is my second problem with using codes (NaN, -99999 etc) for missing values. > And the principle of fromfile() is that it is fast and simple, if you > want masked arrays, use slower, but more full-featured methods. So in that case it should fail with missing data. > > However, in this case: > > In [9]: np.fromstring("3, 4, NaN, 5", sep=",") > Out[9]: array([ ?3., ? 4., ?NaN, ? 5.]) > > > An actual NaN is read from the file, rather than a missing value. > Perhaps the user does want the distinction, so maybe it should really > only fil it in if the users asks for it, but specifying > "missing_value=np.nan" or something. Yes, that is my first problem of using predefined codes for missing values as you do not always know what is going to occur in the data. > >>>From what I can see is that you expect that fromfile() should only >> split at the supplied delimiters, optionally(?) strip any whitespace > > whitespace stripping is not optional. > >> Your output from this string '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12' >> actually assumes multiple delimiters because there is no comma between >> 4 and 5 and 8 and 9. > > Yes, that's the point. I thought about allowing arbitrary multiple > delimiters, but I think '/n' is a special case - for instance, a comma > at the end of some numbers might mean missing data, but a '\n' would not. > > And I couldn't really think of a useful use-case for arbitrary multiple > delimiters. > >> In Josef's last case how many 'missing values should there be? > > ?>> extra newlines at end of file > ?>> str = ?'1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12\n\n\n' > > none -- exactly why I think \n is a special case. What about '\r' and '\n\r'? > > What about: > ?>> extra newlines in the middle of the file > ?>> str = ?'1, 2, 3, 4\n\n5, 6, 7, 8\n9, 10, 11, 12\n' > > I think they should be ignored, but I hope I'm not making something that > is too specific to my personal needs. Not really, it is more that I am being somewhat difficult to ensure I understand what you actually need. My problem with this is that you are reading one huge 1-D array (that you can resize later) rather than a 2-D array with rows and columns (which is what I deal with). But I agree that you can have an option to say treat '\n' or '\r' as a delimiter but I think it should be turned off by default. > > Travis Oliphant wrote: >> +1 (ignoring new-lines transparently is a nice feature). ?You can also >> use sscanf with weave to read most files. > > right -- but that requires weave. In fact, MATLAB has a fscanf function > that allows you to pass in a C format string and it vectorizes it to use > the same one over an over again until it's done. It's actually quite > powerful and flexible. I once started with that in mind, but didn't have > the C chops to do it. I ended up with a tool that only did doubles (come > to think of it, MATLAB only does doubles, anyway...) > > I may some day write a whole new C (or, more likely, Cython) function > that does something like that, but for now, I'm jsut trying to get > fromfile to be useful for me. > > >> +1 ? (much preferrable to insert NaN or other user value than raise >> ValueError in my opinion) > > But raise an error for integer types? > > I guess this is still up the air -- no consensus yet. > > Thanks, > > -Chris > You should have a corresponding value for ints because raising an exceptionwould be inconsistent with allowing floats to have a value. If you must keep the user defined dtype then, as Josef suggests, just use some code be it -999 or most negative number supported by the OS for the defined dtype or, just convert the ints into floats if the user does not define a missing value code. It would be nice to either return the number of missing values or display a warning indicating how many occurred. Bruce From josef.pktd at gmail.com Fri Jan 8 00:26:41 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 8 Jan 2010 00:26:41 -0500 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> Message-ID: <1cd32cbb1001072126x1e68f10dv940078d0cf81f35@mail.gmail.com> On Thu, Jan 7, 2010 at 11:10 PM, Bruce Southey wrote: > On Thu, Jan 7, 2010 at 3:45 PM, Christopher Barker > wrote: >> Bruce Southey wrote: >>>> wrote: >> >>> Using the numpy NaN or similar (noting R's approach to missing values >>> which in turn allows it to have the above functionality) is just a >>> very bad idea for missing values because you always have to check that >>> which NaN is a missing value and which was due to some numerical >>> calculation. >> >> well, this is specific to reading files, so you know where it came from. > > You can only know where it came from when you compare the original > array to the transformed one. Also a user has to check for missing > values or numpy has to warn a user that missing values are present > immediately after reading the data so the appropriate action can be > taken (like using functions that handle missing values appropriately). > That is my second problem with using codes (NaN, -99999 etc) ?for > missing values. > > > >> And the principle of fromfile() is that it is fast and simple, if you >> want masked arrays, use slower, but more full-featured methods. > > So in that case it should fail with missing data. > >> >> However, in this case: >> >> In [9]: np.fromstring("3, 4, NaN, 5", sep=",") >> Out[9]: array([ ?3., ? 4., ?NaN, ? 5.]) >> >> >> An actual NaN is read from the file, rather than a missing value. >> Perhaps the user does want the distinction, so maybe it should really >> only fil it in if the users asks for it, but specifying >> "missing_value=np.nan" or something. > > Yes, that is my first problem of using predefined codes for missing > values as you do not always know what is going to occur in the data. > > >> >>>>From what I can see is that you expect that fromfile() should only >>> split at the supplied delimiters, optionally(?) strip any whitespace >> >> whitespace stripping is not optional. >> >>> Your output from this string '1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12' >>> actually assumes multiple delimiters because there is no comma between >>> 4 and 5 and 8 and 9. >> >> Yes, that's the point. I thought about allowing arbitrary multiple >> delimiters, but I think '/n' is a special case - for instance, a comma >> at the end of some numbers might mean missing data, but a '\n' would not. >> >> And I couldn't really think of a useful use-case for arbitrary multiple >> delimiters. >> >>> In Josef's last case how many 'missing values should there be? >> >> ?>> extra newlines at end of file >> ?>> str = ?'1, 2, 3, 4\n5, 6, 7, 8\n9, 10, 11, 12\n\n\n' >> >> none -- exactly why I think \n is a special case. > > What about '\r' and '\n\r'? Yes, I forgot about this, and it will be the most common case for Windows users like myself. I think \r should be stripped automatically, like in non-binary reading of files in python. > >> >> What about: >> ?>> extra newlines in the middle of the file >> ?>> str = ?'1, 2, 3, 4\n\n5, 6, 7, 8\n9, 10, 11, 12\n' >> >> I think they should be ignored, but I hope I'm not making something that >> is too specific to my personal needs. > > Not really, it is more that I am being somewhat difficult to ensure I > understand what you actually need. > > My problem with this is that you are reading one huge 1-D array ?(that > you can resize later) rather than a 2-D array with rows and columns > (which is what I deal with). But I agree that you can have an option > to say treat '\n' or '\r' as a delimiter but I think it should be > turned off by default. > > >> >> Travis Oliphant wrote: >>> +1 (ignoring new-lines transparently is a nice feature). ?You can also >>> use sscanf with weave to read most files. >> >> right -- but that requires weave. In fact, MATLAB has a fscanf function >> that allows you to pass in a C format string and it vectorizes it to use >> the same one over an over again until it's done. It's actually quite >> powerful and flexible. I once started with that in mind, but didn't have >> the C chops to do it. I ended up with a tool that only did doubles (come >> to think of it, MATLAB only does doubles, anyway...) >> >> I may some day write a whole new C (or, more likely, Cython) function >> that does something like that, but for now, I'm jsut trying to get >> fromfile to be useful for me. >> >> >>> +1 ? (much preferrable to insert NaN or other user value than raise >>> ValueError in my opinion) >> >> But raise an error for integer types? >> >> I guess this is still up the air -- no consensus yet. >> >> Thanks, >> >> -Chris >> > > You should have a corresponding value for ints because raising an > exceptionwould be inconsistent with allowing floats to have a value. No, I think different nan/missing value handling between integers and float is a natural distinction. There is no default nan code for integers, but nan (and inf) are valid floating point numbers (even if nan is not a number). And the default treatment of nans in numpy is getting pretty good (e.g. I like the new (nan)sort). > If you must keep the user defined dtype then, as Josef suggests, just > use some code be it -999 or most negative number supported by the OS > for the defined dtype or, just convert the ints into floats if the > user does not define a missing value code. ?It would be nice to either > return the number of missing values or display a warning indicating > how many occurred. A warning would be good, but doing np.any(np.isnan(x)) or np.isnan(x).sum() on the result is always a good idea for a user when missing values are possibility. Josef > > Bruce > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Fri Jan 8 04:22:53 2010 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 08 Jan 2010 11:22:53 +0200 Subject: [Numpy-discussion] fromfile() -- help! In-Reply-To: <4B46889E.3020008@noaa.gov> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> <4B466E4C.10504@noaa.gov> <4B46889E.3020008@noaa.gov> Message-ID: <1262942573.2580.149.camel@talisman> to, 2010-01-07 kello 17:21 -0800, Christopher Barker kirjoitti: [clip] > if I have NumPyOS_ascii_ftolf right, it should return 0 if it doesn't > succesfully read a number. However, this looks like it sets the data in > *ip, even if the return value is zero. It may also return EOF (== -1) when encountering end-of-stream. Of course, I don't think any code should not rely on EOF being -1, and I doubt that relying on it is intended here. > It does pass on that return value, but, from ctors.c: > > fromfile_next_element(FILE **fp, void *dptr, PyArray_Descr *dtype, > void *NPY_UNUSED(stream_data)) > { > /* the NULL argument is for backwards-compatibility */ > return dtype->f->scanfunc(*fp, dptr, NULL, dtype); > } > > just moves it on through. This is called from here: > > if (next(&stream, dptr, dtype, stream_data) < 0) { > break; > } > > which is checking for < 0 , so if a zero is returned, it will just go in > its merry way... Yeah, this is of course wrong; for example a file containing "1,2," results to np.fromfile("filename.txt", sep=",") == [1, 2, -1] where the last value is effectively undefined. Another point to note is that `next` may also be the fromstr_next_element function; when fixing things also its semantics should be corrected. Pauli From pav+sp at iki.fi Fri Jan 8 04:28:33 2010 From: pav+sp at iki.fi (Pauli Virtanen) Date: Fri, 8 Jan 2010 09:28:33 +0000 (UTC) Subject: [Numpy-discussion] fromfile() -- help! References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> <4B466E4C.10504@noaa.gov> <4B46889E.3020008@noaa.gov> Message-ID: Thu, 07 Jan 2010 17:21:34 -0800, Christopher Barker wrote: [clip] > It does pass on that return value, but, from ctors.c: > > fromfile_next_element(FILE **fp, void *dptr, PyArray_Descr *dtype, > void *NPY_UNUSED(stream_data)) > { > /* the NULL argument is for backwards-compatibility */ return > dtype->f->scanfunc(*fp, dptr, NULL, dtype); > } This functions is IMHO where the fix should go; I believe it should do something like return (ret == 0 || ret == EOF) ? -1 : ret; -- Pauli Virtanen From robince at gmail.com Fri Jan 8 07:03:27 2010 From: robince at gmail.com (Robin) Date: Fri, 8 Jan 2010 12:03:27 +0000 Subject: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2 In-Reply-To: References: <9d5ec4221001050835w5dd4cf97hffd4f2480bf19c3e@mail.gmail.com> <4B43A4FF.3010604@hawaii.edu> <4B43B4CD.3030701@hawaii.edu> <523E341E-A076-465C-938C-EC902F5F0D16@gmail.com> <4B43C4D2.5030701@noaa.gov> <4B43D330.7090608@noaa.gov> Message-ID: <2d5132a51001080403r62828c3bsae9baf12c6ed5e92@mail.gmail.com> On Fri, Jan 8, 2010 at 2:29 AM, David Warde-Farley wrote: > On 5-Jan-10, at 7:02 PM, Christopher Barker wrote: > >>> Pretty sure the python.org binaries are 32-bit only. I still think >>> it's sensible to prefer the >> >> waiting the rest of this sentence.. ;-) > > I had meant to say 'sensible to prefer the Python.org version' though > in reality I'm a little miffed that Python.org isn't providing Ron's 4- > way binaries, since he went to the trouble of adding support for > building them. Grumble grumble. My understanding was that 2.6/3.1 will never be buildable as an arch selectable universal binary interpreter (like the apple system python) due to this issue: http://bugs.python.org/issue6834 I think this is only being fixed in 2.7/3.2 so perhaps from then Python will distribute selectable universal builds. (Just mention it in case folks aren't aware of that issue). Cheers Robin From nouiz at nouiz.org Fri Jan 8 09:18:28 2010 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 8 Jan 2010 09:18:28 -0500 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <002401ca8f3f$fdae2340$f90a69c0$%yang@physics.usyd.edu.au> <1d4f24dc6f376254aa63963eb6cd5916.squirrel@webmail.uio.no> <4B461B9F.5060800@noaa.gov> Message-ID: <2d1d7fe71001080618od97adf1nbd936f5f22963912@mail.gmail.com> Hi, I while back, someone talked about aigen2(http://eigen.tuxfamily.org/). In their benchmark they give info that they are competitive again mkl and goto on matrix matrix product. They are not better, but that could make a good default implementation for numpy when their is no blas installed. I think the license would allow to include it in numpy directly. I don't have time to do it, and my numpy is linked with goto. So it would be useless for me. But if someone want to make the default version better again other tools, that could be a good approach. Fr?d?ric Bastien On Thu, Jan 7, 2010 at 12:47 PM, Sturla Molden wrote: > > Sturla Molden wrote: > >> I would suggest using GotoBLAS instead of ATLAS. > > > >> http://www.tacc.utexas.edu/tacc-projects/ > > > > That does look promising -- nay idea what the license is? They don't > > make it clear on the site > > > > UT TACC Research License (Source Code) > > > > The Texas Advanced Computing Center of The University of Texas at Austin > has developed certain software and documentation that it desires to make > available without charge to anyone for academic, research, experimental or > personal use. This license is designed to guarantee freedom to use the > software for these purposes. If you wish to distribute or make other use > of the software, you may purchase a license to do so from the University > of Texas. > > The accompanying source code is made available to you under the terms of > this UT TACC Research License (this "UTTRL"). By clicking the "ACCEPT" > button, or by installing or using the code, you are consenting to be bound > by this UTTRL. If you do not agree to the terms and conditions of this > license, do not click the "ACCEPT" button, and do not install or use any > part of the code. > > The terms and conditions in this UTTRL not only apply to the source code > made available by UT TACC, but also to any improvements to, or derivative > works of, that source code made by you and to any object code compiled > from such source code, improvements or derivative works. > > 1. DEFINITIONS. > > 1.1 "Commercial Use" shall mean use of Software or Documentation by > Licensee for direct or indirect financial, commercial or strategic gain or > advantage, including without limitation: (a) bundling or integrating the > Software with any hardware product or another software product for > transfer, sale or license to a third party (even if distributing the > Software on separate media and not charging for the Software); (b) > providing customers with a link to the Software or a copy of the Software > for use with hardware or another software product purchased by that > customer; or (c) use in connection with the performance of services for > which Licensee is compensated. > > 1.2 "Derivative Products" means any improvements to, or other derivative > works of, the Software made by Licensee. > > 1.3 "Documentation" shall mean all manuals, user documentation, and other > related materials pertaining to the Software that are made available to > Licensee in connection with the Software. > > 1.4 "Licensor" shall mean The University of Texas. > > 1.5 "Licensee" shall mean the person or entity that has agreed to the > terms hereof and is exercising rights granted hereunder. > > 1.6 "Software" shall mean the computer program(s) referred to as GotoBLAS2 > made available under this UTTRL in source code form, including any error > corrections, bug fixes, patches, updates or other modifications that > Licensor may in its sole discretion make available to Licensee from time > to time, and any object code compiled from such source code. > > 2. GRANT OF RIGHTS. > > Subject to the terms and conditions hereunder, Licensor hereby grants to > Licensee a worldwide, non-transferable, non-exclusive license to (a) > install, use and reproduce the Software for academic, research, > experimental and personal use (but specifically excluding Commercial Use); > (b) use and modify the Software to create Derivative Products, subject to > Section 3.2; and (c) use the Documentation, if any, solely in connection > with Licensee's authorized use of the Software. > > 3. RESTRICTIONS; COVENANTS. > > 3.1 Licensee may not: (a) distribute, sub-license or otherwise transfer > copies or rights to the Software (or any portion thereof) or the > Documentation; (b) use the Software (or any portion thereof) or > Documentation for Commercial Use, or for any other use except as described > in Section 2; (c) copy the Software or Documentation other than for > archival and backup purposes; or (d) remove any product identification, > copyright, proprietary notices or labels from the Software and > Documentation. This UTTRL confers no rights upon Licensee except those > expressly granted herein. > > 3.2 Licensee hereby agrees that it will provide a copy of all Derivative > Products to Licensor and that its use of the Derivative Products will be > subject to all of the same terms, conditions, restrictions and limitations > on use imposed on the Software under this UTTRL. Licensee hereby grants > Licensor a worldwide, non-exclusive, royalty-free license to reproduce, > prepare derivative works of, publicly display, publicly perform, > sublicense and distribute Derivative Products. Licensee also hereby grants > Licensor a worldwide, non-exclusive, royalty-free patent license to make, > have made, use, offer to sell, sell, import and otherwise transfer the > Derivative Products under those patent claims licensable by Licensee that > are necessarily infringed by the Derivative Products. > > 4. PROTECTION OF SOFTWARE. > > 4.1 Confidentiality. The Software and Documentation are the confidential > and proprietary information of Licensor. Licensee agrees to take adequate > steps to protect the Software and Documentation from unauthorized > disclosure or use. Licensee agrees that it will not disclose the Software > or Documentation to any third party. > > 4.2 Proprietary Notices. Licensee shall maintain and place on any copy of > Software or Documentation that it reproduces for internal use all notices > as are authorized and/or required hereunder. Licensee shall include a copy > of this UTTRL and the following notice, on each copy of the Software and > Documentation. Such license and notice shall be embedded in each copy of > the Software, in the video screen display, on the physical medium > embodying the Software copy and on any Documentation: > > Copyright ?? The University of Texas, 2009. All right reserved. > UNIVERSITY EXPRESSLY DISCLAIMS ANY AND ALL WARRANTIES CONCERNING THIS > SOFTWARE AND DOCUMENTATION, INCLUDING ANY WARRANTIES OF MERCHANTABILITY, > FITNESS FOR ANY PARTICULAR PURPOSE, NON-INFRINGEMENT AND WARRANTIES OF > PERFORMANCE, AND ANY WARRANTY THAT MIGHT OTHERWISE ARISE FROM COURSE OF > DEALING OR USAGE OF TRADE. NO WARRANTY IS EITHER EXPRESS OR IMPLIED WITH > RESPECT TO THE USE OF THE SOFTWARE OR DOCUMENTATION. Under no > circumstances shall University be liable for incidental, special, > indirect, direct or consequential damages or loss of profits, interruption > of business, or related expenses which may arise from use of Software or > Documentation, including but not limited to those resulting from defects > in Software and/or Documentation, or loss or inaccuracy of data of any > kind. > > 5. WARRANTIES. > > 5.1 Disclaimer of Warranties. TO THE EXTENT PERMITTED BY APPLICABLE LAW, > THE SOFTWARE AND DOCUMENTATION ARE BEING PROVIDED ON AN "AS IS" BASIS > WITHOUT ANY WARRANTIES OF ANY KIND RESPECTING THE SOFTWARE OR > DOCUMENTATION, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY > WARRANTY OF DESIGN, MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR > NON-INFRINGEMENT. > > 5.2 Limitation of Liability. UNDER NO CIRCUMSTANCES UNLESS REQUIRED BY > APPLICABLE LAW SHALL LICENSOR BE LIABLE FOR INCIDENTAL, SPECIAL, INDIRECT, > DIRECT OR CONSEQUENTIAL DAMAGES OR LOSS OF PROFITS, INTERRUPTION OF > BUSINESS, OR RELATED EXPENSES WHICH MAY ARISE AS A RESULT OF THIS LICENSE > OR OUT OF THE USE OR ATTEMPT OF USE OF SOFTWARE OR DOCUMENTATION INCLUDING > BUT NOT LIMITED TO THOSE RESULTING FROM DEFECTS IN SOFTWARE AND/OR > DOCUMENTATION, OR LOSS OR INACCURACY OF DATA OF ANY KIND. THE FOREGOING > EXCLUSIONS AND LIMITATIONS WILL APPLY TO ALL CLAIMS AND ACTIONS OF ANY > KIND, WHETHER BASED ON CONTRACT, TORT (INCLUDING, WITHOUT LIMITATION, > NEGLIGENCE), OR ANY OTHER GROUNDS. > > 6. INDEMNIFICATION. > > Licensee shall indemnify, defend and hold harmless Licensor, the > University of Texas System, their Regents, and their officers, agents and > employees from and against any claims, demands, or causes of action > whatsoever caused by, or arising out of, or resulting from, the exercise > or practice of the license granted hereunder by Licensee, its officers, > employees, agents or representatives. > > 7. TERMINATION. > > If Licensee breaches this UTTRL, Licensee\'s right to use the Software and > Documentation will terminate immediately without notice, but all > provisions of this UTTRL except Section 2 will survive termination and > continue in effect. Upon termination, Licensee must destroy all copies of > the Software and Documentation. > > 8. GOVERNING LAW; JURISDICTION AND VENUE. > > The validity, interpretation, construction and performance of this UTTRL > shall be governed by the laws of the State of Texas. The Texas state > courts of Travis County, Texas (or, if there is exclusive federal > jurisdiction, the United States District Court for the Central District of > Texas) shall have exclusive jurisdiction and venue over any dispute > arising out of this UTTRL, and Licensee consents to the jurisdiction of > such courts. Application of the United Nations Convention on Contracts for > the International Sale of Goods is expressly excluded. > > 9. EXPORT CONTROLS. > > This license is subject to all applicable export restrictions. Licensee > must comply with all export and import laws and restrictions and > regulations of any United States or foreign agency or authority relating > to the Software and its use. > > 10. U.S. GOVERNMENT END-USERS. > > The Software is a "commercial item," as that term is defined in 48 C.F.R. > 2.101, consisting of "commercial computer software" and "commercial > computer software documentation," as such terms are used in 48 C.F.R. > 12.212 (Sept. 1995) and 48 C.F.R. 227.7202 (June 1995). Consistent with 48 > C.F.R. 12.212, 48 C.F.R. 27.405(b)(2) (June 1998) and 48 C.F.R. 227.7202, > all U.S. Government End Users acquire the Software with only those rights > as set forth herein. > > 11. MISCELLANEOUS. > > If any provision hereof shall be held illegal, invalid or unenforceable, > in whole or in part, such provision shall be modified to the minimum > extent necessary to make it legal, valid and enforceable, and the > legality, validity and enforceability of all other provisions of this > UTTRL shall not be affected thereby. Licensee may not assign this UTTRL in > whole or in part, without Licensor's prior written consent. Any attempt to > assign this UTTRL without such consent will be null and void. This UTTRL > is the complete and exclusive statement between Licensee and Licensor > relating to the subject matter hereof and supersedes all prior oral and > written and all contemporaneous oral negotiations, commitments and > understandings of the parties, if any. Any waiver by either party of any > default or breach hereunder shall not constitute a waiver of any provision > of this UTTRL or of any subsequent default or breach of the same or a > different kind. > > > > END OF LICENSE > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jan 8 10:17:43 2010 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 8 Jan 2010 09:17:43 -0600 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <2d1d7fe71001080618od97adf1nbd936f5f22963912@mail.gmail.com> References: <002401ca8f3f$fdae2340$f90a69c0$%yang@physics.usyd.edu.au> <1d4f24dc6f376254aa63963eb6cd5916.squirrel@webmail.uio.no> <4B461B9F.5060800@noaa.gov> <2d1d7fe71001080618od97adf1nbd936f5f22963912@mail.gmail.com> Message-ID: <3d375d731001080717t7185e9c5w9b64a5d49b9740bc@mail.gmail.com> 2010/1/8 Fr?d?ric Bastien : > Hi, > > I while back, someone talked about aigen2(http://eigen.tuxfamily.org/). In > their benchmark they give info that they are competitive again mkl and goto > on matrix matrix product. They are not better, but that could make a good > default implementation for numpy when their is no blas installed. I think > the license would allow to include it in numpy directly. It is licensed under the LGPLv3, so it is not compatible with the numpy license. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d.l.goldsmith at gmail.com Fri Jan 8 16:13:14 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 8 Jan 2010 13:13:14 -0800 Subject: [Numpy-discussion] Stupid question (at least coming from me it is) Message-ID: <45d1ab481001081313w5a9feb62xf1355ed92e43189b@mail.gmail.com> So, to get the new numpy.polynomial "sub-package," one has to update to 1.4 (or is there a 1.3.x that has it)? Thanks! DG PS: my pressing need (another stupid question, at least coming from me): chebyshev.chebdomain = [0,1] or [-1,1]? Thanks again! -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jan 8 16:40:34 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 8 Jan 2010 14:40:34 -0700 Subject: [Numpy-discussion] Stupid question (at least coming from me it is) In-Reply-To: <45d1ab481001081313w5a9feb62xf1355ed92e43189b@mail.gmail.com> References: <45d1ab481001081313w5a9feb62xf1355ed92e43189b@mail.gmail.com> Message-ID: On Fri, Jan 8, 2010 at 2:13 PM, David Goldsmith wrote: > So, to get the new numpy.polynomial "sub-package," one has to update to 1.4 > (or is there a 1.3.x that has it)? Thanks! > > Yes. > DG > > PS: my pressing need (another stupid question, at least coming from me): > chebyshev.chebdomain = [0,1] or [-1,1]? Thanks again! > > chebyshev.chebdomain is the default chebyshev domain and is [-1,1]. Maybe it needs a bettter name? Note that it is integer; that isn't required, but it makes it compatible with other types like Decimal that don't mix with floats. Another possibility is to make it a function so I can document it, at present it is an ndarray. For normal work you should use the chebyshev.Chebyshev class. Let me know what particular problem you are looking at as it will be useful to start putting some examples together. And I want to see what needs improvement. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Fri Jan 8 18:12:24 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 08 Jan 2010 15:12:24 -0800 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> Message-ID: <4B47BBD8.5010906@noaa.gov> Bruce Southey wrote: > Also a user has to check for missing > values or numpy has to warn a user I think warnings are next to useless for all but interactive work -- so I don't want to rely on them > that missing values are present > immediately after reading the data so the appropriate action can be > taken (like using functions that handle missing values appropriately). > That is my second problem with using codes (NaN, -99999 etc) for > missing values. But I think you're right -- if someone write code, tests with good input, then later runs it with missing valued import, they are likely to have not ever bothered to test for missing values. So I think missing values should only be replaced by something if the user specifically asks for it. >> And the principle of fromfile() is that it is fast and simple, if you >> want masked arrays, use slower, but more full-featured methods. > > So in that case it should fail with missing data. Well, I'm not so sure -- the point is performance, no reason not to have high performing code that handles missing data. > What about '\r' and '\n\r'? I have thought about that -- I'm hoping that python's text file reading will just take care of it, but as we're working with C file handles here (I think), I guess not. '/n/r' is easy -- the '/r' is just extra whitespace. 'r' is another case to handle. > My problem with this is that you are reading one huge 1-D array (that > you can resize later) rather than a 2-D array with rows and columns > (which is what I deal with). That's because fromfile()) is not designed to be row-oriented at all, and the binary read certainly isn't. I'm just trying to make this easy -- though it's not turning out that way! > But I agree that you can have an option > to say treat '\n' or '\r' as a delimiter but I think it should be > turned off by default. that's what I've done. > You should have a corresponding value for ints because raising an > exceptionwould be inconsistent with allowing floats to have a value. I'm not sure I care, really -- but I think having the user specify the fill value is the best option, anyway. josef.pktd at gmail.com wrote: >>> none -- exactly why I think \n is a special case. >> What about '\r' and '\n\r'? > > Yes, I forgot about this, and it will be the most common case for > Windows users like myself. > > I think \r should be stripped automatically, like in non-binary > reading of files in python. except for folks like me that have old mac files laying around...so I want this like "Universal newlines" support. > A warning would be good, but doing np.any(np.isnan(x)) or > np.isnan(x).sum() on the result is always a good idea for a user when > missing values are possibility. right, but the issue is the user has to know that they are possible, and we all know how carefully we all read docs! Thanks for your input -- I think I know what I'd like to do, but it's proving less than trivial to do it, so we'll see. In short: 1) optionally allow newlines to serve as a delimiter, so large tables can be read. 2) raise an exception for missing values, unless: 3) the user specifies a fill value of their choice (compatible with the chosen data type. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Fri Jan 8 18:16:31 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 08 Jan 2010 15:16:31 -0800 Subject: [Numpy-discussion] fromfile() -- help! In-Reply-To: <1262942573.2580.149.camel@talisman> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> <4B466E4C.10504@noaa.gov> <4B46889E.3020008@noaa.gov> <1262942573.2580.149.camel@talisman> Message-ID: <4B47BCCF.6060104@noaa.gov> Pauli Virtanen wrote: >> if I have NumPyOS_ascii_ftolf right, it should return 0 if it doesn't >> succesfully read a number. However, this looks like it sets the data in >> *ip, even if the return value is zero. > > It may also return EOF (== -1) when encountering end-of-stream. Of > course, I don't think any code should not rely on EOF being -1, and I > doubt that relying on it is intended here. OK, so it should explicitly check for EOF? >> It does pass on that return value, but, from ctors.c: >> >> fromfile_next_element(FILE **fp, void *dptr, PyArray_Descr *dtype, >> void *NPY_UNUSED(stream_data)) >> { >> /* the NULL argument is for backwards-compatibility */ >> return dtype->f->scanfunc(*fp, dptr, NULL, dtype); >> } >> >> just moves it on through. This is called from here: >> >> if (next(&stream, dptr, dtype, stream_data) < 0) { >> break; >> } >> >> which is checking for < 0 , so if a zero is returned, it will just go in >> its merry way... > > Yeah, this is of course wrong; for example a file containing "1,2," > results to np.fromfile("filename.txt", sep=",") == [1, 2, -1] where the > last value is effectively undefined. I get a zero, but yes, that's what I'm trying to fix > Another point to note is that `next` may also be the > fromstr_next_element function; when fixing things also its semantics > should be corrected. yup -- I know -- great fun! But I;'m writing unit test that ensure that fromstring and fromfile do the same thing, so I should catch it if I miss anything. >> It does pass on that return value, but, from ctors.c: >> >> fromfile_next_element(FILE **fp, void *dptr, PyArray_Descr *dtype, >> void *NPY_UNUSED(stream_data)) >> { >> /* the NULL argument is for backwards-compatibility */ return >> dtype->f->scanfunc(*fp, dptr, NULL, dtype); >> } > > This functions is IMHO where the fix should go; I believe it should do > something like > > return (ret == 0 || ret == EOF) ? -1 : ret; > Thanks -- I think that makes sense -- if nothing else, a change here will only effect fromfile(), so I won't accidentally break anything else. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From d.l.goldsmith at gmail.com Fri Jan 8 19:19:21 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 8 Jan 2010 16:19:21 -0800 Subject: [Numpy-discussion] Stupid question (at least coming from me it is) In-Reply-To: References: <45d1ab481001081313w5a9feb62xf1355ed92e43189b@mail.gmail.com> Message-ID: <45d1ab481001081619g3afb1af4t275631e293795e55@mail.gmail.com> On Fri, Jan 8, 2010 at 1:40 PM, Charles R Harris wrote: > > chebyshev.chebdomain is the default chebyshev domain and is [-1,1]. Maybe it > needs a bettter name? Note that it is integer; that isn't required, but it > makes it compatible with other types like Decimal that don't mix with > floats. Another possibility is to make it a function so I can document it, That's "the problem" I'm working on. :-) DG > at present it is an ndarray. For normal work you should use the > chebyshev.Chebyshev class. > > Let me know what particular problem you are looking at as it will be useful > to start putting some examples together. And I want to see what needs > improvement. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From bsouthey at gmail.com Fri Jan 8 20:15:43 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 8 Jan 2010 19:15:43 -0600 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <4B47BBD8.5010906@noaa.gov> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <4B47BBD8.5010906@noaa.gov> Message-ID: On Fri, Jan 8, 2010 at 5:12 PM, Christopher Barker wrote: > Bruce Southey wrote: >> Also a user has to check for missing >> values or numpy has to warn a user > > I think warnings are next to useless for all but interactive work -- so > I don't want to rely on them > >> that missing values are present >> immediately after reading the data so the appropriate action can be >> taken (like using functions that handle missing values appropriately). >> That is my second problem with using codes (NaN, -99999 etc) ?for >> missing values. > > But I think you're right -- if someone write code, tests with good > input, then later runs it with missing valued import, they are likely to > have not ever bothered to test for missing values. > > So I think missing values should only be replaced by something if the > user specifically asks for it. > >>> And the principle of fromfile() is that it is fast and simple, if you >>> want masked arrays, use slower, but more full-featured methods. >> >> So in that case it should fail with missing data. > > Well, I'm not so sure -- the point is performance, no reason not to have > high performing code that handles missing data. > >> What about '\r' and '\n\r'? > > I have thought about that -- I'm hoping that python's text file reading > will just take care of it, but as we're working with C file handles here > (I think), I guess not. '/n/r' is easy -- the '/r' is just extra > whitespace. 'r' is another case to handle. > > >> My problem with this is that you are reading one huge 1-D array ?(that >> you can resize later) rather than a 2-D array with rows and columns >> (which is what I deal with). > > That's because fromfile()) is not designed to be row-oriented at all, > and the binary read certainly isn't. I'm just trying to make this easy > -- though it's not turning out that way! > > ?> But I agree that you can have an option >> to say treat '\n' or '\r' as a delimiter but I think it should be >> turned off by default. > > that's what I've done. > >> You should have a corresponding value for ints because raising an >> exceptionwould be inconsistent with allowing floats to have a value. > > I'm not sure I care, really -- but I think having the user specify the > fill value is the best option, anyway. > > josef.pktd at gmail.com wrote: >>>> none -- exactly why I think \n is a special case. >>> What about '\r' and '\n\r'? >> >> Yes, I forgot about this, and it will be the most common case for >> Windows users like myself. >> >> I think \r should be stripped automatically, like in non-binary >> reading of files in python. > > except for folks like me that have old mac files laying around...so I > want this like "Universal newlines" support. > >> A warning would be good, but doing np.any(np.isnan(x)) or >> np.isnan(x).sum() on the result is always a good idea for a user when >> missing values are possibility. > > right, but the issue is the user has to know that they are possible, and > we all know how carefully we all read docs! > > Thanks for your input -- I think I know what I'd like to do, but it's > proving less than trivial to do it, so we'll see. > > In short: > > 1) optionally allow newlines to serve as a delimiter, so large tables > can be read. > > 2) raise an exception for missing values, unless: > ? 3) the user specifies a fill value of their choice (compatible with > the chosen data type. > > > -Chris > > I fully agree with your approach! Thanks for considering my thoughts! Bruce From charlesr.harris at gmail.com Fri Jan 8 20:29:46 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 8 Jan 2010 18:29:46 -0700 Subject: [Numpy-discussion] Stupid question (at least coming from me it is) In-Reply-To: <45d1ab481001081619g3afb1af4t275631e293795e55@mail.gmail.com> References: <45d1ab481001081313w5a9feb62xf1355ed92e43189b@mail.gmail.com> <45d1ab481001081619g3afb1af4t275631e293795e55@mail.gmail.com> Message-ID: On Fri, Jan 8, 2010 at 5:19 PM, David Goldsmith wrote: > On Fri, Jan 8, 2010 at 1:40 PM, Charles R Harris > wrote: > > > > chebyshev.chebdomain is the default chebyshev domain and is [-1,1]. Maybe > it > > needs a bettter name? Note that it is integer; that isn't required, but > it > > makes it compatible with other types like Decimal that don't mix with > > floats. Another possibility is to make it a function so I can document > it, > > That's "the problem" I'm working on. :-) > > There are four variables defined # Chebyshev default domain. chebdomain = np.array([-1,1]) # Chebyshev coefficients representing zero. chebzero = np.array([0]) # Chebyshev coefficients representing one. chebone = np.array([1]) # Chebyshev coefficients representing the identity x. chebx = np.array([0,1]) And corresponding ones in the polynomial module. I can make them all functions if that would help, I thought of doing that in the first place... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Fri Jan 8 23:37:37 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 8 Jan 2010 20:37:37 -0800 Subject: [Numpy-discussion] Stupid question (at least coming from me it is) In-Reply-To: References: <45d1ab481001081313w5a9feb62xf1355ed92e43189b@mail.gmail.com> <45d1ab481001081619g3afb1af4t275631e293795e55@mail.gmail.com> Message-ID: <45d1ab481001082037o4f8bb7dfnf43676c98c4056a3@mail.gmail.com> On Fri, Jan 8, 2010 at 5:29 PM, Charles R Harris wrote: > > On Fri, Jan 8, 2010 at 5:19 PM, David Goldsmith > wrote: >> >> On Fri, Jan 8, 2010 at 1:40 PM, Charles R Harris >> wrote: >> > >> > chebyshev.chebdomain is the default chebyshev domain and is [-1,1]. >> > Maybe it >> > needs a bettter name? Note that it is integer; that isn't required, but >> > it >> > makes it compatible with other types like Decimal that don't mix with >> > floats. Another possibility is to make it a function so I can document >> > it, >> >> That's "the problem" I'm working on. :-) >> > > There are four variables defined > > # Chebyshev default domain. > chebdomain = np.array([-1,1]) > > # Chebyshev coefficients representing zero. > chebzero = np.array([0]) > > # Chebyshev coefficients representing one. > chebone = np.array([1]) > > # Chebyshev coefficients representing the identity x. > chebx = np.array([0,1]) > > And corresponding ones in the polynomial module. I can make them all > functions if that would help, I thought of doing that in the first place... Well, I'm documenting them at the module level, which is where some doc on them already exists (I'm just embellishing a little for increased clarity) and what I _think_ "we" agreed on as "what to do" to document constants, so I don't need/want you to ("promote" them, that is), but if you decide that you want to do it, I'm neutral (unless it would hurt performance of course). Thanks for the additional info, DG From pav at iki.fi Sat Jan 9 07:44:03 2010 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 09 Jan 2010 14:44:03 +0200 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <4B47BBD8.5010906@noaa.gov> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <4B47BBD8.5010906@noaa.gov> Message-ID: <1263041042.4144.3.camel@idol> pe, 2010-01-08 kello 15:12 -0800, Christopher Barker kirjoitti: > 1) optionally allow newlines to serve as a delimiter, so large tables > can be read. I don't really like handling newlines specially. For instance, I could have data like 1, 2, 3; 4, 5, 6; 7, 8, 9; Allowing an "alternative separator" would sound better to me. The above data could then be read like fromfile('foo.txt', sep=' , ', sep2=' ; ') or perhaps fromfile('foo.txt', sep=[' , ', ' ; ']) Since whitespace matches also newlines, this would work. Pauli From efiring at hawaii.edu Sat Jan 9 14:30:48 2010 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 09 Jan 2010 09:30:48 -1000 Subject: [Numpy-discussion] mvoid test error with svn Message-ID: <4B48D968.50104@hawaii.edu> Building numpy from svn and then running numpy.test(), I get the following error: ERROR: Test filled w/ mvoid ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/numpy/ma/tests/test_core.py", line 506, in test_filled_w_mvoid a = mvoid(np.array((1, 2)), mask=[(0, 1)], dtype=ndtype) File "/usr/local/lib/python2.6/dist-packages/numpy/ma/core.py", line 5453, in __new__ _data = ndarray.__new__(self, (), dtype=dtype, buffer=data.data) TypeError: buffer is too small for requested array ---------------------------------------------------------------------- Ran 2505 tests in 10.478s FAILED (KNOWNFAIL=5, SKIP=4, errors=1) In [6]:numpy.version.version Out[6]:'1.5.0.dev8040' In [7]:!uname -a Linux manini 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 16:20:31 UTC 2009 i686 GNU/Linux Eric From charlesr.harris at gmail.com Sat Jan 9 14:57:19 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 9 Jan 2010 12:57:19 -0700 Subject: [Numpy-discussion] mvoid test error with svn In-Reply-To: <4B48D968.50104@hawaii.edu> References: <4B48D968.50104@hawaii.edu> Message-ID: On Sat, Jan 9, 2010 at 12:30 PM, Eric Firing wrote: > Building numpy from svn and then running numpy.test(), I get the > following error: > > ERROR: Test filled w/ mvoid > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/local/lib/python2.6/dist-packages/numpy/ma/tests/test_core.py", > line 506, in test_filled_w_mvoid > a = mvoid(np.array((1, 2)), mask=[(0, 1)], dtype=ndtype) > File "/usr/local/lib/python2.6/dist-packages/numpy/ma/core.py", line > 5453, in __new__ > _data = ndarray.__new__(self, (), dtype=dtype, buffer=data.data) > TypeError: buffer is too small for requested array > > ---------------------------------------------------------------------- > Ran 2505 tests in 10.478s > > FAILED (KNOWNFAIL=5, SKIP=4, errors=1) > > > In [6]:numpy.version.version > Out[6]:'1.5.0.dev8040' > > In [7]:!uname -a > Linux manini 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 16:20:31 UTC > 2009 i686 GNU/Linux > > > There is already a ticket for this: http://projects.scipy.org/numpy/ticket/1346 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Sat Jan 9 20:32:48 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Sat, 09 Jan 2010 17:32:48 -0800 Subject: [Numpy-discussion] fromfile() for reading text (one more time!) In-Reply-To: <1263041042.4144.3.camel@idol> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <4B47BBD8.5010906@noaa.gov> <1263041042.4144.3.camel@idol> Message-ID: <4B492E40.6010002@noaa.gov> Pauli Virtanen wrote: > I don't really like handling newlines specially. For instance, I could > have data like > > 1, 2, 3; > 4, 5, 6; > 7, 8, 9; > > Allowing an "alternative separator" would sound better to me. The above > data could then be read like > > fromfile('foo.txt', sep=' , ', sep2=' ; ') > > or perhaps > > fromfile('foo.txt', sep=[' , ', ' ; ']) I like this syntax better, but: 1) Yes you "could" have data like that, but do you? I've never seen it. Maybe other have. 2) if you did, it would probably indicate something the user would want reserved, like the shape of the array. And newlines really are a special case -- they have a special meaning, and they are very, very common (universal, even)! So, it's just more code than I'm probably going to write. If someone does want to write more code than I do, it would probably make sense to do what someone suggested in the ticket: write a optimized version of loadtxt in C. Anyway. I'll think about it when I poke at the code more. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From denis-bz-py at t-online.de Sun Jan 10 11:34:27 2010 From: denis-bz-py at t-online.de (denis) Date: Sun, 10 Jan 2010 17:34:27 +0100 Subject: [Numpy-discussion] Behaviour of vdot(array2d, array1d) In-Reply-To: <90626729-8117-4C72-9509-E1BE7D7F7933@gmx.de> References: <90626729-8117-4C72-9509-E1BE7D7F7933@gmx.de> Message-ID: On 07/01/2010 18:51, Nikolas Tezak wrote: > However when I do this, vdot raises a ValueError complaining that the > "vectors have different lengths". Nikolas, looks like a bug, in numpy 1.4 on mac ppc too. Use dot instead -- import numpy as np x = 1j * np.ones(( 2, 3 )) y = np.ones( 3 ) try: print "vdot:", np.vdot( x, y ) except ValueError, e: print "ValueError:", e print "dot:", np.dot( x.conj(), y ) numpy core/numeric.py has from _dotblas import dot, vdot but I don't know how to use testing/... for blas, nor how to log a ticket -- experts please advise ? cheers -- denis From tjhnson at gmail.com Sun Jan 10 17:11:06 2010 From: tjhnson at gmail.com (T J) Date: Sun, 10 Jan 2010 14:11:06 -0800 Subject: [Numpy-discussion] Uninformative Error Message-ID: When passing in a list of longs and asking that the dtype be a float (yes, losing precision), the error message is uninformative whenever the long is larger than the largest float. >>> x = 181626642333486640664316511479918087634811756599984861278481913634852446858952226941059178462566942027148832976486383692715763966132465634039844094073670028044755150133224694791817752891901042496950233943249209777416692569138779593594686170807571874640682826295728116325492852625325418526603207268018328608840 >>> array(x, dtype=float) OverflowError: long int too large to convert to float >>> array([x], dtype=float) ValueError: setting an array element with a sequence. The first error is informative, but the second is not and will occur anytime one tries to convert a python list containing longs which are too long. Is there a way this error message could be made more helpful? From timmichelsen at gmx-topmail.de Mon Jan 11 05:10:33 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 11 Jan 2010 10:10:33 +0000 (UTC) Subject: [Numpy-discussion] numpy1.4 dtype issues: scipy.stats & pytables Message-ID: Hello, I experienced the following issue with numpy 1.4: scipy.stats: Python 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit (Intel)] on win32 import scipy.stats as st Traceback (most recent call last): File "", line 1, in File "C:\Python26\lib\site-packages\scipy\stats\__init__.py", line 7, in from stats import * File "C:\Python26\lib\site-packages\scipy\stats\stats.py", line 203, in from morestats import find_repeats #is only reference to scipy.stats File "C:\Python26\lib\site-packages\scipy\stats\morestats.py", line 7, in import distributions File "C:\Python26\lib\site-packages\scipy\stats\distributions.py", line 27, in import vonmises_cython File "numpy.pxd", line 30, in scipy.stats.vonmises_cython (scipy\stats\vonmises_cython.c:2939) ValueError: numpy.dtype does not appear to be the correct type object pytables: import tables Traceback (most recent call last): File "", line 1, in File "C:\Python26\lib\site-packages\tables\__init__.py", line 56, in from tables.utilsExtension import getPyTablesVersion, getHDF5Version File "definitions.pxd", line 138, in tables.utilsExtension ValueError: numpy.dtype does not appear to be the correct type object Is this an error in numpy or no the other packages require update in the code? Thanks, Timmie From pgmdevlist at gmail.com Mon Jan 11 05:54:23 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 11 Jan 2010 05:54:23 -0500 Subject: [Numpy-discussion] numpy1.4 dtype issues: scipy.stats & pytables In-Reply-To: References: Message-ID: <3D845F44-C5F5-4F56-8233-E97C2D5A6CE2@gmail.com> On Jan 11, 2010, at 5:10 AM, Tim Michelsen wrote: > Hello, > I experienced the following issue with numpy 1.4: > ... > > Is this an error in numpy or no the other packages require update in the code? Let me guess, you just recently updated numpy ? I'd bet ybut forgot to recompile scipy and pytables... From ndbecker2 at gmail.com Mon Jan 11 09:35:51 2010 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 11 Jan 2010 09:35:51 -0500 Subject: [Numpy-discussion] savetxt only saves real part of complex Message-ID: Is this a bug? I think silently discarding the imaginary part is a bug. From denis-bz-py at t-online.de Mon Jan 11 11:56:55 2010 From: denis-bz-py at t-online.de (denis) Date: Mon, 11 Jan 2010 17:56:55 +0100 Subject: [Numpy-discussion] numpy1.4 dtype issues: scipy.stats & pytables In-Reply-To: <3D845F44-C5F5-4F56-8233-E97C2D5A6CE2@gmail.com> References: <3D845F44-C5F5-4F56-8233-E97C2D5A6CE2@gmail.com> Message-ID: Only 2 of the 21 top-level subpackages draw that warning with numpy-1.4.0-py2.6-python.org.dmg scipy-0.7.1-py2.6-python.org.dmg on my mac 10.4 ppc, python 2.6.4: try: import scipy.cluster except ValueError, e: print "scipy.cluster error", e try: import scipy.constants except ValueError, e: print "scipy.constants error", e ... scipy.cluster error numpy.dtype does not appear to be the correct type object .../linsolve/__init__.py:4: DeprecationWarning: scipy.linsolve has moved to scipy.sparse.linalg.dsolve warn('scipy.linsolve has moved to scipy.sparse.linalg.dsolve', DeprecationWarning) scipy.stats error numpy.dtype does not appear to be the correct type object cheers -- denis From josef.pktd at gmail.com Mon Jan 11 12:10:24 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 11 Jan 2010 12:10:24 -0500 Subject: [Numpy-discussion] numpy1.4 dtype issues: scipy.stats & pytables In-Reply-To: References: <3D845F44-C5F5-4F56-8233-E97C2D5A6CE2@gmail.com> Message-ID: <1cd32cbb1001110910r2cde5bdgac72e7d176ead46a@mail.gmail.com> On Mon, Jan 11, 2010 at 11:56 AM, denis wrote: > Only 2 of the 21 top-level subpackages draw that warning > with numpy-1.4.0-py2.6-python.org.dmg > scipy-0.7.1-py2.6-python.org.dmg > on my mac 10.4 ppc, python 2.6.4: > > try: > ? ? import scipy.cluster > except ValueError, e: > ? ? print "scipy.cluster error", e > try: > ? ? import scipy.constants > except ValueError, e: > ? ? print "scipy.constants error", e > ... > > scipy.cluster error numpy.dtype does not appear to be the correct type object > .../linsolve/__init__.py:4: DeprecationWarning: scipy.linsolve has moved to scipy.sparse.linalg.dsolve > ? warn('scipy.linsolve has moved to scipy.sparse.linalg.dsolve', DeprecationWarning) > scipy.stats error numpy.dtype does not appear to be the correct type object For this problem, it's supposed to be only those packages that have or import cython generated code. Josef > > cheers > ? -- denis > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From andyjian430074 at gmail.com Mon Jan 11 18:44:59 2010 From: andyjian430074 at gmail.com (Jankins) Date: Mon, 11 Jan 2010 17:44:59 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable Message-ID: <4B4BB7FB.6060004@gmail.com> Hello, I want to use scipy.sparse.linalg.eigen function, but it keeps popping out error message: TypeError: 'module' object is not callable "eigen" is a module, but it has "__call__" method. Why couldn't I call scipy.sparse.linalg.eigen(...)? Thanks. Jankins From robert.kern at gmail.com Mon Jan 11 18:49:11 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 11 Jan 2010 17:49:11 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4BB7FB.6060004@gmail.com> References: <4B4BB7FB.6060004@gmail.com> Message-ID: <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> On Mon, Jan 11, 2010 at 17:44, Jankins wrote: > Hello, > > I want to use scipy.sparse.linalg.eigen function, but it keeps popping > out error message: > TypeError: 'module' object is not callable > > "eigen" is a module, but it has "__call__" method. Why couldn't I call > scipy.sparse.linalg.eigen(...)? Please show the complete code and the complete traceback. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From andyjian430074 at gmail.com Mon Jan 11 19:03:46 2010 From: andyjian430074 at gmail.com (Jankins) Date: Mon, 11 Jan 2010 18:03:46 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> Message-ID: <4B4BBC62.6080002@gmail.com> It is very simple code: import networkx as nx import scipy.sparse.linalg as linalg G = nx.Graph() G.add_star(range(9)) M= nx.to_scipy_sparse_matrix(G) print linalg.eigen(M) Thanks. Jankins On 1/11/2010 5:49 PM, Robert Kern wrote: > On Mon, Jan 11, 2010 at 17:44, Jankins wrote: > >> Hello, >> >> I want to use scipy.sparse.linalg.eigen function, but it keeps popping >> out error message: >> TypeError: 'module' object is not callable >> >> "eigen" is a module, but it has "__call__" method. Why couldn't I call >> scipy.sparse.linalg.eigen(...)? >> > Please show the complete code and the complete traceback. > > From Chris.Barker at noaa.gov Mon Jan 11 19:11:26 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 11 Jan 2010 16:11:26 -0800 Subject: [Numpy-discussion] fromfile() -- aarrgg! In-Reply-To: References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> <4B466E4C.10504@noaa.gov> <4B46889E.3020008@noaa.gov> Message-ID: <4B4BBE2E.9090706@noaa.gov> Pauli Virtanen wrote: > Thu, 07 Jan 2010 17:21:34 -0800, Christopher Barker wrote: > [clip] >> It does pass on that return value, but, from ctors.c: >> >> fromfile_next_element(FILE **fp, void *dptr, PyArray_Descr *dtype, >> void *NPY_UNUSED(stream_data)) >> { >> /* the NULL argument is for backwards-compatibility */ return >> dtype->f->scanfunc(*fp, dptr, NULL, dtype); >> } > > This functions is IMHO where the fix should go; I believe it should do > something like > > return (ret == 0 || ret == EOF) ? -1 : ret; OK, more digging (and printf debugging -- I really need to learn to debug C extensions properly). I've found the deeper issue: NumPyOS_ascii_strtod returns a 0.0 when given invalid input, such as " ,". That's why fromstring is putting in a 0.0 for empty (and invalid) fields. Diggin into NumPyOS_ascii_strtod(), it looks like it is simply a wrapper around PyOS_ascii_strtod(), that checks for NaN and Inf first (and somethign with teh decimal point, I dont' quite get). But anyway, if it is a regular old number, it gets passed of to PyOS_ascii_strtod(), which isn't outragiously well documented (no, I havne't gone to the source, yet), but is similar to the C stdlib srtod(), which says: "If no conversion is performed, zero is returned and the value of nptr is stored in the location referenced by endptr." off do do some more testing, but I guess that means that those pointers need to be checked after the call, to see if a conversion was generated. Am I right? -Chris PS: Boy, this is a pain! -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Mon Jan 11 19:12:30 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 11 Jan 2010 18:12:30 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4BBC62.6080002@gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> Message-ID: <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> On Mon, Jan 11, 2010 at 18:03, Jankins wrote: > It is very simple code: > > import networkx as nx > import scipy.sparse.linalg as linalg > > G = nx.Graph() > G.add_star(range(9)) > M= nx.to_scipy_sparse_matrix(G) > print linalg.eigen(M) > > Thanks. Please post the complete traceback. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From andyjian430074 at gmail.com Mon Jan 11 19:16:12 2010 From: andyjian430074 at gmail.com (Jankins) Date: Mon, 11 Jan 2010 18:16:12 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> Message-ID: <4B4BBF4C.5050105@gmail.com> I am sorry. My bad. File "C:\test.py", line 7, in print linalg.eigen(M) TypeError: 'module' object is not callable I installed "pythonxy". "pythonxy" has already included the scipy package. On 1/11/2010 6:12 PM, Robert Kern wrote: > On Mon, Jan 11, 2010 at 18:03, Jankins wrote: > >> It is very simple code: >> >> import networkx as nx >> import scipy.sparse.linalg as linalg >> >> G = nx.Graph() >> G.add_star(range(9)) >> M= nx.to_scipy_sparse_matrix(G) >> print linalg.eigen(M) >> >> Thanks. >> > Please post the complete traceback. > > From josef.pktd at gmail.com Mon Jan 11 20:53:30 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 11 Jan 2010 20:53:30 -0500 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4BBF4C.5050105@gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> Message-ID: <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> On Mon, Jan 11, 2010 at 7:16 PM, Jankins wrote: > I am sorry. My bad. > > ? File "C:\test.py", line 7, in > ? ? print linalg.eigen(M) > TypeError: 'module' object is not callable > > I installed "pythonxy". "pythonxy" has already included the scipy package. > > On 1/11/2010 6:12 PM, Robert Kern wrote: >> On Mon, Jan 11, 2010 at 18:03, Jankins ?wrote: >> >>> It is very simple code: >>> >>> import networkx as nx >>> import scipy.sparse.linalg as linalg >>> >>> G = nx.Graph() >>> G.add_star(range(9)) >>> M= nx.to_scipy_sparse_matrix(G) >>> print linalg.eigen(M) >>> >>> Thanks. >>> >> Please post the complete traceback. eigen is both a function and a module. Normally the function shadows the module >python Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy.sparse.linalg.eigen >>> scipy.sparse.linalg.eigen I'm not able to import the eigen module, so there is either something different with python 2.6 or networkx is doing some magic ? Can you try without networkx, try linalg.eigen.eigen ? Does >>> scipy.sparse.linalg.eigen show the module or the function? Josef PS: I don't like functions shadowing a module >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From andyjian430074 at gmail.com Mon Jan 11 21:03:35 2010 From: andyjian430074 at gmail.com (Jankins) Date: Mon, 11 Jan 2010 20:03:35 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> Message-ID: <4B4BD877.3090509@gmail.com> Here is the command line python: >>> import scipy.sparse.linalg as linalg >>> >>> linalg.eigen() Traceback (most recent call last): File "", line 1, in TypeError: 'module' object is not callable >>> It's really wired. Jankins On 1/11/2010 7:53 PM, josef.pktd at gmail.com wrote: > On Mon, Jan 11, 2010 at 7:16 PM, Jankins wrote: > >> I am sorry. My bad. >> >> File "C:\test.py", line 7, in >> print linalg.eigen(M) >> TypeError: 'module' object is not callable >> >> I installed "pythonxy". "pythonxy" has already included the scipy package. >> >> On 1/11/2010 6:12 PM, Robert Kern wrote: >> >>> On Mon, Jan 11, 2010 at 18:03, Jankins wrote: >>> >>> >>>> It is very simple code: >>>> >>>> import networkx as nx >>>> import scipy.sparse.linalg as linalg >>>> >>>> G = nx.Graph() >>>> G.add_star(range(9)) >>>> M= nx.to_scipy_sparse_matrix(G) >>>> print linalg.eigen(M) >>>> >>>> Thanks. >>>> >>>> >>> Please post the complete traceback. >>> > eigen is both a function and a module. Normally the function shadows the module > > >> python >> > Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on > win32 > Type "help", "copyright", "credits" or "license" for more information. > >>>> import scipy.sparse.linalg.eigen >>>> scipy.sparse.linalg.eigen >>>> > > > I'm not able to import the eigen module, so there is either something > different with python 2.6 or networkx is doing some magic ? > > Can you try without networkx, try linalg.eigen.eigen ? > > Does > >>>> scipy.sparse.linalg.eigen >>>> > show the module or the function? > > Josef > > PS: I don't like functions shadowing a module > >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Mon Jan 11 21:55:29 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 11 Jan 2010 21:55:29 -0500 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4BD877.3090509@gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> Message-ID: <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> On Mon, Jan 11, 2010 at 9:03 PM, Jankins wrote: > Here is the command line python: > > ?>>> import scipy.sparse.linalg as linalg > ?>>> > ?>>> linalg.eigen() > Traceback (most recent call last): > ? File "", line 1, in > TypeError: 'module' object is not callable > ?>>> linalg.eigen.eigen ? Is your working directory inside scipy ? I have no idea, since I'm not able not to get the function, and your information is a bit minimal. Josef > > It's really wired. > > Jankins > > On 1/11/2010 7:53 PM, josef.pktd at gmail.com wrote: >> On Mon, Jan 11, 2010 at 7:16 PM, Jankins ?wrote: >> >>> I am sorry. My bad. >>> >>> ? ?File "C:\test.py", line 7, in >>> ? ? ?print linalg.eigen(M) >>> TypeError: 'module' object is not callable >>> >>> I installed "pythonxy". "pythonxy" has already included the scipy package. >>> >>> On 1/11/2010 6:12 PM, Robert Kern wrote: >>> >>>> On Mon, Jan 11, 2010 at 18:03, Jankins ? ?wrote: >>>> >>>> >>>>> It is very simple code: >>>>> >>>>> import networkx as nx >>>>> import scipy.sparse.linalg as linalg >>>>> >>>>> G = nx.Graph() >>>>> G.add_star(range(9)) >>>>> M= nx.to_scipy_sparse_matrix(G) >>>>> print linalg.eigen(M) >>>>> >>>>> Thanks. >>>>> >>>>> >>>> Please post the complete traceback. >>>> >> eigen is both a function and a module. Normally the function shadows the module >> >> >>> python >>> >> Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on >> win32 >> Type "help", "copyright", "credits" or "license" for more information. >> >>>>> import scipy.sparse.linalg.eigen >>>>> scipy.sparse.linalg.eigen >>>>> >> >> >> I'm not able to import the eigen module, so there is either something >> different with python 2.6 or networkx is doing some magic ? >> >> Can you try without networkx, try linalg.eigen.eigen ? >> >> Does >> >>>>> scipy.sparse.linalg.eigen >>>>> >> show the module or the function? >> >> Josef >> >> PS: I don't like functions shadowing a module >> >>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From andyjian430074 at gmail.com Mon Jan 11 22:31:47 2010 From: andyjian430074 at gmail.com (Jankins) Date: Mon, 11 Jan 2010 21:31:47 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> Message-ID: <4B4BED23.4080109@gmail.com> linalg has no attribute "eigen". Are you able to use scipy.sparse.linalg.eigen? My working dir is not inside scipy. It is 'C:\\Users\\jankins'. I am using Python 2.6.2 and the latest version of scipy. What should I do? And I couldn't even successfully install scipy in Ubuntu 9.10 neither by "easy_install" or "source compilation". I am so desperate. I planed to use the function to calculate the eigenvalue of a graph.The graph has about 265,214 nodes and 420,045 edges. So it's better to use sparse matrix. Jankins On 1/11/2010 8:55 PM, josef.pktd at gmail.com wrote: > On Mon, Jan 11, 2010 at 9:03 PM, Jankins wrote: > >> Here is the command line python: >> >> >>> import scipy.sparse.linalg as linalg >> >>> >> >>> linalg.eigen() >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: 'module' object is not callable >> >>> >> > linalg.eigen.eigen ? > > Is your working directory inside scipy ? > > I have no idea, since I'm not able not to get the function, and your > information is a bit minimal. > > Josef > > >> It's really wired. >> >> Jankins >> >> On 1/11/2010 7:53 PM, josef.pktd at gmail.com wrote: >> >>> On Mon, Jan 11, 2010 at 7:16 PM, Jankins wrote: >>> >>> >>>> I am sorry. My bad. >>>> >>>> File "C:\test.py", line 7, in >>>> print linalg.eigen(M) >>>> TypeError: 'module' object is not callable >>>> >>>> I installed "pythonxy". "pythonxy" has already included the scipy package. >>>> >>>> On 1/11/2010 6:12 PM, Robert Kern wrote: >>>> >>>> >>>>> On Mon, Jan 11, 2010 at 18:03, Jankins wrote: >>>>> >>>>> >>>>> >>>>>> It is very simple code: >>>>>> >>>>>> import networkx as nx >>>>>> import scipy.sparse.linalg as linalg >>>>>> >>>>>> G = nx.Graph() >>>>>> G.add_star(range(9)) >>>>>> M= nx.to_scipy_sparse_matrix(G) >>>>>> print linalg.eigen(M) >>>>>> >>>>>> Thanks. >>>>>> >>>>>> >>>>>> >>>>> Please post the complete traceback. >>>>> >>>>> >>> eigen is both a function and a module. Normally the function shadows the module >>> >>> >>> >>>> python >>>> >>>> >>> Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on >>> win32 >>> Type "help", "copyright", "credits" or "license" for more information. >>> >>> >>>>>> import scipy.sparse.linalg.eigen >>>>>> scipy.sparse.linalg.eigen >>>>>> >>>>>> >>> >>> >>> I'm not able to import the eigen module, so there is either something >>> different with python 2.6 or networkx is doing some magic ? >>> >>> Can you try without networkx, try linalg.eigen.eigen ? >>> >>> Does >>> >>> >>>>>> scipy.sparse.linalg.eigen >>>>>> >>>>>> >>> show the module or the function? >>> >>> Josef >>> >>> PS: I don't like functions shadowing a module >>> >>> >>>>> >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Mon Jan 11 22:45:29 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 11 Jan 2010 22:45:29 -0500 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4BED23.4080109@gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> Message-ID: <1cd32cbb1001111945t7f185967p2ccdef6c2cb22b62@mail.gmail.com> On Mon, Jan 11, 2010 at 10:31 PM, Jankins wrote: > linalg has no attribute "eigen". You should post full tracebacks. I don't understand this error, because before eigen seemed to exist. You could run the test suite to see if the installation is ok and sparse is working correctly. >>> import scipy.sparse >>> scipy.sparse.test() which is for me: Ran 442 tests in 139.500s OK (KNOWNFAIL=4, SKIP=11) If there are installation problems, then I have no idea since I'm a (happy) Windows user. Josef > > Are you able to use scipy.sparse.linalg.eigen? > > My working dir is not inside scipy. ?It is 'C:\\Users\\jankins'. > > I am using Python 2.6.2 and the latest version of scipy. > > What should I do? And I couldn't even successfully install scipy in > Ubuntu 9.10 neither by "easy_install" or "source compilation". I am so > desperate. > > I planed to use the function to calculate the eigenvalue of a graph.The > graph has about 265,214 nodes and 420,045 edges. So it's better to use > sparse matrix. > > Jankins > > On 1/11/2010 8:55 PM, josef.pktd at gmail.com wrote: >> On Mon, Jan 11, 2010 at 9:03 PM, Jankins ?wrote: >> >>> Here is the command line python: >>> >>> ? >>> ?import scipy.sparse.linalg as linalg >>> ? >>> >>> ? >>> ?linalg.eigen() >>> Traceback (most recent call last): >>> ? ?File "", line 1, in >>> TypeError: 'module' object is not callable >>> ? >>> >>> >> linalg.eigen.eigen ? ? >> >> Is your working directory inside scipy ? >> >> I have no idea, since I'm not able not to get the function, and your >> information is a bit minimal. >> >> Josef >> >> >>> It's really wired. >>> >>> Jankins >>> >>> On 1/11/2010 7:53 PM, josef.pktd at gmail.com wrote: >>> >>>> On Mon, Jan 11, 2010 at 7:16 PM, Jankins ? ?wrote: >>>> >>>> >>>>> I am sorry. My bad. >>>>> >>>>> ? ? File "C:\test.py", line 7, in >>>>> ? ? ? print linalg.eigen(M) >>>>> TypeError: 'module' object is not callable >>>>> >>>>> I installed "pythonxy". "pythonxy" has already included the scipy package. >>>>> >>>>> On 1/11/2010 6:12 PM, Robert Kern wrote: >>>>> >>>>> >>>>>> On Mon, Jan 11, 2010 at 18:03, Jankins ? ? ?wrote: >>>>>> >>>>>> >>>>>> >>>>>>> It is very simple code: >>>>>>> >>>>>>> import networkx as nx >>>>>>> import scipy.sparse.linalg as linalg >>>>>>> >>>>>>> G = nx.Graph() >>>>>>> G.add_star(range(9)) >>>>>>> M= nx.to_scipy_sparse_matrix(G) >>>>>>> print linalg.eigen(M) >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> >>>>>>> >>>>>> Please post the complete traceback. >>>>>> >>>>>> >>>> eigen is both a function and a module. Normally the function shadows the module >>>> >>>> >>>> >>>>> python >>>>> >>>>> >>>> Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on >>>> win32 >>>> Type "help", "copyright", "credits" or "license" for more information. >>>> >>>> >>>>>>> import scipy.sparse.linalg.eigen >>>>>>> scipy.sparse.linalg.eigen >>>>>>> >>>>>>> >>>> >>>> >>>> I'm not able to import the eigen module, so there is either something >>>> different with python 2.6 or networkx is doing some magic ? >>>> >>>> Can you try without networkx, try linalg.eigen.eigen ? >>>> >>>> Does >>>> >>>> >>>>>>> scipy.sparse.linalg.eigen >>>>>>> >>>>>>> >>>> show the module or the function? >>>> >>>> Josef >>>> >>>> PS: I don't like functions shadowing a module >>>> >>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From david at silveregg.co.jp Mon Jan 11 23:33:33 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Tue, 12 Jan 2010 13:33:33 +0900 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4BED23.4080109@gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> Message-ID: <4B4BFB9D.4010403@silveregg.co.jp> Jankins wrote: > > What should I do? And I couldn't even successfully install scipy in > Ubuntu 9.10 neither by "easy_install" or "source compilation". I am so > desperate. Don't use easy_install, and install from sources with python setup.py install, both numpy and scipy, after having installed the following packages: sudo apt-get install gfortran python-dev libatlas-base-dev python-nose Before doing so, you should remove both the build directories (rm -rf build in your source tree) and the previously installed numpy/scipy if any (in /usr/local/lib/python2.6/site-packages/ on Ubuntu 9.10). You should then be able to test your installations doing something like: python -c "import numpy; numpy.test(); import scipy; scipy.test()" cheers, David From andyjian430074 at gmail.com Mon Jan 11 23:53:46 2010 From: andyjian430074 at gmail.com (Jankins) Date: Mon, 11 Jan 2010 22:53:46 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4BFB9D.4010403@silveregg.co.jp> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> <4B4BFB9D.4010403@silveregg.co.jp> Message-ID: <4B4C005A.6030809@gmail.com> Thanks so much. I have successfully installed scipy in Ubuntu 9.10. But I still couldn't use scipy.sparse.linalg.eigen function. The test result is : Ran 3490 tests in 40.268s FAILED (KNOWNFAIL=4, SKIP=28, failures=1) Thanks again. Jankins On 1/11/2010 10:33 PM, David Cournapeau wrote: > Jankins wrote: > > >> What should I do? And I couldn't even successfully install scipy in >> Ubuntu 9.10 neither by "easy_install" or "source compilation". I am so >> desperate. >> > Don't use easy_install, and install from sources with python setup.py > install, both numpy and scipy, after having installed the following > packages: > > sudo apt-get install gfortran python-dev libatlas-base-dev python-nose > > Before doing so, you should remove both the build directories (rm -rf > build in your source tree) and the previously installed numpy/scipy if > any (in /usr/local/lib/python2.6/site-packages/ on Ubuntu 9.10). > > You should then be able to test your installations doing something like: > > python -c "import numpy; numpy.test(); import scipy; scipy.test()" > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From david at silveregg.co.jp Tue Jan 12 00:46:38 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Tue, 12 Jan 2010 14:46:38 +0900 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4C005A.6030809@gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> <4B4BFB9D.4010403@silveregg.co.jp> <4B4C005A.6030809@gmail.com> Message-ID: <4B4C0CBE.7080306@silveregg.co.jp> Jankins wrote: > Thanks so much. I have successfully installed scipy in Ubuntu 9.10. But > I still couldn't use scipy.sparse.linalg.eigen function. Please report *exactly* the suite of commands which is failing. For example, the following works for me: import numpy as np from scipy.sparse import csr_matrix from scipy.sparse.linalg.eigen import eigen m = np.random.randn(10, 10) sm = csr_matrix(m) print eigen(sm) # Give the 6 first (largest) eigen values of sm Note that I am not sure eigen will be able to cope with your problem's size. I already had trouble with problems 1 to 2 order of magnitude smaller than that (~ 5e4 x 5e4) cheers, David From andyjian430074 at gmail.com Tue Jan 12 01:35:55 2010 From: andyjian430074 at gmail.com (Jankins) Date: Tue, 12 Jan 2010 00:35:55 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4C0CBE.7080306@silveregg.co.jp> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> <4B4BFB9D.4010403@silveregg.co.jp> <4B4C005A.6030809@gmail.com> <4B4C0CBE.7080306@silveregg.co.jp> Message-ID: <4B4C184B.7010607@gmail.com> Here is the complete command lines in Windows 7: C:\Users\jankins>python Python 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from scipy.sparse.linalg.eigen import eigen Traceback (most recent call last): File "", line 1, in ImportError: cannot import name eigen >>> Here is the complete command lines in Ubuntu 9.10: Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from scipy.sparse.linalg.eigen import eigen Traceback (most recent call last): File "", line 1, in ImportError: cannot import name eigen >>> It's so wired. Thanks. On 1/11/2010 11:46 PM, David Cournapeau wrote: > Jankins wrote: > >> Thanks so much. I have successfully installed scipy in Ubuntu 9.10. But >> I still couldn't use scipy.sparse.linalg.eigen function. >> > Please report *exactly* the suite of commands which is failing. For > example, the following works for me: > > import numpy as np > from scipy.sparse import csr_matrix > from scipy.sparse.linalg.eigen import eigen > > m = np.random.randn(10, 10) > sm = csr_matrix(m) > > print eigen(sm) # Give the 6 first (largest) eigen values of sm > > > Note that I am not sure eigen will be able to cope with your problem's > size. I already had trouble with problems 1 to 2 order of magnitude > smaller than that (~ 5e4 x 5e4) > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Tue Jan 12 03:37:57 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 12 Jan 2010 10:37:57 +0200 Subject: [Numpy-discussion] fromfile() -- aarrgg! In-Reply-To: <4B4BBE2E.9090706@noaa.gov> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> <4B466E4C.10504@noaa.gov> <4B46889E.3020008@noaa.gov> <4B4BBE2E.9090706@noaa.gov> Message-ID: <1263285477.7976.10.camel@talisman> ma, 2010-01-11 kello 16:11 -0800, Christopher Barker kirjoitti: [clip] > "If no conversion is performed, zero is returned and the value of nptr > is stored in the location referenced by endptr." > > off do do some more testing, but I guess that means that those pointers > need to be checked after the call, to see if a conversion was generated. > > Am I right? Yes, that's how strtod() is typically used. NumPyOS_ascii_ftolf already checks that, but it seems to me that fromstr_next_element or possibly fromstr does not. > PS: Boy, this is a pain! Welcome to the wonderful world of C ;) Pauli From jonboym2 at yahoo.co.uk Tue Jan 12 08:11:02 2010 From: jonboym2 at yahoo.co.uk (Jon Moore) Date: Tue, 12 Jan 2010 13:11:02 +0000 (GMT) Subject: [Numpy-discussion] Getting Callbacks with arrays to work Message-ID: <832556.93262.qm@web24504.mail.ird.yahoo.com> Hi, I'm trying to build a differential equation integrator and later a stochastic differential equation integrator. I'm having trouble getting f2py to work where the callback itself receives an array from the Fortran routine does some work on it and then passes an array back. ? For the stoachastic integrator I'll need 2 callbacks both dealing with arrays. The idea is the code that never changes (ie the integrator) will be in Fortran and the code that changes (ie the callbacks defining differential equations) will be different for each problem. To test the idea I've written basic code which should pass an array back and forth between Python and Fortran if it works right. Here is some code which doesn't work properly:- SUBROUTINE CallbackTest(dv,v0,Vout,N) ??? !IMPLICIT NONE ?????? ? cF2PY???? intent( hide ):: N ??? INTEGER:: N, ic ?????????????? ? ??? EXTERNAL:: dv?????? ? ??? DOUBLE PRECISION, DIMENSION( N ), INTENT(IN):: v0?????? ? ??? DOUBLE PRECISION, DIMENSION( N ), INTENT(OUT):: Vout ?????????? ? ??? DOUBLE PRECISION, DIMENSION( N ):: Vnow ??? DOUBLE PRECISION, DIMENSION( N )::? temp ?????? ? ??? Vnow = v0 ?????? ? ??? temp = dv(Vnow, N) ??? DO ic = 1, N ??????? Vout( ic ) = temp(ic) ??? END DO?? ? ?????? ? END SUBROUTINE CallbackTest When I test it with this python code I find the code just replicates the first term of the array! from numpy import * import callback as c def dV(v): ??? print 'in Python dV: V is: ',v ??? return v.copy()?? ? arr = array([2.0, 4.0, 6.0, 8.0]) print 'Arr is: ', arr output = c.CallbackTest(dV, arr) print 'Out is: ', output Arr is:? [ 2.? 4.? 6.? 8.] in Python dV: V is:? [ 2.? 4.? 6.? 8.] Out is:? [ 2.? 2.? 2.? 2.] Any ideas how I should do this, and also how do I get the code to work with implicit none not commented out? Thanks Jon -------------- next part -------------- An HTML attachment was scrubbed... URL: From pearu.peterson at gmail.com Tue Jan 12 08:44:33 2010 From: pearu.peterson at gmail.com (Pearu Peterson) Date: Tue, 12 Jan 2010 15:44:33 +0200 Subject: [Numpy-discussion] Getting Callbacks with arrays to work In-Reply-To: <832556.93262.qm@web24504.mail.ird.yahoo.com> References: <832556.93262.qm@web24504.mail.ird.yahoo.com> Message-ID: <4B4C7CC1.3010304@cens.ioc.ee> Hi, The problem is that f2py does not support callbacks that return arrays. There is easy workaround to that: provide returnable arrays as arguments to callback functions. Using your example: SUBROUTINE CallbackTest(dv,v0,Vout,N) IMPLICIT NONE !F2PY intent( hide ):: N INTEGER:: N, ic EXTERNAL:: dv DOUBLE PRECISION, DIMENSION( N ), INTENT(IN):: v0 DOUBLE PRECISION, DIMENSION( N ), INTENT(OUT):: Vout DOUBLE PRECISION, DIMENSION( N ):: Vnow DOUBLE PRECISION, DIMENSION( N ):: temp Vnow = v0 !f2py intent (out) temp call dv(temp, Vnow, N) DO ic = 1, N Vout( ic ) = temp(ic) END DO END SUBROUTINE CallbackTest $ f2py -c test.f90 -m t --fcompiler=gnu95 >>> from numpy import * >>> from t import * >>> arr = array([2.0, 4.0, 6.0, 8.0]) >>> def dV(v): print 'in Python dV: V is: ',v ret = v.copy() ret[1] = 100.0 return ret ... >>> output = callbacktest(dV, arr) in Python dV: V is: [ 2. 4. 6. 8.] >>> output array([ 2., 100., 6., 8.]) What problems do you have with implicit none? It works fine here. Check the format of your source code, if it is free then use `.f90` extension, not `.f`. HTH, Pearu Jon Moore wrote: > Hi, > > I'm trying to build a differential equation integrator and later a > stochastic differential equation integrator. > > I'm having trouble getting f2py to work where the callback itself > receives an array from the Fortran routine does some work on it and then > passes an array back. > > For the stoachastic integrator I'll need 2 callbacks both dealing with > arrays. > > The idea is the code that never changes (ie the integrator) will be in > Fortran and the code that changes (ie the callbacks defining > differential equations) will be different for each problem. > > To test the idea I've written basic code which should pass an array back > and forth between Python and Fortran if it works right. > > Here is some code which doesn't work properly:- > > SUBROUTINE CallbackTest(dv,v0,Vout,N) > !IMPLICIT NONE > > cF2PY intent( hide ):: N > INTEGER:: N, ic > > EXTERNAL:: dv > > DOUBLE PRECISION, DIMENSION( N ), INTENT(IN):: v0 > DOUBLE PRECISION, DIMENSION( N ), INTENT(OUT):: Vout > > DOUBLE PRECISION, DIMENSION( N ):: Vnow > DOUBLE PRECISION, DIMENSION( N ):: temp > > Vnow = v0 > > > temp = dv(Vnow, N) > > DO ic = 1, N > Vout( ic ) = temp(ic) > END DO > > END SUBROUTINE CallbackTest > > > > When I test it with this python code I find the code just replicates the > first term of the array! > > > > > from numpy import * > import callback as c > > def dV(v): > print 'in Python dV: V is: ',v > return v.copy() > > arr = array([2.0, 4.0, 6.0, 8.0]) > > print 'Arr is: ', arr > > output = c.CallbackTest(dV, arr) > > print 'Out is: ', output > > > > > Arr is: [ 2. 4. 6. 8.] > > in Python dV: V is: [ 2. 4. 6. 8.] > > Out is: [ 2. 2. 2. 2.] > > > > Any ideas how I should do this, and also how do I get the code to work > with implicit none not commented out? > > Thanks > > Jon > > > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From aisaac at american.edu Tue Jan 12 09:06:33 2010 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 12 Jan 2010 09:06:33 -0500 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4BED23.4080109@gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> Message-ID: <4B4C81E9.9050504@american.edu> >>> filter(lambda x: x.startswith('eig'),dir(np.linalg)) ['eig', 'eigh', 'eigvals', 'eigvalsh'] >>> import scipy.linalg as spla >>> filter(lambda x: x.startswith('eig'),dir(spla)) ['eig', 'eig_banded', 'eigh', 'eigvals', 'eigvals_banded', 'eigvalsh'] hth, Alan Isaac From aisaac at american.edu Tue Jan 12 09:15:37 2010 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 12 Jan 2010 09:15:37 -0500 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4C184B.7010607@gmail.com> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> <4B4BFB9D.4010403@silveregg.co.jp> <4B4C005A.6030809@gmail.com> <4B4C0CBE.7080306@silveregg.co.jp> <4B4C184B.7010607@gmail.com> Message-ID: <4B4C8409.7060400@american.edu> On 1/12/2010 1:35 AM, Jankins wrote: >>>> from scipy.sparse.linalg.eigen import eigen > Traceback (most recent call last): > File "", line 1, in > ImportError: cannot import name eigen Look at David's example: from scipy.sparse.linalg import eigen hth, Alan Isaac From andyjian430074 at gmail.com Tue Jan 12 10:11:57 2010 From: andyjian430074 at gmail.com (Jankins) Date: Tue, 12 Jan 2010 09:11:57 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4C8409.7060400@american.edu> References: <4B4BB7FB.6060004@gmail.com> <3d375d731001111549w6f63a4d9h23b4d7ca993fae4f@mail.gmail.com> <4B4BBC62.6080002@gmail.com> <3d375d731001111612g71158707sb68a3b133a67473c@mail.gmail.com> <4B4BBF4C.5050105@gmail.com> <1cd32cbb1001111753n701e063et3bb85d401babc199@mail.gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> <4B4BFB9D.4010403@silveregg.co.jp> <4B4C005A.6030809@gmail.com> <4B4C0CBE.7080306@silveregg.co.jp> <4B4C184B.7010607@gmail.com> <4B4C8409.7060400@american.edu> Message-ID: <4B4C913D.2020109@gmail.com> >>> import scipy.sparse.linalg as linalg >>> dir(linalg) ['LinearOperator', 'Tester', '__all__', '__builtins__', '__doc__', '__file__', ' __name__', '__package__', '__path__', 'aslinearoperator', 'bench', 'bicg', 'bicg stab', 'cg', 'cgs', 'dsolve', 'eigen', 'factorized', 'gmres', 'interface', 'isol ve', 'iterative', 'linsolve', 'lobpcg', 'minres', 'qmr', 'splu', 'spsolve', 'tes t', 'umfpack', 'use_solver', 'utils'] >>> dir(linalg.eigen) ['Tester', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__pack age__', '__path__', 'bench', 'lobpcg', 'test'] >>> linalg.eigen.test() Running unit tests for scipy.sparse.linalg.eigen NumPy version 1.3.0 NumPy is installed in C:\Python26\lib\site-packages\numpy SciPy version 0.7.1 SciPy is installed in C:\Python26\lib\site-packages\scipy Python version 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit (Int el)] nose version 0.11.1 .......... ---------------------------------------------------------------------- Ran 10 tests in 2.240s OK >>> On 1/12/2010 8:15 AM, Alan G Isaac wrote: > On 1/12/2010 1:35 AM, Jankins wrote: > >>>>> from scipy.sparse.linalg.eigen import eigen >>>>> >> Traceback (most recent call last): >> File "", line 1, in >> ImportError: cannot import name eigen >> > > Look at David's example: > from scipy.sparse.linalg import eigen > > hth, > Alan Isaac > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From arnar.flatberg at gmail.com Tue Jan 12 10:19:38 2010 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Tue, 12 Jan 2010 16:19:38 +0100 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <4B4C913D.2020109@gmail.com> References: <4B4BB7FB.6060004@gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> <4B4BFB9D.4010403@silveregg.co.jp> <4B4C005A.6030809@gmail.com> <4B4C0CBE.7080306@silveregg.co.jp> <4B4C184B.7010607@gmail.com> <4B4C8409.7060400@american.edu> <4B4C913D.2020109@gmail.com> Message-ID: <5d3194021001120719y61b80e2en668e9dac8afa922b@mail.gmail.com> On Tue, Jan 12, 2010 at 4:11 PM, Jankins wrote: Hi On my Ubuntu, I would reach the arpack wrapper as follows: from scipy.sparse.linalg.eigen.arpack import eigen However, I'd guess that you deal with a symmetric matrix (Laplacian or adjacency matrix), so the symmetric solver might be the best choice. This might be reached by: In [29]: from scipy.sparse.linalg.eigen.arpack import eigen_symmetric In [30]: scipy.__version__ Out[30]: '0.7.0' Arnar -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen.pascoe at stfc.ac.uk Tue Jan 12 10:52:22 2010 From: stephen.pascoe at stfc.ac.uk (stephen.pascoe at stfc.ac.uk) Date: Tue, 12 Jan 2010 15:52:22 -0000 Subject: [Numpy-discussion] Numpy 1.4 MaskedArray bug? Message-ID: We have noticed the MaskedArray implementation in numpy-1.4.0 breaks some of our code. For instance we see the following: in 1.3.0: >>> a = numpy.ma.MaskedArray([[1,2,3],[4,5,6]]) >>> numpy.ma.sum(a, 1) masked_array(data = [ 6 15], mask = False, fill_value = 999999) in 1.4.0 >>> a = numpy.ma.MaskedArray([[1,2,3],[4,5,6]]) >>> numpy.ma.sum(a, 1) Traceback (most recent call last): File "", line 1, in File "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n umpy/ma/core.py", line 5682, in __call__ return method(*args, **params) File "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n umpy/ma/core.py", line 4357, in sum newmask = _mask.all(axis=axis) ValueError: axis(=1) out of bounds Also note the "Report Bugs" link on http://numpy.scipy.org is broken (http://numpy.scipy.org/bug-report.html) Thanks, Stephen. --- Stephen Pascoe +44 (0)1235 445980 British Atmospheric Data Centre Rutherford Appleton Laboratory -- Scanned by iCritical. From andyjian430074 at gmail.com Tue Jan 12 11:28:25 2010 From: andyjian430074 at gmail.com (Jankins) Date: Tue, 12 Jan 2010 10:28:25 -0600 Subject: [Numpy-discussion] TypeError: 'module' object is not callable In-Reply-To: <5d3194021001120719y61b80e2en668e9dac8afa922b@mail.gmail.com> References: <4B4BB7FB.6060004@gmail.com> <4B4BD877.3090509@gmail.com> <1cd32cbb1001111855r2b45ccden3d43030e08fb0d4d@mail.gmail.com> <4B4BED23.4080109@gmail.com> <4B4BFB9D.4010403@silveregg.co.jp> <4B4C005A.6030809@gmail.com> <4B4C0CBE.7080306@silveregg.co.jp> <4B4C184B.7010607@gmail.com> <4B4C8409.7060400@american.edu> <4B4C913D.2020109@gmail.com> <5d3194021001120719y61b80e2en668e9dac8afa922b@mail.gmail.com> Message-ID: <4B4CA329.4040900@gmail.com> Thanks so so much. Finally, it works. >>> import scipy.sparse.linalg.eigen.arpack as arpack >>> dir(arpack) ['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', ' _arpack', 'arpack', 'aslinearoperator', 'eigen', 'eigen_symmetric', 'np', 'speig s', 'warnings'] >>> But I still didn't get it. Why some of you can directly use scipy.sparse.linalg.eigen as a function, while some of you couldn't use it that way? Anyway, your solution works for me. On 1/12/2010 9:19 AM, Arnar Flatberg wrote: > > > On Tue, Jan 12, 2010 at 4:11 PM, Jankins > wrote: > > Hi > > On my Ubuntu, I would reach the arpack wrapper as follows: > > from scipy.sparse.linalg.eigen.arpack import eigen > > However, I'd guess that you deal with a symmetric matrix (Laplacian or > adjacency matrix), so the symmetric solver might be the best choice. > > This might be reached by: > > In [29]: from scipy.sparse.linalg.eigen.arpack import eigen_symmetric > In [30]: scipy.__version__ > Out[30]: '0.7.0' > > > Arnar > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis-bz-py at t-online.de Tue Jan 12 11:33:21 2010 From: denis-bz-py at t-online.de (denis) Date: Tue, 12 Jan 2010 17:33:21 +0100 Subject: [Numpy-discussion] numpy1.4 dtype issues: scipy.stats & pytables In-Reply-To: <1cd32cbb1001110910r2cde5bdgac72e7d176ead46a@mail.gmail.com> References: <3D845F44-C5F5-4F56-8233-E97C2D5A6CE2@gmail.com> <1cd32cbb1001110910r2cde5bdgac72e7d176ead46a@mail.gmail.com> Message-ID: On 11/01/2010 18:10, josef.pktd at gmail.com wrote: > For this problem, it's supposed to be only those packages that have or > import cython generated code. Right; is this a known bug, is there a known fix for mac dmgs ? (Whisper, how'd it get past testing ?) scipy/stats/__init__.py has an apparent patch which doesn't work #remove vonmises_cython from __all__, I don't know why it is included __all__ = filter(lambda s:not (s.startswith('_') or s.endswith('cython')),dir()) but just removing vonmises_cython in distributions.py => import scipy.stats then works. Similarly import scipy.cluster => trace File "numpy.pxd", line 30, in scipy.spatial.ckdtree (scipy/spatial/ckdtree.c:6087) ValueError: numpy.dtype does not appear to be the correct type object I like the naming convention xx_cython.so. cheers -- denis From robert.kern at gmail.com Tue Jan 12 11:41:00 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 12 Jan 2010 10:41:00 -0600 Subject: [Numpy-discussion] numpy1.4 dtype issues: scipy.stats & pytables In-Reply-To: References: <3D845F44-C5F5-4F56-8233-E97C2D5A6CE2@gmail.com> <1cd32cbb1001110910r2cde5bdgac72e7d176ead46a@mail.gmail.com> Message-ID: <3d375d731001120841y7bc9bb9cla1b49c6deb2b5232@mail.gmail.com> On Tue, Jan 12, 2010 at 10:33, denis wrote: > On 11/01/2010 18:10, josef.pktd at gmail.com wrote: > >> For this problem, it's supposed to be only those packages that have or >> import cython generated code. > > Right; is this a known bug, is there a known fix ?for mac dmgs ? > (Whisper, how'd it get past testing ?) It's not a bug, but it is a known issue. We tried very hard to keep numpy 1.4 binary compatible; however, Pyrex and Cython impose additional runtime checks above and beyond binary compatibility. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Tue Jan 12 11:42:00 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 12 Jan 2010 11:42:00 -0500 Subject: [Numpy-discussion] numpy1.4 dtype issues: scipy.stats & pytables In-Reply-To: References: <3D845F44-C5F5-4F56-8233-E97C2D5A6CE2@gmail.com> <1cd32cbb1001110910r2cde5bdgac72e7d176ead46a@mail.gmail.com> Message-ID: <1cd32cbb1001120842l17571babx8a31ca86c7b95d8b@mail.gmail.com> On Tue, Jan 12, 2010 at 11:33 AM, denis wrote: > On 11/01/2010 18:10, josef.pktd at gmail.com wrote: > >> For this problem, it's supposed to be only those packages that have or >> import cython generated code. > > Right; is this a known bug, is there a known fix ?for mac dmgs ? > (Whisper, how'd it get past testing ?) Switching to numpy 1.4 requires recompiling cython code (i.e. scipy), there's a lot of information on the details in the mailing lists. > > scipy/stats/__init__.py has an apparent patch which doesn't work > ? ? #remove vonmises_cython from __all__, I don't know why it is included > ? ? __all__ = filter(lambda s:not (s.startswith('_') or s.endswith('cython')),dir()) No this is unrelated, this is just to reduce namespace pollution in __all__ vonmises_cython is still imported as an internal module and functions in distributions. Josef > > but just removing vonmises_cython in distributions.py > => import scipy.stats then works. Then, I expect you will get an import error or some other exception when you try to use stats.vonmises. > > Similarly import scipy.cluster => trace > ? File "numpy.pxd", line 30, in scipy.spatial.ckdtree (scipy/spatial/ckdtree.c:6087) > ValueError: numpy.dtype does not appear to be the correct type object > > I like the naming convention xx_cython.so. > > cheers > ? -- denis > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pgmdevlist at gmail.com Tue Jan 12 12:51:08 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 12 Jan 2010 12:51:08 -0500 Subject: [Numpy-discussion] Numpy 1.4 MaskedArray bug? In-Reply-To: References: Message-ID: On Jan 12, 2010, at 10:52 AM, wrote: > We have noticed the MaskedArray implementation in numpy-1.4.0 breaks > some of our code. For instance we see the following: My, that's embarrassing. Sorry for the inconvenience. > > in 1.3.0: > >>>> a = numpy.ma.MaskedArray([[1,2,3],[4,5,6]]) >>>> numpy.ma.sum(a, 1) > masked_array(data = [ 6 15], > mask = False, > fill_value = 999999) > > in 1.4.0 > >>>> a = numpy.ma.MaskedArray([[1,2,3],[4,5,6]]) >>>> numpy.ma.sum(a, 1) > Traceback (most recent call last): > File "", line 1, in > File > "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n > umpy/ma/core.py", line 5682, in __call__ > return method(*args, **params) > File > "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n > umpy/ma/core.py", line 4357, in sum > newmask = _mask.all(axis=axis) > ValueError: axis(=1) out of bounds Confirmed. Before I take full blame for it, can you try the following on both 1.3 and 1.4 ? >>> np.array(False).all().sum(1) Back to your problem: I'll fix that ASAIC, but it'll be on the SVN. Meanwhile, you can: * Use -1 instead of 1 for your axis. * Force the definition of a mask when you define your array with masked_array(...,mask=False) From sebastian.walter at gmail.com Tue Jan 12 13:05:01 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Tue, 12 Jan 2010 19:05:01 +0100 Subject: [Numpy-discussion] wrong casting of augmented assignment statements Message-ID: Hello, I have a question about the augmented assignment statements *=, +=, etc. Apparently, the casting of types is not working correctly. Is this known resp. intended behavior of numpy? (I'm using numpy.__version__ = '1.4.0.dev7039' on this machine but I remember a recent checkout of numpy yielded the same result). The problem is best explained at some examples: wrong casting from float to int:: In [1]: import numpy In [2]: x = numpy.ones(2,dtype=int) In [3]: y = 1.3 * numpy.ones(2,dtype=float) In [4]: z = x * y In [5]: z Out[5]: array([ 1.3, 1.3]) In [6]: x *= y In [7]: x Out[7]: array([1, 1]) In [8]: x.dtype Out[8]: dtype('int32') wrong casting from float to object:: In [1]: import numpy In [2]: import adolc In [3]: x = adolc.adouble(numpy.array([1,2,3],dtype=float)) In [4]: y = numpy.array([4,5,6],dtype=float) In [5]: x Out[5]: array([1(a), 2(a), 3(a)], dtype=object) In [6]: y Out[6]: array([ 4., 5., 6.]) In [7]: x * y Out[7]: array([4(a), 10(a), 18(a)], dtype=object) In [8]: y *= x In [9]: y Out[9]: array([ 4., 5., 6.]) It is inconsistent to the Python behavior:: In [9]: a = 1 In [10]: b = 1.3 In [11]: c = a * b In [12]: c Out[12]: 1.3 In [13]: a *= b In [14]: a Out[14]: 1.3 I would expect that numpy should at least raise an exception in the case of casting object to float. Any thoughts? regards, Sebastian From robert.kern at gmail.com Tue Jan 12 13:09:05 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 12 Jan 2010 12:09:05 -0600 Subject: [Numpy-discussion] wrong casting of augmented assignment statements In-Reply-To: References: Message-ID: <3d375d731001121009p18e01199y2506b8a205d135fa@mail.gmail.com> On Tue, Jan 12, 2010 at 12:05, Sebastian Walter wrote: > Hello, > I have a question about the augmented assignment statements *=, +=, etc. > Apparently, the casting of types is not working correctly. Is this > known resp. intended behavior of numpy? Augmented assignment modifies numpy arrays in-place, so the usual casting rules for assignment into an array apply. Namely, the array being assigned into keeps its dtype. If you do not want in-place modification, do not use augmented assignment. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From caldwell19 at llnl.gov Tue Jan 12 13:01:29 2010 From: caldwell19 at llnl.gov (Peter Caldwell) Date: Tue, 12 Jan 2010 10:01:29 -0800 Subject: [Numpy-discussion] sphinx numpydoc fails due to no __init__ for class SignedType Message-ID: <4B4CB8F9.4030208@llnl.gov> I'm trying to use sphinx to build documentation for our project (CDAT) that uses numpy. I'm running into an exception due to numpy.numarray.numerictypes.SignedType not having an __init__ attribute, which causes problems with numpydoc. I'm sure there must be a workaround or I'm doing something wrong since the basic numpy documentation is created with sphinx! Suggestions? I'm using sphinx v1.0, numpy v1.3.0, and numpydoc v0.3.1on Redhat Enterprise 5.x. Big thanks, Peter ps - I'm sending this question to both Numpy-discussion and sphinx-dev at googlegroups because the issue lies at the intersection of these groups. Here's the error: ========================================================= Running Sphinx v1.0 loading pickled environment... not found building [html]: targets for 6835 source files that are out of date updating environment: 6835 added, 0 changed, 0 removed /usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/docscrape.py:117: UserWarning: Unknown section Unary Ufuncs: warn("Unknown section %s" % key) /usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/docscrape.py:117: UserWarning: Unknown section Binary Ufuncs: warn("Unknown section %s" % key) /usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/docscrape.py:117: UserWarning: Unknown section Seealso warn("Unknown section %s" % key) reading sources... [ 3%] output/lev0/numpy.numarray Exception occurred: File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/numpydoc.py", line 76, in mangle_signature 'initializes x; see ' in pydoc.getdoc(obj.__init__)): AttributeError: class SignedType has no attribute '__init__' The full traceback has been saved in /tmp/sphinx-err-fprbpu.log, if you want to report the issue to the author. Please also report this if it was a user error, so that a better error message can be provided next time. Send reports to sphinx-dev at googlegroups.com. Thanks! make: *** [html] Error 1 ===================================================== Here's the full traceback: ------------------------------------------------------------------------------------------------ # Sphinx version: 1.0 # Docutils version: 0.6 release Traceback (most recent call last): File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/cmdline.py", line 172, in main app.build(all_files, filenames) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/application.py", line 130, in build self.builder.build_update() File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/builders/__init__.py", line 265, in build_update 'out of date' % len(to_build)) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/builders/__init__.py", line 285, in build purple, length): File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/builders/__init__.py", line 131, in status_iterator for item in iterable: File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/environment.py", line 513, in update_generator self.read_doc(docname, app=app) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/environment.py", line 604, in read_doc pub.publish() File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/core.py", line 203, in publish self.settings) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/readers/__init__.py", line 69, in read self.parse() File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/readers/__init__.py", line 75, in parse self.parser.parse(self.input, document) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/__init__.py", line 157, in parse self.statemachine.run(inputlines, document, inliner=self.inliner) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 170, in run input_source=document['source']) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/statemachine.py", line 233, in run context, state, transitions) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/statemachine.py", line 421, in check_line return method(match, context, next_state) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2678, in underline self.section(title, source, style, lineno - 1, messages) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 323, in section self.new_subsection(title, lineno, messages) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 391, in new_subsection node=section_node, match_titles=1) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 278, in nested_parse node=node, match_titles=match_titles) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 195, in run results = StateMachineWS.run(self, input_lines, input_offset) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/statemachine.py", line 233, in run context, state, transitions) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/statemachine.py", line 421, in check_line return method(match, context, next_state) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2258, in explicit_markup nodelist, blank_finish = self.explicit_construct(match) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2270, in explicit_construct return method(self, expmatch) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2013, in directive directive_class, match, type_name, option_presets) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2062, in run_directive result = directive_instance.run() File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/autodoc.py", line 1106, in run nested_parse_with_titles(self.state, self.result, node) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/util/__init__.py", line 298, in nested_parse_with_titles return state.nested_parse(content, 0, node, match_titles=1) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 278, in nested_parse node=node, match_titles=match_titles) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 195, in run results = StateMachineWS.run(self, input_lines, input_offset) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/statemachine.py", line 233, in run context, state, transitions) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/statemachine.py", line 421, in check_line return method(match, context, next_state) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2260, in explicit_markup self.explicit_list(blank_finish) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2289, in explicit_list match_titles=self.state_machine.match_titles) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 315, in nested_list_parse node=node, match_titles=match_titles) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 195, in run results = StateMachineWS.run(self, input_lines, input_offset) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/statemachine.py", line 233, in run context, state, transitions) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/statemachine.py", line 421, in check_line return method(match, context, next_state) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2562, in explicit_markup nodelist, blank_finish = self.explicit_construct(match) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2270, in explicit_construct return method(self, expmatch) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2013, in directive directive_class, match, type_name, option_presets) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/docutils/parsers/rst/states.py", line 2062, in run_directive result = directive_instance.run() File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/autosummary/__init__.py", line 192, in run items = self.get_items(names) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/autosummary/__init__.py", line 265, in get_items sig = documenter.format_signature() File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/autodoc.py", line 879, in format_signature return ModuleLevelDocumenter.format_signature(self) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/autodoc.py", line 384, in format_signature self.object, self.options, args, retann) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/application.py", line 226, in emit_firstresult for result in self.emit(event, *args): File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/application.py", line 222, in emit result.append(callback(self, *args)) File "/usr/local/cdat/release/5.2d/lib/python2.5/site-packages/Sphinx-1.0dev_20091202-py2.5.egg/sphinx/ext/numpydoc.py", line 76, in mangle_signature 'initializes x; see ' in pydoc.getdoc(obj.__init__)): AttributeError: class SignedType has no attribute '__init__' -- Peter Caldwell Program for Climate Model Diagnosis and Intercomparison Lawrence Livermore National Lab PO Box 808, L-103 Livermore, CA 94551-0808 925-422-4197 From josef.pktd at gmail.com Tue Jan 12 13:11:29 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 12 Jan 2010 13:11:29 -0500 Subject: [Numpy-discussion] wrong casting of augmented assignment statements In-Reply-To: References: Message-ID: <1cd32cbb1001121011g429e56bdh7ca2e99a4ca16820@mail.gmail.com> On Tue, Jan 12, 2010 at 1:05 PM, Sebastian Walter wrote: > Hello, > I have a question about the augmented assignment statements *=, +=, etc. > Apparently, the casting of types is not working correctly. Is this > known resp. intended behavior of numpy? > (I'm using numpy.__version__ = '1.4.0.dev7039' on this machine but I > remember a recent checkout of numpy yielded the same result). > > The problem is best explained at some examples: > > wrong casting from float to int:: > > ? ? ? ? ? ?In [1]: import numpy > > ? ? ? ? ? ?In [2]: x = numpy.ones(2,dtype=int) > > ? ? ? ? ? ?In [3]: y = 1.3 * numpy.ones(2,dtype=float) > > ? ? ? ? ? ?In [4]: z = x * y > > ? ? ? ? ? ?In [5]: z > ? ? ? ? ? ?Out[5]: array([ 1.3, ?1.3]) > > ? ? ? ? ? ?In [6]: x *= y > > ? ? ? ? ? ?In [7]: x > ? ? ? ? ? ?Out[7]: array([1, 1]) > > ? ? ? ? ? ?In [8]: x.dtype > ? ? ? ? ? ?Out[8]: dtype('int32') > > ?wrong casting from float to object:: > > ? ? ? ? ? ?In [1]: import numpy > > ? ? ? ? ? ?In [2]: import adolc > > ? ? ? ? ? ?In [3]: x = adolc.adouble(numpy.array([1,2,3],dtype=float)) > > ? ? ? ? ? ?In [4]: y = numpy.array([4,5,6],dtype=float) > > ? ? ? ? ? ?In [5]: x > ? ? ? ? ? ?Out[5]: array([1(a), 2(a), 3(a)], dtype=object) > > ? ? ? ? ? ?In [6]: y > ? ? ? ? ? ?Out[6]: array([ 4., ?5., ?6.]) > > ? ? ? ? ? ?In [7]: x * y > ? ? ? ? ? ?Out[7]: array([4(a), 10(a), 18(a)], dtype=object) > > ? ? ? ? ? ?In [8]: y *= x > > ? ? ? ? ? ?In [9]: y > > ? ? ? ? ? ?Out[9]: array([ 4., ?5., ?6.]) > > > ? ? ? ?It is inconsistent to the Python behavior:: > > ? ? ? ? ? ?In [9]: a = 1 > > ? ? ? ? ? ?In [10]: b = 1.3 > > ? ? ? ? ? ?In [11]: c = a * b > > ? ? ? ? ? ?In [12]: c > ? ? ? ? ? ?Out[12]: 1.3 > > ? ? ? ? ? ?In [13]: a *= b > > ? ? ? ? ? ?In [14]: a > ? ? ? ? ? ?Out[14]: 1.3 > > > I would expect that numpy should at least raise an exception in the > case of casting object to float. > Any thoughts? You are assigning to an existing array, which implies casting to the dtype of that array. It's the behavior that I would expect. If you want upcasting then don't use inplace *= , ... Josef > > regards, > Sebastian > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Tue Jan 12 13:32:10 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 12 Jan 2010 10:32:10 -0800 Subject: [Numpy-discussion] fromfile() -- aarrgg! In-Reply-To: <1263285477.7976.10.camel@talisman> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> <4B466E4C.10504@noaa.gov> <4B46889E.3020008@noaa.gov> <4B4BBE2E.9090706@noaa.gov> <1263285477.7976.10.camel@talisman> Message-ID: <4B4CC02A.9020807@noaa.gov> Pauli Virtanen wrote: > ma, 2010-01-11 kello 16:11 -0800, Christopher Barker kirjoitti: > [clip] >> "If no conversion is performed, zero is returned and the value of nptr >> is stored in the location referenced by endptr." >> >> off do do some more testing, but I guess that means that those pointers >> need to be checked after the call, to see if a conversion was generated. >> >> Am I right? > > Yes, that's how strtod() is typically used. > > NumPyOS_ascii_ftolf already checks that, no, I don't' think it does, but it does pass the nifo through, so its API should be the same as PyOS_ascii_ftolf which is the same as strftolf(), which makes sense. > but it seems to me that > fromstr_next_element or possibly fromstr does not. The problem is fromstr -- it changes the symantics, assigning the value to a pointer passed in, and returning an error code -- except it doesn't actually check for an error -- it always returns 0: static int @fname at _fromstr(char *str, @type@ *ip, char **endptr, PyArray_Descr *NPY_UNUSED(ignore)) { double result; result = NumPyOS_ascii_strtod(str, endptr); *ip = (@type@) result; return 0; } so the errors are getting lost in the shuffle: This implies that fromstring/fromfile are the only things using it -- unless someone has seen similar bad behaviour anywhere else. > Welcome to the wonderful world of C ;) yup -- which is why I haven't worked out a fix yet... Thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From sebastian.walter at gmail.com Tue Jan 12 13:31:42 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Tue, 12 Jan 2010 19:31:42 +0100 Subject: [Numpy-discussion] wrong casting of augmented assignment statements In-Reply-To: <3d375d731001121009p18e01199y2506b8a205d135fa@mail.gmail.com> References: <3d375d731001121009p18e01199y2506b8a205d135fa@mail.gmail.com> Message-ID: On Tue, Jan 12, 2010 at 7:09 PM, Robert Kern wrote: > On Tue, Jan 12, 2010 at 12:05, Sebastian Walter > wrote: >> Hello, >> I have a question about the augmented assignment statements *=, +=, etc. >> Apparently, the casting of types is not working correctly. Is this >> known resp. intended behavior of numpy? > > Augmented assignment modifies numpy arrays in-place, so the usual > casting rules for assignment into an array apply. Namely, the array > being assigned into keeps its dtype. what are the usual casting rules? How does numpy know how to cast an object to a float? > > If you do not want in-place modification, do not use augmented assignment. Normally, I'd be perfectly fine with that. However, this particular problem occurs when you try to automatically differentiate an algorithm by using an Algorithmic Differentiation (AD) tool. E.g. given a function x = numpy.ones(2) def f(x): a = numpy.ones(2) a *= x return numpy.sum(a) one would use an AD tool as follows: x = numpy.array([adouble(1.), adouble(1.)]) y = f(x) but since the casting from object to float is not possible the computed gradient \nabla_x f(x) will be wrong. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Tue Jan 12 13:32:09 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 12 Jan 2010 20:32:09 +0200 Subject: [Numpy-discussion] Numpy 1.4 MaskedArray bug? In-Reply-To: References:

Message-ID: <1263321129.7167.12.camel@idol> ti, 2010-01-12 kello 12:51 -0500, Pierre GM kirjoitti: [clip] > >>>> a = numpy.ma.MaskedArray([[1,2,3],[4,5,6]]) > >>>> numpy.ma.sum(a, 1) > > Traceback (most recent call last): > > File "", line 1, in > > File > > "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n > > umpy/ma/core.py", line 5682, in __call__ > > return method(*args, **params) > > File > > "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n > > umpy/ma/core.py", line 4357, in sum > > newmask = _mask.all(axis=axis) > > ValueError: axis(=1) out of bounds > > Confirmed. > Before I take full blame for it, can you try the following on both 1.3 and 1.4 ? > >>> np.array(False).all().sum(1) Oh crap, it's mostly my fault: http://projects.scipy.org/numpy/ticket/1286 http://projects.scipy.org/numpy/changeset/7697 http://projects.scipy.org/numpy/browser/trunk/doc/release/1.4.0-notes.rst#deprecations Pretty embarassing, as very simple things break, although the test suite miraculously passes... > Back to your problem: I'll fix that ASAIC, but it'll be on the SVN. Meanwhile, you can: > * Use -1 instead of 1 for your axis. > * Force the definition of a mask when you define your array with masked_array(...,mask=False) Sounds like we need a 1.4.1 out at some point not too far in the future, then. Pauli From robert.kern at gmail.com Tue Jan 12 13:38:51 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 12 Jan 2010 12:38:51 -0600 Subject: [Numpy-discussion] wrong casting of augmented assignment statements In-Reply-To: References: <3d375d731001121009p18e01199y2506b8a205d135fa@mail.gmail.com> Message-ID: <3d375d731001121038t226ed0c6md986851be317402d@mail.gmail.com> On Tue, Jan 12, 2010 at 12:31, Sebastian Walter wrote: > On Tue, Jan 12, 2010 at 7:09 PM, Robert Kern wrote: >> On Tue, Jan 12, 2010 at 12:05, Sebastian Walter >> wrote: >>> Hello, >>> I have a question about the augmented assignment statements *=, +=, etc. >>> Apparently, the casting of types is not working correctly. Is this >>> known resp. intended behavior of numpy? >> >> Augmented assignment modifies numpy arrays in-place, so the usual >> casting rules for assignment into an array apply. Namely, the array >> being assigned into keeps its dtype. > > what are the usual casting rules? For assignment into an array, the array keeps its dtype and the data being assigned into it will be cast to that dtype. > How does numpy know how to cast an object to a float? For a general object, numpy will call its __float__ method. >> If you do not want in-place modification, do not use augmented assignment. > > Normally, I'd be perfectly fine with that. > However, this particular problem occurs when you try to automatically > differentiate an algorithm by using an Algorithmic Differentiation > (AD) tool. > E.g. given a function > > x = numpy.ones(2) > def f(x): > ? a = numpy.ones(2) > ? a *= x > ? return numpy.sum(a) > > one would use an AD tool as follows: > x = numpy.array([adouble(1.), adouble(1.)]) > y = f(x) > > but since the casting from object to float is not possible the > computed gradient \nabla_x f(x) will be wrong. Sorry, but that's just a limitation of the AD approach. There are all kinds of numpy constructions that AD can't handle. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Jan 12 13:52:17 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 12 Jan 2010 11:52:17 -0700 Subject: [Numpy-discussion] Numpy 1.4 MaskedArray bug? In-Reply-To: <1263321129.7167.12.camel@idol> References:

<1263321129.7167.12.camel@idol> Message-ID: On Tue, Jan 12, 2010 at 11:32 AM, Pauli Virtanen wrote: > ti, 2010-01-12 kello 12:51 -0500, Pierre GM kirjoitti: > [clip] > > >>>> a = numpy.ma.MaskedArray([[1,2,3],[4,5,6]]) > > >>>> numpy.ma.sum(a, 1) > > > Traceback (most recent call last): > > > File "", line 1, in > > > File > > > > "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n > > > umpy/ma/core.py", line 5682, in __call__ > > > return method(*args, **params) > > > File > > > > "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n > > > umpy/ma/core.py", line 4357, in sum > > > newmask = _mask.all(axis=axis) > > > ValueError: axis(=1) out of bounds > > > > Confirmed. > > Before I take full blame for it, can you try the following on both 1.3 > and 1.4 ? > > >>> np.array(False).all().sum(1) > > Oh crap, it's mostly my fault: > > http://projects.scipy.org/numpy/ticket/1286 > http://projects.scipy.org/numpy/changeset/7697 > > http://projects.scipy.org/numpy/browser/trunk/doc/release/1.4.0-notes.rst#deprecations > > Pretty embarassing, as very simple things break, although the test suite > miraculously passes... > > > Back to your problem: I'll fix that ASAIC, but it'll be on the SVN. > Meanwhile, you can: > > * Use -1 instead of 1 for your axis. > > * Force the definition of a mask when you define your array with > masked_array(...,mask=False) > > Sounds like we need a 1.4.1 out at some point not too far in the future, > then. > > If so, then it should be sooner rather than later in order to sync with the releases of ubuntu and fedora. Both of the upcoming releases still use 1.3.0, but that could change... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Tue Jan 12 14:16:43 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Tue, 12 Jan 2010 20:16:43 +0100 Subject: [Numpy-discussion] wrong casting of augmented assignment statements In-Reply-To: <3d375d731001121038t226ed0c6md986851be317402d@mail.gmail.com> References: <3d375d731001121009p18e01199y2506b8a205d135fa@mail.gmail.com> <3d375d731001121038t226ed0c6md986851be317402d@mail.gmail.com> Message-ID: On Tue, Jan 12, 2010 at 7:38 PM, Robert Kern wrote: > On Tue, Jan 12, 2010 at 12:31, Sebastian Walter > wrote: >> On Tue, Jan 12, 2010 at 7:09 PM, Robert Kern wrote: >>> On Tue, Jan 12, 2010 at 12:05, Sebastian Walter >>> wrote: >>>> Hello, >>>> I have a question about the augmented assignment statements *=, +=, etc. >>>> Apparently, the casting of types is not working correctly. Is this >>>> known resp. intended behavior of numpy? >>> >>> Augmented assignment modifies numpy arrays in-place, so the usual >>> casting rules for assignment into an array apply. Namely, the array >>> being assigned into keeps its dtype. >> >> what are the usual casting rules? > > For assignment into an array, the array keeps its dtype and the data > being assigned into it will be cast to that dtype. > >> How does numpy know how to cast an object to a float? > > For a general object, numpy will call its __float__ method. 1) the object does not have a __float__ method. 2) I've now implemented the __float__ method (to raise an error). However, it doesn't get called. All objects are casted to 1. > >>> If you do not want in-place modification, do not use augmented assignment. >> >> Normally, I'd be perfectly fine with that. >> However, this particular problem occurs when you try to automatically >> differentiate an algorithm by using an Algorithmic Differentiation >> (AD) tool. >> E.g. given a function >> >> x = numpy.ones(2) >> def f(x): >> ? a = numpy.ones(2) >> ? a *= x >> ? return numpy.sum(a) >> >> one would use an AD tool as follows: >> x = numpy.array([adouble(1.), adouble(1.)]) >> y = f(x) >> >> but since the casting from object to float is not possible the >> computed gradient \nabla_x f(x) will be wrong. > > Sorry, but that's just a limitation of the AD approach. There are all > kinds of numpy constructions that AD can't handle. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Tue Jan 12 14:34:11 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 12 Jan 2010 11:34:11 -0800 Subject: [Numpy-discussion] wrong casting of augmented assignment statements In-Reply-To: References: <3d375d731001121009p18e01199y2506b8a205d135fa@mail.gmail.com> <3d375d731001121038t226ed0c6md986851be317402d@mail.gmail.com> Message-ID: <4B4CCEB3.5030307@noaa.gov> Sebastian Walter wrote: >>> However, this particular problem occurs when you try to automatically >>> differentiate an algorithm by using an Algorithmic Differentiation >>> (AD) tool. >>> E.g. given a function >>> >>> x = numpy.ones(2) >>> def f(x): >>> a = numpy.ones(2) >>> a *= x >>> return numpy.sum(a) I don't know anything about AD, but in general, when I write a function that requires a given numpy array type as input, I'll do something like: def f(x): x = np.asarray(a, dtype=np.float) a = np.ones(2) a *= x return np.sum(a) That makes the casting explicit, and forces it to happen at the top of the function, where the error will be more obvious. asarray will just pass through a conforming array, so little performance penalty when you do give it the right type. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pgmdevlist at gmail.com Tue Jan 12 14:50:43 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 12 Jan 2010 14:50:43 -0500 Subject: [Numpy-discussion] Numpy 1.4 MaskedArray bug? In-Reply-To: References:

<1263321129.7167.12.camel@idol> Message-ID: <910F1C67-06BB-47C1-8FCA-057131728D47@gmail.com> On Jan 12, 2010, at 1:52 PM, Charles R Harris wrote: > > > > On Tue, Jan 12, 2010 at 11:32 AM, Pauli Virtanen wrote: > ti, 2010-01-12 kello 12:51 -0500, Pierre GM kirjoitti: > [clip] > > >>>> a = numpy.ma.MaskedArray([[1,2,3],[4,5,6]]) > > >>>> numpy.ma.sum(a, 1) > > > Traceback (most recent call last): > > > File "", line 1, in > > > File > > > "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n > > > umpy/ma/core.py", line 5682, in __call__ > > > return method(*args, **params) > > > File > > > "/usr/lib64/python2.5/site-packages/numpy-1.4.0-py2.5-linux-x86_64.egg/n > > > umpy/ma/core.py", line 4357, in sum > > > newmask = _mask.all(axis=axis) > > > ValueError: axis(=1) out of bounds > > > > Confirmed. > > Before I take full blame for it, can you try the following on both 1.3 and 1.4 ? > > >>> np.array(False).all().sum(1) > > Oh crap, it's mostly my fault: > > http://projects.scipy.org/numpy/ticket/1286 > http://projects.scipy.org/numpy/changeset/7697 > http://projects.scipy.org/numpy/browser/trunk/doc/release/1.4.0-notes.rst#deprecations > > Pretty embarassing, as very simple things break, although the test suite > miraculously passes... > > > Back to your problem: I'll fix that ASAIC, but it'll be on the SVN. Meanwhile, you can: > > * Use -1 instead of 1 for your axis. > > * Force the definition of a mask when you define your array with masked_array(...,mask=False) > > Sounds like we need a 1.4.1 out at some point not too far in the future, > then. > > > If so, then it should be sooner rather than later in order to sync with the releases of ubuntu and fedora. Both of the upcoming releases still use 1.3.0, but that could change... I guess that the easiest would be for me to provide a workaround for the bug (Pauli's modifications make sense, I was relying on a *feature* that wasn't very robust). I'll update both the trunk and the 1.4.x branch From ms at TheBrookhavenGroup.com Tue Jan 12 15:33:02 2010 From: ms at TheBrookhavenGroup.com (Marc Schwarzschild) Date: Tue, 12 Jan 2010 15:33:02 -0500 Subject: [Numpy-discussion] numpy sum table by category Message-ID: <19276.56446.757824.576522@ny.koplon.com> I have a csv file like this: Account, Symbol, Quantity, Price One,SPY,5,119.00 One,SPY,3,120.00 One,SPY,-2,125.00 One,GE,... One,GE,... Two,SPY, ... Three,GE, ... ... The data is much larger, could be 10,000 records. I can load it into a numpy array using matplotlib.mlab.csv2rec(). I learned several useful numpy functions and have been reading lots of documentation. However, I have not found a way to create a unique list of symbols and the Sum of their respective Quantity values. I want do various calculations on the data like pull out all the records for a given Account. The actual data has lots more columns and sometimes I may want to sum(Quantity*Price) by Account and Symbol. I'm attracted to numpy for speed but would welcome alternative suggestions. I tried unsuccessfully to install PyTables on my Ubuntu system and abandoned that avenue. Can anyone provide some examples on how to do this or point me to documentation? Much appreciated. _________________________________________________________ Marc Schwarzschild The Brookhaven Group, LLC 1-212-580-1175 Analytics for Hedge Fund Investors Risk it, carefully! From josef.pktd at gmail.com Tue Jan 12 16:08:44 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 12 Jan 2010 16:08:44 -0500 Subject: [Numpy-discussion] numpy sum table by category In-Reply-To: <19276.56446.757824.576522@ny.koplon.com> References: <19276.56446.757824.576522@ny.koplon.com> Message-ID: <1cd32cbb1001121308k6e026be5k5ac79d66038268ed@mail.gmail.com> On Tue, Jan 12, 2010 at 3:33 PM, Marc Schwarzschild wrote: > > > I have a csv file like this: > > ? ?Account, Symbol, Quantity, Price > ? ?One,SPY,5,119.00 > ? ?One,SPY,3,120.00 > ? ?One,SPY,-2,125.00 > ? ?One,GE,... > ? ?One,GE,... > ? ?Two,SPY, ... > ? ?Three,GE, ... > ? ? ... > > The data is much larger, could be 10,000 records. ?I can load it > into a numpy array using matplotlib.mlab.csv2rec(). ?I learned > several useful numpy functions and have been reading lots of > documentation. ?However, I have not found a way to create a > unique list of symbols and the Sum of their respective Quantity > values. ?I want do various calculations on the data like pull out > all the records for a given Account. ?The actual data has lots > more columns and sometimes I may want to sum(Quantity*Price) by > Account and Symbol. > > I'm attracted to numpy for speed but would welcome alternative > suggestions. > > I tried unsuccessfully to install PyTables on my Ubuntu system > and abandoned that avenue. > > Can anyone provide some examples on how to do this or point me to > documentation? If you don't want to do a lot of programming yourself, then I recommend tabular for this, which looks good for this kind of spreadsheet like operations, alternatively pandas. Josef > > Much appreciated. > > _________________________________________________________ > Marc Schwarzschild ? ? ? ? ? ? ?The Brookhaven Group, LLC > 1-212-580-1175 ? ? ? ? Analytics for Hedge Fund Investors > ? ? ? ? ? ? ? ? Risk it, carefully! > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Tue Jan 12 20:19:35 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 12 Jan 2010 17:19:35 -0800 Subject: [Numpy-discussion] fromfile() -- aarrgg! In-Reply-To: <4B4CC02A.9020807@noaa.gov> References: <4B42905A.4080105@noaa.gov> <1262721695.5107.1.camel@idol> <4B463F37.4010108@noaa.gov> <1cd32cbb1001071232l6c3d3525g4ec4747d62d998ed@mail.gmail.com> <4B465605.3010406@noaa.gov> <1cd32cbb1001071515p498c8746u5dce34453346c97f@mail.gmail.com> <4B466E4C.10504@noaa.gov> <4B46889E.3020008@noaa.gov> <4B4BBE2E.9090706@noaa.gov> <1263285477.7976.10.camel@talisman> <4B4CC02A.9020807@noaa.gov> Message-ID: <4B4D1FA7.9070205@noaa.gov> Christopher Barker wrote: > static int > @fname at _fromstr(char *str, @type@ *ip, char **endptr, PyArray_Descr > *NPY_UNUSED(ignore)) > { > double result; > result = NumPyOS_ascii_strtod(str, endptr); > *ip = (@type@) result; > return 0; > } OK, I've done the diagnostics, but not all of the fix. Here's the issue: numpyos.c: NumPyOS_ascii_strtod() Was incrementing the input pointer to strip out whitespace before passing it on to PyOS_ascii_strtod(). So the **endptr getting passed back to @fname at _fromstr didn't match. I've fixed that -- so now it should be possible to check if str and *endptr are the same after the call, to see if a double was actually read -- I"m not suite sure what to do in that case, but a return code is a good start. However, I also took a look at integers. For example: In [39]: np.fromstring("4.5, 3", sep=',', dtype=np.int) Out[39]: array([4]) clearly wrong -- it may be OK to read "4.5" as 4, but then it stops, I guess because there is a ".5" before the next sep. Anyway, not the best solution. However, in this case, the function is here: @fname at _fromstr(char *str, @type@ *ip, char **endptr, PyArray_Descr *NPY_UNUSED(ignore)) { @btype@ result; result = PyOS_strto at func@(str, endptr, 10); *ip = (@type@) result; printf("In int fromstr - result: %i\n", result ); printf("In int fromstr - str: '%s', %p %p\n", str, str, *endptr); return 0; } so it's calling PyOS_strtol(), which when called on "4.5" returns 4 -- which explains the abive behaviou -- but how to know that that wasn't a proper reading? This really is a mess! Since there was just some talk about a 1.4.1 -- I'd like to get some of this fixed before then -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From denis-bz-py at t-online.de Wed Jan 13 05:47:22 2010 From: denis-bz-py at t-online.de (denis) Date: Wed, 13 Jan 2010 11:47:22 +0100 Subject: [Numpy-discussion] numpy1.4 dtype issues: scipy.stats & pytables In-Reply-To: <3d375d731001120841y7bc9bb9cla1b49c6deb2b5232@mail.gmail.com> References: <3D845F44-C5F5-4F56-8233-E97C2D5A6CE2@gmail.com> <1cd32cbb1001110910r2cde5bdgac72e7d176ead46a@mail.gmail.com> <3d375d731001120841y7bc9bb9cla1b49c6deb2b5232@mail.gmail.com> Message-ID: On 12/01/2010 17:41, Robert Kern wrote: > It's not a bug, but it is a known issue. We tried very hard to keep > numpy 1.4 binary compatible; however, Pyrex and Cython impose > additional runtime checks above and beyond binary compatibility. Robert, Josef, are you saying that mac users shouldn't expect numpy-1.4.0-py2.6-python.org.dmg scipy-0.7.1-py2.6-python.org.dmg to "just work" together, download and go ? If not, then the download pages should clearly say "... may not work with ..." (If they weren't tested together, that's imho a problem in the process; I realize that testing is hard work, no glory.) cheers -- denis From eadrogue at gmx.net Wed Jan 13 06:57:03 2010 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Wed, 13 Jan 2010 12:57:03 +0100 Subject: [Numpy-discussion] numpy sum table by category In-Reply-To: <19276.56446.757824.576522@ny.koplon.com> References: <19276.56446.757824.576522@ny.koplon.com> Message-ID: <20100113115702.GA6116@doriath.local> 12/01/10 @ 15:33 (-0500), thus spake Marc Schwarzschild: > > > I have a csv file like this: > > Account, Symbol, Quantity, Price > One,SPY,5,119.00 > One,SPY,3,120.00 > One,SPY,-2,125.00 > One,GE,... > One,GE,... > Two,SPY, ... > Three,GE, ... > ... > > The data is much larger, could be 10,000 records. I can load it > into a numpy array using matplotlib.mlab.csv2rec(). I learned > several useful numpy functions and have been reading lots of > documentation. However, I have not found a way to create a > unique list of symbols and the Sum of their respective Quantity > values. If x is your record array: for sym in set(x['Symbol']): mask = x['Symbol'] == sym print sym, x[mask]['Quantity'].sum() > I want do various calculations on the data like pull out > all the records for a given Account. The actual data has lots > more columns and sometimes I may want to sum(Quantity*Price) by > Account and Symbol. To get a subset of records matching an arbitrary criteria, you use boolean arrays. For example, x['Account'] == 'name' generates a boolean array of the same length as x, with each element being True or False depending on whether in that record the Account field was equal to 'name'. Then such arrays can be used as an index on the original x array, to get the subset of records. This is what the example above does. Cheers. From david at silveregg.co.jp Wed Jan 13 23:41:13 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Thu, 14 Jan 2010 13:41:13 +0900 Subject: [Numpy-discussion] numpy1.4 dtype issues: scipy.stats & pytables In-Reply-To: References: <3D845F44-C5F5-4F56-8233-E97C2D5A6CE2@gmail.com> <1cd32cbb1001110910r2cde5bdgac72e7d176ead46a@mail.gmail.com> <3d375d731001120841y7bc9bb9cla1b49c6deb2b5232@mail.gmail.com> Message-ID: <4B4EA069.8040103@silveregg.co.jp> denis wrote: > On 12/01/2010 17:41, Robert Kern wrote: > >> It's not a bug, but it is a known issue. We tried very hard to keep >> numpy 1.4 binary compatible; however, Pyrex and Cython impose >> additional runtime checks above and beyond binary compatibility. > > Robert, Josef, > are you saying that mac users shouldn't expect > numpy-1.4.0-py2.6-python.org.dmg > scipy-0.7.1-py2.6-python.org.dmg > to "just work" together, download and go ? It would not work for the concerned subpackages, no. > If not, then the download pages should clearly say "... may not work with ..." > (If they weren't tested together, that's imho a problem in the process; > I realize that testing is hard work, no glory.) It is not so much hard-work than time consuming, at least as long as we don't have automated testing of binaries. Unfortunately, the problem was not caught properly during the beta phase, David From cournape at gmail.com Thu Jan 14 00:02:47 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 14 Jan 2010 14:02:47 +0900 Subject: [Numpy-discussion] Wanted: new release manager for 1.5 and above Message-ID: <5b8d13221001132102k4f19ee45td6b54bd5df3c8578@mail.gmail.com> Hi, as I already hinted to some people in person, I won't make new releases of numpy (and scipy) as I used to. To ease the transition, I think it would be good to have new people with who we could make say the 1.4.1 release together. Making installers is relatively streamlined now, it can be done almost 100 % automatically - the testing is still manual. I still think it would be a good idea to have a different release manager for each release - it may be easier to find someone to do it if it is only for one release cycle. cheers, David From charlesr.harris at gmail.com Thu Jan 14 01:07:16 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 13 Jan 2010 23:07:16 -0700 Subject: [Numpy-discussion] Wanted: new release manager for 1.5 and above In-Reply-To: <5b8d13221001132102k4f19ee45td6b54bd5df3c8578@mail.gmail.com> References: <5b8d13221001132102k4f19ee45td6b54bd5df3c8578@mail.gmail.com> Message-ID: On Wed, Jan 13, 2010 at 10:02 PM, David Cournapeau wrote: > Hi, > > as I already hinted to some people in person, I won't make new > releases of numpy (and scipy) as I used to. To ease the transition, I > think it would be good to have new people with who we could make say > the 1.4.1 release together. Making installers is relatively > streamlined now, it can be done almost 100 % automatically - the > testing is still manual. > > I still think it would be a good idea to have a different release > manager for each release - it may be easier to find someone to do it > if it is only for one release cycle. > > What is the setup one needs to build the installers? It might be well to document that, the dependencies, and the process. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at silveregg.co.jp Thu Jan 14 01:34:10 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Thu, 14 Jan 2010 15:34:10 +0900 Subject: [Numpy-discussion] Wanted: new release manager for 1.5 and above In-Reply-To: References: <5b8d13221001132102k4f19ee45td6b54bd5df3c8578@mail.gmail.com> Message-ID: <4B4EBAE2.7040703@silveregg.co.jp> Charles R Harris wrote: > > > What is the setup one needs to build the installers? It might be well to > document that, the dependencies, and the process. Right. The top script is: http://projects.scipy.org/numpy/browser/trunk/release.sh the bulk of the work is in : http://projects.scipy.org/numpy/browser/trunk/pavement.py which describes what is needed to build installers. On mac os x, the release script may be used as is to build every installer + the release notes. David From jonboym2 at yahoo.co.uk Thu Jan 14 03:55:46 2010 From: jonboym2 at yahoo.co.uk (Jon Moore) Date: Thu, 14 Jan 2010 08:55:46 +0000 Subject: [Numpy-discussion] Getting Callbacks with arrays to work In-Reply-To: <4B4C7CC1.3010304@cens.ioc.ee> References: <832556.93262.qm@web24504.mail.ird.yahoo.com> <4B4C7CC1.3010304@cens.ioc.ee> Message-ID: <4B4EDC12.3050409@yahoo.co.uk> Hi, Thanks all works now! The implicit none only didn't work when defining dv as a function now its a subroutine it seems to work. Regards Jon On 12/01/2010 13:44, Pearu Peterson wrote: > Hi, > > The problem is that f2py does not support callbacks that > return arrays. There is easy workaround to that: provide > returnable arrays as arguments to callback functions. > Using your example: > > SUBROUTINE CallbackTest(dv,v0,Vout,N) > IMPLICIT NONE > > !F2PY intent( hide ):: N > INTEGER:: N, ic > EXTERNAL:: dv > > DOUBLE PRECISION, DIMENSION( N ), INTENT(IN):: v0 > DOUBLE PRECISION, DIMENSION( N ), INTENT(OUT):: Vout > > DOUBLE PRECISION, DIMENSION( N ):: Vnow > DOUBLE PRECISION, DIMENSION( N ):: temp > > Vnow = v0 > !f2py intent (out) temp > call dv(temp, Vnow, N) > > DO ic = 1, N > Vout( ic ) = temp(ic) > END DO > > END SUBROUTINE CallbackTest > > $ f2py -c test.f90 -m t --fcompiler=gnu95 > >>>> from numpy import * >>>> from t import * >>>> arr = array([2.0, 4.0, 6.0, 8.0]) >>>> def dV(v): > print 'in Python dV: V is: ',v > ret = v.copy() > ret[1] = 100.0 > return ret > ... >>>> output = callbacktest(dV, arr) > in Python dV: V is: [ 2. 4. 6. 8.] >>>> output > array([ 2., 100., 6., 8.]) > > What problems do you have with implicit none? It works > fine here. Check the format of your source code, > if it is free then use `.f90` extension, not `.f`. > > HTH, > Pearu > > Jon Moore wrote: >> Hi, >> >> I'm trying to build a differential equation integrator and later a >> stochastic differential equation integrator. >> >> I'm having trouble getting f2py to work where the callback itself >> receives an array from the Fortran routine does some work on it and then >> passes an array back. >> >> For the stoachastic integrator I'll need 2 callbacks both dealing with >> arrays. >> >> The idea is the code that never changes (ie the integrator) will be in >> Fortran and the code that changes (ie the callbacks defining >> differential equations) will be different for each problem. >> >> To test the idea I've written basic code which should pass an array back >> and forth between Python and Fortran if it works right. >> >> Here is some code which doesn't work properly:- >> >> SUBROUTINE CallbackTest(dv,v0,Vout,N) >> !IMPLICIT NONE >> >> cF2PY intent( hide ):: N >> INTEGER:: N, ic >> >> EXTERNAL:: dv >> >> DOUBLE PRECISION, DIMENSION( N ), INTENT(IN):: v0 >> DOUBLE PRECISION, DIMENSION( N ), INTENT(OUT):: Vout >> >> DOUBLE PRECISION, DIMENSION( N ):: Vnow >> DOUBLE PRECISION, DIMENSION( N ):: temp >> >> Vnow = v0 >> >> >> temp = dv(Vnow, N) >> >> DO ic = 1, N >> Vout( ic ) = temp(ic) >> END DO >> >> END SUBROUTINE CallbackTest >> >> >> >> When I test it with this python code I find the code just replicates the >> first term of the array! >> >> >> >> >> from numpy import * >> import callback as c >> >> def dV(v): >> print 'in Python dV: V is: ',v >> return v.copy() >> >> arr = array([2.0, 4.0, 6.0, 8.0]) >> >> print 'Arr is: ', arr >> >> output = c.CallbackTest(dV, arr) >> >> print 'Out is: ', output >> >> >> >> >> Arr is: [ 2. 4. 6. 8.] >> >> in Python dV: V is: [ 2. 4. 6. 8.] >> >> Out is: [ 2. 2. 2. 2.] >> >> >> >> Any ideas how I should do this, and also how do I get the code to work >> with implicit none not commented out? >> >> Thanks >> >> Jon >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sebastian.walter at gmail.com Thu Jan 14 04:11:29 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Thu, 14 Jan 2010 10:11:29 +0100 Subject: [Numpy-discussion] wrong casting of augmented assignment statements In-Reply-To: <3d375d731001121038t226ed0c6md986851be317402d@mail.gmail.com> References: <3d375d731001121009p18e01199y2506b8a205d135fa@mail.gmail.com> <3d375d731001121038t226ed0c6md986851be317402d@mail.gmail.com> Message-ID: I've written a self-contained example that shows that numpy indeed tries to call the __float__ method. What is buggy is what happens if calling the __float__ method raises an Exception. Then numpy assumes (in this case wrongly) that the object should be casted to the neutral element. I'd guess that the __float__ method is called somewhere in a try: statement and if an exception is raised it is casted to the neutral element. I've tried to locate the corresponding code in the numpy sources but I got lost. Could someone be so kind and point me to it? -------------------- start code ---------------------- import numpy print 'numpy.__version__ = ',numpy.__version__ class ad1: def __init__(self,x): self.x = x def __mul__(self,other): if not isinstance(other, self.__class__): return self.__class__(self.x * other) return self.__class__(self.x * other.x) def __rmul__(self,other): return self * other def __float__(self): raise Exception('this is not possible') def __str__(self): return str(self.x) print '\nThis example yields buggy behavior:' x1 = numpy.array([ad1(1.), ad1(2.), ad1(3.)]) y1 = numpy.random.rand(3) print 'y1= ',y1 print 'x1= ',x1 z1 = x1 * y1 y1 *= x1 # this should call the __float__ method of ad1 which would raise an Exception print 'z1=x1*y1',z1 print 'y1*=x1 ',y1 class ad2: def __init__(self,x): self.x = x def __mul__(self,other): if not isinstance(other, self.__class__): return self.__class__(self.x * other) return self.__class__(self.x * other.x) def __rmul__(self,other): return self * other def __float__(self): return float(self.x) def __str__(self): return str(self.x) print '\nThis example works fine:' x2 = numpy.array([ad2(1.), ad2(2.), ad2(3.)]) y2 = numpy.random.rand(3) print 'y2= ',y2 print 'x2= ',x2 z2 = x2 * y2 y2 *= x2 # this should call the __float__ method of ad1 which would raise an Exception print 'z2=x2*y2',z2 print 'y2*=x2 ',y2 -------------------- end code ---------------------- -------- output --------- walter at wronski$ python wrong_casting_object_to_float_of_augmented_assignment_statements.py numpy.__version__ = 1.3.0 This example yields buggy behavior: y1= [ 0.15322371 0.47915903 0.81153995] x1= [1.0 2.0 3.0] z1=x1*y1 [0.153223711127 0.958318053803 2.43461983729] y1*=x1 [ 0.15322371 0.47915903 0.81153995] This example works fine: y2= [ 0.49377037 0.60908423 0.79772095] x2= [1.0 2.0 3.0] z2=x2*y2 [0.493770370747 1.21816846399 2.39316283707] y2*=x2 [ 0.49377037 1.21816846 2.39316284] -------- end output --------- On Tue, Jan 12, 2010 at 7:38 PM, Robert Kern wrote: > On Tue, Jan 12, 2010 at 12:31, Sebastian Walter > wrote: >> On Tue, Jan 12, 2010 at 7:09 PM, Robert Kern wrote: >>> On Tue, Jan 12, 2010 at 12:05, Sebastian Walter >>> wrote: >>>> Hello, >>>> I have a question about the augmented assignment statements *=, +=, etc. >>>> Apparently, the casting of types is not working correctly. Is this >>>> known resp. intended behavior of numpy? >>> >>> Augmented assignment modifies numpy arrays in-place, so the usual >>> casting rules for assignment into an array apply. Namely, the array >>> being assigned into keeps its dtype. >> >> what are the usual casting rules? > > For assignment into an array, the array keeps its dtype and the data > being assigned into it will be cast to that dtype. > >> How does numpy know how to cast an object to a float? > > For a general object, numpy will call its __float__ method. > >>> If you do not want in-place modification, do not use augmented assignment. >> >> Normally, I'd be perfectly fine with that. >> However, this particular problem occurs when you try to automatically >> differentiate an algorithm by using an Algorithmic Differentiation >> (AD) tool. >> E.g. given a function >> >> x = numpy.ones(2) >> def f(x): >> ? a = numpy.ones(2) >> ? a *= x >> ? return numpy.sum(a) >> >> one would use an AD tool as follows: >> x = numpy.array([adouble(1.), adouble(1.)]) >> y = f(x) >> >> but since the casting from object to float is not possible the >> computed gradient \nabla_x f(x) will be wrong. > > Sorry, but that's just a limitation of the AD approach. There are all > kinds of numpy constructions that AD can't handle. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cournape at gmail.com Thu Jan 14 04:53:16 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 14 Jan 2010 18:53:16 +0900 Subject: [Numpy-discussion] Matrix vs array in ma.minimum Message-ID: <5b8d13221001140153ocd65f2fwc64c4820aa9c7a@mail.gmail.com> Hi, I encountered a problem in matlab which boils down to a surprising behavior of np.ma.minimum: x = np.random.randn(2, 3) mx = np.matrix(x) np.ma.minimum(x) # smallest item of x ret = np.ma.minimum(mx) # flattened version of mx, i.e. ret == mx.flatten() Is this expected ? cheers, David From pgmdevlist at gmail.com Thu Jan 14 07:22:02 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 14 Jan 2010 07:22:02 -0500 Subject: [Numpy-discussion] Matrix vs array in ma.minimum In-Reply-To: <5b8d13221001140153ocd65f2fwc64c4820aa9c7a@mail.gmail.com> References: <5b8d13221001140153ocd65f2fwc64c4820aa9c7a@mail.gmail.com> Message-ID: <6C95E82E-B37F-47CC-A80D-995C779331BF@gmail.com> On Jan 14, 2010, at 4:53 AM, David Cournapeau wrote: > Hi, > > I encountered a problem in matlab which boils down to a surprising > behavior of np.ma.minimum: > > x = np.random.randn(2, 3) > mx = np.matrix(x) > > np.ma.minimum(x) # smallest item of x > ret = np.ma.minimum(mx) # flattened version of mx, i.e. ret == mx.flatten() > > Is this expected ? Er, no. np.ma.minimum(a, b) returns the lowest value of a and b element-wsie, or the the lowest element of a is b is None. The behavior is inherited from the very first implementation of maskedarray in numeric. This itself is unexpected, since np.minimum requires at least 2 input arguments. As you observed, the current function breaks down w/ np.matrix objects when only one argument is given (and when the axis is None): we call umath.minimum.reduce on the ravelled matirx, which returns the ravelled matrix. One would expect a scalar, so yes, this behavior is also unexpected. Now, which way should we go ? Keep np.ma.minimum as it is (fixing the bug so that a scalar is returned if the function is called with only 1 argument and an axis None) ? Adapt it to match np.minimum ? From matthew.brett at gmail.com Thu Jan 14 08:01:48 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 14 Jan 2010 13:01:48 +0000 Subject: [Numpy-discussion] dtype.isbuiltin changed by .newbyteorder Message-ID: <1e2af89e1001140501m188703d6k46cf68c0ec05cacf@mail.gmail.com> Hi, Over on the scipy list, someone pointed out an oddness in the output of the matlab reader, which revealed this - to me - unexpected behavior in numpy: In [20]: dt = np.dtype('f8') In [21]: dt.isbuiltin Out[21]: 1 In [22]: ndt = dt.newbyteorder('<') In [23]: ndt.isbuiltin Out[23]: 0 I was expecting the 'isbuiltin' attribute to be the same (1) after byte swapping. Does that seem reasonable to y'all? Then, is this a bug? Thanks a lot, Matthew From robert.kern at gmail.com Thu Jan 14 10:53:14 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 14 Jan 2010 09:53:14 -0600 Subject: [Numpy-discussion] dtype.isbuiltin changed by .newbyteorder In-Reply-To: <1e2af89e1001140501m188703d6k46cf68c0ec05cacf@mail.gmail.com> References: <1e2af89e1001140501m188703d6k46cf68c0ec05cacf@mail.gmail.com> Message-ID: <3d375d731001140753o371228a2v160c97af9a06e97b@mail.gmail.com> On Thu, Jan 14, 2010 at 07:01, Matthew Brett wrote: > Hi, > > Over on the scipy list, someone pointed out an oddness in the output > of the matlab reader, which revealed this - to me - unexpected > behavior in numpy: > > In [20]: dt = np.dtype('f8') > > In [21]: dt.isbuiltin > Out[21]: 1 > > In [22]: ndt = dt.newbyteorder('<') > > In [23]: ndt.isbuiltin > Out[23]: 0 > > I was expecting the 'isbuiltin' attribute to be the same (1) after > byte swapping. ? ?Does that seem reasonable to y'all? ?Then, is this a > bug? It is at least undesirable. It may not be a bug per se as I don't think that we guarantee that .isbuiltin is free from false negatives (though we do guarantee that it is free from false positives). The reason is that we would have to search the builtin dtypes for a match every time we create a new dtype object, and that could be more expensive than we care to do for *every* creation of a dtype object. It is possible that we can have a cheaper heuristic (native byte order and the standard typecodes) and that transformations like .newbyteorder() can have just a teeny bit more intelligent logic about how it transforms the .isbuiltin flag. Just for clarity and future googling, the issue is that when a native dtype has .newbyteorder() called on it to make a new dtype that has the *same* native byte order, the .isbuiltin flag incorrectly states that it is not builtin. Using .newbyteorder() to swap the byte order to the non-native byte order should and does cause the resulting dtype to not be considered builtin. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matthew.brett at gmail.com Thu Jan 14 12:02:43 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 14 Jan 2010 17:02:43 +0000 Subject: [Numpy-discussion] dtype.isbuiltin changed by .newbyteorder In-Reply-To: <3d375d731001140753o371228a2v160c97af9a06e97b@mail.gmail.com> References: <1e2af89e1001140501m188703d6k46cf68c0ec05cacf@mail.gmail.com> <3d375d731001140753o371228a2v160c97af9a06e97b@mail.gmail.com> Message-ID: <1e2af89e1001140902p65e2bf72u8657353e654baa8@mail.gmail.com> Hi, > It is at least undesirable. It may not be a bug per se as I don't > think that we guarantee that .isbuiltin is free from false negatives > (though we do guarantee that it is free from false positives). The > reason is that we would have to search the builtin dtypes for a match > every time we create a new dtype object, and that could be more > expensive than we care to do for *every* creation of a dtype object. > It is possible that we can have a cheaper heuristic (native byte order > and the standard typecodes) and that transformations like > .newbyteorder() can have just a teeny bit more intelligent logic about > how it transforms the .isbuiltin flag. Thanks - that's very clear and helpful, and made me realize I didn't understand the builtin attribute. I suppose something like the following: output_dtype.isbuiltin = input_dtype.isbuiltin and new_byteorder == native would at least reduce the false negatives at little cost. Cheers, Matthew From rdmoores at gmail.com Thu Jan 14 12:53:44 2010 From: rdmoores at gmail.com (Richard D. Moores) Date: Fri, 15 Jan 2010 01:53:44 +0800 Subject: [Numpy-discussion] hello~ Message-ID: Hi,One of my friends introduce a very good website to me: http://www.flsso.com/. All their products are new and original. They have many brands, such as Sony, HP, Apple, Nokia and so on. Now , they are promoting their products for the coustomers. So their prices are very competitive. By the way, they mainly sell iphones, laptops, tvs, playstation and so on.If you need these products, it will be a good choice.Regards! From lists at onerussian.com Thu Jan 14 16:50:57 2010 From: lists at onerussian.com (Yaroslav Halchenko) Date: Thu, 14 Jan 2010 16:50:57 -0500 Subject: [Numpy-discussion] comparison operators (e.g. ==) on array with dtype object do not work In-Reply-To: <031.76ccb962b442426fadc5c22295c7d17f@scipy.org> References: <022.87efc5288ac90c7afc65516c6af78b0a@scipy.org> <031.76ccb962b442426fadc5c22295c7d17f@scipy.org> Message-ID: <20100114215057.GX18213@onerussian.com> Dear NumPy People, First I want to apologize if I misbehaved on NumPy Trac by reopening the closed ticket http://projects.scipy.org/numpy/ticket/1362 but I still feel strongly that there is misunderstanding and the bug/defect is valid. I would appreciate if someone would waste more of his time to persuade me that I am wrong but please first read till the end: The issue, as originally reported, is demonstrated with: ,--- | > python -c 'import numpy as N; print N.__version__; a=N.array([1, (0,1)],dtype=object); print a==1; print a == (0,1), a[1] == (0,1)' | 1.5.0.dev | [ True False] | [False False] True `--- whenever I expected the last line to be [False True] True charris (thanks for all the efforts to enlighten me) summarized it as """the result was correct given that the tuple (0,1) was converted to an object array with elements 0 and 1. It is *not* converted to an array containing a tuple. """ and I was trying to argue that it is not the case in my example. It is the case in charris's example though whenever both elements are of the same length, or there is just a single tuple, i.e. ,--- | In [1]: array((0,1), dtype=object) | Out[1]: array([0, 1], dtype=object) | | In [2]: array((0,1), dtype=object).shape | Out[2]: (2,) `--- There I would not expect my comparison to be valid indeed. But lets see what happens in my case: ,--- | In [2]: array([1, (0,1)],dtype=object) | Out[2]: array([1, (0, 1)], dtype=object) | | *In [3]: array([1, (0,1)],dtype=object).shape | Out[3]: (2,) | | *In [4]: array([1, (0,1)],dtype=object)[1].shape | --------------------------------------------------------------------------- | AttributeError Traceback (most recent call | last) | | /home/yoh/proj/ in () | | AttributeError: 'tuple' object has no attribute 'shape' `--- So, as far as I see it, the array does contain an object of type tuple, which does not get correctly compared upon __eq__ operation. Am I wrong? Or does numpy internally somehow does convert 1st item (ie tuple) into an array, but casts it back to tuple upon __repr__ or __getitem__? Thanks in advance for feedback On Thu, 14 Jan 2010, NumPy Trac wrote: > #1362: comparison operators (e.g. ==) on array with dtype object do not work > -------------------------+-------------------------------------------------- > Reporter: yarikoptic | Owner: somebody > Type: defect | Status: closed > Priority: normal | Milestone: > Component: Other | Version: > Resolution: invalid | Keywords: > -------------------------+-------------------------------------------------- > Changes (by charris): > * status: reopened => closed > * resolution: => invalid > Old description: > > You can see this better with the '*' operator: > > {{{ > > In [8]: a * (0,2) > > Out[8]: array([0, (0, 1, 0, 1)], dtype=object) > > }}} > > Note how the tuple is concatenated with itself. The reason the original > > instance of a worked was that 1 and (0,1) are of different lengths, so > > the decent into the nested sequence types stopped at one level and a > > tuple is one of the elements. When you do something like ((0,1),(0,1)) > > the decent goes down two levels and you end up with a 2x2 array of > > integer objects. The rule of thumb for object arrays is that you get an > > array with as many indices as possible. Which is why object arrays are > > hard to create. Another example: > > {{{ > > In [10]: array([(1,2,3),(1,2)], dtype=object) > > Out[10]: array([(1, 2, 3), (1, 2)], dtype=object) > > In [11]: array([(1,2),(1,2)], dtype=object) > > Out[11]: > > array([[1, 2], > > [1, 2]], dtype=object) > > }}} > New description: > {{{ > python -c 'import numpy as N; print N.__version__; a=N.array([1, > (0,1)],dtype=object); print a==1; print a == (0,1), a[1] == (0,1)' > }}} > results in > {{{ > 1.5.0.dev > [ True False] > [False False] True > }}} > I expected last line to be > {{{ > [False True] True > }}} > So, it works for int but doesn't work for tuple... I guess it doesn't try > to compare element by element but does smth else. -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From warren.weckesser at enthought.com Thu Jan 14 17:49:09 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 14 Jan 2010 16:49:09 -0600 Subject: [Numpy-discussion] comparison operators (e.g. ==) on array with dtype object do not work In-Reply-To: <20100114215057.GX18213@onerussian.com> References: <022.87efc5288ac90c7afc65516c6af78b0a@scipy.org> <031.76ccb962b442426fadc5c22295c7d17f@scipy.org> <20100114215057.GX18213@onerussian.com> Message-ID: <4B4F9F65.2070308@enthought.com> Yaroslav Halchenko wrote: > Dear NumPy People, > > First I want to apologize if I misbehaved on NumPy Trac by reopening the > closed ticket > http://projects.scipy.org/numpy/ticket/1362 > but I still feel strongly that there is misunderstanding > and the bug/defect is valid. I would appreciate if someone would waste > more of his time to persuade me that I am wrong but please first read > till the end: > > The issue, as originally reported, is demonstrated with: > > ,--- > | > python -c 'import numpy as N; print N.__version__; a=N.array([1, (0,1)],dtype=object); print a==1; print a == (0,1), a[1] == (0,1)' > | 1.5.0.dev > | [ True False] > | [False False] True > `--- > > whenever I expected the last line to be > > [False True] True > > charris (thanks for all the efforts to enlighten me) summarized it as > > """the result was correct given that the tuple (0,1) was converted to an > object array with elements 0 and 1. It is *not* converted to an array > containing a tuple. """ > > and I was trying to argue that it is not the case in my example. It is > the case in charris's example though whenever both elements are of > the same length, or there is just a single tuple, i.e. > > The "problem" is that the tuple is converted to an array in the statement that does the comparison, not in the construction of the array. Numpy attempts to convert the right hand side of the == operator into an array. It then does the comparison using the two arrays. One way to get what you want is to create your own array and then do the comparison: In [1]: import numpy as np In [2]: a = np.array([1, (0,1)], dtype='O') In [3]: t = np.empty(1, dtype='O') In [4]: t[0] = (0,1) In [5]: a == t Out[5]: array([False, True], dtype=bool) In the above code, a numpy array 't' of objects with shape (1,) is created, and the single element is assigned the value (0,1). Then the comparison works as expected. More food for thought: In [6]: b = np.array([1, (0,1), "foo"], dtype='O') In [7]: b == 1 Out[7]: array([ True, False, False], dtype=bool) In [8]: b == (0,1) Out[8]: False In [9]: b == "foo" Out[9]: array([False, False, True], dtype=bool) Warren > ,--- > | In [1]: array((0,1), dtype=object) > | Out[1]: array([0, 1], dtype=object) > | > | In [2]: array((0,1), dtype=object).shape > | Out[2]: (2,) > `--- > > There I would not expect my comparison to be valid indeed. But lets see what > happens in my case: > > ,--- > | In [2]: array([1, (0,1)],dtype=object) > | Out[2]: array([1, (0, 1)], dtype=object) > | > | *In [3]: array([1, (0,1)],dtype=object).shape > | Out[3]: (2,) > | > | *In [4]: array([1, (0,1)],dtype=object)[1].shape > | --------------------------------------------------------------------------- > | AttributeError Traceback (most recent call > | last) > | > | /home/yoh/proj/ in () > | > | AttributeError: 'tuple' object has no attribute 'shape' > `--- > > So, as far as I see it, the array does contain an object of type tuple, > which does not get correctly compared upon __eq__ operation. Am I > wrong? Or does numpy internally somehow does convert 1st item (ie > tuple) into an array, but casts it back to tuple upon __repr__ or > __getitem__? > > Thanks in advance for feedback > > On Thu, 14 Jan 2010, NumPy Trac wrote: > > >> #1362: comparison operators (e.g. ==) on array with dtype object do not work >> -------------------------+-------------------------------------------------- >> Reporter: yarikoptic | Owner: somebody >> Type: defect | Status: closed >> Priority: normal | Milestone: >> Component: Other | Version: >> Resolution: invalid | Keywords: >> -------------------------+-------------------------------------------------- >> Changes (by charris): >> > > >> * status: reopened => closed >> * resolution: => invalid >> > > > >> Old description: >> > > >>> You can see this better with the '*' operator: >>> > > > >>> {{{ >>> In [8]: a * (0,2) >>> Out[8]: array([0, (0, 1, 0, 1)], dtype=object) >>> }}} >>> > > > >>> Note how the tuple is concatenated with itself. The reason the original >>> instance of a worked was that 1 and (0,1) are of different lengths, so >>> the decent into the nested sequence types stopped at one level and a >>> tuple is one of the elements. When you do something like ((0,1),(0,1)) >>> the decent goes down two levels and you end up with a 2x2 array of >>> integer objects. The rule of thumb for object arrays is that you get an >>> array with as many indices as possible. Which is why object arrays are >>> hard to create. Another example: >>> > > > >>> {{{ >>> In [10]: array([(1,2,3),(1,2)], dtype=object) >>> Out[10]: array([(1, 2, 3), (1, 2)], dtype=object) >>> > > >>> In [11]: array([(1,2),(1,2)], dtype=object) >>> Out[11]: >>> array([[1, 2], >>> [1, 2]], dtype=object) >>> }}} >>> > > >> New description: >> > > >> {{{ >> python -c 'import numpy as N; print N.__version__; a=N.array([1, >> (0,1)],dtype=object); print a==1; print a == (0,1), a[1] == (0,1)' >> }}} >> results in >> {{{ >> 1.5.0.dev >> [ True False] >> [False False] True >> }}} >> I expected last line to be >> {{{ >> [False True] True >> }}} >> So, it works for int but doesn't work for tuple... I guess it doesn't try >> to compare element by element but does smth else. >> From josef.pktd at gmail.com Thu Jan 14 18:40:20 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 14 Jan 2010 18:40:20 -0500 Subject: [Numpy-discussion] comparison operators (e.g. ==) on array with dtype object do not work In-Reply-To: <4B4F9F65.2070308@enthought.com> References: <022.87efc5288ac90c7afc65516c6af78b0a@scipy.org> <031.76ccb962b442426fadc5c22295c7d17f@scipy.org> <20100114215057.GX18213@onerussian.com> <4B4F9F65.2070308@enthought.com> Message-ID: <1cd32cbb1001141540i398089f3xc4cecb684ca3ddfb@mail.gmail.com> On Thu, Jan 14, 2010 at 5:49 PM, Warren Weckesser wrote: > Yaroslav Halchenko wrote: >> Dear NumPy People, >> >> First I want to apologize if I misbehaved on NumPy Trac by reopening the >> closed ticket >> http://projects.scipy.org/numpy/ticket/1362 >> but I still feel strongly that there is misunderstanding >> and the bug/defect is valid. ? I would appreciate if someone would waste >> more of his time to persuade me that I am wrong but please first read >> till the end: >> >> The issue, as originally reported, is demonstrated with: >> >> ,--- >> | > python -c 'import numpy as N; print N.__version__; a=N.array([1, (0,1)],dtype=object); print a==1; print a == (0,1), ?a[1] == (0,1)' >> | 1.5.0.dev >> | [ True False] >> | [False False] True >> `--- >> >> whenever I expected the last line to be >> >> [False True] True >> >> charris (thanks for all the efforts to enlighten me) summarized it as >> >> """the result was correct given that the tuple (0,1) was converted to an >> object array with elements 0 and 1. It is *not* converted to an array >> containing a tuple. """ >> >> and I was trying to argue that it is not the case in my example. ?It is >> the case in charris's example though whenever both elements are of >> the same length, or there is just a single tuple, i.e. >> >> > > The "problem" is that the tuple is converted to an array in the > statement that > does the comparison, not in the construction of the array. ?Numpy attempts > to convert the right hand side of the == operator into an array. ?It > then does > the comparison using the two arrays. > > One way to get what you want is to create your own array and then do > the comparison: > > In [1]: import numpy as np > > In [2]: a = np.array([1, (0,1)], dtype='O') > > In [3]: t = np.empty(1, dtype='O') > > In [4]: t[0] = (0,1) > > In [5]: a == t > Out[5]: array([False, ?True], dtype=bool) > > > In the above code, a numpy array 't' of objects with shape (1,) is created, > and the single element is assigned the value (0,1). ?Then the comparison > works as expected. > > More food for thought: > > In [6]: b = np.array([1, (0,1), "foo"], dtype='O') > > In [7]: b == 1 > Out[7]: array([ True, False, False], dtype=bool) > > In [8]: b == (0,1) > Out[8]: False > > In [9]: b == "foo" > Out[9]: array([False, False, ?True], dtype=bool) > It looks difficult to construct an object array with only 1 element, since a tuple is interpreted as different array elements. >>> N.array([(0,1)],dtype=object).shape (1, 2) >>> N.array([(0,1),()],dtype=object).shape (2,) >>> c = N.array([(0,1),()],dtype=object)[:1] >>> c.shape1,) >>> a == c array([False, True], dtype=bool) It looks like some convention is necessary for interpreting a tuple in the array construction, but it doesn't look like a problem with the comparison operator just a consequence. Josef > Warren > >> ,--- >> | In [1]: array((0,1), dtype=object) >> | Out[1]: array([0, 1], dtype=object) >> | >> | In [2]: array((0,1), dtype=object).shape >> | Out[2]: (2,) >> `--- >> >> There I would not expect my comparison to be valid indeed. ?But lets see what >> happens in my case: >> >> ,--- >> | In [2]: array([1, (0,1)],dtype=object) >> | Out[2]: array([1, (0, 1)], dtype=object) >> | >> | *In [3]: array([1, (0,1)],dtype=object).shape >> | Out[3]: (2,) >> | >> | *In [4]: array([1, (0,1)],dtype=object)[1].shape >> | --------------------------------------------------------------------------- >> | AttributeError ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call >> | last) >> | >> | /home/yoh/proj/ in () >> | >> | AttributeError: 'tuple' object has no attribute 'shape' >> `--- >> >> So, as far as I see it, the array does contain an object of type tuple, >> which does not get correctly compared upon __eq__ operation. ?Am I >> wrong? ?Or does numpy internally somehow does convert 1st item (ie >> tuple) into an array, but casts it back to tuple upon __repr__ or >> __getitem__? >> >> Thanks in advance for feedback >> >> On Thu, 14 Jan 2010, NumPy Trac wrote: >> >> >>> #1362: comparison operators (e.g. ==) on array with dtype object do not work >>> -------------------------+-------------------------------------------------- >>> ? Reporter: ?yarikoptic ?| ? ? ? Owner: ?somebody >>> ? ? ? Type: ?defect ? ? ?| ? ? ?Status: ?closed >>> ? Priority: ?normal ? ? ?| ? Milestone: >>> ?Component: ?Other ? ? ? | ? ? Version: >>> Resolution: ?invalid ? ? | ? ?Keywords: >>> -------------------------+-------------------------------------------------- >>> Changes (by charris): >>> >> >> >>> ? * status: ?reopened => closed >>> ? * resolution: ?=> invalid >>> >> >> >> >>> Old description: >>> >> >> >>>> You can see this better with the '*' operator: >>>> >> >> >> >>>> {{{ >>>> In [8]: a * (0,2) >>>> Out[8]: array([0, (0, 1, 0, 1)], dtype=object) >>>> }}} >>>> >> >> >> >>>> Note how the tuple is concatenated with itself. The reason the original >>>> instance of a worked was that 1 and (0,1) are of different lengths, so >>>> the decent into the nested sequence types stopped at one level and a >>>> tuple is one of the elements. When you do something like ((0,1),(0,1)) >>>> the decent goes down two levels and you end up with a 2x2 array of >>>> integer objects. The rule of thumb for object arrays is that you get an >>>> array with as many indices as possible. Which is why object arrays are >>>> hard to create. Another example: >>>> >> >> >> >>>> {{{ >>>> In [10]: array([(1,2,3),(1,2)], dtype=object) >>>> Out[10]: array([(1, 2, 3), (1, 2)], dtype=object) >>>> >> >> >>>> In [11]: array([(1,2),(1,2)], dtype=object) >>>> Out[11]: >>>> array([[1, 2], >>>> ? ? ? ?[1, 2]], dtype=object) >>>> }}} >>>> >> >> >>> New description: >>> >> >> >>> ?{{{ >>> ?python -c 'import numpy as N; print N.__version__; a=N.array([1, >>> ?(0,1)],dtype=object); print a==1; print a == (0,1), ?a[1] == (0,1)' >>> ?}}} >>> ?results in >>> ?{{{ >>> ?1.5.0.dev >>> ?[ True False] >>> ?[False False] True >>> ?}}} >>> ?I expected last line to be >>> ?{{{ >>> ?[False True] True >>> ?}}} >>> ?So, it works for int but doesn't work for tuple... I guess it doesn't try >>> ?to compare element by element but does smth else. >>> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From lists at onerussian.com Thu Jan 14 19:05:16 2010 From: lists at onerussian.com (Yaroslav Halchenko) Date: Thu, 14 Jan 2010 19:05:16 -0500 Subject: [Numpy-discussion] comparison operators (e.g. ==) on array with dtype object do not work In-Reply-To: <1cd32cbb1001141540i398089f3xc4cecb684ca3ddfb@mail.gmail.com> References: <022.87efc5288ac90c7afc65516c6af78b0a@scipy.org> <031.76ccb962b442426fadc5c22295c7d17f@scipy.org> <20100114215057.GX18213@onerussian.com> <4B4F9F65.2070308@enthought.com> <1cd32cbb1001141540i398089f3xc4cecb684ca3ddfb@mail.gmail.com> Message-ID: <20100115000515.GB19319@onerussian.com> On Thu, 14 Jan 2010, josef.pktd at gmail.com wrote: > It looks difficult to construct an object array with only 1 element, > since a tuple is interpreted as different array elements. yeap > It looks like some convention is necessary for interpreting a tuple in > the array construction, but it doesn't look like a problem with the > comparison operator just a consequence. Well -- there is a reason why we use tuples -- they are immutable ... as well as strings actually ;) Thus, imho, it would be a logical API if immutable datatypes are not coerced magically into mutable arrays at least whenever I am already requesting dtype='O'. Such generic treatment of immutable dtypes would address special treatment of strings but it is too much of a change and debatable anyways ;-) -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From lists at onerussian.com Thu Jan 14 19:05:41 2010 From: lists at onerussian.com (Yaroslav Halchenko) Date: Thu, 14 Jan 2010 19:05:41 -0500 Subject: [Numpy-discussion] comparison operators (e.g. ==) on array with dtype object do not work In-Reply-To: <4B4F9F65.2070308@enthought.com> References: <022.87efc5288ac90c7afc65516c6af78b0a@scipy.org> <031.76ccb962b442426fadc5c22295c7d17f@scipy.org> <20100114215057.GX18213@onerussian.com> <4B4F9F65.2070308@enthought.com> Message-ID: <20100115000540.GF18218@onerussian.com> Hi Warren, > The "problem" is that the tuple is converted to an array in the > statement that does the comparison, not in the construction of the > array. Numpy attempts > to convert the right hand side of the == operator into an array. > It then does the comparison using the two arrays. Thanks for the description! It kinda makes sense now, although, in general, I am not pleased with the API, I would take it as a documented feature from now on ;) > One way to get what you want is to create your own array and then do > the comparison: yeah... I might like to check if lhs has dtype==dtype('object') and then convert that rhs item into object array before comparison (for now I just did list comprehension ;)) > In [8]: b == (0,1) > Out[8]: False yeah -- lengths are different now ;) > In [9]: b == "foo" > Out[9]: array([False, False, True], dtype=bool) yeah -- strings are getting special treatment despite being iterables ;) but that is ok I guess anyways The main confusion seems to come from the feature of numpy in doing smart things -- like deciding either it thinks it needs to do element-wise comparison across lhs and rhs (if lengths match) or mapping comparison across all items. That behavior is quite different from basic Python iterable containers suchas tuples and lists, where it does just global comparison: ,--- | *In [33]: [1,2] == [1,3] | Out[33]: False | | *In [34]: array([1,2]) == array([1,3]) | Out[34]: array([ True, False], dtype=bool) `--- I guess I just need to remember that and what you have described thanks again -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From david at silveregg.co.jp Thu Jan 14 20:52:02 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Fri, 15 Jan 2010 10:52:02 +0900 Subject: [Numpy-discussion] Matrix vs array in ma.minimum In-Reply-To: <6C95E82E-B37F-47CC-A80D-995C779331BF@gmail.com> References: <5b8d13221001140153ocd65f2fwc64c4820aa9c7a@mail.gmail.com> <6C95E82E-B37F-47CC-A80D-995C779331BF@gmail.com> Message-ID: <4B4FCA42.1030205@silveregg.co.jp> Pierre GM wrote: > > Er, no. > np.ma.minimum(a, b) returns the lowest value of a and b element-wsie, or the the lowest element of a is b is None. The behavior is inherited from the very first implementation of maskedarray in numeric. This itself is unexpected, since np.minimum requires at least 2 input arguments. > > As you observed, the current function breaks down w/ np.matrix objects when only one argument is given (and when the axis is None): we call umath.minimum.reduce on the ravelled matirx, which returns the ravelled matrix. One would expect a scalar, so yes, this behavior is also unexpected. > > Now, which way should we go ? Keep np.ma.minimum as it is (fixing the bug so that a scalar is returned if the function is called with only 1 argument and an axis None) ? Adapt it to match np.minimum ? I am not a user of Masked Array, so I don't know what is the most desirable behavior. The problem appears when using pylab.imshow on matrices, because matplotlib (and not matlab :) ) uses masked arrays when normalizing the values. cheers, David From pgmdevlist at gmail.com Thu Jan 14 21:59:53 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 14 Jan 2010 21:59:53 -0500 Subject: [Numpy-discussion] Matrix vs array in ma.minimum In-Reply-To: <4B4FCA42.1030205@silveregg.co.jp> References: <5b8d13221001140153ocd65f2fwc64c4820aa9c7a@mail.gmail.com> <6C95E82E-B37F-47CC-A80D-995C779331BF@gmail.com> <4B4FCA42.1030205@silveregg.co.jp> Message-ID: <5CEE736D-90A8-4B59-A5DD-BDDEC71FA6C9@gmail.com> On Jan 14, 2010, at 8:52 PM, David Cournapeau wrote: > Pierre GM wrote: > >> >> Er, no. >> np.ma.minimum(a, b) returns the lowest value of a and b element-wsie, or the the lowest element of a is b is None. The behavior is inherited from the very first implementation of maskedarray in numeric. This itself is unexpected, since np.minimum requires at least 2 input arguments. >> >> As you observed, the current function breaks down w/ np.matrix objects when only one argument is given (and when the axis is None): we call umath.minimum.reduce on the ravelled matirx, which returns the ravelled matrix. One would expect a scalar, so yes, this behavior is also unexpected. >> >> Now, which way should we go ? Keep np.ma.minimum as it is (fixing the bug so that a scalar is returned if the function is called with only 1 argument and an axis None) ? Adapt it to match np.minimum ? > > I am not a user of Masked Array, so I don't know what is the most > desirable behavior. I'm not a regular user of np.minimum. > The problem appears when using pylab.imshow on > matrices, because matplotlib (and not matlab :) ) uses masked arrays > when normalizing the values. David, you mind pointing me to the relevan part of the code and/or give me an example ? In any case, I'd appreciate more feedback on the behavior of np.ma.minimum. From charlesr.harris at gmail.com Thu Jan 14 22:47:54 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 14 Jan 2010 20:47:54 -0700 Subject: [Numpy-discussion] comparison operators (e.g. ==) on array with dtype object do not work In-Reply-To: <4B4F9F65.2070308@enthought.com> References: <022.87efc5288ac90c7afc65516c6af78b0a@scipy.org> <031.76ccb962b442426fadc5c22295c7d17f@scipy.org> <20100114215057.GX18213@onerussian.com> <4B4F9F65.2070308@enthought.com> Message-ID: On Thu, Jan 14, 2010 at 3:49 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > Yaroslav Halchenko wrote: > > Dear NumPy People, > > > > First I want to apologize if I misbehaved on NumPy Trac by reopening the > > closed ticket > > http://projects.scipy.org/numpy/ticket/1362 > > but I still feel strongly that there is misunderstanding > > and the bug/defect is valid. I would appreciate if someone would waste > > more of his time to persuade me that I am wrong but please first read > > till the end: > > > > The issue, as originally reported, is demonstrated with: > > > > ,--- > > | > python -c 'import numpy as N; print N.__version__; a=N.array([1, > (0,1)],dtype=object); print a==1; print a == (0,1), a[1] == (0,1)' > > | 1.5.0.dev > > | [ True False] > > | [False False] True > > `--- > > > > whenever I expected the last line to be > > > > [False True] True > > > > charris (thanks for all the efforts to enlighten me) summarized it as > > > > """the result was correct given that the tuple (0,1) was converted to an > > object array with elements 0 and 1. It is *not* converted to an array > > containing a tuple. """ > > > > and I was trying to argue that it is not the case in my example. It is > > the case in charris's example though whenever both elements are of > > the same length, or there is just a single tuple, i.e. > > > > > > The "problem" is that the tuple is converted to an array in the > statement that > does the comparison, not in the construction of the array. Numpy attempts > to convert the right hand side of the == operator into an array. It > then does > the comparison using the two arrays. > > One way to get what you want is to create your own array and then do > the comparison: > > In [1]: import numpy as np > > In [2]: a = np.array([1, (0,1)], dtype='O') > > In [3]: t = np.empty(1, dtype='O') > > In [4]: t[0] = (0,1) > > In [5]: a == t > Out[5]: array([False, True], dtype=bool) > > > In the above code, a numpy array 't' of objects with shape (1,) is created, > and the single element is assigned the value (0,1). Then the comparison > works as expected. > > More food for thought: > > In [6]: b = np.array([1, (0,1), "foo"], dtype='O') > > In [7]: b == 1 > Out[7]: array([ True, False, False], dtype=bool) > > In [8]: b == (0,1) > Out[8]: False > > Oooh, that last one is strange. Also In [6]: arange(2) == arange(3) Out[6]: False So the comparison isn't element-wise. I rather think a shape mismatch error should be raised in this case. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Jan 14 23:06:36 2010 From: cournape at gmail.com (David Cournapeau) Date: Fri, 15 Jan 2010 13:06:36 +0900 Subject: [Numpy-discussion] Matrix vs array in ma.minimum In-Reply-To: <5CEE736D-90A8-4B59-A5DD-BDDEC71FA6C9@gmail.com> References: <5b8d13221001140153ocd65f2fwc64c4820aa9c7a@mail.gmail.com> <6C95E82E-B37F-47CC-A80D-995C779331BF@gmail.com> <4B4FCA42.1030205@silveregg.co.jp> <5CEE736D-90A8-4B59-A5DD-BDDEC71FA6C9@gmail.com> Message-ID: <5b8d13221001142006u35c6e810gaf0af4806981a821@mail.gmail.com> On Fri, Jan 15, 2010 at 11:59 AM, Pierre GM wrote: > On Jan 14, 2010, at 8:52 PM, David Cournapeau wrote: >> Pierre GM wrote: >> >>> >>> Er, no. >>> np.ma.minimum(a, b) returns the lowest value of a and b element-wsie, or the the lowest element of a is b is None. The behavior is inherited from the very first implementation of maskedarray in numeric. This itself is unexpected, since np.minimum requires at least 2 input arguments. >>> >>> As you observed, the current function breaks down w/ np.matrix objects when only one argument is given (and when the axis is None): we call umath.minimum.reduce on the ravelled matirx, which returns the ravelled matrix. One would expect a scalar, so yes, this behavior is also unexpected. >>> >>> Now, which way should we go ? Keep np.ma.minimum as it is (fixing the bug so that a scalar is returned if the function is called with only 1 argument and an axis ?None) ? Adapt it to match np.minimum ? >> >> I am not a user of Masked Array, so I don't know what is the most >> desirable behavior. > > I'm not a regular user of np.minimum. Damn, I thought I coul > >> The problem appears when using pylab.imshow on >> matrices, because matplotlib (and not matlab :) ) uses masked arrays >> when normalizing the values. > > > David, you mind pointing me to the relevan part of the code and/or give me an example ? Here is a self-contained example reproducing the matplotlib pb: import numpy as np from numpy import ma import matplotlib.cbook as cbook class Normalize: """ Normalize a given value to the 0-1 range """ def __init__(self, vmin=None, vmax=None, clip=False): """ If *vmin* or *vmax* is not given, they are taken from the input's minimum and maximum value respectively. If *clip* is *True* and the given value falls outside the range, the returned value will be 0 or 1, whichever is closer. Returns 0 if:: vmin==vmax Works with scalars or arrays, including masked arrays. If *clip* is *True*, masked values are set to 1; otherwise they remain masked. Clipping silently defeats the purpose of setting the over, under, and masked colors in the colormap, so it is likely to lead to surprises; therefore the default is *clip* = *False*. """ self.vmin = vmin self.vmax = vmax self.clip = clip def __call__(self, value, clip=None): if clip is None: clip = self.clip if cbook.iterable(value): vtype = 'array' val = ma.asarray(value).astype(np.float) else: vtype = 'scalar' val = ma.array([value]).astype(np.float) self.autoscale_None(val) vmin, vmax = self.vmin, self.vmax if vmin > vmax: raise ValueError("minvalue must be less than or equal to maxvalue") elif vmin==vmax: return 0.0 * val else: if clip: mask = ma.getmask(val) val = ma.array(np.clip(val.filled(vmax), vmin, vmax), mask=mask) result = (val-vmin) * (1.0/(vmax-vmin)) if vtype == 'scalar': result = result[0] return result def inverse(self, value): if not self.scaled(): raise ValueError("Not invertible until scaled") vmin, vmax = self.vmin, self.vmax if cbook.iterable(value): val = ma.asarray(value) return vmin + val * (vmax - vmin) else: return vmin + value * (vmax - vmin) def autoscale(self, A): ''' Set *vmin*, *vmax* to min, max of *A*. ''' self.vmin = ma.minimum(A) self.vmax = ma.maximum(A) def autoscale_None(self, A): ' autoscale only None-valued vmin or vmax' if self.vmin is None: self.vmin = ma.minimum(A) if self.vmax is None: self.vmax = ma.maximum(A) def scaled(self): 'return true if vmin and vmax set' return (self.vmin is not None and self.vmax is not None) if __name__ == "__main__": x = np.random.randn(10, 10) mx = np.matrix(x) print Normalize()(x) print Normalize()(mx) From charlesr.harris at gmail.com Fri Jan 15 00:36:31 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 14 Jan 2010 22:36:31 -0700 Subject: [Numpy-discussion] Wanted: new release manager for 1.5 and above In-Reply-To: <4B4EBAE2.7040703@silveregg.co.jp> References: <5b8d13221001132102k4f19ee45td6b54bd5df3c8578@mail.gmail.com> <4B4EBAE2.7040703@silveregg.co.jp> Message-ID: On Wed, Jan 13, 2010 at 11:34 PM, David Cournapeau wrote: > Charles R Harris wrote: > > > > > > > What is the setup one needs to build the installers? It might be well to > > document that, the dependencies, and the process. > > Right. The top script is: > http://projects.scipy.org/numpy/browser/trunk/release.sh > > the bulk of the work is in : > http://projects.scipy.org/numpy/browser/trunk/pavement.py > > which describes what is needed to build installers. On mac os x, the > release script may be used as is to build every installer + the release > notes. > > Umm, I think it needs some more explanation. There are virtual environments, c compilers, wine, paver, etc. All/ of these might require some installation, version numbers, and setup. This might all seem clear to you, but a newbie coming on to build the packages probably needs more instruction. What sort of setup do you run, what hardware, etc. If code needs to be compiled for the PCC I assume the compiler needs to be to do that. What about c libraries (for numpy) and c++ libraries (for scipy)? Does one need a MAC? etc. I'm probably just ignorant, but I think a careful step by step procedure would be helpful. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at silveregg.co.jp Fri Jan 15 00:44:26 2010 From: david at silveregg.co.jp (David Cournapeau) Date: Fri, 15 Jan 2010 14:44:26 +0900 Subject: [Numpy-discussion] Wanted: new release manager for 1.5 and above In-Reply-To: References: <5b8d13221001132102k4f19ee45td6b54bd5df3c8578@mail.gmail.com> <4B4EBAE2.7040703@silveregg.co.jp> Message-ID: <4B5000BA.1070701@silveregg.co.jp> Charles R Harris wrote: > > > On Wed, Jan 13, 2010 at 11:34 PM, David Cournapeau > > wrote: > > Charles R Harris wrote: > > > > > > > What is the setup one needs to build the installers? It might be > well to > > document that, the dependencies, and the process. > > Right. The top script is: > http://projects.scipy.org/numpy/browser/trunk/release.sh > > the bulk of the work is in : > http://projects.scipy.org/numpy/browser/trunk/pavement.py > > which describes what is needed to build installers. On mac os x, the > release script may be used as is to build every installer + the release > notes. > > > Umm, I think it needs some more explanation. There are virtual > environments, c compilers, wine, paver, etc. All/ of these might require > some installation, version numbers, and setup. This might all seem clear > to you, but a newbie coming on to build the packages probably needs more > instruction. I think it is a waste of time to document all this very precisely, because it is continuously changing. Documenting everything would boil down to rewrite the paver script in English (and most likely would be much more verbose). That's exactly why I was suggesting to have some volunteers to do 1.4.1 to do the release together, as a way to "pass the knowledge around". cheers, David From pgmdevlist at gmail.com Fri Jan 15 04:10:51 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 15 Jan 2010 04:10:51 -0500 Subject: [Numpy-discussion] Matrix vs array in ma.minimum In-Reply-To: <5b8d13221001142006u35c6e810gaf0af4806981a821@mail.gmail.com> References: <5b8d13221001140153ocd65f2fwc64c4820aa9c7a@mail.gmail.com> <6C95E82E-B37F-47CC-A80D-995C779331BF@gmail.com> <4B4FCA42.1030205@silveregg.co.jp> <5CEE736D-90A8-4B59-A5DD-BDDEC71FA6C9@gmail.com> <5b8d13221001142006u35c6e810gaf0af4806981a821@mail.gmail.com> Message-ID: <6C270179-7B32-4907-96A7-0F1477A8044E@gmail.com> On Jan 14, 2010, at 11:06 PM, David Cournapeau wrote: > On Fri, Jan 15, 2010 at 11:59 AM, Pierre GM wrote: >> On Jan 14, 2010, at 8:52 PM, David Cournapeau wrote: >>> Pierre GM wrote: >>> >>>> >>>> Er, no. >>>> np.ma.minimum(a, b) returns the lowest value of a and b element-wsie, or the the lowest element of a is b is None. The behavior is inherited from the very first implementation of maskedarray in numeric. This itself is unexpected, since np.minimum requires at least 2 input arguments. >>>> >>>> As you observed, the current function breaks down w/ np.matrix objects when only one argument is given (and when the axis is None): we call umath.minimum.reduce on the ravelled matirx, which returns the ravelled matrix. One would expect a scalar, so yes, this behavior is also unexpected. >>>> >>>> Now, which way should we go ? Keep np.ma.minimum as it is (fixing the bug so that a scalar is returned if the function is called with only 1 argument and an axis None) ? Adapt it to match np.minimum ? >>> >>> I am not a user of Masked Array, so I don't know what is the most >>> desirable behavior. >> >> I'm not a regular user of np.minimum. > > Damn, I thought I coul >> >>> The problem appears when using pylab.imshow on >>> matrices, because matplotlib (and not matlab :) ) uses masked arrays >>> when normalizing the values. >> >> >> David, you mind pointing me to the relevan part of the code and/or give me an example ? > > Here is a self-contained example reproducing the matplotlib pb: > OK, thx a lot. I'll work on it as soon as I can. From seb.haase at gmail.com Fri Jan 15 04:38:14 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Fri, 15 Jan 2010 10:38:14 +0100 Subject: [Numpy-discussion] broken links on http://numpy.scipy.org/ Message-ID: Hi, Apparently this very nice looking icons (4 of the 5 icons or so) at http://numpy.scipy.org/ are broken links. Regards, Sebastian Haase From ralf.gommers at googlemail.com Fri Jan 15 09:46:18 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 15 Jan 2010 22:46:18 +0800 Subject: [Numpy-discussion] Wanted: new release manager for 1.5 and above In-Reply-To: <4B4EBAE2.7040703@silveregg.co.jp> References: <5b8d13221001132102k4f19ee45td6b54bd5df3c8578@mail.gmail.com> <4B4EBAE2.7040703@silveregg.co.jp> Message-ID: Hi David, Here are some questions to get a clearer idea of exactly what's involved in / required for making a release. On Thu, Jan 14, 2010 at 2:34 PM, David Cournapeau wrote: > Charles R Harris wrote: > > > > > > > What is the setup one needs to build the installers? It might be well to > > document that, the dependencies, and the process. > > Right. The top script is: > http://projects.scipy.org/numpy/browser/trunk/release.sh > > the bulk of the work is in : > http://projects.scipy.org/numpy/browser/trunk/pavement.py > > which describes what is needed to build installers. On mac os x, the > release script may be used as is to build every installer + the release > notes. > > Is it necessary to have OS X to build the dmg installer, or could you build that from linux with some modifications to the build script? How many combinations do you test manually? All supported Python versions on all platforms? Several Linux flavors? For someone new to packaging, how much time would you estimate it takes to do a single release? Is most of this time spent testing, or fixing the problems you find during testing? Do you have an idea about when to start preparing for the release of 1.4.1? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Fri Jan 15 09:50:14 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 15 Jan 2010 15:50:14 +0100 Subject: [Numpy-discussion] svn log and blank entries Message-ID: Hi all, An svn log > CHANGELOG in svn/numpy yields some blank entries Is that intended ? ------------------------------------------------------------------------ r8055 | ariver | 2010-01-15 03:02:30 +0100 (Fr, 15 Jan 2010) | 1 line _ ------------------------------------------------------------------------ r8054 | ariver | 2010-01-15 02:57:56 +0100 (Fr, 15 Jan 2010) | 1 line _ ------------------------------------------------------------------------ r8053 | ariver | 2010-01-15 02:51:02 +0100 (Fr, 15 Jan 2010) | 1 line _ Nils From josef.pktd at gmail.com Fri Jan 15 09:55:15 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 15 Jan 2010 09:55:15 -0500 Subject: [Numpy-discussion] svn log and blank entries In-Reply-To: References: Message-ID: <1cd32cbb1001150655haa6b7aaie3273b4d95a53402@mail.gmail.com> On Fri, Jan 15, 2010 at 9:50 AM, Nils Wagner