From charlesr.harris at gmail.com Fri Dec 1 00:12:03 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 30 Nov 2006 22:12:03 -0700 Subject: [Numpy-discussion] save a matrix In-Reply-To: References: Message-ID: On 11/30/06, Keith Goodman wrote: > > What's a good way to save matrix objects to file for later use? I just > need something quick for debugging. > > I saw two suggestions on this list from Francesc Altet (2006-05-22): > > 1. Use tofile and fromfile and save the meta data yourself. > > 2. pytables > > Any suggestions for #3? Is this what you want? In [14]: a Out[14]: matrix([[2, 3], [4, 5]]) In [15]: b Out[15]: matrix([[2, 3], [4, 5]]) In [16]: f = open('dump.pkl','w') In [17]: pickle.dump(a,f) In [18]: pickle.dump(b,f) In [19]: f.close() In [20]: f = open('dump.pkl','r') In [21]: x = pickle.load(f) In [22]: y = pickle.load(f) In [23]: f.close() In [24]: x Out[24]: matrix([[2, 3], [4, 5]]) In [25]: y Out[25]: matrix([[2, 3], [4, 5]]) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Dec 1 00:17:29 2006 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 30 Nov 2006 21:17:29 -0800 Subject: [Numpy-discussion] save a matrix In-Reply-To: References: Message-ID: On 11/30/06, Charles R Harris wrote: > > > On 11/30/06, Keith Goodman wrote: > > What's a good way to save matrix objects to file for later use? I just > > need something quick for debugging. > > > > I saw two suggestions on this list from Francesc Altet (2006-05-22): > > > > 1. Use tofile and fromfile and save the meta data yourself. > > > > 2. pytables > > > > Any suggestions for #3? > Is this what you want? > > In [14]: a > Out[14]: > matrix([[2, 3], > [4, 5]]) > > In [15]: b > Out[15]: > matrix([[2, 3], > [4, 5]]) > > In [16]: f = open(' dump.pkl','w') > > In [17]: pickle.dump(a,f) > > In [18]: pickle.dump(b,f) > > In [19]: f.close() > > In [20]: f = open('dump.pkl','r') > > In [21]: x = pickle.load(f) > > In [22]: y = pickle.load(f) > > In [23]: f.close() > > In [24]: x > Out[24]: > matrix([[2, 3], > [4, 5]]) > > In [25]: y > Out[25]: > matrix([[2, 3], > [4, 5]]) Yes. That will do very well. You got me out of a pickle. From charlesr.harris at gmail.com Fri Dec 1 00:19:24 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 30 Nov 2006 22:19:24 -0700 Subject: [Numpy-discussion] save a matrix In-Reply-To: References: Message-ID: On 11/30/06, Charles R Harris wrote: > > > > On 11/30/06, Keith Goodman wrote: > > > > What's a good way to save matrix objects to file for later use? I just > > need something quick for debugging. > > > > I saw two suggestions on this list from Francesc Altet (2006-05-22): > > > > 1. Use tofile and fromfile and save the meta data yourself. > > > > 2. pytables > > > > Any suggestions for #3? > > Is this what you want? > > In [14]: a > Out[14]: > matrix([[2, 3], > [4, 5]]) > > In [15]: b > Out[15]: > matrix([[2, 3], > [4, 5]]) > > In [16]: f = open(' dump.pkl','w') > > In [17]: pickle.dump(a,f) > > In [18]: pickle.dump(b,f) > > In [19]: f.close() > > In [20]: f = open('dump.pkl','r') > > In [21]: x = pickle.load(f) > > In [22]: y = pickle.load(f) > > In [23]: f.close() > > In [24]: x > Out[24]: > matrix([[2, 3], > [4, 5]]) > > In [25]: y > Out[25]: > matrix([[2, 3], > [4, 5]]) > It is also possible to put the variables of interest in a dictionary, then pickle the dictionary. That way you can also store the variable names. In [27]: f = open('dump.pkl','w') In [28]: pickle.dump( {'a':a,'b':b}, f) In [29]: f.close() In [30]: f = open('dump.pkl','r') In [31]: mystuff = pickle.load(f) In [32]: f.close() In [34]: mystuff Out[34]: {'a': matrix([[2, 3], [4, 5]]), 'b': matrix([[2, 3], [4, 5]])} I think you can actually pickle the whole evironment, but I don't recall how. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Dec 1 00:30:25 2006 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 30 Nov 2006 21:30:25 -0800 Subject: [Numpy-discussion] save a matrix In-Reply-To: References: Message-ID: On 11/30/06, Charles R Harris wrote: > > It is also possible to put the variables of interest in a dictionary, then > pickle the dictionary. That way you can also store the variable names. > > In [27]: f = open(' dump.pkl','w') > > In [28]: pickle.dump( {'a':a,'b':b}, f) > > In [29]: f.close() > > In [30]: f = open('dump.pkl','r') > > In [31]: mystuff = pickle.load(f) > > In [32]: f.close() > > In [34]: mystuff > Out[34]: > {'a': matrix([[2, 3], > [4, 5]]), 'b': matrix([[2, 3], > [4, 5]])} I think I could use that to write a function savematrix(filename, a, b, c,...) Is there a way to write a loadmatrix(filename) that doesn't return anything but makes the matrices a, b, c, ... available? Probably not a good function design. But useful for quick things. From charlesr.harris at gmail.com Fri Dec 1 01:33:05 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 30 Nov 2006 23:33:05 -0700 Subject: [Numpy-discussion] save a matrix In-Reply-To: References: Message-ID: On 11/30/06, Keith Goodman wrote: > > On 11/30/06, Charles R Harris wrote: > > > > It is also possible to put the variables of interest in a dictionary, > then > > pickle the dictionary. That way you can also store the variable names. > > > > In [27]: f = open(' dump.pkl','w') > > > > In [28]: pickle.dump( {'a':a,'b':b}, f) > > > > In [29]: f.close() > > > > In [30]: f = open('dump.pkl','r') > > > > In [31]: mystuff = pickle.load(f) > > > > In [32]: f.close() > > > > In [34]: mystuff > > Out[34]: > > {'a': matrix([[2, 3], > > [4, 5]]), 'b': matrix([[2, 3], > > [4, 5]])} > > I think I could use that to write a function savematrix(filename, a, b, > c,...) > > Is there a way to write a loadmatrix(filename) that doesn't return > anything but makes the matrices a, b, c, ... available? I think there is, that is why I mentioned the saving the environment thingee. IIRC, I saw code for something like that a couple of years back but I don't recall the details. Maybe something like: In [80]: globals()['x'] = [1,2] In [81]: x Out[81]: [1, 2] Then you just have to merge the pickled dictionary with globals(). Like this: >>> globals().update(mystuff) where mystuff is the dictionary where you have your stuff. This could probably also go something like >>> globals().update(load(f)) where f contains the pickled dictionary. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Dec 1 02:02:19 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 1 Dec 2006 00:02:19 -0700 Subject: [Numpy-discussion] save a matrix In-Reply-To: References: Message-ID: On 11/30/06, Charles R Harris wrote: > > > > On 11/30/06, Keith Goodman wrote: > > > > On 11/30/06, Charles R Harris wrote: > > > > > > It is also possible to put the variables of interest in a dictionary, > > then > > > pickle the dictionary. That way you can also store the variable names. > > > > > > > > In [27]: f = open(' dump.pkl','w') > > > > > > In [28]: pickle.dump( {'a':a,'b':b}, f) > > > > > > In [29]: f.close() > > > > > > In [30]: f = open('dump.pkl','r') > > > > > > In [31]: mystuff = pickle.load(f) > > > > > > In [32]: f.close() > > > > > > In [34]: mystuff > > > Out[34]: > > > {'a': matrix([[2, 3], > > > [4, 5]]), 'b': matrix([[2, 3], > > > [4, 5]])} > > > > I think I could use that to write a function savematrix(filename, a, b, > > c,...) > > > > Is there a way to write a loadmatrix(filename) that doesn't return > > anything but makes the matrices a, b, c, ... available? > > > I think there is, that is why I mentioned the saving the environment > thingee. IIRC, I saw code for something like that a couple of years back but > I don't recall the details. Maybe something like: > > In [80]: globals()['x'] = [1,2] > > In [81]: x > Out[81]: [1, 2] > > Then you just have to merge the pickled dictionary with globals(). Like > this: > > >>> globals().update(mystuff) > > where mystuff is the dictionary where you have your stuff. This could > probably also go something like > > >>> globals().update(load(f)) > > where f contains the pickled dictionary. > You could probably dump the entire environment from a subroutine, cPickle.dump(globals(), f), which might be a good way to save everything off, but not very efficient. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Fri Dec 1 04:54:36 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri, 1 Dec 2006 10:54:36 +0100 Subject: [Numpy-discussion] bare bones numpy extension code In-Reply-To: <87d574n3v5.fsf@peds-pc311.bsd.uchicago.edu> References: <87d574n3v5.fsf@peds-pc311.bsd.uchicago.edu> Message-ID: <200612011054.36567.faltet@carabos.com> A Dijous 30 Novembre 2006 20:14, John Hunter escrigu?: > A colleague of mine wants to write some numpy extension code. I > pointed him to lots of examples in the matplotlib src dir, but the > build environment is more complicated than he needs with all the > numpy/numeric/numarray switches, etc. Does someone have the basic > "hello world" of numpy extensions that includes src code and a basic > setup.py that I can pass on to him. It might be nice to include > something like that in a numpy "examples" directory. Hi, In case your colleague is going to use Pyrex to do her extensions (which I do recommend, specially for naive users), you can find some simple but nice examples in the numpy/doc/pyrex/ directory of the NumPy distribution. Also interesting for beginners is: http://www.scipy.org/Cookbook/Pyrex_and_NumPy HTH, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From lroubeyrie at limair.asso.fr Fri Dec 1 06:19:42 2006 From: lroubeyrie at limair.asso.fr (Lionel Roubeyrie) Date: Fri, 1 Dec 2006 12:19:42 +0100 Subject: [Numpy-discussion] subclass Message-ID: <200612011219.42733.lroubeyrie@limair.asso.fr> Hi all, is it possible to subclass numpy.array to set extras functionalities and change the behavior of others? I can't find any docs on that. Thanks -- Lionel Roubeyrie - lroubeyrie at limair.asso.fr LIMAIR http://www.limair.asso.fr From pgmdevlist at gmail.com Fri Dec 1 06:58:02 2006 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 1 Dec 2006 06:58:02 -0500 Subject: [Numpy-discussion] subclass In-Reply-To: <200612011219.42733.lroubeyrie@limair.asso.fr> References: <200612011219.42733.lroubeyrie@limair.asso.fr> Message-ID: <200612010658.02749.pgmdevlist@gmail.com> On Friday 01 December 2006 06:19, Lionel Roubeyrie wrote: > Hi all, > is it possible to subclass numpy.array to set extras functionalities and > change the behavior of others? I can't find any docs on that. Did you really look ;) ? http://www.scipy.org/Subclasses Check also the new implementation of MaskedArray, where masked arrays are implemented as subclass of ndarrays http://projects.scipy.org/scipy/numpy/wiki/MaskedArray. If you have some particular features in mind, let us know. P. From tgrav at mac.com Fri Dec 1 07:38:38 2006 From: tgrav at mac.com (Tommy Grav) Date: Fri, 1 Dec 2006 07:38:38 -0500 Subject: [Numpy-discussion] ScipySuperpack for Mac (PowerPC) Message-ID: <590C35C8-95B4-480C-8B08-92AEC624E27C@mac.com> I installed the Mac ScipySuperpack (from http://www.scipy.org/Download). However it seems that the version of matplotlib in there is not compatible with their version of numpy [tgrav@******] ch2/pbcd -> python ActivePython 2.4.3 Build 11 (ActiveState Software Inc.) based on Python 2.4.3 (#1, Apr 3 2006, 18:07:18) [GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> import pylab >>> x = range(1,10) >>> y = range(1,10) >>> pylab.plot(x,y) [] >>> pylab.show() alloc: invalid block: 0xa08bcd8: a 68 0 Abort [tgrav@*****] ch2/pbcd -> Anyone know how to fix this? Cheers Tommy From faltet at carabos.com Fri Dec 1 07:54:38 2006 From: faltet at carabos.com (Francesc Altet) Date: Fri, 1 Dec 2006 13:54:38 +0100 Subject: [Numpy-discussion] subclass In-Reply-To: <200612011219.42733.lroubeyrie@limair.asso.fr> References: <200612011219.42733.lroubeyrie@limair.asso.fr> Message-ID: <200612011354.39298.faltet@carabos.com> A Divendres 01 Desembre 2006 12:19, Lionel Roubeyrie escrigu?: > Hi all, > is it possible to subclass numpy.array to set extras functionalities and > change the behavior of others? I can't find any docs on that. > Thanks If what you want is extending the functionality of ndarray at C level, there is a complete section dedicated to this ('Subtyping the ndarray in C', chapter 15) in Travis' Guide to NumPy [1]. [1] http://www.tramy.us/ Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From lroubeyrie at limair.asso.fr Fri Dec 1 08:04:35 2006 From: lroubeyrie at limair.asso.fr (Lionel Roubeyrie) Date: Fri, 1 Dec 2006 14:04:35 +0100 Subject: [Numpy-discussion] subclass In-Reply-To: <200612010658.02749.pgmdevlist@gmail.com> References: <200612011219.42733.lroubeyrie@limair.asso.fr> <200612010658.02749.pgmdevlist@gmail.com> Message-ID: <200612011404.35872.lroubeyrie@limair.asso.fr> Arg! I really didn't see that! thanks Le Vendredi 01 D?cembre 2006 12:58, Pierre GM a ?crit?: > On Friday 01 December 2006 06:19, Lionel Roubeyrie wrote: > > Hi all, > > is it possible to subclass numpy.array to set extras functionalities and > > change the behavior of others? I can't find any docs on that. > > Did you really look ;) ? > http://www.scipy.org/Subclasses > > Check also the new implementation of MaskedArray, where masked arrays are > implemented as subclass of ndarrays > http://projects.scipy.org/scipy/numpy/wiki/MaskedArray. > > If you have some particular features in mind, let us know. > P. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Lionel Roubeyrie - lroubeyrie at limair.asso.fr LIMAIR http://www.limair.asso.fr From lroubeyrie at limair.asso.fr Fri Dec 1 08:20:31 2006 From: lroubeyrie at limair.asso.fr (Lionel Roubeyrie) Date: Fri, 1 Dec 2006 14:20:31 +0100 Subject: [Numpy-discussion] subclass In-Reply-To: <200612011354.39298.faltet@carabos.com> References: <200612011219.42733.lroubeyrie@limair.asso.fr> <200612011354.39298.faltet@carabos.com> Message-ID: <200612011420.31501.lroubeyrie@limair.asso.fr> I search to handle time series by associating dates to a masked array, but no set (directly) computation (sum, max, ...) on dates and have the possibility to search/select datas entries by date. Le Vendredi 01 D?cembre 2006 13:54, Francesc Altet a ?crit?: > If what you want is extending the functionality of ndarray at C level, > there is a complete section dedicated to this ('Subtyping the ndarray > in C', chapter 15) in Travis' Guide to NumPy [1]. > > [1] http://www.tramy.us/ > > Cheers, -- Lionel Roubeyrie - lroubeyrie at limair.asso.fr LIMAIR http://www.limair.asso.fr From Chris.Barker at noaa.gov Fri Dec 1 15:22:12 2006 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri, 01 Dec 2006 12:22:12 -0800 Subject: [Numpy-discussion] seting the dtype for where... Message-ID: <45708EF4.1070202@noaa.gov> Hi all, I'd like to set the data type for what numpy.where creates. For example: import numpy as N N.where(a >= 5, 5, 0) creates an integer array, which makes sense. N.where(a >= 5, 5.0, 0) creates a float64 array, which also makes sense, but I'd like a float32 array, so I tried: N.where(a >= 5, array(5.0, dtype=N.float32), 0) but I got a float64 array again. How can I get a float32 array? where doesn't take a dtype argument -- maybe it should? numpy version 1.0 thanks, -Chris From robert.kern at gmail.com Fri Dec 1 16:04:34 2006 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 01 Dec 2006 15:04:34 -0600 Subject: [Numpy-discussion] seting the dtype for where... In-Reply-To: <45708EF4.1070202@noaa.gov> References: <45708EF4.1070202@noaa.gov> Message-ID: <457098E2.6070307@gmail.com> Chris Barker wrote: > Hi all, > > I'd like to set the data type for what numpy.where creates. For example: > > import numpy as N > > N.where(a >= 5, 5, 0) > > creates an integer array, which makes sense. > > N.where(a >= 5, 5.0, 0) > > creates a float64 array, which also makes sense, but I'd like a float32 > array, so I tried: > > N.where(a >= 5, array(5.0, dtype=N.float32), 0) > > but I got a float64 array again. Well, it's consistent with all of the other coercion rules: In [6]: (array(5.0, dtype=float32) + 0).dtype Out[6]: dtype('float64') float64 is the lowest floating point dtype that can hold the full range of int32 values (much less int64) without losing precision. Since both operands ("coercands"?) are scalars, they both get a say in the final dtype (unlike a full array being coerced together with a scalar; only the array gets a say). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri Dec 1 16:46:39 2006 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 1 Dec 2006 13:46:39 -0800 Subject: [Numpy-discussion] nan functions convert matrix to array Message-ID: The first line of the nan functions (such as nansum, nanmin, nanmax) is y = array(a) That leads to matrix in, array out. Is there some way to make it matrix in, matrix out? Here, for example, is nansum: def nansum(a, axis=None): """Sum the array over the given axis, treating NaNs as 0. """ y = array(a) if not issubclass(y.dtype.type, _nx.integer): y[isnan(a)] = 0 return y.sum(axis) From mattknox_ca at hotmail.com Fri Dec 1 17:07:15 2006 From: mattknox_ca at hotmail.com (Matt Knox) Date: Fri, 1 Dec 2006 17:07:15 -0500 Subject: [Numpy-discussion] efficient way to get first index of first masked/non-masked value in a masked array Message-ID: all I can come up with is dumb brute force methods by iterating through all the values. Anyone got any tricks I can use? Thanks, - Matt Knox From robert.kern at gmail.com Fri Dec 1 17:13:41 2006 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 01 Dec 2006 16:13:41 -0600 Subject: [Numpy-discussion] efficient way to get first index of first masked/non-masked value in a masked array In-Reply-To: References: Message-ID: <4570A915.90206@gmail.com> Matt Knox wrote: > all I can come up with is dumb brute force methods by iterating through all the values. Anyone got any tricks I can use? import numpy as np def first_masked(m): idx = np.where(m.mask)[0] if len(idx) != 0: return idx[0] else: raise ValueError("no masked data") first_unmasked() is left as an exercise for the reader. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Fri Dec 1 17:42:22 2006 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 1 Dec 2006 17:42:22 -0500 Subject: [Numpy-discussion] nan functions convert matrix to array In-Reply-To: References: Message-ID: <200612011742.23145.pgmdevlist@gmail.com> On Friday 01 December 2006 16:46, Keith Goodman wrote: > The first line of the nan functions (such as nansum, nanmin, nanmax) is > Is there some way to make it matrix in, matrix out? Quick workaround: Overwrite these functions with your own, where 'array' or 'asarray' in the first line is replaced by 'asanyarray'. From Chris.Barker at noaa.gov Fri Dec 1 18:55:41 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 01 Dec 2006 15:55:41 -0800 Subject: [Numpy-discussion] seting the dtype for where... In-Reply-To: <457098E2.6070307@gmail.com> References: <45708EF4.1070202@noaa.gov> <457098E2.6070307@gmail.com> Message-ID: <4570C0FD.7060003@noaa.gov> Robert Kern wrote: > Well, it's consistent with all of the other coercion rules: > > > In [6]: (array(5.0, dtype=float32) + 0).dtype > Out[6]: dtype('float64') duh! of course. If I use a float32 scalar for BOTH the operands, then I get a float32 array out. Thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From thibault at physics.cornell.edu Fri Dec 1 23:09:32 2006 From: thibault at physics.cornell.edu (Pierre Thibault) Date: Fri, 1 Dec 2006 23:09:32 -0500 Subject: [Numpy-discussion] numpy.fft.rfftn -> non-contiguous Message-ID: <1b1c766f0612012009q3ed87bd5i8b084915a6deda72@mail.gmail.com> Hello! I'm a little confused about what rfftn is doing: It seems to me that the best would be for it to return a C-contiguous array with the first element reduced by a half (plus one), so that one can easily obtain the non-repeated slices. What I get is the following: In [1]: from numpy import * In [2]: a = random.rand(8,8,8) In [3]: fa = fft.rfftn(a) In [4]: fa.shape Out[4]: (8, 8, 5) In [5]: fa.flags Out[5]: CONTIGUOUS : False FORTRAN : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [6]: fa.swapaxes(-1,-2).flags Out[6]: CONTIGUOUS : True FORTRAN : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False What I would like to have is fa.shape: (5,8,8) fa.flags: CONTIGUOUS = True So, now fa[0] is a contiguous block containing the data that is not supposed to appear twice in the complex fft (that is, it would be nice if fft.rfftn(a)[0] == fft.fftn(a)[0]). I tried playing with the axes argument in rfftn, to no avail. I can do without, but at least it looks kind of ugly to me that the array returned by a built-in function is not contiguous. By the way, I am using the official debian unstable numpy package. It doesn't look like it is using fftw, and I don't know if any of this behaviour would be different if I had compiled numpy by myself - I am not even sure using fftw is an option with numpy. Thanks for any answer/comments on that! Pierre -- Pierre Thibault 616 Clark Hall, Cornell University (607) 255-5522 From arildna at stud.ntnu.no Sun Dec 3 10:56:49 2006 From: arildna at stud.ntnu.no (=?ISO-8859-1?Q?Arild_B._N=E6ss?=) Date: Sun, 03 Dec 2006 16:56:49 +0100 Subject: [Numpy-discussion] problems installing NumPy on OSX In-Reply-To: References: <20061104222219.5v1dj889q8cgsg8c@webmail.ntnu.no> <20061105091727.ouibxoltwk40cksg@webmail.ntnu.no> <20061105193259.lbxo4gkzk44wsw4c@webmail.ntnu.no> Message-ID: Den 5. nov. 2006 kl. 20.26 skrev Steve Lianoglou: >> I'm sorry, I was a tad too quick typing there. I meant to say "And do >> I even need to [install Xcode] to run numpy?" Robert pointed out that >> a lot things mentioned in the install guide were necessary to run >> scipy, but that you could run numpy without them. >> >> Therefore I was wondering if installing the newest Xcode package was >> likely to fix the error message I am now getting when trying to >> install numpy: > > I think Robert may have suggested to install the newest XCode because > it will give you a newer gcc that can have a better chance compiling > numpy correctly (or at least will remove another "unkown" to help > find your true problem). > > Maybe there'd be some "Universal Binary-aware"ness that the old xcode > gcc might be missing that you'll get w/ the new one and since Python > 2.5 is universal, this might be it. Getting the new xcode would be > the simplest part of the install anyway, so .. why not :-) > > -steve Hi again, I had to make do without numpy for what I was originally planning to use it for, and I've been busy for a while, as well as fed up of not getting this thing to work. I've realized I'm really going to need it if I am to continue using python though, so I've installed the new XCode and given it another try. This gets me further, actually the installation seems to complete. However, when I type >> import Numeric in Python, I get the usual ImportError: No module named Numeric. >> import numpy works, but >>> a= array([[1,2,3],[4,5,6]]) tells me array is not defined. So the installation obviously hasn't worked. I'm sure some of you are as tired of hearing about this as I am of writing about it, but I really have no idea what to do here. The installation output in the terminal window is quite long, so I have only copied in the parts that seem to contain some kind of error message (see below). First it says that g77, f77, gfortran and f95 is missing, then I've copied in a long part where there is a lot of small errors: - a series of errors in configtests - 4 instances of "nothing done with h_files=..." - some more failing configtests with an "#error No _WIN32" Hope somebody can help. regards, Arild N?ss ------------------------------------------------------------------------ ------------------- Could not locate executable g77 Could not locate executable f77 Could not locate executable gfortran Could not locate executable f95 ... compile options: '-Inumpy/core/src -Inumpy/core/include -I/Library/ Frameworks/Python.framework/Versions/2.5/include/python2.5 -c' gcc: _configtest.c _configtest.c: In function 'main': _configtest.c:4: error: 'isnan' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) _configtest.c: In function 'main': _configtest.c:4: error: 'isnan' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) lipo: can't figure out the architecture type of: /var/tmp//cczgPhx0.out _configtest.c: In function 'main': _configtest.c:4: error: 'isnan' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) _configtest.c: In function 'main': _configtest.c:4: error: 'isnan' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) lipo: can't figure out the architecture type of: /var/tmp//cczgPhx0.out failure. removing: _configtest.c _configtest.o C compiler: gcc -arch ppc -arch i386 -isysroot /Developer/SDKs/ MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -DNDEBUG -g -O3 compile options: '-Inumpy/core/src -Inumpy/core/include -I/Library/ Frameworks/Python.framework/Versions/2.5/include/python2.5 -c' gcc: _configtest.c _configtest.c: In function 'main': _configtest.c:4: error: 'isinf' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) _configtest.c: In function 'main': _configtest.c:4: error: 'isinf' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) lipo: can't figure out the architecture type of: /var/tmp//ccEAgr9A.out _configtest.c: In function 'main': _configtest.c:4: error: 'isinf' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) _configtest.c: In function 'main': _configtest.c:4: error: 'isinf' undeclared (first use in this function) _configtest.c:4: error: (Each undeclared identifier is reported only once _configtest.c:4: error: for each function it appears in.) lipo: can't figure out the architecture type of: /var/tmp//ccEAgr9A.out failure. removing: _configtest.c _configtest.o C compiler: gcc -arch ppc -arch i386 -isysroot /Developer/SDKs/ MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -DNDEBUG -g -O3 compile options: '-Inumpy/core/src -Inumpy/core/include -I/Library/ Frameworks/Python.framework/Versions/2.5/include/python2.5 -c' gcc: _configtest.c gcc _configtest.o -o _configtest success! removing: _configtest.c _configtest.o _configtest adding 'build/src.macosx-10.3-fat-2.5/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_array_api.py adding 'build/src.macosx-10.3-fat-2.5/numpy/core/ __multiarray_api.h' to sources. creating build/src.macosx-10.3-fat-2.5/numpy/core/src conv_template:> build/src.macosx-10.3-fat-2.5/numpy/core/src/ scalartypes.inc adding 'build/src.macosx-10.3-fat-2.5/numpy/core/src' to include_dirs. conv_template:> build/src.macosx-10.3-fat-2.5/numpy/core/src/ arraytypes.inc numpy.core - nothing done with h_files= ['build/src.macosx-10.3- fat-2.5/numpy/core/src/scalartypes.inc', 'build/src.macosx-10.3- fat-2.5/numpy/core/src/arraytypes.inc', 'build/src.macosx-10.3- fat-2.5/numpy/core/config.h', 'build/src.macosx-10.3-fat-2.5/numpy/ core/__multiarray_api.h'] building extension "numpy.core.umath" sources adding 'build/src.macosx-10.3-fat-2.5/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.macosx-10.3-fat-2.5/numpy/core/__ufunc_api.h' to sources. conv_template:> build/src.macosx-10.3-fat-2.5/numpy/core/src/ umathmodule.c adding 'build/src.macosx-10.3-fat-2.5/numpy/core/src' to include_dirs. numpy.core - nothing done with h_files= ['build/src.macosx-10.3- fat-2.5/numpy/core/src/scalartypes.inc', 'build/src.macosx-10.3- fat-2.5/numpy/core/src/arraytypes.inc', 'build/src.macosx-10.3- fat-2.5/numpy/core/config.h', 'build/src.macosx-10.3-fat-2.5/numpy/ core/__ufunc_api.h'] building extension "numpy.core._sort" sources adding 'build/src.macosx-10.3-fat-2.5/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_array_api.py adding 'build/src.macosx-10.3-fat-2.5/numpy/core/ __multiarray_api.h' to sources. conv_template:> build/src.macosx-10.3-fat-2.5/numpy/core/src/ _sortmodule.c numpy.core - nothing done with h_files= ['build/src.macosx-10.3- fat-2.5/numpy/core/config.h', 'build/src.macosx-10.3-fat-2.5/numpy/ core/__multiarray_api.h'] building extension "numpy.core.scalarmath" sources adding 'build/src.macosx-10.3-fat-2.5/numpy/core/config.h' to sources. executing numpy/core/code_generators/generate_array_api.py adding 'build/src.macosx-10.3-fat-2.5/numpy/core/ __multiarray_api.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.macosx-10.3-fat-2.5/numpy/core/__ufunc_api.h' to sources. conv_template:> build/src.macosx-10.3-fat-2.5/numpy/core/src/ scalarmathmodule.c numpy.core - nothing done with h_files= ['build/src.macosx-10.3- fat-2.5/numpy/core/config.h', 'build/src.macosx-10.3-fat-2.5/numpy/ core/__multiarray_api.h', 'build/src.macosx-10.3-fat-2.5/numpy/core/ __ufunc_api.h'] building extension "numpy.core._dotblas" sources adding 'numpy/core/blasdot/_dotblas.c' to sources. building extension "numpy.lib._compiled_base" sources building extension "numpy.numarray._capi" sources building extension "numpy.fft.fftpack_lite" sources building extension "numpy.linalg.lapack_lite" sources creating build/src.macosx-10.3-fat-2.5/numpy/linalg adding 'numpy/linalg/lapack_litemodule.c' to sources. building extension "numpy.random.mtrand" sources creating build/src.macosx-10.3-fat-2.5/numpy/random C compiler: gcc -arch ppc -arch i386 -isysroot /Developer/SDKs/ MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -DNDEBUG -g -O3 compile options: '-Inumpy/core/src -Inumpy/core/include -I/Library/ Frameworks/Python.framework/Versions/2.5/include/python2.5 -c' gcc: _configtest.c _configtest.c:7:2: error: #error No _WIN32 _configtest.c:7:2: error: #error No _WIN32 lipo: can't figure out the architecture type of: /var/tmp//ccojlBrt.out _configtest.c:7:2: error: #error No _WIN32 _configtest.c:7:2: error: #error No _WIN32 lipo: can't figure out the architecture type of: /var/tmp//ccojlBrt.out failure. removing: _configtest.c _configtest.o From gael.varoquaux at normalesup.org Sun Dec 3 11:00:16 2006 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 3 Dec 2006 17:00:16 +0100 Subject: [Numpy-discussion] problems installing NumPy on OSX In-Reply-To: References: <20061104222219.5v1dj889q8cgsg8c@webmail.ntnu.no> <20061105091727.ouibxoltwk40cksg@webmail.ntnu.no> <20061105193259.lbxo4gkzk44wsw4c@webmail.ntnu.no>

Message-ID: <20061203160012.GG20167@clipper.ens.fr> On Sun, Dec 03, 2006 at 04:56:49PM +0100, Arild B. N?ss wrote: > This gets me further, actually the installation seems to complete. > However, when I type > >> import Numeric > in Python, I get the usual ImportError: No module named Numeric. Thats normal, you installed numpy. > >> import numpy > works, but > >>> a= array([[1,2,3],[4,5,6]]) > tells me array is not defined. Hum, you can do either: from numpy import * a= array([[1,2,3],[4,5,6]]) or import numpy a= numpy.array([[1,2,3],[4,5,6]]) Ga?l From arildna at stud.ntnu.no Sun Dec 3 11:36:04 2006 From: arildna at stud.ntnu.no (=?ISO-8859-1?Q? Arild_B._N=E6ss ?=) Date: Sun, 03 Dec 2006 17:36:04 +0100 Subject: [Numpy-discussion] problems installing NumPy on OSX In-Reply-To: <20061203160012.GG20167@clipper.ens.fr> References: <20061104222219.5v1dj889q8cgsg8c@webmail.ntnu.no> <20061105091727.ouibxoltwk40cksg@webmail.ntnu.no> <20061105193259.lbxo4gkzk44wsw4c@webmail.ntnu.no>

<20061203160012.GG20167@clipper.ens.fr> Message-ID: <53B5F7E9-C152-418C-B872-BD0331030875@stud.ntnu.no> Den 3. des. 2006 kl. 17.00 skrev Gael Varoquaux: > On Sun, Dec 03, 2006 at 04:56:49PM +0100, Arild B. N?ss wrote: >> This gets me further, actually the installation seems to complete. >> However, when I type >>>> import Numeric >> in Python, I get the usual ImportError: No module named Numeric. > > Thats normal, you installed numpy. Hm, this doesn't work for you either? It says on this page that "import Numeric" is a normal test: http://numpy.scipy.org/numpydoc/numpy-3.html > >>>> import numpy >> works, but >>>>> a= array([[1,2,3],[4,5,6]]) >> tells me array is not defined. > > Hum, you can do either: > > from numpy import * > a= array([[1,2,3],[4,5,6]]) > > or > > import numpy > a= numpy.array([[1,2,3],[4,5,6]]) You got my hopes up for a second there, but I can do neither: Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from numpy import * Running from numpy source directory. >>> a= array([[1,2,3],[4,5,6]]) Traceback (most recent call last): File "", line 1, in NameError: name 'array' is not defined >>> import numpy >>> a = numpy.array([[1,2],[3,4]]) Traceback (most recent call last): File "", line 1, in AttributeError: 'module' object has no attribute 'array' regards, Arild N?ss -------------- next part -------------- An HTML attachment was scrubbed... URL: From erin.sheldon at gmail.com Sun Dec 3 11:59:37 2006 From: erin.sheldon at gmail.com (Erin Sheldon) Date: Sun, 3 Dec 2006 11:59:37 -0500 Subject: [Numpy-discussion] problems installing NumPy on OSX In-Reply-To: <53B5F7E9-C152-418C-B872-BD0331030875@stud.ntnu.no> References: <20061104222219.5v1dj889q8cgsg8c@webmail.ntnu.no> <20061105091727.ouibxoltwk40cksg@webmail.ntnu.no> <20061105193259.lbxo4gkzk44wsw4c@webmail.ntnu.no>

<20061203160012.GG20167@clipper.ens.fr> <53B5F7E9-C152-418C-B872-BD0331030875@stud.ntnu.no> Message-ID: <331116dc0612030859h1bff2553j646c457368449087@mail.gmail.com> On 12/3/06, Arild B. N?ss wrote: > > > Den 3. des. 2006 kl. 17.00 skrev Gael Varoquaux: > > On Sun, Dec 03, 2006 at 04:56:49PM +0100, Arild B. N?ss wrote: > This gets me further, actually the installation seems to complete. > However, when I type > > import Numeric > in Python, I get the usual ImportError: No module named Numeric. > > Thats normal, you installed numpy. > > Hm, this doesn't work for you either? It says on this page that "import > Numeric" is a normal test: > http://numpy.scipy.org/numpydoc/numpy-3.html That document looks out of date. If you just install numpy you won't be able to import Numeric -SNIP- > You got my hopes up for a second there, but I can do neither: > > Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) > [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> from numpy import * > Running from numpy source directory. > >>> a= array([[1,2,3],[4,5,6]]) > Traceback (most recent call last): > File "", line 1, in > NameError: name 'array' is not defined You will get this error running from the numpy source directory. cd to somewhere else. From pgmdevlist at gmail.com Sun Dec 3 12:07:18 2006 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 3 Dec 2006 12:07:18 -0500 Subject: [Numpy-discussion] nan functions convert matrix to array In-Reply-To: References: <200612011742.23145.pgmdevlist@gmail.com> Message-ID: <200612031207.19054.pgmdevlist@gmail.com> On Friday 01 December 2006 17:56, Keith Goodman wrote: ... > Would it break anything to change the first line of the nan functions from > a = array(a) > to > a = asanyarray(a) > ? Seeing what the nan functions do, I don't think that would be a problem. An expception would be raised if the operation could not be performed anyway (like a N.sum on a record array). But I'm no judge, so I'll let the powers in place decide of that. In the same order of idea, I'm bumping a post of mine: would it be possible to get 'asnayarray' in 'apply_along_axis', 'apply_over_axes', 'vectorize' ? From lists.steve at arachnedesign.net Sun Dec 3 12:23:35 2006 From: lists.steve at arachnedesign.net (Steve Lianoglou) Date: Sun, 3 Dec 2006 12:23:35 -0500 Subject: [Numpy-discussion] problems installing NumPy on OSX In-Reply-To: <331116dc0612030859h1bff2553j646c457368449087@mail.gmail.com> References: <20061104222219.5v1dj889q8cgsg8c@webmail.ntnu.no> <20061105091727.ouibxoltwk40cksg@webmail.ntnu.no> <20061105193259.lbxo4gkzk44wsw4c@webmail.ntnu.no>

<20061203160012.GG20167@clipper.ens.fr> <53B5F7E9-C152-418C-B872-BD0331030875@stud.ntnu.no> <331116dc0612030859h1bff2553j646c457368449087@mail.gmail.com> Message-ID: >> You got my hopes up for a second there, but I can do neither: >> >> Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) >> [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin >> Type "help", "copyright", "credits" or "license" for more >> information. >>>>> from numpy import * >> Running from numpy source directory. >>>>> a= array([[1,2,3],[4,5,6]]) >> Traceback (most recent call last): >> File "", line 1, in >> NameError: name 'array' is not defined > > You will get this error running from the numpy source > directory. cd to somewhere else. Lasty .. you don't have to "guess" (too much) if numpy installed correctly. Once you're not running from the source dir (I just checked here, it completely doesn't work when I'm in the source dir also), run numpy's test suite: [you@/Users/yourhomedir] $ python In [1]: import numpy In [2]: numpy.test(1,1) You should see the tests fly by and finally get something like: ---------------------------------------------------------------------- Ran 517 tests in 0.450s OK Out[2]: -steve From arildna at stud.ntnu.no Sun Dec 3 14:19:01 2006 From: arildna at stud.ntnu.no (=?ISO-8859-1?Q? Arild_B._N=E6ss ?=) Date: Sun, 03 Dec 2006 20:19:01 +0100 Subject: [Numpy-discussion] problems installing NumPy on OSX In-Reply-To: References: <20061104222219.5v1dj889q8cgsg8c@webmail.ntnu.no> <20061105091727.ouibxoltwk40cksg@webmail.ntnu.no> <20061105193259.lbxo4gkzk44wsw4c@webmail.ntnu.no>

<20061203160012.GG20167@clipper.ens.fr> <53B5F7E9-C152-418C-B872-BD0331030875@stud.ntnu.no> <331116dc0612030859h1bff2553j646c457368449087@mail.gmail.com> Message-ID: <680A53F2-B763-4A6D-99A1-83005F5CB1DA@stud.ntnu.no> Den 3. des. 2006 kl. 18.23 skrev Steve Lianoglou: >>> You got my hopes up for a second there, but I can do neither: >>> >>> Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) >>> [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin >>> Type "help", "copyright", "credits" or "license" for more >>> information. >>>>>> from numpy import * >>> Running from numpy source directory. >>>>>> a= array([[1,2,3],[4,5,6]]) >>> Traceback (most recent call last): >>> File "", line 1, in >>> NameError: name 'array' is not defined >> >> You will get this error running from the numpy source >> directory. cd to somewhere else. > > Lasty .. you don't have to "guess" (too much) if numpy installed > correctly. > > Once you're not running from the source dir (I just checked here, it > completely doesn't work when I'm in the source dir also), run numpy's > test suite: > > [you@/Users/yourhomedir] $ python > > In [1]: import numpy > In [2]: numpy.test(1,1) > > You should see the tests fly by and finally get something like: > > ---------------------------------------------------------------------- > Ran 517 tests in 0.450s > > OK > Out[2]: It seems running from the source dir has been the main problem all along. It works fine outside (I guess). I get one error in the test Steve recommends though. But hey, 519 out of 520 ain't so bad, is it? regards, Arild N?ss ====================================================================== FAIL: Ticket #112 ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/core/tests/test_regression.py", line 220, in check_longfloat_repr assert(str(a)[1:9] == str(a[0])[:8]) AssertionError ---------------------------------------------------------------------- Ran 520 tests in 8.141s FAILED (failures=1) From mforbes at phys.washington.edu Mon Dec 4 00:27:08 2006 From: mforbes at phys.washington.edu (Michael McNeil Forbes) Date: Sun, 03 Dec 2006 21:27:08 -0800 Subject: [Numpy-discussion] take semantics (bug?) Message-ID: What are the semantics of the "take" function? I would have expected that the following have the same shape and size: >>> a = array([1,2,3]) >>> inds = a.nonzero() >>> a[inds] array([1, 2, 3]) >>> a.take(inds) array([[1, 2, 3]]) Is there a bug somewhere here or is this intentional? Michael. From robert.kern at gmail.com Mon Dec 4 00:41:48 2006 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 03 Dec 2006 23:41:48 -0600 Subject: [Numpy-discussion] take semantics (bug?) In-Reply-To: References: Message-ID: <4573B51C.7050209@gmail.com> Michael McNeil Forbes wrote: > What are the semantics of the "take" function? > > I would have expected that the following have the same shape and size: > >>>> a = array([1,2,3]) >>>> inds = a.nonzero() >>>> a[inds] > array([1, 2, 3]) >>>> a.take(inds) > array([[1, 2, 3]]) > > Is there a bug somewhere here or is this intentional? It's a result of a.nonzero() returning a tuple. In [3]: a.nonzero() Out[3]: (array([0, 1, 2]),) __getitem__ interprets tuples specially: a[1,2,3] == a[(1,2,3)], also a[0,] == a[0]. .take() doesn't; it simply tries to convert its argument into an array. It can convert (array([0, 1, 2]),) into array([[0, 1, 2]]), so it does. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Mon Dec 4 12:09:56 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 04 Dec 2006 09:09:56 -0800 Subject: [Numpy-discussion] problems installing NumPy on OSX In-Reply-To: <680A53F2-B763-4A6D-99A1-83005F5CB1DA@stud.ntnu.no> References: <20061104222219.5v1dj889q8cgsg8c@webmail.ntnu.no> <20061105091727.ouibxoltwk40cksg@webmail.ntnu.no> <20061105193259.lbxo4gkzk44wsw4c@webmail.ntnu.no>

<20061203160012.GG20167@clipper.ens.fr> <53B5F7E9-C152-418C-B872-BD0331030875@stud.ntnu.no> <331116dc0612030859h1bff2553j646c457368449087@mail.gmail.com> <680A53F2-B763-4A6D-99A1-83005F5CB1DA@stud.ntnu.no> Message-ID: <4573A009.3070505@ieee.org> > It seems running from the source dir has been the main problem all > along. It works fine outside (I guess). > > I get one error in the test Steve recommends though. But hey, 519 out > of 520 ain't so bad, is it? > Don't worry about the failing test. It's a bad test on your platform. If you don't use long double's you won't care. -Travis From koara at atlas.cz Mon Dec 4 15:37:05 2006 From: koara at atlas.cz (koara at atlas.cz) Date: Mon, 04 Dec 2006 21:37:05 +0100 Subject: [Numpy-discussion] ValueError: dimensions too large. Message-ID: <18eda9c6e3b5467cb0bc95aea4c7b518@atlas.cz> Hello, i tried to create a 2d array, but encountered: ValueError: dimensions too large. Does this refer to insufficient memory, or is there really a limit on dimension sizes? Cheers. From charlesr.harris at gmail.com Tue Dec 5 13:02:17 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 Dec 2006 11:02:17 -0700 Subject: [Numpy-discussion] dot operations on multidimensional arrays In-Reply-To: <45655665.7020302@fysik.dtu.dk> References: <45655665.7020302@fysik.dtu.dk> Message-ID: On 11/23/06, Carsten Rostgaard wrote: > > Hi! > I am trying to use the "dot" method on multi-(more than 2)-dimensional > arrays. > > Specifically I do > >> y = dot(a, b) > where a is a 2D array and b is a 3D array. > > using numpy I get the the help: > " > dot(...) > dot(a,v) returns matrix-multiplication between a and b. > The product-sum is over the last dimension of a and the > second-to-last dimension of b. > " > I then expect that > >> y[i, j, k] = sum(a[i, :] * b[j, :, k]) > which is actually what I get. > > The question is then: > 1) Is there any way to change the axis for which the product-sum is > performed. This can of course be done by a swapaxis before and after the > operation, but this makes the array non-contiguous, in which case the > dot operation often makes bugs (at least in Numeric). > 2) For complicated reasons we still use Numeric in our software package, > and in this, "dot" behaves very strangely. > According to the Numeric help: In Numpy tensordot(a, b, axes=2) tensordot returns the product for any (ndim >= 1) arrays. r_{xxx, yyy} = \sum_k a_{xxx,k} b_{k,yyy} where the axes to be summed over are given by the axes argument. the first element of the sequence determines the axis or axes in arr1 to sum over, and the second element in axes argument sequence determines the axis or axes in arr2 to sum over. When there is more than one axis to sum over, the corresponding arguments to axes should be sequences of the same length with the first axis to sum over given first in both sequences, the second axis second, and so forth. If the axes argument is an integer, N, then the last N dimensions of a and first N dimensions of b are summed over. I don't know about numeric. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Tue Dec 5 13:16:05 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Tue, 05 Dec 2006 11:16:05 -0700 Subject: [Numpy-discussion] How to speed up this function? In-Reply-To: <5263831.1901231165202880646.JavaMail.root@vms062.mailsrvcs.net> References: <5263831.1901231165202880646.JavaMail.root@vms062.mailsrvcs.net> Message-ID: <4575B765.1040505@ieee.org> fsenkel at verizon.net wrote: > Hello, > > I'm taking a CFD class, one of the codes I wrote runs very slow. When I look at hotshot is says the function below is the problem. Since this is an explicit step, the for loops are only traversed once, so I think it's caused by memory usage, but I'm not sure if it's the local variables or the loop? I can vectorize the inner loop, would declaring the data structures in the calling routine and passing them in be a better idea than using local storage? > > I'm new at python and numpy, I need to look at how to get profiling information for the lines within a function. > > > Thank you, > > Frank > > > PS > I tried to post this via google groups, but it didn't seem to go through, sorry if it ends up as multiple postings > > > def findw(wnext,wprior,phiprior,uprior,vprior): > #format here is x[i,j] where i's are rows, j's columns, use flipud() to get the > #print out consistent with the spacial up-down directions > > #assign local names that are more > #inline with the class notation > w = wprior > phi = phiprior > u = uprior > v = vprior > > > #three of the BC are known so just set them > #symetry plane > wnext[0,0:gcols] = 0.0 > > #upper wall > wnext[gN,0:gcols] = 2.0/gdy**2 * (phi[gN,0:gcols] - phi[gN-1,0:gcols]) > > #inlet, off the walls > wnext[1:grows-1,0] = 0.0 > > > upos = where(u>0) > vpos = where(v>0) > > Sx = ones_like(u) > Sx[upos] = 0.0 > > Sy = ones_like(v) > Sy[vpos] = 0.0 > > uw = u*w > vw = v*w > > #interior nodes > for j in range(1,gasizej-1): > for i in range(1,gasizei-1): > > wnext[i,j] =( w[i,j] + gnu*gdt/gdx**2 * (w[i,j-1] - 2.0*w[i,j] + w[i,j+1]) + > gnu*gdt/gdy**2 * (w[i-1,j] - 2.0*w[i,j] + w[i+1,j]) - > (1.0 - Sx[i,j]) * gdt/gdx * (uw[i,j] - uw[i,j-1]) - > Sx[i,j] * gdt/gdx * (uw[i,j+1] - uw[i,j]) - > (1.0 - Sy[i,j]) * gdt/gdy * (vw[i,j] - vw[i-1,j]) - > Sy[i,j] * gdt/gdy * (vw[i+1,j] - vw[i,j]) ) > I imagine that this loop is what is killing you. Remove at least the inner loop, if not both (try removing just the inner loop as well as both since sometimes it's faster to just remove the inner one due to memory usage issues..) removing both will look something like: wnext[1:-1,1:-1] = w[1:-1, 1:-1] + gnu*gdx**2* (w[1:-1, 0:-2] - 2.0*w[1:-1, 1:-1] + w[1:-1,2:] etc, etc When you're done with that, note also that you have the same array term present multiple times. You could save more time by collapsing those terms and using different scalar multipliers. Occasionally that is numerically unwise, but I doubt it in this case. There all sorts of other things that you can do such as using inplace operations, etc. But try vectorizing the loop first. -tim > ## print "***wnext****" > ## print "i: ", i, "j: ", j, "wnext[i,j]: ", wnext[i,j] > > #final BC at outlet, off walls > wnext[1:grows-1,gM] = wnext[1:grows-1,gM-1] > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > From tim.hochberg at ieee.org Tue Dec 5 13:20:29 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Tue, 05 Dec 2006 11:20:29 -0700 Subject: [Numpy-discussion] Precision in Python In-Reply-To: References: Message-ID: <4575B86D.7020600@ieee.org> Elton Mendes wrote: > Hi. > I'm having a precision problem in python > > Example: > > > >>> a = 5.14343434 > >>> b = round(a,1) > >>> b > 5.0999999999999996 > >>> > > It?s possible to round the number exactly to 5.1 Read this: http://www.python.org/infogami-faq/general/why-are-floating-point-calculations-so-inaccurate/ From charlesr.harris at gmail.com Tue Dec 5 13:18:47 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 Dec 2006 11:18:47 -0700 Subject: [Numpy-discussion] How to speed up this function? In-Reply-To: <5263831.1901231165202880646.JavaMail.root@vms062.mailsrvcs.net> References: <5263831.1901231165202880646.JavaMail.root@vms062.mailsrvcs.net> Message-ID: On 12/3/06, fsenkel at verizon.net wrote: > > Hello, > > I'm taking a CFD class, one of the codes I wrote runs very slow. When I > look at hotshot is says the function below is the problem. Since this is an > explicit step, the for loops are only traversed once, so I think it's caused > by memory usage, but I'm not sure if it's the local variables or the loop? I > can vectorize the inner loop, would declaring the data structures in the > calling routine and passing them in be a better idea than using local > storage? > > I'm new at python and numpy, I need to look at how to get profiling > information for the lines within a function. > > > Thank you, > > Frank > > > PS > I tried to post this via google groups, but it didn't seem to go through, > sorry if it ends up as multiple postings > > > def findw(wnext,wprior,phiprior,uprior,vprior): > #format here is x[i,j] where i's are rows, j's columns, use flipud() to > get the > #print out consistent with the spacial up-down directions > > #assign local names that are more > #inline with the class notation > w = wprior > phi = phiprior > u = uprior > v = vprior > > > #three of the BC are known so just set them > #symetry plane > wnext[0,0:gcols] = 0.0 > > #upper wall > wnext[gN,0:gcols] = 2.0/gdy**2 * (phi[gN,0:gcols] - phi[gN-1,0:gcols]) > > #inlet, off the walls > wnext[1:grows-1,0] = 0.0 > > > upos = where(u>0) > vpos = where(v>0) > > Sx = ones_like(u) > Sx[upos] = 0.0 > > Sy = ones_like(v) > Sy[vpos] = 0.0 > > uw = u*w > vw = v*w > > #interior nodes > for j in range(1,gasizej-1): > for i in range(1,gasizei-1): > > wnext[i,j] =( w[i,j] + gnu*gdt/gdx**2 * (w[i,j-1] - 2.0*w[i,j] > + w[i,j+1]) + > gnu*gdt/gdy**2 * (w[i-1,j] - 2.0*w[i,j] + > w[i+1,j]) - > (1.0 - Sx[i,j]) * gdt/gdx * (uw[i,j] - uw[i,j-1]) > - > Sx[i,j] * gdt/gdx * (uw[i,j+1] - uw[i,j]) - > (1.0 - Sy[i,j]) * gdt/gdy * (vw[i,j] - vw[i-1,j]) > - > Sy[i,j] * gdt/gdy * (vw[i+1,j] - vw[i,j]) ) > > ## print "***wnext****" > ## print "i: ", i, "j: ", j, "wnext[i,j]: ", wnext[i,j] > > #final BC at outlet, off walls > wnext[1:grows-1,gM] = wnext[1:grows-1,gM-1] > _ Explicit indexing tends to be very slow. I note what looks to be a lot of differencing in the code, so I suspect what you have here is a PDE. Your best bet in the short term is to vectorize as many of these operations as possible, but because the expression is so complicated it is a bit of a chore to see just how. It your CFD class allows it, there are probably tools in scipy that are adapted to this sort of problem, and in particular to CFD. Sandia also puts out PyTrilinos, http://software.sandia.gov/trilinos/packages/pytrilinos/, which provides interfaces to distributed and parallel PDE solvers. It's big iron software for serious problems, so might be a bit of overkill for your applications. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bpederse at gmail.com Tue Dec 5 13:30:41 2006 From: bpederse at gmail.com (Brent Pedersen) Date: Tue, 5 Dec 2006 10:30:41 -0800 Subject: [Numpy-discussion] How to speed up this function? In-Reply-To: <5263831.1901231165202880646.JavaMail.root@vms062.mailsrvcs.net> References: <5263831.1901231165202880646.JavaMail.root@vms062.mailsrvcs.net> Message-ID: it looks like you could use weave.blitz() without much change to your code. or weave.inline() if needed. see this page: http://scipy.org/PerformancePython On 12/3/06, fsenkel at verizon.net wrote: > > Hello, > > I'm taking a CFD class, one of the codes I wrote runs very slow. When I > look at hotshot is says the function below is the problem. Since this is an > explicit step, the for loops are only traversed once, so I think it's caused > by memory usage, but I'm not sure if it's the local variables or the loop? I > can vectorize the inner loop, would declaring the data structures in the > calling routine and passing them in be a better idea than using local > storage? > > I'm new at python and numpy, I need to look at how to get profiling > information for the lines within a function. > > > Thank you, > > Frank > > > PS > I tried to post this via google groups, but it didn't seem to go through, > sorry if it ends up as multiple postings > > > def findw(wnext,wprior,phiprior,uprior,vprior): > #format here is x[i,j] where i's are rows, j's columns, use flipud() to > get the > #print out consistent with the spacial up-down directions > > #assign local names that are more > #inline with the class notation > w = wprior > phi = phiprior > u = uprior > v = vprior > > > #three of the BC are known so just set them > #symetry plane > wnext[0,0:gcols] = 0.0 > > #upper wall > wnext[gN,0:gcols] = 2.0/gdy**2 * (phi[gN,0:gcols] - phi[gN-1,0:gcols]) > > #inlet, off the walls > wnext[1:grows-1,0] = 0.0 > > > upos = where(u>0) > vpos = where(v>0) > > Sx = ones_like(u) > Sx[upos] = 0.0 > > Sy = ones_like(v) > Sy[vpos] = 0.0 > > uw = u*w > vw = v*w > > #interior nodes > for j in range(1,gasizej-1): > for i in range(1,gasizei-1): > > wnext[i,j] =( w[i,j] + gnu*gdt/gdx**2 * (w[i,j-1] - 2.0*w[i,j] > + w[i,j+1]) + > gnu*gdt/gdy**2 * (w[i-1,j] - 2.0*w[i,j] + > w[i+1,j]) - > (1.0 - Sx[i,j]) * gdt/gdx * (uw[i,j] - uw[i,j-1]) > - > Sx[i,j] * gdt/gdx * (uw[i,j+1] - uw[i,j]) - > (1.0 - Sy[i,j]) * gdt/gdy * (vw[i,j] - vw[i-1,j]) > - > Sy[i,j] * gdt/gdy * (vw[i+1,j] - vw[i,j]) ) > > ## print "***wnext****" > ## print "i: ", i, "j: ", j, "wnext[i,j]: ", wnext[i,j] > > #final BC at outlet, off walls > wnext[1:grows-1,gM] = wnext[1:grows-1,gM-1] > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 5 13:33:30 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 Dec 2006 11:33:30 -0700 Subject: [Numpy-discussion] How to speed up this function? In-Reply-To: References: <5263831.1901231165202880646.JavaMail.root@vms062.mailsrvcs.net> Message-ID: On 12/5/06, Charles R Harris wrote: > > > > On 12/3/06, fsenkel at verizon.net wrote: > > > > Hello, > > > > I'm taking a CFD class, one of the codes I wrote runs very slow. When I > > look at hotshot is says the function below is the problem. Since this is an > > explicit step, the for loops are only traversed once, so I think it's caused > > by memory usage, but I'm not sure if it's the local variables or the loop? I > > can vectorize the inner loop, would declaring the data structures in the > > calling routine and passing them in be a better idea than using local > > storage? > > > > I'm new at python and numpy, I need to look at how to get profiling > > information for the lines within a function. > > > > > > Thank you, > > > > Frank > > > > > > PS > > I tried to post this via google groups, but it didn't seem to go > > through, sorry if it ends up as multiple postings > > > Explicit indexing tends to be very slow. I note what looks to be a lot of > differencing in the code, so I suspect what you have here is a PDE. Your > best bet in the short term is to vectorize as many of these operations as > possible, but because the expression is so complicated it is a bit of a > chore to see just how. It your CFD class allows it, there are probably > tools in scipy that are adapted to this sort of problem, and in particular > to CFD. Sandia also puts out PyTrilinos, > http://software.sandia.gov/trilinos/packages/pytrilinos/, which provides > interfaces to distributed and parallel PDE solvers. It's big iron software > for serious problems, so might be a bit of overkill for your applications. > If it is a PDE, you might also want to look into sparse matrices. Other folks here can tell you more about that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 5 13:39:49 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 5 Dec 2006 11:39:49 -0700 Subject: [Numpy-discussion] Precision in Python In-Reply-To: References: Message-ID: On 11/27/06, Elton Mendes wrote: > > Hi. > I'm having a precision problem in python > > Example: > > > >>> a = 5.14343434 > >>> b = round(a,1) > >>> b > 5.0999999999999996 > >>> > > It?s possible to round the number exactly to 5.1 Short answer, no. The number 5.1 can't be exactly represented as a binary fraction, i.e., it can't be expressed in the form int/power_of_two. If all you are worried about is appearance, then the print routine will round it to 5.1 if you restrict the precision of the output. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Dec 5 14:19:27 2006 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 05 Dec 2006 13:19:27 -0600 Subject: [Numpy-discussion] Numpy and Python 2.2 on RHEL3 In-Reply-To: <45759940.8080000@icecube.wisc.edu> References: <45759940.8080000@icecube.wisc.edu> Message-ID: <4575C63F.8050509@gmail.com> David Bogen wrote: > All: > > Is it possible to build Numpy using Python 2.2? I haven't been able to > find anything that explicitly lists the versions of Python with which > Numpy functions so I've been working under the assumption that the two > bits will mesh together somehow. numpy requires Python 2.3 . -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From a.u.r.e.l.i.a.n at gmx.net Tue Dec 5 15:51:06 2006 From: a.u.r.e.l.i.a.n at gmx.net (Johannes Loehnert) Date: Tue, 5 Dec 2006 21:51:06 +0100 Subject: [Numpy-discussion] dot operations on multidimensional arrays In-Reply-To: <45655665.7020302@fysik.dtu.dk> References: <45655665.7020302@fysik.dtu.dk> Message-ID: <200612052151.07223.a.u.r.e.l.i.a.n@gmx.net> Hi, > The question is then: > 1) Is there any way to change the axis for which the product-sum is > performed. This can of course be done by a swapaxis before and after the > operation, but this makes the array non-contiguous, in which case the > dot operation often makes bugs (at least in Numeric). > 2) For complicated reasons we still use Numeric in our software package, > and in this, "dot" behaves very strangely. The behaviour for >2D arrays has a bug which was fixed for numpy long ago. (I was the one who found it. :-)) It lead exactly to the behaviour you found (first row is correct, rest is garbage). I do not know if it was fixed in Numeric, maybe updating to the latest version will help. Otherwise, maybe the best workaround is to use a for loop and calculate dot elementwise. Johannes From gnchen at cortechs.net Tue Dec 5 16:27:55 2006 From: gnchen at cortechs.net (Gennan Chen) Date: Tue, 05 Dec 2006 13:27:55 -0800 Subject: [Numpy-discussion] compile scipy by using intel compiler Message-ID: <1165354075.6742.5.camel@cortechs25.cortechs.net> Hi! All, I have a dual opteron 285 with 8G ram machine. And I ran FC6 x86_64 on that. I did manage to get numpy (from svn) compiled by using icc 9.1.0.45 and mkl 9.0 ( got 3 errors when I ran the est). But no such luck for scipy (from svn). Below is the error: Lib/special/cephes/mconf.h(137): remark #193: zero used for undefined preprocessing identifier #if WORDS_BIGENDIAN /* Defined in pyconfig.h */ ^ Lib/special/cephes/const.c(92): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ Lib/special/cephes/const.c(97): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ Lib/special/cephes/const.c(97): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ compilation aborted for Lib/special/cephes/const.c (code 2) error: Command "icc -O2 -g -fomit-frame-pointer -mcpu=pentium4 -mtune=pentium4 -march=pentium4 -msse3 -axW -Wall -fPIC -c Lib/special/cephes/const.c -o build/temp.linux-x86_64-2.4/Lib/special/cephes/const.o" failed with exit status 2 Did anyone has a solution for this? BTW, the 3 error I got from numpy are: File "/usr/lib64/python2.4/site-packages/numpy/lib/tests/test_ufunclike.py", line 25, in test_ufunclike Failed example: nx.sign(a) Expected: array([ 1., -1., 0., 0., 1., -1.]) Got: array([ 1., -1., -1., 0., 1., -1.]) ********************************************************************** File "/usr/lib64/python2.4/site-packages/numpy/lib/tests/test_ufunclike.py", line 40, in test_ufunclike Failed example: nx.sign(a, y) Expected: array([True, True, False, False, True, True], dtype=bool) Got: array([True, True, True, False, True, True], dtype=bool) ********************************************************************** File "/usr/lib64/python2.4/site-packages/numpy/lib/tests/test_ufunclike.py", line 43, in test_ufunclike Failed example: y Expected: array([True, True, False, False, True, True], dtype=bool) Got: array([True, True, True, False, True, True], dtype=bool) Are these error serious?? Or maybe I should get back to gcc? Anyone got a good speed up by using icc and mkl? -- Gen-Nan Chen, PhD -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue Dec 5 16:46:42 2006 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 5 Dec 2006 13:46:42 -0800 Subject: [Numpy-discussion] numpy in debian Message-ID: I'm impressed with how easy it is to compile numpy. I'm even more impressed with how easy it is let someone else compile it and just apt-get it. Does anyone know the numpy plans for debian? It is currently at 1.0rc1. I'm afraid to ask the debian numpy maintainers since they have already done me such a big favor by packaging numpy and I haven't even thanked them yet. Oh, yeah. Then there's that other project I should thank. Thank you. Numpy is a joy to use. From mforbes at phys.washington.edu Tue Dec 5 21:57:10 2006 From: mforbes at phys.washington.edu (Michael McNeil Forbes) Date: Tue, 05 Dec 2006 18:57:10 -0800 Subject: [Numpy-discussion] take semantics (bug?) References: <4573B51C.7050209@gmail.com> Message-ID: Robert Kern wrote: > Michael McNeil Forbes wrote: > > What are the semantics of the "take" function? > > > > I would have expected that the following have the same shape and size: > > > >>>> a = array([1,2,3]) > >>>> inds = a.nonzero() > >>>> a[inds] > > array([1, 2, 3]) > >>>> a.take(inds) > > array([[1, 2, 3]]) > > > > Is there a bug somewhere here or is this intentional? > > It's a result of a.nonzero() returning a tuple. ... > __getitem__ interprets tuples specially: a[1,2,3] == a[(1,2,3)], also a[0,] > == a[0]. > > .take() doesn't; it simply tries to convert its argument into an array. It > can > convert (array([0, 1, 2]),) into array([[0, 1, 2]]), so it does. Okay. I understand why this happens from the code. 1) Is there a design reason for this inconsistent treatment of "indices"? 2) If so, is there some place (perhaps on the Wiki or in some source code I cannot find) that design decisions like this are discussed? (I have several other inconsistencies I would like to address, but would like to find out if they are "intentional" before wasting people's time.) Thanks, Michael. From robert.kern at gmail.com Tue Dec 5 22:19:03 2006 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 05 Dec 2006 21:19:03 -0600 Subject: [Numpy-discussion] take semantics (bug?) In-Reply-To: References: <4573B51C.7050209@gmail.com> Message-ID: <457636A7.6040408@gmail.com> Michael McNeil Forbes wrote: > Robert Kern wrote: >> Michael McNeil Forbes wrote: >>> What are the semantics of the "take" function? >>> >>> I would have expected that the following have the same shape and size: >>> >>>>>> a = array([1,2,3]) >>>>>> inds = a.nonzero() >>>>>> a[inds] >>> array([1, 2, 3]) >>>>>> a.take(inds) >>> array([[1, 2, 3]]) >>> >>> Is there a bug somewhere here or is this intentional? >> It's a result of a.nonzero() returning a tuple. > ... >> __getitem__ interprets tuples specially: a[1,2,3] == a[(1,2,3)], also a[0,] >> == a[0]. >> >> .take() doesn't; it simply tries to convert its argument into an array. It >> can >> convert (array([0, 1, 2]),) into array([[0, 1, 2]]), so it does. > > Okay. I understand why this happens from the code. > > 1) Is there a design reason for this inconsistent treatment of "indices"? Indexing needs to handle tuples of indices separately from other objects in order to support x[i, j] .take() does not support multidimensional indexing, so it shouldn't try to go through the special cases that __getitem__ does. Instead, it follows the rules that nearly every other method uses (i.e. "just turn it into an array"). > 2) If so, is there some place (perhaps on the Wiki or in some source > code I cannot find) that design decisions like this are discussed? (I > have several other inconsistencies I would like to address, but would > like to find out if they are "intentional" before wasting people's time.) If they're recorded outside of the code, _The Guide to NumPy_, or the mailing list, they're here: http://projects.scipy.org/scipy/numpy -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nadavh at visionsense.com Wed Dec 6 01:42:47 2006 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 6 Dec 2006 08:42:47 +0200 Subject: [Numpy-discussion] dot operations on multidimensional arrays Message-ID: <07C6A61102C94148B8104D42DE95F7E8CC1FA9@exchange2k.envision.co.il> Try numpy.tensordot Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Carsten Rostgaard Sent: Thursday, November 23, 2006 10:06 To: numpy-discussion at scipy.org Subject: [Numpy-discussion] dot operations on multidimensional arrays Hi! I am trying to use the "dot" method on multi-(more than 2)-dimensional arrays. Specifically I do >> y = dot(a, b) where a is a 2D array and b is a 3D array. using numpy I get the the help: " dot(...) dot(a,v) returns matrix-multiplication between a and b. The product-sum is over the last dimension of a and the second-to-last dimension of b. " I then expect that >> y[i, j, k] = sum(a[i, :] * b[j, :, k]) which is actually what I get. The question is then: 1) Is there any way to change the axis for which the product-sum is performed. This can of course be done by a swapaxis before and after the operation, but this makes the array non-contiguous, in which case the dot operation often makes bugs (at least in Numeric). 2) For complicated reasons we still use Numeric in our software package, and in this, "dot" behaves very strangely. According to the Numeric help: " dot(a, b) dot(a,b) returns matrix-multiplication between a and b. The product-sum is over the last dimension of a and the second-to-last dimension of b. " so I would have expected again that y[i, j, k] = sum(a[i, :] * b[j, :, k]), and the dimensions actually fit, i.e. y.shape = (a.shape[0], b.shape[0], b.shape[2]), but only some rows of the result has these values!! Does anyone know what Numeric.dot(a, b) actually does when b has more than two dimensions? I use the following test script: ---------------------BEGIN SCRIPT----------------------- import Numeric as num # import numpy as num # make 'random' input arrays a = num.zeros((2, 5)) b = num.zeros((3, 5, 4)) a.flat[:] = num.arange(len(a.flat)) - 3 b.flat[:] = num.arange(len(b.flat)) + 5 # built-in dot product y1 = num.dot(a, b) # manual dot product y2 = num.zeros((a.shape[0], b.shape[0], b.shape[2])) for i in range(a.shape[0]): for j in range(b.shape[0]): for k in range(b.shape[2]): y2[i, j, k] = num.sum(a[i,:] * b[j, :, k]) # test for consistency print y1 == y2 ---------------------END SCRIPT------------------------ with the result: [[[1 1 1 1] [0 0 0 0] [0 0 0 0]] [[1 1 1 1] [0 0 0 0] [0 0 0 0]]] thanks a lot, Carsten Rostgaard Carsten.Rostgaard at fysik.dtu.dk _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion From nadavh at visionsense.com Wed Dec 6 01:52:35 2006 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 6 Dec 2006 08:52:35 +0200 Subject: [Numpy-discussion] Precision in Python Message-ID: <07C6A61102C94148B8104D42DE95F7E8C8F12C@exchange2k.envision.co.il> -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Elton Mendes Sent: Monday, November 27, 2006 13:57 To: numpy-discussion at scipy.org Subject: [Numpy-discussion] Precision in Python Hi. I'm having a precision problem in python Example: >>> a = 5.14343434 >>> b = round(a,1) >>> b 5.0999999999999996 >>> It?s possible to round the number exactly to 5.1 NO. 5.1 can not be represented exactly as a machine native float. The only way I know to represent this value exactly is to use the decimal module. Usually you do not want to do this. Nadav. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Wed Dec 6 02:05:06 2006 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 6 Dec 2006 09:05:06 +0200 Subject: [Numpy-discussion] How to speed up this function? Message-ID: <07C6A61102C94148B8104D42DE95F7E8C8F12D@exchange2k.envision.co.il> You can speed it up easily by avoiding the loop. The main idea is to replace the indexing of the type [i+1,j], [i-1,j], [i,j+1], [i,j-1] by the appropriate slicing. For example for i in xrange(1,n): for j in xrange(1,m) a[i,j] = b[i-1,j] + c[i,j+1] can be replaced by a[1:,:-1] = b[:-1] + c[:,1:] Nadav. -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of fsenkel at verizon.net Sent: Monday, December 04, 2006 05:28 To: numpy-discussion at scipy.org Subject: [Numpy-discussion] How to speed up this function? Hello, I'm taking a CFD class, one of the codes I wrote runs very slow. When I look at hotshot is says the function below is the problem. Since this is an explicit step, the for loops are only traversed once, so I think it's caused by memory usage, but I'm not sure if it's the local variables or the loop? I can vectorize the inner loop, would declaring the data structures in the calling routine and passing them in be a better idea than using local storage? I'm new at python and numpy, I need to look at how to get profiling information for the lines within a function. Thank you, Frank PS I tried to post this via google groups, but it didn't seem to go through, sorry if it ends up as multiple postings def findw(wnext,wprior,phiprior,uprior,vprior): #format here is x[i,j] where i's are rows, j's columns, use flipud() to get the #print out consistent with the spacial up-down directions #assign local names that are more #inline with the class notation w = wprior phi = phiprior u = uprior v = vprior #three of the BC are known so just set them #symetry plane wnext[0,0:gcols] = 0.0 #upper wall wnext[gN,0:gcols] = 2.0/gdy**2 * (phi[gN,0:gcols] - phi[gN-1,0:gcols]) #inlet, off the walls wnext[1:grows-1,0] = 0.0 upos = where(u>0) vpos = where(v>0) Sx = ones_like(u) Sx[upos] = 0.0 Sy = ones_like(v) Sy[vpos] = 0.0 uw = u*w vw = v*w #interior nodes for j in range(1,gasizej-1): for i in range(1,gasizei-1): wnext[i,j] =( w[i,j] + gnu*gdt/gdx**2 * (w[i,j-1] - 2.0*w[i,j] + w[i,j+1]) + gnu*gdt/gdy**2 * (w[i-1,j] - 2.0*w[i,j] + w[i+1,j]) - (1.0 - Sx[i,j]) * gdt/gdx * (uw[i,j] - uw[i,j-1]) - Sx[i,j] * gdt/gdx * (uw[i,j+1] - uw[i,j]) - (1.0 - Sy[i,j]) * gdt/gdy * (vw[i,j] - vw[i-1,j]) - Sy[i,j] * gdt/gdy * (vw[i+1,j] - vw[i,j]) ) ## print "***wnext****" ## print "i: ", i, "j: ", j, "wnext[i,j]: ", wnext[i,j] #final BC at outlet, off walls wnext[1:grows-1,gM] = wnext[1:grows-1,gM-1] _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion From giorgio.luciano at chimica.unige.it Wed Dec 6 10:41:23 2006 From: giorgio.luciano at chimica.unige.it (Giorgio Luciano) Date: Wed, 06 Dec 2006 16:41:23 +0100 Subject: [Numpy-discussion] equivalent to isempty command in matlab (newbie question) Message-ID: <4576E4A3.7040004@chimica.unige.it> Today I've also posted a question to scipy groups because I've thought I've found a solution but this work good bar(N, b1[:,0], width, color='r', yerr=binterv) ############ s3=find(sig1[:,arange(ini,c)]<=0.001) b1=b.flatten() #if s3!=[]: for i3 in arange(len(s3)): text(s3[i3], b1[s3[i3]+ini],'***') s2=find(logical_and(sig1[:,arange(ini,c)]>0.001,sig1[:,arange(ini,c)]<=0.01)) for i2 in arange(len(s2)): text(s2[i2], b1[s2[i2]+ini],'**') s1=find(logical_and(sig1[:,arange(ini,c)]>0.01,sig1[:,arange(ini,c)]<=0.05)) for i1 in arange(len(s1)): text(s1[i1], b1[s1[i1]+ini],'*') title('Plot of the coefficients of the model') and when i uncomment the ifs3!=[] part it does not... so in this case I've solve the problem.. but is there an equivalent for isempty matlab command in numpy ? Thanks in advance for the reply Giorgio From david.huard at gmail.com Wed Dec 6 10:21:44 2006 From: david.huard at gmail.com (David Huard) Date: Wed, 6 Dec 2006 10:21:44 -0500 Subject: [Numpy-discussion] Resizing without allocating additional memory Message-ID: <91cf711d0612060721t3a7c0bc1y8ae72fd72d66f556@mail.gmail.com> Hi, I have fortran subroutines wrapped with f2py that take arrays as arguments, and I often need to use resize(a, N) to pass an array of copies of an element. The resize call , however, is becoming the speed bottleneck, so my question is: Is it possible to create an (1xN) array from a scalar without allocating additional memory for the array, ie just return a new "view" of the array where all elements point to the same scalar. Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at marquardt.sc Wed Dec 6 11:28:39 2006 From: christian at marquardt.sc (Christian Marquardt) Date: Wed, 6 Dec 2006 17:28:39 +0100 (CET) Subject: [Numpy-discussion] Indices returned by where() Message-ID: <31940.193.17.11.23.1165422519.squirrel@webmail.marquardt.sc> Dear list, apologies if the answer to my question is obvious... Is the following intentional? $>python Python 2.4 (#1, Mar 22 2005, 21:42:42) [GCC 3.3.5 20050117 (prerelease) (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> print np.__version__ 1.0 >>> x = np.array([1., 2., 3., 4., 5.]) >>> idx = np.where(x > 6.) >>> print len(idx) 1 The reason is of course that where() returns a tuple of index arrays instead of simply an index array: >>> print idx (array([], dtype=int32),) Does that mean that one always has to explicitely request the first element of the returned tuple in order to check how many matches were found, even for 1d arrays? What's the reason for designing it that way? Many thanks, Christian From robert.kern at gmail.com Wed Dec 6 12:01:00 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 06 Dec 2006 11:01:00 -0600 Subject: [Numpy-discussion] equivalent to isempty command in matlab (newbie question) In-Reply-To: <4576E4A3.7040004@chimica.unige.it> References: <4576E4A3.7040004@chimica.unige.it> Message-ID: <4576F74C.3080700@gmail.com> Giorgio Luciano wrote: > Today I've also posted a question to scipy groups because I've thought > I've found a solution but > > this work good > > bar(N, b1[:,0], width, color='r', yerr=binterv) > ############ > s3=find(sig1[:,arange(ini,c)]<=0.001) > b1=b.flatten() > #if s3!=[]: > for i3 in arange(len(s3)): > text(s3[i3], b1[s3[i3]+ini],'***') > s2=find(logical_and(sig1[:,arange(ini,c)]>0.001,sig1[:,arange(ini,c)]<=0.01)) > for i2 in arange(len(s2)): > text(s2[i2], b1[s2[i2]+ini],'**') > s1=find(logical_and(sig1[:,arange(ini,c)]>0.01,sig1[:,arange(ini,c)]<=0.05)) > for i1 in arange(len(s1)): > text(s1[i1], b1[s1[i1]+ini],'*') > title('Plot of the coefficients of the model') > > and when i uncomment the ifs3!=[] part it does not... > so in this case I've solve the problem.. but is there an equivalent for > isempty matlab command in numpy ? Use (len(s3) != 0) instead of (s3 != []). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From filip.wasilewski at gmail.com Wed Dec 6 12:09:52 2006 From: filip.wasilewski at gmail.com (Filip Wasilewski) Date: Wed, 6 Dec 2006 18:09:52 +0100 Subject: [Numpy-discussion] equivalent to isempty command in matlab (newbie question) In-Reply-To: <4576E4A3.7040004@chimica.unige.it> References: <4576E4A3.7040004@chimica.unige.it> Message-ID: On 12/6/06, Giorgio Luciano wrote: > Today I've also posted a question to scipy groups because I've thought > I've found a solution but > > this work good > > bar(N, b1[:,0], width, color='r', yerr=binterv) > ############ > s3=find(sig1[:,arange(ini,c)]<=0.001) Just a few tips before I answer your question. Is sig1 a global constant? It is a good practice to write constant names in uppercase. Otherwise consider passing it as a function argument. > b1=b.flatten() > #if s3!=[]: if s3: ... > for i3 in arange(len(s3)): Although this works, a no-surprise way is to use standard xrange: for i3 in xrange(len(s3)): ... or enumerate: for i3, elem in enumerate(s3): ... > text(s3[i3], b1[s3[i3]+ini],'***') > s2=find(logical_and(sig1[:,arange(ini,c)]>0.001,sig1[:,arange(ini,c)]<=0.01)) Boolean operators are also ok, just remember about parentheses and operators priority: (sig1[:,arange(ini,c)]>0.001) & (sig1[:,arange(ini,c)]<=0.01) > for i2 in arange(len(s2)): > text(s2[i2], b1[s2[i2]+ini],'**') > s1=find(logical_and(sig1[:,arange(ini,c)]>0.01,sig1[:,arange(ini,c)]<=0.05)) > for i1 in arange(len(s1)): > text(s1[i1], b1[s1[i1]+ini],'*') > title('Plot of the coefficients of the model') > > and when i uncomment the ifs3!=[] part it does not... I think you have just discovered a bug or it's an inconsistency I didn't know of? >>> print numpy.array([1,1]) == [], numpy.array([1,1]) != [] False True >>> print numpy.array([1]) == [], numpy.array([1]) != [] [] [] >>> print numpy.array([]) == [], numpy.array([]) != [] [] [] >>> numpy.__version__ '1.0' > so in this case I've solve the problem.. but is there an equivalent for > isempty matlab command in numpy ? Just like for other Python objects: if ifs3: print "not empty" or check if .size attribute is positive. cheers, fw From robert.kern at gmail.com Wed Dec 6 12:20:19 2006 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 06 Dec 2006 11:20:19 -0600 Subject: [Numpy-discussion] equivalent to isempty command in matlab (newbie question) In-Reply-To: References: <4576E4A3.7040004@chimica.unige.it> Message-ID: <4576FBD3.3090601@gmail.com> Filip Wasilewski wrote: > Just like for other Python objects: > > if ifs3: > print "not empty" No, that doesn't work. numpy arrays do not have a truth value. They raise an error when you try to use them in such a context. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant at ee.byu.edu Wed Dec 6 13:45:03 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed, 06 Dec 2006 11:45:03 -0700 Subject: [Numpy-discussion] Resizing without allocating additional memory In-Reply-To: <91cf711d0612060721t3a7c0bc1y8ae72fd72d66f556@mail.gmail.com> References: <91cf711d0612060721t3a7c0bc1y8ae72fd72d66f556@mail.gmail.com> Message-ID: <45770FAF.1030009@ee.byu.edu> David Huard wrote: > Hi, > > I have fortran subroutines wrapped with f2py that take arrays as > arguments, and I often need to use resize(a, N) to pass an array of > copies of an element. The resize call , however, is becoming the speed > bottleneck, so my question is: > Is it possible to create an (1xN) array from a scalar without > allocating additional memory for the array, ie just return a new > "view" of the array where all elements point to the same scalar. > I don't think this would be possible in Fortran because Fortran does not provide a facility for using arbitrary striding (maybe later versions of Fortran using pointers does, though). If you can use arbitrary striding in your code, then you can construct such a view using appropriate strides (i.e. a stride of 0). You can do this with the ndarray constructor: a = array(5) g = ndarray(shape=(1,10), dtype=int, buffer=a, strides=(0,0)) But, notice you will get interesting results using g += 1 Explain why the result of this is an array of 15 (Hint: look at the value of a). -Travis From filip.wasilewski at gmail.com Wed Dec 6 14:04:35 2006 From: filip.wasilewski at gmail.com (Filip Wasilewski) Date: Wed, 6 Dec 2006 20:04:35 +0100 Subject: [Numpy-discussion] equivalent to isempty command in matlab (newbie question) In-Reply-To: <4576FBD3.3090601@gmail.com> References: <4576E4A3.7040004@chimica.unige.it> <4576FBD3.3090601@gmail.com> Message-ID: On 12/6/06, Robert Kern wrote: > Filip Wasilewski wrote: > > > Just like for other Python objects: > > > > if ifs3: > > print "not empty" > > No, that doesn't work. numpy arrays do not have a truth value. They raise an > error when you try to use them in such a context. Right! I could swear I have checked this before posting. Evidently I got bitten by this: >>> bool(numpy.array([])) False >>> bool(numpy.array([1])) True >>> bool(numpy.array([0])) False >>> bool(numpy.array([1,1])) Traceback (most recent call last): File "", line 1, in -toplevel- bool(numpy.array([1,1])) ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() So depending on the situation one can use len or size: >>> len(numpy.array([[],[]])) 2 >>> numpy.array([[],[]]).size 0 And how to understand following? >>> print numpy.array([1,1]) == [], numpy.array([1,1]) != [] False True >>> print `numpy.array([1]) == []`, `numpy.array([1]) != []` array([], dtype=bool) array([], dtype=bool) >>> print bool(numpy.array([1]) == []), bool(numpy.array([1]) != []) False False cheers, fw From david.huard at gmail.com Wed Dec 6 15:15:04 2006 From: david.huard at gmail.com (David Huard) Date: Wed, 6 Dec 2006 15:15:04 -0500 Subject: [Numpy-discussion] Resizing without allocating additional memory In-Reply-To: <45770FAF.1030009@ee.byu.edu> References: <91cf711d0612060721t3a7c0bc1y8ae72fd72d66f556@mail.gmail.com> <45770FAF.1030009@ee.byu.edu> Message-ID: <91cf711d0612061215j1e0acba9q1bf2223cd265c19@mail.gmail.com> Thanks Travis, I guess we'll have to tweak the fortran subroutines. It would have been neat though. David Answer: Since g+=1 adds one to all N elements of g, the buffer a gets incremented N times. So a = array(i) g = ndarray(shape=(1,N), dtype=int, buffer=a, strides=(0,0)) g+=M returns i + M*N 2006/12/6, Travis Oliphant : > > David Huard wrote: > > > Hi, > > > > I have fortran subroutines wrapped with f2py that take arrays as > > arguments, and I often need to use resize(a, N) to pass an array of > > copies of an element. The resize call , however, is becoming the speed > > bottleneck, so my question is: > > Is it possible to create an (1xN) array from a scalar without > > allocating additional memory for the array, ie just return a new > > "view" of the array where all elements point to the same scalar. > > > I don't think this would be possible in Fortran because Fortran does not > provide a facility for using arbitrary striding (maybe later versions of > Fortran using pointers does, though). > > If you can use arbitrary striding in your code, then you can construct > such a view using appropriate strides (i.e. a stride of 0). You can do > this with the ndarray constructor: > > > a = array(5) > g = ndarray(shape=(1,10), dtype=int, buffer=a, strides=(0,0)) > > But, notice you will get interesting results using > > g += 1 > > Explain why the result of this is an array of 15 (Hint: look at the > value of a). > > -Travis > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at ee.byu.edu Wed Dec 6 15:27:43 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed, 06 Dec 2006 13:27:43 -0700 Subject: [Numpy-discussion] Overlapping copy with object_ arrays In-Reply-To: <20061204230402.39447.qmail@web37210.mail.mud.yahoo.com> References: <20061204230402.39447.qmail@web37210.mail.mud.yahoo.com> Message-ID: <457727BF.6040601@ee.byu.edu> James Flowers wrote: > Hello, > > Having a problem with overlapping copies. Memory being freed twice > ??? See below: Thanks for the test. This problem is fixed and will be checked into SVN as soon as I can figure out why I'm not able to access SVN from my work machine. The problem is that object array copies were done by first decrementing the reference count of all elements of the destination array and then incrementing the reference count of all elements of the destination array once the copy was complete. For over-lapping copies (containing only a single reference to an object). This created a problem as the reference count went to 0 before the copy occurred. I've changed the code so that the reference count of the source is increased and the reference count of the destination is decreased before the copy is made. Then, the reference counts are correct after the copy is completed even for over-lapping copies. -Travis From charlesr.harris at gmail.com Thu Dec 7 01:17:52 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 6 Dec 2006 23:17:52 -0700 Subject: [Numpy-discussion] Indices returned by where() In-Reply-To: <31940.193.17.11.23.1165422519.squirrel@webmail.marquardt.sc> References: <31940.193.17.11.23.1165422519.squirrel@webmail.marquardt.sc> Message-ID: On 12/6/06, Christian Marquardt wrote: > > Dear list, > > apologies if the answer to my question is obvious... > > Is the following intentional? > > $>python > > Python 2.4 (#1, Mar 22 2005, 21:42:42) > [GCC 3.3.5 20050117 (prerelease) (SUSE Linux)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > > >>> import numpy as np > >>> print np.__version__ > 1.0 > > >>> x = np.array([1., 2., 3., 4., 5.]) > > >>> idx = np.where(x > 6.) > >>> print len(idx) > 1 > > The reason is of course that where() returns a tuple of index arrays > instead of simply an index array: > > >>> print idx > (array([], dtype=int32),) > > Does that mean that one always has to explicitely request the first > element of the returned tuple in order to check how many matches were > found, even for 1d arrays? What's the reason for designing it that way? Fancy indexing. In [1]: a = arange(10).reshape(2,5) In [2]: i = where(a>3) In [3]: i Out[3]: (array([0, 1, 1, 1, 1, 1]), array([4, 0, 1, 2, 3, 4])) In [4]: a[i] = 0 In [5]: a Out[5]: array([[0, 1, 2, 3, 0], [0, 0, 0, 0, 0]]) If you just want a count, try In [6]: a = arange(10).reshape(2,5) In [7]: sum(a>3) Out[7]: 6 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivetti at itc.it Thu Dec 7 05:30:15 2006 From: olivetti at itc.it (Emanuele Olivetti) Date: Thu, 07 Dec 2006 11:30:15 +0100 Subject: [Numpy-discussion] pickling arrays: numpy 1.0 can't unpickle numpy 1.0.1 Message-ID: <4577ED37.6020208@itc.it> I'm running numpy 1.0 and 1.0.1 on several hosts and today I've found that pickling arrays in 1.0.1 generates problems to 1.0. An example: --- numpy 1.0.1 --- import numpy import pickle a = numpy.array([1,2,3]) f=open('test1.pickle','w') pickle.dump(a,f) f.close() --- If I unpickle test1.pickle in numpy 1.0 I got: --- numpy 1.0 >>> import numpy >>> import pickle >>> f=open('test1.pickle') >>> a=pickle.load(f) Traceback (most recent call last): File "", line 1, in File "/hardmnt/virgo0/sra/olivetti/myapps/lib/python2.5/pickle.py", line 1370, in load return Unpickler(file).load() File "/hardmnt/virgo0/sra/olivetti/myapps/lib/python2.5/pickle.py", line 858, in load dispatch[key](self) File "/hardmnt/virgo0/sra/olivetti/myapps/lib/python2.5/pickle.py", line 1217, in load_build setstate(state) TypeError: argument 1 must be sequence of length 5, not 8 ----------------- How can I let access pickled arrays made in numpy 1.0.1 to numpy 1.0 ? Help! Thanks in advance, Emanuele From giorgio.luciano at chimica.unige.it Thu Dec 7 05:36:03 2006 From: giorgio.luciano at chimica.unige.it (Giorgio Luciano) Date: Thu, 07 Dec 2006 11:36:03 +0100 Subject: [Numpy-discussion] equivalent to isempty command in matlab (newbie question), (Robert Kern) In-Reply-To: References: Message-ID: <4577EE93.4060904@chimica.unige.it> > Today's Topics: > > 1. Re: equivalent to isempty command in matlab (newbie question) > (Robert Kern) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 06 Dec 2006 11:20:19 -0600 > From: Robert Kern > Subject: Re: [Numpy-discussion] equivalent to isempty command in > matlab (newbie question) > To: Discussion of Numerical Python > Message-ID: <4576FBD3.3090601 at gmail.com> > Content-Type: text/plain; charset=UTF-8 > > Filip Wasilewski wrote: > > >> Just like for other Python objects: >> >> if ifs3: >> print "not empty" >> > > No, that doesn't work. numpy arrays do not have a truth value. They raise an > error when you try to use them in such a context. > > Does it exist a workaround for that to make numpy understand when an array is empty ? Giorgio From faltet at carabos.com Thu Dec 7 06:15:49 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu, 07 Dec 2006 12:15:49 +0100 Subject: [Numpy-discussion] equivalent to isempty command in matlab (newbie question), (Robert Kern) In-Reply-To: <4577EE93.4060904@chimica.unige.it> References: <4577EE93.4060904@chimica.unige.it> Message-ID: <1165490149.2588.28.camel@localhost.localdomain> El dj 07 de 12 del 2006 a les 11:36 +0100, en/na Giorgio Luciano va escriure: > Does it exist a workaround for that to make numpy understand when an > array is empty ? > Giorgio > I guess there should be many. One possibility is to use .size: In [9]:a=numpy.array([]) In [10]:a.size == False Out[10]:True In [11]:a=numpy.array([1]) In [12]:a.size == False Out[12]:False Cheers, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth From alexandre.fayolle at logilab.fr Thu Dec 7 10:50:21 2006 From: alexandre.fayolle at logilab.fr (Alexandre Fayolle) Date: Thu, 7 Dec 2006 16:50:21 +0100 Subject: [Numpy-discussion] Numeric memory leak when building Numeric.array from numarray.array Message-ID: <20061207155020.GA18896@crater.logilab.fr> Hi, I'm facing a memory leak on an application that has to use numarray and Numeric (because of external dependencies). The problem occurs when building a Numeric array from a numarray array: import Numeric import numarray import sys atest = numarray.arange(200) temp = Numeric.array(atest) print sys.getrefcount(atest) # prints 3 print sys.getrefcount(temp) # prints 2 I'm running numarray 1.5.2 and Numeric 24.2 I can work around this by using an intermediate string representation: temp = Numeric.fromstring(atest.tostring(), atest.typecode()) temp.shape = atest.shape -- Alexandre Fayolle LOGILAB, Paris (France) Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services Informatique scientifique: http://www.logilab.fr/science Reprise et maintenance de sites CPS: http://www.migration-cms.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 481 bytes Desc: Digital signature URL: From kxroberto at googlemail.com Thu Dec 7 11:11:14 2006 From: kxroberto at googlemail.com (Robert) Date: Thu, 07 Dec 2006 17:11:14 +0100 Subject: [Numpy-discussion] Future Python 2.3 support ? - Re: Numpy and Python 2.2 on RHEL3 In-Reply-To: <4575C63F.8050509@gmail.com> References: <45759940.8080000@icecube.wisc.edu> <4575C63F.8050509@gmail.com> Message-ID: Robert Kern wrote: > David Bogen wrote: >> All: >> >> Is it possible to build Numpy using Python 2.2? I haven't been able to >> find anything that explicitly lists the versions of Python with which >> Numpy functions so I've been working under the assumption that the two >> bits will mesh together somehow. > > numpy requires Python 2.3 . > hope Python2.3 support will not be dropped too early. There is not much cost overall when going from 2.2 to 2.3. Yet from Py2.3 to Py2.4 there is a tremendous increase in memory footprint, in distributable file sizes, load time, cgi start time, compiler troubles on Win etc. .. - not really balanced by comparable improvements. For most types of practical applications Py2.3 is still the "good Python" for me. Robert From faltet at carabos.com Thu Dec 7 11:36:22 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu, 07 Dec 2006 17:36:22 +0100 Subject: [Numpy-discussion] Numeric memory leak when building Numeric.array from numarray.array In-Reply-To: <20061207155020.GA18896@crater.logilab.fr> References: <20061207155020.GA18896@crater.logilab.fr> Message-ID: <1165509382.2588.54.camel@localhost.localdomain> El dj 07 de 12 del 2006 a les 16:50 +0100, en/na Alexandre Fayolle va escriure: > Hi, > > I'm facing a memory leak on an application that has to use numarray and > Numeric (because of external dependencies). > > The problem occurs when building a Numeric array from a numarray array: > > import Numeric > import numarray > import sys > atest = numarray.arange(200) > temp = Numeric.array(atest) > print sys.getrefcount(atest) # prints 3 > print sys.getrefcount(temp) # prints 2 > > I'm running numarray 1.5.2 and Numeric 24.2 Yeah, it seems like the array protocol implementation in Numeric is leaking. Unfortunately, as Numeric maintenance has been dropped, there is small chances that this would be fixed in the future. > > I can work around this by using an intermediate string representation: > > temp = Numeric.fromstring(atest.tostring(), atest.typecode()) > temp.shape = atest.shape Another (faster) workaround would be: temp2 = Numeric.fromstring(atest._data, typecode=atest.typecode()) which is pretty fast: In [20]:Timer("Numeric.fromstring(atest._data, typecode=atest.typecode())", "import numarray, Numeric; atest=numarray.arange(200)").repeat(3,10000) Out[20]:[0.18092107772827148, 0.13870906829833984, 0.13995194435119629] i.e. more than 2x faster than your current solution: In [21]:Timer("Numeric.fromstring(atest.tostring(), typecode=atest.typecode())", "import numarray, Numeric; atest=numarray.arange(200)").repeat(3,10000) Out[21]:[0.37756705284118652, 0.32852792739868164, 0.32704305648803711] and similar in speed to the native .array() and .asarray() based on the array protocol: In [22]:Timer("Numeric.array(atest)", "import numarray, Numeric; atest=numarray.arange(200)").repeat(3,10000) Out[22]:[0.17277789115905762, 0.12470793724060059, 0.12530016899108887] In [23]:Timer("Numeric.asarray(atest)", "import numarray, Numeric; atest=numarray.arange(200)").repeat(3,10000) Out[23]:[0.20457005500793457, 0.15211081504821777, 0.15212082862854004] As an aside, and curiously enough, Numeric.array() (a copy is done) is faster than Numeric.asarray() (a copy shouldn't be done) :-/ HTH, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth From ddrake at brontes3d.com Thu Dec 7 11:45:05 2006 From: ddrake at brontes3d.com (Daniel Drake) Date: Thu, 07 Dec 2006 11:45:05 -0500 Subject: [Numpy-discussion] numarray-1.5.2 and Py_NONE refcount crash Message-ID: <1165509905.26874.30.camel@systems03.lan.brontes3d.com> Hi, I know that numarray is outdated now, but it's too big a job to change to numpy right now. On the offchance that someone can help: After upgrading from numarray-1.3.1 to numarray-1.5.2, we get occasional crashes where python tries to free Py_NONE I'm aware of the NA_FromDimsStridesDescrAndData fix at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=399440 however this doesn't solve the problem here and that particular function doesn't seem to be in our codepath anyway. Any ideas? Thanks -- Daniel Drake Brontes Technologies, A 3M Company From kwgoodman at gmail.com Thu Dec 7 12:05:31 2006 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 7 Dec 2006 09:05:31 -0800 Subject: [Numpy-discussion] pickling arrays: numpy 1.0 can't unpickle numpy 1.0.1 In-Reply-To: <4577ED37.6020208@itc.it> References: <4577ED37.6020208@itc.it> Message-ID: On 12/7/06, Emanuele Olivetti wrote: > How can I let access pickled arrays made in numpy 1.0.1 to numpy 1.0 ? If you pickle in 1.0.1, I bet you can read it in 1.0. I don't know why the pickle format keeps changing. But I understand why an old version of software can't always read data generated by a new version of software. From kwgoodman at gmail.com Thu Dec 7 12:06:31 2006 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 7 Dec 2006 09:06:31 -0800 Subject: [Numpy-discussion] pickling arrays: numpy 1.0 can't unpickle numpy 1.0.1 In-Reply-To: References: <4577ED37.6020208@itc.it> Message-ID: On 12/7/06, Keith Goodman wrote: > On 12/7/06, Emanuele Olivetti wrote: > > How can I let access pickled arrays made in numpy 1.0.1 to numpy 1.0 ? > > If you pickle in 1.0.1, I bet you can read it in 1.0. > > I don't know why the pickle format keeps changing. But I understand > why an old version of software can't always read data generated by a > new version of software. Sorry. I meant if you pickle in 1.0, I bet you can read it in 1.0.1. From charlesr.harris at gmail.com Thu Dec 7 12:42:06 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 7 Dec 2006 10:42:06 -0700 Subject: [Numpy-discussion] pickling arrays: numpy 1.0 can't unpickle numpy 1.0.1 In-Reply-To: References: <4577ED37.6020208@itc.it> Message-ID: On 12/7/06, Keith Goodman wrote: > > On 12/7/06, Keith Goodman wrote: > > On 12/7/06, Emanuele Olivetti wrote: > > > How can I let access pickled arrays made in numpy 1.0.1 to numpy 1.0 ? > > > > If you pickle in 1.0.1, I bet you can read it in 1.0. > > > > I don't know why the pickle format keeps changing. But I understand > > why an old version of software can't always read data generated by a > > new version of software. The 1.0.x versions are supposed to be compatible. I don't see any changes to pickle in svn since before the 1.0 release, so there might be another problem here. Are there other differences between the machines? Python version, OS, endianess, 32 vs 64 bit, compiler, etc. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Dec 7 12:49:56 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 7 Dec 2006 10:49:56 -0700 Subject: [Numpy-discussion] Strange numpy behaviour with pickle In-Reply-To: <4565B9C5.1080108@xrce.xerox.com> References: <4565B9C5.1080108@xrce.xerox.com> Message-ID: On 11/23/06, Jerome Fuselier wrote: > > Hello, > I've discovered a small problem when I tried to save a numpy array with > the pickle module. The dumped array is not always equal to the loaded one > and the error is not always here, depending on the way I express matrix > operations. > > I illustrated the error with a small script attached with this message, > the 4th test corresponds to what I did first which didn't do what I was > expecting. > > Am I missing something or is it a bug ? > The tests all work for me. What version of numpy are you using? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Dec 7 13:26:37 2006 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 07 Dec 2006 12:26:37 -0600 Subject: [Numpy-discussion] Future Python 2.3 support ? - Re: Numpy and Python 2.2 on RHEL3 In-Reply-To: References: <45759940.8080000@icecube.wisc.edu> <4575C63F.8050509@gmail.com> Message-ID: <45785CDD.5060004@gmail.com> Robert wrote: > Robert Kern wrote: >> numpy requires Python 2.3 . > > hope Python2.3 support will not be dropped too early. There is not much cost overall when going from 2.2 to 2.3. It won't. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From emanuele at relativita.com Thu Dec 7 16:19:00 2006 From: emanuele at relativita.com (emanuele at relativita.com) Date: Thu, 7 Dec 2006 22:19:00 +0100 (CET) Subject: [Numpy-discussion] pickling arrays: numpy 1.0 can't unpickle numpy 1.0.1 In-Reply-To: References: <4577ED37.6020208@itc.it> Message-ID: <38164.194.242.201.223.1165526340.squirrel@webmail.relativita.com> On Thu, December 7, 2006 6:42 pm, Charles R Harris wrote: > The 1.0.x versions are supposed to be compatible. I don't see any changes > to > pickle in svn since before the 1.0 release, so there might be another > problem here. Are there other differences between the machines? Python > version, OS, endianess, 32 vs 64 bit, compiler, etc. Both hosts are 32bit i386 with python 2.5 and different version of numpy. Gcc version is pretty similar: - host1 : gcc version 3.4.5 20051201 (Red Hat 3.4.5-2) - host2 : gcc version 3.4.6 20060404 (Red Hat 3.4.6-3) Try my example yourself and tell me if it works for you. Thanks, Emanuele From oliphant.travis at ieee.org Thu Dec 7 19:45:48 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu, 07 Dec 2006 17:45:48 -0700 Subject: [Numpy-discussion] pickling arrays: numpy 1.0 can't unpickle numpy 1.0.1 In-Reply-To: <4577ED37.6020208@itc.it> References: <4577ED37.6020208@itc.it> Message-ID: Emanuele Olivetti wrote: > I'm running numpy 1.0 and 1.0.1 on several hosts and > today I've found that pickling arrays in 1.0.1 generates > problems to 1.0. An example: > --- numpy 1.0.1 --- > import numpy > import pickle > a = numpy.array([1,2,3]) > f=open('test1.pickle','w') > pickle.dump(a,f) > f.close() > --- > > If I unpickle test1.pickle in numpy 1.0 I got: > --- numpy 1.0 > >>> import numpy > >>> import pickle > >>> f=open('test1.pickle') > >>> a=pickle.load(f) > Traceback (most recent call last): > File "", line 1, in > File "/hardmnt/virgo0/sra/olivetti/myapps/lib/python2.5/pickle.py", line 1370, in load > return Unpickler(file).load() > File "/hardmnt/virgo0/sra/olivetti/myapps/lib/python2.5/pickle.py", line 858, in load > dispatch[key](self) > File "/hardmnt/virgo0/sra/olivetti/myapps/lib/python2.5/pickle.py", line 1217, in load_build > setstate(state) > TypeError: argument 1 must be sequence of length 5, not 8 > ----------------- Please show which version of numpy you are using. There were no changes to pickle from released numpy 1.0 to 1.0.1 (at least that I'm aware of). There might, however, be bugs. -Travis From Jerome.Fuselier at xrce.xerox.com Fri Dec 8 03:33:40 2006 From: Jerome.Fuselier at xrce.xerox.com (Jerome Fuselier) Date: Fri, 08 Dec 2006 09:33:40 +0100 Subject: [Numpy-discussion] Strange numpy behaviour with pickle In-Reply-To: References: <4565B9C5.1080108@xrce.xerox.com> Message-ID: <45792364.5090408@xrce.xerox.com> An HTML attachment was scrubbed... URL: From oliphant.travis at ieee.org Fri Dec 8 03:40:13 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Fri, 08 Dec 2006 01:40:13 -0700 Subject: [Numpy-discussion] pickling arrays: numpy 1.0 can't unpickle numpy 1.0.1 In-Reply-To: <4577ED37.6020208@itc.it> References: <4577ED37.6020208@itc.it> Message-ID: Emanuele Olivetti wrote: > I'm running numpy 1.0 and 1.0.1 on several hosts and > today I've found that pickling arrays in 1.0.1 generates > problems to 1.0. An example: I correct my previous statement. Yes, this is true. Pickles generated with 1.0.1 cannot be read by version 1.0 However, pickles generated with 1.0 can be read by 1.0.1. It is typically not the case that pickles created with newer versions of the code will work with older versions. I obviously didn't think that was something to be concerned about because it slipped my mind. Changeset 3411 is the reason where the hasobject member of the data-type object was given more bits (and therefore needed to be saved). Sorry for the trouble, -Travis From emanuele at relativita.com Fri Dec 8 10:06:41 2006 From: emanuele at relativita.com (Emanuele Olivetti) Date: Fri, 08 Dec 2006 16:06:41 +0100 Subject: [Numpy-discussion] pickling arrays: numpy 1.0 can't unpickle numpy 1.0.1 In-Reply-To: References: <4577ED37.6020208@itc.it> Message-ID: <45797F81.8060204@relativita.com> Travis E. Oliphant wrote: > I correct my previous statement. Yes, this is true. Pickles generated > with 1.0.1 cannot be read by version 1.0 > > However, pickles generated with 1.0 can be read by 1.0.1. It is > typically not the case that pickles created with newer versions of the > code will work with older versions. I obviously didn't think that was > something to be concerned about because it slipped my mind. > > Changeset 3411 is the reason where the hasobject member of the data-type > object was given more bits (and therefore needed to be saved). > Thank you for the detailed explanation. I can easily upgrade the central host that collects results from other hosts in order to be able to read all results (1.0 or 1.0.1). Emanuele From weili at jimmy.harvard.edu Thu Dec 7 21:52:34 2006 From: weili at jimmy.harvard.edu (Wei) Date: Thu, 7 Dec 2006 21:52:34 -0500 Subject: [Numpy-discussion] installation error in cygwin Message-ID: <008401c71a73$ea39b480$a32d349b@WeiLiLaptop> Hi, I just got my new intel core duo laptop. So I downloaded the new cygwin (including everything) but couldn't get the numarray or numpy modules installed correctly. I always got the following error. Can some one help? Many thanks! Wei python setup.py install Using EXTRA_COMPILE_ARGS = [] Using builtin 'lite' BLAS and LAPACK running config Wrote config.h running install running build running build_py copying Lib/numinclude.py -> build/lib.cygwin-1.5.22-i686-2.4/numarray running build_ext building 'numarray.libnumarray' extension gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.22-i686-2.4/Src/libnumarraymodule.o -L/usr/lib/python2.4/config -lpython2.4 -o build/lib.cygwin-1.5.22-i686-2.4/numarray/libnumarray.dll -lm -L/lib -lm -lc -lgcc -L/lib/mingw -lmingwex /lib/mingw/libmingwex.a(feclearexcept.o):feclearexcept.c:(.text+0x21): undefined reference to `___cpu_features' /lib/mingw/libmingwex.a(fetestexcept.o):fetestexcept.c:(.text+0x7): undefined reference to `___cpu_features' collect2: ld returned 1 exit status error: command 'gcc' failed with exit status 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tomh at kurage.nimh.nih.gov Fri Dec 8 06:47:41 2006 From: tomh at kurage.nimh.nih.gov (Tom Holroyd) Date: Fri, 8 Dec 2006 06:47:41 -0500 (EST) Subject: [Numpy-discussion] shuffle bug Message-ID: This is certainly a bug. It has been mentioned before, but there was no comment. shuffle() doesn't handle arrays. >>> from numpy import * >>> from numpy.random import * >>> a = arange(12) >>> a.shape = (6,2) >>> a array([[ 0, 1], [ 2, 3], [ 4, 5], [ 6, 7], [ 8, 9], [10, 11]]) >>> shuffle(a) >>> a array([[ 0, 1], [ 2, 3], [ 2, 3], [ 0, 1], [ 4, 5], [10, 11]]) This is with numpy 1.0. The [0, 1] element was duplicated. That's not right. Tom Holroyd, Ph.D. We experience the world not as it is, but as we expect it to be. From robert.kern at gmail.com Fri Dec 8 21:07:34 2006 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 08 Dec 2006 20:07:34 -0600 Subject: [Numpy-discussion] shuffle bug In-Reply-To: References: Message-ID: <457A1A66.7090602@gmail.com> Tom Holroyd wrote: > This is certainly a bug. It has been mentioned before, but there > was no comment. Yes, there was. http://projects.scipy.org/pipermail/numpy-discussion/2006-November/024783.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant.travis at ieee.org Fri Dec 8 21:52:33 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 08 Dec 2006 19:52:33 -0700 Subject: [Numpy-discussion] installation error in cygwin In-Reply-To: <008401c71a73$ea39b480$a32d349b@WeiLiLaptop> References: <008401c71a73$ea39b480$a32d349b@WeiLiLaptop> Message-ID: <457A24F1.1030608@ieee.org> Wei wrote: > > Hi, > > I just got my new intel core duo laptop. So I downloaded the new > cygwin (including everything) but couldn?t get the numarray or numpy > modules installed correctly. I always got the following error. Can > some one help? > > Many thanks! > > Wei > > python setup.py install > > Using EXTRA_COMPILE_ARGS = [] > > Using builtin 'lite' BLAS and LAPACK > > running config > > Wrote config.h > > running install > > running build > > running build_py > > copying Lib/numinclude.py -> build/lib.cygwin-1.5.22-i686-2.4/numarray > > running build_ext > > building 'numarray.libnumarray' extension > > gcc -shared -Wl,--enable-auto-image-base > build/temp.cygwin-1.5.22-i686-2.4/Src/libnumarraymodule.o > -L/usr/lib/python2.4/config -lpython2.4 -o > build/lib.cygwin-1.5.22-i686-2.4/numarray/libnumarray.dll -lm -L/lib > -lm -lc -lgcc -L/lib/mingw -lmingwex > > /lib/mingw/libmingwex.a(feclearexcept.o):feclearexcept.c:(.text+0x21): > undefined reference to `___cpu_features' > > /lib/mingw/libmingwex.a(fetestexcept.o):fetestexcept.c:(.text+0x7): > undefined reference to `___cpu_features' > > collect2: ld returned 1 exit status > > error: command 'gcc' failed with exit status 1 > You don't need cygwin to install NumPy. I use the mingw compiler to compile windows binaries. This error looks like a problem with the platform-dependent code, but it's showing up in an odd place. I'm not sure what the issue is. -Travis From charlesr.harris at gmail.com Fri Dec 8 22:58:20 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 8 Dec 2006 20:58:20 -0700 Subject: [Numpy-discussion] installation error in cygwin In-Reply-To: <457A24F1.1030608@ieee.org> References: <008401c71a73$ea39b480$a32d349b@WeiLiLaptop> <457A24F1.1030608@ieee.org> Message-ID: On 12/8/06, Travis Oliphant wrote: > > Wei wrote: > > > > Hi, > > > > I just got my new intel core duo laptop. So I downloaded the new > > cygwin (including everything) but couldn't get the numarray or numpy > > modules installed correctly. I always got the following error. Can > > some one help? > > > > Many thanks! > > > > Wei > > > > python setup.py install > > > > Using EXTRA_COMPILE_ARGS = [] > > > > Using builtin 'lite' BLAS and LAPACK > > > > running config > > > > Wrote config.h > > > > running install > > > > running build > > > > running build_py > > > > copying Lib/numinclude.py -> build/lib.cygwin-1.5.22-i686-2.4/numarray > > > > running build_ext > > > > building 'numarray.libnumarray' extension > > > > gcc -shared -Wl,--enable-auto-image-base > > build/temp.cygwin-1.5.22-i686-2.4/Src/libnumarraymodule.o > > -L/usr/lib/python2.4/config -lpython2.4 -o > > build/lib.cygwin-1.5.22-i686-2.4/numarray/libnumarray.dll -lm -L/lib > > -lm -lc -lgcc -L/lib/mingw -lmingwex > > > > /lib/mingw/libmingwex.a(feclearexcept.o):feclearexcept.c:(.text+0x21): > > undefined reference to `___cpu_features' > > > > /lib/mingw/libmingwex.a(fetestexcept.o):fetestexcept.c:(.text+0x7): > > undefined reference to `___cpu_features' > > > > collect2: ld returned 1 exit status > > > > error: command 'gcc' failed with exit status 1 > > Looks almost like a mix of mingw and cygwin. The two aren't compatible, so I wonder if the python you have was compiled with cygwin or with vc. If the latter, mingw is the proper compiler to use. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Sat Dec 9 00:31:47 2006 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 9 Dec 2006 00:31:47 -0500 Subject: [Numpy-discussion] shuffle bug In-Reply-To: <457A1A66.7090602@gmail.com> References: <457A1A66.7090602@gmail.com> Message-ID: > Tom Holroyd wrote: >> there was no comment. On Fri, 08 Dec 2006, Robert Kern apparently wrote: > Yes, there was. > http://projects.scipy.org/pipermail/numpy-discussion/2006-November/024783.html Also see the previous comment: http://projects.scipy.org/pipermail/numpy-discussion/2006-November/024782.html fwiw, Alan Isaac From cssmwbs at gmail.com Sat Dec 9 01:20:31 2006 From: cssmwbs at gmail.com (WB) Date: Fri, 8 Dec 2006 22:20:31 -0800 Subject: [Numpy-discussion] compile scipy by using intel compiler In-Reply-To: <1165354075.6742.5.camel@cortechs25.cortechs.net> References: <1165354075.6742.5.camel@cortechs25.cortechs.net> Message-ID: <7c13686f0612082220w56ddf420xbeb6cc27588739e6@mail.gmail.com> hi gen, have you tried the compiler designed specifically for the opteron rather than the intel? you can download it here: http://developer.amd.com/acml.jsp don't know if it will get rid of any of your errors or if it will help compile scipy, but it may be worth a try anyway. wb On 12/5/06, Gennan Chen < gnchen at cortechs.net> wrote: > > Hi! All, > > I have a dual opteron 285 with 8G ram machine. And I ran FC6 x86_64 on > that. I did manage to get numpy (from svn) compiled by using icc 9.1.0.45and mkl > 9.0 ( got 3 errors when I ran the est). But no such luck for scipy (from > svn). Below is the error: > > Lib/special/cephes/mconf.h(137): remark #193: zero used for undefined > preprocessing identifier > #if WORDS_BIGENDIAN /* Defined in pyconfig.h */ > ^ > > Lib/special/cephes/const.c(92): error: floating-point operation result is > out of range > double INFINITY = 1.0/0.0; /* 99e999; */ > ^ > > Lib/special/cephes/const.c(97): error: floating-point operation result is > out of range > double NAN = 1.0/0.0 - 1.0/0.0; > ^ > > Lib/special/cephes/const.c(97): error: floating-point operation result is > out of range > double NAN = 1.0/0.0 - 1.0/0.0; > ^ > > compilation aborted for Lib/special/cephes/const.c (code 2) > error: Command "icc -O2 -g -fomit-frame-pointer -mcpu=pentium4 > -mtune=pentium4 -march=pentium4 -msse3 -axW -Wall -fPIC -c > Lib/special/cephes/const.c -o build/temp.linux-x86_64-2.4/Lib/special/cephes/const.o" > failed with exit status 2 > > Did anyone has a solution for this? > > BTW, the 3 error I got from numpy are: > File > "/usr/lib64/python2.4/site-packages/numpy/lib/tests/test_ufunclike.py", line > 25, in test_ufunclike > Failed example: > nx.sign(a) > Expected: > array([ 1., -1., 0., 0., 1., -1.]) > Got: > array([ 1., -1., -1., 0., 1., -1.]) > ********************************************************************** > File > "/usr/lib64/python2.4/site-packages/numpy/lib/tests/test_ufunclike.py", line > 40, in test_ufunclike > Failed example: > nx.sign(a, y) > Expected: > array([True, True, False, False, True, True], dtype=bool) > Got: > array([True, True, True, False, True, True], dtype=bool) > ********************************************************************** > File > "/usr/lib64/python2.4/site-packages/numpy/lib/tests/test_ufunclike.py", line > 43, in test_ufunclike > Failed example: > y > Expected: > array([True, True, False, False, True, True], dtype=bool) > Got: > array([True, True, True, False, True, True], dtype=bool) > > > Are these error serious?? > > Or maybe I should get back to gcc? Anyone got a good speed up by using icc > and mkl? > > -- > Gen-Nan Chen, PhD > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From morten.bjerkaas at gmail.com Sun Dec 10 07:40:28 2006 From: morten.bjerkaas at gmail.com (=?ISO-8859-1?Q?Morten_Bjerk=E5s?=) Date: Sun, 10 Dec 2006 13:40:28 +0100 Subject: [Numpy-discussion] griddata in python from x,y,z coordinates Message-ID: Dear I wounder about if you know about a command in numpy similar to the command griddata in MATLAB. This command makes a grid out of a set of x,y,z coordinates. best regards Morten -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Dec 10 07:45:40 2006 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 10 Dec 2006 06:45:40 -0600 Subject: [Numpy-discussion] griddata in python from x,y,z coordinates In-Reply-To: References: Message-ID: <457C0174.9010207@gmail.com> Morten Bjerk?s wrote: > Dear > I wounder about if you know about a command in numpy similar to the > command griddata in MATLAB. This command makes a grid out of a set of > x,y,z coordinates. http://www.scipy.org/Cookbook/Matplotlib/Gridding_irregularly_spaced_data -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Sun Dec 10 15:06:12 2006 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 10 Dec 2006 15:06:12 -0500 Subject: [Numpy-discussion] ANN: An alternative to numpy.core.ma Message-ID: <200612101506.12873.pgmdevlist@gmail.com> All, I just posted on the DeveloperZone of the wiki the latest version of maskedarray, an alternative to numpy.core.ma. You can download it here: http://projects.scipy.org/scipy/numpy/attachment/wiki/MaskedArray/maskedarray-1.00.dev0040.tar.gz The package has three modules: core (with the basic functions of numpy.core.ma), extras (which adds some functions, such as apply_along_axis, or the concatenator mr_), and testutils (which adds support for maskedarray for the tests functions). It also comes with its test suite (available in the tests subdirectory). For those of you who were not aware of it, the new MaskedArray is a subclass of ndarray, and it accepts any subclass of ndarray as data. You can use it as you would with numpy.core.ma.MaskedArray. For those of you who already tested the package, the main modifications are: - the reorganization of the initial module in core+extras. - Data are now shared by default (in other terms, the copy flag defaults to False in MaskedArray.__new__), for consistency with the rest of numpy. - An additional boolean flag has been introduced: keep_mask (with a default of True). This flag is useful when trying to mask a mask array: it tells __new__ whether to keep the initial mask (in that case, the new mask will be combined with the old mask) or not (in that case, the new mask replaces the old one). - Some functions/routines that were missing have been added (any/all...) As always, this is a work in progress. In particular, I should really check for the bottlenecks: would anybody have some pointers ? If you wanna be on the safe, optimized side, stick to numpy.core.ma. Otherwise, please try this new implementation, and don't forget to give me some feedback! PS: Technical question: how can I delete some files in the DeveloperZone wiki ? The maskedarray.py, test_maskedarray.py, test_subclasses.py are out of date, and should be replaced by the package. Thanks a lot in advance ! From david at icps.u-strasbg.fr Mon Dec 11 06:32:09 2006 From: david at icps.u-strasbg.fr (R. David) Date: Mon, 11 Dec 2006 12:32:09 +0100 Subject: [Numpy-discussion] lapack_lite dgesv Message-ID: <457D41B9.1030804@icps.u-strasbg.fr> Hello, I am trying to use the lapack_lite dgesv routine. The following sample code : from numpy import * [....] a=zeros((nbrows,nbcols),float,order='C') [....] ipiv=zeros((DIM),int,order='C') [....] linalg.lapack_lite.dgesv(DIM,1,a,DIM,asarray(ipiv),b,DIM,info) leads do the followin error message : lapack_lite.LapackError: Parameter ipiv is not of type PyArray_INT in lapack_lite.dgesv I don't understand the type problem for ipiv ! Indeed, the type of 'a' is OK, ipiv is created the same way than a, but something goes wrong. Do you have a clue for this ? Regards, Romaric -- -------------------------------------- R. David - david at icps.u-strasbg.fr Tel. : 03 90 24 45 48 (Fax 45 47) -------------------------------------- From alexandre.fayolle at logilab.fr Mon Dec 11 08:16:21 2006 From: alexandre.fayolle at logilab.fr (Alexandre Fayolle) Date: Mon, 11 Dec 2006 14:16:21 +0100 Subject: [Numpy-discussion] Numeric memory leak when building Numeric.array from numarray.array In-Reply-To: <1165509382.2588.54.camel@localhost.localdomain> References: <20061207155020.GA18896@crater.logilab.fr> <1165509382.2588.54.camel@localhost.localdomain> Message-ID: <20061211131621.GE18685@crater.logilab.fr> On Thu, Dec 07, 2006 at 05:36:22PM +0100, Francesc Altet wrote: > El dj 07 de 12 del 2006 a les 16:50 +0100, en/na Alexandre Fayolle va > escriure: > > Hi, > > > > I'm facing a memory leak on an application that has to use numarray and > > Numeric (because of external dependencies). > > > > The problem occurs when building a Numeric array from a numarray array: > > > > import Numeric > > import numarray > > import sys > > atest = numarray.arange(200) > > temp = Numeric.array(atest) > > print sys.getrefcount(atest) # prints 3 > > print sys.getrefcount(temp) # prints 2 > > > > I'm running numarray 1.5.2 and Numeric 24.2 > > Yeah, it seems like the array protocol implementation in Numeric is > leaking. Unfortunately, as Numeric maintenance has been dropped, there > is small chances that this would be fixed in the future. > > > > > I can work around this by using an intermediate string representation: > > > > temp = Numeric.fromstring(atest.tostring(), atest.typecode()) > > temp.shape = atest.shape > > Another (faster) workaround would be: > > temp2 = Numeric.fromstring(atest._data, typecode=atest.typecode()) Nice! Thanks Francesc. -- Alexandre Fayolle LOGILAB, Paris (France) Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services Informatique scientifique: http://www.logilab.fr/science Reprise et maintenance de sites CPS: http://www.migration-cms.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 481 bytes Desc: Digital signature URL: From tim.hochberg at ieee.org Mon Dec 11 08:20:02 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon, 11 Dec 2006 06:20:02 -0700 Subject: [Numpy-discussion] lapack_lite dgesv In-Reply-To: <457D41B9.1030804@icps.u-strasbg.fr> References: <457D41B9.1030804@icps.u-strasbg.fr> Message-ID: <457D5B02.2000501@ieee.org> R. David wrote: > Hello, > > I am trying to use the lapack_lite dgesv routine. > > The following sample code : > > from numpy import * > [....] > a=zeros((nbrows,nbcols),float,order='C') > [....] > ipiv=zeros((DIM),int,order='C') > [....] > linalg.lapack_lite.dgesv(DIM,1,a,DIM,asarray(ipiv),b,DIM,info) > > leads do the followin error message : > lapack_lite.LapackError: Parameter ipiv is not of type PyArray_INT in lapack_lite.dgesv > > I don't understand the type problem for ipiv ! > Indeed, the type of 'a' is OK, ipiv is created the same way than a, but something goes > wrong. > Do you have a clue for this ? > The problem is probably your definition of ipiv. "(DIM)" is just a parenthesized scalar, what you probably want is "(DIM,)", which is a one-tuple. Personally, I'd recommend using list notation ("[nbrows, nbcols]", "[DIM]") rather than tuple notation since it's both easier to read and and avoids this type of mistake. Regards, -tim From david at icps.u-strasbg.fr Mon Dec 11 08:32:44 2006 From: david at icps.u-strasbg.fr (R. David) Date: Mon, 11 Dec 2006 14:32:44 +0100 Subject: [Numpy-discussion] lapack_lite dgesv In-Reply-To: <457D5B02.2000501@ieee.org> References: <457D41B9.1030804@icps.u-strasbg.fr> <457D5B02.2000501@ieee.org> Message-ID: <457D5DFC.2060109@icps.u-strasbg.fr> Hello Tim, > > The problem is probably your definition of ipiv. "(DIM)" is just a > parenthesized scalar, what you probably want is "(DIM,)", which is a > one-tuple. Personally, I'd recommend using list notation ("[nbrows, > nbcols]", "[DIM]") rather than tuple notation since it's both easier to > read and and avoids this type of mistake. I tried both notations and none work. In the meantime, I tried extending the ipiv arrays to a 2 dimensionnal ones (if I had more than on right member for instante), but I still get the error message. Romaric From tim.hochberg at ieee.org Mon Dec 11 08:48:55 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon, 11 Dec 2006 06:48:55 -0700 Subject: [Numpy-discussion] lapack_lite dgesv In-Reply-To: <457D5DFC.2060109@icps.u-strasbg.fr> References: <457D41B9.1030804@icps.u-strasbg.fr> <457D5B02.2000501@ieee.org> <457D5DFC.2060109@icps.u-strasbg.fr> Message-ID: <457D61C7.8050308@ieee.org> R. David wrote: > Hello Tim, > >> The problem is probably your definition of ipiv. "(DIM)" is just a >> parenthesized scalar, what you probably want is "(DIM,)", which is a >> one-tuple. Personally, I'd recommend using list notation ("[nbrows, >> nbcols]", "[DIM]") rather than tuple notation since it's both easier to >> read and and avoids this type of mistake. >> > > I tried both notations and none work. > > In the meantime, I tried extending the ipiv arrays to a 2 dimensionnal > ones (if I had more than on right member for instante), but I still > get the error message. > > Romaric, Try replacing 'int' with intc (or numpy.intc if you are not using 'import *'). The following 'works' for me in the sense that it doesn't throw any errors (although I imagine the results are nonsense): from numpy import * nbrows, nbcols = 10, 10 a=zeros([nbrows,nbcols],float,order='C') b = zeros([nbcols], float, order='C') DIM = nbrows info = 0 ipiv=zeros([DIM],intc,order='C') linalg.lapack_lite.dgesv(DIM,1,a,DIM,ipiv,b,DIM,info) [In the future, could you could you try including self contained examples, so others don't have to go back and figure out sensible values for DIMS and info and whatnot? Ideally we'd just be able to throw them into a file and run them and get the same error that you are getting] Hope that solves it. -tim From david at icps.u-strasbg.fr Mon Dec 11 09:00:52 2006 From: david at icps.u-strasbg.fr (R. David) Date: Mon, 11 Dec 2006 15:00:52 +0100 Subject: [Numpy-discussion] lapack_lite dgesv In-Reply-To: <457D61C7.8050308@ieee.org> References: <457D41B9.1030804@icps.u-strasbg.fr> <457D5B02.2000501@ieee.org> <457D5DFC.2060109@icps.u-strasbg.fr> <457D61C7.8050308@ieee.org> Message-ID: <457D6494.3080103@icps.u-strasbg.fr> Hello, > Try replacing 'int' with intc (or numpy.intc if you are not using > 'import *'). The following 'works' for me in the sense that it doesn't > throw any errors (although I imagine the results are nonsense): Thanks, it works now !! Sorry for non including the whole code, I just not wanted to annoy the whole list with bunchs of code :-) Regards, Romaric From tim.hochberg at ieee.org Mon Dec 11 09:01:43 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon, 11 Dec 2006 07:01:43 -0700 Subject: [Numpy-discussion] lapack_lite dgesv In-Reply-To: <457D6494.3080103@icps.u-strasbg.fr> References: <457D41B9.1030804@icps.u-strasbg.fr> <457D5B02.2000501@ieee.org> <457D5DFC.2060109@icps.u-strasbg.fr> <457D61C7.8050308@ieee.org> <457D6494.3080103@icps.u-strasbg.fr> Message-ID: <457D64C7.6030905@ieee.org> R. David wrote: > Hello, > > > >> Try replacing 'int' with intc (or numpy.intc if you are not using >> 'import *'). The following 'works' for me in the sense that it doesn't >> throw any errors (although I imagine the results are nonsense): >> > Thanks, it works now !! > Great. Glad that helped. > Sorry for non including the whole code, I just not wanted to annoy the whole list > with bunchs of code :-) > Don't include the whole code -- that would probably annoy somebody. What you included was almost right, except that it should run on it's own. So nbrows, nbcols, DIMS and info all needed to be defined. That's all. -tim From faltet at carabos.com Mon Dec 11 10:33:18 2006 From: faltet at carabos.com (Francesc Altet) Date: Mon, 11 Dec 2006 16:33:18 +0100 Subject: [Numpy-discussion] Numeric memory leak when building Numeric.array from numarray.array In-Reply-To: <20061211131621.GE18685@crater.logilab.fr> References: <20061207155020.GA18896@crater.logilab.fr> <1165509382.2588.54.camel@localhost.localdomain> <20061211131621.GE18685@crater.logilab.fr> Message-ID: <1165851198.2847.12.camel@localhost.localdomain> El dl 11 de 12 del 2006 a les 14:16 +0100, en/na Alexandre Fayolle va escriure: > > > I can work around this by using an intermediate string representation: > > > > > > temp = Numeric.fromstring(atest.tostring(), atest.typecode()) > > > temp.shape = atest.shape > > > > Another (faster) workaround would be: > > > > temp2 = Numeric.fromstring(atest._data, typecode=atest.typecode()) > > Nice! Well, I've to say that this approach only work for contiguous, non-offseted arrays, as can be seen in: In [59]:atest = numarray.arange(10) In [60]:Numeric.fromstring(atest[5:]._data, typecode=atest.typecode()) Out[60]:array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9],'i') # wrong! In [61]:Numeric.fromstring(atest[5:].tostring(), atest.typecode()) Out[61]:array([5, 6, 7, 8, 9],'i') # good In [62]:Numeric.fromstring(atest[::2]._data, typecode=atest.typecode()) Out[62]:array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9],'i') # wrong! In [63]:Numeric.fromstring(atest[::2].tostring(), atest.typecode()) Out[63]:array([0, 2, 4, 6, 8],'i') # good So, be careful when using it. I'd rather keep using your approach, which is the faster one that is completely general. -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth From abli at freemail.hu Mon Dec 11 12:51:29 2006 From: abli at freemail.hu (Abel Daniel) Date: Mon, 11 Dec 2006 17:51:29 +0000 (UTC) Subject: [Numpy-discussion] a==b for numpy arrays Message-ID: > Hi! My unittests got broken because 'a==b' for numpy arrays returns an array instead of returning True or False: >>> import numpy >>> a = numpy.array([1, 2]) >>> b = numpy.array([1, 4]) >>> a==b array([True, False], dtype=bool) This means, for example: >>> if a==b: ... print 'equal' ... Traceback (most recent call last): File "", line 1, in ? ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() >>> Now, I think that having a way of getting an element-wise comparison (i.e. getting an array of bools) is great. _But_ why make that the result of a '==' comparison? Is there any actual code that does, for example >>> result_array = a==b or any variant thereof? Thanks in advance, Daniel From david.huard at gmail.com Mon Dec 11 13:18:18 2006 From: david.huard at gmail.com (David Huard) Date: Mon, 11 Dec 2006 13:18:18 -0500 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: References: Message-ID: <91cf711d0612111018tbf35b22h1b0b5f26a2fa1264@mail.gmail.com> Hi Daniel, Just out of curiosity, what's wrong with if all(a==b): ... ? Cheers, David 2006/12/11, Abel Daniel : > > > > Hi! > > My unittests got broken because 'a==b' for numpy arrays returns an > array instead of returning True or False: > > >>> import numpy > >>> a = numpy.array([1, 2]) > >>> b = numpy.array([1, 4]) > >>> a==b > array([True, False], dtype=bool) > > This means, for example: > >>> if a==b: > ... print 'equal' > ... > Traceback (most recent call last): > File "", line 1, in ? > ValueError: The truth value of an array with more than one element is > ambiguous. > Use a.any() or a.all() > >>> > > > Now, I think that having a way of getting an element-wise comparison > (i.e. getting an array of bools) is great. _But_ why make that the > result of a '==' comparison? Is there any actual code that does, for > example > >>> result_array = a==b > or any variant thereof? > > Thanks in advance, > Daniel > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Dec 11 14:32:27 2006 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 11 Dec 2006 13:32:27 -0600 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: References: Message-ID: <457DB24B.6020500@gmail.com> Abel Daniel wrote: > Hi! > > My unittests got broken because 'a==b' for numpy arrays returns an > array instead of returning True or False: > >>>> import numpy >>>> a = numpy.array([1, 2]) >>>> b = numpy.array([1, 4]) >>>> a==b > array([True, False], dtype=bool) > > This means, for example: >>>> if a==b: > ... print 'equal' > ... > Traceback (most recent call last): > File "", line 1, in ? > ValueError: The truth value of an array with more than one element is ambiguous. > Use a.any() or a.all() > > > Now, I think that having a way of getting an element-wise comparison > (i.e. getting an array of bools) is great. _But_ why make that the > result of a '==' comparison? Is there any actual code that does, for > example >>>> result_array = a==b > or any variant thereof? Yes, a lot. Rich comparisons have been in Numeric for quite a long time, now. I'm not sure what version of Numeric you were transitioning from that didn't do this, but it must have been extremely old. I suspect, however, that you were using a relatively recent version of Numeric and simply did not know that the rich comparisons were taking place. Now, what did change from Numeric to numpy (also, from Numeric to numarray) is that arrays can no longer be used as a truth value. It used to be that Numeric arrays' truth value was the same as Numeric.sometrue(arr). It is likely that your unit tests were expecting (a == b) to be the same as Numeric.alltrue(a == b). Since this is not the case, your unit tests had bugs. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From abli at freemail.hu Mon Dec 11 15:41:05 2006 From: abli at freemail.hu (Abel Daniel) Date: Mon, 11 Dec 2006 20:41:05 +0000 (UTC) Subject: [Numpy-discussion] a==b for numpy arrays References: <457DB24B.6020500@gmail.com> Message-ID: Robert Kern gmail.com> writes: > > Abel Daniel wrote: > > Now, I think that having a way of getting an element-wise comparison > > (i.e. getting an array of bools) is great. _But_ why make that the > > result of a '==' comparison? Is there any actual code that does, for > > example > >>>> result_array = a==b > > or any variant thereof? > > Yes, a lot. > And it would be much more cumbersome to use something like numpy.eq_as_array(a,b) or a.eq_as_array(b) in these cases? Could you show an example so that I can better appreciate the difference? The thing I can't get into my head is that '=' in the mathematical sense has a well-defined meaning for matrices, this seems to be broken by the current behaviour. That is, what "A+B" on a blackboard in a math class means maps nicely to what 'a+b' means with a and b being numpy arrays. But 'A=B' means something completely different than 'a==b'. I tried to dig up something about this "'a==b' return an array" decision from the discussion surrounding PEP 207 (on comp.lang.python or on python-dev) but I got lost in that thread. -- Daniel From tim.hochberg at ieee.org Mon Dec 11 16:09:50 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon, 11 Dec 2006 14:09:50 -0700 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: References: <457DB24B.6020500@gmail.com> Message-ID: <457DC91E.8070509@ieee.org> Abel Daniel wrote: > Robert Kern gmail.com> writes: > > >> Abel Daniel wrote: >> >>> Now, I think that having a way of getting an element-wise comparison >>> (i.e. getting an array of bools) is great. _But_ why make that the >>> result of a '==' comparison? Is there any actual code that does, for >>> example >>> >>>>>> result_array = a==b >>>>>> >>> or any variant thereof? >>> >> Yes, a lot. >> >> > And it would be much more cumbersome to use something like > numpy.eq_as_array(a,b) or a.eq_as_array(b) in these cases? > Yes. > Could you show an example so that I can better appreciate the difference? > # Replace all zeros with something safe so some calculation doesn't go insance. a[a==0] = DELTA Keep in mind also that all of the comparison operators are overloaded. It would be difficult to explain if "a<=0" returned an array, but "a==0" returned a scalar. > The thing I can't get into my head is that '=' in the mathematical sense has a > well-defined meaning for matrices, this seems to be broken by the current > behaviour. Numpy is not really about matrices. Numpy is about array's which are different and, for the most part, more powerful. You can use arrays inside numpyif you insist, but I personally think you're better off just learning to use arrays. Tastes vary though. > That is, what "A+B" on a blackboard in a math class means maps nicely > to what 'a+b' means with a and b being numpy arrays. But 'A=B' means something > completely different than 'a==b'. > One thing to keep in mind is that what you have in mind, which is equivalent to numpy.all(a==b) is almost always a bad idea when using floating point. -tim From robert.kern at gmail.com Mon Dec 11 16:17:39 2006 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 11 Dec 2006 15:17:39 -0600 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: References: <457DB24B.6020500@gmail.com> Message-ID: <457DCAF3.5080106@gmail.com> Abel Daniel wrote: > Robert Kern gmail.com> writes: > >> Abel Daniel wrote: >>> Now, I think that having a way of getting an element-wise comparison >>> (i.e. getting an array of bools) is great. _But_ why make that the >>> result of a '==' comparison? Is there any actual code that does, for >>> example >>>>>> result_array = a==b >>> or any variant thereof? >> Yes, a lot. >> > And it would be much more cumbersome to use something like > numpy.eq_as_array(a,b) or a.eq_as_array(b) in these cases? numpy.equal() is the ufunc corresponding to the == operation. > Could you show an example so that I can better appreciate the difference? a[a < 0] = 0 a[less(a, 0)] = 0 ma.masked_array(crufty_data, mask=(crufty_data==9999)) ma.masked_array(crufty_data, mask=equal(crufty_data, 9999)) (a >= b) & (a <= c) greater_equal(a,b) & less_equal(a,c) null_space = a[s <= eps] null_space = u[less_equal(s, eps)] > The thing I can't get into my head is that '=' in the mathematical sense has a > well-defined meaning for matrices, this seems to be broken by the current > behaviour. That is, what "A+B" on a blackboard in a math class means maps nicely > to what 'a+b' means with a and b being numpy arrays. But 'A=B' means something > completely different than 'a==b'. Well, yes. Computer languages reuse symbols that have other meanings in other contexts. For that matter 'a = b' in Python is definitely not the same thing as 'A = B' on the blackboard. Suffice it to say that a large majority of people felt that rich comparisons (and specifically rich comparisons for Numeric arrays) were enough of an improvement over the use of functions to do the same thing that we got the language changed to support it. Perhaps it is simply a matter of taste as to whether or not one thinks it is a improvement, but enough people think it is that it won't be changing back. > I tried to dig up something about this "'a==b' return an array" decision from > the discussion surrounding PEP 207 (on comp.lang.python or on python-dev) but I > got lost in that thread. Most of the results of that discussion are in the PEP itself. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From a.u.r.e.l.i.a.n at gmx.net Mon Dec 11 18:18:18 2006 From: a.u.r.e.l.i.a.n at gmx.net (Johannes Loehnert) Date: Tue, 12 Dec 2006 00:18:18 +0100 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: References: <457DB24B.6020500@gmail.com> Message-ID: <200612120018.18739.a.u.r.e.l.i.a.n@gmx.net> Hi, > current behaviour. That is, what "A+B" on a blackboard in a math class > means maps nicely to what 'a+b' means with a and b being numpy arrays. But > 'A=B' means something completely different than 'a==b'. This mapping is dangerous, I think A+B and A-B might be the only cases where it actually works. A*B and A/B (=A*inv(B)) are completely different from a*b in python. As it is, you only have to remember that every binary operator works element-wise. Reasoning aside, just wrap an all(...) around your if-comparisons and you will be fine. :-) Johannes From hirzel at resonon.com Mon Dec 11 18:26:56 2006 From: hirzel at resonon.com (Tim Hirzel) Date: Mon, 11 Dec 2006 18:26:56 -0500 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() Message-ID: <457DE940.1060807@resonon.com> Hi, Does anyone know how to get fromfile and tofile to work from a tempfile.TemporaryFile? Or if its not possible? I am getting this: >>> import tempfile >>> f = tempfile.TemporaryFile() >>> f ', mode 'w+b' at 0x01EE1728> >>> a = numpy.arange(10) >>> a.tofile(f) Traceback (most recent call last): File "", line 1, in ? IOError: first argument must be a string or open file thanks! tim From lists.steve at arachnedesign.net Mon Dec 11 18:37:24 2006 From: lists.steve at arachnedesign.net (Steve Lianoglou) Date: Mon, 11 Dec 2006 18:37:24 -0500 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: <457DCAF3.5080106@gmail.com> References: <457DB24B.6020500@gmail.com> <457DCAF3.5080106@gmail.com> Message-ID: Hi, It's not relevant to the point of this discussion all that much, but: > a[a < 0] = 0 > a[less(a, 0)] = 0 Instead I've been doing something like: a[where(a < 0)] = 0 I didn't realized you could do it the other way. Is there a difference somewhere between the two, or are they interchangeable? I kind of like the shorter way (sans where clause) better ... Thanks, -steve From lists.steve at arachnedesign.net Mon Dec 11 18:40:40 2006 From: lists.steve at arachnedesign.net (Steve Lianoglou) Date: Mon, 11 Dec 2006 18:40:40 -0500 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: References: <457DB24B.6020500@gmail.com> <457DCAF3.5080106@gmail.com> Message-ID: <0E86BF16-C8C7-40B1-95A3-8AC8E2CC9FF6@arachnedesign.net> > It's not relevant to the point of this discussion all that much, but: > >> a[a < 0] = 0 >> a[less(a, 0)] = 0 > > Instead I've been doing something like: > > a[where(a < 0)] = 0 > > I didn't realized you could do it the other way. Is there a > difference somewhere between the two, or are they interchangeable? Ah ... I see, w/o the where returns a boolean array. I reckon that's actually better to use than the where clause for cases like this since (for one) it'll take up less memory than arrays of ints. Sorry for talking to myself ... -steve From kwgoodman at gmail.com Mon Dec 11 18:48:41 2006 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 11 Dec 2006 15:48:41 -0800 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: <0E86BF16-C8C7-40B1-95A3-8AC8E2CC9FF6@arachnedesign.net> References: <457DB24B.6020500@gmail.com> <457DCAF3.5080106@gmail.com> <0E86BF16-C8C7-40B1-95A3-8AC8E2CC9FF6@arachnedesign.net> Message-ID: On 12/11/06, Steve Lianoglou wrote: > > It's not relevant to the point of this discussion all that much, but: > > > >> a[a < 0] = 0 > >> a[less(a, 0)] = 0 > > > > Instead I've been doing something like: > > > > a[where(a < 0)] = 0 > > > > I didn't realized you could do it the other way. Is there a > > difference somewhere between the two, or are they interchangeable? > > Ah ... I see, w/o the where returns a boolean array. I reckon that's > actually better to use than the where clause for cases like this > since (for one) it'll take up less memory than arrays of ints. These are different: a[a[:,0] >0, :] a[where(a[:,0].A >0)[0],:] I think it would be great if the former gave the same result as the latter. From charlesr.harris at gmail.com Mon Dec 11 18:50:45 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 11 Dec 2006 16:50:45 -0700 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: <457DE940.1060807@resonon.com> References: <457DE940.1060807@resonon.com> Message-ID: On 12/11/06, Tim Hirzel wrote: > > Hi, > Does anyone know how to get fromfile and tofile to work from a > tempfile.TemporaryFile? Or if its not possible? > > I am getting this: > >>> import tempfile > >>> f = tempfile.TemporaryFile() > >>> f > ', mode 'w+b' at 0x01EE1728> > >>> a = numpy.arange(10) > >>> a.tofile(f) > Traceback (most recent call last): > File "", line 1, in ? > IOError: first argument must be a string or open file Works for me: In [16]: f = tempfile.TemporaryFile() In [17]: a = ones(10) In [18]: a.tofile(f) In [19]: f.seek(0) In [20]: b = fromfile(f) In [21]: b Out[21]: array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) In [22]: f.close() What version of numpy are you running? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Mon Dec 11 19:35:54 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 11 Dec 2006 16:35:54 -0800 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: <0E86BF16-C8C7-40B1-95A3-8AC8E2CC9FF6@arachnedesign.net> References: <457DB24B.6020500@gmail.com> <457DCAF3.5080106@gmail.com> <0E86BF16-C8C7-40B1-95A3-8AC8E2CC9FF6@arachnedesign.net> Message-ID: <457DF96A.3080602@noaa.gov> Steve Lianoglou wrote: >> a[where(a < 0)] = 0 > Ah ... I see, w/o the where returns a boolean array. I reckon that's > actually better to use than the where clause for cases like this > since (for one) it'll take up less memory than arrays of ints. not to mention that you're creating an entire temporary array for no reason when you use were. the above statement creates a boolean array for a < 10, then creates another array with the where statement. Where is very handy when you want a new array, created according to some element-wise condition: b = where(a > 0, 10, 0) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From David.L.Goldsmith at noaa.gov Mon Dec 11 20:19:17 2006 From: David.L.Goldsmith at noaa.gov (David Goldsmith) Date: Mon, 11 Dec 2006 17:19:17 -0800 Subject: [Numpy-discussion] a==b for numpy arrays In-Reply-To: References: <457DB24B.6020500@gmail.com> Message-ID: <457E0395.1040404@noaa.gov> Abel Daniel wrote: > to what 'a+b' means with a and b being numpy arrays. But 'A=B' means something > completely different than 'a==b'. > > I disagree: A=B "on the blackboard" does mean that every element in A equals its positionally-corresponding element in B, and a==b in numpy will only be wholly true if a=b in the blackboard sense. As has been said by others in this thread, what needs to be adjusted to (and many off-the-shelf numerical programs have operated this way for years, so it's not like one has to make this adjustment - if they haven't already - only if one is using numpy) is what Robert calls "rich comparisons", i.e., a comparison of arrays/matrices returns a boolean-valued but otherwise similar object, whose elements indicate whether the comparison is true or false at each position. To determine if the comparison returns true for every element, all one has to do is use the 'all' method - not a huge amount of overhead, and now rather ubiquitous (in my experience) throughout the numerical software community (not to mention that rich comparison is _much_ more flexible, and in that, powerful). Oh, and another convenience method with which you should be aware is 'any', which returns true if any of the element-wise comparisons are true. DG > I tried to dig up something about this "'a==b' return an array" decision from > the discussion surrounding PEP 207 (on comp.lang.python or on python-dev) but I > got lost in that thread. > > From david at ar.media.kyoto-u.ac.jp Tue Dec 12 02:49:30 2006 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 12 Dec 2006 16:49:30 +0900 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? Message-ID: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> Hi, I am polishing some code to compute autocorrelation using fft, and when testing the code against numpy.correlate, I realised that I am not sure about the definition... There are various function related to correlation as far as numpy/scipoy is concerned: numpy.correlate numpy.corrcoef scipy.signal.correlate For me, the correlation between two sequences X and Y at lag t is the sum(X[i] * Y*[i+lag]) where Y* is the complex conjugate of Y. numpy.correlate does not use the conjugate, scipy.signal.correlate as well, and I don't understand numpy.corrcoef. I've never seen complex correlation used without the conjugate, so I was curious why this definition was used ? It is incompatible with the correlation as a scalar product, for example. Could someone give the definition used by those function ? cheers, David From svetosch at gmx.net Tue Dec 12 06:18:50 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Tue, 12 Dec 2006 12:18:50 +0100 Subject: [Numpy-discussion] numpy book In-Reply-To: <454A1197.1020606@ee.byu.edu> References: <4549E34A.4060507@gmx.net> <454A1197.1020606@ee.byu.edu> Message-ID: <457E901A.1030907@gmx.net> Travis Oliphant schrieb: >> Note that this is not a request to Travis to send me the latest version >> by private email. That would be inefficient and my need is not that >> urgent. Nevertheless I think that issue should be settled. >> >> > There will be an update, soon. I'm currently working on the index, > corrections, and formatting issues. > > The update will be sent in conjunction with the release of 1.0.1 which I > am targetting in 2 weeks. > I don't want to be a PITA, but should I have received something now that 1.0.1 is out? I can also offer help with formatting/editing the Lyx source file if that's the problem. cheers, sven From charlesr.harris at gmail.com Tue Dec 12 08:12:16 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 12 Dec 2006 06:12:16 -0700 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? In-Reply-To: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> References: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> Message-ID: On 12/12/06, David Cournapeau wrote: > > Hi, > > I am polishing some code to compute autocorrelation using fft, and > when testing the code against numpy.correlate, I realised that I am not > sure about the definition... There are various function related to > correlation as far as numpy/scipoy is concerned: > > numpy.correlate > numpy.corrcoef > scipy.signal.correlate > > For me, the correlation between two sequences X and Y at lag t is > the sum(X[i] * Y*[i+lag]) where Y* is the complex conjugate of Y. > numpy.correlate does not use the conjugate, scipy.signal.correlate as > well, and I don't understand numpy.corrcoef. I've never seen complex > correlation used without the conjugate, so I was curious why this Neither have I, it is one of those oddities that may have been inherited from Numeric. I wouldn't mind seeing it changed but it is probably a bit late for that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Dec 12 08:17:43 2006 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 12 Dec 2006 22:17:43 +0900 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? In-Reply-To: References: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> Message-ID: <457EABF7.3090302@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On 12/12/06, *David Cournapeau* > wrote: > > Hi, > > I am polishing some code to compute autocorrelation using fft, and > when testing the code against numpy.correlate, I realised that I > am not > sure about the definition... There are various function related to > correlation as far as numpy/scipoy is concerned: > > numpy.correlate > numpy.corrcoef > scipy.signal.correlate > > For me, the correlation between two sequences X and Y at lag t is > the sum(X[i] * Y*[i+lag]) where Y* is the complex conjugate of Y. > numpy.correlate does not use the conjugate, scipy.signal.correlate as > well, and I don't understand numpy.corrcoef. I've never seen complex > correlation used without the conjugate, so I was curious why this > > > Neither have I, it is one of those oddities that may have been > inherited from Numeric. I wouldn't mind seeing it changed but it is > probably a bit late for that. Well, I would myself call this a bug, not a feature, unless at least the doc specifies the behaviour; the point of my question was to get the opinion of other on this point. Anyway, a function to implements the 'real' cross correlation as defined in signal processing and statistics is a must have IMHO, David From aisaac at american.edu Tue Dec 12 09:40:02 2006 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 12 Dec 2006 09:40:02 -0500 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? In-Reply-To: References: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> Message-ID: > On 12/12/06, David Cournapeau wrote: >> I am polishing some code to compute autocorrelation using >> fft, and when testing the code against numpy.correlate, >> I realised that I am not sure about the definition... >> There are various function related to correlation as far >> as numpy/scipoy is concerned: >> numpy.correlate >> numpy.corrcoef >> scipy.signal.correlate >> For me, the correlation between two sequences X and Y at lag t is >> the sum(X[i] * Y*[i+lag]) where Y* is the complex conjugate of Y. >> numpy.correlate does not use the conjugate, scipy.signal.correlate as >> well, and I don't understand numpy.corrcoef. I've never seen complex >> correlation used without the conjugate, so I was curious why this On Tue, 12 Dec 2006, Charles R Harris apparently wrote: > Neither have I, it is one of those oddities that may have > been inherited from Numeric. I wouldn't mind seeing it > changed but it is probably a bit late for that. I hope that "too late" is not a determining argument! I hope the argument will address the following: - was there a justification for the extant behavior? If so, what was it, and does it still seem valid? - is the current definition reasonable; does it match definitions in use in at least some domain? http://mathworld.wolfram.com/Cross-Correlation.html http://en.wikipedia.org/wiki/Cross-correlation - if not, is this behavior so unexpected as to be considered a bug? - are many existing applications depending on it? The worst case is: it is a bug, but many existing users depend on the current behavior. I am not taking a position, but that seems the current view on this list. I hope that *if* that is the assessment, then a transition path will be plotted. For example, a keyword could be added, with a proper default, and a warning emitted when it is not set. Cheers, Alan Isaac From david.huard at gmail.com Tue Dec 12 10:02:12 2006 From: david.huard at gmail.com (David Huard) Date: Tue, 12 Dec 2006 10:02:12 -0500 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? In-Reply-To: References: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp>

Message-ID: <91cf711d0612120702t1e7926e2y6aebd65454accc62@mail.gmail.com> > > - if not, is this behavior so unexpected as to be considered > a bug? > - are many existing applications depending on it? > > The worst case is: > it is a bug, but many existing users depend on the current behavior. > I am not taking a position, but that seems the current view on this list. > I hope that *if* that is the assessment, then a transition > path will be plotted. For example, a keyword could be > added, with a proper default, and a warning emitted when it > is not set. > +1 for a change. I'm not using the current implementation. Since it was undocumented, I prefered coding my own. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From hirzel at resonon.com Tue Dec 12 11:26:33 2006 From: hirzel at resonon.com (Tim Hirzel) Date: Tue, 12 Dec 2006 11:26:33 -0500 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: References: <457DE940.1060807@resonon.com> Message-ID: <457ED839.4080604@resonon.com> Hi Chuck, Thanks for checking that. I am running numpy 1.0.1 in python 2.4 on win32 (xp). Are you on linux? I double checked the behavior in 1.0 and 1.0.1, just to be extra sure, and it thows the IOError in both cases. tim Charles R Harris wrote: > > > On 12/11/06, *Tim Hirzel* > wrote: > > Hi, > Does anyone know how to get fromfile and tofile to work from a > tempfile.TemporaryFile? Or if its not possible? > > I am getting this: > >>> import tempfile > >>> f = tempfile.TemporaryFile () > >>> f > ', mode 'w+b' at 0x01EE1728> > >>> a = numpy.arange(10) > >>> a.tofile(f) > Traceback (most recent call last): > File "", line 1, in ? > IOError: first argument must be a string or open file > > > Works for me: > > In [16]: f = tempfile.TemporaryFile() > > In [17]: a = ones(10) > > In [18]: a.tofile(f) > > In [19]: f.seek(0) > > In [20]: b = fromfile(f) > > In [21]: b > Out[21]: array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) > > In [22]: f.close() > > What version of numpy are you running? > > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From pgmdevlist at gmail.com Tue Dec 12 11:38:19 2006 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 12 Dec 2006 11:38:19 -0500 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? In-Reply-To: <91cf711d0612120702t1e7926e2y6aebd65454accc62@mail.gmail.com> References: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> <91cf711d0612120702t1e7926e2y6aebd65454accc62@mail.gmail.com> Message-ID: <200612121138.19906.pgmdevlist@gmail.com> > +1 for a change. I'm not using the current implementation. Since it was > undocumented, I prefered coding my own. Same case as David. I found it easier to code something with FFTs than trying to understand what was going on. From tim.hochberg at ieee.org Tue Dec 12 12:07:24 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Tue, 12 Dec 2006 10:07:24 -0700 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? In-Reply-To: <457EABF7.3090302@ar.media.kyoto-u.ac.jp> References: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> <457EABF7.3090302@ar.media.kyoto-u.ac.jp> Message-ID: <457EE1CC.9080806@ieee.org> David Cournapeau wrote: > Charles R Harris wrote: > >> On 12/12/06, *David Cournapeau* > > wrote: >> >> Hi, >> >> I am polishing some code to compute autocorrelation using fft, and >> when testing the code against numpy.correlate, I realised that I >> am not >> sure about the definition... There are various function related to >> correlation as far as numpy/scipoy is concerned: >> >> numpy.correlate >> numpy.corrcoef >> scipy.signal.correlate >> >> For me, the correlation between two sequences X and Y at lag t is >> the sum(X[i] * Y*[i+lag]) where Y* is the complex conjugate of Y. >> numpy.correlate does not use the conjugate, scipy.signal.correlate as >> well, and I don't understand numpy.corrcoef. I've never seen complex >> correlation used without the conjugate, so I was curious why this >> >> >> Neither have I, it is one of those oddities that may have been >> inherited from Numeric. I wouldn't mind seeing it changed but it is >> probably a bit late for that. >> > Well, I would myself call this a bug, not a feature, unless at least the > doc specifies the behaviour; the point of my question was to get the > opinion of other on this point. Anyway, a function to implements the > 'real' cross correlation as defined in signal processing and statistics > is a must have IMHO, > It's unfriendly to modify the behavior of a function like this in a point release. And, this particular type of modification is particularly unfriendly since any code that depends on the current behavior won't break cleanly, but will start producing failures, possibly intermittent, data dependent failures, which are especially troublesome. In addition, neither the name correlation nor its docstring is strongly, cough, correlated with cross-correlation. The docstring claims that it's the "discrete, linear correlation", which appears to mean nothing in my far from exhaustive web search. So rather than "fixing" the function, I would first propose introducing a function with a more descriptive name and docstring , for example you could steal the name 'xcorr' from matlab. Then if in fact the behavior of correlate is deemed to be an error, deprecate it and start issuing a warning in the next point release, then remove it in the next major release. Even better, IMO, would be if someone who cares about this stuff pulls together all the related signal processing stuff and moves them to a submodule so we could actually find what signal processing primitives are available. At the same time, more informative docstrings would be a great. -tim From charlesr.harris at gmail.com Tue Dec 12 13:00:38 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 12 Dec 2006 11:00:38 -0700 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: <457ED839.4080604@resonon.com> References: <457DE940.1060807@resonon.com> <457ED839.4080604@resonon.com> Message-ID: On 12/12/06, Tim Hirzel wrote: > > Hi Chuck, > Thanks for checking that. I am running numpy 1.0.1 in python 2.4 on > win32 (xp). Are you on linux? I double checked the behavior in 1.0 > and 1.0.1, just to be extra sure, and it thows the IOError in both cases. > tim I'm running linux and the current svn version of numpy. Maybe the problem is with the tempfile module on windows. Do fromfile and tofile work for files opened normally? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Tue Dec 12 13:06:03 2006 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 12 Dec 2006 13:06:03 -0500 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? In-Reply-To: <457EE1CC.9080806@ieee.org> References: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> <457EABF7.3090302@ar.media.kyoto-u.ac.jp><457EE1CC.9080806@ieee.org> Message-ID: On Tue, 12 Dec 2006, Tim Hochberg apparently wrote: > So rather than "fixing" the function, I would first > propose introducing a function with a more descriptive > name and docstring , for example you could steal the name > 'xcorr' from matlab. Then if in fact the behavior of > correlate is deemed to be an error, deprecate it and start > issuing a warning in the next point release, then remove > it in the next major release. This is also a good way forward. The important thing is to find a way forward. Cheers, Alan Isaac From oliphant.travis at ieee.org Tue Dec 12 13:30:45 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue, 12 Dec 2006 11:30:45 -0700 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? In-Reply-To: References: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> Message-ID: <457EF555.5010401@ieee.org> > > On 12/12/06, *David Cournapeau* > wrote: > > Hi, > > I am polishing some code to compute autocorrelation using fft, and > when testing the code against numpy.correlate, I realised that I > am not > sure about the definition... There are various function related to > correlation as far as numpy/scipoy is concerned: > > numpy.correlate > numpy.corrcoef > scipy.signal.correlate > > For me, the correlation between two sequences X and Y at lag t is > the sum(X[i] * Y*[i+lag]) where Y* is the complex conjugate of Y. > numpy.correlate does not use the conjugate, scipy.signal.correlate as > well, and I don't understand numpy.corrcoef. I've never seen complex > correlation used without the conjugate, so I was curious why this > > > Neither have I, it is one of those oddities that may have been > inherited from Numeric. I wouldn't mind seeing it changed but it is > probably a bit late for that. It is inherited from Numeric and can't really change. We can move forward with a different function, however, that uses the conjugate for complex data. The non-conjugated version is still well-defined, however. Convolution, for example, is defined without the conjugation, and the correlate function is the basis for that computation. So, it is not a good idea to change it. The scipy.signal.correlate function is a generalization to N-D of the numpy.correlate function which is 1-d only, the numpy.corrcoef function is completely different and just computes the correlation coefficients from the covariance matrix assuming observations of random vectors. -Travis From hirzel at resonon.com Tue Dec 12 14:24:31 2006 From: hirzel at resonon.com (Tim Hirzel) Date: Tue, 12 Dec 2006 14:24:31 -0500 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: References: <457DE940.1060807@resonon.com> <457ED839.4080604@resonon.com> Message-ID: <457F01EF.9040600@resonon.com> > I'm running linux and the current svn version of numpy. Maybe the > problem is with the tempfile module on windows. Do fromfile and tofile > work for files opened normally? > > Chuck fromfile and tofile work fine on regular files. From skimming the code a bit, it's hard to imagine numpy code is the culprit, since it must be getting a NULL pointer back from PyFile_AsFile(file)... Perhaps this is a question for a python dev list? My gut says it's probably something in the windows tempfile module. But perhaps in the PyFile_AsFile(file) implementation. Seems one of those isn't playing nice. It's all quite mysterious to me... tim From gamercier at yahoo.com Tue Dec 12 15:16:52 2006 From: gamercier at yahoo.com (Gustavo Mercier) Date: Tue, 12 Dec 2006 12:16:52 -0800 (PST) Subject: [Numpy-discussion] Installation - Numpy 1.0.1 Suse Linux 10.1 Message-ID: <261866.81649.qm@web31803.mail.mud.yahoo.com> Hi! I am trying to install Numpy 1.0.1 in a Linux Box (Suse Linux 10.1; gcc 4.1x). I have done this successfully with previous versions up to 1.0b5. However, I now run into problems. It compiles and installs ok, but upon opening python and importing numpy it hangs with an unresolved reference when loading linalg. The reference is to a _gfortran...function. I use Atlas and this was compiled with gfortran, and following the instructions to combine lapack with atlas. blas was also compiled the same way. I see that Numpy is being compiled with g77. I am trying to add the reference to the gfortran library object in /usr/lib but I don't see how to do this as an option in distutils. I also try to change the compiler from g77 to gfortran, but I failed. Setting --fcompiler anc --ccompiler options led to failures due to trying to compile c code with the fortran compiler. Any suggestions? Thanks for your help! -- Gustavo A. Mercier, Jr. MD,PhD gamercier at yahoo.com 469-396-6750 - cell -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Tue Dec 12 15:25:29 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 12 Dec 2006 12:25:29 -0800 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: <457F01EF.9040600@resonon.com> References: <457DE940.1060807@resonon.com> <457ED839.4080604@resonon.com> <457F01EF.9040600@resonon.com> Message-ID: <457F1039.2080508@noaa.gov> did you try reading and writing to/from that temp file with regular old python functions? -Chris Tim Hirzel wrote: >> I'm running linux and the current svn version of numpy. Maybe the >> problem is with the tempfile module on windows. Do fromfile and tofile >> work for files opened normally? >> >> Chuck > > fromfile and tofile work fine on regular files. From skimming the code > a bit, it's hard to imagine numpy code is the culprit, since it must be > getting a NULL pointer back from PyFile_AsFile(file)... Perhaps this is > a question for a python dev list? My gut says it's probably something > in the windows tempfile module. But perhaps in the PyFile_AsFile(file) > implementation. Seems one of those isn't playing nice. It's all quite > mysterious to me... > > tim > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From hirzel at resonon.com Tue Dec 12 16:56:08 2006 From: hirzel at resonon.com (Tim Hirzel) Date: Tue, 12 Dec 2006 16:56:08 -0500 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: <457F1039.2080508@noaa.gov> References: <457DE940.1060807@resonon.com> <457ED839.4080604@resonon.com> <457F01EF.9040600@resonon.com> <457F1039.2080508@noaa.gov> Message-ID: <457F2578.4040703@resonon.com> Good thought Chris, Normal reading and writing does seem to work. .. But, my friend Daniel figured out a workaround when I asked to confirm this behavior on his windows setup (and it is does behave the same for him). The first clue was this: >>> f = tempfile.TemporaryFile() >>> type(f) >>> g = open("temp","w+b") >>> type(g) so Daniel did a dir(f) and found the 'file' attribute so if you do (where 'a' is a numpy array) >>> a.tofile(f.file) It works! writing to the "file" attribute of the TemporaryFile, it works fine! So that's good, but still a little hinky. Especially since it works on linux... on a linux platform, what does type(tempfile.TemporaryFile()) return? I assume an as well... anways, so at least there is a quick repair for now. Good news is, I assume using 'f.file' would work on linux too in terms of having a single cross-platform solution. cheers, tim Christopher Barker wrote: > did you try reading and writing to/from that temp file with regular old > python functions? > > -Chris > > > Tim Hirzel wrote: > >>> I'm running linux and the current svn version of numpy. Maybe the >>> problem is with the tempfile module on windows. Do fromfile and tofile >>> work for files opened normally? >>> >>> Chuck >>> >> fromfile and tofile work fine on regular files. From skimming the code >> a bit, it's hard to imagine numpy code is the culprit, since it must be >> getting a NULL pointer back from PyFile_AsFile(file)... Perhaps this is >> a question for a python dev list? My gut says it's probably something >> in the windows tempfile module. But perhaps in the PyFile_AsFile(file) >> implementation. Seems one of those isn't playing nice. It's all quite >> mysterious to me... >> >> tim >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > From oliphant at ee.byu.edu Tue Dec 12 17:26:44 2006 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue, 12 Dec 2006 15:26:44 -0700 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: <457F2578.4040703@resonon.com> References: <457DE940.1060807@resonon.com> <457ED839.4080604@resonon.com> <457F01EF.9040600@resonon.com> <457F1039.2080508@noaa.gov> <457F2578.4040703@resonon.com> Message-ID: <457F2CA4.2050402@ee.byu.edu> Tim Hirzel wrote: >Good thought Chris, >Normal reading and writing does seem to work. .. >But, my friend Daniel figured out a workaround when I asked to confirm >this behavior on his windows setup (and it is does behave the same for >him). The first clue was this: > > >>> f = tempfile.TemporaryFile() > >>> type(f) > > >>> g = open("temp","w+b") > >>> type(g) > > >so Daniel did a dir(f) and found the 'file' attribute > >so if you do (where 'a' is a numpy array) > >>> a.tofile(f.file) >It works! > >writing to the "file" attribute of the TemporaryFile, it works fine! So >that's good, but still a little hinky. Especially since it works on >linux... >on a linux platform, what does type(tempfile.TemporaryFile()) return? I >assume an as well... > >anways, so at least there is a quick repair for now. Good news is, I >assume using 'f.file' would work on linux too in terms of having a >single cross-platform solution. > > There is no file attribute on Linux. On linux you get >>> type(f) So, you might have to do something like: if not isinstance(f, file): f = f.file before passing f to the tofile method. It seems to me that the temporary file mechanism on Windows is a little odd. -Travis From dalcinl at gmail.com Tue Dec 12 18:31:33 2006 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 12 Dec 2006 20:31:33 -0300 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: <457F2CA4.2050402@ee.byu.edu> References: <457DE940.1060807@resonon.com> <457ED839.4080604@resonon.com> <457F01EF.9040600@resonon.com> <457F1039.2080508@noaa.gov> <457F2578.4040703@resonon.com> <457F2CA4.2050402@ee.byu.edu> Message-ID: > > It seems to me that the temporary file mechanism on Windows is a little > odd. > Indeed, looking at sources, the posix version uses the mkstemp/unlink idiom.. but in win it uses a bit of hackery. It seems opened files cannot be unlinked. if _os.name != 'posix' or _os.sys.platform == 'cygwin': # On non-POSIX and Cygwin systems, assume that we cannot unlink a file # while it is open. TemporaryFile = NamedTemporaryFile else: def TemporaryFile(mode='w+b', bufsize=-1, suffix="", prefix=template, dir=None): ............ (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags) try: _os.unlink(name) return _os.fdopen(fd, mode, bufsize) except: _os.close(fd) raise -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From charlesr.harris at gmail.com Tue Dec 12 19:02:38 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 12 Dec 2006 17:02:38 -0700 Subject: [Numpy-discussion] a.T Message-ID: Hi all, I'm curious about the error thrown when I use a.T as the left side of a multiply assign. In the following, I am multiplying each the rows of 'a' by the corresponding element of arange(n), i.e., broadcasting from the left. The result looks fine, but an error is thrown after the operation is complete. In [62]: a = arange(12).reshape(3,2,2) In [63]: a Out[63]: array([[[ 0, 1], [ 2, 3]], [[ 4, 5], [ 6, 7]], [[ 8, 9], [10, 11]]]) In [64]: a.T *= arange(3) --------------------------------------------------------------------------- exceptions.TypeError Traceback (most recent call last) /home/charris/workspace/microsat/tycho-work/ TypeError: attribute 'T' of 'numpy.ndarray' objects is not writable In [65]: a Out[65]: array([[[ 0, 0], [ 0, 0]], [[ 4, 5], [ 6, 7]], [[16, 18], [20, 22]]]) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 12 19:09:11 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 12 Dec 2006 17:09:11 -0700 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: <457F2CA4.2050402@ee.byu.edu> References: <457DE940.1060807@resonon.com> <457ED839.4080604@resonon.com> <457F01EF.9040600@resonon.com> <457F1039.2080508@noaa.gov> <457F2578.4040703@resonon.com> <457F2CA4.2050402@ee.byu.edu> Message-ID: On 12/12/06, Travis Oliphant wrote: > > Tim Hirzel wrote: > > >Good thought Chris, > >Normal reading and writing does seem to work. .. > >But, my friend Daniel figured out a workaround when I asked to confirm > >this behavior on his windows setup (and it is does behave the same for > >him). The first clue was this: > > > > >>> f = tempfile.TemporaryFile() > > >>> type(f) > > > > >>> g = open("temp","w+b") > > >>> type(g) > > > > > >so Daniel did a dir(f) and found the 'file' attribute > > > >so if you do (where 'a' is a numpy array) > > >>> a.tofile(f.file) > >It works! > > > >writing to the "file" attribute of the TemporaryFile, it works fine! So > >that's good, but still a little hinky. Especially since it works on > >linux... > >on a linux platform, what does type(tempfile.TemporaryFile()) return? I > >assume an as well... > > > >anways, so at least there is a quick repair for now. Good news is, I > >assume using 'f.file' would work on linux too in terms of having a > >single cross-platform solution. > > > > > There is no file attribute on Linux. On linux you get > > >>> type(f) > > > So, you might have to do something like: > > if not isinstance(f, file): > f = f.file > > before passing f to the tofile method. > > It seems to me that the temporary file mechanism on Windows is a little > odd. Looks like a tempfile bug to me. Python should be cross platform and since the file attribute is correctly set on windows, I don't see why the tempfile can't be made to behave correctly. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Dec 12 19:11:35 2006 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 12 Dec 2006 19:11:35 -0500 Subject: [Numpy-discussion] a.T In-Reply-To: References: Message-ID: <200612121911.36014.pgmdevlist@gmail.com> On Tuesday 12 December 2006 19:02, Charles R Harris wrote: > Hi all, > > I'm curious about the error thrown when I use a.T as the left side of a > multiply assign. I Chuck, if you keep in mind that .T is a shortcut for .transpose(), you'll understand why you can't assign to a function call. From david at ar.media.kyoto-u.ac.jp Tue Dec 12 22:19:20 2006 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 13 Dec 2006 12:19:20 +0900 Subject: [Numpy-discussion] Definition of correlation, correlate and so on ? In-Reply-To: <457EE1CC.9080806@ieee.org> References: <457E5F0A.4080707@ar.media.kyoto-u.ac.jp> <457EABF7.3090302@ar.media.kyoto-u.ac.jp> <457EE1CC.9080806@ieee.org> Message-ID: <457F7138.9010401@ar.media.kyoto-u.ac.jp> Tim Hochberg wrote: > > So rather than "fixing" the function, I would first propose introducing > a function with a more descriptive name and docstring , for example you > could steal the name 'xcorr' from matlab. Then if in fact the behavior > of correlate is deemed to be an error, deprecate it and start issuing a > warning in the next point release, then remove it in the next major release. That was my idea too: specifiy in the docstring that this does not compute the correlation, and put a new function xcorr (or whatever name). The good news being this function is already done for rank up to 2, with basic tests... :) > > Even better, IMO, would be if someone who cares about this stuff pulls > together all the related signal processing stuff and moves them to a > submodule so we could actually find what signal processing primitives > are available. At the same time, more informative docstrings would be a > great. > Do you mean signal function in numpy or scipy ? For scipy, this is already done (module scipy.signals), David From cameron.walsh at gmail.com Tue Dec 12 22:27:34 2006 From: cameron.walsh at gmail.com (Cameron Walsh) Date: Wed, 13 Dec 2006 12:27:34 +0900 Subject: [Numpy-discussion] Histograms of extremely large data sets Message-ID: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> Hi all, I'm trying to generate histograms of extremely large datasets. I've tried a few methods, listed below, all with their own shortcomings. Mailing-list archive and google searches have not revealed any solutions. Method 1: import numpy import matplotlib data=numpy.empty((489,1000,1000),dtype="uint8") # Replace this line with actual data samples, but the size and types are correct. histogram = pylab.hist(data, bins=range(0,256)) pylab.xlim(0,256) pylab.show() The problem with this method is it appears to never finish. It is however, extremely fast for smaller data sets, like 5x1000x1000 (1-2 seconds) instead of 500x1000x1000. Method 2: import numpy import matplotlib data=numpy.empty((489,1000,1000),dtype="uint8") # Replace this line with actual data samples, but the size and types are correct. bins=numpy.zeros((256),dtype="uint32") for val in data.flat: bins[val]+=1 barchart = pylab.bar(xrange(256),bins,align="center") pylab.xlim(0,256) pylab.show() The problem with this method is it is incredibly slow, taking up to 30 seconds for a 1x1000x1000 sample, I have neither the patience nor the inclination to time a 500x1000x1000 sample. Method 3: import numpy data=numpy.empty((489,1000,1000),dtype="uint8") # Replace this line with actual data samples, but the size and types are correct. a=numpy.histogram(data,256) The problem with this one is: Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.5/site-packages/numpy/lib/function_base.py", line 96, in histogram n = sort(a).searchsorted(bins) ValueError: dimensions too large. It seems that iterating over the entire array and doing it manually is the slowest possible method, but that the rest are not much better. Is there a faster method available, or do I have to implement method 2 in C and submit the change as a patch? Thanks and best regards, Cameron. From eric at enthought.com Wed Dec 13 02:42:09 2006 From: eric at enthought.com (eric jones) Date: Wed, 13 Dec 2006 01:42:09 -0600 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> Message-ID: <457FAED1.6010803@enthought.com> Hey Cameron, I wrote a simple weave based histogram function that should work for your problem. It should work for any array input data type. The needed files (and a few tests and examples) are attached. Below is the output from the histogram_speed.py file attached. The test takes about 10 seconds to bin a uniformly distributed set of data from a 1000x1000x100 uint8 array into 256 bins. It compares Travis' nifty new iterator based indexing in numpy to raw C indexing of a contiguous array. The two algorithms give identical results, and the speed difference is negligible. That's cool because the iterator based stuff makes this sort of algorithms quite easy to handle for N-dimensional. Hope that helps, eric ps. For those who care, I had to make a minor change to the array type converters so that they can be used with the iterator interface more easily. Later this will be folded into weave, but for now I sub-classed the standard array converter and made the modifications. # speed test output. c:\eric\code\histogram> histogram_speed.py type: uint8 millions of elements: 100.0 sec (C indexing based): 9.52776707654 [390141 390352 390598 389706 390985 390856 389785 390262 389929 391024 391854 390243 391255 390723 390525 391751 389842 391612 389601 391210 390799 391674 390693 390381 390460 389839 390185 390909 390215 391271 390934 390818 390528 389990 389982 389667 391035 390317 390616 390916 390191 389771 391448 390325 390556 391333 390148 390894 389611 390511 390614 390999 389646 391255 391284 391214 392106 391067 391480 389991 391091 390271 389801 390044 391459 390644 391309 390450 390200 391537 390907 390160 391117 390738 391638 391200 390815 390611 390355 389925 390939 390932 391569 390287 389987 389545 391140 391280 389773 389794 389559 390085 389991 391372 390189 391010 390863 390432 390743 390959 389271 390210 390967 390999 391177 389777 391748 390623 391597 392009 389308 390557 390213 390930 390449 390327 390600 390626 389985 390816 389671 390187 390595 390973 390921 390599 390167 391196 390381 391345 392166 389709 390656 389886 390646 390355 391273 391342 390234 390751 390515 390048 390455 391122 391069 390968 390488 390708 391027 391179 391110 390453 390632 390825 391369 390844 390001 391487 390778 390788 390609 390254 389907 391803 391508 391414 391012 389987 389284 390699 391094 390658 390463 390291 390848 389616 390894 389561 390971 391165 391378 391698 389434 390591 390027 391088 390787 391165 390169 391212 389799 389829 389764 390435 391158 391834 391206 390041 391537 390237 390253 391025 392336 391081 390005 391057 390226 390240 390197 389906 391164 391157 390639 391501 389125 389922 390961 390012 389832 389650 390018 390461 390695 390140 390939 389089 391094 390076 391123 389518 391340 390039 390786 391751 391133 390675 392305 390667 391243 389889 390103 390438 389215 389805 392180 391351 389923 390932 390136 390556 389684 390324 390152 390982 391355] sec (numpy iteration based): 10.3055525213 [390141 390352 390598 389706 390985 390856 389785 390262 389929 391024 391854 390243 391255 390723 390525 391751 389842 391612 389601 391210 390799 391674 390693 390381 390460 389839 390185 390909 390215 391271 390934 390818 390528 389990 389982 389667 391035 390317 390616 390916 390191 389771 391448 390325 390556 391333 390148 390894 389611 390511 390614 390999 389646 391255 391284 391214 392106 391067 391480 389991 391091 390271 389801 390044 391459 390644 391309 390450 390200 391537 390907 390160 391117 390738 391638 391200 390815 390611 390355 389925 390939 390932 391569 390287 389987 389545 391140 391280 389773 389794 389559 390085 389991 391372 390189 391010 390863 390432 390743 390959 389271 390210 390967 390999 391177 389777 391748 390623 391597 392009 389308 390557 390213 390930 390449 390327 390600 390626 389985 390816 389671 390187 390595 390973 390921 390599 390167 391196 390381 391345 392166 389709 390656 389886 390646 390355 391273 391342 390234 390751 390515 390048 390455 391122 391069 390968 390488 390708 391027 391179 391110 390453 390632 390825 391369 390844 390001 391487 390778 390788 390609 390254 389907 391803 391508 391414 391012 389987 389284 390699 391094 390658 390463 390291 390848 389616 390894 389561 390971 391165 391378 391698 389434 390591 390027 391088 390787 391165 390169 391212 389799 389829 389764 390435 391158 391834 391206 390041 391537 390237 390253 391025 392336 391081 390005 391057 390226 390240 390197 389906 391164 391157 390639 391501 389125 389922 390961 390012 389832 389650 390018 390461 390695 390140 390939 389089 391094 390076 391123 389518 391340 390039 390786 391751 391133 390675 392305 390667 391243 389889 390103 390438 389215 389805 392180 391351 389923 390932 390136 390556 389684 390324 390152 390982 391355] 0 Cameron Walsh wrote: > Hi all, > > I'm trying to generate histograms of extremely large datasets. I've > tried a few methods, listed below, all with their own shortcomings. > Mailing-list archive and google searches have not revealed any > solutions. > > Method 1: > > import numpy > import matplotlib > > data=numpy.empty((489,1000,1000),dtype="uint8") > # Replace this line with actual data samples, but the size and types > are correct. > > histogram = pylab.hist(data, bins=range(0,256)) > pylab.xlim(0,256) > pylab.show() > > The problem with this method is it appears to never finish. It is > however, extremely fast for smaller data sets, like 5x1000x1000 (1-2 > seconds) instead of 500x1000x1000. > > > Method 2: > > import numpy > import matplotlib > > data=numpy.empty((489,1000,1000),dtype="uint8") > # Replace this line with actual data samples, but the size and types > are correct. > > bins=numpy.zeros((256),dtype="uint32") > for val in data.flat: > bins[val]+=1 > barchart = pylab.bar(xrange(256),bins,align="center") > pylab.xlim(0,256) > pylab.show() > > The problem with this method is it is incredibly slow, taking up to 30 > seconds for a 1x1000x1000 sample, I have neither the patience nor the > inclination to time a 500x1000x1000 sample. > > > Method 3: > > import numpy > > data=numpy.empty((489,1000,1000),dtype="uint8") > # Replace this line with actual data samples, but the size and types > are correct. > > a=numpy.histogram(data,256) > > > The problem with this one is: > > Traceback (most recent call last): > File "", line 1, in > File "/usr/local/lib/python2.5/site-packages/numpy/lib/function_base.py", > line 96, in histogram > n = sort(a).searchsorted(bins) > ValueError: dimensions too large. > > > It seems that iterating over the entire array and doing it manually is > the slowest possible method, but that the rest are not much better. > Is there a faster method available, or do I have to implement method 2 > in C and submit the change as a patch? > > Thanks and best regards, > > Cameron. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- A non-text attachment was scrubbed... Name: weave_histogram.py Type: text/x-python Size: 2533 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: histogram_speed.py Type: text/x-python Size: 702 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_weave_histogram.py Type: text/x-python Size: 2170 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: typed_array_converter.py Type: text/x-python Size: 1582 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: weave_contiguous_histogram.py Type: text/x-python Size: 2388 bytes Desc: not available URL: From cameron.walsh at gmail.com Wed Dec 13 03:27:22 2006 From: cameron.walsh at gmail.com (Cameron Walsh) Date: Wed, 13 Dec 2006 17:27:22 +0900 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <457FAED1.6010803@enthought.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> Message-ID: <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> On 13/12/06, eric jones wrote 290 lines of awesome code and a fantastic explanation: > Hey Cameron, > > I wrote a simple weave based histogram function that should work for > your problem. It should work for any array input data type. The needed > files (and a few tests and examples) are attached. Thank you very much, they seem to be exactly what I need. I haven't yet been able to test it all completely, as for some reason I'm missing the zlib module. That might have to wait till tomorrow depending on how the next half hour goes. > > Below is the output from the histogram_speed.py file attached. The test > takes about 10 seconds to bin a uniformly distributed set of data from a > 1000x1000x100 uint8 array into 256 bins. It compares Travis' nifty new > iterator based indexing in numpy to raw C indexing of a contiguous > array. The two algorithms give identical results, and the speed > difference is negligible. That's cool because the iterator based stuff > makes this sort of algorithms quite easy to handle for N-dimensional. If that's the case, assuming our machines are the same, your new code is around 5 times faster. That brings it back to a reasonable time frame. I'll let you know as soon as I can how it all works. > > Hope that helps, > eric It certainly does! Cameron. > > ps. For those who care, I had to make a minor change to the array type > converters so that they can be used with the iterator interface more > easily. Later this will be folded into weave, but for now I sub-classed > the standard array converter and made the modifications. > > # speed test output. > c:\eric\code\histogram> histogram_speed.py > type: uint8 > millions of elements: 100.0 > sec (C indexing based): 9.52776707654 > [390141 390352 390598 389706 390985 390856 389785 390262 389929 391024 > 391854 390243 391255 390723 390525 391751 389842 391612 389601 391210 > 390799 391674 390693 390381 390460 389839 390185 390909 390215 391271 > 390934 390818 390528 389990 389982 389667 391035 390317 390616 390916 > 390191 389771 391448 390325 390556 391333 390148 390894 389611 390511 > 390614 390999 389646 391255 391284 391214 392106 391067 391480 389991 > 391091 390271 389801 390044 391459 390644 391309 390450 390200 391537 > 390907 390160 391117 390738 391638 391200 390815 390611 390355 389925 > 390939 390932 391569 390287 389987 389545 391140 391280 389773 389794 > 389559 390085 389991 391372 390189 391010 390863 390432 390743 390959 > 389271 390210 390967 390999 391177 389777 391748 390623 391597 392009 > 389308 390557 390213 390930 390449 390327 390600 390626 389985 390816 > 389671 390187 390595 390973 390921 390599 390167 391196 390381 391345 > 392166 389709 390656 389886 390646 390355 391273 391342 390234 390751 > 390515 390048 390455 391122 391069 390968 390488 390708 391027 391179 > 391110 390453 390632 390825 391369 390844 390001 391487 390778 390788 > 390609 390254 389907 391803 391508 391414 391012 389987 389284 390699 > 391094 390658 390463 390291 390848 389616 390894 389561 390971 391165 > 391378 391698 389434 390591 390027 391088 390787 391165 390169 391212 > 389799 389829 389764 390435 391158 391834 391206 390041 391537 390237 > 390253 391025 392336 391081 390005 391057 390226 390240 390197 389906 > 391164 391157 390639 391501 389125 389922 390961 390012 389832 389650 > 390018 390461 390695 390140 390939 389089 391094 390076 391123 389518 > 391340 390039 390786 391751 391133 390675 392305 390667 391243 389889 > 390103 390438 389215 389805 392180 391351 389923 390932 390136 390556 > 389684 390324 390152 390982 391355] > sec (numpy iteration based): 10.3055525213 > [390141 390352 390598 389706 390985 390856 389785 390262 389929 391024 > 391854 390243 391255 390723 390525 391751 389842 391612 389601 391210 > 390799 391674 390693 390381 390460 389839 390185 390909 390215 391271 > 390934 390818 390528 389990 389982 389667 391035 390317 390616 390916 > 390191 389771 391448 390325 390556 391333 390148 390894 389611 390511 > 390614 390999 389646 391255 391284 391214 392106 391067 391480 389991 > 391091 390271 389801 390044 391459 390644 391309 390450 390200 391537 > 390907 390160 391117 390738 391638 391200 390815 390611 390355 389925 > 390939 390932 391569 390287 389987 389545 391140 391280 389773 389794 > 389559 390085 389991 391372 390189 391010 390863 390432 390743 390959 > 389271 390210 390967 390999 391177 389777 391748 390623 391597 392009 > 389308 390557 390213 390930 390449 390327 390600 390626 389985 390816 > 389671 390187 390595 390973 390921 390599 390167 391196 390381 391345 > 392166 389709 390656 389886 390646 390355 391273 391342 390234 390751 > 390515 390048 390455 391122 391069 390968 390488 390708 391027 391179 > 391110 390453 390632 390825 391369 390844 390001 391487 390778 390788 > 390609 390254 389907 391803 391508 391414 391012 389987 389284 390699 > 391094 390658 390463 390291 390848 389616 390894 389561 390971 391165 > 391378 391698 389434 390591 390027 391088 390787 391165 390169 391212 > 389799 389829 389764 390435 391158 391834 391206 390041 391537 390237 > 390253 391025 392336 391081 390005 391057 390226 390240 390197 389906 > 391164 391157 390639 391501 389125 389922 390961 390012 389832 389650 > 390018 390461 390695 390140 390939 389089 391094 390076 391123 389518 > 391340 390039 390786 391751 391133 390675 392305 390667 391243 389889 > 390103 390438 389215 389805 392180 391351 389923 390932 390136 390556 > 389684 390324 390152 390982 391355] > 0 > > > Cameron Walsh wrote: > > Hi all, > > > > I'm trying to generate histograms of extremely large datasets. I've > > tried a few methods, listed below, all with their own shortcomings. > > Mailing-list archive and google searches have not revealed any > > solutions. > > > > Method 1: > > > > import numpy > > import matplotlib > > > > data=numpy.empty((489,1000,1000),dtype="uint8") > > # Replace this line with actual data samples, but the size and types > > are correct. > > > > histogram = pylab.hist(data, bins=range(0,256)) > > pylab.xlim(0,256) > > pylab.show() > > > > The problem with this method is it appears to never finish. It is > > however, extremely fast for smaller data sets, like 5x1000x1000 (1-2 > > seconds) instead of 500x1000x1000. > > > > > > Method 2: > > > > import numpy > > import matplotlib > > > > data=numpy.empty((489,1000,1000),dtype="uint8") > > # Replace this line with actual data samples, but the size and types > > are correct. > > > > bins=numpy.zeros((256),dtype="uint32") > > for val in data.flat: > > bins[val]+=1 > > barchart = pylab.bar(xrange(256),bins,align="center") > > pylab.xlim(0,256) > > pylab.show() > > > > The problem with this method is it is incredibly slow, taking up to 30 > > seconds for a 1x1000x1000 sample, I have neither the patience nor the > > inclination to time a 500x1000x1000 sample. > > > > > > Method 3: > > > > import numpy > > > > data=numpy.empty((489,1000,1000),dtype="uint8") > > # Replace this line with actual data samples, but the size and types > > are correct. > > > > a=numpy.histogram(data,256) > > > > > > The problem with this one is: > > > > Traceback (most recent call last): > > File "", line 1, in > > File "/usr/local/lib/python2.5/site-packages/numpy/lib/function_base.py", > > line 96, in histogram > > n = sort(a).searchsorted(bins) > > ValueError: dimensions too large. > > > > > > It seems that iterating over the entire array and doing it manually is > > the slowest possible method, but that the rest are not much better. > > Is there a faster method available, or do I have to implement method 2 > > in C and submit the change as a patch? > > > > Thanks and best regards, > > > > Cameron. > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > From faltet at carabos.com Wed Dec 13 12:28:01 2006 From: faltet at carabos.com (Francesc Altet) Date: Wed, 13 Dec 2006 18:28:01 +0100 Subject: [Numpy-discussion] .byteswap() and copy/view dilemma Message-ID: <1166030881.2846.64.camel@localhost.localdomain> Hi, I'm a bit confused about which cases the .byteswap() method in NumPy would return a copy or a view. From the docstrings: """ a.byteswap(False) -> View or copy. Swap the bytes in the array. Swap the bytes in the array. Return the byteswapped array. If the first argument is TRUE, byteswap in-place and return a reference to self. """ >From the above description, it is not clear to me whether the .byteswap(False) would return a copy or a view. However, the official NumPy book seems to explain this clearer: """ Byteswap the elements of the array and return the byteswapped array. If the argument is True, then byteswap in-place and return a reference to self. Otherwise, return a copy of the array with the elements byteswapped. The data-type descriptor is not changed so the array will have changed numbers. """ My experiments also show that .byteswap(False) does do a copy: In [154]:a=numpy.array([1,2,3]) In [155]:b=a.byteswap() In [156]:b Out[156]:array([16777216, 33554432, 50331648]) In [157]:a[1]=0 In [158]:b Out[158]:array([16777216, 33554432, 50331648]) I'm wondering if I'm the only one that finds this docstring confusing. Regards, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth From pgmdevlist at gmail.com Wed Dec 13 15:22:06 2006 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 13 Dec 2006 15:22:06 -0500 Subject: [Numpy-discussion] rollaxis Message-ID: <200612131522.06193.pgmdevlist@gmail.com> All, I have a ND array whose axes I want to reorganize, so that axis "i" is at the end while the others stay in their relative position. What's the easiest ? From peridot.faceted at gmail.com Wed Dec 13 15:29:03 2006 From: peridot.faceted at gmail.com (A. M. Archibald) Date: Wed, 13 Dec 2006 15:29:03 -0500 Subject: [Numpy-discussion] rollaxis In-Reply-To: <200612131522.06193.pgmdevlist@gmail.com> References: <200612131522.06193.pgmdevlist@gmail.com> Message-ID: On 13/12/06, Pierre GM wrote: > All, > I have a ND array whose axes I want to reorganize, so that axis "i" is at the > end while the others stay in their relative position. What's the easiest ? Generate an axis-permutation tuple and use transpose: s = list(A.shape) s.remove(i) s.append(i) B = A.transpose(s) A. M. Archibald From hirzel at resonon.com Wed Dec 13 16:28:02 2006 From: hirzel at resonon.com (Tim Hirzel) Date: Wed, 13 Dec 2006 16:28:02 -0500 Subject: [Numpy-discussion] fromfile and tofile access with a tempfile.TemporaryFile() In-Reply-To: References: <457DE940.1060807@resonon.com> <457ED839.4080604@resonon.com> <457F01EF.9040600@resonon.com> <457F1039.2080508@noaa.gov> <457F2578.4040703@resonon.com> <457F2CA4.2050402@ee.byu.edu> Message-ID: <45807062.3070603@resonon.com> Thanks for everyone's help and thoughts. I agree that this behavior is buggy. I submitted a bug report to the python project at sourceforge, with a link to this thread. Hopefully the report will be helpful. tim Charles R Harris wrote: > > > On 12/12/06, *Travis Oliphant* > wrote: > > Tim Hirzel wrote: > > >Good thought Chris, > >Normal reading and writing does seem to work. .. > >But, my friend Daniel figured out a workaround when I asked to > confirm > >this behavior on his windows setup (and it is does behave the > same for > >him). The first clue was this: > > > > >>> f = tempfile.TemporaryFile() > > >>> type(f) > > > > >>> g = open("temp","w+b") > > >>> type(g) > > > > > >so Daniel did a dir(f) and found the 'file' attribute > > > >so if you do (where 'a' is a numpy array) > > >>> a.tofile(f.file) > >It works! > > > >writing to the "file" attribute of the TemporaryFile, it works > fine! So > >that's good, but still a little hinky. Especially since it works on > >linux... > >on a linux platform, what does type( tempfile.TemporaryFile()) > return? I > >assume an as well... > > > >anways, so at least there is a quick repair for now. Good news is, I > >assume using 'f.file' would work on linux too in terms of having a > >single cross-platform solution. > > > > > There is no file attribute on Linux. On linux you get > > >>> type(f) > > > So, you might have to do something like: > > if not isinstance(f, file): > f = f.file > > before passing f to the tofile method. > > It seems to me that the temporary file mechanism on Windows is a > little > odd. > > > Looks like a tempfile bug to me. Python should be cross platform and > since the file attribute is correctly set on windows, I don't see why > the tempfile can't be made to behave correctly. > > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From pgmdevlist at gmail.com Wed Dec 13 16:44:35 2006 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 13 Dec 2006 16:44:35 -0500 Subject: [Numpy-discussion] rollaxis In-Reply-To: References: <200612131522.06193.pgmdevlist@gmail.com> Message-ID: <200612131644.36536.pgmdevlist@gmail.com> On Wednesday 13 December 2006 15:29, A. M. Archibald wrote: > Generate an axis-permutation tuple and use transpose: Ah OK. It took me a little while to get it running: instead of s=list(A.shape) in your example, one should read s=range(A.ndim) But it does the trick, thanks a lot! And now, double or nothing: Samething, but with rows or columns: For example, put a 5th column in the far right, without modifying the relative positions of the others. Thanks in advance ! From pgmdevlist at gmail.com Wed Dec 13 17:54:55 2006 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 13 Dec 2006 17:54:55 -0500 Subject: [Numpy-discussion] rollaxis In-Reply-To: References: <200612131522.06193.pgmdevlist@gmail.com> <200612131644.36536.pgmdevlist@gmail.com> Message-ID: <200612131754.55240.pgmdevlist@gmail.com> > Generate a column-permutation tuple and use fancy indexing: Works like a charm, thanks a lot ! From cameron.walsh at gmail.com Wed Dec 13 20:32:05 2006 From: cameron.walsh at gmail.com (Cameron Walsh) Date: Thu, 14 Dec 2006 10:32:05 +0900 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> Message-ID: <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> On 13/12/06, Cameron Walsh wrote: > On 13/12/06, eric jones wrote 290 lines of > awesome code and a fantastic explanation: > > > Hey Cameron, > > > > I wrote a simple weave based histogram function that should work for > > your problem. It should work for any array input data type. The needed > > files (and a few tests and examples) are attached. [...] Hi Eric, I've ran test_weave_histogram.py and histogram_speed.py, but each one seems to fail on calling typed_array_converter.declaration_code() as follows: Traceback (most recent call last): File "histogram_speed.py", line 26, in res2 = histogram(data, bins) File "/home/cameron/repos/wavesmaker/trunk/code/gui/process_modules/eric_histo/weave_histogram.py", line 67, in histogram compiler='gcc') File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", line 339, in inline **kw) File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", line 447, in compile_function verbose=verbose, **kw) File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", line 353, in compile kw,file = self.build_kw_and_file(location,kw) File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", line 334, in build_kw_and_file file = self.generate_file(location=location) File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", line 295, in generate_file code = self.module_code() File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", line 203, in module_code self.function_code(), File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", line 269, in function_code all_function_code += func.function_code() File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", line 83, in function_code decl_code = indent(self.arg_declaration_code(),4) File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", line 62, in arg_declaration_code arg_strings.append(arg.declaration_code(inline=1)) File "/home/cameron/repos/wavesmaker/trunk/code/gui/process_modules/eric_histo/typed_array_converter.py", line 18, in declaration_code code = super(typed_array_converter, self).declaration_code(templatize, TypeError: super() argument 1 must be type, not classobj I've tried this with Python2.4 and Python2.5 with the same results. What do I need to change, since it seems to have worked for you but not for me. Thanks and best regards, Cameron. From wbaxter at gmail.com Wed Dec 13 21:06:30 2006 From: wbaxter at gmail.com (Bill Baxter) Date: Thu, 14 Dec 2006 11:06:30 +0900 Subject: [Numpy-discussion] linalg.lstsq for complex In-Reply-To: References: Message-ID: Is this code from linalg.lstsq for the complex case correct? lapack_routine = lapack_lite.zgelsd lwork = 1 rwork = zeros((lwork,), real_t) work = zeros((lwork,),t) results = lapack_routine(m, n, n_rhs, a, m, bstar, ldb, s, rcond, 0, work, -1, rwork, iwork, 0) lwork = int(abs(work[0])) rwork = zeros((lwork,),real_t) a_real = zeros((m,n),real_t) bstar_real = zeros((ldb,n_rhs,),real_t) results = lapack_lite.dgelsd(m, n, n_rhs, a_real, m, bstar_real, ldb, s, rcond, 0, rwork, -1, iwork, 0) lrwork = int(rwork[0]) work = zeros((lwork,), t) rwork = zeros((lrwork,), real_t) results = lapack_routine(m, n, n_rhs, a, m, bstar, ldb, s, rcond, The middle call to dgelsd looks unnecessary to me. At the very least, allocating astar_real and bstar_real shouldn't be necessary since they aren't referenced anywhere else in the lstsq function. The lapack documentation for zgelsd also doesn't mention any need to call dgelsd to compute the size of the work array. --bb From eric at enthought.com Wed Dec 13 21:17:26 2006 From: eric at enthought.com (eric jones) Date: Wed, 13 Dec 2006 20:17:26 -0600 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> Message-ID: <4580B436.8040203@enthought.com> Hmmm. ? Not sure. ? Change that line to this instead which should work as well. code = array_converter.declaration_code(self, templatize, inline) Both work for me. eric Cameron Walsh wrote: > On 13/12/06, Cameron Walsh wrote: > >> On 13/12/06, eric jones wrote 290 lines of >> awesome code and a fantastic explanation: >> >> >>> Hey Cameron, >>> >>> I wrote a simple weave based histogram function that should work for >>> your problem. It should work for any array input data type. The needed >>> files (and a few tests and examples) are attached. >>> > [...] > > Hi Eric, > > I've ran test_weave_histogram.py and histogram_speed.py, but each one > seems to fail on calling typed_array_converter.declaration_code() as > follows: > > Traceback (most recent call last): > File "histogram_speed.py", line 26, in > res2 = histogram(data, bins) > File "/home/cameron/repos/wavesmaker/trunk/code/gui/process_modules/eric_histo/weave_histogram.py", > line 67, in histogram > compiler='gcc') > File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", > line 339, in inline > **kw) > File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", > line 447, in compile_function > verbose=verbose, **kw) > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > line 353, in compile > kw,file = self.build_kw_and_file(location,kw) > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > line 334, in build_kw_and_file > file = self.generate_file(location=location) > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > line 295, in generate_file > code = self.module_code() > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > line 203, in module_code > self.function_code(), > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > line 269, in function_code > all_function_code += func.function_code() > File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", > line 83, in function_code > decl_code = indent(self.arg_declaration_code(),4) > File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", > line 62, in arg_declaration_code > arg_strings.append(arg.declaration_code(inline=1)) > File "/home/cameron/repos/wavesmaker/trunk/code/gui/process_modules/eric_histo/typed_array_converter.py", > line 18, in declaration_code > code = super(typed_array_converter, self).declaration_code(templatize, > TypeError: super() argument 1 must be type, not classobj > > I've tried this with Python2.4 and Python2.5 with the same results. > > What do I need to change, since it seems to have worked for you but not for me. > > Thanks and best regards, > > Cameron. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From cameron.walsh at gmail.com Wed Dec 13 22:03:50 2006 From: cameron.walsh at gmail.com (Cameron Walsh) Date: Thu, 14 Dec 2006 12:03:50 +0900 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <4580B436.8040203@enthought.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> <4580B436.8040203@enthought.com> Message-ID: <106309950612131903l43b1d67fsaf71aba632725780@mail.gmail.com> Thanks very much, Eric. That line fixed it for me, although I'm still not sure why it broke with the last line. Your weave_histogram works a charm and is around 16 times faster than any of the other options I've tried. On my laptop it took 30 seconds to generate a histogram from 500 million numbers, which is fine. Thanks and best regards, Cameron. On 14/12/06, eric jones wrote: > Hmmm. > > ? Not sure. ? > > Change that line to this instead which should work as well. > > code = array_converter.declaration_code(self, templatize, inline) > > Both work for me. > > eric > > Cameron Walsh wrote: > > On 13/12/06, Cameron Walsh wrote: > > > >> On 13/12/06, eric jones wrote 290 lines of > >> awesome code and a fantastic explanation: > >> > >> > >>> Hey Cameron, > >>> > >>> I wrote a simple weave based histogram function that should work for > >>> your problem. It should work for any array input data type. The needed > >>> files (and a few tests and examples) are attached. > >>> > > [...] > > > > Hi Eric, > > > > I've ran test_weave_histogram.py and histogram_speed.py, but each one > > seems to fail on calling typed_array_converter.declaration_code() as > > follows: > > > > Traceback (most recent call last): > > File "histogram_speed.py", line 26, in > > res2 = histogram(data, bins) > > File "/home/cameron/repos/wavesmaker/trunk/code/gui/process_modules/eric_histo/weave_histogram.py", > > line 67, in histogram > > compiler='gcc') > > File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", > > line 339, in inline > > **kw) > > File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", > > line 447, in compile_function > > verbose=verbose, **kw) > > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > > line 353, in compile > > kw,file = self.build_kw_and_file(location,kw) > > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > > line 334, in build_kw_and_file > > file = self.generate_file(location=location) > > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > > line 295, in generate_file > > code = self.module_code() > > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > > line 203, in module_code > > self.function_code(), > > File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", > > line 269, in function_code > > all_function_code += func.function_code() > > File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", > > line 83, in function_code > > decl_code = indent(self.arg_declaration_code(),4) > > File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", > > line 62, in arg_declaration_code > > arg_strings.append(arg.declaration_code(inline=1)) > > File "/home/cameron/repos/wavesmaker/trunk/code/gui/process_modules/eric_histo/typed_array_converter.py", > > line 18, in declaration_code > > code = super(typed_array_converter, self).declaration_code(templatize, > > TypeError: super() argument 1 must be type, not classobj > > > > I've tried this with Python2.4 and Python2.5 with the same results. > > > > What do I need to change, since it seems to have worked for you but not for me. > > > > Thanks and best regards, > > > > Cameron. > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From eric at enthought.com Wed Dec 13 23:35:38 2006 From: eric at enthought.com (eric jones) Date: Wed, 13 Dec 2006 22:35:38 -0600 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <106309950612131903l43b1d67fsaf71aba632725780@mail.gmail.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> <4580B436.8040203@enthought.com> <106309950612131903l43b1d67fsaf71aba632725780@mail.gmail.com> Message-ID: <4580D49A.70303@enthought.com> Glad to here it worked for you. see ya, eric Cameron Walsh wrote: > Thanks very much, Eric. That line fixed it for me, although I'm still > not sure why it broke with the last line. > > Your weave_histogram works a charm and is around 16 times faster than > any of the other options I've tried. On my laptop it took 30 seconds > to generate a histogram from 500 million numbers, which is fine. > > Thanks and best regards, > > Cameron. > > > On 14/12/06, eric jones wrote: > >> Hmmm. >> >> ? Not sure. ? >> >> Change that line to this instead which should work as well. >> >> code = array_converter.declaration_code(self, templatize, inline) >> >> Both work for me. >> >> eric >> >> Cameron Walsh wrote: >> >>> On 13/12/06, Cameron Walsh wrote: >>> >>> >>>> On 13/12/06, eric jones wrote 290 lines of >>>> awesome code and a fantastic explanation: >>>> >>>> >>>> >>>>> Hey Cameron, >>>>> >>>>> I wrote a simple weave based histogram function that should work for >>>>> your problem. It should work for any array input data type. The needed >>>>> files (and a few tests and examples) are attached. >>>>> >>>>> >>> [...] >>> >>> Hi Eric, >>> >>> I've ran test_weave_histogram.py and histogram_speed.py, but each one >>> seems to fail on calling typed_array_converter.declaration_code() as >>> follows: >>> >>> Traceback (most recent call last): >>> File "histogram_speed.py", line 26, in >>> res2 = histogram(data, bins) >>> File "/home/cameron/repos/wavesmaker/trunk/code/gui/process_modules/eric_histo/weave_histogram.py", >>> line 67, in histogram >>> compiler='gcc') >>> File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", >>> line 339, in inline >>> **kw) >>> File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", >>> line 447, in compile_function >>> verbose=verbose, **kw) >>> File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", >>> line 353, in compile >>> kw,file = self.build_kw_and_file(location,kw) >>> File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", >>> line 334, in build_kw_and_file >>> file = self.generate_file(location=location) >>> File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", >>> line 295, in generate_file >>> code = self.module_code() >>> File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", >>> line 203, in module_code >>> self.function_code(), >>> File "/usr/local/lib/python2.5/site-packages/scipy/weave/ext_tools.py", >>> line 269, in function_code >>> all_function_code += func.function_code() >>> File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", >>> line 83, in function_code >>> decl_code = indent(self.arg_declaration_code(),4) >>> File "/usr/local/lib/python2.5/site-packages/scipy/weave/inline_tools.py", >>> line 62, in arg_declaration_code >>> arg_strings.append(arg.declaration_code(inline=1)) >>> File "/home/cameron/repos/wavesmaker/trunk/code/gui/process_modules/eric_histo/typed_array_converter.py", >>> line 18, in declaration_code >>> code = super(typed_array_converter, self).declaration_code(templatize, >>> TypeError: super() argument 1 must be type, not classobj >>> >>> I've tried this with Python2.4 and Python2.5 with the same results. >>> >>> What do I need to change, since it seems to have worked for you but not for me. >>> >>> Thanks and best regards, >>> >>> Cameron. >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From rlw at stsci.edu Wed Dec 13 23:40:03 2006 From: rlw at stsci.edu (Rick White) Date: Wed, 13 Dec 2006 23:40:03 -0500 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> Message-ID: <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> On Dec 12, 2006, at 10:27 PM, Cameron Walsh wrote: > I'm trying to generate histograms of extremely large datasets. I've > tried a few methods, listed below, all with their own shortcomings. > Mailing-list archive and google searches have not revealed any > solutions. The numpy.histogram function can be modified to use memory much more efficiently when the input array is large, and the modification turns out to be faster even for smallish arrays (in my tests, anyway). Below is a modified version of the histogram function from function_base.py. It is almost identical, but it avoids doing the sort of the entire input array simply by dividing it into blocks. (It would be even better to avoid the call to ravel too.) The only other messy detail is that the builtin range function is shadowed by the 'range' parameter. In my timing tests this is about the same speed for arrays about the same size as the block size and is faster than the current version by 30-40% for large arrays. The speed difference increases as the array size increases. I haven't compared this to Eric's weave function, but this has the advantages of being pure Python and of being much simpler. On my machine (MacBook Pro) it takes about 4 seconds for an array with 100 million elements. The time increases perfectly linearly with array size for arrays larger than a million elements. Rick from numpy import * lrange = range def histogram(a, bins=10, range=None, normed=False): a = asarray(a).ravel() if not iterable(bins): if range is None: range = (a.min(), a.max()) mn, mx = [mi+0.0 for mi in range] if mn == mx: mn -= 0.5 mx += 0.5 bins = linspace(mn, mx, bins, endpoint=False) # best block size probably depends on processor cache size block = 65536 n = sort(a[:block]).searchsorted(bins) for i in lrange(block,len(a),block): n += sort(a[i:i+block]).searchsorted(bins) n = concatenate([n, [len(a)]]) n = n[1:]-n[:-1] if normed: db = bins[1] - bins[0] return 1.0/(a.size*db) * n, bins else: return n, bins From eric at enthought.com Thu Dec 14 01:03:45 2006 From: eric at enthought.com (eric jones) Date: Thu, 14 Dec 2006 00:03:45 -0600 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> Message-ID: <4580E941.4030805@enthought.com> Looks to me like Rick's version is simpler and faster.It looks like it offers a speed-up of about 1.6 on my machine over the weave version. I believe this is because the sorting approach results in quite a few less compares than the algorithm I used. Very cool. I vote that his version go into numpy. eric Rick White wrote: > On Dec 12, 2006, at 10:27 PM, Cameron Walsh wrote: > > >> I'm trying to generate histograms of extremely large datasets. I've >> tried a few methods, listed below, all with their own shortcomings. >> Mailing-list archive and google searches have not revealed any >> solutions. >> > > The numpy.histogram function can be modified to use memory much more > efficiently when the input array is large, and the modification turns > out to be faster even for smallish arrays (in my tests, anyway). > Below is a modified version of the histogram function from > function_base.py. It is almost identical, but it avoids doing the > sort of the entire input array simply by dividing it into blocks. > (It would be even better to avoid the call to ravel too.) The only > other messy detail is that the builtin range function is shadowed by > the 'range' parameter. > > In my timing tests this is about the same speed for arrays about the > same size as the block size and is faster than the current version by > 30-40% for large arrays. The speed difference increases as the array > size increases. > > I haven't compared this to Eric's weave function, but this has the > advantages of being pure Python and of being much simpler. On my > machine (MacBook Pro) it takes about 4 seconds for an array with 100 > million elements. The time increases perfectly linearly with array > size for arrays larger than a million elements. > Rick > > from numpy import * > > lrange = range > def histogram(a, bins=10, range=None, normed=False): > a = asarray(a).ravel() > if not iterable(bins): > if range is None: > range = (a.min(), a.max()) > mn, mx = [mi+0.0 for mi in range] > if mn == mx: > mn -= 0.5 > mx += 0.5 > bins = linspace(mn, mx, bins, endpoint=False) > > # best block size probably depends on processor cache size > block = 65536 > n = sort(a[:block]).searchsorted(bins) > for i in lrange(block,len(a),block): > n += sort(a[i:i+block]).searchsorted(bins) > n = concatenate([n, [len(a)]]) > n = n[1:]-n[:-1] > > if normed: > db = bins[1] - bins[0] > return 1.0/(a.size*db) * n, bins > else: > return n, bins > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From cameron.walsh at gmail.com Thu Dec 14 02:56:09 2006 From: cameron.walsh at gmail.com (Cameron Walsh) Date: Thu, 14 Dec 2006 16:56:09 +0900 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <4580E941.4030805@enthought.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> <4580E941.4030805@enthought.com> Message-ID: <106309950612132356r30b01b5ale19a928bc27ce7c6@mail.gmail.com> Hi all, Absolutely gorgeous, I confirm the 1.6x speed-up over the weave version, i.e. a 25x speed-up over the existing version. It would be good if the redefinition of the range function could be changed in the numpy modules, before it goes into subversion, to avoid the need for Rick's line lrange=range before the new histogram function. At some point I might try and test different cache sizes for different data-set sizes and see what the effect is. For now, 65536 seems a good number and I would be happy to see this replace the current numpy.histogram. Thanks very much Eric and Rick, you've both taught me a lot, as well as solving the original problem. I'm sure this will be of use to others in the future, so if there's anything I can do to assist in getting this into the next numpy release, please let me know. Best regards, Cameron. On 14/12/06, eric jones wrote: > Looks to me like Rick's version is simpler and faster.It looks like it > offers a speed-up of about 1.6 on my machine over the weave version. I > believe this is because the sorting approach results in quite a few less > compares than the algorithm I used. > > Very cool. I vote that his version go into numpy. > > eric > > > > Rick White wrote: > > On Dec 12, 2006, at 10:27 PM, Cameron Walsh wrote: > > > > > >> I'm trying to generate histograms of extremely large datasets. I've > >> tried a few methods, listed below, all with their own shortcomings. > >> Mailing-list archive and google searches have not revealed any > >> solutions. > >> > > > > The numpy.histogram function can be modified to use memory much more > > efficiently when the input array is large, and the modification turns > > out to be faster even for smallish arrays (in my tests, anyway). > > Below is a modified version of the histogram function from > > function_base.py. It is almost identical, but it avoids doing the > > sort of the entire input array simply by dividing it into blocks. > > (It would be even better to avoid the call to ravel too.) The only > > other messy detail is that the builtin range function is shadowed by > > the 'range' parameter. > > > > In my timing tests this is about the same speed for arrays about the > > same size as the block size and is faster than the current version by > > 30-40% for large arrays. The speed difference increases as the array > > size increases. > > > > I haven't compared this to Eric's weave function, but this has the > > advantages of being pure Python and of being much simpler. On my > > machine (MacBook Pro) it takes about 4 seconds for an array with 100 > > million elements. The time increases perfectly linearly with array > > size for arrays larger than a million elements. > > Rick > > > > from numpy import * > > > > lrange = range > > def histogram(a, bins=10, range=None, normed=False): > > a = asarray(a).ravel() > > if not iterable(bins): > > if range is None: > > range = (a.min(), a.max()) > > mn, mx = [mi+0.0 for mi in range] > > if mn == mx: > > mn -= 0.5 > > mx += 0.5 > > bins = linspace(mn, mx, bins, endpoint=False) > > > > # best block size probably depends on processor cache size > > block = 65536 > > n = sort(a[:block]).searchsorted(bins) > > for i in lrange(block,len(a),block): > > n += sort(a[i:i+block]).searchsorted(bins) > > n = concatenate([n, [len(a)]]) > > n = n[1:]-n[:-1] > > > > if normed: > > db = bins[1] - bins[0] > > return 1.0/(a.size*db) * n, bins > > else: > > return n, bins > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From rlw at stsci.edu Thu Dec 14 08:30:05 2006 From: rlw at stsci.edu (Rick White) Date: Thu, 14 Dec 2006 08:30:05 -0500 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <106309950612132356r30b01b5ale19a928bc27ce7c6@mail.gmail.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> <4580E941.4030805@enthought.com> <106309950612132356r30b01b5ale19a928bc27ce7c6@mail.gmail.com> Message-ID: <6897A9E2-4B52-4B2F-942F-BDF16AA03779@stsci.edu> On Dec 14, 2006, at 2:56 AM, Cameron Walsh wrote: > At some point I might try and test > different cache sizes for different data-set sizes and see what the > effect is. For now, 65536 seems a good number and I would be happy to > see this replace the current numpy.histogram. I experimented a little on my machine and found that 64k was a good size, but it is fairly insensitive to the size over a wide range (16000 to 1e6). I'd be interested to hear how this scales on other machines -- I'm pretty sure that the ideal size will keep the piece of the array being sorted smaller than the on-chip cache. Just so we don't get too smug about the speed, if I do this in IDL on the same machine it is 10 times faster (0.28 seconds instead of 4 seconds). I'm sure the IDL version uses the much faster approach of just sweeping through the array once, incrementing counts in the appropriate bins. It only handles equal-sized bins, so it is not as general as the numpy version -- but equal-sized bins is a very common case. I'd still like to see a C version of histogram (which I guess would need to be a ufunc) go into the core numpy. Rick From giorgio.luciano at chimica.unige.it Thu Dec 14 08:31:47 2006 From: giorgio.luciano at chimica.unige.it (Giorgio Luciano) Date: Thu, 14 Dec 2006 14:31:47 +0100 Subject: [Numpy-discussion] empty data matrix (are they really empty ?) In-Reply-To: <4580B436.8040203@enthought.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> <4580B436.8040203@enthought.com> Message-ID: <45815243.9070406@chimica.unige.it> I was converting a matlab file to my new favority scientific language Numpy :) In the old file I created a matrix on the fly. I know that Numpy and python cannot do that so I found a workaround here's the code lev2=empty((1,h)) ir=1 for j in arange(1,nstep+2): #a=gr[[arange(ir-1,ir+nstep)],:] a2=gr[arange(ir-1,ir+nstep)] #flev=dot(a2,dot(disper,a2.transpose())) clev=diag(dot(a2,dot(disper,a2.transpose()))) lev2=vstack((lev2,clev)) #print ir #print clev #print h ir=ir+nstep+1 lev=lev2[1:,] print lev So First I create the empty matrix Secon perform the calculation Third take the matrix and exclude the first line since it has "dummy" values because after I need to plot it with contour(lev) H,K = meshgrid(lab,lab) fig=p.figure() ax=p3.Axes3D(fig) ax.plot_wireframe(H,K,lev) p.show() p.close everything works fine.. but is this really necessary ? could not an empy just just be "really empty" ? Thanks for the answers Cheers Giorgio From svetosch at gmx.net Thu Dec 14 09:19:42 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Thu, 14 Dec 2006 15:19:42 +0100 Subject: [Numpy-discussion] empty data matrix (are they really empty ?) In-Reply-To: <45815243.9070406@chimica.unige.it> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> <4580B436.8040203@enthought.com> <45815243.9070406@chimica.unige.it> Message-ID: <45815D7E.3090109@gmx.net> [you probably should have started a new thread instead of replying to another one...] Giorgio Luciano schrieb: > In the old file I created a matrix on the fly. I know that Numpy and > python cannot do that so I found a workaround I'm not sure what you mean what numpy cannot do, but... > here's the code > > lev2=empty((1,h)) > lev=lev2[1:,] > So > First I create the empty matrix > Secon perform the calculation > Third take the matrix and exclude the first line since it has "dummy" ... > > everything works fine.. but is this really necessary ? could not an empy > just just be "really empty" ? > Thanks for the answers > > you can use lev2 = empty((0,h)) as a starting point for adding rows, it works and then nothing "dummy"-like will be in lev2 hth, sven From Chris.Barker at noaa.gov Thu Dec 14 12:40:43 2006 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 14 Dec 2006 09:40:43 -0800 Subject: [Numpy-discussion] empty data matrix (are they really empty ?) In-Reply-To: <45815D7E.3090109@gmx.net> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> <4580B436.8040203@enthought.com> <45815243.9070406@chimica.unige.it> <45815D7E.3090109@gmx.net> Message-ID: <45818C9B.5060404@noaa.gov> Sven Schreiber wrote: >> In the old file I created a matrix on the fly. I know that Numpy and >> python cannot do that so I found a workaround numpy can create matrices on the fly, in fact, you are doing that with this code! The only thing it doesn't do is have a lateral that joins matrices the way matlab does -- you need to use vstack and the like. >> First I create the empty matrix To get better performance, you could create the entire empty matrix, not just one row -- this is the same as MATLAB -- if you know how big your matrix is going to be, it's better to create it first with "zeros". In numpy you can use either zeros or empty - just make sure that if you use empty, you fill the whole thing later, or you'll get garbage. Your code: lev2=empty((1,h)) # you've just created and empty single row . . . lev2=vstack((lev2,clev)) #now you are creating a whole new array, with one more row than before. The alternative: lev2=empty((nstep+1,h)) 3 create the whole empty array ir=1 for j in arange(1,nstep+2): a2=gr[arange(ir-1,ir+nstep)] clev=diag(dot(a2,dot(disper,a2.transpose()))) lev2[j,:] = clev # fill in the row you've just calculated ir=ir+nstep+1 print lev2 I may have got some of the indexing wrong, but I hope you get the idea. By the way, if you sent a complete, runnable sample, we can test out suggestions, and you'll get better answers. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ellisonbg.net at gmail.com Thu Dec 14 12:55:00 2006 From: ellisonbg.net at gmail.com (Brian Granger) Date: Thu, 14 Dec 2006 10:55:00 -0700 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> Message-ID: <6ce0ac130612140955i6e1c79b9n60b17e169589b173@mail.gmail.com> This same idea could be used to parallelize the histogram computation. Then you could really get into large (many Gb/TB/PB) data sets. I might try to find time to do this with ipython1, but someone else could do this as well. Brian On 12/13/06, Rick White wrote: > On Dec 12, 2006, at 10:27 PM, Cameron Walsh wrote: > > > I'm trying to generate histograms of extremely large datasets. I've > > tried a few methods, listed below, all with their own shortcomings. > > Mailing-list archive and google searches have not revealed any > > solutions. > > The numpy.histogram function can be modified to use memory much more > efficiently when the input array is large, and the modification turns > out to be faster even for smallish arrays (in my tests, anyway). > Below is a modified version of the histogram function from > function_base.py. It is almost identical, but it avoids doing the > sort of the entire input array simply by dividing it into blocks. > (It would be even better to avoid the call to ravel too.) The only > other messy detail is that the builtin range function is shadowed by > the 'range' parameter. > > In my timing tests this is about the same speed for arrays about the > same size as the block size and is faster than the current version by > 30-40% for large arrays. The speed difference increases as the array > size increases. > > I haven't compared this to Eric's weave function, but this has the > advantages of being pure Python and of being much simpler. On my > machine (MacBook Pro) it takes about 4 seconds for an array with 100 > million elements. The time increases perfectly linearly with array > size for arrays larger than a million elements. > Rick > > from numpy import * > > lrange = range > def histogram(a, bins=10, range=None, normed=False): > a = asarray(a).ravel() > if not iterable(bins): > if range is None: > range = (a.min(), a.max()) > mn, mx = [mi+0.0 for mi in range] > if mn == mx: > mn -= 0.5 > mx += 0.5 > bins = linspace(mn, mx, bins, endpoint=False) > > # best block size probably depends on processor cache size > block = 65536 > n = sort(a[:block]).searchsorted(bins) > for i in lrange(block,len(a),block): > n += sort(a[i:i+block]).searchsorted(bins) > n = concatenate([n, [len(a)]]) > n = n[1:]-n[:-1] > > if normed: > db = bins[1] - bins[0] > return 1.0/(a.size*db) * n, bins > else: > return n, bins > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From tim.hochberg at ieee.org Thu Dec 14 13:21:25 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Thu, 14 Dec 2006 11:21:25 -0700 Subject: [Numpy-discussion] Pyrex and numpy Message-ID: <45819625.8040803@ieee.org> I was just going to try pyrex out with numpy to see how it compares with weave (which is cool but quirky). My first attempt ended in failure: I tried to compile the demo in in numpy/doc/pyrex and got this error: c_numpy.pxd:99:22: Array element cannot be a Python object Does anyone who uses pyrex see this? Does anyone know what it's from? Not that I deleted numpyx.c, since otherwise pyrex isn't invoked at all? -tim From eric at enthought.com Thu Dec 14 14:25:20 2006 From: eric at enthought.com (eric jones) Date: Thu, 14 Dec 2006 13:25:20 -0600 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <6897A9E2-4B52-4B2F-942F-BDF16AA03779@stsci.edu> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> <4580E941.4030805@enthought.com> <106309950612132356r30b01b5ale19a928bc27ce7c6@mail.gmail.com> <6897A9E2-4B52-4B2F-942F-BDF16AA03779@stsci.edu> Message-ID: <4581A520.4030806@enthought.com> Rick White wrote: > Just so we don't get too smug about the speed, if I do this in IDL on > the same machine it is 10 times faster (0.28 seconds instead of 4 > seconds). I'm sure the IDL version uses the much faster approach of > just sweeping through the array once, incrementing counts in the > appropriate bins. It only handles equal-sized bins, so it is not as > general as the numpy version -- but equal-sized bins is a very common > case. I'd still like to see a C version of histogram (which I guess > would need to be a ufunc) go into the core numpy. > Yes, this gets rid of the search, and indices can just be caluclated from offsets. I've attached a modified weaved histogram that takes this approach. Running the snippet below on my machine takes .118 sec for the evenly binned weave algorithm and 0.385 sec for Rick's algorithm on 5 million elements. That is close to 4x faster (but not 10x...), so there is indeed some speed to be gained for the common special case. I don't know if the code I wrote has a 2x gain left in it, but I've spent zero time optimizing it. I'd bet it can be improved substantially. eric ### test_weave_even_histogram.py from numpy import arange, product, sum, zeros, uint8 from numpy.random import randint import weave_even_histogram import time shape = 1000,1000,5 size = product(shape) data = randint(0,256,size).astype(uint8) bins = arange(256+1) print 'type:', data.dtype print 'millions of elements:', size/1e6 bin_start = 0 bin_size = 1 bin_count = 256 t1 = time.clock() res = weave_even_histogram.histogram(data, bin_start, bin_size, bin_count) t2 = time.clock() print 'sec (evenly spaced):', t2-t1, sum(res) print res > Rick > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- A non-text attachment was scrubbed... Name: weave_even_histogram.py Type: text/x-python Size: 1726 bytes Desc: not available URL: From eric at enthought.com Thu Dec 14 14:27:35 2006 From: eric at enthought.com (eric jones) Date: Thu, 14 Dec 2006 13:27:35 -0600 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <4581A520.4030806@enthought.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> <4580E941.4030805@enthought.com> <106309950612132356r30b01b5ale19a928bc27ce7c6@mail.gmail.com> <6897A9E2-4B52-4B2F-942F-BDF16AA03779@stsci.edu> <4581A520.4030806@enthought.com> Message-ID: <4581A5A7.5030406@enthought.com> I just noticed a bug in this code. "PyArray_ITER_NEXT(iter);" should be moved out of the if statement. eric eric jones wrote: > > > Rick White wrote: >> Just so we don't get too smug about the speed, if I do this in IDL >> on the same machine it is 10 times faster (0.28 seconds instead of >> 4 seconds). I'm sure the IDL version uses the much faster approach >> of just sweeping through the array once, incrementing counts in the >> appropriate bins. It only handles equal-sized bins, so it is not as >> general as the numpy version -- but equal-sized bins is a very >> common case. I'd still like to see a C version of histogram (which >> I guess would need to be a ufunc) go into the core numpy. >> > Yes, this gets rid of the search, and indices can just be caluclated > from offsets. I've attached a modified weaved histogram that takes > this approach. Running the snippet below on my machine takes .118 sec > for the evenly binned weave algorithm and 0.385 sec for Rick's > algorithm on 5 million elements. That is close to 4x faster (but not > 10x...), so there is indeed some speed to be gained for the common > special case. I don't know if the code I wrote has a 2x gain left in > it, but I've spent zero time optimizing it. I'd bet it can be > improved substantially. > > eric > > ### test_weave_even_histogram.py > > from numpy import arange, product, sum, zeros, uint8 > from numpy.random import randint > > import weave_even_histogram > > import time > > shape = 1000,1000,5 > size = product(shape) > data = randint(0,256,size).astype(uint8) > bins = arange(256+1) > > print 'type:', data.dtype > print 'millions of elements:', size/1e6 > > bin_start = 0 > bin_size = 1 > bin_count = 256 > t1 = time.clock() > res = weave_even_histogram.histogram(data, bin_start, bin_size, > bin_count) > t2 = time.clock() > print 'sec (evenly spaced):', t2-t1, sum(res) > print res > > >> Rick >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > > ------------------------------------------------------------------------ > > from numpy import array, zeros, asarray, sort, int32 > from scipy import weave > from typed_array_converter import converters > > def histogram(ary, bin_start, bin_size, bin_count): > > ary = asarray(ary) > > # Create an array to hold the histogram count results. > results = zeros(bin_count,dtype=int32) > > # The C++ code that actually does the histogramming. > code = """ > PyArrayIterObject *iter = (PyArrayIterObject*)PyArray_IterNew(py_ary); > > while(iter->index < iter->size) > { > > ////////////////////////////////////////////////////////// > // binary search > ////////////////////////////////////////////////////////// > > // This requires an update to weave > ary_data_type value = *((ary_data_type*)iter->dataptr); > if (value>=bin_start) > { > int bin_index = (int)((value-bin_start)/bin_size); > > ////////////////////////////////////////////////////////// > // Bin counter increment > ////////////////////////////////////////////////////////// > > // If the value was found, increment the counter for that bin. > if (bin_index < bin_count) > { > results[bin_index]++; > } > PyArray_ITER_NEXT(iter); > } > } > """ > weave.inline(code, ['ary', 'bin_start', 'bin_size','bin_count', 'results'], > type_converters=converters, > compiler='gcc') > > return results > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From faltet at carabos.com Thu Dec 14 14:30:47 2006 From: faltet at carabos.com (Francesc Altet) Date: Thu, 14 Dec 2006 20:30:47 +0100 Subject: [Numpy-discussion] Pyrex and numpy In-Reply-To: <45819625.8040803@ieee.org> References: <45819625.8040803@ieee.org> Message-ID: <1166124647.2645.38.camel@localhost.localdomain> El dj 14 de 12 del 2006 a les 11:21 -0700, en/na Tim Hochberg va escriure: > > I was just going to try pyrex out with numpy to see how it compares with > weave (which is cool but quirky). My first attempt ended in failure: I > tried to compile the demo in in numpy/doc/pyrex and got this error: > > c_numpy.pxd:99:22: Array element cannot be a Python object > > > Does anyone who uses pyrex see this? Does anyone know what it's from? > Not that I deleted numpyx.c, since otherwise pyrex isn't invoked at all? > Mmm, I can compile and run the example just fine. That's strange because your Pyrex error seems to tell that NPY_MAXDIMS is a Python object instead of an integer. But in my numpy installation, NPY_MAXDIMS is defined in ndarrayobject.h, which should be imported automatically by Pyrex in: cdef extern from "numpy/arrayobject.h": block (which should include ndarrayobject.h). Sorry for not being able to help more, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth From david.huard at gmail.com Thu Dec 14 14:45:24 2006 From: david.huard at gmail.com (David Huard) Date: Thu, 14 Dec 2006 14:45:24 -0500 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <4581A520.4030806@enthought.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> <4580E941.4030805@enthought.com> <106309950612132356r30b01b5ale19a928bc27ce7c6@mail.gmail.com> <6897A9E2-4B52-4B2F-942F-BDF16AA03779@stsci.edu> <4581A520.4030806@enthought.com> Message-ID: <91cf711d0612141145u2fce5b3bwc09bcc191be5f8f1@mail.gmail.com> Hi, I spent some time a while ago on an histogram function for numpy. It uses digitize and bincount instead of sorting the data. If I remember right, it was significantly faster than numpy's histogram, but I don't know how it will behave with very large data sets. I attached the file if you want to take a look, or if you me the benchmark, I'll add it to it and report the results. Cheers, David 2006/12/14, eric jones : > > > > Rick White wrote: > > Just so we don't get too smug about the speed, if I do this in IDL on > > the same machine it is 10 times faster (0.28 seconds instead of 4 > > seconds). I'm sure the IDL version uses the much faster approach of > > just sweeping through the array once, incrementing counts in the > > appropriate bins. It only handles equal-sized bins, so it is not as > > general as the numpy version -- but equal-sized bins is a very common > > case. I'd still like to see a C version of histogram (which I guess > > would need to be a ufunc) go into the core numpy. > > > Yes, this gets rid of the search, and indices can just be caluclated > from offsets. I've attached a modified weaved histogram that takes this > approach. Running the snippet below on my machine takes .118 sec for > the evenly binned weave algorithm and 0.385 sec for Rick's algorithm on > 5 million elements. That is close to 4x faster (but not 10x...), so > there is indeed some speed to be gained for the common special case. I > don't know if the code I wrote has a 2x gain left in it, but I've spent > zero time optimizing it. I'd bet it can be improved substantially. > > eric > > ### test_weave_even_histogram.py > > from numpy import arange, product, sum, zeros, uint8 > from numpy.random import randint > > import weave_even_histogram > > import time > > shape = 1000,1000,5 > size = product(shape) > data = randint(0,256,size).astype(uint8) > bins = arange(256+1) > > print 'type:', data.dtype > print 'millions of elements:', size/1e6 > > bin_start = 0 > bin_size = 1 > bin_count = 256 > t1 = time.clock() > res = weave_even_histogram.histogram(data, bin_start, bin_size, bin_count) > t2 = time.clock() > print 'sec (evenly spaced):', t2-t1, sum(res) > print res > > > > Rick > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: histogram1d.py Type: text/x-python Size: 6826 bytes Desc: not available URL: From eric at enthought.com Thu Dec 14 15:15:00 2006 From: eric at enthought.com (eric jones) Date: Thu, 14 Dec 2006 14:15:00 -0600 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <4581A5A7.5030406@enthought.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> <4580E941.4030805@enthought.com> <106309950612132356r30b01b5ale19a928bc27ce7c6@mail.gmail.com> <6897A9E2-4B52-4B2F-942F-BDF16AA03779@stsci.edu> <4581A520.4030806@enthought.com> <4581A5A7.5030406@enthought.com> Message-ID: <4581B0C4.9050605@enthought.com> I've attached the newest version of my benchmarking code, and added one more evenly spaced version that works on contiguous arrays. Turns out that, the numpy iterator interface overhead is noticeably slower when we are using faster algorithms. The fastest (but least flexible) version is 5x faster than Rick's. So, we're still not at 10x that IDL gives you. eric C:\eric\code\histogram>c:\Python24\python.exe histogram_speed.py type: uint8 millions of elements: 10.0 sec (C indexing based): 0.944789631084 10000000 sec (numpy iteration based): 1.04402933892 10000000 sec (rick's pure python): 0.703124279762 10000000 sec (evenly spaced): 0.231293921434 10000000 sec (evenly spaced): 0.139521643114 10000000 Summary: case sec speed-up weave_1d_arbitrary 0.944790 0.744213 weave_nd_arbitrary 1.044029 0.673472 ricks_arbitrary 0.703124 1.000000 weave_nd_even 0.231294 3.039960 weave_1d_even 0.139522 5.039536 eric jones wrote: > I just noticed a bug in this code. "PyArray_ITER_NEXT(iter);" should be moved out of the if statement. > > eric > > eric jones wrote: > >> Rick White wrote: >> >>> Just so we don't get too smug about the speed, if I do this in IDL >>> on the same machine it is 10 times faster (0.28 seconds instead of >>> 4 seconds). I'm sure the IDL version uses the much faster approach >>> of just sweeping through the array once, incrementing counts in the >>> appropriate bins. It only handles equal-sized bins, so it is not as >>> general as the numpy version -- but equal-sized bins is a very >>> common case. I'd still like to see a C version of histogram (which >>> I guess would need to be a ufunc) go into the core numpy. >>> >>> >> Yes, this gets rid of the search, and indices can just be caluclated >> from offsets. I've attached a modified weaved histogram that takes >> this approach. Running the snippet below on my machine takes .118 sec >> for the evenly binned weave algorithm and 0.385 sec for Rick's >> algorithm on 5 million elements. That is close to 4x faster (but not >> 10x...), so there is indeed some speed to be gained for the common >> special case. I don't know if the code I wrote has a 2x gain left in >> it, but I've spent zero time optimizing it. I'd bet it can be >> improved substantially. >> >> eric >> >> ### test_weave_even_histogram.py >> >> from numpy import arange, product, sum, zeros, uint8 >> from numpy.random import randint >> >> import weave_even_histogram >> >> import time >> >> shape = 1000,1000,5 >> size = product(shape) >> data = randint(0,256,size).astype(uint8) >> bins = arange(256+1) >> >> print 'type:', data.dtype >> print 'millions of elements:', size/1e6 >> >> bin_start = 0 >> bin_size = 1 >> bin_count = 256 >> t1 = time.clock() >> res = weave_even_histogram.histogram(data, bin_start, bin_size, >> bin_count) >> t2 = time.clock() >> print 'sec (evenly spaced):', t2-t1, sum(res) >> print res >> >> >> >>> Rick >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >> ------------------------------------------------------------------------ >> >> from numpy import array, zeros, asarray, sort, int32 >> from scipy import weave >> from typed_array_converter import converters >> >> def histogram(ary, bin_start, bin_size, bin_count): >> >> ary = asarray(ary) >> >> # Create an array to hold the histogram count results. >> results = zeros(bin_count,dtype=int32) >> >> # The C++ code that actually does the histogramming. >> code = """ >> PyArrayIterObject *iter = (PyArrayIterObject*)PyArray_IterNew(py_ary); >> >> while(iter->index < iter->size) >> { >> >> ////////////////////////////////////////////////////////// >> // binary search >> ////////////////////////////////////////////////////////// >> >> // This requires an update to weave >> ary_data_type value = *((ary_data_type*)iter->dataptr); >> if (value>=bin_start) >> { >> int bin_index = (int)((value-bin_start)/bin_size); >> >> ////////////////////////////////////////////////////////// >> // Bin counter increment >> ////////////////////////////////////////////////////////// >> >> // If the value was found, increment the counter for that bin. >> if (bin_index < bin_count) >> { >> results[bin_index]++; >> } >> PyArray_ITER_NEXT(iter); >> } >> } >> """ >> weave.inline(code, ['ary', 'bin_start', 'bin_size','bin_count', 'results'], >> type_converters=converters, >> compiler='gcc') >> >> return results >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- A non-text attachment was scrubbed... Name: histogram.zip Type: application/x-zip-compressed Size: 5823 bytes Desc: not available URL: From ellisonbg.net at gmail.com Thu Dec 14 20:06:14 2006 From: ellisonbg.net at gmail.com (Brian Granger) Date: Thu, 14 Dec 2006 18:06:14 -0700 Subject: [Numpy-discussion] Can we change how fortran compiler version strings are handled?! Message-ID: <6ce0ac130612141706r147c3208i687de9625051a0e6@mail.gmail.com> Hi, I have been doing quite a bit of numpy evangelism here at my work and slowly people are starting to use it. One of the main things people are interested in is f2py. But, I am finding that there is one persistent problem that keeps coming up when people try to install numpy on various systems: In three cases I have found that numpy failed to find and use a fortran compiler because the version string didn't match what was hardcoded into numpy.distutils. The reality is that version strings are in no way "standardized". In the most recent cases, we had a version of the lahey compiler that had the extra word "Express" and in another case, the xlf version string on a supercomputer was completely different. What is crazy to me is that this simple mismatch prevents numpy from even trying the compiler. Can we please change how Numpy handles the version string of fortran compilers? My suggestion would be to simply print the version string, but to attempt to use the compiler no matter what the version string is. That way, the success or failure of using the fortran compiler will be determined by the actual compiler, not its version string. There could be some other smart way of handling this, but I think it should be dealt with to make the installation process easier. I am willing to work up a patch if there is agreement on what should be done. Oh, the other difficult thing is that in the current arrangement, numpy.distutils doesn't print an error message that is easy to debug. It just silently does find the compiler rather than saying why. Thanks Brian From robert.kern at gmail.com Thu Dec 14 20:14:22 2006 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 14 Dec 2006 17:14:22 -0800 Subject: [Numpy-discussion] Can we change how fortran compiler version strings are handled?! In-Reply-To: <6ce0ac130612141706r147c3208i687de9625051a0e6@mail.gmail.com> References: <6ce0ac130612141706r147c3208i687de9625051a0e6@mail.gmail.com> Message-ID: <4581F6EE.5070006@gmail.com> Brian Granger wrote: > Can we please change how Numpy handles the version string of fortran > compilers? Yes, please. I'll be happy to apply any patch you might provide for this. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cameron.walsh at gmail.com Thu Dec 14 22:43:46 2006 From: cameron.walsh at gmail.com (Cameron Walsh) Date: Fri, 15 Dec 2006 12:43:46 +0900 Subject: [Numpy-discussion] Histograms of extremely large data sets In-Reply-To: <4581B0C4.9050605@enthought.com> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <9936E3D8-CF84-4605-BFB9-757D8FCC7912@stsci.edu> <4580E941.4030805@enthought.com> <106309950612132356r30b01b5ale19a928bc27ce7c6@mail.gmail.com> <6897A9E2-4B52-4B2F-942F-BDF16AA03779@stsci.edu> <4581A520.4030806@enthought.com> <4581A5A7.5030406@enthought.com> <4581B0C4.9050605@enthought.com> Message-ID: <106309950612141943p405a3e44i975aa7dd99fa90@mail.gmail.com> Using Eric's latest speed-testing, here's David's results: cameron at cameron-laptop:~/code_snippets/histogram$ python histogram_speed.py type: uint8 millions of elements: 100.0 sec (C indexing based): 8.44 100000000 sec (numpy iteration based): 8.91 100000000 sec (rick's pure python): 6.4 100000000 sec (nd evenly spaced): 2.1 100000000 sec (1d evenly spaced): 1.33 100000000 sec (david huard): 35.84 100000000 Summary: case sec speed-up weave_1d_arbitrary 8.440000 0.758294 weave_nd_arbitrary 8.910000 0.718294 ricks_arbitrary 6.400000 1.000000 weave_nd_even 2.100000 3.047619 weave_1d_even 1.330000 4.812030 david_huard 35.840000 0.178571 I also tried this on an equal-sized sample of my real-world data: 100 image slices, 8bits/sample, 1000x1000 pixels per image. The full data set is 489 image slices, but I was unable to randomly generate 489 million data samples because I ran out of memory and started thrashing the page file, ruining any results. So I've compared like with like and got the following results with real-world data: type: uint8 millions of elements: 100.0 sec (C indexing based): 6.1 100000000 sec (numpy iteration based): 7.07 100000000 sec (rick's pure python): 4.77 100000000 sec (nd evenly spaced): 2.12 100000000 sec (1d evenly spaced): 1.33 100000000 sec (david huard): 16.47 100000000 Summary: case sec speed-up weave_1d_arbitrary 6.100000 0.781967 weave_nd_arbitrary 7.070000 0.674682 ricks_arbitrary 4.770000 1.000000 weave_nd_even 2.120000 2.250000 weave_1d_even 1.330000 3.586466 david_huard 16.470000 0.289617 Note how much faster some of the algorithms run on the non-random, real-world data. I assume this is due to variations in the scaling of the quick-sort algorithm depending on the starting order of the data? Scaling with the full data set was similar. Unfortunately, David's code was not able to load the entire 489 image slices, throwing the same error as that mentioned in the first email in this thread. Later parts of the project I am working on will probably require iteration over the entire data set, and iteration seems to be slowing down several of these histogram algorithms, requiring the sort() approach. I'll have a look at the iterator, and see if there's anything that can be done there instead. I'm hoping that it will be possible to use a C-based iterator for a numpy multiarray, as this would allow many data processing algorithms to run faster, not just the histogram. Once again, thanks to everyone for all your input. This seems to have generated more discussion and action than I anticipated, for which I am very grateful. Best regards, Cameron. From giorgio.luciano at chimica.unige.it Fri Dec 15 03:14:28 2006 From: giorgio.luciano at chimica.unige.it (Giorgio Luciano) Date: Fri, 15 Dec 2006 09:14:28 +0100 Subject: [Numpy-discussion] empty data matrix (are they really empty ?) In-Reply-To: <45815D7E.3090109@gmx.net> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> <4580B436.8040203@enthought.com> <45815243.9070406@chimica.unige.it> <45815D7E.3090109@gmx.net> Message-ID: <45825964.1000702@chimica.unige.it> Ok thanks Swen I will try and sorry for not starting a new thread.. I'm still a newbie with mailing list too :) Giorgio > > > From giorgio.luciano at chimica.unige.it Fri Dec 15 03:59:06 2006 From: giorgio.luciano at chimica.unige.it (Giorgio Luciano) Date: Fri, 15 Dec 2006 09:59:06 +0100 Subject: [Numpy-discussion] empty data matrix (are they really empty ?) In-Reply-To: <45818C9B.5060404@noaa.gov> References: <106309950612121927o6b50ee7fj4996c52d5ff2d250@mail.gmail.com> <457FAED1.6010803@enthought.com> <106309950612130027m7c454b19r1255670114d5cb33@mail.gmail.com> <106309950612131732j1e48dccfnbd60ee1c17b0c6c9@mail.gmail.com> <4580B436.8040203@enthought.com> <45815243.9070406@chimica.unige.it> <45815D7E.3090109@gmx.net> <45818C9B.5060404@noaa.gov> Message-ID: <458263DA.30303@chimica.unige.it> Here's the runnable example Everything work fine with installed python 2.5 matplotlib 0.87.7 numpy 1.01scipy 0.5.2 Cheers Giorgio this module plots leverage for a regression module (only two steps in the grid since it's only a try to compare with a matlab file I have) -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pyplotlev.py URL: From meesters at uni-mainz.de Fri Dec 15 05:52:26 2006 From: meesters at uni-mainz.de (Christian Meesters) Date: Fri, 15 Dec 2006 11:52:26 +0100 Subject: [Numpy-discussion] migrating to numpy Message-ID: <200612151152.26804.meesters@uni-mainz.de> Hi I was looking for a guide how to migrate old numeric code to numpy and couldn't find anything on the web. Any pointers for me? TIA and sorry for bothering, since this surely was discussed already, Christian From pjssilva at ime.usp.br Fri Dec 15 08:27:21 2006 From: pjssilva at ime.usp.br (Paulo Jose da Silva e Silva) Date: Fri, 15 Dec 2006 11:27:21 -0200 Subject: [Numpy-discussion] Automatic matrices Message-ID: <1166189241.17633.27.camel@localhost.localdomain> Hello, If a numpy user is specially concerned with numerical linear algebra (or more generally with Math), it may find unconvenient the use of the dot function instead of the * operator. This behavior may be specially unpleasant for someone migrating from Matlab. I believe that this is the may reason for the existence of the matrix class within numpy. However, after trying to use the matrix class I have came across a major roadblock: many numpy/scipy functions return an array by default and not matrices. Then, we need then to add many conversion calls to the 'mat' function in our code. This is also unconvenient. I have had the idea of trying to write some code to automatically call the mat function for me. I have a very simple and inefficient prototype now that can be downloaded at: http://www.ime.usp.br/~pjssilva/matrix_import.py The use is simple. Instead of importing a numerical module with import, use the special class MatrixfiedModule from the above file. Let me give an example: --- ipython session --- In [2]:import matrix_import as mi In [3]:num = mi.MatrixfiedModule('numpy') Importing numpy In [4]:la = mi.MatrixfiedModule('scipy.linalg') Importing scipy.linalg In [5]:A = num.random.rand(3,4) Importing numpy.random In [6]:Q, R = la.qr(A) In [7]:la.norm(Q*R - A) Out[7]:6.0555516793379748e-16 ----- End session ----- For now the solution is very inefficient: every function call to a MatrixfiedModule function is wrapped on the fly to search for array return values ad convert them to matrix. This can certainly be improved the wrapping all the functions in the original module first. I plan to add this possibility soon. It is also incomplete: The automatic conversion only happens for return values of module function. It doesn't try to deal with special objects like finfo(float).eps or mgrid[0:9.,0:6.]. I am not sure how to deal with this. I can donate the code to scipy if there is any interest. Any comments? Best, Paulo -- Paulo Jos? da Silva e Silva Professor Assistente, Dep. de Ci?ncia da Computa??o (Assistant Professor, Computer Science Dept.) Universidade de S?o Paulo - Brazil e-mail: pjssilva at ime.usp.br Web: http://www.ime.usp.br/~pjssilva Teoria ? o que n?o entendemos o (Theory is something we don't) suficiente para chamar de pr?tica. (understand well enough to call) (practice) From evan.lapisky at gmail.com Fri Dec 15 09:37:26 2006 From: evan.lapisky at gmail.com (Evan Lapisky) Date: Fri, 15 Dec 2006 09:37:26 -0500 Subject: [Numpy-discussion] Pyrex and numpy Message-ID: <4052c5140612150637m126d7ca2u5537f3a7b7a7178d@mail.gmail.com> > I was just going to try pyrex out with numpy to see how it compares with > weave (which is cool but quirky). My first attempt ended in failure: I > tried to compile the demo in in numpy/doc/pyrex and got this error: > > c_numpy.pxd:99:22: Array element cannot be a Python object > > > Does anyone who uses pyrex see this? Does anyone know what it's from? > Not that I deleted numpyx.c, since otherwise pyrex isn't invoked at all? > I had the same problem. I don't know why it didn't work, but the pyrex example from http://scipy.org/PerformancePython worked just fine. -Evan From robert.kern at gmail.com Fri Dec 15 10:05:35 2006 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 15 Dec 2006 07:05:35 -0800 Subject: [Numpy-discussion] migrating to numpy In-Reply-To: <200612151152.26804.meesters@uni-mainz.de> References: <200612151152.26804.meesters@uni-mainz.de> Message-ID: <4582B9BF.2070003@gmail.com> Christian Meesters wrote: > Hi > > I was looking for a guide how to migrate old numeric code to numpy and > couldn't find anything on the web. Any pointers for me? http://www.scipy.org/Converting_from_Numeric Also, chapter 2.6 of the _Guide to NumPy_ in the freely available sample chapters covers this. http://www.tramy.us/numpybooksample.pdf -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tim.hochberg at ieee.org Fri Dec 15 11:05:05 2006 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Fri, 15 Dec 2006 09:05:05 -0700 Subject: [Numpy-discussion] Pyrex and numpy In-Reply-To: <4052c5140612150637m126d7ca2u5537f3a7b7a7178d@mail.gmail.com> References: <4052c5140612150637m126d7ca2u5537f3a7b7a7178d@mail.gmail.com> Message-ID: <4582C7B1.2090406@ieee.org> Evan Lapisky wrote: >> I was just going to try pyrex out with numpy to see how it compares with >> weave (which is cool but quirky). My first attempt ended in failure: I >> tried to compile the demo in in numpy/doc/pyrex and got this error: >> >> c_numpy.pxd:99:22: Array element cannot be a Python object >> >> >> Does anyone who uses pyrex see this? Does anyone know what it's from? >> Not that I deleted numpyx.c, since otherwise pyrex isn't invoked at all? >> >> > > I had the same problem. I don't know why it didn't work, but the pyrex > example from http://scipy.org/PerformancePython worked just fine. > > Hmmm. Thanks. I'll give it a try. -tim From svetosch at gmx.net Fri Dec 15 17:37:57 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Fri, 15 Dec 2006 23:37:57 +0100 Subject: [Numpy-discussion] Automatic matrices In-Reply-To: <1166189241.17633.27.camel@localhost.localdomain> References: <1166189241.17633.27.camel@localhost.localdomain> Message-ID: <458323C5.7010808@gmx.net> Paulo Jose da Silva e Silva schrieb: > > However, after trying to use the matrix class I have came across a major > roadblock: many numpy/scipy functions return an array by default and not > matrices. Then, we need then to add many conversion calls to the 'mat' > function in our code. This is also unconvenient. > scipy I don't know, but in numpy as a matrix user I'm glad that such behavior has been treated as bugs on the way to 1.0 -- so could you please send a list with the affected numpy functions? -sven From pjssilva at ime.usp.br Fri Dec 15 19:21:37 2006 From: pjssilva at ime.usp.br (Paulo Jose da Silva e Silva) Date: Fri, 15 Dec 2006 22:21:37 -0200 Subject: [Numpy-discussion] Automatic matrices In-Reply-To: <458323C5.7010808@gmx.net> References: <1166189241.17633.27.camel@localhost.localdomain> <458323C5.7010808@gmx.net> Message-ID: <1166228497.17633.61.camel@localhost.localdomain> Em Sex, 2006-12-15 ?s 23:37 +0100, Sven Schreiber escreveu: > Paulo Jose da Silva e Silva schrieb: > > > > > However, after trying to use the matrix class I have came across a major > > roadblock: many numpy/scipy functions return an array by default and not > > matrices. Then, we need then to add many conversion calls to the 'mat' > > function in our code. This is also unconvenient. > > > > scipy I don't know, but in numpy as a matrix user I'm glad that such > behavior has been treated as bugs on the way to 1.0 -- so could you > please send a list with the affected numpy functions? > -sven Ops... I did not try to imply that there are some functions in numpy that return array when receiving matrices. What I meant is that there are functions in numpy that always return arrays. Hence they ask for an explicit conversion to matrices. Good examples is the whole numpy.random sub-module. So if you want a random matrix you need to type: A = mat(numpy.random.rand(4,4)) Hence, a matrix user of numpy module still have to be aware of such conversions. Note that in my code, after importing numpy using the special module, I can write A = num.random.rand(4,4) There is no special case. best, Paulo Obs: I remember reading somewhere in the list that we can change the behavior of numpy to make it return matrices as default, even in calls for functions like zeros or ones. I don't have the reference now. Anyhow I wanted a solution that can make any module play nice with matrices. From kwgoodman at gmail.com Fri Dec 15 19:44:25 2006 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 15 Dec 2006 16:44:25 -0800 Subject: [Numpy-discussion] Automatic matrices In-Reply-To: <1166228497.17633.61.camel@localhost.localdomain> References: <1166189241.17633.27.camel@localhost.localdomain> <458323C5.7010808@gmx.net> <1166228497.17633.61.camel@localhost.localdomain> Message-ID: On 12/15/06, Paulo Jose da Silva e Silva wrote: > I did not try to imply that there are some functions in numpy that > return array when receiving matrices. What I meant is that there are > functions in numpy that always return arrays. Hence they ask for an > explicit conversion to matrices. Good examples is the whole numpy.random > sub-module. So if you want a random matrix you need to type: There are many numpy functions that will take a matrix as input but return an array. The nan functions (nanmin, nanmax, nanargmin, nanargmax, nansum) are an example. The first line in these functions is y = array(a) which converts the matrix input into an array. A more matrix friendly alternative would be y = asanyarray(a) But are there any unintended consequences of changing from array to asanyarray? From robert.kern at gmail.com Fri Dec 15 21:49:49 2006 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 15 Dec 2006 20:49:49 -0600 Subject: [Numpy-discussion] Automatic matrices In-Reply-To: References: <1166189241.17633.27.camel@localhost.localdomain> <458323C5.7010808@gmx.net> <1166228497.17633.61.camel@localhost.localdomain> Message-ID: <45835ECD.1060703@gmail.com> Keith Goodman wrote: > But are there any unintended consequences of changing from array to asanyarray? Not by itself, no. That entails that the implementations cannot rely on any particular behavior of the arrays. The correct(ish) approach looks something like the following, I believe: def foo(input): input_arr = asanyarray(input) wrapper = input_arr.__array_wrap__ input_arr = asarray(input_arr) # Do stuff on input_arr to get output_arr (an ndarray). return wrapper(output_arr) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From svetosch at gmx.net Sat Dec 16 15:24:41 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Sat, 16 Dec 2006 21:24:41 +0100 Subject: [Numpy-discussion] Automatic matrices In-Reply-To: References: <1166189241.17633.27.camel@localhost.localdomain> <458323C5.7010808@gmx.net> <1166228497.17633.61.camel@localhost.localdomain> Message-ID: <45845609.40705@gmx.net> Keith Goodman schrieb: > > There are many numpy functions that will take a matrix as input but > return an array. > > The nan functions (nanmin, nanmax, nanargmin, nanargmax, nansum) are an example. > So that would be a bug IMHO and should be filed as a ticket. I will do that eventually if nobody stops me first... -sven From svetosch at gmx.net Sat Dec 16 15:27:14 2006 From: svetosch at gmx.net (Sven Schreiber) Date: Sat, 16 Dec 2006 21:27:14 +0100 Subject: [Numpy-discussion] Automatic matrices In-Reply-To: <1166228497.17633.61.camel@localhost.localdomain> References: <1166189241.17633.27.camel@localhost.localdomain> <458323C5.7010808@gmx.net> <1166228497.17633.61.camel@localhost.localdomain> Message-ID: <458456A2.6010900@gmx.net> Paulo Jose da Silva e Silva schrieb: > > Obs: I remember reading somewhere in the list that we can change the > behavior of numpy to make it return matrices as default, even in calls > for functions like zeros or ones. I don't have the reference now. Anyhow > I wanted a solution that can make any module play nice with matrices. > Yes, from numpy.matlib import ones, zeros, empty, rand, eye should cover most cases (at least for me) -sven From cjw at sympatico.ca Sat Dec 16 19:55:46 2006 From: cjw at sympatico.ca (Colin J. Williams) Date: Sat, 16 Dec 2006 19:55:46 -0500 Subject: [Numpy-discussion] Subclasses - use of __finalize__ Message-ID: An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 17 00:59:08 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 16 Dec 2006 22:59:08 -0700 Subject: [Numpy-discussion] Automatic matrices In-Reply-To: <458456A2.6010900@gmx.net> References: <1166189241.17633.27.camel@localhost.localdomain> <458323C5.7010808@gmx.net> <1166228497.17633.61.camel@localhost.localdomain> <458456A2.6010900@gmx.net> Message-ID: Testing, please disregard.... -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 17 15:25:20 2006 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 17 Dec 2006 13:25:20 -0700 Subject: [Numpy-discussion] Anyone have a "little" shooting-method function to share In-Reply-To: <1163094437.6376.6.camel@localhost.localdomain> References: <45527171.8030903@noaa.gov> <1163094437.6376.6.camel@localhost.localdomain> Message-ID: On 11/9/06, Pauli Virtanen