From sturla at molden.no Tue Sep 1 00:06:50 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 01 Sep 2009 06:06:50 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) Message-ID: <4A9C9DDA.9060503@molden.no> We recently has a discussion regarding an optimization of NumPy's median to average O(n) complexity. After some searching, I found out there is a selection algorithm competitive in speed with Hoare's quick select. It has the advantage of being a lot simpler to implement. In plain Python: import numpy as np def wirthselect(array, k): """ Niklaus Wirth's selection algortithm """ a = np.ascontiguousarray(array) if (a is array): a = a.copy() l = 0 m = a.shape[0] - 1 while l < m: x = a[k] i = l j = m while 1: while a[i] < x: i += 1 while x < a[j]: j -= 1 if i <= j: tmp = a[i] a[i] = a[j] a[j] = tmp i += 1 j -= 1 if i > j: break if j < k: l = i if k < i: m = j return a Now, the median can be obtained in average O(n) time as: def median(x): """ median in average O(n) time """ n = x.shape[0] k = n >> 1 s = wirthselect(x, k) if n & 1: return s[k] else: return 0.5*(s[k]+s[:k].max()) The beauty of this is that Wirth select is extremely easy to migrate to Cython: import numpy ctypedef numpy.double_t T # or whatever def wirthselect(numpy.ndarray[T, ndim=1] array, int k): cdef int i, j, l, m cdef T x, tmp cdef T *a _array = np.ascontiguousarray(array) if (_array is array): _array = _array.copy() a = _array.data l = 0 m = a.shape[0] - 1 with nogil: while l < m: x = a[k] i = l j = m while 1: while a[i] < x: i += 1 while x < a[j]: j -= 1 if i <= j: tmp = a[i] a[i] = a[j] a[j] = tmp i += 1 j -= 1 if i > j: break if j < k: l = i if k < i: m = j return _array For example, we could have a small script that generates withselect for all NumPy dtypes (T as template), and use a dict as jump table. Chad, you can continue to write quick select using NumPy's C quick sort in numpy/core/src/_sortmodule.c.src. When you are done, it might be about 10% faster than this. :-) Reference: http://ndevilla.free.fr/median/median.pdf Best regards, Sturla Molden From sturla at molden.no Tue Sep 1 00:50:30 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 01 Sep 2009 06:50:30 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9C9DDA.9060503@molden.no> References: <4A9C9DDA.9060503@molden.no> Message-ID: <4A9CA816.10102@molden.no> Sturla Molden skrev: > We recently has a discussion regarding an optimization of NumPy's median > to average O(n) complexity. After some searching, I found out there is a > selection algorithm competitive in speed with Hoare's quick select. > > Reference: > http://ndevilla.free.fr/median/median.pdf After som more googling, I found this text by Wirth himself: http://www.oberon2005.ru/book/ads2004.pdf The method is described on page 61 (57 in the PDF) as Hoare's quick select. So it seems it's just a less optimized version than that of Numerical Receipes, and the first reference (Devillard) was confused. Anyhow, it still has the advantage of looking nice in Cython and being very different from the Numerical Receipes code. We can rename wirthselect to quickselect then. Sorry for the confusion. I should have checked the source better. Sturla Molden From stefano_covino at yahoo.it Tue Sep 1 04:08:48 2009 From: stefano_covino at yahoo.it (Stefano Covino) Date: Tue, 1 Sep 2009 10:08:48 +0200 Subject: [Numpy-discussion] snow leopard and Numeric Message-ID: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> Hello everybody, I have just upgraded my Mac laptop to snow leopard. However, I can no more compile Numeric 24.2. Here is my output: [MacBook-Pro-di-Stefano:~/Pacchetti/Numeric-24.2] covino% python setup.py build running build running build_py running build_ext building 'RNG.RNG' extension gcc-4.2 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -arch i386 - arch ppc -arch x86_64 -pipe -IInclude -IPackages/FFT/Include - IPackages/RNG/Include -I/System/Library/Frameworks/Python.framework/ Versions/2.6/include/python2.6 -c Packages/RNG/Src/ranf.c -o build/ temp.macosx-10.6-universal-2.6/Packages/RNG/Src/ranf.o Packages/RNG/Src/ranf.c: In function ?Mixranf?: Packages/RNG/Src/ranf.c:153: error: conflicting types for ?gettimeofday? /usr/include/sys/time.h:210: error: previous declaration of ?gettimeofday? was here Packages/RNG/Src/ranf.c: In function ?Mixranf?: Packages/RNG/Src/ranf.c:153: error: conflicting types for ?gettimeofday? /usr/include/sys/time.h:210: error: previous declaration of ?gettimeofday? was here Packages/RNG/Src/ranf.c: In function ?Mixranf?: Packages/RNG/Src/ranf.c:153: error: conflicting types for ?gettimeofday? /usr/include/sys/time.h:210: error: previous declaration of ?gettimeofday? was here lipo: can't open input file: /var/folders/x4/x4lrvHJWH68+aWExBjO5Gk++ +TI/-Tmp-//ccDCDxtF.out (No such file or directory) error: command 'gcc-4.2' failed with exit status 1 Is there anything I could do? Thanks a lot, Stefano From matthieu.brucher at gmail.com Tue Sep 1 04:27:06 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 1 Sep 2009 10:27:06 +0200 Subject: [Numpy-discussion] snow leopard and Numeric In-Reply-To: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> References: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> Message-ID: Use Numpy instead of Numeric (no longer supported I think)? Matthieu 2009/9/1 Stefano Covino : > Hello everybody, > > I have just upgraded my Mac laptop to snow leopard. > However, I can no more compile Numeric 24.2. > > Here is my output: > > [MacBook-Pro-di-Stefano:~/Pacchetti/Numeric-24.2] covino% python > setup.py build > running build > running build_py > running build_ext > building 'RNG.RNG' extension > gcc-4.2 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -arch i386 - > arch ppc -arch x86_64 -pipe -IInclude -IPackages/FFT/Include - > IPackages/RNG/Include -I/System/Library/Frameworks/Python.framework/ > Versions/2.6/include/python2.6 -c Packages/RNG/Src/ranf.c -o build/ > temp.macosx-10.6-universal-2.6/Packages/RNG/Src/ranf.o > Packages/RNG/Src/ranf.c: In function ?Mixranf?: > Packages/RNG/Src/ranf.c:153: error: conflicting types for ?gettimeofday? > /usr/include/sys/time.h:210: error: previous declaration of > ?gettimeofday? was here > Packages/RNG/Src/ranf.c: In function ?Mixranf?: > Packages/RNG/Src/ranf.c:153: error: conflicting types for ?gettimeofday? > /usr/include/sys/time.h:210: error: previous declaration of > ?gettimeofday? was here > Packages/RNG/Src/ranf.c: In function ?Mixranf?: > Packages/RNG/Src/ranf.c:153: error: conflicting types for ?gettimeofday? > /usr/include/sys/time.h:210: error: previous declaration of > ?gettimeofday? was here > lipo: can't open input file: /var/folders/x4/x4lrvHJWH68+aWExBjO5Gk++ > +TI/-Tmp-//ccDCDxtF.out (No such file or directory) > error: command 'gcc-4.2' failed with exit status 1 > > > Is there anything I could do? > > > Thanks a lot, > ? ? ? ?Stefano > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From timmichelsen at gmx-topmail.de Tue Sep 1 06:08:28 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 1 Sep 2009 10:08:28 +0000 (UTC) Subject: [Numpy-discussion] genfromtext advice Message-ID: Hello, I tried to load a ASCII table into a string array. Unfortunately, this table has some empty chells Here it is: http://www.ncdc.noaa.gov/oa/climate/rcsg/cdrom/ismcs/alphanum.html After having converted this into a text file I tried this: $ np.genfromtxt('alphanum_to-text.txt', dtype=np.str_, delimiter='|', skiprows=1, missing='') But I got an error: 1043 dtype = np.dtype(ttype) 1044 # -> 1045 output = np.array(data, dtype) 1046 if usemask: 1047 if dtype.names: ValueError: setting an array element with a sequence I sometimes experience this error message with text read in. Could this message be made more helpful like telling in which line of the input file this occurs? This could really reduce the trial and error. What would be the correct way to read this data in? Thanks, Timmie From pgmdevlist at gmail.com Tue Sep 1 06:37:40 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 1 Sep 2009 06:37:40 -0400 Subject: [Numpy-discussion] genfromtext advice In-Reply-To: References: Message-ID: On Sep 1, 2009, at 6:08 AM, Tim Michelsen wrote: > Hello, > I tried to load a ASCII table into a string array. Unfortunately, > this table has > some empty chells > > Here it is: > http://www.ncdc.noaa.gov/oa/climate/rcsg/cdrom/ismcs/alphanum.html > > After having converted this into a text file I tried this: > > $ np.genfromtxt('alphanum_to-text.txt', dtype=np.str_, delimiter='|', > skiprows=1, missing='') > > But I got an error: > > 1043 dtype = np.dtype(ttype) > 1044 # > -> 1045 output = np.array(data, dtype) > 1046 if usemask: > 1047 if dtype.names: > > ValueError: setting an array element with a sequence > > I sometimes experience this error message with text read in. Could > this message > be made more helpful like telling in which line of the input file > this occurs? Mmh, perhaps. I'll try to see what I can do. Usually, this message shows up when one of the lines you have read doesn't have the same number of columns as the others. You mentioned that you're skipping one line : try skipping a few more and check. I can't really see what could go wrong in the link you sent. However, you converted it to txt, right ? You're 100% sure that you didn't miss something in the conversion ? Can you post the result (or send a link to?) From timmichelsen at gmx-topmail.de Tue Sep 1 07:18:32 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 1 Sep 2009 11:18:32 +0000 (UTC) Subject: [Numpy-discussion] genfromtext advice References: Message-ID: > Mmh, perhaps. Thanks for the quick reply. > I'll try to see what I can do. Usually, this message > shows up when one of the lines you have read doesn't have the same > number of columns as the others. Could we add this error to the docstring? As I suggested, It would be helpful to get the line number where this occurs. >You mentioned that you're skipping > one line : try skipping a few more and check. Here is an example: http://pastebin.com/m508d1d00 If I skip the first 5512 lines everything goes well. > You're 100% sure that you didn't miss > something in the conversion ? Yes. And just did it again. Just edited the html source and then saved to text. no magic. > Can you post the result (or send a link > to?) May I add a improvement request and append the file there? Thanks, Timmie From lciti at essex.ac.uk Tue Sep 1 07:17:48 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Tue, 1 Sep 2009 12:17:48 +0100 Subject: [Numpy-discussion] genfromtext advice References: Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8F@sernt14.essex.ac.uk> I have tried $ awk -F '|' '{if(NF != 12) print NR;}' /tmp/pp.txt and besides the first 23 lines and the last 3 lines of the file, also the following have a number of '|' different from 11: 1635 2851 5538 i.e. BIKIN, BENGUERIR and TERESINA AIRPORT. But I completely agree with you, genfromtxt could print out the line number and the actual line giving problems. From jorgesmbox-ml at yahoo.es Tue Sep 1 08:07:36 2009 From: jorgesmbox-ml at yahoo.es (jorgesmbox-ml at yahoo.es) Date: Tue, 1 Sep 2009 12:07:36 +0000 (GMT) Subject: [Numpy-discussion] Question about np.savez Message-ID: <292629.75203.qm@web27902.mail.ukl.yahoo.com> Hi, I know the documentation states that np.savez saves numpy arrays, so my question relates to misusing it. Before reading the doc in detail, and after reading about pickle and other options to make data persistent, I passed np.savez a list of ndarrays. It didn't complain, but when I loaded the data back, the list had been turned into an ndarray. Is this behaviour expected? It did surprise me. Below there is an example: In [18]: l Out[18]: [array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18]), array([ 0, 3, 6, 9, 12, 15, 18, 21, 24, 27])] In [19]: np.savez('test.npz', l=l) In [20]: data = np.load('test.npz') In [21]: l1 = data['l'] In [22]: l1 Out[22]: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18], [ 0, 3, 6, 9, 12, 15, 18, 21, 24, 27]]) jorge From timmichelsen at gmx-topmail.de Tue Sep 1 08:54:15 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 1 Sep 2009 12:54:15 +0000 (UTC) Subject: [Numpy-discussion] genfromtext advice References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8F@sernt14.essex.ac.uk> Message-ID: > $ awk -F '|' '{if(NF != 12) print NR;}' /tmp/pp.txt > and besides the first 23 lines and the last 3 lines of the file, > also the following have a number of '|' different from 11: > 1635 > 2851 > 5538 > i.e. BIKIN, BENGUERIR and TERESINA AIRPORT. Looks lika some bash magic. I will try to translate this into as python script... From cekees at gmail.com Tue Sep 1 09:03:29 2009 From: cekees at gmail.com (Chris Kees) Date: Tue, 1 Sep 2009 08:03:29 -0500 Subject: [Numpy-discussion] Facing the Multicore-Challenge, Heidelberg 2010 Message-ID: This conference may be of interest given the many discussions at SciPy on python support for parallel programming: http://www.multicore-challenge.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Sep 1 09:08:32 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 1 Sep 2009 13:08:32 +0000 (UTC) Subject: [Numpy-discussion] Question about np.savez References: <292629.75203.qm@web27902.mail.ukl.yahoo.com> Message-ID: Tue, 01 Sep 2009 12:07:36 +0000, jorgesmbox-ml kirjoitti: > I know the documentation states that np.savez saves numpy arrays, so my > question relates to misusing it. Before reading the doc in detail, and > after reading about pickle and other options to make data persistent, I > passed np.savez a list of ndarrays. It didn't complain, but when I > loaded the data back, the list had been turned into an ndarray. Is this > behaviour expected? It did surprise me. Below there is an example: [clip] It is expected. savez casts its input to arrays before saving. -- Pauli Virtanen From lciti at essex.ac.uk Tue Sep 1 09:13:33 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Tue, 1 Sep 2009 14:13:33 +0100 Subject: [Numpy-discussion] genfromtext advice References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8F@sernt14.essex.ac.uk> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E90@sernt14.essex.ac.uk> import sys f = open(sys.argv[1], 'rt') for l in f: if len(l.split('|')) != 12: print(l) From timmichelsen at gmx-topmail.de Tue Sep 1 10:55:40 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 1 Sep 2009 14:55:40 +0000 (UTC) Subject: [Numpy-discussion] genfromtext advice References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8F@sernt14.essex.ac.uk> Message-ID: > But I completely agree with you, genfromtxt could print out > the line number and the actual line giving problems. Here we go: http://projects.scipy.org/numpy/ticket/1212 From bsouthey at gmail.com Tue Sep 1 12:49:49 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 01 Sep 2009 11:49:49 -0500 Subject: [Numpy-discussion] genfromtext advice In-Reply-To: References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E8F@sernt14.essex.ac.uk> Message-ID: <4A9D50AD.9080503@gmail.com> On 09/01/2009 09:55 AM, Tim Michelsen wrote: >> But I completely agree with you, genfromtxt could print out >> the line number and the actual line giving problems. >> > Here we go: > http://projects.scipy.org/numpy/ticket/1212 > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, By the way, the error message comes from numpy.array() not genfromtxt. Please look closer at the data in particular: Station Name='BIKIN' You should see something that should be there that is not. Bruce From timmichelsen at gmx-topmail.de Tue Sep 1 12:50:31 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 1 Sep 2009 16:50:31 +0000 (UTC) Subject: [Numpy-discussion] np.hist with masked values Message-ID: Hello, should creating a histogram with masked data be different that one cretated with unmasked data? Is np.hist tuned for work with historams? I R_project I would do: # Note: values is my dataset ### masking zeros values_mask=ifelse(values==0, NA, (values)) # http://stat.ethz.ch/R-manual/R-patched/library/stats/html/ecdf.html plot(ecdf(values_mask), main="Emprical cumulative distribution function", do.points=F, xlab='Data Values') There masking the zeros has an effect. With pure numpy not. Thanks in advance for any comment. Timmie From robert.kern at gmail.com Tue Sep 1 13:34:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Sep 2009 12:34:54 -0500 Subject: [Numpy-discussion] np.hist with masked values In-Reply-To: References: Message-ID: <3d375d730909011034x23eafe72n4fffbf1a7d3c6d0@mail.gmail.com> On Tue, Sep 1, 2009 at 11:50, Tim Michelsen wrote: > Hello, > should creating a histogram with masked data be different that one cretated with > unmasked data? Ideally, yes. Patches are welcome. In the meantime, use the .compressed() method on the masked array to get an ndarray with just the unmasked data to pass into np.histogram(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Tue Sep 1 15:01:40 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 01 Sep 2009 21:01:40 +0200 Subject: [Numpy-discussion] ticket for O(n) median function? Message-ID: <4A9D6F94.8050903@molden.no> I could not find any, so I'll ask if it's ok to create one. I have a patch for /numpy/lib/function_base.py that uses any 'select' function to obtain the median. I'll also submit the Cython code for quickselect. Attachment (median.py.gz) contains the suggested implementation of median. I disabled overwrite_input because the median function calls numpy.apply_along_axis. Regards, Sturla Molden -------------- next part -------------- A non-text attachment was scrubbed... Name: median.py.gz Type: application/gzip Size: 1688 bytes Desc: not available URL: From robert.kern at gmail.com Tue Sep 1 15:20:42 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Sep 2009 14:20:42 -0500 Subject: [Numpy-discussion] ticket for O(n) median function? In-Reply-To: <4A9D6F94.8050903@molden.no> References: <4A9D6F94.8050903@molden.no> Message-ID: <3d375d730909011220o2a4b043aj86cda2edeb1475ec@mail.gmail.com> On Tue, Sep 1, 2009 at 14:01, Sturla Molden wrote: > I could not find any, so I'll ask if it's ok to create one. It's always okay to create a ticket. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Tue Sep 1 15:21:06 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 1 Sep 2009 19:21:06 +0000 (UTC) Subject: [Numpy-discussion] ticket for O(n) median function? References: <4A9D6F94.8050903@molden.no> Message-ID: On 2009-09-01, Sturla Molden wrote: [clip] > I could not find any, so I'll ask if it's ok to create one. I have a > patch for /numpy/lib/function_base.py that uses any 'select' function to > obtain the median. I'll also submit the Cython code for quickselect. I'd say that just go ahead and create one -- there's little harm done even if it's a duplicate. -- Pauli Virtanen From dagss at student.matnat.uio.no Tue Sep 1 15:51:58 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 01 Sep 2009 21:51:58 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9C9DDA.9060503@molden.no> References: <4A9C9DDA.9060503@molden.no> Message-ID: <4A9D7B5E.6040009@student.matnat.uio.no> Sturla Molden wrote: > We recently has a discussion regarding an optimization of NumPy's median > to average O(n) complexity. After some searching, I found out there is a > selection algorithm competitive in speed with Hoare's quick select. It > has the advantage of being a lot simpler to implement. In plain Python: > > import numpy as np > > def wirthselect(array, k): > > """ Niklaus Wirth's selection algortithm """ > > a = np.ascontiguousarray(array) > if (a is array): a = a.copy() > > l = 0 > m = a.shape[0] - 1 > while l < m: > x = a[k] > i = l > j = m > while 1: > while a[i] < x: i += 1 > while x < a[j]: j -= 1 > if i <= j: > tmp = a[i] > a[i] = a[j] > a[j] = tmp > i += 1 > j -= 1 > if i > j: break > if j < k: l = i > if k < i: m = j > > return a > > > Now, the median can be obtained in average O(n) time as: > > > def median(x): > > """ median in average O(n) time """ > > n = x.shape[0] > k = n >> 1 > s = wirthselect(x, k) > if n & 1: > return s[k] > else: > return 0.5*(s[k]+s[:k].max()) > > > The beauty of this is that Wirth select is extremely easy to migrate to > Cython: > > > import numpy > ctypedef numpy.double_t T # or whatever > > def wirthselect(numpy.ndarray[T, ndim=1] array, int k): > > cdef int i, j, l, m > cdef T x, tmp > cdef T *a > > _array = np.ascontiguousarray(array) > if (_array is array): _array = _array.copy() > a = _array.data > > l = 0 > m = a.shape[0] - 1 Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the right type to use in this case? -- Dag Sverre From sturla at molden.no Tue Sep 1 17:19:59 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 01 Sep 2009 23:19:59 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9D7B5E.6040009@student.matnat.uio.no> References: <4A9C9DDA.9060503@molden.no> <4A9D7B5E.6040009@student.matnat.uio.no> Message-ID: <4A9D8FFF.6020903@molden.no> Dag Sverre Seljebotn skrev: > Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the > right type to use in this case? > > Yup. You are right. Thanks. Sturla From sturla at molden.no Tue Sep 1 17:37:54 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 01 Sep 2009 23:37:54 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9D7B5E.6040009@student.matnat.uio.no> References: <4A9C9DDA.9060503@molden.no> <4A9D7B5E.6040009@student.matnat.uio.no> Message-ID: <4A9D9432.20907@molden.no> Dag Sverre Seljebotn skrev: > > Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the > right type to use in this case? > By the way, here is a more polished version, does it look ok? http://projects.scipy.org/numpy/attachment/ticket/1213/generate_qselect.py http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.pyx Cython needs something like Java's generics by the way :-) Regards, Sturla Molden From sturla at molden.no Tue Sep 1 17:42:07 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 01 Sep 2009 23:42:07 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9D9432.20907@molden.no> References: <4A9C9DDA.9060503@molden.no> <4A9D7B5E.6040009@student.matnat.uio.no> <4A9D9432.20907@molden.no> Message-ID: <4A9D952F.4010202@molden.no> Sturla Molden skrev: > > By the way, here is a more polished version, does it look ok? No it doesn't... Got to keep the GIL for the general case (sorting object arrays). Fixing that. SM From dwf at cs.toronto.edu Tue Sep 1 19:03:16 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 1 Sep 2009 19:03:16 -0400 Subject: [Numpy-discussion] snow leopard and Numeric In-Reply-To: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> References: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> Message-ID: On 1-Sep-09, at 4:08 AM, Stefano Covino wrote: > I have just upgraded my Mac laptop to snow leopard. > However, I can no more compile Numeric 24.2. Do you really need Numeric? NumPy provides all of the functionality of Numeric and then some. David From dwf at cs.toronto.edu Tue Sep 1 20:36:38 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 1 Sep 2009 20:36:38 -0400 Subject: [Numpy-discussion] Question about np.savez In-Reply-To: References: <292629.75203.qm@web27902.mail.ukl.yahoo.com> Message-ID: <33909AD0-CBBC-4A83-8BC5-A14DE792F75D@cs.toronto.edu> On 1-Sep-09, at 9:08 AM, Pauli Virtanen wrote: > Tue, 01 Sep 2009 12:07:36 +0000, jorgesmbox-ml kirjoitti: >> I know the documentation states that np.savez saves numpy arrays, >> so my >> question relates to misusing it. Before reading the doc in detail, >> and >> after reading about pickle and other options to make data >> persistent, I >> passed np.savez a list of ndarrays. It didn't complain, but when I >> loaded the data back, the list had been turned into an ndarray. Is >> this >> behaviour expected? It did surprise me. Below there is an example: > [clip] > > It is expected. savez casts its input to arrays before saving. If you actually want to save multiple arrays, you can use savez('fname', *[a,b,c]) and they will be accessible under the names arr_0, arr_1, etc. and a list of these names is in the 'files' attribute on the NpzFile object. To retrieve your list of arrays when you load, you can just do mynewlist = [data[arrname] for arrname in data.files] David From jorgesmbox-ml at yahoo.es Tue Sep 1 22:11:09 2009 From: jorgesmbox-ml at yahoo.es (Jorge Scandaliaris) Date: Wed, 2 Sep 2009 02:11:09 +0000 (UTC) Subject: [Numpy-discussion] Question about np.savez References: <292629.75203.qm@web27902.mail.ukl.yahoo.com> <33909AD0-CBBC-4A83-8BC5-A14DE792F75D@cs.toronto.edu> Message-ID: David Warde-Farley cs.toronto.edu> writes: > If you actually want to save multiple arrays, you can use > savez('fname', *[a,b,c]) and they will be accessible under the names > arr_0, arr_1, etc. and a list of these names is in the 'files' > attribute on the NpzFile object. To retrieve your list of arrays when > you load, you can just do > > mynewlist = [data[arrname] for arrname in data.files] > Thanks for the tip. I have realized, though, that I might need some more flexibility than just the ability to save ndarrays. The data I am dealing with is best kept in a hierarchical way (I could represent the structure with ndarrays also, but I think it would be messy and difficult). I am having a look at h5py to see if it fulfill my needs. I know there is pytables, too, but from having a quick look it seems h5py is simpler. Am I right on this?. I also get a nice side-effect, the data would be readable by the de-facto standard software used by most people in my field. Jorge From chanley at stsci.edu Tue Sep 1 22:35:21 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Tue, 01 Sep 2009 22:35:21 -0400 Subject: [Numpy-discussion] numpy rev7353 w/ OS X 10.6 Message-ID: <4A9DD9E9.3000507@stsci.edu> Hi, Upgraded to Snow Leopard, left setup.py and all environment variables the same, tried latest numpy from source. This is the build error I receive: Running from numpy source directory. non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2_7353 numpy/core/setup_common.py:81: MismatchCAPIWarning: API mismatch detected, the C API version numbers have to be updated. Current C api version is 3, with checksum 4526bc5a07e51d13a2db642715fedca5, but recorded checksum for C API version 3 in codegen_dir/cversions.txt is bf22c0d05b31625d2a7015988d61ce5a. If functions were added in the C API, you have to update C_API_VERSION in numpy/core/setup_common.pyc. MismatchCAPIWarning) blas_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers'] lapack_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-msse3'] running install running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_srca building py_modules sources creating build creating build/src.macosx-10.3-i386-2.5 creating build/src.macosx-10.3-i386-2.5/numpy creating build/src.macosx-10.3-i386-2.5/numpy/distutils building library "npymath" sources customize NAGFCompiler Could not locate executable f95 Found executable /usr/local/bin/gfortran nag: no Fortran 90 compiler found Could not locate executable ranlib nag: no Fortran 90 compiler found nag: no Fortran 90 compiler found customize AbsoftFCompiler Could not locate executable f90 absoft: no Fortran 90 compiler found absoft: no Fortran 90 compiler found absoft: no Fortran 90 compiler found absoft: no Fortran 90 compiler found absoft: no Fortran 90 compiler found absoft: no Fortran 90 compiler found customize IBMFCompiler Could not locate executable xlf90 ibm: no Fortran 90 compiler found Could not locate executable xlf95 ibm: no Fortran 90 compiler found customize IntelFCompiler Could not locate executable ifort Could not locate executable ifc intel: no Fortran 90 compiler found intel: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found Found executable /usr/local/bin/g77 gnu: no Fortran 90 compiler found /Users/chanley/dev/numpy/numpy/distutils/fcompiler/gnu.py:131: UserWarning: Env. variable MACOSX_DEPLOYMENT_TARGET set to 10.3 warnings.warn(s) customize Gnu95FCompiler Found executable /usr/local/bin/gfortran customize Gnu95FCompiler customize Gnu95FCompiler using config C compiler: cc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -DNDEBUG -g -O3 -Wall -Wstrict-prototypes compile options: '-Inumpy/core/src -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/stsci/pyssgdev/Python-2.5.1/include/python2.5 -c' cc: _configtest.c sh: cc: command not found sh: cc: command not found failure. removing: _configtest.c _configtest.o C compiler: cc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -DNDEBUG -g -O3 -Wall -Wstrict-prototypes compile options: '-Inumpy/core/src -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/stsci/pyssgdev/Python-2.5.1/include/python2.5 -c' cc: _configtest.c sh: cc: command not found sh: cc: command not found failure. removing: _configtest.c _configtest.o C compiler: cc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -DNDEBUG -g -O3 -Wall -Wstrict-prototypes compile options: '-Inumpy/core/src -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/stsci/pyssgdev/Python-2.5.1/include/python2.5 -c' cc: _configtest.c sh: cc: command not found sh: cc: command not found failure. removing: _configtest.c _configtest.o Traceback (most recent call last): File "setup.py", line 186, in setup_package() File "setup.py", line 179, in setup_package configuration=configuration ) File "/Users/chanley/dev/numpy/numpy/distutils/core.py", line 186, in setup return old_setup(**new_attr) File "/usr/stsci/pyssgdev/Python-2.5.1/lib/python2.5/distutils/core.py", line 151, in setup dist.run_commands() File "/usr/stsci/pyssgdev/Python-2.5.1/lib/python2.5/distutils/dist.py", line 974, in run_commands self.run_command(cmd) File "/usr/stsci/pyssgdev/Python-2.5.1/lib/python2.5/distutils/dist.py", line 994, in run_command cmd_obj.run() File "/Users/chanley/dev/numpy/numpy/distutils/command/install.py", line 52, in run r = old_install.run(self) File "/usr/stsci/pyssgdev/Python-2.5.1/lib/python2.5/distutils/command/install.py", line 506, in run self.run_command('build') File "/usr/stsci/pyssgdev/Python-2.5.1/lib/python2.5/distutils/cmd.py", line 333, in run_command self.distribution.run_command(command) File "/usr/stsci/pyssgdev/Python-2.5.1/lib/python2.5/distutils/dist.py", line 994, in run_command cmd_obj.run() File "/Users/chanley/dev/numpy/numpy/distutils/command/build.py", line 37, in run old_build.run(self) File "/usr/stsci/pyssgdev/Python-2.5.1/lib/python2.5/distutils/command/build.py", line 112, in run self.run_command(cmd_name) File "/usr/stsci/pyssgdev/Python-2.5.1/lib/python2.5/distutils/cmd.py", line 333, in run_command self.distribution.run_command(command) File "/usr/stsci/pyssgdev/Python-2.5.1/lib/python2.5/distutils/dist.py", line 994, in run_command cmd_obj.run() File "/Users/chanley/dev/numpy/numpy/distutils/command/build_src.py", line 151, in run self.build_sources() File "/Users/chanley/dev/numpy/numpy/distutils/command/build_src.py", line 162, in build_sources self.build_library_sources(*libname_info) File "/Users/chanley/dev/numpy/numpy/distutils/command/build_src.py", line 297, in build_library_sources sources = self.generate_sources(sources, (lib_name, build_info)) File "/Users/chanley/dev/numpy/numpy/distutils/command/build_src.py", line 384, in generate_sources source = func(extension, build_dir) File "numpy/core/setup.py", line 590, in get_mathlib_info mlibs = check_mathlib(config_cmd) File "numpy/core/setup.py", line 283, in check_mathlib raise EnvironmentError("math library missing; rerun " EnvironmentError: math library missing; rerun setup.py after setting the MATHLIB env variable Does Python need to be rebuild? I was using 2.5 I downloaded from python.org. Thanks, Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From robert.kern at gmail.com Tue Sep 1 22:56:33 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Sep 2009 21:56:33 -0500 Subject: [Numpy-discussion] numpy rev7353 w/ OS X 10.6 In-Reply-To: <4A9DD9E9.3000507@stsci.edu> References: <4A9DD9E9.3000507@stsci.edu> Message-ID: <3d375d730909011956x3a7abb09k48de0d50cccf06a6@mail.gmail.com> On Tue, Sep 1, 2009 at 21:35, Christopher Hanley wrote: > Hi, > > Upgraded to Snow Leopard, left setup.py and all environment variables > the same, tried latest numpy from source. ?This is the build error I > receive: > C compiler: cc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp > -mno-fused-madd -DNDEBUG -g -O3 -Wall -Wstrict-prototypes > > compile options: '-Inumpy/core/src -Inumpy/core/src/npymath > -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include > -I/usr/stsci/pyssgdev/Python-2.5.1/include/python2.5 -c' > cc: _configtest.c > sh: cc: command not found Do you have a CC environment variable set incorrectly to "cc"? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From chanley at stsci.edu Tue Sep 1 23:06:44 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Tue, 01 Sep 2009 23:06:44 -0400 Subject: [Numpy-discussion] numpy rev7353 w/ OS X 10.6 In-Reply-To: <3d375d730909011956x3a7abb09k48de0d50cccf06a6@mail.gmail.com> References: <4A9DD9E9.3000507@stsci.edu> <3d375d730909011956x3a7abb09k48de0d50cccf06a6@mail.gmail.com> Message-ID: <4A9DE144.9080807@stsci.edu> Robert Kern wrote: > On Tue, Sep 1, 2009 at 21:35, Christopher Hanley wrote: > >> Hi, >> >> Upgraded to Snow Leopard, left setup.py and all environment variables >> the same, tried latest numpy from source. This is the build error I >> receive: >> > > >> C compiler: cc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp >> -mno-fused-madd -DNDEBUG -g -O3 -Wall -Wstrict-prototypes >> >> compile options: '-Inumpy/core/src -Inumpy/core/src/npymath >> -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include >> -I/usr/stsci/pyssgdev/Python-2.5.1/include/python2.5 -c' >> cc: _configtest.c >> sh: cc: command not found >> > > Do you have a CC environment variable set incorrectly to "cc"? > > Yes. But that appears to only be a symptom. It appears that my entire XCode installation has disappeared. Perhaps that is how all that space was freed on my system. ;-) Grr..... OK. Ignore my message. I am going to have to reset and start again. Thanks, Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From dwf at cs.toronto.edu Tue Sep 1 23:39:29 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 1 Sep 2009 23:39:29 -0400 Subject: [Numpy-discussion] Question about np.savez In-Reply-To: References: <292629.75203.qm@web27902.mail.ukl.yahoo.com> <33909AD0-CBBC-4A83-8BC5-A14DE792F75D@cs.toronto.edu> Message-ID: <9F7FA763-9683-46B2-B19F-56387DCBCCA7@cs.toronto.edu> On 1-Sep-09, at 10:11 PM, Jorge Scandaliaris wrote: > David Warde-Farley cs.toronto.edu> writes: >> If you actually want to save multiple arrays, you can use >> savez('fname', *[a,b,c]) and they will be accessible under the names >> arr_0, arr_1, etc. and a list of these names is in the 'files' >> attribute on the NpzFile object. To retrieve your list of arrays when >> you load, you can just do >> >> mynewlist = [data[arrname] for arrname in data.files] >> > > Thanks for the tip. I have realized, though, that I might need some > more > flexibility than just the ability to save ndarrays. The data I am > dealing with > is best kept in a hierarchical way (I could represent the structure > with > ndarrays also, but I think it would be messy and difficult). I am > having a look > at h5py to see if it fulfill my needs. I know there is pytables, > too, but from > having a quick look it seems h5py is simpler. Am I right on this?. I wouldn't say one is 'simpler' or 'more complicated'; they're different in approach. From the h5py FAQ: The two projects have different design goals. PyTables presents a database-like approach to data storage, providing features like indexing and fast "in-kernel" queries on dataset contents. It also has a custom system to represent data types. In contrast, h5py is an attempt to map the HDF5 feature set to NumPy as closely as possible. For example, the high-level type system uses NumPy dtype objects exclusively, and method and attribute naming follows Python and NumPy conventions for dictionary and array access (i.e. ".dtype" and ".shape" attributes for datasets, obj[name] indexing syntax for groups, etc). So, if you have huge amounts of data and you want to do complicated queries on discontiguous subsets of it, PyTables is the clear winner. The types systems are quite similar but there is some extra work involved with PyTables. h5py, on the other hand, provides a nearly complete wrapping of the HDF5 C API, in addition to the NumPy integration. The truth is, both of them/either of them integrate nicely with NumPy. They have overlapping featuresets, just different design philosophies. David From robert.kern at gmail.com Tue Sep 1 23:50:57 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Sep 2009 22:50:57 -0500 Subject: [Numpy-discussion] Question about np.savez In-Reply-To: References: <292629.75203.qm@web27902.mail.ukl.yahoo.com> <33909AD0-CBBC-4A83-8BC5-A14DE792F75D@cs.toronto.edu> Message-ID: <3d375d730909012050m305e1e09q20ff7efdbf027356@mail.gmail.com> On Tue, Sep 1, 2009 at 21:11, Jorge Scandaliaris wrote: > David Warde-Farley cs.toronto.edu> writes: >> If you actually want to save multiple arrays, you can use >> savez('fname', *[a,b,c]) and they will be accessible under the names >> arr_0, arr_1, etc. and a list of these names is in the 'files' >> attribute on the NpzFile object. To retrieve your list of arrays when >> you load, you can just do >> >> mynewlist = [data[arrname] for arrname in data.files] >> > > Thanks for the tip. I have realized, though, that I might need some more > flexibility than just the ability to save ndarrays. The data I am dealing with > is best kept in a hierarchical way (I could represent the structure with > ndarrays also, but I think it would be messy and difficult). I am having a look > at h5py to see if it fulfill my needs. I know there is pytables, too, but from > having a quick look it seems h5py is simpler. Am I right on this?. I also get a > nice side-effect, the data would be readable by the de-facto standard software > used by most people in my field. If there is a particular format that uses HDF5 that you are trying to replicate, h5py is the clear answer. However, PyTables will, by and large, make files that are entirely readable by other HDF5 libraries when you just use the subset of features that is supported by HDF5-proper. For example, tables and arrays work just fine. What won't be supported by non-PyTables libraries are things like dataset attributes which are pickled objects. Your non-PyTables HDF5 apps will see some extraneous attributes on the arrays and tables, but those are typically not necessary for interpretation. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefano_covino at yahoo.it Wed Sep 2 00:37:11 2009 From: stefano_covino at yahoo.it (Stefano Covino) Date: Wed, 2 Sep 2009 04:37:11 +0000 (UTC) Subject: [Numpy-discussion] snow leopard and Numeric References: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> Message-ID: David Warde-Farley cs.toronto.edu> writes: > > On 1-Sep-09, at 4:08 AM, Stefano Covino wrote: > > > I have just upgraded my Mac laptop to snow leopard. > > However, I can no more compile Numeric 24.2. > > Do you really need Numeric? NumPy provides all of the functionality of > Numeric and then some. > > David > Hi, of course you are all right. NumPy is much better. Essentially I was just curious to understand what it is wrong given that Numeric compiled smoothly with the previous Mac OSX version. Cheers, Stefano From robert.kern at gmail.com Wed Sep 2 00:44:34 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Sep 2009 23:44:34 -0500 Subject: [Numpy-discussion] snow leopard and Numeric In-Reply-To: References: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> Message-ID: <3d375d730909012144l23d66258q459d5b0966815f45@mail.gmail.com> On Tue, Sep 1, 2009 at 23:37, Stefano Covino wrote: > of course you are all right. NumPy is much better. Essentially I was just > curious to understand what it is wrong given that Numeric compiled smoothly with > the previous Mac OSX version. The 64-bit version of OS X complies to a different UNIX standard than the 32-bit version. gettimeofday(), which is being used to seed the random number generator, is one of the affected functions. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefano_covino at yahoo.it Wed Sep 2 00:50:04 2009 From: stefano_covino at yahoo.it (Stefano Covino) Date: Wed, 2 Sep 2009 04:50:04 +0000 (UTC) Subject: [Numpy-discussion] snow leopard and Numeric References: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> <3d375d730909012144l23d66258q459d5b0966815f45@mail.gmail.com> Message-ID: > > The 64-bit version of OS X complies to a different UNIX standard than > the 32-bit version. gettimeofday(), which is being used to seed the > random number generator, is one of the affected functions. > Thanks. I guessed something like this. Is there a way to constrain an old-style compilation just to make a code work? I have similar problems with other old pieces of code. Cheers, Stefano From robert.kern at gmail.com Wed Sep 2 00:53:05 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Sep 2009 23:53:05 -0500 Subject: [Numpy-discussion] snow leopard and Numeric In-Reply-To: References: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> <3d375d730909012144l23d66258q459d5b0966815f45@mail.gmail.com> Message-ID: <3d375d730909012153o2736cee4r430f597dd37c5a3@mail.gmail.com> On Tue, Sep 1, 2009 at 23:50, Stefano Covino wrote: >> >> The 64-bit version of OS X complies to a different UNIX standard than >> the 32-bit version. gettimeofday(), which is being used to seed the >> random number generator, is one of the affected functions. > > Thanks. I guessed something like this. > > Is there a way to constrain an old-style compilation just to make a code > work? I have similar problems with other old pieces of code. Use "-arch i686" in the CFLAGS and LDFLAGS. I think. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Wed Sep 2 01:34:01 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 02 Sep 2009 07:34:01 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9D9432.20907@molden.no> References: <4A9C9DDA.9060503@molden.no> <4A9D7B5E.6040009@student.matnat.uio.no> <4A9D9432.20907@molden.no> Message-ID: <4A9E03C9.70003@molden.no> Sturla Molden skrev: > > http://projects.scipy.org/numpy/attachment/ticket/1213/generate_qselect.py > http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.pyx My suggestion for a new median function is here: http://projects.scipy.org/numpy/attachment/ticket/1213/median.py The quickselect extension module can be compiled from C source: http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.c.gz Feel free to look at it when you have time. :-) Regards, Sturla Molden From sole at esrf.fr Wed Sep 2 03:40:49 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Wed, 02 Sep 2009 09:40:49 +0200 Subject: [Numpy-discussion] How to concatenate two arrays without duplicating memory? Message-ID: <4A9E2181.70702@esrf.fr> Hello, Let's say we have two arrays A and B of shapes (10000, 2000) and (10000, 4000). If I do C=numpy.concatenate((A, B), axis=1), I get a new array of dimension (10000, 6000) with duplication of memory. I am looking for a way to have a non contiguous array C in which the "left" (10000, 2000) elements point to A and the "right" (10000, 4000) elements point to B. Any hint will be appreciated. Thanks, Armando From faltet at pytables.org Wed Sep 2 03:46:44 2009 From: faltet at pytables.org (Francesc Alted) Date: Wed, 2 Sep 2009 09:46:44 +0200 Subject: [Numpy-discussion] Question about np.savez In-Reply-To: <3d375d730909012050m305e1e09q20ff7efdbf027356@mail.gmail.com> References: <292629.75203.qm@web27902.mail.ukl.yahoo.com> <3d375d730909012050m305e1e09q20ff7efdbf027356@mail.gmail.com> Message-ID: <200909020946.47491.faltet@pytables.org> A Wednesday 02 September 2009 05:50:57 Robert Kern escrigu?: > On Tue, Sep 1, 2009 at 21:11, Jorge Scandaliaris wrote: > > David Warde-Farley cs.toronto.edu> writes: > >> If you actually want to save multiple arrays, you can use > >> savez('fname', *[a,b,c]) and they will be accessible under the names > >> arr_0, arr_1, etc. and a list of these names is in the 'files' > >> attribute on the NpzFile object. To retrieve your list of arrays when > >> you load, you can just do > >> > >> mynewlist = [data[arrname] for arrname in data.files] > > > > Thanks for the tip. I have realized, though, that I might need some more > > flexibility than just the ability to save ndarrays. The data I am dealing > > with is best kept in a hierarchical way (I could represent the structure > > with ndarrays also, but I think it would be messy and difficult). I am > > having a look at h5py to see if it fulfill my needs. I know there is > > pytables, too, but from having a quick look it seems h5py is simpler. Am > > I right on this?. I also get a nice side-effect, the data would be > > readable by the de-facto standard software used by most people in my > > field. > > If there is a particular format that uses HDF5 that you are trying to > replicate, h5py is the clear answer. However, PyTables will, by and > large, make files that are entirely readable by other HDF5 libraries > when you just use the subset of features that is supported by > HDF5-proper. For example, tables and arrays work just fine. What won't > be supported by non-PyTables libraries are things like dataset > attributes which are pickled objects. Your non-PyTables HDF5 apps will > see some extraneous attributes on the arrays and tables, but those are > typically not necessary for interpretation. Most of these 'extraneous' attributes are derived from the use of the high level HDF5 interface (http://www.hdfgroup.org/HDF5/doc/HL/). If they bother you, you can get rid of them by setting the parameter ``PYTABLES_SYS_ATTRS`` to false (either in tables/parameters.py or passing it to `tables.openFile`). -- Francesc Alted From gael.varoquaux at normalesup.org Wed Sep 2 04:24:35 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 2 Sep 2009 10:24:35 +0200 Subject: [Numpy-discussion] How to concatenate two arrays without duplicating memory? In-Reply-To: <4A9E2181.70702@esrf.fr> References: <4A9E2181.70702@esrf.fr> Message-ID: <20090902082435.GC27947@phare.normalesup.org> On Wed, Sep 02, 2009 at 09:40:49AM +0200, "V. Armando Sol?" wrote: > Let's say we have two arrays A and B of shapes (10000, 2000) and (10000, > 4000). > If I do C=numpy.concatenate((A, B), axis=1), I get a new array of > dimension (10000, 6000) with duplication of memory. > I am looking for a way to have a non contiguous array C in which the > "left" (10000, 2000) elements point to A and the "right" (10000, 4000) > elements point to B. You cannot in the numpy memory model. The numpy memory model defines an array as something that has regular strides to jump from an element to the next one. Ga?l From lciti at essex.ac.uk Wed Sep 2 04:50:03 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 2 Sep 2009 09:50:03 +0100 Subject: [Numpy-discussion] A faster median (Wirth's method) References: <4A9C9DDA.9060503@molden.no> <4A9D7B5E.6040009@student.matnat.uio.no> <4A9D9432.20907@molden.no> <4A9E03C9.70003@molden.no> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E93@sernt14.essex.ac.uk> Hello Sturla, I had a quick look at your code. Looks fine. A few notes... In "select" you should replace numpy with np. In "_median" how can you, if n==2, use s[] if s is not defined? What if n==1? Also, I think when returning an empty array, it should be of the same type you would get in the other cases. You could replace _median with the following. Best, Luca def _median(x, inplace): assert(x.ndim == 1) n = x.shape[0] if n > 2: k = n >> 1 s = select(x, k, inplace=inplace) if n & 1: return s[k] else: return 0.5*(s[k]+s[:k].max()) elif n == 0: return np.empty(0, dtype=x.dtype) elif n == 2: return 0.5*(x[0]+x[1]) else: # n == 1 return x[0] From lciti at essex.ac.uk Wed Sep 2 04:57:36 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 2 Sep 2009 09:57:36 +0100 Subject: [Numpy-discussion] How to concatenate two arrayswithout duplicating memory? References: <4A9E2181.70702@esrf.fr> <20090902082435.GC27947@phare.normalesup.org> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E94@sernt14.essex.ac.uk> As Ga?l pointed out you cannot create A, B and then C as the concatenation of A and B without duplicating the vectors. > I am looking for a way to have a non contiguous array C in which the > "left" (10000, 2000) elements point to A and the "right" (10000, 4000) > elements point to B. But you can still re-link A to the left elements and B to the right ones afterwards by using views into C. >>> C=numpy.concatenate((A, B), axis=1) >>> A,B = C[:,:2000], C[:,2000:] Best, Luca From sole at esrf.fr Wed Sep 2 05:10:46 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Wed, 02 Sep 2009 11:10:46 +0200 Subject: [Numpy-discussion] How to concatenate two arrays without duplicating memory? In-Reply-To: <20090902082435.GC27947@phare.normalesup.org> References: <4A9E2181.70702@esrf.fr> <20090902082435.GC27947@phare.normalesup.org> Message-ID: <4A9E3696.4030409@esrf.fr> Gael Varoquaux wrote: > You cannot in the numpy memory model. The numpy memory model defines an > array as something that has regular strides to jump from an element to > the next one. > I expected problems in the suggested case (concatenating columns) but I did not expect the problem would be so severe to affect the case of row concatenation. I guess I am still considering a 2D array as an array of pointers and that does not apply to numpy arrays. Thanks for the info. Armando From jorgesmbox-ml at yahoo.es Wed Sep 2 05:20:55 2009 From: jorgesmbox-ml at yahoo.es (Jorge Scandaliaris) Date: Wed, 2 Sep 2009 09:20:55 +0000 (UTC) Subject: [Numpy-discussion] Question about np.savez References: <292629.75203.qm@web27902.mail.ukl.yahoo.com> <33909AD0-CBBC-4A83-8BC5-A14DE792F75D@cs.toronto.edu> <9F7FA763-9683-46B2-B19F-56387DCBCCA7@cs.toronto.edu> Message-ID: Thanks David, Robert and Francesc for comments and suggestions. It's nice having options, but that also means one has to choose ;) I will have a closer look at pytables. The thing that got me "scared" about it was the word database. I have close to zero experience using or, even worst, designing databases. Maybe I am wrong. The way I was considering for structuring could be considered like a, rudimentary at least, database. I have the feeling this is turning into killing a fly with a cannon... Jorge From sole at esrf.fr Wed Sep 2 05:31:06 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Wed, 02 Sep 2009 11:31:06 +0200 Subject: [Numpy-discussion] How to concatenate two arrayswithout duplicating memory? In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E94@sernt14.essex.ac.uk> References: <4A9E2181.70702@esrf.fr> <20090902082435.GC27947@phare.normalesup.org> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E94@sernt14.essex.ac.uk> Message-ID: <4A9E3B5A.20405@esrf.fr> Citi, Luca wrote: > As Ga?l pointed out you cannot create A, B and then C > as the concatenation of A and B without duplicating > the vectors. > > But you can still re-link A to the left elements > and B to the right ones afterwards by using views into C. > Thanks for the hint. In my case the A array is already present and the contents of the B array can be read from disk. At least I have two workarounds making use of your suggested solution of re-linking: - create the C array, copy the contents of A to it and read the contents of B directly into C with duplication of the memory of A during some time. - save the array A in disk, create the array C, read the contents of A and B into it and re-link A and B with no duplication but ugly. Thanks, Armando From faltet at pytables.org Wed Sep 2 05:47:30 2009 From: faltet at pytables.org (Francesc Alted) Date: Wed, 2 Sep 2009 11:47:30 +0200 Subject: [Numpy-discussion] Question about np.savez In-Reply-To: References: <292629.75203.qm@web27902.mail.ukl.yahoo.com> <9F7FA763-9683-46B2-B19F-56387DCBCCA7@cs.toronto.edu> Message-ID: <200909021147.30868.faltet@pytables.org> A Wednesday 02 September 2009 11:20:55 Jorge Scandaliaris escrigu?: > Thanks David, Robert and Francesc for comments and suggestions. It's nice > having options, but that also means one has to choose ;) > I will have a closer look at pytables. The thing that got me "scared" about > it was the word database. I have close to zero experience using or, even > worst, designing databases. Maybe I am wrong. The way I was considering for > structuring could be considered like a, rudimentary at least, database. Well, I agree that the term 'database' is perhaps a bit scaring and I don't actually like this term to be applied to PyTables --I always like to say that PyTables is not a database competitor, but rather a companion. Just for completeness, here it is my own comparison among PyTables and h5py: http://www.pytables.org/moin/FAQ#HowdoesPyTablescomparewiththeh5pyproject.3F > I have the feeling this is turning into killing a fly with a cannon... Maybe. But if you are going to keep many data on-disk, it can be a nice advantage in the medium term. HTH, -- Francesc Alted From seb.haase at gmail.com Wed Sep 2 05:58:15 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Wed, 2 Sep 2009 11:58:15 +0200 Subject: [Numpy-discussion] How to concatenate two arrayswithout duplicating memory? In-Reply-To: <4A9E3B5A.20405@esrf.fr> References: <4A9E2181.70702@esrf.fr> <20090902082435.GC27947@phare.normalesup.org> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E94@sernt14.essex.ac.uk> <4A9E3B5A.20405@esrf.fr> Message-ID: Hi, depending on the needs you have you might be interested in my "minimal implementation" of what I call a mock-ndarray. I needed somthing like this to analyze higher dimensional stacks of 2d images and what I needed was mostly the indexing features of nd-arrays. A mockarray is initialized with a list of nd-arrays. The result is a mock array having one additional dimention "in front". >>> a = N.arange(9) >>> b = N.arange(9) >>> a.shape=3,3 >>> b.shape=3,3 >>> c = F.mockNDarray(a,b) >>> c.shape (2, 3, 3) >>> c[2,2,2] >>> c[1,2,2] 8 No memory copy is done. I put the module file here http://drop.io/kpu4bib/asset/mockndarray-py Otherwise this is part of my (BSD) "Priithon" image analysis framework. Regards Sebastian Haase On Wed, Sep 2, 2009 at 11:31 AM, "V. Armando Sol?" wrote: > Citi, Luca wrote: >> As Ga?l pointed out you cannot create A, B and then C >> as the concatenation of A and B without duplicating >> the vectors. >> >> But you can still re-link A to the left elements >> and B to the right ones afterwards by using views into C. >> > > Thanks for the hint. In my case the A array is already present and the > contents of the B array can be read from disk. > > At least I have two workarounds making use of your suggested solution of > re-linking: > > - create the C array, copy the contents of A to it and read the contents > of B directly into C with duplication of the memory of A during some time. > > - save the array A in disk, create the array C, read the contents of A > and B into it and re-link A and B with no duplication but ugly. > > Thanks, > > Armando > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From stefano_covino at yahoo.it Wed Sep 2 06:00:06 2009 From: stefano_covino at yahoo.it (Stefano Covino) Date: Wed, 2 Sep 2009 10:00:06 +0000 (UTC) Subject: [Numpy-discussion] snow leopard and Numeric References: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> <3d375d730909012144l23d66258q459d5b0966815f45@mail.gmail.com> <3d375d730909012153o2736cee4r430f597dd37c5a3@mail.gmail.com> Message-ID: > > > > Is there a way to constrain an old-style compilation just to make a code > > work? I have similar problems with other old pieces of code. > > Use "-arch i686" in the CFLAGS and LDFLAGS. I think. > Unfortunately, it seems not to have any effect. I'll try something else. Thanks anyway. Stefano From romain.brette at ens.fr Wed Sep 2 08:26:31 2009 From: romain.brette at ens.fr (Romain Brette) Date: Wed, 02 Sep 2009 14:26:31 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061012x2bd69e6i1550787f10cd6aaf@mail.gmail.com> <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> Message-ID: Hi everyone, In case anyone is interested, I just set up a google group to discuss GPU-based simulation for our Python neural simulator Brian: http://groups.google.fr/group/brian-on-gpu Our simulator relies heavily Numpy. I would be very happy if the GPU experts here would like to share their expertise. Best, Romain Romain Brette a ?crit : > Sturla Molden a ?crit : >> Thus, here is my plan: >> >> 1. a special context-manager class >> 2. immutable arrays inside with statement >> 3. lazy evaluation: expressions build up a parse tree >> 4. dynamic code generation >> 5. evaluation on exit >> > > There seems to be some similarity with what we want to do to accelerate > our neural simulations (briansimulator.org), as described here: > http://brian.svn.sourceforge.net/viewvc/brian/trunk/dev/BEPs/BEP-9-Automatic%20code%20generation.txt?view=markup > (by the way BEP is "Brian Enhancement Proposal") > The speed-up factor we got in our experimental code with GPU is very > substantial when there are many neurons (= large vectors, e.g. 10 000 > elements), even when operations are simple. > > Romain From dagss at student.matnat.uio.no Wed Sep 2 08:53:20 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 02 Sep 2009 14:53:20 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9D9432.20907@molden.no> References: <4A9C9DDA.9060503@molden.no> <4A9D7B5E.6040009@student.matnat.uio.no> <4A9D9432.20907@molden.no> Message-ID: <4A9E6AC0.1010902@student.matnat.uio.no> Sturla Molden wrote: > Dag Sverre Seljebotn skrev: > >> Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the >> right type to use in this case? >> >> > By the way, here is a more polished version, does it look ok? > > http://projects.scipy.org/numpy/attachment/ticket/1213/generate_qselect.py > http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.pyx > I didn't look at the algorithm, but the types look OK (except for the gil as you say). Comments: a) Is the cast to numpy.npy_intp really needed? I'm pretty sure shape is defined as numpy.npy_intp*. b) If you want higher performance with contiguous arrays (which occur a lot as inplace=False is default I guess) you can do np.ndarray[T, ndim=1, mode="c"] to tell the compiler the array is contiguous. That doubles the number of function instances though... > Cython needs something like Java's generics by the way :-) > Yes, we all long for that. It will come as soon as somebody volunteers I suppose -- it shouldn't be all that difficult, but I don't think any of the existing devs will be up for it any time soon. Dag Sverre From gokhansever at gmail.com Wed Sep 2 10:38:22 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 2 Sep 2009 09:38:22 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file Message-ID: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> Hello, I want to be able to parse a binary file which hold information regarding to experiment configuration and data obviously. Both configuration and data sections are variable-length. A chuck this data is shown as below (after a binary read operation) '\x00\x00@\x00$\x00\x02\x00\x12\x00\xff\x00\x00\x00U\xaa\xfa\xffd\x00\x08\x00\x01\x00\x08\x00\xff\x00\x00\x00U\xaa\xfb\xffl\x00\xab\x00\x01\x00\xab\x00\xff\x00\x00\x00U\xaa\xe7\x03\x17\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00U\xaa\xd9\x07\x04\x00\x02\x00\r\x00\x06\x00\x03\x00\x00\x00\x01\x00\x00\x00\xd9\x07\x04\x00\x02\x00\r\x00\x06\x00\x03\x00\x00\x00\x01\x00\x00\x00prj.300\x00; Version = 1\n', 'ProjectName = PME1 2009 King Air N825ST\n', 'FlightId = \n', 'AircraftType = WMI King Air 200\n', 'AircraftId = N825ST\n', 'OperatorName = Weather Modification Inc.\n', 'Comments = \n', '\x00\x00@ In binary form the file is 1.3MB, and when written to a txt file it expands to 3.7MB totalling approximately 4 million characters. When fully processed (with an IDL code) it produces 86 seperate configuration files, and 46 ascii files for data, about 10-15 different instruments and in various combinations plus sampling rates. I attemted to use RE module, however the time it takes parse the file is really longer than I expected. What would be wisest and fastest way to tackle this issue? Upon successful re-construction of the data and metadata, I am planning to use a much modular structure like HDF5 or netCDF4 for an easy data storage and analyses. Thank you. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Wed Sep 2 10:54:34 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 02 Sep 2009 16:54:34 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9E6AC0.1010902@student.matnat.uio.no> References: <4A9C9DDA.9060503@molden.no> <4A9D7B5E.6040009@student.matnat.uio.no> <4A9D9432.20907@molden.no> <4A9E6AC0.1010902@student.matnat.uio.no> Message-ID: <4A9E872A.1010101@molden.no> Dag Sverre Seljebotn skrev: > a) Is the cast to numpy.npy_intp really needed? I'm pretty sure shape is > > defined as numpy.npy_intp*. I don't know Cython internals in detail but you do, I so take your word for it. I thought shape was a tuple of Python ints. > b) If you want higher performance with contiguous arrays (which occur a > lot as inplace=False is default I guess) you can do > > np.ndarray[T, ndim=1, mode="c"] > > to tell the compiler the array is contiguous. That doubles the number of > function instances though... Thanks. I could either double the number of specialized select functions, or I could make a local copy using numpy.ascontiguousarray in the select function. Quickselect touch the discontiguous array on average 2*n times, whereas numpy.ascontiguousarray touch the discontiguous array n times (but in orderly). Then there is the question of cache use: Contiguous arrays are the more friendly case, and numpy.ascontiguousarray is more friendly than quickselect. Also if quickselect is not done inplace (the common case for medians), we always have contigous arrays, so mode="c" is almost always wanted. And when quickselect is done inplace, we usually have a contiguous input. This is also why I used a C pointer instead of your buffer syntax in the first version. Then I changed my mind, not sure why. So I'll try with a local copy first then. I don't think we want close to a megabyte of Cython generated gibberish C just for the median. Sturla Molden From sturla at molden.no Wed Sep 2 11:00:10 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 02 Sep 2009 17:00:10 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E93@sernt14.essex.ac.uk> References: <4A9C9DDA.9060503@molden.no> <4A9D7B5E.6040009@student.matnat.uio.no> <4A9D9432.20907@molden.no> <4A9E03C9.70003@molden.no> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E93@sernt14.essex.ac.uk> Message-ID: <4A9E887A.2080105@molden.no> Citi, Luca skrev: > Hello Sturla, > In "_median" how can you, if n==2, use s[] if s is not defined? > What if n==1? > That was a typo. > Also, I think when returning an empty array, it should be of > the same type you would get in the other cases. Currently median returns numpy.nan for empty input arrays. I'll do that instead. I want it to behave exactly as the current implementation, except for the sorting. Sturla From sturla at molden.no Wed Sep 2 11:11:35 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 02 Sep 2009 17:11:35 +0200 Subject: [Numpy-discussion] How to concatenate two arrays without duplicating memory? In-Reply-To: <4A9E2181.70702@esrf.fr> References: <4A9E2181.70702@esrf.fr> Message-ID: <4A9E8B27.7090705@molden.no> V. Armando Sol? skrev: > I am looking for a way to have a non contiguous array C in which the > "left" (10000, 2000) elements point to A and the "right" (10000, 4000) > elements point to B. > > Any hint will be appreciated. If you know in advance that A and B are going to be duplicated, you can use views: C = np.zeros((10000, 6000)) A = C[:,:2000] B = C[:,2000:] Now C is A and B concatenated horizontally. If you can't to this, you could write the data to a temporary file and read it back, but it would be slow. Sturla From robert.kern at gmail.com Wed Sep 2 11:11:35 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 10:11:35 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> Message-ID: <3d375d730909020811p5283b605sb2c5361b9019b8bd@mail.gmail.com> On Wed, Sep 2, 2009 at 09:38, G?khan Sever wrote: > Hello, > > I want to be able to parse a binary file which hold information regarding to > experiment configuration and data obviously. Both configuration and data > sections are variable-length. A chuck this data is shown as below (after a > binary read operation) > > '\x00\x00@\x00$\x00\x02\x00\x12\x00\xff\x00\x00\x00U\xaa\xfa\xffd\x00\x08\x00\x01\x00\x08\x00\xff\x00\x00\x00U\xaa\xfb\xffl\x00\xab\x00\x01\x00\xab\x00\xff\x00\x00\x00U\xaa\xe7\x03\x17\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00U\xaa\xd9\x07\x04\x00\x02\x00\r\x00\x06\x00\x03\x00\x00\x00\x01\x00\x00\x00\xd9\x07\x04\x00\x02\x00\r\x00\x06\x00\x03\x00\x00\x00\x01\x00\x00\x00prj.300\x00; > Version = 1\n', 'ProjectName = PME1 2009 King Air N825ST\n', 'FlightId = > \n', 'AircraftType = WMI King Air 200\n', 'AircraftId = N825ST\n', > 'OperatorName = Weather Modification Inc.\n', 'Comments = \n', '\x00\x00@ > > In binary form the file is 1.3MB, and when written to a txt file it expands > to 3.7MB totalling approximately 4 million characters. When fully processed > (with an IDL code) it produces 86 seperate configuration files, and 46 ascii > files for data, about 10-15 different instruments and in various > combinations plus sampling rates. > > I attemted to use RE module, however the time it takes parse the file is > really longer than I expected. What would be wisest and fastest way to > tackle this issue? Upon successful re-construction of the data and metadata, > I am planning to use a much modular structure like HDF5 or netCDF4 for an > easy data storage and analyses. Are there fixed delimiters? Like '\x00\x00@\x00' perhaps? It might be faster to search for those using .find() instead of regexes. Without more information about how the file format gets split up, I'm not sure we can make good suggestions. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Wed Sep 2 11:23:23 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 02 Sep 2009 17:23:23 +0200 Subject: [Numpy-discussion] How to concatenate two arrayswithout duplicating memory? In-Reply-To: References: <4A9E2181.70702@esrf.fr> <20090902082435.GC27947@phare.normalesup.org> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E94@sernt14.essex.ac.uk> <4A9E3B5A.20405@esrf.fr> Message-ID: <4A9E8DEB.2030204@molden.no> Sebastian Haase skrev: > A mockarray is initialized with a list of nd-arrays. The result is a > mock array having one additional dimention "in front". This is important, because often in the case of 'concatenation' a real concatenation is not needed. But then there is a common tool called Matlab, which unlike Python has no concept of lists and make numerical programmers think they do. C = [A, B] is a horizontal concatenation in Matlab. Too much exposure to Matlab cripples the mind easily. Sturla From sturla at molden.no Wed Sep 2 11:34:36 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 02 Sep 2009 17:34:36 +0200 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> Message-ID: <4A9E908C.1070205@molden.no> G?khan Sever skrev: > What would be wisest and fastest way to tackle this issue? Get the format, read the binary data directly, skip the ascii/regex part. I sometimes use recarrays with formatted binary data; just constructing a dtype and use numpy.fromfile to read. That works when the binary file store C structs written successively. Sturla Molden From lciti at essex.ac.uk Wed Sep 2 12:11:08 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 2 Sep 2009 17:11:08 +0100 Subject: [Numpy-discussion] np.bitwise_and.identity Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E96@sernt14.essex.ac.uk> Hello, I know I am splitting the hair, but should not np.bitwise_and.identity be -1 instead of 1? I mean, something with all the bits set? I am checking whether all elements of a vector 'v' have a certain bit 'b' set: if np.bitwise_and.reduce(v) & (1 << b): # do something If v is empty, the expression is true for b==0 and false otherwise. In fact np.bitwise_and.identity is 1. I like being able to use np.bitwise_and.reduce because it many times faster than (v & (1 << b)).all() (it does not create the temporary vector). Of course there are workarounds but I was wondering if there is a reason for this behaviour. Best, Luca From robert.kern at gmail.com Wed Sep 2 12:20:36 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 11:20:36 -0500 Subject: [Numpy-discussion] np.bitwise_and.identity In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E96@sernt14.essex.ac.uk> References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E96@sernt14.essex.ac.uk> Message-ID: <3d375d730909020920g12b880cch496362e409424f2@mail.gmail.com> On Wed, Sep 2, 2009 at 11:11, Citi, Luca wrote: > Hello, > I know I am splitting the hair, but should not > np.bitwise_and.identity be -1 instead of 1? > I mean, something with all the bits set? Probably. However, the .identity parts of ufuncs were designed mostly to support multiply and add, so .identity is restricted to 0, 1, or nothing currently. It will take some effort to change that. In the C code, the sentinel value for "no identity" is -1, alas. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From denis-bz-py at t-online.de Wed Sep 2 12:33:08 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Wed, 2 Sep 2009 16:33:08 +0000 (UTC) Subject: [Numpy-discussion] adaptive interpolation on a regular 2d grid References: <3d375d730908311552r399ed62ah4f482b14b4e89b88@mail.gmail.com> Message-ID: Robert Kern gmail.com> writes: > Looks good! Where can we get the code? Can this be specialized for 1D functions? Re code: sure, I'll be happy to post it if anyone points me to a real test case or two, to help me understand the envelope -- 100^2 -> 500^2 grid ? (Splines on regular grids are fast and robust, hard to beat.) Re 1d: I have an old version using 2 point - 2 slope splines, overkill, will trim it. (Is there a sandbox or wiki of interpolation testcases, not images ?) From robert.kern at gmail.com Wed Sep 2 12:38:27 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 11:38:27 -0500 Subject: [Numpy-discussion] adaptive interpolation on a regular 2d grid In-Reply-To: References: <3d375d730908311552r399ed62ah4f482b14b4e89b88@mail.gmail.com> Message-ID: <3d375d730909020938p6270876ka26aa01b708a1cc4@mail.gmail.com> On Wed, Sep 2, 2009 at 11:33, denis bzowy wrote: > Robert Kern gmail.com> writes: > >> Looks good! Where can we get the code? Can this be specialized for 1D > functions? > > > > Re code: sure, I'll be happy to post it if anyone points me to a real test > case or two, to help me understand the envelope -- 100^2 -> 500^2 grid ? > (Splines on regular grids are fast and robust, hard to beat.) > > Re 1d: I have an old version using 2 point - 2 slope splines, overkill, > will trim it. > > (Is there a sandbox or wiki of interpolation testcases, not images ?) I have some test cases here: http://svn.scipy.org/svn/scikits/trunk/delaunay/scikits/delaunay/testfuncs.py They are meant to test scattered data interpolation. They aren't going to exercise your adaptive interpolation very hard. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From seb.haase at gmail.com Wed Sep 2 12:48:05 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Wed, 2 Sep 2009 18:48:05 +0200 Subject: [Numpy-discussion] How to concatenate two arrayswithout duplicating memory? In-Reply-To: <4A9E8DEB.2030204@molden.no> References: <4A9E2181.70702@esrf.fr> <20090902082435.GC27947@phare.normalesup.org> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E94@sernt14.essex.ac.uk> <4A9E3B5A.20405@esrf.fr> <4A9E8DEB.2030204@molden.no> Message-ID: I forgot to mention I also support transpose. -S. On Wed, Sep 2, 2009 at 5:23 PM, Sturla Molden wrote: > Sebastian Haase skrev: >> A mockarray is initialized with a list of nd-arrays. The result is a >> mock array having one additional dimention "in front". > This is important, because often in the case of ?'concatenation' a real > concatenation is not needed. But then there is a common tool called > Matlab, which unlike Python has no concept of lists and make numerical > programmers think they do. C = [A, B] is a horizontal concatenation in > Matlab. Too much exposure to Matlab cripples the mind easily. > > Sturla > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gokhansever at gmail.com Wed Sep 2 12:52:35 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 2 Sep 2009 11:52:35 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <3d375d730909020811p5283b605sb2c5361b9019b8bd@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <3d375d730909020811p5283b605sb2c5361b9019b8bd@mail.gmail.com> Message-ID: <49d6b3500909020952l1de2ec0fo98ac4631855e110e@mail.gmail.com> On Wed, Sep 2, 2009 at 10:11 AM, Robert Kern wrote: > On Wed, Sep 2, 2009 at 09:38, G?khan Sever wrote: > > Hello, > > > > I want to be able to parse a binary file which hold information regarding > to > > experiment configuration and data obviously. Both configuration and data > > sections are variable-length. A chuck this data is shown as below (after > a > > binary read operation) > > > > '\x00\x00@ > \x00$\x00\x02\x00\x12\x00\xff\x00\x00\x00U\xaa\xfa\xffd\x00\x08\x00\x01\x00\x08\x00\xff\x00\x00\x00U\xaa\xfb\xffl\x00\xab\x00\x01\x00\xab\x00\xff\x00\x00\x00U\xaa\xe7\x03\x17\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00U\xaa\xd9\x07\x04\x00\x02\x00\r\x00\x06\x00\x03\x00\x00\x00\x01\x00\x00\x00\xd9\x07\x04\x00\x02\x00\r\x00\x06\x00\x03\x00\x00\x00\x01\x00\x00\x00prj.300\x00; > > Version = 1\n', 'ProjectName = PME1 2009 King Air N825ST\n', 'FlightId = > > \n', 'AircraftType = WMI King Air 200\n', 'AircraftId = N825ST\n', > > 'OperatorName = Weather Modification Inc.\n', 'Comments = \n', '\x00\x00@ > > > > In binary form the file is 1.3MB, and when written to a txt file it > expands > > to 3.7MB totalling approximately 4 million characters. When fully > processed > > (with an IDL code) it produces 86 seperate configuration files, and 46 > ascii > > files for data, about 10-15 different instruments and in various > > combinations plus sampling rates. > > > > I attemted to use RE module, however the time it takes parse the file is > > really longer than I expected. What would be wisest and fastest way to > > tackle this issue? Upon successful re-construction of the data and > metadata, > > I am planning to use a much modular structure like HDF5 or netCDF4 for an > > easy data storage and analyses. > > Are there fixed delimiters? Like '\x00\x00@\x00' perhaps? It might be > faster to search for those using .find() instead of regexes. > > Without more information about how the file format gets split up, I'm > not sure we can make good suggestions. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Fixed delims... That is what I used to parse metadata with a regex. Something like: r = re.compile("\0;.+?\0\0@\0\$", re.DOTALL) which extracts to portions that I am interested. However I have yet to figure parsing separate data streams. Couldn't find a way find to see which data blocks goes with which device. I put the test binary file I am using at: http://drop.io/1plh5rt -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Wed Sep 2 12:53:52 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 2 Sep 2009 11:53:52 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <4A9E908C.1070205@molden.no> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> Message-ID: <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> On Wed, Sep 2, 2009 at 10:34 AM, Sturla Molden wrote: > G?khan Sever skrev: > > What would be wisest and fastest way to tackle this issue? > Get the format, read the binary data directly, skip the ascii/regex part. > > I sometimes use recarrays with formatted binary data; just constructing > a dtype and use numpy.fromfile to read. That works when the binary file > store C structs written successively. > > Sturla Molden > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > How to use recarrays with variable-length data fields as well as metadata? Eventually I will record the data with numpy arrays but not sure how to utilize recarrays in the first stage. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertwb at math.washington.edu Wed Sep 2 12:57:05 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 2 Sep 2009 09:57:05 -0700 (PDT) Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9E6AC0.1010902@student.matnat.uio.no> References: <4A9C9DDA.9060503@molden.no> <4A9D7B5E.6040009@student.matnat.uio.no> <4A9D9432.20907@molden.no> <4A9E6AC0.1010902@student.matnat.uio.no> Message-ID: On Wed, 2 Sep 2009, Dag Sverre Seljebotn wrote: > Sturla Molden wrote: >> Dag Sverre Seljebotn skrev: >> >>> Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the >>> right type to use in this case? >>> >>> >> By the way, here is a more polished version, does it look ok? >> >> http://projects.scipy.org/numpy/attachment/ticket/1213/generate_qselect.py >> http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.pyx >> > I didn't look at the algorithm, but the types look OK (except for the > gil as you say). Comments: > > a) Is the cast to numpy.npy_intp really needed? I'm pretty sure shape is > defined as numpy.npy_intp*. > b) If you want higher performance with contiguous arrays (which occur a > lot as inplace=False is default I guess) you can do > > np.ndarray[T, ndim=1, mode="c"] > > to tell the compiler the array is contiguous. That doubles the number of > function instances though... > > >> Cython needs something like Java's generics by the way :-) >> > Yes, we all long for that. It will come as soon as somebody volunteers I > suppose -- it shouldn't be all that difficult, but I don't think any of > the existing devs will be up for it any time soon. Danilo's C++ project has some baby steps in that direction, though it'll need to be expanded quite a bit to handle this. - Robert From robert.kern at gmail.com Wed Sep 2 13:04:46 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 12:04:46 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> Message-ID: <3d375d730909021004r3787accgd239b5a157a8e563@mail.gmail.com> On Wed, Sep 2, 2009 at 11:53, G?khan Sever wrote: > How to use recarrays with variable-length data fields as well as metadata? You don't. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From lciti at essex.ac.uk Wed Sep 2 13:01:52 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 2 Sep 2009 18:01:52 +0100 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com><4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> If I understand the problem... if you are 100% sure that "', '" only occurs between fields and never within, you can use the 'split' method of the string which could be faster than regexp in this simple case. From lciti at essex.ac.uk Wed Sep 2 13:04:27 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 2 Sep 2009 18:04:27 +0100 Subject: [Numpy-discussion] np.bitwise_and.identity References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E96@sernt14.essex.ac.uk> <3d375d730909020920g12b880cch496362e409424f2@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E99@sernt14.essex.ac.uk> Thank you, Robert, for the quick reply. I just saw the line #define PyUFunc_None -1 in the ufuncobject.h file. It is always the same, you choose a sentinel thinking that it doesn't conflict with any possible value and you later find there is one such case. As said it is not a big deal. I wouldn't spend time on it. Best, Luca From gokhansever at gmail.com Wed Sep 2 13:27:45 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 2 Sep 2009 12:27:45 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> Message-ID: <49d6b3500909021027w29159135tbb5bfe47a48d26c@mail.gmail.com> On Wed, Sep 2, 2009 at 12:01 PM, Citi, Luca wrote: > If I understand the problem... > if you are 100% sure that "', '" only occurs between fields > and never within, you can use the 'split' method of the string > which could be faster than regexp in this simple case. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > But it is not possible to extract a pattern such as within a field. A construct like in regex starting with a ; till the end of the section. ?? -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Wed Sep 2 13:33:14 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 2 Sep 2009 12:33:14 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <3d375d730909021004r3787accgd239b5a157a8e563@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3d375d730909021004r3787accgd239b5a157a8e563@mail.gmail.com> Message-ID: <49d6b3500909021033r7643158bnbf0db914aa0db03a@mail.gmail.com> On Wed, Sep 2, 2009 at 12:04 PM, Robert Kern wrote: > On Wed, Sep 2, 2009 at 11:53, G?khan Sever wrote: > > > How to use recarrays with variable-length data fields as well as > metadata? > > You don't. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I was just confirming my guess :) The data in the binary file was written in a variable-length fashion. Although each chuck has a specific starting indication like \x00\x00\@\x00$\x00\x02 the amount of the in each section varies depends on what was in the written stream. How your find suggestion work? It just returns the location of the first occurrence. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Sep 2 13:29:42 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 12:29:42 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <49d6b3500909021027w29159135tbb5bfe47a48d26c@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> <49d6b3500909021027w29159135tbb5bfe47a48d26c@mail.gmail.com> Message-ID: <3d375d730909021029l9a3be85wb850421217da5ef9@mail.gmail.com> On Wed, Sep 2, 2009 at 12:27, G?khan Sever wrote: > > On Wed, Sep 2, 2009 at 12:01 PM, Citi, Luca wrote: >> >> If I understand the problem... >> if you are 100% sure that "', '" only occurs between fields >> and never within, you can use the 'split' method of the string >> which could be faster than regexp in this simple case. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > But it is not possible to extract a pattern such as within a field. A > construct like in regex starting with a ; till the end of the section. ?? I can't parse that sentence. Can you describe the format in a little more detail? Or point to documentation of the format? Or the IDL code that parses it? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Sep 2 13:46:16 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 12:46:16 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <49d6b3500909021033r7643158bnbf0db914aa0db03a@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3d375d730909021004r3787accgd239b5a157a8e563@mail.gmail.com> <49d6b3500909021033r7643158bnbf0db914aa0db03a@mail.gmail.com> Message-ID: <3d375d730909021046k276a83b9hcccb346eea7eb4ba@mail.gmail.com> On Wed, Sep 2, 2009 at 12:33, G?khan Sever wrote: > How your find suggestion work? It just returns the location of the first > occurrence. http://docs.python.org/library/stdtypes.html#str.find str.find(sub[, start[, end]]) Return the lowest index in the string where substring sub is found, such that sub is contained in the range [start, end]. Optional arguments start and end are interpreted as in slice notation. Return -1 if sub is not found. But perhaps you should profile your code to see where it is actually taking up the time. Regexes on 1.3 MB of data should be quite fast. In [21]: marker = '\x00\x00\@\x00$\x00\x02' In [22]: block = marker + '\xde\xca\xfb\xad' * ((1024-8) // 4) In [23]: data = int(round(1.3 * 1024)) * block In [24]: import re In [25]: r = re.compile(re.escape(marker)) In [26]: %time r.findall(data) CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s Wall time: 0.01 s -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gokhansever at gmail.com Wed Sep 2 14:28:07 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 2 Sep 2009 13:28:07 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <3d375d730909021029l9a3be85wb850421217da5ef9@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> <49d6b3500909021027w29159135tbb5bfe47a48d26c@mail.gmail.com> <3d375d730909021029l9a3be85wb850421217da5ef9@mail.gmail.com> Message-ID: <49d6b3500909021128sfe82fd9iaf95886b6263340a@mail.gmail.com> On Wed, Sep 2, 2009 at 12:29 PM, Robert Kern wrote: > On Wed, Sep 2, 2009 at 12:27, G?khan Sever wrote: > > > > On Wed, Sep 2, 2009 at 12:01 PM, Citi, Luca wrote: > >> > >> If I understand the problem... > >> if you are 100% sure that "', '" only occurs between fields > >> and never within, you can use the 'split' method of the string > >> which could be faster than regexp in this simple case. > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > But it is not possible to extract a pattern such as within a field. A > > construct like in regex starting with a ; till the end of the section. ?? > > I can't parse that sentence. Can you describe the format in a little > more detail? Or point to documentation of the format? Or the IDL code > that parses it? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Put the reference manual in: http://drop.io/1plh5rt First few pages describe the data format they use. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Wed Sep 2 14:30:46 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 2 Sep 2009 13:30:46 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <3d375d730909021029l9a3be85wb850421217da5ef9@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> <49d6b3500909021027w29159135tbb5bfe47a48d26c@mail.gmail.com> <3d375d730909021029l9a3be85wb850421217da5ef9@mail.gmail.com> Message-ID: <49d6b3500909021130k701fbe98ga1b5757d472fb015@mail.gmail.com> On Wed, Sep 2, 2009 at 12:29 PM, Robert Kern wrote: > On Wed, Sep 2, 2009 at 12:27, G?khan Sever wrote: > > > > On Wed, Sep 2, 2009 at 12:01 PM, Citi, Luca wrote: > >> > >> If I understand the problem... > >> if you are 100% sure that "', '" only occurs between fields > >> and never within, you can use the 'split' method of the string > >> which could be faster than regexp in this simple case. > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > But it is not possible to extract a pattern such as within a field. A > > construct like in regex starting with a ; till the end of the section. ?? > > I can't parse that sentence. Can you describe the format in a little > more detail? Or point to documentation of the format? Or the IDL code > that parses it? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > IDL processing code is on: http://adpaa.svn.sourceforge.net/viewvc/adpaa/trunk/src/Level1/process_raw/ A part of ADPAA - Aircraft Data Processing and Analysis project. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Wed Sep 2 14:42:06 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 2 Sep 2009 13:42:06 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <3d375d730909021046k276a83b9hcccb346eea7eb4ba@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3d375d730909021004r3787accgd239b5a157a8e563@mail.gmail.com> <49d6b3500909021033r7643158bnbf0db914aa0db03a@mail.gmail.com> <3d375d730909021046k276a83b9hcccb346eea7eb4ba@mail.gmail.com> Message-ID: <49d6b3500909021142o7f7f858l7992c4446f5950cd@mail.gmail.com> On Wed, Sep 2, 2009 at 12:46 PM, Robert Kern wrote: > On Wed, Sep 2, 2009 at 12:33, G?khan Sever wrote: > > How your find suggestion work? It just returns the location of the first > > occurrence. > > http://docs.python.org/library/stdtypes.html#str.find > > str.find(sub[, start[, end]]) > Return the lowest index in the string where substring sub is > found, such that sub is contained in the range [start, end]. Optional > arguments start and end are interpreted as in slice notation. Return > -1 if sub is not found. > > But perhaps you should profile your code to see where it is actually > taking up the time. Regexes on 1.3 MB of data should be quite fast. > > In [21]: marker = '\x00\x00\@\x00$\x00\x02' > > In [22]: block = marker + '\xde\xca\xfb\xad' * ((1024-8) // 4) > > In [23]: data = int(round(1.3 * 1024)) * block > > In [24]: import re > > In [25]: r = re.compile(re.escape(marker)) > > In [26]: %time r.findall(data) > CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s > Wall time: 0.01 s > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > This is what I have been using. It's not returning exactly what I want but very close besides its being slow: I[52]: mypattern = re.compile('\0\0\1\0.+?\0\0@\0\$', re.DOTALL) I[53]: res = mypattern.findall(ss) I[54]: len res -----> len(res) O[54]: 95 I[55]: %time mypattern.findall(ss); CPU times: user 9.14 s, sys: 0.00 s, total: 9.14 s Wall time: 9.16 s I[57]: res[0] O[57]: '\x00\x00\x01\x00\x00\x00\xd9\x07\x04\x00\x02\x00\r\x00\x06\x00\x03\x00\x00\x00\x01\x00\x00\x00 *prj.300*\x00; Version = 1\nProjectName = PME1 2009 King Air N825ST\nFlightId = \nAircraftType = WMI King Air 200\nAircraftId = N825ST\nOperatorName = Weather Modification Inc.\nComments = \n\x00\x00@ \x00$' I need the part starting with the bold typed section (prj.300) and till the end of the section. I need the bold part because I can construct file names from that and write the following content in it. Ohh when it works the resulting search should return me 86 occurrence. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Sep 2 14:58:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 13:58:06 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <49d6b3500909021128sfe82fd9iaf95886b6263340a@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> <49d6b3500909021027w29159135tbb5bfe47a48d26c@mail.gmail.com> <3d375d730909021029l9a3be85wb850421217da5ef9@mail.gmail.com> <49d6b3500909021128sfe82fd9iaf95886b6263340a@mail.gmail.com> Message-ID: <3d375d730909021158i79ec0551h3eca28878a0d405c@mail.gmail.com> On Wed, Sep 2, 2009 at 13:28, G?khan Sever wrote: > Put the reference manual in: > > http://drop.io/1plh5rt > > First few pages describe the data format they use. Ah. The fields are *not* delimited by a fixed value. Regexes are no help to you for pulling out the information you need, except perhaps later to parse the text fields. I think you are also getting spurious results because your regex matches things inside data fields. Instead, you have a header containing the length of the data field followed by the data field. Create a structured dtype that corresponds to the DataDir struct on page 15. Note that "unsigned int" there is actually a numpy.uint16, not a uint32. dt = np.dtype([('tagNumber', np.uint16), ('dataOffset', np.uint16), ('numberBytes', np.uint16), ('samples', np.uint16), ('bytesPerSample', np.uint16), ('type', np.uint8), ('param1', np.uint8), ('param2', np.uint8), ('param3', np.uint8), ('address', np.uint16)]) Now read dt.itemsize bytes from the file and use header = fromstring(f.read(dt.itemsize), dt)[0] to get a record object that corresponds to the header. Use the dataOffset and numberBytes fields to extract the actual data bytes from the file. For example, if we go to the second header field: In [28]: f.seek(dt.itemsize,0) In [29]: header = np.fromstring(f.read(dt.itemsize), dt)[0] In [30]: header Out[30]: (65530, 100, 8, 1, 8, 255, 0, 0, 0, 43605) In [31]: f.seek(header['dataOffset'], 0) In [32]: f.read(header['numberBytes']) Out[32]: 'prj.300\x00' There are still some semantic issues you need to work out, still. There are multiple "buffers" per file, and the dataOffsets are relative to the start of the buffer, not the file. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From chad.netzer at gmail.com Wed Sep 2 15:25:12 2009 From: chad.netzer at gmail.com (Chad Netzer) Date: Wed, 2 Sep 2009 12:25:12 -0700 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9C9DDA.9060503@molden.no> References: <4A9C9DDA.9060503@molden.no> Message-ID: On Mon, Aug 31, 2009 at 9:06 PM, Sturla Molden wrote: > > We recently has a discussion regarding an optimization of NumPy's median > to average O(n) complexity. After some searching, I found out there is a > selection algorithm competitive in speed with Hoare's quick select. It > has the advantage of being a lot simpler to implement. In plain Python: > Chad, you can continue to write quick select using NumPy's C quick sort > in numpy/core/src/_sortmodule.c.src. ?When you are done, it might be > about 10% faster than this. :-) I was sick for a bit last week, so got stalled on my version, but I'll be working on it this weekend. I'm going for a more general partition function, that could have slightly more general use cases than just a median. Nevertheless, its good to see there could be several options, hopefully at least one of which can be put into numpy. By the way, as far as I can tell, the above algorithm is exactly the same idea as a non-recursive Hoare (ie. quicksort) selection: Do the partition, then only proceed to the sub-partition that must contain the nth element. My version is a bit more general, allowing partitioning on a range of elements rather than just one, but the concept is the same. The numpy quicksort already does non recursive sorting. I'd also like to, if possible, have a specialized 2D version, since image media filtering is one of my interests, and the C version works on 1D (raveled) arrays only. -C From jeremy.mayes at gmail.com Wed Sep 2 18:23:46 2009 From: jeremy.mayes at gmail.com (Jeremy Mayes) Date: Wed, 2 Sep 2009 17:23:46 -0500 Subject: [Numpy-discussion] numpy core dump on linux Message-ID: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> This one line causes python to core dump on linux. numpy.lexsort([ numpy.array(['-','-','-','-','-','-','-','-','-','-','-','-','-'])[::-1],numpy.array([732685., 732685., 732685., 732685., 732685., 732685.,732685., 732685., 732685., 732685., 732685., 732685., 732679.])[::-1]]) Here's some version info: python 2.5.4 numpy 1.3.0 error is *** glibc detected *** free(): invalid next size (fast): 0x0000000000526be0 *** Any ideas? -- --jlm -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Sep 2 18:37:00 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 17:37:00 -0500 Subject: [Numpy-discussion] numpy core dump on linux In-Reply-To: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> References: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> Message-ID: <3d375d730909021537w7ffc2ef7ld7073aedfb536ac4@mail.gmail.com> On Wed, Sep 2, 2009 at 17:23, Jeremy Mayes wrote: > This one line causes python to core dump on linux. > numpy.lexsort([ > numpy.array(['-','-','-','-','-','-','-','-','-','-','-','-','-'])[::-1],numpy.array([732685., > 732685.,? 732685.,? 732685.,? 732685.,? 732685.,732685.,? 732685., > 732685.,? 732685.,? 732685.,? 732685.,? 732679.])[::-1]]) > > Here's some version info: > > python 2.5.4 > numpy 1.3.0 > > error is > *** glibc detected *** free(): invalid next size (fast): 0x0000000000526be0 > *** > > Any ideas? Huh. The line executes for me on OS X, but the interpreter crashes when exiting. Here is my backtrace: Thread 0 Crashed: 0 org.python.python 0x00270760 collect + 288 1 org.python.python 0x002712ea PyGC_Collect + 42 2 org.python.python 0x00260390 Py_Finalize + 208 3 org.python.python 0x0026f750 Py_Main + 2768 4 org.python.python 0x00001f82 0x1000 + 3970 5 org.python.python 0x00001ea9 0x1000 + 3753 Can you show us a gdb backtrace on your machine? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From celil.rufat at gmail.com Wed Sep 2 18:47:47 2009 From: celil.rufat at gmail.com (Celil Rufat) Date: Wed, 2 Sep 2009 15:47:47 -0700 Subject: [Numpy-discussion] numpy on Snow Leopard Message-ID: <97056b660909021547m658863b1ode828cce916f67f5@mail.gmail.com> I am unable to build numpy on Snow Leopard. The error that I am getting is shown below. It is a linking issue related to the change in the the default behavior of gcc under Snow Leopard. Before it used to compile for the 32 bit i386 architecture, now the default is the 64 bit x86_64 architecture. Has anybody successfully compiled numpy for MACOSX 10.6. If so I would appreciate if you can tell me how you fixed this issue. Regards, Celil ... C compiler: gcc -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 ... gcc: _configtest.c _configtest.c:1: warning: conflicting types for built-in function ?exp? _configtest.c:1: warning: conflicting types for built-in function ?exp? gcc _configtest.o -o _configtest ld: warning: in _configtest.o, missing required architecture x86_64 in file Undefined symbols: "_main", referenced from: start in crt1.10.6.o ld: symbol(s) not found collect2: ld returned 1 exit status ld: warning: in _configtest.o, missing required architecture x86_64 in file Undefined symbols: "_main", referenced from: start in crt1.10.6.o ld: symbol(s) not found collect2: ld returned 1 exit status failure. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Sep 2 19:14:17 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Sep 2009 17:14:17 -0600 Subject: [Numpy-discussion] numpy core dump on linux In-Reply-To: <3d375d730909021537w7ffc2ef7ld7073aedfb536ac4@mail.gmail.com> References: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> <3d375d730909021537w7ffc2ef7ld7073aedfb536ac4@mail.gmail.com> Message-ID: On Wed, Sep 2, 2009 at 4:37 PM, Robert Kern wrote: > On Wed, Sep 2, 2009 at 17:23, Jeremy Mayes wrote: > > This one line causes python to core dump on linux. > > numpy.lexsort([ > > > numpy.array(['-','-','-','-','-','-','-','-','-','-','-','-','-'])[::-1],numpy.array([732685., > > 732685., 732685., 732685., 732685., 732685.,732685., 732685., > > 732685., 732685., 732685., 732685., 732679.])[::-1]]) > > > > Here's some version info: > > > > python 2.5.4 > > numpy 1.3.0 > > > > error is > > *** glibc detected *** free(): invalid next size (fast): > 0x0000000000526be0 > > *** > > > > Any ideas? > > Huh. The line executes for me on OS X, but the interpreter crashes > when exiting. Here is my backtrace: > > > Thread 0 Crashed: > 0 org.python.python 0x00270760 collect + 288 > 1 org.python.python 0x002712ea PyGC_Collect + 42 > 2 org.python.python 0x00260390 Py_Finalize + 208 > 3 org.python.python 0x0026f750 Py_Main + 2768 > 4 org.python.python 0x00001f82 0x1000 + 3970 > 5 org.python.python 0x00001ea9 0x1000 + 3753 > > > Can you show us a gdb backtrace on your machine? > > It's the [::-1] what done it. I suspect a copy is being made and has a bug. In [1]: a = np.array(['-']*100) In [2]: b = np.array([1.0]*100) In [3]: i = lexsort((a,b)) In [4]: i = lexsort((a[::-1])) In [5]: i = lexsort((b[::-1])) In [6]: i = lexsort((a,b[::-1])) In [7]: i = lexsort((a[::-1],b)) *Crash* These also work: In [3]: i = lexsort((b[::-1],a)) In [4]: i = lexsort((b[::-1],b[::-1])) In [5]: i = lexsort((a[::-1],a[::-1])) In [6]: i = lexsort((a,b[::-1])) So it seems to be the combination of reversed string a with an array of different type. Looks like a type setting is getting skipped somewhere. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From timmichelsen at gmx-topmail.de Wed Sep 2 19:15:28 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 03 Sep 2009 01:15:28 +0200 Subject: [Numpy-discussion] help creating a reversed cumulative histogram Message-ID: Hello fellow numy users, I posted some questions on histograms recently [1, 2] but still couldn't find a solution. I am trying to create a inverse cumulative histogram [3] which shall look like [4] but with the higher values at the left. The classification shall follow this exemplary rule: class 1: 0 all values > 0 class 2: 10 all values > 10 class 3: 15 all values > 15 class 4: 20 all values > 20 class 5: 25 all values > 25 [...] I could get this easily in a spreadsheet by creating a matix with conditional statements (if VALUES_COL > CLASS_BOUNDARY; VALUES_COL; '-'). With python (numpy or pylab) I was not successful. The plotted histogram envelope turned out to be just the inverted curve as the one created with the spreadsheet app. I have briely visualised the issue here [5]. I hope that this makes it more understandable. Later I would like to sum and count all values in each bin as discussed in [2]. May someone give me pointer or hint on how to improve my code below to achive the desired histogram? Thanks a lot in advance, Timmie [1]: http://www.nabble.com/np.hist-with-masked-values-to25243905.html [2]: http://www.nabble.com/histogram%3A-sum-up-values-in-each-bin-to25171265.html [3]: http://en.wikipedia.org/wiki/Histogram#Cumulative_histogram [4]: http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=126 [5]: http://www.scribd.com/doc/19371606/Distribution-Histogram ##### CODE ##### normed = False values # loaded data as array bins = 10 ### sum ## taken from ## http://www.nabble.com/Scipy-and-statistics%3A-probability-density-function-to24683007.html#a24683304 sums = np.histogram(values, weights=values, normed=normed, bins=bins) ecdf_sums = np.hstack([0.0, sums[0].cumsum() ]) ecdf_inv_sums = ecdf_sums[::-1] pylab.plot(sums[1], ecdf_inv_sums) pylab.show() From lciti at essex.ac.uk Wed Sep 2 19:19:13 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Thu, 3 Sep 2009 00:19:13 +0100 Subject: [Numpy-discussion] numpy core dump on linux References: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> <3d375d730909021537w7ffc2ef7ld7073aedfb536ac4@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA1@sernt14.essex.ac.uk> I experience the same problem. A few more additional test cases: In [1]: import numpy In [2]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5)]) Out[2]: array([0, 1, 2, 3, 4]) In [3]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5.)]) Out[3]: array([0, 1, 2, 3, 4]) In [4]: numpy.lexsort([numpy.arange(5), numpy.arange(5)]) Out[4]: array([0, 1, 2, 3, 4]) In [5]: numpy.lexsort([numpy.arange(5), numpy.arange(5.)]) Out[5]: array([0, 1, 2, 3, 4]) In [6]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5)]) Out[6]: array([0, 1, 2, 3, 4]) In [7]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5.)]) *** glibc detected *** /usr/bin/python: free(): invalid next size (fast): 0x09be6eb8 *** It looks like the problem is when the first array is reversed and the second is float. I am not familiar with gdb. If I run "gdb python", run it, and give the commands above, it hangs at the glibc line without returning to gdb unless I hit CTRL-C. In this case, I guess, the backtrace I get is related to the CTRL-C rather than the error. Any hint in how to obtain useful information from gdb? Best, Luca From charlesr.harris at gmail.com Wed Sep 2 19:21:29 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Sep 2009 17:21:29 -0600 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: References: <4A9C9DDA.9060503@molden.no> Message-ID: On Wed, Sep 2, 2009 at 1:25 PM, Chad Netzer wrote: > On Mon, Aug 31, 2009 at 9:06 PM, Sturla Molden wrote: > > > > We recently has a discussion regarding an optimization of NumPy's median > > to average O(n) complexity. After some searching, I found out there is a > > selection algorithm competitive in speed with Hoare's quick select. It > > has the advantage of being a lot simpler to implement. In plain Python: > > > Chad, you can continue to write quick select using NumPy's C quick sort > > in numpy/core/src/_sortmodule.c.src. When you are done, it might be > > about 10% faster than this. :-) > > I was sick for a bit last week, so got stalled on my version, but I'll > be working on it this weekend. I'm going for a more general partition > function, that could have slightly more general use cases than just a > median. Nevertheless, its good to see there could be several options, > hopefully at least one of which can be put into numpy. > > By the way, as far as I can tell, the above algorithm is exactly the > same idea as a non-recursive Hoare (ie. quicksort) selection: Do the > partition, then only proceed to the sub-partition that must contain > the nth element. My version is a bit more general, allowing > partitioning on a range of elements rather than just one, but the > concept is the same. The numpy quicksort already does non recursive > sorting. > > I'd also like to, if possible, have a specialized 2D version, since > image media filtering is one of my interests, and the C version works > on 1D (raveled) arrays only. > > There are special hardwired medians for 2,3,5,9 elements, which covers a lot of image processing. They aren't in numpy, though ;) David has implemented a NeighborhoodIter that could help extract the elements if you want to deal with images. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Sep 2 19:24:13 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Sep 2009 17:24:13 -0600 Subject: [Numpy-discussion] numpy core dump on linux In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA1@sernt14.essex.ac.uk> References: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> <3d375d730909021537w7ffc2ef7ld7073aedfb536ac4@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA1@sernt14.essex.ac.uk> Message-ID: On Wed, Sep 2, 2009 at 5:19 PM, Citi, Luca wrote: > I experience the same problem. > A few more additional test cases: > > In [1]: import numpy > > In [2]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5)]) > Out[2]: array([0, 1, 2, 3, 4]) > > In [3]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5.)]) > Out[3]: array([0, 1, 2, 3, 4]) > > In [4]: numpy.lexsort([numpy.arange(5), numpy.arange(5)]) > Out[4]: array([0, 1, 2, 3, 4]) > > In [5]: numpy.lexsort([numpy.arange(5), numpy.arange(5.)]) > Out[5]: array([0, 1, 2, 3, 4]) > > In [6]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5)]) > Out[6]: array([0, 1, 2, 3, 4]) > > In [7]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5.)]) > *** glibc detected *** /usr/bin/python: free(): invalid next size (fast): > 0x09be6eb8 *** > > It looks like the problem is when the first array is reversed and the > second is float. > > I am not familiar with gdb. If I run "gdb python", run it, and give the > commands above, > it hangs at the glibc line without returning to gdb unless I hit CTRL-C. In > this case, > I guess, the backtrace I get is related to the CTRL-C rather than the > error. > Any hint in how to obtain useful information from gdb? > > The actual bug is probably not where the crash occurs. I think there is enough info to track it down for anyone who wants to crawl through the relevant code. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Sep 2 19:26:14 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 18:26:14 -0500 Subject: [Numpy-discussion] help creating a reversed cumulative histogram In-Reply-To: References: Message-ID: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> On Wed, Sep 2, 2009 at 18:15, Tim Michelsen wrote: > Hello fellow numy users, > I posted some questions on histograms recently [1, 2] but still couldn't > find ?a solution. > > I am trying to create a inverse cumulative histogram [3] which shall > look like [4] but with the higher values at the left. Okay. That is completely different from what you've asked before. > The classification shall follow this exemplary rule: > > class 1: 0 > all values > 0 > > class 2: 10 > all values > 10 > > class 3: 15 > all values > 15 > > class 4: 20 > all values > 20 > > class 5: 25 > all values > 25 > > [...] > > I could get this easily in a spreadsheet by creating a matix with > conditional statements (if VALUES_COL > CLASS_BOUNDARY; VALUES_COL; '-'). > > With python (numpy or pylab) I was not successful. The plotted histogram > envelope turned out to be just the inverted curve as the one created > with the spreadsheet app. > sums = np.histogram(values, weights=values, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? normed=normed, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? bins=bins) > ecdf_sums = np.hstack([0.0, sums[0].cumsum() ]) > ecdf_inv_sums = ecdf_sums[::-1] This is not the kind of "inversion" that you are looking for. You want ecdf_inv_sums = ecdf_sums[-1] - ecdf_sums -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Sep 2 19:29:31 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Sep 2009 17:29:31 -0600 Subject: [Numpy-discussion] numpy core dump on linux In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA1@sernt14.essex.ac.uk> References: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> <3d375d730909021537w7ffc2ef7ld7073aedfb536ac4@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA1@sernt14.essex.ac.uk> Message-ID: On Wed, Sep 2, 2009 at 5:19 PM, Citi, Luca wrote: > I experience the same problem. > A few more additional test cases: > > In [1]: import numpy > > In [2]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5)]) > Out[2]: array([0, 1, 2, 3, 4]) > > In [3]: numpy.lexsort([numpy.arange(5)[::-1].copy(), numpy.arange(5.)]) > Out[3]: array([0, 1, 2, 3, 4]) > > In [4]: numpy.lexsort([numpy.arange(5), numpy.arange(5)]) > Out[4]: array([0, 1, 2, 3, 4]) > > In [5]: numpy.lexsort([numpy.arange(5), numpy.arange(5.)]) > Out[5]: array([0, 1, 2, 3, 4]) > > In [6]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5)]) > Out[6]: array([0, 1, 2, 3, 4]) > > In [7]: numpy.lexsort([numpy.arange(5)[::-1], numpy.arange(5.)]) > *** glibc detected *** /usr/bin/python: free(): invalid next size (fast): > 0x09be6eb8 *** > > It looks like the problem is when the first array is reversed and the > second is float. > It's mixing types with different bit sizes, small type first. In [6]: a = np.array([1.0]*100, dtype=int16) In [7]: b = np.array([1.0]*100, dtype=int32) In [8]: lexsort((a[::-1],b)) *Crash* Probably the results are incorrect for the reverse order of types that doesn't crash, but different arrays would be needed to check that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From eadrogue at gmx.net Wed Sep 2 19:44:25 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Thu, 3 Sep 2009 01:44:25 +0200 Subject: [Numpy-discussion] masked arrays of structured arrays In-Reply-To: <48116BC6-2100-47E6-9B9F-AD3E4465BF43@gmail.com> References: <20090822115844.GA6422@doriath.local> <00FAC407-B21B-46BC-8ADA-E4702DBE54A6@gmail.com> <20090831183317.GA7148@doriath.local> <48116BC6-2100-47E6-9B9F-AD3E4465BF43@gmail.com> Message-ID: <20090902234425.GA16795@doriath.local> 31/08/09 @ 14:37 (-0400), thus spake Pierre GM: > On Aug 31, 2009, at 2:33 PM, Ernest Adrogu? wrote: > > > 30/08/09 @ 13:19 (-0400), thus spake Pierre GM: > >> I can't reproduce that with a recent SVN version (r7348). What > >> version > >> of numpy are you using ? > > > > Version 1.2.1 > > That must be that. Can you try w/ 1.3 ? Yes, in version 1.3.0 it's fixed. -- Ernest From josef.pktd at gmail.com Wed Sep 2 19:55:03 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 2 Sep 2009 19:55:03 -0400 Subject: [Numpy-discussion] help creating a reversed cumulative histogram In-Reply-To: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> References: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> Message-ID: <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> On Wed, Sep 2, 2009 at 7:26 PM, Robert Kern wrote: > On Wed, Sep 2, 2009 at 18:15, Tim Michelsen wrote: >> Hello fellow numy users, >> I posted some questions on histograms recently [1, 2] but still couldn't >> find ?a solution. >> >> I am trying to create a inverse cumulative histogram [3] which shall >> look like [4] but with the higher values at the left. > > Okay. That is completely different from what you've asked before. > >> The classification shall follow this exemplary rule: >> >> class 1: 0 >> all values > 0 >> >> class 2: 10 >> all values > 10 >> >> class 3: 15 >> all values > 15 >> >> class 4: 20 >> all values > 20 >> >> class 5: 25 >> all values > 25 >> >> [...] >> >> I could get this easily in a spreadsheet by creating a matix with >> conditional statements (if VALUES_COL > CLASS_BOUNDARY; VALUES_COL; '-'). >> >> With python (numpy or pylab) I was not successful. The plotted histogram >> envelope turned out to be just the inverted curve as the one created >> with the spreadsheet app. > >> sums = np.histogram(values, weights=values, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? normed=normed, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? bins=bins) >> ecdf_sums = np.hstack([0.0, sums[0].cumsum() ]) >> ecdf_inv_sums = ecdf_sums[::-1] > > This is not the kind of "inversion" that you are looking for. You want > > ecdf_inv_sums = ecdf_sums[-1] - ecdf_sums and you can plot the histogram with bar eisf_sums = ecdf_sums[-1] - ecdf_sums # empirical inverse survival function of weights width = sums[1][1] - sums[1][0] rects1 = plt.bar(sums[1], eisf_sums, width, color='b') Are you sure you want cumulative weights in the histogram? Josef > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From timmichelsen at gmx-topmail.de Wed Sep 2 20:11:10 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 03 Sep 2009 02:11:10 +0200 Subject: [Numpy-discussion] help creating a reversed cumulative histogram In-Reply-To: <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> References: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> Message-ID: Hello Robert and Josef, thanks for the quick answers! I really appreciate this. >>> I am trying to create a inverse cumulative histogram [3] which shall >>> look like [4] but with the higher values at the left. >> Okay. That is completely different from what you've asked before. You are right. But it's soemtimes hard to decribe a desired and expected output in python terms and pseudocode. I still have to lern more numpy vocabs... I will evalute your answers and give feedback. Regards, Timmie From robert.kern at gmail.com Wed Sep 2 21:33:07 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Sep 2009 20:33:07 -0500 Subject: [Numpy-discussion] help creating a reversed cumulative histogram In-Reply-To: References: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> Message-ID: <3d375d730909021833y620d1a4fx9b0e6ef1523a2b74@mail.gmail.com> On Wed, Sep 2, 2009 at 19:11, Tim Michelsen wrote: > Hello Robert and Josef, > thanks for the quick answers! I really appreciate this. > >>>> I am trying to create a inverse cumulative histogram [3] which shall >>>> look like [4] but with the higher values at the left. >>> Okay. That is completely different from what you've asked before. > You are right. > But it's soemtimes hard to decribe a desired and expected output in > python terms and pseudocode. > I still have to lern more numpy vocabs... Actually, I apologize. I meant to delete that line before sending the message. It was unnecessary and abusive. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Sep 2 23:26:59 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Sep 2009 21:26:59 -0600 Subject: [Numpy-discussion] numpy core dump on linux In-Reply-To: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> References: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> Message-ID: On Wed, Sep 2, 2009 at 4:23 PM, Jeremy Mayes wrote: > This one line causes python to core dump on linux. > numpy.lexsort([ > numpy.array(['-','-','-','-','-','-','-','-','-','-','-','-','-'])[::-1],numpy.array([732685., > 732685., 732685., 732685., 732685., 732685.,732685., 732685., > 732685., 732685., 732685., 732685., 732679.])[::-1]]) > > Here's some version info: > > python 2.5.4 > numpy 1.3.0 > > error is > *** glibc detected *** free(): invalid next size (fast): 0x0000000000526be0 > *** > > I've opened ticket #1217 for this. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Thu Sep 3 00:59:54 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 2 Sep 2009 23:59:54 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <3d375d730909021158i79ec0551h3eca28878a0d405c@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> <49d6b3500909021027w29159135tbb5bfe47a48d26c@mail.gmail.com> <3d375d730909021029l9a3be85wb850421217da5ef9@mail.gmail.com> <49d6b3500909021128sfe82fd9iaf95886b6263340a@mail.gmail.com> <3d375d730909021158i79ec0551h3eca28878a0d405c@mail.gmail.com> Message-ID: <49d6b3500909022159m628262edr3cd3be17c12d9a19@mail.gmail.com> On Wed, Sep 2, 2009 at 1:58 PM, Robert Kern wrote: > On Wed, Sep 2, 2009 at 13:28, G?khan Sever wrote: > > Put the reference manual in: > > > > http://drop.io/1plh5rt > > > > First few pages describe the data format they use. > > Ah. The fields are *not* delimited by a fixed value. Regexes are no > help to you for pulling out the information you need, except perhaps > later to parse the text fields. I think you are also getting spurious > results because your regex matches things inside data fields. > > Instead, you have a header containing the length of the data field > followed by the data field. Create a structured dtype that corresponds > to the DataDir struct on page 15. Note that "unsigned int" there is > actually a numpy.uint16, not a uint32. > > dt = np.dtype([('tagNumber', np.uint16), ('dataOffset', np.uint16), > ('numberBytes', np.uint16), ('samples', np.uint16), ('bytesPerSample', > np.uint16), ('type', np.uint8), ('param1', np.uint8), ('param2', > np.uint8), ('param3', np.uint8), ('address', np.uint16)]) > > Now read dt.itemsize bytes from the file and use > > header = fromstring(f.read(dt.itemsize), dt)[0] > > to get a record object that corresponds to the header. Use the > dataOffset and numberBytes fields to extract the actual data bytes > from the file. > > For example, if we go to the second header field: > > In [28]: f.seek(dt.itemsize,0) > > In [29]: header = np.fromstring(f.read(dt.itemsize), dt)[0] > > In [30]: header > Out[30]: (65530, 100, 8, 1, 8, 255, 0, 0, 0, 43605) > > In [31]: f.seek(header['dataOffset'], 0) > > In [32]: f.read(header['numberBytes']) > Out[32]: 'prj.300\x00' > > > There are still some semantic issues you need to work out, still. > There are multiple "buffers" per file, and the dataOffsets are > relative to the start of the buffer, not the file. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Robert, You must have thrown a couple RTFM's while replying my emails :) I usually take trial-error approaches initially, and don't give up unless I hit a hurdle so fast, which in this case resulted with the unsuccessful regex approach. However from the good point I have learnt the basics of regular expressions and realized how powerful could they be during a text parsing task. Enough prattle, below is what I am working on: So far I was successfully able to extract the file names and the data associated with those names (with the exception of multiple buffer per file cases). However not reading time increments correctly, I should be seeing 1 sec incremental time ticks from the time segment reading, but all it does is to return the same first time information. Furthermore, I still couldn't figure out how to wrap the main looping suite (range(500) is just a dummy number which will let me process whole binary data) I don't know yet how to make the range input generic which will work any size of similar binary file. import numpy as np import struct f = open('test.sea', 'rb') dt = np.dtype([('tagNumber', np.uint16), ('dataOffset', np.uint16), ('numberBytes', np.uint16), ('samples', np.uint16), ('bytesPerSample', np.uint16), ('type', np.uint8), ('param1', np.uint8), ('param2', np.uint8), ('param3', np.uint8), ('address', np.uint16)]) start = 0 ct = 0 for i in range(500): header = np.fromstring(f.read(dt.itemsize), dt)[0] if header['tagNumber'] == 65530: loc = f.tell() f.seek(start + header['dataOffset']) f.read(header['numberBytes']) f.seek(loc) elif header['tagNumber'] == 65531: loc = f.tell() f.seek(start + header['dataOffset']) f.read(header['numberBytes']) start = f.tell() elif header['tagNumber'] == 0: loc = f.tell() f.seek(start + header['dataOffset']) print f.tell() k = f.read(header['numberBytes'] print struct.unpack('9h', k[:18]) f.seek(loc) ct += 1 -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Thu Sep 3 01:09:02 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 03 Sep 2009 07:09:02 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: References: <4A9C9DDA.9060503@molden.no> Message-ID: <4A9F4F6E.7090504@molden.no> Chad Netzer skrev: > By the way, as far as I can tell, the above algorithm is exactly the > same idea as a non-recursive Hoare (ie. quicksort) selection: Do the > partition, then only proceed to the sub-partition that must contain > the nth element. My version is a bit more general, allowing > partitioning on a range of elements rather than just one, but the > concept is the same. The numpy quicksort already does non recursive > sorting. > > I'd also like to, if possible, have a specialized 2D version, since > image media filtering is one of my interests, and the C version works > on 1D (raveled) arrays only. I agree. NumPy (or SciPy) could have a select module similar to the sort module. If the select function takes an axis argument similar to the sort functions, only a small change to the current np.median would needed. Take a look at this: http://projects.scipy.org/numpy/attachment/ticket/1213/_selectmodule.pyx Here is a select function that takes an axis argument. There are specialized versions for 1D, 2D, and 3D. Input can be contiguous or not. For 4D and above, axes are found by recursion on the shape array. Thus it should be fast regardless of dimensions. I haven't tested the Cython code /thoroughly/, but at least it does compile. Sturla Molden From robert.kern at gmail.com Thu Sep 3 01:22:18 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 3 Sep 2009 00:22:18 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <49d6b3500909022159m628262edr3cd3be17c12d9a19@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> <49d6b3500909021027w29159135tbb5bfe47a48d26c@mail.gmail.com> <3d375d730909021029l9a3be85wb850421217da5ef9@mail.gmail.com> <49d6b3500909021128sfe82fd9iaf95886b6263340a@mail.gmail.com> <3d375d730909021158i79ec0551h3eca28878a0d405c@mail.gmail.com> <49d6b3500909022159m628262edr3cd3be17c12d9a19@mail.gmail.com> Message-ID: <3d375d730909022222t1e9ea6f4q693fcf9e7a60ab10@mail.gmail.com> On Wed, Sep 2, 2009 at 23:59, G?khan Sever wrote: > Robert, > > You must have thrown a couple RTFM's while replying my emails :) Not really. There's no manual for this. Greg Wilson's _Data Crunching_ may be a good general introduction to how to think about these problems. http://www.pragprog.com/titles/gwd/data-crunching > I usually > take trial-error approaches initially, and don't give up unless I hit a > hurdle so fast, which in this case resulted with the unsuccessful regex > approach. However from the good point I have learnt the basics of regular > expressions and realized how powerful could they be during a text parsing > task. > > Enough prattle, below is what I am working on: > > So far I was successfully able to extract the file names and the data > associated with those names (with the exception of multiple buffer per file > cases). > > However not reading time increments correctly, I should be seeing 1 sec > incremental time ticks from the time segment reading, but all it does is to > return the same first time information. > > Furthermore, I still couldn't figure out how to wrap the main looping suite > (range(500) is just a dummy number which will let me process whole binary > data) I don't know yet how to make the range input generic which will work > any size of similar binary file. while True: ... if no_more_data(): break > import numpy as np > import struct > > f = open('test.sea', 'rb') > > dt = np.dtype([('tagNumber', np.uint16), ('dataOffset', np.uint16), > ('numberBytes', np.uint16), ('samples', np.uint16), ('bytesPerSample', > np.uint16), ('type', np.uint8), ('param1', np.uint8), ('param2', > np.uint8), ('param3', np.uint8), ('address', np.uint16)]) > > > start = 0 > ct = 0 > > for i in range(500): > > ??? header = np.fromstring(f.read(dt.itemsize), dt)[0] > > ??? if header['tagNumber'] == 65530: > ??????? loc = f.tell() > ??????? f.seek(start + header['dataOffset']) > ??????? f.read(header['numberBytes']) Presumably you are doing something with this data, not just discarding it. > ??????? f.seek(loc) This should be f.seek(loc, 0). f.seek(nbytes) is to seek forward from the current position by nbytes. The 0 tells it to start from the beginning. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu Sep 3 01:28:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 3 Sep 2009 00:28:19 -0500 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9F4F6E.7090504@molden.no> References: <4A9C9DDA.9060503@molden.no> <4A9F4F6E.7090504@molden.no> Message-ID: <3d375d730909022228v446cee1fwca66ad623becb704@mail.gmail.com> On Thu, Sep 3, 2009 at 00:09, Sturla Molden wrote: > Chad Netzer skrev: >> I'd also like to, if possible, have a specialized 2D version, since >> image media filtering is one of my interests, and the C version works >> on 1D (raveled) arrays only. > I agree. NumPy (or SciPy) could have a select module similar to the sort > module. If the select function takes an axis argument similar to the > sort functions, only a small change to the current np.median would needed. > > Take a look at this: > > http://projects.scipy.org/numpy/attachment/ticket/1213/_selectmodule.pyx > > Here is a select function that takes an axis argument. There are > specialized versions for 1D, 2D, and 3D. Input can be contiguous or not. > For 4D and above, axes are found by recursion on the shape array. Thus > it should be fast regardless of dimensions. When he is talking about 2D, I believe he is referring to median filtering rather than computing the median along an axis. I.e., replacing each pixel with the median of a specified neighborhood around the pixel. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From chad.netzer at gmail.com Thu Sep 3 01:50:09 2009 From: chad.netzer at gmail.com (Chad Netzer) Date: Wed, 2 Sep 2009 22:50:09 -0700 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <3d375d730909022228v446cee1fwca66ad623becb704@mail.gmail.com> References: <4A9C9DDA.9060503@molden.no> <4A9F4F6E.7090504@molden.no> <3d375d730909022228v446cee1fwca66ad623becb704@mail.gmail.com> Message-ID: On Wed, Sep 2, 2009 at 10:28 PM, Robert Kern wrote: > When he is talking about 2D, I believe he is referring to median > filtering rather than computing the median along an axis. I.e., > replacing each pixel with the median of a specified neighborhood > around the pixel. That's right, Robert. Basically, I meant doing a median on a square (or rectangle) "view" of an array, without first having to ravel(), thus generally saving a copy. But actually, since my selection based median overwrites the source array, it may not save a copy anyway. But Charles Harris's earlier suggestion of some hard coded medians for common filter template sizes (ie 3x3, 5x5, etc.) may be a nice addition to scipy, especially if it can be generalized somewhat to other filters. -C From sturla at molden.no Thu Sep 3 01:55:40 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 03 Sep 2009 07:55:40 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <3d375d730909022228v446cee1fwca66ad623becb704@mail.gmail.com> References: <4A9C9DDA.9060503@molden.no> <4A9F4F6E.7090504@molden.no> <3d375d730909022228v446cee1fwca66ad623becb704@mail.gmail.com> Message-ID: <4A9F5A5C.9040209@molden.no> Robert Kern skrev: > When he is talking about 2D, I believe he is referring to median > filtering rather than computing the median along an axis. I.e., > replacing each pixel with the median of a specified neighborhood > around the pixel. > > That's not something numpy's median function should be specialized to do. IMHO, median filtering belongs to scipy. Sturla From wright at esrf.fr Thu Sep 3 01:55:40 2009 From: wright at esrf.fr (Jon Wright) Date: Thu, 03 Sep 2009 07:55:40 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: References: <4A9C9DDA.9060503@molden.no> <4A9F4F6E.7090504@molden.no> <3d375d730909022228v446cee1fwca66ad623becb704@mail.gmail.com> Message-ID: <4A9F5A5C.2000208@esrf.fr> Chad Netzer wrote: > But Charles Harris's earlier suggestion of some hard coded medians for > common filter template sizes (ie 3x3, 5x5, etc.) may be a nice > addition to scipy, especially if it can be generalized somewhat to > other filters. > For 2D images try looking into PIL : ImageFilter.MedianFilter Cheers, Jon From sturla at molden.no Thu Sep 3 02:14:30 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 03 Sep 2009 08:14:30 +0200 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: References: <4A9C9DDA.9060503@molden.no> <4A9F4F6E.7090504@molden.no> <3d375d730909022228v446cee1fwca66ad623becb704@mail.gmail.com> Message-ID: <4A9F5EC6.8090408@molden.no> Chad Netzer skrev: > That's right, Robert. Basically, I meant doing a median on a square > (or rectangle) "view" of an array, without first having to ravel(), > thus generally saving a copy. But actually, since my selection based > median overwrites the source array, it may not save a copy anyway. > Avoiding copies of timy buffers is futile optimization. QuickSelect is overkill for tiny buffers like common filter kernels. Insertion sort is fine. Getting rid of loops and multiple function calls in Python helps a lot. If memory is not an issue, with np.median you can actully create an quite an efficient median filter using a 3D ndarray. For example if you use an image of 640 x 480 pixels and want a 9 pixel median filter, you can put shifted images in an 640 x 480 x 9 ndarray, and call median with axis=2. Sturla Molden From stefano_covino at yahoo.it Thu Sep 3 03:57:34 2009 From: stefano_covino at yahoo.it (Stefano Covino) Date: Thu, 3 Sep 2009 07:57:34 +0000 (UTC) Subject: [Numpy-discussion] snow leopard and Numeric References: <8AD04A53-5C75-4159-B224-07B1BB85A212@yahoo.it> <3d375d730909012144l23d66258q459d5b0966815f45@mail.gmail.com> <3d375d730909012153o2736cee4r430f597dd37c5a3@mail.gmail.com> Message-ID: Apparently, I have solved the problem. I say apparently meaning that at least it compiles and installs smoothly. In file ranf.c (Packages/RNG/Src/ranf.c) insert at the beginning #include and comment line 153 in the original file, where function 'gettomeofday' is re-defined. Cheers, Stefano From timmichelsen at gmx-topmail.de Thu Sep 3 05:33:08 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 3 Sep 2009 09:33:08 +0000 (UTC) Subject: [Numpy-discussion] help creating a reversed cumulative histogram References: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> <3d375d730909021833y620d1a4fx9b0e6ef1523a2b74@mail.gmail.com> Message-ID: > >>> Okay. That is completely different from what you've asked before. > > You are right. > > But it's soemtimes hard to decribe a desired and expected output in > > python terms and pseudocode. > > I still have to lern more numpy vocabs... > > Actually, I apologize. I meant to delete that line before sending the > message. It was unnecessary and abusive. Don't worry. I got it right the ways you meant it initially. No offence. Coding and math problems get more clear once you take the effort to explain and visualise it for others. You spend quite a lot of time responding here. I appreciate that. Best regards, Timmie From timmichelsen at gmx-topmail.de Thu Sep 3 09:23:06 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 3 Sep 2009 13:23:06 +0000 (UTC) Subject: [Numpy-discussion] help creating a reversed cumulative histogram References: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> Message-ID: > Hello, I have checked the snippets you proposed. It does what I wanted to achieve. Obviously, I had to substract the values as Robert demonstrated. This could also be perceived from the figure I posted. I still have see how I can optimise the code (c.f. below) or modify to be less complicated. It seemed so simple in the spreadsheet... > eisf_sums = ecdf_sums[-1] - ecdf_sums > # empirical inverse survival > function of weights Can you recommend me a (literature) source where I can look up this term? I learned statistics in my mother tongue and seem to need a refresher on distributions... I would like to come up with the right terms next time. > Are you sure you want cumulative weights in >the histogram? You mean it doesn't make sense at all? I need: 1) the count of occurrences sorted in each bin counts = np.histogram(values, normed=normed, bins=bins) => here I obtain now the same as in the spreadsheet 2) the sum of all values sorted in each bin sums = np.histogram(values, weights=values, normed=normed, bins=bins) => here I still obtain different values for the first histogram value (eisf_sums[0]): Numpy: eisf_sums 335.50026738, 319.21363636, 266.07724942, 198.10258741, 126.69270396, 67.98125874, 38.47335664, 24.75062937, 13.42121212, 2.48636364, 0. Spreadsheet: 335.2351159, 319.2136364, 266.0772494, 198.1025874, 126.692704, 67.98125874, 38.47335664, 24.75062937, 13.42121212, 2.486363636, 0 Additionally, I would like to see these implemented as convenience functions in numpy or scipy. There should be out of the box functions for all kinds of distributions. Where is the best place to contrubute a final version? The scipy.stats? Thanks again for your input, Timmie ##### below the distilled code ##### ## histogram settings normed = False bins = 10 ## counts: gives expected results counts = np.histogram(values, normed=normed, bins=bins) ecdf_counts = np.hstack([1.0, counts[0].cumsum() ]) ecdf_inv_counts = ecdf_counts[::-1] # empirical inverse survival function of weights eisf_counts = ecdf_counts[-1] - ecdf_counts ### sum: does have deviations sums = np.histogram(values, weights=values, normed=normed, bins=bins) ecdf_sums = np.hstack([1.0, sums[0].cumsum() ]) ecdf_inv_sums = ecdf_sums[::-1] # empirical inverse survival function of weights eisf_sums = ecdf_sums[-1] - ecdf_sums ## # configure plot xlabel = 'Bins' ylabel_left = 'Counts' ylabel_right = 'Sum' fig1 = plt.figure() ax1 = fig1.add_subplot(111) # counts ax1.plot(counts[1], ecdf_inv_counts, 'r-') ax1.set_xlabel(xlabel) ax1.set_ylabel(ylabel_left, color='b') for tl in ax1.get_yticklabels(): tl.set_color('b') # sums ax2 = ax1.twinx() ax2.plot(sums[1], eisf_sums, 'b-') ax2.set_ylabel(ylabel_right, color='r') for tl in ax2.get_yticklabels(): tl.set_color('r') plt.show() From charlesr.harris at gmail.com Thu Sep 3 09:34:29 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Sep 2009 07:34:29 -0600 Subject: [Numpy-discussion] A faster median (Wirth's method) In-Reply-To: <4A9F5EC6.8090408@molden.no> References: <4A9C9DDA.9060503@molden.no> <4A9F4F6E.7090504@molden.no> <3d375d730909022228v446cee1fwca66ad623becb704@mail.gmail.com> <4A9F5EC6.8090408@molden.no> Message-ID: On Thu, Sep 3, 2009 at 12:14 AM, Sturla Molden wrote: > Chad Netzer skrev: > > That's right, Robert. Basically, I meant doing a median on a square > > (or rectangle) "view" of an array, without first having to ravel(), > > thus generally saving a copy. But actually, since my selection based > > median overwrites the source array, it may not save a copy anyway. > > > Avoiding copies of timy buffers is futile optimization. > > QuickSelect is overkill for tiny buffers like common filter kernels. > Insertion sort is fine. > > Shell sort is nice for sizes in the awkward range. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Sep 3 10:17:14 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 3 Sep 2009 10:17:14 -0400 Subject: [Numpy-discussion] help creating a reversed cumulative histogram In-Reply-To: References: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> Message-ID: <1cd32cbb0909030717i4a9e3a8t69f7d129a331337c@mail.gmail.com> On Thu, Sep 3, 2009 at 9:23 AM, Tim Michelsen wrote: >> > Hello, > I have checked the snippets you proposed. > It does what I wanted to achieve. > Obviously, I had to substract the values as Robert > demonstrated. This could also be perceived from > the figure I posted. > > I still have see how I can optimise the code > (c.f. below) or modify to be less complicated. > It seemed so simple in the spreadsheet... > >> eisf_sums = ecdf_sums[-1] - ecdf_sums >> # empirical inverse survival this should have inverse in it, it was a cut and paste error empirical survival function would be just 1-ecdf however, as distributions they would require to be normed to 1, >> function of weights > Can you recommend me a (literature) source where > I can look up this term? > I learned statistics in my mother tongue and seem > to need a refresher on distributions... > I would like to come up with the right terms > next time. My first stop is usually wikipedia: http://en.wikipedia.org/wiki/Survival_function http://de.wikipedia.org/wiki/Verteilungsfunktion#.C3.9Cberlebenswahrscheinlichkeit and the ISI - INTERNATIONAL STATISTICAL INSTITUTE glossary for terms in different languages http://isi.cbs.nl/glossary/bloken83.htm > >> Are you sure you want cumulative weights in >>the histogram? > You mean it doesn't make sense at all? It depends on what you want, ecdf as it is calculated, with the weights argument in the histogram, gives you the cumulative sum of the values, not the count. In the case of the weight of pigs, it would be to cumulative weight of all pigs with a weight less than the given bin boundary weight. If values were income, then it would be the aggregated income of all individual with an income below the bin bin boundary. So it makes sense, given this is what you want (below). > > I need: > 1) the count of occurrences sorted in each bin > ? ?counts = np.histogram(values, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?normed=normed, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bins=bins) > ? ?=> here I obtain now the same as in the > ? ?spreadsheet > > 2) the sum of all values sorted in each bin > ? ?sums = np.histogram(values, weights=values, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?normed=normed, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bins=bins) > > ? ?=> here I still obtain different values for the first > ? ?histogram value (eisf_sums[0]): > ? ?Numpy: eisf_sums > ? ?335.50026738, 319.21363636, 266.07724942, > ? ?198.10258741, 126.69270396, 67.98125874, > ? ?38.47335664, ?24.75062937, 13.42121212, > ? ?2.48636364, 0. > > ? ?Spreadsheet: > ? ?335.2351159, 319.2136364, 266.0772494, > ? ?198.1025874, 126.692704, 67.98125874, > ? ?38.47335664, 24.75062937, 13.42121212, > ? ?2.486363636, 0 there might be a mistake in the treatment of a cell when reversing, when I run your example the highest value is not equal to values.sum() this might match the spreadsheet, but I haven't compared isf = sums[0][::-1].cumsum()[::-1] But I'm not sure yet, what's going on. Josef > > Additionally, I would like to see these implemented > as convenience functions in numpy or scipy. > There should be out of the box functions for all kinds > of distributions. > Where is the best place to contrubute a final version? > The scipy.stats? > > Thanks again for your input, > Timmie > > ##### below the distilled code ##### > ## histogram settings > normed = False > bins = 10 > > ## counts: gives expected results > counts = np.histogram(values, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?normed=normed, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bins=bins) > > ecdf_counts = np.hstack([1.0, counts[0].cumsum() ]) > ecdf_inv_counts = ecdf_counts[::-1] > # empirical inverse survival function of weights > eisf_counts = ecdf_counts[-1] - ecdf_counts > > > ### sum: does have deviations > sums = np.histogram(values, weights=values, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?normed=normed, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bins=bins) > ecdf_sums = np.hstack([1.0, sums[0].cumsum() ]) > ecdf_inv_sums = ecdf_sums[::-1] > # empirical inverse survival function of weights > eisf_sums = ecdf_sums[-1] - ecdf_sums > > ## > # configure plot > xlabel = 'Bins' > ylabel_left = 'Counts' > ylabel_right = 'Sum' > > > fig1 = plt.figure() > ax1 = fig1.add_subplot(111) > > # counts > ax1.plot(counts[1], ecdf_inv_counts, 'r-') > ax1.set_xlabel(xlabel) > ax1.set_ylabel(ylabel_left, color='b') > for tl in ax1.get_yticklabels(): > ? ?tl.set_color('b') > > # sums > ax2 = ax1.twinx() > ax2.plot(sums[1], eisf_sums, 'b-') > ax2.set_ylabel(ylabel_right, color='r') > for tl in ax2.get_yticklabels(): > ? ?tl.set_color('r') > plt.show() > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From bsouthey at gmail.com Thu Sep 3 10:24:23 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 03 Sep 2009 09:24:23 -0500 Subject: [Numpy-discussion] numpy core dump on linux In-Reply-To: References: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> Message-ID: <4A9FD197.1000303@gmail.com> On 09/02/2009 10:26 PM, Charles R Harris wrote: > > > On Wed, Sep 2, 2009 at 4:23 PM, Jeremy Mayes > wrote: > > This one line causes python to core dump on linux. > numpy.lexsort([ > numpy.array(['-','-','-','-','-','-','-','-','-','-','-','-','-'])[::-1],numpy.array([732685., > 732685., 732685., 732685., 732685., 732685.,732685., > 732685., 732685., 732685., 732685., 732685., 732679.])[::-1]]) > > Here's some version info: > > python 2.5.4 > numpy 1.3.0 > > error is > *** glibc detected *** free(): invalid next size (fast): > 0x0000000000526be0 *** > > > I've opened ticket #1217 for this. > > Chuck > Hi, It appears to work if you: 1) Use a copied version of the reverse string array using numpy.copy(numpy.array(['-','-','-','-','-','-','-','-','-','-','-','-','-'])[::-1]) 2) Change the dtype of the string array to at least S8 Which may support the suggestion that it is related to using [::-1]. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Sep 3 11:49:40 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Sep 2009 09:49:40 -0600 Subject: [Numpy-discussion] numpy core dump on linux In-Reply-To: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> References: <890c2bf00909021523q3d0f7cdboccbcb7ca938514f7@mail.gmail.com> Message-ID: On Wed, Sep 2, 2009 at 4:23 PM, Jeremy Mayes wrote: > This one line causes python to core dump on linux. > numpy.lexsort([ > numpy.array(['-','-','-','-','-','-','-','-','-','-','-','-','-'])[::-1],numpy.array([732685., > 732685., 732685., 732685., 732685., 732685.,732685., 732685., > 732685., 732685., 732685., 732685., 732679.])[::-1]]) > > Here's some version info: > > python 2.5.4 > numpy 1.3.0 > > error is > *** glibc detected *** free(): invalid next size (fast): 0x0000000000526be0 > *** > > Should be fixed in r7356. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wkerzendorf at googlemail.com Thu Sep 3 12:02:07 2009 From: wkerzendorf at googlemail.com (Wolfgang Kerzendorf) Date: Thu, 3 Sep 2009 18:02:07 +0200 Subject: [Numpy-discussion] snow leopard issues with numpy Message-ID: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> I just installed numpy and scipy (both svn) on OS X 10.6 and just got scipy to work with Robert Kern's help. Playing around with numpy I got the following segfault: http://pastebin.com/m35220dbf I hope someone can make sense of it. Thanks in advance Wolfgang From charlesr.harris at gmail.com Thu Sep 3 12:13:23 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Sep 2009 10:13:23 -0600 Subject: [Numpy-discussion] snow leopard issues with numpy In-Reply-To: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> References: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> Message-ID: On Thu, Sep 3, 2009 at 10:02 AM, Wolfgang Kerzendorf < wkerzendorf at googlemail.com> wrote: > I just installed numpy and scipy (both svn) on OS X 10.6 and just got > scipy to work with Robert Kern's help. Playing around with numpy I got > the following segfault: > http://pastebin.com/m35220dbf > I hope someone can make sense of it. Thanks in advance > Wolfgang > I'm going to guess it is a python problem. Which version of python do you have and where did it come from? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Sep 3 12:39:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 3 Sep 2009 11:39:19 -0500 Subject: [Numpy-discussion] snow leopard issues with numpy In-Reply-To: References: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> Message-ID: <3d375d730909030939g3718ab91w8ee5477903f97a59@mail.gmail.com> On Thu, Sep 3, 2009 at 11:13, Charles R Harris wrote: > > > On Thu, Sep 3, 2009 at 10:02 AM, Wolfgang Kerzendorf > wrote: >> >> I just installed numpy and scipy (both svn) on OS X 10.6 and just got >> scipy to work with Robert Kern's help. Playing around with numpy I got >> the following segfault: >> http://pastebin.com/m35220dbf >> I hope someone can make sense of it. Thanks in advance >> ? ? ?Wolfgang > > I'm going to guess it is a python problem. Which version of python do you > have and where did it come from? Or a matplotlib problem. _path.so is theirs. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Sep 3 12:49:10 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Sep 2009 10:49:10 -0600 Subject: [Numpy-discussion] snow leopard issues with numpy In-Reply-To: <3d375d730909030939g3718ab91w8ee5477903f97a59@mail.gmail.com> References: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> <3d375d730909030939g3718ab91w8ee5477903f97a59@mail.gmail.com> Message-ID: On Thu, Sep 3, 2009 at 10:39 AM, Robert Kern wrote: > On Thu, Sep 3, 2009 at 11:13, Charles R Harris > wrote: > > > > > > On Thu, Sep 3, 2009 at 10:02 AM, Wolfgang Kerzendorf > > wrote: > >> > >> I just installed numpy and scipy (both svn) on OS X 10.6 and just got > >> scipy to work with Robert Kern's help. Playing around with numpy I got > >> the following segfault: > >> http://pastebin.com/m35220dbf > >> I hope someone can make sense of it. Thanks in advance > >> Wolfgang > > > > I'm going to guess it is a python problem. Which version of python do you > > have and where did it come from? > > Or a matplotlib problem. _path.so is theirs. > > Likely, then. I had to recompile matplotlib from svn when I upgraded to Fedora 11. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From timmichelsen at gmx-topmail.de Thu Sep 3 12:58:05 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 3 Sep 2009 16:58:05 +0000 (UTC) Subject: [Numpy-discussion] help creating a reversed cumulative histogram References: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> <1cd32cbb0909030717i4a9e3a8t69f7d129a331337c@mail.gmail.com> Message-ID: > My first stop is usually wikipedia: [...] Thanks. So I I'known that I have to call the beast a "empirical inverse survival function", Robert would also have foundit easier to help. Anyway, step by step... > In the case of the weight of pigs, it would be to cumulative weight of > all pigs with a weight less than the given bin boundary weight. > If values were income, then it would be the aggregated income of all > individual with an income below the bin bin boundary. > So it makes sense, given this is what you want (below). Exactly! Or for precipitation: a) count: number of precipitation events that ocurred up to a certain limit b) sum: precipitation total registered up to that limit > there might be a mistake in the treatment of a cell when > reversing, when I run your example the highest value is > not equal to values.sum() This has made me think again. Small point. See here: ecdf_sums = np.hstack([0.0, sums[0].cumsum() ]) ecdf_sums = np.hstack([sums[0].cumsum() ]) I had to adjust the classes in the spreadsheet by replacing the first class limit by 0.0. I had modifed this yesterday to a different value (0.265152) as I was testing the code. from: 0.265152, 0.487273, 0.709394, 0.931515, 1.153636, 1.375758, 1.597879, 1.820000, 2.042121, 2.264242, 2.486364 to: 0.0, 0.487273, 0.709394, 0.931515, 1.153636, 1.375758, 1.597879, 1.820000, 2.042121, 2.264242, 2.486364 Now everything is fine. Results and curves match. > But I'm not sure yet, what's going on. 1) first I didn't know how to develop the code for a "empirical inverse survival function" in numpy 2) I screwed my spreadsheet classes up while testing and verifying my numpy code. Again, would a function for the "empirical inverse survival function" qualify for the inclusion into numpy or scipy? Thanks for the help. Best regards, Timmie From josef.pktd at gmail.com Thu Sep 3 13:48:15 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 3 Sep 2009 13:48:15 -0400 Subject: [Numpy-discussion] help creating a reversed cumulative histogram In-Reply-To: References: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> <1cd32cbb0909030717i4a9e3a8t69f7d129a331337c@mail.gmail.com> Message-ID: <1cd32cbb0909031048k7e9304x89365ee81372f1f0@mail.gmail.com> On Thu, Sep 3, 2009 at 12:58 PM, Tim Michelsen wrote: >> My first stop is usually wikipedia: > [...] > Thanks. > So I I'known that I have to call the beast a > "empirical inverse survival function", Robert would > also have foundit easier to help. > Anyway, step by step... > >> In the case of the weight of pigs, it would be to cumulative weight of >> all pigs with a weight less than the given bin boundary weight. >> If values were income, then it would be the aggregated income of all >> individual with an income below the bin bin boundary. >> So it makes sense, given this is what you want (below). > Exactly! > > Or for precipitation: > a) count: number of precipitation events that > ? ?ocurred up to a certain limit > b) sum: precipitation total registered up to that limit > >> there might be a mistake in the treatment of a cell when >> reversing, when I run your example the highest value is >> not equal to values.sum() > This has made me think again. Small point. > > See here: > ecdf_sums = np.hstack([0.0, sums[0].cumsum() ]) > ecdf_sums = np.hstack([sums[0].cumsum() ]) > > I had to adjust the classes in the spreadsheet by > replacing the first class limit by 0.0. > I had modifed this yesterday to a different value > (0.265152) as I was testing the code. > > from: > 0.265152, 0.487273, 0.709394, 0.931515, > 1.153636, 1.375758, 1.597879, 1.820000, > 2.042121, 2.264242, 2.486364 > > to: > 0.0, 0.487273, 0.709394, 0.931515, > 1.153636, 1.375758, 1.597879, 1.820000, > 2.042121, 2.264242, 2.486364 > > Now everything is fine. Results and curves match. > >> But I'm not sure yet, what's going on. > 1) first I didn't know how to develop the code for a > ? ?"empirical inverse survival function" in numpy > 2) I screwed my spreadsheet classes up while > ? ?testing and verifying my numpy code. > > Again, would a function for the > "empirical inverse survival function" qualify for the > inclusion into numpy or scipy? Sorry, I'm too distracted, correcting myself a second time "this should *not* have inverse in it, using inverse was a cut and paste error" it's empirical survival function If it's just a one-liner with cumsum, then I don't think its necessary to have a function for it. But following also the previous discussion, it would be useful to have the combination of histogram and empirical cdf, sf, and/or pdf to define an empirical distribution. As interpretation in terms of distribution, normed=True would be necessary, but it could also be an option. One question to your application, in the plot you draw lines and not histograms. Is there a reason to use histograms in the calculation instead of the full ecdf. (i.e. cumsum on original values instead of cumsum on histogrammed values) ? Josef > > Thanks for the help. > > Best regards, > Timmie > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From timmichelsen at gmx-topmail.de Thu Sep 3 16:01:15 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 03 Sep 2009 22:01:15 +0200 Subject: [Numpy-discussion] help creating a reversed cumulative histogram In-Reply-To: <1cd32cbb0909031048k7e9304x89365ee81372f1f0@mail.gmail.com> References: <3d375d730909021626w3dc3c2dbo161b8b1239272e0d@mail.gmail.com> <1cd32cbb0909021655l37bc7ae9wf6555818ba6b3e8f@mail.gmail.com> <1cd32cbb0909030717i4a9e3a8t69f7d129a331337c@mail.gmail.com> <1cd32cbb0909031048k7e9304x89365ee81372f1f0@mail.gmail.com> Message-ID: >> Again, would a function for the >> "empirical inverse survival function" qualify for the >> inclusion into numpy or scipy? > > Sorry, I'm too distracted, correcting myself a second time > "this should *not* have inverse in it, using inverse was a cut and paste error" > it's empirical survival function I think my fault not paying too much attention to the exact terms. The pages you sent on "survial function" are marked as to-read. > If it's just a one-liner with cumsum, then I don't think its necessary > to have a function for it. > > But following also the previous discussion, it would be useful to have > the combination of histogram and empirical cdf, sf, and/or pdf to > define an empirical distribution. As interpretation in terms of > distribution, normed=True would be necessary, but it could also be an > option. And it seems that this is just one call in R. > One question to your application, in the plot you draw lines and not > histograms. Is there a reason to use histograms in the calculation > instead of the full ecdf. (i.e. cumsum on original values instead of > cumsum on histogrammed values) ? Well, I was not aware of cumsum and a way to create ecdf with numpy. I just sereach the list archives for cdf or ecdf. As my inital version was created in a shreadsheet, I first tried to replicate that and get it validated. Can you give an example of a full ecdf? In the end I am interested in the points (x and y coordinates) where the ecdf intersects with a certain threshold value. This is the next task: Get x,y of the cut-point between a vertical or horizontal line and a curve with numpy and matplotlib. Can you point out an example for that? Best regards, Timmie From wkerzendorf at googlemail.com Thu Sep 3 19:02:58 2009 From: wkerzendorf at googlemail.com (Wolfgang Kerzendorf) Date: Fri, 4 Sep 2009 01:02:58 +0200 Subject: [Numpy-discussion] snow leopard issues with numpy In-Reply-To: References: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> Message-ID: <5B4C377A-AA24-431E-854D-C24A836D709E@gmail.com> my version of python is the one that comes with snow leopard: 2.6.1 hope that helps On 03/09/2009, at 18:13 , Charles R Harris wrote: > > > On Thu, Sep 3, 2009 at 10:02 AM, Wolfgang Kerzendorf > wrote: > I just installed numpy and scipy (both svn) on OS X 10.6 and just got > scipy to work with Robert Kern's help. Playing around with numpy I got > the following segfault: > http://pastebin.com/m35220dbf > I hope someone can make sense of it. Thanks in advance > Wolfgang > > I'm going to guess it is a python problem. Which version of python > do you have and where did it come from? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Sep 3 23:49:48 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 4 Sep 2009 12:49:48 +0900 Subject: [Numpy-discussion] snow leopard issues with numpy In-Reply-To: References: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> <3d375d730909030939g3718ab91w8ee5477903f97a59@mail.gmail.com> Message-ID: <5b8d13220909032049n506c69ffrf06a189d7dd9e55d@mail.gmail.com> On Fri, Sep 4, 2009 at 1:49 AM, Charles R Harris wrote: > > > On Thu, Sep 3, 2009 at 10:39 AM, Robert Kern wrote: >> >> On Thu, Sep 3, 2009 at 11:13, Charles R Harris >> wrote: >> > >> > >> > On Thu, Sep 3, 2009 at 10:02 AM, Wolfgang Kerzendorf >> > wrote: >> >> >> >> I just installed numpy and scipy (both svn) on OS X 10.6 and just got >> >> scipy to work with Robert Kern's help. Playing around with numpy I got >> >> the following segfault: >> >> http://pastebin.com/m35220dbf >> >> I hope someone can make sense of it. Thanks in advance >> >> ? ? ?Wolfgang >> > >> > I'm going to guess it is a python problem. Which version of python do >> > you >> > have and where did it come from? >> >> Or a matplotlib problem. _path.so is theirs. >> > > Likely, then. I had to recompile matplotlib from svn when I upgraded to > Fedora 11. Looks like C++ code, and snow leopard has gcc 4.2, which is likely to have some subtle ABI incompatibilities with 4.0 (the one from leopoard). So yes, a matplotlib rebuild would be the first thing to try. cheers, David From chad.netzer at gmail.com Fri Sep 4 02:05:47 2009 From: chad.netzer at gmail.com (Chad Netzer) Date: Thu, 3 Sep 2009 23:05:47 -0700 Subject: [Numpy-discussion] snow leopard issues with numpy In-Reply-To: <5B4C377A-AA24-431E-854D-C24A836D709E@gmail.com> References: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> <5B4C377A-AA24-431E-854D-C24A836D709E@gmail.com> Message-ID: On Thu, Sep 3, 2009 at 4:02 PM, Wolfgang Kerzendorf wrote: > my version of python is the one that comes with snow leopard: 2.6.1 > hope that helps Huh? I upgraded to Snow Leopard over my Leopard system (i.e not a fresh install), and my default is python2.5: $ python Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin Hmmm, also that looks very old... [snooping] Okay, so it seems I probably have an old Mac Python installed in /Library/Frameworks, that perhaps was overriding the new Snow Leopard python? $ type -a python python is /Library/Frameworks/Python.framework/Versions/Current/bin/python python is /usr/bin/python $ ls -l /Library/Frameworks/Python.framework/Versions total 8 drwxr-xr-x 9 root admin 306 Sep 5 2006 2.4/ drwxrwxr-x 10 root admin 340 Dec 11 2006 2.5/ lrwxr-xr-x 1 root admin 3 Apr 5 16:53 Current@ -> 2.5 Geez... I think I was using the MacPorts python, which I deleted after installing SnowLeopard, and that exposed some old still installed frameworks. $ sudo rm /Library/Frameworks/Python.framework/Versions/Current/bin/python $ type -a python python is /usr/bin/python $ hash python $ python Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> Yep, got rid of the old framework executable, and now it finds the system python executable. Man, there is a lot of old cruft of various sort in my /Library/Frameworks. So, note to those upgrading to Snow Leopard, look in your /Library?Frameworks folder for old just that may still be overriding the newer /System/Library/Frameworks stuff: $ ls -l /System/Library/Frameworks/Python.framework/Versions/ total 8 drwxr-xr-x 8 root wheel 272 Aug 28 14:33 2.3/ drwxr-xr-x 12 root wheel 408 Aug 28 15:03 2.5/ drwxr-xr-x 12 root wheel 408 Aug 28 15:03 2.6/ lrwxr-xr-x 1 root wheel 3 Aug 28 14:33 Current@ -> 2.6 I'd list my whole /Library/Frameworks folder as an example, but its embarrassingly ancient and crufty. :) Hope this helps someone. -C From david at ar.media.kyoto-u.ac.jp Fri Sep 4 01:52:15 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 04 Sep 2009 14:52:15 +0900 Subject: [Numpy-discussion] snow leopard issues with numpy In-Reply-To: References: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> <5B4C377A-AA24-431E-854D-C24A836D709E@gmail.com> Message-ID: <4AA0AB0F.3040606@ar.media.kyoto-u.ac.jp> Chad Netzer wrote: > On Thu, Sep 3, 2009 at 4:02 PM, Wolfgang > Kerzendorf wrote: > >> my version of python is the one that comes with snow leopard: 2.6.1 >> hope that helps >> > > Huh? I upgraded to Snow Leopard over my Leopard system (i.e not a > fresh install), and my default is python2.5: > > $ python > Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) > [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin > > Hmmm, also that looks very old... > > [snooping] > > Okay, so it seems I probably have an old Mac Python installed in > /Library/Frameworks, that perhaps was overriding the new Snow Leopard > python? > > $ type -a python > python is /Library/Frameworks/Python.framework/Versions/Current/bin/python > python is /usr/bin/python > > $ ls -l /Library/Frameworks/Python.framework/Versions > total 8 > drwxr-xr-x 9 root admin 306 Sep 5 2006 2.4/ > drwxrwxr-x 10 root admin 340 Dec 11 2006 2.5/ > lrwxr-xr-x 1 root admin 3 Apr 5 16:53 Current@ -> 2.5 > > Geez... I think I was using the MacPorts python, which I deleted after > installing SnowLeopard, and that exposed some old still installed > frameworks. > > $ sudo rm /Library/Frameworks/Python.framework/Versions/Current/bin/python > > $ type -a python > python is /usr/bin/python > $ hash python > $ python > Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51) > [GCC 4.2.1 (Apple Inc. build 5646)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > > > Yep, got rid of the old framework executable, and now it finds the > system python executable. > > Man, there is a lot of old cruft of various sort in my > /Library/Frameworks. So, note to those upgrading to Snow Leopard, > look in your /Library?Frameworks folder for old just that may still be > overriding the newer /System/Library/Frameworks stuff: Concerning python on snow Leopard, two things will cause quite a few issues and I expect a few emails on the ML in the next few weeks :): - the system python is now 64 bits - gcc now targets x86_64 by default (i.e. without -arch). Apple seems to push 64 bits quite strongly in Snow Leopard, with almost every app being 64 bits on my system except the kernel of course. OTOH, this should make numpy/scipy more easily available for 64 bits on mac os x, assuming the changes to python itself will be integrated upstream (I doubt Apple made an effort to send patches upstream themselves). cheers, David From wkerzendorf at googlemail.com Fri Sep 4 05:59:07 2009 From: wkerzendorf at googlemail.com (Wolfgang Kerzendorf) Date: Fri, 4 Sep 2009 11:59:07 +0200 Subject: [Numpy-discussion] snow leopard issues with numpy In-Reply-To: <4AA0AB0F.3040606@ar.media.kyoto-u.ac.jp> References: <1324DD89-AE68-4B9E-BE82-339006831B86@gmail.com> <5B4C377A-AA24-431E-854D-C24A836D709E@gmail.com> <4AA0AB0F.3040606@ar.media.kyoto-u.ac.jp> Message-ID: Dear all, A recompile of matplotlib (Svn) did the trick. Thanks for the help. I had issues with building scipy and numpy and Robert Kern helped me a lot there. I think it would be useful in general if numpy and scipy recommend compilers for OS X (perhaps on the download page for numpy and scipy). I use the c compiler from hpc.sourceforge.net and use the gfortran compiler from http://r.research.att.com/tools/ (recommended by Robert Kern) and they seem to work. Thanks again for the suggestions, Wolfgang On 04/09/2009, at 7:52 , David Cournapeau wrote: > Chad Netzer wrote: >> On Thu, Sep 3, 2009 at 4:02 PM, Wolfgang >> Kerzendorf wrote: >> >>> my version of python is the one that comes with snow leopard: 2.6.1 >>> hope that helps >>> >> >> Huh? I upgraded to Snow Leopard over my Leopard system (i.e not a >> fresh install), and my default is python2.5: >> >> $ python >> Python 2.5 (r25:51918, Sep 19 2006, 08:49:13) >> [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin >> >> Hmmm, also that looks very old... >> >> [snooping] >> >> Okay, so it seems I probably have an old Mac Python installed in >> /Library/Frameworks, that perhaps was overriding the new Snow Leopard >> python? >> >> $ type -a python >> python is /Library/Frameworks/Python.framework/Versions/Current/bin/ >> python >> python is /usr/bin/python >> >> $ ls -l /Library/Frameworks/Python.framework/Versions >> total 8 >> drwxr-xr-x 9 root admin 306 Sep 5 2006 2.4/ >> drwxrwxr-x 10 root admin 340 Dec 11 2006 2.5/ >> lrwxr-xr-x 1 root admin 3 Apr 5 16:53 Current@ -> 2.5 >> >> Geez... I think I was using the MacPorts python, which I deleted >> after >> installing SnowLeopard, and that exposed some old still installed >> frameworks. >> >> $ sudo rm /Library/Frameworks/Python.framework/Versions/Current/bin/ >> python >> >> $ type -a python >> python is /usr/bin/python >> $ hash python >> $ python >> Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51) >> [GCC 4.2.1 (Apple Inc. build 5646)] on darwin >> Type "help", "copyright", "credits" or "license" for more >> information. >> >> >> Yep, got rid of the old framework executable, and now it finds the >> system python executable. >> >> Man, there is a lot of old cruft of various sort in my >> /Library/Frameworks. So, note to those upgrading to Snow Leopard, >> look in your /Library?Frameworks folder for old just that may still >> be >> overriding the newer /System/Library/Frameworks stuff: > > Concerning python on snow Leopard, two things will cause quite a few > issues and I expect a few emails on the ML in the next few weeks :): > - the system python is now 64 bits > - gcc now targets x86_64 by default (i.e. without -arch). > > Apple seems to push 64 bits quite strongly in Snow Leopard, with > almost > every app being 64 bits on my system except the kernel of course. > OTOH, > this should make numpy/scipy more easily available for 64 bits on mac > os x, assuming the changes to python itself will be integrated > upstream > (I doubt Apple made an effort to send patches upstream themselves). > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From denis-bz-py at t-online.de Fri Sep 4 11:51:23 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Fri, 4 Sep 2009 15:51:23 +0000 (UTC) Subject: [Numpy-discussion] adaptive interpolation on a regular 2d grid References: <3d375d730908311552r399ed62ah4f482b14b4e89b88@mail.gmail.com> <3d375d730909020938p6270876ka26aa01b708a1cc4@mail.gmail.com> Message-ID: Robert Kern gmail.com> writes: > http://svn.scipy.org/svn/scikits/trunk/delaunay/scikits/delaunay/testfuncs.py Thank you Robert, that looks nice. I've put 1d adalin1.py in http://drop.io/denis_adalin ; have 2d, but can someone please comment on {content, style, direction} of this simple 1d (150 lines w __doc__) ? The adaptive splitting is now all done first, before interpolation, so that one could feed other interpolators. From dwf at cs.toronto.edu Fri Sep 4 15:36:32 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 4 Sep 2009 15:36:32 -0400 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile Message-ID: A friend of mine is trying to save approx 2GB of float32s with np.save, and it's been failing. I traced it to PyArray_ToFile in core/ src/convert.c: Traceback (most recent call last): File "preprocessTIMIT.py", line 302, in main() File "preprocessTIMIT.py", line 299, in main num.save("/ais/gobi1/gdahl/speech/%s" % k, d[k]) File "/nobackup/murray/bin/pylab-02/x86_64_GenuineIntel_6.06/ Python-2.5.4/lib/python2.5/site-packages/numpy/lib/io.py", line 241, in save format.write_array(fid, arr) File "/nobackup/murray/bin/pylab-02/x86_64_GenuineIntel_6.06/ Python-2.5.4/lib/python2.5/site-packages/numpy/lib/format.py", line 322, in write_array array.tofile(fp) ValueError: 541110272 requested and 2028 written This is writing to a local ext3 filesystem. Apparently fwrite() is giving back values in the low thousands instead of the correct amount. Has anyone encountered this, or know why it would happen? The values are 1004 and 2028 that we seem to keep seeing, which as far as I can tell are not error codes. Thanks, David From dwf at cs.toronto.edu Fri Sep 4 15:56:46 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 4 Sep 2009 15:56:46 -0400 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: Message-ID: On 4-Sep-09, at 3:36 PM, David Warde-Farley wrote: > A friend of mine is trying to save approx 2GB of float32s with > np.save, and it's been failing. I traced it to PyArray_ToFile in core/ > src/convert.c: That should be core/src/multiarray/convert.c, sorry. The system it's running on is: Linux cluster40 2.6.15-51-amd64-generic #1 SMP PREEMPT Tue Feb 12 16:56:43 UTC 2008 x86_64 GNU/Linux if that helps at all. David From charlesr.harris at gmail.com Fri Sep 4 16:23:45 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Sep 2009 14:23:45 -0600 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: Message-ID: On Fri, Sep 4, 2009 at 1:36 PM, David Warde-Farley wrote: > A friend of mine is trying to save approx 2GB of float32s with > np.save, and it's been failing. I traced it to PyArray_ToFile in core/ > src/convert.c: > > Traceback (most recent call last): > File "preprocessTIMIT.py", line 302, in > main() > File "preprocessTIMIT.py", line 299, in main > num.save("/ais/gobi1/gdahl/speech/%s" % k, d[k]) > File "/nobackup/murray/bin/pylab-02/x86_64_GenuineIntel_6.06/ > Python-2.5.4/lib/python2.5/site-packages/numpy/lib/io.py", line 241, > in save > format.write_array(fid, arr) > File "/nobackup/murray/bin/pylab-02/x86_64_GenuineIntel_6.06/ > Python-2.5.4/lib/python2.5/site-packages/numpy/lib/format.py", line > 322, in write_array > array.tofile(fp) > ValueError: 541110272 requested and 2028 written > > This is writing to a local ext3 filesystem. Apparently fwrite() is > giving back values in the low thousands instead of the correct amount. > > Has anyone encountered this, or know why it would happen? The values > are 1004 and 2028 that we seem to keep seeing, which as far as I can > tell are not error codes. > > The odd values might be from the format code in the error message: PyErr_Format(PyExc_ValueError, "%ld requested and %ld written", (long) size, (long) n); The code that is immediately responsible for the write is in lines 79-92 of convert.c. You could do a bit of poking around in there to find out what is happening. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Sep 4 17:01:42 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Sep 2009 15:01:42 -0600 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: Message-ID: On Fri, Sep 4, 2009 at 1:36 PM, David Warde-Farley wrote: > A friend of mine is trying to save approx 2GB of float32s with > np.save, and it's been failing. I traced it to PyArray_ToFile in core/ > src/convert.c: > > Traceback (most recent call last): > File "preprocessTIMIT.py", line 302, in > main() > File "preprocessTIMIT.py", line 299, in main > num.save("/ais/gobi1/gdahl/speech/%s" % k, d[k]) > File "/nobackup/murray/bin/pylab-02/x86_64_GenuineIntel_6.06/ > Python-2.5.4/lib/python2.5/site-packages/numpy/lib/io.py", line 241, > in save > format.write_array(fid, arr) > File "/nobackup/murray/bin/pylab-02/x86_64_GenuineIntel_6.06/ > Python-2.5.4/lib/python2.5/site-packages/numpy/lib/format.py", line > 322, in write_array > array.tofile(fp) > ValueError: 541110272 requested and 2028 written > > This is writing to a local ext3 filesystem. Apparently fwrite() is > giving back values in the low thousands instead of the correct amount. > > Has anyone encountered this, or know why it would happen? The values > are 1004 and 2028 that we seem to keep seeing, which as far as I can > tell are not error codes. > Oh and is the Python python here 64 bits or 32 bits? What does file `which python` say? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Fri Sep 4 17:07:23 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Fri, 4 Sep 2009 16:07:23 -0500 Subject: [Numpy-discussion] Fastest way to parsing a specific binay file In-Reply-To: <3d375d730909022222t1e9ea6f4q693fcf9e7a60ab10@mail.gmail.com> References: <49d6b3500909020738g53befd6ey4a6af8e269162510@mail.gmail.com> <4A9E908C.1070205@molden.no> <49d6b3500909020953v262b832ajaac5fdba02f8fb05@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9E98@sernt14.essex.ac.uk> <49d6b3500909021027w29159135tbb5bfe47a48d26c@mail.gmail.com> <3d375d730909021029l9a3be85wb850421217da5ef9@mail.gmail.com> <49d6b3500909021128sfe82fd9iaf95886b6263340a@mail.gmail.com> <3d375d730909021158i79ec0551h3eca28878a0d405c@mail.gmail.com> <49d6b3500909022159m628262edr3cd3be17c12d9a19@mail.gmail.com> <3d375d730909022222t1e9ea6f4q693fcf9e7a60ab10@mail.gmail.com> Message-ID: <49d6b3500909041407k79b4c9fbtee3df2f7320e8779@mail.gmail.com> On Thu, Sep 3, 2009 at 12:22 AM, Robert Kern wrote: > On Wed, Sep 2, 2009 at 23:59, G?khan Sever wrote: > > > Robert, > > > > You must have thrown a couple RTFM's while replying my emails :) > > Not really. There's no manual for this. Greg Wilson's _Data Crunching_ > may be a good general introduction to how to think about these > problems. > > http://www.pragprog.com/titles/gwd/data-crunching > > > I usually > > take trial-error approaches initially, and don't give up unless I hit a > > hurdle so fast, which in this case resulted with the unsuccessful regex > > approach. However from the good point I have learnt the basics of regular > > expressions and realized how powerful could they be during a text parsing > > task. > > > > Enough prattle, below is what I am working on: > > > > So far I was successfully able to extract the file names and the data > > associated with those names (with the exception of multiple buffer per > file > > cases). > > > > However not reading time increments correctly, I should be seeing 1 sec > > incremental time ticks from the time segment reading, but all it does is > to > > return the same first time information. > > > > Furthermore, I still couldn't figure out how to wrap the main looping > suite > > (range(500) is just a dummy number which will let me process whole binary > > data) I don't know yet how to make the range input generic which will > work > > any size of similar binary file. > > while True: > ... > > if no_more_data(): > break > > > import numpy as np > > import struct > > > > f = open('test.sea', 'rb') > > > > dt = np.dtype([('tagNumber', np.uint16), ('dataOffset', np.uint16), > > ('numberBytes', np.uint16), ('samples', np.uint16), ('bytesPerSample', > > np.uint16), ('type', np.uint8), ('param1', np.uint8), ('param2', > > np.uint8), ('param3', np.uint8), ('address', np.uint16)]) > > > > > > start = 0 > > ct = 0 > > > > for i in range(500): > > > > header = np.fromstring(f.read(dt.itemsize), dt)[0] > > > > if header['tagNumber'] == 65530: > > loc = f.tell() > > f.seek(start + header['dataOffset']) > > f.read(header['numberBytes']) > > Presumably you are doing something with this data, not just discarding it. > > > f.seek(loc) > > This should be f.seek(loc, 0). f.seek(nbytes) is to seek forward from > the current position by nbytes. The 0 tells it to start from the > beginning. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Thanks for the suggestions and sorry for the late replying. Was trying to stay offline and read some dissertations. Getting somewhere on the code. Fixed generic reading case with a try-except on ValueError block, so far it works fine. As seen below, I was be able to read the time and a specific data content,which in this case Cloud Condensation Nuclei data recorded on the acquisition system. However, there still exists a few weirdness behaviours. Take the following print-out for example: 1344416 (18000, 84, 110, 1, 256, 37, 10, 0, 0, 61441) 1344468 H,13:09:51,0.59,0.00,28.43,32.65,36.26,26.60,29.54,38.12,27.98,45.01,453.77,426.25,76.25,0.14,9.35,3.34,0.00 It is supposed to print the correct data when I seek the cursor to 1344416+84 and read 110 characters more but when it doesn't work this way. To make it correctly work I have to seek 52 lines instead of 84 (where I compansate this by (f.tell() - dt.itemsize) which transports me where I exactly want it to be. #################################################################################################################################### The other point I discovered while studying the contents of this binary file format and the existing IDL code is, there are many similar lines of code occuring in scripts as well as code replicas. This is due to the fact that the current postprocessing toolsuite has been configured to work on many different field campaigns. That is to say, the existing code base could process the binary data from a lab setup, and from a campaign in Mali or in Saudi Arabia following its way by pre-defined processing scripts. Each different campaign has its own specific configurations which kept in those my very early mentioned config files ie. which instrurement was connected to the system, at what port, type of communication (serial. analog), sampling rate etc. Instead of taking this path, I can parse those config files and make a very generic postprocessing script suite independent from the campaign based on the fact that all of the config files will be placed in the binary file. #################################################################################################################################### I had to mention very early, my primary intention is to unify our ADPAA kit using Python. https://www.ohloh.net/p/adpaa/analyses/latest as is shown from ohloh's stats it approximately approaches to 170k lines of code. 6-8 different languages; IDL is being the master language. Currently all the code has been written in procedural-linear fashion, majority of the tasks are interconnected to eachother ie in order to create a higher lever analyses file the lower stages in the processing hierarchy must be executed. I don't know how much we gain to write these parts in object-oriented way, but definetely will provide us a neater and less code repeated processing software. I still quiet don't get the pros and cons of basing the design on Traits. I might ask your opinions onto integration of Traits based design into this type of project. As a very early estimate, I am thinking of writing 5 to 10 times less code than what was written up to now. This said, I have less than a year to finish my degree hear, and there are very many field papers and books to read. Good day. #!/usr/bin/env python import numpy as np import struct f = open('test.sea', 'rb') dt = np.dtype([('tagNumber', np.uint16), ('dataOffset', np.uint16), ('numberBytes', np.uint16), ('samples', np.uint16), ('bytesPerSample', np.uint16), ('type', np.uint8), ('param1', np.uint8), ('param2', np.uint8), ('param3', np.uint8), ('address', np.uint16)]) start = 0 while True: try: header = np.fromstring(f.read(dt.itemsize), dt)[0] ### Read Time if header['tagNumber'] == 0 and header['address'] == 43605: start = f.tell()-dt.itemsize loc = f.tell() #start = f.tell() print f.tell() print header f.seek(start + header['dataOffset']) print f.tell() print struct.unpack('9H', f.read(18)) print struct.unpack('9H', f.read(18)) f.seek(loc, 0) # Read DMT-CCNC data if header['tagNumber'] == 18000 and header['type'] == 37: start = f.tell()-dt.itemsize*2 loc = f.tell() print f.tell() print header f.seek(start + header['dataOffset']) print f.tell() print f.read(header['numberBytes']) print f.tell() f.seek(loc,0) except ValueError: break -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Fri Sep 4 17:48:26 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 4 Sep 2009 17:48:26 -0400 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: Message-ID: On 4-Sep-09, at 5:01 PM, Charles R Harris wrote: > Oh and is the Python python here 64 bits or 32 bits? What does > > file `which python` > > say? I'm fairly sure it's a 64-bit python, but I don't have access to the machine to check. I'll double check with the person who does. David From dwf at cs.toronto.edu Fri Sep 4 17:54:47 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 4 Sep 2009 17:54:47 -0400 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: Message-ID: On 4-Sep-09, at 4:23 PM, Charles R Harris wrote: > The odd values might be from the format code in the error message: > > PyErr_Format(PyExc_ValueError, > "%ld requested and %ld written", > (long) size, (long) n); > > The code that is immediately responsible for the write is in lines > 79-92 of > convert.c. You could do a bit of poking around in there to find out > what is > happening. Yes, I saw that. My C is rusty, but wouldn't the cast take care of it? n is of type size_t, which is pretty big, and a cast to long shouldn't be an issue. And if (hopefully) PyErr_Format is correct... At any rate, I am without an account on those machines, so I will have to try and reproduce it elsewhere, or else do this in an extremely annoying fashion by proxy. I should've mentioned this is with the 1.3.0 release, not that it's changed since. David From charlesr.harris at gmail.com Fri Sep 4 18:13:31 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Sep 2009 16:13:31 -0600 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: Message-ID: On Fri, Sep 4, 2009 at 3:54 PM, David Warde-Farley wrote: > On 4-Sep-09, at 4:23 PM, Charles R Harris wrote: > > > The odd values might be from the format code in the error message: > > > > PyErr_Format(PyExc_ValueError, > > "%ld requested and %ld written", > > (long) size, (long) n); > > > > The code that is immediately responsible for the write is in lines > > 79-92 of > > convert.c. You could do a bit of poking around in there to find out > > what is > > happening. > > > Yes, I saw that. My C is rusty, but wouldn't the cast take care of it? > n is of type size_t, which is pretty big, and a cast to long shouldn't > be an issue. And if (hopefully) PyErr_Format is correct... > > Well, I don't have access, but this smells like an overflow problem. It would be nice to know what the actual sizes of the numbers are. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Fri Sep 4 23:22:20 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 05 Sep 2009 05:22:20 +0200 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: Message-ID: <4AA1D96C.1050801@molden.no> David Warde-Farley skrev: >> The odd values might be from the format code in the error message: >> >> PyErr_Format(PyExc_ValueError, >> "%ld requested and %ld written", >> (long) size, (long) n); >> > > Yes, I saw that. My C is rusty, but wouldn't the cast take care of it? > n is of type size_t, which is pretty big, and a cast to long shouldn't > be an issue. And if (hopefully) PyErr_Format is correct... > A long is (usually) 32 bit, even on 64 bit systems. This means that size is casted to an integer between - 2**31 and 2*31 - 1. As 2**31 bytes are 2 GB, the expression "(long) size" will overflow if a write of 2GB or more failed. Thus you get some bogus numbers in the formatted message. There is thus a bug in the call to PyErr_Format, as it only works reliably on 32 bits systems. S.M. From charlesr.harris at gmail.com Fri Sep 4 23:59:25 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Sep 2009 21:59:25 -0600 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: <4AA1D96C.1050801@molden.no> References: <4AA1D96C.1050801@molden.no> Message-ID: On Fri, Sep 4, 2009 at 9:22 PM, Sturla Molden wrote: > David Warde-Farley skrev: > >> The odd values might be from the format code in the error message: > >> > >> PyErr_Format(PyExc_ValueError, > >> "%ld requested and %ld written", > >> (long) size, (long) n); > >> > > > > Yes, I saw that. My C is rusty, but wouldn't the cast take care of it? > > n is of type size_t, which is pretty big, and a cast to long shouldn't > > be an issue. And if (hopefully) PyErr_Format is correct... > > > A long is (usually) 32 bit, even on 64 bit systems. This means that size > is casted to an integer between - 2**31 and 2*31 - 1. As 2**31 bytes are > 2 GB, the expression "(long) size" will overflow if a write of 2GB or > more failed. Thus you get some bogus numbers in the formatted message. > There is thus a bug in the call to PyErr_Format, as it only works > reliably on 32 bits systems. > > The size of long depends on the compiler as well as the operating system. On linux x86_64, IIRC, it is 64 bits, on Windows64 I believe it is 32. Ints always seem to be 32 bits. But something funny is definitely going on. It shouldn't be possible to allocate an array bigger than npy_intp, which is a signed number, and the casts to size_t and such should be safe. But in anycase, using long in the print statement is not portable and needs fixing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Sat Sep 5 00:29:36 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 05 Sep 2009 06:29:36 +0200 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: <4AA1D96C.1050801@molden.no> Message-ID: <4AA1E930.8030400@molden.no> Charles R Harris skrev: > The size of long depends on the compiler as well as the operating > system. On linux x86_64, IIRC, it is 64 bits, on Windows64 I believe > it is 32. Ints always seem to be 32 bits. If I remember the C standard correctly, a long is guaranteed to be at least 32 bits, whereas an int is guaranteed to be at least 16 bits. A short is at least 16 bits and a long long is 64 bits. Then there is intptr_t which is wide enough to store a pointer, but could be wider. A size_t usually has the size of a pointer but not always (on segment and offset architectures they might differ). Considering PEP353, should we be indexing with Py_ssize_t instead of npy_intp? I believe (correct me if I'm wrong) npy_intp is typedef'ed to Py_intptr_t. S.M. From charlesr.harris at gmail.com Sat Sep 5 01:01:30 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Sep 2009 23:01:30 -0600 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: <4AA1E930.8030400@molden.no> References: <4AA1D96C.1050801@molden.no> <4AA1E930.8030400@molden.no> Message-ID: On Fri, Sep 4, 2009 at 10:29 PM, Sturla Molden wrote: > Charles R Harris skrev: > > The size of long depends on the compiler as well as the operating > > system. On linux x86_64, IIRC, it is 64 bits, on Windows64 I believe > > it is 32. Ints always seem to be 32 bits. > If I remember the C standard correctly, a long is guaranteed to be at > least 32 bits, whereas an int is guaranteed to be at least 16 bits. A > short is at least 16 bits and a long long is 64 bits. Then there is > intptr_t which is wide enough to store a pointer, but could be wider. A > size_t usually has the size of a pointer but not always (on segment and > offset architectures they might differ). Considering PEP353, should we > be indexing with Py_ssize_t instead of npy_intp? I believe (correct me > if I'm wrong) npy_intp is typedef'ed to Py_intptr_t. > > The problem is that Py_ssize_t showed up in Python 2.5 and we support 2.4. For earlier versions folks usually typedef Py_ssize_t to int, which works for python compatibility but would cause problems for us if we used it as the npy_intp type. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpg.314 at gmail.com Sat Sep 5 02:14:35 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Sat, 5 Sep 2009 11:44:35 +0530 Subject: [Numpy-discussion] mixing -fno-exceptions with swig c++ wrappers to python Message-ID: <4d5dd8c20909042314y4ff2d3f7od9820aeed4636ec2@mail.gmail.com> Hi, I am using swig to expose a c++ class to Python. I am wondering if it is safe to use the -fno-exceptions option while compiling the wrappers. I am also using the typemaps present in the numpy.i file that comes with numpy. Thanks, -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From mark.wendell at gmail.com Sat Sep 5 13:22:39 2009 From: mark.wendell at gmail.com (Mark Wendell) Date: Sat, 5 Sep 2009 11:22:39 -0600 Subject: [Numpy-discussion] object dtype questions Message-ID: When I create an array with a dtype=object, and instance a custom python class to each array member, have I foregone any opportunity to call methods or get/set attributes on those instances with any of numpy's 'group' operations? In other words, it seems like once I've got a custom object as an array member, the only way to get at it is by accessing it individually, by looping through each member of the array one at a time. For example: Say that C is a simple python class with a couple attributes and methods. a = np.empty( (5,5), dtype=object) for i in range(5): for j in range(5): a[i,j] = C(var1,var2) First question: is there a quicker way than above to create unique instances of C for each element of a? for i in range(5): for j in range(5): a[i,j].myMethod(var3,var4) print a[i,j].attribute1 Again, is there a quicker way than above to call myMethod or access attribute1? Thanks in advance. I've been searching through the docs for examples of the use of custom python objects, but haven't found much. Mark -- -- Mark Wendell From cournape at gmail.com Sat Sep 5 19:20:53 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 6 Sep 2009 08:20:53 +0900 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: <4AA1D96C.1050801@molden.no> <4AA1E930.8030400@molden.no> Message-ID: <5b8d13220909051620r402e979bo154c71b473a179a4@mail.gmail.com> On Sat, Sep 5, 2009 at 2:01 PM, Charles R Harris wrote: > > > On Fri, Sep 4, 2009 at 10:29 PM, Sturla Molden wrote: >> >> Charles R Harris skrev: >> > The size of long depends on the compiler as well as the operating >> > system. On linux x86_64, IIRC, it is 64 bits, on Windows64 I believe >> > it is 32. Ints always seem to be 32 bits. >> If I remember the C standard correctly, a long is guaranteed to be at >> least 32 bits, whereas an int is guaranteed to be at least 16 bits. A >> short is at least 16 bits and a long long is 64 bits. Then there is >> intptr_t which is wide enough to store a pointer, but could be wider. A >> size_t usually has the size of a pointer but not always (on segment and >> offset architectures they might differ). Considering PEP353, should we >> be indexing with Py_ssize_t instead of npy_intp? I believe (correct me >> if I'm wrong) npy_intp is typedef'ed to Py_intptr_t. >> > > The problem is that Py_ssize_t showed up in Python 2.5 and we support 2.4. > For earlier versions folks usually typedef Py_ssize_t to int, which works > for python compatibility but would cause problems for us if we used it as > the npy_intp type. We could use our own npy_ssize_t, though. I find npy_intp for indexing quite confusing. I have started a patch toward this during the scipy09 sprints. There are a lot of bogus loops in numpy and scipy where the indexing loop is an int BTW. For the formatting, we should have our own formatting macro to deal with printing those - I have meant to implement them, but it is a bit of work to do it correctly as a printf format for size_t only appeared in C99 (the usual workaround to use usigned long is wrong on windows 64). cheers, David From dwf at cs.toronto.edu Sat Sep 5 20:40:15 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sat, 5 Sep 2009 20:40:15 -0400 Subject: [Numpy-discussion] fwrite() failure in PyArray_ToFile In-Reply-To: References: <4AA1D96C.1050801@molden.no> Message-ID: <84380C05-17E2-4AFE-B707-9945CA1AD689@cs.toronto.edu> On 4-Sep-09, at 11:59 PM, Charles R Harris wrote: > The size of long depends on the compiler as well as the operating > system. On linux x86_64, IIRC, it is 64 bits, on Windows64 I believe > it is 32. Ints always seem to be 32 bits. But something funny is > definitely going on. It shouldn't be possible to allocate an array > bigger than npy_intp, which is a signed number, and the casts to > size_t and such should be safe. But in anycase, using long in the > print statement is not portable and needs fixing. I got him to run a check on that machine and you're correct, sizeof(long) == 8. David From eadrogue at gmx.net Sun Sep 6 07:32:52 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Sun, 6 Sep 2009 13:32:52 +0200 Subject: [Numpy-discussion] object dtype questions In-Reply-To: References: Message-ID: <20090906113252.GA23392@doriath.local> 5/09/09 @ 11:22 (-0600), thus spake Mark Wendell: > For example: > > Say that C is a simple python class with a couple attributes and methods. > > a = np.empty( (5,5), dtype=object) > for i in range(5): > for j in range(5): > a[i,j] = C(var1,var2) > > First question: is there a quicker way than above to create unique > instances of C for each element of a? You achieve the same with a=np.array([C(var1, var2) for i in range(25)], dtype=object).reshape((5,5)) but it takes about the same time in my computer. > for i in range(5): > for j in range(5): > a[i,j].myMethod(var3,var4) > print a[i,j].attribute1 > > Again, is there a quicker way than above to call myMethod or access attribute1? I think you can use a ufunc: def foo(x): x.myMethod(var3,var4) print x.attribute1 ufoo = np.frompyfunc(foo, 1, 0) ufoo(a) Don't know if it is any faster though. -- Ernest From sturla at molden.no Sun Sep 6 08:33:52 2009 From: sturla at molden.no (Sturla Molden) Date: Sun, 06 Sep 2009 14:33:52 +0200 Subject: [Numpy-discussion] object dtype questions In-Reply-To: References: Message-ID: <4AA3AC30.402@molden.no> Mark Wendell skrev: > for i in range(5): > for j in range(5): > a[i,j].myMethod(var3,var4) > print a[i,j].attribute1 > > Again, is there a quicker way than above to call myMethod or access attribute1 One option is to look up the name of the method unbound, and then use built-in function map. map( cls.myMethod, a ) is similar to: [aa.myMethod() for aa in a] Using map avoids looking up the name 'myMethod' for each instance, it is done only once. map can be a lot faster or hardly faster at all, depending on the amount of work done by the function. The more work being performed, the less you benefit from map. You can pass in variables (var3,var4) by giving additional sequences to map. S.M. From aisaac at american.edu Sun Sep 6 10:12:07 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 06 Sep 2009 10:12:07 -0400 Subject: [Numpy-discussion] object dtype questions In-Reply-To: <4AA3AC30.402@molden.no> References: <4AA3AC30.402@molden.no> Message-ID: <4AA3C337.8020100@american.edu> On 9/6/2009 8:33 AM, Sturla Molden wrote: > map( cls.myMethod, a ) > > is similar to: > > [aa.myMethod() for aa in a] http://article.gmane.org/gmane.comp.python.general/630847 fwiw, Alan Isaac From sturla at molden.no Sun Sep 6 17:12:31 2009 From: sturla at molden.no (Sturla Molden) Date: Sun, 06 Sep 2009 23:12:31 +0200 Subject: [Numpy-discussion] object dtype questions In-Reply-To: <4AA3C337.8020100@american.edu> References: <4AA3AC30.402@molden.no> <4AA3C337.8020100@american.edu> Message-ID: <4AA425BF.40800@molden.no> Alan G Isaac skrev: > http://article.gmane.org/gmane.comp.python.general/630847 > Yes, but here you still have to look up the name 'f' from locals in each iteration. map is written in C, once it has as PyObject* to the callable it does not need to look up the name anymore. The dictionary lookup(s) to get the callable is done just once. map is also more thread-safe. During the iteration another thread could rebind the name of the callable, but map is impervious to that. S.M. From pinto at mit.edu Sun Sep 6 19:38:57 2009 From: pinto at mit.edu (Nicolas Pinto) Date: Sun, 6 Sep 2009 19:38:57 -0400 Subject: [Numpy-discussion] problem with numpy.distutils and Cython In-Reply-To: References: <954ae5aa0908230027h956568h585e7854bc05b8a7@mail.gmail.com> <3d375d730908230034j1b9e8370x7d2dbbe283fd988b@mail.gmail.com> Message-ID: <954ae5aa0909061638r305ce0dfib3b25adde6a7dd4d@mail.gmail.com> Thanks Lisandro, it worked perfectly! On Sun, Aug 23, 2009 at 6:30 PM, Lisandro Dalcin wrote: > The monkeypatching below in your setup.py could work. This way, you > just have to use numpy.distutils, but you will not be able to pass > many options to Cython (like C++ code generation?) > > from numpy.distutils.command import build_src > import Cython > import Cython.Compiler.Main > build_src.Pyrex = Cython > build_src.have_pyrex = True > > > On Sun, Aug 23, 2009 at 4:34 AM, Robert Kern wrote: > > On Sun, Aug 23, 2009 at 00:27, Nicolas Pinto > wrote: > >> Hello, > >> > >> I'm trying to use numpy.distutils and Cython in a setup.py but I'm > running > >> into some problems. > >> > >> The following code raises a "AttributeError: fcompiler" when I run > "python > >> setup.py install" (it runs smoothly with "python setup.py build_ext > >> --inplace"): > >> > >> from numpy.distutils.core import setup, Extension > >> from Cython.Distutils import build_ext > >> ext_modules = [Extension("test", ["test.pyx"])] > >> setup(cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules) > >> > >> Whereas the following works in both cases: > >> > >> from distutils.core import setup, Extension > >> from Cython.Distutils import build_ext > >> ext_modules = [Extension("test", ["test.pyx"])] > >> setup(cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules) > >> > >> Am I missing something? > > > > numpy.distutils needs its own build_ext, which you are overriding with > > Cython's. You need one build_ext that does both things. > > > > -- > > Robert Kern > > > > "I have come to believe that the whole world is an enigma, a harmless > > enigma that is made terrible by our own mad attempt to interpret it as > > though it had an underlying truth." > > -- Umberto Eco > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Nicolas Pinto Ph.D. Candidate, Brain & Computer Sciences Massachusetts Institute of Technology, USA http://web.mit.edu/pinto -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sun Sep 6 23:35:05 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 07 Sep 2009 12:35:05 +0900 Subject: [Numpy-discussion] mixing -fno-exceptions with swig c++ wrappers to python In-Reply-To: <4d5dd8c20909042314y4ff2d3f7od9820aeed4636ec2@mail.gmail.com> References: <4d5dd8c20909042314y4ff2d3f7od9820aeed4636ec2@mail.gmail.com> Message-ID: <4AA47F69.4010507@ar.media.kyoto-u.ac.jp> Rohit Garg wrote: > Hi, > > I am using swig to expose a c++ class to Python. I am wondering if it > is safe to use the -fno-exceptions option while compiling the > wrappers. I am also using the typemaps present in the numpy.i file > that comes with numpy. > > It will mostly depend on the code you are wrapping and your toolchain. It should not cause trouble w.r.t numpy, as numpy does not use C++ at all. cheers, David From david at ar.media.kyoto-u.ac.jp Mon Sep 7 03:57:10 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 07 Sep 2009 16:57:10 +0900 Subject: [Numpy-discussion] datetime-related import slowdown Message-ID: <4AA4BCD6.3040304@ar.media.kyoto-u.ac.jp> Hi, I noticed that numpy import times significantly are significantly worse than it used to be, and those are related to recent datetime related changes: # One month ago time python -c "import numpy" -> 141ms # Now: time python -c "import numpy" -> 202ms Using bzr import profiler, most of the slowdown comes from mx_datetime, and I guess all the regex compilation within (each re.compile takes several ms each in there). Is there a way to make this faster, at least for people not using datetime ? cheers, David From gnurser at googlemail.com Mon Sep 7 05:00:38 2009 From: gnurser at googlemail.com (George Nurser) Date: Mon, 7 Sep 2009 10:00:38 +0100 Subject: [Numpy-discussion] numpy/scipy/matplotlib + 10.6 + Apple python 2.6.1 Message-ID: <1d1e6ea70909070200v760212bj64749b67040302d1@mail.gmail.com> There are some interesting instructions on how to make this work at http://blog.hyperjeff.net/?p=160. However I'm not sure that the recommendation to rename the Apple-supplied version of numpy is consistent with previous advice I've seen on this mailing list. --George Nurser. From cournape at gmail.com Mon Sep 7 05:13:01 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 7 Sep 2009 18:13:01 +0900 Subject: [Numpy-discussion] numpy/scipy/matplotlib + 10.6 + Apple python 2.6.1 In-Reply-To: <1d1e6ea70909070200v760212bj64749b67040302d1@mail.gmail.com> References: <1d1e6ea70909070200v760212bj64749b67040302d1@mail.gmail.com> Message-ID: <5b8d13220909070213n42e4da0diebe5c20361928f78@mail.gmail.com> On Mon, Sep 7, 2009 at 6:00 PM, George Nurser wrote: > There are some interesting instructions on how to make this work at > http://blog.hyperjeff.net/?p=160. > However I'm not sure that the recommendation to rename the > Apple-supplied version of numpy is consistent with previous advice > I've seen on this mailing list. I think it is best to avoid touching anything in /System. The better solution is to install things locally, at least if you don't need to share with several users one install. With python 2.6, there is actually a very simple way to do so, using the --user option of python. You install everything using the --user option, and this will get installed into your $HOME, and python will look there first. Also, FFTW is not needed by scipy anymore. cheers, David From tjhnson at gmail.com Mon Sep 7 05:34:16 2009 From: tjhnson at gmail.com (T J) Date: Mon, 7 Sep 2009 02:34:16 -0700 Subject: [Numpy-discussion] Row-wise dot product? Message-ID: Is there a better way to achieve the following, perhaps without the python for loop? >>> x.shape (10000,3) >>> y.shape (10000,3) >>> z = empty(len(x)) >>> for i in range(10000): ... z[i] = dot(x[i], y[i]) ... From nadavh at visionsense.com Mon Sep 7 05:56:33 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 7 Sep 2009 12:56:33 +0300 Subject: [Numpy-discussion] Row-wise dot product? References: Message-ID: <710F2847B0018641891D9A21602763605AD159@ex3.envision.co.il> (x*y).sum(1) Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? T J ????: ? 07-??????-09 12:34 ??: Discussion of Numerical Python ????: [Numpy-discussion] Row-wise dot product? Is there a better way to achieve the following, perhaps without the python for loop? >>> x.shape (10000,3) >>> y.shape (10000,3) >>> z = empty(len(x)) >>> for i in range(10000): ... z[i] = dot(x[i], y[i]) ... _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From kxroberto at googlemail.com Mon Sep 7 08:11:59 2009 From: kxroberto at googlemail.com (Robert) Date: Mon, 07 Sep 2009 14:11:59 +0200 Subject: [Numpy-discussion] Why is the truth value of ndarray not simply size>0 ? Message-ID: Is there a reason why ndarray truth tests (except scalars) deviates from the convention of other Python iterables list,array.array,str,dict,... ? Furthermore there is a surprising strange exception for arrays with size 1 (!= scalars). I often run into exceptions and unexpected bahavior like shown below, when developing or transcribing from other types to numpy. Robert --- >>> a=np.array([12,4,5]) >>> if a: print 2 ... Traceback (most recent call last): File "", line 1, in ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() >>> a=np.array([12]) >>> if a: print 2 ... 2 >>> a=np.array([0]) >>> if a: print 2 ... >>> a=[0] >>> if a: print 2 ... 2 >>> a=[] >>> if a: print 2 ... >>> import array >>> a=array.array('i',[12,1,3]) >>> if a: print 2 ... 2 >>> a=array.array('i',[0]) >>> if a: print 2 ... 2 >>> a=array.array('i',[]) >>> if a: print 2 ... >>> bool(np.array(0)) False From nmb at wartburg.edu Mon Sep 7 09:46:35 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Mon, 07 Sep 2009 08:46:35 -0500 Subject: [Numpy-discussion] Why is the truth value of ndarray not simply size>0 ? In-Reply-To: References: Message-ID: <4AA50EBB.5080402@wartburg.edu> On 2009-09-07 07:11 , Robert wrote: > Is there a reason why ndarray truth tests (except scalars) > deviates from the convention of other Python iterables > list,array.array,str,dict,... ? > > Furthermore there is a surprising strange exception for arrays > with size 1 (!= scalars). Historically, numpy's predecessors used "not equal to zero" as the meaning for truth (consistent with numerical types in Python). However, this introduces an ambiguity as both any(a != 0) and all(a != 0) are reasonable interpretations of the truth value of a sequence of numbers. Numpy refuses to guess and raises the exception shown below. For sequences with a single item, there is no ambiguity and numpy does the (numerically) ordinary thing. The ndarray type available in Numpy is not conceptually an extension of Python's iterables. If you'd like to help other Numpy users with this issue, you can edit the documentation in the online documentation editor at http://docs.scipy.org/numpy/docs/numpy-docs/user/index.rst -Neil From denis-bz-py at t-online.de Mon Sep 7 09:26:22 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Mon, 7 Sep 2009 13:26:22 +0000 (UTC) Subject: [Numpy-discussion] greppable file of all numpy functions ? Message-ID: Does anyone have a program to generate a file with one line per Numpy function / class / method, for local grepping ? It might be useful for any package with thousands of functions too. (Grepping a Pypi summary to see "what the heck is ..." takes < 1 second.) Sorry if this is a duplicate, must exist already ? From engelh at deshaw.com Mon Sep 7 10:09:57 2009 From: engelh at deshaw.com (Hans-Andreas Engel) Date: Mon, 7 Sep 2009 14:09:57 +0000 (UTC) Subject: [Numpy-discussion] Row-wise dot product? References: <710F2847B0018641891D9A21602763605AD159@ex3.envision.co.il> Message-ID: > From: T J gmail.com> > Is there a better way to achieve the following, perhaps without the > python for loop? > > >>> x.shape > (10000,3) > >>> y.shape > (10000,3) > >>> z = empty(len(x)) > >>> for i in range(10000): > ... z[i] = dot(x[i], y[i]) > ... > _______________________________________________ Nadav Horesh visionsense.com> writes: > > (x*y).sum(1) > > Nadav > If you wish to avoid the extra memory allocation implied by `x*y' and get a ~4x speed-up, you can use a generalized ufunc (numpy >= 1.3, stolen from the testcases): z = numpy.core.umath_tests.inner1d(x, y) Best, Hansres From josef.pktd at gmail.com Mon Sep 7 12:07:32 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 7 Sep 2009 12:07:32 -0400 Subject: [Numpy-discussion] Row-wise dot product? In-Reply-To: References: <710F2847B0018641891D9A21602763605AD159@ex3.envision.co.il> Message-ID: <1cd32cbb0909070907j6f439d8cifdc4479db293ec9@mail.gmail.com> On Mon, Sep 7, 2009 at 10:09 AM, Hans-Andreas Engel wrote: >> From: T J gmail.com> >> Is there a better way to achieve the following, perhaps without the >> python for loop? >> >> >>> x.shape >> (10000,3) >> >>> y.shape >> (10000,3) >> >>> z = empty(len(x)) >> >>> for i in range(10000): >> ... ? ?z[i] = dot(x[i], y[i]) >> ... >> _______________________________________________ > > Nadav Horesh visionsense.com> writes: > >> >> (x*y).sum(1) >> >> ? Nadav >> > > > If you wish to avoid the extra memory allocation implied by `x*y' > and get a ~4x speed-up, you can use a generalized ufunc > (numpy >= 1.3, stolen from the testcases): > > ? z = numpy.core.umath_tests.inner1d(x, y) Thanks for the example. There are not many examples of generalized ufuncs on the mailing list and it's easy to forget that they have been added. Initially, I didn't find them in my numpy 1.3.0 install (AttributeErrors). It seems they have to be explicitly imported: import numpy.core.umath_tests Josef > > Best, > Hansres > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From rpg.314 at gmail.com Mon Sep 7 12:16:06 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Mon, 7 Sep 2009 09:16:06 -0700 Subject: [Numpy-discussion] mixing -fno-exceptions with swig c++ wrappers to python In-Reply-To: <4AA47F69.4010507@ar.media.kyoto-u.ac.jp> References: <4d5dd8c20909042314y4ff2d3f7od9820aeed4636ec2@mail.gmail.com> <4AA47F69.4010507@ar.media.kyoto-u.ac.jp> Message-ID: <4d5dd8c20909070916v5f0a0055nbac2214e9187f9de@mail.gmail.com> On Sun, Sep 6, 2009 at 8:35 PM, David Cournapeau wrote: > Rohit Garg wrote: >> Hi, >> >> I am using swig to expose a c++ class to Python. I am wondering if it >> is safe to use the -fno-exceptions option while compiling the >> wrappers. I am also using the typemaps present in the numpy.i file >> that comes with numpy. >> >> > > It will mostly depend on the code you are wrapping and your toolchain. > It should not cause trouble w.r.t numpy, as numpy does not use C++ at all. Yeah, that's what I meant. If my code does not use exceptions, then is it safe to use -fno-exceptions? > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From nathanielpeterson08 at gmail.com Mon Sep 7 15:46:35 2009 From: nathanielpeterson08 at gmail.com (nathanielpeterson08 at gmail.com) Date: Mon, 07 Sep 2009 15:46:35 -0400 Subject: [Numpy-discussion] Choosing the lesser value Message-ID: <87pra2v8pw.fsf@farmer.myhome.westell.com> Let a=np.ma.masked_invalid(np.array([-1,np.nan,-2,-3, np.nan])) b=np.ma.masked_invalid(np.array([-2,-3, -1,np.nan,np.nan])) I'd like to choose the lesser element (component-wise) of a and b. If the two elements are comparable, I want the lesser element. If one element is a number and the other is nan, then I want the number. And if both elements are nan, then I'd like resultant element be nan. In other words, I'd like the result to equal np.array([-2,-3,-2,-3,np.nan]) This is what I've tried: #!/usr/bin/env python import numpy as np a=np.ma.masked_invalid(np.array([-1,np.nan,-2,-3,np.nan])) b=np.ma.masked_invalid(np.array([-2,-3, -1,np.nan,np.nan])) very_large_num=10 a=np.ma.filled(a,very_large_num) print(a) # [ -1. 10. -2. -3. 10.] b=np.ma.filled(b,very_large_num) print(b) # [ -2. -3. -1. 10. 10.] result=np.ma.where(np.ma.less(a,b),a,b) print(result) # [ -2. -3. -2. -3. 10.] And this almost works, except that: [*] when both elements are nan, I'm getting my fill value (0) instead of nan. [*] I can guarantee all elements of a and b are either nan or less than some very_large_num, but if there is a solution that works without declaring very_large_num, I'd prefer that. What is the numpy way to choose lesser values in this situation? From nathanielpeterson08 at gmail.com Mon Sep 7 16:06:49 2009 From: nathanielpeterson08 at gmail.com (nathanielpeterson08 at gmail.com) Date: Mon, 07 Sep 2009 16:06:49 -0400 Subject: [Numpy-discussion] Choosing the lesser value In-Reply-To: <87pra2v8pw.fsf@farmer.myhome.westell.com> (nathanielpeterson08@gmail.com) References: <87pra2v8pw.fsf@farmer.myhome.westell.com> Message-ID: <87ocpmv7s6.fsf@farmer.myhome.westell.com> Doh, x=np.column_stack([a,b]) r=np.ma.masked_invalid(np.nanmin(x,axis=1)) print(r) # [-2.0 -3.0 -2.0 -3.0 --] From charlesr.harris at gmail.com Mon Sep 7 16:16:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 7 Sep 2009 15:16:32 -0500 Subject: [Numpy-discussion] Choosing the lesser value In-Reply-To: <87pra2v8pw.fsf@farmer.myhome.westell.com> References: <87pra2v8pw.fsf@farmer.myhome.westell.com> Message-ID: On Mon, Sep 7, 2009 at 2:46 PM, wrote: > Let > > a=np.ma.masked_invalid(np.array([-1,np.nan,-2,-3, np.nan])) > b=np.ma.masked_invalid(np.array([-2,-3, -1,np.nan,np.nan])) > > I'd like to choose the lesser element (component-wise) of a and b. > > If the two elements are comparable, I want the lesser element. > If one element is a number and the other is nan, then I want the number. > And if both elements are nan, then I'd like resultant element be nan. > > In other words, I'd like the result to equal > > np.array([-2,-3,-2,-3,np.nan]) > > This is what I've tried: > > #!/usr/bin/env python > import numpy as np > a=np.ma.masked_invalid(np.array([-1,np.nan,-2,-3,np.nan])) > b=np.ma.masked_invalid(np.array([-2,-3, -1,np.nan,np.nan])) > very_large_num=10 > a=np.ma.filled(a,very_large_num) > print(a) > # [ -1. 10. -2. -3. 10.] > b=np.ma.filled(b,very_large_num) > print(b) > # [ -2. -3. -1. 10. 10.] > result=np.ma.where(np.ma.less(a,b),a,b) > print(result) > # [ -2. -3. -2. -3. 10.] > > And this almost works, except that: > > [*] when both elements are nan, I'm getting my fill value (0) instead of > nan. > [*] I can guarantee all elements of a and b are either nan or less than > some very_large_num, but if there is a solution that works without declaring > very_large_num, I'd prefer that. > > What is the numpy way to choose lesser values in this situation? > __ The fmin ufunc has the behaviour you want. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Mon Sep 7 17:50:41 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 7 Sep 2009 17:50:41 -0400 Subject: [Numpy-discussion] greppable file of all numpy functions ? In-Reply-To: References: Message-ID: numpy.lookfor does what you're looking for, though I know of no such greppable file. You might be able to generate it thusly (untested): import numpy for key in dir(numpy): print key, if getattr(numpy, key).__doc__: print ':', getattr(numpy, key).__doc__.strip().split('\n')[0] On 7-Sep-09, at 9:26 AM, denis bzowy wrote: > Does anyone have a program to generate a file with one line per > Numpy function > / class / method, for local grepping ? > It might be useful for any package with thousands of functions too. > (Grepping a Pypi summary to see "what the heck is ..." takes < 1 > second.) > > Sorry if this is a duplicate, must exist already ? > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From tjhnson at gmail.com Mon Sep 7 18:27:03 2009 From: tjhnson at gmail.com (T J) Date: Mon, 7 Sep 2009 15:27:03 -0700 Subject: [Numpy-discussion] Row-wise dot product? In-Reply-To: References: <710F2847B0018641891D9A21602763605AD159@ex3.envision.co.il> Message-ID: On Mon, Sep 7, 2009 at 7:09 AM, Hans-Andreas Engel wrote: > If you wish to avoid the extra memory allocation implied by `x*y' > and get a ~4x speed-up, you can use a generalized ufunc > (numpy >= 1.3, stolen from the testcases): > > ? z = numpy.core.umath_tests.inner1d(x, y) > This is exactly what I was hoping for. Now, I can also keep an array of vectors and apply a rotation matrix to each vector. Hopefully, these use cases show serve as good proof on why the generalized ufunc machinery is useful. From jsseabold at gmail.com Mon Sep 7 18:36:13 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 7 Sep 2009 18:36:13 -0400 Subject: [Numpy-discussion] Behavior from a change in dtype? Message-ID: Hello all, I ran into a problem with some of my older code (since figured out the user error). However, in trying to give a simple example that replicates the problem I was having, I ran into this. In [19]: a = np.array((1.)) In [20]: a Out[20]: array(1.0) # the dtype is 'float64' In [21]: a.dtype=' References: <710F2847B0018641891D9A21602763605AD159@ex3.envision.co.il> Message-ID: On Mon, Sep 7, 2009 at 3:27 PM, T J wrote: > On Mon, Sep 7, 2009 at 7:09 AM, Hans-Andreas Engel wrote: >> If you wish to avoid the extra memory allocation implied by `x*y' >> and get a ~4x speed-up, you can use a generalized ufunc >> (numpy >= 1.3, stolen from the testcases): >> >> ? z = numpy.core.umath_tests.inner1d(x, y) >> > > This is exactly what I was hoping for. ?Now, I can also keep an array > of vectors and apply a rotation matrix to each vector. > I spoke too soon. inner1d will not allow me to rotate each row in the array. Is there another function that will help with this? If I'm understanding the signature for generalized ufuncs, it looks like I need: (i),(i,j)->(i) Or perhaps I am just being dense. From tjhnson at gmail.com Mon Sep 7 18:53:37 2009 From: tjhnson at gmail.com (T J) Date: Mon, 7 Sep 2009 15:53:37 -0700 Subject: [Numpy-discussion] Row-wise dot product? In-Reply-To: References: <710F2847B0018641891D9A21602763605AD159@ex3.envision.co.il> Message-ID: On Mon, Sep 7, 2009 at 3:43 PM, T J wrote: > Or perhaps I am just being dense. > Yes. I just tried to reinvent standard matrix multiplication. From josef.pktd at gmail.com Mon Sep 7 19:35:52 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 7 Sep 2009 19:35:52 -0400 Subject: [Numpy-discussion] Behavior from a change in dtype? In-Reply-To: References: Message-ID: <1cd32cbb0909071635o40b2cd60le7b19ad40b3b2903@mail.gmail.com> On Mon, Sep 7, 2009 at 6:36 PM, Skipper Seabold wrote: > Hello all, > > I ran into a problem with some of my older code (since figured out the > user error). ?However, in trying to give a simple example that > replicates the problem I was having, I ran into this. > > In [19]: a = np.array((1.)) > > In [20]: a > Out[20]: array(1.0) > > # the dtype is 'float64' > > In [21]: a.dtype='>> a = np.array((1.)) >>> b = a.astype('>> b array(1L, dtype=int64) Josef > > In [22]: a > Out[22]: array(4607182418800017408) > > I've seen some recent threads about handling changes in types, but I > didn't follow closely, so forgive me if I'm missing something that is > known. ?In general, is it just a bad idea to touch the dtype like > this? > > Best, > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From wfspotz at sandia.gov Mon Sep 7 19:49:27 2009 From: wfspotz at sandia.gov (Bill Spotz) Date: Mon, 7 Sep 2009 19:49:27 -0400 Subject: [Numpy-discussion] mixing -fno-exceptions with swig c++ wrappers to python In-Reply-To: <4d5dd8c20909070916v5f0a0055nbac2214e9187f9de@mail.gmail.com> References: <4d5dd8c20909042314y4ff2d3f7od9820aeed4636ec2@mail.gmail.com> <4AA47F69.4010507@ar.media.kyoto-u.ac.jp> <4d5dd8c20909070916v5f0a0055nbac2214e9187f9de@mail.gmail.com> Message-ID: numpy.i is supposed to be C-compatible, so it does not generate any throw or catch statements, and utilizes standard python error handling. Using -fno-exceptions should be OK. On Sep 7, 2009, at 12:16 PM, Rohit Garg wrote: > On Sun, Sep 6, 2009 at 8:35 PM, David > Cournapeau wrote: >> Rohit Garg wrote: >>> Hi, >>> >>> I am using swig to expose a c++ class to Python. I am wondering if >>> it >>> is safe to use the -fno-exceptions option while compiling the >>> wrappers. I am also using the typemaps present in the numpy.i file >>> that comes with numpy. >>> >>> >> >> It will mostly depend on the code you are wrapping and your >> toolchain. >> It should not cause trouble w.r.t numpy, as numpy does not use C++ >> at all. > Yeah, that's what I meant. If my code does not use exceptions, then is > it safe to use -fno-exceptions? > >> >> cheers, >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > Rohit Garg > > http://rpg-314.blogspot.com/ > > Senior Undergraduate > Department of Physics > Indian Institute of Technology > Bombay > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From jsseabold at gmail.com Mon Sep 7 20:01:18 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 7 Sep 2009 20:01:18 -0400 Subject: [Numpy-discussion] Behavior from a change in dtype? In-Reply-To: <1cd32cbb0909071635o40b2cd60le7b19ad40b3b2903@mail.gmail.com> References: <1cd32cbb0909071635o40b2cd60le7b19ad40b3b2903@mail.gmail.com> Message-ID: On Mon, Sep 7, 2009 at 7:35 PM, wrote: > On Mon, Sep 7, 2009 at 6:36 PM, Skipper Seabold wrote: >> Hello all, >> >> I ran into a problem with some of my older code (since figured out the >> user error). ?However, in trying to give a simple example that >> replicates the problem I was having, I ran into this. >> >> In [19]: a = np.array((1.)) >> >> In [20]: a >> Out[20]: array(1.0) >> >> # the dtype is 'float64' >> >> In [21]: a.dtype=' > The way I understand it is: > Here you are telling numpy to interpret the existing memory/data in a > different way, which might make sense or not depending on the types, > e.g. I also used this to switch between structured arrays and regular > arrays with compatible memory. However it does not convert the data. > > If you want to convert the data to a different type, numpy needs to > create a new array, e.g. with astype > >>>> a = np.array((1.)) >>>> b = a.astype('>>> b > array(1L, dtype=int64) > > Josef > Hmm, okay, well I came across this in trying to create a recarray like data2 below, so I guess I should just combine the two questions. Is the last example the best way to do what I'm trying to do (taken from an old thread)? I would like to add a few more examples of best practice here , so I don't need to go looking again. import numpy as np data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) dt = np.dtype([('var1', ' References: <1cd32cbb0909071635o40b2cd60le7b19ad40b3b2903@mail.gmail.com> Message-ID: <1cd32cbb0909071808w7452cac4y4385f88080b75870@mail.gmail.com> On Mon, Sep 7, 2009 at 8:01 PM, Skipper Seabold wrote: > On Mon, Sep 7, 2009 at 7:35 PM, wrote: >> On Mon, Sep 7, 2009 at 6:36 PM, Skipper Seabold wrote: >>> Hello all, >>> >>> I ran into a problem with some of my older code (since figured out the >>> user error). ?However, in trying to give a simple example that >>> replicates the problem I was having, I ran into this. >>> >>> In [19]: a = np.array((1.)) >>> >>> In [20]: a >>> Out[20]: array(1.0) >>> >>> # the dtype is 'float64' >>> >>> In [21]: a.dtype='> >> The way I understand it is: >> Here you are telling numpy to interpret the existing memory/data in a >> different way, which might make sense or not depending on the types, >> e.g. I also used this to switch between structured arrays and regular >> arrays with compatible memory. However it does not convert the data. >> >> If you want to convert the data to a different type, numpy needs to >> create a new array, e.g. with astype >> >>>>> a = np.array((1.)) >>>>> b = a.astype('>>>> b >> array(1L, dtype=int64) >> >> Josef >> > > Hmm, okay, well I came across this in trying to create a recarray like > data2 below, so I guess I should just combine the two questions. ?Is > the last example the best way to do what I'm trying to do (taken from > an old thread)? ?I would like to add a few more examples of best > practice here , > so I don't need to go looking again. > > import numpy as np > > data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) > dt = np.dtype([('var1', ' data2 = data.copy() > data3 = data.copy() > > # Doesn't work, raises TypeError: expected a readable buffer object > data2 = data2.view(np.recarray) > data2.astype(dt) > > # Works without error (?) with unexpected result > data3 = data3.view(np.recarray) > data3.dtype = dt > > # One correct (though IMHO) unintuitive way > data = np.rec.fromarrays(data.swapaxes(1,0), dtype=dt) I'm not able to come up with anything much better. For the conversion to structured arrays, numpy seems to expect tuples for the rows: >>> data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) >>> np.array(map(tuple, data), dt) array([(10.75, 1L, 1L), (10.390000000000001, 0L, 1L), (18.18, 0L, 1L)], dtype=[('var1', '>> np.array(map(tuple, data), dt).view(np.recarray) rec.array([(10.75, 1L, 1L), (10.390000000000001, 0L, 1L), (18.18, 0L, 1L)], dtype=[('var1', '>> data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) >>> data.dtype dtype('float64') >>> dt = np.dtype([('var1', '>> data.astype(dt) Traceback (most recent call last): File "", line 1, in data.astype(dt) TypeError: expected a readable buffer object >>> # doh >>> data_stf = data.view([('',data.dtype)]*data.shape[1]) >>> data_stf.dtype dtype([('f0', '>> data_stf.astype(dt) array([[(10.75, 1L, 1L)], [(10.390000000000001, 0L, 1L)], [(18.18, 0L, 1L)]], dtype=[('var1', '>> data_stf.astype(dt).view(np.recarray) rec.array([[(10.75, 1L, 1L)], [(10.390000000000001, 0L, 1L)], [(18.18, 0L, 1L)]], dtype=[('var1', '>> # Yuhoo > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From david at ar.media.kyoto-u.ac.jp Mon Sep 7 22:10:30 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 08 Sep 2009 11:10:30 +0900 Subject: [Numpy-discussion] mixing -fno-exceptions with swig c++ wrappers to python In-Reply-To: <4d5dd8c20909070916v5f0a0055nbac2214e9187f9de@mail.gmail.com> References: <4d5dd8c20909042314y4ff2d3f7od9820aeed4636ec2@mail.gmail.com> <4AA47F69.4010507@ar.media.kyoto-u.ac.jp> <4d5dd8c20909070916v5f0a0055nbac2214e9187f9de@mail.gmail.com> Message-ID: <4AA5BD16.8060508@ar.media.kyoto-u.ac.jp> Rohit Garg wrote: > Yeah, that's what I meant. If my code does not use exceptions, then is > it safe to use -fno-exceptions? > You would have to look at g++ documentation - but if it is safe for your code, numpy should not make it "unsafe". I am not sure what not using exception means in C++, though, as new throws exception for example. cheers, David From robert.kern at gmail.com Mon Sep 7 23:46:55 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 7 Sep 2009 22:46:55 -0500 Subject: [Numpy-discussion] greppable file of all numpy functions ? In-Reply-To: References: Message-ID: <3d375d730909072046n3dc793bam32161a7119c2b412@mail.gmail.com> On Mon, Sep 7, 2009 at 08:26, denis bzowy wrote: > Does anyone have a program to generate a file with one line per Numpy function > / class / method, for local grepping ? > It might be useful for any package with thousands of functions too. > (Grepping a Pypi summary to see "what the heck is ..." takes < 1 second.) I'll do you one better and give you full-text search. http://pypi.python.org/pypi/WhooshDoc -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From denis-bz-py at t-online.de Tue Sep 8 10:28:30 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Tue, 8 Sep 2009 14:28:30 +0000 (UTC) Subject: [Numpy-discussion] greppable file of all numpy functions ? References: Message-ID: denis bzowy t-online.de> writes: > > Does anyone have a program to generate a file with one line per Numpy function > / class / method, for local grepping ? Sorry I wasn't clear: I want just all defs, one per long line, like this: ... PyQt4.QtCore.QObject.findChildren(type type, QRegExp regExp) -> list PyQt4.QtCore.QObject.emit(SIGNAL(), ...) PyQt4.QtCore.QObject.objectName() -> QString PyQt4.QtCore.QObject.setObjectName(QString name) PyQt4.QtCore.QObject.isWidgetType() -> bool ... This file (PyQt4.api) is a bit different but you get the idea: egrep kilroy all.defs -> a.b.c.kilroy ... with args -- no __doc__ then pydoc or ipython %whoosh a.b.c.kilroy -> __doc__ is step 2. Sound dumb ? Well, grep is fast, simple, and works even when you don't know enough for tree-structured search. Whooshdoc looks very nice, can it do just all.defs ? (Oops, wdoc -v index numpy.core numpy.lib -> ... File "/opt/local/lib/python2.5/site-packages/epydoc-3.0.1-py2.5.egg/epydoc/doc module_doc.package.submodules.append(module_doc) AttributeError: _Sentinel instance has no attribute 'append' log on the way to enthought-dev From giuseppe.aprea at gmail.com Tue Sep 8 10:29:06 2009 From: giuseppe.aprea at gmail.com (Giuseppe Aprea) Date: Tue, 8 Sep 2009 16:29:06 +0200 Subject: [Numpy-discussion] creating mesh data from xyz data Message-ID: Hi list, I have some files with data stored in columns: x1 y1 z1 x2 y2 z2 x3 y3 z3 x4 y4 z4 x5 y5 z5 ....... and I need to make a contour plot of this data using matplotlib. The problem is that contour plot functions usually handle a different kind of input: X=[[x1,x2,x3,x4,x5,x6], [x1,x2,x3,x4,x5,x6], [x1,x2,x3,x4,x5,x6],... Y=[[y1,y1,y1,y1,y1,y1], [y2,y2,y2,y2,y2,y2], [y3,y3,y3,y3,y3,y3],..... Z=[[z1,z2,z3,z4,z5,z6], [z7,z8,zz9,z10,z11,z12],.... I usually load data using 3 lists: x, y and z; I wonder if there is any function which is able to take these 3 lists and return the right inputs for matplotlib functions. cheers g From Chris.Barker at noaa.gov Tue Sep 8 11:38:35 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 08 Sep 2009 08:38:35 -0700 Subject: [Numpy-discussion] creating mesh data from xyz data In-Reply-To: References: Message-ID: <4AA67A7B.8090705@noaa.gov> Giuseppe Aprea wrote: > I have some files with data stored in columns: > > x1 y1 z1 > x2 y2 z2 > x3 y3 z3 > x4 y4 z4 > x5 y5 z5 > I usually load data using 3 lists: x, y and z; I wonder if there is > any function which is able to take these 3 lists and return the right > inputs for matplotlib functions. There may b e some MPL utilities that help with this, so you may want to ask there, but: What you want to do depends on the nature of your data. If your data is on a rectangular structured grid, the you should use your knowledge of the data structure to re-create that structure to pass to MPL. If it is unstructured data: i.e. the (x,y) points are at arbitrary positions, then you need some sort of interpolation scheme to get an appropriate rectangular mesh: Here's a good start: http://www.scipy.org/Cookbook/Matplotlib/Gridding_irregularly_spaced_data -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue Sep 8 12:53:16 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 08 Sep 2009 09:53:16 -0700 Subject: [Numpy-discussion] Behavior from a change in dtype? In-Reply-To: References: <1cd32cbb0909071635o40b2cd60le7b19ad40b3b2903@mail.gmail.com> Message-ID: <4AA68BFC.7080707@noaa.gov> Skipper Seabold wrote: > Hmm, okay, well I came across this in trying to create a recarray like > data2 below, so I guess I should just combine the two questions. key to understanding this is to understand what is going on under the hood in numpy. Travis O. gave a nice intro in an Enthought webcast a few months ago -- I"m not sure if those are recorded and up on the web, but it's worth a look. It was also discussed int eh advanced numpy tutorial at SciPy this year -- and that is up on the web: http://www.archive.org/details/scipy09_advancedTutorialDay1_1 Anyway, here is my minimal attempt to clarify: > import numpy as np > > data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) here we are using a standard array constructor -- it will look at the data you are passing in (a mixture of python floats and ints), and decide that they can best be represented by a numpy array of float64s. numpy arrays are essentially a pointer to a black of memory, and a bunch of attributes that describe how the bytes pointed to are to be interpreted. In this case, they are a 9 C doubles, representing a 3x3 array of doubles. > dt = np.dtype([('var1', 'f8'), ('var2', '>i8'), ('var3', '>i8')]) ) This is a data type descriptor that is analogous to a C struct, containing a float64 and two int84s > # Doesn't work, raises TypeError: expected a readable buffer object > data2 = data2.view(np.recarray) > data2.astype(dt) I'm don't understand that error either, but recarrays are about adding the ability to access parts of a structured array by name, but you still need the dtype to specify the types and names. This does seem to work (though may not be giving the results you expect): In [19]: data2 = data.copy() In [20]: data2 = data2.view(np.recarray) In [21]: data2 = data2.view(dtype=dt) or, indeed in the opposite order: In [24]: data2 = data.copy() In [25]: data2 = data2.view(dtype=dt) In [26]: data2 = data2.view(np.recarray) So you've done two operations, one is to change the dtype -- the interpretation of the bytes in the data buffer, and one is to make this a recarray, which allows you to access the "fields" by name: In [31]: data2['var1'] Out[31]: array([[ 10.75], [ 10.39], [ 18.18]]) > # Works without error (?) with unexpected result > data3 = data3.view(np.recarray) > data3.dtype = dt that all depends what you expect! I used "view" above, 'cause I think there is less magic, though it's the same thing. I suppose changing the dtype in place like that is a tiny bit more efficient -- if you use .view() , you are creating a new array pointing to the same data, rather than changing the array in place. But anyway, the dtype describes how the bytes in the memory black are to be interpreted, changing it by assigning the attribute or using .view() changes the interpretation, but does not change the bytes themselves at all, so in this case, you are taking the 8 bytes representing a float64 of value: 1.0, and interpreting those bytes as an 8 byte int -- which is going to give you garbage, essentially. > # One correct (though IMHO) unintuitive way > data = np.rec.fromarrays(data.swapaxes(1,0), dtype=dt) This is using the np.rec.fromarrays constructor to build a new record array with the dtype you want, the data is being converted and copied, it won't change the original at all: So the question remains -- is there a way to convert the floats in "data" to ints in place? This seems to work: In [78]: data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) In [79]: data[:,1:3] = data[:,1:3].astype('>i8').view(dtype='>f8') In [80]: data.dtype = dt It is making a copy of the integer data in process -- but I think that is required, as you are changing the value, not just the interpretation of the bytes. I suppose we could have a "astype_inplace" method, but that would only work if the two types were the same size, and I'm not sure it's a common enough use to be worth it. What is your real use case? I suspect that what you really should do here is define your dtype first, then create the array of data: data = np.array([(10.75, 1, 1), (10.39, 0, 1), (18.18, 0, 1)], dtype=dt) which does require that you use tuples, rather than lists to hold the "structs". HTH, - Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue Sep 8 12:58:13 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 08 Sep 2009 09:58:13 -0700 Subject: [Numpy-discussion] numpy/scipy/matplotlib + 10.6 + Apple python 2.6.1 In-Reply-To: <5b8d13220909070213n42e4da0diebe5c20361928f78@mail.gmail.com> References: <1d1e6ea70909070200v760212bj64749b67040302d1@mail.gmail.com> <5b8d13220909070213n42e4da0diebe5c20361928f78@mail.gmail.com> Message-ID: <4AA68D25.8020601@noaa.gov> David Cournapeau wrote: > I think it is best to avoid touching anything in /System. Yes, it is. > The better > solution is to install things locally, at least if you don't need to > share with several users one install. And if you do, you can put it in: /Library/Frameworks (/Library is kind Apple's answer to /usr/local, at least for Frameworks) What that means is that you need to install a new Python, too. I think those notes were for using the Apple-supplied Python. But it's a good idea to build your own Python (or install the python.org one) in /Library anyway -- Apple has never upgraded a Python within an OS-X release, and tends to have a bunch of not-quite-up-to-date pacakges installed. Since you don't know which of those packages are being used by Apple utilities, and Python doesn't provide a package versioning system, and not all package updates are fully backwards compatible, it's best to simply not mess with Apple's python at all. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From nmb at wartburg.edu Tue Sep 8 13:51:10 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Tue, 08 Sep 2009 12:51:10 -0500 Subject: [Numpy-discussion] creating mesh data from xyz data In-Reply-To: <4AA67A7B.8090705@noaa.gov> References: <4AA67A7B.8090705@noaa.gov> Message-ID: <4AA6998E.1030406@wartburg.edu> On 2009-09-08 10:38 , Christopher Barker wrote: > Giuseppe Aprea wrote: >> I have some files with data stored in columns: >> >> x1 y1 z1 >> x2 y2 z2 >> x3 y3 z3 >> x4 y4 z4 >> x5 y5 z5 >> I usually load data using 3 lists: x, y and z; I wonder if there is >> any function which is able to take these 3 lists and return the right >> inputs for matplotlib functions. > > There may b e some MPL utilities that help with this, so you may want to > ask there, but: > > What you want to do depends on the nature of your data. If your data is > on a rectangular structured grid, the you should use your knowledge of > the data structure to re-create that structure to pass to MPL. To expand on Chris's very nice explanation, if the data points are in "raster" order where x1 = x2 = ... = xn and so forth, then you can use reshape to get your arrays for matplotlib. Here's an example: >>> x = [1]*5+[2]*5+[3]*5 >>> y = [6,7,8,9,10]*3 >>> z = range(15) >>> x,y,z ([1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3], [6, 7, 8, 9, 10, 6, 7, 8, 9, 10, 6, 7, 8, 9, 10], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) >>> plot_x = np.array(x).reshape(3,5) >>> plot_y = np.array(y).reshape(3,5) >>> plot_z = np.array(z).reshape(3,5) >>> plot_x,plot_y,plot_z (array([[1, 1, 1, 1, 1], [2, 2, 2, 2, 2], [3, 3, 3, 3, 3]]), array([[ 6, 7, 8, 9, 10], [ 6, 7, 8, 9, 10], [ 6, 7, 8, 9, 10]]), array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14]])) -Neil From dsdale24 at gmail.com Tue Sep 8 15:21:58 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Tue, 8 Sep 2009 15:21:58 -0400 Subject: [Numpy-discussion] question about future support for python-3 Message-ID: I'm not a core numpy developer and don't want to step on anybody's toes here. But I was wondering if anyone had considered approaching the Python Software Foundation about support to help get numpy working with python-3? Thanks, Darren From gdahl at cs.toronto.edu Tue Sep 8 15:19:05 2009 From: gdahl at cs.toronto.edu (George Dahl) Date: Tue, 8 Sep 2009 19:19:05 +0000 (UTC) Subject: [Numpy-discussion] Fwd: GPU Numpy References: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> <4A8FBF64.50300@molden.no> Message-ID: Sturla Molden molden.no> writes: > > Erik Tollerud skrev: > >> NumPy arrays on the GPU memory is an easy task. But then I would have to > >> write the computation in OpenCL's dialect of C99? > > This is true to some extent, but also probably difficult to do given > > the fact that paralellizable algorithms are generally more difficult > > to formulate in striaghtforward ways. > Then you have misunderstood me completely. Creating an ndarray that has > a buffer in graphics memory is not too difficult, given that graphics > memory can be memory mapped. This has nothing to do with parallelizable > algorithms or not. It is just memory management. We could make an > ndarray subclass that quickly puts is content in a buffer accessible to > the GPU. That is not difficult. But then comes the question of what you > do with it. > > I think many here misunderstands the issue here: > > Teraflops peak performance of modern GPUs is impressive. But NumPy > cannot easily benefit from that. In fact, there is little or nothing to > gain from optimising in that end. In order for a GPU to help, > computation must be the time-limiting factor. It is not. There is not > more to say about using GPUs in NumPy right now. > > Take a look at the timings here: http://www.scipy.org/PerformancePython > It shows that computing with NumPy is more than ten times slower than > using plain C. This is despite NumPy being written in C. The NumPy code > does not incur 10 times more floating point operations than the C code. > The floating point unit does not run in turtle mode when using NumPy. > NumPy's relative slowness compared to C has nothing to do with floating > point computation. It is due to inferior memory use (temporary buffers, > multiple buffer traversals) and memory access being slow. Moving > computation to the GPU can only make this worse. > > Improved memory usage - e.g. through lazy evaluation and JIT compilaton > of expressions - can give up to a tenfold increase in performance. That > is where we must start optimising to get a faster NumPy. Incidentally, > this will also make it easier to leverage on modern GPUs. > > Sturla Molden > I know that for my work, I can get around an order of a 50-fold speedup over numpy using a python wrapper for a simple GPU matrix class. So I might be dealing with a lot of matrix products where I multiply a fixed 512 by 784 matrix by a 784 by 256 matrix that changes between each matrix product, although to really see the largest gains I use a 4096 by 2048 matrix times a bunch of 2048 by 256 matrices. If all I was doing were those matrix products, it would be even faster, but what I actually am doing is a matrix product, then adding a column vector to the result, then applying an elementwise logistic sigmoid function and potentially generating a matrix of pseudorandom numbers the same shape as my result (although not always). When I do these sorts of workloads, my python numpy+GPU matrix class goes so much faster than anything that doesn't use the GPU (be it Matlab, or numpy, or C/C++ whatever) that I don't even bother measuring the speedups precisely. In some cases, my python code isn't making too many temporaries since what it is doing is so simple, but in other cases that is obviously slowing it down a bit. I have relatively complicated jobs that used to take weeks on the CPU can now take hours or days. Obviously improved memory usage would be more helpful since not everyone has access to the sorts of GPUs I use, but tenfold increases in performance seem like chump change compared to what I see with the sorts of workloads I do. From dwf at cs.toronto.edu Tue Sep 8 15:56:43 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 8 Sep 2009 15:56:43 -0400 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: References: Message-ID: <4EA84C5C-E574-41E9-ABEE-00DCB88726FC@cs.toronto.edu> Hey Darren, On 8-Sep-09, at 3:21 PM, Darren Dale wrote: > I'm not a core numpy developer and don't want to step on anybody's > toes here. But I was wondering if anyone had considered approaching > the Python Software Foundation about support to help get numpy working > with python-3? It's a great idea, but word on the grapevine is they lost a LOT of money on PyCon 2009 due to lower than expected turnout (recession, etc.); worth a try, perhaps, but I wouldn't hold my breath. David From doutriaux1 at llnl.gov Tue Sep 8 15:59:08 2009 From: doutriaux1 at llnl.gov (=?UTF-8?Q?Charles_=D8=B3=D9=85=D9=8A=D8=B1_Doutriaux?=) Date: Tue, 8 Sep 2009 12:59:08 -0700 Subject: [Numpy-discussion] dtype and dtype.char Message-ID: <29F41702-997A-47A6-9240-67D09C0F091A@llnl.gov> Hi, I'm testing our code on 64bit vs 32bit I just realized that the dtype.car is platform dependent. I guess it's normal her emy little test: for t in [numpy .byte ,numpy .short ,numpy .int ,numpy .int32 ,numpy .float ,numpy .float32 ,numpy .double,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]: print 'Testing type:',t data = numpy.array([0], dtype=t) print data.dtype.char,data.dtype On 64bit I get for numpy.unit64: Testing type: L uint64 Whereas on 32bit i get Testing type: Q uint64 Is it really normal? I guess that means I shouldn't expect the dtype.char to be the same on all platform.... Is that right? C. From robert.kern at gmail.com Tue Sep 8 16:02:20 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Sep 2009 15:02:20 -0500 Subject: [Numpy-discussion] dtype and dtype.char In-Reply-To: <29F41702-997A-47A6-9240-67D09C0F091A@llnl.gov> References: <29F41702-997A-47A6-9240-67D09C0F091A@llnl.gov> Message-ID: <3d375d730909081302o67b5b086o3941f99380eb0a2c@mail.gmail.com> 2009/9/8 Charles ???? Doutriaux : > Hi, > > I'm testing our code on 64bit vs 32bit > > I just realized that the dtype.car is platform dependent. > > I guess it's normal > > her emy little test: > for t in > [numpy > .byte > ,numpy > .short > ,numpy > .int > ,numpy > .int32 > ,numpy > .float > ,numpy > .float32 > ,numpy > .double,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]: > ? ? print 'Testing type:',t > ? ? data = numpy.array([0], dtype=t) > ? ? print data.dtype.char,data.dtype > > > On 64bit I get for numpy.unit64: > Testing type: > L uint64 > > Whereas on 32bit i get > Testing type: > Q uint64 > > Is it really normal? I guess that means I shouldn't expect the > dtype.char to be the same on all platform.... > > Is that right? Yes. dtype.char corresponds more closely to the C type ("L" == "unsigned long" and "Q" == "unsigned long long") which is platform specific. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From doutriaux1 at llnl.gov Tue Sep 8 16:13:45 2009 From: doutriaux1 at llnl.gov (=?UTF-8?Q?Charles_=D8=B3=D9=85=D9=8A=D8=B1_Doutriaux?=) Date: Tue, 8 Sep 2009 13:13:45 -0700 Subject: [Numpy-discussion] dtype and dtype.char In-Reply-To: <3d375d730909081302o67b5b086o3941f99380eb0a2c@mail.gmail.com> References: <29F41702-997A-47A6-9240-67D09C0F091A@llnl.gov> <3d375d730909081302o67b5b086o3941f99380eb0a2c@mail.gmail.com> Message-ID: Hi Robert, Ok we have a section of code that used to be like that: char t; switch(type) { case NPY_CHAR: t = 'c'; break; etc... I now replaced with char t; switch(type) { case NPY_CHAR: t = NPY_CHARLTR; break; But I'm still stuck with numpy.uint64 NPY_UINT64LTR does not seem to exist.... What do you recommend? C. On Sep 8, 2009, at 1:02 PM, Robert Kern wrote: > 2009/9/8 Charles ???? Doutriaux : >> Hi, >> >> I'm testing our code on 64bit vs 32bit >> >> I just realized that the dtype.car is platform dependent. >> >> I guess it's normal >> >> her emy little test: >> for t in >> [numpy >> .byte >> ,numpy >> .short >> ,numpy >> .int >> ,numpy >> .int32 >> ,numpy >> .float >> ,numpy >> .float32 >> ,numpy >> .double >> ,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]: >> print 'Testing type:',t >> data = numpy.array([0], dtype=t) >> print data.dtype.char,data.dtype >> >> >> On 64bit I get for numpy.unit64: >> Testing type: >> L uint64 >> >> Whereas on 32bit i get >> Testing type: >> Q uint64 >> >> Is it really normal? I guess that means I shouldn't expect the >> dtype.char to be the same on all platform.... >> >> Is that right? > > Yes. dtype.char corresponds more closely to the C type ("L" == > "unsigned long" and "Q" == "unsigned long long") which is platform > specific. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://*mail.scipy.org/mailman/listinfo/numpy-discussion From dsdale24 at gmail.com Tue Sep 8 16:16:14 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Tue, 8 Sep 2009 16:16:14 -0400 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: <4EA84C5C-E574-41E9-ABEE-00DCB88726FC@cs.toronto.edu> References: <4EA84C5C-E574-41E9-ABEE-00DCB88726FC@cs.toronto.edu> Message-ID: Hi David, On Tue, Sep 8, 2009 at 3:56 PM, David Warde-Farley wrote: > Hey Darren, > > On 8-Sep-09, at 3:21 PM, Darren Dale wrote: > >> I'm not a core numpy developer and don't want to step on anybody's >> toes here. But I was wondering if anyone had considered approaching >> the Python Software Foundation about support to help get numpy working >> with python-3? > > It's a great idea, but word on the grapevine is they lost a LOT of > money on PyCon 2009 due to lower than expected turnout (recession, > etc.); worth a try, perhaps, but I wouldn't hold my breath. I'm blissfully ignorant of the grapevine. But if the numpy project could make use of additional resources to speed along the transition, and if the PSF is in a position to help (either now or in the future), both parties could benefit from such an arrangement. Darren From Chris.Barker at noaa.gov Tue Sep 8 17:21:53 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 08 Sep 2009 14:21:53 -0700 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> <4A8FBF64.50300@molden.no> Message-ID: <4AA6CAF1.9060303@noaa.gov> George Dahl wrote: > Sturla Molden molden.no> writes: >> Teraflops peak performance of modern GPUs is impressive. But NumPy >> cannot easily benefit from that. > I know that for my work, I can get around an order of a 50-fold speedup over > numpy using a python wrapper for a simple GPU matrix class. I think you're talking across each other here. Sturla is referring to making a numpy ndarray gpu-aware and then expecting expressions like: z = a*x**2 + b*x + c to go faster when s, b, c, and x are ndarrays. That's not going to happen. On the other hand, George is talking about moving higher-level operations (like a matrix product) over to GPU code. This is analogous to numpy.linalg and numpy.dot() using LAPACK routines, and yes, that could help those programs that use such operations. So a GPU LAPACK would be nice. This is also analogous to using SWIG, or ctypes or cython or weave, or ??? to move a computationally expensive part of the code over to C. I think anything that makes it easier to write little bits of your code for the GPU would be pretty cool -- a GPU-aware Cython? Also, perhaps a GPU-aware numexpr could be helpful which I think is the kind of thing that Sturla was refering to when she wrote: "Incidentally, this will also make it easier to leverage on modern GPUs." -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From lists at cheimes.de Tue Sep 8 18:57:08 2009 From: lists at cheimes.de (Christian Heimes) Date: Wed, 09 Sep 2009 00:57:08 +0200 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: References: Message-ID: Darren Dale wrote: > I'm not a core numpy developer and don't want to step on anybody's > toes here. But I was wondering if anyone had considered approaching > the Python Software Foundation about support to help get numpy working > with python-3? What kind of support are you talking about? Developers, money, software, PR, test platforms ...? For quite some time we are talking about ways on the PSF list to aid projects. We are trying to figure out what projects need, especially high profile projects and important infrastructure projects. I myself consider NumPy as a great asset for both the scientific community and Python. It's true that Pycon '09 was a major drawback on our financials. But there are other ways beside money to assist projects. For example the snakebite network (http://snakebite.org/) could be very useful for you once it's open. Please don't ask me about details on the status, I don't have an account yet. About a month ago we got 14 MSDN premium subscriptions with full access to MS development tools and all Windows platforms, which is very useful for porting and testing application on Windows. Some core developers may also be interested to assist you directly. The PSF might (!) even donate some money but I'm not in the position to discuss it. I can get you in touch with the PSF if you like. I'm a PSF member and a core developer. Christian From charlesr.harris at gmail.com Tue Sep 8 19:37:02 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Sep 2009 18:37:02 -0500 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: References: Message-ID: On Tue, Sep 8, 2009 at 5:57 PM, Christian Heimes wrote: > Darren Dale wrote: > > I'm not a core numpy developer and don't want to step on anybody's > > toes here. But I was wondering if anyone had considered approaching > > the Python Software Foundation about support to help get numpy working > > with python-3? > > What kind of support are you talking about? Developers, money, software, > PR, test platforms ...? For quite some time we are talking about ways on > the PSF list to aid projects. We are trying to figure out what projects > need, especially high profile projects and important infrastructure > projects. I myself consider NumPy as a great asset for both the > scientific community and Python. > > I think a full time developer would do the most to speed up the transition. Having a variety of platforms available for testing is good but I don't think it will speed things up significantly. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Sep 8 20:08:18 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Sep 2009 09:08:18 +0900 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: References: Message-ID: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> On Wed, Sep 9, 2009 at 4:21 AM, Darren Dale wrote: > I'm not a core numpy developer and don't want to step on anybody's > toes here. But I was wondering if anyone had considered approaching > the Python Software Foundation about support to help get numpy working > with python-3? I already gave my own opinion on py3k, which can be summarized as: - it is a huge effort, and no core numpy/scipy developer has expressed the urge to transition to py3k, since py3k does not bring much for scientific computing. - very few packages with a significant portion of C have been ported to my knowledge, hence very little experience on how to do it. AFAIK, only small packages have been ported. Even big, pure python projects have not been ported. The only big C project to have been ported is python itself, and it broke compatibility and used a different source tree than python 2. - it remains to be seen whether we can do the py3k support in the same source tree as the one use for python >= 2.4. Having two source trees would make the effort even much bigger, well over the current developers capacity IMHO. The only area where I could see the PSF helping is the point 2: more documentation, more stories about 2->3 transition. cheers, David From dsdale24 at gmail.com Tue Sep 8 20:37:38 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Tue, 8 Sep 2009 20:37:38 -0400 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> Message-ID: Hi David, On Tue, Sep 8, 2009 at 8:08 PM, David Cournapeau wrote: > On Wed, Sep 9, 2009 at 4:21 AM, Darren Dale wrote: >> I'm not a core numpy developer and don't want to step on anybody's >> toes here. But I was wondering if anyone had considered approaching >> the Python Software Foundation about support to help get numpy working >> with python-3? > > I already gave my own opinion on py3k, which can be summarized as: > ?- it is a huge effort, and no core numpy/scipy developer has > expressed the urge to transition to py3k, since py3k does not bring > much for scientific computing. > ?- very few packages with a significant portion of C have been ported > to my knowledge, hence very little experience on how to do it. AFAIK, > only small packages have been ported. Even big, pure python projects > have not been ported. The only big C project to have been ported is > python itself, and it broke compatibility and used a different source > tree than python 2. > ?- it remains to be seen whether we can do the py3k support in the > same source tree as the one use for python >= 2.4. Having two source > trees would make the effort even much bigger, well over the current > developers capacity IMHO. > > The only area where I could see the PSF helping is the point 2: more > documentation, more stories about 2->3 transition. I'm surprised to hear you say that. I would think additional developer and/or financial resources would be useful, for all of the reasons you listed. Darren From mail.to.daniel.platz at googlemail.com Tue Sep 8 20:30:39 2009 From: mail.to.daniel.platz at googlemail.com (Daniel Platz) Date: Wed, 9 Sep 2009 02:30:39 +0200 Subject: [Numpy-discussion] Huge arrays Message-ID: Hi, I have a numpy newbie question. I want to store a huge amount of data in an array. This data come from a measurement setup and I want to write them to disk later since there is nearly no time for this during the measurement. To put some numbers up: I have 2*256*2000000 int16 numbers which I want to store. I tried data1 = numpy.zeros((256,2000000),dtype=int16) data2 = numpy.zeros((256,2000000),dtype=int16) This works for the first array data1. However, it returns with a memory error for array data2. I have read somewhere that there is a 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still be below that? I use Windows XP Pro 32 bit with 3GB of RAM. If someone has an idea to help me I would be very glad. Thanks in advance. Daniel From cournape at gmail.com Tue Sep 8 20:53:01 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Sep 2009 09:53:01 +0900 Subject: [Numpy-discussion] Huge arrays In-Reply-To: References: Message-ID: <5b8d13220909081753j112d4040u8c7def0235b83c33@mail.gmail.com> On Wed, Sep 9, 2009 at 9:30 AM, Daniel Platz wrote: > Hi, > > I have a numpy newbie question. I want to store a huge amount of data > in ?an array. This data come from a measurement setup and I want to > write them to disk later since there is nearly no time for this during > the measurement. To put some numbers up: I have 2*256*2000000 int16 > numbers which I want to store. I tried > > data1 = numpy.zeros((256,2000000),dtype=int16) > data2 = numpy.zeros((256,2000000),dtype=int16) > > This works for the first array data1. However, it returns with a > memory error for array data2. I have read somewhere that there is a > 2GB limit for numpy arrays on a 32 bit machine This has nothing to do with numpy per se - that's the fundamental limitation of 32 bits architectures. Each of your array is 1024 Mb, so you won't be able to create two of them. The 2Gb limit is a theoretical upper limit, and in practice, it will always be lower, if only because python itself needs some memory. There is also the memory fragmentation problem, which means allocating one contiguous, almost 2Gb segment will be difficult. > If someone has an idea to help me I would be very glad. If you really need to deal with arrays that big, you should move on 64 bits architecture. That's exactly the problem they are solving. cheers, David From cournape at gmail.com Tue Sep 8 21:02:09 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Sep 2009 10:02:09 +0900 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> Message-ID: <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote: > Hi David, >> I already gave my own opinion on py3k, which can be summarized as: >> ?- it is a huge effort, and no core numpy/scipy developer has >> expressed the urge to transition to py3k, since py3k does not bring >> much for scientific computing. >> ?- very few packages with a significant portion of C have been ported >> to my knowledge, hence very little experience on how to do it. AFAIK, >> only small packages have been ported. Even big, pure python projects >> have not been ported. The only big C project to have been ported is >> python itself, and it broke compatibility and used a different source >> tree than python 2. >> ?- it remains to be seen whether we can do the py3k support in the >> same source tree as the one use for python >= 2.4. Having two source >> trees would make the effort even much bigger, well over the current >> developers capacity IMHO. >> >> The only area where I could see the PSF helping is the point 2: more >> documentation, more stories about 2->3 transition. > > I'm surprised to hear you say that. I would think additional developer > and/or financial resources would be useful, for all of the reasons you > listed. If there was enough resources to pay someone very familiar with numpy codebase for a long time, then yes, it could be useful - but I assume that's out of the question. This would be very expensive as it would requires several full months IMO. The PSF could help for the point 3, by porting other projects to py3k and documenting it. The only example I know so far is pycog2 (http://mail.python.org/pipermail/python-porting/2008-December/000010.html). Paying people to do documentation about porting C code seems like a good way to spend money: it would be useful outside numpy community, and would presumably be less costly. David From fperez.net at gmail.com Tue Sep 8 21:03:25 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 8 Sep 2009 18:03:25 -0700 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> Message-ID: On Tue, Sep 8, 2009 at 5:08 PM, David Cournapeau wrote: > ?- it remains to be seen whether we can do the py3k support in the > same source tree as the one use for python >= 2.4. Having two source > trees would make the effort even much bigger, well over the current > developers capacity IMHO. I know ipython is a very different beast than numpy for this discussion (no C code at all, but extensive, invasive and often obscure use of the stdlib and the language itself). But FWIW, I have convinced myself that we will only really be able to seriously tackle the 3 transition when we can ditch 2.5 compatibility and have a tree that runs for 2.6 only, with all the -3 options turned on. Only at that point does it become feasible to start attacking the 3 transition for us. We simply don't have the manpower to manage multiple source trees that diverge fully and exist separately for 2.x and 3.x. Cheers, f From doutriaux1 at llnl.gov Tue Sep 8 21:03:42 2009 From: doutriaux1 at llnl.gov (=?UTF-8?Q?Charles_=D8=B3=D9=85=D9=8A=D8=B1_Doutriaux?=) Date: Tue, 8 Sep 2009 18:03:42 -0700 Subject: [Numpy-discussion] dtype and dtype.char In-Reply-To: References: <29F41702-997A-47A6-9240-67D09C0F091A@llnl.gov> <3d375d730909081302o67b5b086o3941f99380eb0a2c@mail.gmail.com> Message-ID: Ok I finally got it.... I was going at it backward... Instead of checking for NPY_INT64 and trying to figure out which letter it is different on each platform) I needed to check for NPY_LONGLONG /NPY_LONG/ NPY_INT, etc.. i.e I need to check for the numpy types that have an associated unique letter. Not their aliases since these can be different... It works now. C. On Sep 8, 2009, at 1:13 PM, Charles ???? Doutriaux wrote: > Hi Robert, > > Ok we have a section of code that used to be like that: > > char t; > switch(type) { > case NPY_CHAR: > t = 'c'; > break; > etc... > > I now replaced with > char t; > switch(type) { > case NPY_CHAR: > t = NPY_CHARLTR; > break; > > But I'm still stuck with numpy.uint64 > NPY_UINT64LTR does not seem to exist.... > > What do you recommend? > > C. > > On Sep 8, 2009, at 1:02 PM, Robert Kern wrote: > >> 2009/9/8 Charles ???? Doutriaux : >>> Hi, >>> >>> I'm testing our code on 64bit vs 32bit >>> >>> I just realized that the dtype.car is platform dependent. >>> >>> I guess it's normal >>> >>> her emy little test: >>> for t in >>> [numpy >>> .byte >>> ,numpy >>> .short >>> ,numpy >>> .int >>> ,numpy >>> .int32 >>> ,numpy >>> .float >>> ,numpy >>> .float32 >>> ,numpy >>> .double >>> ,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]: >>> print 'Testing type:',t >>> data = numpy.array([0], dtype=t) >>> print data.dtype.char,data.dtype >>> >>> >>> On 64bit I get for numpy.unit64: >>> Testing type: >>> L uint64 >>> >>> Whereas on 32bit i get >>> Testing type: >>> Q uint64 >>> >>> Is it really normal? I guess that means I shouldn't expect the >>> dtype.char to be the same on all platform.... >>> >>> Is that right? >> >> Yes. dtype.char corresponds more closely to the C type ("L" == >> "unsigned long" and "Q" == "unsigned long long") which is platform >> specific. >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it >> as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://**mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://*mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Tue Sep 8 21:41:23 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Sep 2009 20:41:23 -0500 Subject: [Numpy-discussion] Huge arrays In-Reply-To: References: Message-ID: On Tue, Sep 8, 2009 at 7:30 PM, Daniel Platz < mail.to.daniel.platz at googlemail.com> wrote: > Hi, > > I have a numpy newbie question. I want to store a huge amount of data > in an array. This data come from a measurement setup and I want to > write them to disk later since there is nearly no time for this during > the measurement. To put some numbers up: I have 2*256*2000000 int16 > numbers which I want to store. I tried > > data1 = numpy.zeros((256,2000000),dtype=int16) > data2 = numpy.zeros((256,2000000),dtype=int16) > > This works for the first array data1. However, it returns with a > memory error for array data2. I have read somewhere that there is a > 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still > be below that? I use Windows XP Pro 32 bit with 3GB of RAM. > > More precisely, 2GB for windows and 3GB for (non-PAE enabled) linux. The rest of the address space is set aside for the operating system. Note that address space is not the same as physical memory, but it sets a limit on what you can use, whether swap or real memory. Chuck. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Wed Sep 9 00:19:15 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 09 Sep 2009 06:19:15 +0200 Subject: [Numpy-discussion] Huge arrays In-Reply-To: References: Message-ID: <4AA72CC3.5050004@molden.no> Daniel Platz skrev: > data1 = numpy.zeros((256,2000000),dtype=int16) > data2 = numpy.zeros((256,2000000),dtype=int16) > > This works for the first array data1. However, it returns with a > memory error for array data2. I have read somewhere that there is a > 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still > be below that? I use Windows XP Pro 32 bit with 3GB of RAM. There is a 2 GB limit for user space on Win32, this is about 1.9 GB. You have other programs running as well, so this is still too much. Also Windows reserves 50% of RAM for itself, so you have less than 1.5 GB to play with. S.M. From seb.haase at gmail.com Wed Sep 9 01:10:58 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Wed, 9 Sep 2009 07:10:58 +0200 Subject: [Numpy-discussion] Huge arrays In-Reply-To: <4AA72CC3.5050004@molden.no> References: <4AA72CC3.5050004@molden.no> Message-ID: Hi, you can probably use PyTables for this. Even though it's meant to save/load data to/from disk (in HDF5 format) as far as I understand, it can be used to make your task solvable - even on a 32bit system !! It's free (pytables.org) -- so maybe you can try it out and tell me if I'm right .... Or someone else here would know right away... Cheers, Sebastian Haase On Wed, Sep 9, 2009 at 6:19 AM, Sturla Molden wrote: > Daniel Platz skrev: >> data1 = numpy.zeros((256,2000000),dtype=int16) >> data2 = numpy.zeros((256,2000000),dtype=int16) >> >> This works for the first array data1. However, it returns with a >> memory error for array data2. I have read somewhere that there is a >> 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still >> be below that? I use Windows XP Pro 32 bit with 3GB of RAM. > > There is a 2 GB limit for user space on Win32, this is about 1.9 GB. You > have other programs running as well, so this is still too much. Also > Windows reserves 50% of RAM for itself, so you have less than 1.5 GB to > play with. > > S.M. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cournape at gmail.com Wed Sep 9 01:22:33 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Sep 2009 14:22:33 +0900 Subject: [Numpy-discussion] Huge arrays In-Reply-To: References: <4AA72CC3.5050004@molden.no> Message-ID: <5b8d13220909082222s59ecb06bjb3816ed94572b911@mail.gmail.com> On Wed, Sep 9, 2009 at 2:10 PM, Sebastian Haase wrote: > Hi, > you can probably use PyTables for this. Even though it's meant to > save/load data to/from disk (in HDF5 format) as far as I understand, > it can be used to make your task solvable - even on a 32bit system !! > It's free (pytables.org) -- so maybe you can try it out and tell me if > I'm right .... You still would not be able to load a numpy array > 2 Gb. Numpy memory model needs one contiguously addressable chunk of memory for the data, which is limited under the 32 bits archs. This cannot be overcome in any way AFAIK. You may be able to save data > 2 Gb, by appending several chunks < 2 Gb to disk - maybe pytables supports this if it has large file support (which enables to write files > 2Gb on a 32 bits system). cheers, David From faltet at pytables.org Wed Sep 9 04:48:48 2009 From: faltet at pytables.org (Francesc Alted) Date: Wed, 9 Sep 2009 10:48:48 +0200 Subject: [Numpy-discussion] Huge arrays In-Reply-To: <5b8d13220909082222s59ecb06bjb3816ed94572b911@mail.gmail.com> References: <5b8d13220909082222s59ecb06bjb3816ed94572b911@mail.gmail.com> Message-ID: <200909091048.49439.faltet@pytables.org> A Wednesday 09 September 2009 07:22:33 David Cournapeau escrigu?: > On Wed, Sep 9, 2009 at 2:10 PM, Sebastian Haase wrote: > > Hi, > > you can probably use PyTables for this. Even though it's meant to > > save/load data to/from disk (in HDF5 format) as far as I understand, > > it can be used to make your task solvable - even on a 32bit system !! > > It's free (pytables.org) -- so maybe you can try it out and tell me if > > I'm right .... > > You still would not be able to load a numpy array > 2 Gb. Numpy memory > model needs one contiguously addressable chunk of memory for the data, > which is limited under the 32 bits archs. This cannot be overcome in > any way AFAIK. > > You may be able to save data > 2 Gb, by appending several chunks < 2 > Gb to disk - maybe pytables supports this if it has large file support > (which enables to write files > 2Gb on a 32 bits system). Yes, this later is supported in PyTables as long as the underlying filesystem supports files > 2 GB, which is very usual in modern operating systems. This even works on 32-bit systems as the indexing machinery in Python has been completely replaced inside PyTables. However, I think that what Daniel is trying to achieve is to be able to keep all the info in-memory because writing it to disk is too slow. I also agree that your suggestion to use a 64-bit OS (or 32-bit Linux, as it can address the full 3GB right out-of-the-box, as Chuck said) is the way to go. OTOH, having the possibility to manage compressed data buffers transparently in NumPy would help here, but not there yet ;-) -- Francesc Alted From faltet at pytables.org Wed Sep 9 05:18:48 2009 From: faltet at pytables.org (Francesc Alted) Date: Wed, 9 Sep 2009 11:18:48 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <4A8FBF64.50300@molden.no> Message-ID: <200909091118.48456.faltet@pytables.org> A Tuesday 08 September 2009 21:19:05 George Dahl escrigu?: > Sturla Molden molden.no> writes: > > Erik Tollerud skrev: > > >> NumPy arrays on the GPU memory is an easy task. But then I would have > > >> to write the computation in OpenCL's dialect of C99? > > > > > > This is true to some extent, but also probably difficult to do given > > > the fact that paralellizable algorithms are generally more difficult > > > to formulate in striaghtforward ways. > > > > Then you have misunderstood me completely. Creating an ndarray that has > > a buffer in graphics memory is not too difficult, given that graphics > > memory can be memory mapped. This has nothing to do with parallelizable > > algorithms or not. It is just memory management. We could make an > > ndarray subclass that quickly puts is content in a buffer accessible to > > the GPU. That is not difficult. But then comes the question of what you > > do with it. > > > > I think many here misunderstands the issue here: > > > > Teraflops peak performance of modern GPUs is impressive. But NumPy > > cannot easily benefit from that. In fact, there is little or nothing to > > gain from optimising in that end. In order for a GPU to help, > > computation must be the time-limiting factor. It is not. There is not > > more to say about using GPUs in NumPy right now. > > > > Take a look at the timings here: http://www.scipy.org/PerformancePython > > It shows that computing with NumPy is more than ten times slower than > > using plain C. This is despite NumPy being written in C. The NumPy code > > does not incur 10 times more floating point operations than the C code. > > The floating point unit does not run in turtle mode when using NumPy. > > NumPy's relative slowness compared to C has nothing to do with floating > > point computation. It is due to inferior memory use (temporary buffers, > > multiple buffer traversals) and memory access being slow. Moving > > computation to the GPU can only make this worse. > > > > Improved memory usage - e.g. through lazy evaluation and JIT compilaton > > of expressions - can give up to a tenfold increase in performance. That > > is where we must start optimising to get a faster NumPy. Incidentally, > > this will also make it easier to leverage on modern GPUs. > > > > Sturla Molden > > I know that for my work, I can get around an order of a 50-fold speedup > over numpy using a python wrapper for a simple GPU matrix class. So I > might be dealing with a lot of matrix products where I multiply a fixed 512 > by 784 matrix by a 784 by 256 matrix that changes between each matrix > product, although to really see the largest gains I use a 4096 by 2048 > matrix times a bunch of 2048 by 256 matrices. If all I was doing were > those matrix products, it would be even faster, but what I actually am > doing is a matrix product, then adding a column vector to the result, then > applying an elementwise logistic sigmoid function and potentially > generating a matrix of pseudorandom numbers the same shape as my result > (although not always). When I do these sorts of workloads, my python > numpy+GPU matrix class goes so much faster than anything that doesn't use > the GPU (be it Matlab, or numpy, or C/C++ whatever) that I don't even > bother measuring the speedups precisely. In some cases, my python code > isn't making too many temporaries since what it is doing is so simple, but > in other cases that is obviously slowing it down a bit. I have relatively > complicated jobs that used to take weeks on the CPU can now take hours or > days. > > Obviously improved memory usage would be more helpful since not everyone > has access to the sorts of GPUs I use, but tenfold increases in performance > seem like chump change compared to what I see with the sorts of workloads I > do. 50-fold increases over NumPy+[Atlas|MKL] are really impressive. However, the point is that these speed-ups can be achieved only when the ratio of operations per element is really huge. Matrix-matrix multiplication (your example above) is a paradigmatic example of these scenarios, where computations are O(3) (or little smaller than 3, when optimized algorithms are used), while memory access is O(2). Of course, when the matrices are large, the ratio operations/elements is larger, allowing much better speed-ups; this is why GPUs really do a good job here. The point here is that matrix-matrix multiplications (or, in general, functions with a large operation/element ratio) are a *tiny* part of all the possible operations between arrays that NumPy supports. This is why Sturla is saying that it is not a good idea to include support of GPUs in all parts of NumPy. A much better strategy is to give NumPy the possibility to link with external packages (? la BLAS, LAPACK, Atlas, MKL) that can leverage the powerful GPUs for specific problems (e.g. matrix-matrix multiplications). -- Francesc Alted From faltet at pytables.org Wed Sep 9 05:26:06 2009 From: faltet at pytables.org (Francesc Alted) Date: Wed, 9 Sep 2009 11:26:06 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4AA6CAF1.9060303@noaa.gov> References: <4AA6CAF1.9060303@noaa.gov> Message-ID: <200909091126.06554.faltet@pytables.org> A Tuesday 08 September 2009 23:21:53 Christopher Barker escrigu?: > Also, perhaps a GPU-aware numexpr could be helpful which I think is the > kind of thing that Sturla was refering to when she wrote: > > "Incidentally, this will also make it easier to leverage on modern GPUs." Numexpr mainly supports functions that are meant to be used element-wise, so the operation/element ratio is normally 1 (or close to 1). In these scenarios is where improved memory access is much more important than CPU (or, for that matter, GPU), and is the reason why numexpr is much more efficient than NumPy when evaluating complex expressions like ``a*b+c*sqrt(d)``. In other words, a GPU-enabled numexpr makes little sense. -- Francesc Alted From faltet at pytables.org Wed Sep 9 05:55:07 2009 From: faltet at pytables.org (Francesc Alted) Date: Wed, 9 Sep 2009 11:55:07 +0200 Subject: [Numpy-discussion] Huge arrays In-Reply-To: <200909091048.49439.faltet@pytables.org> References: <5b8d13220909082222s59ecb06bjb3816ed94572b911@mail.gmail.com> <200909091048.49439.faltet@pytables.org> Message-ID: <200909091155.07508.faltet@pytables.org> A Wednesday 09 September 2009 10:48:48 Francesc Alted escrigu?: > OTOH, having the possibility to manage compressed data buffers > transparently in NumPy would help here, but not there yet ;-) Now that I think about it, in case the data is compressible, Daniel could try to define a PyTables' compressed array or table on-disk and save chunks to it. If data is compressible enough, the filesystem cache will keep it in-memory, until the disk can eventually absorb it. For doing this, I would recommend to use the LZO compressor, as it is one of the fastest I've seen (at least until Blosc would be ready), because it can compress up to 5 times faster than output data to disk (depending on how compressible the data is, and the speed of the disk subsystem). Of course, if data is not compressible at all, then this venue doesn't make a lot of sense. HTH, -- Francesc Alted From sccolbert at gmail.com Wed Sep 9 05:57:36 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Wed, 9 Sep 2009 05:57:36 -0400 Subject: [Numpy-discussion] Row-wise dot product? In-Reply-To: References: <710F2847B0018641891D9A21602763605AD159@ex3.envision.co.il> Message-ID: <7f014ea60909090257q37e1654g9984d60223928244@mail.gmail.com> the way I do my rotations is this: tmat = rotation matrix vec = stack of row vectors rotated_vecs = np.dot(tmat, vec.T).T On Mon, Sep 7, 2009 at 6:53 PM, T J wrote: > On Mon, Sep 7, 2009 at 3:43 PM, T J wrote: >> Or perhaps I am just being dense. >> > > Yes. ?I just tried to reinvent standard matrix multiplication. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From rsalvador.wk at gmail.com Wed Sep 9 06:28:29 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Wed, 9 Sep 2009 12:28:29 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... Message-ID: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> Hi there! I'm sure I'm missing something, but I am not able of doing a simple sum of two arrays with different dimensions. I have a 2D array and a 1D array np.shape(a) (8, 26) np.shape(b) (8,) and I want to sum each *row* of 'a' with the equivalent *row* of 'b' (this is, summing each 1D row array of 'a' with each scalar of 'b' for all rows). It's straight forward doing the sum with a for loop (yes, I learnt C long ago :S ): for i in range(8): c[i] = a[i] + b[i] but, is there a Pythonic way to do this? or even better...which is the pythonic way of doing it? because I'm sure there is one... Thanks, and sorry for such an easy question...but I'm stuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Wed Sep 9 06:46:59 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 9 Sep 2009 11:46:59 +0100 Subject: [Numpy-discussion] Adding a 2D with a 1D array... References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk> Hi Ruben One dimensional arrays can be thought of as rows. If you want a column, you need to append a dimension. >>> d = a + b[:,None] which is equivalent to >>> d = a + b[:,np.newaxis] Best, Luca From rsalvador.wk at gmail.com Wed Sep 9 07:08:22 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Wed, 9 Sep 2009 13:08:22 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk> Message-ID: <4fe028e30909090408m6e7b8128u4fe2927edef17dd6@mail.gmail.com> Perfect! Thank you very much :D It's not obvious, though...I think I should read more deeply into Python/NumPy...but for the use I'm giving to it... Anyway, I thought the pythonic way would be faster, but after trying with a size 80000 instead of 8...the for loop is faster! Pythonic time ==> 0.36776400 seconds For loop time ==> 0.31708717 seconds :S On Wed, Sep 9, 2009 at 12:46 PM, Citi, Luca wrote: > Hi Ruben > > One dimensional arrays can be thought of as rows. If you want a column, you > need to append a dimension. > > >>> d = a + b[:,None] > which is equivalent to > >>> d = a + b[:,np.newaxis] > > Best, > Luca > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav+sp at iki.fi Wed Sep 9 07:19:53 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Wed, 9 Sep 2009 11:19:53 +0000 (UTC) Subject: [Numpy-discussion] Adding a 2D with a 1D array... References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk> <4fe028e30909090408m6e7b8128u4fe2927edef17dd6@mail.gmail.com> Message-ID: Wed, 09 Sep 2009 13:08:22 +0200, Ruben Salvador wrote: > Perfect! Thank you very much :D > > It's not obvious, though...I think I should read more deeply into > Python/NumPy...but for the use I'm giving to it... > > Anyway, I thought the pythonic way would be faster, but after trying > with a size 80000 instead of 8...the for loop is faster! > > Pythonic time ==> 0.36776400 seconds > For loop time ==> 0.31708717 seconds Doubtful: In [1]: import numpy as np In [2]: a = np.zeros((80000, 26)) In [3]: b = np.zeros((80000,)) In [4]: def foo(): ...: d = np.empty(a.shape, a.dtype) ...: for i in xrange(a.shape[0]): ...: d[i] = a[i] + b[i] ...: In [5]: %timeit d=a+b[:,None] 100 loops, best of 3: 18.3 ms per loop In [6]: %timeit foo() 10 loops, best of 3: 334 ms per loop From washakie at gmail.com Wed Sep 9 07:25:16 2009 From: washakie at gmail.com (John [H2O]) Date: Wed, 9 Sep 2009 04:25:16 -0700 (PDT) Subject: [Numpy-discussion] re loading f2py modules in ipython Message-ID: <25362944.post@talk.nabble.com> Hello, I've started to rely more and more on f2py to create simple modules utilizing Fortran for efficiency. This is a great tool to have within Python! A problem, however, is that unlike python modules, the reload() function does not seem to update the f2py modules within ipython (which I use extensively for testing). Is there another function to call? Regards! -- View this message in context: http://www.nabble.com/reloading-f2py-modules-in-ipython-tp25362944p25362944.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From lciti at essex.ac.uk Wed Sep 9 07:20:56 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 9 Sep 2009 12:20:56 +0100 Subject: [Numpy-discussion] Adding a 2D with a 1D array... References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com><3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk> <4fe028e30909090408m6e7b8128u4fe2927edef17dd6@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA8@sernt14.essex.ac.uk> I am sorry but it doesn't make much sense. How do you measure the performance? Are you sure you include the creation of the "c" output array in the time spent (which is outside the for loop but should be considered anyway)? Here are my results... In [84]: a = np.random.rand(8,26) In [85]: b = np.random.rand(8) In [86]: def o(a,b): ....: c = np.empty_like(a) ....: for i in range(len(a)): ....: c[i] = a[i] + b[i] ....: return c ....: In [87]: d = a + b[:,None] In [88]: (d == o(a,b)).all() Out[88]: True In [89]: %timeit o(a,b) %ti10000 loops, best of 3: 36.8 ?s per loop In [90]: %timeit d = a + b[:,None] 100000 loops, best of 3: 5.17 ?s per loop In [91]: a = np.random.rand(80000,26) In [92]: b = np.random.rand(80000) In [93]: %timeit o(a,b) %ti10 loops, best of 3: 287 ms per loop In [94]: %timeit d = a + b[:,None] 100 loops, best of 3: 15.4 ms per loop From dsdale24 at gmail.com Wed Sep 9 08:15:39 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 9 Sep 2009 08:15:39 -0400 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> Message-ID: On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau wrote: > On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote: >> Hi David, > >>> I already gave my own opinion on py3k, which can be summarized as: >>> ?- it is a huge effort, and no core numpy/scipy developer has >>> expressed the urge to transition to py3k, since py3k does not bring >>> much for scientific computing. >>> ?- very few packages with a significant portion of C have been ported >>> to my knowledge, hence very little experience on how to do it. AFAIK, >>> only small packages have been ported. Even big, pure python projects >>> have not been ported. The only big C project to have been ported is >>> python itself, and it broke compatibility and used a different source >>> tree than python 2. >>> ?- it remains to be seen whether we can do the py3k support in the >>> same source tree as the one use for python >= 2.4. Having two source >>> trees would make the effort even much bigger, well over the current >>> developers capacity IMHO. >>> >>> The only area where I could see the PSF helping is the point 2: more >>> documentation, more stories about 2->3 transition. >> >> I'm surprised to hear you say that. I would think additional developer >> and/or financial resources would be useful, for all of the reasons you >> listed. > > If there was enough resources to pay someone very familiar with numpy > codebase for a long time, then yes, it could be useful - but I assume > that's out of the question. This would be very expensive as it would > requires several full months IMO. > > The PSF could help for the point 3, by porting other projects to py3k > and documenting it. The only example I know so far is pycog2 > (http://mail.python.org/pipermail/python-porting/2008-December/000010.html). > > Paying people to do documentation about porting C code seems like a > good way to spend money: it would be useful outside numpy community, > and would presumably be less costly. Another topic concerning documentation is API compatibility. The python devs have requested projects not use the 2-3 transition as an excuse to change their APIs, but numpy is maybe a special case. I'm thinking about PEP3118. Is numpy going to transition to python 3 and then down the road transition again to the new buffer protocol? What is the strategy here? My underinformed impression is that there isn't one, since every time PEP3118 is considered in the context of the 2-3 transition somebody helpfully reminds the list that we aren't supposed to break APIs. Numpy is a critical python library, perhaps the transition presents an opportunity, if the community can yield a little on numpy's C api. For example, in the long run, what would it take to get numpy (or the core thereof) into the standard library, and can we take steps now in that direction? Would the numpy devs be receptive to comments from the python devs on the existing numpy codebase? I'm willing to pitch in and work on the transition, not because I need python-3 right now, but because the transition needs to happen and it would benefit everyone in the long run. But I would like to know that we are making the most of the opportunity, and have considered our options. Darren From rsalvador.wk at gmail.com Wed Sep 9 08:36:21 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Wed, 9 Sep 2009 14:36:21 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA8@sernt14.essex.ac.uk> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk> <4fe028e30909090408m6e7b8128u4fe2927edef17dd6@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA8@sernt14.essex.ac.uk> Message-ID: <4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com> Your results are what I expected...but. This code is called from my main program, and what I have in there (output array already created for both cases) is: print "lambd", lambd print "np.shape(a)", np.shape(a) print "np.shape(r)", np.shape(r) print "np.shape(offspr)", np.shape(offspr) t = clock() for i in range(lambd): offspr[i] = r[i] + a[i] t1 = clock() - t print "For loop time ==> %.8f seconds" % t1 t2 = clock() offspr = r + a[:,None] t3 = clock() - t2 print "Pythonic time ==> %.8f seconds" % t3 The results I obtain are: lambd 80000 np.shape(a) (80000,) np.shape(r) (80000, 26) np.shape(offspr) (80000, 26) For loop time ==> 0.34528804 seconds Pythonic time ==> 0.35956192 seconds Maybe I'm not measuring properly, so, how should I do it? On Wed, Sep 9, 2009 at 1:20 PM, Citi, Luca wrote: > I am sorry but it doesn't make much sense. > How do you measure the performance? > Are you sure you include the creation of the "c" output array in the time > spent (which is outside the for loop but should be considered anyway)? > > Here are my results... > > In [84]: a = np.random.rand(8,26) > > In [85]: b = np.random.rand(8) > > In [86]: def o(a,b): > ....: c = np.empty_like(a) > ....: for i in range(len(a)): > ....: c[i] = a[i] + b[i] > ....: return c > ....: > > In [87]: d = a + b[:,None] > > In [88]: (d == o(a,b)).all() > Out[88]: True > > In [89]: %timeit o(a,b) > %ti10000 loops, best of 3: 36.8 ?s per loop > > In [90]: %timeit d = a + b[:,None] > 100000 loops, best of 3: 5.17 ?s per loop > > In [91]: a = np.random.rand(80000,26) > > In [92]: b = np.random.rand(80000) > > In [93]: %timeit o(a,b) > %ti10 loops, best of 3: 287 ms per loop > > In [94]: %timeit d = a + b[:,None] > 100 loops, best of 3: 15.4 ms per loop > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsalvador.wk at gmail.com Wed Sep 9 08:42:43 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Wed, 9 Sep 2009 14:42:43 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk> <4fe028e30909090408m6e7b8128u4fe2927edef17dd6@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA8@sernt14.essex.ac.uk> <4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com> Message-ID: <4fe028e30909090542m52accbcfkdbe070599f112fa1@mail.gmail.com> I forgot...just in case: rsalvador at cactus:~$ python --version Python 2.5.2 python-scipy: version 0.6.0 On Wed, Sep 9, 2009 at 2:36 PM, Ruben Salvador wrote: > Your results are what I expected...but. This code is called from my main > program, and what I have in there (output array already created for both > cases) is: > > print "lambd", lambd > print "np.shape(a)", np.shape(a) > print "np.shape(r)", np.shape(r) > print "np.shape(offspr)", np.shape(offspr) > t = clock() > for i in range(lambd): > offspr[i] = r[i] + a[i] > t1 = clock() - t > print "For loop time ==> %.8f seconds" % t1 > t2 = clock() > offspr = r + a[:,None] > t3 = clock() - t2 > print "Pythonic time ==> %.8f seconds" % t3 > > The results I obtain are: > > lambd 80000 > np.shape(a) (80000,) > np.shape(r) (80000, 26) > np.shape(offspr) (80000, 26) > For loop time ==> 0.34528804 seconds > Pythonic time ==> 0.35956192 seconds > > Maybe I'm not measuring properly, so, how should I do it? > > On Wed, Sep 9, 2009 at 1:20 PM, Citi, Luca wrote: > >> I am sorry but it doesn't make much sense. >> How do you measure the performance? >> Are you sure you include the creation of the "c" output array in the time >> spent (which is outside the for loop but should be considered anyway)? >> >> Here are my results... >> >> In [84]: a = np.random.rand(8,26) >> >> In [85]: b = np.random.rand(8) >> >> In [86]: def o(a,b): >> ....: c = np.empty_like(a) >> ....: for i in range(len(a)): >> ....: c[i] = a[i] + b[i] >> ....: return c >> ....: >> >> In [87]: d = a + b[:,None] >> >> In [88]: (d == o(a,b)).all() >> Out[88]: True >> >> In [89]: %timeit o(a,b) >> %ti10000 loops, best of 3: 36.8 ?s per loop >> >> In [90]: %timeit d = a + b[:,None] >> 100000 loops, best of 3: 5.17 ?s per loop >> >> In [91]: a = np.random.rand(80000,26) >> >> In [92]: b = np.random.rand(80000) >> >> In [93]: %timeit o(a,b) >> %ti10 loops, best of 3: 287 ms per loop >> >> In [94]: %timeit d = a + b[:,None] >> 100 loops, best of 3: 15.4 ms per loop >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lev at columbia.edu Wed Sep 9 09:36:30 2009 From: lev at columbia.edu (Lev Givon) Date: Wed, 9 Sep 2009 09:36:30 -0400 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909091118.48456.faltet@pytables.org> References: <4A8FBF64.50300@molden.no> <200909091118.48456.faltet@pytables.org> Message-ID: <20090909133630.GC25357@localhost.columbia.edu> Received from Francesc Alted on Wed, Sep 09, 2009 at 05:18:48AM EDT: (snip) > The point here is that matrix-matrix multiplications (or, in general, > functions with a large operation/element ratio) are a *tiny* part of all the > possible operations between arrays that NumPy supports. This is why Sturla is > saying that it is not a good idea to include support of GPUs in all parts of > NumPy. A much better strategy is to give NumPy the possibility to link with > external packages (? la BLAS, LAPACK, Atlas, MKL) that can leverage the .. and CULA: http://www.culatools.com/ > powerful GPUs for specific problems (e.g. matrix-matrix multiplications). L.G. From faltet at pytables.org Wed Sep 9 10:41:17 2009 From: faltet at pytables.org (Francesc Alted) Date: Wed, 9 Sep 2009 16:41:17 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909091126.06554.faltet@pytables.org> References: <4AA6CAF1.9060303@noaa.gov> <200909091126.06554.faltet@pytables.org> Message-ID: <200909091641.17205.faltet@pytables.org> A Wednesday 09 September 2009 11:26:06 Francesc Alted escrigu?: > A Tuesday 08 September 2009 23:21:53 Christopher Barker escrigu?: > > Also, perhaps a GPU-aware numexpr could be helpful which I think is the > > kind of thing that Sturla was refering to when she wrote: > > > > "Incidentally, this will also make it easier to leverage on modern > > GPUs." > > Numexpr mainly supports functions that are meant to be used element-wise, > so the operation/element ratio is normally 1 (or close to 1). In these > scenarios is where improved memory access is much more important than CPU > (or, for that matter, GPU), and is the reason why numexpr is much more > efficient than NumPy when evaluating complex expressions like > ``a*b+c*sqrt(d)``. > > In other words, a GPU-enabled numexpr makes little sense. Er, I forgot the fact that one exception to operation/element ratio being normally 1 in numexpr is the computation of transcendental functions (trigonometrical, exponential, logarithmic...) where the number of CPU operations per element is much larger than 1 (normally in the 100s). Right now, there is support for accelerating them in numexpr via VML (Intel's Vector Math Library), but I suppose that a library making use of a GPU would be very interesting too (and the same applies to numpy). But again, it makes more sense to rely on external packages or libraries (similar to the VML above) for this sort of things. After having a look at CULA (thanks for the pointer, Lev!), my hope is that in short we will see other libraries allowing for efficient evaluation of transcendental functions using GPUs too. -- Francesc Alted From robert.kern at gmail.com Wed Sep 9 11:21:45 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Sep 2009 10:21:45 -0500 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> Message-ID: <3d375d730909090821scc9f7fahf55b17acf563ce54@mail.gmail.com> On Wed, Sep 9, 2009 at 07:15, Darren Dale wrote: > Another topic concerning documentation is API compatibility. The > python devs have requested projects not use the 2-3 transition as an > excuse to change their APIs, but numpy is maybe a special case. I'm > thinking about PEP3118. Is numpy going to transition to python 3 and > then down the road transition again to the new buffer protocol? What > is the strategy here? My underinformed impression is that there isn't > one, since every time PEP3118 is considered in the context of the 2-3 > transition somebody helpfully reminds the list that we aren't supposed > to break APIs. We aren't supposed to break APIs that aren't related to the 2-3 transition. PEP3118 is related to the 2-3 transition. Since I'm that somebody that always pipes up about this topic, I'm pretty sure it hasn't been PEP3118-related breakage that has been proposed. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Sep 9 11:22:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Sep 2009 10:22:43 -0500 Subject: [Numpy-discussion] re loading f2py modules in ipython In-Reply-To: <25362944.post@talk.nabble.com> References: <25362944.post@talk.nabble.com> Message-ID: <3d375d730909090822q7cb8afcdp4e31f5bcf50c77ba@mail.gmail.com> On Wed, Sep 9, 2009 at 06:25, John [H2O] wrote: > > Hello, > > I've started to rely more and more on f2py to create simple modules > utilizing Fortran for efficiency. This is a great tool to have within > Python! > > A problem, however, is that unlike python modules, the reload() function > does not seem to update the f2py modules within ipython (which I use > extensively for testing). > > Is there another function to call? No. Extension modules, regardless of whether they are built with f2py or pure C, cannot be be reloaded. It's a limitation of CPython. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Sep 9 11:25:31 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Sep 2009 10:25:31 -0500 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> Message-ID: On Wed, Sep 9, 2009 at 7:15 AM, Darren Dale wrote: > On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau > wrote: > > On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote: > >> Hi David, > > > >>> I already gave my own opinion on py3k, which can be summarized as: > >>> - it is a huge effort, and no core numpy/scipy developer has > >>> expressed the urge to transition to py3k, since py3k does not bring > >>> much for scientific computing. > >>> - very few packages with a significant portion of C have been ported > >>> to my knowledge, hence very little experience on how to do it. AFAIK, > >>> only small packages have been ported. Even big, pure python projects > >>> have not been ported. The only big C project to have been ported is > >>> python itself, and it broke compatibility and used a different source > >>> tree than python 2. > >>> - it remains to be seen whether we can do the py3k support in the > >>> same source tree as the one use for python >= 2.4. Having two source > >>> trees would make the effort even much bigger, well over the current > >>> developers capacity IMHO. > >>> > >>> The only area where I could see the PSF helping is the point 2: more > >>> documentation, more stories about 2->3 transition. > >> > >> I'm surprised to hear you say that. I would think additional developer > >> and/or financial resources would be useful, for all of the reasons you > >> listed. > > > > If there was enough resources to pay someone very familiar with numpy > > codebase for a long time, then yes, it could be useful - but I assume > > that's out of the question. This would be very expensive as it would > > requires several full months IMO. > > > > The PSF could help for the point 3, by porting other projects to py3k > > and documenting it. The only example I know so far is pycog2 > > ( > http://mail.python.org/pipermail/python-porting/2008-December/000010.html > ). > > > > Paying people to do documentation about porting C code seems like a > > good way to spend money: it would be useful outside numpy community, > > and would presumably be less costly. > > Another topic concerning documentation is API compatibility. The > python devs have requested projects not use the 2-3 transition as an > excuse to change their APIs, but numpy is maybe a special case. I'm > thinking about PEP3118. Is numpy going to transition to python 3 and > then down the road transition again to the new buffer protocol? What > is the strategy here? My underinformed impression is that there isn't > one, since every time PEP3118 is considered in the context of the 2-3 > transition somebody helpfully reminds the list that we aren't supposed > to break APIs. Numpy is a critical python library, perhaps the > transition presents an opportunity, if the community can yield a > little on numpy's C api. For example, in the long run, what would it > take to get numpy (or the core thereof) into the standard library, and > can we take steps now in that direction? Would the numpy devs be > receptive to comments from the python devs on the existing numpy > codebase? > > I'm willing to pitch in and work on the transition, not because I need > python-3 right now, but because the transition needs to happen and it > would benefit everyone in the long run. But I would like to know that > we are making the most of the opportunity, and have considered our > options. > > Making numpy more buffer centric is an interesting idea and might be where we want to go with the ufuncs, but the new buffer protocol didn't go in until python 2.6. If there was no rush I'd go with Fernando and wait until we could be all python 2.6 all the time. However, if anyone has the time to work on getting the c-code up to snuff and finding out what the problems are I'm all for that. I have some notes on the transition in the src directory and if you do anything please keep them current. There is a lot of work to be done in the python code also, some of which can be done at this time, i.e., making all exceptions use class ctors, etc. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bergstrj at iro.umontreal.ca Wed Sep 9 11:27:30 2009 From: bergstrj at iro.umontreal.ca (James Bergstra) Date: Wed, 9 Sep 2009 11:27:30 -0400 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909091641.17205.faltet@pytables.org> References: <4AA6CAF1.9060303@noaa.gov> <200909091126.06554.faltet@pytables.org> <200909091641.17205.faltet@pytables.org> Message-ID: <7f1eaee30909090827s34cab32ctb6520bec42422924@mail.gmail.com> On Wed, Sep 9, 2009 at 10:41 AM, Francesc Alted wrote: >> Numexpr mainly supports functions that are meant to be used element-wise, >> so the operation/element ratio is normally 1 (or close to 1). In these >> scenarios is where improved memory access is much more important than CPU >> (or, for that matter, GPU), and is the reason why numexpr is much more >> efficient than NumPy when evaluating complex expressions like >> ``a*b+c*sqrt(d)``. >> >> In other words, a GPU-enabled numexpr makes little sense. There's another way of looking at this, which has been mentioned before in the conversation, but which I think should be mentioned again... The cost of transfer to and from a GPU is very high, compared with most of the sorts of things that we do with ndarrays. So the approach of using libraries to speed up little pieces here and there (i.e. with VML or ATLAS) but basically to let stock numpy take care of the rest does not work. In order to benefit from huge speedups on a GPU, data need to be on the GPU already. It is a good idea to perform low-instruction density functions on the GPU even when the CPU could go just as fast (or even if the CPU is faster!) just to ensure that the data stay on the GPU. Suppose you want to evaluate "dot(a*b+c*sqrt(d), e)". The GPU is great for doing dot(), but if you have to copy the result of the elemwise expression to the GPU before you can start doing dot(), then the performance advantage is ruined. Except for huge matrices, you might as well just leave the data in the system RAM and use a normal BLAS library. So that's why it is a good idea to use the GPU to do some functions even when the CPU would be faster for them (in isolation). All that said, there is a possibility that future devices (and some laptops already?) will use an integrated memory system that might make 'copying to the GPU' a non-issue... but we're not there yet I think... James -- http://www-etud.iro.umontreal.ca/~bergstrj From dagss at student.matnat.uio.no Wed Sep 9 11:34:03 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 09 Sep 2009 17:34:03 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4AA6CAF1.9060303@noaa.gov> References: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> <4A8FBF64.50300@molden.no> <4AA6CAF1.9060303@noaa.gov> Message-ID: <4AA7CAEB.2090106@student.matnat.uio.no> Christopher Barker wrote: > George Dahl wrote: >> Sturla Molden molden.no> writes: >>> Teraflops peak performance of modern GPUs is impressive. But NumPy >>> cannot easily benefit from that. > >> I know that for my work, I can get around an order of a 50-fold speedup over >> numpy using a python wrapper for a simple GPU matrix class. > > I think you're talking across each other here. Sturla is referring to > making a numpy ndarray gpu-aware and then expecting expressions like: > > z = a*x**2 + b*x + c > > to go faster when s, b, c, and x are ndarrays. > > That's not going to happen. > > On the other hand, George is talking about moving higher-level > operations (like a matrix product) over to GPU code. This is analogous > to numpy.linalg and numpy.dot() using LAPACK routines, and yes, that > could help those programs that use such operations. > > So a GPU LAPACK would be nice. > > This is also analogous to using SWIG, or ctypes or cython or weave, or > ??? to move a computationally expensive part of the code over to C. > > I think anything that makes it easier to write little bits of your code > for the GPU would be pretty cool -- a GPU-aware Cython? Cython is probably open for that if anybody's interested in implementing it/make a student project on it (way too big for GSoC I think, unfortunately). However I'd definitely make it a generic library turning expressions into compiled code (either GPU or CPU w/SSE); that could then be used both at compile-time from Cython, or at run-time using e.g. SymPy or SAGE expressions. Both PyCUDA and CorePy would tend to allow both compile-time operation and run-time operation. -- Dag Sverre From dagss at student.matnat.uio.no Wed Sep 9 12:03:52 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 09 Sep 2009 18:03:52 +0200 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> Message-ID: <4AA7D1E8.7000800@student.matnat.uio.no> Darren Dale wrote: > On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau wrote: >> On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote: >>> Hi David, >>>> I already gave my own opinion on py3k, which can be summarized as: >>>> - it is a huge effort, and no core numpy/scipy developer has >>>> expressed the urge to transition to py3k, since py3k does not bring >>>> much for scientific computing. >>>> - very few packages with a significant portion of C have been ported >>>> to my knowledge, hence very little experience on how to do it. AFAIK, >>>> only small packages have been ported. Even big, pure python projects >>>> have not been ported. The only big C project to have been ported is >>>> python itself, and it broke compatibility and used a different source >>>> tree than python 2. >>>> - it remains to be seen whether we can do the py3k support in the >>>> same source tree as the one use for python >= 2.4. Having two source >>>> trees would make the effort even much bigger, well over the current >>>> developers capacity IMHO. >>>> >>>> The only area where I could see the PSF helping is the point 2: more >>>> documentation, more stories about 2->3 transition. >>> I'm surprised to hear you say that. I would think additional developer >>> and/or financial resources would be useful, for all of the reasons you >>> listed. >> If there was enough resources to pay someone very familiar with numpy >> codebase for a long time, then yes, it could be useful - but I assume >> that's out of the question. This would be very expensive as it would >> requires several full months IMO. >> >> The PSF could help for the point 3, by porting other projects to py3k >> and documenting it. The only example I know so far is pycog2 >> (http://mail.python.org/pipermail/python-porting/2008-December/000010.html). >> >> Paying people to do documentation about porting C code seems like a >> good way to spend money: it would be useful outside numpy community, >> and would presumably be less costly. > > Another topic concerning documentation is API compatibility. The > python devs have requested projects not use the 2-3 transition as an > excuse to change their APIs, but numpy is maybe a special case. I'm > thinking about PEP3118. Is numpy going to transition to python 3 and > then down the road transition again to the new buffer protocol? What > is the strategy here? My underinformed impression is that there isn't > one, since every time PEP3118 is considered in the context of the 2-3 > transition somebody helpfully reminds the list that we aren't supposed > to break APIs. Numpy is a critical python library, perhaps the I'd be surprised if this is the case and if there are any issues. What Robert said applies, plus: In Python 2.6 the ndarray type would support *both* the old and the new buffer protocols, which can be usedin parallel on Python 2.6. There's no real issue on the PEP 3118 at all as I can see, it just needs to be done. I'll try hard to give this a small start (ndarray export its buffer) in November (though when the time comes I might feel that I really should be studying instead...). > transition presents an opportunity, if the community can yield a > little on numpy's C api. For example, in the long run, what would it > take to get numpy (or the core thereof) into the standard library, and > can we take steps now in that direction? Would the numpy devs be > receptive to comments from the python devs on the existing numpy > codebase? I think this one is likely a question of semantics. My feeling is that for instance the slice-returns-a-view on an array type would be hard to swallow on a standard library component? (Seeing as list returns a copy.) Python 3 kind of solved this by calling the type "memoryview", which implies that slicing returns another view. I have a feeling the the best start in this direction might be for somebody to give the memoryview type in Python 3 some love, perhaps set it up as a light-weight ndarray replacement in the standard library. (If anybody implemented fancy indexing on a memoryview I suppose it should return a new view though (through a pointer table), meaning incompatability with NumPy's fancy indexing...) > I'm willing to pitch in and work on the transition, not because I need > python-3 right now, but because the transition needs to happen and it > would benefit everyone in the long run. But I would like to know that > we are making the most of the opportunity, and have considered our > options. Well, something that may belong here: There's been some talk now and then on whether one should to port parts of the NumPy C codebase to Cython (which gives automatic Python 3 compatability, up to string/bytes issues etc.). That could probably take somewhat longer, but perhaps result in a better maintainable code base in the end which more people could work on. -- Dag Sverre From dsdale24 at gmail.com Wed Sep 9 12:06:24 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 9 Sep 2009 12:06:24 -0400 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> Message-ID: On Wed, Sep 9, 2009 at 11:25 AM, Charles R Harris wrote: > > > On Wed, Sep 9, 2009 at 7:15 AM, Darren Dale wrote: >> >> On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau >> wrote: >> > On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote: >> >> Hi David, >> > >> >>> I already gave my own opinion on py3k, which can be summarized as: >> >>> ?- it is a huge effort, and no core numpy/scipy developer has >> >>> expressed the urge to transition to py3k, since py3k does not bring >> >>> much for scientific computing. >> >>> ?- very few packages with a significant portion of C have been ported >> >>> to my knowledge, hence very little experience on how to do it. AFAIK, >> >>> only small packages have been ported. Even big, pure python projects >> >>> have not been ported. The only big C project to have been ported is >> >>> python itself, and it broke compatibility and used a different source >> >>> tree than python 2. >> >>> ?- it remains to be seen whether we can do the py3k support in the >> >>> same source tree as the one use for python >= 2.4. Having two source >> >>> trees would make the effort even much bigger, well over the current >> >>> developers capacity IMHO. >> >>> >> >>> The only area where I could see the PSF helping is the point 2: more >> >>> documentation, more stories about 2->3 transition. >> >> >> >> I'm surprised to hear you say that. I would think additional developer >> >> and/or financial resources would be useful, for all of the reasons you >> >> listed. >> > >> > If there was enough resources to pay someone very familiar with numpy >> > codebase for a long time, then yes, it could be useful - but I assume >> > that's out of the question. This would be very expensive as it would >> > requires several full months IMO. >> > >> > The PSF could help for the point 3, by porting other projects to py3k >> > and documenting it. The only example I know so far is pycog2 >> > >> > (http://mail.python.org/pipermail/python-porting/2008-December/000010.html). >> > >> > Paying people to do documentation about porting C code seems like a >> > good way to spend money: it would be useful outside numpy community, >> > and would presumably be less costly. >> >> Another topic concerning documentation is API compatibility. The >> python devs have requested projects not use the 2-3 transition as an >> excuse to change their APIs, but numpy is maybe a special case. I'm >> thinking about PEP3118. Is numpy going to transition to python 3 and >> then down the road transition again to the new buffer protocol? What >> is the strategy here? My underinformed impression is that there isn't >> one, since every time PEP3118 is considered in the context of the 2-3 >> transition somebody helpfully reminds the list that we aren't supposed >> to break APIs. Numpy is a critical python library, perhaps the >> transition presents an opportunity, if the community can yield a >> little on numpy's C api. For example, in the long run, what would it >> take to get numpy (or the core thereof) into the standard library, and >> can we take steps now in that direction? Would the numpy devs be >> receptive to comments from the python devs on the existing numpy >> codebase? >> >> I'm willing to pitch in and work on the transition, not because I need >> python-3 right now, but because the transition needs to happen and it >> would benefit everyone in the long run. But I would like to know that >> we are making the most of the opportunity, and have considered our >> options. >> > > Making numpy more buffer centric is an interesting idea and might be where > we want to go with the ufuncs, but the new buffer protocol didn't go in > until python 2.6. If there was no rush I'd go with Fernando and wait until > we could be all python 2.6 all the time. I wonder what such a timeframe would look like, what would decide when to require python-2.6 for future releases of packages. Could a maintenance-only branch be created for the numpy-1.4 or 1.5 series, and then future development require 2.6 or 3.1? > However,?if anyone?has the time to work on getting the c-code up to snuff > and finding out what the problems are I'm all for that. I have some notes on > the transition in the src directory and if you do anything please keep them > current. I will have a look, thank you for putting those notes together. Darren From dagss at student.matnat.uio.no Wed Sep 9 12:13:17 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 09 Sep 2009 18:13:17 +0200 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: <4AA7D1E8.7000800@student.matnat.uio.no> References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> <4AA7D1E8.7000800@student.matnat.uio.no> Message-ID: <4AA7D41D.3060102@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Darren Dale wrote: >> On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau wrote: >>> On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote: >>>> Hi David, >>>>> I already gave my own opinion on py3k, which can be summarized as: >>>>> - it is a huge effort, and no core numpy/scipy developer has >>>>> expressed the urge to transition to py3k, since py3k does not bring >>>>> much for scientific computing. >>>>> - very few packages with a significant portion of C have been ported >>>>> to my knowledge, hence very little experience on how to do it. AFAIK, >>>>> only small packages have been ported. Even big, pure python projects >>>>> have not been ported. The only big C project to have been ported is >>>>> python itself, and it broke compatibility and used a different source >>>>> tree than python 2. >>>>> - it remains to be seen whether we can do the py3k support in the >>>>> same source tree as the one use for python >= 2.4. Having two source >>>>> trees would make the effort even much bigger, well over the current >>>>> developers capacity IMHO. >>>>> >>>>> The only area where I could see the PSF helping is the point 2: more >>>>> documentation, more stories about 2->3 transition. >>>> I'm surprised to hear you say that. I would think additional developer >>>> and/or financial resources would be useful, for all of the reasons you >>>> listed. >>> If there was enough resources to pay someone very familiar with numpy >>> codebase for a long time, then yes, it could be useful - but I assume >>> that's out of the question. This would be very expensive as it would >>> requires several full months IMO. >>> >>> The PSF could help for the point 3, by porting other projects to py3k >>> and documenting it. The only example I know so far is pycog2 >>> (http://mail.python.org/pipermail/python-porting/2008-December/000010.html). >>> >>> Paying people to do documentation about porting C code seems like a >>> good way to spend money: it would be useful outside numpy community, >>> and would presumably be less costly. >> Another topic concerning documentation is API compatibility. The >> python devs have requested projects not use the 2-3 transition as an >> excuse to change their APIs, but numpy is maybe a special case. I'm >> thinking about PEP3118. Is numpy going to transition to python 3 and >> then down the road transition again to the new buffer protocol? What >> is the strategy here? My underinformed impression is that there isn't >> one, since every time PEP3118 is considered in the context of the 2-3 >> transition somebody helpfully reminds the list that we aren't supposed >> to break APIs. Numpy is a critical python library, perhaps the > > I'd be surprised if this is the case and if there are any issues. > > What Robert said applies, plus: In Python 2.6 the ndarray type would > support *both* the old and the new buffer protocols, which can be usedin > parallel on Python 2.6. > > There's no real issue on the PEP 3118 at all as I can see, it just needs > to be done. I'll try hard to give this a small start (ndarray export its > buffer) in November (though when the time comes I might feel that I > really should be studying instead...). > >> transition presents an opportunity, if the community can yield a >> little on numpy's C api. For example, in the long run, what would it >> take to get numpy (or the core thereof) into the standard library, and >> can we take steps now in that direction? Would the numpy devs be >> receptive to comments from the python devs on the existing numpy >> codebase? > > I think this one is likely a question of semantics. My feeling is that > for instance the slice-returns-a-view on an array type would be hard to > swallow on a standard library component? (Seeing as list returns a copy.) > > Python 3 kind of solved this by calling the type "memoryview", which > implies that slicing returns another view. > > I have a feeling the the best start in this direction might be for > somebody to give the memoryview type in Python 3 some love, perhaps set > it up as a light-weight ndarray replacement in the standard library. > > (If anybody implemented fancy indexing on a memoryview I suppose it > should return a new view though (through a pointer table), meaning > incompatability with NumPy's fancy indexing...) > >> I'm willing to pitch in and work on the transition, not because I need >> python-3 right now, but because the transition needs to happen and it >> would benefit everyone in the long run. But I would like to know that >> we are making the most of the opportunity, and have considered our >> options. Another note: Perhaps there is an opportunity for replacing NumPy with more buffer-centric cross-library approaches in Python 3 eventually, but current NumPy with the current API really has to be ported to Python 3 just so that people can port their existing programs to Python 3. -- Dag Sverre From Chris.Barker at noaa.gov Wed Sep 9 12:40:47 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 09 Sep 2009 09:40:47 -0700 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: <3d375d730909090821scc9f7fahf55b17acf563ce54@mail.gmail.com> References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> <3d375d730909090821scc9f7fahf55b17acf563ce54@mail.gmail.com> Message-ID: <4AA7DA8F.8050804@noaa.gov> Robert Kern wrote: > On Wed, Sep 9, 2009 at 07:15, Darren Dale wrote: > We aren't supposed to break APIs that aren't related to the 2-3 > transition. PEP3118 is related to the 2-3 transition. Since I'm that > somebody that always pipes up about this topic, I'm pretty sure it > hasn't been PEP3118-related breakage that has been proposed. Is there a difference between changing the C api and the Python API? I'd kind of expect that any C code is going to be broken more than Python code anyway, so maybe that's a distinction worth making. Or maybe not -- I suppose the logic is that the transition of user code form 2->3 should be as easy as possible, so we don't want users to have to deal with Python changes, and numpy changes, and wxPython changes, and ??? all at once. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed Sep 9 12:45:30 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Sep 2009 11:45:30 -0500 Subject: [Numpy-discussion] question about future support for python-3 In-Reply-To: <4AA7DA8F.8050804@noaa.gov> References: <5b8d13220909081708m5ba63ee7sad2c56d16b5eeec1@mail.gmail.com> <5b8d13220909081802j2f8f9decxc2abe0ccc73bbc86@mail.gmail.com> <3d375d730909090821scc9f7fahf55b17acf563ce54@mail.gmail.com> <4AA7DA8F.8050804@noaa.gov> Message-ID: <3d375d730909090945t2dfc05a4y91d596d8ecd7f819@mail.gmail.com> On Wed, Sep 9, 2009 at 11:40, Christopher Barker wrote: > Robert Kern wrote: >> On Wed, Sep 9, 2009 at 07:15, Darren Dale wrote: >> We aren't supposed to break APIs that aren't related to the 2-3 >> transition. PEP3118 is related to the 2-3 transition. Since I'm that >> somebody that always pipes up about this topic, I'm pretty sure it >> hasn't been PEP3118-related breakage that has been proposed. > > Is there a difference between changing the C api and the Python API? I'd > kind of expect that any C code is going to be broken more than Python > code anyway, so maybe that's a distinction worth making. > > Or maybe not -- I suppose the logic is that the transition of user code > form 2->3 should be as easy as possible, so we don't want users to have > to deal with Python changes, and numpy changes, and wxPython changes, > and ??? all at once. Yes, that is the logic. If the breakage is actually related to the Python 3 transition (e.g. reworking things to use bytes and unicode instead of str), then it's okay. You do whatever you need to do to convert your code to work with Python 3. What you shouldn't do is break APIs for reasons that are unrelated to the transition, like cleaning up various warts that have been bugging us over the years. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dagss at student.matnat.uio.no Wed Sep 9 14:17:20 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 09 Sep 2009 20:17:20 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk> <4fe028e30909090408m6e7b8128u4fe2927edef17dd6@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA8@sernt14.essex.ac.uk> <4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com> Message-ID: <4AA7F130.3000805@student.matnat.uio.no> Ruben Salvador wrote: > Your results are what I expected...but. This code is called from my main > program, and what I have in there (output array already created for both > cases) is: > > print "lambd", lambd > print "np.shape(a)", np.shape(a) > print "np.shape(r)", np.shape(r) > print "np.shape(offspr)", np.shape(offspr) > t = clock() > for i in range(lambd): > offspr[i] = r[i] + a[i] > t1 = clock() - t > print "For loop time ==> %.8f seconds" % t1 > t2 = clock() > offspr = r + a[:,None] > t3 = clock() - t2 > print "Pythonic time ==> %.8f seconds" % t3 > > The results I obtain are: > > lambd 80000 > np.shape(a) (80000,) > np.shape(r) (80000, 26) > np.shape(offspr) (80000, 26) > For loop time ==> 0.34528804 seconds > Pythonic time ==> 0.35956192 seconds > > Maybe I'm not measuring properly, so, how should I do it? Like Luca said, you are not including the creation time of offspr in the for-loop version. A fairer comparison would be offspr[...] = r + a[:, None] Even fairer (one less temporary copy): offspr[...] = r offspr += a[:, None] Of course, see how the trend is for larger N as well. Also your timings are a bit crude (though this depends on how many times you ran your script to check :-)). To get better measurements, use the timeit module, or (easier) IPython and the %timeit command. > > On Wed, Sep 9, 2009 at 1:20 PM, Citi, Luca > wrote: > > I am sorry but it doesn't make much sense. > How do you measure the performance? > Are you sure you include the creation of the "c" output array in the > time spent (which is outside the for loop but should be considered > anyway)? > > Here are my results... > > In [84]: a = np.random.rand(8,26) > > In [85]: b = np.random.rand(8) > > In [86]: def o(a,b): > ....: c = np.empty_like(a) > ....: for i in range(len(a)): > ....: c[i] = a[i] + b[i] > ....: return c > ....: > > In [87]: d = a + b[:,None] > > In [88]: (d == o(a,b)).all() > Out[88]: True > > In [89]: %timeit o(a,b) > %ti10000 loops, best of 3: 36.8 ?s per loop > > In [90]: %timeit d = a + b[:,None] > 100000 loops, best of 3: 5.17 ?s per loop > > In [91]: a = np.random.rand(80000,26) > > In [92]: b = np.random.rand(80000) > > In [93]: %timeit o(a,b) > %ti10 loops, best of 3: 287 ms per loop > > In [94]: %timeit d = a + b[:,None] > 100 loops, best of 3: 15.4 ms per loop > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Dag Sverre From dwf at cs.toronto.edu Wed Sep 9 17:52:41 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 9 Sep 2009 17:52:41 -0400 Subject: [Numpy-discussion] Huge arrays In-Reply-To: <200909091048.49439.faltet@pytables.org> References: <5b8d13220909082222s59ecb06bjb3816ed94572b911@mail.gmail.com> <200909091048.49439.faltet@pytables.org> Message-ID: <269F5A72-FAA0-4C61-BAF2-299C71C31EF4@cs.toronto.edu> On 9-Sep-09, at 4:48 AM, Francesc Alted wrote: > Yes, this later is supported in PyTables as long as the underlying > filesystem > supports files > 2 GB, which is very usual in modern operating > systems. I think the OP said he was on Win32, in which case it should be noted: FAT32 has its upper file size limit at 4GB (minus one byte), so storing both your arrays as one file on a FAT32 partition is a no-no. David From sturla at molden.no Thu Sep 10 00:04:24 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 10 Sep 2009 06:04:24 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <7f1eaee30908061041l2cd76f64r96e7f5c7c16a2483@mail.gmail.com> <4A7B43CE.7050509@molden.no> <7f1eaee30908061429v5d04ab77v18b37a0a177548cd@mail.gmail.com> <4A7B5B06.2080909@molden.no> <4A7B62FC.6090504@molden.no> <4A8FBF64.50300@molden.no> Message-ID: <4AA87AC8.6070906@molden.no> George Dahl skrev: > I know that for my work, I can get around an order of a 50-fold speedup over > numpy using a python wrapper for a simple GPU matrix class. So I might be > dealing with a lot of matrix products where I multiply a fixed 512 by 784 matrix > by a 784 by 256 matrix that changes between each matrix product, although to > really see the largest gains I use a 4096 by 2048 matrix times a bunch of 2048 > by 256 matrices. Matrix multiplication is at the core of 3D graphics, and the raison d'etre for GPUs. That is specifically what they are designed to do. Matrix multiplication scale O(n**3) with floating point operations and O(n**2) with memory access. That is GPUs gives fast 3D graphics (matrix multiplications) by speeding up floating point operations. GPUs makes sence for certain level-3 BLAS calls, but that really belongs in BLAS, not in NumPy's core. One could e.g. consider linking with a BLAS wrapper that directs these special cases to the GPU and the rest to ATLAS / MKL / netlib BLAS. Sturla Molden From sturla at molden.no Thu Sep 10 00:47:47 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 10 Sep 2009 06:47:47 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <7f1eaee30909090827s34cab32ctb6520bec42422924@mail.gmail.com> References: <4AA6CAF1.9060303@noaa.gov> <200909091126.06554.faltet@pytables.org> <200909091641.17205.faltet@pytables.org> <7f1eaee30909090827s34cab32ctb6520bec42422924@mail.gmail.com> Message-ID: <4AA884F3.50305@molden.no> James Bergstra skrev: > Suppose you want to evaluate "dot(a*b+c*sqrt(d), e)". The GPU is > great for doing dot(), The CPU is equally great (or better?) for doing dot(). In both cases: - memory access scale O(n) for dot producs. - computation scale O(n) for dot producs. - memory is low - computation is fast (faster for GPU) In both cases, the floating point unit is starved. That means it could do a lot more work if memory were faster. For the GPU to be "faster than CPU", you have to have a situation where computation dominates over memory access. Matrix-matrix multiplication is one such example. This is what GPUs are designed to do, as it is the major bootleneck in 3D graphics. The proper way to speed up "dot(a*b+c*sqrt(d), e)" is to get rid of temporary intermediates. That is, in Python pseudo-code: result = 0 for i in range(n): result += (a[i]*b[i] + c[i]*sqrt(d[i])) * e[i] instead of: tmp0 = empty(n) for i in range(n): tmp0[i] = a[i] * b[i] tmp1 = empty(n) for i in range(n): tmp1[i] = sqrt(d[i]) tmp2 = empty(n) for i in range(n): tmp2[i] = c[i] * tmp1[i] tmp3 = empty(n) for i in range(n): tmp3[i] = tmp0[i] + tmp2[i] result = 0 for i in range(n): result += tmp3[i] * e[i] It is this complication that makes NumPy an order of magnitude slower than hand-crafted C (but still much faster than pure Python!) Adding in GPUs will not change this. The amount of computation (flop count) is the same, so it cannot be the source of the slowness. Sturla Molden From dwf at cs.toronto.edu Thu Sep 10 01:29:26 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 10 Sep 2009 01:29:26 -0400 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4AA884F3.50305@molden.no> References: <4AA6CAF1.9060303@noaa.gov> <200909091126.06554.faltet@pytables.org> <200909091641.17205.faltet@pytables.org> <7f1eaee30909090827s34cab32ctb6520bec42422924@mail.gmail.com> <4AA884F3.50305@molden.no> Message-ID: On 10-Sep-09, at 12:47 AM, Sturla Molden wrote: > The CPU is equally great (or better?) for doing dot(). In both cases: > > - memory access scale O(n) for dot producs. > - computation scale O(n) for dot producs. > - memory is low > - computation is fast (faster for GPU) You do realize that the throughput from onboard (video) RAM is going to be much higher, right? It's not just the parallelization but the memory bandwidth. And as James pointed out, if you can keep most of your intermediate computation on-card, you stand to benefit immensely, even if doing some operations where the GPU provides no tangible benefit (i.e. the benefit is in aggregate and avoiding copies). FWIW I agree with you that NumPy isn't the place for GPU stuff to happen. In the short to medium term we need a way to make it simpler for naturally expressed computations not go hog wild with temporary allocations (it's a very hard problem given the constraints of the language). In the long term I envision something with flexible enough machinery to be manipulating objects in GPU memory with the same ease as in main memory, but I think the path to that lies in increasing the generality and flexibility of the interfaces exposed. David From fperez.net at gmail.com Thu Sep 10 01:52:05 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 9 Sep 2009 22:52:05 -0700 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4AA884F3.50305@molden.no> References: <4AA6CAF1.9060303@noaa.gov> <200909091126.06554.faltet@pytables.org> <200909091641.17205.faltet@pytables.org> <7f1eaee30909090827s34cab32ctb6520bec42422924@mail.gmail.com> <4AA884F3.50305@molden.no> Message-ID: On Wed, Sep 9, 2009 at 9:47 PM, Sturla Molden wrote: > James Bergstra skrev: >> Suppose you want to evaluate "dot(a*b+c*sqrt(d), e)". ?The GPU is >> great for doing dot(), > The CPU is equally great (or better?) for doing dot(). In both cases: > > - memory access scale O(n) for dot producs. > - computation scale O(n) for dot producs. Remember that we have a little terminology ambiguity here: in numpy, dot(a,b) is used to describe both the vector dot product, an O(n) operation if a and b are n-element vectors, and the matrix product, an O(n**3) operation if a and b are both nxn square matrices. Just a clarification... Cheers, f From mpi at comxnet.dk Thu Sep 10 03:17:00 2009 From: mpi at comxnet.dk (Mads Ipsen) Date: Thu, 10 Sep 2009 09:17:00 +0200 Subject: [Numpy-discussion] error: comma at end of enumerator list Message-ID: <4AA8A7EC.7090108@comxnet.dk> Hey, When I try to compile a swig based interface to NumPy, I get the error: lib/python2.6/site-packages/numpy/core/include/numpy/npy_common.h:11: error: comma at end of enumerator list In npy_common.h, changing /* enums for detected endianness */ enum { NPY_CPU_UNKNOWN_ENDIAN, NPY_CPU_LITTLE, NPY_CPU_BIG, }; to /* enums for detected endianness */ enum { NPY_CPU_UNKNOWN_ENDIAN, NPY_CPU_LITTLE, NPY_CPU_BIG }; fixes the issue. I believe this should be fixed. At least we cannot built our software without the above fix. System info: gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu12) Ubuntu 8.10 Best regards, Mads -- +------------------------------------------------------------+ | Mads Ipsen, Scientific developer | +------------------------------+-----------------------------+ | QuantumWise A/S | phone: +45-29716388 | | N?rres?gade 27A | www: www.quantumwise.com | | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | +------------------------------+-----------------------------+ From slaunger at gmail.com Thu Sep 10 03:17:04 2009 From: slaunger at gmail.com (Kim Hansen) Date: Thu, 10 Sep 2009 09:17:04 +0200 Subject: [Numpy-discussion] Huge arrays In-Reply-To: <269F5A72-FAA0-4C61-BAF2-299C71C31EF4@cs.toronto.edu> References: <5b8d13220909082222s59ecb06bjb3816ed94572b911@mail.gmail.com> <200909091048.49439.faltet@pytables.org> <269F5A72-FAA0-4C61-BAF2-299C71C31EF4@cs.toronto.edu> Message-ID: > > On 9-Sep-09, at 4:48 AM, Francesc Alted wrote: > > > Yes, this later is supported in PyTables as long as the underlying > > filesystem > > supports files > 2 GB, which is very usual in modern operating > > systems. > > I think the OP said he was on Win32, in which case it should be noted: > FAT32 has its upper file size limit at 4GB (minus one byte), so > storing both your arrays as one file on a FAT32 partition is a no-no. > > David > Strange, I work on Win32 systems, and there I have no problems storing data files up to 600 GB (have not tried larger) in size stored on RAID0 disk systems of 2x1TB, I can also open them and seek in them using Python. For those data files, I use Pytables lzo compressed h5 files to create and maintain an index to the large data file Besides some meta data describing chunks of data, the index also conains a data position value stating what the file position of the beginning of each data chunk (payload) is. The index files I work with in h5 format are not larger than 1.5 GB though. It all works very nice and it is very convenient Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Sep 10 03:06:21 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 10 Sep 2009 16:06:21 +0900 Subject: [Numpy-discussion] Huge arrays In-Reply-To: References: <5b8d13220909082222s59ecb06bjb3816ed94572b911@mail.gmail.com> <200909091048.49439.faltet@pytables.org> <269F5A72-FAA0-4C61-BAF2-299C71C31EF4@cs.toronto.edu> Message-ID: <4AA8A56D.3060508@ar.media.kyoto-u.ac.jp> Kim Hansen wrote: > > On 9-Sep-09, at 4:48 AM, Francesc Alted wrote: > > > Yes, this later is supported in PyTables as long as the underlying > > filesystem > > supports files > 2 GB, which is very usual in modern operating > > systems. > > I think the OP said he was on Win32, in which case it should be noted: > FAT32 has its upper file size limit at 4GB (minus one byte), so > storing both your arrays as one file on a FAT32 partition is a no-no. > > David > > > Strange, I work on Win32 systems, and there I have no problems storing > data files up to 600 GB (have not tried larger) in size stored on > RAID0 disk systems of 2x1TB, I can also open them and seek in them > using Python. It is a FAT32 limitation, not a windows limitation. NTFS should handle large files without much trouble, and I believe the vast majority of windows installations (>= windows xp) use NTFS and not FAT32. I certainly have not seen a windows installed on FAT32 for a very long time. cheers, David From rpg.314 at gmail.com Thu Sep 10 03:45:29 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Thu, 10 Sep 2009 13:15:29 +0530 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: References: <4AA6CAF1.9060303@noaa.gov> <200909091126.06554.faltet@pytables.org> <200909091641.17205.faltet@pytables.org> <7f1eaee30909090827s34cab32ctb6520bec42422924@mail.gmail.com> <4AA884F3.50305@molden.no> Message-ID: <4d5dd8c20909100045r7aa78480vc8626874b2d9af61@mail.gmail.com> > You do realize that the throughput from onboard (video) RAM is going > to be much higher, right? It's not just the parallelization but the > memory bandwidth. And as James pointed out, if you can keep most of > your intermediate computation on-card, you stand to benefit immensely, > even if doing some operations where the GPU provides no tangible > benefit (i.e. the benefit is in aggregate and avoiding copies). Good point made here. GPU's support bandwidth O(100 GBps) (bytes not bits). Upcoming GPU's will likely break the 250 GBps mark. Even if your expressions involve low operation/memory ratios, GPU's are a big win as their memory bandwidth is higher than CPU's L2 and even L1 caches. Regards, -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From faltet at pytables.org Thu Sep 10 04:36:27 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 10:36:27 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4d5dd8c20909100045r7aa78480vc8626874b2d9af61@mail.gmail.com> References: <4d5dd8c20909100045r7aa78480vc8626874b2d9af61@mail.gmail.com> Message-ID: <200909101036.27389.faltet@pytables.org> A Thursday 10 September 2009 09:45:29 Rohit Garg escrigu?: > > You do realize that the throughput from onboard (video) RAM is going > > to be much higher, right? It's not just the parallelization but the > > memory bandwidth. And as James pointed out, if you can keep most of > > your intermediate computation on-card, you stand to benefit immensely, > > even if doing some operations where the GPU provides no tangible > > benefit (i.e. the benefit is in aggregate and avoiding copies). > > Good point made here. GPU's support bandwidth O(100 GBps) (bytes not > bits). Upcoming GPU's will likely break the 250 GBps mark. Even if > your expressions involve low operation/memory ratios, GPU's are a big > win as their memory bandwidth is higher than CPU's L2 and even L1 > caches. Where are you getting this info from? IMO the technology of memory in graphics boards cannot be so different than in commercial motherboards. It could be a *bit* faster (at the expenses of packing less of it), but I'd say not as much as 4x faster (100 GB/s vs 25 GB/s of Intel i7 in sequential access), as you are suggesting. Maybe this is GPU cache bandwidth? -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsalvador.wk at gmail.com Thu Sep 10 04:40:02 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Thu, 10 Sep 2009 10:40:02 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <4AA7F130.3000805@student.matnat.uio.no> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk> <4fe028e30909090408m6e7b8128u4fe2927edef17dd6@mail.gmail.com> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA8@sernt14.essex.ac.uk> <4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com> <4AA7F130.3000805@student.matnat.uio.no> Message-ID: <4fe028e30909100140h601c9ffdra3f73533f1b802b2@mail.gmail.com> OK. I get the idea, but I can't see it. In both cases, as the print statement shows, offspr is already created. I need light :S On Wed, Sep 9, 2009 at 8:17 PM, Dag Sverre Seljebotn < dagss at student.matnat.uio.no> wrote: > Ruben Salvador wrote: > > Your results are what I expected...but. This code is called from my main > > program, and what I have in there (output array already created for both > > cases) is: > > > > print "lambd", lambd > > print "np.shape(a)", np.shape(a) > > print "np.shape(r)", np.shape(r) > > print "np.shape(offspr)", np.shape(offspr) > > t = clock() > > for i in range(lambd): > > offspr[i] = r[i] + a[i] > > t1 = clock() - t > > print "For loop time ==> %.8f seconds" % t1 > > t2 = clock() > > offspr = r + a[:,None] > > t3 = clock() - t2 > > print "Pythonic time ==> %.8f seconds" % t3 > > > > The results I obtain are: > > > > lambd 80000 > > np.shape(a) (80000,) > > np.shape(r) (80000, 26) > > np.shape(offspr) (80000, 26) > > For loop time ==> 0.34528804 seconds > > Pythonic time ==> 0.35956192 seconds > > > > Maybe I'm not measuring properly, so, how should I do it? > > Like Luca said, you are not including the creation time of offspr in the > for-loop version. A fairer comparison would be offspr[...] = r + a[:, None] > > Even fairer (one less temporary copy): > > offspr[...] = r > offspr += a[:, None] > > Of course, see how the trend is for larger N as well. > > Also your timings are a bit crude (though this depends on how many times > you ran your script to check :-)). To get better measurements, use the > timeit module, or (easier) IPython and the %timeit command. > > > > > On Wed, Sep 9, 2009 at 1:20 PM, Citi, Luca > > wrote: > > > > I am sorry but it doesn't make much sense. > > How do you measure the performance? > > Are you sure you include the creation of the "c" output array in the > > time spent (which is outside the for loop but should be considered > > anyway)? > > > > Here are my results... > > > > In [84]: a = np.random.rand(8,26) > > > > In [85]: b = np.random.rand(8) > > > > In [86]: def o(a,b): > > ....: c = np.empty_like(a) > > ....: for i in range(len(a)): > > ....: c[i] = a[i] + b[i] > > ....: return c > > ....: > > > > In [87]: d = a + b[:,None] > > > > In [88]: (d == o(a,b)).all() > > Out[88]: True > > > > In [89]: %timeit o(a,b) > > %ti10000 loops, best of 3: 36.8 ?s per loop > > > > In [90]: %timeit d = a + b[:,None] > > 100000 loops, best of 3: 5.17 ?s per loop > > > > In [91]: a = np.random.rand(80000,26) > > > > In [92]: b = np.random.rand(80000) > > > > In [93]: %timeit o(a,b) > > %ti10 loops, best of 3: 287 ms per loop > > > > In [94]: %timeit d = a + b[:,None] > > 100 loops, best of 3: 15.4 ms per loop > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > -- > Dag Sverre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Thu Sep 10 04:41:10 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Thu, 10 Sep 2009 09:41:10 +0100 Subject: [Numpy-discussion] Fwd: GPU Numpy References: <4AA6CAF1.9060303@noaa.gov> <200909091126.06554.faltet@pytables.org> <200909091641.17205.faltet@pytables.org> <7f1eaee30909090827s34cab32ctb6520bec42422924@mail.gmail.com> <4AA884F3.50305@molden.no> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EAA@sernt14.essex.ac.uk> Hi Sturla, > The proper way to speed up "dot(a*b+c*sqrt(d), e)" is to get rid of > temporary intermediates. I implemented a patch http://projects.scipy.org/numpy/ticket/1153 that reduces the number of temporary intermediates. In your example from 4 to 2. There is a big improvement in terms of memory footprint, and some improvement in terms of speed (especially for large matrices) but not as much as I expected. In your example > result = 0 > for i in range(n): > result += (a[i]*b[i] + c[i]*sqrt(d[i])) * e[i] another big speedup could come from the fact that it makes better use of the cache. That is exactly why numexpr is faster in these cases. I hope one day numpy will be able to perform such optimizations. Best, Luca From lciti at essex.ac.uk Thu Sep 10 04:42:55 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Thu, 10 Sep 2009 09:42:55 +0100 Subject: [Numpy-discussion] Adding a 2D with a 1D array... References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com><3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA7@sernt14.essex.ac.uk><4fe028e30909090408m6e7b8128u4fe2927edef17dd6@mail.gmail.com><3DA3B328CBC48B4EBB88484B8A5EA19106AF9EA8@sernt14.essex.ac.uk><4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com><4AA7F130.3000805@student.matnat.uio.no> <4fe028e30909100140h601c9ffdra3f73533f1b802b2@mail.gmail.com> Message-ID: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EAB@sernt14.essex.ac.uk> Hi Ruben, > In both cases, as the print > statement shows, offspr is already created. >>> offspr[...] = r + a[:, None] means "fill the existing object pointed by offspr with r + a[:, None]" while >>> offspr = r + a[:,None] means "create a new array and assign it to the variable offspr (after decref-ing the object previously pointed by offspr)" Best, Luca From faltet at pytables.org Thu Sep 10 04:49:13 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 10:49:13 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <4AA7F130.3000805@student.matnat.uio.no> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com> <4AA7F130.3000805@student.matnat.uio.no> Message-ID: <200909101049.13502.faltet@pytables.org> A Wednesday 09 September 2009 20:17:20 Dag Sverre Seljebotn escrigu?: > Ruben Salvador wrote: > > Your results are what I expected...but. This code is called from my main > > program, and what I have in there (output array already created for both > > cases) is: > > > > print "lambd", lambd > > print "np.shape(a)", np.shape(a) > > print "np.shape(r)", np.shape(r) > > print "np.shape(offspr)", np.shape(offspr) > > t = clock() > > for i in range(lambd): > > offspr[i] = r[i] + a[i] > > t1 = clock() - t > > print "For loop time ==> %.8f seconds" % t1 > > t2 = clock() > > offspr = r + a[:,None] > > t3 = clock() - t2 > > print "Pythonic time ==> %.8f seconds" % t3 > > > > The results I obtain are: > > > > lambd 80000 > > np.shape(a) (80000,) > > np.shape(r) (80000, 26) > > np.shape(offspr) (80000, 26) > > For loop time ==> 0.34528804 seconds > > Pythonic time ==> 0.35956192 seconds > > > > Maybe I'm not measuring properly, so, how should I do it? > > Like Luca said, you are not including the creation time of offspr in the > for-loop version. A fairer comparison would be > > offspr[...] = r + a[:, None] > > Even fairer (one less temporary copy): > > offspr[...] = r > offspr += a[:, None] > > Of course, see how the trend is for larger N as well. > > Also your timings are a bit crude (though this depends on how many times > you ran your script to check :-)). To get better measurements, use the > timeit module, or (easier) IPython and the %timeit command. Oh well, the art of benchmarking :) The timeit module allows you normally get less jitter in timings because it loops on doing the same operation repeatedly and get a mean. However, this has the drawback of filling your cache with the datasets (or part of them) so, in the end, your measurements with timeit does not take into account the time to transmit the data in main memory into the CPU caches, and that may be not what you want to measure. In the case of Ruben, I think what he is seeing are cache effects. Maybe if he does a loop, he would finally see the difference coming up (although this may be not what he want, of course ;-) -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpg.314 at gmail.com Thu Sep 10 04:58:13 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Thu, 10 Sep 2009 14:28:13 +0530 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101036.27389.faltet@pytables.org> References: <4d5dd8c20909100045r7aa78480vc8626874b2d9af61@mail.gmail.com> <200909101036.27389.faltet@pytables.org> Message-ID: <4d5dd8c20909100158w1f64ef1fu242cb0c6de493b0a@mail.gmail.com> > Where are you getting this info from? IMO the technology of memory in > graphics boards cannot be so different than in commercial motherboards. It > could be a *bit* faster (at the expenses of packing less of it), but I'd say > not as much as 4x faster (100 GB/s vs 25 GB/s of Intel i7 in sequential > access), as you are suggesting. Maybe this is GPU cache bandwidth? This is publicly documented. You can start off by looking at the wikipedia stuff. For reference, gtx280-->141GBps-->has 1GB ati4870-->115GBps-->has 1GB ati5870-->153GBps (launches sept 22, 2009)-->2GB models will be there too Next gen nv gpu's will *assuredly* have bandwidth in excess of 200 GBps. This is *off chip memory bandwidth* from graphics memory (aka video ram). GPU have (very small) caches but they don't reduce memory latency. > > -- > > Francesc Alted > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From sturla at molden.no Thu Sep 10 05:11:22 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 10 Sep 2009 11:11:22 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EAA@sernt14.essex.ac.uk> References: <4AA6CAF1.9060303@noaa.gov> <200909091126.06554.faltet@pytables.org> <200909091641.17205.faltet@pytables.org> <7f1eaee30909090827s34cab32ctb6520bec42422924@mail.gmail.com> <4AA884F3.50305@molden.no> <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EAA@sernt14.essex.ac.uk> Message-ID: <4AA8C2BA.80801@molden.no> Citi, Luca skrev: > That is exactly why numexpr is faster in these cases. > I hope one day numpy will be able to perform such > optimizations. > I think it is going to require lazy evaluation. Whenever possible, an operator would just return a symbolic representation of the operation. This would gradually build up a tree of operators and buffers. When someone tries to read the data from an array, the buffer is created on-demand by flushing procratinated expressions. One must be sure that the buffers referenced in an incomplete expression never change. This would be easiest to ensure with immutable buffers. Numexpr is the kind of back-end a system like this would require. But a lot of the code in numexpr can be omitted because Python creates the parse tree; we would not need the expression parser in numexpr as frontend. Well... this plan is gradually getting closer to a specialized SciPy JIT-compiler. I would be fun to make if I could find time for it. Sturla Molden From sturla at molden.no Thu Sep 10 05:16:18 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 10 Sep 2009 11:16:18 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4d5dd8c20909100158w1f64ef1fu242cb0c6de493b0a@mail.gmail.com> References: <4d5dd8c20909100045r7aa78480vc8626874b2d9af61@mail.gmail.com> <200909101036.27389.faltet@pytables.org> <4d5dd8c20909100158w1f64ef1fu242cb0c6de493b0a@mail.gmail.com> Message-ID: <4AA8C3E2.7030605@molden.no> Rohit Garg skrev: > gtx280-->141GBps-->has 1GB > ati4870-->115GBps-->has 1GB > ati5870-->153GBps (launches sept 22, 2009)-->2GB models will be there too > That is going to help if buffers are kept in graphics memory. But the problem is that graphics memory is a scarse resource. S.M. From faltet at pytables.org Thu Sep 10 05:19:16 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 11:19:16 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4AA8C2BA.80801@molden.no> References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EAA@sernt14.essex.ac.uk> <4AA8C2BA.80801@molden.no> Message-ID: <200909101119.16815.faltet@pytables.org> A Thursday 10 September 2009 11:11:22 Sturla Molden escrigu?: > Citi, Luca skrev: > > That is exactly why numexpr is faster in these cases. > > I hope one day numpy will be able to perform such > > optimizations. > > I think it is going to require lazy evaluation. Whenever possible, an > operator would just return a symbolic representation of the operation. > This would gradually build up a tree of operators and buffers. When > someone tries to read the data from an array, the buffer is created > on-demand by flushing procratinated expressions. One must be sure that > the buffers referenced in an incomplete expression never change. This > would be easiest to ensure with immutable buffers. Numexpr is the kind > of back-end a system like this would require. But a lot of the code in > numexpr can be omitted because Python creates the parse tree; we would > not need the expression parser in numexpr as frontend. Well... this plan > is gradually getting closer to a specialized SciPy JIT-compiler. I would > be fun to make if I could find time for it. Numexpr already uses the Python parser, instead of build a new one. However the bytecode emitted after the compilation process is different, of course. Also, I don't see the point in requiring immutable buffers. Could you develop this further? -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Sep 10 05:20:21 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 10 Sep 2009 11:20:21 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101036.27389.faltet@pytables.org> References: <4d5dd8c20909100045r7aa78480vc8626874b2d9af61@mail.gmail.com> <200909101036.27389.faltet@pytables.org> Message-ID: <20090910092021.GM2482@phare.normalesup.org> On Thu, Sep 10, 2009 at 10:36:27AM +0200, Francesc Alted wrote: > Where are you getting this info from? IMO the technology of memory in > graphics boards cannot be so different than in commercial motherboards. It > could be a *bit* faster (at the expenses of packing less of it), but I'd > say not as much as 4x faster (100 GB/s vs 25 GB/s of Intel i7 in > sequential access), as you are suggesting. Maybe this is GPU cache > bandwidth? I believe this is simply because the transfers is made in parallel to the different processing units of the graphic card. So we are back to importance of embarrassingly parallel problems and specifying things with high-level operations rather than for loop. Ga?l From faltet at pytables.org Thu Sep 10 05:23:17 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 11:23:17 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4d5dd8c20909100158w1f64ef1fu242cb0c6de493b0a@mail.gmail.com> References: <200909101036.27389.faltet@pytables.org> <4d5dd8c20909100158w1f64ef1fu242cb0c6de493b0a@mail.gmail.com> Message-ID: <200909101123.17643.faltet@pytables.org> A Thursday 10 September 2009 10:58:13 Rohit Garg escrigu?: > > Where are you getting this info from? IMO the technology of memory in > > graphics boards cannot be so different than in commercial motherboards. > > It could be a *bit* faster (at the expenses of packing less of it), but > > I'd say not as much as 4x faster (100 GB/s vs 25 GB/s of Intel i7 in > > sequential access), as you are suggesting. Maybe this is GPU cache > > bandwidth? > > This is publicly documented. You can start off by looking at the > wikipedia stuff. > > For reference, > > gtx280-->141GBps-->has 1GB > ati4870-->115GBps-->has 1GB > ati5870-->153GBps (launches sept 22, 2009)-->2GB models will be there too > > Next gen nv gpu's will *assuredly* have bandwidth in excess of 200 GBps. > > This is *off chip memory bandwidth* from graphics memory (aka video > ram). GPU have (very small) caches but they don't reduce memory > latency. That's nice to see. I think I'll change my mind if someone could perform a vector-vector multiplication (a operation that is typically memory-bounded) in double precision up to 5x times faster on a gtx280 nv card than in a Intel's i7 CPU. -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Thu Sep 10 05:29:49 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 11:29:49 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <20090910092021.GM2482@phare.normalesup.org> References: <200909101036.27389.faltet@pytables.org> <20090910092021.GM2482@phare.normalesup.org> Message-ID: <200909101129.49984.faltet@pytables.org> A Thursday 10 September 2009 11:20:21 Gael Varoquaux escrigu?: > On Thu, Sep 10, 2009 at 10:36:27AM +0200, Francesc Alted wrote: > > Where are you getting this info from? IMO the technology of memory in > > graphics boards cannot be so different than in commercial > > motherboards. It could be a *bit* faster (at the expenses of packing less > > of it), but I'd say not as much as 4x faster (100 GB/s vs 25 GB/s of > > Intel i7 in sequential access), as you are suggesting. Maybe this is GPU > > cache bandwidth? > > I believe this is simply because the transfers is made in parallel to the > different processing units of the graphic card. So we are back to > importance of embarrassingly parallel problems and specifying things with > high-level operations rather than for loop. Sure. Specially because NumPy is all about embarrasingly parallel problems (after all, this is how an ufunc works, doing operations element-by-element). The point is: are GPUs prepared to compete with a general-purpose CPUs in all- road operations, like evaluating transcendental functions, conditionals all of this with a rich set of data types? I would like to believe that this is the case, but I don't think so (at least not yet). -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Sep 10 05:37:24 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 10 Sep 2009 11:37:24 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101129.49984.faltet@pytables.org> References: <200909101036.27389.faltet@pytables.org> <20090910092021.GM2482@phare.normalesup.org> <200909101129.49984.faltet@pytables.org> Message-ID: <20090910093724.GN2482@phare.normalesup.org> On Thu, Sep 10, 2009 at 11:29:49AM +0200, Francesc Alted wrote: > The point is: are GPUs prepared to compete with a general-purpose CPUs in > all-road operations, like evaluating transcendental functions, > conditionals all of this with a rich set of data types? I would like to > believe that this is the case, but I don't think so (at least not yet). I believe (this is very foggy) that GPUs can implement non trivial logic on there base processing unit, so that conditionals and transcendental functions are indeed possible. Where it gets hard is when you don't have problems that can be expressed in an embarrassingly parallel manner. There are solutions there to (I believe of the message passing type), after all matrix multiplication is done on GPUs. Ga?l From sturla at molden.no Thu Sep 10 05:40:48 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 10 Sep 2009 11:40:48 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101119.16815.faltet@pytables.org> References: <3DA3B328CBC48B4EBB88484B8A5EA19106AF9EAA@sernt14.essex.ac.uk> <4AA8C2BA.80801@molden.no> <200909101119.16815.faltet@pytables.org> Message-ID: <4AA8C9A0.1040902@molden.no> Francesc Alted skrev: > > Numexpr already uses the Python parser, instead of build a new one. > However the bytecode emitted after the compilation process is > different, of course. > > Also, I don't see the point in requiring immutable buffers. Could you > develop this further? > If you do lacy evaluation, a function like this could fail without immutable buffers: def foobar(x): y = a*x[:] + b x[0] = 0 # affects y and anything else depending on x return y Immutable buffers are not required, one could document the oddity, but coding would be very error-prone. S.M. From matthieu.brucher at gmail.com Thu Sep 10 05:42:43 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 10 Sep 2009 11:42:43 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101129.49984.faltet@pytables.org> References: <200909101036.27389.faltet@pytables.org> <20090910092021.GM2482@phare.normalesup.org> <200909101129.49984.faltet@pytables.org> Message-ID: > Sure. Specially because NumPy is all about embarrasingly parallel problems > (after all, this is how an ufunc works, doing operations > element-by-element). > > The point is: are GPUs prepared to compete with a general-purpose CPUs in > all-road operations, like evaluating transcendental functions, conditionals > all of this with a rich set of data types? I would like to believe that this > is the case, but I don't think so (at least not yet). A lot of nVidia's SDK functions is not done on GPU. There are some functions that they provide where the actual computation is done on the CPU, not on the GPU (I don't have an example here, but nVidia's forum is full of examples ;)) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From rsalvador.wk at gmail.com Thu Sep 10 05:43:44 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Thu, 10 Sep 2009 11:43:44 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <200909101049.13502.faltet@pytables.org> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com> <4AA7F130.3000805@student.matnat.uio.no> <200909101049.13502.faltet@pytables.org> Message-ID: <4fe028e30909100243y668122e7x1683b9c6fdd5a357@mail.gmail.com> OK. Thanks everybody :D But...what is happening now? When executing this code: print ' ..... object parameters mutation .....' print 'np.shape(offspr)', np.shape(offspr) print 'np.shape(offspr[0])', np.shape(offspr[0]) print "np.shape(r)", np.shape(r) print "np.shape(offspr_sigma)", np.shape(offspr_sigma) a = offspr_sigma * np.random.normal(0, 1, shp_sigma) print "np.shape(a)", np.shape(a) t4 = clock() offspr[...] = r offspr += a[:,None] t5 = clock() - t4 print "Pythonic time (no array creation) ==> %.8f seconds" % t5 t2 = clock() offspr = r + a[:,None] t3 = clock() - t2 print "Pythonic time ==> %.8f seconds" % t3 t = clock() for i in range(lambd): offspr[i] = r[i] + a[i] t1 = clock() - t print "For loop time ==> %.8f seconds" % t1 what I get is ..... object parameters mutation ..... np.shape(offspr) (80000, 26) np.shape(offspr[0]) (26,) np.shape(r) (80000, 26) np.shape(offspr_sigma) (80000,) np.shape(a) (80000,) Traceback (most recent call last): File "/home/rsalvador/wavelets/devel/testing/genwave.py", line 660, in main() File "/home/rsalvador/wavelets/devel/testing/genwave.py", line 390, in main mutate_strat, tau_global, tau_params) File "/home/rsalvador/wavelets/devel/testing/genwavelib.py", line 299, in mutate offspr[...] = r TypeError: list indices must be integers WTF? On 9/10/09, Francesc Alted wrote: > > A Wednesday 09 September 2009 20:17:20 Dag Sverre Seljebotn escrigu?: > > > Ruben Salvador wrote: > > > > Your results are what I expected...but. This code is called from my > main > > > > program, and what I have in there (output array already created for > both > > > > cases) is: > > > > > > > > print "lambd", lambd > > > > print "np.shape(a)", np.shape(a) > > > > print "np.shape(r)", np.shape(r) > > > > print "np.shape(offspr)", np.shape(offspr) > > > > t = clock() > > > > for i in range(lambd): > > > > offspr[i] = r[i] + a[i] > > > > t1 = clock() - t > > > > print "For loop time ==> %.8f seconds" % t1 > > > > t2 = clock() > > > > offspr = r + a[:,None] > > > > t3 = clock() - t2 > > > > print "Pythonic time ==> %.8f seconds" % t3 > > > > > > > > The results I obtain are: > > > > > > > > lambd 80000 > > > > np.shape(a) (80000,) > > > > np.shape(r) (80000, 26) > > > > np.shape(offspr) (80000, 26) > > > > For loop time ==> 0.34528804 seconds > > > > Pythonic time ==> 0.35956192 seconds > > > > > > > > Maybe I'm not measuring properly, so, how should I do it? > > > > > > Like Luca said, you are not including the creation time of offspr in the > > > for-loop version. A fairer comparison would be > > > > > > offspr[...] = r + a[:, None] > > > > > > Even fairer (one less temporary copy): > > > > > > offspr[...] = r > > > offspr += a[:, None] > > > > > > Of course, see how the trend is for larger N as well. > > > > > > Also your timings are a bit crude (though this depends on how many times > > > you ran your script to check :-)). To get better measurements, use the > > > timeit module, or (easier) IPython and the %timeit command. > > Oh well, the art of benchmarking :) > > The timeit module allows you normally get less jitter in timings because it > loops on doing the same operation repeatedly and get a mean. However, this > has the drawback of filling your cache with the datasets (or part of them) > so, in the end, your measurements with timeit does not take into account the > time to transmit the data in main memory into the CPU caches, and that may > be not what you want to measure. > > In the case of Ruben, I think what he is seeing are cache effects. Maybe if > he does a loop, he would finally see the difference coming up (although this > may be not what he want, of course ;-) > > -- > > Francesc Alted > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpg.314 at gmail.com Thu Sep 10 06:01:29 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Thu, 10 Sep 2009 15:31:29 +0530 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101129.49984.faltet@pytables.org> References: <200909101036.27389.faltet@pytables.org> <20090910092021.GM2482@phare.normalesup.org> <200909101129.49984.faltet@pytables.org> Message-ID: <4d5dd8c20909100301l41ad33e6w1f43bb8339b65d64@mail.gmail.com> > The point is: are GPUs prepared to compete with a general-purpose CPUs in > all-road operations, like evaluating transcendental functions, conditionals > all of this with a rich set of data types? Yup. -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From faltet at pytables.org Thu Sep 10 07:31:28 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 13:31:28 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <4fe028e30909100243y668122e7x1683b9c6fdd5a357@mail.gmail.com> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <200909101049.13502.faltet@pytables.org> <4fe028e30909100243y668122e7x1683b9c6fdd5a357@mail.gmail.com> Message-ID: <200909101331.28128.faltet@pytables.org> A Thursday 10 September 2009 11:43:44 Ruben Salvador escrigu?: > OK. Thanks everybody :D > But...what is happening now? When executing this code: > > print ' ..... object parameters mutation .....' > print 'np.shape(offspr)', np.shape(offspr) > print 'np.shape(offspr[0])', np.shape(offspr[0]) > print "np.shape(r)", np.shape(r) > print "np.shape(offspr_sigma)", np.shape(offspr_sigma) > a = offspr_sigma * np.random.normal(0, 1, shp_sigma) > print "np.shape(a)", np.shape(a) > t4 = clock() > offspr[...] = r > offspr += a[:,None] > t5 = clock() - t4 > print "Pythonic time (no array creation) ==> %.8f seconds" % t5 > t2 = clock() > offspr = r + a[:,None] > t3 = clock() - t2 > print "Pythonic time ==> %.8f seconds" % t3 > t = clock() > for i in range(lambd): > offspr[i] = r[i] + a[i] > t1 = clock() - t > print "For loop time ==> %.8f seconds" % t1 > > what I get is [clip] What's your definition for offspr? Please always try to send auto-contained code snippets so that other people can better help you. -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Thu Sep 10 07:45:10 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 10 Sep 2009 13:45:10 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <200909101049.13502.faltet@pytables.org> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <4fe028e30909090536x3e82c032odc09f6b89dd34990@mail.gmail.com> <4AA7F130.3000805@student.matnat.uio.no> <200909101049.13502.faltet@pytables.org> Message-ID: <4AA8E6C6.7080906@student.matnat.uio.no> Francesc Alted wrote: > A Wednesday 09 September 2009 20:17:20 Dag Sverre Seljebotn escrigu?: > > > Ruben Salvador wrote: > > > > Your results are what I expected...but. This code is called from my > main > > > > program, and what I have in there (output array already created for > both > > > > cases) is: > > > > > > > > print "lambd", lambd > > > > print "np.shape(a)", np.shape(a) > > > > print "np.shape(r)", np.shape(r) > > > > print "np.shape(offspr)", np.shape(offspr) > > > > t = clock() > > > > for i in range(lambd): > > > > offspr[i] = r[i] + a[i] > > > > t1 = clock() - t > > > > print "For loop time ==> %.8f seconds" % t1 > > > > t2 = clock() > > > > offspr = r + a[:,None] > > > > t3 = clock() - t2 > > > > print "Pythonic time ==> %.8f seconds" % t3 > > > > > > > > The results I obtain are: > > > > > > > > lambd 80000 > > > > np.shape(a) (80000,) > > > > np.shape(r) (80000, 26) > > > > np.shape(offspr) (80000, 26) > > > > For loop time ==> 0.34528804 seconds > > > > Pythonic time ==> 0.35956192 seconds > > > > > > > > Maybe I'm not measuring properly, so, how should I do it? > > > > > > Like Luca said, you are not including the creation time of offspr in the > > > for-loop version. A fairer comparison would be > > > > > > offspr[...] = r + a[:, None] > > > > > > Even fairer (one less temporary copy): > > > > > > offspr[...] = r > > > offspr += a[:, None] > > > > > > Of course, see how the trend is for larger N as well. > > > > > > Also your timings are a bit crude (though this depends on how many times > > > you ran your script to check :-)). To get better measurements, use the > > > timeit module, or (easier) IPython and the %timeit command. > > Oh well, the art of benchmarking :) > > The timeit module allows you normally get less jitter in timings because > it loops on doing the same operation repeatedly and get a mean. However, > this has the drawback of filling your cache with the datasets (or part > of them) so, in the end, your measurements with timeit does not take > into account the time to transmit the data in main memory into the CPU > caches, and that may be not what you want to measure. Do you see any issues with this approach: Add a flag timeit to provide two modes: a) Do an initial run which is always not included in timings (in fact, as it gets "min" and not "mean", I think this is the current behaviour) b) Do something else between every run which should clear out the cache (like, just do another big dummy calculation). (Also a guard in timeit against CPU frequency scaling errors would be great :-) Like simply outputting a warning if frequency scaling is detected). -- Dag Sverre From faltet at pytables.org Thu Sep 10 08:03:51 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 14:03:51 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <4AA8E6C6.7080906@student.matnat.uio.no> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <200909101049.13502.faltet@pytables.org> <4AA8E6C6.7080906@student.matnat.uio.no> Message-ID: <200909101403.51283.faltet@pytables.org> A Thursday 10 September 2009 13:45:10 Dag Sverre Seljebotn escrigu?: > Francesc Alted wrote: > > A Wednesday 09 September 2009 20:17:20 Dag Sverre Seljebotn escrigu?: > > > Ruben Salvador wrote: > > > > Your results are what I expected...but. This code is called from my > > > > main > > > > > > program, and what I have in there (output array already created for > > > > both > > > > > > cases) is: > > > > > > > > > > > > > > > > print "lambd", lambd > > > > > > > > print "np.shape(a)", np.shape(a) > > > > > > > > print "np.shape(r)", np.shape(r) > > > > > > > > print "np.shape(offspr)", np.shape(offspr) > > > > > > > > t = clock() > > > > > > > > for i in range(lambd): > > > > > > > > offspr[i] = r[i] + a[i] > > > > > > > > t1 = clock() - t > > > > > > > > print "For loop time ==> %.8f seconds" % t1 > > > > > > > > t2 = clock() > > > > > > > > offspr = r + a[:,None] > > > > > > > > t3 = clock() - t2 > > > > > > > > print "Pythonic time ==> %.8f seconds" % t3 > > > > > > > > > > > > > > > > The results I obtain are: > > > > > > > > > > > > > > > > lambd 80000 > > > > > > > > np.shape(a) (80000,) > > > > > > > > np.shape(r) (80000, 26) > > > > > > > > np.shape(offspr) (80000, 26) > > > > > > > > For loop time ==> 0.34528804 seconds > > > > > > > > Pythonic time ==> 0.35956192 seconds > > > > > > > > > > > > > > > > Maybe I'm not measuring properly, so, how should I do it? > > > > > > Like Luca said, you are not including the creation time of offspr in > > > the > > > > > > for-loop version. A fairer comparison would be > > > > > > > > > > > > offspr[...] = r + a[:, None] > > > > > > > > > > > > Even fairer (one less temporary copy): > > > > > > > > > > > > offspr[...] = r > > > > > > offspr += a[:, None] > > > > > > > > > > > > Of course, see how the trend is for larger N as well. > > > > > > > > > > > > Also your timings are a bit crude (though this depends on how many > > > times > > > > > > you ran your script to check :-)). To get better measurements, use the > > > > > > timeit module, or (easier) IPython and the %timeit command. > > > > Oh well, the art of benchmarking :) > > > > The timeit module allows you normally get less jitter in timings because > > it loops on doing the same operation repeatedly and get a mean. However, > > this has the drawback of filling your cache with the datasets (or part > > of them) so, in the end, your measurements with timeit does not take > > into account the time to transmit the data in main memory into the CPU > > caches, and that may be not what you want to measure. > > Do you see any issues with this approach: Add a flag timeit to provide > two modes: > > a) Do an initial run which is always not included in timings (in fact, > as it gets "min" and not "mean", I think this is the current behaviour) Yup, you are right, it is 'min'. In fact, this is why timeit normally 'forgets' about data transmission times (with a 'mean' the effect is very similar anyways). > b) Do something else between every run which should clear out the cache > (like, just do another big dummy calculation). Yeah. In fact, you can simulate this behaviour by running two instances of timeit: one with your code + big dummy calculation, and the other with just the big dummy calculation. Subtract both numbers and you will have a better guess for non-cached calculations. > > (Also a guard in timeit against CPU frequency scaling errors would be > great :-) Like simply outputting a warning if frequency scaling is > detected). Sorry, I don't get this one. -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Thu Sep 10 08:22:57 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 10 Sep 2009 14:22:57 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <200909101403.51283.faltet@pytables.org> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <200909101049.13502.faltet@pytables.org> <4AA8E6C6.7080906@student.matnat.uio.no> <200909101403.51283.faltet@pytables.org> Message-ID: <4AA8EFA1.9030809@student.matnat.uio.no> Francesc Alted wrote: > A Thursday 10 September 2009 13:45:10 Dag Sverre Seljebotn escrigu?: > > Do you see any issues with this approach: Add a flag timeit to provide > > > two modes: > > > > > > a) Do an initial run which is always not included in timings (in fact, > > > as it gets "min" and not "mean", I think this is the current behaviour) > > Yup, you are right, it is 'min'. In fact, this is why timeit normally > 'forgets' about data transmission times (with a 'mean' the effect is > very similar anyways). > > > b) Do something else between every run which should clear out the cache > > > (like, just do another big dummy calculation). > > Yeah. In fact, you can simulate this behaviour by running two instances > of timeit: one with your code + big dummy calculation, and the other > with just the big dummy calculation. Subtract both numbers and you will > have a better guess for non-cached calculations. > > > > > > (Also a guard in timeit against CPU frequency scaling errors would be > > > great :-) Like simply outputting a warning if frequency scaling is > > > detected). > > Sorry, I don't get this one. I had some trouble getting reliable benchmarks on my own computer until I realised that the power-saving capabilities of my CPU down-throttled the clock speed when it was not in use. Thus if I did two calls to timeit right after one another, the second would always report lower runtime, because the first one started at a lower clock speed. Changing a BIOS setting solved this, but it might be a gotcha which e.g. timeit and IPython could report (they could just inspect the CPU information and emit a warning -- or, do something to throttle up the CPU to full speed first). -- Dag Sverre From faltet at pytables.org Thu Sep 10 08:21:08 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 14:21:08 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4AA8C9A0.1040902@molden.no> References: <200909101119.16815.faltet@pytables.org> <4AA8C9A0.1040902@molden.no> Message-ID: <200909101421.08525.faltet@pytables.org> A Thursday 10 September 2009 11:40:48 Sturla Molden escrigu?: > Francesc Alted skrev: > > Numexpr already uses the Python parser, instead of build a new one. > > However the bytecode emitted after the compilation process is > > different, of course. > > > > Also, I don't see the point in requiring immutable buffers. Could you > > develop this further? > > If you do lacy evaluation, a function like this could fail without > immutable buffers: > > def foobar(x): > y = a*x[:] + b > x[0] = 0 # affects y and anything else depending on x > return y > > Immutable buffers are not required, one could document the oddity, but > coding would be very error-prone. > Mmh, I don't see a problem here if operation's order is kept untouched (and you normally want to do this). But I'm not an expert on 'lazy evaluation', so may want to ignore my comments better ;-) -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Thu Sep 10 08:28:45 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 14:28:45 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <20090910093724.GN2482@phare.normalesup.org> References: <200909101129.49984.faltet@pytables.org> <20090910093724.GN2482@phare.normalesup.org> Message-ID: <200909101428.45215.faltet@pytables.org> A Thursday 10 September 2009 11:37:24 Gael Varoquaux escrigu?: > On Thu, Sep 10, 2009 at 11:29:49AM +0200, Francesc Alted wrote: > > The point is: are GPUs prepared to compete with a general-purpose CPUs > > in all-road operations, like evaluating transcendental functions, > > conditionals all of this with a rich set of data types? I would like to > > believe that this is the case, but I don't think so (at least not yet). > > I believe (this is very foggy) that GPUs can implement non trivial logic > on there base processing unit, so that conditionals and transcendental > functions are indeed possible. Where it gets hard is when you don't have > problems that can be expressed in an embarrassingly parallel manner. But NumPy is about embarrassingly parallel calculations, right? I mean: a = np.cos(b) where b is a 10000x10000 matrix is *very* embarrassing (in the parallel meaning of the term ;-) Anyone here can say how the above operation can be done with GPUs? (and providing some timings would be really great :) -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Thu Sep 10 08:32:34 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 14:32:34 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <4AA8EFA1.9030809@student.matnat.uio.no> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <200909101403.51283.faltet@pytables.org> <4AA8EFA1.9030809@student.matnat.uio.no> Message-ID: <200909101432.34946.faltet@pytables.org> A Thursday 10 September 2009 14:22:57 Dag Sverre Seljebotn escrigu?: > > > (Also a guard in timeit against CPU frequency scaling errors would be > > > > > > great :-) Like simply outputting a warning if frequency scaling is > > > > > > detected). > > > > Sorry, I don't get this one. > > I had some trouble getting reliable benchmarks on my own computer until > I realised that the power-saving capabilities of my CPU down-throttled > the clock speed when it was not in use. Thus if I did two calls to > timeit right after one another, the second would always report lower > runtime, because the first one started at a lower clock speed. :-) Good point > > Changing a BIOS setting solved this, but it might be a gotcha which e.g. > timeit and IPython could report (they could just inspect the CPU > information and emit a warning -- or, do something to throttle up the > CPU to full speed first). -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpg.314 at gmail.com Thu Sep 10 08:35:26 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Thu, 10 Sep 2009 18:05:26 +0530 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101428.45215.faltet@pytables.org> References: <200909101129.49984.faltet@pytables.org> <20090910093724.GN2482@phare.normalesup.org> <200909101428.45215.faltet@pytables.org> Message-ID: <4d5dd8c20909100535i1e8ffff9ucf7d0af001c45b1b@mail.gmail.com> > a = np.cos(b) > > where b is a 10000x10000 matrix is *very* embarrassing (in the parallel > meaning of the term ;-) On this operation, gpu's will eat up cpu's like a pack of pirhanas. :) -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From rpg.314 at gmail.com Thu Sep 10 08:36:16 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Thu, 10 Sep 2009 18:06:16 +0530 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101123.17643.faltet@pytables.org> References: <200909101036.27389.faltet@pytables.org> <4d5dd8c20909100158w1f64ef1fu242cb0c6de493b0a@mail.gmail.com> <200909101123.17643.faltet@pytables.org> Message-ID: <4d5dd8c20909100536x47f89b9dy4325d4db919c16f5@mail.gmail.com> > That's nice to see. I think I'll change my mind if someone could perform a > vector-vector multiplication (a operation that is typically memory-bounded) You mean a dot product? -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From faltet at pytables.org Thu Sep 10 08:40:59 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 14:40:59 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4d5dd8c20909100536x47f89b9dy4325d4db919c16f5@mail.gmail.com> References: <200909101123.17643.faltet@pytables.org> <4d5dd8c20909100536x47f89b9dy4325d4db919c16f5@mail.gmail.com> Message-ID: <200909101440.59119.faltet@pytables.org> A Thursday 10 September 2009 14:36:16 Rohit Garg escrigu?: > > That's nice to see. I think I'll change my mind if someone could perform > > a vector-vector multiplication (a operation that is typically > > memory-bounded) > > You mean a dot product? Whatever, dot product or element-wise product. Both are memory-bounded. -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Thu Sep 10 09:39:02 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 10 Sep 2009 08:39:02 -0500 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101440.59119.faltet@pytables.org> References: <200909101123.17643.faltet@pytables.org> <4d5dd8c20909100536x47f89b9dy4325d4db919c16f5@mail.gmail.com> <200909101440.59119.faltet@pytables.org> Message-ID: <4AA90176.1060903@gmail.com> On 09/10/2009 07:40 AM, Francesc Alted wrote: > > A Thursday 10 September 2009 14:36:16 Rohit Garg escrigu?: > > > > That's nice to see. I think I'll change my mind if someone could > perform > > > > a vector-vector multiplication (a operation that is typically > > > > memory-bounded) > > > > > > You mean a dot product? > > Whatever, dot product or element-wise product. Both are memory-bounded. > > -- > > Francesc Alted > > As Francesc previous said, these need to be at least in double precision and really it should also be in all the floating point precisions used by numpy on supported platforms. Based on the various boinc project comments, many graphics cards do not natively support double precision so you can get an inflated speedup just because of the difference in precision. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsalvador.wk at gmail.com Thu Sep 10 09:47:00 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Thu, 10 Sep 2009 15:47:00 +0200 Subject: [Numpy-discussion] Adding a 2D with a 1D array... In-Reply-To: <200909101432.34946.faltet@pytables.org> References: <4fe028e30909090328m4affca0cke983c3944b8a7233@mail.gmail.com> <200909101403.51283.faltet@pytables.org> <4AA8EFA1.9030809@student.matnat.uio.no> <200909101432.34946.faltet@pytables.org> Message-ID: <4fe028e30909100647xe9f164elf0ebfefbcbed737d@mail.gmail.com> Well...you are right, sorry, I just thought 'np.shape(offspr)' result would be enough. Obviously, not! offspr wasn't actually a numpy array, but a Python list. I'm sorry for the inconvenience but I didn't realize....I'm just changing my code so that I just use numpy arrays, and forgot to change offspr definition :S It's always better not to hurry and check the changes deeper. I'lll put some time in profiling the code properly later on....now I just need to finish this! Any pointer where to start a rationale 'sane profiling techniques'? I love reading details and explanations, but don't have the time to go through hundreds of pages right now, so...some good trade-off between practical and extensive docs? Thanks everybody! On Thu, Sep 10, 2009 at 2:32 PM, Francesc Alted wrote: > A Thursday 10 September 2009 14:22:57 Dag Sverre Seljebotn escrigu?: > > > > > (Also a guard in timeit against CPU frequency scaling errors would be > > > > > > > > > > great :-) Like simply outputting a warning if frequency scaling is > > > > > > > > > > detected). > > > > > > > > Sorry, I don't get this one. > > > > > > I had some trouble getting reliable benchmarks on my own computer until > > > I realised that the power-saving capabilities of my CPU down-throttled > > > the clock speed when it was not in use. Thus if I did two calls to > > > timeit right after one another, the second would always report lower > > > runtime, because the first one started at a lower clock speed. > > :-) Good point > > > > > > Changing a BIOS setting solved this, but it might be a gotcha which e.g. > > > timeit and IPython could report (they could just inspect the CPU > > > information and emit a warning -- or, do something to throttle up the > > > CPU to full speed first). > > -- > > Francesc Alted > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Thu Sep 10 09:48:08 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 10 Sep 2009 09:48:08 -0400 Subject: [Numpy-discussion] Behavior from a change in dtype? In-Reply-To: <4AA68BFC.7080707@noaa.gov> References: <1cd32cbb0909071635o40b2cd60le7b19ad40b3b2903@mail.gmail.com> <4AA68BFC.7080707@noaa.gov> Message-ID: On Tue, Sep 8, 2009 at 12:53 PM, Christopher Barker wrote: > Skipper Seabold wrote: >> Hmm, okay, well I came across this in trying to create a recarray like >> data2 below, so I guess I should just combine the two questions. > > key to understanding this is to understand what is going on under the > hood in numpy. Travis O. gave a nice intro in an Enthought webcast a few > months ago -- I"m not sure if those are recorded and up on the web, but > it's worth a look. It was also discussed int eh advanced numpy tutorial > at SciPy this year -- and that is up on the web: > > http://www.archive.org/details/scipy09_advancedTutorialDay1_1 > Thanks. I wasn't able to watch the Enthought webcasts on Linux, but I've seen a few of the video tutorials. What a great resource. I'm really glad this came together. > > Anyway, here is my minimal attempt to clarify: > >> import numpy as np >> >> data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) > > here we are using a standard array constructor -- it will look at the > data you are passing in (a mixture of python floats and ints), and > decide that they can best be represented by a numpy array of float64s. > > numpy arrays are essentially a pointer to a black of memory, and a bunch > of attributes that describe how the bytes pointed to are to be > interpreted. In this case, they are a 9 C doubles, representing a 3x3 > array of doubles. > >> dt = np.dtype([('var1', ' > (NOTE: I'm on a big-endian machine, so I've used: > dt = np.dtype([('var1', '>f8'), ('var2', '>i8'), ('var3', '>i8')]) > ) > > This is a data type descriptor that is analogous to a C struct, > containing a float64 and two int84s > >> # Doesn't work, raises TypeError: expected a readable buffer object >> data2 = data2.view(np.recarray) >> data2.astype(dt) > > I'm don't understand that error either, but recarrays are about adding > the ability to access parts of a structured array by name, but you still > need the dtype to specify the types and names. This does seem to work > (though may not be giving the results you expect): > > In [19]: data2 = data.copy() > In [20]: data2 = data2.view(np.recarray) > In [21]: data2 = data2.view(dtype=dt) > > or, indeed in the opposite order: > > In [24]: data2 = data.copy() > In [25]: data2 = data2.view(dtype=dt) > In [26]: data2 = data2.view(np.recarray) > > > So you've done two operations, one is to change the dtype -- the > interpretation of the bytes in the data buffer, and one is to make this > a recarray, which allows you to access the "fields" by name: > > In [31]: data2['var1'] > Out[31]: > array([[ 10.75], > ? ? ? ?[ 10.39], > ? ? ? ?[ 18.18]]) > >> # Works without error (?) with unexpected result >> data3 = data3.view(np.recarray) >> data3.dtype = dt > > that all depends what you expect! I used "view" above, 'cause I think > there is less magic, though it's the same thing. I suppose changing the > dtype in place like that is a tiny bit more efficient -- if you use > .view() , you are creating a new array pointing to the same data, rather > than changing the array in place. > > But anyway, the dtype describes how the bytes in the memory black are to > be interpreted, changing it by assigning the attribute or using .view() > changes the interpretation, but does not change the bytes themselves at > all, so in this case, you are taking the 8 bytes representing a float64 > of value: 1.0, and interpreting those bytes as an 8 byte int -- which is > going to give you garbage, essentially. > >> # One correct (though IMHO) unintuitive way >> data = np.rec.fromarrays(data.swapaxes(1,0), dtype=dt) > > This is using the np.rec.fromarrays constructor to build a new record > array with the dtype you want, the data is being converted and copied, > it won't change the original at all: > > So the question remains -- is there a way to convert the floats in > "data" to ints in place? > Ah, ok. I understand roughly the above. But, yes, this is my question. > > This seems to work: > In [78]: data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) > > In [79]: data[:,1:3] = data[:,1:3].astype('>i8').view(dtype='>f8') > > In [80]: data.dtype = dt > > It is making a copy of the integer data in process -- but I think that > is required, as you are changing the value, not just the interpretation > of the bytes. I suppose we could have a "astype_inplace" method, but > that would only work if the two types were the same size, and I'm not > sure it's a common enough use to be worth it. > > What is your real use case? I suspect that what you really should do > here is define your dtype first, then create the array of data: > I have a function that eventually appends an ndarray of floats that are 0 to 1 to a recarray, and I ran into it trying to debug. Then I was just curious about the modification in place. > data = np.array([(10.75, 1, 1), (10.39, 0, 1), (18.18, 0, 1)], dtype=dt) > > which does require that you use tuples, rather than lists to hold the > "structs". > Ah yes, I have had a bit of trouble extending my same function to structured arrays, but that's another thread if I can't figure it out. Thanks for the help. Cheers, Skipper From rpg.314 at gmail.com Thu Sep 10 09:51:15 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Thu, 10 Sep 2009 19:21:15 +0530 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4AA90176.1060903@gmail.com> References: <200909101123.17643.faltet@pytables.org> <4d5dd8c20909100536x47f89b9dy4325d4db919c16f5@mail.gmail.com> <200909101440.59119.faltet@pytables.org> <4AA90176.1060903@gmail.com> Message-ID: <4d5dd8c20909100651q4c0445c9ud4f603ff58ff1765@mail.gmail.com> Apart from float and double, which floating point formats are supported by numpy? On Thu, Sep 10, 2009 at 7:09 PM, Bruce Southey wrote: > On 09/10/2009 07:40 AM, Francesc Alted wrote: > > A Thursday 10 September 2009 14:36:16 Rohit Garg escrigu?: > >> > That's nice to see. I think I'll change my mind if someone could perform > >> > a vector-vector multiplication (a operation that is typically > >> > memory-bounded) > >> > >> You mean a dot product? > > Whatever, dot product or element-wise product. Both are memory-bounded. > > -- > > Francesc Alted > > As Francesc previous said, these need to be at least in double precision and > really it should also be in all the floating point precisions used by numpy > on supported platforms. Based on the various boinc project comments, many > graphics cards do not natively support double precision so? you can get an > inflated speedup just because of the difference in precision. > > Bruce > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From faltet at pytables.org Thu Sep 10 11:20:13 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 10 Sep 2009 17:20:13 +0200 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <4d5dd8c20909100651q4c0445c9ud4f603ff58ff1765@mail.gmail.com> References: <4AA90176.1060903@gmail.com> <4d5dd8c20909100651q4c0445c9ud4f603ff58ff1765@mail.gmail.com> Message-ID: <200909101720.13736.faltet@pytables.org> A Thursday 10 September 2009 15:51:15 Rohit Garg escrigu?: > Apart from float and double, which floating point formats are > supported by numpy? I think whatever supported by the underlying CPU, whenever it is extended double precision (12 bytes) or quad precision (16 bytes). -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpg.314 at gmail.com Thu Sep 10 11:29:27 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Thu, 10 Sep 2009 20:59:27 +0530 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101720.13736.faltet@pytables.org> References: <4AA90176.1060903@gmail.com> <4d5dd8c20909100651q4c0445c9ud4f603ff58ff1765@mail.gmail.com> <200909101720.13736.faltet@pytables.org> Message-ID: <4d5dd8c20909100829g79b63ae4rffb433c751c45f25@mail.gmail.com> > I think whatever supported by the underlying CPU, whenever it is extended > double precision (12 bytes) or quad precision (16 bytes). classic 64 bit cpu's support neither. > > -- > > Francesc Alted > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From robert.kern at gmail.com Thu Sep 10 11:46:25 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 10 Sep 2009 10:46:25 -0500 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <200909101428.45215.faltet@pytables.org> References: <200909101129.49984.faltet@pytables.org> <20090910093724.GN2482@phare.normalesup.org> <200909101428.45215.faltet@pytables.org> Message-ID: <3d375d730909100846x54f22c7g12b28c853c6bd5ce@mail.gmail.com> On Thu, Sep 10, 2009 at 07:28, Francesc Alted wrote: > A Thursday 10 September 2009 11:37:24 Gael Varoquaux escrigu?: > >> On Thu, Sep 10, 2009 at 11:29:49AM +0200, Francesc Alted wrote: > >> > The point is: are GPUs prepared to compete with a general-purpose CPUs > >> > in all-road operations, like evaluating transcendental functions, > >> > conditionals all of this with a rich set of data types? I would like to > >> > believe that this is the case, but I don't think so (at least not yet). > >> > >> I believe (this is very foggy) that GPUs can implement non trivial logic > >> on there base processing unit, so that conditionals and transcendental > >> functions are indeed possible. Where it gets hard is when you don't have > >> problems that can be expressed in an embarrassingly parallel manner. > > But NumPy is about embarrassingly parallel calculations, right? I mean: > > a = np.cos(b) > > where b is a 10000x10000 matrix is *very* embarrassing (in the parallel > meaning of the term ;-) Yes. However, it is worth making the distinction between embarrassingly parallel problems and SIMD problems. Not all embarrassingly parallel problems are SIMD-capable. GPUs do SIMD, not generally embarrassing problems. If there are branches, as would be necessary for many special functions, the GPU does not perform as well. Basically, every unit has to do both branches because they all must do the same instruction at the same time, even though the data on each unit only gets processed by one branch. cos() is easy. Or at least is so necessary to graphics computing that it is already a primitive in all (most?) GPU languages. Googling around shows SIMD code for the basic transcendental functions. I believe you have to code them differently than you would on a CPU. Other special functions would simply be hard to do efficiently. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From rpg.314 at gmail.com Thu Sep 10 12:19:28 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Thu, 10 Sep 2009 21:49:28 +0530 Subject: [Numpy-discussion] Fwd: GPU Numpy In-Reply-To: <3d375d730909100846x54f22c7g12b28c853c6bd5ce@mail.gmail.com> References: <200909101129.49984.faltet@pytables.org> <20090910093724.GN2482@phare.normalesup.org> <200909101428.45215.faltet@pytables.org> <3d375d730909100846x54f22c7g12b28c853c6bd5ce@mail.gmail.com> Message-ID: <4d5dd8c20909100919laf5a790s9cc21f93ba1a8feb@mail.gmail.com> > Yes. However, it is worth making the distinction between > embarrassingly parallel problems and SIMD problems. Not all > embarrassingly parallel problems are SIMD-capable. GPUs do SIMD, not > generally embarrassing problems. GPUs exploit both dimensions of parallelism, both simd (aka vectorization) and parallelization (aka multicore). And yeah, 99.9% of the time branching on GPU should be the least/last of your worries if your problem is data-parallel. There are much worse things than branchings. As for SIMD special functions, branching can certainly be eliminated. I have written/come across some special functions myself, and I do not know any case which is difficult to do efficiently on a gpu. Certainly, I know less than some folks around here. May be you can contribute a counter example to this discussion. Regards, -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From washakie at gmail.com Thu Sep 10 13:03:20 2009 From: washakie at gmail.com (John [H2O]) Date: Thu, 10 Sep 2009 10:03:20 -0700 (PDT) Subject: [Numpy-discussion] iteration slowing, no increase in memory Message-ID: <25387205.post@talk.nabble.com> Hello, I have a routine that is iterating through a series of directories, loading files, plotting, then moving on... It runs very well for the first few iterations, but then slows tremendously - there is nothing significantly different about the files or directory in which it slows. I've monitored the memory use, and it is not increasing. I've looked at what other possible explanations there may be, but I am at a loss. Does anyone have suggestions for where to start looking. I recognized without the code it is difficult, but I don't know that there is any one 'piece' of code to post, and it's problem not of interest for me to post the entire script here. Thanks! -- View this message in context: http://www.nabble.com/iteration-slowing%2C-no-increase-in-memory-tp25387205p25387205.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Thu Sep 10 13:09:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 10 Sep 2009 12:09:52 -0500 Subject: [Numpy-discussion] iteration slowing, no increase in memory In-Reply-To: <25387205.post@talk.nabble.com> References: <25387205.post@talk.nabble.com> Message-ID: <3d375d730909101009pc8cebdfn241a8075331a901@mail.gmail.com> On Thu, Sep 10, 2009 at 12:03, John [H2O] wrote: > > Hello, > > I have a routine that is iterating through a series of directories, loading > files, plotting, then moving on... > > It runs very well for the first few iterations, but then slows tremendously > - there is nothing significantly different about the files or directory in > which it slows. One thing you can do to verify this is to change the order of iteration. You will also want to profile your code. Then you can see what is taking up so much time. http://docs.python.org/library/profile -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From chad.netzer at gmail.com Thu Sep 10 20:32:28 2009 From: chad.netzer at gmail.com (Chad Netzer) Date: Thu, 10 Sep 2009 17:32:28 -0700 Subject: [Numpy-discussion] iteration slowing, no increase in memory In-Reply-To: <25387205.post@talk.nabble.com> References: <25387205.post@talk.nabble.com> Message-ID: On Thu, Sep 10, 2009 at 10:03 AM, John [H2O] wrote: > It runs very well for the first few iterations, but then slows tremendously > - there is nothing significantly different about the files or directory in > which it slows. I've monitored the memory use, and it is not increasing. The memory use itself is not a good indicator, as modern operating systems (Linux, Windows, Mac, et al) generally use all available free memory as a disk cache. So the system memory use may remain quite steady while old data is flushed and new data paged in. The first few iterations could be "fast" if they are already in memory, although the behavior should probably change on repeated runs. If you reboot, then immediately run the script, is it slow on all directories? Or if you can't reboot, can you at least remount the filesystem (which should flush all the cached data and metadata)? Or, for recent Linux kernels: http://linux-mm.org/Drop_Caches Are other operations slow/fast for the different directories, such as tar'ing them up, or "du -s"? Can you verify the integrity of the drive with SMART tools? If its Linux, can you get data on the actual disk device I/O (using "iostat" or "vmstat")? Or you could test by iterating over the same directory repeatedly; it should be fast after the first iteration. Then move to a "problem" directory and see if the first iteration only is slow, or if all iterations are slow. -C From chad.netzer at gmail.com Fri Sep 11 03:07:13 2009 From: chad.netzer at gmail.com (Chad Netzer) Date: Fri, 11 Sep 2009 00:07:13 -0700 Subject: [Numpy-discussion] Huge arrays In-Reply-To: References: Message-ID: On Tue, Sep 8, 2009 at 6:41 PM, Charles R Harris wrote: > > More precisely, 2GB for windows and 3GB for (non-PAE enabled) linux. And just to further clarify, even with PAE enabled on linux, any individual process has about a 3 GB address limit (there are hacks to raise that to 3.5 or 4GB, but with a performance penalty). But 4 GB is the absolute max addressable RAM for a single 32 bit process (even if the kernel itself can use up to 64GB of physical RAM with PAE). For gory details on Windows address space limits: http://msdn.microsoft.com/en-us/library/bb613473%28VS.85%29.aspx If running 64bit is not an option, I'd consider the "compress in RAM" technique. Delta-compression for most sampled signals should be quite doable. Heck, here's some untested pseudo-code: import numpy import zlib data_row = numpy.zeros(2000000, dtype=numpy.int16) # Fill up data_row compressed_row_strings = [] data_row[1:] = data_row[:-1] - data_row[1:] # quick n dirty delta encoding compressed_row_strings.append(zlib.compress(data_row.tostring()) # Put a loop in there, reuse the row array, and you are almost all set. The delta # encoding is optional, but probably useful for most "real world" 1d signals. # If you don't have the time between samples to compress the whole row, break # it into smaller chunks (see zlib.compressobj()) -C From sole at esrf.fr Fri Sep 11 03:30:30 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Fri, 11 Sep 2009 09:30:30 +0200 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) Message-ID: <4AA9FC96.8040608@esrf.fr> Hello, I have found performance problems under windows when using python 2.6 In my case, they seem to be related to the dot product. The following simple script: import numpy import time a=numpy.arange(1000000.) a.shape=1000,1000 t0=time.time() b=numpy.dot(a.T,a) print "Elapsed time = ",time.time()-t0 reports an "Elapsed time" of 1.4 seconds under python 2.5 and 15 seconds under python 2.6 Same version of numpy, same machine, official numpy installers for windows (both with nosse flag) Are some libraries missing in the windows superpack for python 2.6? Perhaps the reported problem is already known, but I did not find any information about it. Best regards, Armando From david at ar.media.kyoto-u.ac.jp Fri Sep 11 03:12:27 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 11 Sep 2009 16:12:27 +0900 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AA9FC96.8040608@esrf.fr> References: <4AA9FC96.8040608@esrf.fr> Message-ID: <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> V. Armando Sol? wrote: > Hello, > > I have found performance problems under windows when using python 2.6 > In my case, they seem to be related to the dot product. > > The following simple script: > > import numpy > import time > a=numpy.arange(1000000.) > a.shape=1000,1000 > t0=time.time() > b=numpy.dot(a.T,a) > print "Elapsed time = ",time.time()-t0 > > reports an "Elapsed time" of 1.4 seconds under python 2.5 and 15 seconds > under python 2.6 > > Same version of numpy, same machine, official numpy installers for > windows (both with nosse flag) > Could you confirm this by pasting the output of numpy.show_config() in both versions ? cheers, David From sole at esrf.fr Fri Sep 11 03:47:37 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Fri, 11 Sep 2009 09:47:37 +0200 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> References: <4AA9FC96.8040608@esrf.fr> <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> Message-ID: <4AAA0099.9020906@esrf.fr> David Cournapeau wrote: > V. Armando Sol? wrote: > >> Hello, >> >> I have found performance problems under windows when using python 2.6 >> In my case, they seem to be related to the dot product. >> >> The following simple script: >> >> import numpy >> import time >> a=numpy.arange(1000000.) >> a.shape=1000,1000 >> t0=time.time() >> b=numpy.dot(a.T,a) >> print "Elapsed time = ",time.time()-t0 >> >> reports an "Elapsed time" of 1.4 seconds under python 2.5 and 15 seconds >> under python 2.6 >> >> Same version of numpy, same machine, official numpy installers for >> windows (both with nosse flag) >> >> > > Could you confirm this by pasting the output of numpy.show_config() in > both versions The output of: python -c "import numpy; import sys; print sys.executable;numpy.show_config()" > python26.txt and python -c "import numpy; import sys; print sys.executable; numpy.show_config()" > python25.txt are identical except for the first line: diff python25.txt python26.txt 1c1 < C:\Python25\python.exe --- > C:\Python26\python.exe I paste the python26.txt because the other one is the same except for the first line: C:\Python26\python.exe blas_info: libraries = ['blas'] library_dirs = ['C:\\local\\lib\\yop\\nosse'] language = f77 lapack_info: libraries = ['lapack'] library_dirs = ['C:\\local\\lib\\yop\\nosse'] language = f77 atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['blas'] library_dirs = ['C:\\local\\lib\\yop\\nosse'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_blas_threads_info: NOT AVAILABLE lapack_opt_info: libraries = ['lapack', 'blas'] library_dirs = ['C:\\local\\lib\\yop\\nosse'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_info: NOT AVAILABLE lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE atlas_blas_info: NOT AVAILABLE mkl_info: NOT AVAILABLE Any hint? Armando From sole at esrf.fr Fri Sep 11 04:23:37 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Fri, 11 Sep 2009 10:23:37 +0200 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> References: <4AA9FC96.8040608@esrf.fr> <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> Message-ID: <4AAA0909.5070709@esrf.fr> Hello, It seems to point towards a packaging problem. In python 2.5, I can do: import numpy.core._dotblas as dotblas dotblas.__file__ and I get: C:\\Python25\\lib\\site-packages\\numpy\\core\\_dotblas.pyd In python 2.6: >>>import numpy.core._dotblas as dotblas ... ImportError: No module named _dotblas and, of course, I cannot find the _dotblas.pyd file in the relevant directories. Best regards, Armando From sturla at molden.no Fri Sep 11 05:05:28 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 11 Sep 2009 11:05:28 +0200 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AA9FC96.8040608@esrf.fr> References: <4AA9FC96.8040608@esrf.fr> Message-ID: <4AAA12D8.5080909@molden.no> V. Armando Sol? skrev: > import numpy > import time > a=numpy.arange(1000000.) > a.shape=1000,1000 > t0=time.time() > b=numpy.dot(a.T,a) > print "Elapsed time = ",time.time()-t0 > > reports an "Elapsed time" of 1.4 seconds under python 2.5 and 15 seconds > under python 2.6 > My computer reports 0.34 seconds (Pytin 2.6.2, Win32). S.M. From david at ar.media.kyoto-u.ac.jp Fri Sep 11 04:53:13 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 11 Sep 2009 17:53:13 +0900 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AAA0909.5070709@esrf.fr> References: <4AA9FC96.8040608@esrf.fr> <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> <4AAA0909.5070709@esrf.fr> Message-ID: <4AAA0FF9.5040708@ar.media.kyoto-u.ac.jp> V. Armando Sol? wrote: > Hello, > > It seems to point towards a packaging problem. > > In python 2.5, I can do: > > import numpy.core._dotblas as dotblas > dotblas.__file__ > > and I get: > > C:\\Python25\\lib\\site-packages\\numpy\\core\\_dotblas.pyd > That's where the error lies: if you install with nosse, you should not get _dotblas.pyd at all. The 15 second is the 'normal' speed if you don't use ATLAS. I will look into the packaging problem - could you open an issue on numpy trac, so that I don't forget about it ? cheers, David > In python 2.6: > > >>>import numpy.core._dotblas as dotblas > ... > ImportError: No module named _dotblas > > and, of course, I cannot find the _dotblas.pyd file in the relevant > directories. > > Best regards, > > Armando > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > From sturla at molden.no Fri Sep 11 05:14:44 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 11 Sep 2009 11:14:44 +0200 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AAA0909.5070709@esrf.fr> References: <4AA9FC96.8040608@esrf.fr> <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> <4AAA0909.5070709@esrf.fr> Message-ID: <4AAA1504.2050007@molden.no> V. Armando Sol? skrev: > In python 2.6: > > >>>import numpy.core._dotblas as dotblas > ... > ImportError: No module named _dotblas > >>> import numpy.core._dotblas as dotblas >>> dotblas.__file__ 'C:\\Python26\\lib\\site-packages\\numpy\\core\\_dotblas.pyd' From sole at esrf.fr Fri Sep 11 05:22:28 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Fri, 11 Sep 2009 11:22:28 +0200 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AAA1504.2050007@molden.no> References: <4AA9FC96.8040608@esrf.fr> <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> <4AAA0909.5070709@esrf.fr> <4AAA1504.2050007@molden.no> Message-ID: <4AAA16D4.9090600@esrf.fr> Sturla Molden wrote: > V. Armando Sol? skrev: > >> In python 2.6: >> >> >>>import numpy.core._dotblas as dotblas >> ... >> ImportError: No module named _dotblas >> >> > > >>> import numpy.core._dotblas as dotblas > >>> dotblas.__file__ > 'C:\\Python26\\lib\\site-packages\\numpy\\core\\_dotblas.pyd' > That's because you have installed either the sse2 or the sse3 versions. As I said in my post, the problem affects the nosse version. _dotblas.pyd is missing in the nosse version and that is a problem unless one forgets about supporting Pentium III and Socket A processors when developing code. Armando From sole at esrf.fr Fri Sep 11 05:27:25 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Fri, 11 Sep 2009 11:27:25 +0200 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AAA0FF9.5040708@ar.media.kyoto-u.ac.jp> References: <4AA9FC96.8040608@esrf.fr> <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> <4AAA0909.5070709@esrf.fr> <4AAA0FF9.5040708@ar.media.kyoto-u.ac.jp> Message-ID: <4AAA17FD.3070103@esrf.fr> David Cournapeau wrote: > V. Armando Sol? wrote: > >> Hello, >> >> It seems to point towards a packaging problem. >> >> In python 2.5, I can do: >> >> import numpy.core._dotblas as dotblas >> dotblas.__file__ >> >> and I get: >> >> C:\\Python25\\lib\\site-packages\\numpy\\core\\_dotblas.pyd >> >> > > That's where the error lies: if you install with nosse, you should not > get _dotblas.pyd at all. Why? The nosse for python 2.5 has _dotblas.pyd Is it impossible to get it compiled under python 2.6 without using sse2 or sse3? If so it should be somewhere written in the release notes. > I will look into the packaging problem - could you open an issue on > numpy trac, so that I don't forget about it ? > OK. I'll try to do it. Best regards, Armando From david at ar.media.kyoto-u.ac.jp Fri Sep 11 05:25:29 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 11 Sep 2009 18:25:29 +0900 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AAA17FD.3070103@esrf.fr> References: <4AA9FC96.8040608@esrf.fr> <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> <4AAA0909.5070709@esrf.fr> <4AAA0FF9.5040708@ar.media.kyoto-u.ac.jp> <4AAA17FD.3070103@esrf.fr> Message-ID: <4AAA1789.50307@ar.media.kyoto-u.ac.jp> V. Armando Sol? wrote: > David Cournapeau wrote: > >> V. Armando Sol? wrote: >> >> >>> Hello, >>> >>> It seems to point towards a packaging problem. >>> >>> In python 2.5, I can do: >>> >>> import numpy.core._dotblas as dotblas >>> dotblas.__file__ >>> >>> and I get: >>> >>> C:\\Python25\\lib\\site-packages\\numpy\\core\\_dotblas.pyd >>> >>> >>> >> That's where the error lies: if you install with nosse, you should not >> get _dotblas.pyd at all. >> > Why? The nosse for python 2.5 has _dotblas.pyd > Yes, and it should not - because the _dotblas.pyd uses SSE2 instructions. The python 2.6 installer is the correct one, python 2.5 is not. > Is it impossible to get it compiled under python 2.6 without using sse2 > or sse3? > It is possible to compile anything you want if you are willing to go through the hassle of compiling ATLAS on windows. The binary installer only uses ATLAS (and hence build _dotblas) for SSE2 and SSE3. The low availability of machines with only SSE does not worth the hassle to do it anymore (but again, that's only an issue of the official binaries, you can still compile your own atlas). cheers, David From josef.pktd at gmail.com Fri Sep 11 06:44:17 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 11 Sep 2009 06:44:17 -0400 Subject: [Numpy-discussion] Dot product performance on python 2.6 (windows) In-Reply-To: <4AAA1789.50307@ar.media.kyoto-u.ac.jp> References: <4AA9FC96.8040608@esrf.fr> <4AA9F85B.5010005@ar.media.kyoto-u.ac.jp> <4AAA0909.5070709@esrf.fr> <4AAA0FF9.5040708@ar.media.kyoto-u.ac.jp> <4AAA17FD.3070103@esrf.fr> <4AAA1789.50307@ar.media.kyoto-u.ac.jp> Message-ID: <1cd32cbb0909110344x314cc1a3x45893da5af1f690b@mail.gmail.com> On Fri, Sep 11, 2009 at 5:25 AM, David Cournapeau wrote: > V. Armando Sol? wrote: >> David Cournapeau wrote: >> >>> V. Armando Sol? wrote: >>> >>> >>>> Hello, >>>> >>>> It seems to point towards a packaging problem. >>>> >>>> In python 2.5, I can do: >>>> >>>> import numpy.core._dotblas as dotblas >>>> dotblas.__file__ >>>> >>>> and I get: >>>> >>>> C:\\Python25\\lib\\site-packages\\numpy\\core\\_dotblas.pyd >>>> >>>> >>>> >>> That's where the error lies: if you install with nosse, you should not >>> get _dotblas.pyd at all. >>> >> Why? The nosse for python 2.5 has _dotblas.pyd >> > > Yes, and it should not - because the _dotblas.pyd uses SSE2 > instructions. ?The python 2.6 installer is the correct one, python 2.5 > is not. >> Is it impossible to get it compiled under python 2.6 without using sse2 >> or sse3? >> > > It is possible to compile anything you want if you are willing to go > through the hassle of compiling ATLAS on windows. The binary installer > only uses ATLAS (and hence build _dotblas) for SSE2 and SSE3. The low > availability of machines with only SSE does not worth the hassle to do > it anymore (but again, that's only an issue of the official binaries, > you can still compile your own atlas). > It's also possible to use the older ATLAS binaries at http://www.scipy.org/Installing_SciPy/Windows#head-cd37d819e333227e327079e4c2a2298daf625624 with MingW, which does not require building ATLAS yourself. I'm using the sse2 from there and don't have any problems. David, If you have updated ATLAS binaries (e.g. also for sse3), is it possible to add them to the webpage? Thanks, Josef > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From romain.brette at ens.fr Fri Sep 11 07:52:40 2009 From: romain.brette at ens.fr (Romain Brette) Date: Fri, 11 Sep 2009 11:52:40 +0000 (UTC) Subject: [Numpy-discussion] Speed of class derived from float64 Message-ID: Hi, In our project we define a class derived from numpy.float64 (and we add units) and I noticed that instance creation was very slow. I found out that creating a float64 object is fast, but creating an object from the derived class is almost 10 times slower, even if that class doesn't do anything new. For example: class C(float64): pass x=float64(5.) takes 0.6 ?s x=float64() takes 0.5 ?s x=C(5.) takes 3.9 ?s x=C() takes 0.6 ?s Finally, if C derives from float instead of float64, x=C(5.) takes 0.3 ?s. What I find surprising is that the extra time is only when the object is initialized with a value. Does anybody have an idea about it? (and possibly a solution to make it faster?) Romain From chanley at stsci.edu Fri Sep 11 11:15:39 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Fri, 11 Sep 2009 11:15:39 -0400 Subject: [Numpy-discussion] numpy on OSX 10.6 issues Message-ID: <4AAA699B.2020000@stsci.edu> Hi, I'm looking for some help getting the svn trunk numpy working on Max OS X 10.6. I've installed my own version of Python 2.6 from python.org. I've got the following flags set: setenv MACOSX_DEPLOYMENT_TARGET 10.6 setenv CFLAGS "-arch i386 -arch x86_64" setenv FFLAGS "-arch i386 -arch x86_64" setenv LDFLAGS "-Wall -undefined dynamic_lookup -bundle -arch i386 -arch x86_64" The build seems to complete with no errors. However when I attempt to import numpy I get the following error: Python 2.6.2 (r262:71600, Apr 16 2009, 09:17:39) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy Traceback (most recent call last): File "", line 1, in File "/Users/chanley/dev/site-packages/lib/python/numpy/__init__.py", line 130 , in import add_newdocs File "/Users/chanley/dev/site-packages/lib/python/numpy/add_newdocs.py", line 9, in from lib import add_newdoc File "/Users/chanley/dev/site-packages/lib/python/numpy/lib/__init__.py", line 4, in from type_check import * File "/Users/chanley/dev/site-packages/lib/python/numpy/lib/type_check.py", li ne 8, in import numpy.core.numeric as _nx File "/Users/chanley/dev/site-packages/lib/python/numpy/core/__init__.py", lin e 8, in import numerictypes as nt File "/Users/chanley/dev/site-packages/lib/python/numpy/core/numerictypes.py", line 600, in _typestr[key] = empty((1,),key).dtype.str[1:] ValueError: array is too big. >>> Any suggestions? Thank you for your time and help, Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From meine at informatik.uni-hamburg.de Fri Sep 11 11:21:56 2009 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Fri, 11 Sep 2009 17:21:56 +0200 Subject: [Numpy-discussion] iteration slowing, no increase in memory In-Reply-To: <25387205.post@talk.nabble.com> References: <25387205.post@talk.nabble.com> Message-ID: <200909111721.56309.meine@informatik.uni-hamburg.de> On Thursday 10 September 2009 19:03:20 John [H2O] wrote: > I have a routine that is iterating through a series of directories, loading > files, plotting, then moving on... > > It runs very well for the first few iterations, but then slows tremendously Maybe you "collect" some data into growing data structures and some of your algorithms have non-constant time complexity w.r.t the size of these? HTH, Hans From a.h.jaffe at gmail.com Fri Sep 11 11:24:11 2009 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Fri, 11 Sep 2009 08:24:11 -0700 Subject: [Numpy-discussion] searchsorted for exact matches, not preserving order Message-ID: Dear all, I've got two (integer) arrays, and I want to find the indices in the first one that have entries in the second. I.E. I want all idx s.t. there exists a j with a[idx]=b[j]. Here is my current implementation (with a = pixnums, b=surveypix) import numpy as np def matchPix(pixnums, surveypix): spix = np.sort(surveypix) ### returns a list of indices into spix to keep spix sorted when inserting pixnums ss = np.searchsorted(spix, pixnums) ss[ss==len(spix)] = 0 ## if any of the pixnums are > max(spix) ### now need to extract the actual matches idxs = [i for (i,s) in enumerate(ss) if pixnums[i]==spix[s]] return np.asarray(idxs) This works, and is pretty efficient, but to me this actually seems like a more common task than searchsorted itself; is there a simpler, more numpyish way to do this? Andrew From robert.kern at gmail.com Fri Sep 11 11:33:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Sep 2009 10:33:28 -0500 Subject: [Numpy-discussion] searchsorted for exact matches, not preserving order In-Reply-To: References: Message-ID: <3d375d730909110833o334deeeanb35e8416651e2eb6@mail.gmail.com> On Fri, Sep 11, 2009 at 10:24, Andrew Jaffe wrote: > Dear all, > > I've got two (integer) arrays, and I want to find the indices in the > first one that have entries in the second. I.E. I want all idx s.t. > there exists a j with a[idx]=b[j]. Here is my current implementation > (with a = pixnums, b=surveypix) numpy.setmember1d() [or numpy.in1d() for the SVN trunk of numpy]. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Fri Sep 11 13:25:58 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 11 Sep 2009 10:25:58 -0700 Subject: [Numpy-discussion] iteration slowing, no increase in memory In-Reply-To: <25387205.post@talk.nabble.com> References: <25387205.post@talk.nabble.com> Message-ID: On Thu, Sep 10, 2009 at 10:03 AM, John [H2O] wrote: > I have a routine that is iterating through a series of directories, loading > files, plotting, then moving on... > > It runs very well for the first few iterations, but then slows tremendously You mention plotting. I'd suggest checking that you aren't holding state inside matplotlib, which is exceedingly easy to do without noticing if you only use the pylab/pyplot interface and don't take care to clear things out. As a quick check, disable the plotting by commenting the plot commands out, but leave the rest of the code to run. That will help isolate whether the problem is indeed in your plotting code. Cheers, f From amenity at enthought.com Fri Sep 11 14:17:57 2009 From: amenity at enthought.com (Amenity Applewhite) Date: Fri, 11 Sep 2009 13:17:57 -0500 Subject: [Numpy-discussion] Scientific Computing with Python, September 18, 2009 References: <1183663757.1252692788349.JavaMail.root@p2-ws607.ad.prodcc.net> Message-ID: <4832F195-92FA-412D-9C1C-CEE81851F10B@enthought.com> (HTML version of email) Greetings! September is well upon us and it looks like it's already time for another Scientific Computing with Python webinar. Next week, Travis Oliphant will be hosting a presentation on regression analysis in NumPy and SciPy. As you are probably aware, Travis was the primary developer of NumPy, so we're fortunate to have him presenting these tools. Here's a word on what to expect Friday: A common scientific and engineering need is to find the parameters to a model that best fit a particular data set. A large number of techniques and tools have been created for assisting with this general problem. They vary based on the model (e.g. linear or nonlinear), the characteristics of the errors on the data (e.g. weighted or un- weighted), and the error metric selected (e.g. least-squares, or absolute difference). This webinar will provide an overview of the tools that SciPy and NumPy provide for regression analysis including linear and non-linear least-squares and a brief look at handling other error metrics. We will also demonstrate simple GUI tools that can make some problems easier and provide a quick overview of the new Scikits package statsmodels whose API is maturing in a separate package but should be incorporated into SciPy in the future. Here's the registration information: Scientific Computing with Python Webinar: Regression analysis in NumPy Friday, September 18 1pm CDT/6pm UTC Register at GoToMeeting: https://www1.gotomeeting.com/register/632400424 Forward email http://ui.constantcontact.com/sa/fwtf.jsp?m=1102424111856&ea=leah%40enthought.com&a=1102702114724&id=preview Hope to see you there! -- Amenity Applewhite Enthought, Inc. Scientific Computing Solutions www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Fri Sep 11 14:36:39 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 11 Sep 2009 20:36:39 +0200 Subject: [Numpy-discussion] Ticket 1216 Message-ID: Hi all, Ticket http://projects.scipy.org/numpy/ticket/1216 can be closed. Cheers, Nils From sebas0 at gmail.com Fri Sep 11 15:22:48 2009 From: sebas0 at gmail.com (Sebastian) Date: Fri, 11 Sep 2009 16:22:48 -0300 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') Message-ID: Hello, The folks at stsci (Jim T.) are not able to reproduce this error with 1.4.0.dev7362 so I guess there is something wrong with my numpy installation. I also tried '1.4.0.dev7362' and numpy1.3 (stable) but alas, the same error! My system: [root at siate numpy]# uname -a Linux siate.iate.oac.uncor.edu 2.6.27.25-78.2.56.fc9.x86_64 #1 SMP Thu Jun 18 12:24:37 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux Below is the numpy error and the the output of numpy, build and install as attached: python setup.py build > build.txt python setup.py install > install.txt Can't work out what is going wrong here. I don't think I'm missing some dependency nor mixing compilers, but maybe I'm wrong, any hints? best regards, - Sebastian Gurovich [root at siate soft]# ipython Python 2.5.1 (r251:54863, Jun 15 2008, 18:24:56) Type "copyright", "credits" or "license" for more information. IPython 0.8.3 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object'. ?object also works, ?? prints more. In [6]: numpy.__version__ Out[6]: '1.4.0.dev7375' In [7]: a=numpy.zeros(0x80000000,dtype='b1') In [8]: a.data --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/sebaguro/Desktop/soft/ in () ValueError: size must be zero or positive -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2_7375 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/lib64 libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib64 libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib64 libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib64/atlas libraries f77blas,cblas,atlas not found in /usr/lib64/sse2 libraries f77blas,cblas,atlas not found in /usr/lib64 libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE blas_info: libraries blas not found in /usr/local/lib64 libraries blas not found in /usr/local/lib FOUND: libraries = ['blas'] library_dirs = ['/usr/lib64'] language = f77 FOUND: libraries = ['blas'] library_dirs = ['/usr/lib64'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /usr/local/lib64 libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib64 libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib64 libraries lapack_atlas not found in /usr/local/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/atlas libraries lapack_atlas not found in /usr/lib64/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/sse2 libraries lapack_atlas not found in /usr/lib64/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib64 libraries lapack_atlas not found in /usr/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib64 libraries lapack_atlas not found in /usr/local/lib64 libraries f77blas,cblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib64/atlas libraries lapack_atlas not found in /usr/lib64/atlas libraries f77blas,cblas,atlas not found in /usr/lib64/sse2 libraries lapack_atlas not found in /usr/lib64/sse2 libraries f77blas,cblas,atlas not found in /usr/lib64 libraries lapack_atlas not found in /usr/lib64 libraries f77blas,cblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_info NOT AVAILABLE lapack_info: libraries lapack not found in /usr/local/lib64 libraries lapack not found in /usr/local/lib FOUND: libraries = ['lapack'] library_dirs = ['/usr/lib64'] language = f77 FOUND: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib64'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 running install running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building py_modules sources building library "npymath" sources customize GnuFCompiler Found executable /usr/bin/g77 gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using config C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c _configtest.c:1: warning: conflicting types for built-in function ?exp? gcc -pthread _configtest.o -o _configtest _configtest.o: In function `main': /home/sebaguro/Desktop/soft/numpy/_configtest.c:6: undefined reference to `exp' collect2: ld returned 1 exit status _configtest.o: In function `main': /home/sebaguro/Desktop/soft/numpy/_configtest.c:6: undefined reference to `exp' collect2: ld returned 1 exit status failure. removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c _configtest.c:1: warning: conflicting types for built-in function ?exp? gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest building extension "numpy.core._sort" sources adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h' to sources. numpy.core - nothing done with h_files = ['build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h'] building extension "numpy.core.multiarray" sources adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h' to sources. numpy.core - nothing done with h_files = ['build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h'] building extension "numpy.core.umath" sources adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__ufunc_api.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/src/umath' to include_dirs. numpy.core - nothing done with h_files = ['build/src.linux-x86_64-2.5/numpy/core/src/umath/funcs.inc', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__ufunc_api.h'] building extension "numpy.core.scalarmath" sources adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__ufunc_api.h' to sources. numpy.core - nothing done with h_files = ['build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__ufunc_api.h'] building extension "numpy.core._dotblas" sources building extension "numpy.core.umath_tests" sources building extension "numpy.core.multiarray_tests" sources building extension "numpy.lib._compiled_base" sources building extension "numpy.numarray._capi" sources building extension "numpy.fft.fftpack_lite" sources building extension "numpy.linalg.lapack_lite" sources adding 'numpy/linalg/lapack_litemodule.c' to sources. adding 'numpy/linalg/python_xerbla.c' to sources. building extension "numpy.random.mtrand" sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c gcc -pthread _configtest.o -o _configtest _configtest failure. removing: _configtest.c _configtest.o _configtest building data_files sources build_src: building npy-pkg config files running build_py copying numpy/version.py -> build/lib.linux-x86_64-2.5/numpy copying build/src.linux-x86_64-2.5/numpy/__config__.py -> build/lib.linux-x86_64-2.5/numpy copying build/src.linux-x86_64-2.5/numpy/distutils/__config__.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/f2py/__svn_version__.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/core/__svn_version__.py -> build/lib.linux-x86_64-2.5/numpy/core running build_clib customize UnixCCompiler customize UnixCCompiler using build_clib running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using build_ext running scons running build_scripts adding 'build/scripts.linux-x86_64-2.5/f2py' to scripts running install_lib copying build/lib.linux-x86_64-2.5/numpy/f2py/__svn_version__.py -> /usr/lib64/python2.5/site-packages/numpy/f2py copying build/lib.linux-x86_64-2.5/numpy/distutils/__config__.py -> /usr/lib64/python2.5/site-packages/numpy/distutils copying build/lib.linux-x86_64-2.5/numpy/core/__svn_version__.py -> /usr/lib64/python2.5/site-packages/numpy/core copying build/lib.linux-x86_64-2.5/numpy/__config__.py -> /usr/lib64/python2.5/site-packages/numpy copying build/lib.linux-x86_64-2.5/numpy/version.py -> /usr/lib64/python2.5/site-packages/numpy byte-compiling /usr/lib64/python2.5/site-packages/numpy/f2py/__svn_version__.py to __svn_version__.pyc byte-compiling /usr/lib64/python2.5/site-packages/numpy/distutils/__config__.py to __config__.pyc byte-compiling /usr/lib64/python2.5/site-packages/numpy/core/__svn_version__.py to __svn_version__.pyc byte-compiling /usr/lib64/python2.5/site-packages/numpy/__config__.py to __config__.pyc byte-compiling /usr/lib64/python2.5/site-packages/numpy/version.py to version.pyc running install_scripts changing mode of /usr/bin/f2py to 755 running install_data copying install.txt -> /usr/lib64/python2.5/site-packages/numpy copying build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h -> /usr/lib64/python2.5/site-packages/numpy/core/include/numpy copying build/src.linux-x86_64-2.5/numpy/core/include/numpy/multiarray_api.txt -> /usr/lib64/python2.5/site-packages/numpy/core/include/numpy copying build/src.linux-x86_64-2.5/numpy/core/include/numpy/__ufunc_api.h -> /usr/lib64/python2.5/site-packages/numpy/core/include/numpy copying build/src.linux-x86_64-2.5/numpy/core/include/numpy/ufunc_api.txt -> /usr/lib64/python2.5/site-packages/numpy/core/include/numpy copying build/src.linux-x86_64-2.5/numpy/core/lib/npy-pkg-config/npymath.ini -> /usr/lib64/python2.5/site-packages/numpy/core/lib/npy-pkg-config copying build/src.linux-x86_64-2.5/numpy/core/lib/npy-pkg-config/mlib.ini -> /usr/lib64/python2.5/site-packages/numpy/core/lib/npy-pkg-config running install_egg_info Removing /usr/lib64/python2.5/site-packages/numpy-1.4.0.dev7375-py2.5.egg-info Writing /usr/lib64/python2.5/site-packages/numpy-1.4.0.dev7375-py2.5.egg-info running install_clib -------------- next part -------------- non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2_7375 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/lib64 libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib64 libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib64 libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib64/atlas libraries f77blas,cblas,atlas not found in /usr/lib64/sse2 libraries f77blas,cblas,atlas not found in /usr/lib64 libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE blas_info: libraries blas not found in /usr/local/lib64 libraries blas not found in /usr/local/lib FOUND: libraries = ['blas'] library_dirs = ['/usr/lib64'] language = f77 FOUND: libraries = ['blas'] library_dirs = ['/usr/lib64'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /usr/local/lib64 libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib64 libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib64 libraries lapack_atlas not found in /usr/local/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/atlas libraries lapack_atlas not found in /usr/lib64/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/sse2 libraries lapack_atlas not found in /usr/lib64/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib64 libraries lapack_atlas not found in /usr/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib64 libraries lapack_atlas not found in /usr/local/lib64 libraries f77blas,cblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib64/atlas libraries lapack_atlas not found in /usr/lib64/atlas libraries f77blas,cblas,atlas not found in /usr/lib64/sse2 libraries lapack_atlas not found in /usr/lib64/sse2 libraries f77blas,cblas,atlas not found in /usr/lib64 libraries lapack_atlas not found in /usr/lib64 libraries f77blas,cblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_info NOT AVAILABLE lapack_info: libraries lapack not found in /usr/local/lib64 libraries lapack not found in /usr/local/lib FOUND: libraries = ['lapack'] library_dirs = ['/usr/lib64'] language = f77 FOUND: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib64'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building py_modules sources building library "npymath" sources customize GnuFCompiler Found executable /usr/bin/g77 gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using config C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c _configtest.c:1: warning: conflicting types for built-in function ?exp? gcc -pthread _configtest.o -o _configtest _configtest.o: In function `main': /home/sebaguro/Desktop/soft/numpy/_configtest.c:6: undefined reference to `exp' collect2: ld returned 1 exit status _configtest.o: In function `main': /home/sebaguro/Desktop/soft/numpy/_configtest.c:6: undefined reference to `exp' collect2: ld returned 1 exit status failure. removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c _configtest.c:1: warning: conflicting types for built-in function ?exp? gcc -pthread _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest building extension "numpy.core._sort" sources adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h' to sources. numpy.core - nothing done with h_files = ['build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h'] building extension "numpy.core.multiarray" sources adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h' to sources. numpy.core - nothing done with h_files = ['build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h'] building extension "numpy.core.umath" sources adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__ufunc_api.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/src/umath' to include_dirs. numpy.core - nothing done with h_files = ['build/src.linux-x86_64-2.5/numpy/core/src/umath/funcs.inc', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__ufunc_api.h'] building extension "numpy.core.scalarmath" sources adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h' to sources. adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h' to sources. executing numpy/core/code_generators/generate_numpy_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h' to sources. executing numpy/core/code_generators/generate_ufunc_api.py adding 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__ufunc_api.h' to sources. numpy.core - nothing done with h_files = ['build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/numpyconfig.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__multiarray_api.h', 'build/src.linux-x86_64-2.5/numpy/core/include/numpy/__ufunc_api.h'] building extension "numpy.core._dotblas" sources building extension "numpy.core.umath_tests" sources building extension "numpy.core.multiarray_tests" sources building extension "numpy.lib._compiled_base" sources building extension "numpy.numarray._capi" sources building extension "numpy.fft.fftpack_lite" sources building extension "numpy.linalg.lapack_lite" sources adding 'numpy/linalg/lapack_litemodule.c' to sources. adding 'numpy/linalg/python_xerbla.c' to sources. building extension "numpy.random.mtrand" sources C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c gcc -pthread _configtest.o -o _configtest _configtest failure. removing: _configtest.c _configtest.o _configtest building data_files sources build_src: building npy-pkg config files running build_py copying numpy/version.py -> build/lib.linux-x86_64-2.5/numpy copying build/src.linux-x86_64-2.5/numpy/__config__.py -> build/lib.linux-x86_64-2.5/numpy copying build/src.linux-x86_64-2.5/numpy/distutils/__config__.py -> build/lib.linux-x86_64-2.5/numpy/distutils copying numpy/f2py/__svn_version__.py -> build/lib.linux-x86_64-2.5/numpy/f2py copying numpy/core/__svn_version__.py -> build/lib.linux-x86_64-2.5/numpy/core running build_clib customize UnixCCompiler customize UnixCCompiler using build_clib running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using build_ext running scons running build_scripts adding 'build/scripts.linux-x86_64-2.5/f2py' to scripts From a.h.jaffe at gmail.com Fri Sep 11 16:46:40 2009 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Fri, 11 Sep 2009 13:46:40 -0700 Subject: [Numpy-discussion] searchsorted for exact matches, not preserving order In-Reply-To: <3d375d730909110833o334deeeanb35e8416651e2eb6@mail.gmail.com> References: <3d375d730909110833o334deeeanb35e8416651e2eb6@mail.gmail.com> Message-ID: <4AAAB730.7020708@gmail.com> On 11/09/2009 08:33, Robert Kern wrote: > On Fri, Sep 11, 2009 at 10:24, Andrew Jaffe wrote: >> Dear all, >> >> I've got two (integer) arrays, and I want to find the indices in the >> first one that have entries in the second. I.E. I want all idx s.t. >> there exists a j with a[idx]=b[j]. Here is my current implementation >> (with a = pixnums, b=surveypix) > > numpy.setmember1d() [or numpy.in1d() for the SVN trunk of numpy]. > Robert, Thanks. But in fact this fails for my (possibly corner or edge) case: when the first array has duplicates that, in fact, are not in the second array, indices corresponding to those entries get returned. In general, duplicates are not necessarily treated right by this algorithm, I don't think. I can understand that this may be a feature, not a bug, but in fact for my use-case I want the algorithm to return the indices corresponding to all entries in ar1 with the same value, if that value appears anywhere in ar2. Yours, Andrew From robert.kern at gmail.com Fri Sep 11 16:52:42 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Sep 2009 15:52:42 -0500 Subject: [Numpy-discussion] searchsorted for exact matches, not preserving order In-Reply-To: <4AAAB730.7020708@gmail.com> References: <3d375d730909110833o334deeeanb35e8416651e2eb6@mail.gmail.com> <4AAAB730.7020708@gmail.com> Message-ID: <3d375d730909111352p41f62da0nbbe2ca78b47e79c1@mail.gmail.com> On Fri, Sep 11, 2009 at 15:46, Andrew Jaffe wrote: > On 11/09/2009 08:33, Robert Kern wrote: >> On Fri, Sep 11, 2009 at 10:24, Andrew Jaffe ?wrote: >>> Dear all, >>> >>> I've got two (integer) arrays, and I want to find the indices in the >>> first one that have entries in the second. I.E. I want all idx s.t. >>> there exists a j with a[idx]=b[j]. Here is my current implementation >>> (with a = pixnums, b=surveypix) >> >> numpy.setmember1d() [or numpy.in1d() for the SVN trunk of numpy]. >> > Robert, > > Thanks. But in fact this fails for my (possibly corner or edge) case: > when the first array has duplicates that, in fact, are not in the second > array, indices corresponding to those entries get returned. In general, > duplicates are not necessarily treated right by this algorithm, I don't > think. > > I can understand that this may be a feature, not a bug, but in fact for > my use-case I want the algorithm to return the indices corresponding to > all entries in ar1 with the same value, if that value appears anywhere > in ar2. That is the main feature that in1d() added over setmember1d(). If you cannot use the SVN trunk of numpy, you can just copy the implementation. It is not large. In [4]: a = np.arange(5) In [5]: b = np.hstack([a,a]) In [6]: c = np.arange(0, 10, 2) In [7]: c Out[7]: array([0, 2, 4, 6, 8]) In [8]: np.in1d(b, c) Out[8]: array([ True, False, True, False, True, True, False, True, False, True], dtype=bool) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Thu Sep 10 15:39:30 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 10 Sep 2009 15:39:30 -0400 Subject: [Numpy-discussion] iteration slowing, no increase in memory In-Reply-To: <3d375d730909101009pc8cebdfn241a8075331a901@mail.gmail.com> References: <25387205.post@talk.nabble.com> <3d375d730909101009pc8cebdfn241a8075331a901@mail.gmail.com> Message-ID: <664F01E5-8507-4B76-AA93-2648EF0FD46E@cs.toronto.edu> On 10-Sep-09, at 1:09 PM, Robert Kern wrote: > One thing you can do to verify this is to change the order of > iteration. You will also want to profile your code. Then you can see > what is taking up so much time. > > http://docs.python.org/library/profile Because apparently Robert is too modest to mention his own contribution to profiling: http://packages.python.org/line_profiler/ David From robert.kern at gmail.com Fri Sep 11 17:31:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Sep 2009 16:31:21 -0500 Subject: [Numpy-discussion] iteration slowing, no increase in memory In-Reply-To: <664F01E5-8507-4B76-AA93-2648EF0FD46E@cs.toronto.edu> References: <25387205.post@talk.nabble.com> <3d375d730909101009pc8cebdfn241a8075331a901@mail.gmail.com> <664F01E5-8507-4B76-AA93-2648EF0FD46E@cs.toronto.edu> Message-ID: <3d375d730909111431v87fda0du4135f2ad08d40f3b@mail.gmail.com> On Thu, Sep 10, 2009 at 14:39, David Warde-Farley wrote: > On 10-Sep-09, at 1:09 PM, Robert Kern wrote: > >> One thing you can do to verify this is to change the order of >> iteration. You will also want to profile your code. Then you can see >> what is taking up so much time. >> >> ?http://docs.python.org/library/profile > > Because apparently Robert is too modest to mention his own > contribution to profiling: > > http://packages.python.org/line_profiler/ Not at all. It's just not relevant yet. line_profiler is good if you know that a particular function is taking too long but don't know why. You have to use cProfile first to figure out which function, if any, is the bottleneck. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Fri Sep 11 19:07:20 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 11 Sep 2009 19:07:20 -0400 Subject: [Numpy-discussion] Scientific Computing with Python, September 18, 2009 References: <1183663757.1252692788349.JavaMail.root@p2-ws607.ad.prodcc.net> <4832F195-92FA-412D-9C1C-CEE81851F10B@enthought.com> Message-ID: I'd love to participate in these webinars. Problem is, AFAICT, gotomeeting only supports windows. From ferrell at diablotech.com Fri Sep 11 19:26:06 2009 From: ferrell at diablotech.com (Robert Ferrell) Date: Fri, 11 Sep 2009 17:26:06 -0600 Subject: [Numpy-discussion] Scientific Computing with Python, September 18, 2009 In-Reply-To: References: <1183663757.1252692788349.JavaMail.root@p2-ws607.ad.prodcc.net> <4832F195-92FA-412D-9C1C-CEE81851F10B@enthought.com> Message-ID: <84746922-9382-4B1B-84F1-FA7D7940CFAD@diablotech.com> On Sep 11, 2009, at 5:07 PM, Neal Becker wrote: > I'd love to participate in these webinars. Problem is, AFAICT, > gotomeeting > only supports windows. I'm not certain that is correct. I've participated in some of these, and Im' running OS X (10.5). I think those were gotomeeting, although don't honestly recall. Assuming nothing's changed, though, worked great on OS X. From lciti at essex.ac.uk Sat Sep 12 09:19:34 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Sat, 12 Sep 2009 14:19:34 +0100 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') In-Reply-To: References: Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A00B@MBOX0.essex.ac.uk> Hi, with the standard ubuntu version (1.2.1), I get >>> a=np.zeros(0x800000,dtype='b1') # OK >>> a=np.zeros(0x80000000,dtype='b1') TypeError: data type not understood >>> a=np.zeros(0x80000000,dtype=bool) ValueError: negative dimensions are not allowed while with 1.4.0.dev7375, I get >>> a=np.zeros(0x80000000,dtype='b1') # (both 'b1'and bool give the same result) ValueError: Maximum allowed dimension exceeded I think it might have something to do with the fact that in a 32bit machine (like mine, what about yours?) the default int is 32bit. An int32 with value 0x80000000 has the sign bit set and is actually -2**31. In fact with a different machine (64bit ubuntu, numpy 1.2.1), it works >>> a=np.zeros(0x80000000,dtype='b1') # OK In any case, with a 32bit machine you could not address such a big array anyway. Best, Luca From charlesr.harris at gmail.com Sat Sep 12 09:35:11 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Sep 2009 08:35:11 -0500 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') In-Reply-To: References: Message-ID: On Fri, Sep 11, 2009 at 2:22 PM, Sebastian wrote: > Hello, > > The folks at stsci (Jim T.) are not able to reproduce this error with > 1.4.0.dev7362 so I guess there is something wrong with my numpy > installation. > I also tried '1.4.0.dev7362' and numpy1.3 (stable) but alas, the same > error! > > My system: > [root at siate numpy]# uname -a > Linux siate.iate.oac.uncor.edu 2.6.27.25-78.2.56.fc9.x86_64 #1 SMP Thu Jun > 18 12:24:37 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux > > Below is the numpy error and the the output of numpy, build and install as > attached: > python setup.py build > build.txt > python setup.py install > install.txt > > Can't work out what is going wrong here. I don't think I'm missing some > dependency nor mixing compilers, but maybe I'm wrong, any hints? > best regards, > - Sebastian Gurovich > > > [root at siate soft]# ipython > Python 2.5.1 (r251:54863, Jun 15 2008, 18:24:56) > Type "copyright", "credits" or "license" for more information. > > IPython 0.8.3 -- An enhanced Interactive Python. > ? -> Introduction and overview of IPython's features. > %quickref -> Quick reference. > help -> Python's own help system. > object? -> Details about 'object'. ?object also works, ?? prints more. > > In [6]: numpy.__version__ > Out[6]: '1.4.0.dev7375' > > In [7]: a=numpy.zeros(0x80000000,dtype='b1') > > I expect 0x80000000 is interpreted by python as a negative number. Try 2**31 intead. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Sat Sep 12 10:03:27 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Sat, 12 Sep 2009 15:03:27 +0100 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') In-Reply-To: References: , Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A00C@MBOX0.essex.ac.uk> I just realized that Sebastian posted its 'uname -a' and he has a 64bit machine. In this case it should work as mine (the 64bit one) does. Maybe during the compilation some flags prevented a full 64bit code to be compiled? From charlesr.harris at gmail.com Sat Sep 12 10:11:24 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Sep 2009 09:11:24 -0500 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB31E561A00C@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB31E561A00C@MBOX0.essex.ac.uk> Message-ID: On Sat, Sep 12, 2009 at 9:03 AM, Citi, Luca wrote: > I just realized that Sebastian posted its 'uname -a' and he has a 64bit > machine. > In this case it should work as mine (the 64bit one) does. > Maybe during the compilation some flags prevented a full 64bit code to be > compiled? > __ > Ints are still 32 bits on 64 bit machines, but the real question is how python interprets the hex value. I don't have access at the moment, but maybe 0x80000000L would behave differently. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Sat Sep 12 10:47:44 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Sat, 12 Sep 2009 15:47:44 +0100 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') In-Reply-To: References: <271BED32E925E646A1333A56D9C6AFCB31E561A00C@MBOX0.essex.ac.uk>, Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A00D@MBOX0.essex.ac.uk> Python shouldn't be the problem here. Even on a 32bit machine >>> a = 0x80000000 2147483648L >>> a=np.zeros(a, dtype=bool) ValueError: negative dimensions are not allowed From charlesr.harris at gmail.com Sat Sep 12 11:28:03 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Sep 2009 10:28:03 -0500 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB31E561A00D@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB31E561A00C@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A00D@MBOX0.essex.ac.uk> Message-ID: On Sat, Sep 12, 2009 at 9:47 AM, Citi, Luca wrote: > Python shouldn't be the problem here. > Even on a 32bit machine > >>> a = 0x80000000 > 2147483648L > >>> a=np.zeros(a, dtype=bool) > ValueError: negative dimensions are not allowed > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpi at comxnet.dk Sat Sep 12 17:57:02 2009 From: mpi at comxnet.dk (Mads Ipsen) Date: Sat, 12 Sep 2009 23:57:02 +0200 Subject: [Numpy-discussion] Error in header file - wrong mailing list? Message-ID: <4AAC192E.4020706@comxnet.dk> Hey, I recently posted a bug related to a compile error in the header file 'npy_common.h' but have received no responses so far. Am I posting this in the wrong mailing list? Best regards, Mads -- +------------------------------------------------------------+ | Mads Ipsen, Scientific developer | +------------------------------+-----------------------------+ | QuantumWise A/S | phone: +45-29716388 | | N?rres?gade 27A | www: www.quantumwise.com | | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | +------------------------------+-----------------------------+ From pav at iki.fi Sat Sep 12 18:47:34 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 13 Sep 2009 01:47:34 +0300 Subject: [Numpy-discussion] Error in header file - wrong mailing list? In-Reply-To: <4AAC192E.4020706@comxnet.dk> References: <4AAC192E.4020706@comxnet.dk> Message-ID: <1252795653.23272.7.camel@idol> la, 2009-09-12 kello 23:57 +0200, Mads Ipsen kirjoitti: > Hey, > > I recently posted a bug related to a compile error in the header file > 'npy_common.h' but have received no responses so far. > > Am I posting this in the wrong mailing list? It is the correct list, but in general bug reports are better reported in the bug tracker: http://projects.scipy.org/numpy/ Message concerning bugs, especially if they are minor ones, are easily lost in the other traffic on the ML. The error you get from the comma at the end of the enum must be because you have -pedantic -Werror in your CFLAGS. Just unset CFLAGS or so before compilation. Yes, comma at the end of enum list is not strictly valid C89, although it is valid C99 and already passes for most compilers. -- Pauli Virtanen From pav at iki.fi Sat Sep 12 18:55:46 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 13 Sep 2009 01:55:46 +0300 Subject: [Numpy-discussion] Error in header file - wrong mailing list? In-Reply-To: <1252795653.23272.7.camel@idol> References: <4AAC192E.4020706@comxnet.dk> <1252795653.23272.7.camel@idol> Message-ID: <1252796145.23272.11.camel@idol> su, 2009-09-13 kello 01:47 +0300, Pauli Virtanen kirjoitti: [clip] > The error you get from the comma at the end of the enum must be because > you have > > -pedantic -Werror > > in your CFLAGS. Just > > unset CFLAGS > > or so before compilation. Yes, comma at the end of enum list is not > strictly valid C89, although it is valid C99 and already passes for most > compilers. Another possibility is that these flags come from /usr/lib/pythonX.Y/config/Makefile -- in that case it's maybe possible to override one of the the BASECFLAGS etc. variables with environment vars. Also, I see that OPT="-std=c89 -pedantic -Werror" python setup.py won't succeed in current SVN, because apparently the configuration detection code is not strict C89. -- Pauli Virtanen From jsseabold at gmail.com Sat Sep 12 20:12:02 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Sat, 12 Sep 2009 20:12:02 -0400 Subject: [Numpy-discussion] Scientific Computing with Python, September 18, 2009 In-Reply-To: <84746922-9382-4B1B-84F1-FA7D7940CFAD@diablotech.com> References: <1183663757.1252692788349.JavaMail.root@p2-ws607.ad.prodcc.net> <4832F195-92FA-412D-9C1C-CEE81851F10B@enthought.com> <84746922-9382-4B1B-84F1-FA7D7940CFAD@diablotech.com> Message-ID: On Fri, Sep 11, 2009 at 7:26 PM, Robert Ferrell wrote: > > > On Sep 11, 2009, at 5:07 PM, Neal Becker wrote: > >> I'd love to participate in these webinars. ?Problem is, AFAICT, >> gotomeeting >> only supports windows. > > I'm not certain that is correct. ?I've participated in some of these, > and Im' running OS X (10.5). ?I think those were gotomeeting, although > don't honestly recall. ?Assuming nothing's changed, though, worked > great on OS X. > I wasn't able to connect in the past using linux and their web site says Windows or Mac. FWIW, my friend sets these things up for work, and they use commpartners or webex. It looks like webex supports linux (don't know about commpartners), but I don't know much about the details, costs etc. http://support.webex.com/support/system-requirements.html Skipper From david at ar.media.kyoto-u.ac.jp Sun Sep 13 02:48:55 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 13 Sep 2009 15:48:55 +0900 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') In-Reply-To: References: <271BED32E925E646A1333A56D9C6AFCB31E561A00C@MBOX0.essex.ac.uk> Message-ID: <4AAC95D7.3050804@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Sat, Sep 12, 2009 at 9:03 AM, Citi, Luca > wrote: > > I just realized that Sebastian posted its 'uname -a' and he has a > 64bit machine. > In this case it should work as mine (the 64bit one) does. > Maybe during the compilation some flags prevented a full 64bit > code to be compiled? > __ > > > Ints are still 32 bits on 64 bit machines, but the real question is > how python interprets the hex value. That's not a python problem: the conversion of the object to a C int/long happens in numpy (in PyArray_IntpFromSequence in this case). I am not sure I understand exactly what the code is doing, though. I don't understand the rationale for #ifdef/#endif in the one item in shape tuple case (line 521 and below), as well as the call to PyNumber_Int, cheers, David From kxroberto at googlemail.com Sun Sep 13 06:30:58 2009 From: kxroberto at googlemail.com (Robert) Date: Sun, 13 Sep 2009 12:30:58 +0200 Subject: [Numpy-discussion] Why is the truth value of ndarray not simply size>0 ? In-Reply-To: <4AA50EBB.5080402@wartburg.edu> References: <4AA50EBB.5080402@wartburg.edu> Message-ID: Neil Martinsen-Burrell wrote: > On 2009-09-07 07:11 , Robert wrote: >> Is there a reason why ndarray truth tests (except scalars) >> deviates from the convention of other Python iterables >> list,array.array,str,dict,... ? >> >> Furthermore there is a surprising strange exception for arrays >> with size 1 (!= scalars). > > Historically, numpy's predecessors used "not equal to zero" as the > meaning for truth (consistent with numerical types in Python). However, > this introduces an ambiguity as both any(a != 0) and all(a != 0) are > reasonable interpretations of the truth value of a sequence of numbers. well, I can familiarize with that "not equal to zero" philosophy for a math-centric array type (different from a container / size>0 philosophy) However I don't see that all(a) (or "all(a != 0)") is something which anybody would ever expect with .__nonzero__() / if a: ... . Does anybody? And the current behavior with all those strange exceptions and exceptions from exceptions still seems awkward and unnecessary. The any() interpretion is outstandingly "right" in my opinion, and doesn't need to be guessed: anything/any part non-zero disturbs the clean "zeroness". Zero must be wholly pure zero. This is so everywhere in math and informatics. a number/memory is zero when all bits/bytes are zero. a matrix is a zero matrix when all elements are zero... This way only the test is also seamlessly consistent with a zero length array (while all(zerolengtharray != 0) would be True surprisingly!) This kind of any(a) truth test (only) is also often needed, and it would be also fast executable this way. It would be compatible with None/False init/default variable tests during code evolution in Python style and would behave well everywhere as far as I can see. It would also not break old code. Would a feature request in that direction have any chance? Robert > Numpy refuses to guess and raises the exception shown below. For > sequences with a single item, there is no ambiguity and numpy does the > (numerically) ordinary thing. > > The ndarray type available in Numpy is not conceptually an extension of > Python's iterables. If you'd like to help other Numpy users with this > issue, you can edit the documentation in the online documentation editor > at http://docs.scipy.org/numpy/docs/numpy-docs/user/index.rst > > -Neil From nadavh at visionsense.com Sun Sep 13 06:39:23 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun, 13 Sep 2009 13:39:23 +0300 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') References: <271BED32E925E646A1333A56D9C6AFCB31E561A00C@MBOX0.essex.ac.uk> <4AAC95D7.3050804@ar.media.kyoto-u.ac.jp> Message-ID: <710F2847B0018641891D9A21602763605AD167@ex3.envision.co.il> Could it be a problem of python version? I get no error with python2.6.2 (on amd64 gentoo) Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? David Cournapeau ????: ? 13-??????-09 09:48 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') Charles R Harris wrote: > > > On Sat, Sep 12, 2009 at 9:03 AM, Citi, Luca > wrote: > > I just realized that Sebastian posted its 'uname -a' and he has a > 64bit machine. > In this case it should work as mine (the 64bit one) does. > Maybe during the compilation some flags prevented a full 64bit > code to be compiled? > __ > > > Ints are still 32 bits on 64 bit machines, but the real question is > how python interprets the hex value. That's not a python problem: the conversion of the object to a C int/long happens in numpy (in PyArray_IntpFromSequence in this case). I am not sure I understand exactly what the code is doing, though. I don't understand the rationale for #ifdef/#endif in the one item in shape tuple case (line 521 and below), as well as the call to PyNumber_Int, cheers, David _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3965 bytes Desc: not available URL: From kxroberto at googlemail.com Sun Sep 13 07:46:01 2009 From: kxroberto at googlemail.com (Robert) Date: Sun, 13 Sep 2009 13:46:01 +0200 Subject: [Numpy-discussion] Why is the truth value of ndarray not simply size>0 ? In-Reply-To: References: <4AA50EBB.5080402@wartburg.edu> Message-ID: Robert wrote: > Neil Martinsen-Burrell wrote: >> On 2009-09-07 07:11 , Robert wrote: >>> Is there a reason why ndarray truth tests (except scalars) >>> deviates from the convention of other Python iterables >>> list,array.array,str,dict,... ? >>> >>> Furthermore there is a surprising strange exception for arrays >>> with size 1 (!= scalars). >> Historically, numpy's predecessors used "not equal to zero" as the >> meaning for truth (consistent with numerical types in Python). However, >> this introduces an ambiguity as both any(a != 0) and all(a != 0) are >> reasonable interpretations of the truth value of a sequence of numbers. > > > well, I can familiarize with that "not equal to zero" philosophy > for a math-centric array type (different from a container / size>0 > philosophy) > > However I don't see that all(a) (or "all(a != 0)") is something > which anybody would ever expect with .__nonzero__() / if a: ... . > Does anybody? And the current behavior with all those strange > exceptions and exceptions from exceptions still seems awkward and > unnecessary. > > The any() interpretion is outstandingly "right" in my opinion, and > doesn't need to be guessed: anything/any part non-zero disturbs > the clean "zeroness". Zero must be wholly pure zero. This is so > everywhere in math and informatics. a number/memory is zero when > all bits/bytes are zero. a matrix is a zero matrix when all > elements are zero... This way only the test is also seamlessly > consistent with a zero length array (while all(zerolengtharray != > 0) would be True surprisingly!) > This kind of any(a) truth test (only) is also often needed, and it > would be also fast executable this way. It would be compatible > with None/False init/default variable tests during code evolution > in Python style and would behave well everywhere as far as I can > see. It would also not break old code. > Would a feature request in that direction have any chance? > > Robert > > >> Numpy refuses to guess and raises the exception shown below. For >> sequences with a single item, there is no ambiguity and numpy does the >> (numerically) ordinary thing. >> coming to mind another way to see it: I'm not aware of any other python type which doesn't definitely know if it is __nonzero__ or not (unless there is an IOError or so). And everywhere: if there is *any* logical doubt at all, the default is: True! - not an Exception. For example .__nonzero__/.__bool__ for a custom class defaults to True. A behavior where an object throws an exception upon __nonzero__ test just because of principal doubts seems not to fit into the Python world. The non-zero test must definitely go through. Only 2 ways seem to be consistently Pythonic and logical: "size > 0"; or "any(a)" (*); and the later option may be more 'numerical'. Robert * .__nonzero__() and perhaps .any() too should not fail upon flexible types like currently: >>> np.array(["","",""]).any() Traceback (most recent call last): File "", line 1, in TypeError: cannot perform reduce with flexible type >>> From aisaac at american.edu Sun Sep 13 09:05:41 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 13 Sep 2009 09:05:41 -0400 Subject: [Numpy-discussion] Why is the truth value of ndarray not simply size>0 ? In-Reply-To: References: <4AA50EBB.5080402@wartburg.edu> Message-ID: <4AACEE25.7060202@american.edu> On 9/13/2009 7:46 AM, Robert wrote: > 2 ways seem to be consistently Pythonic and logical: "size> > 0"; or "any(a)" (*); and the later option may be more 'numerical'. Well, *there's* the problem. As a user I have felt more than once that a length based test, like other containers, would be natural, so I that I could do the usual test if a: That would certainly prove problematic for the `any` test when a is an array of zeros. And then soon after I want to use if (a>0): and that would certainly prove problematic for the `len` test when (a>0) is an array of zeros. And as for the `any` test, well in this case I usually want `all`. So the ValueError has proved useful. Alan Isaac From charlesr.harris at gmail.com Sun Sep 13 10:11:46 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 13 Sep 2009 08:11:46 -0600 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') In-Reply-To: <710F2847B0018641891D9A21602763605AD167@ex3.envision.co.il> References: <271BED32E925E646A1333A56D9C6AFCB31E561A00C@MBOX0.essex.ac.uk> <4AAC95D7.3050804@ar.media.kyoto-u.ac.jp> <710F2847B0018641891D9A21602763605AD167@ex3.envision.co.il> Message-ID: 2009/9/13 Nadav Horesh > > Could it be a problem of python version? I get no error with python2.6.2 > (on amd64 gentoo) > > Nadav > > -----????? ??????----- > ???: numpy-discussion-bounces at scipy.org ??? David Cournapeau > ????: ? 13-??????-09 09:48 > ??: Discussion of Numerical Python > ????: Re: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, > dtype='b1') > > Charles R Harris wrote: > > > > > > On Sat, Sep 12, 2009 at 9:03 AM, Citi, Luca > > wrote: > > > > I just realized that Sebastian posted its 'uname -a' and he has a > > 64bit machine. > > In this case it should work as mine (the 64bit one) does. > > Maybe during the compilation some flags prevented a full 64bit > > code to be compiled? > > __ > > > > > > Ints are still 32 bits on 64 bit machines, but the real question is > > how python interprets the hex value. > > > That's not a python problem: the conversion of the object to a C > int/long happens in numpy (in PyArray_IntpFromSequence in this case). I > am not sure I understand exactly what the code is doing, though. I don't > understand the rationale for #ifdef/#endif in the one item in shape > tuple case (line 521 and below), as well as the call to PyNumber_Int, > > Possibly, I get In [1]: a=numpy.zeros(0x80000000,dtype='b1') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/charris/ in () ValueError: Maximum allowed dimension exceeded This on 32 bit fedora 11 with python 2.6. Hmm, "maximum allowed size exceeded" might be a better message. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Sep 13 10:55:11 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 13 Sep 2009 10:55:11 -0400 Subject: [Numpy-discussion] Why is the truth value of ndarray not simply size>0 ? In-Reply-To: <4AACEE25.7060202@american.edu> References: <4AA50EBB.5080402@wartburg.edu> <4AACEE25.7060202@american.edu> Message-ID: <1cd32cbb0909130755m71b1e0e1s5591c173fe04f8b6@mail.gmail.com> On Sun, Sep 13, 2009 at 9:05 AM, Alan G Isaac wrote: > On 9/13/2009 7:46 AM, Robert wrote: >> 2 ways seem to be consistently Pythonic and logical: "size> >> 0"; or "any(a)" (*); and the later option may be more 'numerical'. > > Well, *there's* the problem. > > As a user I have felt more than once that a > length based test, like other containers, would > be natural, so I that I could do the usual test > ? ? ? ?if a: > That would certainly prove problematic for > the `any` test when a is an array of zeros. > And ?then soon after I want to use > ? ? ? ?if (a>0): > and that would certainly prove problematic for > the `len` test when (a>0) is an array of zeros. > And as for the `any` test, well in this case I > usually want `all`. > > So the ValueError has proved useful. > > Alan Isaac > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > In a numerical context, I essentially stopped using if a: ... I don't think a single definition is common enough to avoid an ambiguous interpretation, and the ValueError is a good reminder to be more precise. I don't see why zero should evaluate as False, except for booleans, even for scalars and also in python. zeros are useful numbers and throwing them together with None is pretty confusing. But I agree with the initial post that for single numbers the numpy behavior is not completely consistent. >>> a=np.array(0) >>> if not a: print 'not true' not true >>> a=np.array([0]) >>> if not a: print 'not true' not true >>> a=np.array([]) >>> if not a: print 'not true' not true >>> np.array([0]).shape (1,) >>> np.array(0).shape () >>> np.array([]).shape (0,) I don't think np.array([0]) should evaluate to false. But "if a.size==0:" or "if a==0:" or if np.any(a==0):" is much clearer. Josef From dsdale24 at gmail.com Sun Sep 13 13:01:56 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Sun, 13 Sep 2009 13:01:56 -0400 Subject: [Numpy-discussion] suggestion for generalizing numpy functions In-Reply-To: References: <9457e7c80907201433i766eb9a9x47e6a449c10450d7@mail.gmail.com> Message-ID: On Sat, Jul 25, 2009 at 8:33 PM, Darren Dale wrote: > On Thu, Jul 23, 2009 at 12:54 PM, Darren Dale wrote: >> On Tue, Jul 21, 2009 at 10:11 AM, Darren Dale wrote: >>> On Tue, Jul 21, 2009 at 7:44 AM, Darren Dale wrote: >>>> 2009/7/20 St?fan van der Walt : >>>>> Hi Chuck >>>>> >>>>> 2009/7/17 Charles R Harris : >>>>>> PyObject*?PyTuple_GetItem(PyObject *p, Py_ssize_t pos) >>>>>> Return value: Borrowed reference. >>>>>> Return the object at position pos in the tuple pointed to by p. If pos is >>>>>> out of bounds, return NULL and sets an IndexError exception. It's a borrowed >>>>>> reference so you need to call Py_INCREF on it. I find this Python C-API >>>>>> documentation useful. >>>>> >>>>> Have you had a look over the rest of the code? ?I think this would >>>>> make a good addition. ?Travis mentioned Contexts for doing something >>>>> similar, but I don't know enough about that concept to compare the >>>>> two. >>>> >>>> I think contexts would be very different from what is already in >>>> place. For now, it would be nice to make this one small improvement to >>>> the existing ufunc infrastructure, and maybe consider contexts (which >>>> I still don't understand) at a later time. I have improved the code >>>> slightly and added a few tests, and will post a new patch later this >>>> morning. I just need to add some documentation. >>> >>> Here is a better patch, which includes a few additional tests and adds >>> some documentation. It also attempts to improve the docstring and >>> sphinx docs for __array_wrap__, which may have been a little bit >>> misleading. There is also some whitespace cleanup in a few places. >>> Would someone please review my work and commit the patch if it is >>> acceptable? Pierre or Travis, would either of you have a chance to >>> look over the implementation and the documentation changes, since you >>> two seem to be most familiar with ufuncs and subclassing ndarray? >> >> It looks like part of my patch has been clobbered by changes >> introduced in svn 7184-7191. What else should I be doing so a patch >> like this can be committed relatively quickly? > > Could I please obtain commit privileges so I can commit this feature > to svn myself? I guess I forgot to follow up here, I committed the patch during the SciPy conference. Thank you to the devs for granting me commit privileges, I'll use them with care. Are the numpy developers familiar with predicative dispatch in general, and PEP 3124 (generic functions) in particular? I've been reading about them all weekend. They seem particularly applicable to numpy, and not just where we currently use __array_prepare__ and __array_wrap__. The PEP seems to have stalled, but there is currently some discussion about it at python-dev. If anyone is interested in commenting on how generic functions could be useful to numpy, commenting in tho thread at python-dev could help establish what features would be desirable in generic functions and motivation for including them in the standard library. Here are some links for anyone who is interested: The PEP: http://ftp.python.org/dev/peps/pep-3124/ A presentation of Eby's implementation in PEAK: http://peak.telecommunity.com/PyCon05Talk/img0.html One of the paper's Eby cites in the presentation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.1167 A Charming Python article an Eby's implementation: http://www.ibm.com/developerworks/library/l-cppeak2/ Guido's musings on the topic http://www.artima.com/weblogs/viewpost.jsp?thread=155123 Darren From jsseabold at gmail.com Sun Sep 13 13:29:27 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Sun, 13 Sep 2009 13:29:27 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? Message-ID: Is there a reason that the missing argument in genfromtxt only takes a string? For instance, I have a dataset that in most columns has a zero for some observations but in others it was just left blank, which is the equivalent of zero. I would like to set all of the missing to 0 (it defaults to -1 now) when loading in the data. I suppose I could do this with a converter, but I have too many columns for this. Before I try to work on a patch, I'd just like to know if I'm missing something, maybe there's already way to do this (without using a mask)? -Skipper From robert.kern at gmail.com Sun Sep 13 15:35:48 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 13 Sep 2009 14:35:48 -0500 Subject: [Numpy-discussion] Why is the truth value of ndarray not simply size>0 ? In-Reply-To: References: <4AA50EBB.5080402@wartburg.edu> Message-ID: <3d375d730909131235g6a7e3ac6lc171218e33a3553b@mail.gmail.com> On Sun, Sep 13, 2009 at 05:30, Robert wrote: > Neil Martinsen-Burrell wrote: >> On 2009-09-07 07:11 , Robert wrote: >>> Is there a reason why ndarray truth tests (except scalars) >>> deviates from the convention of other Python iterables >>> list,array.array,str,dict,... ? >>> >>> Furthermore there is a surprising strange exception for arrays >>> with size 1 (!= scalars). >> >> Historically, numpy's predecessors used "not equal to zero" as the >> meaning for truth (consistent with numerical types in Python). ?However, >> this introduces an ambiguity as both any(a != 0) and all(a != 0) are >> reasonable interpretations of the truth value of a sequence of numbers. > > > well, I can familiarize with that "not equal to zero" philosophy > for a math-centric array type (different from a container / size>0 > philosophy) > > However I don't see that all(a) (or "all(a != 0)") is something > which anybody would ever expect with .__nonzero__() / if a: ... . > Does anybody? And the current behavior with all those strange > exceptions and exceptions from exceptions still seems awkward and > unnecessary. > > The any() interpretion is outstandingly "right" in my opinion, and > doesn't need to be guessed: anything/any part non-zero disturbs > the clean "zeroness". Zero must be wholly pure zero. This is so > everywhere in math and informatics. a number/memory is zero when > all bits/bytes are zero. a matrix is a zero matrix when all > elements are zero... This way only the test is also seamlessly > consistent with a zero length array (while all(zerolengtharray != > 0) would be True surprisingly!) > This kind of any(a) truth test (only) is also often needed, and it > would be also fast executable this way. It would be compatible > with None/False init/default variable tests during code evolution > in Python style and would behave well everywhere as far as I can > see. It would also not break old code. > Would a feature request in that direction have any chance? No. Numeric used to use the any() interpretation, and it led to many, many errors in people's code that went undetected for years. For example, people seem to usually want "a == b" to be True iff *all* elements are equal. People also seem to usually want "a != b" to be True if *any* elements are unequal. These desires are inconsistent and cannot be realized at the same time, yet people seem to hold both mental models in their head without thoroughly thinking through the logic or testing it. No amount of documentation or education seemed to help, so we decided to raise an exception instead. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jsseabold at gmail.com Sun Sep 13 15:51:24 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Sun, 13 Sep 2009 15:51:24 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: Message-ID: On Sun, Sep 13, 2009 at 1:29 PM, Skipper Seabold wrote: > Is there a reason that the missing argument in genfromtxt only takes a string? > > For instance, I have a dataset that in most columns has a zero for > some observations but in others it was just left blank, which is the > equivalent of zero. ?I would like to set all of the missing to 0 (it > defaults to -1 now) when loading in the data. ?I suppose I could do > this with a converter, but I have too many columns for this. > > Before I try to work on a patch, I'd just like to know if I'm missing > something, maybe there's already way to do this (without using a > mask)? > > -Skipper > To be a little more concrete here are the two problems I am having right now. from StringIO import StringIO import numpy as np s = stringIO('D01N01,10/1/2003 ,1, 1, 0, 400, 600,0, 0, 0,0,0, 0,0,0, 0, 0,0,0, 0,0,0,0,0,0, 0, 0,0, 0,0, 0,0,0,3,0, 50, 80,0, 0,0,0,0,0, 4,0, 3380, 1070, 0, 0, 0,0,0,0,1,0, 600, 900,0, 0, 0,0,0,0, 0,0, 0, 0,0,0, 0,0, 0,0, 0,0, 0, 0,0,0,0, 0,0,0,2,0,1000, 900,0, 0, 0,0,0,0,0,0, 0, 0,0,0, 0,0,0,0,0,0, 0, 0,0,0,0,0,0,0,0,0, 0, 0,0, 0,0,0,0,0,0,0, 0, 0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0, 0,0, 0,0,0,0,0,1,0, 500, 800,0, 0, 0,0,0,0,0,0, 0, 0,0, 0, 0,0,0,0,1, 0, 300, 0,0, 0, 0,0,0,0, 1,0, 1600, 900, 0, 0,0,0, 0,0,0,0, 0, 0,0,0,0,0,0,0,0,0, 0, 0,0,0,0,0,0,0, 0,0, 0, 0, 0,0,0,0,0,0,0,0, 0, 0,0,0,0,0,0,0, 0, 0,0,0,0, 0,0, 0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0, 0, 0,0, 0,0,0,0,0, 0,0, 0,0,0,0,0,0,0,0,0,0, 0, 0,0,0,0,0,0,0,0,0, 0, 0,0,0,0,0,0,0,0,0, 0,0,0, 0,0,0,0,0, 0,0, 0, 0, 0, 0,0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\r\nL24U05,12/25/2003 ,2, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,0,0, 0, 0,0, 0, 0,0,0,0, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,0,0, 0,0,0, 0,0,0,0,0, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , \r\n') data = np.genfromtxt(s, dtype=None, delimiter=",", names=None) All of the missing values in the second observation are now -1. Also, I'm having trouble defining a converter for my dates. I have the function from datetime import datetime def str2date(date): day,month,year = date.strip().split('/') return datetime(*map(int, [year, month, day])) conv = {1 : lambda s: str2date(s)} s.seek(0) data = np.genfromtxt(s, dtype=None, delimiter=",", names=None, converters=conv) I get /usr/local/lib/python2.6/dist-packages/numpy/lib/io.pyc in genfromtxt(fname, dtype, comments, delimiter, skiprows, converters, missing, missing_values, usecols, names, excludelist, deletechars, case_sensitive, unpack, usemask, loose) 990 if dtype is None: 991 for (converter, item) in zip(converters, values): --> 992 converter.upgrade(item) 993 # Store the values 994 append_to_rows(tuple(values)) /usr/local/lib/python2.6/dist-packages/numpy/lib/_iotools.pyc in upgrade(self, value) 469 # Raise an exception if we locked the converter... 470 if self._locked: --> 471 raise ValueError("Converter is locked and cannot be upgraded") 472 _statusmax = len(self._mapper) 473 # Complains if we try to upgrade by the maximum ValueError: Converter is locked and cannot be upgraded Does anyone know what I'm doing wrong? Thanks, Skipper From alan at ajackson.org Sun Sep 13 17:31:02 2009 From: alan at ajackson.org (alan at ajackson.org) Date: Sun, 13 Sep 2009 16:31:02 -0500 Subject: [Numpy-discussion] Scientific Computing with Python, September 18, 2009 In-Reply-To: References: <1183663757.1252692788349.JavaMail.root@p2-ws607.ad.prodcc.net> <4832F195-92FA-412D-9C1C-CEE81851F10B@enthought.com> <84746922-9382-4B1B-84F1-FA7D7940CFAD@diablotech.com> Message-ID: <20090913163102.1288081e@ajackson.org> >On Fri, Sep 11, 2009 at 7:26 PM, Robert Ferrell wrote: >> >> >> On Sep 11, 2009, at 5:07 PM, Neal Becker wrote: >> >>> I'd love to participate in these webinars. ?Problem is, AFAICT, >>> gotomeeting >>> only supports windows. >> >> I'm not certain that is correct. ?I've participated in some of these, >> and Im' running OS X (10.5). ?I think those were gotomeeting, although >> don't honestly recall. ?Assuming nothing's changed, though, worked >> great on OS X. >> > >I wasn't able to connect in the past using linux and their web site >says Windows or Mac. > >FWIW, my friend sets these things up for work, and they use >commpartners or webex. It looks like webex supports linux (don't know >about commpartners), but I don't know much about the details, costs >etc. The first 25 connections for Webex are pretty cheap, but after that it starts to get obnoxiously expensive. They have a strange pricing model that actually penalizes you for using more. Kind of like a pusher trying to get someone addicted. A taste is cheap, and once you're hooked... It does work well though. I have done webinars at work where I ran Linux in a vnc session on Windows, and then to Webex, all over the planet. Latency was minimal. -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From ndbecker2 at gmail.com Sun Sep 13 18:38:48 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Sun, 13 Sep 2009 18:38:48 -0400 Subject: [Numpy-discussion] Scientific Computing with Python, September 18, 2009 References: <1183663757.1252692788349.JavaMail.root@p2-ws607.ad.prodcc.net> <4832F195-92FA-412D-9C1C-CEE81851F10B@enthought.com> <84746922-9382-4B1B-84F1-FA7D7940CFAD@diablotech.com> <20090913163102.1288081e@ajackson.org> Message-ID: alan at ajackson.org wrote: >>On Fri, Sep 11, 2009 at 7:26 PM, Robert Ferrell >>wrote: >>> >>> >>> On Sep 11, 2009, at 5:07 PM, Neal Becker wrote: >>> >>>> I'd love to participate in these webinars. Problem is, AFAICT, >>>> gotomeeting >>>> only supports windows. >>> >>> I'm not certain that is correct. I've participated in some of these, >>> and Im' running OS X (10.5). I think those were gotomeeting, although >>> don't honestly recall. Assuming nothing's changed, though, worked >>> great on OS X. >>> >> >>I wasn't able to connect in the past using linux and their web site >>says Windows or Mac. >> >>FWIW, my friend sets these things up for work, and they use >>commpartners or webex. It looks like webex supports linux (don't know >>about commpartners), but I don't know much about the details, costs >>etc. > > The first 25 connections for Webex are pretty cheap, but after that it > starts to get obnoxiously expensive. They have a strange pricing model > that actually penalizes you for using more. Kind of like a pusher trying > to get someone addicted. A taste is cheap, and once you're hooked... > > It does work well though. I have done webinars at work where I ran Linux > in a vnc session on Windows, and then to Webex, all over the planet. > Latency was minimal. > > I think dimdim looks interesting From ndbecker2 at gmail.com Sun Sep 13 18:41:43 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Sun, 13 Sep 2009 18:41:43 -0400 Subject: [Numpy-discussion] Scientific Computing with Python, September 18, 2009 References: <1183663757.1252692788349.JavaMail.root@p2-ws607.ad.prodcc.net> <4832F195-92FA-412D-9C1C-CEE81851F10B@enthought.com> <84746922-9382-4B1B-84F1-FA7D7940CFAD@diablotech.com> <20090913163102.1288081e@ajackson.org> Message-ID: alan at ajackson.org wrote: >>On Fri, Sep 11, 2009 at 7:26 PM, Robert Ferrell >>wrote: >>> >>> >>> On Sep 11, 2009, at 5:07 PM, Neal Becker wrote: >>> >>>> I'd love to participate in these webinars. Problem is, AFAICT, >>>> gotomeeting >>>> only supports windows. >>> >>> I'm not certain that is correct. I've participated in some of these, >>> and Im' running OS X (10.5). I think those were gotomeeting, although >>> don't honestly recall. Assuming nothing's changed, though, worked >>> great on OS X. >>> >> >>I wasn't able to connect in the past using linux and their web site >>says Windows or Mac. >> >>FWIW, my friend sets these things up for work, and they use >>commpartners or webex. It looks like webex supports linux (don't know >>about commpartners), but I don't know much about the details, costs >>etc. > > The first 25 connections for Webex are pretty cheap, but after that it > starts to get obnoxiously expensive. They have a strange pricing model > that actually penalizes you for using more. Kind of like a pusher trying > to get someone addicted. A taste is cheap, and once you're hooked... > > It does work well though. I have done webinars at work where I ran Linux > in a vnc session on Windows, and then to Webex, all over the planet. > Latency was minimal. > > Here is a nice review: http://www.webinarcentral.net/2008/11/webinar-central-blog-guide-to-free- webinar-hosting-sites From craig at brechmos.org Sun Sep 13 20:51:03 2009 From: craig at brechmos.org (brechmos) Date: Sun, 13 Sep 2009 17:51:03 -0700 (PDT) Subject: [Numpy-discussion] 3D interpolation of large array Message-ID: <25428880.post@talk.nabble.com> I have a large dataset (e.g., 70 x 500 x 500) and want to interpolate points (for example to double the size). What it seems I want is: []: newx,newy,newz=mgrid[1:70:0.5,1:500:0.5,1:500:0.5] []: coords = array([newz, newy, newx]) []: dout = np.map_coordinates(d, coords) The problem is that mgrid can't do such a large set of values (at least on my setup! MacBook Pro with 4Gig RAM). The error from the mgrid command is "ValueError: dimensions too large." My version of Numpy is numpy-1.3.0n1-py2.5-macosx-10.3 (from the Enthought package). Is there any other way that I could do this? -- View this message in context: http://www.nabble.com/3D-interpolation-of-large-array-tp25428880p25428880.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From dwf at cs.toronto.edu Sun Sep 13 23:55:43 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sun, 13 Sep 2009 23:55:43 -0400 Subject: [Numpy-discussion] 3D interpolation of large array In-Reply-To: <25428880.post@talk.nabble.com> References: <25428880.post@talk.nabble.com> Message-ID: On 13-Sep-09, at 8:51 PM, brechmos wrote: > > I have a large dataset (e.g., 70 x 500 x 500) and want to > interpolate points > (for example to double the size). What it seems I want is: > > []: newx,newy,newz=mgrid[1:70:0.5,1:500:0.5,1:500:0.5] > []: coords = array([newz, newy, newx]) > []: dout = np.map_coordinates(d, coords) That's going to incur over 3 gigabytes of RAM (about a gig each for newx, newy, and newz), and you're running 32-bit Python. Not only do you likely not have enough physical memory (which can be solved somewhat by swapping), Python doesn't have enough memory address space. So in other words, no. Compiling a 64-bit version of Python and NumPy would solve the problem. David From mpi at comxnet.dk Mon Sep 14 03:08:34 2009 From: mpi at comxnet.dk (Mads Ipsen) Date: Mon, 14 Sep 2009 09:08:34 +0200 Subject: [Numpy-discussion] Error in header file - wrong mailing list? In-Reply-To: <1252796145.23272.11.camel@idol> References: <4AAC192E.4020706@comxnet.dk> <1252795653.23272.7.camel@idol> <1252796145.23272.11.camel@idol> Message-ID: <4AADEBF2.6070501@comxnet.dk> Pauli Virtanen wrote: > su, 2009-09-13 kello 01:47 +0300, Pauli Virtanen kirjoitti: > [clip] > >> The error you get from the comma at the end of the enum must be because >> you have >> >> -pedantic -Werror >> >> in your CFLAGS. Just >> >> unset CFLAGS >> >> or so before compilation. Yes, comma at the end of enum list is not >> strictly valid C89, although it is valid C99 and already passes for most >> compilers. >> > > Another possibility is that these flags come > from /usr/lib/pythonX.Y/config/Makefile -- in that case it's maybe > possible to override one of the the BASECFLAGS etc. variables with > environment vars. > > Also, I see that > > OPT="-std=c89 -pedantic -Werror" python setup.py > > won't succeed in current SVN, because apparently the configuration > detection code is not strict C89. > > Well, I don't know you consider this as a valid argument, but to me its a matter of removing a single comma, which will make the source less sensitive to compilers and compiler flags. Mads -- +------------------------------------------------------------+ | Mads Ipsen, Scientific developer | +------------------------------+-----------------------------+ | QuantumWise A/S | phone: +45-29716388 | | N?rres?gade 27A | www: www.quantumwise.com | | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | +------------------------------+-----------------------------+ From gael.varoquaux at normalesup.org Mon Sep 14 03:21:41 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 14 Sep 2009 09:21:41 +0200 Subject: [Numpy-discussion] 3D interpolation of large array In-Reply-To: References: <25428880.post@talk.nabble.com> Message-ID: <20090914072141.GA26363@phare.normalesup.org> On Sun, Sep 13, 2009 at 11:55:43PM -0400, David Warde-Farley wrote: > On 13-Sep-09, at 8:51 PM, brechmos wrote: > > I have a large dataset (e.g., 70 x 500 x 500) and want to > > interpolate points > > (for example to double the size). What it seems I want is: > > []: newx,newy,newz=mgrid[1:70:0.5,1:500:0.5,1:500:0.5] > > []: coords = array([newz, newy, newx]) > > []: dout = np.map_coordinates(d, coords) > That's going to incur over 3 gigabytes of RAM (about a gig each for > newx, newy, and newz), and you're running 32-bit Python. Not only do > you likely not have enough physical memory (which can be solved > somewhat by swapping), Python doesn't have enough memory address > space. So in other words, no. > Compiling a 64-bit version of Python and NumPy would solve the problem. Breaking the problem in smaller problems might also be an option. Ga?l From pav+sp at iki.fi Mon Sep 14 03:33:11 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Mon, 14 Sep 2009 07:33:11 +0000 (UTC) Subject: [Numpy-discussion] Error in header file - wrong mailing list? References: <4AAC192E.4020706@comxnet.dk> <1252795653.23272.7.camel@idol> <1252796145.23272.11.camel@idol> <4AADEBF2.6070501@comxnet.dk> Message-ID: Mon, 14 Sep 2009 09:08:34 +0200, Mads Ipsen wrote: [clip] > Well, I don't know you consider this as a valid argument, but to me its > a matter of removing a single comma, which will make the source less > sensitive to compilers and compiler flags. That's done. -- Pauli Virtanen From rsalvador.wk at gmail.com Mon Sep 14 06:46:50 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Mon, 14 Sep 2009 12:46:50 +0200 Subject: [Numpy-discussion] Create 2D array from EXISTING 1D array Message-ID: <4fe028e30909140346w64460ab2t23648f9ef3dad50@mail.gmail.com> Hi there! It's a time since I'm asking this question to myself, and still don't know a Pythonic way to solve it. I want to create a 2D array where each *row* is a copy of an already existing 1D array. For example: In [21]: a = np.array[1, 2, 3] In [25]: a Out[25]: array([1, 2, 3]) To create a 2D 'b' array, where each row is 'a', I would do: In [28]: b = np.empty((5,3), np.int) In [29]: for i in range(5): ....: b[i] = a ....: ....: In [30]: b Out[30]: array([[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]) I can't figure out how to create the same 2D array with np.array(), or np.whatever(). It should be faster, cleaner... Any idea? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From johan.gronqvist at gmail.com Mon Sep 14 07:06:32 2009 From: johan.gronqvist at gmail.com (=?ISO-8859-1?Q?Johan_Gr=F6nqvist?=) Date: Mon, 14 Sep 2009 13:06:32 +0200 Subject: [Numpy-discussion] Create 2D array from EXISTING 1D array In-Reply-To: <4fe028e30909140346w64460ab2t23648f9ef3dad50@mail.gmail.com> References: <4fe028e30909140346w64460ab2t23648f9ef3dad50@mail.gmail.com> Message-ID: Ruben Salvador skrev: > [...] I want to create a 2D array where each > *row* is a copy of an already existing 1D array. For example: > In [25]: a > Out[25]: array([1, 2, 3]) > [...] > In [30]: b > Out[30]: > array([[1, 2, 3], > [1, 2, 3], > [1, 2, 3], > [1, 2, 3], > [1, 2, 3]]) > Without understanding anything, this seems to work: johan at johan-laptop:~$ python Python 2.5.4 (r254:67916, Feb 18 2009, 03:00:47) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> a = np.array([1, 2, 3]) >>> b = np.zeros((5, 3)) >>> b[:, :] = a >>> b array([[ 1., 2., 3.], [ 1., 2., 3.], [ 1., 2., 3.], [ 1., 2., 3.], [ 1., 2., 3.]]) Hope it helps / johan From lciti at essex.ac.uk Mon Sep 14 07:07:05 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Mon, 14 Sep 2009 12:07:05 +0100 Subject: [Numpy-discussion] Create 2D array from EXISTING 1D array In-Reply-To: <4fe028e30909140346w64460ab2t23648f9ef3dad50@mail.gmail.com> References: <4fe028e30909140346w64460ab2t23648f9ef3dad50@mail.gmail.com> Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A00E@MBOX0.essex.ac.uk> if I get your question correctly, np.tile could be what you need From rsalvador.wk at gmail.com Mon Sep 14 08:31:14 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Mon, 14 Sep 2009 14:31:14 +0200 Subject: [Numpy-discussion] Create 2D array from EXISTING 1D array In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB31E561A00E@MBOX0.essex.ac.uk> References: <4fe028e30909140346w64460ab2t23648f9ef3dad50@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB31E561A00E@MBOX0.essex.ac.uk> Message-ID: <4fe028e30909140531h37820578r433b6dd2745bf8dd@mail.gmail.com> Perfect, that's exactly what I need! Somehow I missed this routine when checking documentation :S In [58]: a Out[58]: array([1, 2, 3]) In [59]: np.tile(a, (5,1)) Out[59]: array([[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]) Thanks a lot! On Mon, Sep 14, 2009 at 1:07 PM, Citi, Luca wrote: > if I get your question correctly, np.tile could be what you need > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis-bz-py at t-online.de Mon Sep 14 09:01:11 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Mon, 14 Sep 2009 13:01:11 +0000 (UTC) Subject: [Numpy-discussion] timeit one big vs many little a[ogrid] = f(ogrid) Message-ID: Folks, this simple timeit -> a largish speed ratio that surprised me -- but perhaps I've done something stupid ? """ timeit one big vs many little a[ogrid] = f(ogrid) Consider evaluating a function on an NxN grid, in 2 ways: a) in one shot: y,x = ogrid[0:N, 0:N] a[y,x] = f(x,y) b) piece by piece, covering the NxN with little nxn ogrids. How much faster would you expect "one big" to be than "little", say for N=256, n=8, *roughly* -- factor 2, factor 10, for a trivial f() ? An application: for adaptive interpolation on a 2d grid (adalin2), suppose we fill 8x8 squares either with f(ogrid), or interpolate(). *If* 8x8 piece by piece were 10* slower than 256x256 in one shot doing 10 % of the (256/8)^2 little squares with f() and interpolating 90 % in 0 time would take the same time as f( 256x256 ) -- for trivial f(). In fact 10* is about what I see on one (1) platform, mac ppc => f() must be very expensive for interpolation to pay off. ("All models are wrong, but some are useful.") Bottom line: f( one big ogrid ) is fast, hard to beat. """ from __future__ import division import timeit import numpy as np __date__ = "14sep 2009" N = 256 Ntime = 10 print "# n msec a[ogrid] = f(ogrid) N=%d numpy %s" % (N, np.__version__) n = N while n >= 4: #{ timer = timeit.Timer( setup = """ import numpy as np N = %d n = %d def f(x,y): return (2*x + y) / N a = np.zeros(( N, N )) """ % (N,n), stmt = """ #............................................................................... for j in range( 0, N, n ): for k in range( 0, N, n ): y,x = np.ogrid[ j:j+n, k:k+n ] a[y,x] = f(x,y) """ ) msec = timer.timeit( Ntime ) / Ntime * 1000 print "%3d %4.0f" % (n, msec) n //= 2 #} From Michael.Walker at sophia.inria.fr Mon Sep 14 09:21:11 2009 From: Michael.Walker at sophia.inria.fr (Michael.Walker at sophia.inria.fr) Date: Mon, 14 Sep 2009 15:21:11 +0200 (CEST) Subject: [Numpy-discussion] SuperLU failed to solve Message-ID: <58410.194.167.194.160.1252934471.squirrel@imap-sop.inria.fr> Hello, does anyone on this list have any idea what might cause SuperLU to not solve on my computer when the associated code works perfectly well on somebody else's? On my computer the code crashes with the error message 'SuperLU solve failed: info=9' thanks Michael Walker Plant Modelling Group CIRAD, Montpellier 04 67 61 57 27 From charlesr.harris at gmail.com Mon Sep 14 09:31:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 14 Sep 2009 07:31:32 -0600 Subject: [Numpy-discussion] Create 2D array from EXISTING 1D array In-Reply-To: <4fe028e30909140346w64460ab2t23648f9ef3dad50@mail.gmail.com> References: <4fe028e30909140346w64460ab2t23648f9ef3dad50@mail.gmail.com> Message-ID: On Mon, Sep 14, 2009 at 4:46 AM, Ruben Salvador wrote: > Hi there! > > It's a time since I'm asking this question to myself, and still don't know > a Pythonic way to solve it. I want to create a 2D array where each *row* is > a copy of an already existing 1D array. For example: > > In [21]: a = np.array[1, 2, 3] > In [25]: a > Out[25]: array([1, 2, 3]) > > To create a 2D 'b' array, where each row is 'a', I would do: > > In [28]: b = np.empty((5,3), np.int) > > You can use a list of 1D arrays, i.e., In [1]: a = array([1,2,3]) In [2]: b = array([a]*5) In [3]: b Out[3]: array([[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebas0 at gmail.com Mon Sep 14 10:43:46 2009 From: sebas0 at gmail.com (Sebastian) Date: Mon, 14 Sep 2009 11:43:46 -0300 Subject: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, dtype='b1') In-Reply-To: References: <271BED32E925E646A1333A56D9C6AFCB31E561A00C@MBOX0.essex.ac.uk> <4AAC95D7.3050804@ar.media.kyoto-u.ac.jp> <710F2847B0018641891D9A21602763605AD167@ex3.envision.co.il> Message-ID: Thanks for the help. I think that deleting the old build directory before rebuilding may have been the trick. The output below shows i'm no longer reproducing the error. best wishes, - Sebastian Gurovich In [3]: numpy.__version__ Out[3]: '1.3.0' In [4]: a=numpy.zeros(0x80000000,dtype='b1') In [5]: a.data Out[5]: wrote: > > > 2009/9/13 Nadav Horesh > >> >> Could it be a problem of python version? I get no error with python2.6.2 >> (on amd64 gentoo) >> >> Nadav >> >> -----????? ??????----- >> ???: numpy-discussion-bounces at scipy.org ??? David Cournapeau >> ????: ? 13-??????-09 09:48 >> ??: Discussion of Numerical Python >> ????: Re: [Numpy-discussion] 64-bit Fedora 9 a=numpy.zeros(0x80000000, >> dtype='b1') >> >> Charles R Harris wrote: >> > >> > >> > On Sat, Sep 12, 2009 at 9:03 AM, Citi, Luca > > > wrote: >> > >> > I just realized that Sebastian posted its 'uname -a' and he has a >> > 64bit machine. >> > In this case it should work as mine (the 64bit one) does. >> > Maybe during the compilation some flags prevented a full 64bit >> > code to be compiled? >> > __ >> > >> > >> > Ints are still 32 bits on 64 bit machines, but the real question is >> > how python interprets the hex value. >> >> >> That's not a python problem: the conversion of the object to a C >> int/long happens in numpy (in PyArray_IntpFromSequence in this case). I >> am not sure I understand exactly what the code is doing, though. I don't >> understand the rationale for #ifdef/#endif in the one item in shape >> tuple case (line 521 and below), as well as the call to PyNumber_Int, >> >> > Possibly, I get > > In [1]: a=numpy.zeros(0x80000000,dtype='b1') > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > > /home/charris/ in () > > ValueError: Maximum allowed dimension exceeded > > This on 32 bit fedora 11 with python 2.6. Hmm, "maximum allowed size > exceeded" might be a better message. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Mon Sep 14 10:52:32 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 14 Sep 2009 09:52:32 -0500 Subject: [Numpy-discussion] How to change the dtype of a structured or record array Message-ID: <4AAE58B0.8040304@gmail.com> Hi, I would like to change the dtype of just one field of a structured or record array without copying the original array. I can not change the creation of the original array because it was created using genfromtxt. For example, r=np.rec.array([(1, 1.0), (1, 1.0), (1, 1.0)],dtype=[('foo', int), ('bar', float)]) # illustrative example r=r.astype([('foo', int), ('bar', int)]) # works this creates as copy Is there alternative way to avoid this extra copying? Thanks Bruce From pgmdevlist at gmail.com Mon Sep 14 21:59:51 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Sep 2009 21:59:51 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: Message-ID: On Sep 13, 2009, at 3:51 PM, Skipper Seabold wrote: > On Sun, Sep 13, 2009 at 1:29 PM, Skipper Seabold > wrote: >> Is there a reason that the missing argument in genfromtxt only >> takes a string? Because we check strings. Note that you can specify several characters at once, provided they're separated by a comma, like missing="0,nan,n/a" >> For instance, I have a dataset that in most columns has a zero for >> some observations but in others it was just left blank, which is the >> equivalent of zero. I would like to set all of the missing to 0 (it >> defaults to -1 now) when loading in the data. I suppose I could do >> this with a converter, but I have too many columns for this. OK, I see. Gonna try to find some fix. > All of the missing values in the second observation are now -1. Also, > I'm having trouble defining a converter for my dates. > > I have the function > > from datetime import datetime > > def str2date(date): > day,month,year = date.strip().split('/') > return datetime(*map(int, [year, month, day])) > > conv = {1 : lambda s: str2date(s)} > s.seek(0) > data = np.genfromtxt(s, dtype=None, delimiter=",", names=None, > converters=conv) OK, I see the problem... When no dtype is defined, we try to guess what a converter should return by testing its inputs. At first we check whether the input is a boolean, then whether it's an integer, then a float, and so on. When you define explicitly a converter, there's no need for all those checks, so we lock the converter to a particular state, which sets the conversion function and the value to return in case of missing. Except that I messed it up and it fails in that case (the conversion function is set properly, bu the dtype of the output is still undefined). That's a bug, I'll try to fix that once I've tamed my snow kitten. Meanwhile, you can use tsfromtxt (in scikits.timeseries), or even simpler, define a dtype for the output (you know that your first column is a str, your second an object, and the others ints or floats... From jsseabold at gmail.com Mon Sep 14 22:31:39 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 14 Sep 2009 22:31:39 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: Message-ID: On Mon, Sep 14, 2009 at 9:59 PM, Pierre GM wrote: > > On Sep 13, 2009, at 3:51 PM, Skipper Seabold wrote: > >> On Sun, Sep 13, 2009 at 1:29 PM, Skipper Seabold >> wrote: >>> Is there a reason that the missing argument in genfromtxt only >>> takes a string? > > Because we check strings. Note that you can specify several characters > at once, provided they're separated by a comma, like missing="0,nan,n/a" > >>> For instance, I have a dataset that in most columns has a zero for >>> some observations but in others it was just left blank, which is the >>> equivalent of zero. ?I would like to set all of the missing to 0 (it >>> defaults to -1 now) when loading in the data. ?I suppose I could do >>> this with a converter, but I have too many columns for this. > > OK, I see. Gonna try to find some fix. > I actually figured out a workaround with converters, since my missing values are " "," "," " ie., irregular number of spaces and the values aren't stripped of white spaces. I just define {# : lambda s: float(s.strip() or 0)}, and I have a loop build all of the converters, but then I have to go through and drop the ones that are supposed to be strings or dates, which is still pretty tedious, since I have a number of datasets that are like this, but they all contain different data in different orders and there's no (computer) logical order to it that I've discovered yet. >> All of the missing values in the second observation are now -1. ?Also, >> I'm having trouble defining a converter for my dates. >> >> I have the function >> >> from datetime import datetime >> >> def str2date(date): >> ? ?day,month,year = date.strip().split('/') >> ? ?return datetime(*map(int, [year, month, day])) >> >> conv = {1 : lambda s: str2date(s)} >> s.seek(0) >> data = np.genfromtxt(s, dtype=None, delimiter=",", names=None, >> converters=conv) > > OK, I see the problem... > When no dtype is defined, we try to guess what a converter should > return by testing its inputs. At first we check whether the input is a > boolean, then whether it's an integer, then a float, and so on. When > you define explicitly a converter, there's no need for all those > checks, so we lock the converter to a particular state, which sets the > conversion function and the value to return in case of missing. > Except that I messed it up and it fails in that case (the conversion > function is set properly, bu the dtype of the output is still > undefined). That's a bug, I'll try to fix that once I've tamed my snow > kitten. No worries. I really like genfromtxt (having recently gotten pretty familiar with it) and would like to help out with extending it towards these kind of cases if there's an interest and this is feasible. I tried another workaround for the dates with my converters defined as conv conv.update({date : lambda s : datetime(*map(int, s.strip().split('/')[-1:]+s.strip().split('/')[:2]))}) Where `date` is the column that contains a date. The problem was that my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked for a test case if my dates were "dd/mm/yyyy" and I just use reversed, but gave an error about not finding the day in the third position, though that lambda function worked for a test case outside of genfromtxt. > Meanwhile, you can use tsfromtxt (in scikits.timeseries), or even > simpler, define a dtype for the output (you know that your first > column is a str, your second an object, and the others ints or floats... > I started to look at the timeseries for this, but I installed it incorrectly and it gave an error about being compiled with the wrong endianness. I've since fixed that and will take another look when I get a chance. I also tried the new datetime dtype, but I wasn't sure how to do this without defining the whole dtype. I have 500 columns that aren't homogeneous across several datasets, and each one is pretty huge, so this is tedious and takes some time to read the data (not using a test case) and see that it didn't work correctly. Cheers, Skipper From pgmdevlist at gmail.com Mon Sep 14 22:41:28 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Sep 2009 22:41:28 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: Message-ID: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> On Sep 14, 2009, at 10:31 PM, Skipper Seabold wrote: > > I actually figured out a workaround with converters, since my missing > values are " "," "," " ie., irregular number of spaces and the > values aren't stripped of white spaces. I just define {# : lambda s: > float(s.strip() or 0)}, and I have a loop build all of the converters, > but then I have to go through and drop the ones that are supposed to > be strings or dates, which is still pretty tedious, since I have a > number of datasets that are like this, but they all contain different > data in different orders and there's no (computer) logical order to it > that I've discovered yet. I understand your frustration... We could think about some kind of global default for the missing values... > I tried another workaround for the dates with my converters defined > as conv > > conv.update({date : lambda s : datetime(*map(int, > s.strip().split('/')[-1:]+s.strip().split('/')[:2]))}) > > Where `date` is the column that contains a date. The problem was that > my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked > for a test case if my dates were "dd/mm/yyyy" and I just use reversed, > but gave an error about not finding the day in the third position, > though that lambda function worked for a test case outside of > genfromtxt. Check the archives of the mailing list, there's an example using dateutil.parser that may be just what you need. From jsseabold at gmail.com Mon Sep 14 22:55:23 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 14 Sep 2009 22:55:23 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> Message-ID: On Mon, Sep 14, 2009 at 10:41 PM, Pierre GM wrote: > > On Sep 14, 2009, at 10:31 PM, Skipper Seabold wrote: >> >> I actually figured out a workaround with converters, since my missing >> values are " "," ?"," ? " ie., irregular number of spaces and the >> values aren't stripped of white spaces. ?I just define {# : lambda s: >> float(s.strip() or 0)}, and I have a loop build all of the converters, >> but then I have to go through and drop the ones that are supposed to >> be strings or dates, which is still pretty tedious, since I have a >> number of datasets that are like this, but they all contain different >> data in different orders and there's no (computer) logical order to it >> that I've discovered yet. > > I understand your frustration... We could think about some kind of > global default for the missing values... I'm not too frustrated, I'd just like to do this as few times as humanly (or machine-ly, rather) possible in the future... The main thing I'd like right now I think is for whitespace to be stripped, but maybe there is a good reason for this. I didn't realize this was the source of my confusion at first. Also just being able to define missing as a number would be nice. I started a patch for this, but I reverted when I realized I could make the converters as I did. While we're on the subject, the other thing on my wishlist (unless I just don't know how to do this) is being able to define a "column map" for datasets that have no delimiters. At first each observation of my data was just one long string with no gaps or regular breaks but I knew which columns had what. Eg., the first variable was (not zero-indexed) columns 1-6, the second columns 11-15, the third column 16, etc. so I would just say delimiter = [1:6,11:15,16,...]. >> I tried another workaround for the dates with my converters defined >> as conv >> >> conv.update({date : lambda s : datetime(*map(int, >> s.strip().split('/')[-1:]+s.strip().split('/')[:2]))}) >> >> Where `date` is the column that contains a date. ?The problem was that >> my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked >> for a test case if my dates were "dd/mm/yyyy" and I just use reversed, >> but gave an error about not finding the day in the third position, >> though that lambda function worked for a test case outside of >> genfromtxt. > > Check the archives of the mailing list, there's an example using > dateutil.parser that may be just what you need. > Ah ok. I looked for a bit, but I was sure I missed something. Thanks. Skipper From jsseabold at gmail.com Mon Sep 14 22:56:56 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 14 Sep 2009 22:56:56 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> Message-ID: On Mon, Sep 14, 2009 at 10:55 PM, Skipper Seabold wrote: > On Mon, Sep 14, 2009 at 10:41 PM, Pierre GM wrote: >> >> On Sep 14, 2009, at 10:31 PM, Skipper Seabold wrote: >>> >>> I actually figured out a workaround with converters, since my missing >>> values are " "," ?"," ? " ie., irregular number of spaces and the >>> values aren't stripped of white spaces. ?I just define {# : lambda s: >>> float(s.strip() or 0)}, and I have a loop build all of the converters, >>> but then I have to go through and drop the ones that are supposed to >>> be strings or dates, which is still pretty tedious, since I have a >>> number of datasets that are like this, but they all contain different >>> data in different orders and there's no (computer) logical order to it >>> that I've discovered yet. >> >> I understand your frustration... We could think about some kind of >> global default for the missing values... > > I'm not too frustrated, I'd just like to do this as few times as > humanly (or machine-ly, rather) possible in the future... > > The main thing I'd like right now I think is for whitespace to be > stripped, but maybe there is a good reason for this. ?I didn't realize > this was the source of my confusion at first. ?Also just being able to > define missing as a number would be nice. ?I started a patch for this, > but I reverted when I realized I could make the converters as I did. > > While we're on the subject, the other thing on my wishlist (unless I > just don't know how to do this) is being able to define a "column map" > for datasets that have no delimiters. ?At first each observation of my > data was just one long string with no gaps or regular breaks but I > knew which columns had what. ?Eg., the first variable was (not > zero-indexed) columns 1-6, the second columns 11-15, the third column > 16, etc. ?so I would just say delimiter = [1:6,11:15,16,...]. > Err, 1-6, 7-10, 11-15, 16... I need some sleep. >>> I tried another workaround for the dates with my converters defined >>> as conv >>> >>> conv.update({date : lambda s : datetime(*map(int, >>> s.strip().split('/')[-1:]+s.strip().split('/')[:2]))}) >>> >>> Where `date` is the column that contains a date. ?The problem was that >>> my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked >>> for a test case if my dates were "dd/mm/yyyy" and I just use reversed, >>> but gave an error about not finding the day in the third position, >>> though that lambda function worked for a test case outside of >>> genfromtxt. >> >> Check the archives of the mailing list, there's an example using >> dateutil.parser that may be just what you need. >> > > Ah ok. ?I looked for a bit, but I was sure I missed something. ?Thanks. > > Skipper > From pgmdevlist at gmail.com Mon Sep 14 23:40:39 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Sep 2009 23:40:39 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> Message-ID: <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> On Sep 14, 2009, at 10:55 PM, Skipper Seabold wrote: > > While we're on the subject, the other thing on my wishlist (unless I > just don't know how to do this) is being able to define a "column map" > for datasets that have no delimiters. At first each observation of my > data was just one long string with no gaps or regular breaks but I > knew which columns had what. Eg., the first variable was (not > zero-indexed) columns 1-6, the second columns 11-15, the third column > 16, etc. so I would just say delimiter = [1:6,11:15,16,...]. Fixed-width fields should already be supported. Instead of delimiter= [1-6, 7-10, 11-15, 16]..., use delimiter=[6, 4, 4, 1] (that is, just give the widths of the fields). Note that I wouldn't be surprised at all if it failed for some corner cases (eg, if you need to read the name from the first line). From jsseabold at gmail.com Tue Sep 15 00:50:55 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 15 Sep 2009 00:50:55 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> Message-ID: On Mon, Sep 14, 2009 at 11:40 PM, Pierre GM wrote: > > On Sep 14, 2009, at 10:55 PM, Skipper Seabold wrote: >> >> While we're on the subject, the other thing on my wishlist (unless I >> just don't know how to do this) is being able to define a "column map" >> for datasets that have no delimiters. ?At first each observation of my >> data was just one long string with no gaps or regular breaks but I >> knew which columns had what. ?Eg., the first variable was (not >> zero-indexed) columns 1-6, the second columns 11-15, the third column >> 16, etc. ?so I would just say delimiter = [1:6,11:15,16,...]. > > Fixed-width fields should already be supported. Instead of delimiter= > [1-6, 7-10, 11-15, 16]..., use delimiter=[6, 4, 4, 1] (that is, just > give the widths of the fields). > Note that I wouldn't be surprised at all if it failed for some corner > cases (eg, if you need to read the name from the first line). > Doh, so it does! The docstring could probably note this unless I just missed it somewhere. Thanks, Skipper From pgmdevlist at gmail.com Tue Sep 15 01:24:38 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 15 Sep 2009 01:24:38 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> Message-ID: On Sep 15, 2009, at 12:50 AM, Skipper Seabold wrote: >> >> Fixed-width fields should already be supported. Instead of delimiter= >> [1-6, 7-10, 11-15, 16]..., use delimiter=[6, 4, 4, 1] (that is, just >> give the widths of the fields). >> Note that I wouldn't be surprised at all if it failed for some corner >> cases (eg, if you need to read the name from the first line). >> > > Doh, so it does! The docstring could probably note this unless I just > missed it somewhere. Well, we sure do need some docs and more examples. From nadavh at visionsense.com Tue Sep 15 02:28:35 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue, 15 Sep 2009 09:28:35 +0300 Subject: [Numpy-discussion] Error in numpy 1.4.0 dev 07384 Message-ID: <710F2847B0018641891D9A21602763605AD16A@ex3.envision.co.il> I compiled the recent numpy from svn on gentoo-amd64, since then I often get the error message: RuntimeError: FATAL: module compiled aslittle endian, but detected different endianness at runtime Of course AMD64 is little endian, and array's byteorder is little endian by default. Grepping shows the the error origin is a C code inside generate_numpy_api.py BTW: Until now I did not observe any problem, the program I run seems to produce the correct results. Nadav From gael.varoquaux at normalesup.org Tue Sep 15 03:37:47 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 15 Sep 2009 09:37:47 +0200 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> Message-ID: <20090915073747.GD17789@phare.normalesup.org> On Tue, Sep 15, 2009 at 12:50:55AM -0400, Skipper Seabold wrote: > Doh, so it does! The docstring could probably note this unless I just > missed it somewhere. Hey Skipper, You sent a patch a while ago to fix a docstring. I am not sure it has been applied ( :( ). I just wanted to point out that there is an easy way of making a difference, and making sure that the docstrings get fixed (which is indeed very important). If you go to http://docs.scipy.org/ and register, send your login name on this mailing list, we will add you to the list of editors, and you will be able to edit easily the docstrings of scipy SVN. Cheers, Ga?l From lciti at essex.ac.uk Tue Sep 15 04:32:47 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Tue, 15 Sep 2009 09:32:47 +0100 Subject: [Numpy-discussion] Error in numpy 1.4.0 dev 07384 In-Reply-To: <710F2847B0018641891D9A21602763605AD16A@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD16A@ex3.envision.co.il> Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A00F@MBOX0.essex.ac.uk> I got the same problem when compiling a new svn revision with some intermediate files left from the build of a previous revision. Removing the content of the build folder before compiling the new version solved the issue. From nadavh at visionsense.com Tue Sep 15 06:30:51 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue, 15 Sep 2009 13:30:51 +0300 Subject: [Numpy-discussion] Error in numpy 1.4.0 dev 07384 References: <710F2847B0018641891D9A21602763605AD16A@ex3.envision.co.il> <271BED32E925E646A1333A56D9C6AFCB31E561A00F@MBOX0.essex.ac.uk> Message-ID: <710F2847B0018641891D9A21602763605AD16C@ex3.envision.co.il> That it! Thanks, Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Citi, Luca ????: ? 15-??????-09 11:32 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] Error in numpy 1.4.0 dev 07384 I got the same problem when compiling a new svn revision with some intermediate files left from the build of a previous revision. Removing the content of the build folder before compiling the new version solved the issue. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 2982 bytes Desc: not available URL: From washakie at gmail.com Tue Sep 15 07:04:36 2009 From: washakie at gmail.com (John [H2O]) Date: Tue, 15 Sep 2009 04:04:36 -0700 (PDT) Subject: [Numpy-discussion] exec: bad practice? Message-ID: <25452013.post@talk.nabble.com> Hello, I have a bit of code where I create arrays with meaningful names via: meat = ['beef','lamb','pork'] cut = ['ribs','cutlets'] for m in meat: for c in cut: exec("consumed_%s_%s = np.zeros((numxgrid,numygrid,nummeasured))" % (m,c)) Is this 'pythonic'? Or is it bad practice (and potentially slow) to use the 'exec' statement? -- View this message in context: http://www.nabble.com/exec%3A-bad-practice--tp25452013p25452013.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From seb.binet at gmail.com Tue Sep 15 07:07:56 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Tue, 15 Sep 2009 13:07:56 +0200 Subject: [Numpy-discussion] exec: bad practice? In-Reply-To: <25452013.post@talk.nabble.com> References: <25452013.post@talk.nabble.com> Message-ID: <200909151307.56368.binet@cern.ch> hi John, > I have a bit of code where I create arrays with meaningful names via: > > meat = ['beef','lamb','pork'] > cut = ['ribs','cutlets'] > > for m in meat: > for c in cut: > exec("consumed_%s_%s = np.zeros((numxgrid,numygrid,nummeasured))" > % (m,c)) > > Is this 'pythonic'? Or is it bad practice (and potentially slow) to use the > 'exec' statement? usage of the exec statement is usually frown upon and can be side stepped. e.g: for m in meat: for c in cut: locals()['consumed_%s_%s' % (m,c)] = some_array hth, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From aisaac at american.edu Tue Sep 15 08:34:35 2009 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 15 Sep 2009 08:34:35 -0400 Subject: [Numpy-discussion] exec: bad practice? In-Reply-To: <200909151307.56368.binet@cern.ch> References: <25452013.post@talk.nabble.com> <200909151307.56368.binet@cern.ch> Message-ID: <4AAF89DB.9030707@american.edu> On 9/15/2009 7:07 AM, Sebastien Binet wrote: > usage of the exec statement is usually frown upon and can be side stepped. > e.g: > > for m in meat: > for c in cut: > locals()['consumed_%s_%s' % (m,c)] = some_array Additionally, name construction can be pointless. Maybe:: info = dict() for pr in itertools.product(meat, cut): info[pr] = f(pr) Alan Isaac From stefan at sun.ac.za Tue Sep 15 08:46:17 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 15 Sep 2009 14:46:17 +0200 Subject: [Numpy-discussion] How to change the dtype of a structured or record array In-Reply-To: <4AAE58B0.8040304@gmail.com> References: <4AAE58B0.8040304@gmail.com> Message-ID: <9457e7c80909150546s349e6502y868d9ab1c699bbe9@mail.gmail.com> Hi Bruce 2009/9/14 Bruce Southey : > I would like to change the dtype of just one field of a structured or > record array without copying the original array. I can not change the > creation of the original array because it was created using genfromtxt. You can't do that, unfortunately. You can view the array using any dtype of the same length as the old one, but you can't modify the length of single elements without re-allocation. Regards St?fan From bsouthey at gmail.com Tue Sep 15 09:03:55 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Sep 2009 08:03:55 -0500 Subject: [Numpy-discussion] How to change the dtype of a structured or record array In-Reply-To: <9457e7c80909150546s349e6502y868d9ab1c699bbe9@mail.gmail.com> References: <4AAE58B0.8040304@gmail.com> <9457e7c80909150546s349e6502y868d9ab1c699bbe9@mail.gmail.com> Message-ID: <4AAF90BB.3050205@gmail.com> On 09/15/2009 07:46 AM, St?fan van der Walt wrote: > Hi Bruce > > 2009/9/14 Bruce Southey: > >> I would like to change the dtype of just one field of a structured or >> record array without copying the original array. I can not change the >> creation of the original array because it was created using genfromtxt. >> > You can't do that, unfortunately. You can view the array using any > dtype of the same length as the old one, but you can't modify the > length of single elements without re-allocation. > > Regards > St?fan > Thanks! Bruce From jsseabold at gmail.com Tue Sep 15 09:22:41 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 15 Sep 2009 09:22:41 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: <20090915073747.GD17789@phare.normalesup.org> References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> <20090915073747.GD17789@phare.normalesup.org> Message-ID: On Tue, Sep 15, 2009 at 3:37 AM, Gael Varoquaux wrote: > On Tue, Sep 15, 2009 at 12:50:55AM -0400, Skipper Seabold wrote: >> Doh, so it does! ?The docstring could probably note this unless I just >> missed it somewhere. > > Hey Skipper, > > You sent a patch a while ago to fix a docstring. I am not sure it has > been applied ( :( ). > > I just wanted to point out that there is an easy way of making a > difference, and making sure that the docstrings get fixed (which is > indeed very important). If you go to http://docs.scipy.org/ and register, > send your login name on this mailing list, we will add you to the list of > editors, and you will be able to edit easily the docstrings of scipy SVN. > Yes, of course. I have a login already, thanks. How quickly I forget. I will have a look at the docs and add some examples. Skipper From bsouthey at gmail.com Tue Sep 15 09:43:16 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Sep 2009 08:43:16 -0500 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: Message-ID: <4AAF99F4.7050101@gmail.com> On 09/14/2009 09:31 PM, Skipper Seabold wrote: > On Mon, Sep 14, 2009 at 9:59 PM, Pierre GM wrote: > [snip] >> OK, I see the problem... >> When no dtype is defined, we try to guess what a converter should >> return by testing its inputs. At first we check whether the input is a >> boolean, then whether it's an integer, then a float, and so on. When >> you define explicitly a converter, there's no need for all those >> checks, so we lock the converter to a particular state, which sets the >> conversion function and the value to return in case of missing. >> Except that I messed it up and it fails in that case (the conversion >> function is set properly, bu the dtype of the output is still >> undefined). That's a bug, I'll try to fix that once I've tamed my snow >> kitten. >> > No worries. I really like genfromtxt (having recently gotten pretty > familiar with it) and would like to help out with extending it towards > these kind of cases if there's an interest and this is feasible. > > I tried another workaround for the dates with my converters defined as conv > > conv.update({date : lambda s : datetime(*map(int, > s.strip().split('/')[-1:]+s.strip().split('/')[:2]))}) > > Where `date` is the column that contains a date. The problem was that > my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked > for a test case if my dates were "dd/mm/yyyy" and I just use reversed, > but gave an error about not finding the day in the third position, > though that lambda function worked for a test case outside of > genfromtxt. > > >> Meanwhile, you can use tsfromtxt (in scikits.timeseries), >> In SAS there are multiple ways to define formats especially dates: http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a002200738.htm It would be nice to accept the common variants (USA vs English dates) as well as two digit vs 4 digit year codes. >> or even >> simpler, define a dtype for the output (you know that your first >> column is a str, your second an object, and the others ints or floats... >> >> How do you specify different dtypes in genfromtxt? I could not see the information in the docstring and the dtype argument does not appear to allow multiple dtypes. Bruce From rmay31 at gmail.com Tue Sep 15 10:08:34 2009 From: rmay31 at gmail.com (Ryan May) Date: Tue, 15 Sep 2009 09:08:34 -0500 Subject: [Numpy-discussion] Error in numpy 1.4.0 dev 07384 In-Reply-To: <710F2847B0018641891D9A21602763605AD16C@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD16A@ex3.envision.co.il> <271BED32E925E646A1333A56D9C6AFCB31E561A00F@MBOX0.essex.ac.uk> <710F2847B0018641891D9A21602763605AD16C@ex3.envision.co.il> Message-ID: Keep in mind that you can still have a problem with a conflict between your SVN copy and system copy, if the SVN copy is visible by default (like, say, installed to ~/.local under python 2.6) In my case, there was a problem where a gnome panel applet used a feature in pygtk which called to numpy. I was getting the same RuntimeError because Pygtk was built against the system 1.3 copy, but when I ran the applet, it would first find my 1.4 SVN numpy. Just an FYI (to you and others) as I lost a chunk of time figuring that out. Ryan 2009/9/15 Nadav Horesh > That it! > > Thanks, > > Nadav > > -----????? ??????----- > ???: numpy-discussion-bounces at scipy.org ??? Citi, Luca > ????: ? 15-??????-09 11:32 > ??: Discussion of Numerical Python > ????: Re: [Numpy-discussion] Error in numpy 1.4.0 dev 07384 > > I got the same problem when compiling a new svn revision with some > intermediate files left from the build of a previous revision. > Removing the content of the build folder before compiling the new version > solved the issue. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlw at stsci.edu Tue Sep 15 10:30:43 2009 From: rlw at stsci.edu (Rick White) Date: Tue, 15 Sep 2009 10:30:43 -0400 Subject: [Numpy-discussion] exec: bad practice? In-Reply-To: References: Message-ID: <5AE24C95-EF0B-4523-AB9F-00761186B5E1@stsci.edu> You're not supposed to write to the locals() dictionary. Sometimes it works, but sometimes it doesn't. From the Python library docs: locals() Update and return a dictionary representing the current local symbol table. Note: The contents of this dictionary should not be modified; changes may not affect the values of local variables used by the interpreter. I think the only way to create a variable with a program-specified name in the local namespace is to use exec (but I'd be happy to be corrected). Cheers, Rick On Sep 15, 2009 Sebastien Binet wrote: > hi John, > >> I have a bit of code where I create arrays with meaningful names via: >> >> meat = ['beef','lamb','pork'] >> cut = ['ribs','cutlets'] >> >> for m in meat: >> for c in cut: >> exec("consumed_%s_%s = np.zeros >> ((numxgrid,numygrid,nummeasured))" >> % (m,c)) >> >> Is this 'pythonic'? Or is it bad practice (and potentially slow) >> to use the >> 'exec' statement? > > usage of the exec statement is usually frown upon and can be side > stepped. > e.g: > > for m in meat: > for c in cut: > locals()['consumed_%s_%s' % (m,c)] = some_array > > hth, > sebastien. From jsseabold at gmail.com Tue Sep 15 10:44:16 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 15 Sep 2009 10:44:16 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: <4AAF99F4.7050101@gmail.com> References: <4AAF99F4.7050101@gmail.com> Message-ID: On Tue, Sep 15, 2009 at 9:43 AM, Bruce Southey wrote: > On 09/14/2009 09:31 PM, Skipper Seabold wrote: >> On Mon, Sep 14, 2009 at 9:59 PM, Pierre GM ?wrote: >> > [snip] >>> OK, I see the problem... >>> When no dtype is defined, we try to guess what a converter should >>> return by testing its inputs. At first we check whether the input is a >>> boolean, then whether it's an integer, then a float, and so on. When >>> you define explicitly a converter, there's no need for all those >>> checks, so we lock the converter to a particular state, which sets the >>> conversion function and the value to return in case of missing. >>> Except that I messed it up and it fails in that case (the conversion >>> function is set properly, bu the dtype of the output is still >>> undefined). That's a bug, I'll try to fix that once I've tamed my snow >>> kitten. >>> >> No worries. ?I really like genfromtxt (having recently gotten pretty >> familiar with it) and would like to help out with extending it towards >> these kind of cases if there's an interest and this is feasible. >> >> I tried another workaround for the dates with my converters defined as conv >> >> conv.update({date : lambda s : datetime(*map(int, >> s.strip().split('/')[-1:]+s.strip().split('/')[:2]))}) >> >> Where `date` is the column that contains a date. ?The problem was that >> my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked >> for a test case if my dates were "dd/mm/yyyy" and I just use reversed, >> but gave an error about not finding the day in the third position, >> though that lambda function worked for a test case outside of >> genfromtxt. >> >> >>> Meanwhile, you can use tsfromtxt (in scikits.timeseries), >>> > In SAS there are multiple ways to define formats especially dates: > http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a002200738.htm > > It would be nice to accept the common variants (USA vs English dates) as > well as two digit vs 4 digit year codes. > This is relevant to what I've been doing. I parsed a SAS input file to get the information to pass to genfromtxt, and it might be useful to have these types defined. Again, I'm wondering about whether the new datetime dtype might eventually be used for something like this. Do you know if SAS publishes the format of its datasets, similar to Stata? http://www.stata.com/help.cgi?dta > > >>> or even >>> simpler, define a dtype for the output (you know that your first >>> column is a str, your second an object, and the others ints or floats... >>> >>> > How do you specify different dtypes in genfromtxt? > I could not see the information in the docstring and the dtype argument > does not appear to allow multiple dtypes. > I have also been struggling with this (and modifying the dtype of field in structured array in place, btw). To give a quick example, here are some of the ways that I expected to work and didn't and a few ways that work. from StringIO import StringIO import numpy as np # a few incorrect ones s = StringIO("11.3abcde") data = np.genfromtxt(s, dtype=np.dtype(int, float, str), delimiter=[1,3,5]) In [42]: data Out[42]: array([ 1, 1, -1]) s.seek(0) data = np.genfromtxt(s, dtype=np.dtype(float, int, str), delimiter=[1,3,5]) In [45]: data Out[45]: array([ 1. , 1.3, NaN]) s.seek(0) data = np.genfromtxt(s, dtype=np.dtype(str, float, int), delimiter=[1,3,5]) In [48]: data Out[48]: array(['1', '1.3', 'abcde'], dtype='|S5') # correct few s.seek(0) data = np.genfromtxt(s, dtype=np.dtype([('myint','i8'),('myfloat','f8'),('mystring','a5')]), delimiter=[1,3,5]) In [52]: data Out[52]: array((1, 1.3, 'abcde'), dtype=[('myint', '<271BED32E925E646A1333A56D9C6AFCB31E561A00F@MBOX0.essex.ac.uk><710F2847B0018641891D9A21602763605AD16C@ex3.envision.co.il> Message-ID: <710F2847B0018641891D9A21602763605AD16F@ex3.envision.co.il> I spent much time on similar issues, so my policy became to keep all the packages strictly under the same tree. Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Ryan May ????: ? 15-??????-09 17:08 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] Error in numpy 1.4.0 dev 07384 Keep in mind that you can still have a problem with a conflict between your SVN copy and system copy, if the SVN copy is visible by default (like, say, installed to ~/.local under python 2.6) In my case, there was a problem where a gnome panel applet used a feature in pygtk which called to numpy. I was getting the same RuntimeError because Pygtk was built against the system 1.3 copy, but when I ran the applet, it would first find my 1.4 SVN numpy. Just an FYI (to you and others) as I lost a chunk of time figuring that out. Ryan 2009/9/15 Nadav Horesh > That it! > > Thanks, > > Nadav > > -----????? ??????----- > ???: numpy-discussion-bounces at scipy.org ??? Citi, Luca > ????: ? 15-??????-09 11:32 > ??: Discussion of Numerical Python > ????: Re: [Numpy-discussion] Error in numpy 1.4.0 dev 07384 > > I got the same problem when compiling a new svn revision with some > intermediate files left from the build of a previous revision. > Removing the content of the build folder before compiling the new version > solved the issue. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3886 bytes Desc: not available URL: From josef.pktd at gmail.com Tue Sep 15 10:57:36 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 15 Sep 2009 10:57:36 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <4AAF99F4.7050101@gmail.com> Message-ID: <1cd32cbb0909150757s5cfd632ak582722e6bbfc0ff1@mail.gmail.com> On Tue, Sep 15, 2009 at 10:44 AM, Skipper Seabold wrote: > On Tue, Sep 15, 2009 at 9:43 AM, Bruce Southey wrote: >> On 09/14/2009 09:31 PM, Skipper Seabold wrote: >>> On Mon, Sep 14, 2009 at 9:59 PM, Pierre GM ?wrote: >>> >> [snip] >>>> OK, I see the problem... >>>> When no dtype is defined, we try to guess what a converter should >>>> return by testing its inputs. At first we check whether the input is a >>>> boolean, then whether it's an integer, then a float, and so on. When >>>> you define explicitly a converter, there's no need for all those >>>> checks, so we lock the converter to a particular state, which sets the >>>> conversion function and the value to return in case of missing. >>>> Except that I messed it up and it fails in that case (the conversion >>>> function is set properly, bu the dtype of the output is still >>>> undefined). That's a bug, I'll try to fix that once I've tamed my snow >>>> kitten. >>>> >>> No worries. ?I really like genfromtxt (having recently gotten pretty >>> familiar with it) and would like to help out with extending it towards >>> these kind of cases if there's an interest and this is feasible. >>> >>> I tried another workaround for the dates with my converters defined as conv >>> >>> conv.update({date : lambda s : datetime(*map(int, >>> s.strip().split('/')[-1:]+s.strip().split('/')[:2]))}) >>> >>> Where `date` is the column that contains a date. ?The problem was that >>> my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked >>> for a test case if my dates were "dd/mm/yyyy" and I just use reversed, >>> but gave an error about not finding the day in the third position, >>> though that lambda function worked for a test case outside of >>> genfromtxt. >>> >>> >>>> Meanwhile, you can use tsfromtxt (in scikits.timeseries), >>>> >> In SAS there are multiple ways to define formats especially dates: >> http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a002200738.htm >> >> It would be nice to accept the common variants (USA vs English dates) as >> well as two digit vs 4 digit year codes. >> > > This is relevant to what I've been doing. ?I parsed a SAS input file > to get the information to pass to genfromtxt, and it might be useful > to have these types defined. ?Again, I'm wondering about whether the > new datetime dtype might eventually be used for something like this. > > Do you know if SAS publishes the format of its datasets, similar to > Stata? ?http://www.stata.com/help.cgi?dta > >> >> >>>> or even >>>> simpler, define a dtype for the output (you know that your first >>>> column is a str, your second an object, and the others ints or floats... >>>> >>>> >> How do you specify different dtypes in genfromtxt? >> I could not see the information in the docstring and the dtype argument >> does not appear to allow multiple dtypes. >> > > I have also been struggling with this (and modifying the dtype of > field in structured array in place, btw). ?To give a quick example, > here are some of the ways that I expected to work and didn't and a few > ways that work. > > from StringIO import StringIO > import numpy as np > > # a few incorrect ones > > s = StringIO("11.3abcde") > data = np.genfromtxt(s, dtype=np.dtype(int, float, str), delimiter=[1,3,5]) > > In [42]: data > Out[42]: array([ 1, ?1, -1]) > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype(float, int, str), delimiter=[1,3,5]) > > In [45]: data > Out[45]: array([ 1. , ?1.3, ?NaN]) > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype(str, float, int), delimiter=[1,3,5]) > > In [48]: data > Out[48]: > array(['1', '1.3', 'abcde'], > ? ? ?dtype='|S5') these are not problem of genfromtxt, the dtype construction is not what you think it is. What the second and third arguments are, I don't know >>> np.dtype(int,float,str) dtype('int32') >>> np.dtype(float,float,str) dtype('float64') >>> np.dtype(str,float,str) dtype('|S0') I think the versions below are the correct way of specifying a structured dtype. Josef > > # correct few > > s.seek(0) > data = np.genfromtxt(s, > dtype=np.dtype([('myint','i8'),('myfloat','f8'),('mystring','a5')]), > delimiter=[1,3,5]) > > In [52]: data > Out[52]: > array((1, 1.3, 'abcde'), > ? ? ?dtype=[('myint', ' > s.seek(0) > data = np.genfromtxt(s, dtype=None, delimiter=[1,3,5]) > > In [55]: data > Out[55]: > array((1, 1.3, 'abcde'), > ? ? ?dtype=[('f0', ' > # one I expected to work but have probably made an obvious mistake > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype('i8','f8','a5'), > names=['myint','myfloat','mystring'], delimiter=[1,3,5]) > > In [64]: data > Out[64]: array([ 1, ?1, -1]) > > # "ugly" way to do this, but it works > > s.seek(0) > data = np.genfromtxt(s, > dtype=np.dtype([('','i8'),('','f8'),('','a5')]), > names=['myint','myfloat','mystring'], delimiter=[1,3,5]) > > In [69]: data > Out[69]: > array((1, 1.3, 'abcde'), > ? ? ?dtype=[('myint', ' > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Tue Sep 15 10:57:17 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 15 Sep 2009 10:57:17 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <4AAF99F4.7050101@gmail.com> Message-ID: On Tue, Sep 15, 2009 at 10:44 AM, Skipper Seabold wrote: > On Tue, Sep 15, 2009 at 9:43 AM, Bruce Southey wrote: >> On 09/14/2009 09:31 PM, Skipper Seabold wrote: >>> On Mon, Sep 14, 2009 at 9:59 PM, Pierre GM ?wrote: >>> >> [snip] >>>> OK, I see the problem... >>>> When no dtype is defined, we try to guess what a converter should >>>> return by testing its inputs. At first we check whether the input is a >>>> boolean, then whether it's an integer, then a float, and so on. When >>>> you define explicitly a converter, there's no need for all those >>>> checks, so we lock the converter to a particular state, which sets the >>>> conversion function and the value to return in case of missing. >>>> Except that I messed it up and it fails in that case (the conversion >>>> function is set properly, bu the dtype of the output is still >>>> undefined). That's a bug, I'll try to fix that once I've tamed my snow >>>> kitten. >>>> >>> No worries. ?I really like genfromtxt (having recently gotten pretty >>> familiar with it) and would like to help out with extending it towards >>> these kind of cases if there's an interest and this is feasible. >>> >>> I tried another workaround for the dates with my converters defined as conv >>> >>> conv.update({date : lambda s : datetime(*map(int, >>> s.strip().split('/')[-1:]+s.strip().split('/')[:2]))}) >>> >>> Where `date` is the column that contains a date. ?The problem was that >>> my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked >>> for a test case if my dates were "dd/mm/yyyy" and I just use reversed, >>> but gave an error about not finding the day in the third position, >>> though that lambda function worked for a test case outside of >>> genfromtxt. >>> >>> >>>> Meanwhile, you can use tsfromtxt (in scikits.timeseries), >>>> >> In SAS there are multiple ways to define formats especially dates: >> http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a002200738.htm >> >> It would be nice to accept the common variants (USA vs English dates) as >> well as two digit vs 4 digit year codes. >> > > This is relevant to what I've been doing. ?I parsed a SAS input file > to get the information to pass to genfromtxt, and it might be useful > to have these types defined. ?Again, I'm wondering about whether the > new datetime dtype might eventually be used for something like this. > > Do you know if SAS publishes the format of its datasets, similar to > Stata? ?http://www.stata.com/help.cgi?dta > >> >> >>>> or even >>>> simpler, define a dtype for the output (you know that your first >>>> column is a str, your second an object, and the others ints or floats... >>>> >>>> >> How do you specify different dtypes in genfromtxt? >> I could not see the information in the docstring and the dtype argument >> does not appear to allow multiple dtypes. >> > > I have also been struggling with this (and modifying the dtype of > field in structured array in place, btw). ?To give a quick example, > here are some of the ways that I expected to work and didn't and a few > ways that work. > > from StringIO import StringIO > import numpy as np > > # a few incorrect ones > > s = StringIO("11.3abcde") > data = np.genfromtxt(s, dtype=np.dtype(int, float, str), delimiter=[1,3,5]) > > In [42]: data > Out[42]: array([ 1, ?1, -1]) > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype(float, int, str), delimiter=[1,3,5]) > > In [45]: data > Out[45]: array([ 1. , ?1.3, ?NaN]) > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype(str, float, int), delimiter=[1,3,5]) > > In [48]: data > Out[48]: > array(['1', '1.3', 'abcde'], > ? ? ?dtype='|S5') > > # correct few > > s.seek(0) > data = np.genfromtxt(s, > dtype=np.dtype([('myint','i8'),('myfloat','f8'),('mystring','a5')]), > delimiter=[1,3,5]) > > In [52]: data > Out[52]: > array((1, 1.3, 'abcde'), > ? ? ?dtype=[('myint', ' > s.seek(0) > data = np.genfromtxt(s, dtype=None, delimiter=[1,3,5]) > > In [55]: data > Out[55]: > array((1, 1.3, 'abcde'), > ? ? ?dtype=[('f0', ' > # one I expected to work but have probably made an obvious mistake > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype('i8','f8','a5'), > names=['myint','myfloat','mystring'], delimiter=[1,3,5]) > > In [64]: data > Out[64]: array([ 1, ?1, -1]) > > # "ugly" way to do this, but it works > > s.seek(0) > data = np.genfromtxt(s, > dtype=np.dtype([('','i8'),('','f8'),('','a5')]), > names=['myint','myfloat','mystring'], delimiter=[1,3,5]) > > In [69]: data > Out[69]: > array((1, 1.3, 'abcde'), > ? ? ?dtype=[('myint', ' Btw, you don't have to pass it as a dtype. It just needs to be able to pass if dtype is not None: dtype = np.dtype(dtype) I would like to see something like this, as it does when dtype is None, but then we would have to have a type argument, maybe rather than a dtype argument. names = ['var1','var2','var3'] type = ['i', 'f', 'str'] dtype = zip(names,type) if dtype is not None: .... Again, while I'm on it...I noticed the argument to specify the autostrip argument that can be provided to _iotools.LineSplitter is always False. If this does, what I think (no time to test yet), it might be nice to be able to specify this in genfromtxt. Skipper From seb.binet at gmail.com Tue Sep 15 10:59:37 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Tue, 15 Sep 2009 16:59:37 +0200 Subject: [Numpy-discussion] exec: bad practice? In-Reply-To: <5AE24C95-EF0B-4523-AB9F-00761186B5E1@stsci.edu> References: <5AE24C95-EF0B-4523-AB9F-00761186B5E1@stsci.edu> Message-ID: <200909151659.37907.binet@cern.ch> On Tuesday 15 September 2009 16:30:43 Rick White wrote: > You're not supposed to write to the locals() dictionary. Sometimes > it works, but sometimes it doesn't. From the Python library docs: > > locals() > Update and return a dictionary representing the current local symbol > table. > Note: The contents of this dictionary should not be modified; > changes may not affect the values of local variables used by the > interpreter. ah! I am glad I tried to show off my python shaky python skills :) > I think the only way to create a variable with a program-specified > name in the local namespace is to use exec (but I'd be happy to be > corrected). looks like so. I am sure there is a good reason for being able to programmatically modify the globals() dict content but not the locals() one... cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From bsouthey at gmail.com Tue Sep 15 11:57:35 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Sep 2009 10:57:35 -0500 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <4AAF99F4.7050101@gmail.com> Message-ID: <4AAFB96F.5080806@gmail.com> On 09/15/2009 09:44 AM, Skipper Seabold wrote: > On Tue, Sep 15, 2009 at 9:43 AM, Bruce Southey wrote: > >> On 09/14/2009 09:31 PM, Skipper Seabold wrote: >> >>> On Mon, Sep 14, 2009 at 9:59 PM, Pierre GM wrote: >>> >>> >> [snip] >> >>>> OK, I see the problem... >>>> When no dtype is defined, we try to guess what a converter should >>>> return by testing its inputs. At first we check whether the input is a >>>> boolean, then whether it's an integer, then a float, and so on. When >>>> you define explicitly a converter, there's no need for all those >>>> checks, so we lock the converter to a particular state, which sets the >>>> conversion function and the value to return in case of missing. >>>> Except that I messed it up and it fails in that case (the conversion >>>> function is set properly, bu the dtype of the output is still >>>> undefined). That's a bug, I'll try to fix that once I've tamed my snow >>>> kitten. >>>> >>>> >>> No worries. I really like genfromtxt (having recently gotten pretty >>> familiar with it) and would like to help out with extending it towards >>> these kind of cases if there's an interest and this is feasible. >>> >>> I tried another workaround for the dates with my converters defined as conv >>> >>> conv.update({date : lambda s : datetime(*map(int, >>> s.strip().split('/')[-1:]+s.strip().split('/')[:2]))}) >>> >>> Where `date` is the column that contains a date. The problem was that >>> my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked >>> for a test case if my dates were "dd/mm/yyyy" and I just use reversed, >>> but gave an error about not finding the day in the third position, >>> though that lambda function worked for a test case outside of >>> genfromtxt. >>> >>> >>> >>>> Meanwhile, you can use tsfromtxt (in scikits.timeseries), >>>> >>>> >> In SAS there are multiple ways to define formats especially dates: >> http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a002200738.htm >> >> It would be nice to accept the common variants (USA vs English dates) as >> well as two digit vs 4 digit year codes. >> >> > This is relevant to what I've been doing. I parsed a SAS input file > to get the information to pass to genfromtxt, and it might be useful > to have these types defined. Again, I'm wondering about whether the > new datetime dtype might eventually be used for something like this. > > Do you know if SAS publishes the format of its datasets, similar to > Stata? http://www.stata.com/help.cgi?dta > I am not exactly sure what you mean. Most of type formats are available under the data set informat statement but really you need to address special ones like defining strings with sufficient length and time when reading data. Usually I read dates as strings and then convert back dates as needed since these are not always correct or have the same format in the data. SAS is rather complex as it has multiple ways to create what it calls permanent datasets and these are even incompatible across OS's in the same version. So really these are not very useful outside of the specific version of SAS that is being used. There are many ways to transfer files like using the xport engine that R can read (see read.xport in foreign package - has link to format). However, usually it is just easier to create a new file within SAS. > >> >> >>>> or even >>>> simpler, define a dtype for the output (you know that your first >>>> column is a str, your second an object, and the others ints or floats... >>>> >>>> >>>> >> How do you specify different dtypes in genfromtxt? >> I could not see the information in the docstring and the dtype argument >> does not appear to allow multiple dtypes. >> >> > I have also been struggling with this (and modifying the dtype of > field in structured array in place, btw). To give a quick example, > here are some of the ways that I expected to work and didn't and a few > ways that work. > > from StringIO import StringIO > import numpy as np > > # a few incorrect ones > > s = StringIO("11.3abcde") > data = np.genfromtxt(s, dtype=np.dtype(int, float, str), delimiter=[1,3,5]) > > In [42]: data > Out[42]: array([ 1, 1, -1]) > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype(float, int, str), delimiter=[1,3,5]) > > In [45]: data > Out[45]: array([ 1. , 1.3, NaN]) > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype(str, float, int), delimiter=[1,3,5]) > > In [48]: data > Out[48]: > array(['1', '1.3', 'abcde'], > dtype='|S5') > > # correct few > > s.seek(0) > data = np.genfromtxt(s, > dtype=np.dtype([('myint','i8'),('myfloat','f8'),('mystring','a5')]), > delimiter=[1,3,5]) > > In [52]: data > Out[52]: > array((1, 1.3, 'abcde'), > dtype=[('myint', ' > s.seek(0) > data = np.genfromtxt(s, dtype=None, delimiter=[1,3,5]) > > In [55]: data > Out[55]: > array((1, 1.3, 'abcde'), > dtype=[('f0', ' > # one I expected to work but have probably made an obvious mistake > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype('i8','f8','a5'), > names=['myint','myfloat','mystring'], delimiter=[1,3,5]) > > In [64]: data > Out[64]: array([ 1, 1, -1]) > > # "ugly" way to do this, but it works > > s.seek(0) > data = np.genfromtxt(s, > dtype=np.dtype([('','i8'),('','f8'),('','a5')]), > names=['myint','myfloat','mystring'], delimiter=[1,3,5]) > > In [69]: data > Out[69]: > array((1, 1.3, 'abcde'), > dtype=[('myint', ' > > Skipper > Thanks for these examples as these make sense now. I was confused because the display shows the dtype as list not as a single dtype. Bruce From denis-bz-py at t-online.de Tue Sep 15 12:05:50 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Tue, 15 Sep 2009 16:05:50 +0000 (UTC) Subject: [Numpy-discussion] timeit one big vs many little a[ogrid] = f(ogrid) References: Message-ID: Added: an inline grid y,x = np.ogrid[ j:j+n, k:k+n ] a[ j:j+n, k:k+n ] = f(x,y) is 3* faster than a[y,x] = f(x,y) for 256x256, about the same for little 8x8 squares (on mac ppc.) So ogrids are not "objects" -- you can't g = xxgrid[j:j+n, k:k+n] ... use g, pass it just like inline. (Does anyone know of and use a small clean Grid class in C++ ? Graphics libs have had window-to-viewport since day 1: always there, iron out silly errors, maybe heavy.) From gael.varoquaux at normalesup.org Tue Sep 15 13:07:15 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 15 Sep 2009 19:07:15 +0200 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> <20090915073747.GD17789@phare.normalesup.org> Message-ID: <20090915170715.GB31154@phare.normalesup.org> On Tue, Sep 15, 2009 at 09:22:41AM -0400, Skipper Seabold wrote: > Yes, of course. I have a login already, thanks. How quickly I > forget. I will have a look at the docs and add some examples. Thanks a lot. Such contributions are very valuable to the community. Ga?l From michael.s.gilbert at gmail.com Tue Sep 15 13:38:40 2009 From: michael.s.gilbert at gmail.com (Michael Gilbert) Date: Tue, 15 Sep 2009 13:38:40 -0400 Subject: [Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices? Message-ID: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> hi, when using numpy.random.multivariate_normal, would it make sense to warn the user that they have entered a non-physical covariance matrix? i was recently working on a problem and getting very strange results until i finally realized that i had actually entered a bogus covariance matrix. its easy to determine when this is the case -- its when the determinant of the covariance matrix is negative. i.e. the multivariate normal distribution has det(C)^1/2 as part of the normalization factor, so when det(C)<0, you end up with an imaginary probability distribution. a warning might be better than an error since there may be cases where the user would intentionally want this type of configuration. mike From charlesr.harris at gmail.com Tue Sep 15 13:50:47 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Sep 2009 11:50:47 -0600 Subject: [Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices? In-Reply-To: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> References: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> Message-ID: On Tue, Sep 15, 2009 at 11:38 AM, Michael Gilbert < michael.s.gilbert at gmail.com> wrote: > hi, > > when using numpy.random.multivariate_normal, would it make sense to warn > the user that they have entered a non-physical covariance matrix? i was > recently working on a problem and getting very strange results until i > finally realized that i had actually entered a bogus covariance matrix. > > its easy to determine when this is the case -- its when the > determinant of the covariance matrix is negative. i.e. the > multivariate normal distribution has det(C)^1/2 as part of the > normalization factor, so when det(C)<0, you end up with an imaginary > probability distribution. > > Hmm, you mean it isn't implemented using a cholesky decomposition? That would (should) throw an error if the covariance isn't symmetric positive definite. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Sep 15 13:56:08 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 15 Sep 2009 13:56:08 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <4AAF99F4.7050101@gmail.com> Message-ID: On Sep 15, 2009, at 10:44 AM, Skipper Seabold wrote: >>>> How do you specify different dtypes in genfromtxt? >> I could not see the information in the docstring and the dtype >> argument >> does not appear to allow multiple dtypes. Just give a regular dtype, or something that could be interpreted as such. Have a look at http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html > # a few incorrect ones > > s = StringIO("11.3abcde") > data = np.genfromtxt(s, dtype=np.dtype(int, float, str), delimiter= > [1,3,5]) Non-legit at all, but a good idea in that case. > > # one I expected to work but have probably made an obvious mistake > > s.seek(0) > data = np.genfromtxt(s, dtype=np.dtype('i8','f8','a5'), > names=['myint','myfloat','mystring'], delimiter=[1,3,5]) But this one works: data=np.genfromtxt(s, dtype=np.dtype("i8,f8,a5"), names= ['myint','myfloat','mystring'], delimiter=[1,3,5]) > > Btw, you don't have to pass it as a dtype. It just needs to be able > to pass > > if dtype is not None: > dtype = np.dtype(dtype) > > I would like to see something like this, as it does when dtype is > None, but then we would have to have a type argument, maybe rather > than a dtype argument. 'k. Gonna see what I can do. > Again, while I'm on it...I noticed the argument to specify the > autostrip argument that can be provided to _iotools.LineSplitter is > always False. If this does, what I think (no time to test yet), it > might be nice to be able to specify this in genfromtxt. Would you mind giving me an example of usage with the corresponding expected output, so that I can work on it ? From jsseabold at gmail.com Tue Sep 15 14:15:38 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 15 Sep 2009 14:15:38 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <4AAF99F4.7050101@gmail.com> Message-ID: On Tue, Sep 15, 2009 at 1:56 PM, Pierre GM wrote: > > On Sep 15, 2009, at 10:44 AM, Skipper Seabold wrote: >>>>> How do you specify different dtypes in genfromtxt? >>> I could not see the information in the docstring and the dtype >>> argument >>> does not appear to allow multiple dtypes. > > Just give a regular dtype, or something that could be interpreted as > such. Have a look at > http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html > >> # a few incorrect ones >> >> s = StringIO("11.3abcde") >> data = np.genfromtxt(s, dtype=np.dtype(int, float, str), delimiter= >> [1,3,5]) > > Non-legit at all, but a good idea in that case. > >> >> # one I expected to work but have probably made an obvious mistake >> >> s.seek(0) >> data = np.genfromtxt(s, dtype=np.dtype('i8','f8','a5'), >> names=['myint','myfloat','mystring'], delimiter=[1,3,5]) > > But this one works: > data=np.genfromtxt(s, dtype=np.dtype("i8,f8,a5"), names= > ['myint','myfloat','mystring'], delimiter=[1,3,5]) > >> >> Btw, you don't have to pass it as a dtype. ?It just needs to be able >> to pass >> >> if dtype is not None: >> ? ?dtype = np.dtype(dtype) >> >> I would like to see something like this, as it does when dtype is >> None, but then we would have to have a type argument, maybe rather >> than a dtype argument. > > 'k. Gonna see what I can do. > Oh, given that this works though, I don't think my gripe is that legitimate. This is essentially the same thing, I just need to read up on declaring a dtype and stick some examples in the docstrings, so I don't forget... data = np.genfromtxt(s, dtype=np.dtype("i8,f8,a5"), names=['myint','myfloat','mystring'], delimiter=[1,3,5]) >> Again, while I'm on it...I noticed the argument to specify the >> autostrip argument that can be provided to _iotools.LineSplitter is >> always False. ?If this does, what I think (no time to test yet), it >> might be nice to be able to specify this in genfromtxt. > > Would you mind giving me an example of usage with the corresponding > expected output, so that I can work on it ? > Sure, I gave a longer example of this in the 2nd email in this thread, where my "missing" fields were " , , , , ,", ie., fixed width white space that I wanted to just strip down to "". Also if you notice that when it reads the date I still have "mm/dd/yyyy " with the trailing whitespace. I don't know how big of a deal this though. I think you can just define an autostrip argument in genfromtxt and then split_line=LineSplitter(..., autostrip=autostrip). I haven't tested this yet though. http://article.gmane.org/gmane.comp.python.numeric.general/32821 Skipper From robert.kern at gmail.com Tue Sep 15 14:26:23 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 15 Sep 2009 13:26:23 -0500 Subject: [Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices? In-Reply-To: References: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> Message-ID: <3d375d730909151126n3b1a9938x9b62af7ced9488b6@mail.gmail.com> On Tue, Sep 15, 2009 at 12:50, Charles R Harris wrote: > > > On Tue, Sep 15, 2009 at 11:38 AM, Michael Gilbert > wrote: >> >> hi, >> >> when using numpy.random.multivariate_normal, would it make sense to warn >> the user that they have entered a non-physical covariance matrix? i was >> recently working on a problem and getting very strange results until i >> finally realized that i had actually entered a bogus covariance matrix. >> >> its easy to determine when this is the case -- its when the >> determinant of the covariance matrix is negative. ?i.e. the >> multivariate normal distribution has det(C)^1/2 as part of the >> normalization factor, so when det(C)<0, you end up with an imaginary >> probability distribution. >> > > Hmm, you mean it isn't implemented using a cholesky decomposition? That > would (should) throw an error if the covariance isn't symmetric positive > definite. We use the SVD to do the matrix square root. I believe I was just following the older code that I was replacing. I have run into nearly degenerate cases where det(C) ~ 0 such that the SVD method gave not unreasonable answers, given the circumstances, while the Cholesky decomposition gave an error "too soon" in my estimation. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From michael.s.gilbert at gmail.com Tue Sep 15 14:57:47 2009 From: michael.s.gilbert at gmail.com (Michael Gilbert) Date: Tue, 15 Sep 2009 14:57:47 -0400 Subject: [Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices? In-Reply-To: <3d375d730909151126n3b1a9938x9b62af7ced9488b6@mail.gmail.com> References: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> <3d375d730909151126n3b1a9938x9b62af7ced9488b6@mail.gmail.com> Message-ID: <20090915145747.cdf8ef4d.michael.s.gilbert@gmail.com> On Tue, 15 Sep 2009 13:26:23 -0500, Robert Kern wrote: > On Tue, Sep 15, 2009 at 12:50, Charles R > Harris wrote: > > > > > > On Tue, Sep 15, 2009 at 11:38 AM, Michael Gilbert > > wrote: > >> > >> hi, > >> > >> when using numpy.random.multivariate_normal, would it make sense to warn > >> the user that they have entered a non-physical covariance matrix? i was > >> recently working on a problem and getting very strange results until i > >> finally realized that i had actually entered a bogus covariance matrix. > >> > >> its easy to determine when this is the case -- its when the > >> determinant of the covariance matrix is negative. ?i.e. the > >> multivariate normal distribution has det(C)^1/2 as part of the > >> normalization factor, so when det(C)<0, you end up with an imaginary > >> probability distribution. > >> > > > > Hmm, you mean it isn't implemented using a cholesky decomposition? That > > would (should) throw an error if the covariance isn't symmetric positive > > definite. > > We use the SVD to do the matrix square root. I believe I was just > following the older code that I was replacing. I have run into nearly > degenerate cases where det(C) ~ 0 such that the SVD method gave not > unreasonable answers, given the circumstances, while the Cholesky > decomposition gave an error "too soon" in my estimation. i just tried a non-symmetric covariance matrix, which, like you mention is also non-physical. there were also no errors for this situation, and the results will obviously be incorrect. regardless of the method for determining the matrix square root, it should be possible to determine whether an error needs to be thrown based on whether or not the result is imaginary, right? mike From doutriaux1 at llnl.gov Tue Sep 15 14:55:21 2009 From: doutriaux1 at llnl.gov (=?UTF-8?Q?Charles_=D8=B3=D9=85=D9=8A=D8=B1_Doutriaux?=) Date: Tue, 15 Sep 2009 11:55:21 -0700 Subject: [Numpy-discussion] numpy 1.3.0 and g95 on Mac Os X Message-ID: <3F7BCB55-3FBA-4EE7-9083-ACFA841E684E@llnl.gov> Hi there, I have a user that failed to install numpy 1.3.0 on her Mac 10.5.8. Turns out she is not using gfortran but g95. Is it a known feature? Is g95 not supposed to work with numpy? She did set FC to g95 before, Here's the log. Thanks, C. -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy.LOG Type: application/octet-stream Size: 6673 bytes Desc: not available URL: -------------- next part -------------- From charlesr.harris at gmail.com Tue Sep 15 15:17:43 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Sep 2009 13:17:43 -0600 Subject: [Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices? In-Reply-To: <20090915145747.cdf8ef4d.michael.s.gilbert@gmail.com> References: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> <3d375d730909151126n3b1a9938x9b62af7ced9488b6@mail.gmail.com> <20090915145747.cdf8ef4d.michael.s.gilbert@gmail.com> Message-ID: On Tue, Sep 15, 2009 at 12:57 PM, Michael Gilbert < michael.s.gilbert at gmail.com> wrote: > On Tue, 15 Sep 2009 13:26:23 -0500, Robert Kern wrote: > > On Tue, Sep 15, 2009 at 12:50, Charles R > > Harris wrote: > > > > > > > > > On Tue, Sep 15, 2009 at 11:38 AM, Michael Gilbert > > > wrote: > > >> > > >> hi, > > >> > > >> when using numpy.random.multivariate_normal, would it make sense to > warn > > >> the user that they have entered a non-physical covariance matrix? i > was > > >> recently working on a problem and getting very strange results until i > > >> finally realized that i had actually entered a bogus covariance > matrix. > > >> > > >> its easy to determine when this is the case -- its when the > > >> determinant of the covariance matrix is negative. i.e. the > > >> multivariate normal distribution has det(C)^1/2 as part of the > > >> normalization factor, so when det(C)<0, you end up with an imaginary > > >> probability distribution. > > >> > > > > > > Hmm, you mean it isn't implemented using a cholesky decomposition? That > > > would (should) throw an error if the covariance isn't symmetric > positive > > > definite. > > > > We use the SVD to do the matrix square root. I believe I was just > > following the older code that I was replacing. I have run into nearly > > degenerate cases where det(C) ~ 0 such that the SVD method gave not > > unreasonable answers, given the circumstances, while the Cholesky > > decomposition gave an error "too soon" in my estimation. > > i just tried a non-symmetric covariance matrix, which, like you > mention is also non-physical. there were also no errors for this > situation, and the results will obviously be incorrect. > > regardless of the method for determining the matrix square root, it > should be possible to determine whether an error needs to be thrown > based on whether or not the result is imaginary, right? > > The singular values should all be non-negative, if not there is a bug somewhere. Can you post the matrix that has the negative singular values? The symmetry can be checked in various ways. I think some sort of check would be appropriate. Open a ticket. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Sep 15 15:24:59 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 15 Sep 2009 14:24:59 -0500 Subject: [Numpy-discussion] numpy 1.3.0 and g95 on Mac Os X In-Reply-To: <3F7BCB55-3FBA-4EE7-9083-ACFA841E684E@llnl.gov> References: <3F7BCB55-3FBA-4EE7-9083-ACFA841E684E@llnl.gov> Message-ID: <3d375d730909151224x13b5779an96ca590354943e1@mail.gmail.com> 2009/9/15 Charles ???? Doutriaux : > Hi there, > > I have a user that failed to install numpy 1.3.0 on her Mac 10.5.8. > > Turns out she is not using gfortran but g95. > > Is it a known feature? Is g95 not supposed to work with numpy? Probably not for Mac. Most likely no one has done the work necessary to get the right flags for Mac builds of g95. Also, she will need to use --fcompiler=g95. I highly recommend using the gfortran builds from this site, though: http://r.research.att.com/tools/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From michael.s.gilbert at gmail.com Tue Sep 15 15:28:55 2009 From: michael.s.gilbert at gmail.com (Michael Gilbert) Date: Tue, 15 Sep 2009 15:28:55 -0400 Subject: [Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices? In-Reply-To: References: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> <3d375d730909151126n3b1a9938x9b62af7ced9488b6@mail.gmail.com> <20090915145747.cdf8ef4d.michael.s.gilbert@gmail.com> Message-ID: <20090915152855.5899eead.michael.s.gilbert@gmail.com> On Tue, 15 Sep 2009 13:17:43 -0600, Charles R Harris wrote: > On Tue, Sep 15, 2009 at 12:57 PM, Michael Gilbert < > michael.s.gilbert at gmail.com> wrote: > > > On Tue, 15 Sep 2009 13:26:23 -0500, Robert Kern wrote: > > > On Tue, Sep 15, 2009 at 12:50, Charles R > > > Harris wrote: > > > > > > > > > > > > On Tue, Sep 15, 2009 at 11:38 AM, Michael Gilbert > > > > wrote: > > > >> > > > >> hi, > > > >> > > > >> when using numpy.random.multivariate_normal, would it make sense to > > warn > > > >> the user that they have entered a non-physical covariance matrix? i > > was > > > >> recently working on a problem and getting very strange results until i > > > >> finally realized that i had actually entered a bogus covariance > > matrix. > > > >> > > > >> its easy to determine when this is the case -- its when the > > > >> determinant of the covariance matrix is negative. i.e. the > > > >> multivariate normal distribution has det(C)^1/2 as part of the > > > >> normalization factor, so when det(C)<0, you end up with an imaginary > > > >> probability distribution. > > > >> > > > > > > > > Hmm, you mean it isn't implemented using a cholesky decomposition? That > > > > would (should) throw an error if the covariance isn't symmetric > > positive > > > > definite. > > > > > > We use the SVD to do the matrix square root. I believe I was just > > > following the older code that I was replacing. I have run into nearly > > > degenerate cases where det(C) ~ 0 such that the SVD method gave not > > > unreasonable answers, given the circumstances, while the Cholesky > > > decomposition gave an error "too soon" in my estimation. > > > > i just tried a non-symmetric covariance matrix, which, like you > > mention is also non-physical. there were also no errors for this > > situation, and the results will obviously be incorrect. > > > > regardless of the method for determining the matrix square root, it > > should be possible to determine whether an error needs to be thrown > > based on whether or not the result is imaginary, right? > > > > > The singular values should all be non-negative, if not there is a bug > somewhere. Can you post the matrix that has the negative singular values? > The symmetry can be checked in various ways. I think some sort of check > would be appropriate. Open a ticket. will do. where is your tracker at? mike From doutriaux1 at llnl.gov Tue Sep 15 15:30:02 2009 From: doutriaux1 at llnl.gov (=?UTF-8?Q?Charles_=D8=B3=D9=85=D9=8A=D8=B1_Doutriaux?=) Date: Tue, 15 Sep 2009 12:30:02 -0700 Subject: [Numpy-discussion] numpy 1.3.0 and g95 on Mac Os X In-Reply-To: <3d375d730909151224x13b5779an96ca590354943e1@mail.gmail.com> References: <3F7BCB55-3FBA-4EE7-9083-ACFA841E684E@llnl.gov> <3d375d730909151224x13b5779an96ca590354943e1@mail.gmail.com> Message-ID: Thanks Robert, That's exactly what I recommended her. Except I usually get gfortran from http://hpc.sourceforge.net C. On Sep 15, 2009, at 12:24 PM, Robert Kern wrote: > 2009/9/15 Charles ???? Doutriaux : >> Hi there, >> >> I have a user that failed to install numpy 1.3.0 on her Mac 10.5.8. >> >> Turns out she is not using gfortran but g95. >> >> Is it a known feature? Is g95 not supposed to work with numpy? > > Probably not for Mac. Most likely no one has done the work necessary > to get the right flags for Mac builds of g95. > > Also, she will need to use --fcompiler=g95. > > I highly recommend using the gfortran builds from this site, though: > > http://*r.research.att.com/tools/ > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://*mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Tue Sep 15 15:31:21 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Sep 2009 13:31:21 -0600 Subject: [Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices? In-Reply-To: <20090915152855.5899eead.michael.s.gilbert@gmail.com> References: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> <3d375d730909151126n3b1a9938x9b62af7ced9488b6@mail.gmail.com> <20090915145747.cdf8ef4d.michael.s.gilbert@gmail.com> <20090915152855.5899eead.michael.s.gilbert@gmail.com> Message-ID: On Tue, Sep 15, 2009 at 1:28 PM, Michael Gilbert < michael.s.gilbert at gmail.com> wrote: > On Tue, 15 Sep 2009 13:17:43 -0600, Charles R Harris wrote: > > On Tue, Sep 15, 2009 at 12:57 PM, Michael Gilbert < > > michael.s.gilbert at gmail.com> wrote: > > > > > On Tue, 15 Sep 2009 13:26:23 -0500, Robert Kern wrote: > > > > On Tue, Sep 15, 2009 at 12:50, Charles R > > > > Harris wrote: > > > > > > > > > > > > > > > On Tue, Sep 15, 2009 at 11:38 AM, Michael Gilbert > > > > > wrote: > > > > >> > > > > >> hi, > > > > >> > > > > >> when using numpy.random.multivariate_normal, would it make sense > to > > > warn > > > > >> the user that they have entered a non-physical covariance matrix? > i > > > was > > > > >> recently working on a problem and getting very strange results > until i > > > > >> finally realized that i had actually entered a bogus covariance > > > matrix. > > > > >> > > > > >> its easy to determine when this is the case -- its when the > > > > >> determinant of the covariance matrix is negative. i.e. the > > > > >> multivariate normal distribution has det(C)^1/2 as part of the > > > > >> normalization factor, so when det(C)<0, you end up with an > imaginary > > > > >> probability distribution. > > > > >> > > > > > > > > > > Hmm, you mean it isn't implemented using a cholesky decomposition? > That > > > > > would (should) throw an error if the covariance isn't symmetric > > > positive > > > > > definite. > > > > > > > > We use the SVD to do the matrix square root. I believe I was just > > > > following the older code that I was replacing. I have run into nearly > > > > degenerate cases where det(C) ~ 0 such that the SVD method gave not > > > > unreasonable answers, given the circumstances, while the Cholesky > > > > decomposition gave an error "too soon" in my estimation. > > > > > > i just tried a non-symmetric covariance matrix, which, like you > > > mention is also non-physical. there were also no errors for this > > > situation, and the results will obviously be incorrect. > > > > > > regardless of the method for determining the matrix square root, it > > > should be possible to determine whether an error needs to be thrown > > > based on whether or not the result is imaginary, right? > > > > > > > > The singular values should all be non-negative, if not there is a bug > > somewhere. Can you post the matrix that has the negative singular values? > > The symmetry can be checked in various ways. I think some sort of check > > would be appropriate. Open a ticket. > > will do. where is your tracker at? > > Go to www.scipy.org, click on the bug icon, and follow directions. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.gilbert at gmail.com Tue Sep 15 15:39:15 2009 From: michael.s.gilbert at gmail.com (Michael Gilbert) Date: Tue, 15 Sep 2009 15:39:15 -0400 Subject: [Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices? In-Reply-To: References: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> <3d375d730909151126n3b1a9938x9b62af7ced9488b6@mail.gmail.com> <20090915145747.cdf8ef4d.michael.s.gilbert@gmail.com> <20090915152855.5899eead.michael.s.gilbert@gmail.com> Message-ID: <20090915153915.d6040484.michael.s.gilbert@gmail.com> On Tue, 15 Sep 2009 13:31:21 -0600, Charles R Harris wrote: > On Tue, Sep 15, 2009 at 1:28 PM, Michael Gilbert < > michael.s.gilbert at gmail.com> wrote: > > > On Tue, 15 Sep 2009 13:17:43 -0600, Charles R Harris wrote: > > > On Tue, Sep 15, 2009 at 12:57 PM, Michael Gilbert < > > > michael.s.gilbert at gmail.com> wrote: > > > > > > > On Tue, 15 Sep 2009 13:26:23 -0500, Robert Kern wrote: > > > > > On Tue, Sep 15, 2009 at 12:50, Charles R > > > > > Harris wrote: > > > > > > > > > > > > > > > > > > On Tue, Sep 15, 2009 at 11:38 AM, Michael Gilbert > > > > > > wrote: > > > > > >> > > > > > >> hi, > > > > > >> > > > > > >> when using numpy.random.multivariate_normal, would it make sense > > to > > > > warn > > > > > >> the user that they have entered a non-physical covariance matrix? > > i > > > > was > > > > > >> recently working on a problem and getting very strange results > > until i > > > > > >> finally realized that i had actually entered a bogus covariance > > > > matrix. > > > > > >> > > > > > >> its easy to determine when this is the case -- its when the > > > > > >> determinant of the covariance matrix is negative. i.e. the > > > > > >> multivariate normal distribution has det(C)^1/2 as part of the > > > > > >> normalization factor, so when det(C)<0, you end up with an > > imaginary > > > > > >> probability distribution. > > > > > >> > > > > > > > > > > > > Hmm, you mean it isn't implemented using a cholesky decomposition? > > That > > > > > > would (should) throw an error if the covariance isn't symmetric > > > > positive > > > > > > definite. > > > > > > > > > > We use the SVD to do the matrix square root. I believe I was just > > > > > following the older code that I was replacing. I have run into nearly > > > > > degenerate cases where det(C) ~ 0 such that the SVD method gave not > > > > > unreasonable answers, given the circumstances, while the Cholesky > > > > > decomposition gave an error "too soon" in my estimation. > > > > > > > > i just tried a non-symmetric covariance matrix, which, like you > > > > mention is also non-physical. there were also no errors for this > > > > situation, and the results will obviously be incorrect. > > > > > > > > regardless of the method for determining the matrix square root, it > > > > should be possible to determine whether an error needs to be thrown > > > > based on whether or not the result is imaginary, right? > > > > > > > > > > > The singular values should all be non-negative, if not there is a bug > > > somewhere. Can you post the matrix that has the negative singular values? > > > The symmetry can be checked in various ways. I think some sort of check > > > would be appropriate. Open a ticket. > > > > will do. where is your tracker at? > > > > Go to www.scipy.org, click on the bug icon, and follow directions. ok. i didn't realize numpy bugs were tracked along with scipy bugs. there is no mention of that on the numpy.scipy.org page. mike From robert.kern at gmail.com Tue Sep 15 15:35:38 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 15 Sep 2009 14:35:38 -0500 Subject: [Numpy-discussion] numpy 1.3.0 and g95 on Mac Os X In-Reply-To: References: <3F7BCB55-3FBA-4EE7-9083-ACFA841E684E@llnl.gov> <3d375d730909151224x13b5779an96ca590354943e1@mail.gmail.com> Message-ID: <3d375d730909151235tae6b5cdq9f176210e23dcf11@mail.gmail.com> 2009/9/15 Charles ???? Doutriaux : > Thanks Robert, > > That's exactly what I recommended her. Except I usually get gfortran > from http://hpc.sourceforge.net I cannot recommend and cannot support the compilers on the hpc site. They are often compiled from buggy, unreleased and unofficial versions of gfortran, they frequently do not support all of the Mac flags and architectures, and they are released in unversioned tarballs so that there is no way to identify which release had which problems or to go back to a working release. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Tue Sep 15 15:38:44 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 15 Sep 2009 15:38:44 -0400 Subject: [Numpy-discussion] numpy 1.3.0 and g95 on Mac Os X In-Reply-To: References: <3F7BCB55-3FBA-4EE7-9083-ACFA841E684E@llnl.gov> <3d375d730909151224x13b5779an96ca590354943e1@mail.gmail.com> Message-ID: <79E01AA2-22E4-4F81-A7B2-25846402C6F5@cs.toronto.edu> On 15-Sep-09, at 3:30 PM, Charles ???? Doutriaux wrote: > Thanks Robert, > > That's exactly what I recommended her. Except I usually get gfortran > from http://hpc.sourceforge.net The hpc.sourceforge.net one is known to generate buggy SciPy binaries, apparently: http://projects.scipy.org/scipy/wiki/GetCode#mac-os-x Unsure if this is still the case. David From timmichelsen at gmx-topmail.de Tue Sep 15 15:52:32 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 15 Sep 2009 21:52:32 +0200 Subject: [Numpy-discussion] online Doc Editor [Re: `missing` argument in genfromtxt only a string?] In-Reply-To: <20090915073747.GD17789@phare.normalesup.org> References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> <20090915073747.GD17789@phare.normalesup.org> Message-ID: > I just wanted to point out that there is an easy way of making a > difference, and making sure that the docstrings get fixed (which is > indeed very important). If you go to http://docs.scipy.org/ and register, > send your login name on this mailing list, we will add you to the list of > editors, and you will be able to edit easily the docstrings of scipy SVN. I actually tried to add np.genfromtxt to the docs via that inteface. Unfortunately, I couldn't recover my password. Are there chances that such a functionality can be integrated in the docs editor? Thanks, Timmie From timmichelsen at gmx-topmail.de Tue Sep 15 16:19:15 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 15 Sep 2009 22:19:15 +0200 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> Message-ID: > Check the archives of the mailing list, there's an example using > dateutil.parser that may be just what you need. How is this dateutil.parser used in timeseries? Can it not be used to make the dateconverter obsolte for the most simple cases? From ralf.gommers at googlemail.com Tue Sep 15 16:22:34 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 15 Sep 2009 16:22:34 -0400 Subject: [Numpy-discussion] online Doc Editor [Re: `missing` argument in genfromtxt only a string?] In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> <20090915073747.GD17789@phare.normalesup.org> Message-ID: On Tue, Sep 15, 2009 at 3:52 PM, Tim Michelsen wrote: > > I just wanted to point out that there is an easy way of making a > > difference, and making sure that the docstrings get fixed (which is > > indeed very important). If you go to http://docs.scipy.org/ and > register, > > send your login name on this mailing list, we will add you to the list of > > editors, and you will be able to edit easily the docstrings of scipy SVN. > I actually tried to add np.genfromtxt to the docs via that inteface. > It already exists in the docs: http://docs.scipy.org/numpy/docs/numpy.lib.io.genfromtxt/ Did you mean you tried to edit this page? Unfortunately, I couldn't recover my password. > > Are there chances that such a functionality can be integrated in the > docs editor? > There's a ticket for this functionality on the pydocweb tracker already. Hopefully it gets implemented at some point. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Sep 15 16:28:27 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 15 Sep 2009 16:28:27 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> Message-ID: <2931EB2C-D350-4A21-824D-02E543F641AC@gmail.com> On Sep 15, 2009, at 4:19 PM, Tim Michelsen wrote: >> Check the archives of the mailing list, there's an example using >> dateutil.parser that may be just what you need. > How is this dateutil.parser used in timeseries? It's left in a corner. The use of dateutil.parser comes from matplotlib. (genfromtxt is nothing but an extension of mlab.csv2rec). > Can it not be used to make the dateconverter obsolte for the most > simple > cases? Not really. dateutil.parser outputs a datetime object, when we need Date objects in scikits.timeseries. Now, of course, you don't have to use scikits.timeseries, and then dateutil.parser can be useful. But it's an external module, and we should try to have as few dependancies as possible. From timmichelsen at gmx-topmail.de Tue Sep 15 16:30:35 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 15 Sep 2009 22:30:35 +0200 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: Message-ID: > I actually figured out a workaround with converters, since my missing > values are " "," "," " ie., irregular number of spaces and the > values aren't stripped of white spaces. I just define {# : lambda s: > float(s.strip() or 0)}, and I have a loop build all of the converters, > but then I have to go through and drop the ones that are supposed to > be strings or dates, which is still pretty tedious, since I have a > number of datasets that are like this, but they all contain different > data in different orders and there's no (computer) logical order to it > that I've discovered yet. Glad that you brought this up. I posted a similar question recently: http://thread.gmane.org/gmane.comp.python.numeric.general/32511 >>> All of the missing values in the second observation are now -1. Also, >>> I'm having trouble defining a converter for my dates. I had a lot of timeseries code developed before Pierre created the marvelous tsfromtxt after the numpy 1.3 upgrade. Now, you do not need the np.loadtxt => ts.time_series again. I only have to adapt the old code some day... I was actually thinking of creating a converter library. When you work with measurement logger data, hardly any data complys with python/numpy expected inputs. And they all think they do it for a reason. As an example, many count fours of the day from 1-24 instead of 0-23. Just to indicate that the values refer to the _end_ or the averaging interval... You can get around this. But it gets difficltier when the data is in 15min. time steps: 0:00:00 0:15:00 [...] 23:45:00 24:00:00 So each data set is individual in this sense. That really bothers me at time. We shall all thank for having genfromtxt and derived! From pgmdevlist at gmail.com Tue Sep 15 16:33:56 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 15 Sep 2009 16:33:56 -0400 Subject: [Numpy-discussion] online Doc Editor [Re: `missing` argument in genfromtxt only a string?] In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> <20090915073747.GD17789@phare.normalesup.org> Message-ID: <4E5845EB-6FFB-4D82-B132-02C9378D3B82@gmail.com> On Sep 15, 2009, at 4:22 PM, Ralf Gommers wrote: > > There's a ticket for this functionality on the pydocweb tracker > already. Hopefully it gets implemented at some point. My bad, sorry. I already always forget to check tickets on the trac site for numpy/scipy, adding yet another site to check seems to be way too much for my caffeine-affected memory. However, there should be an option to allocate tickets to specific people, right ? And then I should received an email if I've been given a ticket, right ? In any case, all my apologies. I'll try to get on it ASAP (but Snow Leopard proves to be trickier to tame than I expected, and I have already couple of full plates waiting for me). From pgmdevlist at gmail.com Tue Sep 15 16:36:32 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 15 Sep 2009 16:36:32 -0400 Subject: [Numpy-discussion] `missing` argument in genfromtxt only a string? In-Reply-To: References: Message-ID: <9C78ED57-6FE5-4E74-9E83-867608EF2AC0@gmail.com> On Sep 15, 2009, at 4:30 PM, Tim Michelsen wrote: > We shall all thank for having genfromtxt and derived! You should really thank John Hunter, the original writer of mlab.csv2rec (I thnk). I just polished the code and add a few extra functionalities. From ralf.gommers at googlemail.com Tue Sep 15 16:58:35 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 15 Sep 2009 16:58:35 -0400 Subject: [Numpy-discussion] online Doc Editor [Re: `missing` argument in genfromtxt only a string?] In-Reply-To: <4E5845EB-6FFB-4D82-B132-02C9378D3B82@gmail.com> References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> <20090915073747.GD17789@phare.normalesup.org> <4E5845EB-6FFB-4D82-B132-02C9378D3B82@gmail.com> Message-ID: On Tue, Sep 15, 2009 at 4:33 PM, Pierre GM wrote: > > On Sep 15, 2009, at 4:22 PM, Ralf Gommers wrote: > > > > There's a ticket for this functionality on the pydocweb tracker > > already. Hopefully it gets implemented at some point. > > My bad, sorry. I already always forget to check tickets on the trac > site for numpy/scipy, adding yet another site to check seems to be way > too much for my caffeine-affected memory. However, there should be an > option to allocate tickets to specific people, right ? And then I > should received an email if I've been given a ticket, right ? > In any case, all my apologies. I'll try to get on it ASAP (but Snow > Leopard proves to be trickier to tame than I expected, and I have > already couple of full plates waiting for me). > That's the ticket for retrieving login passwords I was talking about. The pydocweb tracker is exclusively for the wiki itself so no need for you to check that. Once you have everything working under Snow Leopard, please share your wisdom! I was planning to start playing with it next weekend. ralf _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timmichelsen at gmx-topmail.de Tue Sep 15 18:06:48 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Wed, 16 Sep 2009 00:06:48 +0200 Subject: [Numpy-discussion] online Doc Editor [Re: `missing` argument in genfromtxt only a string?] In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> <20090915073747.GD17789@phare.normalesup.org> Message-ID: > It already exists in the docs: > http://docs.scipy.org/numpy/docs/numpy.lib.io.genfromtxt/ > Did you mean you tried to edit this page? But cannot be found in here: http://docs.scipy.org/doc/numpy/reference/routines.io.html So it just needs to be added in http://docs.scipy.org/doc/numpy/_sources/reference/routines.io.txt From ralf.gommers at googlemail.com Tue Sep 15 18:19:34 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 15 Sep 2009 18:19:34 -0400 Subject: [Numpy-discussion] online Doc Editor [Re: `missing` argument in genfromtxt only a string?] In-Reply-To: References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> <20090915073747.GD17789@phare.normalesup.org> Message-ID: On Tue, Sep 15, 2009 at 6:06 PM, Tim Michelsen wrote: > > > It already exists in the docs: > > http://docs.scipy.org/numpy/docs/numpy.lib.io.genfromtxt/ > > Did you mean you tried to edit this page? > But cannot be found in here: > http://docs.scipy.org/doc/numpy/reference/routines.io.html > > So it just needs to be added in > http://docs.scipy.org/doc/numpy/_sources/reference/routines.io.txt > You're right, I added it here: http://docs.scipy.org/numpy/docs/numpy-docs/reference/routines.io.rst/ Note that it will only show up at the link you gave once the built docs are updated. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From timmichelsen at gmx-topmail.de Tue Sep 15 18:38:20 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Wed, 16 Sep 2009 00:38:20 +0200 Subject: [Numpy-discussion] online Doc Editor [Re: `missing` argument in genfromtxt only a string?] In-Reply-To: <4E5845EB-6FFB-4D82-B132-02C9378D3B82@gmail.com> References: <5BD5E255-0E1B-41BC-8FDC-2E63063F25CE@gmail.com> <99DB018B-346C-4F02-8D87-DBEF0D8997A7@gmail.com> <20090915073747.GD17789@phare.normalesup.org> <4E5845EB-6FFB-4D82-B132-02C9378D3B82@gmail.com> Message-ID: > My bad, sorry. I already always forget to check tickets on the trac > site for numpy/scipy, adding yet another site to check seems to be way > too much for my caffeine-affected memory. However, there should be an > option to allocate tickets to specific people, right ? And then I > should received an email if I've been given a ticket, right ? Other projects use a mailing list where trac sends changes: e.g.: http://www.mail-archive.com/grass-dev at lists.osgeo.org/msg11399.html Maybe an option here? From charlesr.harris at gmail.com Tue Sep 15 23:11:25 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Sep 2009 21:11:25 -0600 Subject: [Numpy-discussion] warning or error for non-physical multivariate_normal covariance matrices? In-Reply-To: <3d375d730909151126n3b1a9938x9b62af7ced9488b6@mail.gmail.com> References: <20090915133840.5f21b221.michael.s.gilbert@gmail.com> <3d375d730909151126n3b1a9938x9b62af7ced9488b6@mail.gmail.com> Message-ID: On Tue, Sep 15, 2009 at 12:26 PM, Robert Kern wrote: > On Tue, Sep 15, 2009 at 12:50, Charles R > Harris wrote: > > > > > > On Tue, Sep 15, 2009 at 11:38 AM, Michael Gilbert > > wrote: > >> > >> hi, > >> > >> when using numpy.random.multivariate_normal, would it make sense to warn > >> the user that they have entered a non-physical covariance matrix? i was > >> recently working on a problem and getting very strange results until i > >> finally realized that i had actually entered a bogus covariance matrix. > >> > >> its easy to determine when this is the case -- its when the > >> determinant of the covariance matrix is negative. i.e. the > >> multivariate normal distribution has det(C)^1/2 as part of the > >> normalization factor, so when det(C)<0, you end up with an imaginary > >> probability distribution. > >> > > > > Hmm, you mean it isn't implemented using a cholesky decomposition? That > > would (should) throw an error if the covariance isn't symmetric positive > > definite. > > We use the SVD to do the matrix square root. I believe I was just > following the older code that I was replacing. I have run into nearly > degenerate cases where det(C) ~ 0 such that the SVD method gave not > unreasonable answers, given the circumstances, while the Cholesky > decomposition gave an error "too soon" in my estimation. > > That's a bit dangerous for ill conditioned covariance matrices because the orthogonal matrices aren't guaranteed to be transposes of each other. In particular, if there are negative eigenvalues, due to roundoff error or incorrect user input, the square root is guaranteed to fail because the singular values have to be positive and this failure will pass unnoticed. It's possible that where cholesky failed the svd method was actually giving you wrong results. In any case eigh would be a safer choice if you don't want to use cholesky. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Wed Sep 16 02:56:05 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 16 Sep 2009 15:56:05 +0900 Subject: [Numpy-discussion] Moved matrix class into separate module Message-ID: <4AB08C05.50409@ar.media.kyoto-u.ac.jp> Hi, I just wanted to mention I integrated a patch from some time ago to make numpy.core independent from other numpy modules. This is really useful when working on involved changes at the C level. This meant moving some stuff around, in particular the matrix class and utilities is now into numpy.matrixclass module. The numpy namespace is of course unchanged. cheers, David From josef.pktd at gmail.com Wed Sep 16 04:25:26 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 16 Sep 2009 04:25:26 -0400 Subject: [Numpy-discussion] how can we concatenate structured arrays ? Message-ID: <1cd32cbb0909160125w4ce0eed3s4cee34ed95e1f149@mail.gmail.com> I have two structured arrays of different types. How can I horizontally concatenate the two arrays? Is there a direct way, or do I need to start from scratch? nobs = 10 testdata = np.random.randint(3, size=(nobs,4)).view([('a',int),('b',int),('c',int),('d',int)]) testdatacont = np.random.normal( size=(nobs,2)).view([('e',float), ('f',float)]) >>> np.hstack((testdata,testdatacont)) Traceback (most recent call last): File "C:\Programs\Python25\Lib\site-packages\numpy\lib\shape_base.py", line 505, in hstack return _nx.concatenate(map(atleast_1d,tup),1) TypeError: expected a readable buffer object >>> np.column_stack((testdata,testdatacont)) Traceback (most recent call last): File "C:\Programs\Python25\Lib\site-packages\numpy\lib\shape_base.py", line 552, in column_stack return _nx.concatenate(arrays,1) TypeError: expected a readable buffer object the following works, but looks like a big detour for a simple column_stack: >>> import numpy.lib.recfunctions >>> dt2 = numpy.lib.recfunctions.zip_descr((testdata,testdatacont),flatten=True) >>> joinedarr = np.array([tuple(i+j) for i,j in zip(testdata.base.tolist(), testdatacont.base.tolist())], dtype = dt2) >>> joinedarr.dtype dtype([('a', '>> np.column_stack((testdata.base,testdatacont.base)).dtype dtype('float64') Josef From pgmdevlist at gmail.com Wed Sep 16 04:42:09 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 16 Sep 2009 04:42:09 -0400 Subject: [Numpy-discussion] how can we concatenate structured arrays ? In-Reply-To: <1cd32cbb0909160125w4ce0eed3s4cee34ed95e1f149@mail.gmail.com> References: <1cd32cbb0909160125w4ce0eed3s4cee34ed95e1f149@mail.gmail.com> Message-ID: <534DE730-CBCF-4A57-A73E-03179F0E2BA8@gmail.com> On Sep 16, 2009, at 4:25 AM, josef.pktd at gmail.com wrote: > I have two structured arrays of different types. How can I > horizontally concatenate the two arrays? Is there a direct way, or do > I need to start from scratch? Check numpy.lib.recfunctions, that should get you started. From josef.pktd at gmail.com Wed Sep 16 04:45:30 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 16 Sep 2009 04:45:30 -0400 Subject: [Numpy-discussion] how can we concatenate structured arrays ? In-Reply-To: <1cd32cbb0909160125w4ce0eed3s4cee34ed95e1f149@mail.gmail.com> References: <1cd32cbb0909160125w4ce0eed3s4cee34ed95e1f149@mail.gmail.com> Message-ID: <1cd32cbb0909160145q44861f1fo15b433e8e5a87f8b@mail.gmail.com> On Wed, Sep 16, 2009 at 4:25 AM, wrote: > I have two structured arrays of different types. How can I > horizontally concatenate the two arrays? Is there a direct way, or do > I need to start from scratch? > > nobs = 10 > testdata = np.random.randint(3, > size=(nobs,4)).view([('a',int),('b',int),('c',int),('d',int)]) > testdatacont = np.random.normal( size=(nobs,2)).view([('e',float), ('f',float)]) > >>>> np.hstack((testdata,testdatacont)) > Traceback (most recent call last): > ?File "C:\Programs\Python25\Lib\site-packages\numpy\lib\shape_base.py", > line 505, in hstack > ? ?return _nx.concatenate(map(atleast_1d,tup),1) > TypeError: expected a readable buffer object > >>>> np.column_stack((testdata,testdatacont)) > Traceback (most recent call last): > ?File "C:\Programs\Python25\Lib\site-packages\numpy\lib\shape_base.py", > line 552, in column_stack > ? ?return _nx.concatenate(arrays,1) > TypeError: expected a readable buffer object > > > the following works, but looks like a big detour for a simple column_stack: > >>>> import numpy.lib.recfunctions >>>> dt2 = numpy.lib.recfunctions.zip_descr((testdata,testdatacont),flatten=True) >>>> joinedarr = np.array([tuple(i+j) for i,j in zip(testdata.base.tolist(), testdatacont.base.tolist())], dtype = dt2) >>>> joinedarr.dtype > dtype([('a', ' ' > > if I want to convert the dtypes to float (which I don't want in this > case), then its easier > >>>> np.column_stack((testdata.base,testdatacont.base)).dtype > dtype('float64') > > > Josef > looping over column also works, this looks more efficient >>> tt = np.empty((10,1), dt2) >>> tt.shape (10, 1) >>> tt['a'].shape (10, 1) >>> testdata['a'].shape # has ndim=2 (10, 1) >>> for n in testdata.dtype.names: tt[n] = testdata[n] ... >>> for n in testdatacont.dtype.names: tt[n] = testdatacont[n] ... >>> tt array([[(2, 0, 1, 1, 0.61282791440084505, 0.29305903681720574)], [(1, 1, 1, 2, -1.5331947180856178, -0.62794592132997662)], [(1, 0, 1, 1, 0.34850521437127446, -0.71435625605096553)], [(2, 1, 2, 1, -0.035021646994300569, 0.14235131301077331)], [(2, 0, 2, 0, -0.072940874291085214, 1.257392635986091)], [(1, 0, 1, 0, 0.19764464613444582, 3.1907154468379528)], [(1, 2, 2, 1, 1.0584100502205742, -1.8249604812902063)], [(1, 1, 0, 0, -0.1580364093187942, 0.0314819593087034)], [(1, 2, 2, 0, -2.0938485304115289, 1.0133998231900494)], [(0, 2, 0, 0, 0.042563869142945909, 1.2643518145105357)]], dtype=[('a', ' References: <1cd32cbb0909160125w4ce0eed3s4cee34ed95e1f149@mail.gmail.com> <534DE730-CBCF-4A57-A73E-03179F0E2BA8@gmail.com> Message-ID: <1cd32cbb0909160210ocd691dbob2adde21e6ceb2a3@mail.gmail.com> On Wed, Sep 16, 2009 at 4:42 AM, Pierre GM wrote: > > On Sep 16, 2009, at 4:25 AM, josef.pktd at gmail.com wrote: > >> I have two structured arrays of different types. How can I >> horizontally concatenate the two arrays? Is there a direct way, or do >> I need to start from scratch? > > Check numpy.lib.recfunctions, that should get you started. Thanks, I was doing that, but without reading through every option of every function, they all seem to do something more complicated, like merge or joinby, append_fields doesn't look very convenient, adds one by one looking some more recursive_fill_fields seems also to do the loop (but also handles nested structured arrays) tt = np.empty((10,1),dt2) numpy.lib.recfunctions.recursive_fill_fields(testdata, tt) numpy.lib.recfunctions.recursive_fill_fields(testdatacont, tt) So, I guess the answer is, loop over columns of structured arrays Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Wed Sep 16 09:58:12 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 16 Sep 2009 09:58:12 -0400 Subject: [Numpy-discussion] how can we concatenate structured arrays ? In-Reply-To: <1cd32cbb0909160210ocd691dbob2adde21e6ceb2a3@mail.gmail.com> References: <1cd32cbb0909160125w4ce0eed3s4cee34ed95e1f149@mail.gmail.com> <534DE730-CBCF-4A57-A73E-03179F0E2BA8@gmail.com> <1cd32cbb0909160210ocd691dbob2adde21e6ceb2a3@mail.gmail.com> Message-ID: On Wed, Sep 16, 2009 at 5:10 AM, wrote: > On Wed, Sep 16, 2009 at 4:42 AM, Pierre GM wrote: >> >> On Sep 16, 2009, at 4:25 AM, josef.pktd at gmail.com wrote: >> >>> I have two structured arrays of different types. How can I >>> horizontally concatenate the two arrays? Is there a direct way, or do >>> I need to start from scratch? >> >> Check numpy.lib.recfunctions, that should get you started. > > Thanks, I was doing that, but without reading through every option > of every function, they all seem to do something more > complicated, like merge or joinby, > append_fields doesn't look very convenient, adds one by one > > looking some more recursive_fill_fields seems also to do the > loop (but also handles nested structured arrays) > > tt = np.empty((10,1),dt2) > numpy.lib.recfunctions.recursive_fill_fields(testdata, tt) > numpy.lib.recfunctions.recursive_fill_fields(testdatacont, tt) > > So, I guess the answer is, loop over columns of structured arrays > > Josef > I spent a lot of time trying to do this for categorical in statsmodels when we have to append the array of dummies to a mixed dtype structured array. recfunctions.append_fields was the best I could come up with. Skipper From chanley at stsci.edu Wed Sep 16 11:32:37 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 16 Sep 2009 11:32:37 -0400 Subject: [Numpy-discussion] Moved matrix class into separate module In-Reply-To: <4AB08C05.50409@ar.media.kyoto-u.ac.jp> References: <4AB08C05.50409@ar.media.kyoto-u.ac.jp> Message-ID: <304AD265-29C6-43A9-B9E2-1C60C05AB51B@stsci.edu> Hi, When I try running the tests on a fresh build from the trunk I receive 28 errors. Most of the errors are of the form: "NameError: global name 'matrix' is not defined" It looks like there was some change to the numpy namespace. I can provide a full listing of the unit test errors if desired. This is on my MacBook Pro running Python 2.5 on OS X 10.5. Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 On Sep 16, 2009, at 2:56 AM, David Cournapeau wrote: > Hi, > > I just wanted to mention I integrated a patch from some time ago to > make > numpy.core independent from other numpy modules. This is really useful > when working on involved changes at the C level. This meant moving > some > stuff around, in particular the matrix class and utilities is now into > numpy.matrixclass module. The numpy namespace is of course unchanged. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From cournape at gmail.com Wed Sep 16 11:51:01 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 17 Sep 2009 00:51:01 +0900 Subject: [Numpy-discussion] Moved matrix class into separate module In-Reply-To: <304AD265-29C6-43A9-B9E2-1C60C05AB51B@stsci.edu> References: <4AB08C05.50409@ar.media.kyoto-u.ac.jp> <304AD265-29C6-43A9-B9E2-1C60C05AB51B@stsci.edu> Message-ID: <5b8d13220909160851o2d0ce294r967c6b54ba43a6cd@mail.gmail.com> On Thu, Sep 17, 2009 at 12:32 AM, Christopher Hanley wrote: > Hi, > > When I try running the tests on a fresh build from the trunk I receive > 28 errors. ?Most of the errors are of the form: > > "NameError: global name 'matrix' is not defined" > > It looks like ?there was some change to the numpy namespace. ?I can > provide a full listing of the unit test errors if desired. Yes please - I do not see those errors on both mac os x and linux. You should also make sure that your svn checkout, build and install directories are not polluted by old cruft. David From chanley at stsci.edu Wed Sep 16 12:07:24 2009 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 16 Sep 2009 12:07:24 -0400 Subject: [Numpy-discussion] Moved matrix class into separate module In-Reply-To: <5b8d13220909160851o2d0ce294r967c6b54ba43a6cd@mail.gmail.com> References: <4AB08C05.50409@ar.media.kyoto-u.ac.jp> <304AD265-29C6-43A9-B9E2-1C60C05AB51B@stsci.edu> <5b8d13220909160851o2d0ce294r967c6b54ba43a6cd@mail.gmail.com> Message-ID: My apologizes. I had remembered to remove the previous build directory but not the target installation directory. After having removed all traces of the previous numpy installation and do a clean install I receive no new errors. Sorry for the false alarm. Chris -- Christopher Hanley Senior Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 On Sep 16, 2009, at 11:51 AM, David Cournapeau wrote: > On Thu, Sep 17, 2009 at 12:32 AM, Christopher Hanley > wrote: >> Hi, >> >> When I try running the tests on a fresh build from the trunk I >> receive >> 28 errors. Most of the errors are of the form: >> >> "NameError: global name 'matrix' is not defined" >> >> It looks like there was some change to the numpy namespace. I can >> provide a full listing of the unit test errors if desired. > > Yes please - I do not see those errors on both mac os x and linux. You > should also make sure that your svn checkout, build and install > directories are not polluted by old cruft. > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From nwagner at iam.uni-stuttgart.de Wed Sep 16 12:39:27 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 16 Sep 2009 18:39:27 +0200 Subject: [Numpy-discussion] NameError: global name 'matrix' is not defined Message-ID: Ran 2235 tests in 25.593s FAILED (KNOWNFAIL=1, errors=28, failures=1) >>> import numpy >>> numpy.__version__ '1.4.0.dev7400' ====================================================================== ERROR: test_basic (test_defmatrix.TestAlgebra) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 190, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined From robert.kern at gmail.com Wed Sep 16 12:41:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 Sep 2009 11:41:15 -0500 Subject: [Numpy-discussion] NameError: global name 'matrix' is not defined In-Reply-To: References: Message-ID: <3d375d730909160941o61c89b71w80743ad90b3d0cce@mail.gmail.com> On Wed, Sep 16, 2009 at 11:39, Nils Wagner wrote: > Ran 2235 tests in 25.593s > > FAILED (KNOWNFAIL=1, errors=28, failures=1) > >>>> import numpy >>>> numpy.__version__ > '1.4.0.dev7400' > > ====================================================================== > ERROR: test_basic (test_defmatrix.TestAlgebra) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ? File > "/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", > line 190, in test_basic > ? ? mA = matrix(A) > NameError: global name 'matrix' is not defined Clean out old files before reinstalling. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nwagner at iam.uni-stuttgart.de Wed Sep 16 12:52:53 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 16 Sep 2009 18:52:53 +0200 Subject: [Numpy-discussion] NameError: global name 'matrix' is not defined In-Reply-To: <3d375d730909160941o61c89b71w80743ad90b3d0cce@mail.gmail.com> References: <3d375d730909160941o61c89b71w80743ad90b3d0cce@mail.gmail.com> Message-ID: On Wed, 16 Sep 2009 11:41:15 -0500 Robert Kern wrote: > On Wed, Sep 16, 2009 at 11:39, Nils >Wagner wrote: >> Ran 2235 tests in 25.593s >> >> FAILED (KNOWNFAIL=1, errors=28, failures=1) >> >failures=1> >>>>> import numpy >>>>> numpy.__version__ >> '1.4.0.dev7400' >> >> ====================================================================== >> ERROR: test_basic (test_defmatrix.TestAlgebra) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> ? File >> "/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", >> line 190, in test_basic >> ? ? mA = matrix(A) >> NameError: global name 'matrix' is not defined > > Clean out old files before reinstalling. > > -- > Robert Kern Thank you very much. Works for me ... rm -rf /home/nwagner/local/lib64/python2.6/site-packages/numpy Ran 2196 tests in 16.713s OK (KNOWNFAIL=1) Nils From eadrogue at gmx.net Wed Sep 16 17:36:42 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Wed, 16 Sep 2009 23:36:42 +0200 Subject: [Numpy-discussion] array multiplication Message-ID: <20090916213642.GA4144@doriath.local> Hi, I have two 1-d arrays (a and b), and I want to create a third 2-d array, whose rows are of the form a[i]*b: c = np.zeros((len(a),b)) c[0] = a[0]*b c[1] = a[1]*b . . . Is there an easy way to do this (e.g, without a loop)? Thanks! -- Ernest From robert.kern at gmail.com Wed Sep 16 17:40:00 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 Sep 2009 16:40:00 -0500 Subject: [Numpy-discussion] array multiplication In-Reply-To: <20090916213642.GA4144@doriath.local> References: <20090916213642.GA4144@doriath.local> Message-ID: <3d375d730909161440l371f72eeic5cf061173b0df91@mail.gmail.com> 2009/9/16 Ernest Adrogu? : > Hi, > > I have two 1-d arrays (a and b), and I want to create a > third 2-d array, whose rows are of the form a[i]*b: > > c = np.zeros((len(a),b)) > > c[0] = a[0]*b > c[1] = a[1]*b > . > . > . > > Is there an easy way to do this (e.g, without a loop)? c = a[:,np.newaxis] * b http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gokhansever at gmail.com Wed Sep 16 20:22:52 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 16 Sep 2009 19:22:52 -0500 Subject: [Numpy-discussion] Simple pattern recognition Message-ID: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> Hello all, I want to be able to count predefined simple rectangle shapes on an image as shown like in this one: http://img7.imageshack.us/img7/2327/particles.png Which is in my case to count all the blue pixels (they are ice-snow flake shadows in reality) in one of the column. What is the way to automate this task, which library or technique should I study to tackle it. Thanks. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Wed Sep 16 20:53:03 2009 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 16 Sep 2009 20:53:03 -0400 Subject: [Numpy-discussion] Simple pattern recognition In-Reply-To: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> Message-ID: <4AB1886F.4020501@american.edu> On 9/16/2009 8:22 PM, G?khan Sever wrote: > I want to be able to count predefined simple rectangle shapes on an > image as shown like in this one: > http://img7.imageshack.us/img7/2327/particles.png ch.9 of http://www.amazon.com/Beginning-Python-Visualization-Transformation-Professionals/dp/1430218436 hth, Alan Isaac From dwf at cs.toronto.edu Wed Sep 16 21:43:15 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 16 Sep 2009 21:43:15 -0400 Subject: [Numpy-discussion] Simple pattern recognition In-Reply-To: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> Message-ID: On 16-Sep-09, at 8:22 PM, G?khan Sever wrote: > Hello all, > > I want to be able to count predefined simple rectangle shapes on an > image as > shown like in this one: http://img7.imageshack.us/img7/2327/particles.png > > Which is in my case to count all the blue pixels (they are ice-snow > flake > shadows in reality) in one of the column. > > What is the way to automate this task, which library or technique > should I > study to tackle it. Hey Gokhan, Well, scipy.ndimage.label() will be handy once you extract those columns one by one. Is one contiguous blue region considered one object? In that case, you'd be done. Once you've run label() you can use scipy.ndimage.find_objects() to get slices into the entire column that contain the contiguous region. If you want to try and count individual rectangles that may overlap, there are likely dynamic programming algorithms that can find the biggest rectangles. The truth is you can probably even do something pretty naive and then compile it with Cython and it'll go blazing fast. If you can be more specific about the kinds of "predefined simple rectangle shapes" we can probably tell you more. David From nadavh at visionsense.com Thu Sep 17 01:39:52 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 17 Sep 2009 08:39:52 +0300 Subject: [Numpy-discussion] array multiplication References: <20090916213642.GA4144@doriath.local> <3d375d730909161440l371f72eeic5cf061173b0df91@mail.gmail.com> Message-ID: <710F2847B0018641891D9A21602763605AD172@ex3.envision.co.il> Or np.multiply.outer(a,b) Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Robert Kern ????: ? 17-??????-09 00:40 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] array multiplication 2009/9/16 Ernest Adrogu? : > Hi, > > I have two 1-d arrays (a and b), and I want to create a > third 2-d array, whose rows are of the form a[i]*b: > > c = np.zeros((len(a),b)) > > c[0] = a[0]*b > c[1] = a[1]*b > . > . > . > > Is there an easy way to do this (e.g, without a loop)? c = a[:,np.newaxis] * b http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3240 bytes Desc: not available URL: From ckkart at hoc.net Thu Sep 17 02:58:22 2009 From: ckkart at hoc.net (Christian K.) Date: Thu, 17 Sep 2009 06:58:22 +0000 (UTC) Subject: [Numpy-discussion] matlab for numpy users Message-ID: Hi, this is probaby an unusual question here from someone used to numpy who is forced to work with matlab and it is not exactly the right place to ask. Sorry for that. Is there something like broadcasting in matlab? E.g. how can I do something like that: a = ones((50,50), dtype=float) time = linspace(0,1,101) res = a*exp(-time[:,newaxis,newaxis]) Thanks in advance, Christian From david at ar.media.kyoto-u.ac.jp Thu Sep 17 02:41:48 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 17 Sep 2009 15:41:48 +0900 Subject: [Numpy-discussion] matlab for numpy users In-Reply-To: References: Message-ID: <4AB1DA2C.4040504@ar.media.kyoto-u.ac.jp> Hi Christian, Christian K. wrote: > Hi, > > this is probaby an unusual question here from someone used to numpy who is > forced to work with matlab and it is not exactly the right place to ask. Sorry > for that. > > Is there something like broadcasting in matlab? Not really (except for trivial things like scalar-matrix operations). The usual way to do it in matlab is repmat, which helps you doing 'manual broadcasting'. cheers, David From sturla at molden.no Thu Sep 17 05:58:20 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 17 Sep 2009 11:58:20 +0200 Subject: [Numpy-discussion] matlab for numpy users In-Reply-To: References: Message-ID: <4AB2083C.1060106@molden.no> Christian K. skrev: > Is there something like broadcasting in matlab? Matlab does not do automatic broadcasting like NumPy and Fortran 95. You have to broadcast manually, mostly using repmat (but there are other ways as well). This should help: http://home.online.no/~pjacklam/matlab/doc/mtt/doc/mtt.pdf Sturla Molden From gokhansever at gmail.com Thu Sep 17 11:14:30 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 17 Sep 2009 10:14:30 -0500 Subject: [Numpy-discussion] Simple pattern recognition In-Reply-To: References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> Message-ID: <49d6b3500909170814q179b3f83pa54e46eb5b66d4c1@mail.gmail.com> On Wed, Sep 16, 2009 at 8:43 PM, David Warde-Farley wrote: > On 16-Sep-09, at 8:22 PM, G?khan Sever wrote: > > > Hello all, > > > > I want to be able to count predefined simple rectangle shapes on an > > image as > > shown like in this one: > http://img7.imageshack.us/img7/2327/particles.png > > > > Which is in my case to count all the blue pixels (they are ice-snow > > flake > > shadows in reality) in one of the column. > > > > What is the way to automate this task, which library or technique > > should I > > study to tackle it. > > Hey Gokhan, > Hello David, How are you? > > Well, scipy.ndimage.label() will be handy once you extract those > columns one by one. Is one contiguous blue region considered one > object? In that case, you'd be done. > > That's true, the blue contiguous regions are the ones that I want to extract from the image. See my new screenshot http://img514.imageshack.us/img514/3416/particles2.png GIMP's fuzzy selection tool (4th one from the top on the toolbox) selects those areas (the background image on the ss) perfectly. I copied those regions and put the on a new canvas using grayscale colorspace. The foreground image zooms to the top portion of extracted regions. Using the pen tool and the tiniest size I outlined the area that I am interested. We were asked to measure the width and height of each similar ice particle shadows (which are just the those contiguous extractions). For the sake of simplicity we will just average the width and height to use a radius to later calculate particle volumes. (Hah wondering how else would be possible to calculate such irregular shaped particle volume if we weren't making this assumption. (Any ideas?? -particle re-construction, edge smoothing, that would be a topic for a master or phd thesis, I think) So seems like what was done with the GIMP could be automated with the approach you mentioned. I use PIL to read my png file (after cropped the initial image to the column of my interest) Like: from PIL import Image myim = Image('seccol.png) imdata = np.array(myim.getdata()) >From this on, I am not sure what to provide to the structure parameter label(input, structure=None, output=None) I am correct on PIL usage or just a simple binary ready work, too? > Once you've run label() you can use scipy.ndimage.find_objects() to > get slices into the entire column that contain the contiguous region. > > If you want to try and count individual rectangles that may overlap, > there are likely dynamic programming algorithms that can find the > biggest rectangles. The truth is you can probably even do something > pretty naive and then compile it with Cython and it'll go blazing fast. > Uhh this might be a little too complicated for this project. I don't think I will need a Cython but depends can expand further. > > If you can be more specific about the kinds of "predefined simple > rectangle shapes" we can probably tell you more. > Looking once again they are not predefined. I was thinking to loop through each segment based on the smallest rectangle in the figure. The first paragraph on this reply should clear my intention. Thanks for your reply. Hope your thesis work going well or successfully completed :) -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Thu Sep 17 11:15:34 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 17 Sep 2009 10:15:34 -0500 Subject: [Numpy-discussion] Simple pattern recognition In-Reply-To: <4AB1886F.4020501@american.edu> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> <4AB1886F.4020501@american.edu> Message-ID: <49d6b3500909170815l32446613jfc1ac5e441b7f31f@mail.gmail.com> On Wed, Sep 16, 2009 at 7:53 PM, Alan G Isaac wrote: > On 9/16/2009 8:22 PM, G?khan Sever wrote: > > I want to be able to count predefined simple rectangle shapes on an > > image as shown like in this one: > > http://img7.imageshack.us/img7/2327/particles.png > > ch.9 of > http://www.amazon.com/Beginning-Python-Visualization-Transformation-Professionals/dp/1430218436 > > hth, > Alan Isaac > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Thanks for the suggestion. I will try to find a copy. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Thu Sep 17 11:17:25 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 17 Sep 2009 10:17:25 -0500 Subject: [Numpy-discussion] [Matplotlib-users] Simple pattern recognition In-Reply-To: <417707D1-0857-4BA7-824A-E1D6292BC36A@mit.edu> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> <417707D1-0857-4BA7-824A-E1D6292BC36A@mit.edu> Message-ID: <49d6b3500909170817l6ff52c7fye667f24a2e60b951@mail.gmail.com> On Wed, Sep 16, 2009 at 10:46 PM, Tony S Yu wrote: > > On Sep 16, 2009, at 8:22 PM, G?khan Sever wrote: > > Hello all, > > I want to be able to count predefined simple rectangle shapes on an image > as shown like in this one: > http://img7.imageshack.us/img7/2327/particles.png > > Which is in my case to count all the blue pixels (they are ice-snow flake > shadows in reality) in one of the column. > > What is the way to automate this task, which library or technique should I > study to tackle it. > > > You should check out the ndimage subpackage in scipy. > > This tutorial should help you get started: > http://docs.scipy.org/doc/scipy/reference/tutorial/ndimage.html > The section on "segmentation and labeling" will be particularly useful for > you. > > Best, > -Tony > > > Right into the eye. "Segmentation is the process of separating objects of interest from the background." Once I finish this task, I should be able to count the occurrences of the shadowed instances so that I would have an idea regarding to their sizes. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Sep 17 11:21:43 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 17 Sep 2009 11:21:43 -0400 Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared Message-ID: Hi, After the recent move of the matrix module, all the changes to the docstrings have disappeared from the doc wiki. I imagine the changes still live somewhere in the underlying repo but are not visible to the user. Can they be restored? If there is some unforeseen problem with that, I do have a recent patch saved that contains the latest edits (but of course it won't apply cleanly). I saved that patch from the wiki because I was worried about the large number of unmerged changes, there are about 500 docstrings with changes right now (close to 50% of all important docstrings). Would it make sense to make sure (maybe with an automatic reminder email) that changes from the wiki get merged if there are enough changes - 100 changed docstrings or changes older than two months, say? I know it is quite a bit of work to do the merge. I remember Pauli saying that most of the work was reviewing all changes to make sure they are an improvement over the current svn version. Is that correct? I can help with that if necessary. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.sinclair.za at gmail.com Thu Sep 17 12:19:30 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Thu, 17 Sep 2009 18:19:30 +0200 Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared In-Reply-To: References: Message-ID: <6a17e9ee0909170919j252e292fud7e074d2192a38f4@mail.gmail.com> > 2009/9/17 Ralf Gommers : > After the recent move of the matrix module, all the changes to the docstrings have disappeared from the doc wiki. Hmm.. http://article.gmane.org/gmane.comp.python.scientific.devel/9732 ;-) > I know it is quite a bit of work to do the merge. I remember Pauli saying > that most of the work was reviewing all changes to make sure they are an > improvement over the current svn version. Is that correct? I can help with > that if necessary. It's probably important that the documentation patches should be committed pretty soon after being reviewed for obvious malicious code and marked "OK to Apply". It's possible to edit docstrings that are marked as "OK to apply", without this flag being removed. If none of the current commiters have time to commit doc patches on demand, then perhaps it makes sense to give commit access to someone working actively on the documentation (Ralf, DavidG ?). This could be on the understanding that only doc patches would be commited by this person. It's always going to be a lot of work when we let the trunk and doc-editor get too far out of sync. Cheers, Scott From oliphant at enthought.com Thu Sep 17 12:29:19 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 17 Sep 2009 11:29:19 -0500 Subject: [Numpy-discussion] datetime-related import slowdown In-Reply-To: <4AA4BCD6.3040304@ar.media.kyoto-u.ac.jp> References: <4AA4BCD6.3040304@ar.media.kyoto-u.ac.jp> Message-ID: On Sep 7, 2009, at 2:57 AM, David Cournapeau wrote: > Hi, > > I noticed that numpy import times significantly are significantly > worse than it used to be, and those are related to recent datetime > related changes: > > # One month ago > time python -c "import numpy" -> 141ms > > # Now: > time python -c "import numpy" -> 202ms > > Using bzr import profiler, most of the slowdown comes from > mx_datetime, > and I guess all the regex compilation within (each re.compile takes > several ms each in there). Is there a way to make this faster, at > least > for people not using datetime ? Yes, I'm sure we can move any imports into functions that need them. -Travis -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliphant at enthought.com From ralf.gommers at googlemail.com Thu Sep 17 13:07:09 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 17 Sep 2009 13:07:09 -0400 Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared In-Reply-To: <6a17e9ee0909170919j252e292fud7e074d2192a38f4@mail.gmail.com> References: <6a17e9ee0909170919j252e292fud7e074d2192a38f4@mail.gmail.com> Message-ID: On Thu, Sep 17, 2009 at 12:19 PM, Scott Sinclair wrote: > > 2009/9/17 Ralf Gommers : > > After the recent move of the matrix module, all the changes to the > docstrings have disappeared from the doc wiki. > > Hmm.. http://article.gmane.org/gmane.comp.python.scientific.devel/9732 ;-) > > > > I know it is quite a bit of work to do the merge. I remember Pauli saying > > that most of the work was reviewing all changes to make sure they are an > > improvement over the current svn version. Is that correct? I can help > with > > that if necessary. > > It's probably important that the documentation patches should be > committed pretty soon after being reviewed for obvious malicious code > and marked "OK to Apply". It's possible to edit docstrings that are > marked as "OK to apply", without this flag being removed. > Agreed. However the wiki page for a moved docstring will still lose the history, comments, etc even if patches are committed. So the wiki should give a warning if it sees an object disappearing from svn instead of simply hiding the relevant page. > > If none of the current commiters have time to commit doc patches on > demand, then perhaps it makes sense to give commit access to someone > working actively on the documentation (Ralf, DavidG ?). This could be > on the understanding that only doc patches would be commited by this > person. > It would also be possible for me to publish a git repo with doc changes and ping a developer to pull from there. That way I could do 98% of the work without needing commit rights. Cheers, Ralf > It's always going to be a lot of work when we let the trunk and > doc-editor get too far out of sync. > > Cheers, > Scott > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Thu Sep 17 13:30:53 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 17 Sep 2009 12:30:53 -0500 Subject: [Numpy-discussion] Simple pattern recognition In-Reply-To: <49d6b3500909170814q179b3f83pa54e46eb5b66d4c1@mail.gmail.com> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> <49d6b3500909170814q179b3f83pa54e46eb5b66d4c1@mail.gmail.com> Message-ID: <49d6b3500909171030o4b3deac7x1d5064fc70248f8@mail.gmail.com> On Thu, Sep 17, 2009 at 10:14 AM, G?khan Sever wrote: > > I use PIL to read my png file (after cropped the initial image to the > column of my interest) Like: > > from PIL import Image > myim = Image('seccol.png) > imdata = np.array(myim.getdata()) > > From this on, I am not sure what to provide to the structure parameter > > label(input, structure=None, output=None) > > I am correct on PIL usage or just a simple binary ready work, too? > > Confirming myself :) This is the right approach to take. However this is only the beginning. > > >> Once you've run label() you can use scipy.ndimage.find_objects() to >> get slices into the entire column that contain the contiguous region. >> >> If you want to try and count individual rectangles that may overlap, >> there are likely dynamic programming algorithms that can find the >> biggest rectangles. The truth is you can probably even do something >> pretty naive and then compile it with Cython and it'll go blazing fast. >> > I used find_object() correctly as well. Again, this query turns to not so simple object counting :) Because I should be able identify complex objects in that labeled array (ie. whether a single shadow or a part of compound object; look up, down, right, left neighbourhoods of each object to see it is by itself or not). And after such identification, I should get the maximum height and widths of each object. This is somewhat easier but the identification seems like a bit challenging for me. (many loops and some intelligent logic needs to applied). I am aware this is a very good algorithm design exercise, however I would be happy to hear if any similar work has been done before, since this is just the tip of my lab report. Thanks again. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Sep 17 13:48:57 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 17 Sep 2009 20:48:57 +0300 Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared In-Reply-To: References: Message-ID: <1253209737.5581.80.camel@idol> to, 2009-09-17 kello 11:21 -0400, Ralf Gommers kirjoitti: > After the recent move of the matrix module, all the changes to the > docstrings have disappeared from the doc wiki. I imagine the changes > still live somewhere in the underlying repo but are not visible to the > user. Can they be restored? If there is some unforeseen problem with > that, I do have a recent patch saved that contains the latest edits > (but of course it won't apply cleanly). They are not lost (the app is designed so that things are not easily lost), and can be found by writing the old full URL: http://docs.scipy.org/numpy/docs/numpy.core.defmatrix/ The easiest way to make the transition is to copy the docstrings via SQL on the server. I just copied the numpy.core.defmatrix.* docstring revisions and comments to numpy.matrx.defmatrix.*. For reference, it can be done with two SQL statements on the database: insert or ignore into docweb_docstringrevision (docstring_id, text, author, comment, timestamp, review, ok_to_apply) select 'numpy.matrx.defmatrix' || substr(docstring_id, 21, length(docstring_id)), text, author, comment, timestamp, review, ok_to_apply from docweb_docstringrevision where docstring_id like 'numpy.core.defmatrix%'; insert or ignore into docweb_reviewcomment (docstring_id, text, author, timestamp, resolved) select 'numpy.matrx.defmatrix' || substr(docstring_id, 21, length(docstring_id)), text, author, timestamp, resolved from docweb_reviewcomment where docstring_id like 'numpy.core.defmatrix%'; Moving would have been two one-liners, but copying maybe better conforms to the append-only policy. Also, the sqlite version on the server doesn't support a replace() function, yuck. I think everything should be OK now -- please let me know if something is broken. > I saved that patch from the wiki because I was worried about the large > number of unmerged changes, there are about 500 docstrings with > changes right now (close to 50% of all important docstrings). Would it > make sense to make sure (maybe with an automatic reminder email) that > changes from the wiki get merged if there are enough changes - 100 > changed docstrings or changes older than two months, say? I'll put the following to my crontab, let's see if it helps... 15 14 * * sun test `wget -q -O- 'http://docs.scipy.org/numpy/patch/' | grep -c ''` -lt 100 || echo "Hey man, lots of stuff is waiting at http://docs.scipy.org/numpy/patch/" | mail -s "Lots o' Stuff waiting in Pydocweb" pauli at localhost Anyone else with SVN commit access is welcome to do similarly :) > I know it is quite a bit of work to do the merge. I remember Pauli > saying that most of the work was reviewing all changes to make sure > they are an improvement over the current svn version. Is that correct? > I can help with that if necessary. You have reviewer rights, so you can go and click the "OK to apply" buttons, to flag those changes that are better than SVN. Also, check that no unwanted or dangerous code (eg. careless handling of temporary files, etc) is added to the examples -- they may eventually be executed after committed to SVN. -- Pauli Virtanen From pav at iki.fi Thu Sep 17 13:56:03 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 17 Sep 2009 20:56:03 +0300 Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared In-Reply-To: <6a17e9ee0909170919j252e292fud7e074d2192a38f4@mail.gmail.com> References: <6a17e9ee0909170919j252e292fud7e074d2192a38f4@mail.gmail.com> Message-ID: <1253210162.5581.89.camel@idol> to, 2009-09-17 kello 18:19 +0200, Scott Sinclair kirjoitti: [clip] > It's probably important that the documentation patches should be > committed pretty soon after being reviewed for obvious malicious code > and marked "OK to Apply". It's possible to edit docstrings that are > marked as "OK to apply", without this flag being removed. If that's possible, then it's a bug. But I don't see how that can happen -- do you have an example how to do this kind of edits that don't reset the ok_to_apply flag? > If none of the current commiters have time to commit doc patches on > demand, then perhaps it makes sense to give commit access to someone > working actively on the documentation (Ralf, DavidG ?). This could be > on the understanding that only doc patches would be commited by this > person. > > It's always going to be a lot of work when we let the trunk and > doc-editor get too far out of sync. The work scales linearly with the size of the diff to SVN, so it's not extremely bad. Of course, it's a bit of a work to go through hundreds of potential docstrings in one sitting. -- Pauli Virtanen From dwf at cs.toronto.edu Thu Sep 17 15:30:55 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 17 Sep 2009 15:30:55 -0400 Subject: [Numpy-discussion] matlab for numpy users In-Reply-To: <4AB1DA2C.4040504@ar.media.kyoto-u.ac.jp> References: <4AB1DA2C.4040504@ar.media.kyoto-u.ac.jp> Message-ID: <3A0A51CF-4C11-4B3E-A3FB-05C05E90FDD7@cs.toronto.edu> On 17-Sep-09, at 2:41 AM, David Cournapeau wrote: > Not really (except for trivial things like scalar-matrix operations). > The usual way to do it in matlab is repmat, which helps you doing > 'manual broadcasting'. In recent versions there is also 'bsxfun', which is an awkward way of doing broadcasting but uses less memory (I think). David From ralf.gommers at googlemail.com Thu Sep 17 15:58:23 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 17 Sep 2009 15:58:23 -0400 Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared In-Reply-To: <1253209737.5581.80.camel@idol> References: <1253209737.5581.80.camel@idol> Message-ID: On Thu, Sep 17, 2009 at 1:48 PM, Pauli Virtanen wrote: > to, 2009-09-17 kello 11:21 -0400, Ralf Gommers kirjoitti: > > After the recent move of the matrix module, all the changes to the > > docstrings have disappeared from the doc wiki. I imagine the changes > > still live somewhere in the underlying repo but are not visible to the > > user. Can they be restored? If there is some unforeseen problem with > > that, I do have a recent patch saved that contains the latest edits > > (but of course it won't apply cleanly). > > They are not lost (the app is designed so that things are not easily > lost), and can be found by writing the old full URL: > > http://docs.scipy.org/numpy/docs/numpy.core.defmatrix/ > > That is useful. Copying to the new location indeed seems better than moving. Even though the content is not technically lost, I would like to see warnings for objects that have disappeared from svn. If content gets silently hidden like this it will result in duplicated effort, you can't rely on people noticing this every time. The easiest way to make the transition is to copy the docstrings via SQL > on the server. > > I think everything should be OK now -- please let me know if something > is broken. > That fixed it, thanks. The one thing that did not get copied was the status of docstrings marked Unimportant, they were still in the Needs editing state. I changed those manually. > > > You have reviewer rights, so you can go and click the "OK to apply" > buttons, to flag those changes that are better than SVN. > > Also, check that no unwanted or dangerous code (eg. careless handling of > temporary files, etc) is added to the examples -- they may eventually be > executed after committed to SVN. > I'll work on it this weekend and will let you know when it is done. Cheers, Ralf > > -- > Pauli Virtanen > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From slaunger at gmail.com Thu Sep 17 16:13:58 2009 From: slaunger at gmail.com (Kim Hansen) Date: Thu, 17 Sep 2009 22:13:58 +0200 Subject: [Numpy-discussion] Efficient array equivalent to cmp(x,y) Message-ID: Hi, Is there an array-like function equivalent with the builtin method for the Python single-valued comparison cmp(x,y)? What I would like is a cmp(a, lim), where a is an ndarray and lim is a single value, and then I need an array back of a's shape giving the elementwise comparison array([cmp(a[0], lim), cmp(a[1], lim), ...]) I can do it somewhat ackwardly doing this: In [1]: a = randint(5, size=10); print a [0 2 4 1 3 0 3 4 0 1] In [2]: lim = 2 In [3]: acmp = empty(a.shape, dtype='i1') In [4]: acmp[a < lim] = -1 In [5]: acmp[a == lim] = 0 In [6]: acmp[a > lim] = 1 In [7]: acmp Out[7]: array([-1, 0, 1, -1, 1, -1, 1, 1, -1, -1], dtype=int8) But that is not very elegant and since this is a computational bottleneck I would rather like to avoid all the intermediate creations of three mask arrays for fancy indexing in this example. Is there a simpler/more elegant way to acheive the same result? Thanks in advance, Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu Sep 17 16:19:55 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 17 Sep 2009 13:19:55 -0700 Subject: [Numpy-discussion] Efficient array equivalent to cmp(x,y) In-Reply-To: References: Message-ID: On Thu, Sep 17, 2009 at 1:13 PM, Kim Hansen wrote: > Hi, > > Is there an array-like function?equivalent?with the?builtin method for the > Python single-valued comparison cmp(x,y)? > > What?I would like is a cmp(a, lim), where a is an ndarray and lim is a > single value, and then?I need an array back of a's shape giving the > elementwise comparison > array([cmp(a[0], lim), cmp(a[1], lim), ...]) > > I can do it somewhat ackwardly doing this: > > In [1]: a = randint(5, size=10); print a > [0 2 4 1 3 0 3 4 0 1] > In [2]: lim = 2 > In [3]: acmp = empty(a.shape, dtype='i1') > In [4]: acmp[a < lim] = -1 > In [5]: acmp[a == lim] = 0 > In [6]: acmp[a > lim] = 1 > In [7]: acmp > Out[7]: array([-1,? 0,? 1, -1,? 1, -1,? 1,? 1, -1, -1], dtype=int8) > > But that is not very elegant and since this is a computational bottleneck?I > would rather like to avoid all the intermediate creations of three mask > arrays for fancy indexing in this example. If there are no NaNs, you only need to make 2 masks by using ones instead of empty. Not elegent but a little faster. From robert.kern at gmail.com Thu Sep 17 16:25:18 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 17 Sep 2009 15:25:18 -0500 Subject: [Numpy-discussion] Efficient array equivalent to cmp(x,y) In-Reply-To: References: Message-ID: <3d375d730909171325o1daad201n63cf23bfb676a635@mail.gmail.com> [Please pardon the piggybacking. I didn't get the original.] On Thu, Sep 17, 2009 at 15:19, Keith Goodman wrote: > On Thu, Sep 17, 2009 at 1:13 PM, Kim Hansen wrote: >> Hi, >> >> Is there an array-like function?equivalent?with the?builtin method for the >> Python single-valued comparison cmp(x,y)? >> >> What?I would like is a cmp(a, lim), where a is an ndarray and lim is a >> single value, and then?I need an array back of a's shape giving the >> elementwise comparison >> array([cmp(a[0], lim), cmp(a[1], lim), ...]) >> >> I can do it somewhat ackwardly doing this: >> >> In [1]: a = randint(5, size=10); print a >> [0 2 4 1 3 0 3 4 0 1] >> In [2]: lim = 2 >> In [3]: acmp = empty(a.shape, dtype='i1') >> In [4]: acmp[a < lim] = -1 >> In [5]: acmp[a == lim] = 0 >> In [6]: acmp[a > lim] = 1 >> In [7]: acmp >> Out[7]: array([-1,? 0,? 1, -1,? 1, -1,? 1,? 1, -1, -1], dtype=int8) >> >> But that is not very elegant and since this is a computational bottleneck?I >> would rather like to avoid all the intermediate creations of three mask >> arrays for fancy indexing in this example. In [1]: a = np.array([0, 2, 4, 1, 3, 0, 3, 4, 0, 1]) In [2]: lim = 2 In [3]: np.sign(a - lim) Out[3]: array([-1, 0, 1, -1, 1, -1, 1, 1, -1, -1]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From slaunger at gmail.com Thu Sep 17 16:30:40 2009 From: slaunger at gmail.com (Kim Hansen) Date: Thu, 17 Sep 2009 22:30:40 +0200 Subject: [Numpy-discussion] Efficient array equivalent to cmp(x,y) In-Reply-To: References: Message-ID: > > > > If there are no NaNs, you only need to make 2 masks by using ones > instead of empty. Not elegent but a little faster. > Good point! Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From slaunger at gmail.com Thu Sep 17 16:33:24 2009 From: slaunger at gmail.com (Kim Hansen) Date: Thu, 17 Sep 2009 22:33:24 +0200 Subject: [Numpy-discussion] Efficient array equivalent to cmp(x,y) In-Reply-To: <3d375d730909171325o1daad201n63cf23bfb676a635@mail.gmail.com> References: <3d375d730909171325o1daad201n63cf23bfb676a635@mail.gmail.com> Message-ID: > > > > In [1]: a = np.array([0, 2, 4, 1, 3, 0, 3, 4, 0, 1]) > > In [2]: lim = 2 > > In [3]: np.sign(a - lim) > Out[3]: array([-1, 0, 1, -1, 1, -1, 1, 1, -1, -1]) > Dooh. Facepalm. I should have thought of that myself! Only one intermediate array needs to be created then. Thank you. That was what I was looking for. Cheers, Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Thu Sep 17 18:12:01 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 17 Sep 2009 18:12:01 -0400 Subject: [Numpy-discussion] dumb structured arrays question Message-ID: If I have a 1-dimensional array with a structured dtype, say str, float, float, float, float.... where all the float columns have their own names, and I just want to extract all the floats in the order they appear into a 2D matrix that disregards the dtype metadata... Is there an easy way to do that? Currently the only thing I can think of is out = [] for name in arr.dtype.names: if name != 'condition': # my single str field out.append(arr['name']) out = np.array(out) Is there already some convenience function for this, to coerce numeric compound dtypes into an extra dimension? It seems conceivable that I might be able to pull some dirty trick with stride_tricks that would even allow me to avoid the copy, since it'd just be a matter of an extra few bytes on the 2nd dimension stride to skip over the string data. Thanks, David From erin.sheldon at gmail.com Thu Sep 17 18:18:29 2009 From: erin.sheldon at gmail.com (Erin Sheldon) Date: Thu, 17 Sep 2009 18:18:29 -0400 Subject: [Numpy-discussion] dumb structured arrays question In-Reply-To: References: Message-ID: <331116dc0909171518r4399894ck20283194d33f7d8e@mail.gmail.com> On Thu, Sep 17, 2009 at 6:12 PM, David Warde-Farley wrote: > If I have a 1-dimensional array with a structured dtype, say str, > float, float, float, float.... where all the float columns have their > own names, and I just want to extract all the floats in the order they > appear into a 2D matrix that disregards the dtype metadata... Is there > an easy way to do that? > > Currently the only thing I can think of is > > out = [] > for name in arr.dtype.names: > ? ? ? ?if name != 'condition': # my single str field > ? ? ? ? ? ? ? ?out.append(arr['name']) > out = np.array(out) > > Is there already some convenience function for this, to coerce numeric > compound dtypes into an extra dimension? > > It seems conceivable that I might be able to pull some dirty trick > with stride_tricks that would even allow me to avoid the copy, since > it'd just be a matter of an extra few bytes on the 2nd dimension > stride to skip over the string data. > > Thanks, > > David You can just view it differently: In [4]: x=numpy.zeros(3,dtype=[('field1','S5'),('field2','f4'),('field3','f4'),('field4','f4')]) In [5]: x Out[5]: array([('', 0.0, 0.0, 0.0), ('', 0.0, 0.0, 0.0), ('', 0.0, 0.0, 0.0)], dtype=[('field1', '|S5'), ('field2', ' References: <331116dc0909171518r4399894ck20283194d33f7d8e@mail.gmail.com> Message-ID: <6F2CF41E-F4A6-4EBA-9640-7A57B213CC6F@cs.toronto.edu> On 17-Sep-09, at 6:18 PM, Erin Sheldon wrote: > > You can just view it differently: > > In [4]: x=numpy.zeros(3,dtype=[('field1','S5'),('field2','f4'), > ('field3','f4'),('field4','f4')]) > > In [5]: x > Out[5]: > array([('', 0.0, 0.0, 0.0), ('', 0.0, 0.0, 0.0), ('', 0.0, 0.0, 0.0)], > dtype=[('field1', '|S5'), ('field2', ' ('field4', ' > In [6]: dt=numpy.dtype([('field1','S5'),('compound','3f4')]) > > In [7]: x.view(dt) > Out[7]: > array([('', [0.0, 0.0, 0.0]), ('', [0.0, 0.0, 0.0]), ('', [0.0, 0.0, > 0.0])], > dtype=[('field1', '|S5'), ('compound', ' > In [8]: x.view(dt)['compound'] > Out[8]: > array([[ 0., 0., 0.], > [ 0., 0., 0.], > [ 0., 0., 0.]], dtype=float32) Thanks, that's immensely useful. Didn't know about .view(). David From ralf.gommers at googlemail.com Thu Sep 17 21:16:12 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 17 Sep 2009 21:16:12 -0400 Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared In-Reply-To: References: <1253209737.5581.80.camel@idol> Message-ID: There is a lot more it turns out: atleast_1d atleast_2d atleast_3d hstack vstack correlate2 linspace logspace finfo iinfo MachAr Based on this I suspect there is quite a bit of work that got lost earlier in the summer. A couple of times I saw the count of "needs editing" in the stats go up by several or even several tens. At the time I thought those were all objects that were new to NumPy, but more likely they got moved around and the doc writers redid the work. ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.sinclair.za at gmail.com Fri Sep 18 01:48:47 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Fri, 18 Sep 2009 07:48:47 +0200 Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared In-Reply-To: <1253210162.5581.89.camel@idol> References: <6a17e9ee0909170919j252e292fud7e074d2192a38f4@mail.gmail.com> <1253210162.5581.89.camel@idol> Message-ID: <6a17e9ee0909172248w674282bfpfbd1ef1d3f6de125@mail.gmail.com> > 2009/9/17 Pauli Virtanen : > to, 2009-09-17 kello 18:19 +0200, Scott Sinclair kirjoitti: > [clip] >> It's probably important that the documentation patches should be >> committed pretty soon after being reviewed for obvious malicious code >> and marked "OK to Apply". It's possible to edit docstrings that are >> marked as "OK to apply", without this flag being removed. > > If that's possible, then it's a bug. But I don't see how that can happen > -- do you have an example how to do this kind of edits that don't reset > the ok_to_apply flag? No I don't, I've just tried it and the flag is correctly reset. I am certain that I edited a docstring some time ago without the flag being reset, but can't reproduce that action now. Cheers, Scott From pav+sp at iki.fi Fri Sep 18 03:42:08 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Fri, 18 Sep 2009 07:42:08 +0000 (UTC) Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared References: <1253209737.5581.80.camel@idol> Message-ID: Thu, 17 Sep 2009 21:16:12 -0400, Ralf Gommers wrote: > There is a lot more it turns out: > > atleast_1d > atleast_2d > atleast_3d > hstack > vstack > correlate2 > linspace > logspace > finfo > iinfo > MachAr So it seems -- David's been moving stuff around lately. > Based on this I suspect there is quite a bit of work that got lost > earlier in the summer. A couple of times I saw the count of "needs > editing" in the stats go up by several or even several tens. At the time > I thought those were all objects that were new to NumPy, but more likely > they got moved around and the doc writers redid the work. I doubt that -- not so much has been moved around. It's easy to see from "git log --stat -M -C" that only shape_base, getlimits, and machar have been moved around since 2007. The main cause for "Needs editing" increasing is that I elevated occasionally some items from "Unimportant" status that were actually important. If you have good ideas how to the "move/delete warnings" should appear in the web UI, and what the application should do to make it easy to fix them, go ahead and tell them. Designing the UI and how it should work is the first part of the work in making any improvements. -- Pauli Virtanen From highegg at gmail.com Fri Sep 18 05:35:17 2009 From: highegg at gmail.com (Jaroslav Hajek) Date: Fri, 18 Sep 2009 11:35:17 +0200 Subject: [Numpy-discussion] string arrays - accessing data from C++ Message-ID: <69d8d540909180235o3ba29b9aqe145be45df393937@mail.gmail.com> hi all, I'm working on Pytave - a Python<->Octave bridge that can either use NumPy or the older Numeric to map onto Octave's arrays. I would like to implement support for NumPy's string arrays, but I'm a little confused about how the data is stored internally. Does PyArrayObject::data point to a single contiguous char[] buffer, like with the old Numeric char arrays, with PyArrayObject::descr->elsize being the maximum length? If yes, how are string lengths determined (I noticed that the strings stored can have differing lengths when displayed from the interpreter)? Finally, is there any way to create an array in NumPy (from within the interpreter) that would have type == PyArray_CHAR? (There will be no need for that when I implement the above, I'm just wondering whether I overlooked something). Many thanks in advance. -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz From renesd at gmail.com Fri Sep 18 08:01:08 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 13:01:08 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> Message-ID: <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> Hello, as a big numpy user, and someone wanting to help with the python 3 migration, I'd like to help with a python 3.1 port of numpy. We(at the pygame project) have mostly completed our port of pygame to python 3.0 and 3.1 so can offer some insight into what it takes with a CPython extension. pygame supports python 2.3 through to 3.1, so it should be possible to also keep backwards compatibility with the port for numpy. ?We can also use some of the helper code we used for the pygame port. I haven't asked the other pygame developers if they are interested in helping too... but maybe they will be interested in helping too(since they use numpy too). ?I'd also be interested in other people helping with the effort. ?Once I have put some of the ground work in place, and got a few parts done it should be easier for other people to see what needs changing. If the python 3 port is something you'd want included with numpy, then I'd like to begin this weekend. I'm not super familiar with the numpy code base yet, so I'd like to do it in small changes making small parts compatible with py3k and then having them reviewed/tested. I can either start a branch myself using mecurial, or perhaps you'd like me to do the work somewhere else ( like in a branch in the numpy svn?). Which code should I base the port off? ?trunk ? Or maybe I missed something and a port has started already? In which case, is there a branch somewhere I can help with? cheers, From charlesr.harris at gmail.com Fri Sep 18 10:10:50 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Sep 2009 08:10:50 -0600 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> Message-ID: Hi Ren?, On Fri, Sep 18, 2009 at 6:01 AM, Ren? Dudfield wrote: > Hello, > > as a big numpy user, and someone wanting to help with the python 3 > migration, I'd like to help with a python 3.1 port of numpy. > > We(at the pygame project) have mostly completed our port of pygame to > python 3.0 and 3.1 so can offer some insight into what it takes with a > CPython extension. > > pygame supports python 2.3 through to 3.1, so it should be possible to > also keep backwards compatibility with the port for numpy. We can > also use some of the helper code we used for the pygame port. > > I haven't asked the other pygame developers if they are interested in > helping too... but maybe they will be interested in helping too(since > they use numpy too). I'd also be interested in other people helping > with the effort. Once I have put some of the ground work in place, > and got a few parts done it should be easier for other people to see > what needs changing. > > > If the python 3 port is something you'd want included with numpy, then > I'd like to begin this weekend. > > I'm not super familiar with the numpy code base yet, so I'd like to do > it in small changes making small parts compatible with py3k and then > having them reviewed/tested. > > I can either start a branch myself using mecurial, or perhaps you'd > like me to do the work somewhere else ( like in a branch in the numpy > svn?). > > Which code should I base the port off? trunk ? > > Darren Dale and I are just getting started on a port and welcome any help you can offer. Because of the difficulty of maintaining two branches the only route that looks good at this point is to get the python parts of numpy in a state that will allow 2to3 to work and use #ifdef's in the c code. What was your experience with pygames? Because the numpy c code is difficult to come to grips with the easiest part for inexperienced c coders and newbies is to start on is probably the python code. There is a fair amount of that so some plan of attack and a check list probably needs to be set up. Any such list should be kept in svn along with any helpful notes about problems/solutions encountered along the way. We also need David C. to commit his build changes for py3k so we can actually build the whole thing when the time comes. (Hint, hint). Also, I'm thinking of skipping 3.0 and starting with 3.1 because of the fixes, particularly the disappearance of cmp, that went in between the versions. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From renesd at gmail.com Fri Sep 18 10:46:14 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 15:46:14 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> Message-ID: <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> On Fri, Sep 18, 2009 at 3:10 PM, Charles R Harris wrote: > Hi Ren?, > > On Fri, Sep 18, 2009 at 6:01 AM, Ren? Dudfield wrote: >> >> Hello, >> >> as a big numpy user, and someone wanting to help with the python 3 >> migration, I'd like to help with a python 3.1 port of numpy. >> >> We(at the pygame project) have mostly completed our port of pygame to >> python 3.0 and 3.1 so can offer some insight into what it takes with a >> CPython extension. >> >> pygame supports python 2.3 through to 3.1, so it should be possible to >> also keep backwards compatibility with the port for numpy. ?We can >> also use some of the helper code we used for the pygame port. >> >> I haven't asked the other pygame developers if they are interested in >> helping too... but maybe they will be interested in helping too(since >> they use numpy too). ?I'd also be interested in other people helping >> with the effort. ?Once I have put some of the ground work in place, >> and got a few parts done it should be easier for other people to see >> what needs changing. >> >> >> If the python 3 port is something you'd want included with numpy, then >> I'd like to begin this weekend. >> >> I'm not super familiar with the numpy code base yet, so I'd like to do >> it in small changes making small parts compatible with py3k and then >> having them reviewed/tested. >> >> I can either start a branch myself using mecurial, or perhaps you'd >> like me to do the work somewhere else ( like in a branch in the numpy >> svn?). >> >> Which code should I base the port off? ?trunk ? >> > > Darren Dale and I are just getting started on a port and welcome any help > you can offer.? Because of the difficulty of maintaining two branches the > only route that looks good at this point is to get the python parts of numpy > in a state that will allow 2to3 to work and use #ifdef's in the c code. What > was your experience with pygames? > ah cool! That's great to hear. Where is your work going to be done? Well, we managed to make the code base runnable in python 2.x and 3 without using the 2to3 tool. We didn't need, or want to support different versions of source code. In lots of cases it's possible to support versions of python at once... if you aren't using any new python3 features that the older pythons don't have. For the C code there's a header which helps with a lot of the backward compat stuff. Had to implement a few things because of missing things... like simple slicing is gone from py3k (sq_slice) etc. Here's a header file we used with some compatibility macros... http://www.seul.org/viewcvs/viewcvs.cgi/trunk/src/pgcompat.h?rev=2049&root=PyGame&sortby=date&view=markup but we also have some other compat stuff in other headers... but I think most of the py3k stuff is in there. You could probably just copy that header file to use it. The module initialisation works slightly differently so we have #ifdef PY3 for those differences. See the bottom of this file for an example module: http://www.seul.org/viewcvs/viewcvs.cgi/trunk/src/rect.c?rev=2617&root=PyGame&sortby=date&view=markup For things like the missing sq_slice, rather than rewriting things we just wrote code to support stuff that py3k removed. > Because the numpy c code is difficult to come to grips with the easiest part > for inexperienced c coders and newbies is to start on is probably the python > code. There is a fair amount of that so some plan of attack and a check list > probably needs to be set up. Any such list should be kept in svn along with > any helpful notes about problems/solutions encountered along the way. > Well plenty of tests are something that's needed for sure... so add that to your checklist. It can be a good idea to get the setup.py, and testing framework ported early, so it's easier to check if code still works. I'm happy to help with the C stuff, and the .py stuff too. > We also need David C. to commit his build changes for py3k so we can > actually build the whole thing when the time comes. (Hint, hint). Also, I'm > thinking of skipping 3.0 and starting with 3.1 because of the fixes, > particularly the disappearance of cmp, that went in between the versions. > > Chuck yeah, skipping 3.0 is a good idea... just go straight to 3.1. We started our port during 3.0 before 3.1... but there weren't all that many differences for us between the two versions. cheers, From renesd at gmail.com Fri Sep 18 10:52:30 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 15:52:30 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> Message-ID: <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> one more thing... there's also notes about porting to py3k here: http://wiki.python.org/moin/cporting and here: http://wiki.python.org/moin/PortingExtensionModulesToPy3k Which are better than the python.org docs for cporting. That's probably a pretty good page to store notes about porting as we go. cu, From charlesr.harris at gmail.com Fri Sep 18 11:05:36 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Sep 2009 09:05:36 -0600 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> Message-ID: On Fri, Sep 18, 2009 at 8:52 AM, Ren? Dudfield wrote: > one more thing... > > there's also notes about porting to py3k here: > http://wiki.python.org/moin/cporting > and here: > http://wiki.python.org/moin/PortingExtensionModulesToPy3k > > Which are better than the python.org docs for cporting. That's > probably a pretty good page to store notes about porting as we go. > > > Thanks! Numpy defines a lot of extension python types, so that is what I got started on, using NPY_PY3K as the flag. Numpy also exports the numeric operations (don't ask) and I think that needs to be changed so it looks like a reqular c++ in c class with getters and setters, which will make things a bit easier with the major changes that happened there. IIRC, there is an include file that provides the old conversions between python numeric types and c types. Did you use that? We could give you commit privileges for this work, or we could work offline with git, i.e., you could use git svn and I would pull from you to make the commits. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From renesd at gmail.com Fri Sep 18 11:09:46 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 16:09:46 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> Message-ID: <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> On Fri, Sep 18, 2009 at 4:05 PM, Charles R Harris wrote: > > > On Fri, Sep 18, 2009 at 8:52 AM, Ren? Dudfield wrote: >> >> one more thing... >> >> there's also notes about porting to py3k here: >> ? ?http://wiki.python.org/moin/cporting >> and here: >> ? ?http://wiki.python.org/moin/PortingExtensionModulesToPy3k >> >> Which are better than the python.org docs for cporting. ?That's >> probably a pretty good page to store notes about porting as we go. >> >> > > Thanks! Numpy defines a lot of extension python types, so that is what I got > started on, using NPY_PY3K as the flag. Numpy also exports the numeric > operations (don't ask) and I think that needs to be changed so it looks like > a reqular c++ in c class with getters and setters, which will make things a > bit easier with the major changes that happened there. > > IIRC, there is an include file that provides the old conversions between > python numeric types and c types. Did you use that? no, I don't know about that. > > We could give you commit privileges for this work, or we could work offline > with git, i.e., you could use git svn and I would pull from you to make the > commits. if that works for you, that sounds good. Should I clone from trunk, or is it going to be in a separate branch? cheers, From ralf.gommers at googlemail.com Fri Sep 18 11:24:25 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 18 Sep 2009 11:24:25 -0400 Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared In-Reply-To: References: <1253209737.5581.80.camel@idol> Message-ID: > > > Based on this I suspect there is quite a bit of work that got lost > > earlier in the summer. A couple of times I saw the count of "needs > > editing" in the stats go up by several or even several tens. At the time > > I thought those were all objects that were new to NumPy, but more likely > > they got moved around and the doc writers redid the work. > > > I doubt that -- not so much has been moved around. It's easy to see from > "git log --stat -M -C" that only shape_base, getlimits, and machar have > been moved around since 2007. > > The main cause for "Needs editing" increasing is that I elevated > occasionally some items from "Unimportant" status that were actually > important. > Great, now I can sleep again:) > > If you have good ideas how to the "move/delete warnings" should appear in > the web UI, and what the application should do to make it easy to fix > them, go ahead and tell them. Designing the UI and how it should work is > the first part of the work in making any improvements. > > One possible solution: have a page with all unhandled moves/deletes, similar to the Merge page. This page can list all relevant docstrings and have buttons to copy over the old page, or to ignore the event. As extra insurance, the newly created page could have a warning added to it in a comment that there may be a problem. It may also be possible to copy over the whole page automatically but that is more risky than clearly asking the user for a manual resolution. Cheers, Ralf -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav+sp at iki.fi Fri Sep 18 11:25:21 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Fri, 18 Sep 2009 15:25:21 +0000 (UTC) Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared References: <1253209737.5581.80.camel@idol> Message-ID: Fri, 18 Sep 2009 07:42:08 +0000, Pauli Virtanen wrote: [clip] > I doubt that -- not so much has been moved around. It's easy to see from > "git log --stat -M -C" that only shape_base, getlimits, and machar have > been moved around since 2007. A complete listing obtained from SQL: numpy-docs/reference/generalized_ufuncs.rst Taken care of when the move was made. numpy.core.defmatrix.asmatrix numpy.core.defmatrix.bmat numpy.core.defmatrix.matrix numpy.core.defmatrix.matrix.all numpy.core.defmatrix.matrix.any numpy.core.defmatrix.matrix.argmax numpy.core.defmatrix.matrix.argmin numpy.core.defmatrix.matrix.getA numpy.core.defmatrix.matrix.getA1 numpy.core.defmatrix.matrix.getH numpy.core.defmatrix.matrix.getI numpy.core.defmatrix.matrix.getT numpy.core.defmatrix.matrix.max numpy.core.defmatrix.matrix.mean numpy.core.defmatrix.matrix.min numpy.core.defmatrix.matrix.prod numpy.core.defmatrix.matrix.ptp numpy.core.defmatrix.matrix.std numpy.core.defmatrix.matrix.sum numpy.core.defmatrix.matrix.tolist numpy.core.defmatrix.matrix.var numpy.core.defmatrix.matrix_power Taken care of, these are here since I copied the old strings to numpy.matrixlib. numpy.core.numeric.acorrelate All features in acorrelate were merged to the main correlate function. No way to really avoid lost work here. numpy.float96 Platform issue -- it's just not available on all platforms, and so hits a corner case in the app. numpy.lib.arraysetops.intersect1d_nu numpy.lib.arraysetops.setmember1d numpy.lib.arraysetops.unique1d These were deprecated. Work lost, but no way to really avoid this. numpy.lib.function_base.unique numpy.lib.utils.issubdtype These were committed to SVN before move -- no work lost. numpy.ma.core.MaskedArray.torecords The function was merged to "toflex" in SVN. No work lost. numpy.ma.core.get_data numpy.ma.core.get_mask These were already taken care of when the move occurred. numpy.matlib.empty numpy.matlib.eye numpy.matlib.identity numpy.matlib.ones numpy.matlib.rand numpy.matlib.randn numpy.matlib.repmat numpy.matlib.zeros These are hidden by some misconfiguration. No work lost, I'll try to fix this when I find time for it. So, in summary, I'm happy to say that not much has been moved around during two years, and we have caught and fixed all the largest changes in a few days after when they have occurred. Big thanks to Ralf & Scott for noticing these! Reorganization of the codebases is sometimes necessary. The thing to improve here is adding a better move/delete tracking functionality to the web application. There's always a balance between doing things manually vs. writing an automated way -- here, an automated way might be better, as keeping track of and fixing these manually is somewhat fiddly work, even though it boils down to a few SQL statements. -- Pauli Virtanen From charlesr.harris at gmail.com Fri Sep 18 11:29:28 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Sep 2009 09:29:28 -0600 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> Message-ID: On Fri, Sep 18, 2009 at 9:09 AM, Ren? Dudfield wrote: > On Fri, Sep 18, 2009 at 4:05 PM, Charles R Harris > wrote: > > > > > > On Fri, Sep 18, 2009 at 8:52 AM, Ren? Dudfield wrote: > >> > >> one more thing... > >> > >> there's also notes about porting to py3k here: > >> http://wiki.python.org/moin/cporting > >> and here: > >> http://wiki.python.org/moin/PortingExtensionModulesToPy3k > >> > >> Which are better than the python.org docs for cporting. That's > >> probably a pretty good page to store notes about porting as we go. > >> > >> > > > > Thanks! Numpy defines a lot of extension python types, so that is what I > got > > started on, using NPY_PY3K as the flag. Numpy also exports the numeric > > operations (don't ask) and I think that needs to be changed so it looks > like > > a reqular c++ in c class with getters and setters, which will make things > a > > bit easier with the major changes that happened there. > > > > IIRC, there is an include file that provides the old conversions between > > python numeric types and c types. Did you use that? > > no, I don't know about that. > > > > > We could give you commit privileges for this work, or we could work > offline > > with git, i.e., you could use git svn and I would pull from you to make > the > > commits. > > if that works for you, that sounds good. Should I clone from trunk, > or is it going to be in a separate branch? > > If things work like we want, the changes will have to end up in trunk. That's not to say you can't work in a git branch. One concern I have is pulling between git repos that have been rebased to svn, as I understand it this can lead to problems. OTOH, I think things should work if both git repos are rebased to the same svn revision. Hmm, it might be best to work in a branch that is kept rebased to the master which is kept in sync with svn. Maybe someone who knows a bit more about git can weigh in here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav+sp at iki.fi Fri Sep 18 11:29:39 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Fri, 18 Sep 2009 15:29:39 +0000 (UTC) Subject: [Numpy-discussion] defmatrix move - new docstrings disappeared References: <1253209737.5581.80.camel@idol> Message-ID: Fri, 18 Sep 2009 11:24:25 -0400, Ralf Gommers wrote: [clip] >> If you have good ideas how to the "move/delete warnings" should appear >> in the web UI, and what the application should do to make it easy to >> fix them, go ahead and tell them. Designing the UI and how it should >> work is the first part of the work in making any improvements. >> >> > One possible solution: have a page with all unhandled moves/deletes, > similar to the Merge page. This page can list all relevant docstrings > and have buttons to copy over the old page, or to ignore the event. As > extra insurance, the newly created page could have a warning added to it > in a comment that there may be a problem. > > It may also be possible to copy over the whole page automatically but > that is more risky than clearly asking the user for a manual resolution. Sounds good, manual resolution for all new and removed pages is reasonably low-overhead, especially as there is no completely reliable way to detect when a new page is moved, or just new. -- Pauli Virtanen From renesd at gmail.com Fri Sep 18 12:19:40 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 17:19:40 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> Message-ID: <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> On Fri, Sep 18, 2009 at 4:29 PM, Charles R Harris wrote: > > > On Fri, Sep 18, 2009 at 9:09 AM, Ren? Dudfield wrote: >> >> On Fri, Sep 18, 2009 at 4:05 PM, Charles R Harris >> wrote: >> > >> > >> > On Fri, Sep 18, 2009 at 8:52 AM, Ren? Dudfield wrote: >> >> >> >> one more thing... >> >> >> >> there's also notes about porting to py3k here: >> >> ? ?http://wiki.python.org/moin/cporting >> >> and here: >> >> ? ?http://wiki.python.org/moin/PortingExtensionModulesToPy3k >> >> >> >> Which are better than the python.org docs for cporting. ?That's >> >> probably a pretty good page to store notes about porting as we go. >> >> >> >> >> > >> > Thanks! Numpy defines a lot of extension python types, so that is what I >> > got >> > started on, using NPY_PY3K as the flag. Numpy also exports the numeric >> > operations (don't ask) and I think that needs to be changed so it looks >> > like >> > a reqular c++ in c class with getters and setters, which will make >> > things a >> > bit easier with the major changes that happened there. >> > >> > IIRC, there is an include file that provides the old conversions between >> > python numeric types and c types. Did you use that? >> >> no, I don't know about that. >> >> > >> > We could give you commit privileges for this work, or we could work >> > offline >> > with git, i.e., you could use git svn and I would pull from you to make >> > the >> > commits. >> >> if that works for you, that sounds good. ?Should I clone from trunk, >> or is it going to be in a separate branch? >> > > If things work like we want, the changes will have to end up in trunk. > That's not to say you can't work in a git branch. One concern I have is > pulling between git repos that have been rebased to svn, as I understand it > this can lead to problems. OTOH, I think things should work if both git > repos are rebased to the same svn revision. Hmm, it might be best to work in > a branch that is kept rebased to the master which is kept in sync with svn. > Maybe someone who knows a bit more about git can weigh in here. > > Chuck > Well, for now I can just send patches with a svn diff... if you'd be so kind as to apply them after a review :) Integrating working changes into trunk incrementally seems like a good idea (not know numpy dev process very well). cu, From charlesr.harris at gmail.com Fri Sep 18 12:33:24 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Sep 2009 10:33:24 -0600 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> Message-ID: On Fri, Sep 18, 2009 at 10:19 AM, Ren? Dudfield wrote: > On Fri, Sep 18, 2009 at 4:29 PM, Charles R Harris > wrote: > > > > > > On Fri, Sep 18, 2009 at 9:09 AM, Ren? Dudfield wrote: > >> > >> On Fri, Sep 18, 2009 at 4:05 PM, Charles R Harris > >> wrote: > >> > > >> > > >> > On Fri, Sep 18, 2009 at 8:52 AM, Ren? Dudfield > wrote: > >> >> > >> >> one more thing... > >> >> > >> >> there's also notes about porting to py3k here: > >> >> http://wiki.python.org/moin/cporting > >> >> and here: > >> >> http://wiki.python.org/moin/PortingExtensionModulesToPy3k > >> >> > >> >> Which are better than the python.org docs for cporting. That's > >> >> probably a pretty good page to store notes about porting as we go. > >> >> > >> >> > >> > > >> > Thanks! Numpy defines a lot of extension python types, so that is what > I > >> > got > >> > started on, using NPY_PY3K as the flag. Numpy also exports the numeric > >> > operations (don't ask) and I think that needs to be changed so it > looks > >> > like > >> > a reqular c++ in c class with getters and setters, which will make > >> > things a > >> > bit easier with the major changes that happened there. > >> > > >> > IIRC, there is an include file that provides the old conversions > between > >> > python numeric types and c types. Did you use that? > >> > >> no, I don't know about that. > >> > >> > > >> > We could give you commit privileges for this work, or we could work > >> > offline > >> > with git, i.e., you could use git svn and I would pull from you to > make > >> > the > >> > commits. > >> > >> if that works for you, that sounds good. Should I clone from trunk, > >> or is it going to be in a separate branch? > >> > > > > If things work like we want, the changes will have to end up in trunk. > > That's not to say you can't work in a git branch. One concern I have is > > pulling between git repos that have been rebased to svn, as I understand > it > > this can lead to problems. OTOH, I think things should work if both git > > repos are rebased to the same svn revision. Hmm, it might be best to work > in > > a branch that is kept rebased to the master which is kept in sync with > svn. > > Maybe someone who knows a bit more about git can weigh in here. > > > > Chuck > > > > > Well, for now I can just send patches with a svn diff... if you'd be > so kind as to apply them after a review :) > > Integrating working changes into trunk incrementally seems like a good > idea (not know numpy dev process very well). > > Numpy relies on nose for testing. I know that there is a py3k branch for nose but it doesn't look very active and I don't know its current state. Do you know anything about that? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From renesd at gmail.com Fri Sep 18 12:36:24 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 17:36:24 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> Message-ID: <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> On Fri, Sep 18, 2009 at 5:33 PM, Charles R Harris wrote: > Numpy relies on nose for testing. I know that there is a py3k branch for > nose but it doesn't look very active and I don't know its current state. Do > you know anything about that? > > Chuck ah, bugger. No I don't. I can find out though... I'll write back with what I find. cu, From renesd at gmail.com Fri Sep 18 12:49:40 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 17:49:40 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> Message-ID: <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> On Fri, Sep 18, 2009 at 5:36 PM, Ren? Dudfield wrote: > On Fri, Sep 18, 2009 at 5:33 PM, Charles R Harris > wrote: >> Numpy relies on nose for testing. I know that there is a py3k branch for >> nose but it doesn't look very active and I don't know its current state. Do >> you know anything about that? >> >> Chuck > > ah, bugger. ?No I don't. ?I can find out though... ?I'll write back > with what I find. > > > cu, > ok... so this is the py3k branch here... svn checkout http://python-nose.googlecode.com/svn/branches/py3k nose3 It does look old, as nose has since moved to mecurial from svn. Apparently it works though... but I haven't tried it. Here is the mailing list post from april which says it's working: http://groups.google.com/group/nose-users/browse_thread/thread/3463fc48bad31ee8# btw, for python code you can run the test suite with python -3 to get it to print warnings about python3 incompatible things used. I think we could use this to at least start getting python3 incompatible stuff running on python2.6. In fact, that's where I'll start with. cheers, From Chris.Barker at noaa.gov Fri Sep 18 13:08:24 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 18 Sep 2009 10:08:24 -0700 Subject: [Numpy-discussion] string arrays - accessing data from C++ In-Reply-To: <69d8d540909180235o3ba29b9aqe145be45df393937@mail.gmail.com> References: <69d8d540909180235o3ba29b9aqe145be45df393937@mail.gmail.com> Message-ID: <4AB3BE88.5060700@noaa.gov> Jaroslav Hajek wrote: > Does PyArrayObject::data point to a single contiguous char[] buffer, > like with the old Numeric char arrays, with > PyArrayObject::descr->elsize being the maximum length? yes. > string lengths determined c-style null termination > Finally, is there any way to create an array in NumPy (from within the > interpreter) that would have type == PyArray_CHAR? I think this will get you what you want: a = np.empty((3,4), dtype=np.character) or a = np.empty((3,4), dtype='c') You can learn a lot by experimenting at the command line (even better, ipython): In [27]: a = np.array(('this', 'that','a longer string','s')) In [28]: a Out[28]: array(['this', 'that', 'a longer string', 's'], dtype='|S15') you can see that it is a dtype of '|S15', so each element can be up to 15 bytes. #which you can also fine this way: In [30]: a.itemsize Out[30]: 15 and, for a contiguous block, like this: In [31]: a.strides Out[31]: (15,) # now to look at the bytes themselves: In [37]: b = a.view(dtype=np.uint8).reshape((4,-1)) In [38]: b Out[38]: array([[116, 104, 105, 115, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [116, 104, 97, 116, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 97, 32, 108, 111, 110, 103, 101, 114, 32, 115, 116, 114, 105, 110, 103], [115, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8) so you can see that it's null-terminated. I find it very cool that you can get at virtually all the c-level info for an array from python. HTH, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From renesd at gmail.com Fri Sep 18 15:14:56 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 20:14:56 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> Message-ID: <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> On Fri, Sep 18, 2009 at 5:49 PM, Ren? Dudfield wrote: > On Fri, Sep 18, 2009 at 5:36 PM, Ren? Dudfield wrote: >> On Fri, Sep 18, 2009 at 5:33 PM, Charles R Harris >> wrote: >>> Numpy relies on nose for testing. I know that there is a py3k branch for >>> nose but it doesn't look very active and I don't know its current state. Do >>> you know anything about that? >>> >>> Chuck >> >> ah, bugger. ?No I don't. ?I can find out though... ?I'll write back >> with what I find. >> >> >> cu, >> > > ok... so this is the py3k branch here... > ? ?svn checkout http://python-nose.googlecode.com/svn/branches/py3k nose3 > > It does look old, as nose has since moved to mecurial from svn. > Apparently it works though... but I haven't tried it. > > Here is the mailing list post from april which says it's working: > ? ?http://groups.google.com/group/nose-users/browse_thread/thread/3463fc48bad31ee8# > > > btw, for python code you can run the test suite with python -3 to get > it to print warnings about python3 incompatible things used. ?I think > we could use this to at least start getting python3 incompatible stuff > running on python2.6. > > In fact, that's where I'll start with. > > > cheers, > Hi, I'll be uploading stuff to github at http://github.com/illume/numpy3k. git clone git://github.com/illume/numpy3k.git It's from numpy trunk... I hope that is ok. Not much there yet of course. Only added a PY3K.txt file with notes from this thread. This will be the file you talked about with the porting notes, and todos/status/plans etc. Will email when there's other stuff to look at. Maybe you could try pulling from there and committing that file to svn as a test that my setup is working? cheers, From highegg at gmail.com Fri Sep 18 15:23:29 2009 From: highegg at gmail.com (Jaroslav Hajek) Date: Fri, 18 Sep 2009 21:23:29 +0200 Subject: [Numpy-discussion] string arrays - accessing data from C++ In-Reply-To: <4AB3BE88.5060700@noaa.gov> References: <69d8d540909180235o3ba29b9aqe145be45df393937@mail.gmail.com> <4AB3BE88.5060700@noaa.gov> Message-ID: <69d8d540909181223j6620152ex2ffbbd6afb7578fe@mail.gmail.com> On Fri, Sep 18, 2009 at 7:08 PM, Christopher Barker wrote: > Jaroslav Hajek wrote: > >> Does PyArrayObject::data point to a single contiguous char[] buffer, >> like with the old Numeric char arrays, with >> PyArrayObject::descr->elsize being the maximum length? > > yes. > >> string lengths determined > > c-style null termination > Hmm, this didn't seem to work for me. But maybe I was doing something else wrong. Thanks. >> Finally, is there any way to create an array in NumPy (from within the >> interpreter) that would have type == PyArray_CHAR? > > I think this will get you what you want: > > a = np.empty((3,4), dtype=np.character) > or > a = np.empty((3,4), dtype='c') > Are you sure? I think this is what I tried (I can't check at this moment), and the result has descr->type equal to PyArray_STRING. Also, note that even in the interpreter, the dtype shows itself as string: >>> numpy.dtype('c') dtype('|S1') > You can learn a lot by experimenting at the command line (even better, > ipython): > > In [27]: a = np.array(('this', 'that','a longer string','s')) > > In [28]: a > Out[28]: > array(['this', 'that', 'a longer string', 's'], > ? ? ? dtype='|S15') > > > you can see that it is a dtype of '|S15', so each element can be up to > 15 bytes. > > #which you can also fine this way: > > In [30]: a.itemsize > Out[30]: 15 > > and, for a contiguous block, like this: > > In [31]: a.strides > Out[31]: (15,) > > # now to look at the bytes themselves: > > In [37]: b = a.view(dtype=np.uint8).reshape((4,-1)) > > In [38]: b > Out[38]: > array([[116, 104, 105, 115, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, > ? ? ? ? ? 0, ? 0], > ? ? ? ?[116, 104, ?97, 116, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, > ? ? ? ? ? 0, ? 0], > ? ? ? ?[ 97, ?32, 108, 111, 110, 103, 101, 114, ?32, 115, 116, 114, 105, > ? ? ? ? 110, 103], > ? ? ? ?[115, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, ? 0, > ? ? ? ? ? 0, ? 0]], dtype=uint8) > > > so you can see that it's null-terminated. > Even null-padded, apparently. > I find it very cool that you can get at virtually all the c-level info > for an array from python. > Yes. -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz From renesd at gmail.com Fri Sep 18 15:50:58 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 20:50:58 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> Message-ID: <64ddb72c0909181250r4d7937a1l715c6d41923f345e@mail.gmail.com> Hi again, I found your numpy/core/src/py3k_notes.txt file. I left the PY3K.txt I made in there for now, so its easy for people to find... but will use the other file for more specific information like is in there now. Or I can move the information across from PY3K.txt into py3k_notes.txt if you prefer. From renesd at gmail.com Fri Sep 18 16:18:14 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Fri, 18 Sep 2009 21:18:14 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909181250r4d7937a1l715c6d41923f345e@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> <64ddb72c0909181250r4d7937a1l715c6d41923f345e@mail.gmail.com> Message-ID: <64ddb72c0909181318p65933213sccd2a722ed72a255@mail.gmail.com> hi, Added a numpy/compat.py file from pygame. This defines these things for compatibility: __all__ = ['geterror', 'long_', 'xrange_', 'ord_', 'unichr_', 'unicode_', 'raw_input_'] geterror() is useful for exceptions compatible between py3k and pyv2... As in py3k you can't do this: except ImportError, e: So with geterror() it's like this: except ImportError: e = compat.geterror() The other ones are just compatibility versions of stuff that's missing. If numpy/compat.py is the wrong place please let me know. cheers, ps. Also... I've asked Lenard, and Marcus from the pygame project if they mind giving their py3k related code from the LGPL pygame to numpy and Marcus has said yes. Marcus doesn't have the time for (yet another) project... but is willing to answer questions if we have any issues. Haven't heard back from Lenard yet... so better wait off before importing the pygame code into numpy until he gives the all clear. pps. I was wrong about the -3 flag for python2.6... that only warns about stuff it knows 2to3 can not handle. 2to3 comes with python2.6 as well as python3.0/3.1. From Chris.Barker at noaa.gov Fri Sep 18 16:26:40 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 18 Sep 2009 13:26:40 -0700 Subject: [Numpy-discussion] string arrays - accessing data from C++ In-Reply-To: <69d8d540909181223j6620152ex2ffbbd6afb7578fe@mail.gmail.com> References: <69d8d540909180235o3ba29b9aqe145be45df393937@mail.gmail.com> <4AB3BE88.5060700@noaa.gov> <69d8d540909181223j6620152ex2ffbbd6afb7578fe@mail.gmail.com> Message-ID: <4AB3ED00.7040202@noaa.gov> Jaroslav Hajek wrote: >>> string lengths determined >> c-style null termination >> > > Hmm, this didn't seem to work for me. But maybe I was doing something > else wrong. Thanks. well, I notice that for a length-n string, if there are n "real' characters, then there is no null, so that may have messed up your code somewhere. >> a = np.empty((3,4), dtype=np.character) > Are you sure? I think this is what I tried (I can't check at this > moment), and the result has descr->type equal to PyArray_STRING. Also, > note that even in the interpreter, the dtype shows itself as string: > >>>> numpy.dtype('c') > dtype('|S1') Good point -- that is a length-one string, not the same thing. Running: for n in dir(np): if type(getattr(np, n)) == type(np.uint8): print n give me what should be all the dtype objects, and these are the ones that look to me like they might be "char": byte character chararray int8 ubyte uint8 but none of those seem to be quite right: In [20]: for dtype in [np.byte, np.character, np.chararray, np.int8, np.ubyte, np.uint8]: ....: a = np.empty((1,1), dtype=dtype); print a.dtype ....: ....: int8 |S1 object int8 uint8 uint8 There was a discussion on the Cython list recently, and apparently "char" is not as clearly defined as I thought -- some compilers use signed, some unsigned.. who knew? So I'm not sure what PyArray_CHAR is. I'm sure someone more familiar with the C side of things can answer this, though. Anyone? > Even null-padded, apparently. let's see: In [24]: a = np.array(['this','that','the other']) In [25]: a.view(np.uint8).reshape((3,-1)) Out[25]: array([[116, 104, 105, 115, 0, 0, 0, 0, 0], [116, 104, 97, 116, 0, 0, 0, 0, 0], [116, 104, 101, 32, 111, 116, 104, 101, 114]], dtype=uint8) In [26]: a[2] = 's' In [27]: a Out[27]: array(['this', 'that', 's'], dtype='|S9') In [28]: a.view(np.uint8).reshape((3,-1)) Out[28]: array([[116, 104, 105, 115, 0, 0, 0, 0, 0], [116, 104, 97, 116, 0, 0, 0, 0, 0], [115, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8) yup-- it looks like the padding is maintained -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Fri Sep 18 17:12:11 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Sep 2009 15:12:11 -0600 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909181318p65933213sccd2a722ed72a255@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> <64ddb72c0909181250r4d7937a1l715c6d41923f345e@mail.gmail.com> <64ddb72c0909181318p65933213sccd2a722ed72a255@mail.gmail.com> Message-ID: On Fri, Sep 18, 2009 at 2:18 PM, Ren? Dudfield wrote: > hi, > > Added a numpy/compat.py file from pygame. > > This defines these things for compatibility: > __all__ = ['geterror', 'long_', 'xrange_', 'ord_', 'unichr_', > 'unicode_', 'raw_input_'] > > > geterror() is useful for exceptions compatible between py3k and pyv2... > > As in py3k you can't do this: > except ImportError, e: > > So with geterror() it's like this: > except ImportError: > e = compat.geterror() > > > The other ones are just compatibility versions of stuff that's missing. > > > If numpy/compat.py is the wrong place please let me know. > > > cheers, > > > ps. Also... I've asked Lenard, and Marcus from the pygame project if > they mind giving their py3k related code from the LGPL pygame to numpy > and Marcus has said yes. Marcus doesn't have the time for (yet > another) project... but is willing to answer questions if we have any > issues. Haven't heard back from Lenard yet... so better wait off > before importing the pygame code into numpy until he gives the all > clear. > > pps. I was wrong about the -3 flag for python2.6... that only warns > about stuff it knows 2to3 can not handle. 2to3 comes with python2.6 > as well as python3.0/3.1. > Thanks Rene, I'll take a look at this stuff when I get home. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Sep 19 01:41:47 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Sep 2009 23:41:47 -0600 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> Message-ID: On Fri, Sep 18, 2009 at 1:14 PM, Ren? Dudfield wrote: > On Fri, Sep 18, 2009 at 5:49 PM, Ren? Dudfield wrote: > > On Fri, Sep 18, 2009 at 5:36 PM, Ren? Dudfield wrote: > >> On Fri, Sep 18, 2009 at 5:33 PM, Charles R Harris > >> wrote: > >>> Numpy relies on nose for testing. I know that there is a py3k branch > for > >>> nose but it doesn't look very active and I don't know its current > state. Do > >>> you know anything about that? > >>> > >>> Chuck > >> > >> ah, bugger. No I don't. I can find out though... I'll write back > >> with what I find. > >> > >> > >> cu, > >> > > > > ok... so this is the py3k branch here... > > svn checkout http://python-nose.googlecode.com/svn/branches/py3knose3 > > > > It does look old, as nose has since moved to mecurial from svn. > > Apparently it works though... but I haven't tried it. > > > > Here is the mailing list post from april which says it's working: > > > http://groups.google.com/group/nose-users/browse_thread/thread/3463fc48bad31ee8# > > > > > > btw, for python code you can run the test suite with python -3 to get > > it to print warnings about python3 incompatible things used. I think > > we could use this to at least start getting python3 incompatible stuff > > running on python2.6. > > > > In fact, that's where I'll start with. > > > > > > cheers, > > > > Hi, > > I'll be uploading stuff to github at http://github.com/illume/numpy3k. > git clone git://github.com/illume/numpy3k.git > > Hmm, that doesn't work for me, git finds no commits in common if I pull and, as far as I can tell, essentially clones the whole thing. I quess that's because I'm using git-svn and started in a different place for the initial co and every commit to svn rebases. Diffs might be the easiest way to go. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sat Sep 19 03:53:18 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 19 Sep 2009 16:53:18 +0900 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> Message-ID: <4AB48DEE.3030106@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > Hi Ren?, > > On Fri, Sep 18, 2009 at 6:01 AM, Ren? Dudfield > wrote: > > Hello, > > as a big numpy user, and someone wanting to help with the python 3 > migration, I'd like to help with a python 3.1 port of numpy. > > We(at the pygame project) have mostly completed our port of pygame to > python 3.0 and 3.1 so can offer some insight into what it takes with a > CPython extension. > > pygame supports python 2.3 through to 3.1, so it should be possible to > also keep backwards compatibility with the port for numpy. We can > also use some of the helper code we used for the pygame port. > > I haven't asked the other pygame developers if they are interested in > helping too... but maybe they will be interested in helping too(since > they use numpy too). I'd also be interested in other people helping > with the effort. Once I have put some of the ground work in place, > and got a few parts done it should be easier for other people to see > what needs changing. > > > If the python 3 port is something you'd want included with numpy, then > I'd like to begin this weekend. > > I'm not super familiar with the numpy code base yet, so I'd like to do > it in small changes making small parts compatible with py3k and then > having them reviewed/tested. > > I can either start a branch myself using mecurial, or perhaps you'd > like me to do the work somewhere else ( like in a branch in the numpy > svn?). > > Which code should I base the port off? trunk ? > > > Darren Dale and I are just getting started on a port and welcome any > help you can offer. Because of the difficulty of maintaining two > branches the only route that looks good at this point is to get the > python parts of numpy in a state that will allow 2to3 to work and use > #ifdef's in the c code. What was your experience with pygames? > > Because the numpy c code is difficult to come to grips with the > easiest part for inexperienced c coders and newbies is to start on is > probably the python code. There is a fair amount of that so some plan > of attack and a check list probably needs to be set up. Any such list > should be kept in svn along with any helpful notes about > problems/solutions encountered along the way. > > We also need David C. to commit his build changes for py3k so we can > actually build the whole thing when the time comes. (Hint, hint). I will try to update it, but I don't have much time ATM to work on python 3 issues - one issue is that we would need some kind of infrastructure so that incompatible distutils changes would be detected automatically (with a buildbot or something). Otherwise, since nobody can use python 3 and numpy ATM, the code will quickly bitrot. Concerning python version, I think 3.0 should not even be considered. It is already considered obsolete, and we can be fairly confident that there are 0 numpy users under python 3.0 :) cheers, David From renesd at gmail.com Sat Sep 19 04:54:51 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 19 Sep 2009 09:54:51 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> Message-ID: <64ddb72c0909190154s3cbbffa1q1bde0d65986947a8@mail.gmail.com> On Sat, Sep 19, 2009 at 6:41 AM, Charles R Harris wrote: >> >> Hi, >> >> I'll be uploading stuff to github at http://github.com/illume/numpy3k. >> ? ?git clone git://github.com/illume/numpy3k.git >> > > Hmm, that doesn't work for me, git finds no commits in common if I pull and, > as far as I can tell, essentially clones the whole thing. I quess that's > because I'm using git-svn and started in a different place for the initial > co and every commit to svn rebases. Diffs might be the easiest way to go. > > Chuck > > ah, oh well. I'll just upload diffs as I go. Which versions of python is numpy to support? 2.5 and greater? Or 2.4 and greater? I can't see if it's written anywhere which versions are supported. Personally, I think at least python2.5 should be supported... as that's still very widely used... but older versions should be able to be supported with my changes if needed. cheers, From david at ar.media.kyoto-u.ac.jp Sat Sep 19 04:37:34 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 19 Sep 2009 17:37:34 +0900 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909190154s3cbbffa1q1bde0d65986947a8@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> <64ddb72c0909190154s3cbbffa1q1bde0d65986947a8@mail.gmail.com> Message-ID: <4AB4984E.90001@ar.media.kyoto-u.ac.jp> Ren? Dudfield wrote: > ah, oh well. I'll just upload diffs as I go. > > Which versions of python is numpy to support? 2.4 and above, cheers, David From pav at iki.fi Sat Sep 19 05:31:41 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 19 Sep 2009 12:31:41 +0300 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> Message-ID: <1253352700.20568.4.camel@idol> pe, 2009-09-18 kello 23:41 -0600, Charles R Harris kirjoitti: [clip] > Hmm, that doesn't work for me, git finds no commits in common if I > pull and, as far as I can tell, essentially clones the whole thing. I > quess that's because I'm using git-svn and started in a different > place for the initial co and every commit to svn rebases. Diffs might > be the easiest way to go. Use our Git mirror if you need to collaborate: http://projects.scipy.org/numpy/wiki/GitMirror It's there precisely to allow a central tree around which to base your changes, and avoids precisely these problems. Also, avoid touching git-svn as much as possible. If you want to preserve interoperability, "git-svn dcommit" to SVN always from a separate commit branch, to which you cherry-pick changes from the branch you use to collaborate. This is because git-svn rewrites the commits you dcommit to SVN. -- Pauli Virtanen From renesd at gmail.com Sat Sep 19 05:55:41 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 19 Sep 2009 10:55:41 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <1253352700.20568.4.camel@idol> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180809q386cdac4n2f541cd0db7f9ce@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> <1253352700.20568.4.camel@idol> Message-ID: <64ddb72c0909190255g47931c74x4d7902364888d41a@mail.gmail.com> On Sat, Sep 19, 2009 at 10:31 AM, Pauli Virtanen wrote: > pe, 2009-09-18 kello 23:41 -0600, Charles R Harris kirjoitti: > > [clip] >> Hmm, that doesn't work for me, git finds no commits in common if I >> pull and, as far as I can tell, essentially clones the whole thing. I >> quess that's because I'm using git-svn and started in a different >> place for the initial co and every commit to svn rebases. Diffs might >> be the easiest way to go. > > Use our Git mirror if you need to collaborate: > > ? ? ? ?http://projects.scipy.org/numpy/wiki/GitMirror > > It's there precisely to allow a central tree around which to base your > changes, and avoids precisely these problems. > > Also, avoid touching git-svn as much as possible. If you want to > preserve interoperability, "git-svn dcommit" to SVN always from a > separate commit branch, to which you cherry-pick changes from the branch > you use to collaborate. This is because git-svn rewrites the commits you > dcommit to SVN. > Cool, I'll follow that guide and set up a new repo... avoiding the git-svn parts of the guide. thanks. From renesd at gmail.com Sat Sep 19 12:23:01 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 19 Sep 2009 17:23:01 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909190255g47931c74x4d7902364888d41a@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180919v6a0d7930r90afc3ac72673490@mail.gmail.com> <64ddb72c0909180936j4dc0b927t377a781db6af1ca8@mail.gmail.com> <64ddb72c0909180949k3b885c75id6b8d383cce731db@mail.gmail.com> <64ddb72c0909181214kf294732k44b055a0259648b4@mail.gmail.com> <1253352700.20568.4.camel@idol> <64ddb72c0909190255g47931c74x4d7902364888d41a@mail.gmail.com> Message-ID: <64ddb72c0909190923g34276410rf6f31f66fc202d24@mail.gmail.com> Hi again, I deleted the old git repo there, and I followed the scipy git guide and put the changes here: http://github.com/illume/numpy3k. git clone git://github.com/illume/numpy3k.git So hopefully you'll be able to take those changes from git into svn. If it's still not working... please let me know and I'll still use that git repo, but just upload diffs for you. cheers, From gheorghe.postelnicu at gmail.com Sat Sep 19 17:00:43 2009 From: gheorghe.postelnicu at gmail.com (Gheorghe Postelnicu) Date: Sat, 19 Sep 2009 23:00:43 +0200 Subject: [Numpy-discussion] Error while running installer Message-ID: <8094b1b60909191400t1866b313ke9faeda66b167c67@mail.gmail.com> Hi guys, I just tried to run the 1.3.0 superpack installer and I get the following message box: Executing numpy installer failed. The details show the following lines: Output folder: C:\DOCUME~1\Ghighi\LOCALS~1\Temp Install dir for actual installers is C:\DOCUME~1\Ghighi\LOCALS~1\Temp "Target CPU handles SSE2" "Target CPU handles SSE3" "native install (arch value: native)" "Install SSE 3" Extract: numpy-1.3.0-sse3.exe... 100% Execute: "C:\DOCUME~1\Ghighi\LOCALS~1\Temp\numpy-1.3.0-sse3.exe" Completed I just re-installed my Windows box. Are there any pre-requisites other than Python 2.6? I have installed Python 2.6.2. Many thanks in advance for the help, Gheorghe -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Sat Sep 19 19:25:07 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 19 Sep 2009 16:25:07 -0700 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> Message-ID: Hi all, On Fri, Sep 18, 2009 at 7:52 AM, Ren? Dudfield wrote: > one more thing... > > there's also notes about porting to py3k here: > ? ?http://wiki.python.org/moin/cporting > and here: > ? ?http://wiki.python.org/moin/PortingExtensionModulesToPy3k > > Which are better than the python.org docs for cporting. ?That's > probably a pretty good page to store notes about porting as we go. just a note in terms of resources, one I found out recently to be useful is the new version of Dive into Python, that is specifically for py3: http://diveintopython3.org/ in particular it has a handy porting summary of Python-level things: http://diveintopython3.org/porting-code-to-python-3-with-2to3.html Cheers, f From david at ar.media.kyoto-u.ac.jp Sun Sep 20 05:18:39 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 20 Sep 2009 18:18:39 +0900 Subject: [Numpy-discussion] Error while running installer In-Reply-To: <8094b1b60909191400t1866b313ke9faeda66b167c67@mail.gmail.com> References: <8094b1b60909191400t1866b313ke9faeda66b167c67@mail.gmail.com> Message-ID: <4AB5F36F.9010308@ar.media.kyoto-u.ac.jp> Gheorghe Postelnicu wrote: > Hi guys, > > I just tried to run the 1.3.0 superpack installer and I get the > following message box: > > Executing numpy installer failed. > > The details show the following lines: > > Output folder: C:\DOCUME~1\Ghighi\LOCALS~1\Temp > Install dir for actual installers is C:\DOCUME~1\Ghighi\LOCALS~1\Temp > "Target CPU handles SSE2" > "Target CPU handles SSE3" > "native install (arch value: native)" > "Install SSE 3" > Extract: numpy-1.3.0-sse3.exe... 100% > Execute: "C:\DOCUME~1\Ghighi\LOCALS~1\Temp\numpy-1.3.0-sse3.exe" > Completed > > I just re-installed my Windows box. Are there any pre-requisites other > than Python 2.6? I have installed Python 2.6.2. which version of windows are you using ? Is numpy installed (i.e. is the error message bogus or the installer actually failed ?) David From xavier.gnata at gmail.com Sun Sep 20 12:08:38 2009 From: xavier.gnata at gmail.com (Xavier Gnata) Date: Sun, 20 Sep 2009 18:08:38 +0200 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> Message-ID: <4AB65386.5040904@gmail.com> Hi, I have a large 2D numpy array as input and a 1D array as output. In between, I would like to use C code. C is requirement because it has to be fast and because the algorithm cannot be written in a numpy oriented way :( (no way...really). Which tool should I use to achieve that? waeve.inline? pyrex? What is the baseline? I don't know the size of the 1D array before the end of the computation (if it is relevant in the numpy/C interaction). Xavier From renesd at gmail.com Sun Sep 20 14:13:45 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sun, 20 Sep 2009 19:13:45 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> Message-ID: <64ddb72c0909201113j6f1426c6vd3cef3cb851df666@mail.gmail.com> Hi again, I noticed numpy includes a copy of distutils. I guess because it's been modified in some way? cheers, From robert.kern at gmail.com Sun Sep 20 14:15:04 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 20 Sep 2009 13:15:04 -0500 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <64ddb72c0909201113j6f1426c6vd3cef3cb851df666@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909201113j6f1426c6vd3cef3cb851df666@mail.gmail.com> Message-ID: <3d375d730909201115q176ba039x33382a84cf8a4ebb@mail.gmail.com> On Sun, Sep 20, 2009 at 13:13, Ren? Dudfield wrote: > Hi again, > > I noticed numpy includes a copy of distutils. ?I guess because it's > been modified in some way? numpy.distutils is a set of extensions to distutils; it is not a copy of distutils. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From romain.brette at ens.fr Sun Sep 20 14:17:44 2009 From: romain.brette at ens.fr (Romain Brette) Date: Sun, 20 Sep 2009 20:17:44 +0200 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? Message-ID: Hi, Would anyone have thoughts about what the best hardware would be for Numpy? In particular, I am wondering about Intel Core i7 vs Xeon. Also, I feel that the limiting factor might be memory speed and cache rather than processor speed. What do you think? Best, Romain From ralf.gommers at googlemail.com Sun Sep 20 15:49:18 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 20 Sep 2009 15:49:18 -0400 Subject: [Numpy-discussion] merging docs from wiki Message-ID: Hi, I'm done reviewing all the improved docstrings for NumPy, they can be merged now from the doc editor Patch page. Maybe I'll get around to doing the SciPy ones as well this week, but I can't promise that. There are a few docstrings on the Patch page I did not mark "Ok to apply": 1. the generic docstrings. Some are marked Ready for review, but they refer mostly to "self" and to "generic" which I don't think is very helpful. It would be great if someone could do just one of those docstrings and make it somewhat informative. There are about 50 that can then be done in the same way. 2. get_numpy_include: the docstring is deleted because the function is deprecated. I don't think that is helpful but I'm not sure. Should this be reverted or applied? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Sun Sep 20 19:33:07 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sun, 20 Sep 2009 19:33:07 -0400 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? In-Reply-To: References: Message-ID: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> On 20-Sep-09, at 2:17 PM, Romain Brette wrote: > Would anyone have thoughts about what the best hardware would be for > Numpy? In > particular, I am wondering about Intel Core i7 vs Xeon. Also, I feel > that the > limiting factor might be memory speed and cache rather than > processor speed. > What do you think? So, there are several different chips that bear the Xeon brand, you'd have to look at individual benchmarks. But if you're concerned about linear algebra performance, I'd say to go with the desktop version and spend some of the money you save on a license for the Intel Math Kernel Library to link NumPy against: http://software.intel.com/en-us/intel-mkl/ David From aka.tkf at gmail.com Sun Sep 20 20:02:39 2009 From: aka.tkf at gmail.com (Takafumi Arakaki) Date: Mon, 21 Sep 2009 09:02:39 +0900 Subject: [Numpy-discussion] PyArray_AsCArray (cfunction, in Array API) in Numpy User Guide Message-ID: <129768d20909201702gb000326ja35edff6946ceb83@mail.gmail.com> Hi, Is the definition and explanation of PyArray_AsCArray in Numpy User Guide up-to-date? In the guide, it's like this: int PyArray_AsCArray(PyObject** op, void* ptr, npy_intp* dims, int nd, int typenum, int itemsize) (http://docs.scipy.org/doc/numpy/reference/c-api.array.html?highlight=pyarray_ascarray#PyArray_AsCArray) But in the source code: int PyArray_AsCArray(PyObject **op, void *ptr, intp *dims, int nd, PyArray_Descr* typedescr) (http://projects.scipy.org/numpy/browser/tags/1.3.0/numpy/core/src/multiarraymodule.c) I think the code in the guide is from old source code. I found the same code in v0.6.1: (http://projects.scipy.org/numpy/browser/tags/0.6.1/scipy/base/src/multiarraymodule.c) Please tell me how to use PyArray_AsCArray if anyone know. Thanks, Takafumi From ralf.gommers at googlemail.com Sun Sep 20 20:59:57 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 20 Sep 2009 20:59:57 -0400 Subject: [Numpy-discussion] merging docs from wiki In-Reply-To: References: Message-ID: On Sun, Sep 20, 2009 at 3:49 PM, Ralf Gommers wrote: > Hi, > > I'm done reviewing all the improved docstrings for NumPy, they can be > merged now from the doc editor Patch page. Maybe I'll get around to doing > the SciPy ones as well this week, but I can't promise that. > > Actually, scipy was a lot less work. Please merge that too. cheers, ralf > There are a few docstrings on the Patch page I did not mark "Ok to apply": > > 1. the generic docstrings. Some are marked Ready for review, but they refer > mostly to "self" and to "generic" which I don't think is very helpful. It > would be great if someone could do just one of those docstrings and make it > somewhat informative. There are about 50 that can then be done in the same > way. > > 2. get_numpy_include: the docstring is deleted because the function is > deprecated. I don't think that is helpful but I'm not sure. Should this be > reverted or applied? > > Cheers, > Ralf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Sun Sep 20 23:11:42 2009 From: wesmckinn at gmail.com (Wes McKinney) Date: Sun, 20 Sep 2009 23:11:42 -0400 Subject: [Numpy-discussion] np.take versus fancy indexing Message-ID: <6c476c8a0909202011g37a5f87cj7dee741a877519dc@mail.gmail.com> Any clue why I'm seeing this behavior? np.take's documentation says it does the "same thing" as fancy indexing, but from this example I'm not so sure: import numpy as np mat = np.random.randn(5000, 1000) selector = np.array(np.arange(5000)[::2]) In [95]: timeit submat = mat[selector] 10 loops, best of 3: 68.4 ms per loop In [96]: timeit submat = np.take(mat, selector, axis=0) 10 loops, best of 3: 21.5 ms per loop indeed the result is the same: In [97]: (mat[selector] == np.take(mat, selector, axis=0)).all() Out[97]: True In [98]: mat[selector].flags Out[98]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [99]: np.take(mat, selector, axis=0).flags Out[99]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False What's going on here / am I doing something wrong? Thanks, Wes From aka.tkf at gmail.com Mon Sep 21 03:07:02 2009 From: aka.tkf at gmail.com (Takafumi Arakaki) Date: Mon, 21 Sep 2009 16:07:02 +0900 Subject: [Numpy-discussion] PyArray_AsCArray (cfunction, in Array API) in Numpy User Guide In-Reply-To: <129768d20909201702gb000326ja35edff6946ceb83@mail.gmail.com> References: <129768d20909201702gb000326ja35edff6946ceb83@mail.gmail.com> Message-ID: <129768d20909210007y4c1aca29h3961441acc082695@mail.gmail.com> Hi, I wrote sample code and it works fine. This is my code, in case anyone else want to know how to use it: #include #include "structmember.h" #include static PyObject * print_a1(PyObject *dummy, PyObject *args) { npy_intp dims[3]; /* PyArray_AsCArray is for ndim <= 3 */ int typenum; int i, nd; PyObject *o1; double *d1; PyArray_Descr *descr; if (PyArg_ParseTuple(args, "O:print_a1", &o1) < 0) { PyErr_SetString( PyExc_TypeError, "bad arguments"); return NULL; } nd = PyArray_NDIM(o1); typenum = NPY_DOUBLE; descr = PyArray_DescrFromType(typenum); if (PyArray_AsCArray(&o1, (void *)&d1, dims, nd, descr) < 0){ PyErr_SetString( PyExc_TypeError, "error on getting c array"); return NULL; } printf("[%d] ", dims[0]); for (i = 0; i < dims[0]; ++i){ printf("%.2f ", d1[i]); } printf("\n"); printf("if ( ((PyArrayObject *)o1)->data == d1): "); if ( ((PyArrayObject *)o1)->data == (char *)d1){ printf("True\n"); }else{ printf("False\n"); } if (PyArray_Free(o1, (void *)&d1) < 0){ PyErr_SetString( PyExc_TypeError, "PyArray_Free fail"); return NULL; } return Py_BuildValue(""); /* return None */ } static PyMethodDef module_methods[] = { {"print_a1", (PyCFunction)print_a1, METH_VARARGS, ""}, {NULL} /* Sentinel */ }; #ifndef PyMODINIT_FUNC /* declarations for DLL import/export */ #define PyMODINIT_FUNC void #endif PyMODINIT_FUNC initaca(void) { (void) Py_InitModule("aca", module_methods); import_array(); /* required NumPy initialization */ } Thanks, Takafumi From lciti at essex.ac.uk Mon Sep 21 04:35:47 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Mon, 21 Sep 2009 09:35:47 +0100 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> Hello, I cannot quite understand whether ndarray.base is the closest base, the one from which the view was made or the ultimate base, the one actually containing the data. I think the documentation and the actual behaviour mismatch. In [1]: import numpy as np In [2]: x = np.arange(12) In [3]: y = x[::2] In [4]: z = y[2:] In [5]: x.flags Out[5]: C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [6]: y.flags Out[6]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [7]: z.flags Out[7]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [8]: z.base Out[8]: array([ 0, 2, 4, 6, 8, 10]) It looks like the "base" of "z" is "y", i.e. its closest base, the array from which the view "z" was created. But the documentation says: base : ndarray If the array is a view on another array, that array is its `base` (unless that array is also a view). The `base` array is where the array data is ultimately stored. and it looks like the "base" should be "x", the array where the data is ultimately stored. I like the second one better. First, because this way I do not have to travel all the bases until I find an array with OWNDATA set. Second, because the current implementation keeps "y" alive because of "z" while in the end "z" only needs "x". In [11]: del y In [12]: z.base Out[12]: array([ 0, 2, 4, 6, 8, 10]) Comments? Best, Luca From pav+sp at iki.fi Mon Sep 21 05:06:53 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Mon, 21 Sep 2009 09:06:53 +0000 (UTC) Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> Message-ID: Mon, 21 Sep 2009 09:35:47 +0100, Citi, Luca wrote: > I cannot quite understand whether ndarray.base is the closest base, the > one from which the view was made or the ultimate base, the one actually > containing the data. > I think the documentation and the actual behaviour mismatch. The closest base. If the documentation says the opposite, it's wrong. That it's the closest base also causes crashes when the base chain becomes longer than the stack limit. -- Pauli Virtanen From highegg at gmail.com Mon Sep 21 05:12:34 2009 From: highegg at gmail.com (Jaroslav Hajek) Date: Mon, 21 Sep 2009 11:12:34 +0200 Subject: [Numpy-discussion] string arrays - accessing data from C++ In-Reply-To: <4AB3ED00.7040202@noaa.gov> References: <69d8d540909180235o3ba29b9aqe145be45df393937@mail.gmail.com> <4AB3BE88.5060700@noaa.gov> <69d8d540909181223j6620152ex2ffbbd6afb7578fe@mail.gmail.com> <4AB3ED00.7040202@noaa.gov> Message-ID: <69d8d540909210212i21098df6sde539a4db99a9b39@mail.gmail.com> On Fri, Sep 18, 2009 at 10:26 PM, Christopher Barker wrote: > Jaroslav Hajek wrote: >>>> string lengths determined >>> c-style null termination >>> >> >> Hmm, this didn't seem to work for me. But maybe I was doing something >> else wrong. Thanks. > > well, I notice that for a length-n string, if there are n "real' > characters, then there is no null, so that may have messed up your code > somewhere. > As it happens, the problem was just in my brain :) > >>> a = np.empty((3,4), dtype=np.character) > >> Are you sure? I think this is what I tried (I can't check at this >> moment), and the result has descr->type equal to PyArray_STRING. Also, >> note that even in the interpreter, the dtype shows itself as string: >> >>>>> numpy.dtype('c') >> dtype('|S1') > > Good point -- that is a length-one string, not the same thing. Running: > > for n in dir(np): > ? ?if type(getattr(np, n)) == type(np.uint8): print n > > give me what should be all the dtype objects, and these are the ones > that look to me like they might be "char": > > byte > character > chararray > int8 > ubyte > uint8 > > but none of those seem to be quite right: > > In [20]: for dtype in [np.byte, np.character, np.chararray, np.int8, > np.ubyte, np.uint8]: > ? ?....: ? ? a = np.empty((1,1), dtype=dtype); print a.dtype > ? ?....: > ? ?....: > int8 > |S1 > object > int8 > uint8 > uint8 > > There was a discussion on the Cython list recently, and apparently > "char" is not as clearly defined as I thought -- some compilers use > signed, some unsigned.. who knew? So I'm not sure what PyArray_CHAR is. > This is what I suspected - there is no longer a true "character array" type, and dtype("c") is just an alias for dtype("S1"). Similarly, creating a PyArray_CHAR array from the C API results in dtype("|S1"). > > yup-- it looks like the padding is maintained > That's great, because that's almost exactly the data Octave needs. Only Octave typically uses space as the padding character for compatibility with Matlab, but can cope with nulls as well. NumPy string arrays are supported by Pytave now. Thanks for your help. best regards -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz From lciti at essex.ac.uk Mon Sep 21 05:51:52 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Mon, 21 Sep 2009 10:51:52 +0100 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk>, Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A01E@MBOX0.essex.ac.uk> Thanks for your quick answer. Is there a reason for that? Am I wrong or it only makes life harder, such as: while (PyArray_Check(base) && !PyArray_CHKFLAGS(base, NPY_OWNDATA)) { base = PyArray_BASE(base); } plus the possible error you underlined, plus the fact that this keeps a chain of zombies alive without reason. Are there cases where the current behaviour is useful? Best, Luca From pav+sp at iki.fi Mon Sep 21 06:11:43 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Mon, 21 Sep 2009 10:11:43 +0000 (UTC) Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01E@MBOX0.essex.ac.uk> Message-ID: Mon, 21 Sep 2009 10:51:52 +0100, Citi, Luca wrote: > Thanks for your quick answer. > > Is there a reason for that? > Am I wrong or it only makes life harder, such as: > > while (PyArray_Check(base) && !PyArray_CHKFLAGS(base, NPY_OWNDATA)) { > base = PyArray_BASE(base); > } > > plus the possible error you underlined, plus the fact that this keeps a > chain of zombies alive without reason. > > Are there cases where the current behaviour is useful? I don't see real merits in the current behavior over doing the chain up- walk on view creation. I don't know if anything in view creation requires that the immediate base is alive afterwards, but that seems unlikely, so I believe there's no reason not to make this change. Some (bad) code might be broken if this was changed, for example >>> y = x[::-1] >>> y.base is x but in practice this is probably negligible. -- Pauli Virtanen From lciti at essex.ac.uk Mon Sep 21 06:31:27 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Mon, 21 Sep 2009 11:31:27 +0100 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01E@MBOX0.essex.ac.uk>, Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> I think you do not need to do the chain up walk on view creation. If the assumption is that base is the ultimate base, on view creation you can do something like (pseudo-code): view.base = parent if parent.owndata else parent.base From david at ar.media.kyoto-u.ac.jp Mon Sep 21 06:50:40 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 21 Sep 2009 19:50:40 +0900 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <4AB65386.5040904@gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> Message-ID: <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> Xavier Gnata wrote: > Hi, > > I have a large 2D numpy array as input and a 1D array as output. > In between, I would like to use C code. > C is requirement because it has to be fast and because the algorithm > cannot be written in a numpy oriented way :( (no way...really). > > Which tool should I use to achieve that? waeve.inline? pyrex? What is > the baseline? > That's only a data point, but I almost always use cython in those cases, unless I need 'very advanced' features of the C API in which case I just do it manually. cheers, David From ndbecker2 at gmail.com Mon Sep 21 07:27:33 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 21 Sep 2009 07:27:33 -0400 Subject: [Numpy-discussion] something wrong with docs? Message-ID: I'm trying to read about subclassing. When I view http://docs.scipy.org/doc/numpy/user/basics.subclassing.html?highlight=subclass#module- numpy.doc.subclassing It seems the examples show the _outputs_ of tests, but I don't see the actual example code. e.g., the first example renders like this: Simple example - adding an extra attribute to ndarray? Using the object looks like this: From meine at informatik.uni-hamburg.de Mon Sep 21 07:28:41 2009 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Mon, 21 Sep 2009 13:28:41 +0200 Subject: [Numpy-discussion] =?iso-8859-1?q?is_ndarray=2Ebase_the_closest_b?= =?iso-8859-1?q?ase=09or=09the=09ultimate=09base=3F?= In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> Message-ID: <200909211328.41980.meine@informatik.uni-hamburg.de> Hi! On Monday 21 September 2009 12:31:27 Citi, Luca wrote: > I think you do not need to do the chain up walk on view creation. > If the assumption is that base is the ultimate base, on view creation > you can do something like (pseudo-code): > view.base = parent if parent.owndata else parent.base Hmm. My impression was that .base was for refcounting purposes *only*. Thus, it is not even guaranteed that the attribute value is an array(-like) object. For example, I might want to allow direct access to some internal buffers of an object of mine in an extension module; then, I'd use .base to bind the lifetime of my object to the array (the lifetime of which I cannot control anymore). Ciao, Hans From romain.brette at ens.fr Mon Sep 21 07:59:39 2009 From: romain.brette at ens.fr (Romain Brette) Date: Mon, 21 Sep 2009 13:59:39 +0200 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? In-Reply-To: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> References: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> Message-ID: David Warde-Farley a ?crit : > On 20-Sep-09, at 2:17 PM, Romain Brette wrote: > >> Would anyone have thoughts about what the best hardware would be for >> Numpy? In >> particular, I am wondering about Intel Core i7 vs Xeon. Also, I feel >> that the >> limiting factor might be memory speed and cache rather than >> processor speed. >> What do you think? > > > So, there are several different chips that bear the Xeon brand, you'd > have to look at individual benchmarks. But if you're concerned about > linear algebra performance, I'd say to go with the desktop version and > spend some of the money you save on a license for the Intel Math > Kernel Library to link NumPy against: http://software.intel.com/en-us/intel-mkl/ > > David Interesting, I might try Intel MKL. I use mostly element-wise operations (e.g. exp(x) or x>x0, where x is a vector), do you think it would make a big difference? Romain From ndbecker2 at gmail.com Mon Sep 21 08:00:34 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 21 Sep 2009 08:00:34 -0400 Subject: [Numpy-discussion] fixed-point arithmetic Message-ID: One thing I'm really missing is something like matlab's fixed-pt toolbox. I'd love to see this added to numpy. A fixed point integer (fpi) type is based on an integer, but keeps track of where the 'binary point' is. When created, the fpi has a specified number of fractional bits and integer bits. When assigned to, the fpi will check for overflow. On overflow various actions can be taken, including raise exception and ignore (just wraparound). numpy arrays of fpi should support all numeric operations. Also mixed fpi/integer operations. I'm not sure how to go about implementing this. At first, I was thinking to just subclass numpy array. But, I don't think this provides fpi scalars, and their associated operations. I have code in c++ for a scalar fpi data type (not numpy scalar, just a c++ type) that I think has all the required behavior. It depends on boost::python and other boost code (and unreleased boost constrained_value), so probably would not be interesting to a larger numpy audience. I'm thinking this might all be implemented in cython. I haven't used cython yet, so this could be an opportunity. OTOH, I don't know if cython has all the capabilities to implement a new numpy scalar/array type. Thoughts? Interest? From faltet at pytables.org Mon Sep 21 09:07:58 2009 From: faltet at pytables.org (Francesc Alted) Date: Mon, 21 Sep 2009 15:07:58 +0200 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? In-Reply-To: References: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> Message-ID: <200909211508.04406.faltet@pytables.org> A Monday 21 September 2009 13:59:39 Romain Brette escrigu?: > David Warde-Farley a ?crit : > > On 20-Sep-09, at 2:17 PM, Romain Brette wrote: > >> Would anyone have thoughts about what the best hardware would be for > >> Numpy? In > >> particular, I am wondering about Intel Core i7 vs Xeon. Also, I feel > >> that the > >> limiting factor might be memory speed and cache rather than > >> processor speed. > >> What do you think? > > > > So, there are several different chips that bear the Xeon brand, you'd > > have to look at individual benchmarks. But if you're concerned about > > linear algebra performance, I'd say to go with the desktop version and > > spend some of the money you save on a license for the Intel Math > > Kernel Library to link NumPy against: > > http://software.intel.com/en-us/intel-mkl/ > > > > David > > Interesting, I might try Intel MKL. I use mostly element-wise operations > (e.g. exp(x) or x>x0, where x is a vector), do you think it would make a > big difference? MKL should represent a big advantage for the exp(x) operation. For example, numexpr, that can optionally make use of MKL, gives these figures: In [1]: import numpy as np In [3]: a = np.random.rand(1e7) In [4]: timeit np.exp(a) 10 loops, best of 3: 251 ms per loop In [5]: import numpy as np In [6]: import numexpr as ne In [7]: timeit ne.evaluate("exp(a)") 10 loops, best of 3: 78.7 ms per loop That is, MKL's exp() is around 3x faster than plain C's exp(). You can also set different accuracy modes in MKL: In [8]: ne.set_vml_accuracy_mode('low') Out[8]: 'high' In [9]: timeit ne.evaluate("exp(a)") 10 loops, best of 3: 70.5 ms per loop # 3.5x speedup In [10]: ne.set_vml_accuracy_mode('fast') Out[10]: 'low' In [11]: timeit ne.evaluate("exp(a)") 10 loops, best of 3: 62.8 ms per loop # 4x speedup For the x>x0, you won't get any speed-up from using MKL, as this operation is bounded by memory speed. -- Francesc Alted From renesd at gmail.com Mon Sep 21 09:15:29 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Mon, 21 Sep 2009 14:15:29 +0100 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? In-Reply-To: <200909211508.04406.faltet@pytables.org> References: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> <200909211508.04406.faltet@pytables.org> Message-ID: <64ddb72c0909210615q1b96221cre8267f379a67b710@mail.gmail.com> hi, Definitely memory speed is probably the biggest thing to consider. Also using 64bit if you need to do lots of calculations, and cache things. ACML is another alternative... but I've never tried linking it with numpy http://developer.amd.com/cpu/Libraries/acml/Pages/default.aspx AMD Phenom II is their latest chip, but I haven't tried that either. The chips in the latest mac pro run really quick :) Dual 4 core... with lower ghz, but faster memory speed makes my numpy stuff go faster than the higher ghz previous gen mac pro cpus. cu, From cournape at gmail.com Mon Sep 21 10:53:29 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 21 Sep 2009 23:53:29 +0900 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? In-Reply-To: References: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> Message-ID: <5b8d13220909210753v7b76965aned20401129088394@mail.gmail.com> On Mon, Sep 21, 2009 at 8:59 PM, Romain Brette wrote: > David Warde-Farley a ?crit : >> On 20-Sep-09, at 2:17 PM, Romain Brette wrote: >> >>> Would anyone have thoughts about what the best hardware would be for >>> Numpy? In >>> particular, I am wondering about Intel Core i7 vs Xeon. Also, I feel >>> that the >>> limiting factor might be memory speed and cache rather than >>> processor speed. >>> What do you think? >> >> >> So, there are several different chips that bear the Xeon brand, you'd >> have to look at individual benchmarks. But if you're concerned about >> linear algebra performance, I'd say to go with the desktop version and >> spend some of the money you save on a license for the Intel Math >> Kernel Library to link NumPy against: http://software.intel.com/en-us/intel-mkl/ >> >> David > > Interesting, I might try Intel MKL. I use mostly element-wise operations > (e.g. exp(x) or x>x0, where x is a vector), do you think it would make a > big difference? It won't make any difference for most operations, at least by default, as we only support the MKL for BLAS/LAPACK. IF the MKL gives a C99 interface to the math library, it may be possible to tweak the build process such as to benefit from them. Concerning the hardware, I have just bought a core i7 (the cheapest model is ~ 200$ now, with 4 cores and 8 Mb of shared cache), and the thing flies for floating point computation. My last computer was a pentium 4 so I don't have a lot of reference, but you can compute ~ 300e6 exp (assuming a contiguous array), and ATLAS 3.8.3 built on it is extremely fast - using the threaded version, the asymptotic peak performances are quite impressive. It takes for example 14s to inverse a 5000x5000 matrix of double. cheers, David From cournape at gmail.com Mon Sep 21 11:03:46 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 22 Sep 2009 00:03:46 +0900 Subject: [Numpy-discussion] fixed-point arithmetic In-Reply-To: References: Message-ID: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> On Mon, Sep 21, 2009 at 9:00 PM, Neal Becker wrote: > > numpy arrays of fpi should support all numeric operations. ?Also mixed > fpi/integer operations. > > I'm not sure how to go about implementing this. ?At first, I was thinking to > just subclass numpy array. ?But, I don't think this provides fpi scalars, > and their associated operations. Using dtype seems more straightforward. I would first try to see how far you could go using a pure python object as a dtype. For example (on python 2.6): from decimal import Decimal import numpy as np a = np.array([1, 2, 3], Decimal) b = np.array([2, 3, 4], Decimal) a + b works as expected. A lot of things won't work (e.g. most transcendent functions, which would require a specific implementation anyway), but arithmetic, etc... would work. Then, you could think about implementing the class in cython. If speed is an issue, then implementing your own dtype seems the way to go - I don't know exactly what kind of speed increase you could hope from going the object -> dtype, though. cheers, David From sccolbert at gmail.com Mon Sep 21 11:30:04 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 21 Sep 2009 17:30:04 +0200 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? In-Reply-To: <5b8d13220909210753v7b76965aned20401129088394@mail.gmail.com> References: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> <5b8d13220909210753v7b76965aned20401129088394@mail.gmail.com> Message-ID: <7f014ea60909210830k1c138cfbt3489f5f018de860c@mail.gmail.com> Just because I have a ruler handy.... :) On my laptop with qx9300, I invert that 5000, 5000 double (float64) matrix in 14.67s. Granted my cpu cores were all at about 75 degrees during that process.. Cheers! Chris On Mon, Sep 21, 2009 at 4:53 PM, David Cournapeau wrote: > On Mon, Sep 21, 2009 at 8:59 PM, Romain Brette wrote: >> David Warde-Farley a ?crit : >>> On 20-Sep-09, at 2:17 PM, Romain Brette wrote: >>> >>>> Would anyone have thoughts about what the best hardware would be for >>>> Numpy? In >>>> particular, I am wondering about Intel Core i7 vs Xeon. Also, I feel >>>> that the >>>> limiting factor might be memory speed and cache rather than >>>> processor speed. >>>> What do you think? >>> >>> >>> So, there are several different chips that bear the Xeon brand, you'd >>> have to look at individual benchmarks. But if you're concerned about >>> linear algebra performance, I'd say to go with the desktop version and >>> spend some of the money you save on a license for the Intel Math >>> Kernel Library to link NumPy against: http://software.intel.com/en-us/intel-mkl/ >>> >>> David >> >> Interesting, I might try Intel MKL. I use mostly element-wise operations >> (e.g. exp(x) or x>x0, where x is a vector), do you think it would make a >> big difference? > > It won't make any difference for most operations, at least by default, > as we only support the MKL for BLAS/LAPACK. IF the MKL gives a C99 > interface to the math library, it may be possible to tweak the build > process such as to benefit from them. > > Concerning the hardware, I have just bought a core i7 (the cheapest > model is ~ 200$ now, with 4 cores and 8 Mb of shared cache), and the > thing flies for floating point computation. My last computer was a > pentium 4 so I don't have a lot of reference, but you can compute ~ > 300e6 exp (assuming a contiguous array), and ATLAS 3.8.3 built on it > is extremely fast - using the threaded version, the asymptotic peak > performances are quite impressive. It takes for example 14s to inverse > a 5000x5000 matrix of double. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ndbecker2 at gmail.com Mon Sep 21 11:57:18 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 21 Sep 2009 11:57:18 -0400 Subject: [Numpy-discussion] fixed-point arithmetic References: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> Message-ID: David Cournapeau wrote: > On Mon, Sep 21, 2009 at 9:00 PM, Neal Becker wrote: >> >> numpy arrays of fpi should support all numeric operations. Also mixed >> fpi/integer operations. >> >> I'm not sure how to go about implementing this. At first, I was thinking >> to just subclass numpy array. But, I don't think this provides fpi >> scalars, and their associated operations. > > Using dtype seems more straightforward. I would first try to see how > far you could go using a pure python object as a dtype. For example > (on python 2.6): > > from decimal import Decimal > import numpy as np > a = np.array([1, 2, 3], Decimal) > b = np.array([2, 3, 4], Decimal) > a + b > > works as expected. A lot of things won't work (e.g. most transcendent > functions, which would require a specific implementation anyway), but > arithmetic, etc... would work. > > Then, you could think about implementing the class in cython. If speed > is an issue, then implementing your own dtype seems the way to go - I > don't know exactly what kind of speed increase you could hope from > going the object -> dtype, though. > We don't want to create arrays of fixed-pt objects. That would be very wasteful. What I have in mind is that integer_bits, frac_bits are attributes of the entire arrays, not the individual elements. The array elements are just plain integers. At first I'm thinking that we could subclass numpy array, adding the int_bits and frac_bits attributes. The arithmetic operators would all have to be overloaded. The other aspect is that accessing an element of the array would return a fixed-pt object (not an integer). From eadrogue at gmx.net Mon Sep 21 12:05:19 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Mon, 21 Sep 2009 18:05:19 +0200 Subject: [Numpy-discussion] masked arrays as array indices Message-ID: <20090921160519.GA18313@doriath.local> Hello there, Given a masked array such as this one: In [19]: x = np.ma.masked_equal([-1, -1, 0, -1, 2], -1) In [20]: x Out[20]: masked_array(data = [-- -- 0 -- 2], mask = [ True True False True False], fill_value = 999999) When you make an assignemnt in the vein of x[x == 0] = 25 the result can be a bit puzzling: In [21]: x[x == 0] = 25 In [22]: x Out[22]: masked_array(data = [25 25 25 25 2], mask = [False False False False False], fill_value = 999999) Is this the correct result or have I found a bug? Cheers. -- Ernest From rmay31 at gmail.com Mon Sep 21 12:17:05 2009 From: rmay31 at gmail.com (Ryan May) Date: Mon, 21 Sep 2009 11:17:05 -0500 Subject: [Numpy-discussion] masked arrays as array indices In-Reply-To: <20090921160519.GA18313@doriath.local> References: <20090921160519.GA18313@doriath.local> Message-ID: 2009/9/21 Ernest Adrogu? > Hello there, > > Given a masked array such as this one: > > In [19]: x = np.ma.masked_equal([-1, -1, 0, -1, 2], -1) > > In [20]: x > Out[20]: > masked_array(data = [-- -- 0 -- 2], > mask = [ True True False True False], > fill_value = 999999) > > When you make an assignemnt in the vein of x[x == 0] = 25 > the result can be a bit puzzling: > > In [21]: x[x == 0] = 25 > > In [22]: x > Out[22]: > masked_array(data = [25 25 25 25 2], > mask = [False False False False False], > fill_value = 999999) > > Is this the correct result or have I found a bug? > I see the same here on 1.4.0.dev7400. Seems pretty odd to me. Then again, it's a bit more complex using masked boolean arrays for indexing since you have True, False, and masked values. Anyone have thoughts on what *should* happen here? Or is this it? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Sep 21 12:46:49 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 21 Sep 2009 11:46:49 -0500 Subject: [Numpy-discussion] fixed-point arithmetic In-Reply-To: References: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> Message-ID: <3d375d730909210946j51e71bc5ud62dcd7ccb4581f1@mail.gmail.com> On Mon, Sep 21, 2009 at 10:57, Neal Becker wrote: > David Cournapeau wrote: > >> On Mon, Sep 21, 2009 at 9:00 PM, Neal Becker wrote: >>> >>> numpy arrays of fpi should support all numeric operations. ?Also mixed >>> fpi/integer operations. >>> >>> I'm not sure how to go about implementing this. ?At first, I was thinking >>> to just subclass numpy array. ?But, I don't think this provides fpi >>> scalars, and their associated operations. >> >> Using dtype seems more straightforward. I would first try to see how >> far you could go using a pure python object as a dtype. For example >> (on python 2.6): >> >> from decimal import Decimal >> import numpy as np >> a = np.array([1, 2, 3], Decimal) >> b = np.array([2, 3, 4], Decimal) >> a + b >> >> works as expected. A lot of things won't work (e.g. most transcendent >> functions, which would require a specific implementation anyway), but >> arithmetic, etc... would work. >> >> Then, you could think about implementing the class in cython. If speed >> is an issue, then implementing your own dtype seems the way to go - I >> don't know exactly what kind of speed increase you could hope from >> going the object -> dtype, though. >> > > We don't want to create arrays of fixed-pt objects. ?That would be very > wasteful. ?What I have in mind is that integer_bits, frac_bits are > attributes of the entire arrays, not the individual elements. ?The array > elements are just plain integers. > > At first I'm thinking that we could subclass numpy array, adding the > int_bits and frac_bits attributes. ?The arithmetic operators would all have > to be overloaded. > > The other aspect is that accessing an element of the array would return a > fixed-pt object (not an integer). Actually, what you would do is create a new dtype, not a subclass of ndarray. The new datetime dtypes are similar in that they too are "parameterized" dtypes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From elaine.angelino at gmail.com Mon Sep 21 13:03:34 2009 From: elaine.angelino at gmail.com (Elaine Angelino) Date: Mon, 21 Sep 2009 13:03:34 -0400 Subject: [Numpy-discussion] numpy docstring sphinx pre-processors Message-ID: <901520e20909211003p2ab3e04dk961e8824f013e786@mail.gmail.com> Hi there -- I have been working on a small Python package whose central data object comes from Numpy (the record array object). I would like to produce documentation that looks like Numpy's, and am planning to follow Numpy's docstring standard. Numpy uses Sphinx to generate documentation (e.g. for HTML and LaTeX PDF docs). My understanding is that Numpy has its own pre-processors that modify the docstrings to format them in reStructuredText (reST) before using Sphinx to produce the final output (see http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines#docstring-standard). Are these Numpy pre-processors available to the community? I would love to use them! Thanks very much, Elaine -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Mon Sep 21 13:02:36 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 21 Sep 2009 13:02:36 -0400 Subject: [Numpy-discussion] fixed-point arithmetic References: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> <3d375d730909210946j51e71bc5ud62dcd7ccb4581f1@mail.gmail.com> Message-ID: Robert Kern wrote: > On Mon, Sep 21, 2009 at 10:57, Neal Becker wrote: >> David Cournapeau wrote: >> >>> On Mon, Sep 21, 2009 at 9:00 PM, Neal Becker >>> wrote: >>>> >>>> numpy arrays of fpi should support all numeric operations. Also mixed >>>> fpi/integer operations. >>>> >>>> I'm not sure how to go about implementing this. At first, I was >>>> thinking to just subclass numpy array. But, I don't think this >>>> provides fpi scalars, and their associated operations. >>> >>> Using dtype seems more straightforward. I would first try to see how >>> far you could go using a pure python object as a dtype. For example >>> (on python 2.6): >>> >>> from decimal import Decimal >>> import numpy as np >>> a = np.array([1, 2, 3], Decimal) >>> b = np.array([2, 3, 4], Decimal) >>> a + b >>> >>> works as expected. A lot of things won't work (e.g. most transcendent >>> functions, which would require a specific implementation anyway), but >>> arithmetic, etc... would work. >>> >>> Then, you could think about implementing the class in cython. If speed >>> is an issue, then implementing your own dtype seems the way to go - I >>> don't know exactly what kind of speed increase you could hope from >>> going the object -> dtype, though. >>> >> >> We don't want to create arrays of fixed-pt objects. That would be very >> wasteful. What I have in mind is that integer_bits, frac_bits are >> attributes of the entire arrays, not the individual elements. The array >> elements are just plain integers. >> >> At first I'm thinking that we could subclass numpy array, adding the >> int_bits and frac_bits attributes. The arithmetic operators would all >> have to be overloaded. >> >> The other aspect is that accessing an element of the array would return a >> fixed-pt object (not an integer). > > Actually, what you would do is create a new dtype, not a subclass of > ndarray. The new datetime dtypes are similar in that they too are > "parameterized" dtypes. > But doesn't this mean that each array element has it's own int_bits, frac_bits attributes? I don't want that. From robert.kern at gmail.com Mon Sep 21 13:08:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 21 Sep 2009 12:08:21 -0500 Subject: [Numpy-discussion] numpy docstring sphinx pre-processors In-Reply-To: <901520e20909211003p2ab3e04dk961e8824f013e786@mail.gmail.com> References: <901520e20909211003p2ab3e04dk961e8824f013e786@mail.gmail.com> Message-ID: <3d375d730909211008s23d9ecb0we57d6a0087400ff2@mail.gmail.com> On Mon, Sep 21, 2009 at 12:03, Elaine Angelino wrote: > Hi there -- > > I have been working on a small Python package whose central data object > comes from Numpy (the record array object). > > I would like to produce documentation that looks like Numpy's, and am > planning to follow Numpy's docstring standard. > > Numpy uses Sphinx to generate documentation (e.g. for HTML and LaTeX PDF > docs). > > My understanding is that Numpy has its own pre-processors that modify the > docstrings to format them in reStructuredText (reST) before using Sphinx to > produce the final output (see > http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines#docstring-standard). > > Are these Numpy pre-processors available to the community?? I would love to > use them! http://svn.scipy.org/svn/numpy/trunk/doc/sphinxext/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon Sep 21 13:09:51 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 21 Sep 2009 12:09:51 -0500 Subject: [Numpy-discussion] fixed-point arithmetic In-Reply-To: References: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> <3d375d730909210946j51e71bc5ud62dcd7ccb4581f1@mail.gmail.com> Message-ID: <3d375d730909211009h5090a648v71142155e0d972aa@mail.gmail.com> On Mon, Sep 21, 2009 at 12:02, Neal Becker wrote: > Robert Kern wrote: > >> On Mon, Sep 21, 2009 at 10:57, Neal Becker wrote: >>> David Cournapeau wrote: >>> >>>> On Mon, Sep 21, 2009 at 9:00 PM, Neal Becker >>>> wrote: >>>>> >>>>> numpy arrays of fpi should support all numeric operations. ?Also mixed >>>>> fpi/integer operations. >>>>> >>>>> I'm not sure how to go about implementing this. ?At first, I was >>>>> thinking to just subclass numpy array. ?But, I don't think this >>>>> provides fpi scalars, and their associated operations. >>>> >>>> Using dtype seems more straightforward. I would first try to see how >>>> far you could go using a pure python object as a dtype. For example >>>> (on python 2.6): >>>> >>>> from decimal import Decimal >>>> import numpy as np >>>> a = np.array([1, 2, 3], Decimal) >>>> b = np.array([2, 3, 4], Decimal) >>>> a + b >>>> >>>> works as expected. A lot of things won't work (e.g. most transcendent >>>> functions, which would require a specific implementation anyway), but >>>> arithmetic, etc... would work. >>>> >>>> Then, you could think about implementing the class in cython. If speed >>>> is an issue, then implementing your own dtype seems the way to go - I >>>> don't know exactly what kind of speed increase you could hope from >>>> going the object -> dtype, though. >>>> >>> >>> We don't want to create arrays of fixed-pt objects. ?That would be very >>> wasteful. ?What I have in mind is that integer_bits, frac_bits are >>> attributes of the entire arrays, not the individual elements. ?The array >>> elements are just plain integers. >>> >>> At first I'm thinking that we could subclass numpy array, adding the >>> int_bits and frac_bits attributes. ?The arithmetic operators would all >>> have to be overloaded. >>> >>> The other aspect is that accessing an element of the array would return a >>> fixed-pt object (not an integer). >> >> Actually, what you would do is create a new dtype, not a subclass of >> ndarray. The new datetime dtypes are similar in that they too are >> "parameterized" dtypes. > > But doesn't this mean that each array element has it's own int_bits, > frac_bits attributes? ?I don't want that. No, I'm suggesting that the dtype has the int_bits and frac_bits information just like the new datetime dtypes have their unit information. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Mon Sep 21 13:15:24 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 21 Sep 2009 13:15:24 -0400 Subject: [Numpy-discussion] numpy docstring sphinx pre-processors In-Reply-To: <3d375d730909211008s23d9ecb0we57d6a0087400ff2@mail.gmail.com> References: <901520e20909211003p2ab3e04dk961e8824f013e786@mail.gmail.com> <3d375d730909211008s23d9ecb0we57d6a0087400ff2@mail.gmail.com> Message-ID: <1cd32cbb0909211015h1e7b7e26l7a37fc3dfd8215cb@mail.gmail.com> On Mon, Sep 21, 2009 at 1:08 PM, Robert Kern wrote: > On Mon, Sep 21, 2009 at 12:03, Elaine Angelino > wrote: >> Hi there -- >> >> I have been working on a small Python package whose central data object >> comes from Numpy (the record array object). >> >> I would like to produce documentation that looks like Numpy's, and am >> planning to follow Numpy's docstring standard. >> >> Numpy uses Sphinx to generate documentation (e.g. for HTML and LaTeX PDF >> docs). >> >> My understanding is that Numpy has its own pre-processors that modify the >> docstrings to format them in reStructuredText (reST) before using Sphinx to >> produce the final output (see >> http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines#docstring-standard). >> >> Are these Numpy pre-processors available to the community?? I would love to >> use them! > > http://svn.scipy.org/svn/numpy/trunk/doc/sphinxext/ > > -- > Robert Kern I just struggled through the same task, which required some adjustments to work on Windows. If you want to compare the versions, the sphinx doc directory of statsmodels is here: http://bazaar.launchpad.net/~scipystats/statsmodels/trunk/files/head%3A/scikits/statsmodels/docs/ This uses the numpy sphinxext, but requires a very recent sphinx and doesn't include any older sphinx compatibility, but works almost out of the box on both windows and linux. Josef > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsseabold at gmail.com Mon Sep 21 13:15:49 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 21 Sep 2009 13:15:49 -0400 Subject: [Numpy-discussion] something wrong with docs? In-Reply-To: References: Message-ID: On Mon, Sep 21, 2009 at 7:27 AM, Neal Becker wrote: > I'm trying to read about subclassing. ?When I view > > http://docs.scipy.org/doc/numpy/user/basics.subclassing.html?highlight=subclass#module- > numpy.doc.subclassing > > It seems the examples show the _outputs_ of tests, but I don't see the > actual example code. > > e.g., the first example renders like this: > > Simple example - adding an extra attribute to ndarray? > Using the object looks like this: > I'd like to see this sorted as well. The problem is that the `testcode` directive is not recognized. I was recently a bit confused by this, and I went to the rst file to view the code, but that's obviously not a fix for the rendering problem. Skipper From elaine.angelino at gmail.com Mon Sep 21 13:20:29 2009 From: elaine.angelino at gmail.com (Elaine Angelino) Date: Mon, 21 Sep 2009 13:20:29 -0400 Subject: [Numpy-discussion] numpy docstring sphinx pre-processors In-Reply-To: <3d375d730909211008s23d9ecb0we57d6a0087400ff2@mail.gmail.com> References: <901520e20909211003p2ab3e04dk961e8824f013e786@mail.gmail.com> <3d375d730909211008s23d9ecb0we57d6a0087400ff2@mail.gmail.com> Message-ID: <901520e20909211020r71ec62a2ucf23d989894922c8@mail.gmail.com> thanks robert! yes i saw this (http://svn.scipy.org/svn/numpy/trunk/doc/sphinxext/) but is there a good description of how to use this? i'm looking for a "standard recipe" that could be followed by myself and others. e.g. what functions to call and in what order... i would like to emulate what numpy does as closely as possible. thanks again, elaine On Mon, Sep 21, 2009 at 1:08 PM, Robert Kern wrote: > On Mon, Sep 21, 2009 at 12:03, Elaine Angelino > wrote: > > Hi there -- > > > > I have been working on a small Python package whose central data object > > comes from Numpy (the record array object). > > > > I would like to produce documentation that looks like Numpy's, and am > > planning to follow Numpy's docstring standard. > > > > Numpy uses Sphinx to generate documentation (e.g. for HTML and LaTeX PDF > > docs). > > > > My understanding is that Numpy has its own pre-processors that modify the > > docstrings to format them in reStructuredText (reST) before using Sphinx > to > > produce the final output (see > > > http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines#docstring-standard > ). > > > > Are these Numpy pre-processors available to the community? I would love > to > > use them! > > http://svn.scipy.org/svn/numpy/trunk/doc/sphinxext/ > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Sep 21 13:23:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 21 Sep 2009 12:23:52 -0500 Subject: [Numpy-discussion] numpy docstring sphinx pre-processors In-Reply-To: <901520e20909211020r71ec62a2ucf23d989894922c8@mail.gmail.com> References: <901520e20909211003p2ab3e04dk961e8824f013e786@mail.gmail.com> <3d375d730909211008s23d9ecb0we57d6a0087400ff2@mail.gmail.com> <901520e20909211020r71ec62a2ucf23d989894922c8@mail.gmail.com> Message-ID: <3d375d730909211023m64c7a4eu5d65f95d8c715649@mail.gmail.com> On Mon, Sep 21, 2009 at 12:20, Elaine Angelino wrote: > thanks robert! > > yes i saw this (http://svn.scipy.org/svn/numpy/trunk/doc/sphinxext/) but is > there a good description of how to use this?? i'm looking for a "standard > recipe" that could be followed by myself and others.? e.g. what functions to > call and in what order... i would like to emulate what numpy does as closely > as possible. http://svn.scipy.org/svn/numpy/trunk/doc/source/conf.py -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Mon Sep 21 13:30:38 2009 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 21 Sep 2009 07:30:38 -1000 Subject: [Numpy-discussion] np.take versus fancy indexing In-Reply-To: <6c476c8a0909202011g37a5f87cj7dee741a877519dc@mail.gmail.com> References: <6c476c8a0909202011g37a5f87cj7dee741a877519dc@mail.gmail.com> Message-ID: <4AB7B83E.3020107@hawaii.edu> Wes McKinney wrote: > Any clue why I'm seeing this behavior? np.take's documentation says it > does the "same thing" as fancy indexing, but from this example I'm not > so sure: The code used to implement np.take is not the same as that used in fancy indexing. np.take's mission is simpler, so it uses type-specific code for each numeric type, generated using a template. The same type of optimization was done for putmask and clip. I haven't looked into the code used by fancy indexing. Presumably it could be sped up by using np.take (or the strategy used by np.take) in suitable cases, but I suspect that would be a big job, with plenty of opportunities for introducing bugs. Eric > > import numpy as np > > mat = np.random.randn(5000, 1000) > selector = np.array(np.arange(5000)[::2]) > > In [95]: timeit submat = mat[selector] > 10 loops, best of 3: 68.4 ms per loop > > In [96]: timeit submat = np.take(mat, selector, axis=0) > 10 loops, best of 3: 21.5 ms per loop > > indeed the result is the same: > > In [97]: (mat[selector] == np.take(mat, selector, axis=0)).all() > Out[97]: True > > In [98]: mat[selector].flags > Out[98]: > C_CONTIGUOUS : True > F_CONTIGUOUS : False > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [99]: np.take(mat, selector, axis=0).flags > Out[99]: > C_CONTIGUOUS : True > F_CONTIGUOUS : False > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > What's going on here / am I doing something wrong? > > Thanks, > Wes > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From dwf at cs.toronto.edu Mon Sep 21 13:34:54 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 21 Sep 2009 13:34:54 -0400 Subject: [Numpy-discussion] numpy docstring sphinx pre-processors In-Reply-To: <901520e20909211020r71ec62a2ucf23d989894922c8@mail.gmail.com> References: <901520e20909211003p2ab3e04dk961e8824f013e786@mail.gmail.com> <3d375d730909211008s23d9ecb0we57d6a0087400ff2@mail.gmail.com> <901520e20909211020r71ec62a2ucf23d989894922c8@mail.gmail.com> Message-ID: <00841381-8136-49F2-BD8A-006A5130DAC9@cs.toronto.edu> On 21-Sep-09, at 1:20 PM, Elaine Angelino wrote: > thanks robert! > > yes i saw this (http://svn.scipy.org/svn/numpy/trunk/doc/sphinxext/) > but is there a good description of how to use this? i'm looking for > a "standard recipe" that could be followed by myself and others. > e.g. what functions to call and in what order... i would like to > emulate what numpy does as closely as possible. You should have a look at matplotlib's sampledoc tutorial. It goes over how to use Sphinx extensions. http://matplotlib.sourceforge.net/sampledoc/ David From elaine.angelino at gmail.com Mon Sep 21 13:35:55 2009 From: elaine.angelino at gmail.com (Elaine Angelino) Date: Mon, 21 Sep 2009 13:35:55 -0400 Subject: [Numpy-discussion] numpy docstring sphinx pre-processors In-Reply-To: <3d375d730909211023m64c7a4eu5d65f95d8c715649@mail.gmail.com> References: <901520e20909211003p2ab3e04dk961e8824f013e786@mail.gmail.com> <3d375d730909211008s23d9ecb0we57d6a0087400ff2@mail.gmail.com> <901520e20909211020r71ec62a2ucf23d989894922c8@mail.gmail.com> <3d375d730909211023m64c7a4eu5d65f95d8c715649@mail.gmail.com> Message-ID: <901520e20909211035u195b1c2dx451edb9895062bd1@mail.gmail.com> ok a couple more questions: 1) how does sphinxext relate to numpydoc? sphinxext in scipy source tree -- http://svn.scipy.org/svn/numpy/trunk/doc/sphinxext/ numpydoc on PyPI -- http://pypi.python.org/pypi?%3Aaction=search&term=numpydoc&submit=search 2) what about postprocess.py, should i be using this too? ( http://svn.scipy.org/svn/numpy/trunk/doc/) thanks again elaine On Mon, Sep 21, 2009 at 1:23 PM, Robert Kern wrote: > On Mon, Sep 21, 2009 at 12:20, Elaine Angelino > wrote: > > thanks robert! > > > > yes i saw this (http://svn.scipy.org/svn/numpy/trunk/doc/sphinxext/) but > is > > there a good description of how to use this? i'm looking for a "standard > > recipe" that could be followed by myself and others. e.g. what functions > to > > call and in what order... i would like to emulate what numpy does as > closely > > as possible. > > http://svn.scipy.org/svn/numpy/trunk/doc/source/conf.py > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Mon Sep 21 13:39:20 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 21 Sep 2009 13:39:20 -0400 Subject: [Numpy-discussion] fixed-point arithmetic References: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> <3d375d730909210946j51e71bc5ud62dcd7ccb4581f1@mail.gmail.com> <3d375d730909211009h5090a648v71142155e0d972aa@mail.gmail.com> Message-ID: Robert Kern wrote: > On Mon, Sep 21, 2009 at 12:02, Neal Becker wrote: >> Robert Kern wrote: >> >>> On Mon, Sep 21, 2009 at 10:57, Neal Becker wrote: >>>> David Cournapeau wrote: >>>> >>>>> On Mon, Sep 21, 2009 at 9:00 PM, Neal Becker >>>>> wrote: >>>>>> >>>>>> numpy arrays of fpi should support all numeric operations. Also >>>>>> mixed fpi/integer operations. >>>>>> >>>>>> I'm not sure how to go about implementing this. At first, I was >>>>>> thinking to just subclass numpy array. But, I don't think this >>>>>> provides fpi scalars, and their associated operations. >>>>> >>>>> Using dtype seems more straightforward. I would first try to see how >>>>> far you could go using a pure python object as a dtype. For example >>>>> (on python 2.6): >>>>> >>>>> from decimal import Decimal >>>>> import numpy as np >>>>> a = np.array([1, 2, 3], Decimal) >>>>> b = np.array([2, 3, 4], Decimal) >>>>> a + b >>>>> >>>>> works as expected. A lot of things won't work (e.g. most transcendent >>>>> functions, which would require a specific implementation anyway), but >>>>> arithmetic, etc... would work. >>>>> >>>>> Then, you could think about implementing the class in cython. If speed >>>>> is an issue, then implementing your own dtype seems the way to go - I >>>>> don't know exactly what kind of speed increase you could hope from >>>>> going the object -> dtype, though. >>>>> >>>> >>>> We don't want to create arrays of fixed-pt objects. That would be very >>>> wasteful. What I have in mind is that integer_bits, frac_bits are >>>> attributes of the entire arrays, not the individual elements. The >>>> array elements are just plain integers. >>>> >>>> At first I'm thinking that we could subclass numpy array, adding the >>>> int_bits and frac_bits attributes. The arithmetic operators would all >>>> have to be overloaded. >>>> >>>> The other aspect is that accessing an element of the array would return >>>> a fixed-pt object (not an integer). >>> >>> Actually, what you would do is create a new dtype, not a subclass of >>> ndarray. The new datetime dtypes are similar in that they too are >>> "parameterized" dtypes. >> >> But doesn't this mean that each array element has it's own int_bits, >> frac_bits attributes? I don't want that. > > No, I'm suggesting that the dtype has the int_bits and frac_bits > information just like the new datetime dtypes have their unit > information. > 1. Where would I find this new datetime dtype? 2. Don't know exactly what 'parameterized' dtypes are. Does this mean that the dtype for 8.1 format fixed-pt is different from the dtype for 6.2 format, for example? From gokhansever at gmail.com Mon Sep 21 13:45:44 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 21 Sep 2009 12:45:44 -0500 Subject: [Numpy-discussion] Simple pattern recognition In-Reply-To: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> Message-ID: <49d6b3500909211045g2913d62ey539171b0668ae7c3@mail.gmail.com> I asked this question at http://stackoverflow.com/questions/1449139/simple-object-recognition and get lots of nice feedback, and finally I have managed to implement what I wanted. What I was looking for is named "connected component labelling or analysis" for my "connected component extraction" I have put the code (lab2.py) and the image (particles.png) under: http://code.google.com/p/ccnworks/source/browse/#svn/trunk/AtSc450/labs What do you think of improving that code and adding into scipy's ndimage library (like connected_components()) ? Comments and suggestions are welcome :) On Wed, Sep 16, 2009 at 7:22 PM, G?khan Sever wrote: > Hello all, > > I want to be able to count predefined simple rectangle shapes on an image > as shown like in this one: > http://img7.imageshack.us/img7/2327/particles.png > > Which is in my case to count all the blue pixels (they are ice-snow flake > shadows in reality) in one of the column. > > What is the way to automate this task, which library or technique should I > study to tackle it. > > Thanks. > > -- > G?khan > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Mon Sep 21 13:57:53 2009 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 21 Sep 2009 13:57:53 -0400 Subject: [Numpy-discussion] [SciPy-User] Simple pattern recognition In-Reply-To: <49d6b3500909211045g2913d62ey539171b0668ae7c3@mail.gmail.com> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> <49d6b3500909211045g2913d62ey539171b0668ae7c3@mail.gmail.com> Message-ID: I believe that pretty generic connected-component finding is already available with scipy.ndimage.label, as David suggested at the beginning of the thread... This function takes a binary array (e.g. zeros where the background is, non-zero where foreground is) and outputs an array where each connected component of non-background pixels has a unique non-zero "label" value. ndimage.find_objects will then give slices (e.g. bounding boxes) for each labeled object (or a subset of them as specified). There are also a ton of statistics you can calculate based on the labeled objects -- look at the entire ndimage.measurements namespace. Zach On Sep 21, 2009, at 1:45 PM, G?khan Sever wrote: > I asked this question at http://stackoverflow.com/questions/1449139/simple-object-recognition > and get lots of nice feedback, and finally I have managed to > implement what I wanted. > > What I was looking for is named "connected component labelling or > analysis" for my "connected component extraction" > > I have put the code (lab2.py) and the image (particles.png) under: > http://code.google.com/p/ccnworks/source/browse/#svn/trunk/AtSc450/ > labs > > What do you think of improving that code and adding into scipy's > ndimage library (like connected_components()) ? > > Comments and suggestions are welcome :) > > > On Wed, Sep 16, 2009 at 7:22 PM, G?khan Sever > wrote: > Hello all, > > I want to be able to count predefined simple rectangle shapes on an > image as shown like in this one: http://img7.imageshack.us/img7/2327/particles.png > > Which is in my case to count all the blue pixels (they are ice-snow > flake shadows in reality) in one of the column. > > What is the way to automate this task, which library or technique > should I study to tackle it. > > Thanks. > > -- > G?khan > > > > -- > G?khan > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pav at iki.fi Mon Sep 21 14:04:01 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 21 Sep 2009 21:04:01 +0300 Subject: [Numpy-discussion] numpy docstring sphinx pre-processors In-Reply-To: <901520e20909211035u195b1c2dx451edb9895062bd1@mail.gmail.com> References: <901520e20909211003p2ab3e04dk961e8824f013e786@mail.gmail.com> <3d375d730909211008s23d9ecb0we57d6a0087400ff2@mail.gmail.com> <901520e20909211020r71ec62a2ucf23d989894922c8@mail.gmail.com> <3d375d730909211023m64c7a4eu5d65f95d8c715649@mail.gmail.com> <901520e20909211035u195b1c2dx451edb9895062bd1@mail.gmail.com> Message-ID: <1253556240.4856.7.camel@idol> ma, 2009-09-21 kello 13:35 -0400, Elaine Angelino kirjoitti: > ok a couple more questions: > 1) how does sphinxext relate to numpydoc? > sphinxext in scipy source tree -- > http://svn.scipy.org/svn/numpy/trunk/doc/sphinxext/ > numpydoc on PyPI -- http://pypi.python.org/pypi?% > 3Aaction=search&term=numpydoc&submit=search They are the same. If you want to use easy_install, use numpydoc. > > 2) what about postprocess.py, should i be using this too? > (http://svn.scipy.org/svn/numpy/trunk/doc/) It removes extra section headers from the Latex output. If you want to use it, you'll have to modify to match your module. -- Pauli Virtanen From gokhansever at gmail.com Mon Sep 21 14:04:26 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 21 Sep 2009 13:04:26 -0500 Subject: [Numpy-discussion] [SciPy-User] Simple pattern recognition In-Reply-To: References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> <49d6b3500909211045g2913d62ey539171b0668ae7c3@mail.gmail.com> Message-ID: <49d6b3500909211104m2ad0646fo6ca8a2d74735e9bc@mail.gmail.com> ndimage.label works differently than what I have done here. Later using find_objects you can get slices for row or column basis. Not possible to construct a dynamical structure to find objects that are in the in both axis. Could you look at the stackoverflow article once again and comment back? Thanks. On Mon, Sep 21, 2009 at 12:57 PM, Zachary Pincus wrote: > I believe that pretty generic connected-component finding is already > available with scipy.ndimage.label, as David suggested at the > beginning of the thread... > > This function takes a binary array (e.g. zeros where the background > is, non-zero where foreground is) and outputs an array where each > connected component of non-background pixels has a unique non-zero > "label" value. > > ndimage.find_objects will then give slices (e.g. bounding boxes) for > each labeled object (or a subset of them as specified). There are also > a ton of statistics you can calculate based on the labeled objects -- > look at the entire ndimage.measurements namespace. > > Zach > > On Sep 21, 2009, at 1:45 PM, G?khan Sever wrote: > > > I asked this question at > http://stackoverflow.com/questions/1449139/simple-object-recognition > > and get lots of nice feedback, and finally I have managed to > > implement what I wanted. > > > > What I was looking for is named "connected component labelling or > > analysis" for my "connected component extraction" > > > > I have put the code (lab2.py) and the image (particles.png) under: > > http://code.google.com/p/ccnworks/source/browse/#svn/trunk/AtSc450/ > > labs > > > > What do you think of improving that code and adding into scipy's > > ndimage library (like connected_components()) ? > > > > Comments and suggestions are welcome :) > > > > > > On Wed, Sep 16, 2009 at 7:22 PM, G?khan Sever > > wrote: > > Hello all, > > > > I want to be able to count predefined simple rectangle shapes on an > > image as shown like in this one: > http://img7.imageshack.us/img7/2327/particles.png > > > > Which is in my case to count all the blue pixels (they are ice-snow > > flake shadows in reality) in one of the column. > > > > What is the way to automate this task, which library or technique > > should I study to tackle it. > > > > Thanks. > > > > -- > > G?khan > > > > > > > > -- > > G?khan > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From code at familjenjonsson.org Mon Sep 21 14:05:40 2009 From: code at familjenjonsson.org (Patrik Jonsson) Date: Mon, 21 Sep 2009 11:05:40 -0700 Subject: [Numpy-discussion] Building problem on CentOS 5.3 In-Reply-To: <133c84d10909211041i1a17d9a7l2e1ea1366c2130c3@mail.gmail.com> References: <133c84d10909211041i1a17d9a7l2e1ea1366c2130c3@mail.gmail.com> Message-ID: <133c84d10909211105n4e0e725bt34d484899343b798@mail.gmail.com> Hi all, I've installed python 2.5 on my centos 5.3 x86_64 machine (system standard is 2.4), and now I want to install numpy for it. However, the build fails. The final message is: "EnvironmentError: math library missing; rerun setup.py after setting the MATHLIB env variable" However, from looking earlier in the output, it seems it's looking for a library called libcpml. I'm not sure what this library is (there is no yum package for it), and since numpy works for python 2.4 it seems all libraries should be installed. Any help would be most appreciated. The full build output is attached. Thanks, /Patrik Jonsson -------------- next part -------------- Running from numpy source directory. non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib/sse2 libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE /home/patrik/system-stuff/numpy-1.3.0/numpy/distutils/system_info.py:1383: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) blas_info: libraries blas not found in /usr/local/lib FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib/sse2 libraries lapack_atlas not found in /usr/lib/sse2 libraries f77blas,cblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_info NOT AVAILABLE /home/patrik/system-stuff/numpy-1.3.0/numpy/distutils/system_info.py:1290: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) lapack_info: libraries lapack not found in /usr/local/lib FOUND: libraries = ['lapack'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src building py_modules sources building library "npymath" sources building extension "numpy.core._sort" sources Generating build/src.linux-x86_64-2.5/numpy/core/include/numpy/config.h customize Gnu95FCompiler Found executable /usr/bin/gfortran customize Gnu95FCompiler using config C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c success! removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c _configtest.c: In function 'main': _configtest.c:5: error: size of array 'test_array' is negative _configtest.c: In function 'main': _configtest.c:5: error: size of array 'test_array' is negative C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c _configtest.c: In function 'main': _configtest.c:5: error: size of array 'test_array' is negative _configtest.c: In function 'main': _configtest.c:5: error: size of array 'test_array' is negative C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c success! removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c success! removing: _configtest.c _configtest.o /home/patrik/system-stuff/numpy-1.3.0/numpy/distutils/command/config.py:39: DeprecationWarning: +++++++++++++++++++++++++++++++++++++++++++++++++ Usage of try_run is deprecated: please do not use it anymore, and avoid configuration checks involving running executable on the target machine. +++++++++++++++++++++++++++++++++++++++++++++++++ DeprecationWarning) C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c gcc -pthread _configtest.o -o _configtest /usr/bin/ld: warning: i386 architecture of input file `_configtest.o' is incompatible with i386:x86-64 output _configtest.o: In function `main': /home/patrik/system-stuff/numpy-1.3.0/_configtest.c:5: undefined reference to `exp' collect2: ld returned 1 exit status /usr/bin/ld: warning: i386 architecture of input file `_configtest.o' is incompatible with i386:x86-64 output _configtest.o: In function `main': /home/patrik/system-stuff/numpy-1.3.0/_configtest.c:5: undefined reference to `exp' collect2: ld returned 1 exit status failure. removing: _configtest.c _configtest.o C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c gcc -pthread _configtest.o -lm -o _configtest /usr/bin/ld: warning: i386 architecture of input file `_configtest.o' is incompatible with i386:x86-64 output _configtest failure. removing: _configtest.c _configtest.o _configtest C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -D_GNU_SOURCE -fPIC -fPIC compile options: '-Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -c' gcc: _configtest.c gcc -pthread _configtest.o -lcpml -o _configtest /usr/bin/ld: cannot find -lcpml collect2: ld returned 1 exit status /usr/bin/ld: cannot find -lcpml collect2: ld returned 1 exit status failure. removing: _configtest.c _configtest.o Traceback (most recent call last): File "setup.py", line 172, in setup_package() File "setup.py", line 165, in setup_package configuration=configuration ) File "/home/patrik/system-stuff/numpy-1.3.0/numpy/distutils/core.py", line 184, in setup return old_setup(**new_attr) File "/usr/lib/python2.5/distutils/core.py", line 151, in setup dist.run_commands() File "/usr/lib/python2.5/distutils/dist.py", line 974, in run_commands self.run_command(cmd) File "/usr/lib/python2.5/distutils/dist.py", line 994, in run_command cmd_obj.run() File "/home/patrik/system-stuff/numpy-1.3.0/numpy/distutils/command/build.py", line 37, in run old_build.run(self) File "/usr/lib/python2.5/distutils/command/build.py", line 112, in run self.run_command(cmd_name) File "/usr/lib/python2.5/distutils/cmd.py", line 333, in run_command self.distribution.run_command(command) File "/usr/lib/python2.5/distutils/dist.py", line 994, in run_command cmd_obj.run() File "/home/patrik/system-stuff/numpy-1.3.0/numpy/distutils/command/build_src.py", line 130, in run self.build_sources() File "/home/patrik/system-stuff/numpy-1.3.0/numpy/distutils/command/build_src.py", line 147, in build_sources self.build_extension_sources(ext) File "/home/patrik/system-stuff/numpy-1.3.0/numpy/distutils/command/build_src.py", line 250, in build_extension_sources sources = self.generate_sources(sources, ext) File "/home/patrik/system-stuff/numpy-1.3.0/numpy/distutils/command/build_src.py", line 307, in generate_sources source = func(extension, build_dir) File "numpy/core/setup.py", line 289, in generate_config_h mathlibs = check_mathlib(config_cmd) File "numpy/core/setup.py", line 253, in check_mathlib raise EnvironmentError("math library missing; rerun " EnvironmentError: math library missing; rerun setup.py after setting the MATHLIB env variable From Ashwin.Kashyap at thomson.net Mon Sep 21 13:45:27 2009 From: Ashwin.Kashyap at thomson.net (Kashyap Ashwin) Date: Mon, 21 Sep 2009 13:45:27 -0400 Subject: [Numpy-discussion] Numpy large array bug Message-ID: <68DF70B3485CC648835655773E92314F820154@prinsmail02.am.thmulti.com> Hello, I have downloaded numpy 1.3rc2 sources and compiled it on Ubuntu Hardy Linux x86_64. numpy.test() seems to run ok as well. Here is the bug I can reproduce import numpy as np a=np.zeros((2*1024*1024*1024 + 10000), dtype="uint8") a[:]=1 # returns immediately a.mean() 0.0 print a [0 0 0 ..., 0 0 0] The bug only happens when the nElements > 2G (2^31). So for dtype=uint16/32, the bug happens when size is greater thatn 2^31 as well. Can someone please tell me if I can find a patch for this? I checked the mailing list and trac and I cannot find any related bug. Thanks, Ashwin -------------- next part -------------- An HTML attachment was scrubbed... URL: From renesd at gmail.com Mon Sep 21 14:11:45 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Mon, 21 Sep 2009 19:11:45 +0100 Subject: [Numpy-discussion] Simple pattern recognition In-Reply-To: <49d6b3500909211045g2913d62ey539171b0668ae7c3@mail.gmail.com> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> <49d6b3500909211045g2913d62ey539171b0668ae7c3@mail.gmail.com> Message-ID: <64ddb72c0909211111rd322e1ct8adb40eda4d83d22@mail.gmail.com> On Mon, Sep 21, 2009 at 6:45 PM, G?khan Sever wrote: > I asked this question at > http://stackoverflow.com/questions/1449139/simple-object-recognition and get > lots of nice feedback, and finally I have managed to implement what I > wanted. > > What I was looking for is named "connected component labelling or analysis" > for my "connected component extraction" > > I have put the code (lab2.py) and the image (particles.png) under: > http://code.google.com/p/ccnworks/source/browse/#svn/trunk/AtSc450/labs > > What do you think of improving that code and adding into scipy's ndimage > library (like connected_components())? ? > > Comments and suggestions are welcome :) > > > On Wed, Sep 16, 2009 at 7:22 PM, G?khan Sever wrote: >> >> Hello all, >> >> I want to be able to count predefined simple rectangle shapes on an image >> as shown like in this one: http://img7.imageshack.us/img7/2327/particles.png >> >> Which is in my case to count all the blue pixels (they are ice-snow flake >> shadows in reality) in one of the column. >> >> What is the way to automate this task, which library or technique should I >> study to tackle it. >> >> Thanks. >> >> -- >> G?khan > > Hi, cool! I didn't even know there was an ndimage... :) Something similar(connected components) can be found in pygame.transform and pygame.mask. However that code is in C. cheers, From robert.kern at gmail.com Mon Sep 21 14:15:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 21 Sep 2009 13:15:19 -0500 Subject: [Numpy-discussion] fixed-point arithmetic In-Reply-To: References: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> <3d375d730909210946j51e71bc5ud62dcd7ccb4581f1@mail.gmail.com> <3d375d730909211009h5090a648v71142155e0d972aa@mail.gmail.com> Message-ID: <3d375d730909211115n627fd5bav6c869e3f40632505@mail.gmail.com> On Mon, Sep 21, 2009 at 12:39, Neal Becker wrote: > 1. Where would I find this new datetime dtype? It's in the SVN trunk. > 2. Don't know exactly what 'parameterized' dtypes are. ?Does this mean that > the dtype for 8.1 format fixed-pt is different from the dtype for 6.2 > format, for example? Yes. The dtype code letter is the same, but the dtype object has metadata attached to it in the form of a dictionary. The ufunc loops get references to the array objects and will look at the dtype metadata in order to figure out exactly what to do. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From faltet at pytables.org Mon Sep 21 14:30:38 2009 From: faltet at pytables.org (Francesc Alted) Date: Mon, 21 Sep 2009 20:30:38 +0200 Subject: [Numpy-discussion] Numpy large array bug In-Reply-To: <68DF70B3485CC648835655773E92314F820154@prinsmail02.am.thmulti.com> References: <68DF70B3485CC648835655773E92314F820154@prinsmail02.am.thmulti.com> Message-ID: <200909212030.39029.faltet@pytables.org> A Monday 21 September 2009 19:45:27 Kashyap Ashwin escrigu?: > Hello, > > I have downloaded numpy 1.3rc2 sources and compiled it on Ubuntu Hardy > Linux x86_64. numpy.test() seems to run ok as well. > > > > Here is the bug I can reproduce > > > > import numpy as np > > a=np.zeros((2*1024*1024*1024 + 10000), dtype="uint8") > > a[:]=1 > > > > # returns immediately > > a.mean() > > 0.0 > > > > print a > > [0 0 0 ..., 0 0 0] > > > > The bug only happens when the nElements > 2G (2^31). So for > dtype=uint16/32, the bug happens when size is greater thatn 2^31 as > well. Yup. I can reproduce your problem with NumPy 1.3.0 (final) and a 64-bit platform. I suppose that you should file a bug better. -- Francesc Alted From pav at iki.fi Mon Sep 21 14:32:40 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 21 Sep 2009 21:32:40 +0300 Subject: [Numpy-discussion] something wrong with docs? In-Reply-To: References: Message-ID: <1253557959.3707.4.camel@idol> ma, 2009-09-21 kello 13:15 -0400, Skipper Seabold kirjoitti: > On Mon, Sep 21, 2009 at 7:27 AM, Neal Becker wrote: > > I'm trying to read about subclassing. When I view > > > > http://docs.scipy.org/doc/numpy/user/basics.subclassing.html?highlight=subclass#module- > > numpy.doc.subclassing > > > > It seems the examples show the _outputs_ of tests, but I don't see the > > actual example code. > > > > e.g., the first example renders like this: > > > > Simple example - adding an extra attribute to ndarray? > > Using the object looks like this: > > > > I'd like to see this sorted as well. The problem is that the > `testcode` directive > > is not recognized. I was recently a bit confused by this, and I went > to the rst file to view the code, but that's obviously not a fix for > the rendering problem. The `sphinx.ext.doctest` extension is not enabled, so the testcode:: etc. directives are not available. I'm not sure if it should be enabled -- it would be cleaner to just replace the testcode:: stuff with the ordinary example markup. -- Pauli Virtanen From dwf at cs.toronto.edu Mon Sep 21 14:36:05 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 21 Sep 2009 14:36:05 -0400 Subject: [Numpy-discussion] [SciPy-User] Simple pattern recognition In-Reply-To: <49d6b3500909211104m2ad0646fo6ca8a2d74735e9bc@mail.gmail.com> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> <49d6b3500909211045g2913d62ey539171b0668ae7c3@mail.gmail.com> <49d6b3500909211104m2ad0646fo6ca8a2d74735e9bc@mail.gmail.com> Message-ID: <3F97EAA3-72C2-43D2-A060-580B8127191C@cs.toronto.edu> I think Zachary is right, ndimage does what you want: In [48]: image = array( [[0,0,0,1,1,0,0], [0,0,0,1,1,1,0], [0,0,0,1,0,0,0], [0,0,0,0,0,0,0], [0,1,0,0,0,0,0], [0,1,1,0,0,0,0], [0,0,0,0,1,1,0], [0,0,0,0,1,1,1]]) In [57]: import scipy.ndimage as ndimage In [58]: labels, num_found = ndimage.label(image) In [59]: object_slices = ndimage.find_objects(labels) In [60]: image[object_slices[0]] Out[60]: array([[1, 1, 0], [1, 1, 1], [1, 0, 0]]) In [61]: image[object_slices[1]] Out[61]: array([[1, 0], [1, 1]]) In [62]: image[object_slices[2]] Out[62]: array([[1, 1, 0], [1, 1, 1]]) David On 21-Sep-09, at 2:04 PM, G?khan Sever wrote: > ndimage.label works differently than what I have done here. > > Later using find_objects you can get slices for row or column basis. > Not > possible to construct a dynamical structure to find objects that are > in the > in both axis. > > Could you look at the stackoverflow article once again and comment > back? > > Thanks. > > On Mon, Sep 21, 2009 at 12:57 PM, Zachary Pincus >wrote: > >> I believe that pretty generic connected-component finding is already >> available with scipy.ndimage.label, as David suggested at the >> beginning of the thread... >> >> This function takes a binary array (e.g. zeros where the background >> is, non-zero where foreground is) and outputs an array where each >> connected component of non-background pixels has a unique non-zero >> "label" value. >> >> ndimage.find_objects will then give slices (e.g. bounding boxes) for >> each labeled object (or a subset of them as specified). There are >> also >> a ton of statistics you can calculate based on the labeled objects -- >> look at the entire ndimage.measurements namespace. >> >> Zach >> >> On Sep 21, 2009, at 1:45 PM, G?khan Sever wrote: >> >>> I asked this question at >> http://stackoverflow.com/questions/1449139/simple-object-recognition >>> and get lots of nice feedback, and finally I have managed to >>> implement what I wanted. >>> >>> What I was looking for is named "connected component labelling or >>> analysis" for my "connected component extraction" >>> >>> I have put the code (lab2.py) and the image (particles.png) under: >>> http://code.google.com/p/ccnworks/source/browse/#svn/trunk/AtSc450/ >>> labs >>> >>> What do you think of improving that code and adding into scipy's >>> ndimage library (like connected_components()) ? >>> >>> Comments and suggestions are welcome :) >>> >>> >>> On Wed, Sep 16, 2009 at 7:22 PM, G?khan Sever >>> wrote: >>> Hello all, >>> >>> I want to be able to count predefined simple rectangle shapes on an >>> image as shown like in this one: >> http://img7.imageshack.us/img7/2327/particles.png >>> >>> Which is in my case to count all the blue pixels (they are ice-snow >>> flake shadows in reality) in one of the column. >>> >>> What is the way to automate this task, which library or technique >>> should I study to tackle it. >>> >>> Thanks. >>> >>> -- >>> G?khan >>> >>> >>> >>> -- >>> G?khan >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > > -- > G?khan > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From charlesr.harris at gmail.com Mon Sep 21 14:40:08 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 Sep 2009 12:40:08 -0600 Subject: [Numpy-discussion] Numpy large array bug In-Reply-To: <200909212030.39029.faltet@pytables.org> References: <68DF70B3485CC648835655773E92314F820154@prinsmail02.am.thmulti.com> <200909212030.39029.faltet@pytables.org> Message-ID: On Mon, Sep 21, 2009 at 12:30 PM, Francesc Alted wrote: > A Monday 21 September 2009 19:45:27 Kashyap Ashwin escrigu?: > > Hello, > > > > I have downloaded numpy 1.3rc2 sources and compiled it on Ubuntu Hardy > > Linux x86_64. numpy.test() seems to run ok as well. > > > > > > > > Here is the bug I can reproduce > > > > > > > > import numpy as np > > > > a=np.zeros((2*1024*1024*1024 + 10000), dtype="uint8") > > > > a[:]=1 > > > > > > > > # returns immediately > > > > a.mean() > > > > 0.0 > > > > > > > > print a > > > > [0 0 0 ..., 0 0 0] > > > > > > > > The bug only happens when the nElements > 2G (2^31). So for > > dtype=uint16/32, the bug happens when size is greater thatn 2^31 as > > well. > > Yup. I can reproduce your problem with NumPy 1.3.0 (final) and a 64-bit > platform. I suppose that you should file a bug better. > > Does is persist for svn? IIRC, there is another ticket for a slicing bug for large arrays. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Mon Sep 21 14:43:26 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 21 Sep 2009 14:43:26 -0400 Subject: [Numpy-discussion] masked arrays as array indices (is a bad idea) In-Reply-To: References: <20090921160519.GA18313@doriath.local> Message-ID: On Sep 21, 2009, at 12:17 PM, Ryan May wrote: > 2009/9/21 Ernest Adrogu? > Hello there, > > Given a masked array such as this one: > > In [19]: x = np.ma.masked_equal([-1, -1, 0, -1, 2], -1) > > In [20]: x > Out[20]: > masked_array(data = [-- -- 0 -- 2], > mask = [ True True False True False], > fill_value = 999999) > > When you make an assignemnt in the vein of x[x == 0] = 25 > the result can be a bit puzzling: > > In [21]: x[x == 0] = 25 > > In [22]: x > Out[22]: > masked_array(data = [25 25 25 25 2], > mask = [False False False False False], > fill_value = 999999) > > Is this the correct result or have I found a bug? > > I see the same here on 1.4.0.dev7400. Seems pretty odd to me. Then > again, it's a bit more complex using masked boolean arrays for > indexing since you have True, False, and masked values. Anyone have > thoughts on what *should* happen here? Or is this it? Using a masked array in fancy indexing is always a bad idea, as there's no way of guessing the behavior one would want for missing values: should they be evaluated as False ? True ? You should really use the `filled` method to control the behavior. >>> x[(x==0).filled(False)] masked_array(data = [0], mask = [False], fill_value = 999999) >>>x[(x==0).filled(True)] masked_array(data = [-- -- 0 --], mask = [ True True False True], fill_value = 999999) P. [If you're really interested: When testing for equality, a masked array is first filled with 0 (that was the behavior of the first implementation of numpy.ma), tested for equality, and the mask of the result set to the mask of the input. When used in fancy indexing, a masked array is viewed as a standard ndarray by dropping the mask. In the current case, the combination is therefore equivalent to (x.filled(0)==0), which explains why the missing values are treated as True... I agree that the prefilling may not be necessary...] From xavier.gnata at gmail.com Mon Sep 21 14:55:00 2009 From: xavier.gnata at gmail.com (Xavier Gnata) Date: Mon, 21 Sep 2009 20:55:00 +0200 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> Message-ID: <4AB7CC04.8070300@gmail.com> David Cournapeau wrote: > Xavier Gnata wrote: > >> Hi, >> >> I have a large 2D numpy array as input and a 1D array as output. >> In between, I would like to use C code. >> C is requirement because it has to be fast and because the algorithm >> cannot be written in a numpy oriented way :( (no way...really). >> >> Which tool should I use to achieve that? waeve.inline? pyrex? What is >> the baseline? >> >> > > That's only a data point, but I almost always use cython in those cases, > unless I need 'very advanced' features of the C API in which case I just > do it manually. > > cheers, > > David > Ok :) Should I read that to learn you cython and numpy interact? Or is there another best documentation (with examples...)? Xavier http://wiki.cython.org/tutorials/numpy From Ashwin.Kashyap at thomson.net Mon Sep 21 14:55:56 2009 From: Ashwin.Kashyap at thomson.net (Kashyap Ashwin) Date: Mon, 21 Sep 2009 14:55:56 -0400 Subject: [Numpy-discussion] Numpy large array bug In-Reply-To: References: Message-ID: <68DF70B3485CC648835655773E92314F82016D@prinsmail02.am.thmulti.com> Yes, it happens for the trunk as well. > > import numpy as np > > > > a=np.zeros((2*1024*1024*1024 + 10000), dtype="uint8") > > > > a[:]=1 > > # returns immediately > > > > a.mean() > > > > 0.0 > > print a > > > > [0 0 0 ..., 0 0 0] > > The bug only happens when the nElements > 2G (2^31). So for > > dtype=uint16/32, the bug happens when size is greater thatn 2^31 as > > well. > > Yup. I can reproduce your problem with NumPy 1.3.0 (final) and a 64-bit > platform. I suppose that you should file a bug better. > > Does is persist for svn? IIRC, there is another ticket for a slicing bug for large arrays. From Chris.Barker at noaa.gov Mon Sep 21 15:09:12 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 21 Sep 2009 12:09:12 -0700 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <4AB7CC04.8070300@gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> <4AB7CC04.8070300@gmail.com> Message-ID: <4AB7CF58.3060601@noaa.gov> Xavier Gnata wrote: > David Cournapeau wrote: >> That's only a data point, but I almost always use cython in those cases, I'm a second data point, but I think there are many more. Judging from the SciPy conference, Cython is the preferred method for new projects. > Should I read that to learn you cython and numpy interact? > http://wiki.cython.org/tutorials/numpy That's probably the best starting point. Also look online for the videos of the presentations at the SciPy2009 conference -- there were a few Cython ones. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From dwf at cs.toronto.edu Mon Sep 21 15:12:32 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 21 Sep 2009 15:12:32 -0400 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <4AB7CC04.8070300@gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> <4AB7CC04.8070300@gmail.com> Message-ID: <222524F0-7DEE-4737-8A80-C0B8A1787945@cs.toronto.edu> On 21-Sep-09, at 2:55 PM, Xavier Gnata wrote: > Should I read that to learn you cython and numpy interact? > Or is there another best documentation (with examples...)? You should have a look at the Bresenham algorithm thread you posted. I went to the trouble of converting some Python code for Bresenham's algorithm to Cython, and a pointer to the Cython+NumPy tutorial: http://wiki.cython.org/tutorials/numpy David From gokhansever at gmail.com Mon Sep 21 15:14:27 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 21 Sep 2009 14:14:27 -0500 Subject: [Numpy-discussion] [SciPy-User] Simple pattern recognition In-Reply-To: <3F97EAA3-72C2-43D2-A060-580B8127191C@cs.toronto.edu> References: <49d6b3500909161722r6f74cce6j515b756c2b0b78c5@mail.gmail.com> <49d6b3500909211045g2913d62ey539171b0668ae7c3@mail.gmail.com> <49d6b3500909211104m2ad0646fo6ca8a2d74735e9bc@mail.gmail.com> <3F97EAA3-72C2-43D2-A060-580B8127191C@cs.toronto.edu> Message-ID: <49d6b3500909211214o279fbb1fhdbda0d6ba8f84377@mail.gmail.com> Ahh my blindness and apologies :) The nice feeling of reinventing the wheel... Probably I forgot to reshape the image data in the first place before applying into ndimage.label(). However, this was a nice example to understand recursion, and get to know some basics of computer vision and few libraries (OpenCV, pygraph) during my research. Thanks again for all kind replies. On Mon, Sep 21, 2009 at 1:36 PM, David Warde-Farley wrote: > I think Zachary is right, ndimage does what you want: > > In [48]: image = array( > [[0,0,0,1,1,0,0], > [0,0,0,1,1,1,0], > [0,0,0,1,0,0,0], > [0,0,0,0,0,0,0], > [0,1,0,0,0,0,0], > [0,1,1,0,0,0,0], > [0,0,0,0,1,1,0], > [0,0,0,0,1,1,1]]) > > In [57]: import scipy.ndimage as ndimage > > In [58]: labels, num_found = ndimage.label(image) > > In [59]: object_slices = ndimage.find_objects(labels) > > In [60]: image[object_slices[0]] > Out[60]: > array([[1, 1, 0], > [1, 1, 1], > [1, 0, 0]]) > > In [61]: image[object_slices[1]] > Out[61]: > array([[1, 0], > [1, 1]]) > > In [62]: image[object_slices[2]] > Out[62]: > array([[1, 1, 0], > [1, 1, 1]]) > > David > > On 21-Sep-09, at 2:04 PM, G?khan Sever wrote: > > > ndimage.label works differently than what I have done here. > > > > Later using find_objects you can get slices for row or column basis. > > Not > > possible to construct a dynamical structure to find objects that are > > in the > > in both axis. > > > > Could you look at the stackoverflow article once again and comment > > back? > > > > Thanks. > > > > On Mon, Sep 21, 2009 at 12:57 PM, Zachary Pincus < > zachary.pincus at yale.edu > > >wrote: > > > >> I believe that pretty generic connected-component finding is already > >> available with scipy.ndimage.label, as David suggested at the > >> beginning of the thread... > >> > >> This function takes a binary array (e.g. zeros where the background > >> is, non-zero where foreground is) and outputs an array where each > >> connected component of non-background pixels has a unique non-zero > >> "label" value. > >> > >> ndimage.find_objects will then give slices (e.g. bounding boxes) for > >> each labeled object (or a subset of them as specified). There are > >> also > >> a ton of statistics you can calculate based on the labeled objects -- > >> look at the entire ndimage.measurements namespace. > >> > >> Zach > >> > >> On Sep 21, 2009, at 1:45 PM, G?khan Sever wrote: > >> > >>> I asked this question at > >> http://stackoverflow.com/questions/1449139/simple-object-recognition > >>> and get lots of nice feedback, and finally I have managed to > >>> implement what I wanted. > >>> > >>> What I was looking for is named "connected component labelling or > >>> analysis" for my "connected component extraction" > >>> > >>> I have put the code (lab2.py) and the image (particles.png) under: > >>> http://code.google.com/p/ccnworks/source/browse/#svn/trunk/AtSc450/ > >>> labs > >>> > >>> What do you think of improving that code and adding into scipy's > >>> ndimage library (like connected_components()) ? > >>> > >>> Comments and suggestions are welcome :) > >>> > >>> > >>> On Wed, Sep 16, 2009 at 7:22 PM, G?khan Sever > >>> wrote: > >>> Hello all, > >>> > >>> I want to be able to count predefined simple rectangle shapes on an > >>> image as shown like in this one: > >> http://img7.imageshack.us/img7/2327/particles.png > >>> > >>> Which is in my case to count all the blue pixels (they are ice-snow > >>> flake shadows in reality) in one of the column. > >>> > >>> What is the way to automate this task, which library or technique > >>> should I study to tackle it. > >>> > >>> Thanks. > >>> > >>> -- > >>> G?khan > >>> > >>> > >>> > >>> -- > >>> G?khan > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > > > > > > > -- > > G?khan > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Mon Sep 21 15:23:14 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Mon, 21 Sep 2009 20:23:14 +0100 Subject: [Numpy-discussion] Numpy large array bug In-Reply-To: References: <68DF70B3485CC648835655773E92314F820154@prinsmail02.am.thmulti.com> <200909212030.39029.faltet@pytables.org>, Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A025@MBOX0.essex.ac.uk> I can confirm this bug for the last svn. Also: >>> a.put([2*1024*1024*1024 + 100,], 8) IndexError: index out of range for array in this case, I think the error is that in numpy/core/src/multiarray/item_selection.c in PyArray_PutTo line 209 should be: intp i, chunk, ni, max_item, nv, tmp; instead of: int i, chunk, ni, max_item, nv, tmp; fixing it as suggested: >>> a.put([2*1024*1024*1024 + 100,], 8) >>> a.max() 8 From jonathan.taylor at utoronto.ca Mon Sep 21 15:36:05 2009 From: jonathan.taylor at utoronto.ca (Jonathan Taylor) Date: Mon, 21 Sep 2009 15:36:05 -0400 Subject: [Numpy-discussion] Indexing transposes the array? Message-ID: <463e11f90909211236y1fefae59iea29b4c02786028d@mail.gmail.com> Why does indexing seem to transpose this array? In [14]: x = arange(8).reshape((2,2,2)) In [15]: x[0,:,:] Out[15]: array([[0, 1], [2, 3]]) In [16]: x[0,:,[0,1]] Out[16]: array([[0, 2], [1, 3]]) Thanks, Jonathan. From dwf at cs.toronto.edu Mon Sep 21 15:49:00 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 21 Sep 2009 15:49:00 -0400 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? In-Reply-To: <5b8d13220909210753v7b76965aned20401129088394@mail.gmail.com> References: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> <5b8d13220909210753v7b76965aned20401129088394@mail.gmail.com> Message-ID: On 21-Sep-09, at 10:53 AM, David Cournapeau wrote: > Concerning the hardware, I have just bought a core i7 (the cheapest > model is ~ 200$ now, with 4 cores and 8 Mb of shared cache), and the > thing flies for floating point computation. My last computer was a > pentium 4 so I don't have a lot of reference, but you can compute ~ > 300e6 exp (assuming a contiguous array), and ATLAS 3.8.3 built on it > is extremely fast - using the threaded version, the asymptotic peak > performances are quite impressive. It takes for example 14s to inverse > a 5000x5000 matrix of double. I thought you had a Macbook too? The Core i5 750 seems like a good buy right now as well. A bit cheaper, 4 cores and 8Mb of shared cache though at a slightly lower clock speed. David From eadrogue at gmx.net Mon Sep 21 16:23:42 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Mon, 21 Sep 2009 22:23:42 +0200 Subject: [Numpy-discussion] masked arrays as array indices (is a bad idea) In-Reply-To: References: <20090921160519.GA18313@doriath.local> Message-ID: <20090921202342.GA18801@doriath.local> 21/09/09 @ 14:43 (-0400), thus spake Pierre GM: > > > On Sep 21, 2009, at 12:17 PM, Ryan May wrote: > > > 2009/9/21 Ernest Adrogu? > > Hello there, > > > > Given a masked array such as this one: > > > > In [19]: x = np.ma.masked_equal([-1, -1, 0, -1, 2], -1) > > > > In [20]: x > > Out[20]: > > masked_array(data = [-- -- 0 -- 2], > > mask = [ True True False True False], > > fill_value = 999999) > > > > When you make an assignemnt in the vein of x[x == 0] = 25 > > the result can be a bit puzzling: > > > > In [21]: x[x == 0] = 25 > > > > In [22]: x > > Out[22]: > > masked_array(data = [25 25 25 25 2], > > mask = [False False False False False], > > fill_value = 999999) > > > > Is this the correct result or have I found a bug? > > > > I see the same here on 1.4.0.dev7400. Seems pretty odd to me. Then > > again, it's a bit more complex using masked boolean arrays for > > indexing since you have True, False, and masked values. Anyone have > > thoughts on what *should* happen here? Or is this it? > > Using a masked array in fancy indexing is always a bad idea, as > there's no way of guessing the behavior one would want for missing > values: should they be evaluated as False ? True ? You should really > use the `filled` method to control the behavior. > > >>> x[(x==0).filled(False)] > masked_array(data = [0], > mask = [False], > fill_value = 999999) > >>>x[(x==0).filled(True)] > masked_array(data = [-- -- 0 --], > mask = [ True True False True], > fill_value = 999999) > > P. > > [If you're really interested: > When testing for equality, a masked array is first filled with 0 (that > was the behavior of the first implementation of numpy.ma), tested for > equality, and the mask of the result set to the mask of the input. > When used in fancy indexing, a masked array is viewed as a standard > ndarray by dropping the mask. In the current case, the combination is > therefore equivalent to (x.filled(0)==0), which explains why the > missing values are treated as True... I agree that the prefilling may > not be necessary...] This explains why x[x == 3] = 4 works "as expected", whereas x[x == 0] = 4 ruins everything. Basically, any condition that matches 0 will match every masked item as well. I don't know, but maybe it would be better to raise an exception when the index is a masked array then. The current behaviour seems a bit confusing to me. -- Ernest From dwf at cs.toronto.edu Mon Sep 21 16:24:58 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 21 Sep 2009 16:24:58 -0400 Subject: [Numpy-discussion] Indexing transposes the array? In-Reply-To: <463e11f90909211236y1fefae59iea29b4c02786028d@mail.gmail.com> References: <463e11f90909211236y1fefae59iea29b4c02786028d@mail.gmail.com> Message-ID: <1ACBCA80-CE38-43CF-B427-CD7A469BC657@cs.toronto.edu> On 21-Sep-09, at 3:36 PM, Jonathan Taylor wrote: > Why does indexing seem to transpose this array? > > In [14]: x = arange(8).reshape((2,2,2)) > > In [15]: x[0,:,:] > Out[15]: > array([[0, 1], > [2, 3]]) > > In [16]: x[0,:,[0,1]] > Out[16]: > array([[0, 2], > [1, 3]]) The last example in this section (and the explanation) proves instructive: http://docs.scipy.org/doc/numpy/user/basics.indexing.html#indexing-multi-dimensional-arrays Also, notice: In [121]: x[0,:,0] Out[121]: array([0, 2]) In [122]: x[0,:,[0]] Out[122]: array([[0, 2]] The fancy indexing is basically going to look at x[0,:,0], x[0,:,1] and merge them along a new axis. If you used the fancy index along the second dimension, it would pull out the rows, like you want it to. David From Ashwin.Kashyap at thomson.net Mon Sep 21 16:28:58 2009 From: Ashwin.Kashyap at thomson.net (Kashyap Ashwin) Date: Mon, 21 Sep 2009 16:28:58 -0400 Subject: [Numpy-discussion] Numpy large array bug Message-ID: <68DF70B3485CC648835655773E92314F820181@prinsmail02.am.thmulti.com> Also, what about PyArray_PutMask() That function also has a line like "int i, chunk, ni, max_item, nv, tmp;" Should that be changed as well? (Your patch does not fix my original issue.) BTW, in numpy 1.3, that is present in numpy/core/src/multiarraymodule.c. Can someone please give me a temporary patch to test? I am not familiar with numpy codebase! -Ashwin I can confirm this bug for the last svn. Also: >>> a.put([2*1024*1024*1024 + 100,], 8) IndexError: index out of range for array in this case, I think the error is that in numpy/core/src/multiarray/item_selection.c in PyArray_PutTo line 209 should be: intp i, chunk, ni, max_item, nv, tmp; instead of: int i, chunk, ni, max_item, nv, tmp; fixing it as suggested: >>> a.put([2*1024*1024*1024 + 100,], 8) >>> a.max() 8 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Mon Sep 21 16:35:58 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 21 Sep 2009 16:35:58 -0400 Subject: [Numpy-discussion] masked arrays as array indices (is a bad idea) In-Reply-To: <20090921202342.GA18801@doriath.local> References: <20090921160519.GA18313@doriath.local> <20090921202342.GA18801@doriath.local> Message-ID: <674ECDA5-02EC-4351-8F50-CD872B59D11B@gmail.com> On Sep 21, 2009, at 4:23 PM, Ernest Adrogu? wrote: > > This explains why x[x == 3] = 4 works "as expected", whereas > x[x == 0] = 4 ruins everything. Basically, any condition that matches > 0 will match every masked item as well. There's room for improvement here indeed. I need to check first whether fixing the comparison methods doesn't break anything. > I don't know, but maybe it would be better to raise an exception when > the index is a masked array then. The current behaviour seems a bit > confusing to me. That'd be modifying ndarray.__getitem__ and I don't see that happening. In the meantime, please just fill your masked array with the `filled` method or the corresponding function. Remmber that masked arrays are for convenience. As soon as you try to do some heavy computations, you're better processing data and mask yourself. From lciti at essex.ac.uk Mon Sep 21 18:52:16 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Mon, 21 Sep 2009 23:52:16 +0100 Subject: [Numpy-discussion] Numpy large array bug In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB31E561A025@MBOX0.essex.ac.uk> References: <68DF70B3485CC648835655773E92314F820154@prinsmail02.am.thmulti.com> <200909212030.39029.faltet@pytables.org>, , <271BED32E925E646A1333A56D9C6AFCB31E561A025@MBOX0.essex.ac.uk> Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A026@MBOX0.essex.ac.uk> I think the original bug is due to line 535 of numpy/core/src/multiarray/ctors.c (svn) that should be: intp numcopies, nbytes; instead of: int numcopies, nbytes; To resume: in line 535 of numpy/core/src/multiarray/ctors.c and in line 209 of numpy/core/src/multiarray/item_selection.c int should be replaced with intp. From charlesr.harris at gmail.com Mon Sep 21 18:59:37 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 Sep 2009 16:59:37 -0600 Subject: [Numpy-discussion] Numpy large array bug In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB31E561A026@MBOX0.essex.ac.uk> References: <68DF70B3485CC648835655773E92314F820154@prinsmail02.am.thmulti.com> <200909212030.39029.faltet@pytables.org> <271BED32E925E646A1333A56D9C6AFCB31E561A025@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A026@MBOX0.essex.ac.uk> Message-ID: Hi, Luca, On Mon, Sep 21, 2009 at 4:52 PM, Citi, Luca wrote: > I think the original bug is due to > line 535 of numpy/core/src/multiarray/ctors.c (svn) > that should be: > intp numcopies, nbytes; > instead of: > int numcopies, nbytes; > > To resume: > in line 535 of numpy/core/src/multiarray/ctors.c > and > in line 209 of numpy/core/src/multiarray/item_selection.c > int should be replaced with intp. Please open a ticket for this. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon Sep 21 19:11:20 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 22 Sep 2009 08:11:20 +0900 Subject: [Numpy-discussion] fixed-point arithmetic In-Reply-To: References: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> Message-ID: <5b8d13220909211611r138dfe79vdeaa4d0749a1a04c@mail.gmail.com> On Tue, Sep 22, 2009 at 12:57 AM, Neal Becker wrote: > David Cournapeau wrote: > >> On Mon, Sep 21, 2009 at 9:00 PM, Neal Becker wrote: >>> >>> numpy arrays of fpi should support all numeric operations. ?Also mixed >>> fpi/integer operations. >>> >>> I'm not sure how to go about implementing this. ?At first, I was thinking >>> to just subclass numpy array. ?But, I don't think this provides fpi >>> scalars, and their associated operations. >> >> Using dtype seems more straightforward. I would first try to see how >> far you could go using a pure python object as a dtype. For example >> (on python 2.6): >> >> from decimal import Decimal >> import numpy as np >> a = np.array([1, 2, 3], Decimal) >> b = np.array([2, 3, 4], Decimal) >> a + b >> >> works as expected. A lot of things won't work (e.g. most transcendent >> functions, which would require a specific implementation anyway), but >> arithmetic, etc... would work. >> >> Then, you could think about implementing the class in cython. If speed >> is an issue, then implementing your own dtype seems the way to go - I >> don't know exactly what kind of speed increase you could hope from >> going the object -> dtype, though. >> > > We don't want to create arrays of fixed-pt objects. ?That would be very > wasteful. Maybe, but that would be a good way to prototype the thing. > ?What I have in mind is that integer_bits, frac_bits are > attributes of the entire arrays, not the individual elements. ?The array > elements are just plain integers. That's not really how numpy arrays are designed: type-specific info should be in the dtype, not the array class. As Robert mentioned, the recently added datetime dtype shows an example on how to do it. David From lciti at essex.ac.uk Mon Sep 21 20:16:35 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Tue, 22 Sep 2009 01:16:35 +0100 Subject: [Numpy-discussion] Numpy large array bug In-Reply-To: References: <68DF70B3485CC648835655773E92314F820154@prinsmail02.am.thmulti.com> <200909212030.39029.faltet@pytables.org> <271BED32E925E646A1333A56D9C6AFCB31E561A025@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A026@MBOX0.essex.ac.uk>, Message-ID: <271BED32E925E646A1333A56D9C6AFCB31E561A027@MBOX0.essex.ac.uk> Here it is... http://projects.scipy.org/numpy/ticket/1229 From fperez.net at gmail.com Mon Sep 21 21:49:47 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 21 Sep 2009 18:49:47 -0700 Subject: [Numpy-discussion] something wrong with docs? In-Reply-To: <1253557959.3707.4.camel@idol> References: <1253557959.3707.4.camel@idol> Message-ID: On Mon, Sep 21, 2009 at 11:32 AM, Pauli Virtanen wrote: > The `sphinx.ext.doctest` extension is not enabled, so the testcode:: > etc. directives are not available. I'm not sure if it should be enabled > -- it would be cleaner to just replace the testcode:: stuff with the > ordinary example markup. > Why not enable it? It would be nice if we could move gradually towards docs whose examples (at least those marked as such) were always run via sphinx. The more we do this, the higher the chances of non-zero overlap between documentation and reality :) Cheers, f From drife at ucar.edu Tue Sep 22 01:00:46 2009 From: drife at ucar.edu (Daran L. Rife) Date: Mon, 21 Sep 2009 23:00:46 -0600 (MDT) Subject: [Numpy-discussion] Multi-dimensional indexing In-Reply-To: <48400.24.56.170.180.1253592520.squirrel@mail.rap.ucar.edu> References: <48400.24.56.170.180.1253592520.squirrel@mail.rap.ucar.edu> Message-ID: <43907.24.56.170.180.1253595646.squirrel@mail.rap.ucar.edu> I forgot to mention that the second array, which I wish to conditionally select elements from using tmax_idx, has the same dimensions as the "speed" array, That is, (ntimes, nlon, nlat) = U.shape And tmax_idx has dimensions of (nlon, nlat). Daran -- > My apology for the simplemindedness of my question. I've > been a long time user of NumPy and its predecessor Numeric, > but am struggling to understand "fancy indexing" for multi- > dimensional arrays. Here is the problem I am trying to solve. > > Suppose I have an 3-D array, named "speed" whose first dimen- > sion is time, and the second and third dimensions are latitude > and longitude. Further suppose that I wish to find the time > where the values at each point are at their maximum. This can > easily be done with the following code: > >>>> tmax_idx = np.argsort(speed, axis=0) > > I now wish to use this tmax_idx array to conditionally select > the values from a separate array. How can this be done with > fancy indexing? I've certainly done this sort of selection > with index arrays in 1D, but I can not wrap my head round the > multi-dimensionl index selection, even after carefully studying > the excellent indexing documentation and examples on-line. I'd > like to learn how to do this, to avoid the brute force looping > solution of: > > mean_u = np.zeros((nlon, nlat), dtype=np.float32) > > for i in xrange(nlon): > for j in xrange(nlat): > mean_u[i,j] = U[max_spd_idx[i,j],i,j] > > As you know, this is reasonably fast for modest-sized arrays, > but is far more expensive for large arrays. > > > Thanks in advance for your help. > > > Sincerely, > > > Daran Rife > > > > From gokhansever at gmail.com Tue Sep 22 01:43:26 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 22 Sep 2009 00:43:26 -0500 Subject: [Numpy-discussion] [IPython-dev] Testing matplotlib on IPython trunk In-Reply-To: References: <49d6b3500909081245r763f6292nf75b689f2c7d6ead@mail.gmail.com> <49d6b3500909212116q5fac242dhb4dfa0b02da89eb4@mail.gmail.com> Message-ID: <49d6b3500909212243x1357afc3t190817f39b5b055c@mail.gmail.com> Thanks Fernando for the quick response. Today this is the 3rd time I am hitting an unsupported feature in the Python lands. 1-) No attribute docstrings 2-) Look this question: http://stackoverflow.com/questions/1458203/reading-a-float-from-string and 3rd is this. However I think I influenced to guys in our campus to take a look Python. One using Matlab-Simulink and C on collision-detection system design, the latter uses C to design a small scale embedded acquisition system for UAV platforms. He uses an ARM Cortex A8 processor powered Gumstix board. Xubuntu 9.04 runs on it. I saw Python 2.6.2 installed, however not sure how easy would that be to bring rest of the scipy stack into that machine. Besides, tomorrow there is going to be a Matlab seminar here http://www.mathworks.com/company/events/seminars/seminar39323.html It is about a SciPy advanced tutorial long. Many similar subjects I see there: *Speeding Up MATLAB Applications:Tips and Tricks for Writing Efficient Code *Topics include: ? Understanding preallocation and vectorization ? Addressing bottlenecks ? Efficient indexing and manipulations ? JIT ? Interpreter ? Mex *Brief Introduction to Parallel Computing with MATLAB *? Task parallel applications for faster processing ? Data parallel applications for handling large data sets ? Scheduling your programs to run I hope I will not kick out from the session by keep commenting oh that is possible in Python, oh this is too :) On Tue, Sep 22, 2009 at 12:18 AM, Fernando Perez wrote: > 2009/9/21 G?khan Sever : > > > > It's a very late reply but I am wondering how to make these appear in the > Ipy dev loaded into the session but not visible to a whos listing? > > > > I don't think that's supported quite right now. IPython does one > special thing to support a clean %whos listing: right before opening > up the user mainloop, it checks all keys in the user namespace, and > later on when %whos is run, those variables that were initially > present are not displayed. So for now if you do this interactively, > you will unfortunately pollute %whos. > > This is one thing we'll need to make sure works nicely again when the > dust settles. > > Cheers, > > f > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From seb.haase at gmail.com Tue Sep 22 02:33:07 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Mon, 21 Sep 2009 22:33:07 -0800 Subject: [Numpy-discussion] numpy and cython in pure python mode Message-ID: Hi, I'm not subscribed to the cython list - hoping enough people would care to justify my post here: I know that cython's numpy is still getting better and better over time, but is it already today possible to have numpy support when using Cython in "pure python" mode? I like the idea of being able to develop and debug code "the python way" -- and then just switching on the cython-overdrive mode. (Otherwise I have very good experience using C/C++ with appropriate typemaps, and I don't mind the C syntax) I only recently learned about the "pure python" mode on the sympy list (and at the EuroScipy2009 workshop). My understanding is that Cython's pure Python mode could be "played" in two ways: a) either not having a .pyx-file at all and putting everything into a py-file (using the "import cython" stuff) or b) putting only cython specific declaration in to a pyx file having the same basename as the py-file next to it. I have not decided which way I would like better ( for "a)": 1 file might be easier to maintain, for "b)" it might be "nicer to look at" not having so many declarators "all over the place") A short example using numpy showing either way would be really nice to have ;-) One more: there is no way on reload cython-modules (yet), right ? I started looking into pyximport, but I had difficulties finding a comfortable write-test-change-test-....-cycle --- so far I'm using SWIG together with a makefile and recompiling and restarting the python session is similar there... Thanks, Sebastian Haase From pav+sp at iki.fi Tue Sep 22 03:02:40 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 22 Sep 2009 07:02:40 +0000 (UTC) Subject: [Numpy-discussion] something wrong with docs? References: <1253557959.3707.4.camel@idol> Message-ID: Mon, 21 Sep 2009 18:49:47 -0700, Fernando Perez wrote: > On Mon, Sep 21, 2009 at 11:32 AM, Pauli Virtanen wrote: >> The `sphinx.ext.doctest` extension is not enabled, so the testcode:: >> etc. directives are not available. I'm not sure if it should be enabled >> -- it would be cleaner to just replace the testcode:: stuff with the >> ordinary example markup. >> >> > Why not enable it? It would be nice if we could move gradually towards > docs whose examples (at least those marked as such) were always run via > sphinx. The more we do this, the higher the chances of non-zero overlap > between documentation and reality :) I think sphinx.ext.doctest is able to also test the ordinary >>> marked- up examples, so there'd be no large need for new directives. But oh well, I suppose enabling it can't hurt much. -- Pauli Virtanen From sturla at molden.no Tue Sep 22 03:12:06 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 22 Sep 2009 09:12:06 +0200 Subject: [Numpy-discussion] numpy and cython in pure python mode In-Reply-To: References: Message-ID: <4AB878C6.3070505@molden.no> Sebastian Haase skrev: > I know that cython's numpy is still getting better and better over > time, but is it already today possible to have numpy support when > using Cython in "pure python" mode? > I'm not sure. There is this odd memoryview syntax: import cython view = cython.int[:,:](my2darray) print view[3,4] # fast when compiled, according to Sverre S.M. From sebastian.walter at gmail.com Tue Sep 22 03:42:56 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Tue, 22 Sep 2009 09:42:56 +0200 Subject: [Numpy-discussion] polynomial ring dtype Message-ID: This is somewhat similar to the question about fixed-point arithmetic earlier on this mailing list. I need to do computations on arrays whose elements are truncated polynomials. At the momement, I have implemented the univariate truncated polynomials as objects of a class UTPS. The class basically looks like this: class UTPS: def __init__(self, taylor_coeffs): """ polynomial x(t) = tc[0] + tc[1] t + tc[2] t^2 + tc[3] t^3 + ... """ self.tc = numpy.asarray(taylor_coeffs) def __add__(self, rhs): return UTPS(self.tc + rhs.tc) def sin(self): # numpy.sin(self) apparently automatically calls self.sin() which is very cool etc.... One can create arrays of UTPS instances like this: x = numpy.array( [[UTPS([1,2]), UTPS([3,4])], [UTPS([0,1]), UTPS([4,3])]]) and perform funcs and ufuncs on it y = numpy.sum(x) y = numy.sin(x) y = numpy.dot(numpy.eye(2), x) This works out of the box, which is very nice. my question: Is it possible to speed up the computation by defining a special dtype for truncated polynomials? Especially when the arrays get large, computing on arrays of objects is quite slow. I had a look at the numpy svn trunk but couldn't find any clues. If you are interested, you can have a look at the full pre alpha version code (BSD licence) at http://github.com/b45ch1/algopy . regards, Sebastian From romain.brette at ens.fr Tue Sep 22 03:52:49 2009 From: romain.brette at ens.fr (Romain Brette) Date: Tue, 22 Sep 2009 09:52:49 +0200 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? In-Reply-To: References: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> <5b8d13220909210753v7b76965aned20401129088394@mail.gmail.com> Message-ID: David Warde-Farley a ?crit : > On 21-Sep-09, at 10:53 AM, David Cournapeau wrote: > >> Concerning the hardware, I have just bought a core i7 (the cheapest >> model is ~ 200$ now, with 4 cores and 8 Mb of shared cache), and the >> thing flies for floating point computation. My last computer was a >> pentium 4 so I don't have a lot of reference, but you can compute ~ >> 300e6 exp (assuming a contiguous array), and ATLAS 3.8.3 built on it >> is extremely fast - using the threaded version, the asymptotic peak >> performances are quite impressive. It takes for example 14s to inverse >> a 5000x5000 matrix of double. > > I thought you had a Macbook too? > > The Core i5 750 seems like a good buy right now as well. A bit > cheaper, 4 cores and 8Mb of shared cache though at a slightly lower > clock speed. > > David How about the Core i7 975 (Extreme)? http://www.intel.com/performance/desktop/extreme.htm I am wondering if it is worth the extra money. Best, Romain From hrvoje.niksic at avl.com Tue Sep 22 05:01:37 2009 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Tue, 22 Sep 2009 11:01:37 +0200 Subject: [Numpy-discussion] Deserialization uncouples shared arrays Message-ID: <4AB89271.3020105@avl.com> Is it intended for deserialization to uncouple arrays that share a common base? For example: >>> import numpy, cPickle as p >>> a = numpy.array([1, 2, 3]) # base array >>> b = a[:] # view one >>> b array([1, 2, 3]) >>> c = a[::-1] # view two >>> c array([3, 2, 1]) >>> b.base is c.base True Arrays in b and c now share a common base, so changing the contents of one affects the other: >>> b[0] = 10 >>> b array([10, 2, 3]) >>> c array([ 3, 2, 10]) After serialization, the two arrays are effectively uncoupled, creating a different situation than before serialization: >>> d, e = p.loads(p.dumps((b, c), -1)) >>> d array([10, 2, 3]) >>> e array([ 3, 2, 10]) >>> d.base is e.base False >>> d[0] = 11 >>> d array([11, 2, 3]) >>> e array([ 3, 2, 10]) Is this behavior intentional, or is it an artifact of the implementation? Can it be relied upon not to change in a future release? From sccolbert at gmail.com Tue Sep 22 06:17:04 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Tue, 22 Sep 2009 12:17:04 +0200 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <222524F0-7DEE-4737-8A80-C0B8A1787945@cs.toronto.edu> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> <4AB7CC04.8070300@gmail.com> <222524F0-7DEE-4737-8A80-C0B8A1787945@cs.toronto.edu> Message-ID: <7f014ea60909220317l5370d700s7068539fa6d245e4@mail.gmail.com> I give my vote to cython as well. I have a program which uses cython for a portion simply because it was easier using a simple C for-loop to do what i wanted rather than beating numpy into submission. It was an order of magnitude faster as well. Cheers, Chris On Mon, Sep 21, 2009 at 9:12 PM, David Warde-Farley wrote: > On 21-Sep-09, at 2:55 PM, Xavier Gnata wrote: > >> Should I read that to learn you cython and numpy interact? >> Or is there another best documentation (with examples...)? > > You should have a look at the Bresenham algorithm thread you posted. I > went to the trouble of converting some Python code for Bresenham's > algorithm to Cython, and a pointer to the Cython+NumPy tutorial: > > http://wiki.cython.org/tutorials/numpy > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From renesd at gmail.com Tue Sep 22 06:42:23 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Tue, 22 Sep 2009 11:42:23 +0100 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <222524F0-7DEE-4737-8A80-C0B8A1787945@cs.toronto.edu> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> <4AB7CC04.8070300@gmail.com> <222524F0-7DEE-4737-8A80-C0B8A1787945@cs.toronto.edu> Message-ID: <64ddb72c0909220342xeb30896raa7c14fad62d1908@mail.gmail.com> On Mon, Sep 21, 2009 at 8:12 PM, David Warde-Farley wrote: > On 21-Sep-09, at 2:55 PM, Xavier Gnata wrote: > >> Should I read that to learn you cython and numpy interact? >> Or is there another best documentation (with examples...)? > > You should have a look at the Bresenham algorithm thread you posted. I > went to the trouble of converting some Python code for Bresenham's > algorithm to Cython, and a pointer to the Cython+NumPy tutorial: > > http://wiki.cython.org/tutorials/numpy > > David I don't know about the best way... but here are two approaches I like... Another way is to make your C function then load it with ctypes(or wrap it with something else) and pass it pointers with array.ctype.data. You can find the shape of the array in python, and pass it to your C function. The benefit is it's just C code, and you can avoid the GIL too if you want. Then if you keep your C code separate from python stuff other people can use your C code in other languages more easily. cinpy is another one(similar to weave) which can be good for fast prototyping... http://www.cs.tut.fi/~ask/cinpy/ or for changing the code to fit your data. cheers, From meine at informatik.uni-hamburg.de Tue Sep 22 06:51:10 2009 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Tue, 22 Sep 2009 12:51:10 +0200 Subject: [Numpy-discussion] Deserialization uncouples shared arrays In-Reply-To: <4AB89271.3020105@avl.com> References: <4AB89271.3020105@avl.com> Message-ID: <200909221251.11441.meine@informatik.uni-hamburg.de> On Tuesday 22 September 2009 11:01:37 Hrvoje Niksic wrote: > Is it intended for deserialization to uncouple arrays that share a > common base? I think it's not really intended, but it's a limitation by design. AFAIK, it's related to Luca Citi's recent "ultimate base" thread - you simply cannot ensure serialization of arbitrary ndarrays in this way, because they may point to *any* memory, not necessarily an ndarray-allocated one. (Think of internal camera framebuffers, memory allocated by 3rd party libraries, memmapped areas, etc.) HTH, Hans From yogeshkarpate at gmail.com Tue Sep 22 07:11:05 2009 From: yogeshkarpate at gmail.com (yogesh karpate) Date: Tue, 22 Sep 2009 16:41:05 +0530 Subject: [Numpy-discussion] The problem with arrays Message-ID: <703777c60909220411j219600e9k90466e2241f91514@mail.gmail.com> Please kindly go through following code snippet for i in range(a1): data_temp=(bpf[left[0][i]:right[0][i]])# left is an array and right is also an array maxloc=data_temp.argmax() #taking indices of max. value of data segment maxval=data_temp[maxloc] minloc=data_temp.argmin() minval=data_temp[minloc] maxloc = maxloc-1+left # add offset of present location minloc = minloc-1+left # add offset of present location R_index = maxloc R_t = t[maxloc] R_amp = array([maxval]) S_amp = minval#%%% Assuming the S-wave is the lowest #%%% amp in the given window #S_t = t[minloc] R_time=array([R_t[0][i]]) plt.plot(R_time,R_amp,'go'); plt.show() The thing is that I want to plot R_time and R_amp in a single shot.The above code plots R_time and R_amp each time and overwriting previous value as the loop continues,i.e. it displays the many graphs each indicating single point (R_time,R_amp) as long as loop continues.What I want is that all points from R_time and R_amp should be plotted in one go. I tried to take array and store the value ,but it takes only one value at the end of loop. How should I break this loop?Can anybody help me out ???? Thanx in Advance Regards Yogesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From hrvoje.niksic at avl.com Tue Sep 22 07:14:55 2009 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Tue, 22 Sep 2009 13:14:55 +0200 Subject: [Numpy-discussion] Deserialization uncouples shared arrays In-Reply-To: <32645411.376352.1253616708601.JavaMail.xicrypt@atgrzls001> References: <4AB89271.3020105@avl.com> <32645411.376352.1253616708601.JavaMail.xicrypt@atgrzls001> Message-ID: <4AB8B1AF.1090700@avl.com> Hans Meine wrote: > On Tuesday 22 September 2009 11:01:37 Hrvoje Niksic wrote: >> Is it intended for deserialization to uncouple arrays that share a >> common base? > > I think it's not really intended, but it's a limitation by design. I wonder why a "base" attribute is even restored, then? If there is no care to restore the shared views, then views could simply be serialized as arrays? > AFAIK, it's related to Luca Citi's recent "ultimate base" thread - you simply > cannot ensure serialization of arbitrary ndarrays in this way, because they > may point to *any* memory, not necessarily an ndarray-allocated one. That's true in general. I wonder if it's possible to restore shared arrays if (by virtue of "base" attribute) we know that the ndarray shares the memory of another ndarray. From nadavh at visionsense.com Tue Sep 22 07:44:59 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue, 22 Sep 2009 14:44:59 +0300 Subject: [Numpy-discussion] The problem with arrays References: <703777c60909220411j219600e9k90466e2241f91514@mail.gmail.com> Message-ID: <710F2847B0018641891D9A21602763605AD17D@ex3.envision.co.il> A quick answer with going into the details of your code: try plt.plot(R_time,R_amp,'go',hold=1) (one line before the last) Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? yogesh karpate ????: ? 22-??????-09 14:11 ??: numpy-discussion at scipy.org ????: [Numpy-discussion] The problem with arrays Please kindly go through following code snippet for i in range(a1): data_temp=(bpf[left[0][i]:right[0][i]])# left is an array and right is also an array maxloc=data_temp.argmax() #taking indices of max. value of data segment maxval=data_temp[maxloc] minloc=data_temp.argmin() minval=data_temp[minloc] maxloc = maxloc-1+left # add offset of present location minloc = minloc-1+left # add offset of present location R_index = maxloc R_t = t[maxloc] R_amp = array([maxval]) S_amp = minval#%%% Assuming the S-wave is the lowest #%%% amp in the given window #S_t = t[minloc] R_time=array([R_t[0][i]]) plt.plot(R_time,R_amp,'go'); plt.show() The thing is that I want to plot R_time and R_amp in a single shot.The above code plots R_time and R_amp each time and overwriting previous value as the loop continues,i.e. it displays the many graphs each indicating single point (R_time,R_amp) as long as loop continues.What I want is that all points from R_time and R_amp should be plotted in one go. I tried to take array and store the value ,but it takes only one value at the end of loop. How should I break this loop?Can anybody help me out ???? Thanx in Advance Regards Yogesh -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3523 bytes Desc: not available URL: From yogeshkarpate at gmail.com Tue Sep 22 08:12:23 2009 From: yogeshkarpate at gmail.com (yogesh karpate) Date: Tue, 22 Sep 2009 17:42:23 +0530 Subject: [Numpy-discussion] The problem with arrays In-Reply-To: <710F2847B0018641891D9A21602763605AD17D@ex3.envision.co.il> References: <703777c60909220411j219600e9k90466e2241f91514@mail.gmail.com> <710F2847B0018641891D9A21602763605AD17D@ex3.envision.co.il> Message-ID: <703777c60909220512g7a067690yfeef8be4cca6d3f9@mail.gmail.com> I just tried your idea but the result is same. it didnt help . 2009/9/22 Nadav Horesh > A quick answer with going into the details of your code: > > try > plt.plot(R_time,R_amp,'go',hold=1) > (one line before the last) > > Nadav > > -----????? ??????----- > ???: numpy-discussion-bounces at scipy.org ??? yogesh karpate > ????: ? 22-??????-09 14:11 > ??: numpy-discussion at scipy.org > ????: [Numpy-discussion] The problem with arrays > > Please kindly go through following code snippet > for i in range(a1): > data_temp=(bpf[left[0][i]:right[0][i]])# left is an array and right > is also an array > maxloc=data_temp.argmax() #taking indices of max. value of > data segment > maxval=data_temp[maxloc] > minloc=data_temp.argmin() > minval=data_temp[minloc] > maxloc = maxloc-1+left # add offset of present location > minloc = minloc-1+left # add offset of present location > R_index = maxloc > R_t = t[maxloc] > R_amp = array([maxval]) > S_amp = minval#%%% Assuming the S-wave is the lowest > #%%% amp in the given window > #S_t = t[minloc] > R_time=array([R_t[0][i]]) > plt.plot(R_time,R_amp,'go'); > plt.show() > The thing is that I want to plot R_time and R_amp in a single shot.The > above > code plots R_time and R_amp each time and overwriting previous value as > the > loop continues,i.e. it displays the many graphs each indicating single > point (R_time,R_amp) as long as loop continues.What I want is that all > points from R_time and R_amp should be plotted in one go. I tried to take > array and store the value ,but it takes only one value at the end of > loop. > How should I break this loop?Can anybody help me out ???? > Thanx in Advance > Regards > Yogesh > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From meine at informatik.uni-hamburg.de Tue Sep 22 08:28:01 2009 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Tue, 22 Sep 2009 14:28:01 +0200 Subject: [Numpy-discussion] Deserialization uncouples shared arrays In-Reply-To: <4AB8B1AF.1090700@avl.com> References: <4AB89271.3020105@avl.com> <32645411.376352.1253616708601.JavaMail.xicrypt@atgrzls001> <4AB8B1AF.1090700@avl.com> Message-ID: <200909221428.02716.meine@informatik.uni-hamburg.de> On Tuesday 22 September 2009 13:14:55 Hrvoje Niksic wrote: > Hans Meine wrote: > > On Tuesday 22 September 2009 11:01:37 Hrvoje Niksic wrote: > >> Is it intended for deserialization to uncouple arrays that share a > >> common base? > > > > I think it's not really intended, but it's a limitation by design. > > I wonder why a "base" attribute is even restored, then? Oh, is it? Don't know if that makes sense, then. > If there is no > care to restore the shared views, then views could simply be serialized > as arrays? After your posting, I thought they were.. > > AFAIK, it's related to Luca Citi's recent "ultimate base" thread - you > > simply cannot ensure serialization of arbitrary ndarrays in this way, > > because they may point to *any* memory, not necessarily an > > ndarray-allocated one. > > That's true in general. I wonder if it's possible to restore shared > arrays if (by virtue of "base" attribute) we know that the ndarray > shares the memory of another ndarray. For this special case, it looks possible (although slightly dangerous) to me. It's also a common special case, so maybe you should open a ticket? Ciao, Hans From renesd at gmail.com Tue Sep 22 08:41:28 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Tue, 22 Sep 2009 13:41:28 +0100 Subject: [Numpy-discussion] I want to help with a numpy python 3.1.x port In-Reply-To: <3d375d730909201115q176ba039x33382a84cf8a4ebb@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <64ddb72c0909180746j7e1aafd1g6664d12e0a7f6f79@mail.gmail.com> <64ddb72c0909180752g3f350f63y8639a2d9ba2c7c67@mail.gmail.com> <64ddb72c0909201113j6f1426c6vd3cef3cb851df666@mail.gmail.com> <3d375d730909201115q176ba039x33382a84cf8a4ebb@mail.gmail.com> Message-ID: <64ddb72c0909220541s154f52f1y4d18cd365d0fa801@mail.gmail.com> On Sun, Sep 20, 2009 at 7:15 PM, Robert Kern wrote: > On Sun, Sep 20, 2009 at 13:13, Ren? Dudfield wrote: >> Hi again, >> >> I noticed numpy includes a copy of distutils. ?I guess because it's >> been modified in some way? > > numpy.distutils is a set of extensions to distutils; it is not a copy > of distutils. > cool thanks. btw, my work is in the 'work' branch here: http://github.com/illume/numpy3k/tree/work I probably could have called it something other than 'work'... but I just copy/pasted from the guide... so there you go. Only done a bit more on it so far... will try next weekend to get at least the setup.py build stuff running. I think that will be a good first step for me to post a diff, and then try and get it merged into svn. cu! From silva at lma.cnrs-mrs.fr Tue Sep 22 09:31:35 2009 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 22 Sep 2009 15:31:35 +0200 Subject: [Numpy-discussion] The problem with arrays In-Reply-To: <703777c60909220512g7a067690yfeef8be4cca6d3f9@mail.gmail.com> References: <703777c60909220411j219600e9k90466e2241f91514@mail.gmail.com> <710F2847B0018641891D9A21602763605AD17D@ex3.envision.co.il> <703777c60909220512g7a067690yfeef8be4cca6d3f9@mail.gmail.com> Message-ID: <1253626295.18793.47.camel@localhost.localdomain> Le mardi 22 septembre 2009 ? 17:42 +0530, yogesh karpate a ?crit : > I just tried your idea but the result is same. it didnt help . > > 2009/9/22 Nadav Horesh > A quick answer with going into the details of your code: > > try > plt.plot(R_time,R_amp,'go',hold=1) > (one line before the last) > > Nadav You may separate the computation part and the plotting one by storing your results in a R_time and a R_amp array (length a1 arrays). Concerning the plotting issue : are you sure the points you want to be displayed aren't yet? print the values within the loop : >>> print (R_amp, R_time) to check your values. You may also inspect your graphs to see how many lines they have : >>> plt.gca().get_children() or >>> plt.gca().get_lines() might help -- Fabrice Silva LMA UPR CNRS 7051 From sturla at molden.no Tue Sep 22 10:09:23 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 22 Sep 2009 16:09:23 +0200 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <64ddb72c0909220342xeb30896raa7c14fad62d1908@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> <4AB7CC04.8070300@gmail.com> <222524F0-7DEE-4737-8A80-C0B8A1787945@cs.toronto.edu> <64ddb72c0909220342xeb30896raa7c14fad62d1908@mail.gmail.com> Message-ID: <4AB8DA93.1030302@molden.no> Ren? Dudfield skrev: > Another way is to make your C function then load it with ctypes(or > wrap it with something else) and pass it pointers with > array.ctype.data. numpy.ctypeslib.ndpointer is preferred when using ndarrays with ctypes. > You can find the shape of the array in python, and > pass it to your C function. The benefit is it's just C code, and you > can avoid the GIL too if you want. Then if you keep your C code > separate from python stuff other people can use your C code in other > languages more easily. You can do this with Cython as well, just use Cython for the glue code. The important difference is this: Cython is a language for writing C extensions, ctypes is a module for calling DLLs. One important advantage of Cython is deterministic clean-up code. If you put a __dealloc__ method in a "cdef class", it will be called on garbage collection. Another nice way of interfacing C with numpy is f2py. It also works with C, not just Fortran. Yet another way (Windows specific) is to use win32com.client and pass array.ctype.data. That is nice if you have an ActiveX control; for Windows you often get commercial libraries like that. Also if you have .NET or Java objects, you can easily expose them to COM. From sturla at molden.no Tue Sep 22 10:21:05 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 22 Sep 2009 16:21:05 +0200 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <64ddb72c0909220342xeb30896raa7c14fad62d1908@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> <4AB7CC04.8070300@gmail.com> <222524F0-7DEE-4737-8A80-C0B8A1787945@cs.toronto.edu> <64ddb72c0909220342xeb30896raa7c14fad62d1908@mail.gmail.com> Message-ID: <4AB8DD51.2040809@molden.no> Ren? Dudfield skrev: > Another way is to make your C function then load it with ctypes Also one should beware that ctypes is a stable part of the Python standard library. Cython is still unstable and in rapid development. Pyrex is more stabile than Cython, but interfacing with ndarrays is harder. If you have a requirement on not using experimental code, then Cython is not an option. From bsouthey at gmail.com Tue Sep 22 10:39:15 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 22 Sep 2009 09:39:15 -0500 Subject: [Numpy-discussion] Numpy question: Best hardware for Numpy? In-Reply-To: References: <18AC09A7-EB7F-4787-BD75-2B596AF27B53@cs.toronto.edu> <5b8d13220909210753v7b76965aned20401129088394@mail.gmail.com> Message-ID: <4AB8E193.7090001@gmail.com> On 09/22/2009 02:52 AM, Romain Brette wrote: > David Warde-Farley a ?crit : > >> On 21-Sep-09, at 10:53 AM, David Cournapeau wrote: >> >> >>> Concerning the hardware, I have just bought a core i7 (the cheapest >>> model is ~ 200$ now, with 4 cores and 8 Mb of shared cache), and the >>> thing flies for floating point computation. My last computer was a >>> pentium 4 so I don't have a lot of reference, but you can compute ~ >>> 300e6 exp (assuming a contiguous array), and ATLAS 3.8.3 built on it >>> is extremely fast - using the threaded version, the asymptotic peak >>> performances are quite impressive. It takes for example 14s to inverse >>> a 5000x5000 matrix of double. >>> >> I thought you had a Macbook too? >> >> The Core i5 750 seems like a good buy right now as well. A bit >> cheaper, 4 cores and 8Mb of shared cache though at a slightly lower >> clock speed. >> >> David >> > How about the Core i7 975 (Extreme)? > http://www.intel.com/performance/desktop/extreme.htm > > I am wondering if it is worth the extra money. > > Best, > Romain > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, Check out the charts and stuff at places like http://www.tomshardware.com or http://www.anandtech.com/ for example: http://www.tomshardware.com/charts/2009-desktop-cpu-charts/benchmarks,60.html http://www.cpubenchmark.net/index.php As far as I know, if you want dual processors (in addition to the cores and hyperthreads) then you probably are stuck with Xeon's. Also currently the new Xeon's tend to have a slightly higher clock speed than the i7 series (xeon w5580 is 3.2GHz) so, without overclocking, they tend to be faster. The story tends to change with overclocking. If you overclock then the i7 920 appears to be widely recommended especially given the current US$900 difference. Really the i7 975 makes overclocking very easy but there are many guides on overclocking the i7 920 to 3-4Ghz with aircooling. Overclocking xeons may be impossible or hard to do. Bruce From sturla at molden.no Tue Sep 22 10:45:31 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 22 Sep 2009 16:45:31 +0200 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <4AB65386.5040904@gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> Message-ID: <4AB8E30B.2090007@molden.no> Xavier Gnata skrev: > I have a large 2D numpy array as input and a 1D array as output. > In between, I would like to use C code. > C is requirement because it has to be fast and because the algorithm > cannot be written in a numpy oriented way :( (no way...really). > There are certain algorithms that cannot be vectorized, particularly those that are recursive/iterative. One example is MCMC methods such as the Gibbs sampler. You can get around it by running multiple Markov chains in parallel, and vectorizing this parallelism with NumPy. But you cannot vectorize one long chain. Vectorizing with NumPy only applies to data parallel problems. But then there is a nice tool you should not about: Cython in pure Python mode. You just add some annotations to the Python code, and the .py file can be compiled to efficient C. http://wiki.cython.org/pure This is quite similar in spirit to the optional static typing that makes certain implementations of Common Lisp (CMUCL, SBCL, Franz) so insanely fast. From renesd at gmail.com Tue Sep 22 11:24:15 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Tue, 22 Sep 2009 16:24:15 +0100 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <4AB8E30B.2090007@molden.no> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> <4AB8E30B.2090007@molden.no> Message-ID: <64ddb72c0909220824h3e8151cg85a768fc14ee1843@mail.gmail.com> On Tue, Sep 22, 2009 at 3:45 PM, Sturla Molden wrote: > Xavier Gnata skrev: >> I have a large 2D numpy array as input and a 1D array as output. >> In between, I would like to use C code. >> C is requirement because it has to be fast and because the algorithm >> cannot be written in a numpy oriented way :( (no way...really). >> > There are certain algorithms that cannot be vectorized, particularly > those that are recursive/iterative. Hi, one thing you can do is guess(predict) what the previous answer is and continue on a path from that guess. Say you have 1000 processing units, you could get the other 999 working on guesses for the answers and go from there. If one of your guess paths is right, you might be able to skip a bunch of steps. That's what cpus do with their 'speculative execution and branch prediction'. even more OT, you can take advantage of this sometimes by getting the cpu to work out multiple things for you at once by putting in well placed if/elses, but that's fairly cpu specific. It's also used with servers... you ask say 5 servers to give you a result, and wait for the first one to give you the answer. That cython pure module looks cool, thanks for pointing it out. I wonder if anyone has tried using that with traces? So common paths in your code record the types, and then can be used by the cython pure module to try and generate the types for you? It could generate a file containing a {callable : @cython.locals} mapping to be used on compilation. Similar to how psyco works, but with a separate run program/compilation step. cheers, From jbsnyder at fanplastic.org Tue Sep 22 13:03:14 2009 From: jbsnyder at fanplastic.org (James Snyder) Date: Tue, 22 Sep 2009 12:03:14 -0500 Subject: [Numpy-discussion] Parallelizable Performance Python Example Message-ID: <86BF8CE5-9DFF-42D8-987F-8485318939BF@fanplastic.org> Hi - I've recently been trying to adjust the performance python example (http://www.scipy.org/PerformancePython ) so that it could be compared under a parallelized version. I've adjusted the Gauss-Seidel 4 point method to a red-black checkerboarded (http://www.cs.colorado.edu/~mcbryan/3656.04/mail/87.htm) version that is much more easily parallelized on a shared memory system. I've got some examples of this working, but I seem to be having trouble making it anywhere near efficient for the NumPy example (it 's around an order of magnitude slower than the non-red-black version). Here's essentially what I'm doing with the NumPy solver: def numericTimeStep(self, dt=0.0): """ Takes a time step using a NumPy expression. This has been adjusted to use checkerboard style indexing. """ g = self.grid dx2, dy2 = np.float32(g.dx**2), np.float32(g.dy**2) dnr_inv = np.float32(0.5/(dx2 + dy2)) u = g.u g.old_u = u.copy() # needed to compute the error. if self.count == 0: # Precompute Matrix Indexes X, Y = np.meshgrid(range(1,u.shape[0]-1),range(1,u.shape [1]-1)) checker = (X+Y) % 2 self.idx1 = checker==1 self.idx2 = checker==0 # The actual iteration g.u[1:-1, 1:-1][self.idx1] = ((g.u[0:-2, 1:-1][self.idx1] + g.u [2:, 1:-1][self.idx1])*dy2 + (g.u[1:-1,0:-2][self.idx1] + g.u[1:-1, 2:] [self.idx1])*dx2)*dnr_inv g.u[1:-1, 1:-1][self.idx2] = ((g.u[0:-2, 1:-1][self.idx2] + g.u [2:, 1:-1][self.idx2])*dy2 + (g.u[1:-1,0:-2][self.idx2] + g.u[1:-1, 2:] [self.idx2])*dx2)*dnr_inv return g.computeError() Any ideas? I presume that the double-indexing is maybe what's killing this, and I could precompute some boolean indexing arrays, but the original version of this solver (plain Gauss-Seidel, 4 point averaging) is rather simple and clean :-): def numericTimeStep(self, dt=0.0): """Takes a time step using a NumPy expression.""" g = self.grid dx2, dy2 = g.dx**2, g.dy**2 dnr_inv = 0.5/(dx2 + dy2) u = g.u g.old_u = u.copy() # needed to compute the error. # The actual iteration u[1:-1, 1:-1] = ((u[0:-2, 1:-1] + u[2:, 1:-1])*dy2 + (u[1:-1,0:-2] + u[1:-1, 2:])*dx2)*dnr_inv return g.computeError() Here's a pure python version of the red-black solver (which is, of course, incredibly slow, but not that much slower than the non-red- black version): def slowTimeStep(self, dt=0.0): """Takes a time step using straight forward Python loops.""" g = self.grid nx, ny = g.u.shape dx2, dy2 = np.float32(g.dx**2), np.float32(g.dy**2) dnr_inv = np.float32(0.5/(dx2 + dy2)) u = g.u err = 0.0 for offset in range(0,2): for i in range(1, nx-1): for j in range(1 + (i + offset) % 2, ny-1, 2): tmp = u[j,i] u[j,i] = ((u[j-1, i] + u[j+1, i])*dx2 + (u[j, i-1] + u[j, i+1])*dy2)*dnr_inv diff = u[j,i] - tmp err += diff*diff return np.sqrt(err) -- James Snyder Biomedical Engineering Northwestern University jbsnyder at fanplastic.org http://fanplastic.org/key.txt ph: 847.448.0386 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4012 bytes Desc: not available URL: From xavier.gnata at gmail.com Tue Sep 22 13:13:57 2009 From: xavier.gnata at gmail.com (Xavier Gnata) Date: Tue, 22 Sep 2009 19:13:57 +0200 Subject: [Numpy-discussion] Best way to insert C code in numpy code In-Reply-To: <64ddb72c0909220342xeb30896raa7c14fad62d1908@mail.gmail.com> References: <64ddb72c0909180455q5d30d0e4h4ef5f41807cb2a6c@mail.gmail.com> <64ddb72c0909180501s6a2357dv9e6cbcfd4c62b293@mail.gmail.com> <4AB65386.5040904@gmail.com> <4AB75A80.6020608@ar.media.kyoto-u.ac.jp> <4AB7CC04.8070300@gmail.com> <222524F0-7DEE-4737-8A80-C0B8A1787945@cs.toronto.edu> <64ddb72c0909220342xeb30896raa7c14fad62d1908@mail.gmail.com> Message-ID: <4AB905D5.50408@gmail.com> Ren? Dudfield wrote: > On Mon, Sep 21, 2009 at 8:12 PM, David Warde-Farley wrote: > >> On 21-Sep-09, at 2:55 PM, Xavier Gnata wrote: >> >> >>> Should I read that to learn you cython and numpy interact? >>> Or is there another best documentation (with examples...)? >>> >> You should have a look at the Bresenham algorithm thread you posted. I >> went to the trouble of converting some Python code for Bresenham's >> algorithm to Cython, and a pointer to the Cython+NumPy tutorial: >> >> http://wiki.cython.org/tutorials/numpy >> >> David >> > > I don't know about the best way... but here are two approaches I like... > > Another way is to make your C function then load it with ctypes(or > wrap it with something else) and pass it pointers with > array.ctype.data. You can find the shape of the array in python, and > pass it to your C function. The benefit is it's just C code, and you > can avoid the GIL too if you want. Then if you keep your C code > separate from python stuff other people can use your C code in other > languages more easily. > > cinpy is another one(similar to weave) which can be good for fast > prototyping... http://www.cs.tut.fi/~ask/cinpy/ or for changing the > code to fit your data. > > Well I would have to find/read docs to be able to try that solution. cython looks easy :) cheers From drife at ucar.edu Tue Sep 22 13:16:39 2009 From: drife at ucar.edu (Daran Rife) Date: Tue, 22 Sep 2009 11:16:39 -0600 Subject: [Numpy-discussion] Fancy indexing for Message-ID: Hello list, This didn't seem to get through last time round, and my first version was poorly written. I have a rather pedestrian question about fancy indexing for multi-dimensional arrays. Suppose I have two 3-D arrays, one named "A" and the other "B", where both arrays have identical dimensions of time, longitude, and latitude. I wish to use data from A to conditionally select values from array B. Specifically, I first find the time where the values at each point in A are at their maximum. This is accomplished with: >>> tmax_idx = np.argsort(A, axis=0) I now wish to use this tmax_idx array to conditionally select the values from B. In essence, I want to pick values from B for times where the values at A are at their max. Can this be done with fancy indexing? Or is there a smarter way to do this? I've certainly done this sort of selection before, but the index selection array is 1D. I've carefully studied the excellent indexing documentation and examples on-line, but can't sort out whether what I want to do is even possible, without doing the brute force looping method, similar to: max_B = np.zeros((nlon, nlat), dtype=np.float32) for i in xrange(nlon): for j in xrange(nlat): max_B[i,j] = B[tmax_idx[i,j],i,j] As you know, this is reasonably fast for modest-sized arrays, but is far more expensive for large arrays. Thanks in advance for your help. Sincerely, Daran Rife From yogeshkarpate at gmail.com Tue Sep 22 13:30:17 2009 From: yogeshkarpate at gmail.com (yogesh karpate) Date: Tue, 22 Sep 2009 23:00:17 +0530 Subject: [Numpy-discussion] The problem with arrays In-Reply-To: <1253626295.18793.47.camel@localhost.localdomain> References: <703777c60909220411j219600e9k90466e2241f91514@mail.gmail.com> <710F2847B0018641891D9A21602763605AD17D@ex3.envision.co.il> <703777c60909220512g7a067690yfeef8be4cca6d3f9@mail.gmail.com> <1253626295.18793.47.camel@localhost.localdomain> Message-ID: <703777c60909221030n38975a97x57719f55d5ccf18e@mail.gmail.com> On Tue, Sep 22, 2009 at 7:01 PM, Fabrice Silva wrote: > Le mardi 22 septembre 2009 ? 17:42 +0530, yogesh karpate a ?crit : > > I just tried your idea but the result is same. it didnt help . > > > > 2009/9/22 Nadav Horesh > > A quick answer with going into the details of your code: > > > > try > > plt.plot(R_time,R_amp,'go',hold=1) > > (one line before the last) > > > > Nadav > > You may separate the computation part and the plotting one by storing > your results in a R_time and a R_amp array (length a1 arrays). > > Concerning the plotting issue : are you sure the points you want to be > displayed aren't yet? print the values within the loop : > >>> print (R_amp, R_time) > to check your values. > You may also inspect your graphs to see how many lines they have : > >>> plt.gca().get_children() > or > >>> plt.gca().get_lines() > might help. > This is the main thing . When I try to store it in array like R_time=array([R_t[0][i]]). It just stores the final value in that array when loop ends.I cant get out of this For loop.I really have this small problem. I really need help on this guys. > for i in range(a1): > data_temp=(bpf[left[0][i]:right[0][i]])# left is an array and > right is also an array > maxloc=data_temp.argmax() #taking indices of max. value of > data segment > maxval=data_temp[maxloc] > minloc=data_temp.argmin() > minval=data_temp[minloc] > maxloc = maxloc-1+left # add offset of present location > minloc = minloc-1+left # add offset of present location > R_index = maxloc > R_t = t[maxloc] > R_amp = array([maxval]) > S_amp = minval#%%% Assuming the S-wave is the lowest > #%%% amp in the given window > #S_t = t[minloc] > R_time=array([R_t[0][i]]) > plt.plot(R_time,R_amp,'go'); > plt.show() > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From silva at lma.cnrs-mrs.fr Tue Sep 22 13:53:40 2009 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 22 Sep 2009 19:53:40 +0200 Subject: [Numpy-discussion] The problem with arrays In-Reply-To: <703777c60909221030n38975a97x57719f55d5ccf18e@mail.gmail.com> References: <703777c60909220411j219600e9k90466e2241f91514@mail.gmail.com> <710F2847B0018641891D9A21602763605AD17D@ex3.envision.co.il> <703777c60909220512g7a067690yfeef8be4cca6d3f9@mail.gmail.com> <1253626295.18793.47.camel@localhost.localdomain> <703777c60909221030n38975a97x57719f55d5ccf18e@mail.gmail.com> Message-ID: <1253642020.18793.54.camel@localhost.localdomain> Le mardi 22 septembre 2009 ? 23:00 +0530, yogesh karpate a ?crit : > This is the main thing . When I try to store it in array like > R_time=array([R_t[0][i]]). It just stores the final value in that > array when loop ends.I cant get out of this For loop.I really have > this small problem. I really need help on this guys. > > for i in range(a1): > data_temp=(bpf[left[0][i]: > right[0][i]])# left is an array and right is also an array > maxloc=data_temp.argmax() #taking indices of > max. value of data segment > maxval=data_temp[maxloc] > minloc=data_temp.argmin() > minval=data_temp[minloc] > maxloc = maxloc-1+left # add offset of present > location > minloc = minloc-1+left # add offset of present > location > R_index = maxloc > R_t = t[maxloc] > R_amp = array([maxval]) > S_amp = minval#%%% Assuming the S-wave is the lowest > #%%% amp in the given window > #S_t = t[minloc] > R_time=array([R_t[0][i]]) > plt.plot(R_time,R_amp,'go'); > plt.show() Two options : - you define an empty list before the loop >>> R_time = [] and you append the computed value while looping >>> for i: >>> ... >>> R_time.append(t[maxloc]) - or you define a preallocated array before the loop >>> R_time = np.empty(a1) and fill it with the computed values >>> for i: >>> ... >>> R_time[i] = t[maxloc] Same thing with R_amp. After looping, whatever the solution you choose, you can plot the whole set of (time, value) tuples >>> plt.plot(R_time, R_amp) -- Fabrice Silva LMA UPR CNRS 7051 From mdroe at stsci.edu Tue Sep 22 13:58:23 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Tue, 22 Sep 2009 13:58:23 -0400 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> Message-ID: <4AB9103F.7030207@stsci.edu> Sorry to resurrect a long-dead thread, but I've been continuing Chris Hanley's investigation of chararray at Space Telescope Science Institute (and the broader astronomical community) for a while and have some findings to report back. What I've taken from this thread is that chararray is in need of a maintainer. I am able to spend some time to the cause, but first would like to clarify what it will take to make it's continued inclusion more comfortable. Let me start with the use case. chararrays are extensively returned from pyfits (a tool to handle the standard astronomy data format). pyfits is the basis of many applications, and it would be impossible to audit all of that code. Most authors of those tools do not track numpy-discussion closely, which is why we don't hear from them on this list, but there is a great deal of pyfits-using code. Doing some spot-checking on this code, a common thing I see is SQL-like queries on recarrays of objects. For instance, it is very common to a have a table of objects, with a "Target" column which is a string, and do something like (where c is a chararray of the 'Target' column): subset = array[np.where(c.startswith('NGC'))] Strictly speaking, this is a use case for "vectorized string operations", not necessarily for the chararray class as it presently stands. One could almost as easily do: subset = array[np.where([x.startswith('NGC') for x in c])] ...and the latter is even slightly faster, since chararray currently loops in Python anyway. Even better, though, I have some experimental code to perform the loop in C, and I get 5x speed up on a table with ~120,000 rows. If that were to be included in numpy, that's a strong argument against recommending list comprehensions in user code. The use case suggests the continued existence of vectorized string operations in numpy -- whether that continues to be chararray, or some newer/better interface + chararray for backward compatibility, is an open question. Personally I think a less object-oriented approach and just having a namespace full of vectorized string functions might be cleaner than the current situation of needing to create a view class around an ndarray. I'm suggesting something like the following, using the same example, where {STR} is some namespace we would fill with vectorized string operations: subset = array[np.where(np.{STR}.startswith(c, 'NGC'))] Now on to chararray as it now stands. I view chararray as really two separable pieces of functionality: 1) Convenience to perform vectorized string operations using '.method' syntax, or in some cases infix operators (+, *) 2) Implicit "rstrip"ping of values (Note that raw ndarray's truncate values at the first NULL character, like C strings, but chararrays will strip any and all whitespace characters from the end). Changing (2) just seems to be asking to be the source of subtle bugs. Unfortunately, there's an inconsistency between 1) and 2) in the present implementation. For example: In [9]: a = np.char.array(['a ']) In [10]: a Out[10]: chararray(['a'], dtype='|S3') In [11]: a[0] == 'a' Out[11]: True In [12]: a.endswith('a') Out[12]: array([False], dtype=bool) This is *the* design wart of chararray, IMHO, and one that's difficult to fix without breaking compatibility. It might be a worthwhile experiment to remove (2) and see how much we really break, but it would be impossible to know for sure. Now to address the concerns iterated in this thread. Unfortunately, I don't know where this thread began before it landed on the Numpy list, so I may be missing details which would help me address them. > 0) "it gets very little use" (an assumption you presumably dispute); > Certainly not true from where I stand. > 1) "is pretty much undocumented" (less true than a week ago, but still true for several of the attributes, with another handful or so falling into the category of "poorly documented"); > I don't quite understand this one -- 99% of the methods are wrappers around standard Python string methods. I don't think we should redocument those. I agree it needs a better top level docstring about its purpose (see functionalities (1) and (2) above) and its status (for backward compatibility). > 2) "probably more buggy than most other parts of NumPy" ("probably" being a euphemism, IMO); > Trac has these bugs. Any others? http://projects.scipy.org/numpy/ticket/1199 http://projects.scipy.org/numpy/ticket/1200 http://projects.scipy.org/numpy/ticket/856 http://projects.scipy.org/numpy/ticket/855 http://projects.scipy.org/numpy/ticket/1231 > 3) "there is not a really good use-case for it" (a conjecture, but one that has yet to be challenged by counter-example); > See above. > 4) it's not the first time its presence in NumPy has been questioned ("as Stefan pointed out when asking this same question last year") > Hopefully we're addressing that now. > 5) NumPy already has a (perhaps superior) alternative ("object arrays would do nicely if one needs this functionality"); > No -- that gives the problem of even slower Python-looping to do vectorized string operations. > to which I'll add: > > 6) it is, on its face, "counter to the spirit" of NumPy. > I don't quite know what this means -- but I do find the fact that it's a view class with methods a little bit clumsy. Is that what you meant? So here's my TODO list related to all this: 1) Fix bugs in Trac 2) Improve documentation (though probably not in a method-by-method way) 3) Improve unit test coverage 4a) Create C-based vectorized string operations 4b) Refactor chararray in terms of those 4c) Design and create an interface to those methods that will be the "right way" going forward Anything else? Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From fperez.net at gmail.com Tue Sep 22 15:13:04 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 22 Sep 2009 12:13:04 -0700 Subject: [Numpy-discussion] something wrong with docs? In-Reply-To: References: <1253557959.3707.4.camel@idol> Message-ID: On Tue, Sep 22, 2009 at 12:02 AM, Pauli Virtanen wrote: > I think sphinx.ext.doctest is able to also test the ordinary >>> marked- > up examples, so there'd be no large need for new directives. > Well, >>> examples intermix input and output, and are thus very annoying to paste back into new code or interactive sessions (ipython has %doctest_mode that helps some, but you still have to avoid pasting output). Furthermore, >>> examples get very unwieldy beyond a few lines, and things like class definitions are hard to do and read in that mode. The nice thing about the sphinx full doctest support is that it scales very well to more complex code blocks. Cheers, f From rowen at uw.edu Tue Sep 22 15:22:50 2009 From: rowen at uw.edu (Russell E. Owen) Date: Tue, 22 Sep 2009 12:22:50 -0700 Subject: [Numpy-discussion] numpy macosx10.5 binaries: compatible with 10.4? Message-ID: All the official numpy 1.3.0 Mac binaries are labelled "macosx10.5". Does anyone know if these are backwards compatible with MacOS X 10.4 or 10.3.9? -- Russell From robert.kern at gmail.com Tue Sep 22 15:32:16 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 22 Sep 2009 14:32:16 -0500 Subject: [Numpy-discussion] Fancy indexing for In-Reply-To: References: Message-ID: <3d375d730909221232x5d8e619dr5b8ce5060dd7c34b@mail.gmail.com> On Tue, Sep 22, 2009 at 12:16, Daran Rife wrote: > Hello list, > > This didn't seem to get through last time round, and my > first version was poorly written. > > I have a rather pedestrian question about fancy indexing > for multi-dimensional arrays. > > Suppose I have two 3-D arrays, one named "A" and the other "B", > where both arrays have identical dimensions of time, longitude, > and latitude. I wish to use data from A to conditionally select > values from array B. Specifically, I first find the time where > the values at each point in A are at their maximum. This is > accomplished with: > > ?>>> tmax_idx = np.argsort(A, axis=0) > > I now wish to use this tmax_idx array to conditionally select > the values from B. In essence, I want to pick values from B for > times where the values at A are at their max. Can this be done > with fancy indexing? Or is there a smarter way to do this? I've > certainly done this sort of selection before, but the index > selection array is 1D. I've carefully studied the excellent > indexing documentation and examples on-line, but can't sort out > whether what I want to do is even possible, without doing the > brute force looping method, similar to: > > max_B = np.zeros((nlon, nlat), dtype=np.float32) > > for i in xrange(nlon): > ? ?for j in xrange(nlat): > ? ? ? ?max_B[i,j] = B[tmax_idx[i,j],i,j] All of the index arrays need to be broadcastable to the same shape. Thus, you want the "i" index array to be a column vector. max_B = B[tmax_idx, np.arange(nlon)[:,np.newaxis], np.arange(nlat)] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Sep 22 15:34:35 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 22 Sep 2009 14:34:35 -0500 Subject: [Numpy-discussion] numpy and cython in pure python mode In-Reply-To: References: Message-ID: <3d375d730909221234y5a0c8ba3m14aeb331badb09d5@mail.gmail.com> On Tue, Sep 22, 2009 at 01:33, Sebastian Haase wrote: > Hi, > I'm not subscribed to the cython list - hoping enough people would > care to justify my post here: > > I know that cython's numpy is still getting better and better over > time, but is it already today possible to have numpy support when > using Cython in "pure python" mode? > I like the idea of being able to develop and debug code "the python > way" -- and then just switching on the cython-overdrive mode. > (Otherwise I have very good experience using C/C++ with appropriate > typemaps, and I don't mind the C syntax) > > I only recently learned about the "pure python" mode on the sympy list > (and at the EuroScipy2009 workshop). > My understanding is that Cython's pure Python mode could be "played" > in two ways: > a) either not having a .pyx-file at all and putting everything into a > py-file (using the "import cython" stuff) > or b) putting only cython specific declaration in to a pyx file having > the same basename as the py-file next to it. I'm pretty sure that you need Cython syntax that is not supported by the pure-Python mode in order to use numpy arrays effectively. > One more: there is no way on reload cython-modules (yet), ?right ? Correct. There is no way to reload any extension module. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sebastian.walter at gmail.com Tue Sep 22 15:57:15 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Tue, 22 Sep 2009 21:57:15 +0200 Subject: [Numpy-discussion] polynomial ring dtype In-Reply-To: References: Message-ID: sorry if this a duplicate, it seems that my last mail got lost... is there something to take care about when sending a mail to the numpy mailing list? On Tue, Sep 22, 2009 at 9:42 AM, Sebastian Walter wrote: > This is somewhat similar to the question about fixed-point arithmetic > earlier on this mailing list. > > I need to do computations on arrays whose elements are truncated polynomials. > At the momement, I have implemented the univariate truncated > polynomials as objects of a class UTPS. > > The class basically looks like this: > > class UTPS: > ? ?def __init__(self, taylor_coeffs): > ? ? ? ?""" ?polynomial x(t) = ?tc[0] + tc[1] t + tc[2] t^2 + tc[3] > t^3 + ... """ > ? ? ? ?self.tc = numpy.asarray(taylor_coeffs) > > ? ?def __add__(self, rhs): > ? ? ? ?return UTPS(self.tc + rhs.tc) > > ? ?def sin(self): > ? ? ? ?# numpy.sin(self) apparently automatically calls self.sin() > which is very cool > > ? ?etc.... > > One can create arrays of UTPS instances ?like this: > x = numpy.array( [[UTPS([1,2]), UTPS([3,4])], [UTPS([0,1]), UTPS([4,3])]]) > > and perform funcs and ufuncs on it > > y = numpy.sum(x) > y = numy.sin(x) > y = numpy.dot(numpy.eye(2), x) > > This works out of the box, which is very nice. > > my question: > Is it possible to speed up the computation by defining a special dtype > for truncated polynomials? Especially when the arrays get large, > computing on arrays of objects is quite slow. I had a look at the > numpy svn trunk but couldn't find any clues. > > If you are interested, you can have a look at the full pre alpha > version code (BSD licence) at http://github.com/b45ch1/algopy . > > regards, > Sebastian > From Chris.Barker at noaa.gov Tue Sep 22 16:22:28 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 22 Sep 2009 13:22:28 -0700 Subject: [Numpy-discussion] numpy macosx10.5 binaries: compatible with 10.4? In-Reply-To: References: Message-ID: <4AB93204.60504@noaa.gov> Russell E. Owen wrote: > All the official numpy 1.3.0 Mac binaries are labelled "macosx10.5". > Does anyone know if these are backwards compatible with MacOS X 10.4 I'm pretty sure they are. > 10.3.9? not so sure, but worth a try. I've posted bug reports about the naming scheme, but haven't stepped up to fix it, so what can you do? numpy's pretty easy to build on OS-X, too. At least it was the last time I tried it! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From d.l.goldsmith at gmail.com Tue Sep 22 16:33:15 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 22 Sep 2009 13:33:15 -0700 Subject: [Numpy-discussion] merging docs from wiki In-Reply-To: References: Message-ID: <45d1ab480909221333v27ec3b3ft2288110237356022@mail.gmail.com> On Sun, Sep 20, 2009 at 12:49 PM, Ralf Gommers wrote: > Hi, > > I'm done reviewing all the improved docstrings for NumPy, they can be > merged now from the doc editor Patch page. Maybe I'll get around to doing > the SciPy ones as well this week, but I can't promise that. > Thank you very much, Ralf! There are a few docstrings on the Patch page I did not mark "Ok to apply": > > 1. the generic docstrings. Some are marked Ready for review, but they refer > mostly to "self" and to "generic" which I don't think is very helpful. It > would be great if someone could do just one of those docstrings and make it > somewhat informative. I couldn't agree more; unfortunately, I've been trying, highly unsuccessfully, to get someone to do this for a while - I promoted them in the hopes that a reviewer saying they needed an expert's eye would be more authoritative - I guess we're about to find out if that's the case. ;-) DG > There are about 50 that can then be done in the same way. > > 2. get_numpy_include: the docstring is deleted because the function is > deprecated. I don't think that is helpful but I'm not sure. Should this be > reverted or applied? > > Cheers, > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Tue Sep 22 16:41:19 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 22 Sep 2009 13:41:19 -0700 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: <200909211328.41980.meine@informatik.uni-hamburg.de> References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> <200909211328.41980.meine@informatik.uni-hamburg.de> Message-ID: <45d1ab480909221341s1ebf05b4u884b8b7cadf75ce6@mail.gmail.com> So, what's the "bottom-line" of this thread: does the doc need to be changed, or the code? DG 2009/9/21 Hans Meine > Hi! > > On Monday 21 September 2009 12:31:27 Citi, Luca wrote: > > I think you do not need to do the chain up walk on view creation. > > If the assumption is that base is the ultimate base, on view creation > > you can do something like (pseudo-code): > > view.base = parent if parent.owndata else parent.base > > Hmm. My impression was that .base was for refcounting purposes *only*. > Thus, > it is not even guaranteed that the attribute value is an array(-like) > object. > > For example, I might want to allow direct access to some internal buffers > of > an object of mine in an extension module; then, I'd use .base to bind the > lifetime of my object to the array (the lifetime of which I cannot control > anymore). > > Ciao, > Hans > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Tue Sep 22 17:22:00 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 22 Sep 2009 14:22:00 -0700 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <4AB9103F.7030207@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> Message-ID: <45d1ab480909221422k6dd988f1nbe5397baae8ce7c8@mail.gmail.com> Michael: First, thank you very much for your detailed and thorough analysis and recap of the situation - it sounds to me like chararray is now in good hands! :-) On Tue, Sep 22, 2009 at 10:58 AM, Michael Droettboom wrote: > Sorry to resurrect a long-dead thread, but I've been continuing Chris > IMO, no apology necessary! > Hanley's investigation of chararray at Space Telescope Science Institute > (and the broader astronomical community) for a while and have some > findings to report back. > > What I've taken from this thread is that chararray is in need of a > maintainer. I am able to spend some time to the cause, but first would > Yes, thank you! > like to clarify what it will take to make it's continued inclusion more > comfortable. > > Let me start with the use case. chararrays are extensively returned > from pyfits (a tool to handle the standard astronomy data format). > pyfits is the basis of many applications, and it would be impossible to > audit all of that code. Most authors of those tools do not track > numpy-discussion closely, which is why we don't hear from them on this > list, but there is a great deal of pyfits-using code. > > Doing some spot-checking on this code, a common thing I see is SQL-like > queries on recarrays of objects. For instance, it is very common to a > have a table of objects, with a "Target" column which is a string, and > do something like (where c is a chararray of the 'Target' column): > > subset = array[np.where(c.startswith('NGC'))] > > Strictly speaking, this is a use case for "vectorized string > operations", not necessarily for the chararray class as it presently > stands. One could almost as easily do: > > subset = array[np.where([x.startswith('NGC') for x in c])] > > ...and the latter is even slightly faster, since chararray currently > loops in Python anyway. > > Even better, though, I have some experimental code to perform the loop > in C, and I get 5x speed up on a table with ~120,000 rows. If that were > to be included in numpy, that's a strong argument against recommending > list comprehensions in user code. The use case suggests the continued > existence of vectorized string operations in numpy -- whether that > continues to be chararray, or some newer/better interface + chararray > for backward compatibility, is an open question. Personally I think a > less object-oriented approach and just having a namespace full of > vectorized string functions might be cleaner than the current situation > of needing to create a view class around an ndarray. I'm suggesting > something like the following, using the same example, where {STR} is > some namespace we would fill with vectorized string operations: > > subset = array[np.where(np.{STR}.startswith(c, 'NGC'))] > > Now on to chararray as it now stands. I view chararray as really two > separable pieces of functionality: > > 1) Convenience to perform vectorized string operations using > '.method' syntax, or in some cases infix operators (+, *) > 2) Implicit "rstrip"ping of values > > (Note that raw ndarray's truncate values at the first NULL character, > like C strings, but chararrays will strip any and all whitespace > characters from the end). > > Changing (2) just seems to be asking to be the source of subtle bugs. > Unfortunately, there's an inconsistency between 1) and 2) in the present > implementation. For example: > > In [9]: a = np.char.array(['a ']) > > In [10]: a > Out[10]: chararray(['a'], dtype='|S3') > > In [11]: a[0] == 'a' > Out[11]: True > > In [12]: a.endswith('a') > Out[12]: array([False], dtype=bool) > > This is *the* design wart of chararray, IMHO, and one that's difficult > to fix without breaking compatibility. It might be a worthwhile > experiment to remove (2) and see how much we really break, but it would > be impossible to know for sure. > > Now to address the concerns iterated in this thread. Unfortunately, I > don't know where this thread began before it landed on the Numpy list, > so I may be missing details which would help me address them. > > > 0) "it gets very little use" (an assumption you presumably dispute); > > > Certainly not true from where I stand. > I'm convinced. > > 1) "is pretty much undocumented" (less true than a week ago, but still > true for several of the attributes, with another handful or so falling into > the category of "poorly documented"); > > > I don't quite understand this one -- 99% of the methods are wrappers around standard Python string methods. I don't think we should > redocument those. I agree it needs a better top level docstring about > OK, that's what I needed to hear (that I don't believe anyone stated explicitly before - I'm sure I'll be corrected if I'm wrong): in that case, finishing these off is as simple as stating that in the functions' docstrings (albeit in a way compliant w/ the numpy docstring standard, of course; see below). > > 6) it is, on its face, "counter to the spirit" of NumPy. > > > I don't quite know what this means -- but I do find the fact that it's a > view class with methods a little bit clumsy. Is that what you meant? > The rest of the arguments effectively become moot, but I will clarify what I meant by 6), which was simply that as I understood - and understand - it, the central purpose of numpy is to provide a fast (i.e., implemented in C), Python API for a _numerical_ multidimensional array object; it sounds like there is a need for a fast Python API for vectorized string operations, but IMO, numpy is not the place for it (maybe a sub-package in scipy? it could still use numpy "under the hood," of course); that said, my primary concern presently is getting everything that _is_ presently in numpy documented, and now, so it shall be. > So here's my TODO list related to all this: > > 1) Fix bugs in Trac > 2) Improve documentation (though probably not in a method-by-method way) > So, you're volunteering to do this? Great, thanks! (Please be sure, of course, to conform to the numpy docstring standard: http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines#docstring-standard with clarification of referral practice, such as it is, at: http://docs.scipy.org/numpy/Questions+Answers/#documenting-equivalent-functions-and-methods ) > 3) Improve unit test coverage > 4a) Create C-based vectorized string operations > 4b) Refactor chararray in terms of those > 4c) Design and create an interface to those methods that will be the > "right way" going forward > > Anything else? > Looks great to me! With much thanks again!!! DG > > Mike > > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Tue Sep 22 18:14:52 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Tue, 22 Sep 2009 23:14:52 +0100 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: <45d1ab480909221341s1ebf05b4u884b8b7cadf75ce6@mail.gmail.com> References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> <200909211328.41980.meine@informatik.uni-hamburg.de>, <45d1ab480909221341s1ebf05b4u884b8b7cadf75ce6@mail.gmail.com> Message-ID: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3E6@MBOX0.essex.ac.uk> My vote (if I am entitled to) goes to "change the code". Whether or not the addressee of .base is an array, it should be "the object that has to be kept alive such that the data does not get deallocated" rather "one object which will keep alive another object, which will keep alive another object, ...., which will keep alive the object with the data". On creation of a new view B of object A, if A has ONWDATA true then B.base = A, else B.base = A.base. When working on http://projects.scipy.org/numpy/ticket/1085 I had to walk the chain of bases to establish whether any of the inputs and the outputs were views of the same data. If "base" were the ultimate base, one would only need to check whether any of the inputs have the same base of any of the outputs. I tried to modify the code to change the behaviour. I have opened a ticket for this http://projects.scipy.org/numpy/ticket/1232 and attached a patch but I am not 100% sure. I changed PyArray_View in convert.c and a few places in mapping.c and sequence.c. But if there is any reason why the current behaviour should be kept, just ignore the ticket. Luca From ralf.gommers at googlemail.com Tue Sep 22 19:02:16 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 22 Sep 2009 19:02:16 -0400 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <4AB9103F.7030207@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> Message-ID: On Tue, Sep 22, 2009 at 1:58 PM, Michael Droettboom wrote: > Sorry to resurrect a long-dead thread, but I've been continuing Chris > Hanley's investigation of chararray at Space Telescope Science Institute > (and the broader astronomical community) for a while and have some > findings to report back. > Thank you for the thorough investigation, it seems clear to me now that chararray does have a purpose and is in good hands. > > > Now to address the concerns iterated in this thread. Unfortunately, I > don't know where this thread began before it landed on the Numpy list, > so I may be missing details which would help me address them. > The discussion began on the scipy list, but I think you addressed most concerns in enough detail. > > > 0) "it gets very little use" (an assumption you presumably dispute); > > > Certainly not true from where I stand. > > 1) "is pretty much undocumented" (less true than a week ago, but still > true for several of the attributes, with another handful or so falling into > the category of "poorly documented"); > > > I don't quite understand this one -- 99% of the methods are wrappers > around standard Python string methods. I don't think we should > redocument those. I agree it needs a better top level docstring about > its purpose (see functionalities (1) and (2) above) and its status (for > backward compatibility). > Well, then the docstrings should say that. It can be a 4-line standard template that refers to the stdlib docs, but it would be nice to get something other than this in ipython: >>> charar = np.chararray(2) >>> charar.ljust? Docstring: > > 2) "probably more buggy than most other parts of NumPy" ("probably" being > a euphemism, IMO); > > > Trac has these bugs. Any others? > > http://projects.scipy.org/numpy/ticket/1199 > http://projects.scipy.org/numpy/ticket/1200 > http://projects.scipy.org/numpy/ticket/856 > http://projects.scipy.org/numpy/ticket/855 > http://projects.scipy.org/numpy/ticket/1231 > This one: http://article.gmane.org/gmane.comp.python.numeric.general/23638/match=chararray Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From drife at ucar.edu Tue Sep 22 19:36:27 2009 From: drife at ucar.edu (Daran Rife) Date: Tue, 22 Sep 2009 17:36:27 -0600 Subject: [Numpy-discussion] Fancy indexing for Message-ID: Hi Robert, This solution works beautifully! Thanks for sending it along. I need to learn and understand more about fancy indexing for multi-dimensional arrays, especially your clever trick of np.newaxis for broadcasting. Daran -- > Hello list, > > This didn't seem to get through last time round, and my > first version was poorly written. > > I have a rather pedestrian question about fancy indexing > for multi-dimensional arrays. > > Suppose I have two 3-D arrays, one named "A" and the other "B", > where both arrays have identical dimensions of time, longitude, > and latitude. I wish to use data from A to conditionally select > values from array B. Specifically, I first find the time where > the values at each point in A are at their maximum. This is > accomplished with: > > ?>>> tmax_idx = np.argsort(A, axis=0) > > I now wish to use this tmax_idx array to conditionally select > the values from B. In essence, I want to pick values from B for > times where the values at A are at their max. Can this be done > with fancy indexing? Or is there a smarter way to do this? I've > certainly done this sort of selection before, but the index > selection array is 1D. I've carefully studied the excellent > indexing documentation and examples on-line, but can't sort out > whether what I want to do is even possible, without doing the > brute force looping method, similar to: > > max_B = np.zeros((nlon, nlat), dtype=np.float32) > > for i in xrange(nlon): > ? ?for j in xrange(nlat): > ? ? ? ?max_B[i,j] = B[tmax_idx[i,j],i,j] All of the index arrays need to be broadcastable to the same shape. Thus, you want the "i" index array to be a column vector. max_B = B[tmax_idx, np.arange(nlon)[:,np.newaxis], np.arange(nlat)] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d.l.goldsmith at gmail.com Tue Sep 22 22:31:08 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 22 Sep 2009 19:31:08 -0700 Subject: [Numpy-discussion] something wrong with docs? In-Reply-To: References: <1253557959.3707.4.camel@idol> Message-ID: <45d1ab480909221931o4bc326f0y11e5b440bcbce0a7@mail.gmail.com> On Mon, Sep 21, 2009 at 6:49 PM, Fernando Perez wrote: > On Mon, Sep 21, 2009 at 11:32 AM, Pauli Virtanen wrote: > > The `sphinx.ext.doctest` extension is not enabled, so the testcode:: > > etc. directives are not available. I'm not sure if it should be enabled > > -- it would be cleaner to just replace the testcode:: stuff with the > > ordinary example markup. > > > > Why not enable it? It would be nice if we could move gradually > towards docs whose examples (at least those marked as such) were > always run via sphinx. The more we do this, the higher the chances of > Later in this thread, Fernando, you make a good case - scalability - for this, which, as someone who's been using only >>>, raises a number of questions in my mind: 0) this isn't applicable to docstrings, only to numpy-docs (i.e., the .rst files), correct; 1) assuming the answer is "yes," is there a "standard" for these ala the docstring standard, or some other extant way to promulgate and "strengthen" your "suggestion" (after proper community vetting, of course); 2) for those of us new to this approach, is there a "standard example" somewhere we can easily reference? Thanks! DG non-zero overlap between documentation and reality :) > > Cheers, > > f > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Tue Sep 22 22:46:22 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 22 Sep 2009 19:46:22 -0700 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3E6@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> <200909211328.41980.meine@informatik.uni-hamburg.de> <45d1ab480909221341s1ebf05b4u884b8b7cadf75ce6@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3E6@MBOX0.essex.ac.uk> Message-ID: <45d1ab480909221946g4ce93bam9d650ecebcea665c@mail.gmail.com> On Tue, Sep 22, 2009 at 3:14 PM, Citi, Luca wrote: > My vote (if I am entitled to) goes to "change the code". > Whether or not the addressee of .base is an array, it should be "the object > that has to be kept alive such that the data does not get deallocated" > rather "one object which will keep alive another object, which will keep > alive another object, ...., which will keep alive the object with the data". > On creation of a new view B of object A, if A has ONWDATA true then B.base > = A, else B.base = A.base. > > When working on > http://projects.scipy.org/numpy/ticket/1085 > I had to walk the chain of bases to establish whether any of the inputs and > the outputs were views of the same data. > If "base" were the ultimate base, one would only need to check whether any > of the inputs have the same base of any of the outputs. > > I tried to modify the code to change the behaviour. > I have opened a ticket for this > http://projects.scipy.org/numpy/ticket/1232 > and attached a patch but I am not 100% sure. > I changed PyArray_View in convert.c and a few places in mapping.c and > sequence.c. > > But if there is any reason why the current behaviour should be kept, just > ignore the ticket. > You don't mean that literally, right? A ticket can't just be ignored: it can be changed to "will not fix," with, hopefully, a good explanation as to why, but it has to be resolved and closed in some fashion, not just ignored, or someone somewhere down the line will try to address it substantively. :-) In any event, I think we need a few more "heavyweights" to weigh in on this before code is changed: Robert? Charles? Travis? Anyone? Anyone wanna "block"? DG > > Luca > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Tue Sep 22 22:51:11 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 22 Sep 2009 19:51:11 -0700 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> Message-ID: <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> On Tue, Sep 22, 2009 at 4:02 PM, Ralf Gommers wrote: > > On Tue, Sep 22, 2009 at 1:58 PM, Michael Droettboom wrote: > > Trac has these bugs. Any others? >> >> http://projects.scipy.org/numpy/ticket/1199 >> http://projects.scipy.org/numpy/ticket/1200 >> http://projects.scipy.org/numpy/ticket/856 >> http://projects.scipy.org/numpy/ticket/855 >> http://projects.scipy.org/numpy/ticket/1231 >> > > This one: > http://article.gmane.org/gmane.comp.python.numeric.general/23638/match=chararray > > Cheers, > Ralf > That last one never got "promoted" to a ticket? DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Wed Sep 23 00:29:11 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 22 Sep 2009 21:29:11 -0700 Subject: [Numpy-discussion] something wrong with docs? In-Reply-To: <45d1ab480909221931o4bc326f0y11e5b440bcbce0a7@mail.gmail.com> References: <1253557959.3707.4.camel@idol> <45d1ab480909221931o4bc326f0y11e5b440bcbce0a7@mail.gmail.com> Message-ID: On Tue, Sep 22, 2009 at 7:31 PM, David Goldsmith wrote: > Later in this thread, Fernando, you make a good case - scalability - for > this, which, as someone who's been using only >>>, raises a number of > questions in my mind: 0) this isn't applicable to docstrings, only to > numpy-docs (i.e., the .rst files), Yes, and to me this naturally divides things: examples in docstrings should be compact enough that they fit comfortably in a >>> style. If they need an entire page of code, they probably should be in the main docs but not in the docstring. So I like very much the sphinx doctest code blocks for longer examples interspersed with text, and the >>> style for short ones that are a good fit for a docstring. correct; 1) assuming the answer is "yes," > is there a "standard" for these ala the docstring standard, or some other > extant way to promulgate and "strengthen" your "suggestion" (after proper > community vetting, of course); I'm not sure what you mean here, sorry. I simply don't understand what you are looking to "strengthen" or what standard there could be: this is regular code that goes into reST blocks. Sorry if I missed your point... 2) for those of us new to this approach, is > there a "standard example" somewhere we can easily reference? Yes, the sphinx docs have a page about the directive, including a brief example: http://sphinx.pocoo.org/ext/doctest.html Cheers, f From yogeshkarpate at gmail.com Wed Sep 23 01:07:15 2009 From: yogeshkarpate at gmail.com (yogesh karpate) Date: Wed, 23 Sep 2009 10:37:15 +0530 Subject: [Numpy-discussion] The problem with arrays In-Reply-To: <1253642020.18793.54.camel@localhost.localdomain> References: <703777c60909220411j219600e9k90466e2241f91514@mail.gmail.com> <710F2847B0018641891D9A21602763605AD17D@ex3.envision.co.il> <703777c60909220512g7a067690yfeef8be4cca6d3f9@mail.gmail.com> <1253626295.18793.47.camel@localhost.localdomain> <703777c60909221030n38975a97x57719f55d5ccf18e@mail.gmail.com> <1253642020.18793.54.camel@localhost.localdomain> Message-ID: <703777c60909222207y369d2760k2cb343051ee2718e@mail.gmail.com> Dear Fabrice Finally your suggestions worked :).....Thanks a lot... soon the code I'm working will be available as a part of Free Software Foundation. Regards Yogesh On Tue, Sep 22, 2009 at 11:23 PM, Fabrice Silva wrote: > Le mardi 22 septembre 2009 ? 23:00 +0530, yogesh karpate a ?crit : > > > This is the main thing . When I try to store it in array like > > R_time=array([R_t[0][i]]). It just stores the final value in that > > array when loop ends.I cant get out of this For loop.I really have > > this small problem. I really need help on this guys. > > > > for i in range(a1): > > data_temp=(bpf[left[0][i]: > > right[0][i]])# left is an array and right is also an array > > maxloc=data_temp.argmax() #taking indices of > > max. value of data segment > > maxval=data_temp[maxloc] > > minloc=data_temp.argmin() > > minval=data_temp[minloc] > > maxloc = maxloc-1+left # add offset of present > > location > > minloc = minloc-1+left # add offset of present > > location > > R_index = maxloc > > R_t = t[maxloc] > > R_amp = array([maxval]) > > S_amp = minval#%%% Assuming the S-wave is the lowest > > #%%% amp in the given window > > #S_t = t[minloc] > > R_time=array([R_t[0][i]]) > > plt.plot(R_time,R_amp,'go'); > > plt.show() > > Two options : > - you define an empty list before the loop > >>> R_time = [] > and you append the computed value while looping > >>> for i: > >>> ... > >>> R_time.append(t[maxloc]) > > - or you define a preallocated array before the loop > >>> R_time = np.empty(a1) > and fill it with the computed values > >>> for i: > >>> ... > >>> R_time[i] = t[maxloc] > > > Same thing with R_amp. After looping, whatever the solution you choose, > you can plot the whole set of (time, value) tuples > >>> plt.plot(R_time, R_amp) > > -- > Fabrice Silva > LMA UPR CNRS 7051 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Wed Sep 23 02:15:56 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 22 Sep 2009 23:15:56 -0700 Subject: [Numpy-discussion] something wrong with docs? In-Reply-To: References: <1253557959.3707.4.camel@idol> <45d1ab480909221931o4bc326f0y11e5b440bcbce0a7@mail.gmail.com> Message-ID: <45d1ab480909222315t75a6c2f3o47c96c90715755f8@mail.gmail.com> On Tue, Sep 22, 2009 at 9:29 PM, Fernando Perez wrote: > On Tue, Sep 22, 2009 at 7:31 PM, David Goldsmith> is there a "standard" for > these ala the docstring standard, or some other > > extant way to promulgate and "strengthen" your "suggestion" (after proper > > community vetting, of course); > > I'm not sure what you mean here, sorry. I simply don't understand > what you are looking to "strengthen" or what standard there could be: > this is regular code that goes into reST blocks. Sorry if I missed > your point... > "It would be nice if we could move gradually towards docs whose examples (at least those marked as such) were always run via sphinx." That's a "suggestion," but given your point, it seems like you'd advocate it being more than that, no? DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav+sp at iki.fi Wed Sep 23 03:10:39 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Wed, 23 Sep 2009 07:10:39 +0000 (UTC) Subject: [Numpy-discussion] something wrong with docs? References: <1253557959.3707.4.camel@idol> <45d1ab480909221931o4bc326f0y11e5b440bcbce0a7@mail.gmail.com> <45d1ab480909222315t75a6c2f3o47c96c90715755f8@mail.gmail.com> Message-ID: Tue, 22 Sep 2009 23:15:56 -0700, David Goldsmith wrote: [clip] > "It would be nice if we could move gradually towards docs whose examples > (at least those marked as such) were always run via sphinx." Also the >>> examples are doctestable, via numpy.test(doctests=True), or enabling Sphinx's doctest extension and its support for those. What Fernando said about them being more clumsy to write and copy than separate code directives is of course true. I wonder if there's a technical fix that could be made in Sphinx, at least for HTML, to correct this... -- Pauli Virtanen From hrvoje.niksic at avl.com Wed Sep 23 03:15:44 2009 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Wed, 23 Sep 2009 09:15:44 +0200 Subject: [Numpy-discussion] Deserialized arrays with base mutate strings Message-ID: <4AB9CB20.9090809@avl.com> Numpy arrays with the "base" property are deserialized as arrays pointing to a storage contained within a Python string. This is a problem since such arrays are mutable and can mutate existing strings. Here is how to create one: >>> import numpy, cPickle as p >>> a = numpy.array([1, 2, 3]) # create an array >>> b = a[::-1] # create a view >>> b array([3, 2, 1]) >>> b.base # view's base is the original array array([1, 2, 3]) >>> c = p.loads(p.dumps(b, -1)) # roundtrip the view through pickle >>> c array([3, 2, 1]) >>> c.base # base is now a simple string: '\x03\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00' >>> s = c.base >>> s '\x03\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00' >>> type(s) >>> c[0] = 4 # when the array is mutated... >>> s # ...the string changes value! '\x04\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00' This is somewhat disconcerting, as Python strings are supposed to be immutable. In this case the string was created by numpy and is probably not shared by anyone, so it doesn't present a problem in practice. But in corner cases it can lead to serious bugs. Python has a cache of one-letter strings, which cannot be turned off. This means that one-byte array views can change existing Python strings used elsewhere in the code. For example: >>> a = numpy.array([65], 'int8') >>> b = a[::-1] >>> c = p.loads(p.dumps(b, -1)) >>> c array([65], dtype=int8) >>> c.base 'A' >>> c[0] = 66 >>> c.base 'B' >>> 'A' 'B' Note how changing a numpy array permanently changed the contents of all 'A' strings in this python instance, rendering python unusable. The fix should be straightforward: use a string subclass (which will skip the one-letter cache), or an entirely separate type for storage of "base" memory referenced by deserialized arrays. From pav+sp at iki.fi Wed Sep 23 03:37:55 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Wed, 23 Sep 2009 07:37:55 +0000 (UTC) Subject: [Numpy-discussion] Deserialized arrays with base mutate strings References: <4AB9CB20.9090809@avl.com> Message-ID: Wed, 23 Sep 2009 09:15:44 +0200, Hrvoje Niksic wrote: [clip] > Numpy arrays with the "base" property are deserialized as arrays > pointing to a storage contained within a Python string. This is a > problem since such arrays are mutable and can mutate existing strings. > Here is how to create one: Please file a bug ticket in the Trac, thanks! Here is a simpler way, although one more difficult to accidentally: >>> a = numpy.frombuffer("A", dtype='S1') >>> a.flags.writeable = True >>> b = "A" >>> a[0] = "B" >>> b 'B' -- Pauli Virtanen From hrvoje.niksic at avl.com Wed Sep 23 04:01:11 2009 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Wed, 23 Sep 2009 10:01:11 +0200 Subject: [Numpy-discussion] Deserialized arrays with base mutate strings In-Reply-To: <28809716.622169.1253691526147.JavaMail.xicrypt@atgrzls001> References: <4AB9CB20.9090809@avl.com> <28809716.622169.1253691526147.JavaMail.xicrypt@atgrzls001> Message-ID: <4AB9D5C7.3040206@avl.com> Pauli Virtanen wrote: > Wed, 23 Sep 2009 09:15:44 +0200, Hrvoje Niksic wrote: > [clip] >> Numpy arrays with the "base" property are deserialized as arrays >> pointing to a storage contained within a Python string. This is a >> problem since such arrays are mutable and can mutate existing strings. >> Here is how to create one: > > Please file a bug ticket in the Trac, thanks! Done - ticket #1233. > Here is a simpler way, although one more difficult to accidentally: > >>>> a = numpy.frombuffer("A", dtype='S1') >>>> a.flags.writeable = True >>>> b = "A" >>>> a[0] = "B" >>>> b > 'B' I guess this one could be prevented by verifying that the buffer is writable when setting the "writable" flag. When deserializing arrays, I don't see a reason for the "base" property to even exist - sharing of the buffer between different views is unpreserved anyway, as reported in my other thread. From timmichelsen at gmx-topmail.de Wed Sep 23 04:30:49 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Wed, 23 Sep 2009 08:30:49 +0000 (UTC) Subject: [Numpy-discussion] current stautus of numpy -> Excel Message-ID: FYI: Here is a summary of how one can 1) write numpy arrays to Excel 2) interact with numpy/scipy/... from Excel http://groups.google.com/group/python-excel/msg/3881b7e7ae210cc7 Best regards, Timmie From dagss at student.matnat.uio.no Wed Sep 23 08:55:31 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 23 Sep 2009 14:55:31 +0200 Subject: [Numpy-discussion] numpy and cython in pure python mode In-Reply-To: <3d375d730909221234y5a0c8ba3m14aeb331badb09d5@mail.gmail.com> References: <3d375d730909221234y5a0c8ba3m14aeb331badb09d5@mail.gmail.com> Message-ID: <4ABA1AC3.9020801@student.matnat.uio.no> Robert Kern wrote: > On Tue, Sep 22, 2009 at 01:33, Sebastian Haase wrote: > >> Hi, >> I'm not subscribed to the cython list - hoping enough people would >> care to justify my post here: >> The post might be justified, but it is a question of available knowledge as well. I nearly missed this post here. The Cython user list is on: http://groups.google.com/group/cython-users >> I know that cython's numpy is still getting better and better over >> time, but is it already today possible to have numpy support when >> using Cython in "pure python" mode? >> I like the idea of being able to develop and debug code "the python >> way" -- and then just switching on the cython-overdrive mode. >> (Otherwise I have very good experience using C/C++ with appropriate >> typemaps, and I don't mind the C syntax) >> >> I only recently learned about the "pure python" mode on the sympy list >> (and at the EuroScipy2009 workshop). >> My understanding is that Cython's pure Python mode could be "played" >> in two ways: >> a) either not having a .pyx-file at all and putting everything into a >> py-file (using the "import cython" stuff) >> or b) putting only cython specific declaration in to a pyx file having >> the same basename as the py-file next to it. >> That should be a pxd file with the same basename. And I think that mode should work. b), that is. Sturla's note on the memory view syntax doesn't apply as that's not in a released version of Cython yet, and won't be until 0.12.1 or 0.13. But that could be made to support Python mode a). Finally there's been some recent discussion on cython-dev about a tool which can take a pyx file as input and output pure Python. > >> One more: there is no way on reload cython-modules (yet), right ? >> > > Correct. There is no way to reload any extension module. > This can be worked around (in most situations that arise in practice) by compiling the module with a new name each time and importing things from it though. Sage already kind of support it (for the %attach feature only), and there are patches around for pyximport in Cython that's just lacking testing and review. Since pyximport lacks a test suite altogether, nobody seems to ever get around to that. Dag Sverre From davejwood at gmail.com Wed Sep 23 09:42:43 2009 From: davejwood at gmail.com (davew0000) Date: Wed, 23 Sep 2009 06:42:43 -0700 (PDT) Subject: [Numpy-discussion] Numpy 2D array from a list error Message-ID: <25531145.post@talk.nabble.com> Hi, I've got a fairly large (but not huge, 58mb) tab seperated text file, with approximately 200 columns and 56k rows of numbers and strings. Here's a snippet of my code to create a numpy matrix from the data file... #### data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) data = array(data) ### data = array(data) It causes the following error: > ValueError: setting an array element with a sequence If I take the 1st 40,000 lines of the file, it works fine. If I take the last 40,000 lines of the file, it also works fine, so it isn't a problem with the file. I've found a few other posts complaining of the same problem, but none of their fixes work. It seems like a memory problem to me. This was reinforced when I tried to break the dataset into 3 chunks and stack the resulting arrays - I got an error message saying "memory error". I don't really understand why reading in this 57mb txt file is taking up ~2gb's of RAM. Any advice? Thanks in advance Dave -- View this message in context: http://www.nabble.com/Numpy-2D-array-from-a-list-error-tp25531145p25531145.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ndbecker2 at gmail.com Wed Sep 23 09:48:49 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 23 Sep 2009 09:48:49 -0400 Subject: [Numpy-discussion] simple indexing question Message-ID: I have an array: In [12]: a Out[12]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) And a selection array: In [13]: b Out[13]: array([1, 1, 1, 1, 1]) I want a 1-dimensional output, where the array b selects an element from each column of a, where if b[i]=0 select element from 0th row of a and if b[i]=k select element from kth row of a. Easy way to do this? (Not a[b], that gives 5x5 array output) From cimrman3 at ntc.zcu.cz Wed Sep 23 09:59:24 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 23 Sep 2009 15:59:24 +0200 Subject: [Numpy-discussion] simple indexing question In-Reply-To: References: Message-ID: <4ABA29BC.9070109@ntc.zcu.cz> Neal Becker wrote: > I have an array: > In [12]: a > Out[12]: > array([[0, 1, 2, 3, 4], > [5, 6, 7, 8, 9]]) > > And a selection array: > In [13]: b > Out[13]: array([1, 1, 1, 1, 1]) > > I want a 1-dimensional output, where the array b selects an element from > each column of a, where if b[i]=0 select element from 0th row of a and if > b[i]=k select element from kth row of a. > > Easy way to do this? (Not a[b], that gives 5x5 array output) It might be stupid, but it works... In [51]: a Out[51]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [52]: b = [0,1,0,1,0] In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] Out[53]: array([0, 6, 2, 8, 4]) cheers, r. From davejwood at gmail.com Wed Sep 23 10:06:46 2009 From: davejwood at gmail.com (Dave Wood) Date: Wed, 23 Sep 2009 15:06:46 +0100 Subject: [Numpy-discussion] Create numpy array from a list error Message-ID: Hi all, I've got a fairly large (but not huge, 58mb) tab seperated text file, with approximately 200 columns and 56k rows of numbers and strings. Here's a snippet of my code to create a numpy matrix from the data file... #### data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) data = array(data) ### It causes the following error: data = array(data) ValueError: setting an array element with a sequence If I take the 1st 40,000 lines of the file, it works fine. If I take the last 40,000 lines of the file, it also works fine, so it isn't a problem with the file. I've found a few other posts complaining of the same problem, but none of their fixes work. It seems like a memory problem to me. This was reinforced when I tried to break the dataset into 3 chunks and stack the resulting arrays - I got an error message saying "memory error". Also, I don't really understand why reading in this 57mb txt file is taking up ~2gb's of RAM. Any advice? Thanks in advance Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Wed Sep 23 10:12:08 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 23 Sep 2009 09:12:08 -0500 Subject: [Numpy-discussion] Numpy 2D array from a list error In-Reply-To: <25531145.post@talk.nabble.com> References: <25531145.post@talk.nabble.com> Message-ID: <4ABA2CB8.4020406@gmail.com> On 09/23/2009 08:42 AM, davew0000 wrote: > Hi, > > I've got a fairly large (but not huge, 58mb) tab seperated text file, with > approximately 200 columns and 56k rows of numbers and strings. > > Here's a snippet of my code to create a numpy matrix from the data file... > > #### > > data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) > data = array(data) > > ### > data = array(data) > It causes the following error: > > >> ValueError: setting an array element with a sequence >> > If I take the 1st 40,000 lines of the file, it works fine. > If I take the last 40,000 lines of the file, it also works fine, so it isn't > a problem with the file. > > I've found a few other posts complaining of the same problem, but none of > their fixes work. > > It seems like a memory problem to me. This was reinforced when I tried to > break the dataset into 3 chunks and stack the resulting arrays - I got an > error message saying "memory error". > I don't really understand why reading in this 57mb txt file is taking up > ~2gb's of RAM. > > Any advice? Thanks in advance > > Dave > If the text file has 'numbers and strings' how is numpy meant to know what dtype to use? Please try genfromtxt especially if columns contain both numbers and strings. What happens if you read a file instead of using stdin? It is possible that one or more rows have multiple sequential delimiters. Please check the row lengths of your 'data' variable after doing: data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) Really without the input or system, it is hard to say anything. If you really know your data I would suggest preallocating the array and updating the array one line at a time to avoid the large multiple intermediate objects. Bruce From nissimk at gmail.com Wed Sep 23 10:34:55 2009 From: nissimk at gmail.com (Nissim Karpenstein) Date: Wed, 23 Sep 2009 10:34:55 -0400 Subject: [Numpy-discussion] cummax Message-ID: <11322b7d0909230734w76e836c8k70bfae5a542c76db@mail.gmail.com> Hi, I want a cummax function where given an array inp it returns this: numpy.array([inp[:i].max() for i in xrange(1,len(inp)+1)]). Various python versions equivalent to the above are quite slow (though a single python loop is much faster than a python loop with a nested numpy C loop as shown above). I have numpy 1.3.0 source. It looks to me like I could add cummax function by simply adding PyArray_CumMax to multiarraymodule.c which would be the same as PyArray_Max except it would call PyArray_GenericAccumulateFunction instead of PyArray_GenericReduceFunction. Also add array_cummax to arraymethods.c. Is there interest in adding this function to numpy? If so, I will check out the latest code and try to check in these changes. If not, how can I write my own Python module in C that adds this UFunc and still gets to reuse the code in PyArray_GenericReduceFunction? Thanks, -Nissim -------------- next part -------------- An HTML attachment was scrubbed... URL: From davejwood at gmail.com Wed Sep 23 11:00:46 2009 From: davejwood at gmail.com (Dave Wood) Date: Wed, 23 Sep 2009 16:00:46 +0100 Subject: [Numpy-discussion] Numpy 2D array from a list error In-Reply-To: <4ABA2CB8.4020406@gmail.com> References: <25531145.post@talk.nabble.com> <4ABA2CB8.4020406@gmail.com> Message-ID: "If the text file has 'numbers and strings' how is numpy meant to know what dtype to use? Please try genfromtxt especially if columns contain both numbers and strings." Well, I suppose they are all considered to be strings here. I haven't tried to convert the numbers to floats yet. "What happens if you read a file instead of using stdin?" Same problem "It is possible that one or more rows have multiple sequential delimiters. Please check the row lengths of your 'data' variable after doing:" Already done, they all have the same number of rows. The fact that the script works with the first 40k lines, and also with the last 40k lines suggests to me that there is no problem with the file. (I calculate column means and standard deviations later in the script - it's only the first two columns which can't be cast to floating point numbers) "Really without the input or system, it is hard to say anything. If you really know your data I would suggest preallocating the array and updating the array one line at a time to avoid the large multiple intermediate objects." I'm running on linux. My machine is redhat with 2GB RAM, but when memory became an issue I tried running on other Linux machines with much greater RAM capacities. I don't know what distos. I just tried preallocating the array and updating it one line at a time, and that works fine. Thanks very much for the suggestion. :) This doesn't seem like the expected behaviour though and the error message seems wrong. Many thanks, Dave _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Sep 23 11:07:37 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 23 Sep 2009 09:07:37 -0600 Subject: [Numpy-discussion] cummax In-Reply-To: <11322b7d0909230734w76e836c8k70bfae5a542c76db@mail.gmail.com> References: <11322b7d0909230734w76e836c8k70bfae5a542c76db@mail.gmail.com> Message-ID: On Wed, Sep 23, 2009 at 8:34 AM, Nissim Karpenstein wrote: > Hi, > > I want a cummax function where given an array inp it returns this: > > numpy.array([inp[:i].max() for i in xrange(1,len(inp)+1)]). > > Various python versions equivalent to the above are quite slow (though a > single python loop is much faster than a python loop with a nested numpy C > loop as shown above). > > I have numpy 1.3.0 source. It looks to me like I could add cummax function > by simply adding PyArray_CumMax to multiarraymodule.c which would be the > same as PyArray_Max except it would call PyArray_GenericAccumulateFunction > instead of PyArray_GenericReduceFunction. Also add array_cummax to > arraymethods.c. > > Is there interest in adding this function to numpy? If so, I will check > out the latest code and try to check in these changes. > If not, how can I write my own Python module in C that adds this UFunc and > still gets to reuse the code in PyArray_GenericReduceFunction? > > It's already available In [5]: a = arange(10) In [6]: a[5:] = 0 In [7]: maximum.accumulate(a) Out[7]: array([0, 1, 2, 3, 4, 4, 4, 4, 4, 4]) PyArray_Max is there because it is an ndarray method. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed Sep 23 11:12:51 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 23 Sep 2009 11:12:51 -0400 Subject: [Numpy-discussion] simple indexing question References: <4ABA29BC.9070109@ntc.zcu.cz> Message-ID: Robert Cimrman wrote: > Neal Becker wrote: >> I have an array: >> In [12]: a >> Out[12]: >> array([[0, 1, 2, 3, 4], >> [5, 6, 7, 8, 9]]) >> >> And a selection array: >> In [13]: b >> Out[13]: array([1, 1, 1, 1, 1]) >> >> I want a 1-dimensional output, where the array b selects an element from >> each column of a, where if b[i]=0 select element from 0th row of a and if >> b[i]=k select element from kth row of a. >> >> Easy way to do this? (Not a[b], that gives 5x5 array output) > > It might be stupid, but it works... > > In [51]: a > Out[51]: > array([[0, 1, 2, 3, 4], > [5, 6, 7, 8, 9]]) > > In [52]: b = [0,1,0,1,0] > > In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] > Out[53]: array([0, 6, 2, 8, 4]) > > cheers, > r. Thanks. Is there really no more elegant solution? From josef.pktd at gmail.com Wed Sep 23 11:31:17 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 23 Sep 2009 11:31:17 -0400 Subject: [Numpy-discussion] simple indexing question In-Reply-To: References: <4ABA29BC.9070109@ntc.zcu.cz> Message-ID: <1cd32cbb0909230831g69b6bb32v87ae6d082f8e9027@mail.gmail.com> On Wed, Sep 23, 2009 at 11:12 AM, Neal Becker wrote: > Robert Cimrman wrote: > >> Neal Becker wrote: >>> I have an array: >>> In [12]: a >>> Out[12]: >>> array([[0, 1, 2, 3, 4], >>> ? ? ? ?[5, 6, 7, 8, 9]]) >>> >>> And a selection array: >>> In [13]: b >>> Out[13]: array([1, 1, 1, 1, 1]) >>> >>> I want a 1-dimensional output, where the array b selects an element from >>> each column of a, where if b[i]=0 select element from 0th row of a and if >>> b[i]=k select element from kth row of a. >>> >>> Easy way to do this? ?(Not a[b], that gives 5x5 array output) >> >> It might be stupid, but it works... >> >> In [51]: a >> Out[51]: >> array([[0, 1, 2, 3, 4], >> ? ? ? ? [5, 6, 7, 8, 9]]) >> >> In [52]: b = [0,1,0,1,0] >> >> In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] >> Out[53]: array([0, 6, 2, 8, 4]) >> >> cheers, >> r. > > Thanks. ?Is there really no more elegant solution? How about this? >>> a array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) >>> b array([0, 1, 0, 1, 0]) >>> a[b,np.arange(a.shape[1])] array([0, 6, 2, 8, 4]) Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From bsouthey at gmail.com Wed Sep 23 11:34:52 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 23 Sep 2009 10:34:52 -0500 Subject: [Numpy-discussion] Numpy 2D array from a list error In-Reply-To: References: <25531145.post@talk.nabble.com> <4ABA2CB8.4020406@gmail.com> Message-ID: <4ABA401C.9010203@gmail.com> On 09/23/2009 10:00 AM, Dave Wood wrote: > "If the text file has 'numbers and strings' how is numpy meant to know > what dtype to use? > Please try genfromtxt especially if columns contain both numbers and > strings." > Well, I suppose they are all considered to be strings here. I haven't > tried to convert the numbers to floats yet. > "What happens if you read a file instead of using stdin?" > Same problem > > "It is possible that one or more rows have multiple sequential delimiters. > Please check the row lengths of your 'data' variable after doing:" > Already done, they all have the same number of rows. > The fact that the script works with the first 40k lines, and also with > the last 40k lines suggests to me that there is no problem with the file. > (I calculate column means and standard deviations later in the script > - it's only the first two columns which can't be cast to floating > point numbers) > > "Really without the input or system, it is hard to say anything. > If you really know your data I would suggest preallocating the array > and updating the array one line at a time to avoid the large multiple > intermediate objects." > I'm running on linux. My machine is redhat with 2GB RAM, but when > memory became an issue I tried running on other Linux machines with > much greater RAM capacities. I don't know what distos. > I just tried preallocating the array and updating it one line at a > time, and that works fine. Thanks very much for the suggestion. :) > This doesn't seem like the expected behaviour though and the error > message seems wrong. > Many thanks, > Dave > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Glad it you got a solution. While far from an expert, with 2GB ram you do not have that much free RAM outside the OS and other overheads. With your code, the OS has to read all the data in at least once as well as allocate the storage for the result and any intermediate objects. So it is easy to exhaust memory. I agree that the error message is too vague so you could file a ticket. Use PyTables if memory is a problem for you. For example, see the recent 'np.memmap and memory usage' thread on numpy discussion: http://www.mail-archive.com/numpy-discussion at scipy.org/msg18863.html Especially the post by Francesc Alted: http://www.mail-archive.com/numpy-discussion at scipy.org/msg18868.html Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Wed Sep 23 11:43:51 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 23 Sep 2009 11:43:51 -0400 Subject: [Numpy-discussion] Numpy 2D array from a list error In-Reply-To: <25531145.post@talk.nabble.com> References: <25531145.post@talk.nabble.com> Message-ID: On Wed, Sep 23, 2009 at 9:42 AM, davew0000 wrote: > > Hi, > > I've got a fairly large (but not huge, 58mb) tab seperated text file, with > approximately 200 columns and 56k rows of numbers and strings. > > Here's a snippet of my code to create a numpy matrix from the data file... > > #### > > data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) > data = array(data) > > ### > data = array(data) > It causes the following error: > >> ValueError: setting an array element with a sequence > > If I take the 1st 40,000 lines of the file, it works fine. > If I take the last 40,000 lines of the file, it also works fine, so it isn't > a problem with the file. > > I've found a few other posts complaining of the same problem, but none of > their fixes work. > > It seems like a memory problem to me. This was reinforced when I tried to > break the dataset into 3 chunks and stack the resulting arrays - I got an > error message saying "memory error". > I don't really understand why reading in this 57mb txt file is taking up > ~2gb's of RAM. > > Any advice? Thanks in advance > Without knowing more, I wouldn't think that there's really a memory error trying to load a 57 MB file or stacking it split into 3. Try using genfromtxt or loadtxt. It should work without a problem unless there is something funny about your file. Skipper From sienkiew at stsci.edu Wed Sep 23 11:52:30 2009 From: sienkiew at stsci.edu (Mark Sienkiewicz) Date: Wed, 23 Sep 2009 11:52:30 -0400 Subject: [Numpy-discussion] Numpy depends on OpenSSL ??? Message-ID: <4ABA443E.3030307@stsci.edu> I have discovered the hard way that numpy depends on openssl. I am building a 64 bit python environment for the macintosh. I currently do not have a 64 bit openssl library installed, so the python interpreter does not have hashlib. (hashlib gets its md5 function from the openssl library.) The problem is in numpy/core/code_generators/genapi.py, where it appears to be trying to make an md5 hash of the declarations of some of the C functions. What is this hash used for? Is there a particular reason that it needs to be cryptographically strong? Mark S. From cimrman3 at ntc.zcu.cz Wed Sep 23 11:58:23 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 23 Sep 2009 17:58:23 +0200 Subject: [Numpy-discussion] simple indexing question In-Reply-To: <1cd32cbb0909230831g69b6bb32v87ae6d082f8e9027@mail.gmail.com> References: <4ABA29BC.9070109@ntc.zcu.cz> <1cd32cbb0909230831g69b6bb32v87ae6d082f8e9027@mail.gmail.com> Message-ID: <4ABA459F.9050009@ntc.zcu.cz> josef.pktd at gmail.com wrote: > On Wed, Sep 23, 2009 at 11:12 AM, Neal Becker wrote: >> Robert Cimrman wrote: >> >>> Neal Becker wrote: >>>> I have an array: >>>> In [12]: a >>>> Out[12]: >>>> array([[0, 1, 2, 3, 4], >>>> [5, 6, 7, 8, 9]]) >>>> >>>> And a selection array: >>>> In [13]: b >>>> Out[13]: array([1, 1, 1, 1, 1]) >>>> >>>> I want a 1-dimensional output, where the array b selects an element from >>>> each column of a, where if b[i]=0 select element from 0th row of a and if >>>> b[i]=k select element from kth row of a. >>>> >>>> Easy way to do this? (Not a[b], that gives 5x5 array output) >>> It might be stupid, but it works... >>> >>> In [51]: a >>> Out[51]: >>> array([[0, 1, 2, 3, 4], >>> [5, 6, 7, 8, 9]]) >>> >>> In [52]: b = [0,1,0,1,0] >>> >>> In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] >>> Out[53]: array([0, 6, 2, 8, 4]) >>> >>> cheers, >>> r. >> Thanks. Is there really no more elegant solution? > > How about this? > >>>> a > array([[0, 1, 2, 3, 4], > [5, 6, 7, 8, 9]]) >>>> b > array([0, 1, 0, 1, 0]) > >>>> a[b,np.arange(a.shape[1])] > array([0, 6, 2, 8, 4]) So it was stupid :) well, time to go home, r. From robert.kern at gmail.com Wed Sep 23 12:15:34 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 23 Sep 2009 11:15:34 -0500 Subject: [Numpy-discussion] Numpy depends on OpenSSL ??? In-Reply-To: <4ABA443E.3030307@stsci.edu> References: <4ABA443E.3030307@stsci.edu> Message-ID: <3d375d730909230915v73d3abcds94d6f6401e2f86d9@mail.gmail.com> On Wed, Sep 23, 2009 at 10:52, Mark Sienkiewicz wrote: > I have discovered the hard way that numpy depends on openssl. > > I am building a 64 bit python environment for the macintosh. ?I > currently do not have a 64 bit openssl library installed, so the python > interpreter does not have hashlib. ?(hashlib gets its md5 function from > the openssl library.) There are builtin implementations that do not depend on OpenSSL. hashlib should be using them for MD5 and the standard SHA variants when OpenSSL is not available. Try "import _md5". But basically, we expect you to have a reasonably complete standard library. > The problem is in numpy/core/code_generators/genapi.py, where it appears > to be trying to make an md5 hash of the declarations of some of the C > functions. > > What is this hash used for? ?Is there a particular reason that it needs > to be cryptographically strong? It is used for checking for changes in the API. While this use case does not require all of the properties that would make a hash cryptographically strong, it needs some of them. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Sep 23 12:20:36 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 23 Sep 2009 10:20:36 -0600 Subject: [Numpy-discussion] Numpy depends on OpenSSL ??? In-Reply-To: <4ABA443E.3030307@stsci.edu> References: <4ABA443E.3030307@stsci.edu> Message-ID: On Wed, Sep 23, 2009 at 9:52 AM, Mark Sienkiewicz wrote: > I have discovered the hard way that numpy depends on openssl. > > I am building a 64 bit python environment for the macintosh. I > currently do not have a 64 bit openssl library installed, so the python > interpreter does not have hashlib. (hashlib gets its md5 function from > the openssl library.) > > The problem is in numpy/core/code_generators/genapi.py, where it appears > to be trying to make an md5 hash of the declarations of some of the C > functions. > > What is this hash used for? Is there a particular reason that it needs > to be cryptographically strong? > > The hash is used as a way to check for any API changes. It doesn't have to be cryptographically strong, it just needs to scatter the hashed values effectively and we could probably use something simpler. I tend to regard this problem as a Python bug because the standard python modules should be available on all platforms. In any case, we should find a fix. Please open a ticket. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Wed Sep 23 12:24:51 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 23 Sep 2009 11:24:51 -0500 Subject: [Numpy-discussion] Create numpy array from a list error In-Reply-To: References: Message-ID: <49d6b3500909230924n14ad45b3tf6721af17ec21efe@mail.gmail.com> On Wed, Sep 23, 2009 at 9:06 AM, Dave Wood wrote: > Hi all, > > I've got a fairly large (but not huge, 58mb) tab seperated text file, with > approximately 200 columns and 56k rows of numbers and strings. > > Here's a snippet of my code to create a numpy matrix from the data file... > > #### > > data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) > data = array(data) > > ### > > It causes the following error: > > data = array(data) > ValueError: setting an array element with a sequence > > If I take the 1st 40,000 lines of the file, it works fine. > If I take the last 40,000 lines of the file, it also works fine, so it > isn't a problem with the file. > > I've found a few other posts complaining of the same problem, but none of > their fixes work. > > It seems like a memory problem to me. This was reinforced when I tried to > break the dataset into 3 chunks and stack the resulting arrays - I got an > error message saying "memory error". > Also, I don't really understand why reading in this 57mb txt file is taking > up ~2gb's of RAM. > > Any advice? Thanks in advance > > Dave > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt and share your results? I[14]: data = np.loadtxt('09_03_18_07_55_33.sau', dtype='float', skiprows=83).T I[15]: len data -----> len(data) O[15]: 66 I[16]: len data[0] -----> len(data[0]) O[16]: 117040 I[17]: whos Variable Type Data/Info -------------------------------- data ndarray 66x117040: 7724640 elems, type `float64`, 61797120 bytes (58 Mb) [gsever at ccn various]$ python sysinfo.py ================================================================================ Platform : Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas Python : ('CPython', 'tags/r26', '66714') IPython : 0.10 NumPy : 1.4.0.dev Matplotlib : 1.0.svn ================================================================================ -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Wed Sep 23 12:30:19 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 23 Sep 2009 12:30:19 -0400 Subject: [Numpy-discussion] Numpy depends on OpenSSL ??? In-Reply-To: <4ABA443E.3030307@stsci.edu> References: <4ABA443E.3030307@stsci.edu> Message-ID: <9ED5D59E-7156-405B-BA02-5039422F4EAC@cs.toronto.edu> On 23-Sep-09, at 11:52 AM, Mark Sienkiewicz wrote: > I am building a 64 bit python environment for the macintosh. I > currently do not have a 64 bit openssl library installed, so the > python > interpreter does not have hashlib. (hashlib gets its md5 function > from > the openssl library.) If you're interested in remedying this with your Python build, have a look at Mac/BuildScript, there is a bunch of logic there that downloads various optional dependencies and builds them with the selected architectures. It should not be difficult to modify it to also grab and build openssl. David From cournape at gmail.com Wed Sep 23 12:31:50 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 24 Sep 2009 01:31:50 +0900 Subject: [Numpy-discussion] Numpy depends on OpenSSL ??? In-Reply-To: References: <4ABA443E.3030307@stsci.edu> Message-ID: <5b8d13220909230931q5d4d5c39n4af55f7c9658449f@mail.gmail.com> On Thu, Sep 24, 2009 at 1:20 AM, Charles R Harris wrote: > In any case, we should find a fix. I don't think we do - we requires a standard python install, and a python without hashlib is crippled. If you can't build python without openssl, I would consider this a python bug. cheers, David From Chris.Barker at noaa.gov Wed Sep 23 12:36:08 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 23 Sep 2009 09:36:08 -0700 Subject: [Numpy-discussion] Numpy 2D array from a list error In-Reply-To: References: <25531145.post@talk.nabble.com> <4ABA2CB8.4020406@gmail.com> Message-ID: <4ABA4E78.7040503@noaa.gov> Dave Wood wrote: > Well, I suppose they are all considered to be strings here. I haven't > tried to convert the numbers to floats yet. This could be an issue. For strings, numpy creates an array of strings, all of the same length, so each element is as big as the largest one: In [13]: l Out[13]: ['5', '34', 'this is a much longer string'] In [14]: np.array(l) Out[14]: array(['5', '34', 'this is a much longer string'], dtype='|S28') Note that each element is 28 bytes (that's what the S28 means). this means that your array would be much larger than the text file if you have even one long string it in. Also, as mentioned in this thread, in order to figure out how big to make each string element, the array() constructor has to scan through your entire list first, and I don't know how much intermediate memory it may use in that process. This really isn't how numpy is meant to be used -- why would you want a big ol' array of mixed numbers and strings, all stored as strings? structured arrays were meant for this, and np.loadtxt() is the easiest way to get one. > I just tried preallocating the array and updating it one line at a time, > and that works fine. what dtype do you end up with? > This doesn't seem like the expected behaviour though and the error > message seems wrong. yes, not a good error message at all -- it's hard to make sure good errors get triggered every time! HTH, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From gokhansever at gmail.com Wed Sep 23 12:34:31 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 23 Sep 2009 11:34:31 -0500 Subject: [Numpy-discussion] Create numpy array from a list error In-Reply-To: References: Message-ID: <49d6b3500909230934v8afb912vc0f1863547bc915c@mail.gmail.com> On Wed, Sep 23, 2009 at 9:06 AM, Dave Wood wrote: > Hi all, > > I've got a fairly large (but not huge, 58mb) tab seperated text file, with > approximately 200 columns and 56k rows of numbers and strings. > > Here's a snippet of my code to create a numpy matrix from the data file... > > #### > > data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) > data = array(data) > > ### > > It causes the following error: > > data = array(data) > ValueError: setting an array element with a sequence > > If I take the 1st 40,000 lines of the file, it works fine. > If I take the last 40,000 lines of the file, it also works fine, so it > isn't a problem with the file. > > I've found a few other posts complaining of the same problem, but none of > their fixes work. > > It seems like a memory problem to me. This was reinforced when I tried to > break the dataset into 3 chunks and stack the resulting arrays - I got an > error message saying "memory error". > Also, I don't really understand why reading in this 57mb txt file is taking > up ~2gb's of RAM. > > Any advice? Thanks in advance > > Dave > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > One more reply, You try to read mixed data (strings and numbers) into an array, that might be causing the problem. In my example, after skipping the meta-header all I have is numbers. Additionally, when you are reading chunk of data if one of the column elements truncated or overflows its section NumPy complains with that ValueError. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From davejwood at gmail.com Wed Sep 23 12:56:20 2009 From: davejwood at gmail.com (Dave Wood) Date: Wed, 23 Sep 2009 17:56:20 +0100 Subject: [Numpy-discussion] Numpy 2D array from a list error In-Reply-To: <4ABA4E78.7040503@noaa.gov> References: <25531145.post@talk.nabble.com> <4ABA2CB8.4020406@gmail.com> <4ABA4E78.7040503@noaa.gov> Message-ID: Appologies for the multiple posts, people. My posting to the forum was pending for a long time, so I deleted it and tried emailing directly. I didn't think they'd all be sent out. Gokan, thanks for the reply, I hope you get this one. "Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt and share your results? I[14]: data = np.loadtxt('09_03_18_07_55_33.sau', dtype='float', skiprows=83).T I[15]: len data -----> len(data) O[15]: 66 I[16]: len data[0] -----> len(data[0]) O[16]: 117040 I[17]: whos Variable Type Data/Info -------------------------------- data ndarray 66x117040: 7724640 elems, type `float64`, 61797120 bytes (58 Mb) [gsever at ccn various]$ python sysinfo.py ================================================================================ Platform : Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas Python : ('CPython', 'tags/r26', '66714') IPython : 0.10 NumPy : 1.4.0.dev Matplotlib : 1.0.svn ================================================================================ -- G?khan" I tried using loadtxt and got the same error as before (with a little more information). " Traceback (most recent call last): File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line 140, in main() File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line 45, in main data = loadtxt("inputfile.txt",dtype='string') File "/apps/python/2.5.4/rhel4/lib/python2.5/site-packages/numpy/lib/io.py", line 505, in loadtxt X = np.array(X, dtype) ValueError: setting an array element with a sequence " @Christopher Barker Thanks for the information. To fix my problem, I tried taking out the row names (leaving only numerical information), and converting the 2D list to floats. I still had the same problem. On 9/23/09, Christopher Barker wrote: > > Dave Wood wrote: > > Well, I suppose they are all considered to be strings here. I haven't > > tried to convert the numbers to floats yet. > > This could be an issue. For strings, numpy creates an array of strings, > all of the same length, so each element is as big as the largest one: > > In [13]: l > Out[13]: ['5', '34', 'this is a much longer string'] > > In [14]: np.array(l) > Out[14]: > array(['5', '34', 'this is a much longer string'], > dtype='|S28') > > > Note that each element is 28 bytes (that's what the S28 means). > > this means that your array would be much larger than the text file if > you have even one long string it in. Also, as mentioned in this thread, > in order to figure out how big to make each string element, the array() > constructor has to scan through your entire list first, and I don't know > how much intermediate memory it may use in that process. > > This really isn't how numpy is meant to be used -- why would you want a > big ol' array of mixed numbers and strings, all stored as strings? > > structured arrays were meant for this, and np.loadtxt() is the easiest > way to get one. > > > I just tried preallocating the array and updating it one line at a time, > > and that works fine. > > what dtype do you end up with? > > > This doesn't seem like the expected behaviour though and the error > > message seems wrong. > > yes, not a good error message at all -- it's hard to make sure good > errors get triggered every time! > > > HTH, > > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davejwood at gmail.com Wed Sep 23 13:03:19 2009 From: davejwood at gmail.com (Dave Wood) Date: Wed, 23 Sep 2009 18:03:19 +0100 Subject: [Numpy-discussion] Numpy 2D array from a list error In-Reply-To: References: <25531145.post@talk.nabble.com> <4ABA2CB8.4020406@gmail.com> <4ABA4E78.7040503@noaa.gov> Message-ID: Ignore that last mail, I hit send instead of save by mistake. Between you you seem to be right, it's a problem with loading the array of strings. There must be some large strings in the first 'rowname' column. If this column is left out, it works fine (even as strings). Many thanks, sorry for all the emails. Dave On 9/23/09, Dave Wood wrote: > > Appologies for the multiple posts, people. My posting to the forum was > pending for a long time, so I deleted it and tried emailing directly. I > didn't think they'd all be sent out. > Gokan, thanks for the reply, I hope you get this one. > > "Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt and share > your results? > > I[14]: data = np.loadtxt('09_03_18_07_55_33.sau', dtype='float', > skiprows=83).T > > I[15]: len data > -----> len(data) > O[15]: 66 > > I[16]: len data[0] > -----> len(data[0]) > O[16]: 117040 > > I[17]: whos > Variable Type Data/Info > -------------------------------- > data ndarray 66x117040: 7724640 elems, type `float64`, 61797120 > bytes (58 Mb) > > > > [gsever at ccn various]$ python sysinfo.py > > ================================================================================ > Platform : > Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas > Python : ('CPython', 'tags/r26', '66714') > IPython : 0.10 > NumPy : 1.4.0.dev > Matplotlib : 1.0.svn > > ================================================================================ > > > -- > G?khan" > > > > > I tried using loadtxt and got the same error as before (with a little more > information). > > " > > Traceback (most recent call last): > File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line > 140, in > main() > File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line 45, > in main > data = loadtxt("inputfile.txt",dtype='string') > File > "/apps/python/2.5.4/rhel4/lib/python2.5/site-packages/numpy/lib/io.py", line > 505, in loadtxt > X = np.array(X, dtype) > ValueError: setting an array element with a sequence > " > > @Christopher Barker > Thanks for the information. To fix my problem, I tried taking out the row > names (leaving only numerical information), and converting the 2D list to > floats. I still had the same problem. > > > On 9/23/09, Christopher Barker wrote: >> >> Dave Wood wrote: >> > Well, I suppose they are all considered to be strings here. I haven't >> > tried to convert the numbers to floats yet. >> >> This could be an issue. For strings, numpy creates an array of strings, >> all of the same length, so each element is as big as the largest one: >> >> In [13]: l >> Out[13]: ['5', '34', 'this is a much longer string'] >> >> In [14]: np.array(l) >> Out[14]: >> array(['5', '34', 'this is a much longer string'], >> dtype='|S28') >> >> >> Note that each element is 28 bytes (that's what the S28 means). >> >> this means that your array would be much larger than the text file if >> you have even one long string it in. Also, as mentioned in this thread, >> in order to figure out how big to make each string element, the array() >> constructor has to scan through your entire list first, and I don't know >> how much intermediate memory it may use in that process. >> >> This really isn't how numpy is meant to be used -- why would you want a >> big ol' array of mixed numbers and strings, all stored as strings? >> >> structured arrays were meant for this, and np.loadtxt() is the easiest >> way to get one. >> >> > I just tried preallocating the array and updating it one line at a time, >> > and that works fine. >> >> what dtype do you end up with? >> >> > This doesn't seem like the expected behaviour though and the error >> > message seems wrong. >> >> yes, not a good error message at all -- it's hard to make sure good >> errors get triggered every time! >> >> >> HTH, >> >> -Chris >> >> >> >> -- >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Sep 23 13:05:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 23 Sep 2009 12:05:01 -0500 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3E6@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> <200909211328.41980.meine@informatik.uni-hamburg.de> <45d1ab480909221341s1ebf05b4u884b8b7cadf75ce6@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3E6@MBOX0.essex.ac.uk> Message-ID: <3d375d730909231005n19d83f9an322cb5d248c4ae1a@mail.gmail.com> On Tue, Sep 22, 2009 at 17:14, Citi, Luca wrote: > My vote (if I am entitled to) goes to "change the code". > Whether or not the addressee of .base is an array, it should be "the object that has to be kept alive such that the data does not get deallocated" rather "one object which will keep alive another object, which will keep alive another object, ...., which will keep alive the object with the data". > On creation of a new view B of object A, if A has ONWDATA true then B.base = A, else B.base = A.base. > > When working on > http://projects.scipy.org/numpy/ticket/1085 > I had to walk the chain of bases to establish whether any of the inputs and the outputs were views of the same data. > If "base" were the ultimate base, one would only need to check whether any of the inputs have the same base of any of the outputs. This is not reliable. You need to check memory addresses and extents for overlap (unfortunately, slices complicate this; numpy.may_share_memory() is a good heuristic, though). When interfacing with other systems using __array_interface__ or similar APIs, the other system may have multiple objects that point to the same data. If you create ndarrays from each of these objects, their .base attributes would all be different although they all point to the same memory. > I tried to modify the code to change the behaviour. > I have opened a ticket for this http://projects.scipy.org/numpy/ticket/1232 > and attached a patch but I am not 100% sure. > I changed PyArray_View in convert.c and a few places in mapping.c and sequence.c. > > But if there is any reason why the current behaviour should be kept, just ignore the ticket. Lacking a robust use case, I would prefer to keep the current behavior. It is likely that nothing would break if we changed it, but without a use case, I would prefer to be conservative. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Wed Sep 23 13:07:57 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 23 Sep 2009 13:07:57 -0400 Subject: [Numpy-discussion] simple indexing question References: <4ABA29BC.9070109@ntc.zcu.cz> <1cd32cbb0909230831g69b6bb32v87ae6d082f8e9027@mail.gmail.com> Message-ID: josef.pktd at gmail.com wrote: > On Wed, Sep 23, 2009 at 11:12 AM, Neal Becker wrote: >> Robert Cimrman wrote: >> >>> Neal Becker wrote: >>>> I have an array: >>>> In [12]: a >>>> Out[12]: >>>> array([[0, 1, 2, 3, 4], >>>> [5, 6, 7, 8, 9]]) >>>> >>>> And a selection array: >>>> In [13]: b >>>> Out[13]: array([1, 1, 1, 1, 1]) >>>> >>>> I want a 1-dimensional output, where the array b selects an element >>>> from each column of a, where if b[i]=0 select element from 0th row of a >>>> and if b[i]=k select element from kth row of a. >>>> >>>> Easy way to do this? (Not a[b], that gives 5x5 array output) >>> >>> It might be stupid, but it works... >>> >>> In [51]: a >>> Out[51]: >>> array([[0, 1, 2, 3, 4], >>> [5, 6, 7, 8, 9]]) >>> >>> In [52]: b = [0,1,0,1,0] >>> >>> In [53]: a.T.flat[a.shape[0]*np.arange(a.shape[1])+b] >>> Out[53]: array([0, 6, 2, 8, 4]) >>> >>> cheers, >>> r. >> >> Thanks. Is there really no more elegant solution? > > How about this? > >>>> a > array([[0, 1, 2, 3, 4], > [5, 6, 7, 8, 9]]) >>>> b > array([0, 1, 0, 1, 0]) > >>>> a[b,np.arange(a.shape[1])] > array([0, 6, 2, 8, 4]) > > Josef > Thanks, that's not bad. I'm a little surprised that given the fancy indexing capabilities of np there isn't a more direct way to do this. I'm still trying to wrap my mind around the fancy indexing stuff. From dwf at cs.toronto.edu Wed Sep 23 13:26:25 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 23 Sep 2009 13:26:25 -0400 Subject: [Numpy-discussion] Create numpy array from a list error In-Reply-To: References: Message-ID: On 23-Sep-09, at 10:06 AM, Dave Wood wrote: > Hi all, > > I've got a fairly large (but not huge, 58mb) tab seperated text > file, with > approximately 200 columns and 56k rows of numbers and strings. > > Here's a snippet of my code to create a numpy matrix from the data > file... > > #### > > data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) > data = array(data) In general I have found that the pattern your using is a bad one, because it's first reading the entire file into memory and then making a complete copy of it when you call map. I would instead use data = [x.strip().split('\t') for x in sys.stdin] or even defer the loop until array() is called, with a generator: data = (x.strip().split('\t') for x in sys.stdin) This difference still shouldn't be resulting in a memory error with only 57 MB of data, but it'll make things go faster at least. David From lciti at essex.ac.uk Wed Sep 23 14:30:16 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 23 Sep 2009 19:30:16 +0100 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: <3d375d730909231005n19d83f9an322cb5d248c4ae1a@mail.gmail.com> References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> <200909211328.41980.meine@informatik.uni-hamburg.de> <45d1ab480909221341s1ebf05b4u884b8b7cadf75ce6@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3E6@MBOX0.essex.ac.uk>, <3d375d730909231005n19d83f9an322cb5d248c4ae1a@mail.gmail.com> Message-ID: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3ED@MBOX0.essex.ac.uk> > Lacking a robust use case, I would prefer to keep the current > behavior. It is likely that nothing would break if we changed it, but > without a use case, I would prefer to be conservative. Fair enough. >> When working on >> http://projects.scipy.org/numpy/ticket/1085 >> I had to walk the chain of bases to establish whether any of the inputs and the outputs were views of the same data. >> If "base" were the ultimate base, one would only need to check whether any of the inputs have the same base of any of the outputs. > This is not reliable. You need to check memory addresses and extents > for overlap (unfortunately, slices complicate this; > numpy.may_share_memory() is a good heuristic, though). When > nterfacing with other systems using __array_interface__ or similar > APIs, the other system may have multiple objects that point to the > same data. If you create ndarrays from each of these objects, their > .base attributes would all be different although they all point to the > same memory. Lesson learned. I always make the mistake to think of numpy in terms of arrays while instead many users use it to access to external data. >> http://projects.scipy.org/numpy/ticket/1085 But I think in that case it was still an improvement w.r.t. the current implementation which is buggy. At least it shields 95% of users from unexpected results. Using memory addresses and extents might be overkilling (and expensive) in that case. Thanks. Luca From robert.kern at gmail.com Wed Sep 23 14:49:24 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 23 Sep 2009 13:49:24 -0500 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3ED@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> <200909211328.41980.meine@informatik.uni-hamburg.de> <45d1ab480909221341s1ebf05b4u884b8b7cadf75ce6@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3E6@MBOX0.essex.ac.uk> <3d375d730909231005n19d83f9an322cb5d248c4ae1a@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3ED@MBOX0.essex.ac.uk> Message-ID: <3d375d730909231149x44e30b99s6b08403d1b11ecc2@mail.gmail.com> On Wed, Sep 23, 2009 at 13:30, Citi, Luca wrote: >>> http://projects.scipy.org/numpy/ticket/1085 > But I think in that case it was still an improvement w.r.t. the current implementation > which is buggy. At least it shields 95% of users from unexpected results. > Using memory addresses and extents might be overkilling (and expensive) in that case. numpy.may_share_memory() should be pretty cheap. It's just arithmetic. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Wed Sep 23 14:59:08 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 23 Sep 2009 21:59:08 +0300 Subject: [Numpy-discussion] Deserialized arrays with base mutate strings In-Reply-To: <4AB9D5C7.3040206@avl.com> References: <4AB9CB20.9090809@avl.com> <28809716.622169.1253691526147.JavaMail.xicrypt@atgrzls001> <4AB9D5C7.3040206@avl.com> Message-ID: <1253732348.3214.8.camel@idol> ke, 2009-09-23 kello 10:01 +0200, Hrvoje Niksic kirjoitti: [clip] > I guess this one could be prevented by verifying that the buffer is > writable when setting the "writable" flag. When deserializing arrays, I > don't see a reason for the "base" property to even exist - sharing of > the buffer between different views is unpreserved anyway, as reported in > my other thread. IIRC, it avoids one copy: ndarray.__reduce__ pickles the raw data as a string, and so ndarray.__setstate__ receives a Python string back. I don't remember if it's in the end possible to emit raw byte stream to a pickle somehow, not going through strings. If not, then a copy can't be avoided. -- Pauli Virtanen From robert.kern at gmail.com Wed Sep 23 15:02:49 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 23 Sep 2009 14:02:49 -0500 Subject: [Numpy-discussion] Deserialized arrays with base mutate strings In-Reply-To: <1253732348.3214.8.camel@idol> References: <4AB9CB20.9090809@avl.com> <28809716.622169.1253691526147.JavaMail.xicrypt@atgrzls001> <4AB9D5C7.3040206@avl.com> <1253732348.3214.8.camel@idol> Message-ID: <3d375d730909231202n9d3e075y49c63a0121fd610d@mail.gmail.com> On Wed, Sep 23, 2009 at 13:59, Pauli Virtanen wrote: > ke, 2009-09-23 kello 10:01 +0200, Hrvoje Niksic kirjoitti: > [clip] >> I guess this one could be prevented by verifying that the buffer is >> writable when setting the "writable" flag. ?When deserializing arrays, I >> don't see a reason for the "base" property to even exist - sharing of >> the buffer between different views is unpreserved anyway, as reported in >> my other thread. > > IIRC, it avoids one copy: ndarray.__reduce__ pickles the raw data as a > string, and so ndarray.__setstate__ receives a Python string back. Correct, that was the goal. > I don't remember if it's in the end possible to emit raw byte stream to > a pickle somehow, not going through strings. If not, then a copy can't > be avoided. No, I don't think you can. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mdroe at stsci.edu Wed Sep 23 15:18:24 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Wed, 23 Sep 2009 15:18:24 -0400 Subject: [Numpy-discussion] Coercing object arrays to string (or unicode) arrays Message-ID: <4ABA7480.5040404@stsci.edu> As I'm looking into fixing a number of bugs in chararray, I'm running into some surprising behavior. One of the things chararray needs to do occasionally is build up an object array of string objects, and then convert that back to a fixed-length string array. This length is sometimes predetermined by a recarray data structure. Unfortunately, I'm not getting what I would expect when coercing or assigning an object array to a string array. Is this a bug, or am I just going about this the wrong way? If a bug, I'm happy to look into it as part of my "fixing chararray" task, but I just wanted to confirm that it is a bug before proceeding. In [14]: x = np.array(['abcdefgh', 'ijklmnop'], 'O') # Without specifying the length, it seems to default to sizeof(int)... ??? In [15]: np.array(x, 'S') Out[15]: array(['abcd', 'ijkl'], dtype='|S4') In [21]: np.array(x, np.string_) Out[21]: array(['abcd', 'ijkl'], dtype='|S4') # Specifying a length gives strange results In [16]: np.array(x, 'S8') Out[16]: array(['abcdijkl', 'mnop\xe0\x01\x85\x08'], dtype='|S8') # This is what I expected to happen above, but the cast to a list seems like it should be unnecessary In [17]: np.array(list(x)) Out[17]: array(['abcdefgh', 'ijklmnop'], dtype='|S8') # Assignment also seems broken In [18]: y = np.empty(x.shape, dtype='S8') In [19]: y[:] = x[:] In [20]: y Out[20]: array(['abcdijkl', 'mnop\xc05\xf9\xb7'], dtype='|S8') Cheers, Mike From seb.haase at gmail.com Wed Sep 23 15:32:46 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Wed, 23 Sep 2009 11:32:46 -0800 Subject: [Numpy-discussion] numpy and cython in pure python mode In-Reply-To: <4ABA1AC3.9020801@student.matnat.uio.no> References: <3d375d730909221234y5a0c8ba3m14aeb331badb09d5@mail.gmail.com> <4ABA1AC3.9020801@student.matnat.uio.no> Message-ID: Thanks for all the replies ! On Wed, Sep 23, 2009 at 4:55 AM, Dag Sverre Seljebotn wrote: > Robert Kern wrote: >> On Tue, Sep 22, 2009 at 01:33, Sebastian Haase wrote: >> >>> Hi, >>> I'm not subscribed to the cython list - hoping enough people would >>> care to justify my post here: >>> > The post might be justified, but it is a question of available knowledge > as well. I nearly missed this post here. The Cython user list is on: > > http://groups.google.com/group/cython-users > >>> I know that cython's numpy is still getting better and better over >>> time, but is it already today possible to have numpy support when >>> using Cython in "pure python" mode? >>> I like the idea of being able to develop and debug code "the python >>> way" -- and then just switching on the cython-overdrive mode. >>> (Otherwise I have very good experience using C/C++ with appropriate >>> typemaps, and I don't mind the C syntax) >>> >>> I only recently learned about the "pure python" mode on the sympy list >>> (and at the EuroScipy2009 workshop). >>> My understanding is that Cython's pure Python mode could be "played" >>> in two ways: >>> a) either not having a .pyx-file at all and putting everything into a >>> py-file (using the "import cython" stuff) >>> or b) putting only cython specific declaration in to a pyx file having >>> the same basename as the py-file next to it. >>> > That should be a pxd file with the same basename. And I think that mode > should work. b), that is. That is really good news ! A short example would be very helpful, since the "pure python mode" documentation is very sparse. I found that one wiki page to be very good - but obviously it's not a complete reference. > > Sturla's note on the memory view syntax doesn't apply as that's not in a > released version of Cython yet, and won't be until 0.12.1 or 0.13. But > that could be made to support Python mode a). > Using an svn version would be fine with me, however the mentioned syntax (thanks Sturla) looks quite awkward/unnatural. > Finally there's been some recent discussion on cython-dev about a tool > which can take a pyx file as input and output pure Python. This would probably only be a second-to-best solution because it would effectively double the source code. (However, eventually such a tool could run behind the scene and import the pyx-converted py file on the fly -- in which case it would be transparent ...) >> >>> One more: there is no way on reload cython-modules (yet), ?right ? >>> >> >> Correct. There is no way to reload any extension module. >> > This can be worked around (in most situations that arise in practice) by > compiling the module with a new name each time and importing things from > it though. Sage already kind of support it (for the %attach feature > only), and there are patches around for pyximport in Cython that's just > lacking testing and review. Since pyximport lacks a test suite > altogether, nobody seems to ever get around to that. > I read something like this in the archives, thanks for the update. Will those patches be applied to svn - even if untested so far... ? > Dag Sverre Thanks Dag. -- Sebastian Haase From d.l.goldsmith at gmail.com Wed Sep 23 16:16:19 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 23 Sep 2009 13:16:19 -0700 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: <3d375d730909231149x44e30b99s6b08403d1b11ecc2@mail.gmail.com> References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> <200909211328.41980.meine@informatik.uni-hamburg.de> <45d1ab480909221341s1ebf05b4u884b8b7cadf75ce6@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3E6@MBOX0.essex.ac.uk> <3d375d730909231005n19d83f9an322cb5d248c4ae1a@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3ED@MBOX0.essex.ac.uk> <3d375d730909231149x44e30b99s6b08403d1b11ecc2@mail.gmail.com> Message-ID: <45d1ab480909231316i44f54bc7qffda21ce22202783@mail.gmail.com> So the end result is: change the docstring, correct? DG On Wed, Sep 23, 2009 at 11:49 AM, Robert Kern wrote: > On Wed, Sep 23, 2009 at 13:30, Citi, Luca wrote: > > >>> http://projects.scipy.org/numpy/ticket/1085 > > But I think in that case it was still an improvement w.r.t. the current > implementation > > which is buggy. At least it shields 95% of users from unexpected results. > > Using memory addresses and extents might be overkilling (and expensive) > in that case. > > numpy.may_share_memory() should be pretty cheap. It's just arithmetic. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sienkiew at stsci.edu Wed Sep 23 16:16:24 2009 From: sienkiew at stsci.edu (Mark Sienkiewicz) Date: Wed, 23 Sep 2009 16:16:24 -0400 Subject: [Numpy-discussion] Numpy depends on OpenSSL ??? In-Reply-To: <3d375d730909230915v73d3abcds94d6f6401e2f86d9@mail.gmail.com> References: <4ABA443E.3030307@stsci.edu> <3d375d730909230915v73d3abcds94d6f6401e2f86d9@mail.gmail.com> Message-ID: <4ABA8218.7090504@stsci.edu> Robert Kern wrote: > On Wed, Sep 23, 2009 at 10:52, Mark Sienkiewicz wrote: > >> I have discovered the hard way that numpy depends on openssl. >> >> I am building a 64 bit python environment for the macintosh. I >> currently do not have a 64 bit openssl library installed, so the python >> interpreter does not have hashlib. (hashlib gets its md5 function from >> the openssl library.) >> > > There are builtin implementations that do not depend on OpenSSL. > hashlib should be using them for MD5 and the standard SHA variants > when OpenSSL is not available. This is the clue that I needed. Here is where it led: setup.py tries to detect the presence of openssl by looking for the library and the include files. It detects the library that Apple provided in /usr/lib/libssl.dylib and tries to build the openssl version of hashlib. But when it actually builds the module, the link fails because that library file is not for the correct architecture. I am building for x86_64, but the library contains only ppc and i386. The result is that hashlib cannot be imported, so the python installer decides not to install it at all. That certainly appears to indicate that the python developers consider hashlib to be optional, but it _should_ work in most any python installation. So, the problem is really about the python install automatically detecting libraries. If I hack the setup.py that builds all the C modules so that it can't find the openssl library, then it uses the fallbacks that are distributed with python. That gets me a as far as "EnvironmentError: math library missing; rerun setup.py after setting the MATHLIB env variable", which is a big improvement. (The math library is not missing, but this is a different problem entirely.) Thanks, and sorry for the false alarm. Mark S. From lciti at essex.ac.uk Wed Sep 23 16:35:54 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 23 Sep 2009 21:35:54 +0100 Subject: [Numpy-discussion] is ndarray.base the closest base or the ultimate base? In-Reply-To: <3d375d730909231149x44e30b99s6b08403d1b11ecc2@mail.gmail.com> References: <271BED32E925E646A1333A56D9C6AFCB31E561A01D@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB31E561A01F@MBOX0.essex.ac.uk> <200909211328.41980.meine@informatik.uni-hamburg.de> <45d1ab480909221341s1ebf05b4u884b8b7cadf75ce6@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3E6@MBOX0.essex.ac.uk> <3d375d730909231005n19d83f9an322cb5d248c4ae1a@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3ED@MBOX0.essex.ac.uk>, <3d375d730909231149x44e30b99s6b08403d1b11ecc2@mail.gmail.com> Message-ID: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3EE@MBOX0.essex.ac.uk> > numpy.may_share_memory() should be pretty cheap. It's just arithmetic. True, but it is in python. Not something that should go in construct_arrays of ufunc_object.c, I suppose. But the same approach can be translated to C, probably. I can try if we decide http://projects.scipy.org/numpy/ticket/1085 is worth fixing. Let me know. From fperez.net at gmail.com Wed Sep 23 19:33:43 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 23 Sep 2009 16:33:43 -0700 Subject: [Numpy-discussion] something wrong with docs? In-Reply-To: <45d1ab480909222315t75a6c2f3o47c96c90715755f8@mail.gmail.com> References: <1253557959.3707.4.camel@idol> <45d1ab480909221931o4bc326f0y11e5b440bcbce0a7@mail.gmail.com> <45d1ab480909222315t75a6c2f3o47c96c90715755f8@mail.gmail.com> Message-ID: On Tue, Sep 22, 2009 at 11:15 PM, David Goldsmith wrote: > "It would be nice if we could move gradually > towards docs whose examples (at least those marked as such) were > always run via sphinx." > > That's a "suggestion," but given your point, it seems like you'd advocate it > being more than that, no? > I was simply thinking that if this markup were to be used in the docs for all examples where it makes sense, then one could simply use the sphinx target make doctest to also validate the documentation. Even if users don't run these by default, developers and buildbots would, which helps raise the reliability of the docs and reduces chances of code bitrot in the examples from the main docs (that problem is taken care of for the docstrings by np.test(doctest=True) ). Cheers, f From dwf at cs.toronto.edu Wed Sep 23 19:55:08 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 23 Sep 2009 19:55:08 -0400 Subject: [Numpy-discussion] dtype '|S0' not understood Message-ID: Howdy, It seems it's possible using e.g. In [25]: dtype([('foo', str)])Out[25]: dtype([('foo', '|S0')]) to get yourself a zero-length string. However dtype('|S0') results in a TypeError: data type not understood. I understand the stupidity of creating a 0-length string field but it's conceivable that it's accidental. For example, it could lead to a situation where you've created that field, are missing all the data you had meant to put in it, serialize with np.save, and upon np.load aren't able to get _any_ of your data back because the dtype descriptor is considered bogus (can you guess why I thought of this scenario?). It seems that either dtype(str) should do something more sensible than zero-length string, or it should be possible to create it with dtype('| S0'). Which should it be? David From markus.proeller at ifm.com Thu Sep 24 03:07:01 2009 From: markus.proeller at ifm.com (markus.proeller at ifm.com) Date: Thu, 24 Sep 2009 09:07:01 +0200 Subject: [Numpy-discussion] Numpy savetxt: change decimal separator Message-ID: Hello everyone, I save data to a file with the following statement: np.savetxt(fileName, transpose((average_dist, std_deviation, maximum_dist, sum_of_dist)), delimiter = ';', fmt='%6.10f') is there a possibility to change the decimal seperator from a point to comma ? And another question I import this file to excel, is there also a possiblity to create a headline for each column, that the file looks like the following example: average; standard deviation; maximum distance; sum of distances 0,26565; 0,65565; 2,353535; 25, 5656 ... ... ... Thanks, Markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeremy at jeremysanders.net Thu Sep 24 04:50:22 2009 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Thu, 24 Sep 2009 09:50:22 +0100 Subject: [Numpy-discussion] Numpy extension writing problems Message-ID: Hi - I've never written a Python extension before, so I apologise in advance for my lack of knowledge. I'm trying to interpret a variable length tuple of variable length numpy arrays, convert then to C double arrays and pass them to a C++ function. I'm using SIP (as I also need to deal with Qt). Here is my rather poor code(I realise that variable length C arrays on the stack are a gcc extension). a0 is the PyObject* for the tuple. It core dumps on PyArray_AsCArray. %MethodCode const Py_ssize_t numitems = PyTuple_Size(a0); double* data[numitems]; int sizes[numitems]; PyObject* objects[numitems]; int status = 0; PyArray_Descr *descr = PyArray_DescrFromType(PyArray_DOUBLE); for(Py_ssize_t i = 0; i != numitems; i++) data[i] = NULL; if( ! PyTuple_Check(a0) ) goto exit; for(Py_ssize_t i = 0; i != numitems; i++) { objects[i] = PyTuple_GetItem(a0, i); npy_intp size; int ret = PyArray_AsCArray(&objects[i], (void*)(&data[i]), &size, 1, descr); if(ret < 0) { status = 1; goto exit; } sizes[i] = size; } // this is the actual function to call TestKlass::myfunc(data, sizes, numitems); exit: for(Py_ssize_t i = 0; i != numitems; i++) { if( data[i] != NULL ) { PyArray_Free(objects[i], data[i]); } } if(status != 0) { PyErr_SetString(PyExc_RuntimeError, "conversion error"); sipIsErr = 1; } %End Can someone give me a hint on where I'm going wrong? Can't I pass PyTuple_GetItem objects to PyArray_AsCArray. Is there a numpy built in routine which makes this easier? Thanks Jeremy From gokhansever at gmail.com Thu Sep 24 09:41:34 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 24 Sep 2009 08:41:34 -0500 Subject: [Numpy-discussion] Numpy savetxt: change decimal separator In-Reply-To: References: Message-ID: <49d6b3500909240641v612b7883q79e855fa38b1032a@mail.gmail.com> On Thu, Sep 24, 2009 at 2:07 AM, wrote: > > Hello everyone, > > I save data to a file with the following statement: > > np.savetxt(fileName, transpose((average_dist, std_deviation, maximum_dist, > sum_of_dist)), delimiter = ';', fmt='%6.10f') > > is there a possibility to change the decimal seperator from a point to > comma ? > And another question I import this file to excel, is there also a > possiblity to create a headline for each column, that the file looks like > the following example: > I don't know how to accomplish the first task, but for the latter the following lines should work: fid = open(fileName, 'w') fid.write("average; standard deviation; maximum distance; sum of distances") np.savetxt(fid, transpose((average_dist, std_deviation, maximum_dist, sum_of_dist)), delimiter = ';', fmt='%6.10f') fid.close() > > average; standard deviation; maximum distance; sum of distances > 0,26565; 0,65565; 2,353535; 25, 5656 > ... > ... > ... > > Thanks, > > Markus > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From invernizzi at cilea.it Thu Sep 24 10:58:57 2009 From: invernizzi at cilea.it (Alice Invernizzi) Date: Thu, 24 Sep 2009 16:58:57 +0200 Subject: [Numpy-discussion] Resize Method for Numpy Array Message-ID: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> Dear all, I have an Hamletic doubt concerning the numpy array data type. A general learned rule concerning the array usage in other high-level programming languages is that array data-type are homogeneous datasets of fixed dimension. Therefore, is not clear to me why in numpy the size of an array can be changed (either with the 'returning-value' resize() function either with the 'in-place' array method resize()). More in detail, if the existence of the first function ('returning-value') might make sense in array computing operation, the existence of the 'in-place' method really make no sense for me. Would you please be so kind to give some explanation for the existence of resize operator for numpy array? If array size can be change, what are the real advantages of using numpy array instead of list object? Thanks in avdance Best Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From sole at esrf.fr Thu Sep 24 11:38:14 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Thu, 24 Sep 2009 17:38:14 +0200 Subject: [Numpy-discussion] Resize Method for Numpy Array In-Reply-To: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> References: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> Message-ID: <4ABB9266.8030306@esrf.fr> Alice Invernizzi wrote: > > Dear all, > > I have an Hamletic doubt concerning the numpy array data type. > A general learned rule concerning the array usage in other high-level > programming languages is that array data-type are homogeneous datasets > of fixed dimension. > > Therefore, is not clear to me why in numpy the size of an array can be > changed (either with the 'returning-value' resize() function either with > the 'in-place' array method resize()). > More in detail, if the existence of the first function > ('returning-value') might make sense in array computing operation, the > existence of the 'in-place' method really make no sense for me. > > Would you please be so kind to give some explanation for the existence > of resize operator for numpy array? If array size can be change, > what are the real advantages of using numpy array instead of list object? > Thanks in avdance Just to keep into the same line. import numpy a=numpy.arange(100.) a.shape = 10, 10 b = a * 1 # just to get a copy b.shape = 5, 2, 5, 5 b = (b.sum(axis=3)).sum(axis=1) In that way, on b I have a binned image of a. I would expect a.resize(5, 5) would have given something similar (perhaps there is already something to make a binning). In fact a.resize(5,5) is much closer to a crop than to a resize. I think the resize name is misleading and should be called crop, but that is just my view. Armando From sole at esrf.fr Thu Sep 24 11:41:40 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Thu, 24 Sep 2009 17:41:40 +0200 Subject: [Numpy-discussion] Resize Method for Numpy Array In-Reply-To: <4ABB9266.8030306@esrf.fr> References: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> <4ABB9266.8030306@esrf.fr> Message-ID: <4ABB9334.8060700@esrf.fr> V. Armando Sol? wrote: Sorry, there was a bug in the sent code. It should be: > import numpy > a=numpy.arange(100.) > a.shape = 10, 10 > b = a * 1 # just to get a copy > b.shape = 5, 2, 5, 2 > b = (b.sum(axis=3)).sum(axis=1) > > In that way, on b I have a binned image of a. From zhu.146 at gmail.com Thu Sep 24 11:51:50 2009 From: zhu.146 at gmail.com (Junda Zhu) Date: Thu, 24 Sep 2009 11:51:50 -0400 Subject: [Numpy-discussion] Numpy savetxt: change decimal separator In-Reply-To: References: Message-ID: On Sep 24, 2009, at 3:07 AM, markus.proeller at ifm.com wrote: > Hello everyone, > > I save data to a file with the following statement: > > np.savetxt(fileName, transpose((average_dist, std_deviation, > maximum_dist, sum_of_dist)), delimiter = ';', fmt='%6.10f') > > is there a possibility to change the decimal seperator from a point > to comma ? > And another question I import this file to excel, is there also a > possiblity to create a headline for each column, that the file looks > like the following example: > > average; standard deviation; maximum distance; sum of distances > 0,26565; 0,65565; 2,353535; 25, 5656 For the first task, I don't know if there is any direct way in numpy to change the decimal sep, but a little bit awkward trick as follows should work: mem_file = StringIO.StringIO() np.savetxt(mem_file, ... ) new_data_str = mem_file.getvalue().replace('.', ',') output_file = open(fileName, 'w') output_file.write(new_data_str) output_file.close() Or you can use regex to get better match for the decimal seperator. Thanks, Junda -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Thu Sep 24 13:02:39 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 24 Sep 2009 10:02:39 -0700 Subject: [Numpy-discussion] Coercing object arrays to string (or unicode) arrays In-Reply-To: <4ABA7480.5040404@stsci.edu> References: <4ABA7480.5040404@stsci.edu> Message-ID: <4ABBA62F.50608@noaa.gov> Michael Droettboom wrote: > As I'm looking into fixing a number of bugs in chararray, I'm running > into some surprising behavior. > In [14]: x = np.array(['abcdefgh', 'ijklmnop'], 'O') > > # Without specifying the length, it seems to default to sizeof(int)... ??? > In [15]: np.array(x, 'S') > Out[15]: > array(['abcd', 'ijkl'], > dtype='|S4') This sure looks like a bug, and I'm no expert, but I suspect that it's the size of a pointer (you are on a 32 system -- I am), which makes a bit of sense, as Object arrays store a pointer to the python objects. The question is, what should the array constructor do? perhaps the equivalent of: In [41]: np.array(x.tolist()) Out[41]: array(['abcdefgh', 'ijklmnop'], dtype='|S8') which you could use as a work around. Do you need to go through object arrays? could you go straight to a string array: np.array(['abcdefgh', 'ijklmnop'], np.string_) Out[35]: array(['abcdefgh', 'ijklmnop'], dtype='|S8') or just keep the strings in a list. Object arrays are weird, I think there are a lot of corner cases. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From mdroe at stsci.edu Thu Sep 24 13:19:09 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Thu, 24 Sep 2009 13:19:09 -0400 Subject: [Numpy-discussion] Coercing object arrays to string (or unicode) arrays In-Reply-To: <4ABBA62F.50608@noaa.gov> References: <4ABA7480.5040404@stsci.edu> <4ABBA62F.50608@noaa.gov> Message-ID: <4ABBAA0D.9000305@stsci.edu> On 09/24/2009 01:02 PM, Christopher Barker wrote: > Michael Droettboom wrote: > >> As I'm looking into fixing a number of bugs in chararray, I'm running >> into some surprising behavior. >> In [14]: x = np.array(['abcdefgh', 'ijklmnop'], 'O') >> >> # Without specifying the length, it seems to default to sizeof(int)... ??? >> In [15]: np.array(x, 'S') >> Out[15]: >> array(['abcd', 'ijkl'], >> dtype='|S4') >> > This sure looks like a bug, and I'm no expert, but I suspect that it's > the size of a pointer (you are on a 32 system -- I am), which makes a > bit of sense, as Object arrays store a pointer to the python objects. > That was my guess, too, but I haven't yet delved into the code. I'm on 32-bit as well. > The question is, what should the array constructor do? perhaps the > equivalent of: > > In [41]: np.array(x.tolist()) > Out[41]: > array(['abcdefgh', 'ijklmnop'], > dtype='|S8') > > which you could use as a work around. > Yes, that's the behaviour I was expecting. > Do you need to go through object arrays? could you go straight to a > string array: > > np.array(['abcdefgh', 'ijklmnop'], np.string_) > Out[35]: > array(['abcdefgh', 'ijklmnop'], > dtype='|S8') > > or just keep the strings in a list. > The background here is that I'm fixing/resurrecting chararray, which provides vectorized versions of the standard Python string operations, endswith, ljust etc. I was using object arrays when the length of the output string can't be determined ahead of time. For example, the string __mod__ operator. I could probably get away with generating a list of strings instead, but it's a little bit inconsistent with how I'm doing things elsewhere, which is always to generate an array. > Object arrays are weird, I think there are a lot of corner cases. > Yeah, that's been my experience. But it would be nice to try to plug those corner cases up if possible. I'll spend some time investigating this particular one. Cheers, Mike From Chris.Barker at noaa.gov Thu Sep 24 13:24:45 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 24 Sep 2009 10:24:45 -0700 Subject: [Numpy-discussion] Resize Method for Numpy Array In-Reply-To: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> References: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> Message-ID: <4ABBAB5D.5020305@noaa.gov> Alice Invernizzi wrote: > Therefore, is not clear to me why in numpy the size of an array can be > changed (either with the 'returning-value' resize() function either with > the 'in-place' array method resize()). > Would you please be so kind to give some explanation for the existence > of resize operator for numpy array? I don't find I use it that much, but it can be useful for the same reason that it is for lists, etc. > If array size can be change, > what are the real advantages of using numpy array instead of list object? This I can really answer! the "it's a known size" property of an nd-array is probably the least useful one I can think of. I think the pre-known size aspect of array objects is really an artifact of wanting efficient storage and processing, rather than a feature -- particularly in a dynamic language. As for other advantages: arrays are homogeneous data -- giving far more efficiency in a dynamic language arrays are n-d -- you can slice and dice them as such -- it's easy to extract either a row or a column from a 2-d array, for instance. slices of arrays are arrays with a view on the same data: this makes it easy to manipulate portions of an array with the same code you'd manipulate a full array. array broadcasting easy and efficient interchange of raw data with other data types, from C, C++ , Fortran, etc code. and I could go on! You might actually ask the question the other way -- if you can re-size an array -- why do you ever need to use a list? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Thu Sep 24 13:51:00 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 24 Sep 2009 11:51:00 -0600 Subject: [Numpy-discussion] chebyshev polynomials Message-ID: Hi All, Would it be appropriate to add a class similar to poly but instead using chebyshev polynomials? That is, where we currently have 'poly', 'poly1d', 'polyadd', 'polyder', 'polydiv', 'polyfit', 'polyint', 'polymul', 'polysub', 'polyval', change poly to cheb. The rational here is two-fold: first, chebyshev series generally yield better conditioned fits than polynomials and second, I am going to add a more general remez algorithm to scipy for real functions defined on the unit circle in the complex plane and one of the consequences is that it is possible to generate minmax polynomial fits over an interval, but the natural form of the result is a chebyshev series. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdroe at stsci.edu Thu Sep 24 14:23:57 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Thu, 24 Sep 2009 14:23:57 -0400 Subject: [Numpy-discussion] Coercing object arrays to string (or unicode) arrays In-Reply-To: <4ABBAA0D.9000305@stsci.edu> References: <4ABA7480.5040404@stsci.edu> <4ABBA62F.50608@noaa.gov> <4ABBAA0D.9000305@stsci.edu> Message-ID: <4ABBB93D.4000306@stsci.edu> I have filed a bug against this, along with a patch that fixes casting to fixed-size string arrays: http://projects.scipy.org/numpy/ticket/1235 Undefined-sized string arrays is a harder problem, which I'm deferring for later. Mike On 09/24/2009 01:19 PM, Michael Droettboom wrote: > On 09/24/2009 01:02 PM, Christopher Barker wrote: > >> Michael Droettboom wrote: >> >> >>> As I'm looking into fixing a number of bugs in chararray, I'm running >>> into some surprising behavior. >>> In [14]: x = np.array(['abcdefgh', 'ijklmnop'], 'O') >>> >>> # Without specifying the length, it seems to default to sizeof(int)... ??? >>> In [15]: np.array(x, 'S') >>> Out[15]: >>> array(['abcd', 'ijkl'], >>> dtype='|S4') >>> >>> >> This sure looks like a bug, and I'm no expert, but I suspect that it's >> the size of a pointer (you are on a 32 system -- I am), which makes a >> bit of sense, as Object arrays store a pointer to the python objects. >> >> > That was my guess, too, but I haven't yet delved into the code. I'm on > 32-bit as well. > >> The question is, what should the array constructor do? perhaps the >> equivalent of: >> >> In [41]: np.array(x.tolist()) >> Out[41]: >> array(['abcdefgh', 'ijklmnop'], >> dtype='|S8') >> >> which you could use as a work around. >> >> > Yes, that's the behaviour I was expecting. > >> Do you need to go through object arrays? could you go straight to a >> string array: >> >> np.array(['abcdefgh', 'ijklmnop'], np.string_) >> Out[35]: >> array(['abcdefgh', 'ijklmnop'], >> dtype='|S8') >> >> or just keep the strings in a list. >> >> > The background here is that I'm fixing/resurrecting chararray, which > provides vectorized versions of the standard Python string operations, > endswith, ljust etc. > > I was using object arrays when the length of the output string can't be > determined ahead of time. For example, the string __mod__ operator. I > could probably get away with generating a list of strings instead, but > it's a little bit inconsistent with how I'm doing things elsewhere, > which is always to generate an array. > >> Object arrays are weird, I think there are a lot of corner cases. >> >> > Yeah, that's been my experience. But it would be nice to try to plug > those corner cases up if possible. I'll spend some time investigating > this particular one. > > Cheers, > Mike > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Thu Sep 24 15:18:06 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 24 Sep 2009 22:18:06 +0300 Subject: [Numpy-discussion] chebyshev polynomials Message-ID: <1253819886.12295.40.camel@idol> to, 2009-09-24 kello 11:51 -0600, Charles R Harris kirjoitti: > Would it be appropriate to add a class similar to poly but instead > using chebyshev polynomials? That is, where we currently have [clip] Yes, I think. scipy.special.orthogonal would be the best place for this, I think. Numpy would probably be a wrong place for stuff like this. Ideally, all the orthogonal polynomial classes in Scipy should be rewritten to use more a stable representation of the polynomials. Currently, they break down at high orders, which is a bit ugly. I started working on something related in the spring. The branch is here: http://github.com/pv/scipy-work/tree/ticket/921-orthogonal but as you can see, it hasn't got far (eg. orthopoly1d.__call__ is effectively a placeholder). Anyway, the idea was to divide the orthopoly1d class to subclasses, each having more stable polynomial-specific evaluation routines. Stability-preserving arithmetic would be supported at least within the polynomial class. As a side note, should the cheby* versions of `polyval`, `polymul` etc. just be dropped to reduce namespace clutter? You can do the same things already within just class methods and arithmetic. Cheers, Pauli From robert.kern at gmail.com Thu Sep 24 15:27:03 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Sep 2009 14:27:03 -0500 Subject: [Numpy-discussion] Resize Method for Numpy Array In-Reply-To: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> References: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> Message-ID: <3d375d730909241227x71cd02beg10b9e3176e5991e@mail.gmail.com> On Thu, Sep 24, 2009 at 09:58, Alice Invernizzi wrote: > > Dear all, > > I?have an Hamletic doubt concerning the numpy array data type. > A general learned rule concerning the array usage in other high-level > programming languages is that array data-type are homogeneous datasets > of? fixed dimension. While this description is basically true of numpy arrays, I would caution you that every language has a different lexicon, and the same word can mean very different things in each. For example, Python lists are *not* linked lists; they are like C++'s std::vectors with a preallocation strategy to make appending cheap on average. > Therefore, is not clear to?me why in numpy the size of an array can be > changed??(either with the 'returning-value' resize() function either with > the 'in-place' array method resize()). > More in detail, if the existence of the first function > ('returning-value') might make sense in array computing operation, the > existence of the 'in-place' method really make no sense for me. > > Would you please be so kind to give some explanation for the existence of > resize operator for numpy array? If array size can be change, what?are the > real advantages of using numpy array instead of list object? The .resize() method is available for very special purposes and should not be used in general. It will only allow itself to be used if there were no views created from it; otherwise, it will raise an exception. The reason is that it must necessarily reallocate memory for the new size and copy the data over. Any views would be pointing to deallocated data. .resize() can be useful when you are constructing arrays from a stream of data of unknown length and you haven't let any other code see the array, yet. It is not really a defining feature of numpy arrays like the other features that Chris Barker has listed for you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu Sep 24 15:31:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Sep 2009 14:31:01 -0500 Subject: [Numpy-discussion] chebyshev polynomials In-Reply-To: <1253819886.12295.40.camel@idol> References: <1253819886.12295.40.camel@idol> Message-ID: <3d375d730909241231g7a2c4a8ej26454fe1725cf412@mail.gmail.com> On Thu, Sep 24, 2009 at 14:18, Pauli Virtanen wrote: > As a side note, should the cheby* versions of `polyval`, `polymul` etc. > just be dropped to reduce namespace clutter? You can do the same things > already within just class methods and arithmetic. Just to clarify, you mean having classmethods that work on plain arrays of Chebyshev coefficients? I'm +1 on that. I'm -1 on only having a ChebyPoly class with instance methods, although it would be useful to have as an adjunct to the plain routines. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Thu Sep 24 15:51:34 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 24 Sep 2009 22:51:34 +0300 Subject: [Numpy-discussion] chebyshev polynomials In-Reply-To: <3d375d730909241231g7a2c4a8ej26454fe1725cf412@mail.gmail.com> References: <1253819886.12295.40.camel@idol> <3d375d730909241231g7a2c4a8ej26454fe1725cf412@mail.gmail.com> Message-ID: <1253821893.12295.49.camel@idol> to, 2009-09-24 kello 14:31 -0500, Robert Kern kirjoitti: > On Thu, Sep 24, 2009 at 14:18, Pauli Virtanen wrote: > > As a side note, should the cheby* versions of `polyval`, `polymul` etc. > > just be dropped to reduce namespace clutter? You can do the same things > > already within just class methods and arithmetic. > > Just to clarify, you mean having classmethods that work on plain > arrays of Chebyshev coefficients? I'm +1 on that. I'm -1 on only > having a ChebyPoly class with instance methods, although it would be > useful to have as an adjunct to the plain routines. I meant only having a ChebyPoly class with instance methods. Personally, I've always used poly1d instances instead of the poly* routines apart from polyfit. But perhaps this is not how everyone uses them. Using class methods is an interesting idea, though. -- Pauli Virtanen From charlesr.harris at gmail.com Thu Sep 24 15:53:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 24 Sep 2009 13:53:32 -0600 Subject: [Numpy-discussion] chebyshev polynomials In-Reply-To: <1253819886.12295.40.camel@idol> References: <1253819886.12295.40.camel@idol> Message-ID: On Thu, Sep 24, 2009 at 1:18 PM, Pauli Virtanen wrote: > to, 2009-09-24 kello 11:51 -0600, Charles R Harris kirjoitti: > > Would it be appropriate to add a class similar to poly but instead > > using chebyshev polynomials? That is, where we currently have > [clip] > > Yes, I think. scipy.special.orthogonal would be the best place for this, > I think. Numpy would probably be a wrong place for stuff like this. > > Ideally, all the orthogonal polynomial classes in Scipy should be > rewritten to use more a stable representation of the polynomials. > Currently, they break down at high orders, which is a bit ugly. > > I started working on something related in the spring. The branch is > here: > > http://github.com/pv/scipy-work/tree/ticket/921-orthogonal > > but as you can see, it hasn't got far (eg. orthopoly1d.__call__ is > effectively a placeholder). Anyway, the idea was to divide the > orthopoly1d class to subclasses, each having more stable > polynomial-specific evaluation routines. Stability-preserving arithmetic > would be supported at least within the polynomial class. > > I was thinking of storing the chebyshev internally as the values at the chebyschev points. This makes multiplication, differentiation and such quite easy (resample and multiply/divide appropriatately). Its equivalent to working in the fourier domain for convolution and differentiation. The transform back and forth is likewise othogonal, so stable. The intepolation also becomes simple using the barycentric version. > As a side note, should the cheby* versions of `polyval`, `polymul` etc. > just be dropped to reduce namespace clutter? You can do the same things > already within just class methods and arithmetic. > > What do you mean? The evaluation can use various stable methods appropriate to the chebyshev series. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Sep 24 15:58:36 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 24 Sep 2009 13:58:36 -0600 Subject: [Numpy-discussion] chebyshev polynomials In-Reply-To: <3d375d730909241231g7a2c4a8ej26454fe1725cf412@mail.gmail.com> References: <1253819886.12295.40.camel@idol> <3d375d730909241231g7a2c4a8ej26454fe1725cf412@mail.gmail.com> Message-ID: On Thu, Sep 24, 2009 at 1:31 PM, Robert Kern wrote: > On Thu, Sep 24, 2009 at 14:18, Pauli Virtanen wrote: > > > As a side note, should the cheby* versions of `polyval`, `polymul` etc. > > just be dropped to reduce namespace clutter? You can do the same things > > already within just class methods and arithmetic. > > Just to clarify, you mean having classmethods that work on plain > arrays of Chebyshev coefficients? I'm +1 on that. I'm -1 on only > having a ChebyPoly class with instance methods, although it would be > useful to have as an adjunct to the plain routines. > > I have a set of functions that does the first (works on multidimensional arrays of coefficients, actually), but I am open to ideas of what such a chebyschev class with these methods should look like. An interval of definition should probably be part of the ctor. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Sep 24 16:34:50 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 24 Sep 2009 23:34:50 +0300 Subject: [Numpy-discussion] chebyshev polynomials In-Reply-To: References: <1253819886.12295.40.camel@idol> Message-ID: <1253824490.12295.80.camel@idol> to, 2009-09-24 kello 13:53 -0600, Charles R Harris kirjoitti: [clip] > I was thinking of storing the chebyshev internally as the values at > the chebyschev points. This makes multiplication, differentiation and > such quite easy (resample and multiply/divide appropriatately). Its > equivalent to working in the fourier domain for convolution and > differentiation. The transform back and forth is likewise othogonal, > so stable. The intepolation also becomes simple using the barycentric > version. Sounds like you know this stuff well :) The internal representation of each orthogonal polynomial type can probably be whatever works best for each case. It should be no problem to sugar ChebyPoly up after the main work has been done. > As a side note, should the cheby* versions of `polyval`, > `polymul` etc. just be dropped to reduce namespace clutter? > You can do the same things already within just class methods > and arithmetic. > > What do you mean? The evaluation can use various stable methods > appropriate to the chebyshev series. This comment was just on the API -- the implementation of course should be appropriate. > I have a set of functions that does the first (works on > multidimensional arrays of coefficients, actually), but I am open to > ideas of what such a chebyschev class with these methods should look > like. An interval of definition should probably be part of the ctor. > Thoughts? Having the following features could be useful: - __call__, .roots, .order: as in poly1d - .data -> whatever is the internal representation - .coef -> Chebyshev coefficients? - .limits -> The interval - arithmetic: chebyshev chebyshev -> chebyshev - arithmetic: scalar chebyshev -> chebyshev - arithmetic: poly1d chebyshev -> chebyshev/poly1d (??) I'm not sure what __getitem__ and __array__ ought to return. Alternatives seem either to return the poly1d coefficients, or not to define these methods -- otherwise there will be confusion, especially if we want to use these in scipy.special.orthogonal. Pauli From timmichelsen at gmx-topmail.de Thu Sep 24 16:41:15 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 24 Sep 2009 22:41:15 +0200 Subject: [Numpy-discussion] Numpy savetxt: change decimal separator In-Reply-To: References: Message-ID: >> And another question I import this file to excel, is there also a >> possiblity to create a headline for each column, that the file looks >> like the following example: >> >> average; standard deviation; maximum distance; sum of distances >> 0,26565; 0,65565; 2,353535; 25, 5656 I was fiddeling with the same problem here: http://thread.gmane.org/gmane.comp.python.numeric.general/23418 So far, one can only open the file and prepend the header line. I just files an enhancement request for this: proposal: add a header and footer function to numpy.savetxt http://projects.scipy.org/numpy/ticket/1236 Regards, Timmie From charlesr.harris at gmail.com Thu Sep 24 16:57:31 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 24 Sep 2009 14:57:31 -0600 Subject: [Numpy-discussion] chebyshev polynomials In-Reply-To: <1253824490.12295.80.camel@idol> References: <1253819886.12295.40.camel@idol> <1253824490.12295.80.camel@idol> Message-ID: On Thu, Sep 24, 2009 at 2:34 PM, Pauli Virtanen wrote: > to, 2009-09-24 kello 13:53 -0600, Charles R Harris kirjoitti: > > [clip] > > I was thinking of storing the chebyshev internally as the values at > > the chebyschev points. This makes multiplication, differentiation and > > such quite easy (resample and multiply/divide appropriatately). Its > > equivalent to working in the fourier domain for convolution and > > differentiation. The transform back and forth is likewise othogonal, > > so stable. The intepolation also becomes simple using the barycentric > > version. > > Sounds like you know this stuff well :) > > The internal representation of each orthogonal polynomial type can > probably be whatever works best for each case. It should be no problem > to sugar ChebyPoly up after the main work has been done. > > > As a side note, should the cheby* versions of `polyval`, > > `polymul` etc. just be dropped to reduce namespace clutter? > > You can do the same things already within just class methods > > and arithmetic. > > > > What do you mean? The evaluation can use various stable methods > > appropriate to the chebyshev series. > > This comment was just on the API -- the implementation of course should > be appropriate. > > > I have a set of functions that does the first (works on > > multidimensional arrays of coefficients, actually), but I am open to > > ideas of what such a chebyschev class with these methods should look > > like. An interval of definition should probably be part of the ctor. > > Thoughts? > > Having the following features could be useful: > > - __call__, .roots, .order: as in poly1d > - .data -> whatever is the internal representation > - .coef -> Chebyshev coefficients? > - .limits -> The interval > - arithmetic: chebyshev chebyshev -> chebyshev > - arithmetic: scalar chebyshev -> chebyshev > - arithmetic: poly1d chebyshev -> chebyshev/poly1d (??) > > Multiplying by poly1d should be easy, just interpolate at the chebyshev points and multiply. Going the other way is a bit trickier. I'm wondering if having support for complex would be justified? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Thu Sep 24 17:43:41 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Thu, 24 Sep 2009 22:43:41 +0100 Subject: [Numpy-discussion] np.any and np.all short-circuiting Message-ID: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk> Hello I noticed that python's "any" can be faster than numpy's "any" (and the similarly for "all"). Then I wondered why. I realized that numpy implements "any" as logical_or.reduce (and "all" as logical_and.reduce). This means that numpy cannot take advantage of short-circuiting. Looking at the timings confirmed my suspects. I think python fetches one element at the time from the array and as soon as any of them is true it returns true. Instead, numpy goes on until the end of the array even if the very first element is already true. Looking at the code I think I found a way to fix it. I don't see a reason why it should not work. It seems to work. But you never know. I wanted to run the test suite. I am unable to run the test on the svn version, neither from .../build/lib... nor form a different folder using sys.path.insert(0, '.../build/lib...'). In the first case I get "NameError: name 'numeric' is not defined" while in the second case zero tests are successfully performed :-) What is the correct way of running the tests (without installing the development version in the system)? Is there some expert of the inner numpy core able to tell whether the approach is correct and won't break something? I opened a ticket for this: http://projects.scipy.org/numpy/ticket/1237 Best, Luca In the following table any(x) is python's version, np.any(x) is numpy's, while *np.any(x)* is mine. '1.4.0.dev7417' x = np.zeros(100000, dtype=bool) x[i] = True %timeit any(x) %timeit np.any(x) x = np.ones(100000, dtype=bool) x[i] = False %timeit all(x) %timeit np.all(x) ANY i any(x) np.any(x) *np.any(x)* // 6.84 ms 831 ?s 189 ?s 50000 3.41 ms 832 ?s 98 ?s 10000 683 ?s 831 ?s 24.7 ?s 1000 68.9 ?s 859 ?s 8.41 ?s 100 7.92 ?s 888 ?s 6.9 ?s 10 1.42 ?s 832 ?s 6.68 ?s 0 712 ns 831 ?s 6.65 ?s ALL i all(x) np.all(x) *np.all(x)* // 6.65 ms 676 ?s 300 ?s 50000 3.32 ms 677 ?s 154 ?s 10000 666 ?s 676 ?s 36.4 ?s 1000 67.9 ?s 686 ?s 9.86 ?s 100 7.53 ?s 677 ?s 7.26 ?s 10 1.39 ?s 676 ?s 7.06 ?s 0 716 ns 678 ?s 6.96 ?s From charlesr.harris at gmail.com Thu Sep 24 18:07:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 24 Sep 2009 16:07:18 -0600 Subject: [Numpy-discussion] np.any and np.all short-circuiting In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk> Message-ID: On Thu, Sep 24, 2009 at 3:43 PM, Citi, Luca wrote: > Hello > I noticed that python's "any" can be faster than numpy's "any" (and the > similarly for "all"). > Then I wondered why. > I realized that numpy implements "any" as logical_or.reduce (and "all" as > logical_and.reduce). > This means that numpy cannot take advantage of short-circuiting. > Looking at the timings confirmed my suspects. > I think python fetches one element at the time from the array and as soon > as any of them is true it returns true. > Instead, numpy goes on until the end of the array even if the very first > element is already true. > Looking at the code I think I found a way to fix it. > I don't see a reason why it should not work. It seems to work. But you > never know. > I wanted to run the test suite. > I am unable to run the test on the svn version, neither from > .../build/lib... nor form a different folder using sys.path.insert(0, > '.../build/lib...'). > In the first case I get "NameError: name 'numeric' is not defined" > while in the second case zero tests are successfully performed :-) > What is the correct way of running the tests (without installing the > development version in the system)? > Is there some expert of the inner numpy core able to tell whether the > approach is correct and won't break something? > I opened a ticket for this: > http://projects.scipy.org/numpy/ticket/1237 > Best, > Luca > > Did you delete your build directory first? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Sep 24 18:11:11 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Sep 2009 17:11:11 -0500 Subject: [Numpy-discussion] np.any and np.all short-circuiting In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk> Message-ID: <3d375d730909241511y4774d11i7808627fbd023135@mail.gmail.com> On Thu, Sep 24, 2009 at 16:43, Citi, Luca wrote: > What is the correct way of running the tests (without installing the development version in the system)? Build inplace: $ python setup.py build_src --inplace build_ext --inplace -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From lciti at essex.ac.uk Thu Sep 24 18:29:16 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Thu, 24 Sep 2009 23:29:16 +0100 Subject: [Numpy-discussion] np.any and np.all short-circuiting In-Reply-To: References: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk>, Message-ID: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F1@MBOX0.essex.ac.uk> Thank you for your instantaneuos reply! This is what I usually do: from the numpy folder I run (emptying the build folder if I just fetched svn updates) $ python setup build.py $ cd build/lib-... $ ipython In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.4.0.dev7417' Everything works except for the np.test() which gives "NameError: name 'numeric' is not defined" Otherwise I move into a diffeent folder, say /tmp and run ipython In [1]: import sys In [2]: sys.path.insert(0, '~/numpy/build/lib.linux-i686-2.6/') In [3]: import numpy as np In [4]: np.__version__ Out[4]: '1.4.0.dev7417' In [5]: np.test() Running unit tests for numpy NumPy version 1.4.0.dev7417 NumPy is installed in /_space_/Temp/numpy/build/lib.linux-i686-2.6/numpy Python version 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] nose version 0.10.4 ---------------------------------------------------------------------- Ran 0 tests in 0.002s OK Out[5]: What should I do instead? Thanks, Luca From lciti at essex.ac.uk Thu Sep 24 18:30:16 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Thu, 24 Sep 2009 23:30:16 +0100 Subject: [Numpy-discussion] np.any and np.all short-circuiting In-Reply-To: <3d375d730909241511y4774d11i7808627fbd023135@mail.gmail.com> References: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk>, <3d375d730909241511y4774d11i7808627fbd023135@mail.gmail.com> Message-ID: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F2@MBOX0.essex.ac.uk> Thank you both for your help! $ python setup.py build_src --inplace build_ext --inplace I'll give it a try. From sturla at molden.no Thu Sep 24 18:32:09 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 25 Sep 2009 00:32:09 +0200 Subject: [Numpy-discussion] Resize Method for Numpy Array In-Reply-To: <3d375d730909241227x71cd02beg10b9e3176e5991e@mail.gmail.com> References: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> <3d375d730909241227x71cd02beg10b9e3176e5991e@mail.gmail.com> Message-ID: <4ABBF369.70202@molden.no> Robert Kern skrev: > While this description is basically true of numpy arrays, I would > caution you that every language has a different lexicon, and the same > word can mean very different things in each. For example, Python lists > are *not* linked lists; they are like C++'s std::vectors with a > preallocation strategy to make appending cheap on average. > In Java and .NET jargon, Python lists are array lists, not linked lists. It is sad there is no "cons" or "llist" built-in type, something like: mycons = cons(car, cdr) mylist = llist(iterable) Of course we can write [car, cdr] or (car, cdr) for making linked lists in pure Python (without having to define class types), but both have issues.The first is storage inefficient, the latter is not mutable. Yes I know Guido left out linked lists for a purpose, so there is probably no use complaining on the Python ideas of Python dev lists... S.M. From robert.kern at gmail.com Thu Sep 24 18:36:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Sep 2009 17:36:54 -0500 Subject: [Numpy-discussion] Resize Method for Numpy Array In-Reply-To: <4ABBF369.70202@molden.no> References: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> <3d375d730909241227x71cd02beg10b9e3176e5991e@mail.gmail.com> <4ABBF369.70202@molden.no> Message-ID: <3d375d730909241536q6d6b7db6y5dbd5eed9ea1c2f1@mail.gmail.com> On Thu, Sep 24, 2009 at 17:32, Sturla Molden wrote: > Robert Kern skrev: >> While this description is basically true of numpy arrays, I would >> caution you that every language has a different lexicon, and the same >> word can mean very different things in each. For example, Python lists >> are *not* linked lists; they are like C++'s std::vectors with a >> preallocation strategy to make appending cheap on average. >> > In Java and .NET jargon, Python lists are array lists, not linked lists. > > It is sad there is no "cons" or "llist" built-in type, something like: > > ? mycons = cons(car, cdr) > ? mylist = llist(iterable) > > > Of course we can write [car, cdr] or (car, cdr) for making linked lists > in pure Python (without having to define class types), but both have > issues.The first is storage inefficient, the latter is not mutable. > > Yes I know Guido left out linked lists for a purpose, so there is > probably no use complaining on the Python ideas of Python dev lists... collections.deque() is a linked list of 64-item chunks. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Thu Sep 24 19:05:14 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 25 Sep 2009 01:05:14 +0200 Subject: [Numpy-discussion] Resize Method for Numpy Array In-Reply-To: <3d375d730909241536q6d6b7db6y5dbd5eed9ea1c2f1@mail.gmail.com> References: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> <3d375d730909241227x71cd02beg10b9e3176e5991e@mail.gmail.com> <4ABBF369.70202@molden.no> <3d375d730909241536q6d6b7db6y5dbd5eed9ea1c2f1@mail.gmail.com> Message-ID: <4ABBFB2A.30304@molden.no> Robert Kern skrev: > collections.deque() is a linked list of 64-item chunks. > Thanks for that useful information. :-) But it would not help much for a binary tree... Since we are on the NumPy list... One could image making linked lists using NumPy arrays with dtype=object. They are storage efficient like tuples, and mutable like lists. def cons(a,b): return np.array((a,b),dtype=object) But I guess the best way is to implement a real linked extension type in Cython. S.M. From robert.kern at gmail.com Thu Sep 24 19:09:50 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Sep 2009 18:09:50 -0500 Subject: [Numpy-discussion] Resize Method for Numpy Array In-Reply-To: <4ABBFB2A.30304@molden.no> References: <6CE68FDFFBC04001BAD2D7CD4A0C8E22@pcinvernizzi> <3d375d730909241227x71cd02beg10b9e3176e5991e@mail.gmail.com> <4ABBF369.70202@molden.no> <3d375d730909241536q6d6b7db6y5dbd5eed9ea1c2f1@mail.gmail.com> <4ABBFB2A.30304@molden.no> Message-ID: <3d375d730909241609o5993d694s359d09f7d25791d3@mail.gmail.com> On Thu, Sep 24, 2009 at 18:05, Sturla Molden wrote: > Robert Kern skrev: >> collections.deque() is a linked list of 64-item chunks. >> > Thanks for that useful information. :-) But it would not help much for a > binary tree... > > Since we are on the NumPy list... One could image making linked lists > using NumPy ?arrays with dtype=object. They are storage efficient like > tuples, and mutable like lists. > > def cons(a,b): > ? ?return np.array((a,b),dtype=object) > > But I guess the best way is to implement a real linked extension type in > Cython. Yup! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Thu Sep 24 20:03:27 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 24 Sep 2009 20:03:27 -0400 Subject: [Numpy-discussion] dtype '|S0' not understood In-Reply-To: References: Message-ID: <113F0316-56E1-4403-B967-EC06847B8251@cs.toronto.edu> On 23-Sep-09, at 7:55 PM, David Warde-Farley wrote: > It seems that either dtype(str) should do something more sensible than > zero-length string, or it should be possible to create it with > dtype('| > S0'). Which should it be? Since there wasn't any response I went ahead and fixed it by making str and unicode dtypes allow a size of 0 when constructed with protocol type codes. Either S0 and U0 should be constructable with typecodes or they shouldn't be allowed at all; I opted for the latter since a) it was simple and b) I don't know what a sensible default for dtype(str) would be (length 1? length 10?). Patch is at: http://projects.scipy.org/numpy/ticket/1239 Review away! David From lciti at essex.ac.uk Thu Sep 24 20:50:29 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Fri, 25 Sep 2009 01:50:29 +0100 Subject: [Numpy-discussion] np.any and np.all short-circuiting In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F2@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk>, <3d375d730909241511y4774d11i7808627fbd023135@mail.gmail.com>, <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F2@MBOX0.essex.ac.uk> Message-ID: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F3@MBOX0.essex.ac.uk> I am sorry. I followed your suggestion. I re-checked out the svn folder and then compiled with $ python setup.py build_src --inplace build_ext --inplace but I get the same behaviour. If I am inside I get the NameError, if I am outside and use path.insert, it successfully performs zero tests. I have tried with numpy-1.3 with sys.path.insert and it works. I re-implemented the same patch for 1.3 and it passes all 2030 tests. http://projects.scipy.org/numpy/ticket/1237 I think the speed improvement is impressive. Thanks. I still wonder why I am unable to make the tests work with the svn version. Best, Luca From jlewi at intellisis.com Thu Sep 24 20:42:32 2009 From: jlewi at intellisis.com (Jeremy Lewi) Date: Thu, 24 Sep 2009 17:42:32 -0700 Subject: [Numpy-discussion] Assigning an array to the field of a structure doesn't work Message-ID: <000401ca3d79$13c56080$3b502180$@com> Hi I'm trying to understand the following code: import numpy as np dt=np.dtype([("c",np.int32,(2))]) data=np.ndarray([2],dtype=dt) x=np.array([0,10]) #the following line doesn't set data[0]["c"] = x #only data[0]["c"][0] changes data[0]["c"]=x #the following does set data[1]["c"][:]=x data[1]["c"][:]=x What I don't understand is why does, the assignment of data[0]["c"]=x not set c=x but the second one does? Thanks Jeremy -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Sep 24 21:41:23 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 25 Sep 2009 10:41:23 +0900 Subject: [Numpy-discussion] np.any and np.all short-circuiting In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F3@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk> <3d375d730909241511y4774d11i7808627fbd023135@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F2@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F3@MBOX0.essex.ac.uk> Message-ID: <5b8d13220909241841o647d5858ocadc274ad9982f20@mail.gmail.com> On Fri, Sep 25, 2009 at 9:50 AM, Citi, Luca wrote: > I am sorry. > I followed your suggestion. > I re-checked out the svn folder and then compiled with > $ python setup.py build_src --inplace build_ext --inplace > but I get the same behaviour. > If I am inside I get the NameError, if I am outside and use path.insert, it successfully performs zero tests. There is a problem with numpy tests when you import numpy with the install being in the current directory. It happens when you build in place and launch python in the source directory, but it also happens if you happen to be in site-packages, and import numpy installed there. When the issue was last discussed, Robert suggested it was a nose issue. cheers, David From charlesr.harris at gmail.com Thu Sep 24 22:00:17 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 24 Sep 2009 20:00:17 -0600 Subject: [Numpy-discussion] chebyshev polynomials In-Reply-To: <3d375d730909241231g7a2c4a8ej26454fe1725cf412@mail.gmail.com> References: <1253819886.12295.40.camel@idol> <3d375d730909241231g7a2c4a8ej26454fe1725cf412@mail.gmail.com> Message-ID: On Thu, Sep 24, 2009 at 1:31 PM, Robert Kern wrote: > On Thu, Sep 24, 2009 at 14:18, Pauli Virtanen wrote: > > > As a side note, should the cheby* versions of `polyval`, `polymul` etc. > > just be dropped to reduce namespace clutter? You can do the same things > > already within just class methods and arithmetic. > > Just to clarify, you mean having classmethods that work on plain > arrays of Chebyshev coefficients? I'm +1 on that. I'm -1 on only > having a ChebyPoly class with instance methods, although it would be > useful to have as an adjunct to the plain routines. > > Let me see if I understand this correctly. You like the idea of a class with class methods, avoiding namespace polution, but you aren't so hot on having a chebyshev class like poly1d that contains the series info and overloads some of the operators? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Sep 24 22:14:09 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Sep 2009 21:14:09 -0500 Subject: [Numpy-discussion] chebyshev polynomials In-Reply-To: References: <1253819886.12295.40.camel@idol> <3d375d730909241231g7a2c4a8ej26454fe1725cf412@mail.gmail.com> Message-ID: <3d375d730909241914g79f60b3duc7f42af174860063@mail.gmail.com> On Thu, Sep 24, 2009 at 21:00, Charles R Harris wrote: > > > On Thu, Sep 24, 2009 at 1:31 PM, Robert Kern wrote: >> >> On Thu, Sep 24, 2009 at 14:18, Pauli Virtanen wrote: >> >> > As a side note, should the cheby* versions of `polyval`, `polymul` etc. >> > just be dropped to reduce namespace clutter? You can do the same things >> > already within just class methods and arithmetic. >> >> Just to clarify, you mean having classmethods that work on plain >> arrays of Chebyshev coefficients? I'm +1 on that. I'm -1 on only >> having a ChebyPoly class with instance methods, although it would be >> useful to have as an adjunct to the plain routines. >> > > Let me see if I understand this correctly. You like the idea of a class with > class methods, avoiding namespace polution, but you aren't so hot on having > a chebyshev class like poly1d that contains the series info and overloads > some of the operators? I'm not so hot on *only* having a chebyshev class like poly1d. As I said, it would be useful to have one, but I still want routines that work on plain arrays. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Sep 24 22:39:26 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 24 Sep 2009 20:39:26 -0600 Subject: [Numpy-discussion] chebyshev polynomials In-Reply-To: <3d375d730909241914g79f60b3duc7f42af174860063@mail.gmail.com> References: <1253819886.12295.40.camel@idol> <3d375d730909241231g7a2c4a8ej26454fe1725cf412@mail.gmail.com> <3d375d730909241914g79f60b3duc7f42af174860063@mail.gmail.com> Message-ID: On Thu, Sep 24, 2009 at 8:14 PM, Robert Kern wrote: > On Thu, Sep 24, 2009 at 21:00, Charles R Harris > wrote: > > > > > > On Thu, Sep 24, 2009 at 1:31 PM, Robert Kern > wrote: > >> > >> On Thu, Sep 24, 2009 at 14:18, Pauli Virtanen wrote: > >> > >> > As a side note, should the cheby* versions of `polyval`, `polymul` > etc. > >> > just be dropped to reduce namespace clutter? You can do the same > things > >> > already within just class methods and arithmetic. > >> > >> Just to clarify, you mean having classmethods that work on plain > >> arrays of Chebyshev coefficients? I'm +1 on that. I'm -1 on only > >> having a ChebyPoly class with instance methods, although it would be > >> useful to have as an adjunct to the plain routines. > >> > > > > Let me see if I understand this correctly. You like the idea of a class > with > > class methods, avoiding namespace polution, but you aren't so hot on > having > > a chebyshev class like poly1d that contains the series info and overloads > > some of the operators? > > I'm not so hot on *only* having a chebyshev class like poly1d. As I > said, it would be useful to have one, but I still want routines that > work on plain arrays. > > So basically 'chebadd', 'chebder', 'chebdiv', 'chebfit', 'chebint', 'chebmul', 'chebsub', 'chebval', All just taking 1d arrays with an assumed interval of [-1,1] except for chebfit, which needs an interval, and maybe cheb{der,int,val} too also taking intervals. Hmm, before I just had these things as a keyword variable that defaulted to [-1,1]. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus.proeller at ifm.com Fri Sep 25 01:31:19 2009 From: markus.proeller at ifm.com (markus.proeller at ifm.com) Date: Fri, 25 Sep 2009 07:31:19 +0200 Subject: [Numpy-discussion] Antwort: Re: Numpy savetxt: change decimal separator In-Reply-To: Message-ID: >> I was fiddeling with the same problem here: >> http://thread.gmane.org/gmane.comp.python.numeric.general/23418 >> >> So far, one can only open the file and prepend the header line. >> >> I just files an enhancement request for this: >> proposal: add a header and footer function to numpy.savetxt >> http://projects.scipy.org/numpy/ticket/1236 >> >> Regards, >> Timmie Hi Timmie, thanks for that, this would be a very good first step, still having the problem, that the local representation of the decimal point is not covered... Of course the porblem can be handeled by a further file parser and a remove('.',',') method, but it looses a bit of the "straight forward" way. Markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Fri Sep 25 03:06:58 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 25 Sep 2009 16:06:58 +0900 Subject: [Numpy-discussion] datetime functionality: questions and requests Message-ID: <4ABC6C12.4010003@ar.media.kyoto-u.ac.jp> Hi there, Hi Travis, I have started looking at the new datetime code, with the idea that we should soon fix (for real) a release date for numpy 1.4.0. I have a few requests and questions: - since npy_datetime is conditionally defined to be 64 bits, why not typedefing it to npy_int64 ? - there are a few issues w.r.t int/long/npy_datetime used interchangeably. That's bound to cause trouble on 64 bits architectures. - For 1.4.0, there needs to be some minimal documentation with some examples + a few unit tests. In particular, I would like some unit tests which check for 64 bits issues. I was thinking about pushing for 1.4.0 in november, which means that those changes should be hopefully included within mid-october. Is this possible ? cheers, David From charlesr.harris at gmail.com Fri Sep 25 03:43:04 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 25 Sep 2009 01:43:04 -0600 Subject: [Numpy-discussion] datetime functionality: questions and requests In-Reply-To: <4ABC6C12.4010003@ar.media.kyoto-u.ac.jp> References: <4ABC6C12.4010003@ar.media.kyoto-u.ac.jp> Message-ID: On Fri, Sep 25, 2009 at 1:06 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi there, Hi Travis, > > I have started looking at the new datetime code, with the idea that > we should soon fix (for real) a release date for numpy 1.4.0. I have a > few requests and questions: > - since npy_datetime is conditionally defined to be 64 bits, why not > typedefing it to npy_int64 ? > - there are a few issues w.r.t int/long/npy_datetime used > interchangeably. That's bound to cause trouble on 64 bits architectures. > - For 1.4.0, there needs to be some minimal documentation with some > examples + a few unit tests. In particular, I would like some unit tests > which check for 64 bits issues. > > I was thinking about pushing for 1.4.0 in november, which means that > those changes should be hopefully included within mid-october. Is this > possible ? > > Re: November, I was thinking the same. Sometime around the middle of the month perhaps. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Fri Sep 25 05:50:19 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Fri, 25 Sep 2009 10:50:19 +0100 Subject: [Numpy-discussion] np.any and np.all short-circuiting In-Reply-To: <5b8d13220909241841o647d5858ocadc274ad9982f20@mail.gmail.com> References: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk> <3d375d730909241511y4774d11i7808627fbd023135@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F2@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F3@MBOX0.essex.ac.uk>, <5b8d13220909241841o647d5858ocadc274ad9982f20@mail.gmail.com> Message-ID: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F6@MBOX0.essex.ac.uk> Thanks for your reply. So, what is the correct way to test a numpy development version without installing it in /usr/lib/... or /usr/local/lib/.. ? What do you guys do? From cournape at gmail.com Fri Sep 25 06:12:46 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 25 Sep 2009 19:12:46 +0900 Subject: [Numpy-discussion] np.any and np.all short-circuiting In-Reply-To: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F6@MBOX0.essex.ac.uk> References: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk> <3d375d730909241511y4774d11i7808627fbd023135@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F2@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F3@MBOX0.essex.ac.uk> <5b8d13220909241841o647d5858ocadc274ad9982f20@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F6@MBOX0.essex.ac.uk> Message-ID: <5b8d13220909250312i51ac8e5bnadc15bcd8bcee42a@mail.gmail.com> On Fri, Sep 25, 2009 at 6:50 PM, Citi, Luca wrote: > Thanks for your reply. > So, what is the correct way to test a numpy development version without installing it in /usr/lib/... or /usr/local/lib/.. ? > What do you guys do? Build in place, but test from outside the source tree. I for example have a makefile which does the build + test dance, but anything could do. cheers, David From gnurser at googlemail.com Fri Sep 25 06:55:57 2009 From: gnurser at googlemail.com (George Nurser) Date: Fri, 25 Sep 2009 11:55:57 +0100 Subject: [Numpy-discussion] MFDatasets and NetCDF4 In-Reply-To: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> References: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> Message-ID: <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> Hi, I hope this is the right place to ask this. I've found the MFDataset works well in reading NetCDF3 files, but it appears that it doesn't work at present for NetCDF4 files. Is this an inherent problem with the NetCDF4 file structure, or would it be possible to implement the MFDataset for NetCDF4 files sometime? It would be very useful. --George Nurser. From mdroe at stsci.edu Fri Sep 25 09:07:45 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Fri, 25 Sep 2009 09:07:45 -0400 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> Message-ID: <4ABCC0A1.30402@stsci.edu> David Goldsmith wrote: > On Tue, Sep 22, 2009 at 4:02 PM, Ralf Gommers > > wrote: > > > On Tue, Sep 22, 2009 at 1:58 PM, Michael Droettboom > > wrote: > > Trac has these bugs. Any others? > > http://projects.scipy.org/numpy/ticket/1199 > http://projects.scipy.org/numpy/ticket/1200 > http://projects.scipy.org/numpy/ticket/856 > http://projects.scipy.org/numpy/ticket/855 > http://projects.scipy.org/numpy/ticket/1231 > > > This one: > http://article.gmane.org/gmane.comp.python.numeric.general/23638/match=chararray > > Cheers, > Ralf > > > That last one never got "promoted" to a ticket? It's a symptom of this bug, that I created and produced a patch for yesterday: http://projects.scipy.org/numpy/ticket/1235 Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From david.huard at gmail.com Fri Sep 25 09:14:31 2009 From: david.huard at gmail.com (David Huard) Date: Fri, 25 Sep 2009 09:14:31 -0400 Subject: [Numpy-discussion] MFDatasets and NetCDF4 In-Reply-To: <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> References: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> Message-ID: <91cf711d0909250614o158e121ej40399a5b92278884@mail.gmail.com> Hi George, On Fri, Sep 25, 2009 at 6:55 AM, George Nurser wrote: > Hi, > I hope this is the right place to ask this. > I've found the MFDataset works well in reading NetCDF3 files, but it > appears that it doesn't work at present for NetCDF4 files. > > It works on my side for netCDF4 files. What error are you getting ? > Is this an inherent problem with the NetCDF4 file structure, or would > it be possible to implement the MFDataset for NetCDF4 files sometime? > It would be very useful. > > >From the docstring: Datasets must be in C{NETCDF4_CLASSIC, NETCDF3_CLASSIC or NETCDF3_64BIT} format (C{NETCDF4} Datasets won't work). I suspect your files are not in CLASSIC mode. NETCDF4 datasets are allowed to have a more complex hierarchy than the CLASSIC mode, and I think this is what makes concatenation difficult to implement. That is, there would be no simple rule to determine which fields should be concatenated. David > --George Nurser. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gnurser at googlemail.com Fri Sep 25 09:59:53 2009 From: gnurser at googlemail.com (George Nurser) Date: Fri, 25 Sep 2009 14:59:53 +0100 Subject: [Numpy-discussion] MFDatasets and NetCDF4 In-Reply-To: <91cf711d0909250614o158e121ej40399a5b92278884@mail.gmail.com> References: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> <91cf711d0909250614o158e121ej40399a5b92278884@mail.gmail.com> Message-ID: <1d1e6ea70909250659q706c394fg470d5278fb97549d@mail.gmail.com> 2009/9/25 David Huard : > Hi George, > > On Fri, Sep 25, 2009 at 6:55 AM, George Nurser > wrote: >> >> Hi, >> I hope this is the right place to ask this. >> I've found the MFDataset works well in reading NetCDF3 files, but it >> appears that it doesn't work at present for NetCDF4 files. >> > > It works on my side for netCDF4 files. What error are you getting ? > >> >> Is this an inherent problem with the NetCDF4 file structure, or would >> it be possible to implement the MFDataset for NetCDF4 files sometime? >> It would be very useful. >> > > From the docstring: > > ?Datasets must be in C{NETCDF4_CLASSIC, NETCDF3_CLASSIC or NETCDF3_64BIT} > ??? format (C{NETCDF4} Datasets won't work). > > I suspect your files are not in CLASSIC mode. I haven't tried it yet:) I was just trusting the documentation (which I've found to be very good). Doing ncdump -k on my files gives 'netCDF-4' so I guess the files are indeed not in CLASSIC mode. > NETCDF4 datasets are allowed > to have a more complex hierarchy than the CLASSIC mode, and I think this is > what makes concatenation difficult to implement. That is, there would be no > simple rule to determine which fields should be concatenated. > > David Ok. I suspected it was something like that. If the netCDF-4 CLASSIC mode allows data compression and chunking, which I think it is supposed to do, perhaps I can persuade the guys running the model to produce files in netCDF-4 CLASSIC mode. Many thanks. George. From sccolbert at gmail.com Fri Sep 25 10:01:55 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 25 Sep 2009 10:01:55 -0400 Subject: [Numpy-discussion] numpy install script changes permissions? Message-ID: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> Sorry to bring up an old topic again, but I still haven't managed to resolve this issue concerning numpy and nose tests... On a fresh numpy 1.3.0 build from source tarball on sourceforge: On ubuntu 9.04 x64 I issue the following commands: cd numpy-1.3.0 python setup.py sudo python setup.py install In the source build folder, all numpy test scripts have the correct permissions and are not marked as executable, but in the install directory (/usr/local/lib/python2.6/dist-packages/numpy/), the test test scripts have completely different permissions, and are all marked as executable. Thus, nose wont run the tests. This happens with scipy too. Is this a bug in the numpy/scipy build script, or just my ineptness at installing things through the command line? Cheers, Chris From jsseabold at gmail.com Fri Sep 25 10:10:34 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 25 Sep 2009 10:10:34 -0400 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> Message-ID: On Fri, Sep 25, 2009 at 10:01 AM, Chris Colbert wrote: > Sorry to bring up an old topic again, but I still haven't managed to > resolve ?this issue concerning numpy and nose tests... > > On a fresh numpy 1.3.0 build from source tarball on sourceforge: > > On ubuntu 9.04 x64 I issue the following commands: > > cd numpy-1.3.0 > python setup.py > sudo python setup.py install > > > In the source build folder, all numpy test scripts have the correct > permissions and are not marked as executable, > > but in the install directory > (/usr/local/lib/python2.6/dist-packages/numpy/), the test test scripts > have completely different permissions, and are all marked as > executable. Thus, nose wont run the tests. > > This happens with scipy too. Is this a bug in the numpy/scipy build > script, or just my ineptness at installing things through the command > line? > > Cheers, > > Chris Hmm, I don't have this problem from svn. You can pass extra_argv=['--exe'] to the test command to get them to run I believe, but I don't know if that's the solution you're looking for. Skipper From sccolbert at gmail.com Fri Sep 25 10:14:52 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 25 Sep 2009 10:14:52 -0400 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> Message-ID: <7f014ea60909250714t2442809cod840f94704174b42@mail.gmail.com> i have no clue why i'm having this problem. This is a fresh install of ubuntu as well as numpy and scipy... During the write to std out during build and install I do notice things like "changing permission for f2py from 640 to 755" or something like that. So i'm wondering if other permissions are being changed behind the scenes. On Fri, Sep 25, 2009 at 10:10 AM, Skipper Seabold wrote: > On Fri, Sep 25, 2009 at 10:01 AM, Chris Colbert wrote: >> Sorry to bring up an old topic again, but I still haven't managed to >> resolve ?this issue concerning numpy and nose tests... >> >> On a fresh numpy 1.3.0 build from source tarball on sourceforge: >> >> On ubuntu 9.04 x64 I issue the following commands: >> >> cd numpy-1.3.0 >> python setup.py >> sudo python setup.py install >> >> >> In the source build folder, all numpy test scripts have the correct >> permissions and are not marked as executable, >> >> but in the install directory >> (/usr/local/lib/python2.6/dist-packages/numpy/), the test test scripts >> have completely different permissions, and are all marked as >> executable. Thus, nose wont run the tests. >> >> This happens with scipy too. Is this a bug in the numpy/scipy build >> script, or just my ineptness at installing things through the command >> line? >> >> Cheers, >> >> Chris > > Hmm, I don't have this problem from svn. ?You can pass > extra_argv=['--exe'] to the test command to get them to run I believe, > but I don't know if that's the solution you're looking for. > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sienkiew at stsci.edu Fri Sep 25 10:37:13 2009 From: sienkiew at stsci.edu (Mark Sienkiewicz) Date: Fri, 25 Sep 2009 10:37:13 -0400 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> Message-ID: <4ABCD599.7050705@stsci.edu> > In the source build folder, all numpy test scripts have the correct > permissions and are not marked as executable, > > but in the install directory > (/usr/local/lib/python2.6/dist-packages/numpy/), the test test scripts > have completely different permissions, and are all marked as > executable. Thus, nose wont run the tests. > It works ok for me with python 2.5.1 on a mac and python 2.6.1 on linux. That doesn't help you, but it may be a clue at some point. Is it only the test scripts that are executable, or is it everything that gets installed? Does it affect packages other than numpy / scipy? If so, we can suspect distutils and/or ubuntu, rather than numpy. Mark S. From arthbous at indiana.edu Fri Sep 25 11:19:58 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 11:19:58 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler Message-ID: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> Hello, I have a Mac with snow leopard 10.6. I compiled Python2.6.2, numpy1.3 and matplotlib from the sources. I tried to compiled a library (named libsw), that I have made in fortran, with f2py but I got the following error : running build_ext error: don't know how to compile C/C++ code on platform 'posix' with 'gcc' compiler make[1]: *** [libsw] Error 1 make: *** [libsw] Error 2 Noteworthy, my library is correct because I can compile it with f2py on my Ubuntu desktop. Does anyone knows where this error comes from ? Thank you. Best regards, Arthur -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Sep 25 11:24:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 10:24:19 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> Message-ID: <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> On Fri, Sep 25, 2009 at 10:19, Arthur Bousquet wrote: > Hello, > > I have a Mac with snow leopard 10.6. I compiled Python2.6.2, numpy1.3 and > matplotlib from the sources. > I tried to compiled a library (named libsw), that I have made in fortran, > with f2py but I got the following error : > > running build_ext > error: don't know how to compile C/C++ code on platform 'posix' with 'gcc' > compiler > make[1]: *** [libsw] Error 1 > make: *** [libsw] Error 2 > > > Noteworthy, my library is correct because I can compile it with f2py on my > Ubuntu desktop. > Does anyone knows where this error comes from ? Did you use --compiler=gcc? That is not one of the options. Please show the exact command line you used and any configuration files. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mdroe at stsci.edu Fri Sep 25 11:33:17 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Fri, 25 Sep 2009 11:33:17 -0400 Subject: [Numpy-discussion] Getting number of *characters* in dtype='U' array Message-ID: <4ABCE2BD.4020601@stsci.edu> Is there a way to get the number of characters in a fixed-size 'U' array? I can, of course, parse dtype.str, or divide dtype.itemsize by the size of a unicode character, but neither seems terribly elegant or future proof. Does numpy provide (to Python) a method for getting this that I'm just missing? In [7]: x = np.array(u'1234') In [8]: x.dtype Out[8]: dtype(' References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> Message-ID: <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> I used compiler=gcc, here is the command : f2py2.6 --fcompiler=gfortran --compiler=gcc -DF2PY_REPORT_ON_ARRAY_COPY=1 -m libsw -c file1.f90 file2.f90 Ok so you said that "--compiler=gcc" is not a command ? So I removed the line "--compiler=gcc", and I got a new error : error: Command "gcc -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -fPIC -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -DF2PY_REPORT_ON_ARRAY_COPY=1 -I/var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/src.macosx-10.3-fat-2.6 -I/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/core/include -I/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c /var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/src.macosx-10.3-fat-2.6/fortranobject.c -o /var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/src.macosx-10.3-fat-2.6/fortranobject.o" failed with exit status 1 make[1]: *** [libsw] Error 1 make: *** [libsw] Error 2 Why some macs-10.3 appeare but if I do the "export MACOSX_DEPLOYMENT_TARGET=10.6" they all become "10.6" but I still have the same error. Thanks for helping me. Arthur On Fri, Sep 25, 2009 at 11:24 AM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 10:19, Arthur Bousquet > wrote: > > Hello, > > > > I have a Mac with snow leopard 10.6. I compiled Python2.6.2, numpy1.3 and > > matplotlib from the sources. > > I tried to compiled a library (named libsw), that I have made in fortran, > > with f2py but I got the following error : > > > > running build_ext > > error: don't know how to compile C/C++ code on platform 'posix' with > 'gcc' > > compiler > > make[1]: *** [libsw] Error 1 > > make: *** [libsw] Error 2 > > > > > > Noteworthy, my library is correct because I can compile it with f2py on > my > > Ubuntu desktop. > > Does anyone knows where this error comes from ? > > Did you use --compiler=gcc? That is not one of the options. Please > show the exact command line you used and any configuration files. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdroe at stsci.edu Fri Sep 25 11:52:14 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Fri, 25 Sep 2009 11:52:14 -0400 Subject: [Numpy-discussion] Getting number of *characters* in dtype='U' array In-Reply-To: <4ABCE2BD.4020601@stsci.edu> References: <4ABCE2BD.4020601@stsci.edu> Message-ID: <4ABCE72E.2040309@stsci.edu> Ah, I missed the fact that Numpy unicode characters are always 4-bytes, unlike Python unicode objects. Divide by 4 is easy enough. Sorry for the noise. Mike Michael Droettboom wrote: > Is there a way to get the number of characters in a fixed-size 'U' > array? I can, of course, parse dtype.str, or divide dtype.itemsize by > the size of a unicode character, but neither seems terribly elegant or > future proof. Does numpy provide (to Python) a method for getting this > that I'm just missing? > > In [7]: x = np.array(u'1234') > > In [8]: x.dtype > Out[8]: dtype(' > In [9]: x.dtype.str > Out[9]: ' > In [10]: x.dtype.itemsize > Out[10]: 16 > > Cheers, > Mike > > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From robert.kern at gmail.com Fri Sep 25 11:59:02 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 10:59:02 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> Message-ID: <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> On Fri, Sep 25, 2009 at 10:48, Arthur Bousquet wrote: > I used compiler=gcc, here is the command : > > f2py2.6 --fcompiler=gfortran --compiler=gcc -DF2PY_REPORT_ON_ARRAY_COPY=1 -m > libsw -c file1.f90 file2.f90 > > Ok so you said that "--compiler=gcc" is not a command ? Correct. --compiler= does not tell distutils what executable to use, but the internal name given to the class that implements the logic of creating the command lines to compile things. $ python setup.py build_ext --help-compiler Running from numpy source directory. List of available compilers: --compiler=bcpp Borland C++ Compiler --compiler=cygwin Cygwin port of GNU C Compiler for Win32 --compiler=emx EMX port of GNU C Compiler for OS/2 --compiler=intel Intel C Compiler for 32-bit applications --compiler=intele Intel C Itanium Compiler for Itanium-based applications --compiler=mingw32 Mingw32 port of GNU C Compiler for Win32 --compiler=msvc Microsoft Visual C++ --compiler=mwerks MetroWerks CodeWarrior --compiler=unix standard UNIX-style compiler > So I removed the line "--compiler=gcc", and I got a new error : > > error: Command "gcc -arch ppc -arch i386 -isysroot > /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -fPIC -fno-common > -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes > -DF2PY_REPORT_ON_ARRAY_COPY=1 > -I/var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/src.macosx-10.3-fat-2.6 > -I/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/core/include > -I/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c > /var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/src.macosx-10.3-fat-2.6/fortranobject.c > -o > /var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/src.macosx-10.3-fat-2.6/fortranobject.o" > failed with exit status 1 > make[1]: *** [libsw] Error 1 > make: *** [libsw] Error 2 > > > Why some macs-10.3 appeare but if I do the "export > MACOSX_DEPLOYMENT_TARGET=10.6" they all become "10.6" but I still have the > same error. Please show the full output, not snippets. There should be some output just before the "error:" line that shows gcc's error messages. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sccolbert at gmail.com Fri Sep 25 11:59:33 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 25 Sep 2009 11:59:33 -0400 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: <4ABCD599.7050705@stsci.edu> References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> <4ABCD599.7050705@stsci.edu> Message-ID: <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> for numpy and scipy, only the tests have executable permissions. It's as if the tests were specifically targeted and had their permissions changed. And these are the only two python packages i've built from source and installed in this manner, other i've gotten via easy_install, or in the case of the enthought tool suite, I have as an svn install in my home folder. The ETS install doesnt have these problems, but many of the easy_installed packages are marked as executable, though they could have come that way from the source.... Chris On Fri, Sep 25, 2009 at 10:37 AM, Mark Sienkiewicz wrote: > >> In the source build folder, all numpy test scripts have the correct >> permissions and are not marked as executable, >> >> but in the install directory >> (/usr/local/lib/python2.6/dist-packages/numpy/), the test test scripts >> have completely different permissions, and are all marked as >> executable. Thus, nose wont run the tests. >> > > It works ok for me with python 2.5.1 on a mac and python 2.6.1 on > linux. ?That doesn't help you, but it may be a clue at some point. > > Is it only the test scripts that are executable, or is it everything > that gets installed? > > Does it affect packages other than numpy / scipy? ?If so, we can suspect > distutils and/or ubuntu, rather than numpy. > > Mark S. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Fri Sep 25 12:01:29 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 11:01:29 -0500 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> <4ABCD599.7050705@stsci.edu> <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> Message-ID: <3d375d730909250901m576eb237v5f0fbd83a19cca32@mail.gmail.com> On Fri, Sep 25, 2009 at 10:59, Chris Colbert wrote: > for numpy and scipy, only the tests have executable permissions. It's > as if the tests were specifically targeted and had their permissions > changed. > > And these are the only two python packages i've built from source and > installed in this manner, other i've gotten via easy_install, or in > the case of the enthought tool suite, I have as an svn install in my > home folder. The ETS install doesnt have these problems, but many of > the easy_installed packages are marked as executable, though they > could have come that way from the source.... easy_install marks everything executable for somewhat inscrutable reasons. I don't know why a regular install would mark the tests as executable. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri Sep 25 12:09:02 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 25 Sep 2009 10:09:02 -0600 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> <4ABCD599.7050705@stsci.edu> <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> Message-ID: On Fri, Sep 25, 2009 at 9:59 AM, Chris Colbert wrote: > for numpy and scipy, only the tests have executable permissions. It's > as if the tests were specifically targeted and had their permissions > changed. > > And these are the only two python packages i've built from source and > installed in this manner, other i've gotten via easy_install, or in > the case of the enthought tool suite, I have as an svn install in my > home folder. The ETS install doesnt have these problems, but many of > the easy_installed packages are marked as executable, though they > could have come that way from the source.... > > Chris > > On Fri, Sep 25, 2009 at 10:37 AM, Mark Sienkiewicz > wrote: > > > >> In the source build folder, all numpy test scripts have the correct > >> permissions and are not marked as executable, > >> > >> but in the install directory > >> (/usr/local/lib/python2.6/dist-packages/numpy/), the test test scripts > >> have completely different permissions, and are all marked as > >> executable. Thus, nose wont run the tests. > >> > > > > It works ok for me with python 2.5.1 on a mac and python 2.6.1 on > > linux. That doesn't help you, but it may be a clue at some point. > > > > Is it only the test scripts that are executable, or is it everything > > that gets installed? > > > > Does it affect packages other than numpy / scipy? If so, we can suspect > > distutils and/or ubuntu, rather than numpy. > > > > Mark S. > > > What do the permissions look like in the source? In the build directory? What happens if you just copy a test script into the directory? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Sep 25 12:10:28 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 25 Sep 2009 10:10:28 -0600 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> <4ABCD599.7050705@stsci.edu> <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> Message-ID: On Fri, Sep 25, 2009 at 10:09 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Fri, Sep 25, 2009 at 9:59 AM, Chris Colbert wrote: > >> for numpy and scipy, only the tests have executable permissions. It's >> as if the tests were specifically targeted and had their permissions >> changed. >> >> And these are the only two python packages i've built from source and >> installed in this manner, other i've gotten via easy_install, or in >> the case of the enthought tool suite, I have as an svn install in my >> home folder. The ETS install doesnt have these problems, but many of >> the easy_installed packages are marked as executable, though they >> could have come that way from the source.... >> >> Chris >> >> On Fri, Sep 25, 2009 at 10:37 AM, Mark Sienkiewicz >> wrote: >> > >> >> In the source build folder, all numpy test scripts have the correct >> >> permissions and are not marked as executable, >> >> >> >> but in the install directory >> >> (/usr/local/lib/python2.6/dist-packages/numpy/), the test test scripts >> >> have completely different permissions, and are all marked as >> >> executable. Thus, nose wont run the tests. >> >> >> > >> > It works ok for me with python 2.5.1 on a mac and python 2.6.1 on >> > linux. That doesn't help you, but it may be a clue at some point. >> > >> > Is it only the test scripts that are executable, or is it everything >> > that gets installed? >> > >> > Does it affect packages other than numpy / scipy? If so, we can suspect >> > distutils and/or ubuntu, rather than numpy. >> > >> > Mark S. >> > >> > > What do the permissions look like in the source? In the build directory? > What happens if you just copy a test script into the directory? > > Oh, and what happens if you delete the site-packages/numpy directory first? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From arthbous at indiana.edu Fri Sep 25 12:19:39 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 12:19:39 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> Message-ID: <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> Here my full error : http://dl.getdropbox.com/u/541149/error_f2py.txt Thank you, Arthur On Fri, Sep 25, 2009 at 11:59 AM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 10:48, Arthur Bousquet > wrote: > > I used compiler=gcc, here is the command : > > > > f2py2.6 --fcompiler=gfortran --compiler=gcc -DF2PY_REPORT_ON_ARRAY_COPY=1 > -m > > libsw -c file1.f90 file2.f90 > > > > Ok so you said that "--compiler=gcc" is not a command ? > > Correct. --compiler= does not tell distutils what executable to use, > but the internal name given to the class that implements the logic of > creating the command lines to compile things. > > $ python setup.py build_ext --help-compiler > Running from numpy source directory. > List of available compilers: > --compiler=bcpp Borland C++ Compiler > --compiler=cygwin Cygwin port of GNU C Compiler for Win32 > --compiler=emx EMX port of GNU C Compiler for OS/2 > --compiler=intel Intel C Compiler for 32-bit applications > --compiler=intele Intel C Itanium Compiler for Itanium-based > applications > --compiler=mingw32 Mingw32 port of GNU C Compiler for Win32 > --compiler=msvc Microsoft Visual C++ > --compiler=mwerks MetroWerks CodeWarrior > --compiler=unix standard UNIX-style compiler > > > So I removed the line "--compiler=gcc", and I got a new error : > > > > error: Command "gcc -arch ppc -arch i386 -isysroot > > /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -fPIC -fno-common > > -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes > > -DF2PY_REPORT_ON_ARRAY_COPY=1 > > > -I/var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/src.macosx-10.3-fat-2.6 > > > -I/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/core/include > > -I/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c > > > /var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/src.macosx-10.3-fat-2.6/fortranobject.c > > -o > > > /var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/var/folders/N2/N2Z4lrkgHb0iyoQobv1DcE+++TI/-Tmp-/tmpJRch8F/src.macosx-10.3-fat-2.6/fortranobject.o" > > failed with exit status 1 > > make[1]: *** [libsw] Error 1 > > make: *** [libsw] Error 2 > > > > > > Why some macs-10.3 appeare but if I do the "export > > MACOSX_DEPLOYMENT_TARGET=10.6" they all become "10.6" but I still have > the > > same error. > > Please show the full output, not snippets. There should be some output > just before the "error:" line that shows gcc's error messages. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Fri Sep 25 12:21:20 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 25 Sep 2009 12:21:20 -0400 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> <4ABCD599.7050705@stsci.edu> <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> Message-ID: <7f014ea60909250921sb2d4166l545b1f52fa3cccd6@mail.gmail.com> here's an example from numpy/core/tests/ from the source directory: brucewayne at broo:~/builds/numpy-1.3.0/numpy/core/tests$ ls -l total 244 drwxr-xr-x 2 brucewayne brucewayne 4096 2009-04-05 04:29 data -rw-r--r-- 1 brucewayne brucewayne 465 2009-03-29 07:24 test_blasdot.py -rw-r--r-- 1 brucewayne brucewayne 2975 2009-04-05 04:09 test_defchararray.py -rw-r--r-- 1 brucewayne brucewayne 10861 2009-04-05 04:09 test_defmatrix.py -rw-r--r-- 1 brucewayne brucewayne 3306 2009-04-05 04:09 test_dtype.py -rw-r--r-- 1 brucewayne brucewayne 1769 2009-03-29 07:24 test_errstate.py -rw-r--r-- 1 brucewayne brucewayne 2101 2009-03-29 07:24 test_memmap.py -rw-r--r-- 1 brucewayne brucewayne 37911 2009-04-05 04:09 test_multiarray.py -rw-r--r-- 1 brucewayne brucewayne 28164 2009-03-29 07:24 test_numeric.py -rw-r--r-- 1 brucewayne brucewayne 14194 2009-03-29 07:24 test_numerictypes.py -rw-r--r-- 1 brucewayne brucewayne 8611 2009-04-05 04:09 test_print.py -rw-r--r-- 1 brucewayne brucewayne 4591 2009-03-29 07:24 test_records.py -rw-r--r-- 1 brucewayne brucewayne 42700 2009-04-05 04:19 test_regression.py -rw-r--r-- 1 brucewayne brucewayne 3768 2009-04-05 04:09 test_scalarmath.py -rw-r--r-- 1 brucewayne brucewayne 16944 2009-04-05 04:09 test_ufunc.py -rw-r--r-- 1 brucewayne brucewayne 24276 2009-04-05 04:09 test_umath.py -rw-r--r-- 1 brucewayne brucewayne 11255 2009-04-05 04:09 test_unicode.py /* Tests dont exist in the build directory */ from the install directory: brucewayne at broo:/usr/local/lib/python2.6/dist-packages/numpy/core/tests$ ls -l total 244 drwxr-sr-x 2 root staff 4096 2009-08-20 16:19 data -rwxrwxrwx 1 root staff 465 2009-08-17 18:01 test_blasdot.py -rwxrwxrwx 1 root staff 2975 2009-08-17 18:01 test_defchararray.py -rwxrwxrwx 1 root staff 10861 2009-08-17 18:01 test_defmatrix.py -rwxrwxrwx 1 root staff 3306 2009-08-17 18:01 test_dtype.py -rwxrwxrwx 1 root staff 1769 2009-08-17 18:01 test_errstate.py -rwxrwxrwx 1 root staff 2101 2009-08-17 18:01 test_memmap.py -rwxrwxrwx 1 root staff 37911 2009-08-17 18:01 test_multiarray.py -rwxrwxrwx 1 root staff 28164 2009-08-17 18:01 test_numeric.py -rwxrwxrwx 1 root staff 14194 2009-08-17 18:01 test_numerictypes.py -rwxrwxrwx 1 root staff 8611 2009-08-17 18:01 test_print.py -rwxrwxrwx 1 root staff 4591 2009-08-17 18:01 test_records.py -rwxrwxrwx 1 root staff 42700 2009-08-17 18:01 test_regression.py -rwxrwxrwx 1 root staff 3768 2009-08-17 18:01 test_scalarmath.py -rwxrwxrwx 1 root staff 16944 2009-08-17 18:01 test_ufunc.py -rwxrwxrwx 1 root staff 24276 2009-08-17 18:01 test_umath.py -rwxrwxrwx 1 root staff 11255 2009-08-17 18:01 test_unicode.py and after a fresh removal of the install and build dirs, and a rebuild and reinstall: brucewayne at broo:/usr/local/lib/python2.6/dist-packages/numpy/core/tests$ ls -l total 244 drwxr-sr-x 2 root staff 4096 2009-09-25 12:20 data -rw-r--r-- 1 root staff 465 2009-03-29 07:24 test_blasdot.py -rw-r--r-- 1 root staff 2975 2009-04-05 04:09 test_defchararray.py -rw-r--r-- 1 root staff 10861 2009-04-05 04:09 test_defmatrix.py -rw-r--r-- 1 root staff 3306 2009-04-05 04:09 test_dtype.py -rw-r--r-- 1 root staff 1769 2009-03-29 07:24 test_errstate.py -rw-r--r-- 1 root staff 2101 2009-03-29 07:24 test_memmap.py -rw-r--r-- 1 root staff 37911 2009-04-05 04:09 test_multiarray.py -rw-r--r-- 1 root staff 28164 2009-03-29 07:24 test_numeric.py -rw-r--r-- 1 root staff 14194 2009-03-29 07:24 test_numerictypes.py -rw-r--r-- 1 root staff 8611 2009-04-05 04:09 test_print.py -rw-r--r-- 1 root staff 4591 2009-03-29 07:24 test_records.py -rw-r--r-- 1 root staff 42700 2009-04-05 04:19 test_regression.py -rw-r--r-- 1 root staff 3768 2009-04-05 04:09 test_scalarmath.py -rw-r--r-- 1 root staff 16944 2009-04-05 04:09 test_ufunc.py -rw-r--r-- 1 root staff 24276 2009-04-05 04:09 test_umath.py -rw-r--r-- 1 root staff 11255 2009-04-05 04:09 test_unicode.py so, something went amok probably a few installs back. This seems to have cleared it up. Thanks Chuck! Chris On Fri, Sep 25, 2009 at 12:10 PM, Charles R Harris wrote: > > > On Fri, Sep 25, 2009 at 10:09 AM, Charles R Harris > wrote: >> >> >> On Fri, Sep 25, 2009 at 9:59 AM, Chris Colbert >> wrote: >>> >>> for numpy and scipy, only the tests have executable permissions. It's >>> as if the tests were specifically targeted and had their permissions >>> changed. >>> >>> And these are the only two python packages i've built from source and >>> installed in this manner, other i've gotten via easy_install, or in >>> the case of the enthought tool suite, I have as an svn install in my >>> home folder. The ETS install doesnt have these problems, but many of >>> the easy_installed packages are marked as executable, though they >>> could have come that way from the source.... >>> >>> Chris >>> >>> On Fri, Sep 25, 2009 at 10:37 AM, Mark Sienkiewicz >>> wrote: >>> > >>> >> In the source build folder, all numpy test scripts have the correct >>> >> permissions and are not marked as executable, >>> >> >>> >> but in the install directory >>> >> (/usr/local/lib/python2.6/dist-packages/numpy/), the test test scripts >>> >> have completely different permissions, and are all marked as >>> >> executable. Thus, nose wont run the tests. >>> >> >>> > >>> > It works ok for me with python 2.5.1 on a mac and python 2.6.1 on >>> > linux. ?That doesn't help you, but it may be a clue at some point. >>> > >>> > Is it only the test scripts that are executable, or is it everything >>> > that gets installed? >>> > >>> > Does it affect packages other than numpy / scipy? ?If so, we can >>> > suspect >>> > distutils and/or ubuntu, rather than numpy. >>> > >>> > Mark S. >>> > >> >> What do the permissions look like in the source? In the build directory? >> What happens if you just copy a test script into the directory? >> > > Oh, and what happens if you delete the site-packages/numpy directory first? > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From d.l.goldsmith at gmail.com Fri Sep 25 12:23:11 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 25 Sep 2009 09:23:11 -0700 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <4ABCC0A1.30402@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> Message-ID: <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> Great, thanks! DG On Fri, Sep 25, 2009 at 6:07 AM, Michael Droettboom wrote: > David Goldsmith wrote: > > On Tue, Sep 22, 2009 at 4:02 PM, Ralf Gommers > > > > wrote: > > > > > > On Tue, Sep 22, 2009 at 1:58 PM, Michael Droettboom > > > wrote: > > > > Trac has these bugs. Any others? > > > > http://projects.scipy.org/numpy/ticket/1199 > > http://projects.scipy.org/numpy/ticket/1200 > > http://projects.scipy.org/numpy/ticket/856 > > http://projects.scipy.org/numpy/ticket/855 > > http://projects.scipy.org/numpy/ticket/1231 > > > > > > This one: > > > http://article.gmane.org/gmane.comp.python.numeric.general/23638/match=chararray > > > > Cheers, > > Ralf > > > > > > That last one never got "promoted" to a ticket? > It's a symptom of this bug, that I created and produced a patch for > yesterday: > > http://projects.scipy.org/numpy/ticket/1235 > > Mike > > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Fri Sep 25 12:23:26 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 25 Sep 2009 12:23:26 -0400 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: <7f014ea60909250921sb2d4166l545b1f52fa3cccd6@mail.gmail.com> References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> <4ABCD599.7050705@stsci.edu> <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> <7f014ea60909250921sb2d4166l545b1f52fa3cccd6@mail.gmail.com> Message-ID: <7f014ea60909250923rdbf1e89m38752a1706efd54c@mail.gmail.com> Oh, and sorry if calling you Chuck was offensive, that's out of habit from a friend of mine named Charles. My apologies... On Fri, Sep 25, 2009 at 12:21 PM, Chris Colbert wrote: > here's an example from numpy/core/tests/ > > from the source directory: > > brucewayne at broo:~/builds/numpy-1.3.0/numpy/core/tests$ ls -l > total 244 > drwxr-xr-x 2 brucewayne brucewayne ?4096 2009-04-05 04:29 data > -rw-r--r-- 1 brucewayne brucewayne ? 465 2009-03-29 07:24 test_blasdot.py > -rw-r--r-- 1 brucewayne brucewayne ?2975 2009-04-05 04:09 test_defchararray.py > -rw-r--r-- 1 brucewayne brucewayne 10861 2009-04-05 04:09 test_defmatrix.py > -rw-r--r-- 1 brucewayne brucewayne ?3306 2009-04-05 04:09 test_dtype.py > -rw-r--r-- 1 brucewayne brucewayne ?1769 2009-03-29 07:24 test_errstate.py > -rw-r--r-- 1 brucewayne brucewayne ?2101 2009-03-29 07:24 test_memmap.py > -rw-r--r-- 1 brucewayne brucewayne 37911 2009-04-05 04:09 test_multiarray.py > -rw-r--r-- 1 brucewayne brucewayne 28164 2009-03-29 07:24 test_numeric.py > -rw-r--r-- 1 brucewayne brucewayne 14194 2009-03-29 07:24 test_numerictypes.py > -rw-r--r-- 1 brucewayne brucewayne ?8611 2009-04-05 04:09 test_print.py > -rw-r--r-- 1 brucewayne brucewayne ?4591 2009-03-29 07:24 test_records.py > -rw-r--r-- 1 brucewayne brucewayne 42700 2009-04-05 04:19 test_regression.py > -rw-r--r-- 1 brucewayne brucewayne ?3768 2009-04-05 04:09 test_scalarmath.py > -rw-r--r-- 1 brucewayne brucewayne 16944 2009-04-05 04:09 test_ufunc.py > -rw-r--r-- 1 brucewayne brucewayne 24276 2009-04-05 04:09 test_umath.py > -rw-r--r-- 1 brucewayne brucewayne 11255 2009-04-05 04:09 test_unicode.py > > > /* Tests dont exist in the build directory */ > > from the install directory: > > brucewayne at broo:/usr/local/lib/python2.6/dist-packages/numpy/core/tests$ ls -l > total 244 > drwxr-sr-x 2 root staff ?4096 2009-08-20 16:19 data > -rwxrwxrwx 1 root staff ? 465 2009-08-17 18:01 test_blasdot.py > -rwxrwxrwx 1 root staff ?2975 2009-08-17 18:01 test_defchararray.py > -rwxrwxrwx 1 root staff 10861 2009-08-17 18:01 test_defmatrix.py > -rwxrwxrwx 1 root staff ?3306 2009-08-17 18:01 test_dtype.py > -rwxrwxrwx 1 root staff ?1769 2009-08-17 18:01 test_errstate.py > -rwxrwxrwx 1 root staff ?2101 2009-08-17 18:01 test_memmap.py > -rwxrwxrwx 1 root staff 37911 2009-08-17 18:01 test_multiarray.py > -rwxrwxrwx 1 root staff 28164 2009-08-17 18:01 test_numeric.py > -rwxrwxrwx 1 root staff 14194 2009-08-17 18:01 test_numerictypes.py > -rwxrwxrwx 1 root staff ?8611 2009-08-17 18:01 test_print.py > -rwxrwxrwx 1 root staff ?4591 2009-08-17 18:01 test_records.py > -rwxrwxrwx 1 root staff 42700 2009-08-17 18:01 test_regression.py > -rwxrwxrwx 1 root staff ?3768 2009-08-17 18:01 test_scalarmath.py > -rwxrwxrwx 1 root staff 16944 2009-08-17 18:01 test_ufunc.py > -rwxrwxrwx 1 root staff 24276 2009-08-17 18:01 test_umath.py > -rwxrwxrwx 1 root staff 11255 2009-08-17 18:01 test_unicode.py > > > and after a fresh removal of the install and build dirs, and a rebuild > and reinstall: > > brucewayne at broo:/usr/local/lib/python2.6/dist-packages/numpy/core/tests$ ls -l > total 244 > drwxr-sr-x 2 root staff ?4096 2009-09-25 12:20 data > -rw-r--r-- 1 root staff ? 465 2009-03-29 07:24 test_blasdot.py > -rw-r--r-- 1 root staff ?2975 2009-04-05 04:09 test_defchararray.py > -rw-r--r-- 1 root staff 10861 2009-04-05 04:09 test_defmatrix.py > -rw-r--r-- 1 root staff ?3306 2009-04-05 04:09 test_dtype.py > -rw-r--r-- 1 root staff ?1769 2009-03-29 07:24 test_errstate.py > -rw-r--r-- 1 root staff ?2101 2009-03-29 07:24 test_memmap.py > -rw-r--r-- 1 root staff 37911 2009-04-05 04:09 test_multiarray.py > -rw-r--r-- 1 root staff 28164 2009-03-29 07:24 test_numeric.py > -rw-r--r-- 1 root staff 14194 2009-03-29 07:24 test_numerictypes.py > -rw-r--r-- 1 root staff ?8611 2009-04-05 04:09 test_print.py > -rw-r--r-- 1 root staff ?4591 2009-03-29 07:24 test_records.py > -rw-r--r-- 1 root staff 42700 2009-04-05 04:19 test_regression.py > -rw-r--r-- 1 root staff ?3768 2009-04-05 04:09 test_scalarmath.py > -rw-r--r-- 1 root staff 16944 2009-04-05 04:09 test_ufunc.py > -rw-r--r-- 1 root staff 24276 2009-04-05 04:09 test_umath.py > -rw-r--r-- 1 root staff 11255 2009-04-05 04:09 test_unicode.py > > > > so, something went amok probably a few installs back. This seems to > have cleared it up. > > Thanks Chuck! > > Chris > > > > > > > > > On Fri, Sep 25, 2009 at 12:10 PM, Charles R Harris > wrote: >> >> >> On Fri, Sep 25, 2009 at 10:09 AM, Charles R Harris >> wrote: >>> >>> >>> On Fri, Sep 25, 2009 at 9:59 AM, Chris Colbert >>> wrote: >>>> >>>> for numpy and scipy, only the tests have executable permissions. It's >>>> as if the tests were specifically targeted and had their permissions >>>> changed. >>>> >>>> And these are the only two python packages i've built from source and >>>> installed in this manner, other i've gotten via easy_install, or in >>>> the case of the enthought tool suite, I have as an svn install in my >>>> home folder. The ETS install doesnt have these problems, but many of >>>> the easy_installed packages are marked as executable, though they >>>> could have come that way from the source.... >>>> >>>> Chris >>>> >>>> On Fri, Sep 25, 2009 at 10:37 AM, Mark Sienkiewicz >>>> wrote: >>>> > >>>> >> In the source build folder, all numpy test scripts have the correct >>>> >> permissions and are not marked as executable, >>>> >> >>>> >> but in the install directory >>>> >> (/usr/local/lib/python2.6/dist-packages/numpy/), the test test scripts >>>> >> have completely different permissions, and are all marked as >>>> >> executable. Thus, nose wont run the tests. >>>> >> >>>> > >>>> > It works ok for me with python 2.5.1 on a mac and python 2.6.1 on >>>> > linux. ?That doesn't help you, but it may be a clue at some point. >>>> > >>>> > Is it only the test scripts that are executable, or is it everything >>>> > that gets installed? >>>> > >>>> > Does it affect packages other than numpy / scipy? ?If so, we can >>>> > suspect >>>> > distutils and/or ubuntu, rather than numpy. >>>> > >>>> > Mark S. >>>> > >>> >>> What do the permissions look like in the source? In the build directory? >>> What happens if you just copy a test script into the directory? >>> >> >> Oh, and what happens if you delete the site-packages/numpy directory first? >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > From robert.kern at gmail.com Fri Sep 25 12:27:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 11:27:19 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> Message-ID: <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> On Fri, Sep 25, 2009 at 11:19, Arthur Bousquet wrote: > Here my full error : http://dl.getdropbox.com/u/541149/error_f2py.txt /Developer/SDKs/MacOSX10.4u.sdk/usr/include/stdarg.h:4:25: error: stdarg.h: No such file or directory I didn't think that MacOSX10.4u.sdk existed on Snow Leopard. The "-isysroot /Developer/SDKs/MacOSX10.4u.sdk" flag should be coming from Python itself. I don't know how you managed to build Python with those flags. Check your configuration like so: $ grep isysroot /Library/Frameworks/Python.framework/Versions/Current/lib/python2.5/config/Makefile BASECFLAGS= -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic LDFLAGS= -arch i386 -arch ppc -isysroot /Developer/SDKs/MacOSX10.4u.sdk -g -isysroot "${UNIVERSALSDK}" \ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Sep 25 12:29:08 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 11:29:08 -0500 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: <7f014ea60909250923rdbf1e89m38752a1706efd54c@mail.gmail.com> References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> <4ABCD599.7050705@stsci.edu> <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> <7f014ea60909250921sb2d4166l545b1f52fa3cccd6@mail.gmail.com> <7f014ea60909250923rdbf1e89m38752a1706efd54c@mail.gmail.com> Message-ID: <3d375d730909250929k5aaa5087q876cbc3af1ebf9b1@mail.gmail.com> On Fri, Sep 25, 2009 at 11:23, Chris Colbert wrote: > Oh, and sorry if calling you Chuck was offensive, that's out of habit > from a friend of mine named Charles. Since he signs his name "Chuck", I don't think he could reasonably get offended by that. :-) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mdroe at stsci.edu Fri Sep 25 12:29:30 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Fri, 25 Sep 2009 12:29:30 -0400 Subject: [Numpy-discussion] Error building docs Message-ID: <4ABCEFEA.7010002@stsci.edu> Anybody know why I might be seeing this? Cheers, Mike [~/builds/numpy_clean/doc] > make html mkdir -p build touch build/generate-stamp mkdir -p build/html build/doctrees LANG=C sphinx-build -b html -d build/doctrees source build/html Running Sphinx v1.0 (hg) 1.4.dev7413 1.4.0.dev7413 /home/mdroe/usr/lib/python2.5/site-packages/Sphinx-1.0dev_20090915-py2.5.egg/sphinx/application.py:166: FutureWarning: A plot_directive module is also available under matplotlib.sphinxext; expect this numpydoc.plot_directive module to be deprecated after relevant features have been integrated there. mod = __import__(extension, None, None, ['setup']) loading pickled environment... not found [autosummary] generating autosummary for: reference/arrays.classes.rst, reference/arrays.dtypes.rst, reference/arrays.indexing.rst, reference/arrays.interface.rst, reference/arrays.ndarray.rst, reference/arrays.rst, reference/arrays.scalars.rst, reference/c-api.array.rst, reference/c-api.config.rst, reference/c-api.coremath.rst, reference/c-api.dtype.rst, reference/c-api.generalized-ufuncs.rst, reference/c-api.rst, reference/c-api.types-and-structures.rst, reference/c-api.ufunc.rst, reference/distutils.rst, reference/index.rst, reference/internals.code-explanations.rst, reference/internals.rst, reference/maskedarray.baseclass.rst, reference/maskedarray.generic.rst, reference/maskedarray.rst, reference/routines.array-creation.rst, reference/routines.array-manipulation.rst, reference/routines.bitwise.rst, reference/routines.ctypeslib.rst, reference/routines.dtype.rst, reference/routines.dual.rst, reference/routines.emath.rst, reference/routines.err.rst, reference/routines.fft.rst, reference/routines.financial.rst, reference/routines.functional.rst, reference/routines.help.rst, reference/routines.indexing.rst, reference/routines.io.rst, reference/routines.linalg.rst, reference/routines.logic.rst, reference/routines.ma.rst, reference/routines.math.rst, reference/routines.matlib.rst, reference/routines.numarray.rst, reference/routines.oldnumeric.rst, reference/routines.other.rst, reference/routines.poly.rst, reference/routines.random.rst, reference/routines.rst, reference/routines.set.rst, reference/routines.sort.rst, reference/routines.statistics.rst, reference/routines.testing.rst, reference/routines.window.rst, reference/ufuncs.rst Failed to import 'numpy.matlib': no module named numpy.matlib WARNING: [autosummary] failed to import 'numpy.__array_priority__': no module named numpy.__array_priority__ WARNING: [autosummary] failed to import 'numpy.acorrelate': no module named numpy.acorrelate WARNING: [autosummary] failed to import 'numpy.distutils.misc_util.get_numarray_include_dirs': no module named numpy.distutils.misc_util.get_numarray_include_dirs WARNING: [autosummary] failed to import 'numpy.generic.__squeeze__': no module named numpy.generic.__squeeze__ building [html]: targets for 76 source files that are out of date updating environment: 996 added, 0 changed, 0 removed Exception occurred:[ 0%] reference/arrays.classeslass File "/home/mdroe/usr/lib/python2.5/site-packages/docutils/nodes.py", line 471, in __getitem__ return self.attributes[key] KeyError: 'numbered' The full traceback has been saved in /tmp/sphinx-err-QYFjBP.log, if you want to report the issue to the author. Please also report this if it was a user error, so that a better error message can be provided next time. Send reports to sphinx-dev at googlegroups.com. Thanks! make: *** [html] Error 1 -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From sccolbert at gmail.com Fri Sep 25 12:35:50 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 25 Sep 2009 12:35:50 -0400 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: <3d375d730909250929k5aaa5087q876cbc3af1ebf9b1@mail.gmail.com> References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> <4ABCD599.7050705@stsci.edu> <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> <7f014ea60909250921sb2d4166l545b1f52fa3cccd6@mail.gmail.com> <7f014ea60909250923rdbf1e89m38752a1706efd54c@mail.gmail.com> <3d375d730909250929k5aaa5087q876cbc3af1ebf9b1@mail.gmail.com> Message-ID: <7f014ea60909250935j3ee0e1dcsd011f29ba2b99208@mail.gmail.com> you sir, have a very good point :) On Fri, Sep 25, 2009 at 12:29 PM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 11:23, Chris Colbert wrote: >> Oh, and sorry if calling you Chuck was offensive, that's out of habit >> from a friend of mine named Charles. > > Since he signs his name "Chuck", I don't think he could reasonably get > offended by that. :-) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Fri Sep 25 12:40:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 25 Sep 2009 10:40:39 -0600 Subject: [Numpy-discussion] numpy install script changes permissions? In-Reply-To: <3d375d730909250929k5aaa5087q876cbc3af1ebf9b1@mail.gmail.com> References: <7f014ea60909250701xd2b5cdcra976183cf92e06e4@mail.gmail.com> <4ABCD599.7050705@stsci.edu> <7f014ea60909250859j2731a9fdq3cf5743cb715b124@mail.gmail.com> <7f014ea60909250921sb2d4166l545b1f52fa3cccd6@mail.gmail.com> <7f014ea60909250923rdbf1e89m38752a1706efd54c@mail.gmail.com> <3d375d730909250929k5aaa5087q876cbc3af1ebf9b1@mail.gmail.com> Message-ID: On Fri, Sep 25, 2009 at 10:29 AM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 11:23, Chris Colbert wrote: > > Oh, and sorry if calling you Chuck was offensive, that's out of habit > > from a friend of mine named Charles. > > Since he signs his name "Chuck", I don't think he could reasonably get > offended by that. :-) > > Yeah, but I'm trying really really hard to think of something unreasonable. It isn't as easy as it sounds. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Fri Sep 25 13:00:07 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 25 Sep 2009 13:00:07 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors Message-ID: There have been some recent attempts to improve the error reporting in genfromtxt , which is great, because hunting down the problems reading in big and messy files is not fun. I am working on a patch that keeps up with the line number and column number of where you are in parsing the file, so that this can be reported in the error. Is there a way to catch a raised error and add to it? For instance, I have a problem in my file which leads to this error being raised from np.lib._iotools.StringCoverter.upgrade ValueError: Converter is locked and cannot be upgraded I added this into np.lib.io.genfromtxt around line 995. linenum = 0 [...] if dtype is None: try: colnum = 0 for (converter, item) in zip(converters, values): converter.upgrade(item) colnum += 1 except: raise ValueError, "I don't report the error from _iotools.StringConverter.upgrade, but I do know that there is a problem trying to convert a value at line %s and column %s" % (linenum,colnum) [...] linenum += 1 I'd like to add line and column number information to original error from _iotools. Any suggestions? Cheers, Skipper From arthbous at indiana.edu Fri Sep 25 13:17:00 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 13:17:00 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> Message-ID: <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> $ grep isysroot does not work on my computer. Arthur On Fri, Sep 25, 2009 at 12:27 PM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 11:19, Arthur Bousquet > wrote: > > Here my full error : http://dl.getdropbox.com/u/541149/error_f2py.txt > > /Developer/SDKs/MacOSX10.4u.sdk/usr/include/stdarg.h:4:25: error: > stdarg.h: No such file or directory > > > I didn't think that MacOSX10.4u.sdk existed on Snow Leopard. The > "-isysroot /Developer/SDKs/MacOSX10.4u.sdk" flag should be coming from > Python itself. I don't know how you managed to build Python with those > flags. Check your configuration like so: > > $ grep isysroot > > /Library/Frameworks/Python.framework/Versions/Current/lib/python2.5/config/Makefile > BASECFLAGS= -arch ppc -arch i386 -isysroot > /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double > -no-cpp-precomp -mno-fused-madd -fno-common -dynamic > LDFLAGS= -arch i386 -arch ppc -isysroot > /Developer/SDKs/MacOSX10.4u.sdk -g > -isysroot "${UNIVERSALSDK}" \ > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Sep 25 13:21:00 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 12:21:00 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> Message-ID: <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> On Fri, Sep 25, 2009 at 12:17, Arthur Bousquet wrote: > $ grep isysroot > > does not work on my computer. What does that mean? Exactly what did you type and exactly what response did you get? Please, never just say that something "does not work". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From arthbous at indiana.edu Fri Sep 25 13:24:17 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 13:24:17 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> Message-ID: <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> Sorry about that. I typed in the Terminal "grep isysroot" and it is like nothing is happening. The terminal is still working but does not give any response. Arthur On Fri, Sep 25, 2009 at 1:21 PM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 12:17, Arthur Bousquet > wrote: > > $ grep isysroot > > > > does not work on my computer. > > What does that mean? Exactly what did you type and exactly what > response did you get? Please, never just say that something "does not > work". > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Sep 25 13:27:26 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 12:27:26 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> Message-ID: <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> On Fri, Sep 25, 2009 at 12:24, Arthur Bousquet wrote: > Sorry about that. > > I typed in the Terminal "grep isysroot" and it is like nothing is happening. > The terminal is still working but does not give any response. Ah, sorry about that. The command got wordwrapped. Cancel that command using Ctrl-C. Type the following, all on one line: grep isysroot /Library/Frameworks/Python.framework/Versions/Current/lib/python2.5/config/Makefile -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From arthbous at indiana.edu Fri Sep 25 13:35:15 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 13:35:15 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250824u6c7e6b8t5376ce2faef9feef@mail.gmail.com> <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> Message-ID: <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> Thank you. So here the command : Arths-MacBook-Pro:~ arthbous$ grep isysroot /Library/Frameworks/Python.framework/Versions/Current/lib/python2.6/config/Makefile BASECFLAGS= -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -fPIC -fno-common -dynamic LDFLAGS= -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -isysroot "${UNIVERSALSDK}" \ But I have 2 pythons installed, the 2.6.1 from xcode of apple and the one I compiled myselft 2.6.2 in /usr/local/. Which one is in this Library ? Arthur On Fri, Sep 25, 2009 at 1:27 PM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 12:24, Arthur Bousquet > wrote: > > Sorry about that. > > > > I typed in the Terminal "grep isysroot" and it is like nothing is > happening. > > The terminal is still working but does not give any response. > > Ah, sorry about that. The command got wordwrapped. Cancel that command > using Ctrl-C. Type the following, all on one line: > > grep isysroot > /Library/Frameworks/Python.framework/Versions/Current/lib/python2.5/config/Makefile > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpi at comxnet.dk Fri Sep 25 13:45:35 2009 From: mpi at comxnet.dk (Mads Ipsen) Date: Fri, 25 Sep 2009 19:45:35 +0200 Subject: [Numpy-discussion] Tuple outer product? Message-ID: <4ABD01BF.5080405@comxnet.dk> Is there a numpy operation on two arrays, say [1,2,3] and [4,5,6], that will yield: [[(1,4),(1,5),(1,6)],[(2,4),(2,5),(2,6)],[(3,4),(3,5),(3,6)]] Any suggestions are most welcome. Mads -- +------------------------------------------------------------+ | Mads Ipsen, Scientific developer | +------------------------------+-----------------------------+ | QuantumWise A/S | phone: +45-29716388 | | N?rres?gade 27A | www: www.quantumwise.com | | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | +------------------------------+-----------------------------+ From robert.kern at gmail.com Fri Sep 25 13:51:37 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 12:51:37 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <6574ea980909250848g98df329he3fc398fc23d3aaa@mail.gmail.com> <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> Message-ID: <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> On Fri, Sep 25, 2009 at 12:35, Arthur Bousquet wrote: > Thank you. So here the command : > > Arths-MacBook-Pro:~ arthbous$ grep isysroot > /Library/Frameworks/Python.framework/Versions/Current/lib/python2.6/config/Makefile > BASECFLAGS=??? -arch ppc -arch i386 -isysroot > /Developer/SDKs/MacOSX10.4u.sdk? -fno-strict-aliasing -fPIC -fno-common > -dynamic > LDFLAGS=??? -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk > ??? ??? ??? -isysroot "${UNIVERSALSDK}" \ > > But I have 2 pythons installed, the 2.6.1 from xcode of apple and the one I > compiled myselft 2.6.2 in /usr/local/. Which one is in this Library ? Not the latter. I think you may be confused about the former. I don't believe that Python comes from Xcode. The one from OS X is in /System/Library/Frameworks/.... This one looks like a third one installed from some other source, like binaries from www.python.org. Exactly which python executable is f2py using? Check the first line in the f2py2.6 script. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From arthbous at indiana.edu Fri Sep 25 13:55:41 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 13:55:41 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250859g18b591ees47511bc9ce69dcb0@mail.gmail.com> <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> Message-ID: <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> The first line of /usr/local/f2py2.6 is : #!/usr/bin/env python2.6 # See http://cens.ioc.ee/projects/f2py2e/ - Arthur On Fri, Sep 25, 2009 at 1:51 PM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 12:35, Arthur Bousquet > wrote: > > Thank you. So here the command : > > > > Arths-MacBook-Pro:~ arthbous$ grep isysroot > > > /Library/Frameworks/Python.framework/Versions/Current/lib/python2.6/config/Makefile > > BASECFLAGS= -arch ppc -arch i386 -isysroot > > /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -fPIC -fno-common > > -dynamic > > LDFLAGS= -arch ppc -arch i386 -isysroot > /Developer/SDKs/MacOSX10.4u.sdk > > -isysroot "${UNIVERSALSDK}" \ > > > > But I have 2 pythons installed, the 2.6.1 from xcode of apple and the one > I > > compiled myselft 2.6.2 in /usr/local/. Which one is in this Library ? > > Not the latter. I think you may be confused about the former. I don't > believe that Python comes from Xcode. The one from OS X is in > /System/Library/Frameworks/.... This one looks like a third one > installed from some other source, like binaries from www.python.org. > > Exactly which python executable is f2py using? Check the first line in > the f2py2.6 script. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Sep 25 13:57:55 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 12:57:55 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <6574ea980909250919n52de3151j328d3b4fad70d240@mail.gmail.com> <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> Message-ID: <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> On Fri, Sep 25, 2009 at 12:55, Arthur Bousquet wrote: > The first line of /usr/local/f2py2.6 is : > > #!/usr/bin/env python2.6 > # See http://cens.ioc.ee/projects/f2py2e/ And exactly which python2.6 executable do you get when you type $ python 2.6 ? Try printing sys.executable: $ python Python 2.5.4 (r254:67916, Apr 23 2009, 14:49:51) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.executable '/Library/Frameworks/Python.framework/Versions/2.5/Resources/PythonApp.app/Contents/MacOS/PythonApp' -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gokhansever at gmail.com Fri Sep 25 13:58:37 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Fri, 25 Sep 2009 12:58:37 -0500 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD01BF.5080405@comxnet.dk> References: <4ABD01BF.5080405@comxnet.dk> Message-ID: <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> On Fri, Sep 25, 2009 at 12:45 PM, Mads Ipsen wrote: > Is there a numpy operation on two arrays, say [1,2,3] and [4,5,6], that > will yield: > > [[(1,4),(1,5),(1,6)],[(2,4),(2,5),(2,6)],[(3,4),(3,5),(3,6)]] > > Any suggestions are most welcome. > > Mads > > I don't know if there is a function in numpy library, but it is a simple one isn't it? ipython --pylab I[1]: a = array([1,2,3]) I[2]: b = array([4,5,6]) I[3]: [zip(ones(a.size, dtype=int)*a[i], b) for i in range(len(a))] O[3]: [[(1, 4), (1, 5), (1, 6)], [(2, 4), (2, 5), (2, 6)], [(3, 4), (3, 5), (3, 6)]] > > -- > +------------------------------------------------------------+ > | Mads Ipsen, Scientific developer | > +------------------------------+-----------------------------+ > | QuantumWise A/S | phone: +45-29716388 | > | N?rres?gade 27A | www: www.quantumwise.com | > | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | > +------------------------------+-----------------------------+ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hetland at tamu.edu Fri Sep 25 13:48:49 2009 From: hetland at tamu.edu (Rob Hetland) Date: Fri, 25 Sep 2009 12:48:49 -0500 Subject: [Numpy-discussion] how to specify cflags from within setup.py? Message-ID: <762EC506-EE38-4F73-AC3E-306B18797DDF@tamu.edu> I have an external module that I am trying to compile as part of a package. It does not like to be compiled with -O3 optimization, and requires -O2 at most. I have tried the few things I know how to do, like setting the extra_compile_args, and trying to change the CFLAGS, but these do not seem to work. Is there a way to explicitly specify compiler flags from within setup.py? -Rob ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From ralf.gommers at googlemail.com Fri Sep 25 14:00:25 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 25 Sep 2009 14:00:25 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: Message-ID: On Fri, Sep 25, 2009 at 1:00 PM, Skipper Seabold wrote: > There have been some recent attempts to improve the error reporting in > genfromtxt , which is > great, because hunting down the problems reading in big and messy > files is not fun. > > I am working on a patch that keeps up with the line number and column > number of where you are in parsing the file, so that this can be > reported in the error. Is there a way to catch a raised error and add > to it? > > For instance, I have a problem in my file which leads to this error > being raised from np.lib._iotools.StringCoverter.upgrade > > ValueError: Converter is locked and cannot be upgraded > > I added this into np.lib.io.genfromtxt around line 995. > > linenum = 0 > [...] > if dtype is None: > try: > colnum = 0 > for (converter, item) in zip(converters, values): > converter.upgrade(item) > colnum += 1 > except: > raise ValueError, "I don't report the error from > _iotools.StringConverter.upgrade, but I do know that there is a > problem trying to convert a value at line %s and column %s" % > (linenum,colnum) > [...] > linenum += 1 > > I'd like to add line and column number information to original error > from _iotools. Any suggestions? > There is no good way to edit the message of the original exception instance, as explained here: http://blog.ianbicking.org/2007/09/12/re-raising-exceptions/ Probably the easiest for your purpose is this: def divbyzero(): return 1/0 try: a = divbyzero() except ZeroDivisionError as err: print 'problem occurred at line X' raise err or if you want to catch any error: try: yourcode() except: print 'problem occurred at line X' raise Maybe better to use a logger instead of print, but you get the idea. Cheers, Ralf > Cheers, > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Fri Sep 25 14:03:02 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 25 Sep 2009 13:03:02 -0500 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: Message-ID: <4ABD05D6.6020706@gmail.com> On 09/25/2009 12:00 PM, Skipper Seabold wrote: > There have been some recent attempts to improve the error reporting in > genfromtxt, which is > great, because hunting down the problems reading in big and messy > files is not fun. > > I am working on a patch that keeps up with the line number and column > number of where you are in parsing the file, so that this can be > reported in the error. Is there a way to catch a raised error and add > to it? > > For instance, I have a problem in my file which leads to this error > being raised from np.lib._iotools.StringCoverter.upgrade > > ValueError: Converter is locked and cannot be upgraded > > I added this into np.lib.io.genfromtxt around line 995. > > linenum = 0 > [...] > if dtype is None: > try: > colnum = 0 > for (converter, item) in zip(converters, values): > converter.upgrade(item) > colnum += 1 > except: > raise ValueError, "I don't report the error from > _iotools.StringConverter.upgrade, but I do know that there is a > problem trying to convert a value at line %s and column %s" % > (linenum,colnum) > [...] > linenum += 1 > > I'd like to add line and column number information to original error > from _iotools. Any suggestions? > > Cheers, > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, I am guessing that the converter is most likely causing the error. So presumably the file is read correctly without using the converter. If not then you should address that first. If it is an input file, then what is it and the how is genfromtxt called? A question regarding the ticket than this, why do you want to raise an exception? The reason I did not do it was that it was helpful to identify all of the lines that have a specific problem. You can not assume that a user will fix all lines with this problem let alone fix all lines with similar problems. Bruce From aisaac at american.edu Fri Sep 25 14:02:55 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 25 Sep 2009 14:02:55 -0400 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD01BF.5080405@comxnet.dk> References: <4ABD01BF.5080405@comxnet.dk> Message-ID: <4ABD05CF.8060102@american.edu> On 9/25/2009 1:45 PM, Mads Ipsen wrote: > Is there a numpy operation on two arrays, say [1,2,3] and [4,5,6], that > will yield: > > [[(1,4),(1,5),(1,6)],[(2,4),(2,5),(2,6)],[(3,4),(3,5),(3,6)]] >>> from itertools import product >>> list(product([1,2,3],[4,5,6])) [(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)] Alan Isaac From arthbous at indiana.edu Fri Sep 25 14:02:53 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 14:02:53 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909250927jae9cbb6re3739d8a8b6f84da@mail.gmail.com> <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> Message-ID: <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> I got : Arths-MacBook-Pro:~ arthbous$ python2.6 ActivePython 2.6.2.2 (ActiveState Software Inc.) based on Python 2.6.2 (r262:71600, Apr 24 2009, 21:40:46) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.executable '/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python' This is weird because when I compiled numpy, I used "/usr/local/bin/python2.6 setup.py build" and "sudo /usr/local/bin/python2.6 setup.py install". How can I get rid of all the other python ? - Arthur On Fri, Sep 25, 2009 at 1:57 PM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 12:55, Arthur Bousquet > wrote: > > The first line of /usr/local/f2py2.6 is : > > > > #!/usr/bin/env python2.6 > > # See http://cens.ioc.ee/projects/f2py2e/ > > And exactly which python2.6 executable do you get when you type > > $ python 2.6 > > ? Try printing sys.executable: > > $ python > Python 2.5.4 (r254:67916, Apr 23 2009, 14:49:51) > [GCC 4.0.1 (Apple Inc. build 5465)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import sys > >>> sys.executable > > '/Library/Frameworks/Python.framework/Versions/2.5/Resources/PythonApp.app/Contents/MacOS/PythonApp' > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpi at comxnet.dk Fri Sep 25 14:05:04 2009 From: mpi at comxnet.dk (Mads Ipsen) Date: Fri, 25 Sep 2009 20:05:04 +0200 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> Message-ID: <4ABD0650.4040109@comxnet.dk> G?khan Sever wrote: > > > On Fri, Sep 25, 2009 at 12:45 PM, Mads Ipsen > wrote: > > Is there a numpy operation on two arrays, say [1,2,3] and [4,5,6], > that > will yield: > > [[(1,4),(1,5),(1,6)],[(2,4),(2,5),(2,6)],[(3,4),(3,5),(3,6)]] > > Any suggestions are most welcome. > > Mads > > > > I don't know if there is a function in numpy library, but it is a > simple one isn't it? > > ipython --pylab > > I[1]: a = array([1,2,3]) > > I[2]: b = array([4,5,6]) > > I[3]: [zip(ones(a.size, dtype=int)*a[i], b) for i in range(len(a))] > O[3]: [[(1, 4), (1, 5), (1, 6)], [(2, 4), (2, 5), (2, 6)], [(3, 4), > (3, 5), (3, 6)]] > > > > > -- > +------------------------------------------------------------+ > | Mads Ipsen, Scientific developer | > +------------------------------+-----------------------------+ > | QuantumWise A/S | phone: +45-29716388 | > | N?rres?gade 27A | www: www.quantumwise.com > | > | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com > | > +------------------------------+-----------------------------+ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > G?khan > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Sure, and you could do the same thing using two nested for loops. But its bound to be slow when the arrays become large (and they will). The question is whether it can be done in pure numpy. An output like [[[1,4],[1,5],[1,6]],[[2,4],[2,5],[2,6]],[[3,4],[3,5],[3,6]]] is also OK. Mads -- +------------------------------------------------------------+ | Mads Ipsen, Scientific developer | +------------------------------+-----------------------------+ | QuantumWise A/S | phone: +45-29716388 | | N?rres?gade 27A | www: www.quantumwise.com | | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | +------------------------------+-----------------------------+ From jsseabold at gmail.com Fri Sep 25 14:08:00 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 25 Sep 2009 14:08:00 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: Message-ID: On Fri, Sep 25, 2009 at 2:00 PM, Ralf Gommers wrote: > > > On Fri, Sep 25, 2009 at 1:00 PM, Skipper Seabold > wrote: >> >> There have been some recent attempts to improve the error reporting in >> genfromtxt , which is >> great, because hunting down the problems reading in big and messy >> files is not fun. >> >> I am working on a patch that keeps up with the line number and column >> number of where you are in parsing the file, so that this can be >> reported in the error. ?Is there a way to catch a raised error and add >> to it? >> >> For instance, I have a problem in my file which leads to this error >> being raised from np.lib._iotools.StringCoverter.upgrade >> >> ValueError: Converter is locked and cannot be upgraded >> >> I added this into np.lib.io.genfromtxt around line 995. >> >> linenum = 0 >> [...] >> if dtype is None: >> ? ? ? ? ? ?try: >> ? ? ? ? ? ? ? ?colnum = 0 >> ? ? ? ? ? ? ? ?for (converter, item) in zip(converters, values): >> ? ? ? ? ? ? ? ? ? ?converter.upgrade(item) >> ? ? ? ? ? ? ? ? ? ?colnum += 1 >> ? ? ? ? ? ?except: >> ? ? ? ? ? ? ? ?raise ValueError, "I don't report the error from >> _iotools.StringConverter.upgrade, but I do know that there is a >> problem trying to convert a value at line %s and column %s" % >> (linenum,colnum) >> [...] >> linenum += 1 >> >> I'd like to add line and column number information to original error >> from _iotools. ?Any suggestions? > > There is no good way to edit the message of the original exception instance, > as explained here: > http://blog.ianbicking.org/2007/09/12/re-raising-exceptions/ > > Probably the easiest for your purpose is this: > > def divbyzero(): > ??? return 1/0 > > try: > ??? a = divbyzero() > except ZeroDivisionError as err: > ??? print 'problem occurred at line X' > ??? raise err > > or if you want to catch any error: > > try: > ??? yourcode() > except: > ??? print 'problem occurred at line X' > ??? raise > > > Maybe better to use a logger instead of print, but you get the idea. > Ok thanks. Using a logger might be a good idea. Skipper From jsseabold at gmail.com Fri Sep 25 14:16:15 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 25 Sep 2009 14:16:15 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4ABD05D6.6020706@gmail.com> References: <4ABD05D6.6020706@gmail.com> Message-ID: On Fri, Sep 25, 2009 at 2:03 PM, Bruce Southey wrote: > On 09/25/2009 12:00 PM, Skipper Seabold wrote: >> There have been some recent attempts to improve the error reporting in >> genfromtxt, which is >> great, because hunting down the problems reading in big and messy >> files is not fun. >> >> I am working on a patch that keeps up with the line number and column >> number of where you are in parsing the file, so that this can be >> reported in the error. ?Is there a way to catch a raised error and add >> to it? >> >> For instance, I have a problem in my file which leads to this error >> being raised from np.lib._iotools.StringCoverter.upgrade >> >> ValueError: Converter is locked and cannot be upgraded >> >> I added this into np.lib.io.genfromtxt around line 995. >> >> linenum = 0 >> [...] >> if dtype is None: >> ? ? ? ? ? ? ?try: >> ? ? ? ? ? ? ? ? ?colnum = 0 >> ? ? ? ? ? ? ? ? ?for (converter, item) in zip(converters, values): >> ? ? ? ? ? ? ? ? ? ? ?converter.upgrade(item) >> ? ? ? ? ? ? ? ? ? ? ?colnum += 1 >> ? ? ? ? ? ? ?except: >> ? ? ? ? ? ? ? ? ?raise ValueError, "I don't report the error from >> _iotools.StringConverter.upgrade, but I do know that there is a >> problem trying to convert a value at line %s and column %s" % >> (linenum,colnum) >> [...] >> linenum += 1 >> >> I'd like to add line and column number information to original error >> from _iotools. ?Any suggestions? >> >> Cheers, >> Skipper >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > Hi, > I am guessing that the converter is most likely causing the error. So > presumably the file is read correctly without using the converter. If > not then you should address that first. > > If it is an input file, then what is it and the how is genfromtxt called? > The converter is certainly causing the error, I just need to know where so then I can change it as appropriate. I've had to define 100s of converters for qualitative data/survey responses (loops doing most of the work), so I'd rather just know where the converter is failing and then I make the changes with the column number that is spit out in the error. Using your changes and my added exception has saved me so much time that you have no idea compared with what I was doing. > > A question regarding the ticket than this, why do you want to raise an > exception? > > The reason I did not do it was that it was helpful to identify all of > the lines that have a specific problem. You can not assume that a user > will fix all lines with this problem let alone fix all lines with > similar problems. Good point. This would save me even more time. Perhaps then a logger is the way to go rather than print. My understanding is that print statements aren't allowed in numpy. Is this correct? Once I'm done, I will post my changes to the ticket and we can discuss some more. Skipper From pgmdevlist at gmail.com Fri Sep 25 14:17:58 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 25 Sep 2009 14:17:58 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4ABD05D6.6020706@gmail.com> References: <4ABD05D6.6020706@gmail.com> Message-ID: <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> Sorry all, I haven't been as respondent as I wished lately... * About the patch: I don't like the idea of adding yet some other tests in the main loop. I was more into letting things like they are, but calling some error function if some 'setting an array element with a sequence' exception is raised. This function would take 'rows' as an input and would check the length of each row. That way, we don't slow things down when everything works, but just add some delay when they don't. I'll try to come up w/ something soon (in the next couple of weeks). * About the converter error: there's indeed a bug in StringConverter.upgrade, I need to write some unittests to make sure I get it covered. If you could get me some sample code, that'd be great. From robert.kern at gmail.com Fri Sep 25 14:22:25 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 13:22:25 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <6574ea980909251017xe531081rd43407b8d7978db9@mail.gmail.com> <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> Message-ID: <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> On Fri, Sep 25, 2009 at 13:02, Arthur Bousquet wrote: > I got : > > Arths-MacBook-Pro:~ arthbous$ python2.6 > ActivePython 2.6.2.2 (ActiveState Software Inc.) based on > Python 2.6.2 (r262:71600, Apr 24 2009, 21:40:46) > [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin > Type "help", "copyright", "credits" or "license" for more information. >>>> import sys >>>> sys.executable > '/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python' > > This is weird because when I compiled numpy, I used > "/usr/local/bin/python2.6 setup.py build" and > "sudo /usr/local/bin/python2.6 setup.py install". > How can I get rid of all the other python ? Be aware that the framework version of Python will make the symbolic link /usr/local/bin/python2.6 to point to its own executable. If you are sure that you do not need ActiveState's Python, delete the directory /Library/Frameworks/Python.framework/ and delete /usr/local/bin/python and /usr/local/bin/python2.6 and re-install your version into /usr/local again. However, I do recommend using a framework build of Python in order to be the most compatible with everything. In that case, delete the things above and also delete the Python files from /usr/local/ and build your Python using --enable-framework. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jsseabold at gmail.com Fri Sep 25 14:25:22 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 25 Sep 2009 14:25:22 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> Message-ID: On Fri, Sep 25, 2009 at 2:17 PM, Pierre GM wrote: > Sorry all, I haven't been as respondent as I wished lately... > * About the patch: I don't like the idea of adding yet some other > tests in the main loop. I was more into letting things like they are, > but calling some error function if some 'setting an array element with > a sequence' exception is raised. This function would take 'rows' as an > input and would check the length of each row. That way, we don't slow > things down when everything works, but just add some delay when they > don't. I'll try to come up w/ something soon (in the next couple of > weeks). Ok. > * About the converter error: there's indeed a bug in > StringConverter.upgrade, I need to write some unittests to make sure I > get it covered. If you could get me some sample code, that'd be great. Hmm, I'm not sure that the error I'm seeing is the same as the bug we had previously discussed. In this case, the converters are wrong and I need to know about it. I will try to post an example of the two times I've seen this error raised when I get a minute. Skipper From gokhansever at gmail.com Fri Sep 25 14:36:02 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Fri, 25 Sep 2009 13:36:02 -0500 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD0650.4040109@comxnet.dk> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> Message-ID: <49d6b3500909251136u29ad067eoaec427bb6d1be2b9@mail.gmail.com> On Fri, Sep 25, 2009 at 1:05 PM, Mads Ipsen wrote: > G?khan Sever wrote: > > > > > > On Fri, Sep 25, 2009 at 12:45 PM, Mads Ipsen > > wrote: > > > > Is there a numpy operation on two arrays, say [1,2,3] and [4,5,6], > > that > > will yield: > > > > [[(1,4),(1,5),(1,6)],[(2,4),(2,5),(2,6)],[(3,4),(3,5),(3,6)]] > > > > Any suggestions are most welcome. > > > > Mads > > > > > > > > I don't know if there is a function in numpy library, but it is a > > simple one isn't it? > > > > ipython --pylab > > > > I[1]: a = array([1,2,3]) > > > > I[2]: b = array([4,5,6]) > > > > I[3]: [zip(ones(a.size, dtype=int)*a[i], b) for i in range(len(a))] > > O[3]: [[(1, 4), (1, 5), (1, 6)], [(2, 4), (2, 5), (2, 6)], [(3, 4), > > (3, 5), (3, 6)]] > > > > > > > > > > -- > > +------------------------------------------------------------+ > > | Mads Ipsen, Scientific developer | > > +------------------------------+-----------------------------+ > > | QuantumWise A/S | phone: +45-29716388 | > > | N?rres?gade 27A | www: www.quantumwise.com > > | > > | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com > > | > > +------------------------------+-----------------------------+ > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > -- > > G?khan > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > Sure, and you could do the same thing using two nested for loops. But > its bound to be slow when the arrays become large (and they will). The > question is whether it can be done in pure numpy. An output like > > [[[1,4],[1,5],[1,6]],[[2,4],[2,5],[2,6]],[[3,4],[3,5],[3,6]]] > > is also OK. > > Mads > > You are right, it takes a lot of time when the arrays get bigger: I[1]: a = arange(1000) I[2]: b = arange(1000) I[3]: %time [zip(ones(a.size, dtype=int)*a[i], b) for i in range(len(a))]; CPU times: user 1.73 s, sys: 0.06 s, total: 1.79 s Wall time: 1.86 s I[5]: b = arange(2000) I[6]: a = arange(2000) I[7]: %time [zip(ones(a.size, dtype=int)*a[i], b) for i in range(len(a))]; CPU times: user 12.45 s, sys: 0.15 s, total: 12.61 s Wall time: 13.23 s I[9]: b = arange(3000) I[10]: a = arange(3000) I[11]: %time [zip(ones(a.size, dtype=int)*a[i], b) for i in range(len(a))]; CPU times: user 49.29 s, sys: 0.32 s, total: 49.60 s Wall time: 51.25 s I[13]: from itertools import product I[14]: %time list(product(a,b)); CPU times: user 45.74 s, sys: 0.11 s, total: 45.84 s Wall time: 48.26 s To me, this is a very nice case to include Cython in your code. However you might need to seek further advice in this :) > > -- > +------------------------------------------------------------+ > | Mads Ipsen, Scientific developer | > +------------------------------+-----------------------------+ > | QuantumWise A/S | phone: +45-29716388 | > | N?rres?gade 27A | www: www.quantumwise.com | > | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | > +------------------------------+-----------------------------+ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From arthbous at indiana.edu Fri Sep 25 14:43:21 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 14:43:21 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909251021o747fc4cem31ccd5e0d5174007@mail.gmail.com> <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> Message-ID: <6574ea980909251143w917e25bob0563265ff41030f@mail.gmail.com> Thank you a lot for your help. So I deleted as you said, but then I do with --enable-framework for python, at the make there is an error : ln -fsn 2.6 Python.framework/Versions/Current ln -fsn Versions/Current/Python Python.framework/Python ln -fsn Versions/Current/Headers Python.framework/Headers ln -fsn Versions/Current/Resources Python.framework/Resources gcc -u _PyMac_Error Python.framework/Versions/2.6/Python -o python.exe \ Modules/python.o \ -ldl ld: warning: in Python.framework/Versions/2.6/Python, file is not of required architecture Undefined symbols: "_PyMac_Error", referenced from: "_Py_Main", referenced from: _main in python.o ld: symbol(s) not found collect2: ld returned 1 exit status make: *** [python.exe] Error 1 Do you know why I have this ? And also does "export MACOSX_DEPLOYMENT_TARGET=10.6" is necessary ? - Arthur On Fri, Sep 25, 2009 at 2:22 PM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 13:02, Arthur Bousquet > wrote: > > I got : > > > > Arths-MacBook-Pro:~ arthbous$ python2.6 > > ActivePython 2.6.2.2 (ActiveState Software Inc.) based on > > Python 2.6.2 (r262:71600, Apr 24 2009, 21:40:46) > > [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin > > Type "help", "copyright", "credits" or "license" for more information. > >>>> import sys > >>>> sys.executable > > > '/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python' > > > > This is weird because when I compiled numpy, I used > > "/usr/local/bin/python2.6 setup.py build" and > > "sudo /usr/local/bin/python2.6 setup.py install". > > How can I get rid of all the other python ? > > Be aware that the framework version of Python will make the symbolic > link /usr/local/bin/python2.6 to point to its own executable. If you > are sure that you do not need ActiveState's Python, delete the > directory /Library/Frameworks/Python.framework/ and delete > /usr/local/bin/python and /usr/local/bin/python2.6 and re-install your > version into /usr/local again. > > However, I do recommend using a framework build of Python in order to > be the most compatible with everything. In that case, delete the > things above and also delete the Python files from /usr/local/ and > build your Python using --enable-framework. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Sep 25 14:53:39 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 25 Sep 2009 11:53:39 -0700 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD0650.4040109@comxnet.dk> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> Message-ID: On Fri, Sep 25, 2009 at 11:05 AM, Mads Ipsen wrote: > Sure, and you could do the same thing using two nested for loops. But > its bound to be slow when the arrays become large (and they will). The > question is whether it can be done in pure numpy. An output like > > [[[1,4],[1,5],[1,6]],[[2,4],[2,5],[2,6]],[[3,4],[3,5],[3,6]]] > > is also OK. This doesn't work. But maybe someone can make it work: >> x = np.array([1,2,3]) >> y = np.array([4,5,6]) >> >> a = np.ones((3,3,2), dtype=np.int) >> a[:,:,0] = x >> a[:,:,1] = y >> a.tolist() [[[1, 4], [2, 5], [3, 6]], [[1, 4], [2, 5], [3, 6]], [[1, 4], [2, 5], [3, 6]]] From robert.kern at gmail.com Fri Sep 25 14:53:37 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 13:53:37 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251143w917e25bob0563265ff41030f@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <6574ea980909251024h7cda6ed7rb2dbe499371cb111@mail.gmail.com> <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> <6574ea980909251143w917e25bob0563265ff41030f@mail.gmail.com> Message-ID: <3d375d730909251153t219b38c3oeb3b1ccaf6cc6bd9@mail.gmail.com> On Fri, Sep 25, 2009 at 13:43, Arthur Bousquet wrote: > Thank you a lot for your help. So I deleted as you said, but then I do with > --enable-framework for python, at the make there is an error : > > ln -fsn 2.6 Python.framework/Versions/Current > ln -fsn Versions/Current/Python Python.framework/Python > ln -fsn Versions/Current/Headers Python.framework/Headers > ln -fsn Versions/Current/Resources Python.framework/Resources > gcc? -u _PyMac_Error Python.framework/Versions/2.6/Python -o python.exe \ > ??? ??? ??? Modules/python.o \ > ??? ??? ??? ?-ldl > ld: warning: in Python.framework/Versions/2.6/Python, file is not of > required architecture > Undefined symbols: > ? "_PyMac_Error", referenced from: > ? "_Py_Main", referenced from: > ????? _main in python.o > ld: symbol(s) not found > collect2: ld returned 1 exit status > make: *** [python.exe] Error 1 > > Do you know why I have this ? You may need some patches from Apple in order to compile Python for the (now-default) x86_64 CPU architecture. There may be configure flags to only use x86, though. You will have to check the --help and read the documentation. I'm afraid that is about all I know on the subject; I have not upgraded to Snow Leopard, yet, or attempted to build Python on it. > And also does "export > MACOSX_DEPLOYMENT_TARGET=10.6" is necessary ? Quite possibly. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Fri Sep 25 14:57:15 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 25 Sep 2009 14:57:15 -0400 Subject: [Numpy-discussion] fixed-point arithmetic References: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> <3d375d730909210946j51e71bc5ud62dcd7ccb4581f1@mail.gmail.com> <3d375d730909211009h5090a648v71142155e0d972aa@mail.gmail.com> <3d375d730909211115n627fd5bav6c869e3f40632505@mail.gmail.com> Message-ID: Robert Kern wrote: > On Mon, Sep 21, 2009 at 12:39, Neal Becker wrote: > >> 1. Where would I find this new datetime dtype? > > It's in the SVN trunk. > >> 2. Don't know exactly what 'parameterized' dtypes are. Does this mean >> that the dtype for 8.1 format fixed-pt is different from the dtype for >> 6.2 format, for example? > > Yes. The dtype code letter is the same, but the dtype object has > metadata attached to it in the form of a dictionary. The ufunc loops > get references to the array objects and will look at the dtype > metadata in order to figure out exactly what to do. > I'm still a bit confused. The dtype is a PyTypeObject? It almost sounds like you're saying that the dtype is an PyObject instance, rather than a TypeObject. But IIUC, you say it is a PyTypeObject, but one with an extract dict attached (which kindof acts like an instance). From josef.pktd at gmail.com Fri Sep 25 14:59:10 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 25 Sep 2009 14:59:10 -0400 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> Message-ID: <1cd32cbb0909251159m1754aabelb3a9f363efbcb609@mail.gmail.com> On Fri, Sep 25, 2009 at 2:53 PM, Keith Goodman wrote: > On Fri, Sep 25, 2009 at 11:05 AM, Mads Ipsen wrote: >> Sure, and you could do the same thing using two nested for loops. But >> its bound to be slow when the arrays become large (and they will). The >> question is whether it can be done in pure numpy. An output like >> >> [[[1,4],[1,5],[1,6]],[[2,4],[2,5],[2,6]],[[3,4],[3,5],[3,6]]] >> >> is also OK. > > This doesn't work. But maybe someone can make it work: > >>> x = np.array([1,2,3]) >>> y = np.array([4,5,6]) >>> >>> a = np.ones((3,3,2), dtype=np.int) >>> a[:,:,0] = x >>> a[:,:,1] = y >>> a.tolist() > ? [[[1, 4], [2, 5], [3, 6]], [[1, 4], [2, 5], [3, 6]], [[1, 4], [2, > 5], [3, 6]]] and if you really want tuples something like this might work >>> np.dstack(np.meshgrid([1,2,3],[4,5,6])).T.reshape(2,-1).T.copy().view([('',np.int32)]*2).reshape(3,-1).tolist() [[(1, 4), (1, 5), (1, 6)], [(2, 4), (2, 5), (2, 6)], [(3, 4), (3, 5), (3, 6)]] by trial and error, I'm not sure if every step is necessary and works in general Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Fri Sep 25 14:59:11 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 25 Sep 2009 11:59:11 -0700 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> Message-ID: On Fri, Sep 25, 2009 at 11:53 AM, Keith Goodman wrote: > On Fri, Sep 25, 2009 at 11:05 AM, Mads Ipsen wrote: >> Sure, and you could do the same thing using two nested for loops. But >> its bound to be slow when the arrays become large (and they will). The >> question is whether it can be done in pure numpy. An output like >> >> [[[1,4],[1,5],[1,6]],[[2,4],[2,5],[2,6]],[[3,4],[3,5],[3,6]]] >> >> is also OK. > > This doesn't work. But maybe someone can make it work: > >>> x = np.array([1,2,3]) >>> y = np.array([4,5,6]) >>> >>> a = np.ones((3,3,2), dtype=np.int) >>> a[:,:,0] = x >>> a[:,:,1] = y >>> a.tolist() > ? [[[1, 4], [2, 5], [3, 6]], [[1, 4], [2, 5], [3, 6]], [[1, 4], [2, > 5], [3, 6]]] How about this? >> x = np.array([1,2,3]) >> y = np.array([4,5,6]) >> >> a = np.ones((3,3,2), dtype=np.int) >> a[:,:,0] = x >> a.T[1] = y >> a.tolist() [[[1, 4], [2, 4], [3, 4]], [[1, 5], [2, 5], [3, 5]], [[1, 6], [2, 6], [3, 6]]] From robert.kern at gmail.com Fri Sep 25 15:04:42 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 14:04:42 -0500 Subject: [Numpy-discussion] fixed-point arithmetic In-Reply-To: References: <5b8d13220909210803h36401c9amcd1ff38625d31d1f@mail.gmail.com> <3d375d730909210946j51e71bc5ud62dcd7ccb4581f1@mail.gmail.com> <3d375d730909211009h5090a648v71142155e0d972aa@mail.gmail.com> <3d375d730909211115n627fd5bav6c869e3f40632505@mail.gmail.com> Message-ID: <3d375d730909251204lc0e0389r1823da946fa0f194@mail.gmail.com> On Fri, Sep 25, 2009 at 13:57, Neal Becker wrote: > I'm still a bit confused. ?The dtype is a PyTypeObject? No, dtypes are PyArrayDescr's in C. >?It almost sounds > like you're saying that the dtype is an PyObject instance, rather than a > TypeObject. Yup. >?But IIUC, you say it is a PyTypeObject, but one with an extract > dict attached (which kindof acts like an instance). In order to create the scalar objects, you would make a PyTypeObject that will create the appropriate PyArrayDescr with the metadata dict. The scalar types, like np.uint8, are not dtypes themselves. The constructor for your PyTypeObject will have to be a little different than the others, since it will also have to accept the metadata that it will use to create the dtype object. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Fri Sep 25 15:08:29 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 25 Sep 2009 12:08:29 -0700 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: Message-ID: <4ABD152D.9010506@noaa.gov> Ralf Gommers wrote: > Probably the easiest for your purpose is this: > > def divbyzero(): > return 1/0 > > try: > a = divbyzero() > except ZeroDivisionError as err: > print 'problem occurred at line X' > raise err I get an error with this syntax -- is thin 2.6 only? In [10]: run error.py error.py:9: Warning: 'as' will become a reserved keyword in Python 2.6 ------------------------------------------------------------ File "error.py", line 9 except ZeroDivisionError as err: ^ SyntaxError: invalid syntax > print 'problem occurred at line X' > raise > > Maybe better to use a logger instead of print, but you get the idea. definitely don't print! a lib function should never print (unless maybe with a debug flag or something set). I don't know if there is a standard logging approach you could use. I'd rather see info added to the Exception, or a new Exception raised with info. Now, another option. It seems in this case that you know what Exception(s) you are trying to catch, and you want to add some information to the message. If you don't need to keep the old traceback, you can do something like: try: 4/0 except ZeroDivisionError, err: raise Exception("A new error with old message also: %s"%err.message) It doesn't appear to be possible(well, at least not easy!) to add or change the message on an existing exception and then re-raise it (kind of makes me want mutable strings) I suspect that for this use, this would suffice, what the user really wants to know is where in their file the error occurred, not where in the converting it occurred. This assumes that the converting code puts useful messages in the errors. Otherwise, there is info in the traceback module, so you can get more by doing this: import traceback try: 4/0 except ZeroDivisionError, err: line, col = 45, 10 raise Exception(traceback.format_exc()+"\error took place at line: %i, column %i\n"%(line, col)) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From arthbous at indiana.edu Fri Sep 25 15:08:47 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 15:08:47 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <3d375d730909251153t219b38c3oeb3b1ccaf6cc6bd9@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909251027k316573e9l2a695852607fe3f4@mail.gmail.com> <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> <6574ea980909251143w917e25bob0563265ff41030f@mail.gmail.com> <3d375d730909251153t219b38c3oeb3b1ccaf6cc6bd9@mail.gmail.com> Message-ID: <6574ea980909251208u56702fd5we960b3b7edde1cdb@mail.gmail.com> Ok, so I re-compiled python (without framework), numpy,... So I can compile my lib (libsw) now, bu I can't called it from python. I got this error : Arths-MacBook-Pro:run_1 arthbous$ ./run.sh Traceback (most recent call last): File "../main.py", line 23, in import libsw as lsw ImportError: dlopen(/Users/arthbous/Documents/SW/sw-v2.3/libsw.so, 2): no suitable image found. Did find: /Users/arthbous/Documents/SW/sw-v2.3/libsw.so: mach-o, but wrong architecture It seems to compile for the wrong architecture. Do you know about this ? Thank you. - Arthur On Fri, Sep 25, 2009 at 2:53 PM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 13:43, Arthur Bousquet > wrote: > > Thank you a lot for your help. So I deleted as you said, but then I do > with > > --enable-framework for python, at the make there is an error : > > > > ln -fsn 2.6 Python.framework/Versions/Current > > ln -fsn Versions/Current/Python Python.framework/Python > > ln -fsn Versions/Current/Headers Python.framework/Headers > > ln -fsn Versions/Current/Resources Python.framework/Resources > > gcc -u _PyMac_Error Python.framework/Versions/2.6/Python -o python.exe \ > > Modules/python.o \ > > -ldl > > ld: warning: in Python.framework/Versions/2.6/Python, file is not of > > required architecture > > Undefined symbols: > > "_PyMac_Error", referenced from: > > "_Py_Main", referenced from: > > _main in python.o > > ld: symbol(s) not found > > collect2: ld returned 1 exit status > > make: *** [python.exe] Error 1 > > > > Do you know why I have this ? > > You may need some patches from Apple in order to compile Python for > the (now-default) x86_64 CPU architecture. There may be configure > flags to only use x86, though. You will have to check the --help and > read the documentation. I'm afraid that is about all I know on the > subject; I have not upgraded to Snow Leopard, yet, or attempted to > build Python on it. > > > And also does "export > > MACOSX_DEPLOYMENT_TARGET=10.6" is necessary ? > > Quite possibly. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Fri Sep 25 15:12:14 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 25 Sep 2009 12:12:14 -0700 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> Message-ID: <4ABD160E.1060503@noaa.gov> Pierre GM wrote: > That way, we don't slow > things down when everything works, but just add some delay when they > don't. good goal, but if you don't keep track of where you are, wouldn't you need to re-parse the whole file to figure it out again? Maybe a "debug" mode that the user could turn on and off would fit the need. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From aisaac at american.edu Fri Sep 25 15:19:55 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 25 Sep 2009 15:19:55 -0400 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> Message-ID: <4ABD17DB.7000801@american.edu> I do not see what is wrong with itertools.product, but if you hate it, you can use numpy.meshgrid: >>> np.array(np.meshgrid([1,2,3],[4,5,6])).transpose() array([[[1, 4], [1, 5], [1, 6]], [[2, 4], [2, 5], [2, 6]], [[3, 4], [3, 5], [3, 6]]]) Alan Isaac From ralf.gommers at googlemail.com Fri Sep 25 15:28:19 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 25 Sep 2009 15:28:19 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4ABD152D.9010506@noaa.gov> References: <4ABD152D.9010506@noaa.gov> Message-ID: On Fri, Sep 25, 2009 at 3:08 PM, Christopher Barker wrote: > Ralf Gommers wrote: > > Probably the easiest for your purpose is this: > > > > def divbyzero(): > > return 1/0 > > > > try: > > a = divbyzero() > > except ZeroDivisionError as err: > > print 'problem occurred at line X' > > raise err > > I get an error with this syntax -- is thin 2.6 only? > > Yes sorry, that's the 2.6/3.0 syntax. It should be except ZeroDivisionError, err: Anyway, the instance is not needed in my example because it can't be usefully modified and can be re-raised with a bare "raise". > > > > Maybe better to use a logger instead of print, but you get the idea. > > definitely don't print! a lib function should never print (unless maybe > with a debug flag or something set). I don't know if there is a standard > logging approach you could use. > > I'd rather see info added to the Exception, or a new Exception raised > with info. > The former is not possible, which motivated my example. The latter loses the original traceback. Logging seems to be the way to go if you want all the info. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Sep 25 15:30:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 14:30:15 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251208u56702fd5we960b3b7edde1cdb@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <6574ea980909251035m3924ca8diea3db84799189204@mail.gmail.com> <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> <6574ea980909251143w917e25bob0563265ff41030f@mail.gmail.com> <3d375d730909251153t219b38c3oeb3b1ccaf6cc6bd9@mail.gmail.com> <6574ea980909251208u56702fd5we960b3b7edde1cdb@mail.gmail.com> Message-ID: <3d375d730909251230g74da35c4vd70ae9adb2b523c8@mail.gmail.com> On Fri, Sep 25, 2009 at 14:08, Arthur Bousquet wrote: > Ok, so I re-compiled python (without framework), numpy,... So I can compile > my lib (libsw) now, bu I can't called it from python. I got this error : > > Arths-MacBook-Pro:run_1 arthbous$ ./run.sh > Traceback (most recent call last): > ? File "../main.py", line 23, in > ??? import libsw as lsw > ImportError: dlopen(/Users/arthbous/Documents/SW/sw-v2.3/libsw.so, 2): no > suitable image found.? Did find: > ??? /Users/arthbous/Documents/SW/sw-v2.3/libsw.so: mach-o, but wrong > architecture > > It seems to compile for the wrong architecture. Do you know about this ? Find the architecture of libsw.so using file: $ file /Users/arthbous/Documents/SW/sw-v2.3/libsw.so It is likely that it is only x86 while your Python is running using the x86_64 architecture. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Sep 25 15:33:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 14:33:21 -0500 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD17DB.7000801@american.edu> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> Message-ID: <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> On Fri, Sep 25, 2009 at 14:19, Alan G Isaac wrote: > I do not see what is wrong with itertools.product, > but if you hate it, you can use numpy.meshgrid: > >>>> np.array(np.meshgrid([1,2,3],[4,5,6])).transpose() > array([[[1, 4], > ? ? ? ? [1, 5], > ? ? ? ? [1, 6]], > > ? ? ? ?[[2, 4], > ? ? ? ? [2, 5], > ? ? ? ? [2, 6]], > > ? ? ? ?[[3, 4], > ? ? ? ? [3, 5], > ? ? ? ? [3, 6]]]) If you need more than two item sets, or are using Python 2.5: import numpy as np def cartesian_product(*items): items = map(np.asarray, items) lengths = map(len, items) n = np.arange(np.product(lengths)) results = [] for i in range(-1, -len(items)-1, -1): j = n % lengths[i] results.insert(0, items[i][j]) n -= j n //= lengths[i] results = np.column_stack(results) results.shape = tuple(lengths + [len(items)]) return results The final shape manipulations are, of course, optional. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bsouthey at gmail.com Fri Sep 25 15:34:45 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 25 Sep 2009 14:34:45 -0500 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> Message-ID: <4ABD1B55.7050705@gmail.com> On 09/25/2009 01:25 PM, Skipper Seabold wrote: > On Fri, Sep 25, 2009 at 2:17 PM, Pierre GM wrote: > >> Sorry all, I haven't been as respondent as I wished lately... >> * About the patch: I don't like the idea of adding yet some other >> tests in the main loop. I was more into letting things like they are, >> but calling some error function if some 'setting an array element with >> a sequence' exception is raised. This function would take 'rows' as an >> input and would check the length of each row. That way, we don't slow >> things down when everything works, but just add some delay when they >> don't. I'll try to come up w/ something soon (in the next couple of >> weeks). >> > Ok. > I tend to agree but I think that the actual array() function give a more meaningful error about mismatched data such as indicating the row. I think that it would be too late to go back to the data and try to figure out why the exception occurred. If you wait until array() is called then you have not used at least two opportunities to check the whole data.. The data is parsed at least twice, the first is the itertools.chain loop and the second is the subsequent enumeration over rows - lines 981 and 1006 of the unpatched io.py). Really it is a question of how useful the messages are and if (or when) genfromtxt should stop on an error. For a huge data set I can see that stopping on an error is useful because it avoids parsing all the data. But listing all the errors is also useful especially when you can fix all the errors at once. >> * About the converter error: there's indeed a bug in >> StringConverter.upgrade, I need to write some unittests to make sure I >> get it covered. If you could get me some sample code, that'd be great. >> > Hmm, I'm not sure that the error I'm seeing is the same as the bug we > had previously discussed. In this case, the converters are wrong and > I need to know about it. I will try to post an example of the two > times I've seen this error raised when I get a minute. > > Skipper > _______________________________________________ > Please! Samples of using it would be great. Bruce From pgmdevlist at gmail.com Fri Sep 25 15:39:56 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 25 Sep 2009 15:39:56 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4ABD160E.1060503@noaa.gov> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> Message-ID: <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> On Sep 25, 2009, at 3:12 PM, Christopher Barker wrote: > Pierre GM wrote: >> That way, we don't slow >> things down when everything works, but just add some delay when they >> don't. > > good goal, but if you don't keep track of where you are, wouldn't you > need to re-parse the whole file to figure it out again? Indeed. But does it really matter ? We're in a case where there's a problem already... > Maybe a "debug" mode that the user could turn on and off would fit > the need. Not a bad idea. Another option would be to give the user the possibility to skip the offending lines: * we already know what number of columns to expect (nbcols) * we check whether the current row has the correct nb of columns * if it doesn't match, we skip or raise an exception with the corresponding line number. But even if we skip, we need to log the the line number to tell the user that there was a problem (issuing a warning ?) From jsseabold at gmail.com Fri Sep 25 15:42:45 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 25 Sep 2009 15:42:45 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4ABD1B55.7050705@gmail.com> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD1B55.7050705@gmail.com> Message-ID: On Fri, Sep 25, 2009 at 3:34 PM, Bruce Southey wrote: >>> * About the converter error: there's indeed a bug in >>> StringConverter.upgrade, I need to write some unittests to make sure I >>> get it covered. If you could get me some sample code, that'd be great. >>> >> Hmm, I'm not sure that the error I'm seeing is the same as the bug we >> had previously discussed. ?In this case, the converters are wrong and >> I need to know about it. ?I will try to post an example of the two >> times I've seen this error raised when I get a minute. >> >> Skipper >> _______________________________________________ >> > Please! > Samples of using it would be great. > As far as this goes, I added some examples to the docs wiki, but I think that genfromtxt and related would be best served by having their own wiki page that could maybe go here Thoughts? I can work on it as I find time. Also while I'm thinking about it, I filed an enhancement ticket and patch to use the autostrip keyword to get rid of whitespace in strings Skipper From arthbous at indiana.edu Fri Sep 25 15:44:24 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 15:44:24 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <3d375d730909251230g74da35c4vd70ae9adb2b523c8@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <3d375d730909251051j61a9ee90t871c6bba37027d90@mail.gmail.com> <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> <6574ea980909251143w917e25bob0563265ff41030f@mail.gmail.com> <3d375d730909251153t219b38c3oeb3b1ccaf6cc6bd9@mail.gmail.com> <6574ea980909251208u56702fd5we960b3b7edde1cdb@mail.gmail.com> <3d375d730909251230g74da35c4vd70ae9adb2b523c8@mail.gmail.com> Message-ID: <6574ea980909251244m6cb642e4q1c8655598eead51e@mail.gmail.com> $ file libsw.so gives : libsw.so: Mach-O bundle i386 How can I change it ? Is it because of my gfortran ? Thanks. - Arthur On Fri, Sep 25, 2009 at 3:30 PM, Robert Kern wrote: > On Fri, Sep 25, 2009 at 14:08, Arthur Bousquet > wrote: > > Ok, so I re-compiled python (without framework), numpy,... So I can > compile > > my lib (libsw) now, bu I can't called it from python. I got this error : > > > > Arths-MacBook-Pro:run_1 arthbous$ ./run.sh > > Traceback (most recent call last): > > File "../main.py", line 23, in > > import libsw as lsw > > ImportError: dlopen(/Users/arthbous/Documents/SW/sw-v2.3/libsw.so, 2): no > > suitable image found. Did find: > > /Users/arthbous/Documents/SW/sw-v2.3/libsw.so: mach-o, but wrong > > architecture > > > > It seems to compile for the wrong architecture. Do you know about this ? > > Find the architecture of libsw.so using file: > > $ file /Users/arthbous/Documents/SW/sw-v2.3/libsw.so > > It is likely that it is only x86 while your Python is running using > the x86_64 architecture. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Sep 25 15:46:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 14:46:28 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251244m6cb642e4q1c8655598eead51e@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <6574ea980909251055s4c2bca5dw7da2ddaebcb4cfd@mail.gmail.com> <3d375d730909251057w46a36fe0v88a5169e322dd037@mail.gmail.com> <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> <6574ea980909251143w917e25bob0563265ff41030f@mail.gmail.com> <3d375d730909251153t219b38c3oeb3b1ccaf6cc6bd9@mail.gmail.com> <6574ea980909251208u56702fd5we960b3b7edde1cdb@mail.gmail.com> <3d375d730909251230g74da35c4vd70ae9adb2b523c8@mail.gmail.com> <6574ea980909251244m6cb642e4q1c8655598eead51e@mail.gmail.com> Message-ID: <3d375d730909251246n127b1d20pb92b0b5f1a86db6d@mail.gmail.com> On Fri, Sep 25, 2009 at 14:44, Arthur Bousquet wrote: > $ file libsw.so gives : > > libsw.so: Mach-O bundle i386 > > How can I change it ? Is it because of my gfortran ? Your gfortran is probably still configured to make x86 executables by default. Try recompiling libsw.so with the flags -arch x86 -arch x86_64 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Fri Sep 25 15:47:42 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 25 Sep 2009 15:47:42 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD1B55.7050705@gmail.com> Message-ID: On Sep 25, 2009, at 3:42 PM, Skipper Seabold wrote: > > As far as this goes, I added some examples to the docs wiki, but I > think that genfromtxt and related would be best served by having their > own wiki page that could maybe go here > > > Thoughts? I can work on it as I find time. +1 > Also while I'm thinking about it, I filed an enhancement ticket and > patch to use the autostrip keyword to get rid of whitespace in strings > While you're at it, can you ask for adding the possibility to process a dtype like (int,int,float) ? That was what I was working on before I started installing Snow Leopard... From Chris.Barker at noaa.gov Fri Sep 25 15:51:23 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 25 Sep 2009 12:51:23 -0700 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> Message-ID: <4ABD1F3B.9040507@noaa.gov> One more thought: Pierre GM wrote: >>> That way, we don't slow >>> things down when everything works, how long can it really take to increment an integer as each line is parsed? I'd suspect no one would even notice! >>if you don't keep track of where you are, wouldn't you >> need to re-parse the whole file to figure it out again? > > Indeed. But does it really matter ? We're in a case where there's a > problem already... no, it doesn't. >> Maybe a "debug" mode that the user could turn on and off would fit >> the need. > Not a bad idea. Another option would be to give the user the > possibility to skip the offending lines: In either case, I think I'd tend to use it something like this: try: LoadTheFile() except GenFromTxtException: LoadTheFile(debug=True) But I suppose that block could be built in if debug was on -- without debug, it would simply raise an error when one was hit, with debug, it would go back and figure out more about the error and report it. or you could have multiple error modes: error_mode is one of: "fast": does it as fast as possible, and craps out with not much info on error "first_error": stops on first error, and gives you some info about it. "all_errors": keeps going after an error, and logs them all and reports back at the end. "ignore_errors": skips any line with errors, loading the rest of the data -- I think I'd still want the error report, though. but who's going to write that code? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jsseabold at gmail.com Fri Sep 25 15:51:31 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 25 Sep 2009 15:51:31 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD1B55.7050705@gmail.com> Message-ID: On Fri, Sep 25, 2009 at 3:47 PM, Pierre GM wrote: > > On Sep 25, 2009, at 3:42 PM, Skipper Seabold wrote: >> >> As far as this goes, I added some examples to the docs wiki, but I >> think that genfromtxt and related would be best served by having their >> own wiki page that could maybe go here >> >> >> Thoughts? ?I can work on it as I find time. > > +1 > >> Also while I'm thinking about it, I filed an enhancement ticket and >> patch to use the autostrip keyword to get rid of whitespace in strings >> > > While you're at it, can you ask for adding the possibility to process > a dtype like (int,int,float) ? That was what I was working on before I > started installing Snow Leopard... Sure. Should it be another keyword though `type` maybe? dtype implies that it's a legal dtype, and I don't think (int,int,float) works does it? From pgmdevlist at gmail.com Fri Sep 25 15:55:54 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 25 Sep 2009 15:55:54 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD1B55.7050705@gmail.com> Message-ID: <4715C2A5-B40F-4631-89E1-3234C18A1E17@gmail.com> On Sep 25, 2009, at 3:51 PM, Skipper Seabold wrote: >> >> While you're at it, can you ask for adding the possibility to process >> a dtype like (int,int,float) ? That was what I was working on >> before I >> started installing Snow Leopard... > > Sure. Should it be another keyword though `type` maybe? dtype > implies that it's a legal dtype, and I don't think (int,int,float) > works does it? `type` would call for troubles. And no, (int,int,float) is not a valid dtype, but it can be easily processed as such. From ralf.gommers at googlemail.com Fri Sep 25 15:56:28 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 25 Sep 2009 15:56:28 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD1B55.7050705@gmail.com> Message-ID: On Fri, Sep 25, 2009 at 3:47 PM, Pierre GM wrote: > > On Sep 25, 2009, at 3:42 PM, Skipper Seabold wrote: > > > > As far as this goes, I added some examples to the docs wiki, but I > > think that genfromtxt and related would be best served by having their > > own wiki page that could maybe go here > > > > > > Thoughts? I can work on it as I find time. > > +1 > The examples you put in the docstring are good I think. One more example demonstrating missing values would be useful. And +1 to a page in the user guide for anything else. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Fri Sep 25 15:59:06 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 25 Sep 2009 15:59:06 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD1B55.7050705@gmail.com> Message-ID: <1C0DB852-5C0D-45A7-B2CF-778A2878DEE7@gmail.com> On Sep 25, 2009, at 3:56 PM, Ralf Gommers wrote: > > The examples you put in the docstring are good I think. One more > example demonstrating missing values would be useful. And +1 to a > page in the user guide for anything else. Check also what's done in tests/test_io.py, that gives an idea of what can be done and what cannot. From mpi at comxnet.dk Fri Sep 25 16:01:07 2009 From: mpi at comxnet.dk (Mads Ipsen) Date: Fri, 25 Sep 2009 22:01:07 +0200 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> Message-ID: <4ABD2183.9040903@comxnet.dk> Robert Kern wrote: > On Fri, Sep 25, 2009 at 14:19, Alan G Isaac wrote: > >> I do not see what is wrong with itertools.product, >> but if you hate it, you can use numpy.meshgrid: >> >> >>>>> np.array(np.meshgrid([1,2,3],[4,5,6])).transpose() >>>>> >> array([[[1, 4], >> [1, 5], >> [1, 6]], >> >> [[2, 4], >> [2, 5], >> [2, 6]], >> >> [[3, 4], >> [3, 5], >> [3, 6]]]) >> > > If you need more than two item sets, or are using Python 2.5: > > import numpy as np > > def cartesian_product(*items): > items = map(np.asarray, items) > lengths = map(len, items) > n = np.arange(np.product(lengths)) > results = [] > for i in range(-1, -len(items)-1, -1): > j = n % lengths[i] > results.insert(0, items[i][j]) > n -= j > n //= lengths[i] > results = np.column_stack(results) > results.shape = tuple(lengths + [len(items)]) > return results > > > The final shape manipulations are, of course, optional. > > Thanks for all the suggestions. Came up with this, which I think I'll stick with a = numpy.array([1,2,3]) b = numpy.array([4,5,6]) (n,m) = (a.shape[0],b.shape[0]) a = numpy.repeat(a,m).reshape(n,m) b = numpy.repeat(b,n).reshape(m,n).transpose() ab = numpy.dstack((a,b)) print ab.tolist() [[[1, 4], [1, 5], [1, 6]], [[2, 4], [2, 5], [2, 6]], [[3, 4], [3, 5], [3, 6]]] -- +------------------------------------------------------------+ | Mads Ipsen, Scientific developer | +------------------------------+-----------------------------+ | QuantumWise A/S | phone: +45-29716388 | | N?rres?gade 27A | www: www.quantumwise.com | | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | +------------------------------+-----------------------------+ From arthbous at indiana.edu Fri Sep 25 16:36:40 2009 From: arthbous at indiana.edu (Arthur Bousquet) Date: Fri, 25 Sep 2009 16:36:40 -0400 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251249i32542160i817a76aa48b965e7@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> <6574ea980909251143w917e25bob0563265ff41030f@mail.gmail.com> <3d375d730909251153t219b38c3oeb3b1ccaf6cc6bd9@mail.gmail.com> <6574ea980909251208u56702fd5we960b3b7edde1cdb@mail.gmail.com> <3d375d730909251230g74da35c4vd70ae9adb2b523c8@mail.gmail.com> <6574ea980909251244m6cb642e4q1c8655598eead51e@mail.gmail.com> <3d375d730909251246n127b1d20pb92b0b5f1a86db6d@mail.gmail.com> <6574ea980909251249i32542160i817a76aa48b965e7@mail.gmail.com> Message-ID: <6574ea980909251336t49e24ce5y26fc08c67c5092c@mail.gmail.com> Thanks a lot for you help. Now it is working, I downloaded the gfortran from http://hpc.sourceforge.net/. Best regards, Arthur On Fri, Sep 25, 2009 at 3:49 PM, Arthur Bousquet wrote: > How can I add the flags ? > > Thank > > Arthur > > > On Fri, Sep 25, 2009 at 3:46 PM, Robert Kern wrote: > >> On Fri, Sep 25, 2009 at 14:44, Arthur Bousquet >> wrote: >> > $ file libsw.so gives : >> > >> > libsw.so: Mach-O bundle i386 >> > >> > How can I change it ? Is it because of my gfortran ? >> >> Your gfortran is probably still configured to make x86 executables by >> default. Try recompiling libsw.so with the flags >> >> -arch x86 -arch x86_64 >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Fri Sep 25 16:37:59 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Fri, 25 Sep 2009 15:37:59 -0500 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD2183.9040903@comxnet.dk> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> <4ABD2183.9040903@comxnet.dk> Message-ID: <49d6b3500909251337xd4fd49dxa733f11f79c1d0c4@mail.gmail.com> On Fri, Sep 25, 2009 at 3:01 PM, Mads Ipsen wrote: > Robert Kern wrote: > > On Fri, Sep 25, 2009 at 14:19, Alan G Isaac wrote: > > > >> I do not see what is wrong with itertools.product, > >> but if you hate it, you can use numpy.meshgrid: > >> > >> > >>>>> np.array(np.meshgrid([1,2,3],[4,5,6])).transpose() > >>>>> > >> array([[[1, 4], > >> [1, 5], > >> [1, 6]], > >> > >> [[2, 4], > >> [2, 5], > >> [2, 6]], > >> > >> [[3, 4], > >> [3, 5], > >> [3, 6]]]) > >> > > > > If you need more than two item sets, or are using Python 2.5: > > > > import numpy as np > > > > def cartesian_product(*items): > > items = map(np.asarray, items) > > lengths = map(len, items) > > n = np.arange(np.product(lengths)) > > results = [] > > for i in range(-1, -len(items)-1, -1): > > j = n % lengths[i] > > results.insert(0, items[i][j]) > > n -= j > > n //= lengths[i] > > results = np.column_stack(results) > > results.shape = tuple(lengths + [len(items)]) > > return results > > > > > > The final shape manipulations are, of course, optional. > > > > > Thanks for all the suggestions. Came up with this, which I think I'll > stick with > > a = numpy.array([1,2,3]) > b = numpy.array([4,5,6]) > > (n,m) = (a.shape[0],b.shape[0]) > a = numpy.repeat(a,m).reshape(n,m) > b = numpy.repeat(b,n).reshape(m,n).transpose() > ab = numpy.dstack((a,b)) > > print ab.tolist() > > [[[1, 4], [1, 5], [1, 6]], [[2, 4], [2, 5], [2, 6]], [[3, 4], [3, 5], > [3, 6]]] > > > Yes this is super fast indeed :) with 10000x10000 I[12]: %time run test.py CPU times: user 6.14 s, sys: 1.83 s, total: 7.97 s Wall time: 8.25 s #test.py import numpy a = numpy.arange(10000) b = numpy.arange(10000) (n,m) = (a.shape[0],b.shape[0]) a = numpy.repeat(a,m).reshape(n,m) b = numpy.repeat(b,n).reshape(m,n).transpose() ab = numpy.dstack((a,b)) > -- > +------------------------------------------------------------+ > | Mads Ipsen, Scientific developer | > +------------------------------+-----------------------------+ > | QuantumWise A/S | phone: +45-29716388 | > | N?rres?gade 27A | www: www.quantumwise.com | > | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | > +------------------------------+-----------------------------+ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Fri Sep 25 16:44:52 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 25 Sep 2009 16:44:52 -0400 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD2183.9040903@comxnet.dk> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> <4ABD2183.9040903@comxnet.dk> Message-ID: <4ABD2BC4.2030205@american.edu> On 9/25/2009 4:01 PM, Mads Ipsen wrote: > a = numpy.array([1,2,3]) > b = numpy.array([4,5,6]) > > (n,m) = (a.shape[0],b.shape[0]) > a = numpy.repeat(a,m).reshape(n,m) > b = numpy.repeat(b,n).reshape(m,n).transpose() > ab = numpy.dstack((a,b)) > > print ab.tolist() That's just a slow implementation of meshgrid: np.meshgrid(a,b).transpose().tolist() Gives you the same thing. Alan Isaac From nmb at wartburg.edu Fri Sep 25 16:46:07 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Fri, 25 Sep 2009 15:46:07 -0500 Subject: [Numpy-discussion] F2PY error : ... on platform 'posix' with 'gcc' compiler In-Reply-To: <6574ea980909251336t49e24ce5y26fc08c67c5092c@mail.gmail.com> References: <6574ea980909250819q365a1951n3e6f21d8bb045a35@mail.gmail.com> <6574ea980909251102g39944ca9j9ac76fa53d50af95@mail.gmail.com> <3d375d730909251122w5991aacid2ddd8097a3a2c49@mail.gmail.com> <6574ea980909251143w917e25bob0563265ff41030f@mail.gmail.com> <3d375d730909251153t219b38c3oeb3b1ccaf6cc6bd9@mail.gmail.com> <6574ea980909251208u56702fd5we960b3b7edde1cdb@mail.gmail.com> <3d375d730909251230g74da35c4vd70ae9adb2b523c8@mail.gmail.com> <6574ea980909251244m6cb642e4q1c8655598eead51e@mail.gmail.com> <3d375d730909251246n127b1d20pb92b0b5f1a86db6d@mail.gmail.com> <6574ea980909251249i32542160i817a76aa48b965e7@mail.gmail.com> <6574ea980909251336t49e24ce5y26fc08c67c5092c@mail.gmail.com> Message-ID: <4ABD2C0F.6090706@wartburg.edu> On 2009-09-25 15:36 , Arthur Bousquet wrote: > Thanks a lot for you help. Now it is working, I downloaded the gfortran > from http://hpc.sourceforge.net/. I will take the opportunity to steal Robert's line and say that the gfortran available from that site has had numerous problems in the past. The one from http://r.research.att.com/tools/ is much better and is the recommended one for SciPy. -Neil From kwgoodman at gmail.com Fri Sep 25 17:26:41 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 25 Sep 2009 14:26:41 -0700 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <49d6b3500909251337xd4fd49dxa733f11f79c1d0c4@mail.gmail.com> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> <4ABD2183.9040903@comxnet.dk> <49d6b3500909251337xd4fd49dxa733f11f79c1d0c4@mail.gmail.com> Message-ID: On Fri, Sep 25, 2009 at 1:37 PM, G?khan Sever wrote: > Yes this is super fast indeed :) > > with 10000x10000 > > I[12]: %time run test.py > CPU times: user 6.14 s, sys: 1.83 s, total: 7.97 s > Wall time: 8.25 s > > #test.py > > import numpy > > a = numpy.arange(10000) > b = numpy.arange(10000) > > (n,m) = (a.shape[0],b.shape[0]) > a = numpy.repeat(a,m).reshape(n,m) > b = numpy.repeat(b,n).reshape(m,n).transpose() > ab = numpy.dstack((a,b)) No numpy-discussion thread can have too many timeit replies. Or at least that is what I'll test here. To cut the time in half I think we'll need a netflix prize. def mesh1(a, b): (n,m) = (a.shape[0],b.shape[0]) a = np.repeat(a,m).reshape(n,m) b = np.repeat(b,n).reshape(m,n).transpose() ab = np.dstack((a,b)) return ab.tolist() def mesh2(a, b): np.array(np.meshgrid(a,b)).transpose().tolist() def mesh3(a, b): ab = np.empty((a.shape[0], b.shape[0], 2), dtype=np.int) ab.T[0] = a ab[:,:,1] = b return ab.tolist() >> a = np.array([1,2,3]) >> b = np.array([4,5,6]) >> >> >> timeit mesh1(a,b) 10000 loops, best of 3: 31.4 ?s per loop >> timeit mesh2(a,b) 10000 loops, best of 3: 24.8 ?s per loop >> timeit mesh3(a,b) 100000 loops, best of 3: 12.3 ?s per loop >> >> >> a = np.arange(1000) >> b = np.arange(1000) >> >> timeit mesh1(a,b) 10 loops, best of 3: 655 ms per loop >> timeit mesh2(a,b) 10 loops, best of 3: 773 ms per loop >> timeit mesh3(a,b) 10 loops, best of 3: 587 ms per loop From timmichelsen at gmx-topmail.de Fri Sep 25 17:30:45 2009 From: timmichelsen at gmx-topmail.de (Timmie) Date: Fri, 25 Sep 2009 21:30:45 +0000 (UTC) Subject: [Numpy-discussion] genfromtxt to structured array Message-ID: Hello, this may be a easier question. I want to load data into an structured array with getting the names from the column header (names=True). The data looks like: ;month;day;hour;value 1995;1;1;01;0 but loading only works only if changed to: year;month;day;hour;value 1995;1;1;01;0 How do I read in the original data? Thanks, Timmie From rmay31 at gmail.com Fri Sep 25 17:40:17 2009 From: rmay31 at gmail.com (Ryan May) Date: Fri, 25 Sep 2009 16:40:17 -0500 Subject: [Numpy-discussion] genfromtxt to structured array In-Reply-To: References: Message-ID: On Fri, Sep 25, 2009 at 4:30 PM, Timmie wrote: > Hello, > this may be a easier question. > > I want to load data into an structured array with getting the names from the > column header (names=True). > > The data looks like: > > ? ?;month;day;hour;value > ? ?1995;1;1;01;0 > > > but loading only works only if changed to: > > ? ?year;month;day;hour;value > ? ?1995;1;1;01;0 > > > How do I read in the original data? There's an assumption that the number of names is the same as the number of columns. You can just specify the names and skip reading the names from the file: numpy.genfromtext(filename, delimiter=';', skiprows=1, names=['year', 'month', 'day', 'hour', 'value']) Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma Sent from Norman, Oklahoma, United States From mpi at comxnet.dk Fri Sep 25 18:38:53 2009 From: mpi at comxnet.dk (Mads Ipsen) Date: Sat, 26 Sep 2009 00:38:53 +0200 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD2BC4.2030205@american.edu> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> <4ABD2183.9040903@comxnet.dk> <4ABD2BC4.2030205@american.edu> Message-ID: <4ABD467D.2070101@comxnet.dk> Alan G Isaac wrote: > On 9/25/2009 4:01 PM, Mads Ipsen wrote: > >> a = numpy.array([1,2,3]) >> b = numpy.array([4,5,6]) >> >> (n,m) = (a.shape[0],b.shape[0]) >> a = numpy.repeat(a,m).reshape(n,m) >> b = numpy.repeat(b,n).reshape(m,n).transpose() >> ab = numpy.dstack((a,b)) >> >> print ab.tolist() >> > > > That's just a slow implementation of meshgrid: > np.meshgrid(a,b).transpose().tolist() > Gives you the same thing. > > Alan Isaac > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > Yes, but it should also work for [2.1,3.2,4.5] combined with [4.6,-2.3,5.6] - forgot to tell that. Mads -- +------------------------------------------------------------+ | Mads Ipsen, Scientific developer | +------------------------------+-----------------------------+ | QuantumWise A/S | phone: +45-29716388 | | N?rres?gade 27A | www: www.quantumwise.com | | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | +------------------------------+-----------------------------+ From robert.kern at gmail.com Fri Sep 25 18:42:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 17:42:15 -0500 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD467D.2070101@comxnet.dk> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> <4ABD2183.9040903@comxnet.dk> <4ABD2BC4.2030205@american.edu> <4ABD467D.2070101@comxnet.dk> Message-ID: <3d375d730909251542p74ad7c2ch199503260de1afb6@mail.gmail.com> On Fri, Sep 25, 2009 at 17:38, Mads Ipsen wrote: > Yes, but it should also work for [2.1,3.2,4.5] combined with > [4.6,-2.3,5.6] - forgot to tell that. In [5]: np.transpose(np.meshgrid([2.1,3.2,4.5], [4.6,-2.3,5.6])) Out[5]: array([[[ 2.1, 4.6], [ 2.1, -2.3], [ 2.1, 5.6]], [[ 3.2, 4.6], [ 3.2, -2.3], [ 3.2, 5.6]], [[ 4.5, 4.6], [ 4.5, -2.3], [ 4.5, 5.6]]]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jsseabold at gmail.com Fri Sep 25 19:30:41 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 25 Sep 2009 19:30:41 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4ABD1F3B.9040507@noaa.gov> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> Message-ID: On Fri, Sep 25, 2009 at 3:51 PM, Christopher Barker wrote: > One more thought: > > Pierre GM wrote: >>>> That way, we don't slow >>>> things down when everything works, > > how long can it really take to increment an integer as each line is > parsed? I'd suspect no one would even notice! > A 1000 converters later... FWIW, I have a script that creates and savez arrays from several text files in total about 1.5 GB of text. without the incrementing in genfromtxt Run time: 122.043943 seconds with the incrementing in genfromtxt Run time: 131.698873 seconds If we just want to always keep track of things, I would be willing to take a poorly measured 8 % slowdown, because the info that I took from the errors is the only thing that made what I was doing at all feasible. Skipper From aisaac at american.edu Fri Sep 25 19:46:22 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 25 Sep 2009 19:46:22 -0400 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> <4ABD2183.9040903@comxnet.dk> <49d6b3500909251337xd4fd49dxa733f11f79c1d0c4@mail.gmail.com> Message-ID: <4ABD564E.2000401@american.edu> OK, sure, for large arrays meshgrid will look bad. (It creates a large array twice.) If the example really involves sequential integers, then mgrid could be used instead to save on this. Even so, it is just implausible that duplicating meshgrid functionality will be faster than using meshgrid. So let's avoid the transpose of a big matrix by using dstack, and voila, it's competitive again. But with large arrays, itertools.product will look *even* better. It is by far the fastest, and of course even faster yet if you do not convert to list (since it returns a generator). Not to mention, it returns the pairs (in contrast to the nesting in the other solutions). Alan #script from numpy import arange, array, dstack, empty, meshgrid, repeat from itertools import product import timeit def mesh1(a, b): (n,m) = (a.shape[0],b.shape[0]) a = repeat(a,m).reshape(n,m) b = repeat(b,n).reshape(m,n).transpose() ab = dstack((a,b)) return ab.tolist() def mesh2(a, b): return array(meshgrid(a,b)).transpose().tolist() def mesh3(a, b): ab = empty((a.shape[0], b.shape[0], 2), dtype=int) ab.T[0] = a ab[:,:,1] = b return ab.tolist() def mesh4(a, b): return dstack(meshgrid(a,b)).tolist() def mesh5(a,b): return list(product(a,b)) a = arange(1000) b = arange(1000) print for f in mesh1, mesh2, mesh3, mesh4, mesh5: print timeit.timeit('f(a,b)',setup='from __main__ import a,b,f', number=10) #results 12.3801448229 16.1959892249 11.7581712899 12.3436149706 1.24762749176 From aisaac at american.edu Fri Sep 25 19:50:05 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 25 Sep 2009 19:50:05 -0400 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <4ABD467D.2070101@comxnet.dk> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> <4ABD2183.9040903@comxnet.dk> <4ABD2BC4.2030205@american.edu> <4ABD467D.2070101@comxnet.dk> Message-ID: <4ABD572D.2060206@american.edu> > Alan G Isaac wrote: >> That's just a slow implementation of meshgrid: >> np.meshgrid(a,b).transpose().tolist() >> Gives you the same thing. On 9/25/2009 6:38 PM, Mads Ipsen wrote: > Yes, but it should also work for [2.1,3.2,4.5] combined with > [4.6,-2.3,5.6] - forgot to tell that. No problem: a meshgrid is not an mgrid. But again, itertools.product will be **much** faster. Alan Isaac From dwf at cs.toronto.edu Fri Sep 25 21:40:15 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 25 Sep 2009 21:40:15 -0400 Subject: [Numpy-discussion] Getting number of *characters* in dtype='U' array In-Reply-To: <4ABCE2BD.4020601@stsci.edu> References: <4ABCE2BD.4020601@stsci.edu> Message-ID: <20090926014015.GA32290@rodimus> On Fri, Sep 25, 2009 at 11:33:17AM -0400, Michael Droettboom wrote: > Is there a way to get the number of characters in a fixed-size 'U' > array? I can, of course, parse dtype.str, or divide dtype.itemsize by > the size of a unicode character, but neither seems terribly elegant or > future proof. Does numpy provide (to Python) a method for getting this > that I'm just missing? > > In [7]: x = np.array(u'1234') > > In [8]: x.dtype > Out[8]: dtype(' > In [9]: x.dtype.str > Out[9]: ' > In [10]: x.dtype.itemsize > Out[10]: 16 I could be misleading you but I believe x.dtype.alignment is the divisor for itemsize that you're looking for? It looks like it's 1 for string arrays and 4 for Unicode arrays, which would make sense... David From robert.kern at gmail.com Fri Sep 25 22:15:49 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Sep 2009 21:15:49 -0500 Subject: [Numpy-discussion] Getting number of *characters* in dtype='U' array In-Reply-To: <20090926014015.GA32290@rodimus> References: <4ABCE2BD.4020601@stsci.edu> <20090926014015.GA32290@rodimus> Message-ID: <3d375d730909251915m501c97f2s55a163ba5e5bb3ae@mail.gmail.com> On Fri, Sep 25, 2009 at 20:40, David Warde-Farley wrote: > On Fri, Sep 25, 2009 at 11:33:17AM -0400, Michael Droettboom wrote: >> Is there a way to get the number of characters in a fixed-size 'U' >> array? ?I can, of course, parse dtype.str, or divide dtype.itemsize by >> the size of a unicode character, but neither seems terribly elegant or >> future proof. ?Does numpy provide (to Python) a method for getting this >> that I'm just missing? >> >> In [7]: x = np.array(u'1234') >> >> In [8]: x.dtype >> Out[8]: dtype('> >> In [9]: x.dtype.str >> Out[9]: '> >> In [10]: x.dtype.itemsize >> Out[10]: 16 > > I could be misleading you but I believe x.dtype.alignment is the divisor for itemsize that you're looking for? Only by happenstance. It is not really related. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Fri Sep 25 21:57:27 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 25 Sep 2009 21:57:27 -0400 Subject: [Numpy-discussion] Getting number of *characters* in dtype='U' array In-Reply-To: <3d375d730909251915m501c97f2s55a163ba5e5bb3ae@mail.gmail.com> References: <4ABCE2BD.4020601@stsci.edu> <20090926014015.GA32290@rodimus> <3d375d730909251915m501c97f2s55a163ba5e5bb3ae@mail.gmail.com> Message-ID: <20090926015727.GA32392@rodimus> On Fri, Sep 25, 2009 at 09:15:49PM -0500, Robert Kern wrote: > > I could be misleading you but I believe x.dtype.alignment is the divisor for itemsize that you're looking for? > > Only by happenstance. It is not really related. Yikes, my guess was quite wrong indeed. http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html#alignment I'd say go with int(dt.str.split(dt.char)[-1]). That's not _that_ brittle, I guess. I was just looking at the C code for this the other day, and the way it's implemented is that if you've specified unicode the elsize is left-shifted by 2. I don't think the 'atom size' is stored anywhere in the process. David From mpi at comxnet.dk Sat Sep 26 04:50:54 2009 From: mpi at comxnet.dk (Mads Ipsen) Date: Sat, 26 Sep 2009 10:50:54 +0200 Subject: [Numpy-discussion] Tuple outer product? In-Reply-To: <3d375d730909251542p74ad7c2ch199503260de1afb6@mail.gmail.com> References: <4ABD01BF.5080405@comxnet.dk> <49d6b3500909251058h2c77094bp5cb1875a27863ff5@mail.gmail.com> <4ABD0650.4040109@comxnet.dk> <4ABD17DB.7000801@american.edu> <3d375d730909251233o78291a3ah1b6ebcd1a9ba003c@mail.gmail.com> <4ABD2183.9040903@comxnet.dk> <4ABD2BC4.2030205@american.edu> <4ABD467D.2070101@comxnet.dk> <3d375d730909251542p74ad7c2ch199503260de1afb6@mail.gmail.com> Message-ID: <4ABDD5EE.4020507@comxnet.dk> Robert Kern wrote: > On Fri, Sep 25, 2009 at 17:38, Mads Ipsen wrote: > >> Yes, but it should also work for [2.1,3.2,4.5] combined with >> [4.6,-2.3,5.6] - forgot to tell that. >> > > In [5]: np.transpose(np.meshgrid([2.1,3.2,4.5], [4.6,-2.3,5.6])) > Out[5]: > array([[[ 2.1, 4.6], > [ 2.1, -2.3], > [ 2.1, 5.6]], > > [[ 3.2, 4.6], > [ 3.2, -2.3], > [ 3.2, 5.6]], > > [[ 4.5, 4.6], > [ 4.5, -2.3], > [ 4.5, 5.6]]]) > > Point taken :-) -- +------------------------------------------------------------+ | Mads Ipsen, Scientific developer | +------------------------------+-----------------------------+ | QuantumWise A/S | phone: +45-29716388 | | N?rres?gade 27A | www: www.quantumwise.com | | DK-1370 Copenhagen, Denmark | email: mpi at quantumwise.com | +------------------------------+-----------------------------+ From thomas.robitaille at gmail.com Sat Sep 26 09:33:28 2009 From: thomas.robitaille at gmail.com (Thomas Robitaille) Date: Sat, 26 Sep 2009 09:33:28 -0400 Subject: [Numpy-discussion] unpacking bytes directly in numpy Message-ID: <2D96D1A8-2324-41F9-AF11-A71497BD0D04@gmail.com> Hi, To convert some bytes to e.g. a 32-bit int, I can do bytes = f.read(4) i = struct.unpack('>i', bytes)[0] and the convert it to np.int32 with i = np.int32(i) However, is there a more direct way of directly transforming bytes into a np.int32 type without the intermediate 'struct.unpack' step? Thanks for any help, Tom From cournape at gmail.com Sat Sep 26 10:09:36 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 26 Sep 2009 23:09:36 +0900 Subject: [Numpy-discussion] unpacking bytes directly in numpy In-Reply-To: <2D96D1A8-2324-41F9-AF11-A71497BD0D04@gmail.com> References: <2D96D1A8-2324-41F9-AF11-A71497BD0D04@gmail.com> Message-ID: <5b8d13220909260709p3b8d7099n7e933109159bdd56@mail.gmail.com> On Sat, Sep 26, 2009 at 10:33 PM, Thomas Robitaille wrote: > Hi, > > To convert some bytes to e.g. a 32-bit int, I can do > > bytes = f.read(4) > i = struct.unpack('>i', bytes)[0] > > and the convert it to np.int32 with > > i = np.int32(i) > > However, is there a more direct way of directly transforming bytes > into a np.int32 type without the intermediate 'struct.unpack' step? Assuming you have an array of bytes, you could just use view: # x is an array of bytes, whose length is a multiple of 4 x.view(np.int32) cheers, David From lciti at essex.ac.uk Sat Sep 26 17:05:35 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Sat, 26 Sep 2009 22:05:35 +0100 Subject: [Numpy-discussion] np.any and np.all short-circuiting In-Reply-To: <5b8d13220909250312i51ac8e5bnadc15bcd8bcee42a@mail.gmail.com> References: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F0@MBOX0.essex.ac.uk> <3d375d730909241511y4774d11i7808627fbd023135@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F2@MBOX0.essex.ac.uk> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F3@MBOX0.essex.ac.uk> <5b8d13220909241841o647d5858ocadc274ad9982f20@mail.gmail.com> <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F6@MBOX0.essex.ac.uk>, <5b8d13220909250312i51ac8e5bnadc15bcd8bcee42a@mail.gmail.com> Message-ID: <271BED32E925E646A1333A56D9C6AFCB3E13D7D3F8@MBOX0.essex.ac.uk> Hello David, thank you. I followed your suggestion but I was unable to make it work. I surprisingly found that with numpy in a different folder, it worked. I am afraid it is due to the fact that the first one is not a linux filesystem and cannot deal with permission and ownership. This would make sense and agree with a recent post about nose having problems with certain permissions or ownerships. Problem solved. I checked out the svn version into a linux filesystem and it works. Thanks again to all those that offered help. Best, Luca From erik.tollerud at gmail.com Sat Sep 26 18:17:37 2009 From: erik.tollerud at gmail.com (Erik Tollerud) Date: Sat, 26 Sep 2009 15:17:37 -0700 Subject: [Numpy-discussion] python reduce vs numpy reduce for outer product Message-ID: I'm encountering behavior that I think makes sense, but I'm not sure if there's some numpy function I'm unaware of that might speed up this operation. I have a (potentially very long) sequence of vectors, but for examples' sake, I'll stick with three: [A,B,C] with lengths na,nb, and nc. To get the result I want, I first reshape them to (na,1,1) , (1,nb,1) and (1,1,nc) and do: >>>reduce(np.multiply,[A,B,C]) and the result is what I want... The curious thing is that >>>np.prod.reduce([A,B,C]) throws ValueError: setting an array element with a sequence. Presumably this is because np.prod.reduce is trying to operate elemnt-wise without broadcasting. But is there a way to make the ufunc broadcast faster than doing the python-level reduce? (I tried np.prod(broadcast_arrays([A,B,C]),axis=0), but that seemed slower, presumably because it needs to allocate the full array for all three instead of just once). Or, if there's a better way to just start with the first 3 1d vectorsand jump straight to the broadcast product (basically, an outer product over arbitrary number of dimensions...)? From robert.kern at gmail.com Sat Sep 26 18:29:04 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 26 Sep 2009 17:29:04 -0500 Subject: [Numpy-discussion] python reduce vs numpy reduce for outer product In-Reply-To: References: Message-ID: <3d375d730909261529sbe89ahe795b362e77d48df@mail.gmail.com> On Sat, Sep 26, 2009 at 17:17, Erik Tollerud wrote: > I'm encountering behavior that I think makes sense, but I'm not sure > if there's some numpy function I'm unaware of that might speed up this > operation. > > I have a (potentially very long) sequence of vectors, but for > examples' sake, I'll stick with three: [A,B,C] with lengths na,nb, and > nc. ?To get the result I want, I first reshape them to (na,1,1) , > (1,nb,1) and (1,1,nc) and do: > >>>>reduce(np.multiply,[A,B,C]) > > and the result is what I want... The curious thing is that > >>>>np.prod.reduce([A,B,C]) I'm sure you mean np.multiply.reduce(). > throws > > ValueError: setting an array element with a sequence. > > Presumably this is because np.prod.reduce is trying to operate > elemnt-wise without broadcasting. No. np.multiply.reduce() is trying to coerce its argument into an array. You have given it a list with three arrays that do not have the compatible shapes. In [1]: a = arange(5).reshape([5,1,1]) In [2]: b = arange(6).reshape([1,6,1]) In [4]: c = arange(7).reshape([1,1,7]) In [5]: array([a,b,c]) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /Users/rkern/Downloads/ in () ValueError: setting an array element with a sequence. > ?But is there a way to make the > ufunc broadcast faster than doing the python-level reduce? ?(I tried > np.prod(broadcast_arrays([A,B,C]),axis=0), but that seemed slower, > presumably because it needs to allocate the full array for all three > instead of just once). Basically yes because it is computing np.array(np.broadcast_arrays([A,B,C])). > Or, if there's a better way to just start with the first 3 1d > vectorsand jump straight to the broadcast product (basically, an outer > product over arbitrary number of dimensions...)? Well, numpy doesn't support arbitrary numbers of dimensions, nor will your memory. You won't be able to do more than a handful of dimensions practically. Exactly what are you trying to do? Specifics, please, not toy examples. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From erik.tollerud at gmail.com Sat Sep 26 19:17:05 2009 From: erik.tollerud at gmail.com (Erik Tollerud) Date: Sat, 26 Sep 2009 16:17:05 -0700 Subject: [Numpy-discussion] python reduce vs numpy reduce for outer product In-Reply-To: <3d375d730909261529sbe89ahe795b362e77d48df@mail.gmail.com> References: <3d375d730909261529sbe89ahe795b362e77d48df@mail.gmail.com> Message-ID: > I'm sure you mean np.multiply.reduce(). Yes, sorry - typo. >> Or, if there's a better way to just start with the first 3 1d >> vectorsand jump straight to the broadcast product (basically, an outer >> product over arbitrary number of dimensions...)? > > Well, numpy doesn't support arbitrary numbers of dimensions, nor will > your memory. You won't be able to do more than a handful of dimensions > practically. Exactly what are you trying to do? Specifics, please, not > toy examples. Well, I'm not sure how to get too much more specific than what I just described. I am computing moments of n-d input arrays given a particular axis ... I want to take a sequence of 1D arrays, and get an output has as many dimensions as the input sequence's length, with each dimension's size matching the corresponding vector. Symbolically, A[i,j,k,...] = v0[i]*v1[j]*v2[k]*... A is then multiplied by the input n-d array (same shape as A), and that is the output. And yes, practically, this will only work until I run out of memory, but the reduce method works for the n=1,2, 3, and 4 cases, and potentially in the future it will be needed for with higher (up to maybe 8) dimensions that are small enough that they won't overwhelm the memory. So it seems like a bad idea to write custom versions for each potential dimensionality. From robert.kern at gmail.com Sat Sep 26 19:22:03 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 26 Sep 2009 18:22:03 -0500 Subject: [Numpy-discussion] python reduce vs numpy reduce for outer product In-Reply-To: References: <3d375d730909261529sbe89ahe795b362e77d48df@mail.gmail.com> Message-ID: <3d375d730909261622x77da31ddv7d57adc8d1ced28b@mail.gmail.com> On Sat, Sep 26, 2009 at 18:17, Erik Tollerud wrote: >> I'm sure you mean np.multiply.reduce(). > Yes, sorry - typo. > >>> Or, if there's a better way to just start with the first 3 1d >>> vectorsand jump straight to the broadcast product (basically, an outer >>> product over arbitrary number of dimensions...)? >> >> Well, numpy doesn't support arbitrary numbers of dimensions, nor will >> your memory. You won't be able to do more than a handful of dimensions >> practically. Exactly what are you trying to do? Specifics, please, not >> toy examples. > > Well, I'm not sure how to get too much more specific than what I just > described. I am computing moments of n-d input arrays given a > particular axis ... I want to take a sequence of 1D arrays, and get an > output has as many dimensions as the input sequence's length, with > each dimension's size matching the corresponding vector. > Symbolically, A[i,j,k,...] = v0[i]*v1[j]*v2[k]*... ?A is then > multiplied by the input n-d array (same shape as A), and that is the > output. > > And yes, practically, this will only work until I run out of memory, > but the reduce method works for the n=1,2, 3, and 4 cases, and > potentially in the future it will be needed for with higher (up to > maybe 8) dimensions that are small enough that they won't overwhelm > the memory. So it seems like a bad idea to write custom versions for > each potential dimensionality. Okay, that's the key fact I needed. When you said that you would have a long list of vectors, I was worried that you wanted a dimension for each of them. You probably aren't going to be able to beat reduce(np.multiply, ...). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav+sp at iki.fi Mon Sep 28 03:11:40 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Mon, 28 Sep 2009 07:11:40 +0000 (UTC) Subject: [Numpy-discussion] Error building docs References: <4ABCEFEA.7010002@stsci.edu> Message-ID: Fri, 25 Sep 2009 12:29:30 -0400, Michael Droettboom wrote: > Anybody know why I might be seeing this? [clip] > Exception occurred:[ 0%] reference/arrays.classeslass > File "/home/mdroe/usr/lib/python2.5/site-packages/docutils/nodes.py", > line 471, in __getitem__ > return self.attributes[key] > KeyError: 'numbered' No ideas. I think I've seen KeyErrors before, but not from inside docutils. My only guess would be to remove the build directory and try again. If that does not help, I'd look into if there's some known issue with the current version of Sphinx's hg branch. -- Pauli Virtanen From Michael.Walker at sophia.inria.fr Mon Sep 28 04:07:47 2009 From: Michael.Walker at sophia.inria.fr (Michael.Walker at sophia.inria.fr) Date: Mon, 28 Sep 2009 10:07:47 +0200 (CEST) Subject: [Numpy-discussion] numpy.numarray.transpose() In-Reply-To: <1d1e6ea70909250659q706c394fg470d5278fb97549d@mail.gmail.com> References: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> <91cf711d0909250614o158e121ej40399a5b92278884@mail.gmail.com> <1d1e6ea70909250659q706c394fg470d5278fb97549d@mail.gmail.com> Message-ID: <57516.194.167.194.27.1254125267.squirrel@imap-sop.inria.fr> Hello list, I just thought I'd point out a difference between 'import numarray' and 'import numpy.numarray' . Consider the following In [1]: from numpy.numarray import * In [2]: d = array((1,2,3,4)) In [3]: f = reshape(d,(2,2)) In [4]: print f [[1 2] [3 4]] In [5]: f.transpose() Out[5]: array([[1, 3], [2, 4]]) In [6]: print f [[1 2] [3 4]] Now in pure numarray, f would have changed to the transposed form, so that the output from [5] and [6] would match, and be different from that of [4]. (I don't have numarray installed myself but a workmates computer, and examples on the web have the usage f.transpose() . Now, In [7]: f = f.transpose() In [8]: print f [[1 3] [2 4]] as expected. I mention this because I think that it is worth knowing having lost a LOT of time to it. Is it worth filing as a bug report? Michael Walker Plant Modelling Group CIRAD, Montpellier 04 67 61 57 27 From pav+sp at iki.fi Mon Sep 28 04:15:33 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Mon, 28 Sep 2009 08:15:33 +0000 (UTC) Subject: [Numpy-discussion] numpy.numarray.transpose() References: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> <91cf711d0909250614o158e121ej40399a5b92278884@mail.gmail.com> <1d1e6ea70909250659q706c394fg470d5278fb97549d@mail.gmail.com> <57516.194.167.194.27.1254125267.squirrel@imap-sop.inria.fr> Message-ID: Mon, 28 Sep 2009 10:07:47 +0200, Michael.Walker wrote: [clip] > In [7]: f = f.transpose() > > In [8]: print f > [[1 3] > [2 4]] > > as expected. I mention this because I think that it is worth knowing > having lost a LOT of time to it. Is it worth filing as a bug report? Yes. It indeed seems that in numarray, transpose() transposes the array in-place. This could maybe be fixed by a new numarray-emulating ndarray subclass. The tricky problem then is that some functions don't, IIRC, preserve subclasses, which may lead to surprises. (Anyway, these should be fixed at some point...) At the least, we should write a well-visible "differences to numarray" document that explains all differences and known bugs. -- Pauli Virtanen From ndbecker2 at gmail.com Mon Sep 28 07:23:20 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 28 Sep 2009 07:23:20 -0400 Subject: [Numpy-discussion] new array type in cython? Message-ID: Has anyone attempted a new array type in cython? Any hints? From Nicolas.Rougier at loria.fr Mon Sep 28 10:06:19 2009 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Mon, 28 Sep 2009 16:06:19 +0200 Subject: [Numpy-discussion] glumpy: fast OpenGL numpy visualization + matplotlib integration Message-ID: Hi all, glumpy is a fast OpenGL visualization tool for numpy arrays coded on top of pyglet (http://www.pyglet.org/). The package contains many demos showing basic usage as well as integration with matplotlib. As a reference, the animation script available from matplotlib distribution runs at around 500 fps using glumpy instead of 30 fps on my machine. Package/screenshots/explanations at: http://www.loria.fr/~rougier/coding/glumpy.html (it does not require installation so you can run demos from within the glumpy directory). Nicolas From bsouthey at gmail.com Mon Sep 28 10:29:30 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 28 Sep 2009 09:29:30 -0500 Subject: [Numpy-discussion] numpy.numarray.transpose() In-Reply-To: References: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> <91cf711d0909250614o158e121ej40399a5b92278884@mail.gmail.com> <1d1e6ea70909250659q706c394fg470d5278fb97549d@mail.gmail.com> <57516.194.167.194.27.1254125267.squirrel@imap-sop.inria.fr> Message-ID: <4AC0C84A.7030601@gmail.com> On 09/28/2009 03:15 AM, Pauli Virtanen wrote: > Mon, 28 Sep 2009 10:07:47 +0200, Michael.Walker wrote: > [clip] > >> In [7]: f = f.transpose() >> >> In [8]: print f >> [[1 3] >> [2 4]] >> >> as expected. I mention this because I think that it is worth knowing >> having lost a LOT of time to it. Is it worth filing as a bug report? >> > Yes. It indeed seems that in numarray, transpose() transposes the array > in-place. > > This could maybe be fixed by a new numarray-emulating ndarray subclass. > The tricky problem then is that some functions don't, IIRC, preserve > subclasses, which may lead to surprises. (Anyway, these should be fixed > at some point...) > > At the least, we should write a well-visible "differences to numarray" > document that explains all differences and known bugs. > > This is not a bug! This specific difference between numpy and numarray is documented on the 'converting from numarray' page: http://www.scipy.org/Converting_from_numarray What actually is incorrect is that the numpy.numarray.transpose has the same docstring as numpy.transpose. So it would be very helpful to first correct the numpy.array.transpose documentation. A larger goal would be to correctly document all the numpy.numarray and numpy.numeric functions as these should not be linked to the similar numpy functions. If these are identical then it should state that, what differences exist and then refer to equivalent numpy page for example, numpy.numarray.matrixmultiply and numpy.dot. Also, the documentation for these numpy.numarray and numpy.numeric functions should state that these are mainly included for compatibility reasons and may be removed at a future date. Bruce From pav+sp at iki.fi Mon Sep 28 11:08:52 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Mon, 28 Sep 2009 15:08:52 +0000 (UTC) Subject: [Numpy-discussion] numpy.numarray.transpose() References: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> <91cf711d0909250614o158e121ej40399a5b92278884@mail.gmail.com> <1d1e6ea70909250659q706c394fg470d5278fb97549d@mail.gmail.com> <57516.194.167.194.27.1254125267.squirrel@imap-sop.inria.fr> <4AC0C84A.7030601@gmail.com> Message-ID: Mon, 28 Sep 2009 09:29:30 -0500, Bruce Southey wrote: [clip] > This is not a bug! This specific difference between numpy and numarray > is documented on the 'converting from numarray' page: > http://www.scipy.org/Converting_from_numarray Oh. I completely missed that page. Now, it should just be transferred to the main documentation. Also, it might be possible to make numpy.numarray.ndarray different from numpy.ndarray. But I doubt this is high priority -- it may be more efficient just to document the fact. > What actually is incorrect is that the numpy.numarray.transpose has the > same docstring as numpy.transpose. So it would be very helpful to first > correct the numpy.array.transpose documentation. numpy.numarray.transpose is numpy.transpose, so fixing this would involve implementing the numarray-style transpose, too. -- Pauli Virtanen From robert.kern at gmail.com Mon Sep 28 11:12:44 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Sep 2009 10:12:44 -0500 Subject: [Numpy-discussion] new array type in cython? In-Reply-To: References: Message-ID: <3d375d730909280812o2b9b343erf72b3ce28a8ba025@mail.gmail.com> On Mon, Sep 28, 2009 at 06:23, Neal Becker wrote: > Has anyone attempted a new array type in cython? ?Any hints? Are you having problems? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Mon Sep 28 11:36:24 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 28 Sep 2009 11:36:24 -0400 Subject: [Numpy-discussion] new array type in cython? References: <3d375d730909280812o2b9b343erf72b3ce28a8ba025@mail.gmail.com> Message-ID: Robert Kern wrote: > On Mon, Sep 28, 2009 at 06:23, Neal Becker wrote: >> Has anyone attempted a new array type in cython? Any hints? > > Are you having problems? > No, haven't tried using cython for this yet. Wondering if there are any examples. So far my experiences have been with boost::python. From gokhansever at gmail.com Mon Sep 28 12:05:48 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 28 Sep 2009 11:05:48 -0500 Subject: [Numpy-discussion] glumpy: fast OpenGL numpy visualization + matplotlib integration In-Reply-To: References: Message-ID: <49d6b3500909280905se2029fcmbfb4bf54f1351cc3@mail.gmail.com> On Mon, Sep 28, 2009 at 9:06 AM, Nicolas Rougier wrote: > > Hi all, > > glumpy is a fast OpenGL visualization tool for numpy arrays coded on > top of pyglet (http://www.pyglet.org/). The package contains many > demos showing basic usage as well as integration with matplotlib. As a > reference, the animation script available from matplotlib distribution > runs at around 500 fps using glumpy instead of 30 fps on my machine. > > Package/screenshots/explanations at: > http://www.loria.fr/~rougier/coding/glumpy.html > (it does not require installation so you can run demos from within the > glumpy directory). > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi Nicolas, This is technically called OpenGL backend, isn't it? It is nice that integrates with matplotlib, however 300 hundred lines of code indeed a lot of lines for an ordinary user. Do you think this could be further integrated into matplotlib with a wrapper to simplify its usage? -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Sep 28 12:31:44 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Sep 2009 11:31:44 -0500 Subject: [Numpy-discussion] new array type in cython? In-Reply-To: References: <3d375d730909280812o2b9b343erf72b3ce28a8ba025@mail.gmail.com> Message-ID: <3d375d730909280931r7f2ac69cp835a304875d37ddb@mail.gmail.com> On Mon, Sep 28, 2009 at 10:36, Neal Becker wrote: > Robert Kern wrote: > >> On Mon, Sep 28, 2009 at 06:23, Neal Becker wrote: >>> Has anyone attempted a new array type in cython? ?Any hints? >> >> Are you having problems? > > No, haven't tried using cython for this yet. ?Wondering if there are any > examples. Have you read the documentation? http://wiki.cython.org/tutorials/numpy -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Nicolas.Rougier at loria.fr Mon Sep 28 12:37:39 2009 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Mon, 28 Sep 2009 18:37:39 +0200 Subject: [Numpy-discussion] [Matplotlib-users] glumpy: fast OpenGL numpy visualization + matplotlib integration In-Reply-To: <49d6b3500909280905se2029fcmbfb4bf54f1351cc3@mail.gmail.com> References: <49d6b3500909280905se2029fcmbfb4bf54f1351cc3@mail.gmail.com> Message-ID: <4DCD757A-B2A3-4BC4-AA41-ECE355EDE1ED@loria.fr> Well, I've been starting working on a pyglet backend but it is currently painfully slow mainly because I do not know enough of the matplotlib internal machinery to really benefit from it. In the case of glumpy, the use of texture object for representing 2d arrays is a real speed boost since interpolation/colormap/heightmap is made on the GPU. Concerning matplotlib examples, the use of glumpy should be actually two lines of code: from pylab import * from glumpy import imshow, show but I did not package it this way yet (that is easy however). I guess the main question is whether people are interested in glumpy to have a quick & dirty "debug" tool on top of matplotlib or whether they prefer a full fledged and fast pyglet/OpenGL backend (which is really harder). Nicolas On 28 Sep, 2009, at 18:05 , G?khan Sever wrote: > > > On Mon, Sep 28, 2009 at 9:06 AM, Nicolas Rougier > wrote: > > Hi all, > > glumpy is a fast OpenGL visualization tool for numpy arrays coded on > top of pyglet (http://www.pyglet.org/). The package contains many > demos showing basic usage as well as integration with matplotlib. As a > reference, the animation script available from matplotlib distribution > runs at around 500 fps using glumpy instead of 30 fps on my machine. > > Package/screenshots/explanations at: http://www.loria.fr/~rougier/coding/glumpy.html > (it does not require installation so you can run demos from within the > glumpy directory). > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > Hi Nicolas, > > This is technically called OpenGL backend, isn't it? It is nice that > integrates with matplotlib, however 300 hundred lines of code indeed > a lot of lines for an ordinary user. Do you think this could be > further integrated into matplotlib with a wrapper to simplify its > usage? > > > -- > G?khan > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry® Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart > your > developing skills, take BlackBerry mobile applications to market and > stay > ahead of the curve. Join us from November 9-12, 2009. Register > now! > http://p.sf.net/sfu/devconf_______________________________________________ > Matplotlib-users mailing list > Matplotlib-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/matplotlib-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Mon Sep 28 12:40:16 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 28 Sep 2009 09:40:16 -0700 Subject: [Numpy-discussion] unpacking bytes directly in numpy In-Reply-To: <5b8d13220909260709p3b8d7099n7e933109159bdd56@mail.gmail.com> References: <2D96D1A8-2324-41F9-AF11-A71497BD0D04@gmail.com> <5b8d13220909260709p3b8d7099n7e933109159bdd56@mail.gmail.com> Message-ID: <4AC0E6F0.1000007@noaa.gov> David Cournapeau wrote: >> However, is there a more direct way of directly transforming bytes >> into a np.int32 type without the intermediate 'struct.unpack' step? > > Assuming you have an array of bytes, you could just use view: > > # x is an array of bytes, whose length is a multiple of 4 > x.view(np.int32) and if you don't have an array, you can use one of: np.fromstring np.frombuffer np.fromfile they all take a dtype as a parameter. For your example: > bytes = f.read(4) > i = struct.unpack('>i', bytes)[0] i = np.fromfile(f, dtype=np.int32, count=1) and, of course, you cold read a lot more than one number in at once. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Mon Sep 28 12:41:36 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 28 Sep 2009 09:41:36 -0700 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> Message-ID: <4AC0E740.60309@noaa.gov> Skipper Seabold wrote: > FWIW, I have a script that creates and savez arrays from several text > files in total about 1.5 GB of text. > > without the incrementing in genfromtxt > > Run time: 122.043943 seconds > > with the incrementing in genfromtxt > > Run time: 131.698873 seconds > > If we just want to always keep track of things, I would be willing to > take a poorly measured 8 % slowdown, I also think 8% is worth it, but I'm still surprised it's that much. What addition code is inside the inner loop? (or , I guess, the each line loop...) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Mon Sep 28 12:47:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Sep 2009 11:47:28 -0500 Subject: [Numpy-discussion] new array type in cython? In-Reply-To: References: <3d375d730909280812o2b9b343erf72b3ce28a8ba025@mail.gmail.com> Message-ID: <3d375d730909280947v57ae4459j530ff42202ca929a@mail.gmail.com> On Mon, Sep 28, 2009 at 10:36, Neal Becker wrote: > Robert Kern wrote: > >> On Mon, Sep 28, 2009 at 06:23, Neal Becker wrote: >>> Has anyone attempted a new array type in cython? ?Any hints? >> >> Are you having problems? >> > > No, haven't tried using cython for this yet. ?Wondering if there are any > examples. I'm sorry. I misunderstood your question. You meant adding a new dtype in Cython. No, I don't think anyone has attempted this with Cython, yet. There is a C example in docs/newdtype_example/. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Mon Sep 28 12:47:45 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 28 Sep 2009 12:47:45 -0400 Subject: [Numpy-discussion] new array type in cython? References: <3d375d730909280812o2b9b343erf72b3ce28a8ba025@mail.gmail.com> <3d375d730909280931r7f2ac69cp835a304875d37ddb@mail.gmail.com> Message-ID: Robert Kern wrote: > On Mon, Sep 28, 2009 at 10:36, Neal Becker wrote: >> Robert Kern wrote: >> >>> On Mon, Sep 28, 2009 at 06:23, Neal Becker wrote: >>>> Has anyone attempted a new array type in cython? Any hints? >>> >>> Are you having problems? >> >> No, haven't tried using cython for this yet. Wondering if there are any >> examples. > > Have you read the documentation? > > http://wiki.cython.org/tutorials/numpy > Yes, I didn't notice anything about adding user-defined datatypes to numpy, did I miss something? From jsseabold at gmail.com Mon Sep 28 12:51:39 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 28 Sep 2009 12:51:39 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4AC0E740.60309@noaa.gov> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> Message-ID: On Mon, Sep 28, 2009 at 12:41 PM, Christopher Barker wrote: > Skipper Seabold wrote: >> FWIW, I have a script that creates and savez arrays from several text >> files in total about 1.5 GB of text. >> >> without the incrementing in genfromtxt >> >> Run time: 122.043943 seconds >> >> with the incrementing in genfromtxt >> >> Run time: 131.698873 seconds >> >> If we just want to always keep track of things, I would be willing to >> take a poorly measured 8 % slowdown, > > I also think 8% is worth it, but I'm still surprised it's that much. > What addition code is inside the inner loop? (or , I guess, the each > line loop...) > > -Chris > This was probably due to the way that I timed it, honestly. I only did it once. The only differences I made for that part were in the first post of the thread. Two incremented scalars for line numbers and column numbers and a try/except block. I'm really not against a debug mode if someone wants to do it, and it's deemed necessary. If it could be made to log all of the errors that would be extremely helpful. I still need to post some of my use cases though. Anything to help make data cleaning less of a chore... Skipper From rpg.314 at gmail.com Mon Sep 28 13:16:15 2009 From: rpg.314 at gmail.com (Rohit Garg) Date: Mon, 28 Sep 2009 22:46:15 +0530 Subject: [Numpy-discussion] [Matplotlib-users] glumpy: fast OpenGL numpy visualization + matplotlib integration In-Reply-To: <4DCD757A-B2A3-4BC4-AA41-ECE355EDE1ED@loria.fr> References: <49d6b3500909280905se2029fcmbfb4bf54f1351cc3@mail.gmail.com> <4DCD757A-B2A3-4BC4-AA41-ECE355EDE1ED@loria.fr> Message-ID: <4d5dd8c20909281016u583fea3di2d4e15cc7f7a2784@mail.gmail.com> This is good. I have been looking forward to seeing something like this for a while. I'd be cool however, to dump a *real* python function into a vertex shader and let it do real mesh deformations. I know, it would be hard to validate if it wasn;t doing some crazy stuff. Of course, with new (ie soon-to be-introduced) tesselation extensions to opengl, the mesh itself could be generated on the gpu itself. On Mon, Sep 28, 2009 at 10:07 PM, Nicolas Rougier wrote: > > > Well, I've been starting working on a pyglet backend but it is currently > painfully slow mainly because I do not know enough of the matplotlib > internal machinery to really benefit from it. In the case of glumpy, the use > of texture object for representing 2d arrays is a real speed boost since > interpolation/colormap/heightmap is made on the GPU. > Concerning matplotlib examples, the use of glumpy should be actually two > lines of code: > from pylab import * > from glumpy import imshow, show > but I did not package it this way yet (that is easy however). > I guess the main question is whether people are interested in glumpy to have > a quick & dirty "debug" tool on top of matplotlib or whether they prefer a > full fledged and fast pyglet/OpenGL backend (which is really harder). > Nicolas > > > On 28 Sep, 2009, at 18:05 , G?khan Sever wrote: > > > On Mon, Sep 28, 2009 at 9:06 AM, Nicolas Rougier > wrote: >> >> Hi all, >> >> glumpy is a fast OpenGL visualization tool for numpy arrays coded on >> top of pyglet (http://www.pyglet.org/). The package contains many >> demos showing basic usage as well as integration with matplotlib. As a >> reference, the animation script available from matplotlib distribution >> runs at around 500 fps using glumpy instead of 30 fps on my machine. >> >> Package/screenshots/explanations at: >> http://www.loria.fr/~rougier/coding/glumpy.html >> (it does not require installation so you can run demos from within the >> glumpy directory). >> >> >> Nicolas >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > Hi Nicolas, > > This is technically called OpenGL backend, isn't it? It is nice that > integrates with matplotlib, however 300 hundred lines of code indeed a lot > of lines for an ordinary user. Do you think this could be further integrated > into matplotlib with a wrapper to simplify its usage? > > > -- > G?khan > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry® Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9-12, 2009. Register now! > http://p.sf.net/sfu/devconf_______________________________________________ > Matplotlib-users mailing list > Matplotlib-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/matplotlib-users > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Rohit Garg http://rpg-314.blogspot.com/ Senior Undergraduate Department of Physics Indian Institute of Technology Bombay From pgmdevlist at gmail.com Mon Sep 28 13:36:07 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 28 Sep 2009 13:36:07 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> Message-ID: On Sep 28, 2009, at 12:51 PM, Skipper Seabold wrote: > This was probably due to the way that I timed it, honestly. I only > did it once. The only differences I made for that part were in the > first post of the thread. Two incremented scalars for line numbers > and column numbers and a try/except block. > > I'm really not against a debug mode if someone wants to do it, and > it's deemed necessary. If it could be made to log all of the errors > that would be extremely helpful. I still need to post some of my use > cases though. Anything to help make data cleaning less of a chore... I was thinking about something this week-end: we could create a second list when looping on the rows, where we would store the length of each splitted row. After the loop, we can find if these values don't match the expected number of columns `nbcols` and where. Then, we can decide to strip the `rows` list of its invalid values (that corresponds to skipping) or raise an exception, but in both cases we know where the problem is. My only concern is that we'd be creating yet another list of integers, which would increase memory usage. Would it be a problem ? In other news, I should eventually be able to tackle that this week... From jsseabold at gmail.com Mon Sep 28 13:54:54 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 28 Sep 2009 13:54:54 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> Message-ID: On Mon, Sep 28, 2009 at 1:36 PM, Pierre GM wrote: > > On Sep 28, 2009, at 12:51 PM, Skipper Seabold wrote: > >> This was probably due to the way that I timed it, honestly. ?I only >> did it once. ?The only differences I made for that part were in the >> first post of the thread. ?Two incremented scalars for line numbers >> and column numbers and a try/except block. >> >> I'm really not against a debug mode if someone wants to do it, and >> it's deemed necessary. ?If it could be made to log all of the errors >> that would be extremely helpful. ?I still need to post some of my use >> cases though. ?Anything to help make data cleaning less of a chore... > > I was thinking about something this week-end: we could create a second > list when looping on the rows, where we would store the length of each > splitted row. After the loop, we can find if these values don't match > the expected number of columns `nbcols` and where. Then, we can decide > to strip the `rows` list of its invalid values (that corresponds to > skipping) or raise an exception, but in both cases we know where the > problem is. > My only concern is that we'd be creating yet another list of integers, > which would increase memory usage. Would it be a problem ? > In other news, I should eventually be able to tackle that this week... > I don't think it would be prohibitively large. One of the datasets I was working with was about a million lines with about 500 columns in each. So...if this is how you actually do this then you have. L = [500] * 1201798 import sys print sys.getsizeof(L)/(1000000.), "MB" # (9.6144560000000006, 'MB') I can't think of a case where I would want to just skip bad rows. Also, I'd definitely like to know about each line that had problems in an error log if we're going to go through the whole file anyway. No hurry on this, just getting my thoughts out there after my experience. I will post some test cases tonight probably. Skipper From charlesr.harris at gmail.com Mon Sep 28 15:05:27 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 28 Sep 2009 13:05:27 -0600 Subject: [Numpy-discussion] new array type in cython? In-Reply-To: References: <3d375d730909280812o2b9b343erf72b3ce28a8ba025@mail.gmail.com> <3d375d730909280931r7f2ac69cp835a304875d37ddb@mail.gmail.com> Message-ID: On Mon, Sep 28, 2009 at 10:47 AM, Neal Becker wrote: > Robert Kern wrote: > > > On Mon, Sep 28, 2009 at 10:36, Neal Becker wrote: > >> Robert Kern wrote: > >> > >>> On Mon, Sep 28, 2009 at 06:23, Neal Becker > wrote: > >>>> Has anyone attempted a new array type in cython? Any hints? > >>> > >>> Are you having problems? > >> > >> No, haven't tried using cython for this yet. Wondering if there are any > >> examples. > > > > Have you read the documentation? > > > > http://wiki.cython.org/tutorials/numpy > > > > Yes, I didn't notice anything about adding user-defined datatypes to numpy, > did I miss something? > > This is for a fixed point type, no? Why not write a class based around something like int32, override some of the methods, and use object arrays? It's quick an easy and unless you really, really need speed it should do the trick. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pengyu.ut at gmail.com Mon Sep 28 15:31:18 2009 From: pengyu.ut at gmail.com (Peng Yu) Date: Mon, 28 Sep 2009 14:31:18 -0500 Subject: [Numpy-discussion] setup.py does not use the correct library Message-ID: <366c6f340909281231o2b99c498r462a44fa1e30d61c@mail.gmail.com> I use the following command to build numpy-1.3.0rc2. But it seems not able to find the appropriate library files. Can somebody let me know how to make it use the correct ones? #####command export LD_LBRARY_PATH= export CPPFLAGS="-I$HOME/utility/linux/opt/Python-2.6.2/include/python2.6" export LDFLAGS="-L$HOME/utility/linux/opt/Python-2.6.2/lib -lpython2.6" export PATH=$HOME/utility/linux/opt/Python-2.6.2/bin:/usr/local/bin:/usr/bin:/bin python setup.py build --fcompiler=gnu95 #######error /home/pengy/download/linux/python/numpy-1.3.0rc2/build/src.linux-x86_64-2.6/numpy/core/include/numpy/__multiarray_api.h:996: undefined reference to `PyErr_Format' build/temp.linux-x86_64-2.6/numpy/linalg/lapack_litemodule.o: In function `initlapack_lite': /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/lapack_litemodule.c:833: undefined reference to `PyDict_SetItemString' /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/lapack_litemodule.c:830: undefined reference to `PyErr_SetString' build/temp.linux-x86_64-2.6/numpy/linalg/python_xerbla.o: In function `xerbla_': /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/python_xerbla.c:35: undefined reference to `PyExc_ValueError' /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/python_xerbla.c:35: undefined reference to `PyErr_SetString' /usr/lib/gcc/x86_64-redhat-linux/4.1.2/libgfortranbegin.a(fmain.o): In function `main': (.text+0xa): undefined reference to `MAIN__' collect2: ld returned 1 exit status From robert.kern at gmail.com Mon Sep 28 15:53:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Sep 2009 14:53:01 -0500 Subject: [Numpy-discussion] setup.py does not use the correct library In-Reply-To: <366c6f340909281231o2b99c498r462a44fa1e30d61c@mail.gmail.com> References: <366c6f340909281231o2b99c498r462a44fa1e30d61c@mail.gmail.com> Message-ID: <3d375d730909281253v7534453cxf1962c0fa783051@mail.gmail.com> On Mon, Sep 28, 2009 at 14:31, Peng Yu wrote: > I use the following command to build numpy-1.3.0rc2. But it seems not > able to find the appropriate library files. Can somebody let me know > how to make it use the correct ones? > > #####command > export LD_LBRARY_PATH= > export CPPFLAGS="-I$HOME/utility/linux/opt/Python-2.6.2/include/python2.6" > export LDFLAGS="-L$HOME/utility/linux/opt/Python-2.6.2/lib -lpython2.6" When compiling Fortran extensions, $LDFLAGS replaces every linker flag, including things like -shared. Be sure you know what you are doing when using this. But you really shouldn't have to be using $LDFLAGS or $CPPFLAGS like this. Python should know where its libraries and include directories are. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fonnesbeck at gmail.com Mon Sep 28 16:39:28 2009 From: fonnesbeck at gmail.com (Chris) Date: Mon, 28 Sep 2009 20:39:28 +0000 (UTC) Subject: [Numpy-discussion] numpy.reshape() bug? Message-ID: I am trying to "collapse" two dimensions of a 3-D array, using reshape: (Pdb) dims = np.shape(trace) (Pdb) dims Out[2]: (1000, 4, 3) (Pdb) newdims = (dims[0], sum(dims[1:])) (Pdb) newdims Out[2]: (1000, 7) However, reshape seems to think I am missing something: (Pdb) np.reshape(trace, newdims) *** ValueError: total size of new array must be unchanged Clearly the total size of the new array *is* unchanged. From charlesr.harris at gmail.com Mon Sep 28 16:46:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 28 Sep 2009 14:46:32 -0600 Subject: [Numpy-discussion] Deprecate poly1d and replace with Poly1d ? Message-ID: Because poly1d exports the __array__ interface, a design error IMHO that makes it play badly with the prospective chebyshev module. For example the following should convert from a Chebyshev series to a power series chebval([1,0,0], poly1d(1,0)) and it does if I make sure to pass the poly1d as object in an array, but to do that I have to check that it is an instance of poly1d and take special measures. I shouldn't have to do that, it violates duck typing. The more basic problem here is making poly1d look like an array, which it isn't. The array bit is an implementation detail and would be private in C++. with an as_array method to retrieve the details if wanted. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmb at wartburg.edu Mon Sep 28 16:50:32 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Mon, 28 Sep 2009 15:50:32 -0500 Subject: [Numpy-discussion] numpy.reshape() bug? In-Reply-To: References: Message-ID: <4AC12198.50607@wartburg.edu> On 2009-09-28 15:39 , Chris wrote: > I am trying to "collapse" two dimensions of a 3-D array, using reshape: > > (Pdb) dims = np.shape(trace) > (Pdb) dims > Out[2]: (1000, 4, 3) > (Pdb) newdims = (dims[0], sum(dims[1:])) > (Pdb) newdims > Out[2]: (1000, 7) > > However, reshape seems to think I am missing something: > > (Pdb) np.reshape(trace, newdims) > *** ValueError: total size of new array must be unchanged > > Clearly the total size of the new array *is* unchanged. I think you meant prod(dims[1:]). A 4 x 3 sub-array has 12 elements, not 7. (Whence the curse of dimensionality...) -Neil From robert.kern at gmail.com Mon Sep 28 16:59:23 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Sep 2009 15:59:23 -0500 Subject: [Numpy-discussion] Deprecate poly1d and replace with Poly1d ? In-Reply-To: References: Message-ID: <3d375d730909281359k405ce4b1ubca0f186f70200d0@mail.gmail.com> On Mon, Sep 28, 2009 at 15:46, Charles R Harris wrote: > The more basic problem here is making poly1d look like an array, which it > isn't. The array bit is an implementation detail and would be private in > C++. with an as_array method to retrieve the details if wanted. I'm pretty sure that it is an intentional public API and not an implementation detail. The __array__() method is not "making poly1d look like an array"; it is the standard name for such as_array() conversion methods. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From yogeshkarpate at gmail.com Mon Sep 28 17:02:48 2009 From: yogeshkarpate at gmail.com (yogesh karpate) Date: Tue, 29 Sep 2009 02:32:48 +0530 Subject: [Numpy-discussion] The problem with zero dimesnsional array Message-ID: <703777c60909281402t5526b6c5k6cf6156ca5fcb945@mail.gmail.com> Dear All, I'm facing a bog problem in following . the code snippet is as follows #################### % Compute the area indicator################################### for kT in range(leftbound,rightbound): # Here the left bound and rightbound both are indexing array is cutlevel = sum(s[(kT-ptwin):(kT+ptwin)],0)/(ptwin*2+1) corsig = s[(kT-swin+1):kT]-cutlevel areavalue1 =sum((corsig),0) #print areavalue.size print leftbound, rightbound Tval=areavalue1[leftbound:rightbound] Everything works fine till areavalue1, then whenever I try to access the Tval=areavalue1[leftbound:rightbound] it says IndexError: invalid index to scalar variable.. When i try to access areavalue1[0] it gives me entire array but for areavalue1[2:8]..it gives the same error . Thanx in advance.. Regards ymk -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Sep 28 17:05:28 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 28 Sep 2009 15:05:28 -0600 Subject: [Numpy-discussion] Deprecate poly1d and replace with Poly1d ? In-Reply-To: <3d375d730909281359k405ce4b1ubca0f186f70200d0@mail.gmail.com> References: <3d375d730909281359k405ce4b1ubca0f186f70200d0@mail.gmail.com> Message-ID: On Mon, Sep 28, 2009 at 2:59 PM, Robert Kern wrote: > On Mon, Sep 28, 2009 at 15:46, Charles R Harris > wrote: > > > The more basic problem here is making poly1d look like an array, which it > > isn't. The array bit is an implementation detail and would be private in > > C++. with an as_array method to retrieve the details if wanted. > > I'm pretty sure that it is an intentional public API and not an > implementation detail. The __array__() method is not "making poly1d > look like an array"; it is the standard name for such as_array() > conversion methods. > > Exactly, and that is why it is a design decision error. It *shouldn't* work with as_array unless it is *an array*, which it isn't. Really In [19]: sin(poly1d([1,2,3])) Out[19]: array([ 0.84147098, 0.90929743, 0.14112001]) That makes no sense. On the other hand, it is difficult to make arrays of poly1d, which does make sense because the polynomials are a commutative ring. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pengyu.ut at gmail.com Mon Sep 28 17:27:06 2009 From: pengyu.ut at gmail.com (Peng Yu) Date: Mon, 28 Sep 2009 16:27:06 -0500 Subject: [Numpy-discussion] setup.py does not use the correct library In-Reply-To: <3d375d730909281253v7534453cxf1962c0fa783051@mail.gmail.com> References: <366c6f340909281231o2b99c498r462a44fa1e30d61c@mail.gmail.com> <3d375d730909281253v7534453cxf1962c0fa783051@mail.gmail.com> Message-ID: <366c6f340909281427l3ba4fdcfq97406f56b9b3b16c@mail.gmail.com> On Mon, Sep 28, 2009 at 2:53 PM, Robert Kern wrote: > On Mon, Sep 28, 2009 at 14:31, Peng Yu wrote: >> I use the following command to build numpy-1.3.0rc2. But it seems not >> able to find the appropriate library files. Can somebody let me know >> how to make it use the correct ones? >> >> #####command >> export LD_LBRARY_PATH= >> export CPPFLAGS="-I$HOME/utility/linux/opt/Python-2.6.2/include/python2.6" >> export LDFLAGS="-L$HOME/utility/linux/opt/Python-2.6.2/lib -lpython2.6" > > When compiling Fortran extensions, $LDFLAGS replaces every linker > flag, including things like -shared. Be sure you know what you are > doing when using this. > > But you really shouldn't have to be using $LDFLAGS or $CPPFLAGS like > this. Python should know where its libraries and include directories > are. My python is compiled with gcc-3.4.4. I used gcc-4.1.2 which generate the error in my previous email. Do I have to use the same compiler to compile numpy? Regards, Peng From robert.kern at gmail.com Mon Sep 28 17:29:59 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Sep 2009 16:29:59 -0500 Subject: [Numpy-discussion] setup.py does not use the correct library In-Reply-To: <366c6f340909281427l3ba4fdcfq97406f56b9b3b16c@mail.gmail.com> References: <366c6f340909281231o2b99c498r462a44fa1e30d61c@mail.gmail.com> <3d375d730909281253v7534453cxf1962c0fa783051@mail.gmail.com> <366c6f340909281427l3ba4fdcfq97406f56b9b3b16c@mail.gmail.com> Message-ID: <3d375d730909281429q5eeca4e2r161f76a69d0f215e@mail.gmail.com> On Mon, Sep 28, 2009 at 16:27, Peng Yu wrote: > On Mon, Sep 28, 2009 at 2:53 PM, Robert Kern wrote: >> On Mon, Sep 28, 2009 at 14:31, Peng Yu wrote: >>> I use the following command to build numpy-1.3.0rc2. But it seems not >>> able to find the appropriate library files. Can somebody let me know >>> how to make it use the correct ones? >>> >>> #####command >>> export LD_LBRARY_PATH= >>> export CPPFLAGS="-I$HOME/utility/linux/opt/Python-2.6.2/include/python2.6" >>> export LDFLAGS="-L$HOME/utility/linux/opt/Python-2.6.2/lib -lpython2.6" >> >> When compiling Fortran extensions, $LDFLAGS replaces every linker >> flag, including things like -shared. Be sure you know what you are >> doing when using this. >> >> But you really shouldn't have to be using $LDFLAGS or $CPPFLAGS like >> this. Python should know where its libraries and include directories >> are. > > My python is compiled with gcc-3.4.4. I used gcc-4.1.2 which generate > the error in my previous email. Do I have to use the same compiler to > compile numpy? It's a good idea, particularly with the jump from 3.4 to 4.1. However, the source of the error you saw is that you have defined $LDFLAGS. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon Sep 28 17:44:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Sep 2009 16:44:54 -0500 Subject: [Numpy-discussion] setup.py does not use the correct library In-Reply-To: <366c6f340909281440p22581237h6cf05b1ccfff23a9@mail.gmail.com> References: <366c6f340909281231o2b99c498r462a44fa1e30d61c@mail.gmail.com> <3d375d730909281253v7534453cxf1962c0fa783051@mail.gmail.com> <366c6f340909281427l3ba4fdcfq97406f56b9b3b16c@mail.gmail.com> <3d375d730909281429q5eeca4e2r161f76a69d0f215e@mail.gmail.com> <366c6f340909281440p22581237h6cf05b1ccfff23a9@mail.gmail.com> Message-ID: <3d375d730909281444k26078932s556c23ed50ea45b@mail.gmail.com> On Mon, Sep 28, 2009 at 16:40, Peng Yu wrote: > I attached the script that I run for build and the build output. I > think that setup.py doesn't use the correct python library. But I'm > not sure why. Would you please help me figure out what the problem is? Setting $LDFLAGS to be empty is also incorrect. Simply do not set $LDFLAGS or $CPPFLAGS at all. [And please do not Cc: me. I read the list.] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jah.mailinglist at gmail.com Mon Sep 28 19:19:24 2009 From: jah.mailinglist at gmail.com (jah) Date: Mon, 28 Sep 2009 16:19:24 -0700 Subject: [Numpy-discussion] Convert data into rectangular grid Message-ID: Hi, Suppose I have a set of x,y,c data (something useful for matplotlib.pyplot.plot() ). Generally, this data is not rectangular at all. Does there exist a numpy function (or set of functions) which will take this data and construct the smallest two-dimensional arrays X,Y,C ( suitable for matplotlib.pyplot.contour() ). Essentially, I want to pass in the data and a grid step size in the x- and y-directions. The function would average the c-values for all points which land in any particular square. Optionally, I'd like to be able to specify a value to use when there are no points in x,y which are in the square. Hope this makes sense. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Sep 28 19:48:46 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 28 Sep 2009 19:48:46 -0400 Subject: [Numpy-discussion] Convert data into rectangular grid In-Reply-To: References: Message-ID: <1cd32cbb0909281648h4ea0e7b7wb099700f589d92e0@mail.gmail.com> On Mon, Sep 28, 2009 at 7:19 PM, jah wrote: > Hi, > > Suppose I have a set of x,y,c data (something useful for > matplotlib.pyplot.plot() ).? Generally, this data is not rectangular at > all.? Does there exist a numpy function (or set of functions) which will > take this data and construct the smallest two-dimensional arrays X,Y,C ( > suitable for matplotlib.pyplot.contour() ). > > Essentially, I want to pass in the data and a grid step size in the x- and > y-directions.? The function would average the c-values for all points which > land in any particular square.? Optionally, I'd like to be able to specify a > value to use when there are no points in x,y which are in the square. > > Hope this makes sense. If I understand correctly numpy.histogram2d(x, y, ..., weights=c) might do what you want. There was a recent thread on its usage. Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From pengyu.ut at gmail.com Mon Sep 28 20:35:39 2009 From: pengyu.ut at gmail.com (Peng Yu) Date: Mon, 28 Sep 2009 19:35:39 -0500 Subject: [Numpy-discussion] setup.py does not use the correct library In-Reply-To: <366c6f340909281505h3717ac56r1e0e31b521ef5ed2@mail.gmail.com> References: <366c6f340909281231o2b99c498r462a44fa1e30d61c@mail.gmail.com> <3d375d730909281253v7534453cxf1962c0fa783051@mail.gmail.com> <366c6f340909281427l3ba4fdcfq97406f56b9b3b16c@mail.gmail.com> <3d375d730909281429q5eeca4e2r161f76a69d0f215e@mail.gmail.com> <366c6f340909281440p22581237h6cf05b1ccfff23a9@mail.gmail.com> <3d375d730909281444k26078932s556c23ed50ea45b@mail.gmail.com> <366c6f340909281502s360954cdp32f2494aef86fd24@mail.gmail.com> <366c6f340909281505h3717ac56r1e0e31b521ef5ed2@mail.gmail.com> Message-ID: <366c6f340909281735j480081bbvd178ebd8550ac8ed@mail.gmail.com> On Mon, Sep 28, 2009 at 4:44 PM, Robert Kern wrote: > On Mon, Sep 28, 2009 at 16:40, Peng Yu wrote: >> I attached the script that I run for build and the build output. I >> think that setup.py doesn't use the correct python library. But I'm >> not sure why. Would you please help me figure out what the problem is? > > Setting $LDFLAGS to be empty is also incorrect. Simply do not set > $LDFLAGS or $CPPFLAGS at all. > > [And please do not Cc: me. I read the list.] Even if I don't set $LDFLAGS in my build script, I still have LDFLAGS set in my ~/.bash_profile. Here is the my build script and the build output. It still gives me errors. What is wrong with it? $ cat ~/notes/install/numpy/build #!/usr/bin/env bash #export LD_LBRARY_PATH= #export CPPFLAGS="-I$HOME/utility/linux/opt/Python-2.6.2/include/python2.6" #export LDFLAGS="-L$HOME/utility/linux/opt/Python-2.6.2/lib -lpython2.6" #export CPPFLAGS= #export LDFLAGS= export PATH=$HOME/.bin/misc:$HOME/utility/linux/opt/Python-2.6.2/bin:$HOME/utility/linux/opt/gcc-4.3.4/bin:/usr/local/bin:/usr/bin:/bin #echo $PATH gcc --version python --version python setup.py build --fcompiler=gnu95 | tee build_output ; sendmail.sh build_output $ tail -20 build_output /home/pengy/download/linux/python/numpy-1.3.0rc2/build/src.linux-x86_64-2.6/numpy/core/include/numpy/__multiarray_api.h:1007: undefined reference to `PyExc_RuntimeError' /home/pengy/download/linux/python/numpy-1.3.0rc2/build/src.linux-x86_64-2.6/numpy/core/include/numpy/__multiarray_api.h:1007: undefined reference to `PyErr_Format' build/temp.linux-x86_64-2.6/numpy/linalg/lapack_litemodule.o: In function `initlapack_lite': /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/lapack_litemodule.c:830: undefined reference to `PyErr_Print' /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/lapack_litemodule.c:830: undefined reference to `PyExc_ImportError' build/temp.linux-x86_64-2.6/numpy/linalg/lapack_litemodule.o: In function `_import_array': /home/pengy/download/linux/python/numpy-1.3.0rc2/build/src.linux-x86_64-2.6/numpy/core/include/numpy/__multiarray_api.h:977: undefined reference to `PyCObject_AsVoidPtr' /home/pengy/download/linux/python/numpy-1.3.0rc2/build/src.linux-x86_64-2.6/numpy/core/include/numpy/__multiarray_api.h:984: undefined reference to `PyExc_RuntimeError' /home/pengy/download/linux/python/numpy-1.3.0rc2/build/src.linux-x86_64-2.6/numpy/core/include/numpy/__multiarray_api.h:984: undefined reference to `PyErr_Format' /home/pengy/download/linux/python/numpy-1.3.0rc2/build/src.linux-x86_64-2.6/numpy/core/include/numpy/__multiarray_api.h:996: undefined reference to `PyExc_RuntimeError' /home/pengy/download/linux/python/numpy-1.3.0rc2/build/src.linux-x86_64-2.6/numpy/core/include/numpy/__multiarray_api.h:996: undefined reference to `PyErr_Format' build/temp.linux-x86_64-2.6/numpy/linalg/lapack_litemodule.o: In function `initlapack_lite': /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/lapack_litemodule.c:833: undefined reference to `PyDict_SetItemString' /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/lapack_litemodule.c:830: undefined reference to `PyErr_SetString' build/temp.linux-x86_64-2.6/numpy/linalg/python_xerbla.o: In function `xerbla_': /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/python_xerbla.c:35: undefined reference to `PyExc_ValueError' /home/pengy/download/linux/python/numpy-1.3.0rc2/numpy/linalg/python_xerbla.c:35: undefined reference to `PyErr_SetString' /home/pengy/utility/linux/opt/gcc-4.3.4/lib/gcc/x86_64-unknown-linux-gnu/4.3.4/libgfortranbegin.a(fmain.o): In function `main': /home/pengy/download/linux/gcc-4.3.4-build-20090905-220719/x86_64-unknown-linux-gnu/libgfortran/../../../gcc-4.3.4/libgfortran/fmain.c:21: undefined reference to `MAIN__' collect2: ld returned 1 exit status Regards, Peng From robert.kern at gmail.com Mon Sep 28 20:38:40 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Sep 2009 19:38:40 -0500 Subject: [Numpy-discussion] setup.py does not use the correct library In-Reply-To: <366c6f340909281735j480081bbvd178ebd8550ac8ed@mail.gmail.com> References: <366c6f340909281231o2b99c498r462a44fa1e30d61c@mail.gmail.com> <3d375d730909281253v7534453cxf1962c0fa783051@mail.gmail.com> <366c6f340909281427l3ba4fdcfq97406f56b9b3b16c@mail.gmail.com> <3d375d730909281429q5eeca4e2r161f76a69d0f215e@mail.gmail.com> <366c6f340909281440p22581237h6cf05b1ccfff23a9@mail.gmail.com> <3d375d730909281444k26078932s556c23ed50ea45b@mail.gmail.com> <366c6f340909281502s360954cdp32f2494aef86fd24@mail.gmail.com> <366c6f340909281505h3717ac56r1e0e31b521ef5ed2@mail.gmail.com> <366c6f340909281735j480081bbvd178ebd8550ac8ed@mail.gmail.com> Message-ID: <3d375d730909281738ka2b645ap9cf5f7be076d022@mail.gmail.com> On Mon, Sep 28, 2009 at 19:35, Peng Yu wrote: > On Mon, Sep 28, 2009 at 4:44 PM, Robert Kern wrote: >> On Mon, Sep 28, 2009 at 16:40, Peng Yu wrote: >>> I attached the script that I run for build and the build output. I >>> think that setup.py doesn't use the correct python library. But I'm >>> not sure why. Would you please help me figure out what the problem is? >> >> Setting $LDFLAGS to be empty is also incorrect. Simply do not set >> $LDFLAGS or $CPPFLAGS at all. >> >> [And please do not Cc: me. I read the list.] > > Even if I don't set $LDFLAGS in my build script, I still have LDFLAGS > set in my ~/.bash_profile. Then unset it: unset CPPFLAGS unset LDFLAGS -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jah.mailinglist at gmail.com Mon Sep 28 20:45:15 2009 From: jah.mailinglist at gmail.com (jah) Date: Mon, 28 Sep 2009 17:45:15 -0700 Subject: [Numpy-discussion] Convert data into rectangular grid In-Reply-To: <1cd32cbb0909281648h4ea0e7b7wb099700f589d92e0@mail.gmail.com> References: <1cd32cbb0909281648h4ea0e7b7wb099700f589d92e0@mail.gmail.com> Message-ID: On Mon, Sep 28, 2009 at 4:48 PM, wrote: > On Mon, Sep 28, 2009 at 7:19 PM, jah wrote: > > Hi, > > > > Suppose I have a set of x,y,c data (something useful for > > matplotlib.pyplot.plot() ). Generally, this data is not rectangular at > > all. Does there exist a numpy function (or set of functions) which will > > take this data and construct the smallest two-dimensional arrays X,Y,C ( > > suitable for matplotlib.pyplot.contour() ). > > > > Essentially, I want to pass in the data and a grid step size in the x- > and > > y-directions. The function would average the c-values for all points > which > > land in any particular square. Optionally, I'd like to be able to > specify a > > value to use when there are no points in x,y which are in the square. > > > > Hope this makes sense. > > If I understand correctly numpy.histogram2d(x, y, ..., weights=c) might do > what you want. > > There was a recent thread on its usage. > It is very close, but it normed=True, will first normalize the weights (undesirably) and then it will normalize the normalized weights by dividing by the cell area. Instead, what I want is the cell value to be the average off all the points that were placed in the cell. This seems like a common use case, so I'm guessing this functionality is present already. So if 3 points with weights [10,20,30] were placed in cell (i,j), then the cell should have value 20 (the arithmetic mean of the points placed in the cell). Here is the desired use case: I have a set of x,y,c values that I could pass into matplotlib's scatter() or hexbin(). I'd like to take this same set of points and transform them so that I can pass them into matplotlib's contour() function. Perhaps matplotlib has a function which does this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Sep 28 20:49:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Sep 2009 19:49:21 -0500 Subject: [Numpy-discussion] Convert data into rectangular grid In-Reply-To: References: <1cd32cbb0909281648h4ea0e7b7wb099700f589d92e0@mail.gmail.com> Message-ID: <3d375d730909281749v5a4e0432ne734e70dce5f1e07@mail.gmail.com> On Mon, Sep 28, 2009 at 19:45, jah wrote: > Here is the desired use case:? I have a set of x,y,c values that I could > pass into matplotlib's scatter() or hexbin().?? I'd like to take this same > set of points and transform them so that I can pass them into matplotlib's > contour() function.? Perhaps matplotlib has a function which does this. http://matplotlib.sourceforge.net/api/mlab_api.html#matplotlib.mlab.griddata -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Michael.Walker at sophia.inria.fr Tue Sep 29 05:08:42 2009 From: Michael.Walker at sophia.inria.fr (Michael.Walker at sophia.inria.fr) Date: Tue, 29 Sep 2009 11:08:42 +0200 (CEST) Subject: [Numpy-discussion] numpy.numarray.transpose() In-Reply-To: <4AC0C84A.7030601@gmail.com> References: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> <91cf711d0909250614o158e121ej40399a5b92278884@mail.gmail.com> <1d1e6ea70909250659q706c394fg470d5278fb97549d@mail.gmail.com> <57516.194.167.194.27.1254125267.squirrel@imap-sop.inria.fr> <4AC0C84A.7030601@gmail.com> Message-ID: <45618.194.167.194.27.1254215322.squirrel@imap-sop.inria.fr> > On 09/28/2009 03:15 AM, Pauli Virtanen wrote: >> Mon, 28 Sep 2009 10:07:47 +0200, Michael.Walker wrote: >> [clip] >> >>> In [7]: f = f.transpose() >>> >>> In [8]: print f >>> [[1 3] >>> [2 4]] >>> >>> as expected. I mention this because I think that it is worth knowing >>> having lost a LOT of time to it. Is it worth filing as a bug report? >>> > This is not a bug! This specific difference between numpy and numarray > is documented on the 'converting from numarray' page: > http://www.scipy.org/Converting_from_numarray I am referring to the behaviour of numpy.numarray.transpose() being that of numpy.transpose() instead of numarray.transpose. One expects that numpy.numarray would function as numarray, for the purpose of backwards compatability. So, is this worth filing a bug report for? Michael Walker Plant Modelling Group CIRAD, Montpellier 04 67 61 57 27 From pav+sp at iki.fi Tue Sep 29 05:25:04 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 29 Sep 2009 09:25:04 +0000 (UTC) Subject: [Numpy-discussion] numpy.numarray.transpose() References: <1d1e6ea70909250239u34027854l2dde58e46b0934d4@mail.gmail.com> <1d1e6ea70909250355h2a036d21w799e8d87a92715d4@mail.gmail.com> <91cf711d0909250614o158e121ej40399a5b92278884@mail.gmail.com> <1d1e6ea70909250659q706c394fg470d5278fb97549d@mail.gmail.com> <57516.194.167.194.27.1254125267.squirrel@imap-sop.inria.fr> <4AC0C84A.7030601@gmail.com> <45618.194.167.194.27.1254215322.squirrel@imap-sop.inria.fr> Message-ID: Tue, 29 Sep 2009 11:08:42 +0200, Michael.Walker wrote: [clip] > I am referring to the behaviour of numpy.numarray.transpose() being that > of numpy.transpose() instead of numarray.transpose. One expects that You probably mean the transpose methods numpy.numarray.ndarray.transpose and numarray.ndarray.transpose. The transpose functions function identically. > numpy.numarray would function as numarray, for the purpose of backwards > compatability. So, is this worth filing a bug report for? There is no harm in creating a bug ticket. This is at least a documentation issue even in the case the main problem becomes a wontfix. -- Pauli Virtanen From silva at lma.cnrs-mrs.fr Tue Sep 29 07:47:33 2009 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 29 Sep 2009 13:47:33 +0200 Subject: [Numpy-discussion] The problem with zero dimesnsional array In-Reply-To: <703777c60909281402t5526b6c5k6cf6156ca5fcb945@mail.gmail.com> References: <703777c60909281402t5526b6c5k6cf6156ca5fcb945@mail.gmail.com> Message-ID: <1254224853.7612.6.camel@localhost.localdomain> Le mardi 29 septembre 2009 ? 02:32 +0530, yogesh karpate a ?crit : > Dear All, > I'm facing a bog problem in following . the code > snippet is as follows > #################### % Compute the area > indicator################################### > for kT in range(leftbound,rightbound): > # Here the left bound and rightbound both are indexing array is > cutlevel = sum(s[(kT-ptwin):(kT+ptwin)],0)/(ptwin*2+1) > corsig = s[(kT-swin+1):kT]-cutlevel > areavalue1 =sum((corsig),0) > #print areavalue.size > print leftbound, rightbound > Tval=areavalue1[leftbound:rightbound] > Everything works fine till areavalue1, then whenever I try to access > the Tval=areavalue1[leftbound:rightbound] > it says IndexError: invalid index to scalar variable.. > When i try to access areavalue1[0] it gives me entire array but for > areavalue1[2:8]..it gives the same error . > Thanx in advance.. > Regards Could you please check the shape of areavalue : >>> print type(areavalue) >>> print areavalue.shape Make sure areavalue is not a list or a tuple with the array you are interested in as the first element. If it is such a case, first extract the array or replace areavalue by areavalue[0] in your snipplet. -- Fabrice Silva LMA UPR CNRS 7051 PS : you need not to personally email your request. I saw your message on the mailing list, but I had no time to answer... From ndbecker2 at gmail.com Tue Sep 29 08:52:55 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 29 Sep 2009 08:52:55 -0400 Subject: [Numpy-discussion] fixed_pt some progress and a question Message-ID: I'm starting with a pure python implementation and have some progress. AFAICT, the only approach is to subclass ndarray and add the properties and behaviors I need. I ran into one issue though. In my function 'as_double', I need to get to the underlying 'int' array to pass to ldexp. I tried using view, but it silently fails: In [40]: obj Out[40]: fixed_pt_array([ 0, 32, 64, 96, 128]) In [41]: obj.view (int) Out[41]: fixed_pt_array([ 0, 32, 64, 96, 128]) How can I get at the underlying int array to pass to ldexp? Here is the prototype code (far from complete!) import numpy as np def rnd (x, frac_bits, _max): x1 = x >> (frac_bits-1) if (x1 == _max): return x1 >> 1 else: return (x1+1) >> 1 def shift_left (x, bits): return x << bits def shift_right (x, bits): return x >> bits def shiftup_or_rnddn (x, bits, _max, rnd_policy): if (bits > 0): return shift_left (x, bits) elif (bits < 0): return rnd_policy (x, -bits, _max) else: return x def clip (x, _min, _max): if x > _max: return _max elif x < _min: return _min else: return x class fixed_pt (object): def get_max(self): if self.is_signed: return (~(self.base_type(-1) << (self.total_bits-1))) else: return (~(self.base_type (-1) << self.total_bits)) def get_min(self): if self.is_signed: return ((self.base_type(-1) << (self.total_bits-1))) else: return 0 def __init__ (self, int_bits, frac_bits, val, scale=True, base_type=int, rnd_policy=rnd, overflow_policy=clip, is_signed=True): self.is_signed = is_signed self.int_bits = int_bits self.frac_bits = frac_bits self.base_type = base_type self.total_bits = int_bits + frac_bits self.rnd_policy = rnd_policy self._max = self.get_max () self._min = self.get_min () self.overflow_policy = overflow_policy if scale: self.val = self.overflow_policy (self.base_type (shiftup_or_rnddn (val, frac_bits, self._max, self.rnd_policy)), self._min, self._max) def as_double (self): return np.ldexp (self.val, -self.frac_bits) def as_base (self): return shiftup_or_rnddn (self.val, -self.frac_bits, self._max, self.rnd_policy) def __repr__(self): return "[%s <%s,%s>]" % (self.val, self.int_bits, self.frac_bits) def get_max(is_signed, base_type, total_bits): if is_signed: return (~(base_type(-1) << (total_bits-1))) else: return (~(base_type (-1) << total_bits)) def get_min(is_signed, base_type, total_bits): if is_signed: return ((base_type(-1) << (total_bits-1))) else: return 0 class fixed_pt_array(np.ndarray): def __new__(cls, input_array, int_bits, frac_bits, scale=True, base_type=int, rnd_policy=rnd, overflow_policy=clip, is_signed=True): # Input array is an already formed ndarray instance # We first cast to be our class type obj = np.asarray(input_array, dtype=base_type).view(cls) # add the new attribute to the created instance obj.int_bits = int_bits obj.frac_bits = frac_bits obj.rnd_policy = rnd_policy obj.overflow_policy = overflow_policy obj.is_signed = is_signed obj.scale = scale obj.base_type = base_type obj.total_bits = int_bits + frac_bits obj._max = get_max(is_signed, base_type, obj.total_bits) obj._min = get_min(is_signed, base_type, obj.total_bits) if scale: def _scale (val): return overflow_policy (base_type (shiftup_or_rnddn (val, frac_bits, obj._max, rnd_policy)), obj._min, obj._max) vecfunc = np.vectorize (_scale) obj = vecfunc (obj) # Finally, we must return the newly created object: return obj def __array_finalize__(self,obj): # reset the attribute from passed original object if hasattr (obj, 'int_bits'): self.int_bits = obj.int_bits self.frac_bits = obj.frac_bits self.is_signed = obj.is_signed self.base_type = obj.base_type self.total_bits = obj.total_bits self._max = obj._max self._min = obj._min ## self._max = get_max(self.is_signed, self.base_type, self.total_bits) ## self._min = get_min(self.is_signed, self.base_type, self.total_bits) # We do not need to return anything ## def __getitem__ (self, index): ## return fp.fixed_pt_int64_clip (self.int_bits, self.frac_bits, int(np.ndarray.__getitem__(self, index))) def as_double (self): def _as_double (x): print type(x) return np.ldexp (x, -self.frac_bits) return _as_double (self.view (self.base_type)) fp = fixed_pt (5, 5, 1) arr = np.arange(5,dtype=int) obj = fixed_pt_array(arr, int_bits=5, frac_bits=5) print type(obj) From ndbecker2 at gmail.com Tue Sep 29 10:22:56 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 29 Sep 2009 10:22:56 -0400 Subject: [Numpy-discussion] fixed_pt some progress and a question References: Message-ID: This doesn't work either: def as_double (self): import math def _as_double_1 (x): return math.ldexp (x, -self.frac_bits) vecfunc = np.vectorize (_as_double_1, otypes=[np.float]) return vecfunc (self) In [49]: obj.as_double() Out[49]: fixed_pt_array([ 0., 1., 2., 3., 4.]) The values are correct, but I wanted a float array, not fixed_pt_array for output. From josef.pktd at gmail.com Tue Sep 29 10:51:00 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 29 Sep 2009 10:51:00 -0400 Subject: [Numpy-discussion] fixed_pt some progress and a question In-Reply-To: References: Message-ID: <1cd32cbb0909290751v41511f27h5347d40126c3e044@mail.gmail.com> On Tue, Sep 29, 2009 at 10:22 AM, Neal Becker wrote: > This doesn't work either: > > ? ?def as_double (self): > ? ? ? ?import math > ? ? ? ?def _as_double_1 (x): > ? ? ? ? ? ?return math.ldexp (x, -self.frac_bits) > ? ? ? ?vecfunc = np.vectorize (_as_double_1, otypes=[np.float]) > ? ? ? ?return vecfunc (self) > > In [49]: obj.as_double() > Out[49]: fixed_pt_array([ 0., ?1., ?2., ?3., ?4.]) > > The values are correct, but I wanted a float array, not fixed_pt_array for > output. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I don't understand much, but if you just want to convert to a regular float array, you can just create a new array with np.array in as_double, or not? Josef >>> np.array(np.ldexp(obj, -obj.frac_bits)) array([ 0., 1., 2., 3., 4.]) >>> np.array(np.ldexp(obj, -obj.frac_bits)).dtype dtype('float64') >>> type(np.array(np.ldexp(obj, -obj.frac_bits))) From ndbecker2 at gmail.com Tue Sep 29 11:10:09 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 29 Sep 2009 11:10:09 -0400 Subject: [Numpy-discussion] fixed_pt some progress and a question References: <1cd32cbb0909290751v41511f27h5347d40126c3e044@mail.gmail.com> Message-ID: josef.pktd at gmail.com wrote: > On Tue, Sep 29, 2009 at 10:22 AM, Neal Becker wrote: >> This doesn't work either: >> >> def as_double (self): >> import math >> def _as_double_1 (x): >> return math.ldexp (x, -self.frac_bits) >> vecfunc = np.vectorize (_as_double_1, otypes=[np.float]) >> return vecfunc (self) >> >> In [49]: obj.as_double() >> Out[49]: fixed_pt_array([ 0., 1., 2., 3., 4.]) >> >> The values are correct, but I wanted a float array, not fixed_pt_array >> for output. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > I don't understand much, but if you just want to convert to a regular > float array, you can just create a new array with np.array in > as_double, or not? > I could force an additional conversion using np.array (xxx, dtype=float). Seems wasteful. The bigger question I have is, if I've subclassed an array, how can I get at the underlying array type? In this example, fixed_pt is really an 'int64' array. But I can't find any way to get that. Any function such as 'view' just silently does nothing. It always returns the fixed_pt_array class. Consider the fixed_pt_array method as_base(). It should return the underlying int array. How could I do this? From robert.kern at gmail.com Tue Sep 29 11:22:39 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 10:22:39 -0500 Subject: [Numpy-discussion] fixed_pt some progress and a question In-Reply-To: References: <1cd32cbb0909290751v41511f27h5347d40126c3e044@mail.gmail.com> Message-ID: <3d375d730909290822n6244ee0er1dea74e2c8cbca22@mail.gmail.com> On Tue, Sep 29, 2009 at 10:10, Neal Becker wrote: > I could force an additional conversion using np.array (xxx, dtype=float). > Seems wasteful. np.asarray() will not be wasteful. > The bigger question I have is, if I've subclassed an array, how can I get at > the underlying array type? x.view(np.ndarray) The .view() method serves two purposes: to create a view onto the same memory with a new dtype but also to create a view using a different ndarray subclass. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Tue Sep 29 12:37:14 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Sep 2009 09:37:14 -0700 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> Message-ID: <4AC237BA.9050104@noaa.gov> Pierre GM wrote: > I was thinking about something this week-end: we could create a second > list when looping on the rows, where we would store the length of each > splitted row. After the loop, we can find if these values don't match > the expected number of columns `nbcols` and where. Then, we can decide > to strip the `rows` list of its invalid values (that corresponds to > skipping) or raise an exception, but in both cases we know where the > problem is. > My only concern is that we'd be creating yet another list of integers, > which would increase memory usage. Would it be a problem ? I doubt it would be that big deal, however... Skipper Seabold wrote: > One of the datasets I > was working with was about a million lines with about 500 columns in > each. In this use case, it's clearly not a big deal, but it's probably pretty common for folks to have data sets with a smaller number of columns, maybe even two or so (I know I do sometimes). In that case, I suppose we're increasing memory usage by 50% or s, which may be an issue. Another idea: only store the indexes of the rows that have the "wrong" number of columns -- if that's a large number, then then user has bigger problems than memory usage! > I can't think of a case where I would want to just skip bad rows. I can't either, but someone suggested it. It certainly shouldn't happen by default or without a big ol' message of some sort to the user's code. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From sccolbert at gmail.com Tue Sep 29 12:47:05 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Tue, 29 Sep 2009 18:47:05 +0200 Subject: [Numpy-discussion] where does numpy get its pow function? Message-ID: <7f014ea60909290947r4e4b302chdb8d2c8ab61e75db@mail.gmail.com> Does numpy use pow from math.h or something else? I seem to be having a problem with slow pow under gcc when building an extension, but it's not affecting numpy. So if numpy uses that, then there is something else i'm missing. Cheers! Chris From Chris.Barker at noaa.gov Tue Sep 29 12:53:40 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Sep 2009 09:53:40 -0700 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. Message-ID: <4AC23B94.5010604@noaa.gov> Hi folks, This isn't really a numpy question, and I'm doing this with regular old python, but I figure you are the folks that would know this: How do I get python to make a distinction between -0.0 and 0.0? IN this case, I'm starting with user input, so: In [3]: float("-0.0") Out[3]: -0.0 so python seems to preserve the "-". But: In [12]: float("-0.0") == float("0.0") Out[12]: True In [13]: float("-0.0") < float("0.0") Out[13]: False In [14]: float("0.0") > float("-0.0") Out[14]: False It doesn't seem to make the distinction between -0.0 and 0.0 in any of the comparisons. How can I identify -0.0? NOTE: numpy behaves the same way, which I think it should, but still... My back-up plan is to process the string first, looking for the minus sign, but that will require more changes than I'd like to the rest of my code... thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Tue Sep 29 12:54:46 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 11:54:46 -0500 Subject: [Numpy-discussion] where does numpy get its pow function? In-Reply-To: <7f014ea60909290947r4e4b302chdb8d2c8ab61e75db@mail.gmail.com> References: <7f014ea60909290947r4e4b302chdb8d2c8ab61e75db@mail.gmail.com> Message-ID: <3d375d730909290954u12244e40wf9cbe5820ec0e76f@mail.gmail.com> On Tue, Sep 29, 2009 at 11:47, Chris Colbert wrote: > Does numpy use pow from math.h or something else? Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav+sp at iki.fi Tue Sep 29 12:56:54 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 29 Sep 2009 16:56:54 +0000 (UTC) Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. References: <4AC23B94.5010604@noaa.gov> Message-ID: Tue, 29 Sep 2009 09:53:40 -0700, Christopher Barker wrote: [clip] > How can I identify -0.0? signbit -- Pauli Virtanen From ndbecker2 at gmail.com Tue Sep 29 13:00:34 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 29 Sep 2009 13:00:34 -0400 Subject: [Numpy-discussion] __array_wrap__ Message-ID: fixed_pt arrays need to apply the overflow_policy after operations (overflow_policy could be clip, or throw exception). I thought __array_wrap__ would work for this, but it seems to not be called when I need it. For example: In [13]: obj Out[13]: fixed_pt_array([ 0, 32, 64, 96, 128]) In [14]: obj*100 < this should overflow enter: [ 0 32 64 96 128] << on entry into __array_wrap enter: [0 32 64 96 128] exit: [ 0 32 64 96 128] Out[14]: fixed_pt_array([ 0, 3200, 6400, 9600, 12800]) Apparantly, obj*100 is never passed to array_wrap. Is there another way I can do this? From sccolbert at gmail.com Tue Sep 29 13:01:50 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Tue, 29 Sep 2009 19:01:50 +0200 Subject: [Numpy-discussion] where does numpy get its pow function? In-Reply-To: <3d375d730909290954u12244e40wf9cbe5820ec0e76f@mail.gmail.com> References: <7f014ea60909290947r4e4b302chdb8d2c8ab61e75db@mail.gmail.com> <3d375d730909290954u12244e40wf9cbe5820ec0e76f@mail.gmail.com> Message-ID: <7f014ea60909291001g6993c2dbg8429f1b540eec2aa@mail.gmail.com> are there any particular optimization flags issued when building numpy aside from the following? -fwrapv -O2 On Tue, Sep 29, 2009 at 6:54 PM, Robert Kern wrote: > On Tue, Sep 29, 2009 at 11:47, Chris Colbert wrote: >> Does numpy use pow from math.h or something else? > > Yes. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Tue Sep 29 13:04:09 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Sep 2009 10:04:09 -0700 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: References: <4AC23B94.5010604@noaa.gov> Message-ID: <4AC23E09.2040302@noaa.gov> Pauli Virtanen wrote: > Tue, 29 Sep 2009 09:53:40 -0700, Christopher Barker wrote: > [clip] >> How can I identify -0.0? > > signbit > perfect for numpy, but at this point I don't have a numpy dependency (very unusual for my code!). Anyone know a pure-python way to get it? It seems I should be able to do something like: struct.pack("d",-3.4)[0] & Something but I'm not sure what "Something" is, and it would be endian-dependent, wouldn't it? thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pgmdevlist at gmail.com Tue Sep 29 13:05:13 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 29 Sep 2009 13:05:13 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4AC237BA.9050104@noaa.gov> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> Message-ID: On Sep 29, 2009, at 12:37 PM, Christopher Barker wrote: > Pierre GM wrote: > Another idea: only store the indexes of the rows that have the "wrong" > number of columns -- if that's a large number, then then user has > bigger > problems than memory usage! That was my first idea, but then it adds tests in the inside loop (which is what I'm trying to avoid)... > >> I can't think of a case where I would want to just skip bad rows. > > I can't either, but someone suggested it. It certainly shouldn't > happen > by default or without a big ol' message of some sort to the user's > code. That was my intention. OK, I should be able to start working on that in the next few days. Meanwhile, it'd be great if y'all could send me some test cases (so that I can find which method works best). Cheers P. From gokhansever at gmail.com Tue Sep 29 13:08:17 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 29 Sep 2009 12:08:17 -0500 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <4AC23B94.5010604@noaa.gov> References: <4AC23B94.5010604@noaa.gov> Message-ID: <49d6b3500909291008q69a096f1l64f153c247bd2e10@mail.gmail.com> On Tue, Sep 29, 2009 at 11:53 AM, Christopher Barker wrote: > Hi folks, > > This isn't really a numpy question, and I'm doing this with regular old > python, but I figure you are the folks that would know this: > > How do I get python to make a distinction between -0.0 and 0.0? IN this > case, I'm starting with user input, so: > > In [3]: float("-0.0") > Out[3]: -0.0 > > so python seems to preserve the "-". But: > > In [12]: float("-0.0") == float("0.0") > Out[12]: True > > In [13]: float("-0.0") < float("0.0") > Out[13]: False > > In [14]: float("0.0") > float("-0.0") > Out[14]: False > > It doesn't seem to make the distinction between -0.0 and 0.0 in any of > the comparisons. How can I identify -0.0? > > NOTE: numpy behaves the same way, which I think it should, but still... > > My back-up plan is to process the string first, looking for the minus > sign, but that will require more changes than I'd like to the rest of my > code... > > thanks, > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > No help from the decimal module. Although its documentation ( http://docs.python.org/library/decimal.html) says: *A decimal number is immutable. It has a sign, coefficient digits, and an exponent. To preserve significance, the coefficient digits do not truncate trailing zeros. Decimals also include special values such as Infinity, -Infinity, and NaN. The standard also differentiates -0 from +0.* When I try: from decimal import * a = Decimal('+0.0') b = Decimal('-0.0') a == b True No help for you either I guess :) -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Tue Sep 29 13:17:34 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 29 Sep 2009 12:17:34 -0500 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <49d6b3500909291008q69a096f1l64f153c247bd2e10@mail.gmail.com> References: <4AC23B94.5010604@noaa.gov> <49d6b3500909291008q69a096f1l64f153c247bd2e10@mail.gmail.com> Message-ID: <4AC2412E.7040309@gmail.com> On 09/29/2009 12:08 PM, G?khan Sever wrote: > > > On Tue, Sep 29, 2009 at 11:53 AM, Christopher Barker > > wrote: > > Hi folks, > > This isn't really a numpy question, and I'm doing this with > regular old > python, but I figure you are the folks that would know this: > > How do I get python to make a distinction between -0.0 and 0.0? IN > this > case, I'm starting with user input, so: > > In [3]: float("-0.0") > Out[3]: -0.0 > > so python seems to preserve the "-". But: > > In [12]: float("-0.0") == float("0.0") > Out[12]: True > > In [13]: float("-0.0") < float("0.0") > Out[13]: False > > In [14]: float("0.0") > float("-0.0") > Out[14]: False > > It doesn't seem to make the distinction between -0.0 and 0.0 in any of > the comparisons. How can I identify -0.0? > > NOTE: numpy behaves the same way, which I think it should, but > still... > > My back-up plan is to process the string first, looking for the minus > sign, but that will require more changes than I'd like to the rest > of my > code... > > thanks, > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > No help from the decimal module. Although its documentation > (http://docs.python.org/library/decimal.html) says: > > /A decimal number is immutable. It has a sign, coefficient digits, and > an exponent. To preserve significance, the coefficient digits do not > truncate trailing zeros. Decimals also include special values such as > Infinity, -Infinity, and NaN. The standard also differentiates -0 from > +0./ > > When I try: > > from decimal import * > a = Decimal('+0.0') > b = Decimal('-0.0') > > a == b > True > > No help for you either I guess :) > > -- > G?khan > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Numpy support IEEE floating point standard so you have definitions for negative zero (numpy.NZERO) and positive zero (numpy.PZERO): http://docs.scipy.org/numpy/docs/numpy.NZERO/ http://docs.scipy.org/numpy/docs/numpy.PZERO/ Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkington at wisc.edu Tue Sep 29 13:19:53 2009 From: jkington at wisc.edu (Joe Kington) Date: Tue, 29 Sep 2009 12:19:53 -0500 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <4AC23E09.2040302@noaa.gov> References: <4AC23B94.5010604@noaa.gov> <4AC23E09.2040302@noaa.gov> Message-ID: Well, this is messy, and nearly unreadable, but it should work and is pure python(and I think even be endian-independent). struct.unpack('b',struct.pack('>d', X)[0])[0] >= 0 (where X is the variable you want to test) In [54]: struct.unpack('b',struct.pack('>d',0.0)[0])[0] >= 0 Out[54]: True In [55]: struct.unpack('b',struct.pack('>d',-0.0)[0])[0] >= 0 Out[55]: False In [56]: struct.unpack('b',struct.pack('>d',-0.0000001)[0])[0] >= 0 Out[56]: False In [57]: struct.unpack('b',struct.pack('>d',0.0000001)[0])[0] >= 0 Out[57]: True In [58]: struct.unpack('b',struct.pack('>d',3999564.8763)[0])[0] >= 0 Out[58]: True In [59]: struct.unpack('b',struct.pack('>d',-3999564.8763)[0])[0] >= 0 Out[59]: False Hope that helps, anyway -Joe On Tue, Sep 29, 2009 at 12:04 PM, Christopher Barker wrote: > Pauli Virtanen wrote: > > Tue, 29 Sep 2009 09:53:40 -0700, Christopher Barker wrote: > > [clip] > >> How can I identify -0.0? > > > > signbit > > > > perfect for numpy, but at this point I don't have a numpy dependency > (very unusual for my code!). Anyone know a pure-python way to get it? > > It seems I should be able to do something like: > > struct.pack("d",-3.4)[0] & Something > > but I'm not sure what "Something" is, and it would be endian-dependent, > wouldn't it? > > thanks, > -Chris > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From saintmlx at apstat.com Tue Sep 29 13:40:56 2009 From: saintmlx at apstat.com (Xavier Saint-Mleux) Date: Tue, 29 Sep 2009 13:40:56 -0400 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <4AC23E09.2040302@noaa.gov> References: <4AC23B94.5010604@noaa.gov> <4AC23E09.2040302@noaa.gov> Message-ID: <4AC246A8.9060306@apstat.com> Christopher Barker wrote: > Pauli Virtanen wrote: > >> Tue, 29 Sep 2009 09:53:40 -0700, Christopher Barker wrote: >> [clip] >> >>> How can I identify -0.0? >>> >> signbit >> >> > > perfect for numpy, but at this point I don't have a numpy dependency > (very unusual for my code!). Anyone know a pure-python way to get it? > Starting with Python 2.6, there is a copysign function in the math module: >>> import math >>> math.copysign(1, -0.) -1.0 >>> math.copysign(1, 0.) 1.0 http://docs.python.org/library/math.html HTH, Xavier From charlesr.harris at gmail.com Tue Sep 29 13:50:43 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 11:50:43 -0600 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <4AC23B94.5010604@noaa.gov> References: <4AC23B94.5010604@noaa.gov> Message-ID: On Tue, Sep 29, 2009 at 10:53 AM, Christopher Barker wrote: > Hi folks, > > This isn't really a numpy question, and I'm doing this with regular old > python, but I figure you are the folks that would know this: > > How do I get python to make a distinction between -0.0 and 0.0? IN this > case, I'm starting with user input, so: > > In [3]: float("-0.0") > Out[3]: -0.0 > > so python seems to preserve the "-". But: > > In [12]: float("-0.0") == float("0.0") > Out[12]: True > > In [13]: float("-0.0") < float("0.0") > Out[13]: False > > In [14]: float("0.0") > float("-0.0") > Out[14]: False > > IIRC, the numbers compare equal in C, so looking at the signbit is about the only thing you can do. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 29 13:53:46 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 11:53:46 -0600 Subject: [Numpy-discussion] where does numpy get its pow function? In-Reply-To: <7f014ea60909291001g6993c2dbg8429f1b540eec2aa@mail.gmail.com> References: <7f014ea60909290947r4e4b302chdb8d2c8ab61e75db@mail.gmail.com> <3d375d730909290954u12244e40wf9cbe5820ec0e76f@mail.gmail.com> <7f014ea60909291001g6993c2dbg8429f1b540eec2aa@mail.gmail.com> Message-ID: On Tue, Sep 29, 2009 at 11:01 AM, Chris Colbert wrote: > are there any particular optimization flags issued when building numpy > aside from the following? > > -fwrapv -O2 > > Numpy optimizes small integer powers using multiplication. What sort of numbers are you looking at? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdroe at stsci.edu Tue Sep 29 13:55:24 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Tue, 29 Sep 2009 13:55:24 -0400 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> Message-ID: <4AC24A0C.4090005@stsci.edu> I now have a rather large patch ready which addresses the following issues with chararrays. Would it be possible to get SVN commit priviledges, or would you prefer a patch file? 1) Fix bugs in Trac http://projects.scipy.org/numpy/ticket/1199 (chararray.expandtabs broken) http://projects.scipy.org/numpy/ticket/856 (chararray __mod__ error) http://projects.scipy.org/numpy/ticket/855 (chararray __mul__ error) http://projects.scipy.org/numpy/ticket/1231 (chararray methods ignore all arguments following the first argument that evaluates to False) http://projects.scipy.org/numpy/ticket/1235 (Coercing object arrays to string arrays has surprising behaviour) http://projects.scipy.org/numpy/ticket/1240 (Casting from Unicode to String array ignores exception) http://projects.scipy.org/numpy/ticket/1241 (Array constructed with mixture of str and unicode objects fails length detection) I can provide small individual patches for some of these if necessary, but some are interrelated and can only be fixed by the "whole enchilada". 2) Improve documentation Every method now has a docstring, and a new page of routines has been added to the Sphinx tree. 3) Improve unit test coverage Full line-by-line coverage of defchararray.py, as well as lots of hairy Unicode side cases. 4a) Create C-based vectorized string operations This is benchmarking about 5x faster than the old Python-based looping on a large database of around 20k astronomical objects 4b) Refactor chararray class in terms of those 4c) Design and create an interface to those methods that will be the "right way" going forward All vectorized string operations are now available as regular functions in the numpy.char namespace. Usage of the chararray view class is only recommended for numarray backward compatibility. A few side notes: http://projects.scipy.org/numpy/ticket/1200 (chararray.rstrip inconsistency) This bug I believe should be marked as "won't fix". The inconsistent handling of trailing whitespace inconsistency is an unfortunate "feature" of the chararray class, and I am wary that fixing it may break backward compatibility. However, the new free functions in numpy.char do not have this inconsistency, so they should be recommended for new code. http://projects.scipy.org/numpy/ticket/1240 (Casting from Unicode to String array ignores exception) This bug probably needs review by someone deeply familiar with the low-level internals, as it affects more than just string and unicode arrays. It doesn't break any of the unit tests, for what it's worth ;) Cheers, Mike David Goldsmith wrote: > Great, thanks! > > DG > > On Fri, Sep 25, 2009 at 6:07 AM, Michael Droettboom > wrote: > > David Goldsmith wrote: > > On Tue, Sep 22, 2009 at 4:02 PM, Ralf Gommers > > > >> wrote: > > > > > > On Tue, Sep 22, 2009 at 1:58 PM, Michael Droettboom > > > >> wrote: > > > > Trac has these bugs. Any others? > > > > http://projects.scipy.org/numpy/ticket/1199 > > http://projects.scipy.org/numpy/ticket/1200 > > http://projects.scipy.org/numpy/ticket/856 > > http://projects.scipy.org/numpy/ticket/855 > > http://projects.scipy.org/numpy/ticket/1231 > > > > > > This one: > > > http://article.gmane.org/gmane.comp.python.numeric.general/23638/match=chararray > > > > Cheers, > > Ralf > > > > > > That last one never got "promoted" to a ticket? > It's a symptom of this bug, that I created and produced a patch for > yesterday: > > http://projects.scipy.org/numpy/ticket/1235 > > Mike > > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From bsouthey at gmail.com Tue Sep 29 13:57:19 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 29 Sep 2009 12:57:19 -0500 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4AC237BA.9050104@noaa.gov> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> Message-ID: <4AC24A7F.4020904@gmail.com> On 09/29/2009 11:37 AM, Christopher Barker wrote: > Pierre GM wrote: > >> I was thinking about something this week-end: we could create a second >> list when looping on the rows, where we would store the length of each >> splitted row. After the loop, we can find if these values don't match >> the expected number of columns `nbcols` and where. Then, we can decide >> to strip the `rows` list of its invalid values (that corresponds to >> skipping) or raise an exception, but in both cases we know where the >> problem is. >> My only concern is that we'd be creating yet another list of integers, >> which would increase memory usage. Would it be a problem ? >> > I doubt it would be that big deal, however... > Probably more than memory is the execution time involved in printing these problem rows. There are already two loops over the data where you can measure the number of elements in the row but the first may be more appropriate. So a simple solution is that in the first loop you could append the 'bad' rows to one list and append to a 'good' rows to a exist row list or just store the row number that is bad. Untested code for corresponding part of io.py: row_bad=[] # store bad rows bad_row_numbers=[] # store just the row number row_number=0 #simple row counter that probably should be the first data row not first line of the file for line in itertools.chain([first_line,], fhd): values = split_line(line) # Skip an empty line if len(values) == 0: continue # Select only the columns we need if usecols: values = [values[_] for _ in usecols] # Check whether we need to update the converter if dtype is None: for (converter, item) in zip(converters, values): converter.upgrade(item) if len(values) != nbcols: row_bad.append(line) # store bad row so the user can search for that line bad_row_numbers.append(row_number) # store just the bad row number so user can go to the appropriate line(s) in file else: append_to_rows(tuple(values)) row_number=row_number+1 # increment row counter Note I assume that nbcols is the expected number of columns but I seem to be one off with my counting. Then if len(rows_bad) is greater than zero you could raise or print out a warning and the rows then raise an exception or continue. The problem with continuing is that a user may not be aware that there is a warning. Bruce From charlesr.harris at gmail.com Tue Sep 29 14:09:55 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 12:09:55 -0600 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: References: Message-ID: On Tue, Sep 29, 2009 at 11:00 AM, Neal Becker wrote: > fixed_pt arrays need to apply the overflow_policy after operations > (overflow_policy could be clip, or throw exception). > > I thought __array_wrap__ would work for this, but it seems to not be called > when I need it. For example: > > In [13]: obj > Out[13]: fixed_pt_array([ 0, 32, 64, 96, 128]) > > In [14]: obj*100 < this should overflow > enter: [ 0 32 64 96 128] << on entry into __array_wrap > enter: [0 32 64 96 128] > exit: [ 0 32 64 96 128] > Out[14]: fixed_pt_array([ 0, 3200, 6400, 9600, 12800]) > > Apparantly, obj*100 is never passed to array_wrap. > > Is there another way I can do this? > > I believe array wrap has to be explicitly called after the fact. The problem is that you derived from ndarray, but fixed point isn't an ndarray because ndarray doesn't *have* fixed point. If you were to implement *using* ndarray you could catch everything in the operator calls. As a rule of thumb, inheritance is always the wrong thing to do ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Tue Sep 29 14:14:57 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Tue, 29 Sep 2009 20:14:57 +0200 Subject: [Numpy-discussion] where does numpy get its pow function? In-Reply-To: References: <7f014ea60909290947r4e4b302chdb8d2c8ab61e75db@mail.gmail.com> <3d375d730909290954u12244e40wf9cbe5820ec0e76f@mail.gmail.com> <7f014ea60909291001g6993c2dbg8429f1b540eec2aa@mail.gmail.com> Message-ID: <7f014ea60909291114j3be5f829reff6cadc57fbecf1@mail.gmail.com> my powers are typically doubles I traced the problem down to the pow function in math.h just being slow... Thanks! On Tue, Sep 29, 2009 at 7:53 PM, Charles R Harris wrote: > > > On Tue, Sep 29, 2009 at 11:01 AM, Chris Colbert wrote: >> >> are there any particular optimization flags issued when building numpy >> aside from the following? >> >> ?-fwrapv -O2 >> > > Numpy optimizes small integer powers? using multiplication. What sort of > numbers are you looking at? > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From kwgoodman at gmail.com Tue Sep 29 14:19:22 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 29 Sep 2009 11:19:22 -0700 Subject: [Numpy-discussion] subclassing Message-ID: I ran across a problem while using numpy. But the problem is more of python problem. I hope I am not too far off topic. I have a class and a subclass: class myclass: def __init__(self, x): self.x = x def __getitem__(self, index): if type(index) is slice: x = self.x[index] return myclass(x) else: raise IndexError, 'Only slicing is allowed in this example.' def calc(self): print 'myclass' def __repr__(self): return self.x.__repr__() class mysubclass(myclass): def calc(self): print 'mySUBclass' I can make an instance of mysubclass: >> sub = mysubclass(np.array([1,2,3])) >> sub array([1, 2, 3]) >> sub.calc() mySUBclass The problem occurs when I slice: >> sub_slice = sub[1:] >> sub_slice.calc() myclass # <--- Oops. I'd like this to be mySUBclass Slicing causes sub to change from mysubclass to myclass. It does so because __getitem__ returns myclass(x). I could make a __setitem__ method in mysubclass that returns mysubclass(x), but I have many such methods with the same problem (not shown in the code above). Is there another solution? From charlesr.harris at gmail.com Tue Sep 29 14:19:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 12:19:39 -0600 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <4AC24A0C.4090005@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> <4AC24A0C.4090005@stsci.edu> Message-ID: On Tue, Sep 29, 2009 at 11:55 AM, Michael Droettboom wrote: > I now have a rather large patch ready which addresses the following > issues with chararrays. Would it be possible to get SVN commit > priviledges, or would you prefer a patch file? > > If you are going to maintain this part of numpy, I think you should go for the privileges. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Sep 29 14:23:14 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 13:23:14 -0500 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: References: Message-ID: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> On Tue, Sep 29, 2009 at 13:09, Charles R Harris wrote: > > > On Tue, Sep 29, 2009 at 11:00 AM, Neal Becker wrote: >> >> fixed_pt arrays need to apply the overflow_policy after operations >> (overflow_policy could be clip, or throw exception). >> >> I thought __array_wrap__ would work for this, but it seems to not be >> called >> when I need it. ?For example: >> >> In [13]: obj >> Out[13]: fixed_pt_array([ ?0, ?32, ?64, ?96, 128]) >> >> In [14]: obj*100 < this should overflow >> enter: [ ?0 ?32 ?64 ?96 128] << on entry into __array_wrap >> enter: [0 32 64 96 128] >> exit: [ ?0 ?32 ?64 ?96 128] >> Out[14]: fixed_pt_array([ ? ?0, ?3200, ?6400, ?9600, 12800]) >> >> Apparantly, obj*100 is never passed to array_wrap. >> >> Is there another way I can do this? >> > I believe array wrap has to be explicitly called after the fact. Ufuncs call __array_wrap__ implicitly. In [22]: class myarray(np.ndarray): ....: def __array_wrap__(self, *args): ....: print 'myarray.__array_wrap__%r' % (args,) ....: return super(myarray, self).__array_wrap__(*args) ....: ....: In [25]: m = np.arange(10).view(myarray) In [26]: m * 100 myarray.__array_wrap__(myarray([ 0, 100, 200, 300, 400, 500, 600, 700, 800, 900]), (, (myarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), 100), 0)) Out[26]: myarray([ 0, 100, 200, 300, 400, 500, 600, 700, 800, 900]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Sep 29 14:25:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 13:25:06 -0500 Subject: [Numpy-discussion] subclassing In-Reply-To: References: Message-ID: <3d375d730909291125w35645632y38b308c56f4bef1b@mail.gmail.com> On Tue, Sep 29, 2009 at 13:19, Keith Goodman wrote: > I ran across a problem while using numpy. But the problem is more of > python problem. I hope I am not too far off topic. > > I have a class and a subclass: > > class myclass: > > ? ?def __init__(self, x): > ? ? ? ?self.x = x > > ? ?def __getitem__(self, index): > ? ? ? ?if type(index) is slice: > ? ? ? ? ? ?x = self.x[index] > ? ? ? ? ? ?return myclass(x) > ? ? ? ?else: > ? ? ? ? ? ?raise IndexError, 'Only slicing is allowed in this example.' > > ? ?def calc(self): > ? ? ? ?print 'myclass' > > ? ?def __repr__(self): > ? ? ? ?return self.x.__repr__() > > class mysubclass(myclass): > > ? ?def calc(self): > ? ? ? ?print 'mySUBclass' > > > I can make an instance of mysubclass: > >>> sub = mysubclass(np.array([1,2,3])) >>> sub > ? array([1, 2, 3]) >>> sub.calc() > mySUBclass > > The problem occurs when I slice: > >>> sub_slice = sub[1:] >>> sub_slice.calc() > myclass ?# <--- Oops. I'd like this to be mySUBclass > > Slicing causes sub to change from mysubclass to myclass. It does so > because __getitem__ returns myclass(x). I could make a __setitem__ > method in mysubclass that returns mysubclass(x), but I have many such > methods with the same problem (not shown in the code above). Is there > another solution? def __getitem__(self, index): if type(index) is slice: x = self.x[index] return type(self)(x) # <---------- else: raise IndexError, 'Only slicing is allowed in this example.' -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jkington at wisc.edu Tue Sep 29 14:29:26 2009 From: jkington at wisc.edu (Joe Kington) Date: Tue, 29 Sep 2009 13:29:26 -0500 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: References: <4AC23B94.5010604@noaa.gov> <4AC23E09.2040302@noaa.gov> Message-ID: I just realized that what I'm doing won't work on older versions of python, anyway... Things work fine on 2.6 Python 2.6.2 (r262:71600, Sep 3 2009, 09:36:43) [GCC 3.4.6 20060404 (Red Hat 3.4.6-9)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import struct >>> struct.pack('f', -0.0) '\x00\x00\x00\x80' <-- Has sign bit >>> struct.pack('>f', -0.0) '\x80\x00\x00\x00' <-- Has sign bit >>> struct.pack('>> struct.pack('=f', -0.0) '\x00\x00\x00\x80' >>> But on python 2.3 it only works for the native endian case Python 2.3.4 (#1, Jan 9 2007, 16:40:09) [GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import struct >>> struct.pack('f', -0.0) '\x00\x00\x00\x80' <-- Correct, has sign bit and unpacks to -0.0 >>> struct.pack('>f', -0.0) '\x00\x00\x00\x00' <-- (big-endian) Lost the sign bit! Unpacks to 0.0 >>> struct.pack('>> struct.pack('=f', -0.0) '\x00\x00\x00\x00' <-- (whichever is native) Lost the sign bit! >>> So I guess prior to 2.6 (which as xavier pointed out, has copysign) trying to use struct won't be endian-independent. Of course, you could always do x.__repr__().startswith('-'), but it's slow, ugly, and you already said you wanted to avoid it. On the other hand, it works everywhere with any version of python. At any rate, I'm doubt I'm helping that much, -Joe On Tue, Sep 29, 2009 at 12:19 PM, Joe Kington wrote: > Well, this is messy, and nearly unreadable, but it should work and is pure > python(and I think even be endian-independent). > > struct.unpack('b',struct.pack('>d', X)[0])[0] >= 0 > (where X is the variable you want to test) > > In [54]: struct.unpack('b',struct.pack('>d',0.0)[0])[0] >= 0 > Out[54]: True > > In [55]: struct.unpack('b',struct.pack('>d',-0.0)[0])[0] >= 0 > Out[55]: False > > In [56]: struct.unpack('b',struct.pack('>d',-0.0000001)[0])[0] >= 0 > Out[56]: False > > In [57]: struct.unpack('b',struct.pack('>d',0.0000001)[0])[0] >= 0 > Out[57]: True > > In [58]: struct.unpack('b',struct.pack('>d',3999564.8763)[0])[0] >= 0 > Out[58]: True > > In [59]: struct.unpack('b',struct.pack('>d',-3999564.8763)[0])[0] >= 0 > Out[59]: False > > Hope that helps, anyway > -Joe > > > On Tue, Sep 29, 2009 at 12:04 PM, Christopher Barker < > Chris.Barker at noaa.gov> wrote: > >> Pauli Virtanen wrote: >> > Tue, 29 Sep 2009 09:53:40 -0700, Christopher Barker wrote: >> > [clip] >> >> How can I identify -0.0? >> > >> > signbit >> > >> >> perfect for numpy, but at this point I don't have a numpy dependency >> (very unusual for my code!). Anyone know a pure-python way to get it? >> >> It seems I should be able to do something like: >> >> struct.pack("d",-3.4)[0] & Something >> >> but I'm not sure what "Something" is, and it would be endian-dependent, >> wouldn't it? >> >> thanks, >> -Chris >> >> >> -- >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Sep 29 14:30:38 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 29 Sep 2009 14:30:38 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4AC24A7F.4020904@gmail.com> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> <4AC24A7F.4020904@gmail.com> Message-ID: <00CD2A47-A721-4A46-8D81-203BE795A6E4@gmail.com> On Sep 29, 2009, at 1:57 PM, Bruce Southey wrote: > On 09/29/2009 11:37 AM, Christopher Barker wrote: >> Pierre GM wrote: >> > Probably more than memory is the execution time involved in printing > these problem rows. The rows with problems will be printed outside the loop (with at least an associated warning or possibly raising an exception). My concern is to whether store only the tuples (index of the row, nb of columns) for the invalid rows, or just create a list of nb of columns that I'd parse afterwards. The first solution requires an extra test in the loop, the second may waste some memory space. Bah, I'll figure it out. Please send me some test cases so that I can time/test the best option. > From charlesr.harris at gmail.com Tue Sep 29 14:35:36 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 12:35:36 -0600 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> Message-ID: On Tue, Sep 29, 2009 at 12:23 PM, Robert Kern wrote: > On Tue, Sep 29, 2009 at 13:09, Charles R Harris > wrote: > > > > > > On Tue, Sep 29, 2009 at 11:00 AM, Neal Becker > wrote: > >> > >> fixed_pt arrays need to apply the overflow_policy after operations > >> (overflow_policy could be clip, or throw exception). > >> > >> I thought __array_wrap__ would work for this, but it seems to not be > >> called > >> when I need it. For example: > >> > >> In [13]: obj > >> Out[13]: fixed_pt_array([ 0, 32, 64, 96, 128]) > >> > >> In [14]: obj*100 < this should overflow > >> enter: [ 0 32 64 96 128] << on entry into __array_wrap > >> enter: [0 32 64 96 128] > >> exit: [ 0 32 64 96 128] > >> Out[14]: fixed_pt_array([ 0, 3200, 6400, 9600, 12800]) > >> > >> Apparantly, obj*100 is never passed to array_wrap. > >> > >> Is there another way I can do this? > >> > > I believe array wrap has to be explicitly called after the fact. > > Ufuncs call __array_wrap__ implicitly. > > Thanks for the info. How do they decide which one to call? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lou_boog2000 at yahoo.com Tue Sep 29 14:31:00 2009 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Tue, 29 Sep 2009 11:31:00 -0700 (PDT) Subject: [Numpy-discussion] where does numpy get its pow function? In-Reply-To: <3d375d730909290954u12244e40wf9cbe5820ec0e76f@mail.gmail.com> References: <7f014ea60909290947r4e4b302chdb8d2c8ab61e75db@mail.gmail.com> <3d375d730909290954u12244e40wf9cbe5820ec0e76f@mail.gmail.com> Message-ID: <148958.24547.qm@web34408.mail.mud.yahoo.com> ----- Original Message ---- From: Robert Kern To: Discussion of Numerical Python Sent: Tuesday, September 29, 2009 12:54:46 PM Subject: Re: [Numpy-discussion] where does numpy get its pow function? On Tue, Sep 29, 2009 at 11:47, Chris Colbert wrote: > Does numpy use pow from math.h or something else? Yes. -- Robert Kern HAHAHA! Reminds me of when my son was little and we asked, "Do you want vanilla or chocolate ice cream." He would answer, "Yes." Robert, did you mean it used math.h OR that it used something else? :-) -- Lou Pecora, my views are my own. From charlesr.harris at gmail.com Tue Sep 29 14:39:09 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 12:39:09 -0600 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> Message-ID: On Tue, Sep 29, 2009 at 12:35 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Tue, Sep 29, 2009 at 12:23 PM, Robert Kern wrote: > >> On Tue, Sep 29, 2009 at 13:09, Charles R Harris >> wrote: >> > >> > >> > On Tue, Sep 29, 2009 at 11:00 AM, Neal Becker >> wrote: >> >> >> >> fixed_pt arrays need to apply the overflow_policy after operations >> >> (overflow_policy could be clip, or throw exception). >> >> >> >> I thought __array_wrap__ would work for this, but it seems to not be >> >> called >> >> when I need it. For example: >> >> >> >> In [13]: obj >> >> Out[13]: fixed_pt_array([ 0, 32, 64, 96, 128]) >> >> >> >> In [14]: obj*100 < this should overflow >> >> enter: [ 0 32 64 96 128] << on entry into __array_wrap >> >> enter: [0 32 64 96 128] >> >> exit: [ 0 32 64 96 128] >> >> Out[14]: fixed_pt_array([ 0, 3200, 6400, 9600, 12800]) >> >> >> >> Apparantly, obj*100 is never passed to array_wrap. >> >> >> >> Is there another way I can do this? >> >> >> > I believe array wrap has to be explicitly called after the fact. >> >> Ufuncs call __array_wrap__ implicitly. >> >> > Thanks for the info. How do they decide which one to call? > > This also suggests trying an explicit call to multiply(obj, 100). Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Sep 29 14:48:14 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 13:48:14 -0500 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> Message-ID: <3d375d730909291148w14e9df41n18719629541f3186@mail.gmail.com> On Tue, Sep 29, 2009 at 13:35, Charles R Harris wrote: > > On Tue, Sep 29, 2009 at 12:23 PM, Robert Kern wrote: >> >> On Tue, Sep 29, 2009 at 13:09, Charles R Harris >> wrote: >> > >> > >> > On Tue, Sep 29, 2009 at 11:00 AM, Neal Becker >> > wrote: >> >> >> >> fixed_pt arrays need to apply the overflow_policy after operations >> >> (overflow_policy could be clip, or throw exception). >> >> >> >> I thought __array_wrap__ would work for this, but it seems to not be >> >> called >> >> when I need it. ?For example: >> >> >> >> In [13]: obj >> >> Out[13]: fixed_pt_array([ ?0, ?32, ?64, ?96, 128]) >> >> >> >> In [14]: obj*100 < this should overflow >> >> enter: [ ?0 ?32 ?64 ?96 128] << on entry into __array_wrap >> >> enter: [0 32 64 96 128] >> >> exit: [ ?0 ?32 ?64 ?96 128] >> >> Out[14]: fixed_pt_array([ ? ?0, ?3200, ?6400, ?9600, 12800]) >> >> >> >> Apparantly, obj*100 is never passed to array_wrap. >> >> >> >> Is there another way I can do this? >> >> >> > I believe array wrap has to be explicitly called after the fact. >> >> Ufuncs call __array_wrap__ implicitly. >> > > Thanks for the info. How do they decide which one to call? The .__array_priority__ attribute. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Sep 29 14:50:25 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 13:50:25 -0500 Subject: [Numpy-discussion] where does numpy get its pow function? In-Reply-To: <148958.24547.qm@web34408.mail.mud.yahoo.com> References: <7f014ea60909290947r4e4b302chdb8d2c8ab61e75db@mail.gmail.com> <3d375d730909290954u12244e40wf9cbe5820ec0e76f@mail.gmail.com> <148958.24547.qm@web34408.mail.mud.yahoo.com> Message-ID: <3d375d730909291150k43f5f54bl5b2bcbf2b4395783@mail.gmail.com> On Tue, Sep 29, 2009 at 13:31, Lou Pecora wrote: > ----- Original Message ---- > From: Robert Kern > To: Discussion of Numerical Python > Sent: Tuesday, September 29, 2009 12:54:46 PM > Subject: Re: [Numpy-discussion] where does numpy get its pow function? > > On Tue, Sep 29, 2009 at 11:47, Chris Colbert wrote: >> Does numpy use pow from math.h or something else? > > Yes. > > -- > Robert Kern > > > HAHAHA! ?Reminds me of when my son was little and we asked, "Do you want vanilla or chocolate ice cream." ?He would answer, "Yes." > > Robert, did you mean it used math.h ?OR ?that it used something else? ?:-) math.h. Sorry, when one option is specific and one is "something else", I will often respond "yes" or "no" referring to the specific option without thinking about it. People usually know what I mean. :-) When given two specific options, I only say "yes" or "no" when I want to be annoying. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d.l.goldsmith at gmail.com Tue Sep 29 14:55:53 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 29 Sep 2009 11:55:53 -0700 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <4AC24A0C.4090005@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> <4AC24A0C.4090005@stsci.edu> Message-ID: <45d1ab480909291155q582e07fdhf5b6a30f2c0f89c3@mail.gmail.com> Michael: Thanks so much, this is genuinely awesome! Don't forget to email Joe Harrington for your T-shirt - you more than deserve it! ;-) A few specific comments below On Tue, Sep 29, 2009 at 10:55 AM, Michael Droettboom wrote: > I now have a rather large patch ready which addresses the following > issues with chararrays. Would it be possible to get SVN commit > priviledges, or would you prefer a patch file? > I hope you're granted privileges (but I can't do that - don't even have 'em myself); I'm pretty sure everyone on scipy-dev monitors this list as well, but if you have to wait more than a few days for an affirmative reply one way or the other, you should try posting there. > 2) Improve documentation > > Every method now has a docstring, and a new page of routines has been > added to the Sphinx tree. > Awesome, thanks! > 4c) Design and create an interface to those methods that will be the > "right way" going forward > > All vectorized string operations are now available as regular functions > in the numpy.char namespace. Usage of the chararray view class is only > recommended for numarray backward compatibility. > Again, thanks for this. But I wonder aloud (asking long-time developers): do we have any systematic ways to steer people in such directions? (This should probably be a new thread...) > A few side notes: > > http://projects.scipy.org/numpy/ticket/1200 (chararray.rstrip > inconsistency) > > This bug I believe should be marked as "won't fix". The inconsistent > handling of trailing whitespace inconsistency is an unfortunate > "feature" of the chararray class, and I am wary that fixing it may break > backward compatibility. However, the new free functions in numpy.char > do not have this inconsistency, so they should be recommended for new code. > OK, sounds "acceptable." And three more times: thank you, thank you, thank you!!! DG > Cheers, > Mike > > David Goldsmith wrote: > > Great, thanks! > > > > DG > > > > On Fri, Sep 25, 2009 at 6:07 AM, Michael Droettboom > > wrote: > > > > David Goldsmith wrote: > > > On Tue, Sep 22, 2009 at 4:02 PM, Ralf Gommers > > > > > > > >> wrote: > > > > > > > > > On Tue, Sep 22, 2009 at 1:58 PM, Michael Droettboom > > > > > >> wrote: > > > > > > Trac has these bugs. Any others? > > > > > > http://projects.scipy.org/numpy/ticket/1199 > > > http://projects.scipy.org/numpy/ticket/1200 > > > http://projects.scipy.org/numpy/ticket/856 > > > http://projects.scipy.org/numpy/ticket/855 > > > http://projects.scipy.org/numpy/ticket/1231 > > > > > > > > > This one: > > > > > > http://article.gmane.org/gmane.comp.python.numeric.general/23638/match=chararray > > > > > > Cheers, > > > Ralf > > > > > > > > > That last one never got "promoted" to a ticket? > > It's a symptom of this bug, that I created and produced a patch for > > yesterday: > > > > http://projects.scipy.org/numpy/ticket/1235 > > > > Mike > > > > > > -- > > Michael Droettboom > > Science Software Branch > > Operations and Engineering Division > > Space Telescope Science Institute > > Operated by AURA for NASA > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Tue Sep 29 14:55:44 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 29 Sep 2009 14:55:44 -0400 Subject: [Numpy-discussion] __array_wrap__ References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> Message-ID: This seems to work now, but I'm wondering if Charles is correct, that inheritance isn't such a great idea here. The advantage of inheritance is I don't have to implement forwarding all the functions, a pretty big advantage. (I wonder if there is some way to do some of these as a generic 'mixin'?) I was concerned that using a vectorize function in __array_wrap__ would result in a recursive call to __array_wrap__ (and in some earlier version I did get infinite recursion). This version doesn't seem to have a problem, although I don't know why. Any thoughts? Code attached. -------------- next part -------------- A non-text attachment was scrubbed... Name: test_fixed_numpy.py Type: text/x-python Size: 5909 bytes Desc: not available URL: From kwgoodman at gmail.com Tue Sep 29 14:56:47 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 29 Sep 2009 11:56:47 -0700 Subject: [Numpy-discussion] subclassing In-Reply-To: <3d375d730909291125w35645632y38b308c56f4bef1b@mail.gmail.com> References: <3d375d730909291125w35645632y38b308c56f4bef1b@mail.gmail.com> Message-ID: On Tue, Sep 29, 2009 at 11:25 AM, Robert Kern wrote: > On Tue, Sep 29, 2009 at 13:19, Keith Goodman wrote: >> I ran across a problem while using numpy. But the problem is more of >> python problem. I hope I am not too far off topic. >> >> I have a class and a subclass: >> >> class myclass: >> >> ? ?def __init__(self, x): >> ? ? ? ?self.x = x >> >> ? ?def __getitem__(self, index): >> ? ? ? ?if type(index) is slice: >> ? ? ? ? ? ?x = self.x[index] >> ? ? ? ? ? ?return myclass(x) >> ? ? ? ?else: >> ? ? ? ? ? ?raise IndexError, 'Only slicing is allowed in this example.' >> >> ? ?def calc(self): >> ? ? ? ?print 'myclass' >> >> ? ?def __repr__(self): >> ? ? ? ?return self.x.__repr__() >> >> class mysubclass(myclass): >> >> ? ?def calc(self): >> ? ? ? ?print 'mySUBclass' >> >> >> I can make an instance of mysubclass: >> >>>> sub = mysubclass(np.array([1,2,3])) >>>> sub >> ? array([1, 2, 3]) >>>> sub.calc() >> mySUBclass >> >> The problem occurs when I slice: >> >>>> sub_slice = sub[1:] >>>> sub_slice.calc() >> myclass ?# <--- Oops. I'd like this to be mySUBclass >> >> Slicing causes sub to change from mysubclass to myclass. It does so >> because __getitem__ returns myclass(x). I could make a __setitem__ >> method in mysubclass that returns mysubclass(x), but I have many such >> methods with the same problem (not shown in the code above). Is there >> another solution? > > ? def __getitem__(self, index): > ? ? ? if type(index) is slice: > ? ? ? ? ? x = self.x[index] > ? ? ? ? ? return type(self)(x) ? # <---------- > ? ? ? else: > ? ? ? ? ? raise IndexError, 'Only slicing is allowed in this example.' Thank you! I wonder if I'll ever come across a problem that you cannot solve, Robert. From d.l.goldsmith at gmail.com Tue Sep 29 15:07:04 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 29 Sep 2009 12:07:04 -0700 Subject: [Numpy-discussion] where does numpy get its pow function? In-Reply-To: <3d375d730909291150k43f5f54bl5b2bcbf2b4395783@mail.gmail.com> References: <7f014ea60909290947r4e4b302chdb8d2c8ab61e75db@mail.gmail.com> <3d375d730909290954u12244e40wf9cbe5820ec0e76f@mail.gmail.com> <148958.24547.qm@web34408.mail.mud.yahoo.com> <3d375d730909291150k43f5f54bl5b2bcbf2b4395783@mail.gmail.com> Message-ID: <45d1ab480909291207v11ad2d1ej178c8c6aba52df45@mail.gmail.com> On Tue, Sep 29, 2009 at 11:50 AM, Robert Kern wrote: > When given two specific options, I only say "yes" or "no" when I want > to be annoying. > > Hey, Robert, didn't you want to put emphasis markers around "want"? ;-) Annoyingly yours, DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 29 15:07:49 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 13:07:49 -0600 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> Message-ID: On Tue, Sep 29, 2009 at 12:55 PM, Neal Becker wrote: > This seems to work now, but I'm wondering if Charles is correct, that > inheritance isn't such a great idea here. > > The advantage of inheritance is I don't have to implement forwarding all > the > functions, a pretty big advantage. (I wonder if there is some way to do > some > of these as a generic 'mixin'?) > > Using inheritance for implementation is not the thing to do. The usual bit of wisdom is to ask the questions "is a?" and "is implemented with?". If the answer to the first question is no, then don't use inheritance. Sometimes private inheritance is used in C++, but even that isn't considered best practice. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 29 15:08:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 13:08:32 -0600 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: <3d375d730909291148w14e9df41n18719629541f3186@mail.gmail.com> References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> <3d375d730909291148w14e9df41n18719629541f3186@mail.gmail.com> Message-ID: On Tue, Sep 29, 2009 at 12:48 PM, Robert Kern wrote: > On Tue, Sep 29, 2009 at 13:35, Charles R Harris > wrote: > > > > On Tue, Sep 29, 2009 at 12:23 PM, Robert Kern > wrote: > >> > >> On Tue, Sep 29, 2009 at 13:09, Charles R Harris > >> wrote: > >> > > >> > > >> > On Tue, Sep 29, 2009 at 11:00 AM, Neal Becker > >> > wrote: > >> >> > >> >> fixed_pt arrays need to apply the overflow_policy after operations > >> >> (overflow_policy could be clip, or throw exception). > >> >> > >> >> I thought __array_wrap__ would work for this, but it seems to not be > >> >> called > >> >> when I need it. For example: > >> >> > >> >> In [13]: obj > >> >> Out[13]: fixed_pt_array([ 0, 32, 64, 96, 128]) > >> >> > >> >> In [14]: obj*100 < this should overflow > >> >> enter: [ 0 32 64 96 128] << on entry into __array_wrap > >> >> enter: [0 32 64 96 128] > >> >> exit: [ 0 32 64 96 128] > >> >> Out[14]: fixed_pt_array([ 0, 3200, 6400, 9600, 12800]) > >> >> > >> >> Apparantly, obj*100 is never passed to array_wrap. > >> >> > >> >> Is there another way I can do this? > >> >> > >> > I believe array wrap has to be explicitly called after the fact. > >> > >> Ufuncs call __array_wrap__ implicitly. > >> > > > > Thanks for the info. How do they decide which one to call? > > The .__array_priority__ attribute. > > What if they are the same? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Sep 29 15:12:35 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 14:12:35 -0500 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> <3d375d730909291148w14e9df41n18719629541f3186@mail.gmail.com> Message-ID: <3d375d730909291212r360dc095ib4fc01b7f7657d53@mail.gmail.com> On Tue, Sep 29, 2009 at 14:08, Charles R Harris wrote: > > > On Tue, Sep 29, 2009 at 12:48 PM, Robert Kern wrote: >> >> On Tue, Sep 29, 2009 at 13:35, Charles R Harris >> wrote: >> > >> > On Tue, Sep 29, 2009 at 12:23 PM, Robert Kern >> > wrote: >> >> >> >> On Tue, Sep 29, 2009 at 13:09, Charles R Harris >> >> wrote: >> >> > >> >> > >> >> > On Tue, Sep 29, 2009 at 11:00 AM, Neal Becker >> >> > wrote: >> >> >> >> >> >> fixed_pt arrays need to apply the overflow_policy after operations >> >> >> (overflow_policy could be clip, or throw exception). >> >> >> >> >> >> I thought __array_wrap__ would work for this, but it seems to not be >> >> >> called >> >> >> when I need it. ?For example: >> >> >> >> >> >> In [13]: obj >> >> >> Out[13]: fixed_pt_array([ ?0, ?32, ?64, ?96, 128]) >> >> >> >> >> >> In [14]: obj*100 < this should overflow >> >> >> enter: [ ?0 ?32 ?64 ?96 128] << on entry into __array_wrap >> >> >> enter: [0 32 64 96 128] >> >> >> exit: [ ?0 ?32 ?64 ?96 128] >> >> >> Out[14]: fixed_pt_array([ ? ?0, ?3200, ?6400, ?9600, 12800]) >> >> >> >> >> >> Apparantly, obj*100 is never passed to array_wrap. >> >> >> >> >> >> Is there another way I can do this? >> >> >> >> >> > I believe array wrap has to be explicitly called after the fact. >> >> >> >> Ufuncs call __array_wrap__ implicitly. >> >> >> > >> > Thanks for the info. How do they decide which one to call? >> >> The .__array_priority__ attribute. > > What if they are the same? http://docs.scipy.org/doc/numpy/user/c-info.beyond-basics.html?highlight=__array_priority__#ndarray.__array_priority__ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Sep 29 15:17:38 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 13:17:38 -0600 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: <3d375d730909291212r360dc095ib4fc01b7f7657d53@mail.gmail.com> References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> <3d375d730909291148w14e9df41n18719629541f3186@mail.gmail.com> <3d375d730909291212r360dc095ib4fc01b7f7657d53@mail.gmail.com> Message-ID: On Tue, Sep 29, 2009 at 1:12 PM, Robert Kern wrote: > On Tue, Sep 29, 2009 at 14:08, Charles R Harris > wrote: > > > > > > On Tue, Sep 29, 2009 at 12:48 PM, Robert Kern > wrote: > >> > >> On Tue, Sep 29, 2009 at 13:35, Charles R Harris > >> wrote: > >> > > >> > On Tue, Sep 29, 2009 at 12:23 PM, Robert Kern > >> > wrote: > >> >> > >> >> On Tue, Sep 29, 2009 at 13:09, Charles R Harris > >> >> wrote: > >> >> > > >> >> > > >> >> > On Tue, Sep 29, 2009 at 11:00 AM, Neal Becker > > >> >> > wrote: > >> >> >> > >> >> >> fixed_pt arrays need to apply the overflow_policy after operations > >> >> >> (overflow_policy could be clip, or throw exception). > >> >> >> > >> >> >> I thought __array_wrap__ would work for this, but it seems to not > be > >> >> >> called > >> >> >> when I need it. For example: > >> >> >> > >> >> >> In [13]: obj > >> >> >> Out[13]: fixed_pt_array([ 0, 32, 64, 96, 128]) > >> >> >> > >> >> >> In [14]: obj*100 < this should overflow > >> >> >> enter: [ 0 32 64 96 128] << on entry into __array_wrap > >> >> >> enter: [0 32 64 96 128] > >> >> >> exit: [ 0 32 64 96 128] > >> >> >> Out[14]: fixed_pt_array([ 0, 3200, 6400, 9600, 12800]) > >> >> >> > >> >> >> Apparantly, obj*100 is never passed to array_wrap. > >> >> >> > >> >> >> Is there another way I can do this? > >> >> >> > >> >> > I believe array wrap has to be explicitly called after the fact. > >> >> > >> >> Ufuncs call __array_wrap__ implicitly. > >> >> > >> > > >> > Thanks for the info. How do they decide which one to call? > >> > >> The .__array_priority__ attribute. > > > > What if they are the same? > > > http://docs.scipy.org/doc/numpy/user/c-info.beyond-basics.html?highlight=__array_priority__#ndarray.__array_priority__ > > And that is why it is a bad idea. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Tue Sep 29 15:28:26 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Sep 2009 12:28:26 -0700 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> Message-ID: <4AC25FDA.1040406@noaa.gov> Pierre GM wrote: >> Another idea: only store the indexes of the rows that have the "wrong" >> number of columns -- if that's a large number, then then user has >> bigger >> problems than memory usage! > > That was my first idea, but then it adds tests in the inside loop > (which is what I'm trying to avoid)... well, how does one test compare to: read the line from the file split the line into tokens parse each token I can't imagine it's significant, but I guess you only know with profiling. How does it handle the wrong number of tokens now? if an exception is raised somewhere, then that's the only place you'd need to anything extra anyway. > OK, I should be able to start working on that in the next few days. cool! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pgmdevlist at gmail.com Tue Sep 29 15:35:19 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 29 Sep 2009 15:35:19 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4AC25FDA.1040406@noaa.gov> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> <4AC25FDA.1040406@noaa.gov> Message-ID: <7E0FAF52-C5C9-4CE2-BD62-065DC4D8EA25@gmail.com> On Sep 29, 2009, at 3:28 PM, Christopher Barker wrote: > > well, how does one test compare to: > > read the line from the file > split the line into tokens > parse each token > > I can't imagine it's significant, but I guess you only know with > profiling. That's on the parsing part. I'd like to keep it as light as possible. > How does it handle the wrong number of tokens now? if an exception is > raised somewhere, then that's the only place you'd need to anything > extra anyway. It silently fails outside the loop, when the list of splitted rows is converted into an array: if one row has a different length than the others, a "Creating array from a sequence" error occurs but we can't tell where the problem is (because np.array does not tell us). From d.l.goldsmith at gmail.com Tue Sep 29 15:48:07 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 29 Sep 2009 12:48:07 -0700 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <4AC24A0C.4090005@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> <4AC24A0C.4090005@stsci.edu> Message-ID: <45d1ab480909291248j107047fld5da82eb607d5614@mail.gmail.com> On Tue, Sep 29, 2009 at 10:55 AM, Michael Droettboom wrote: > 2) Improve documentation > > Every method now has a docstring, and a new page of routines has been > added to the Sphinx tree. > > Um, where did you do this, 'cause it's not showing up in the doc wiki. DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Tue Sep 29 16:32:28 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 29 Sep 2009 16:32:28 -0400 Subject: [Numpy-discussion] Another dumb structured array question Message-ID: Is there an easy way to get multiple subdtypes out? e.g. if I have a dtype dtype([('foo', 'i4'), ('bar', 'i8'), ('baz', 'S100')]) and an array with that dtype, is there a way to only get the 'foo' and 'bar'? arr[('foo','bar')] doesn't seem to work. David From Chris.Barker at noaa.gov Tue Sep 29 16:34:16 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Sep 2009 13:34:16 -0700 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: References: <4AC23B94.5010604@noaa.gov> <4AC23E09.2040302@noaa.gov> Message-ID: <4AC26F48.3020304@noaa.gov> Joe Kington wrote: > I just realized that what I'm doing won't work on older versions of > python, anyway... What I was looking for was which actual bit the sign bit is, as expressed as a native integer, so I can do a bitwise_and. But now that I think about it, I only need to test zero, not all numbers, so I should be able to do: def signbit(x): if x < 0.0: return True elif x == 0.0: if struct.pack('d',x) == struct.pack('d',-0.0): return True else return False Fortunately, this isn't performance critical. > Of course, you could always do x.__repr__().startswith('-'), but it's > slow, ugly, and you already said you wanted to avoid it. well, that's not what I wanted to avoid -- what I wanted to avoid was parsing the original input string. This would be fine! I wonder if it's better or worse than using struct as above? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From bsouthey at gmail.com Tue Sep 29 16:36:59 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 29 Sep 2009 15:36:59 -0500 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <00CD2A47-A721-4A46-8D81-203BE795A6E4@gmail.com> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> <4AC24A7F.4020904@gmail.com> <00CD2A47-A721-4A46-8D81-203BE795A6E4@gmail.com> Message-ID: <4AC26FEB.60401@gmail.com> On 09/29/2009 01:30 PM, Pierre GM wrote: > On Sep 29, 2009, at 1:57 PM, Bruce Southey wrote: > > >> On 09/29/2009 11:37 AM, Christopher Barker wrote: >> >>> Pierre GM wrote: >>> >>> >> Probably more than memory is the execution time involved in printing >> these problem rows. >> > The rows with problems will be printed outside the loop (with at least > an associated warning or possibly raising an exception). My concern is > to whether store only the tuples (index of the row, nb of columns) for > the invalid rows, or just create a list of nb of columns that I'd > parse afterwards. The first solution requires an extra test in the > loop, the second may waste some memory space. > Bah, I'll figure it out. Please send me some test cases so that I can > time/test the best option. > >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, The first case just has to handle a missing delimiter - actually I expect that most of my cases would relate this. So here is simple Python code to generate arbitrary large list with the occasional missing delimiter. I set it so it reads the desired number of rows and frequency of bad rows from the linux command line. $time python tbig.py 1000000 100000 If I comment out the extra prints in io.py that I put in, it takes about 22 seconds to finish if the delimiters are correct. If I have the missing delimiter it takes 20.5 seconds to crash. Bruce -------------- next part -------------- A non-text attachment was scrubbed... Name: tbig.py Type: text/x-python Size: 530 bytes Desc: not available URL: From robert.kern at gmail.com Tue Sep 29 16:46:09 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 15:46:09 -0500 Subject: [Numpy-discussion] Another dumb structured array question In-Reply-To: References: Message-ID: <3d375d730909291346v578b20fevd17e10d777fea86f@mail.gmail.com> On Tue, Sep 29, 2009 at 15:32, David Warde-Farley wrote: > Is there an easy way to get multiple subdtypes out? e.g. if I have a > dtype > > dtype([('foo', 'i4'), ('bar', 'i8'), ('baz', 'S100')]) > > and an array with that dtype, is there a way to only get the 'foo' and > 'bar'? > > arr[('foo','bar')] doesn't seem to work. A little bit of logic wrapped around numpy.lib.recfunctions.drop_fields() should do the trick. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Sep 29 16:48:10 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 15:48:10 -0500 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <4AC26F48.3020304@noaa.gov> References: <4AC23B94.5010604@noaa.gov> <4AC23E09.2040302@noaa.gov> <4AC26F48.3020304@noaa.gov> Message-ID: <3d375d730909291348v51df360he743931ff2583821@mail.gmail.com> On Tue, Sep 29, 2009 at 15:34, Christopher Barker wrote: > Joe Kington wrote: >> I just realized that what I'm doing won't work on older versions of >> python, anyway... > > What I was looking for was which actual bit the sign bit is, as > expressed as a native integer, so I can do a bitwise_and. > > But now that I think about it, I only need to test zero, not all > numbers, so I should be able to do: > > def signbit(x): > ? ? if x < 0.0: > ? ? ? ? return True > ? ? elif x == 0.0: > ? ? ? ? if struct.pack('d',x) == struct.pack('d',-0.0): > ? ? ? ? ? ? return True > ? ? else > ? ? ? ? return False > > > Fortunately, this isn't performance critical. To speed it up some, precalculate the struct.pack('d', -0.0) constant outside of the function. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ndbecker2 at gmail.com Tue Sep 29 16:52:36 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 29 Sep 2009 16:52:36 -0400 Subject: [Numpy-discussion] max value of np scalars Message-ID: I need the max value of an np scalar type. I had used this code: def get_max(is_signed, base_type, total_bits): print 'get_max:', is_signed, base_type, total_bits if is_signed: return (~(base_type(-1) << (total_bits-1))) else: print type(base_type (-1) << total_bits) return (~(base_type (-1) << total_bits)) This doesn't work for e.g., np.uint64. As the 'print' shows, get_max: False 10 The type of np.uint64 (-1) << 10 is not np.uint64, but long. This seems very strange to me. So, 2 questions. 1) Is this expected behavior? 2) How can I correctly implement get_max? From lists at cheimes.de Tue Sep 29 16:53:52 2009 From: lists at cheimes.de (Christian Heimes) Date: Tue, 29 Sep 2009 22:53:52 +0200 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <4AC23B94.5010604@noaa.gov> References: <4AC23B94.5010604@noaa.gov> Message-ID: Christopher Barker wrote: > Hi folks, > > This isn't really a numpy question, and I'm doing this with regular old > python, but I figure you are the folks that would know this: > > How do I get python to make a distinction between -0.0 and 0.0? IN this > case, I'm starting with user input, so: How about using atan2()? :) >>> from math import atan2 >>> atan2(-0., -1.) -3.1415926535897931 >>> atan2(0., -1.) 3.1415926535897931 from math import atan2 def sign(x): if x > 0.: return +1 if x < 0.: return -1 # x == 0. or -0. if atan2(x, -1.) > 0: return +1 else: return -1 From robert.kern at gmail.com Tue Sep 29 17:10:22 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 16:10:22 -0500 Subject: [Numpy-discussion] max value of np scalars In-Reply-To: References: Message-ID: <3d375d730909291410p31e0e3acu2a377ca47b36f357@mail.gmail.com> On Tue, Sep 29, 2009 at 15:52, Neal Becker wrote: > I need the max value of an np scalar type. ?I had used this code: > > def get_max(is_signed, base_type, total_bits): > ? ?print 'get_max:', is_signed, base_type, total_bits > ? ?if is_signed: > ? ? ? ?return (~(base_type(-1) << (total_bits-1))) > ? ?else: > ? ? ? ?print type(base_type (-1) << total_bits) > ? ? ? ?return (~(base_type (-1) << total_bits)) > > This doesn't work for e.g., np.uint64. ?As the 'print' shows, > ?get_max: False 10 > > > The type of np.uint64 (-1) << 10 is not np.uint64, but long. ?This seems > very strange to me. > > So, 2 questions. > > 1) Is this expected behavior? Could be. I'm not entirely sure why it would be doing this, but the code does fall back to generic object implementations under certain conditions. Of course, np.uint64(-1) is the correct answer for unsigned integer types since we implement wraparound. > 2) How can I correctly implement get_max? np.iinfo() for integer types and np.finfo() for floating point types. In [1]: np.iinfo(np.uint64).max Out[1]: 18446744073709551615L In [2]: np.finfo(np.float32).max Out[2]: 3.4028235e+38 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Sep 29 17:14:29 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 15:14:29 -0600 Subject: [Numpy-discussion] max value of np scalars In-Reply-To: References: Message-ID: On Tue, Sep 29, 2009 at 2:52 PM, Neal Becker wrote: > I need the max value of an np scalar type. I had used this code: > > def get_max(is_signed, base_type, total_bits): > print 'get_max:', is_signed, base_type, total_bits > if is_signed: > return (~(base_type(-1) << (total_bits-1))) > else: > print type(base_type (-1) << total_bits) > return (~(base_type (-1) << total_bits)) > > This doesn't work for e.g., np.uint64. As the 'print' shows, > get_max: False 10 > > > The type of np.uint64 (-1) << 10 is not np.uint64, but long. This seems > very strange to me. > > So, 2 questions. > > 1) Is this expected behavior? > > 2) How can I correctly implement get_max? > > Some odd behavior here: In [24]: left_shift(uint64(-1), 1) Out[24]: 36893488147419103230L In [25]: type(left_shift(uint64(-1), 1)) Out[25]: In [26]: type(left_shift(uint32(-1), 1)) Out[26]: In [27]: type(uint32(-1)) Out[27]: In [28]: type(left_shift(uint32(-1), 1)) Out[28]: In [29]: type(uint64(-1)) Out[29]: I don't think the arguments should be promoted for what should(?) be bitwise operations. Needs some discussion. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Tue Sep 29 17:24:50 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Sep 2009 14:24:50 -0700 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: References: <4AC23B94.5010604@noaa.gov> Message-ID: <4AC27B22.5000101@noaa.gov> Christian Heimes wrote: > How about using atan2()? :) unless atan2 shortcuts for the easy ones, that doesn't strike me as efficient (though with python function call overhead, maybe!). Anyway, of course, some googling that I should have done in the first place, revealed "double.py", from Martin Jansche: http://symptotic.com/mj/code.html (MIT license). double.py provides a full(?) set of IEEE functions for doubles in Python. His solution to the problem at hand is: def signbit(value): """ Test whether the sign bit of the given floating-point value is set. If it is set, this generally means the given value is negative. However, this is not the same as comparing the value to C{0.0}. For example: >>> NEGATIVE_ZERO < 0.0 False since negative zero is numerically equal to positive zero. But the sign bit of negative zero is indeed set: >>> signbit(NEGATIVE_ZERO) True >>> signbit(0.0) False @type value: float @param value: a Python (double-precision) float value @rtype: bool @return: C{True} if the sign bit of C{value} is set; C{False} if it is not set. """ return (doubleToRawLongBits(value) >> 63) == 1 where: def doubleToRawLongBits(value): """ @type value: float @param value: a Python (double-precision) float value @rtype: long @return: the IEEE 754 bit representation (64 bits as a long integer) of the given double-precision floating-point value. """ # pack double into 64 bits, then unpack as long int return _struct.unpack('Q', _struct.pack('d', value))[0] Which is pretty much what I was looking for, though I can't say I've profiled the various options at hand! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue Sep 29 17:27:23 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Sep 2009 14:27:23 -0700 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <7E0FAF52-C5C9-4CE2-BD62-065DC4D8EA25@gmail.com> References: <4ABD05D6.6020706@gmail.com> <4D357173-C408-4AF4-8FFD-C77363915D13@gmail.com> <4ABD160E.1060503@noaa.gov> <57D14F44-04C0-4A50-B49D-E9CA2094F165@gmail.com> <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> <4AC25FDA.1040406@noaa.gov> <7E0FAF52-C5C9-4CE2-BD62-065DC4D8EA25@gmail.com> Message-ID: <4AC27BBB.3060200@noaa.gov> Pierre GM wrote: >> How does it handle the wrong number of tokens now? if an exception is >> raised somewhere, then that's the only place you'd need to anything >> extra anyway. > > It silently fails outside the loop, when the list of splitted rows is > converted into an array: if one row has a different length than the > others, a "Creating array from a sequence" error occurs but we can't > tell where the problem is (because np.array does not tell us). Which brings up a good point -- maybe some of this error reporting should go into np.array? It would be nice to know at least when the failure happened. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jkington at wisc.edu Tue Sep 29 17:37:55 2009 From: jkington at wisc.edu (Joe Kington) Date: Tue, 29 Sep 2009 16:37:55 -0500 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <4AC27B22.5000101@noaa.gov> References: <4AC23B94.5010604@noaa.gov> <4AC27B22.5000101@noaa.gov> Message-ID: I know it's a bit pointless profiling these, but just so I can avoid doing real work for a bit... In [1]: import sys, struct, math In [2]: def comp_struct(x): ...: # Get the first or last byte, depending on endianness ...: # (using '>f' or ' 0: return False ....: elif x < 0: return True ....: elif math.atan2(x, -1.0) > 0: ....: return False ....: else: ....: return True ....: In [7]: comp_atan(0.00); comp_atan(-0.00) Out[7]: False Out[7]: True In [8]: %timeit comp_struct(-0.0) 1000000 loops, best of 3: 971 ns per loop In [9]: %timeit comp_string(-0.0) 1000000 loops, best of 3: 1.43 us per loop In [10]: %timeit comp_atan(-0.0) 1000000 loops, best of 3: 502 ns per loop And just to compare to what was just posted: In [45]: %timeit signbit(-0.0) 1000000 loops, best of 3: 995 ns per loop So, if speed matters, apparently checking things through atan2 is the fastest. (wouldn't have guessed that!) On the other hand I'm being a bit ridiculous profiling these... Use whichever is the most readable, I guess. This is more fun than real work, though! -Joe On Tue, Sep 29, 2009 at 4:24 PM, Christopher Barker wrote: > Christian Heimes wrote: > > How about using atan2()? :) > > unless atan2 shortcuts for the easy ones, that doesn't strike me as > efficient (though with python function call overhead, maybe!). > > Anyway, of course, some googling that I should have done in the first > place, revealed "double.py", from Martin Jansche: > > http://symptotic.com/mj/code.html > (MIT license). > > > double.py provides a full(?) set of IEEE functions for doubles in > Python. His solution to the problem at hand is: > > def signbit(value): > """ > Test whether the sign bit of the given floating-point value is > set. If it is set, this generally means the given value is > negative. However, this is not the same as comparing the value > to C{0.0}. For example: > > >>> NEGATIVE_ZERO < 0.0 > False > > since negative zero is numerically equal to positive zero. But > the sign bit of negative zero is indeed set: > > >>> signbit(NEGATIVE_ZERO) > True > >>> signbit(0.0) > False > > @type value: float > @param value: a Python (double-precision) float value > > @rtype: bool > @return: C{True} if the sign bit of C{value} is set; > C{False} if it is not set. > """ > return (doubleToRawLongBits(value) >> 63) == 1 > > where: > > def doubleToRawLongBits(value): > """ > @type value: float > @param value: a Python (double-precision) float value > > @rtype: long > @return: the IEEE 754 bit representation (64 bits as a long integer) > of the given double-precision floating-point value. > """ > # pack double into 64 bits, then unpack as long int > return _struct.unpack('Q', _struct.pack('d', value))[0] > > > Which is pretty much what I was looking for, though I can't say I've > profiled the various options at hand! > > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Sep 29 17:39:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 16:39:43 -0500 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: References: <4AC23B94.5010604@noaa.gov> <4AC27B22.5000101@noaa.gov> Message-ID: <3d375d730909291439u57815640gb81fc51d19eef2a5@mail.gmail.com> On Tue, Sep 29, 2009 at 16:37, Joe Kington wrote: > I know it's a bit pointless profiling these, but just so I can avoid doing > real work for a bit... > > In [1]: import sys, struct, math > > In [2]: def comp_struct(x): > ?? ...:???? # Get the first or last byte, depending on endianness > ?? ...:???? # (using '>f' or ' python's) Did you try 'd'? I wonder if the extra conversion step from a C double to a C float is causing this issue. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Sep 29 17:40:13 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 15:40:13 -0600 Subject: [Numpy-discussion] chebyshev module Message-ID: Hi all, I'm at the polishing stage on this module and at this point would like some input on the names. Yeah, a bit late ;) As it stands the module emulates the polynomial module in most things with the substitution of cheb for poly and the poly1d equivalent is Cheb1d. There are also a few deviations, instead of cheb there is chebfromroots, roots is chebroots, there are converters to,from polynomials, the ctor for Cheb1d only takes coefficients (not roots or other instances), and Cheb1d doesn't have the __array__ method. The supported coefficients arrays all have to be 1d, but the argument passed to chebval can be any array. Any suggestions are welcome at this time. I'll post the code soonish for further comment. Oh, and is it advisable to have a __copy__ (or copy) method? Any other functions? I don't want to do a complete workup, that would seem to belong to scipy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Sep 29 17:43:55 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 16:43:55 -0500 Subject: [Numpy-discussion] chebyshev module In-Reply-To: References: Message-ID: <3d375d730909291443k7d035fa4sdf8cf79bd3fc8807@mail.gmail.com> On Tue, Sep 29, 2009 at 16:40, Charles R Harris wrote: > Oh, and is it advisable to have a __copy__ (or copy) method? Implement __getstate__ and __setstate__. Both the pickle module and the copy module will use those functions. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jkington at wisc.edu Tue Sep 29 17:49:05 2009 From: jkington at wisc.edu (Joe Kington) Date: Tue, 29 Sep 2009 16:49:05 -0500 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <3d375d730909291439u57815640gb81fc51d19eef2a5@mail.gmail.com> References: <4AC23B94.5010604@noaa.gov> <4AC27B22.5000101@noaa.gov> <3d375d730909291439u57815640gb81fc51d19eef2a5@mail.gmail.com> Message-ID: Using 'd' rather than 'f' doesn't fix the problem... Python 2.3.4 (#1, Jan 9 2007, 16:40:09) [GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import struct >>> struct.pack('d', -0.0) '\x00\x00\x00\x00\x00\x00\x00\x80' <-- Correct signbit >>> struct.pack('>d', -0.0) '\x00\x00\x00\x00\x00\x00\x00\x00' <-- No signbit. Unpacks to 0.0 >>> struct.pack('>> In python 2.6 it works fine. -0.0 packs to '\x00\x00\x00\x00\x00\x00\x00\x80' or '\x80\x00...' as it should. I'm assuming it's a bug that was fixed somewhere in between? On Tue, Sep 29, 2009 at 4:39 PM, Robert Kern wrote: > On Tue, Sep 29, 2009 at 16:37, Joe Kington wrote: > > I know it's a bit pointless profiling these, but just so I can avoid > doing > > real work for a bit... > > > > In [1]: import sys, struct, math > > > > In [2]: def comp_struct(x): > > ...: # Get the first or last byte, depending on endianness > > ...: # (using '>f' or ' > python's) > > Did you try 'd'? I wonder if the extra conversion step from a C double > to a C float is causing this issue. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Sep 29 17:57:20 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Sep 2009 16:57:20 -0500 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: References: <4AC23B94.5010604@noaa.gov> <4AC27B22.5000101@noaa.gov> <3d375d730909291439u57815640gb81fc51d19eef2a5@mail.gmail.com> Message-ID: <3d375d730909291457p1d040c4al4ec2bf246fb916af@mail.gmail.com> On Tue, Sep 29, 2009 at 16:49, Joe Kington wrote: > Using 'd' rather than 'f' doesn't fix the problem... > > Python 2.3.4 (#1, Jan? 9 2007, 16:40:09) > [GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import struct >>>> struct.pack('d', -0.0) > '\x00\x00\x00\x00\x00\x00\x00\x80'?? <-- Correct signbit >>>> struct.pack('>d', -0.0) > '\x00\x00\x00\x00\x00\x00\x00\x00'?? <-- No signbit. Unpacks to 0.0 >>>> struct.pack(' '\x00\x00\x00\x00\x00\x00\x00\x00'?? <-- No signbit. >>>> > > In python 2.6 it works fine. -0.0 packs to > '\x00\x00\x00\x00\x00\x00\x00\x80' or '\x80\x00...' as it should. > > I'm assuming it's a bug that was fixed somewhere in between? Probably. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Tue Sep 29 18:48:59 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Sep 2009 15:48:59 -0700 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <3d375d730909291457p1d040c4al4ec2bf246fb916af@mail.gmail.com> References: <4AC23B94.5010604@noaa.gov> <4AC27B22.5000101@noaa.gov> <3d375d730909291439u57815640gb81fc51d19eef2a5@mail.gmail.com> <3d375d730909291457p1d040c4al4ec2bf246fb916af@mail.gmail.com> Message-ID: <4AC28EDB.40008@noaa.gov> >> I'm assuming it's a bug that was fixed somewhere in between? It works on my 2.5, on a PPC: In [10]: struct.pack('>d', -0.0) Out[10]: '\x80\x00\x00\x00\x00\x00\x00\x00' In [11]: struct.pack('>> struct.pack('>d', -0.0) '\x00\x00\x00\x00\x00\x00\x00\x00' >>> struct.pack(' I have a prototype for fixed_pt without using inheritance. I think I like it. Any thoughts? -------------- next part -------------- A non-text attachment was scrubbed... Name: test_fixed_numpy.py Type: text/x-python Size: 8199 bytes Desc: not available URL: From josef.pktd at gmail.com Tue Sep 29 19:17:16 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 29 Sep 2009 19:17:16 -0400 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <4AC28EDB.40008@noaa.gov> References: <4AC23B94.5010604@noaa.gov> <4AC27B22.5000101@noaa.gov> <3d375d730909291439u57815640gb81fc51d19eef2a5@mail.gmail.com> <3d375d730909291457p1d040c4al4ec2bf246fb916af@mail.gmail.com> <4AC28EDB.40008@noaa.gov> Message-ID: <1cd32cbb0909291617s15d4f20dy39b3783ba8bcf2dd@mail.gmail.com> On Tue, Sep 29, 2009 at 6:48 PM, Christopher Barker wrote: >>> I'm assuming it's a bug that was fixed somewhere in between? > > It works on my 2.5, on a PPC: > > In [10]: struct.pack('>d', -0.0) > Out[10]: '\x80\x00\x00\x00\x00\x00\x00\x00' > > In [11]: struct.pack(' Out[11]: '\x00\x00\x00\x00\x00\x00\x00\x80' > > But not on 2.3.5 on the same PPC (big endian, yes?) > > Python 2.3.5 (#1, Jan 12 2009, 14:43:55) > [GCC 3.3 20030304 (Apple Computer, Inc. build 1819)] on darwin > > ?>>> struct.pack('>d', -0.0) > '\x00\x00\x00\x00\x00\x00\x00\x00' > ?>>> struct.pack(' '\x00\x00\x00\x00\x00\x00\x00\x00' > > I wonder if this is a gcc difference, rather than a python one? 2.3.5 > was compiled with gcc3.3, and 2.5 with gcc 4.0.1. > > I suppose I could test on Windows if I cared.. > WindowsXP: Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 >>> struct.pack('>d', -0.0) '\x80\x00\x00\x00\x00\x00\x00\x00' >>> struct.pack('>> struct.pack('>d', -0.0) '\x00\x00\x00\x00\x00\x00\x00\x00' >>> struct.pack(' -Chris > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959 ? voice > 7600 Sand Point Way NE ? (206) 526-6329 ? fax > Seattle, WA ?98115 ? ? ? (206) 526-6317 ? main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Tue Sep 29 19:29:39 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Sep 2009 16:29:39 -0700 Subject: [Numpy-discussion] making the distinction between -0.0 and 0.0.. In-Reply-To: <1cd32cbb0909291617s15d4f20dy39b3783ba8bcf2dd@mail.gmail.com> References: <4AC23B94.5010604@noaa.gov> <4AC27B22.5000101@noaa.gov> <3d375d730909291439u57815640gb81fc51d19eef2a5@mail.gmail.com> <3d375d730909291457p1d040c4al4ec2bf246fb916af@mail.gmail.com> <4AC28EDB.40008@noaa.gov> <1cd32cbb0909291617s15d4f20dy39b3783ba8bcf2dd@mail.gmail.com> Message-ID: <4AC29863.4020606@noaa.gov> josef.pktd at gmail.com wrote: > WindowsXP: > > Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit > (Intel)] on win32 >>>> struct.pack('>d', -0.0) > '\x80\x00\x00\x00\x00\x00\x00\x00' >>>> struct.pack(' '\x00\x00\x00\x00\x00\x00\x00\x80' > > Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on win32 >>>> struct.pack('>d', -0.0) > '\x00\x00\x00\x00\x00\x00\x00\x00' >>>> struct.pack(' '\x00\x00\x00\x00\x00\x00\x00\x00' > > whatever that means, It means this is a Python, not a compiler issue, and it was fixed between versions 2.4.3 and 2.5.2. Water under the bridge... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Tue Sep 29 20:56:17 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Sep 2009 18:56:17 -0600 Subject: [Numpy-discussion] fixed_pt prototype using aggregation In-Reply-To: References: Message-ID: On Tue, Sep 29, 2009 at 4:50 PM, Neal Becker wrote: > I have a prototype for fixed_pt without using inheritance. I think I like > it. Any thoughts? > > There is a line 177 characters long ;) Looks like a step in the right direction, though. If you add the various operations of interest -- mul, div, add, sub -- to the fixed_pt type you can make object arrays of them pretty easily. Also, FixedPoint would be a more conformant way of naming the class. The Decimal class in python 2.6 might be worth looking at, it is in ./Lib/decimal.py if you download the source. A description is at http://docs.python.org/library/decimal.html . Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Wed Sep 30 00:08:20 2009 From: david.huard at gmail.com (David Huard) Date: Wed, 30 Sep 2009 00:08:20 -0400 Subject: [Numpy-discussion] Convert data into rectangular grid In-Reply-To: References: <1cd32cbb0909281648h4ea0e7b7wb099700f589d92e0@mail.gmail.com> Message-ID: <91cf711d0909292108q16b66c5j8ef3160df77635fb@mail.gmail.com> On Mon, Sep 28, 2009 at 8:45 PM, jah wrote: > On Mon, Sep 28, 2009 at 4:48 PM, wrote: > >> On Mon, Sep 28, 2009 at 7:19 PM, jah wrote: >> > Hi, >> > >> > Suppose I have a set of x,y,c data (something useful for >> > matplotlib.pyplot.plot() ). Generally, this data is not rectangular at >> > all. Does there exist a numpy function (or set of functions) which will >> > take this data and construct the smallest two-dimensional arrays X,Y,C ( >> > suitable for matplotlib.pyplot.contour() ). >> > >> > Essentially, I want to pass in the data and a grid step size in the x- >> and >> > y-directions. The function would average the c-values for all points >> which >> > land in any particular square. Optionally, I'd like to be able to >> specify a >> > value to use when there are no points in x,y which are in the square. >> > >> > Hope this makes sense. >> >> If I understand correctly numpy.histogram2d(x, y, ..., weights=c) might >> do >> what you want. >> >> There was a recent thread on its usage. >> > > It is very close, but it normed=True, will first normalize the weights > (undesirably) and then it will normalize the normalized weights by dividing > by the cell area. Instead, what I want is the cell value to be the average > off all the points that were placed in the cell. This seems like a common > use case, so I'm guessing this functionality is present already. So if 3 > points with weights [10,20,30] were placed in cell (i,j), then the cell > should have value 20 (the arithmetic mean of the points placed in the cell). > > Would this work for you ? >>> s = histogram2d(x,y,weights=c) # Not normalized, so you get the sum of the weights >>> n = histogram2d(x,y) # Now you have the number of elements in each bin >>> mean = s/n David > Here is the desired use case: I have a set of x,y,c values that I could > pass into matplotlib's scatter() or hexbin(). I'd like to take this same > set of points and transform them so that I can pass them into matplotlib's > contour() function. Perhaps matplotlib has a function which does this. > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav+sp at iki.fi Wed Sep 30 02:57:08 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Wed, 30 Sep 2009 06:57:08 +0000 (UTC) Subject: [Numpy-discussion] __array_wrap__ References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> Message-ID: Tue, 29 Sep 2009 14:55:44 -0400, Neal Becker wrote: > This seems to work now, but I'm wondering if Charles is correct, that > inheritance isn't such a great idea here. > > The advantage of inheritance is I don't have to implement forwarding all > the functions, a pretty big advantage. (I wonder if there is some way to > do some of these as a generic 'mixin'?) The usual approach is to use __getattr__, to forward many routines with little extra work. -- Pauli Virtanen From dsdale24 at gmail.com Wed Sep 30 07:30:30 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 30 Sep 2009 07:30:30 -0400 Subject: [Numpy-discussion] __array_wrap__ In-Reply-To: References: <3d375d730909291123w31a2add3r9804b97dec3ba445@mail.gmail.com> Message-ID: On Wed, Sep 30, 2009 at 2:57 AM, Pauli Virtanen wrote: > Tue, 29 Sep 2009 14:55:44 -0400, Neal Becker wrote: > >> This seems to work now, but I'm wondering if Charles is correct, that >> inheritance isn't such a great idea here. >> >> The advantage of inheritance is I don't have to implement forwarding all >> the functions, a pretty big advantage. (I wonder if there is some way to >> do some of these as a generic 'mixin'?) > > The usual approach is to use __getattr__, to forward many routines with > little extra work. ... with a side effect of making the API opaque and breaking tab completion in ipython. Darren From sccolbert at gmail.com Wed Sep 30 07:43:05 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Wed, 30 Sep 2009 13:43:05 +0200 Subject: [Numpy-discussion] how does numpy handle views and garbage collection? Message-ID: <7f014ea60909300443n4ff9765fp123a8ffdf236e604@mail.gmail.com> Lets say I have function that applies a homogeneous transformation matrix to an Nx3 array of points using np.dot. since the matrix is 4x4 I have to add a 4 column of ones to the array so the function looks something like this: def foo(): <--snip--> pts = np.column_stack((Xquad, Yquad, Zquad, np.ones(Zquad.shape))) transpts = np.dot(transmat, pts.T).T return transpts[:,:3] Since i'm returning just the view of the array, I imagine python doesnt garbage collect transpts once the function returns and falls out of scope (because numpy has increfed it in the view operation?). So in essence, I still have that whole column of ones hanging around wasting memory, is that about right? Cheers, Chris From mdroe at stsci.edu Wed Sep 30 09:03:27 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Wed, 30 Sep 2009 09:03:27 -0400 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <45d1ab480909291248j107047fld5da82eb607d5614@mail.gmail.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <812BBECE-D1E8-4699-980A-BB8FB9657CB9@stsci.edu> <4AB9103F.7030207@stsci.edu> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> <4AC24A0C.4090005@stsci.edu> <45d1ab480909291248j107047fld5da82eb607d5614@mail.gmail.com> Message-ID: <4AC3571F.6020901@stsci.edu> In the source in my working copy. Is that going to cause problems? I wasn't sure if it was possible to document methods that didn't yet exist in the code in the wiki. Mike David Goldsmith wrote: > On Tue, Sep 29, 2009 at 10:55 AM, Michael Droettboom > wrote: > > 2) Improve documentation > > Every method now has a docstring, and a new page of routines has been > added to the Sphinx tree. > > > Um, where did you do this, 'cause it's not showing up in the doc wiki. > > DG > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From ndbecker2 at gmail.com Wed Sep 30 09:23:21 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 30 Sep 2009 09:23:21 -0400 Subject: [Numpy-discussion] fixed_pt prototype using aggregation References: Message-ID: I have implemented lots more fixed_pt operations, code attached (sorry for long lines, hope that's not a problem) -------------- next part -------------- A non-text attachment was scrubbed... Name: test_fixed_numpy.py Type: text/x-python Size: 13278 bytes Desc: not available URL: From ralf.gommers at googlemail.com Wed Sep 30 09:26:04 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 30 Sep 2009 09:26:04 -0400 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <4AC3571F.6020901@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <4AB9103F.7030207@stsci.edu> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> <4AC24A0C.4090005@stsci.edu> <45d1ab480909291248j107047fld5da82eb607d5614@mail.gmail.com> <4AC3571F.6020901@stsci.edu> Message-ID: On Wed, Sep 30, 2009 at 9:03 AM, Michael Droettboom wrote: > In the source in my working copy. Is that going to cause problems? I > wasn't sure if it was possible to document methods that didn't yet exist > in the code in the wiki. > > That is fine. New functions will automatically show up in the wiki. It would be helpful though if you could mark them ready for review in the wiki (if they are) after they show up. Could take up to 24 hours for svn changes to propagate. Only if you moved functions around it would be useful if you pinged Pauli after you committed them. This is a temporary problem, right now the wiki creates a new page for a moved object, and the old content (if any) has to be copied over to the new page. Cheers, Ralf Mike > > David Goldsmith wrote: > > On Tue, Sep 29, 2009 at 10:55 AM, Michael Droettboom > > wrote: > > > > 2) Improve documentation > > > > Every method now has a docstring, and a new page of routines has been > > added to the Sphinx tree. > > > > > > Um, where did you do this, 'cause it's not showing up in the doc wiki. > > > > DG > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Wed Sep 30 09:51:56 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Wed, 30 Sep 2009 08:51:56 -0500 Subject: [Numpy-discussion] Another dumb structured array question In-Reply-To: References: Message-ID: On Sep 29, 2009, at 3:32 PM, David Warde-Farley wrote: > Is there an easy way to get multiple subdtypes out? e.g. if I have a > dtype > > dtype([('foo', 'i4'), ('bar', 'i8'), ('baz', 'S100')]) > > and an array with that dtype, is there a way to only get the 'foo' and > 'bar'? > > arr[('foo','bar')] doesn't seem to work. Try (with a later version of NumPy --- possibly trunk): arr[['foo', 'bar']] (i.e. with a list instead of a tuple) -Travis From oliphant at enthought.com Wed Sep 30 09:54:49 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Wed, 30 Sep 2009 08:54:49 -0500 Subject: [Numpy-discussion] max value of np scalars In-Reply-To: References: Message-ID: On Sep 29, 2009, at 4:14 PM, Charles R Harris wrote: > > > On Tue, Sep 29, 2009 at 2:52 PM, Neal Becker > wrote: > I need the max value of an np scalar type. I had used this code: > > def get_max(is_signed, base_type, total_bits): > print 'get_max:', is_signed, base_type, total_bits > if is_signed: > return (~(base_type(-1) << (total_bits-1))) > else: > print type(base_type (-1) << total_bits) > return (~(base_type (-1) << total_bits)) > > This doesn't work for e.g., np.uint64. As the 'print' shows, > get_max: False 10 > > > The type of np.uint64 (-1) << 10 is not np.uint64, but long. This > seems > very strange to me. > > So, 2 questions. > > 1) Is this expected behavior? > > 2) How can I correctly implement get_max? > > Some odd behavior here: > > In [24]: left_shift(uint64(-1), 1) > Out[24]: 36893488147419103230L > > In [25]: type(left_shift(uint64(-1), 1)) > Out[25]: > > In [26]: type(left_shift(uint32(-1), 1)) > Out[26]: > > In [27]: type(uint32(-1)) > Out[27]: > > In [28]: type(left_shift(uint32(-1), 1)) > Out[28]: > > In [29]: type(uint64(-1)) > Out[29]: > > I don't think the arguments should be promoted for what should(?) be > bitwise operations. Needs some discussion. Yes, this is a dusty corner that needs some cleaning. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From seb.haase at gmail.com Wed Sep 30 10:08:13 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Wed, 30 Sep 2009 16:08:13 +0200 Subject: [Numpy-discussion] MemoryError with fancy indexing Message-ID: Hi, Maybe someone could explain to me what is going on here!? >>> N.who() Name Shape Bytes Type ============================================================ SLmap 2048 x 2048 16777216 int32 SLmap_fast 2048 x 2048 16777216 int32 SLmap_slow 2048 x 2048 16777216 int32 >>> SLmap[:]=0 >>> SLmap[SLmap_fast]|=2 Traceback (most recent call last): File "", line 1, in MemoryError >>> SLmap[SLmap_fast]=2 >>> SLmap[SLmap_slow]+=1 Traceback (most recent call last): File "", line 1, in MemoryError >>> SLmap[SLmap_fast.astype(N.bool)]+=2 >>> SLmap[:]=0 >>> SLmap[SLmap_fast.astype(bool)]|=2 >>> SLmap[SLmap_slow.astype(bool)]|=1 >>> Why do I run into memory problems using an int32 array as index - while not having any problem when converting it first to dtype=bool ? (I'm on 64-bit Linux, having many GBs physical ram) Thanks, Sebastian Haase From oliphant at enthought.com Wed Sep 30 10:10:58 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Wed, 30 Sep 2009 09:10:58 -0500 Subject: [Numpy-discussion] how does numpy handle views and garbage collection? In-Reply-To: <7f014ea60909300443n4ff9765fp123a8ffdf236e604@mail.gmail.com> References: <7f014ea60909300443n4ff9765fp123a8ffdf236e604@mail.gmail.com> Message-ID: <57230FBF-E716-4D6E-86BC-FE898A551706@enthought.com> On Sep 30, 2009, at 6:43 AM, Chris Colbert wrote: > Lets say I have function that applies a homogeneous transformation > matrix to an Nx3 array of points using np.dot. > > since the matrix is 4x4 I have to add a 4 column of ones to the array > so the function looks something like this: > > def foo(): > <--snip--> > pts = np.column_stack((Xquad, Yquad, Zquad, np.ones(Zquad.shape))) > > transpts = np.dot(transmat, pts.T).T > > return transpts[:,:3] > > Since i'm returning just the view of the array, I imagine python > doesnt garbage collect transpts once the function returns and falls > out of scope (because numpy has increfed it in the view operation?). > > So in essence, I still have that whole column of ones hanging around > wasting memory, is that about right? Yes. You will have the entire underlying array sitting there until the last view on it is deleted. You can make return a copy explicitly using: return transpts[:,:3].copy() Then, the transpts array will be removed when the function returns. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdroe at stsci.edu Wed Sep 30 10:20:20 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Wed, 30 Sep 2009 10:20:20 -0400 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <4AB9103F.7030207@stsci.edu> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> <4AC24A0C.4090005@stsci.edu> <45d1ab480909291248j107047fld5da82eb607d5614@mail.gmail.com> <4AC3571F.6020901@stsci.edu> Message-ID: <4AC36924.7020605@stsci.edu> Ralf Gommers wrote: > > > On Wed, Sep 30, 2009 at 9:03 AM, Michael Droettboom > wrote: > > In the source in my working copy. Is that going to cause problems? I > wasn't sure if it was possible to document methods that didn't yet > exist > in the code in the wiki. > > That is fine. New functions will automatically show up in the wiki. It > would be helpful though if you could mark them ready for review in the > wiki (if they are) after they show up. Could take up to 24 hours for > svn changes to propagate. Thanks. Will do. > > Only if you moved functions around it would be useful if you pinged > Pauli after you committed them. This is a temporary problem, right now > the wiki creates a new page for a moved object, and the old content > (if any) has to be copied over to the new page. All of the functions that were moved were previously without docstrings in SVN, though some had docstrings (that I just now discovered) in the wiki. This may cause some hiccups, I suppose, so I'll be sure to announce when these things get committed to SVN so I know how to help straighten these things out. Mike > > Cheers, > Ralf > > > Mike > > David Goldsmith wrote: > > On Tue, Sep 29, 2009 at 10:55 AM, Michael Droettboom > > > >> wrote: > > > > 2) Improve documentation > > > > Every method now has a docstring, and a new page of routines > has been > > added to the Sphinx tree. > > > > > > Um, where did you do this, 'cause it's not showing up in the doc > wiki. > > > > DG > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From jsseabold at gmail.com Wed Sep 30 10:24:30 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 30 Sep 2009 10:24:30 -0400 Subject: [Numpy-discussion] Another dumb structured array question In-Reply-To: References: Message-ID: On Wed, Sep 30, 2009 at 9:51 AM, Travis Oliphant wrote: > > On Sep 29, 2009, at 3:32 PM, David Warde-Farley wrote: > >> Is there an easy way to get multiple subdtypes out? e.g. if I have a >> dtype >> >> dtype([('foo', 'i4'), ('bar', 'i8'), ('baz', 'S100')]) >> >> and an array with that dtype, is there a way to only get the 'foo' and >> 'bar'? >> >> arr[('foo','bar')] doesn't seem to work. > > Try ?(with a later version of NumPy --- possibly trunk): > > arr[['foo', 'bar']] > > (i.e. with a list instead of a tuple) > That's really helpful to know. Is it too early to document this on the wiki ? From dagss at student.matnat.uio.no Wed Sep 30 10:34:33 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 30 Sep 2009 16:34:33 +0200 Subject: [Numpy-discussion] ufunc and errors Message-ID: <4AC36C79.1030202@student.matnat.uio.no> I looked and looked in the docs, but couldn't find an answer to this: When writing a ufunc, is it possible somehow to raise a Python exception (by acquiring the GIL first to raise it, set a flag and a callback which will be called with the GIL, or otherwise?). Or should one always use NaN even if the input does not make any sense (like, herhm, passing anything but integers or half-integers to a Wigner 3j symbol). I know how I'd to it manually in a wrapper w/ passed in context if not, but wanted to see. Also, will the arguments always be named x1, x2, x3, ..., or can I somehow give them custom names? Dag Sverre From ralf.gommers at googlemail.com Wed Sep 30 11:01:34 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 30 Sep 2009 11:01:34 -0400 Subject: [Numpy-discussion] Another dumb structured array question In-Reply-To: References: Message-ID: On Wed, Sep 30, 2009 at 10:24 AM, Skipper Seabold wrote: > On Wed, Sep 30, 2009 at 9:51 AM, Travis Oliphant > wrote: > > > > On Sep 29, 2009, at 3:32 PM, David Warde-Farley wrote: > > > >> Is there an easy way to get multiple subdtypes out? e.g. if I have a > >> dtype > >> > >> dtype([('foo', 'i4'), ('bar', 'i8'), ('baz', 'S100')]) > >> > >> and an array with that dtype, is there a way to only get the 'foo' and > >> 'bar'? > >> > >> arr[('foo','bar')] doesn't seem to work. > > > > Try (with a later version of NumPy --- possibly trunk): > > > > arr[['foo', 'bar']] > > > > (i.e. with a list instead of a tuple) > > > > That's really helpful to know. Is it too early to document this on > the wiki ? > Not at all. If it works in trunk it can be documented here: http://docs.scipy.org/numpy/docs/numpy-docs/user/basics.rec.rst/ Cheers, Ralf > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Wed Sep 30 11:22:39 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 30 Sep 2009 11:22:39 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4AC26FEB.60401@gmail.com> References: <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> <4AC24A7F.4020904@gmail.com> <00CD2A47-A721-4A46-8D81-203BE795A6E4@gmail.com> <4AC26FEB.60401@gmail.com> Message-ID: On Tue, Sep 29, 2009 at 4:36 PM, Bruce Southey wrote: > > Hi, > The first case just has to handle a missing delimiter - actually I expect > that most of my cases would relate this. So here is simple Python code to > generate arbitrary large list with the occasional missing delimiter. > > I set it so it reads the desired number of rows and frequency of bad rows > from the linux command line. > $time python tbig.py 1000000 100000 > > If I comment out the extra prints in io.py that I put in, it takes about 22 > seconds to finish if the delimiters are correct. If I have the missing > delimiter it takes 20.5 seconds to crash. > > > Bruce > I think this would actually cover most of the problems I was running into. The only other one I can think of is when I used a converter that I thought would work, but it got unexpected data. For example, from StringIO import StringIO import numpy as np strip_rand = lambda x : float(('r' in x.lower() and x.split()[-1]) or (not 'r' in x.lower() and x.strip() or 0.0)) # Example usage strip_rand('R 40') strip_rand(' ') strip_rand('') strip_rand('40') strip_per = lambda x : float(('%' in x.lower() and x.split()[0]) or (not '%' in x.lower() and x.strip() or 0.0)) # Example usage strip_per('7 %') strip_per('7') strip_per(' ') strip_per('') # Unexpected usage strip_per('R 1') s = StringIO('D01N01,10/1/2003 ,1 %,R 75,400,600\r\nL24U05,12/5/2003\ ,2 %,1,300, 150.5\r\nD02N03,10/10/2004 ,R 1,,7,145.55') data = np.genfromtxt(s, converters = {2 : strip_per, 3 : strip_rand}, delimiter=",", dtype=None) I don't have a clean install right now, but I think this returned a converter is locked for upgrading error. I would just like to know where the problem occured (line and column, preferably not zero-indexed), so I can go and have a look at my data. One more note, being able to autostrip whitespace turned out to be very helpful. I didn't realize how much memory strings of spaces could take up, and as soon as I turned this on, I was able to process an array with a lot of whitespace without filling up my memory. So I think maybe autostrip should be turned on by default? I will post anything else if it occurs to me. Skipper From robert.kern at gmail.com Wed Sep 30 11:33:46 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 30 Sep 2009 10:33:46 -0500 Subject: [Numpy-discussion] ufunc and errors In-Reply-To: <4AC36C79.1030202@student.matnat.uio.no> References: <4AC36C79.1030202@student.matnat.uio.no> Message-ID: <3d375d730909300833o7b2db961m45d9ad26575955cf@mail.gmail.com> On Wed, Sep 30, 2009 at 09:34, Dag Sverre Seljebotn wrote: > I looked and looked in the docs, but couldn't find an answer to this: > When writing a ufunc, is it possible somehow to raise a Python exception > (by acquiring the GIL first to raise it, set a flag and a callback which > will be called with the GIL, or otherwise?). You cannot acquire the GIL inside the loop. In order to do so, you would have to have access to the saved PyGILState_STATE which you don't. > Or should one always use > NaN even if the input does not make any sense (like, herhm, passing > anything but integers or half-integers to a Wigner 3j symbol). You should use a NaN and ideally set the fpstatus to INVALID (creating the NaN may or may not do this; you will have to experiment). This will allow people to handle the issue as they wish using numpy.seterr(). An exception for just one value out of thousands is often undesirable. > I know how I'd to it manually in a wrapper w/ passed in context if not, > but wanted to see. > > Also, will the arguments always be named x1, x2, x3, ..., or can I > somehow give them custom names? The only place where names appear is in the docstring. Write whatever text you like. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From denis-bz-py at t-online.de Wed Sep 30 11:42:23 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Wed, 30 Sep 2009 15:42:23 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?numpy_macosx10=2E5_binaries=3A_compa?= =?utf-8?q?tible_with=0910=2E4=3F?= References: <4AB93204.60504@noaa.gov> Message-ID: > Russell E. Owen wrote: > > All the official numpy 1.3.0 Mac binaries are labelled "macosx10.5". > > Does anyone know if these are backwards compatible with MacOS X 10.4 numpy-1.3.0-py2.5-macosx10.5.dmg works fine on macosx 10.4.11 ppc (with Python 2.5.1) -- denis From jsseabold at gmail.com Wed Sep 30 11:42:55 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 30 Sep 2009 11:42:55 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4AC26FEB.60401@gmail.com> References: <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> <4AC24A7F.4020904@gmail.com> <00CD2A47-A721-4A46-8D81-203BE795A6E4@gmail.com> <4AC26FEB.60401@gmail.com> Message-ID: On Tue, Sep 29, 2009 at 4:36 PM, Bruce Southey wrote: > > Hi, > The first case just has to handle a missing delimiter - actually I expect > that most of my cases would relate this. So here is simple Python code to > generate arbitrary large list with the occasional missing delimiter. > > I set it so it reads the desired number of rows and frequency of bad rows > from the linux command line. > $time python tbig.py 1000000 100000 > > If I comment out the extra prints in io.py that I put in, it takes about 22 > seconds to finish if the delimiters are correct. If I have the missing > delimiter it takes 20.5 seconds to crash. > One other point that perhaps goes without saying is that we want to detect missing and extra delimiters (eg., commas for 1000s). Skipper From jsseabold at gmail.com Wed Sep 30 11:47:13 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 30 Sep 2009 11:47:13 -0400 Subject: [Numpy-discussion] Another dumb structured array question In-Reply-To: References: Message-ID: On Wed, Sep 30, 2009 at 11:01 AM, Ralf Gommers wrote: > > > On Wed, Sep 30, 2009 at 10:24 AM, Skipper Seabold > wrote: >> >> On Wed, Sep 30, 2009 at 9:51 AM, Travis Oliphant >> wrote: >> > >> > On Sep 29, 2009, at 3:32 PM, David Warde-Farley wrote: >> > >> >> Is there an easy way to get multiple subdtypes out? e.g. if I have a >> >> dtype >> >> >> >> dtype([('foo', 'i4'), ('bar', 'i8'), ('baz', 'S100')]) >> >> >> >> and an array with that dtype, is there a way to only get the 'foo' and >> >> 'bar'? >> >> >> >> arr[('foo','bar')] doesn't seem to work. >> > >> > Try ?(with a later version of NumPy --- possibly trunk): >> > >> > arr[['foo', 'bar']] >> > >> > (i.e. with a list instead of a tuple) >> > >> >> That's really helpful to know. ?Is it too early to document this on >> the wiki ? > > Not at all. If it works in trunk it can be documented here: > http://docs.scipy.org/numpy/docs/numpy-docs/user/basics.rec.rst/ > Is the only way to edit these that use the automodule directive to submit a patch to, say, ? Skipper From denis-bz-py at t-online.de Wed Sep 30 11:57:44 2009 From: denis-bz-py at t-online.de (denis bzowy) Date: Wed, 30 Sep 2009 15:57:44 +0000 (UTC) Subject: [Numpy-discussion] Convert data into rectangular grid References: Message-ID: jah gmail.com> writes: > > Hi,Suppose I have a set of x,y,c data ... matplotlib.pyplot.contour() ). > JAH, is griddata() working and fast enough for you ? How many points are you contouring ? From ralf.gommers at googlemail.com Wed Sep 30 12:00:28 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 30 Sep 2009 12:00:28 -0400 Subject: [Numpy-discussion] Another dumb structured array question In-Reply-To: References: Message-ID: On Wed, Sep 30, 2009 at 11:47 AM, Skipper Seabold wrote: > On Wed, Sep 30, 2009 at 11:01 AM, Ralf Gommers > wrote: > >> That's really helpful to know. Is it too early to document this on > >> the wiki ? > > > > Not at all. If it works in trunk it can be documented here: > > http://docs.scipy.org/numpy/docs/numpy-docs/user/basics.rec.rst/ > > > > Is the only way to edit these that use the automodule directive to > submit a patch to, say, > ? > No, all the docs should be in the wiki. The page corresponding to the link you gave is http://docs.scipy.org/numpy/docs/numpy.doc.structured_arrays/ And if you use "diff to svn" on that page you can see that it already contains improvements over svn, so I'd say editing on the wiki is preferable. Cheers, Ralf > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Wed Sep 30 12:04:35 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 30 Sep 2009 12:04:35 -0400 Subject: [Numpy-discussion] Another dumb structured array question In-Reply-To: References: Message-ID: On Wed, Sep 30, 2009 at 12:00 PM, Ralf Gommers wrote: > No, all the docs should be in the wiki. The page corresponding to the link > you gave is http://docs.scipy.org/numpy/docs/numpy.doc.structured_arrays/ > > And if you use "diff to svn" on that page you can see that it already > contains improvements over svn, so I'd say editing on the wiki is > preferable. Ah, ok. Thanks, Skipper From bsouthey at gmail.com Wed Sep 30 12:56:52 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 30 Sep 2009 11:56:52 -0500 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: References: <4ABD1F3B.9040507@noaa.gov> <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> <4AC24A7F.4020904@gmail.com> <00CD2A47-A721-4A46-8D81-203BE795A6E4@gmail.com> <4AC26FEB.60401@gmail.com> Message-ID: <4AC38DD4.9000609@gmail.com> On 09/30/2009 10:22 AM, Skipper Seabold wrote: > On Tue, Sep 29, 2009 at 4:36 PM, Bruce Southey wrote: > > >> Hi, >> The first case just has to handle a missing delimiter - actually I expect >> that most of my cases would relate this. So here is simple Python code to >> generate arbitrary large list with the occasional missing delimiter. >> >> I set it so it reads the desired number of rows and frequency of bad rows >> from the linux command line. >> $time python tbig.py 1000000 100000 >> >> If I comment out the extra prints in io.py that I put in, it takes about 22 >> seconds to finish if the delimiters are correct. If I have the missing >> delimiter it takes 20.5 seconds to crash. >> >> >> Bruce >> >> > I think this would actually cover most of the problems I was running > into. The only other one I can think of is when I used a converter > that I thought would work, but it got unexpected data. For example, > > from StringIO import StringIO > import numpy as np > > strip_rand = lambda x : float(('r' in x.lower() and x.split()[-1]) or > (not 'r' in x.lower() and x.strip() or 0.0)) > > # Example usage > strip_rand('R 40') > strip_rand(' ') > strip_rand('') > strip_rand('40') > > strip_per = lambda x : float(('%' in x.lower() and x.split()[0]) or > (not '%' in x.lower() and x.strip() or 0.0)) > > # Example usage > strip_per('7 %') > strip_per('7') > strip_per(' ') > strip_per('') > > # Unexpected usage > strip_per('R 1') > Does this work for you? I get an: ValueError: invalid literal for float(): R 1 > s = StringIO('D01N01,10/1/2003 ,1 %,R 75,400,600\r\nL24U05,12/5/2003\ > ,2 %,1,300, 150.5\r\nD02N03,10/10/2004 ,R 1,,7,145.55') > Can you provide the correct line before the bad line? It just makes it easy to understand why a line is bad. > data = np.genfromtxt(s, converters = {2 : strip_per, 3 : strip_rand}, > delimiter=",", dtype=None) > > I don't have a clean install right now, but I think this returned a > converter is locked for upgrading error. I would just like to know > where the problem occured (line and column, preferably not > zero-indexed), so I can go and have a look at my data. > I rather limited understanding here. I think the problem is that Python is raising a ValueError because your strip_per() is wrong. It is not informative to you because _iotools.py is not aware that an invalid converter will raise a ValueError. Therefore there needs to be some way to test that the converter is correct or not. This this case I think it is the delimiter so checking the column numbers should occur before the application of the converter to that row. Bruce From jsseabold at gmail.com Wed Sep 30 13:44:21 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 30 Sep 2009 13:44:21 -0400 Subject: [Numpy-discussion] Question about improving genfromtxt errors In-Reply-To: <4AC38DD4.9000609@gmail.com> References: <4AC0E740.60309@noaa.gov> <4AC237BA.9050104@noaa.gov> <4AC24A7F.4020904@gmail.com> <00CD2A47-A721-4A46-8D81-203BE795A6E4@gmail.com> <4AC26FEB.60401@gmail.com> <4AC38DD4.9000609@gmail.com> Message-ID: On Wed, Sep 30, 2009 at 12:56 PM, Bruce Southey wrote: > On 09/30/2009 10:22 AM, Skipper Seabold wrote: >> On Tue, Sep 29, 2009 at 4:36 PM, Bruce Southey ?wrote: >> >> >>> Hi, >>> The first case just has to handle a missing delimiter - actually I expect >>> that most of my cases would relate this. So here is simple Python code to >>> generate arbitrary large list with the occasional missing delimiter. >>> >>> I set it so it reads the desired number of rows and frequency of bad rows >>> from the linux command line. >>> $time python tbig.py 1000000 100000 >>> >>> If I comment out the extra prints in io.py that I put in, it takes about 22 >>> seconds to finish if the delimiters are correct. If I have the missing >>> delimiter it takes 20.5 seconds to crash. >>> >>> >>> Bruce >>> >>> >> I think this would actually cover most of the problems I was running >> into. ?The only other one I can think of is when I used a converter >> that I thought would work, but it got unexpected data. ?For example, >> >> from StringIO import StringIO >> import numpy as np >> >> strip_rand = lambda x : float(('r' in x.lower() and x.split()[-1]) or >> (not 'r' in x.lower() and x.strip() or 0.0)) >> >> # Example usage >> strip_rand('R 40') >> strip_rand(' ?') >> strip_rand('') >> strip_rand('40') >> >> strip_per = lambda x : float(('%' in x.lower() and x.split()[0]) or >> (not '%' in x.lower() and x.strip() or 0.0)) >> >> # Example usage >> strip_per('7 %') >> strip_per('7') >> strip_per(' ') >> strip_per('') >> >> # Unexpected usage >> strip_per('R 1') >> > Does this work for you? > I get an: > ValueError: invalid literal for float(): R 1 > No, that's the idea. Sorry this was a bit opaque. > >> s = StringIO('D01N01,10/1/2003 ,1 %,R 75,400,600\r\nL24U05,12/5/2003\ >> ,2 %,1,300, 150.5\r\nD02N03,10/10/2004 ,R 1,,7,145.55') >> > Can you provide the correct line before the bad line? > It just makes it easy to understand why a line is bad. > The idea is that I have a column, which I expect to be percentages, but these are coded in by different data collectors, so some code a 0 for 0, some just leave it missing which could just as well be 0, some use the %. What I didn't expect was that some put in a money amount, hence the 'R 7', which my converter doesn't catch. >> data = np.genfromtxt(s, converters = {2 : strip_per, 3 : strip_rand}, >> delimiter=",", dtype=None) >> >> I don't have a clean install right now, but I think this returned a >> converter is locked for upgrading error. ?I would just like to know >> where the problem occured (line and column, preferably not >> zero-indexed), so I can go and have a look at my data. >> > I rather limited understanding here. I think the problem is that Python > is raising a ValueError because your strip_per() is wrong. It is not > informative to you because _iotools.py is not aware that an invalid > converter will raise a ValueError. Therefore there needs to be some way > to test that the converter is correct or not. > _iotools does catch this I believe, though I don't understand the upgrading and locking properly. The kludgy fix that I provided in the first post "I do not report the error from _iotools.StringConverter...", catches that an error is raised from _iotools and tells me exactly where the converter fails, so I can go to, say line 750,000 column 250 (and converter with key 249) instead of not knowing anything except that one of my ~500 converters failed somewhere in a 1 million line data file. If you still want to keep the error messages from _iotools.StringConverter, then they maybe they could have a (%s, %s) added and then this can be filled in in genfromtxt when you know (line, column) or something similar as was kind of suggested in a post in this thread I believe. Then again, this might not be possible. I haven't tried. > This this case I think it is the delimiter so checking the column > numbers should occur before the application of the converter to that row. > Sometimes it was the case where I had an extra comma in a number 1,000 say and then the converter tried to work on the wrong column, and sometimes it was because my converter didn't cover every use case, because I didn't know it yet. Either way, I just needed a gentle nudge in the right direction. If that doesn't clear up what I was after, I can try to provide a more detailed code sample. Skipper From gokhansever at gmail.com Wed Sep 30 14:27:20 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 30 Sep 2009 13:27:20 -0500 Subject: [Numpy-discussion] Compound conditional indexing Message-ID: <49d6b3500909301127r5696ad36xce9cebcfb5d00867@mail.gmail.com> Hello, How to conditionally index an array as shown below : a = arange(10) a[5 From jkington at wisc.edu Wed Sep 30 14:32:18 2009 From: jkington at wisc.edu (Joe Kington) Date: Wed, 30 Sep 2009 13:32:18 -0500 Subject: [Numpy-discussion] Compound conditional indexing In-Reply-To: <49d6b3500909301127r5696ad36xce9cebcfb5d00867@mail.gmail.com> References: <49d6b3500909301127r5696ad36xce9cebcfb5d00867@mail.gmail.com> Message-ID: There may be a more elegant way, but: In [2]: a = np.arange(10) In [3]: a[(a>5) & (a<8)] Out[3]: array([6, 7]) On Wed, Sep 30, 2009 at 1:27 PM, G?khan Sever wrote: > Hello, > > How to conditionally index an array as shown below : > > a = arange(10) > a[5 > to get > array([6,7]) > > I can't do this with where either. > > What is the cure for this? > > Thanks. > > -- > G?khan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Sep 30 14:41:50 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 30 Sep 2009 11:41:50 -0700 Subject: [Numpy-discussion] Compound conditional indexing In-Reply-To: <49d6b3500909301127r5696ad36xce9cebcfb5d00867@mail.gmail.com> References: <49d6b3500909301127r5696ad36xce9cebcfb5d00867@mail.gmail.com> Message-ID: <4AC3A66E.6070209@noaa.gov> G?khan Sever wrote: > How to conditionally index an array as shown below : > > a = arange(10) > a[5 > to get > array([6,7]) In [56]: a[(5 References: <49d6b3500909301127r5696ad36xce9cebcfb5d00867@mail.gmail.com> Message-ID: <49d6b3500909301240l7cc8cdfctce8b2101c93db0e4@mail.gmail.com> Thanks this works. My second question how to access a second array using this condition? I am trying slice another array using a compound condition on the reference array. say: a = 1,2,3,4,5, b = 20,30,40,50,60 I want to get elements of a only when a = 3,4. I know I need indices but how ? On Wed, Sep 30, 2009 at 1:32 PM, Joe Kington wrote: > There may be a more elegant way, but: > > In [2]: a = np.arange(10) > > In [3]: a[(a>5) & (a<8)] > Out[3]: array([6, 7]) > > > On Wed, Sep 30, 2009 at 1:27 PM, G?khan Sever wrote: > >> Hello, >> >> How to conditionally index an array as shown below : >> >> a = arange(10) >> a[5> >> to get >> array([6,7]) >> >> I can't do this with where either. >> >> What is the cure for this? >> >> Thanks. >> >> -- >> G?khan >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Sep 30 15:45:53 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 30 Sep 2009 14:45:53 -0500 Subject: [Numpy-discussion] Compound conditional indexing In-Reply-To: <49d6b3500909301240l7cc8cdfctce8b2101c93db0e4@mail.gmail.com> References: <49d6b3500909301127r5696ad36xce9cebcfb5d00867@mail.gmail.com> <49d6b3500909301240l7cc8cdfctce8b2101c93db0e4@mail.gmail.com> Message-ID: <3d375d730909301245jac8ea68v14e151998a38384a@mail.gmail.com> On Wed, Sep 30, 2009 at 14:40, G?khan Sever wrote: > Thanks this works. > > My second question how to access a second array using this condition? > > I am trying slice another array using a compound condition on the reference > array. > > say: > > a = 1,2,3,4,5, > b = 20,30,40,50,60 > > I want to get elements of a only when a = 3,4. I know I need indices but how > ? Did you mean "elements of b only where a = 3,4"? b[(a==3) | (a==4)] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gokhansever at gmail.com Wed Sep 30 15:51:05 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 30 Sep 2009 14:51:05 -0500 Subject: [Numpy-discussion] Compound conditional indexing In-Reply-To: <3d375d730909301245jac8ea68v14e151998a38384a@mail.gmail.com> References: <49d6b3500909301127r5696ad36xce9cebcfb5d00867@mail.gmail.com> <49d6b3500909301240l7cc8cdfctce8b2101c93db0e4@mail.gmail.com> <3d375d730909301245jac8ea68v14e151998a38384a@mail.gmail.com> Message-ID: <49d6b3500909301251p10fead1dl5b6b12e158b5dc4a@mail.gmail.com> On Wed, Sep 30, 2009 at 2:45 PM, Robert Kern wrote: > On Wed, Sep 30, 2009 at 14:40, G?khan Sever wrote: > > Thanks this works. > > > > My second question how to access a second array using this condition? > > > > I am trying slice another array using a compound condition on the > reference > > array. > > > > say: > > > > a = 1,2,3,4,5, > > b = 20,30,40,50,60 > > > > I want to get elements of a only when a = 3,4. I know I need indices but > how > > ? > > Did you mean "elements of b only where a = 3,4"? > > b[(a==3) | (a==4)] > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Ok, ok got it. Thanks Robert :) Following is what I have been looking for exactly. I[1]: a = arange(5) I[2]: b = arange(5)*10 I[3]: a O[3]: array([0, 1, 2, 3, 4]) I[4]: b O[4]: array([ 0, 10, 20, 30, 40]) I[5]: b[(a>1) & (a<3)] O[5]: array([20]) I was forgetting the parenthesis, and consequently I was being bitten by ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() I am glad that it is not my understanding of logic but the usage of NumPy. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Wed Sep 30 16:37:59 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 30 Sep 2009 13:37:59 -0700 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <4AC36924.7020605@stsci.edu> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> <4AC24A0C.4090005@stsci.edu> <45d1ab480909291248j107047fld5da82eb607d5614@mail.gmail.com> <4AC3571F.6020901@stsci.edu> <4AC36924.7020605@stsci.edu> Message-ID: <45d1ab480909301337s3fef8564uce77ebbbba8978da@mail.gmail.com> So, Ralf (or anyone), how, if at all, should we modify the status of the existing chararray objects/methods in the wiki? Assuming you have no problem sharing them with me, Michael, I could add those docstrings you created for the existing methods, and we can promote them to "Ready for Review"; they can be "demoted" to "Unimportant" (by someone with the appropriate privileges); or I can promote them to "Being written" and add a comment that the docstring has been written, it just resides elsewhere for the time being (however, IMO they should not be left in "Needs editing" status). DG On Wed, Sep 30, 2009 at 7:20 AM, Michael Droettboom wrote: > Ralf Gommers wrote: > > > > > > On Wed, Sep 30, 2009 at 9:03 AM, Michael Droettboom > > wrote: > > > > In the source in my working copy. Is that going to cause problems? > I > > wasn't sure if it was possible to document methods that didn't yet > > exist > > in the code in the wiki. > > > > That is fine. New functions will automatically show up in the wiki. It > > would be helpful though if you could mark them ready for review in the > > wiki (if they are) after they show up. Could take up to 24 hours for > > svn changes to propagate. > Thanks. Will do. > > > > Only if you moved functions around it would be useful if you pinged > > Pauli after you committed them. This is a temporary problem, right now > > the wiki creates a new page for a moved object, and the old content > > (if any) has to be copied over to the new page. > All of the functions that were moved were previously without docstrings > in SVN, though some had docstrings (that I just now discovered) in the > wiki. This may cause some hiccups, I suppose, so I'll be sure to > announce when these things get committed to SVN so I know how to help > straighten these things out. > > Mike > > > > Cheers, > > Ralf > > > > > > Mike > > > > David Goldsmith wrote: > > > On Tue, Sep 29, 2009 at 10:55 AM, Michael Droettboom > > > > > >> wrote: > > > > > > 2) Improve documentation > > > > > > Every method now has a docstring, and a new page of routines > > has been > > > added to the Sphinx tree. > > > > > > > > > Um, where did you do this, 'cause it's not showing up in the doc > > wiki. > > > > > > DG > > > > > > ------------------------------------------------------------------------ > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > -- > > Michael Droettboom > > Science Software Branch > > Operations and Engineering Division > > Space Telescope Science Institute > > Operated by AURA for NASA > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Wed Sep 30 16:51:59 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 30 Sep 2009 16:51:59 -0400 Subject: [Numpy-discussion] scipy.reddit.com Message-ID: <2F5DD634-70CB-4A5B-ADD8-F3FFCDF41A1B@cs.toronto.edu> In the spirit of the 'advice' site, and given that we're thinking of moving scipy.org to more static content (once I have some free time on my hands again, which should be soon!), I set up a 'subreddit' on reddit.com for Python-in-Science related links. I even came up with a somewhat spiffy logo for it. Think of it as a communal, collaboratively filtered (via up/down votes, using the arrows next to each submission) bookmarks folder/news site/etc. I'd encourage people to use it and add to it if they feel it might be of use to the community. The address is http://scipy.reddit.com/ , or equivalently http://www.reddit.com/r/scipy David From ralf.gommers at googlemail.com Wed Sep 30 17:09:20 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 30 Sep 2009 17:09:20 -0400 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: <45d1ab480909301337s3fef8564uce77ebbbba8978da@mail.gmail.com> References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <45d1ab480909221951h6006deb9u4675e419c9f0b256@mail.gmail.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> <4AC24A0C.4090005@stsci.edu> <45d1ab480909291248j107047fld5da82eb607d5614@mail.gmail.com> <4AC3571F.6020901@stsci.edu> <4AC36924.7020605@stsci.edu> <45d1ab480909301337s3fef8564uce77ebbbba8978da@mail.gmail.com> Message-ID: On Wed, Sep 30, 2009 at 4:37 PM, David Goldsmith wrote: > So, Ralf (or anyone), how, if at all, should we modify the status of the > existing chararray objects/methods in the wiki? Nothing has to be done until *after* Mike has committed his changes to svn. Please see my previous email for what has to happen at that point. Since Mike wrote the new docstrings it would be best if he updated the status of the wiki pages then. > Assuming you have no problem sharing them with me, Michael, I could add > those docstrings you created for the existing methods, > They will show up in the wiki when they get committed to svn (presumably within a few days), so this is needless effort for the most part. If there are different changes in the wiki and svn, that will show up in the "merge" page. The ony thing that requires manual effort is if there are changes in the wiki and the object got moved in svn. Cheers, Ralf > DG > > > On Wed, Sep 30, 2009 at 7:20 AM, Michael Droettboom wrote: > >> Ralf Gommers wrote: >> > >> > >> > On Wed, Sep 30, 2009 at 9:03 AM, Michael Droettboom > > > wrote: >> > >> > In the source in my working copy. Is that going to cause problems? >> I >> > wasn't sure if it was possible to document methods that didn't yet >> > exist >> > in the code in the wiki. >> > >> > That is fine. New functions will automatically show up in the wiki. It >> > would be helpful though if you could mark them ready for review in the >> > wiki (if they are) after they show up. Could take up to 24 hours for >> > svn changes to propagate. >> Thanks. Will do. >> > >> > Only if you moved functions around it would be useful if you pinged >> > Pauli after you committed them. This is a temporary problem, right now >> > the wiki creates a new page for a moved object, and the old content >> > (if any) has to be copied over to the new page. >> All of the functions that were moved were previously without docstrings >> in SVN, though some had docstrings (that I just now discovered) in the >> wiki. This may cause some hiccups, I suppose, so I'll be sure to >> announce when these things get committed to SVN so I know how to help >> straighten these things out. >> >> Mike >> > >> > Cheers, >> > Ralf >> > >> > >> > Mike >> > >> > David Goldsmith wrote: >> > > On Tue, Sep 29, 2009 at 10:55 AM, Michael Droettboom >> > >> > > >> wrote: >> > > >> > > 2) Improve documentation >> > > >> > > Every method now has a docstring, and a new page of routines >> > has been >> > > added to the Sphinx tree. >> > > >> > > >> > > Um, where did you do this, 'cause it's not showing up in the doc >> > wiki. >> > > >> > > DG >> > > >> > >> ------------------------------------------------------------------------ >> > > >> > > _______________________________________________ >> > > NumPy-Discussion mailing list >> > > NumPy-Discussion at scipy.org >> > > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > >> > >> > -- >> > Michael Droettboom >> > Science Software Branch >> > Operations and Engineering Division >> > Space Telescope Science Institute >> > Operated by AURA for NASA >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > ------------------------------------------------------------------------ >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> >> -- >> Michael Droettboom >> Science Software Branch >> Operations and Engineering Division >> Space Telescope Science Institute >> Operated by AURA for NASA >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Wed Sep 30 17:33:00 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 30 Sep 2009 14:33:00 -0700 Subject: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help] In-Reply-To: References: <857977.74958.qm@web52106.mail.re2.yahoo.com> <4ABCC0A1.30402@stsci.edu> <45d1ab480909250923t2dd716a2s502967fb231dd889@mail.gmail.com> <4AC24A0C.4090005@stsci.edu> <45d1ab480909291248j107047fld5da82eb607d5614@mail.gmail.com> <4AC3571F.6020901@stsci.edu> <4AC36924.7020605@stsci.edu> <45d1ab480909301337s3fef8564uce77ebbbba8978da@mail.gmail.com> Message-ID: <45d1ab480909301433q5ff65872o1f2edc12047d0425@mail.gmail.com> On Wed, Sep 30, 2009 at 2:09 PM, Ralf Gommers wrote: > > On Wed, Sep 30, 2009 at 4:37 PM, David Goldsmith wrote: > >> So, Ralf (or anyone), how, if at all, should we modify the status of the >> existing chararray objects/methods in the wiki? > > > Nothing has to be done until *after* Mike has committed his changes to svn. > Please see my previous email for what has to happen at that point. Since > Mike wrote the new docstrings it would be best if he updated the status of > the wiki pages then. > OK; Mike: hopefully it will be clear what you have to do to update the status (it's pretty trivial) but of course don't hesitate to email (you can do so off-list if you prefer) w/ any questions; unfortunately, AFAIK, there's no way to update the status of many docstrings all at once - you'll have to do them each individually (if you like, let me know when you've committed them and I can help - it sounds like there will be a lot); the main "silly" thing to remember is that the option to change the "Review status" only appears if you're logged in. :-) > Assuming you have no problem sharing them with me, Michael, I could add >> those docstrings you created for the existing methods, >> > > They will show up in the wiki when they get committed to svn (presumably > within a few days), so this is needless effort for the most part. If there > are different changes in the wiki and svn, that will show up in the "merge" > page. > > The ony thing that requires manual effort is if there are changes in the > wiki and the object got moved in svn. > And, as above, updating the status in the Wiki. :-) DG > Cheers, > Ralf > > > >> DG >> >> >> On Wed, Sep 30, 2009 at 7:20 AM, Michael Droettboom wrote: >> >>> Ralf Gommers wrote: >>> > >>> > >>> > On Wed, Sep 30, 2009 at 9:03 AM, Michael Droettboom >> > > wrote: >>> > >>> > In the source in my working copy. Is that going to cause problems? >>> I >>> > wasn't sure if it was possible to document methods that didn't yet >>> > exist >>> > in the code in the wiki. >>> > >>> > That is fine. New functions will automatically show up in the wiki. It >>> > would be helpful though if you could mark them ready for review in the >>> > wiki (if they are) after they show up. Could take up to 24 hours for >>> > svn changes to propagate. >>> Thanks. Will do. >>> > >>> > Only if you moved functions around it would be useful if you pinged >>> > Pauli after you committed them. This is a temporary problem, right now >>> > the wiki creates a new page for a moved object, and the old content >>> > (if any) has to be copied over to the new page. >>> All of the functions that were moved were previously without docstrings >>> in SVN, though some had docstrings (that I just now discovered) in the >>> wiki. This may cause some hiccups, I suppose, so I'll be sure to >>> announce when these things get committed to SVN so I know how to help >>> straighten these things out. >>> >>> Mike >>> > >>> > Cheers, >>> > Ralf >>> > >>> > >>> > Mike >>> > >>> > David Goldsmith wrote: >>> > > On Tue, Sep 29, 2009 at 10:55 AM, Michael Droettboom >>> > >>> > > >> wrote: >>> > > >>> > > 2) Improve documentation >>> > > >>> > > Every method now has a docstring, and a new page of routines >>> > has been >>> > > added to the Sphinx tree. >>> > > >>> > > >>> > > Um, where did you do this, 'cause it's not showing up in the doc >>> > wiki. >>> > > >>> > > DG >>> > > >>> > >>> ------------------------------------------------------------------------ >>> > > >>> > > _______________________________________________ >>> > > NumPy-Discussion mailing list >>> > > NumPy-Discussion at scipy.org >>> > > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > > >>> > >>> > -- >>> > Michael Droettboom >>> > Science Software Branch >>> > Operations and Engineering Division >>> > Space Telescope Science Institute >>> > Operated by AURA for NASA >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> > >>> > >>> ------------------------------------------------------------------------ >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> >>> -- >>> Michael Droettboom >>> Science Software Branch >>> Operations and Engineering Division >>> Space Telescope Science Institute >>> Operated by AURA for NASA >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Wed Sep 30 17:46:18 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 30 Sep 2009 17:46:18 -0400 Subject: [Numpy-discussion] Another dumb structured array question In-Reply-To: References: Message-ID: <23501DA5-BC7B-4D8E-AA7E-B3CF927DD197@cs.toronto.edu> On 30-Sep-09, at 9:51 AM, Travis Oliphant wrote: > Try (with a later version of NumPy --- possibly trunk): > > arr[['foo', 'bar']] > > (i.e. with a list instead of a tuple) Aha! Thanks Travis. I guess tuples for multi-dimensional indexing _and_ multi-field indexing would be semantically awkward, come to think of it. David From jah.mailinglist at gmail.com Wed Sep 30 20:56:40 2009 From: jah.mailinglist at gmail.com (jah) Date: Wed, 30 Sep 2009 17:56:40 -0700 Subject: [Numpy-discussion] Convert data into rectangular grid In-Reply-To: References: Message-ID: On Wed, Sep 30, 2009 at 8:57 AM, denis bzowy wrote: > jah gmail.com> writes: > > > > > Hi,Suppose I have a set of x,y,c data ... matplotlib.pyplot.contour() ). > > > > JAH, is griddata() working and fast enough for you ? > How many points are you contouring ? > > > Thanks all. Robert, griddata is exactly what I was looking for. David, I think that should work too. And Denis, griddata is sufficiently fast that I am not complaining---contouring about 1e6 or 1e7 points typically. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Sep 30 22:45:11 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 30 Sep 2009 20:45:11 -0600 Subject: [Numpy-discussion] repr and object arrays Message-ID: Hi All, It seems that repr applied do an object array does not provide the info needed to recreate it: In [22]: y = array([Decimal(1)]*2) In [23]: repr(y) Out[23]: 'array([1, 1], dtype=object)' And of course, there is going to be a problem with arrays of more than one dimension anyway. But I wonder if this should be fixed? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Sep 30 22:52:47 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 30 Sep 2009 21:52:47 -0500 Subject: [Numpy-discussion] repr and object arrays In-Reply-To: References: Message-ID: <3d375d730909301952u35fd0475pab12f21ca4baf720@mail.gmail.com> On Wed, Sep 30, 2009 at 21:45, Charles R Harris wrote: > Hi All, > > It seems that repr applied do an object array does not provide the info > needed to recreate it: > > In [22]: y = array([Decimal(1)]*2) > > In [23]: repr(y) > Out[23]: 'array([1, 1], dtype=object)' > > And of course, there is going to be a problem with arrays of more than one > dimension anyway. But I wonder if this should be fixed? Using repr() instead of str() for the items would probably be wise. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco