From wnbell at gmail.com Mon Jun 1 00:26:59 2009 From: wnbell at gmail.com (Nathan Bell) Date: Mon, 1 Jun 2009 00:26:59 -0400 Subject: [Numpy-discussion] Rasterizing points onto an array In-Reply-To: <96F50D68-C7F4-4F27-92B4-4CFE7ED01CBA@gmail.com> References: <96F50D68-C7F4-4F27-92B4-4CFE7ED01CBA@gmail.com> Message-ID: On Sun, May 31, 2009 at 8:39 PM, Thomas Robitaille wrote: > Hi, > > I have a set of n points with real coordinates between 0 and 1, given > by two numpy arrays x and y, with a value at each point represented by > a third array z. I am trying to then rasterize the points onto a grid > of size npix*npix. So I can start by converting x and y to integer > pixel coordinates ix and iy. But my question is, is there an efficient > way to add z[i] to the pixel given by (xi[i],yi[i])? Below is what I > am doing at the moment, but the for loop becomes very inefficient for > large n. I would imagine that there is a way to do this without using > a loop? > > --- > > import numpy as np > > n = 10000000 > x = np.random.random(n) > y = np.random.random(n) > z = np.random.random(n) > > npix = 100 > ix = np.array(x*float(npix),int) > iy = np.array(y*float(npix),int) > > image = np.zeros((npix,npix)) > for i in range(len(ix)): > ? ? image[ix[i],iy[i]] = image[ix[i],iy[i]] + z[i] > There might be a better way, but histogram2d() seems like a good fit: import numpy as np n = 1000000 x = np.random.random(n) y = np.random.random(n) z = np.random.random(n) npix = 100 bins = np.linspace(0, 1.0, npix + 1) image = np.histogram2d(x, y, bins=bins, weights=z)[0] -- Nathan Bell wnbell at gmail.com http://www.wnbell.com/ From charlesr.harris at gmail.com Mon Jun 1 00:48:36 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 31 May 2009 22:48:36 -0600 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <4A234643.6040606@ar.media.kyoto-u.ac.jp> References: <871697.92371.qm@web86003.mail.ird.yahoo.com> <4A232C67.2080206@ar.media.kyoto-u.ac.jp> <4A234643.6040606@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, May 31, 2009 at 9:08 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Charles R Harris wrote: > > > > > > On Sun, May 31, 2009 at 7:18 PM, David Cournapeau > > > > > wrote: > > > > Charles R Harris wrote: > > > > > > > > > On Sun, May 31, 2009 at 11:54 AM, rob steed > > > > >> wrote: > > > > > > > > > Hi, > > > After my previous email, I have opened a ticket #1117 > (correlate > > > not order dependent) > > > > > > I have found that the correlate function is defined in > > > multiarraymodule.c and > > > that inputs are being swapped using the following code > > > > > > n1 = ap1->dimensions[0]; > > > n2 = ap2->dimensions[0]; > > > if (n1 < n2) { > > > ret = ap1; > > > ap1 = ap2; > > > ap2 = ret; > > > ret = NULL; > > > i = n1; > > > n1 = n2; > > > n2 = i; > > > } > > > > > > I do not know the code well enough to see whether this could > > just > > > be removed (I don't know c either). > > > Maybe the algorithmn requires the inputs to be length ordered? > I > > > will try to work it out. > > > > > > > > > If the correlation algorithm doesn't use an fft and is done > > > explicitly, then the maximum overlap for any shift is the length of > > > the shortest input. Swapping the arrays makes that logic easier to > > > implement, but it isn't necessary. > > > > But this logic is also wrong if the swapping is not taken into > > account - > > as the OP mentioned, correlate(a, b) is not equal to correlate(b, > > a) in > > the general case. The output is reversed in the second case > > compared to > > the first case. > > > > > > I didn't say it was *correctly* implemented ;) > > :) So I gave it a shot > > http://github.com/cournape/numpy/commits/fix_correlate > > (It took me a while to realize that PyArray_ISFLEXIBLE returns false for > array object. Is this expected ? The documentation concerning copyswap > says that it is necessary for flexible arrays, but I think it is > necessary for object arrays as well). > Don't know. PyArray_ISFLEXIBLE looks like a macro... #define PyArray_ISFLEXIBLE(obj) PyTypeNum_ISFLEXIBLE(PyArray_TYPE(obj)) #define PyTypeNum_ISFLEXIBLE(type) (((type) >=NPY_STRING) && \ ((type) <=NPY_VOID)) And the typecodes are '?bhilqpBHILQPfdgFDGSUVO'. So 'SUV' are flexible and O is not. I'm not clear on how correlate should apply to any of 'SUV' but it might be worth having it work for objects. > It still bothers me that correlate does not conjugate the second > argument for complex arrays... > It bothers me also... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon Jun 1 01:05:23 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 01 Jun 2009 14:05:23 +0900 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: References: <871697.92371.qm@web86003.mail.ird.yahoo.com> <4A232C67.2080206@ar.media.kyoto-u.ac.jp> <4A234643.6040606@ar.media.kyoto-u.ac.jp> Message-ID: <4A236193.7030004@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Sun, May 31, 2009 at 9:08 PM, David Cournapeau > > > wrote: > > Charles R Harris wrote: > > > > > > On Sun, May 31, 2009 at 7:18 PM, David Cournapeau > > > >> > > wrote: > > > > Charles R Harris wrote: > > > > > > > > > On Sun, May 31, 2009 at 11:54 AM, rob steed > > > > > > > > >>> wrote: > > > > > > > > > Hi, > > > After my previous email, I have opened a ticket #1117 > (correlate > > > not order dependent) > > > > > > I have found that the correlate function is defined in > > > multiarraymodule.c and > > > that inputs are being swapped using the following code > > > > > > n1 = ap1->dimensions[0]; > > > n2 = ap2->dimensions[0]; > > > if (n1 < n2) { > > > ret = ap1; > > > ap1 = ap2; > > > ap2 = ret; > > > ret = NULL; > > > i = n1; > > > n1 = n2; > > > n2 = i; > > > } > > > > > > I do not know the code well enough to see whether this > could > > just > > > be removed (I don't know c either). > > > Maybe the algorithmn requires the inputs to be length > ordered? I > > > will try to work it out. > > > > > > > > > If the correlation algorithm doesn't use an fft and is done > > > explicitly, then the maximum overlap for any shift is the > length of > > > the shortest input. Swapping the arrays makes that logic > easier to > > > implement, but it isn't necessary. > > > > But this logic is also wrong if the swapping is not taken into > > account - > > as the OP mentioned, correlate(a, b) is not equal to > correlate(b, > > a) in > > the general case. The output is reversed in the second case > > compared to > > the first case. > > > > > > I didn't say it was *correctly* implemented ;) > > :) So I gave it a shot > > http://github.com/cournape/numpy/commits/fix_correlate > > (It took me a while to realize that PyArray_ISFLEXIBLE returns > false for > array object. Is this expected ? The documentation concerning copyswap > says that it is necessary for flexible arrays, but I think it is > necessary for object arrays as well). > > > Don't know. PyArray_ISFLEXIBLE looks like a macro... > > #define PyArray_ISFLEXIBLE(obj) PyTypeNum_ISFLEXIBLE(PyArray_TYPE(obj)) > > #define PyTypeNum_ISFLEXIBLE(type) (((type) >=NPY_STRING) && \ > ((type) <=NPY_VOID)) > > And the typecodes are '?bhilqpBHILQPfdgFDGSUVO'. So 'SUV' are flexible > and O is not. I re-read the copyswap documentation, and realized I did not read it correctly. Now, I am not sure when to use copyswap vs memcpy (memcpy should be much faster on basic types, as memcpy should be inlined generally, whereas I doubt copyswap can). > I'm not clear on how correlate should apply to any of 'SUV' but it > might be worth having it work for objects. It already does (I added a couple of unit tests in the branch, since there were no test for correlate, and one is for Decimal object arrays). > > > It still bothers me that correlate does not conjugate the second > argument for complex arrays... > > > It bothers me also... I think we should just fix it to use conjugate - I will do this in the branch, and I will integrate it in the trunk later unless someone stands up vehemently against the change. I opened up a ticket to track this, though, cheers, David From faltet at pytables.org Mon Jun 1 12:22:03 2009 From: faltet at pytables.org (Francesc Alted) Date: Mon, 1 Jun 2009 18:22:03 +0200 Subject: [Numpy-discussion] Single precision equivalents of missing C99 functions Message-ID: <200906011822.03735.faltet@pytables.org> Hi, In the process of adding single precision support to Numexpr, I'm experimenting a divergence between Numexpr and NumPy computations. It all boils down to the fact that my implementation defined single precision functions completely. As for one, consider my version of expm1f: inline static float expm1f(float x) { float u = expf(x); if (u == 1.0) { return x; } else if (u-1.0 == -1.0) { return -1; } else { return (u-1.0) * x/logf(u); } } while NumPy seems to declare expm1f as: static float expm1f(float x) { return (float) expm1((double)x); } This leads to different results on Windows when computing expm1(x) for large values of x (like 99.), where my approach returns a 'nan', while NumPy returns an 'inf'. Curiously, on Linux both approaches returns 'inf'. I suppose that the NumPy crew already experimented this divergence and finally used the cast approach for computing the single precision functions. However, this is effectively preventing the use of optimized functions for single precision (i.e. double precision 'exp' and 'log' are used instead of single precision specific 'expf' and 'logf'), which could perform potentially better. So, I'm wondering if it would not be better to use a native implementation instead. Thoughts? Thanks, -- Francesc Alted From robert.kern at gmail.com Mon Jun 1 12:35:11 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 11:35:11 -0500 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <4A236193.7030004@ar.media.kyoto-u.ac.jp> References: <871697.92371.qm@web86003.mail.ird.yahoo.com> <4A232C67.2080206@ar.media.kyoto-u.ac.jp> <4A234643.6040606@ar.media.kyoto-u.ac.jp> <4A236193.7030004@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> On Mon, Jun 1, 2009 at 00:05, David Cournapeau wrote: > I think we should just fix it to use conjugate - I will do this in the > branch, and I will integrate it in the trunk later unless someone stands > up vehemently against the change. I opened up a ticket to track this, > though, It breaks everyone's code that works around the current behavior. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mhearne at usgs.gov Mon Jun 1 12:55:43 2009 From: mhearne at usgs.gov (Michael Hearne) Date: Mon, 1 Jun 2009 10:55:43 -0600 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 Message-ID: A question (and possibly a bug): What should be returned when I do: numpy.nansum([]) In my copy of numpy 1.1.1, I get 0.0. This is what I would expect to see. However, this behavior seems to have changed in 1.3.0, in which I get nan. Thanks, Mike From kwgoodman at gmail.com Mon Jun 1 13:43:19 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 1 Jun 2009 10:43:19 -0700 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: References: Message-ID: On Mon, Jun 1, 2009 at 9:55 AM, Michael Hearne wrote: > A question (and possibly a bug): > > What should be returned when I do: > > numpy.nansum([]) > > In my copy of numpy 1.1.1, I get 0.0. ?This is what I would expect to > see. > However, this behavior seems to have changed in 1.3.0, in which I get > nan. Here's a weird one. This is in numpy 1.2: >> np.sum(9) 9 >> np.sum(9.0) 9.0 >> np.nansum(9) 9 >> np.nansum(9.0) --------------------------------------------------------------------------- IndexError: 0-d arrays can't be indexed. From neilcrighton at gmail.com Mon Jun 1 13:45:57 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Mon, 1 Jun 2009 17:45:57 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?setmember1d=5Fnu?= References: <49B12DCC.4040307@ntc.zcu.cz> <49C22252.1010506@ntc.zcu.cz> Message-ID: Robert Cimrman ntc.zcu.cz> writes: > Re-hi! > > Robert Cimrman wrote: > > Hi all, > > > > I have added to the ticket [1] a script that compares the proposed > > setmember1d_nu() implementations of Neil and Kim. Comments are welcome! > > > > [1] http://projects.scipy.org/numpy/ticket/1036 > > I have attached a patch incorporating the solution that the involved > people agreed on, so review, please. > > best regards, > r. > Hi all, I'd really like to see the setmember1d_nu function in ticket 1036 get into numpy. There's a patch waiting for review that includes tests for the new function: http://projects.scipy.org/numpy/ticket/1036 Is there anything I can do to help get it applied? Neil From charlesr.harris at gmail.com Mon Jun 1 13:48:35 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Jun 2009 11:48:35 -0600 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> References: <871697.92371.qm@web86003.mail.ird.yahoo.com> <4A232C67.2080206@ar.media.kyoto-u.ac.jp> <4A234643.6040606@ar.media.kyoto-u.ac.jp> <4A236193.7030004@ar.media.kyoto-u.ac.jp> <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 10:35 AM, Robert Kern wrote: > On Mon, Jun 1, 2009 at 00:05, David Cournapeau > wrote: > > > I think we should just fix it to use conjugate - I will do this in the > > branch, and I will integrate it in the trunk later unless someone stands > > up vehemently against the change. I opened up a ticket to track this, > > though, > > It breaks everyone's code that works around the current behavior. > Maybe we need a new function. But what to call it? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Jun 1 14:16:51 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 14:16:51 -0400 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: References: Message-ID: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> On Mon, Jun 1, 2009 at 1:43 PM, Keith Goodman wrote: > On Mon, Jun 1, 2009 at 9:55 AM, Michael Hearne wrote: >> A question (and possibly a bug): >> >> What should be returned when I do: >> >> numpy.nansum([]) >> >> In my copy of numpy 1.1.1, I get 0.0. ?This is what I would expect to >> see. >> However, this behavior seems to have changed in 1.3.0, in which I get >> nan. > > Here's a weird one. This is in numpy 1.2: > >>> np.sum(9) > ? 9 >>> np.sum(9.0) > ? 9.0 >>> np.nansum(9) > ? 9 >>> np.nansum(9.0) > --------------------------------------------------------------------------- > IndexError: 0-d arrays can't be indexed. wrong argument in isnan, I think Josef In file: C:\Programs\Python25\Lib\site-packages\numpy\lib\function_base.py def _nanop(op, fill, a, axis=None): """ ... """ y = array(a,subok=True) - mask = isnan(a) + mask = isnan(y) if mask.all(): return np.nan if not issubclass(y.dtype.type, np.integer): y[mask] = fill return op(y, axis=axis) From charlesr.harris at gmail.com Mon Jun 1 14:26:27 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Jun 2009 12:26:27 -0600 Subject: [Numpy-discussion] Single precision equivalents of missing C99 functions In-Reply-To: <200906011822.03735.faltet@pytables.org> References: <200906011822.03735.faltet@pytables.org> Message-ID: On Mon, Jun 1, 2009 at 10:22 AM, Francesc Alted wrote: > Hi, > > In the process of adding single precision support to Numexpr, I'm > experimenting a divergence between Numexpr and NumPy computations. It all > boils down to the fact that my implementation defined single precision > functions completely. As for one, consider my version of expm1f: > > inline static float expm1f(float x) > { > float u = expf(x); > if (u == 1.0) { > return x; > } else if (u-1.0 == -1.0) { > return -1; > } else { > return (u-1.0) * x/logf(u); > } > } > > while NumPy seems to declare expm1f as: > > static float expm1f(float x) > { > return (float) expm1((double)x); > } > > This leads to different results on Windows when computing expm1(x) for > large > values of x (like 99.), where my approach returns a 'nan', while NumPy > returns > an 'inf'. Curiously, on Linux both approaches returns 'inf'. > > I suppose that the NumPy crew already experimented this divergence and > finally > used the cast approach for computing the single precision functions. It was inherited and was no doubt the simplest approach at the time. It has always bothered me a bit, however, and if you have good single/long double routines we should look at including them. It will affect the build so David needs to weigh in here. However, > this is effectively preventing the use of optimized functions for single > precision (i.e. double precision 'exp' and 'log' are used instead of single > precision specific 'expf' and 'logf'), which could perform potentially > better. That depends on the architecture and how fast single vs double computations are. I don't know how the timings compare on current machines. > > So, I'm wondering if it would not be better to use a native implementation > instead. Thoughts? > Some benchmarks would be interesting. Could this be part of the corepy GSOC project? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Mon Jun 1 14:26:37 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 1 Jun 2009 11:26:37 -0700 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> References: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 11:16 AM, wrote: > On Mon, Jun 1, 2009 at 1:43 PM, Keith Goodman wrote: >> On Mon, Jun 1, 2009 at 9:55 AM, Michael Hearne wrote: >>> A question (and possibly a bug): >>> >>> What should be returned when I do: >>> >>> numpy.nansum([]) >>> >>> In my copy of numpy 1.1.1, I get 0.0. ?This is what I would expect to >>> see. >>> However, this behavior seems to have changed in 1.3.0, in which I get >>> nan. >> >> Here's a weird one. This is in numpy 1.2: >> >>>> np.sum(9) >> ? 9 >>>> np.sum(9.0) >> ? 9.0 >>>> np.nansum(9) >> ? 9 >>>> np.nansum(9.0) >> --------------------------------------------------------------------------- >> IndexError: 0-d arrays can't be indexed. > > wrong argument in isnan, I think > > Josef > > In file: C:\Programs\Python25\Lib\site-packages\numpy\lib\function_base.py > > def _nanop(op, fill, a, axis=None): > ? ?""" > ... > > ? ?""" > ? ?y = array(a,subok=True) > - ? ?mask = isnan(a) > + ? mask = isnan(y) > ? ?if mask.all(): > ? ? ? ?return np.nan > > ? ?if not issubclass(y.dtype.type, np.integer): > ? ? ? ?y[mask] = fill > > ? ?return op(y, axis=axis) The problem I came across, np.nansum(float), is caused by this line y[mask] = fill when y is 0-d. From josef.pktd at gmail.com Mon Jun 1 14:34:49 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 14:34:49 -0400 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: References: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> Message-ID: <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> On Mon, Jun 1, 2009 at 2:26 PM, Keith Goodman wrote: > On Mon, Jun 1, 2009 at 11:16 AM, ? wrote: >> On Mon, Jun 1, 2009 at 1:43 PM, Keith Goodman wrote: >>> On Mon, Jun 1, 2009 at 9:55 AM, Michael Hearne wrote: >>>> A question (and possibly a bug): >>>> >>>> What should be returned when I do: >>>> >>>> numpy.nansum([]) >>>> >>>> In my copy of numpy 1.1.1, I get 0.0. ?This is what I would expect to >>>> see. >>>> However, this behavior seems to have changed in 1.3.0, in which I get >>>> nan. >>> >>> Here's a weird one. This is in numpy 1.2: >>> >>>>> np.sum(9) >>> ? 9 >>>>> np.sum(9.0) >>> ? 9.0 >>>>> np.nansum(9) >>> ? 9 >>>>> np.nansum(9.0) >>> --------------------------------------------------------------------------- >>> IndexError: 0-d arrays can't be indexed. >> >> wrong argument in isnan, I think >> >> Josef >> >> In file: C:\Programs\Python25\Lib\site-packages\numpy\lib\function_base.py >> >> def _nanop(op, fill, a, axis=None): >> ? ?""" >> ... >> >> ? ?""" >> ? ?y = array(a,subok=True) >> - ? ?mask = isnan(a) >> + ? mask = isnan(y) >> ? ?if mask.all(): >> ? ? ? ?return np.nan >> >> ? ?if not issubclass(y.dtype.type, np.integer): >> ? ? ? ?y[mask] = fill >> >> ? ?return op(y, axis=axis) > > The problem I came across, np.nansum(float), is caused by this line > > y[mask] = fill > > when y is 0-d. if mask is an array then this works, my initial solution was mask = np.array(mask). then I thought it works also the other way, but it doesn't. I shouldn't have changed my mind >>> y array(9.0) >>> y[np.array(np.isnan(9))] = 0 >>> y array(9.0) >>> y[np.isnan(np.array(9))] = 0 Traceback (most recent call last): File "", line 1, in y[np.isnan(np.array(9))] = 0 IndexError: 0-d arrays can't be indexed. Josef From charlesr.harris at gmail.com Mon Jun 1 14:44:16 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Jun 2009 12:44:16 -0600 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: References: <871697.92371.qm@web86003.mail.ird.yahoo.com> <4A232C67.2080206@ar.media.kyoto-u.ac.jp> <4A234643.6040606@ar.media.kyoto-u.ac.jp> <4A236193.7030004@ar.media.kyoto-u.ac.jp> <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 11:48 AM, Charles R Harris wrote: > > > On Mon, Jun 1, 2009 at 10:35 AM, Robert Kern wrote: > >> On Mon, Jun 1, 2009 at 00:05, David Cournapeau >> wrote: >> >> > I think we should just fix it to use conjugate - I will do this in the >> > branch, and I will integrate it in the trunk later unless someone stands >> > up vehemently against the change. I opened up a ticket to track this, >> > though, >> >> It breaks everyone's code that works around the current behavior. >> > > Maybe we need a new function. But what to call it? > How about introducing acorrelate and deprecating the old version? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Mon Jun 1 14:45:06 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 1 Jun 2009 11:45:06 -0700 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> References: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 11:34 AM, wrote: > On Mon, Jun 1, 2009 at 2:26 PM, Keith Goodman wrote: >> On Mon, Jun 1, 2009 at 11:16 AM, ? wrote: >>> On Mon, Jun 1, 2009 at 1:43 PM, Keith Goodman wrote: >>>> On Mon, Jun 1, 2009 at 9:55 AM, Michael Hearne wrote: >>>>> A question (and possibly a bug): >>>>> >>>>> What should be returned when I do: >>>>> >>>>> numpy.nansum([]) >>>>> >>>>> In my copy of numpy 1.1.1, I get 0.0. ?This is what I would expect to >>>>> see. >>>>> However, this behavior seems to have changed in 1.3.0, in which I get >>>>> nan. >>>> >>>> Here's a weird one. This is in numpy 1.2: >>>> >>>>>> np.sum(9) >>>> ? 9 >>>>>> np.sum(9.0) >>>> ? 9.0 >>>>>> np.nansum(9) >>>> ? 9 >>>>>> np.nansum(9.0) >>>> --------------------------------------------------------------------------- >>>> IndexError: 0-d arrays can't be indexed. >>> >>> wrong argument in isnan, I think >>> >>> Josef >>> >>> In file: C:\Programs\Python25\Lib\site-packages\numpy\lib\function_base.py >>> >>> def _nanop(op, fill, a, axis=None): >>> ? ?""" >>> ... >>> >>> ? ?""" >>> ? ?y = array(a,subok=True) >>> - ? ?mask = isnan(a) >>> + ? mask = isnan(y) >>> ? ?if mask.all(): >>> ? ? ? ?return np.nan >>> >>> ? ?if not issubclass(y.dtype.type, np.integer): >>> ? ? ? ?y[mask] = fill >>> >>> ? ?return op(y, axis=axis) >> >> The problem I came across, np.nansum(float), is caused by this line >> >> y[mask] = fill >> >> when y is 0-d. > > if mask is an array then this works, my initial solution was mask = > np.array(mask). ?then I thought it works also the other way, but it > doesn't. I shouldn't have changed my mind > >>>> y > array(9.0) >>>> y[np.array(np.isnan(9))] = 0 >>>> y > array(9.0) >>>> y[np.isnan(np.array(9))] = 0 > Traceback (most recent call last): > ?File "", line 1, in > ? ?y[np.isnan(np.array(9))] = 0 > IndexError: 0-d arrays can't be indexed. Nice! Are you going to make the change? From thomas.robitaille at gmail.com Mon Jun 1 15:02:38 2009 From: thomas.robitaille at gmail.com (Thomas Robitaille) Date: Mon, 1 Jun 2009 12:02:38 -0700 (PDT) Subject: [Numpy-discussion] Rasterizing points onto an array In-Reply-To: References: <96F50D68-C7F4-4F27-92B4-4CFE7ED01CBA@gmail.com> Message-ID: <23820216.post@talk.nabble.com> Nathan Bell-4 wrote: > > image = np.histogram2d(x, y, bins=bins, weights=z)[0] > This works great - thanks! Thomas -- View this message in context: http://www.nabble.com/Rasterizing-points-onto-an-array-tp23808494p23820216.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Mon Jun 1 15:07:49 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 14:07:49 -0500 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: References: <4A232C67.2080206@ar.media.kyoto-u.ac.jp> <4A234643.6040606@ar.media.kyoto-u.ac.jp> <4A236193.7030004@ar.media.kyoto-u.ac.jp> <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> Message-ID: <3d375d730906011207m35c3464ahe87e24850813e206@mail.gmail.com> On Mon, Jun 1, 2009 at 13:44, Charles R Harris wrote: > > > On Mon, Jun 1, 2009 at 11:48 AM, Charles R Harris > wrote: >> >> >> On Mon, Jun 1, 2009 at 10:35 AM, Robert Kern >> wrote: >>> >>> On Mon, Jun 1, 2009 at 00:05, David Cournapeau >>> wrote: >>> >>> > I think we should just fix it to use conjugate - I will do this in the >>> > branch, and I will integrate it in the trunk later unless someone >>> > stands >>> > up vehemently against the change. I opened up a ticket to track this, >>> > though, >>> >>> It breaks everyone's code that works around the current behavior. >> >> Maybe we need a new function. But what to call it? > > How about introducing acorrelate and deprecating the old version? Sure. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Mon Jun 1 15:38:07 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 15:38:07 -0400 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: References: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> Message-ID: <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> On Mon, Jun 1, 2009 at 2:45 PM, Keith Goodman wrote: > On Mon, Jun 1, 2009 at 11:34 AM, ? wrote: >> On Mon, Jun 1, 2009 at 2:26 PM, Keith Goodman wrote: >>> On Mon, Jun 1, 2009 at 11:16 AM, ? wrote: >>>> On Mon, Jun 1, 2009 at 1:43 PM, Keith Goodman wrote: >>>>> On Mon, Jun 1, 2009 at 9:55 AM, Michael Hearne wrote: >>>>>> A question (and possibly a bug): >>>>>> >>>>>> What should be returned when I do: >>>>>> >>>>>> numpy.nansum([]) >>>>>> >>>>>> In my copy of numpy 1.1.1, I get 0.0. ?This is what I would expect to >>>>>> see. >>>>>> However, this behavior seems to have changed in 1.3.0, in which I get >>>>>> nan. >>>>> >>>>> Here's a weird one. This is in numpy 1.2: >>>>> >>>>>>> np.sum(9) >>>>> ? 9 >>>>>>> np.sum(9.0) >>>>> ? 9.0 >>>>>>> np.nansum(9) >>>>> ? 9 >>>>>>> np.nansum(9.0) >>>>> --------------------------------------------------------------------------- >>>>> IndexError: 0-d arrays can't be indexed. >>>> >>>> wrong argument in isnan, I think >>>> >>>> Josef >>>> >>>> In file: C:\Programs\Python25\Lib\site-packages\numpy\lib\function_base.py >>>> >>>> def _nanop(op, fill, a, axis=None): >>>> ? ?""" >>>> ... >>>> >>>> ? ?""" >>>> ? ?y = array(a,subok=True) >>>> - ? ?mask = isnan(a) >>>> + ? mask = isnan(y) >>>> ? ?if mask.all(): >>>> ? ? ? ?return np.nan >>>> >>>> ? ?if not issubclass(y.dtype.type, np.integer): >>>> ? ? ? ?y[mask] = fill >>>> >>>> ? ?return op(y, axis=axis) >>> >>> The problem I came across, np.nansum(float), is caused by this line >>> >>> y[mask] = fill >>> >>> when y is 0-d. >> >> if mask is an array then this works, my initial solution was mask = >> np.array(mask). ?then I thought it works also the other way, but it >> doesn't. I shouldn't have changed my mind >> >>>>> y >> array(9.0) >>>>> y[np.array(np.isnan(9))] = 0 >>>>> y >> array(9.0) >>>>> y[np.isnan(np.array(9))] = 0 >> Traceback (most recent call last): >> ?File "", line 1, in >> ? ?y[np.isnan(np.array(9))] = 0 >> IndexError: 0-d arrays can't be indexed. > > Nice! > > Are you going to make the change? Not me, the original problem was more difficult to track down >>> np.nansum(np.array([])) 1.#QNAN Here's a good one: >>> np.isnan([]).all() True >>> np.isnan([]).any() False Josef From aisaac at american.edu Mon Jun 1 16:06:42 2009 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 01 Jun 2009 16:06:42 -0400 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> References: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> Message-ID: <4A2434D2.2040801@american.edu> On 6/1/2009 3:38 PM josef.pktd at gmail.com apparently wrote: > Here's a good one: > >>>> np.isnan([]).all() > True >>>> np.isnan([]).any() > False >>> all([]) True >>> any([]) False Cheers, Alan Isaac From josef.pktd at gmail.com Mon Jun 1 16:31:52 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 16:31:52 -0400 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <4A2434D2.2040801@american.edu> References: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> Message-ID: <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> On Mon, Jun 1, 2009 at 4:06 PM, Alan G Isaac wrote: > On 6/1/2009 3:38 PM josef.pktd at gmail.com apparently wrote: >> Here's a good one: >> >>>>> np.isnan([]).all() >> True >>>>> np.isnan([]).any() >> False > > > ?>>> all([]) > True > ?>>> any([]) > False also: >>> y array([], dtype=float64) >>> (y>0).all() True >>> (y>0).any() False >>> ((y>0)>0).sum() 0 I don't know what's the logic, but it causes the bug in np.nansum. Josef From sccolbert at gmail.com Mon Jun 1 16:37:54 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 1 Jun 2009 16:37:54 -0400 Subject: [Numpy-discussion] When building with --prefix /usr/local numpy not detected by python Message-ID: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> On 64-bit ubuntu 9.04 and Python 2.6, I built numpy from source against atlas and lapack (everything 64bit). To install, I used: sudo python setup.py install --prefix /usr/local but then python doesnt find the numpy module, even though it exists in /usr/local/lib/python2.6/site-packages Do I need to add a .pth file somewhere to tell python about numpy? I thought that would be done during the install command Cheers! Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Mon Jun 1 16:47:03 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 1 Jun 2009 16:47:03 -0400 Subject: [Numpy-discussion] When building with --prefix /usr/local numpy not detected by python In-Reply-To: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> References: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 4:37 PM, Chris Colbert wrote: > On 64-bit ubuntu 9.04 and Python 2.6, I built numpy from source against > atlas and lapack (everything 64bit). > > To install, I used:?? sudo python setup.py install --prefix /usr/local > > but then python doesnt find the numpy module, even though it exists in > /usr/local/lib/python2.6/site-packages > > > Do I need to add a .pth file somewhere to tell python about numpy? I thought > that would be done during the install command > > Cheers! > > Chris > I have two similar setups (kubuntu 64 bit and ubuntu 32 bit though) and am able to build without the prefix flag and everything seems to work okay. Install goes into the dist-packages directory and not the site-packages. I'm not sure why though to be honest. Skipper From sccolbert at gmail.com Mon Jun 1 16:48:32 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 1 Jun 2009 16:48:32 -0400 Subject: [Numpy-discussion] When building with --prefix /usr/local numpy not detected by python In-Reply-To: References: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> Message-ID: <7f014ea60906011348l6c5199b0s971aac4357b6dda0@mail.gmail.com> building without the prefix flag works for me as well, just wondering why this doesnt... Chris On Mon, Jun 1, 2009 at 4:47 PM, Skipper Seabold wrote: > On Mon, Jun 1, 2009 at 4:37 PM, Chris Colbert wrote: > > On 64-bit ubuntu 9.04 and Python 2.6, I built numpy from source against > > atlas and lapack (everything 64bit). > > > > To install, I used: sudo python setup.py install --prefix /usr/local > > > > but then python doesnt find the numpy module, even though it exists in > > /usr/local/lib/python2.6/site-packages > > > > > > Do I need to add a .pth file somewhere to tell python about numpy? I > thought > > that would be done during the install command > > > > Cheers! > > > > Chris > > > > I have two similar setups (kubuntu 64 bit and ubuntu 32 bit though) > and am able to build without the prefix flag and everything seems to > work okay. Install goes into the dist-packages directory and not the > site-packages. I'm not sure why though to be honest. > > Skipper > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jun 1 17:12:56 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 16:12:56 -0500 Subject: [Numpy-discussion] When building with --prefix /usr/local numpy not detected by python In-Reply-To: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> References: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> Message-ID: <3d375d730906011412q6db17f1du52816bf8026b8893@mail.gmail.com> On Mon, Jun 1, 2009 at 15:37, Chris Colbert wrote: > On 64-bit ubuntu 9.04 and Python 2.6, I built numpy from source against > atlas and lapack (everything 64bit). > > To install, I used:?? sudo python setup.py install --prefix /usr/local > > but then python doesnt find the numpy module, even though it exists in > /usr/local/lib/python2.6/site-packages > > > Do I need to add a .pth file somewhere to tell python about numpy? No. Double-check that /usr/local/lib/python2.6/site-packages/ is on your sys.path. Ubuntu used to do this, but I don't know if they've changed policy in 9.04. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sccolbert at gmail.com Mon Jun 1 17:35:43 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 1 Jun 2009 17:35:43 -0400 Subject: [Numpy-discussion] When building with --prefix /usr/local numpy not detected by python In-Reply-To: <3d375d730906011412q6db17f1du52816bf8026b8893@mail.gmail.com> References: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> <3d375d730906011412q6db17f1du52816bf8026b8893@mail.gmail.com> Message-ID: <7f014ea60906011435y38be5214h3799d58691509ce0@mail.gmail.com> thanks Robert, the directory indeed wasnt in the $PATH variable. Cheers, Chris On Mon, Jun 1, 2009 at 5:12 PM, Robert Kern wrote: > On Mon, Jun 1, 2009 at 15:37, Chris Colbert wrote: > > On 64-bit ubuntu 9.04 and Python 2.6, I built numpy from source against > > atlas and lapack (everything 64bit). > > > > To install, I used: sudo python setup.py install --prefix /usr/local > > > > but then python doesnt find the numpy module, even though it exists in > > /usr/local/lib/python2.6/site-packages > > > > > > Do I need to add a .pth file somewhere to tell python about numpy? > > No. Double-check that /usr/local/lib/python2.6/site-packages/ is on > your sys.path. Ubuntu used to do this, but I don't know if they've > changed policy in 9.04. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jun 1 17:37:57 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 16:37:57 -0500 Subject: [Numpy-discussion] When building with --prefix /usr/local numpy not detected by python In-Reply-To: <7f014ea60906011435y38be5214h3799d58691509ce0@mail.gmail.com> References: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> <3d375d730906011412q6db17f1du52816bf8026b8893@mail.gmail.com> <7f014ea60906011435y38be5214h3799d58691509ce0@mail.gmail.com> Message-ID: <3d375d730906011437qbc78407xd14c8b8b37c43d23@mail.gmail.com> On Mon, Jun 1, 2009 at 16:35, Chris Colbert wrote: > thanks Robert, > > the directory indeed wasnt in the $PATH variable. No, not the environment variable $PATH, but sys.path. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sccolbert at gmail.com Mon Jun 1 17:44:32 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 1 Jun 2009 17:44:32 -0400 Subject: [Numpy-discussion] When building with --prefix /usr/local numpy not detected by python In-Reply-To: <3d375d730906011437qbc78407xd14c8b8b37c43d23@mail.gmail.com> References: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> <3d375d730906011412q6db17f1du52816bf8026b8893@mail.gmail.com> <7f014ea60906011435y38be5214h3799d58691509ce0@mail.gmail.com> <3d375d730906011437qbc78407xd14c8b8b37c43d23@mail.gmail.com> Message-ID: <7f014ea60906011444g64208880g1acf57c844ccd38@mail.gmail.com> yeah, I came back here just now to call myself an idiot, but I'm too late :) Chris On Mon, Jun 1, 2009 at 5:37 PM, Robert Kern wrote: > On Mon, Jun 1, 2009 at 16:35, Chris Colbert wrote: > > thanks Robert, > > > > the directory indeed wasnt in the $PATH variable. > > No, not the environment variable $PATH, but sys.path. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Mon Jun 1 17:54:29 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Mon, 1 Jun 2009 17:54:29 -0400 Subject: [Numpy-discussion] When building with --prefix /usr/local numpy not detected by python In-Reply-To: <7f014ea60906011444g64208880g1acf57c844ccd38@mail.gmail.com> References: <7f014ea60906011337m39bbd588l26947c10e1e58e5f@mail.gmail.com> <3d375d730906011412q6db17f1du52816bf8026b8893@mail.gmail.com> <7f014ea60906011435y38be5214h3799d58691509ce0@mail.gmail.com> <3d375d730906011437qbc78407xd14c8b8b37c43d23@mail.gmail.com> <7f014ea60906011444g64208880g1acf57c844ccd38@mail.gmail.com> Message-ID: <7f014ea60906011454h66ab9d83ta890e429e44fabb5@mail.gmail.com> the directory wasn't on the python path either. I added a site-packages.pth file to /usr/local/lib/python2.6/dist-packages with the line "/usr/local/lib/python2.6/site-packages" Not elegant, but it worked. Chris On Mon, Jun 1, 2009 at 5:44 PM, Chris Colbert wrote: > yeah, I came back here just now to call myself an idiot, but I'm too late > :) > > Chris > > > On Mon, Jun 1, 2009 at 5:37 PM, Robert Kern wrote: > >> On Mon, Jun 1, 2009 at 16:35, Chris Colbert wrote: >> > thanks Robert, >> > >> > the directory indeed wasnt in the $PATH variable. >> >> No, not the environment variable $PATH, but sys.path. >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ferrell at diablotech.com Mon Jun 1 18:32:05 2009 From: ferrell at diablotech.com (Robert Ferrell) Date: Mon, 1 Jun 2009 16:32:05 -0600 Subject: [Numpy-discussion] Slices of structured arrays Message-ID: Is there a way to get slices of a structured array and keep the field names? For instance, I've got dtype=[('x','f4'),('y','f4'), ('z','f4')] and I want to get just the x & y slices into a new array with dtype=[('x','f4'),('y','f4')]. I can just make a new dtype, and extract what I need, but I'm wondering if there's some simple way to do this that I haven't found. Here's what I know works: # Make a len 10 array with 3 fields, 'x', 'y', 'z' In [647]: xyz = np.array(zip(*np.random.random_integers(low=10, size=(3,10))), dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')]) # Get just the 'x' and 'y' fields In [648]: xy = np.array( zip(xyz['x'], xyz['y'] ), dtype=[('x','f4'), ('y', 'f4')]) In [649]: xyz['x'] Out[649]: array([ 4., 1., 1., 5., 1., 2., 9., 8., 1., 9.], dtype=float32) In [650]: xy['x'] Out[650]: array([ 4., 1., 1., 5., 1., 2., 9., 8., 1., 9.], dtype=float32) That works, but just feels like there's probably an elegant solution I don't know. I couldn't find anything in the docs, but I may not have been using the right search words. thanks, -robert From robert.kern at gmail.com Mon Jun 1 18:41:49 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 17:41:49 -0500 Subject: [Numpy-discussion] Slices of structured arrays In-Reply-To: References: Message-ID: <3d375d730906011541q14355b98v462066db17113c98@mail.gmail.com> On Mon, Jun 1, 2009 at 17:32, Robert Ferrell wrote: > Is there a way to get slices of a structured array and keep the field > names? ?For instance, I've got dtype=[('x','f4'),('y','f4'), > ('z','f4')] and I want to get just the x & y slices into a new array > with dtype=[('x','f4'),('y','f4')]. > > I can just make a new dtype, and extract what I need, but I'm > wondering if there's some simple way to do this that I haven't found. > > Here's what I know works: > > # Make a len 10 array with 3 fields, 'x', 'y', 'z' > In [647]: xyz = np.array(zip(*np.random.random_integers(low=10, > size=(3,10))), dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')]) > > # Get just the 'x' and 'y' fields > In [648]: xy = np.array( zip(xyz['x'], xyz['y'] ), dtype=[('x','f4'), > ('y', 'f4')]) > > In [649]: xyz['x'] > Out[649]: array([ 4., ?1., ?1., ?5., ?1., ?2., ?9., ?8., ?1., ?9.], > dtype=float32) > > In [650]: xy['x'] > Out[650]: array([ 4., ?1., ?1., ?5., ?1., ?2., ?9., ?8., ?1., ?9.], > dtype=float32) > > That works, but just feels like there's probably an elegant solution I > don't know. ?I couldn't find anything in the docs, but I may not have > been using the right search words. In numpy 1.4, there will be a function that does this, numpy.lib.recfunctions.drop_fields(). In the meantime, you can copy-and-paste it into your own code: http://svn.scipy.org/svn/numpy/trunk/numpy/lib/recfunctions.py Or use it from it's original source, matplotlib.mlab.rec_drop_fields(), if you have matplotlib. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ferrell at diablotech.com Mon Jun 1 19:09:15 2009 From: ferrell at diablotech.com (Robert Ferrell) Date: Mon, 1 Jun 2009 17:09:15 -0600 Subject: [Numpy-discussion] Slices of structured arrays In-Reply-To: <3d375d730906011541q14355b98v462066db17113c98@mail.gmail.com> References: <3d375d730906011541q14355b98v462066db17113c98@mail.gmail.com> Message-ID: On Jun 1, 2009, at 4:41 PM, Robert Kern wrote: > On Mon, Jun 1, 2009 at 17:32, Robert Ferrell > wrote: >> Is there a way to get slices of a structured array and keep the field >> names? For instance, I've got dtype=[('x','f4'),('y','f4'), >> ('z','f4')] and I want to get just the x & y slices into a new array >> with dtype=[('x','f4'),('y','f4')]. >> >> I can just make a new dtype, and extract what I need, but I'm >> wondering if there's some simple way to do this that I haven't found. >> >> Here's what I know works: >> >> # Make a len 10 array with 3 fields, 'x', 'y', 'z' >> In [647]: xyz = np.array(zip(*np.random.random_integers(low=10, >> size=(3,10))), dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')]) >> >> # Get just the 'x' and 'y' fields >> In [648]: xy = np.array( zip(xyz['x'], xyz['y'] ), dtype=[('x','f4'), >> ('y', 'f4')]) >> >> In [649]: xyz['x'] >> Out[649]: array([ 4., 1., 1., 5., 1., 2., 9., 8., 1., 9.], >> dtype=float32) >> >> In [650]: xy['x'] >> Out[650]: array([ 4., 1., 1., 5., 1., 2., 9., 8., 1., 9.], >> dtype=float32) >> >> That works, but just feels like there's probably an elegant >> solution I >> don't know. I couldn't find anything in the docs, but I may not have >> been using the right search words. > > In numpy 1.4, there will be a function that does this, > numpy.lib.recfunctions.drop_fields(). In the meantime, you can > copy-and-paste it into your own code: > > http://svn.scipy.org/svn/numpy/trunk/numpy/lib/recfunctions.py > > Or use it from it's original source, > matplotlib.mlab.rec_drop_fields(), if you have matplotlib. That's perfect. I've got matplotlib, so I'll use that for now. thanks, -robert From robert.kern at gmail.com Mon Jun 1 19:30:22 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 18:30:22 -0500 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> References: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> Message-ID: <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> On Mon, Jun 1, 2009 at 15:31, wrote: > On Mon, Jun 1, 2009 at 4:06 PM, Alan G Isaac wrote: >> On 6/1/2009 3:38 PM josef.pktd at gmail.com apparently wrote: >>> Here's a good one: >>> >>>>>> np.isnan([]).all() >>> True >>>>>> np.isnan([]).any() >>> False >> >> >> ?>>> all([]) >> True >> ?>>> any([]) >> False > > also: > >>>> y > array([], dtype=float64) >>>> (y>0).all() > True >>>> (y>0).any() > False >>>> ((y>0)>0).sum() > 0 > > I don't know what's the logic, but it causes the bug in np.nansum. You will have to special-case empty arrays, then. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Mon Jun 1 19:43:32 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 19:43:32 -0400 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> References: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> Message-ID: <1cd32cbb0906011643xd391e36t7788d795b344809@mail.gmail.com> On Mon, Jun 1, 2009 at 7:30 PM, Robert Kern wrote: > On Mon, Jun 1, 2009 at 15:31, ? wrote: >> On Mon, Jun 1, 2009 at 4:06 PM, Alan G Isaac wrote: >>> On 6/1/2009 3:38 PM josef.pktd at gmail.com apparently wrote: >>>> Here's a good one: >>>> >>>>>>> np.isnan([]).all() >>>> True >>>>>>> np.isnan([]).any() >>>> False >>> >>> >>> ?>>> all([]) >>> True >>> ?>>> any([]) >>> False >> >> also: >> >>>>> y >> array([], dtype=float64) >>>>> (y>0).all() >> True >>>>> (y>0).any() >> False >>>>> ((y>0)>0).sum() >> 0 >> >> I don't know what's the logic, but it causes the bug in np.nansum. > > You will have to special-case empty arrays, then. > is np.size the right check for non-empty array, including subtypes? i.e. if y.size and mask.all(): return np.nan or more explicit if y.size > 0 and mask.all(): return np.nan Josef From josef.pktd at gmail.com Mon Jun 1 19:50:53 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 19:50:53 -0400 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <1cd32cbb0906011643xd391e36t7788d795b344809@mail.gmail.com> References: <1cd32cbb0906011116n4c59949fof00a32d0b6f2510e@mail.gmail.com> <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> <1cd32cbb0906011643xd391e36t7788d795b344809@mail.gmail.com> Message-ID: <1cd32cbb0906011650m7501838esc03ad68e6801fe2a@mail.gmail.com> On Mon, Jun 1, 2009 at 7:43 PM, wrote: > On Mon, Jun 1, 2009 at 7:30 PM, Robert Kern wrote: >> On Mon, Jun 1, 2009 at 15:31, ? wrote: >>> On Mon, Jun 1, 2009 at 4:06 PM, Alan G Isaac wrote: >>>> On 6/1/2009 3:38 PM josef.pktd at gmail.com apparently wrote: >>>>> Here's a good one: >>>>> >>>>>>>> np.isnan([]).all() >>>>> True >>>>>>>> np.isnan([]).any() >>>>> False >>>> >>>> >>>> ?>>> all([]) >>>> True >>>> ?>>> any([]) >>>> False >>> >>> also: >>> >>>>>> y >>> array([], dtype=float64) >>>>>> (y>0).all() >>> True >>>>>> (y>0).any() >>> False >>>>>> ((y>0)>0).sum() >>> 0 >>> >>> I don't know what's the logic, but it causes the bug in np.nansum. >> >> You will have to special-case empty arrays, then. >> > > is np.size the right check for non-empty array, including subtypes? > > i.e. > > if y.size and mask.all(): > ? ? ? ?return np.nan > > or more explicit > if y.size > 0 and mask.all(): > ? ? ? ?return np.nan > Actually, now I think this is the wrong behavior, nansum should never return nan. >>> np.nansum([np.nan, np.nan]) 1.#QNAN shouldn't this be zero Josef From kwgoodman at gmail.com Mon Jun 1 19:58:26 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 1 Jun 2009 16:58:26 -0700 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <1cd32cbb0906011650m7501838esc03ad68e6801fe2a@mail.gmail.com> References: <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> <1cd32cbb0906011643xd391e36t7788d795b344809@mail.gmail.com> <1cd32cbb0906011650m7501838esc03ad68e6801fe2a@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 4:50 PM, wrote: > On Mon, Jun 1, 2009 at 7:43 PM, ? wrote: >> On Mon, Jun 1, 2009 at 7:30 PM, Robert Kern wrote: >>> On Mon, Jun 1, 2009 at 15:31, ? wrote: >>>> On Mon, Jun 1, 2009 at 4:06 PM, Alan G Isaac wrote: >>>>> On 6/1/2009 3:38 PM josef.pktd at gmail.com apparently wrote: >>>>>> Here's a good one: >>>>>> >>>>>>>>> np.isnan([]).all() >>>>>> True >>>>>>>>> np.isnan([]).any() >>>>>> False >>>>> >>>>> >>>>> ?>>> all([]) >>>>> True >>>>> ?>>> any([]) >>>>> False >>>> >>>> also: >>>> >>>>>>> y >>>> array([], dtype=float64) >>>>>>> (y>0).all() >>>> True >>>>>>> (y>0).any() >>>> False >>>>>>> ((y>0)>0).sum() >>>> 0 >>>> >>>> I don't know what's the logic, but it causes the bug in np.nansum. >>> >>> You will have to special-case empty arrays, then. >>> >> >> is np.size the right check for non-empty array, including subtypes? >> >> i.e. >> >> if y.size and mask.all(): >> ? ? ? ?return np.nan >> >> or more explicit >> if y.size > 0 and mask.all(): >> ? ? ? ?return np.nan >> > > Actually, now I think this is the wrong behavior, nansum should never > return nan. > >>>> np.nansum([np.nan, np.nan]) > 1.#QNAN > > shouldn't this be zero The doc string says it is zero: "Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero." Treating NaNs differently in different cases is harder to explain. From josef.pktd at gmail.com Mon Jun 1 20:03:39 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 20:03:39 -0400 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: References: <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> <1cd32cbb0906011643xd391e36t7788d795b344809@mail.gmail.com> <1cd32cbb0906011650m7501838esc03ad68e6801fe2a@mail.gmail.com> Message-ID: <1cd32cbb0906011703m793636d5j4556af2f61a04aab@mail.gmail.com> On Mon, Jun 1, 2009 at 7:58 PM, Keith Goodman wrote: > On Mon, Jun 1, 2009 at 4:50 PM, ? wrote: >> On Mon, Jun 1, 2009 at 7:43 PM, ? wrote: >>> On Mon, Jun 1, 2009 at 7:30 PM, Robert Kern wrote: >>>> On Mon, Jun 1, 2009 at 15:31, ? wrote: >>>>> On Mon, Jun 1, 2009 at 4:06 PM, Alan G Isaac wrote: >>>>>> On 6/1/2009 3:38 PM josef.pktd at gmail.com apparently wrote: >>>>>>> Here's a good one: >>>>>>> >>>>>>>>>> np.isnan([]).all() >>>>>>> True >>>>>>>>>> np.isnan([]).any() >>>>>>> False >>>>>> >>>>>> >>>>>> ?>>> all([]) >>>>>> True >>>>>> ?>>> any([]) >>>>>> False >>>>> >>>>> also: >>>>> >>>>>>>> y >>>>> array([], dtype=float64) >>>>>>>> (y>0).all() >>>>> True >>>>>>>> (y>0).any() >>>>> False >>>>>>>> ((y>0)>0).sum() >>>>> 0 >>>>> >>>>> I don't know what's the logic, but it causes the bug in np.nansum. >>>> >>>> You will have to special-case empty arrays, then. >>>> >>> >>> is np.size the right check for non-empty array, including subtypes? >>> >>> i.e. >>> >>> if y.size and mask.all(): >>> ? ? ? ?return np.nan >>> >>> or more explicit >>> if y.size > 0 and mask.all(): >>> ? ? ? ?return np.nan >>> >> >> Actually, now I think this is the wrong behavior, nansum should never >> return nan. >> >>>>> np.nansum([np.nan, np.nan]) >> 1.#QNAN >> >> shouldn't this be zero > > The doc string says it is zero: "Return the sum of array elements over > a given axis treating Not a Numbers (NaNs) as zero." Treating NaNs > differently in different cases is harder to explain. http://projects.scipy.org/numpy/ticket/1123 open for review Josef From robert.kern at gmail.com Mon Jun 1 20:30:02 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 19:30:02 -0500 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <1cd32cbb0906011650m7501838esc03ad68e6801fe2a@mail.gmail.com> References: <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> <1cd32cbb0906011643xd391e36t7788d795b344809@mail.gmail.com> <1cd32cbb0906011650m7501838esc03ad68e6801fe2a@mail.gmail.com> Message-ID: <3d375d730906011730g2608e1a4p6e6f608f50326bac@mail.gmail.com> On Mon, Jun 1, 2009 at 18:50, wrote: > On Mon, Jun 1, 2009 at 7:43 PM, ? wrote: >> is np.size the right check for non-empty array, including subtypes? Yes. >> i.e. >> >> if y.size and mask.all(): >> ? ? ? ?return np.nan >> >> or more explicit >> if y.size > 0 and mask.all(): >> ? ? ? ?return np.nan >> > > Actually, now I think this is the wrong behavior, nansum should never > return nan. > >>>> np.nansum([np.nan, np.nan]) > 1.#QNAN > > shouldn't this be zero I agree. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Mon Jun 1 21:09:35 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Jun 2009 19:09:35 -0600 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <3d375d730906011730g2608e1a4p6e6f608f50326bac@mail.gmail.com> References: <1cd32cbb0906011134p4cc13f11ra0fecdc81c9f7c1c@mail.gmail.com> <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> <1cd32cbb0906011643xd391e36t7788d795b344809@mail.gmail.com> <1cd32cbb0906011650m7501838esc03ad68e6801fe2a@mail.gmail.com> <3d375d730906011730g2608e1a4p6e6f608f50326bac@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 6:30 PM, Robert Kern wrote: > On Mon, Jun 1, 2009 at 18:50, wrote: > > On Mon, Jun 1, 2009 at 7:43 PM, wrote: > > >> is np.size the right check for non-empty array, including subtypes? > > Yes. > > >> i.e. > >> > >> if y.size and mask.all(): > >> return np.nan > >> > >> or more explicit > >> if y.size > 0 and mask.all(): > >> return np.nan > >> > > > > Actually, now I think this is the wrong behavior, nansum should never > > return nan. > > > >>>> np.nansum([np.nan, np.nan]) > > 1.#QNAN > > > > shouldn't this be zero > > I agree. > Would anyone be interested in ufuncs fadd/fsub that treated nans like zeros? Note the fmax.reduce can be used the implement nanmax. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jun 1 21:15:05 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 20:15:05 -0500 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: References: <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> <1cd32cbb0906011643xd391e36t7788d795b344809@mail.gmail.com> <1cd32cbb0906011650m7501838esc03ad68e6801fe2a@mail.gmail.com> <3d375d730906011730g2608e1a4p6e6f608f50326bac@mail.gmail.com> Message-ID: <3d375d730906011815t42f98bb4v6682aa0883079ea9@mail.gmail.com> On Mon, Jun 1, 2009 at 20:09, Charles R Harris wrote: > > On Mon, Jun 1, 2009 at 6:30 PM, Robert Kern wrote: >> >> On Mon, Jun 1, 2009 at 18:50, ? wrote: >> > On Mon, Jun 1, 2009 at 7:43 PM, ? wrote: >> >> >> is np.size the right check for non-empty array, including subtypes? >> >> Yes. >> >> >> i.e. >> >> >> >> if y.size and mask.all(): >> >> ? ? ? ?return np.nan >> >> >> >> or more explicit >> >> if y.size > 0 and mask.all(): >> >> ? ? ? ?return np.nan >> >> >> > >> > Actually, now I think this is the wrong behavior, nansum should never >> > return nan. >> > >> >>>> np.nansum([np.nan, np.nan]) >> > 1.#QNAN >> > >> > shouldn't this be zero >> >> I agree. > > Would anyone be interested in ufuncs fadd/fsub that treated nans like zeros? > Note the fmax.reduce can be used the implement nanmax. Just please don't call them fadd/fsub. The fmin and fmax names came from C99. The fact that they ignore NaNs has nothing to do with the naming; that's just the way C99 designed those particular functions. Better, to my mind, would be to make a new module with NaN-ignoring (or maybe just -aware) semantics. The ufuncs would then be named add/subtract/etc. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Mon Jun 1 21:28:19 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 21:28:19 -0400 Subject: [Numpy-discussion] numpy.nansum() behavior in 1.3.0 In-Reply-To: <3d375d730906011815t42f98bb4v6682aa0883079ea9@mail.gmail.com> References: <1cd32cbb0906011238o94e5eb3lcfbe52d346872bbb@mail.gmail.com> <4A2434D2.2040801@american.edu> <1cd32cbb0906011331s6e532b27g7e1f42d4b45c600e@mail.gmail.com> <3d375d730906011630t1eb79dc2ge038031d7b658d94@mail.gmail.com> <1cd32cbb0906011643xd391e36t7788d795b344809@mail.gmail.com> <1cd32cbb0906011650m7501838esc03ad68e6801fe2a@mail.gmail.com> <3d375d730906011730g2608e1a4p6e6f608f50326bac@mail.gmail.com> <3d375d730906011815t42f98bb4v6682aa0883079ea9@mail.gmail.com> Message-ID: <1cd32cbb0906011828lf4a1690k2f6f12e7c905f8ff@mail.gmail.com> On Mon, Jun 1, 2009 at 9:15 PM, Robert Kern wrote: > On Mon, Jun 1, 2009 at 20:09, Charles R Harris > wrote: >> >> On Mon, Jun 1, 2009 at 6:30 PM, Robert Kern wrote: >>> >>> On Mon, Jun 1, 2009 at 18:50, ? wrote: >>> > On Mon, Jun 1, 2009 at 7:43 PM, ? wrote: >>> >>> >> is np.size the right check for non-empty array, including subtypes? >>> >>> Yes. >>> >>> >> i.e. >>> >> >>> >> if y.size and mask.all(): >>> >> ? ? ? ?return np.nan >>> >> >>> >> or more explicit >>> >> if y.size > 0 and mask.all(): >>> >> ? ? ? ?return np.nan >>> >> >>> > >>> > Actually, now I think this is the wrong behavior, nansum should never >>> > return nan. >>> > >>> >>>> np.nansum([np.nan, np.nan]) >>> > 1.#QNAN >>> > >>> > shouldn't this be zero >>> >>> I agree. >> >> Would anyone be interested in ufuncs fadd/fsub that treated nans like zeros? >> Note the fmax.reduce can be used the implement nanmax. > > Just please don't call them fadd/fsub. The fmin and fmax names came > from C99. The fact that they ignore NaNs has nothing to do with the > naming; that's just the way C99 designed those particular functions. > Better, to my mind, would be to make a new module with NaN-ignoring > (or maybe just -aware) semantics. The ufuncs would then be named > add/subtract/etc. > my proposed fix for nansum also affects nanmax and similar, Is there a clear definition what the result should be for empty or all nan arrays? Josef >>> np.min([]) Traceback (most recent call last): ... ValueError: zero-size array to ufunc.reduce without identity >>> np.nanmin(np.nan) 1.#QNAN >>> nanmin(np.nan) # after changes to _nanop inf >>> np.nanargmin(np.nan) 1.#QNAN >>> nanargmin(np.nan) # after changes to _nanop 0 >>> nanmax(np.nan) # after changes to _nanop inf >>> np.nanmax(np.nan) 1.#QNAN >>> nanmin([]) # after changes to _nanop Traceback (most recent call last): ... ValueError: zero-size array to ufunc.reduce without identity >>> np.nanmin([]) 1.#QNAN >>> np.fmax.reduce(np.array(5)) Traceback (most recent call last): File "", line 1, in np.fmax.reduce(np.array(5)) TypeError: cannot reduce on a scalar >>> np.fmax.reduce([np.nan]) nan From josef.pktd at gmail.com Mon Jun 1 22:37:19 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 22:37:19 -0400 Subject: [Numpy-discussion] how can one catch a multiarray.error Message-ID: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> how do we catch a multiarray.error in a try except clause? e.g. >>> np.argmin([]) Traceback (most recent call last): File "", line 1, in np.argmin([]) File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", line 631, in argmin return _wrapit(a, 'argmin', axis) File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", line 37, in _wrapit result = getattr(asarray(obj),method)(*args, **kwds) multiarray.error: attempt to get argmax/argmin of an empty sequence Josef From robert.kern at gmail.com Mon Jun 1 22:43:59 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 21:43:59 -0500 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> Message-ID: <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> On Mon, Jun 1, 2009 at 21:37, wrote: > how do we catch a multiarray.error in a try except clause? > > e.g. >>>> np.argmin([]) > Traceback (most recent call last): > ?File "", line 1, in > ? ?np.argmin([]) > ?File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > line 631, in argmin > ? ?return _wrapit(a, 'argmin', axis) > ?File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > line 37, in _wrapit > ? ?result = getattr(asarray(obj),method)(*args, **kwds) > multiarray.error: attempt to get argmax/argmin of an empty sequence try: ... except numpy.core.multiarray.error: ... Unfortunately, that is still a string exception. We should change that. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Mon Jun 1 23:06:05 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Jun 2009 21:06:05 -0600 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 8:37 PM, wrote: > how do we catch a multiarray.error in a try except clause? > > e.g. > >>> np.argmin([]) > Traceback (most recent call last): > File "", line 1, in > np.argmin([]) > File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > line 631, in argmin > return _wrapit(a, 'argmin', axis) > File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > line 37, in _wrapit > result = getattr(asarray(obj),method)(*args, **kwds) > multiarray.error: attempt to get argmax/argmin of an empty sequence > What numpy version are you using? A ValueError is raised in recent versions: In [1]: np.argmin([]) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/charris/ in () /usr/lib/python2.5/site-packages/numpy/core/fromnumeric.pyc in argmin(a, axis) 629 argmin = a.argmin 630 except AttributeError: --> 631 return _wrapit(a, 'argmin', axis) 632 return argmin(axis) 633 /usr/lib/python2.5/site-packages/numpy/core/fromnumeric.pyc in _wrapit(obj, method, *args, **kwds) 35 except AttributeError: 36 wrap = None ---> 37 result = getattr(asarray(obj),method)(*args, **kwds) 38 if wrap: 39 if not isinstance(result, mu.ndarray): ValueError: attempt to get argmax/argmin of an empty sequence Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jun 1 23:27:30 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Jun 2009 21:27:30 -0600 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 8:43 PM, Robert Kern wrote: > On Mon, Jun 1, 2009 at 21:37, wrote: > > how do we catch a multiarray.error in a try except clause? > > > > e.g. > >>>> np.argmin([]) > > Traceback (most recent call last): > > File "", line 1, in > > np.argmin([]) > > File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > > line 631, in argmin > > return _wrapit(a, 'argmin', axis) > > File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > > line 37, in _wrapit > > result = getattr(asarray(obj),method)(*args, **kwds) > > multiarray.error: attempt to get argmax/argmin of an empty sequence > > try: > ... > except numpy.core.multiarray.error: > ... > > Unfortunately, that is still a string exception. We should change that. > All the string exception in the core/src are gone. Some remain in lib/src/_compiled_base.c numarray/_capi.c fft/fftpack_litemodule.c f2py (tons of them) I will open a ticket. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Jun 1 23:31:22 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Jun 2009 23:31:22 -0400 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> Message-ID: <1cd32cbb0906012031p4c385797x677eb4c1bc9eaa43@mail.gmail.com> On Mon, Jun 1, 2009 at 11:27 PM, Charles R Harris wrote: > > > On Mon, Jun 1, 2009 at 8:43 PM, Robert Kern wrote: >> >> On Mon, Jun 1, 2009 at 21:37, ? wrote: >> > how do we catch a multiarray.error in a try except clause? >> > >> > e.g. >> >>>> np.argmin([]) >> > Traceback (most recent call last): >> > ?File "", line 1, in >> > ? ?np.argmin([]) >> > ?File >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", >> > line 631, in argmin >> > ? ?return _wrapit(a, 'argmin', axis) >> > ?File >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", >> > line 37, in _wrapit >> > ? ?result = getattr(asarray(obj),method)(*args, **kwds) >> > multiarray.error: attempt to get argmax/argmin of an empty sequence >> >> try: >> ? ... >> except numpy.core.multiarray.error: >> ? ... >> >> Unfortunately, that is still a string exception. We should change that. > > All the string exception in the core/src are gone. Some remain in > > lib/src/_compiled_base.c > numarray/_capi.c > fft/fftpack_litemodule.c > f2py (tons of them) > > I will open a ticket. > I'm using the official windows installer (I think) >>> np.version.version '1.3.0' I was just surprised that "except Exception" didn't catch the multiarray.error, and I didn't find a reference/location for it. Josef From david at ar.media.kyoto-u.ac.jp Mon Jun 1 23:33:27 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 02 Jun 2009 12:33:27 +0900 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: References: <871697.92371.qm@web86003.mail.ird.yahoo.com> <4A232C67.2080206@ar.media.kyoto-u.ac.jp> <4A234643.6040606@ar.media.kyoto-u.ac.jp> <4A236193.7030004@ar.media.kyoto-u.ac.jp> <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> Message-ID: <4A249D87.20207@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Mon, Jun 1, 2009 at 11:48 AM, Charles R Harris > > wrote: > > > > On Mon, Jun 1, 2009 at 10:35 AM, Robert Kern > > wrote: > > On Mon, Jun 1, 2009 at 00:05, David Cournapeau > > wrote: > > > I think we should just fix it to use conjugate - I will do > this in the > > branch, and I will integrate it in the trunk later unless > someone stands > > up vehemently against the change. I opened up a ticket to > track this, > > though, > > It breaks everyone's code that works around the current behavior. > > > Maybe we need a new function. But what to call it? > > > How about introducing acorrelate and deprecating the old version? This does not solve the C function problem (PyArray_Correlate). The easy solution would be to keep the current C version, deal with the problem in python for acorrelate for the time being, and replace the old C function with the 'correct' one once we remove the deprecated correlate ? David From robert.kern at gmail.com Mon Jun 1 23:54:29 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Jun 2009 22:54:29 -0500 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <4A249D87.20207@ar.media.kyoto-u.ac.jp> References: <4A232C67.2080206@ar.media.kyoto-u.ac.jp> <4A234643.6040606@ar.media.kyoto-u.ac.jp> <4A236193.7030004@ar.media.kyoto-u.ac.jp> <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> <4A249D87.20207@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730906012054h66d749eat395f86452aa1bfcd@mail.gmail.com> On Mon, Jun 1, 2009 at 22:33, David Cournapeau wrote: > Charles R Harris wrote: >> >> >> On Mon, Jun 1, 2009 at 11:48 AM, Charles R Harris >> > wrote: >> >> >> >> ? ? On Mon, Jun 1, 2009 at 10:35 AM, Robert Kern >> ? ? > wrote: >> >> ? ? ? ? On Mon, Jun 1, 2009 at 00:05, David Cournapeau >> ? ? ? ? > ? ? ? ? > wrote: >> >> ? ? ? ? > I think we should just fix it to use conjugate - I will do >> ? ? ? ? this in the >> ? ? ? ? > branch, and I will integrate it in the trunk later unless >> ? ? ? ? someone stands >> ? ? ? ? > up vehemently against the change. I opened up a ticket to >> ? ? ? ? track this, >> ? ? ? ? > though, >> >> ? ? ? ? It breaks everyone's code that works around the current behavior. >> >> >> ? ? Maybe we need a new function. But what to call it? >> >> >> How about introducing acorrelate and deprecating the old version? > > This does not solve the C function problem (PyArray_Correlate). The easy > solution would be to keep the current C version, deal with the problem > in python for acorrelate for the time being, and replace the old C > function with the 'correct' one once we remove the deprecated correlate ? No, you do the same thing at the C level. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Mon Jun 1 23:37:38 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 02 Jun 2009 12:37:38 +0900 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <3d375d730906012054h66d749eat395f86452aa1bfcd@mail.gmail.com> References: <4A232C67.2080206@ar.media.kyoto-u.ac.jp> <4A234643.6040606@ar.media.kyoto-u.ac.jp> <4A236193.7030004@ar.media.kyoto-u.ac.jp> <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> <4A249D87.20207@ar.media.kyoto-u.ac.jp> <3d375d730906012054h66d749eat395f86452aa1bfcd@mail.gmail.com> Message-ID: <4A249E82.4030300@ar.media.kyoto-u.ac.jp> Robert Kern wrote: >> This does not solve the C function problem (PyArray_Correlate). The easy >> solution would be to keep the current C version, deal with the problem >> in python for acorrelate for the time being, and replace the old C >> function with the 'correct' one once we remove the deprecated correlate ? >> > > No, you do the same thing at the C level. > Ok, David From fperez.net at gmail.com Tue Jun 2 01:21:03 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 1 Jun 2009 22:21:03 -0700 Subject: [Numpy-discussion] Tutorial topics for SciPy'09 Conference Message-ID: Hi all, The time for the Scipy'09 conference is rapidly approaching, and we would like to both announce the plan for tutorials and solicit feedback from everyone on topics of interest. Broadly speaking, the plan is something along the lines of what we had last year: one continuous 2-day tutorial aimed at introductory users, starting from the very basics, and in parallel a set of 'advanced' tutorials, consisting of a series of 2-hour sessions on specific topics. We will request that the presenters for the advanced tutorials keep the 'tutorial' word very much in mind, so that the sessions really contain hands-on learning work and not simply a 2-hour long slide presentation. We will thus require that all the tutorials will be based on tools that the attendees can install at least 2 weeks in advance on all platforms (no "I released it last night" software). With that in mind, we'd like feedback from all of you on possible topics for the advanced tutorials. We have space for 8 slots total, and here are in no particular order some possible topics. At this point there are no guarantees yet that we can get presentations for these, but we'd like to establish a first list of preferred topics to try and secure the presentations as soon as possible. This is simply a list of candiate topics that various people have informally suggested so far: - Mayavi/TVTK - Advanced topics in matplotlib - Statistics with Scipy - The TimeSeries scikit - Designing scientific interfaces with Traits - Advanced numpy - Sparse Linear Algebra with Scipy - Structured and record arrays in numpy - Cython - Sage - general tutorial - Sage - specific topics, suggestions welcome - Using GPUs with PyCUDA - Testing strategies for scientific codes - Parallel processing and mpi4py - Graph theory with Networkx - Design patterns for efficient iterator-based scientific codes. - Symbolic computing with sympy We'd like to hear from any ideas on other possible topics of interest, and we'll then run a doodle poll to gather quantitative feedback with the final list of candidates. Many thanks, f From david at ar.media.kyoto-u.ac.jp Tue Jun 2 01:08:26 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 02 Jun 2009 14:08:26 +0900 Subject: [Numpy-discussion] numpy.int32, type inheritance and tp_flags Message-ID: <4A24B3CA.2080206@ar.media.kyoto-u.ac.jp> Hi, I have a question related to #1121 (http://projects.scipy.org/numpy/ticket/1121). With python 2.6, PyInt_Check(a) if a is an instance of numpy.int32 does not work anymore. It think this is related to the python issue 2263 (http://bugs.python.org/issue2263), where the tp_flags has been changed for the python int object, change which influences PyInt_Check behavior. What should we do about it ? Right now, it looks like the bitfields are harcoded in scalar types - shouldn't we inherit them from the original python types (in a field per field manner) instead ? cheers, David From charlesr.harris at gmail.com Tue Jun 2 01:49:23 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Jun 2009 23:49:23 -0600 Subject: [Numpy-discussion] numpy.int32, type inheritance and tp_flags In-Reply-To: <4A24B3CA.2080206@ar.media.kyoto-u.ac.jp> References: <4A24B3CA.2080206@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, Jun 1, 2009 at 11:08 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi, > > I have a question related to #1121 > (http://projects.scipy.org/numpy/ticket/1121). With python 2.6, > PyInt_Check(a) if a is an instance of numpy.int32 does not work anymore. > It think this is related to the python issue 2263 > (http://bugs.python.org/issue2263), where the tp_flags has been changed > for the python int object, change which influences PyInt_Check behavior. > > What should we do about it ? Right now, it looks like the bitfields are > harcoded in scalar types - shouldn't we inherit them from the original > python types (in a field per field manner) instead ? > Maybe, but why should it work for int32 anyway? IIRC, the python int type has different lengths on windows and linux 64 bit systems. And what about 3.0? I think we probably need to do something here, but I'm not sure what. The different behavior of the numpy double and integer types corresponding to the python types as opposed to the rest of the scalar types is an issue that has annoyed me since forever. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Jun 2 01:50:53 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 02 Jun 2009 14:50:53 +0900 Subject: [Numpy-discussion] numpy.int32, type inheritance and tp_flags In-Reply-To: References: <4A24B3CA.2080206@ar.media.kyoto-u.ac.jp> Message-ID: <4A24BDBD.3080706@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Mon, Jun 1, 2009 at 11:08 PM, David Cournapeau > > > wrote: > > Hi, > > I have a question related to #1121 > (http://projects.scipy.org/numpy/ticket/1121). With python 2.6, > PyInt_Check(a) if a is an instance of numpy.int32 does not work > anymore. > It think this is related to the python issue 2263 > (http://bugs.python.org/issue2263), where the tp_flags has been > changed > for the python int object, change which influences PyInt_Check > behavior. > > What should we do about it ? Right now, it looks like the > bitfields are > harcoded in scalar types - shouldn't we inherit them from the original > python types (in a field per field manner) instead ? > > > Maybe, but why should it work for int32 anyway? Because it does at the Python level ? issubclass(np.int32, int) # True And some code depends on this (I noticed the problem while tracking down some issues for scipy 0.7.x on python 2.6), although the code could be modified to not depend on it anymore I guess. > IIRC, the python int type has different lengths on windows and linux > 64 bit systems. Yes, because the underlying C type is a long (at least for python 2.5.4 as I read it from Include/intobject.h in the sources). Windows (with MS compilers at least) reserves 4 bytes only for long on 64 bits. But is numpy.int32 really a subclass of int on 64 bits ? I played a bit with numpy on python 2.4 64 bits (Linux): import numpy as np int(2**33) # Returns the right value, of type 'int' np.int32(2**33) # Oups, 0 On 32 bits: import numpy as np int(2*33) # Returns the right value, of type 'long' np.int32(2**33) # 66 ... > And what about 3.0? There is not python 2.* int anymore, only python 2.* long object (which becomes the sole int object on py3k). The PyInt_* apis are gone too, starting from 3.1. > > I think we probably need to do something here, but I'm not sure what. > The different behavior of the numpy double and integer types > corresponding to the python types as opposed to the rest of the scalar > types is an issue that has annoyed me since forever. I think for now I will just add a workaround in scipy. I don't understand much about scalar types, so I don't have a clue about what to do - I feel that it will be one dark area for 3.* porting, though :) cheers, David From charlesr.harris at gmail.com Tue Jun 2 02:11:00 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 2 Jun 2009 00:11:00 -0600 Subject: [Numpy-discussion] numpy.int32, type inheritance and tp_flags In-Reply-To: <4A24B3CA.2080206@ar.media.kyoto-u.ac.jp> References: <4A24B3CA.2080206@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, Jun 1, 2009 at 11:08 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi, > > I have a question related to #1121 > (http://projects.scipy.org/numpy/ticket/1121). With python 2.6, > PyInt_Check(a) if a is an instance of numpy.int32 does not work anymore. > It think this is related to the python issue 2263 > (http://bugs.python.org/issue2263), where the tp_flags has been changed > for the python int object, change which influences PyInt_Check behavior. > It would be nice if the python folks would document Py_TPFLAGS_INT_SUBCLASS so we knew what it did. I also wonder if the problem with struct and the related bug with timeseries aren't python bugs. Shouldn't python be checking for conversion calls rather than an integer subclass? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jun 2 02:17:06 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 2 Jun 2009 00:17:06 -0600 Subject: [Numpy-discussion] numpy.int32, type inheritance and tp_flags In-Reply-To: <4A24BDBD.3080706@ar.media.kyoto-u.ac.jp> References: <4A24B3CA.2080206@ar.media.kyoto-u.ac.jp> <4A24BDBD.3080706@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, Jun 1, 2009 at 11:50 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Charles R Harris wrote: > > > > > > On Mon, Jun 1, 2009 at 11:08 PM, David Cournapeau > > > > > wrote: > > > > Hi, > > > > I have a question related to #1121 > > (http://projects.scipy.org/numpy/ticket/1121). With python 2.6, > > PyInt_Check(a) if a is an instance of numpy.int32 does not work > > anymore. > > It think this is related to the python issue 2263 > > (http://bugs.python.org/issue2263), where the tp_flags has been > > changed > > for the python int object, change which influences PyInt_Check > > behavior. > > > > What should we do about it ? Right now, it looks like the > > bitfields are > > harcoded in scalar types - shouldn't we inherit them from the > original > > python types (in a field per field manner) instead ? > > > > > > Maybe, but why should it work for int32 anyway? > > Because it does at the Python level ? > > issubclass(np.int32, int) # True > > And some code depends on this (I noticed the problem while tracking down > some issues for scipy 0.7.x on python 2.6), although the code could be > modified to not depend on it anymore I guess. > > > IIRC, the python int type has different lengths on windows and linux > > 64 bit systems. > > Yes, because the underlying C type is a long (at least for python 2.5.4 > as I read it from Include/intobject.h in the sources). Windows (with MS > compilers at least) reserves 4 bytes only for long on 64 bits. > > But is numpy.int32 really a subclass of int on 64 bits ? I played a bit > with numpy on python 2.4 64 bits (Linux): > No, IIRC, int64 is. You can see this in the different behavior, i.e., it doesn't act like the other numpy scalars. > > import numpy as np > int(2**33) # Returns the right value, of type 'int' > np.int32(2**33) # Oups, 0 > > On 32 bits: > > import numpy as np > int(2*33) # Returns the right value, of type 'long' > np.int32(2**33) # 66 ... > > And what about 3.0? > > There is not python 2.* int anymore, only python 2.* long object (which > becomes the sole int object on py3k). The PyInt_* apis are gone too, > starting from 3.1. > Exactly, and that's going to hurt. Macro time ;) > > > > > I think we probably need to do something here, but I'm not sure what. > > The different behavior of the numpy double and integer types > > corresponding to the python types as opposed to the rest of the scalar > > types is an issue that has annoyed me since forever. > > I think for now I will just add a workaround in scipy. That's probably the safest thing at the moment. > I don't > understand much about scalar types, so I don't have a clue about what to > do - I feel that it will be one dark area for 3.* porting, though :) > Me too, on all counts. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Jun 2 01:58:38 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 02 Jun 2009 14:58:38 +0900 Subject: [Numpy-discussion] numpy.int32, type inheritance and tp_flags In-Reply-To: References: <4A24B3CA.2080206@ar.media.kyoto-u.ac.jp> Message-ID: <4A24BF8E.2040307@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Mon, Jun 1, 2009 at 11:08 PM, David Cournapeau > > > wrote: > > Hi, > > I have a question related to #1121 > (http://projects.scipy.org/numpy/ticket/1121). With python 2.6, > PyInt_Check(a) if a is an instance of numpy.int32 does not work > anymore. > It think this is related to the python issue 2263 > > > (http://bugs.python.org/issue2263), where the tp_flags has been > changed > for the python int object, change which influences PyInt_Check > behavior. > > > It would be nice if the python folks would document > Py_TPFLAGS_INT_SUBCLASS so we knew what it did. I also wonder if the > problem with struct and the related bug with timeseries aren't python > bugs. Shouldn't python be checking for conversion calls rather than an > integer subclass? I found this while walking through the python hg log: http://www.mail-archive.com/python-dev at python.org/msg18140.html As I understand it, that's basically an optimization for fast subclass testing, and is indeed not documented. But instead of hard-coding the additional flag for types which support this in numpy, I think it would be better to have something which will not break again when another flag is added to some types. Specially since related bugs are quite hard to track. I don't know how to do it, though, as the python doc says that inheriting tp_flags is tricky... David From charlesr.harris at gmail.com Tue Jun 2 02:32:43 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 2 Jun 2009 00:32:43 -0600 Subject: [Numpy-discussion] numpy.int32, type inheritance and tp_flags In-Reply-To: <4A24BF8E.2040307@ar.media.kyoto-u.ac.jp> References: <4A24B3CA.2080206@ar.media.kyoto-u.ac.jp> <4A24BF8E.2040307@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, Jun 1, 2009 at 11:58 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Charles R Harris wrote: > > > > > > On Mon, Jun 1, 2009 at 11:08 PM, David Cournapeau > > > > > wrote: > > > > Hi, > > > > I have a question related to #1121 > > (http://projects.scipy.org/numpy/ticket/1121). With python 2.6, > > PyInt_Check(a) if a is an instance of numpy.int32 does not work > > anymore. > > It think this is related to the python issue 2263 > > > > > > (http://bugs.python.org/issue2263), where the tp_flags has been > > changed > > for the python int object, change which influences PyInt_Check > > behavior. > > > > > > It would be nice if the python folks would document > > Py_TPFLAGS_INT_SUBCLASS so we knew what it did. I also wonder if the > > problem with struct and the related bug with timeseries aren't python > > bugs. Shouldn't python be checking for conversion calls rather than an > > integer subclass? > > I found this while walking through the python hg log: > > http://www.mail-archive.com/python-dev at python.org/msg18140.html > Hmm, makes me think even more that the python code is the buggy one here. Why should it depend on inheritance from the int type? I mean, isn't that kind of limiting? All they need to know is that it can be *converted* to a python integer. It's like duck typing is being replaced by strict typing. Because it's faster to interpret (duh). Of course, I may have no idea what I'm talking about. > > As I understand it, that's basically an optimization for fast subclass > testing, and is indeed not documented. But instead of hard-coding the > additional flag for types which support this in numpy, I think it would > be better to have something which will not break again when another flag > is added to some types. There speaks the build guy ;) > Specially since related bugs are quite hard to > track. I don't know how to do it, though, as the python doc says that > inheriting tp_flags is tricky... > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Tue Jun 2 04:42:22 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Tue, 2 Jun 2009 10:42:22 +0200 Subject: [Numpy-discussion] Multiplying Python float to numpy.array of objects works but fails with a numpy.float64, numpy Bug? Message-ID: Hello, Multiplying a Python float to a numpy.array of objects works flawlessly but not with a numpy.float64 . I tried numpy version '1.0.4' on a 32 bit Linux and '1.2.1' on a 64 bit Linux: both raise the same exception. Is this a (known) bug? ---------------------- test.py ------------------------------------ from numpy import * class adouble: def __init__(self,x): self.x = x def __mul__(self,rhs): if isinstance(rhs,adouble): return adouble(self.x * rhs.x) else: return adouble(self.x * rhs) def __str__(self): return str(self.x) x = adouble(3.) y = adouble(2.) u = array([adouble(3.), adouble(5.)]) v = array([adouble(2.), adouble(7.)]) z = array([2.,3.]) print x * y # ok print u * v # ok print u * z # ok print u * 3. # ok print u * z[0] # _NOT_ OK! print u * float64(3.) # _NOT_ OK! ---------------------- output --------------------------------- walter at wronski$ python test.py 6.0 [6.0 35.0] [6.0 15.0] [9.0 15.0] Traceback (most recent call last): File "test.py", line 24, in print u * z[0] # _NOT_ OK! TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.float64' regards, Sebastian From faltet at pytables.org Tue Jun 2 04:58:01 2009 From: faltet at pytables.org (Francesc Alted) Date: Tue, 2 Jun 2009 10:58:01 +0200 Subject: [Numpy-discussion] Single precision equivalents of missing C99 functions In-Reply-To: References: <200906011822.03735.faltet@pytables.org> Message-ID: <200906021058.02083.faltet@pytables.org> A Monday 01 June 2009 20:26:27 Charles R Harris escrigu?: > > I suppose that the NumPy crew already experimented this divergence and > > finally > > used the cast approach for computing the single precision functions. > > It was inherited and was no doubt the simplest approach at the time. It has > always bothered me a bit, however, and if you have good single/long double > routines we should look at including them. It will affect the build so > David needs to weigh in here. Well, writing those routines is a matter of copy&paste and replace double by 'float' and 'long double'. That's all. > However, > > > this is effectively preventing the use of optimized functions for single > > precision (i.e. double precision 'exp' and 'log' are used instead of > > single precision specific 'expf' and 'logf'), which could perform > > potentially better. > > That depends on the architecture and how fast single vs double computations > are. I don't know how the timings compare on current machines. I've conducted some benchmarks myself (using `expm1f()`), and the speedup for using the native (but naive) simple precision implementation is a mere 6% on Linux. However, I don't expect any speed-up at all on Windows on Intel processors, as the single precision functions in this scenario are simply defined as a macro that does the appropriate cast on double precision ones (i.e. the current NumPy approach). By looking at the math.h header file for MSVC 9, it seems that some architectures like AMD64 may have an advantage here (the simple precision functions are not simply #define wrappers), but I don't have access to this architecture/OS combination. > > So, I'm wondering if it would not be better to use a native > > implementation instead. Thoughts? > > Some benchmarks would be interesting. Could this be part of the corepy GSOC > project? From a performance point of view and provided that the speed-ups are not too noticeable (at least on my tested architecture, namely Intel Core2), I don't think this would be too interesting. A much better venue for people really wanting high speed is to link against Intel MKL or AMD ACML. As a matter of comparison, the MKL implementation for expm1f() takes just around 33 cycles/item and is 2x faster than the Linux/GCC implementation, and 5x faster than the simple (and naive :) implementation that NumPy uses on non-POSIX platforms that do not wear an `expm1()` function (like Windows/MSVC 9). All in all, I don't think that bothering about this would be worth the effort. So, I'll let Numexpr to behave exactly as NumPy for this matter then (if users need speed on Intel platforms they can always link it with MKL). Thanks for the feedback anyway, -- Francesc Alted From cournape at gmail.com Tue Jun 2 06:36:25 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 2 Jun 2009 19:36:25 +0900 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <4A249E82.4030300@ar.media.kyoto-u.ac.jp> References: <4A234643.6040606@ar.media.kyoto-u.ac.jp> <4A236193.7030004@ar.media.kyoto-u.ac.jp> <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> <4A249D87.20207@ar.media.kyoto-u.ac.jp> <3d375d730906012054h66d749eat395f86452aa1bfcd@mail.gmail.com> <4A249E82.4030300@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220906020336r58878a6fx298c9b7a900fcfc9@mail.gmail.com> On Tue, Jun 2, 2009 at 12:37 PM, David Cournapeau wrote: > Robert Kern wrote: >>> This does not solve the C function problem (PyArray_Correlate). The easy >>> solution would be to keep the current C version, deal with the problem >>> in python for acorrelate for the time being, and replace the old C >>> function with the 'correct' one once we remove the deprecated correlate ? >>> >> >> No, you do the same thing at the C level. >> > Done in r7031 - correlate/PyArray_Correlate should be unchanged, and acorrelate/PyArray_Acorrelate implement the conventional definitions, David From robince at gmail.com Tue Jun 2 06:51:30 2009 From: robince at gmail.com (Robin) Date: Tue, 2 Jun 2009 11:51:30 +0100 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <5b8d13220906020336r58878a6fx298c9b7a900fcfc9@mail.gmail.com> References: <4A236193.7030004@ar.media.kyoto-u.ac.jp> <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> <4A249D87.20207@ar.media.kyoto-u.ac.jp> <3d375d730906012054h66d749eat395f86452aa1bfcd@mail.gmail.com> <4A249E82.4030300@ar.media.kyoto-u.ac.jp> <5b8d13220906020336r58878a6fx298c9b7a900fcfc9@mail.gmail.com> Message-ID: On Tue, Jun 2, 2009 at 11:36 AM, David Cournapeau wrote: > > Done in r7031 - correlate/PyArray_Correlate should be unchanged, and > acorrelate/PyArray_Acorrelate implement the conventional definitions, I don't know if it's been discussed before but while people are thinking about/changing correlate I thought I'd like to request as a user a matlab style xcorr function (basically with the functionality of the matlab version). I don't know if this is a deliberate emission, but it is often one of the first things my colleagues try when I get them using Python, and as far as I know there isn't really a good answer. There is xcorr in pylab, but it isn't vectorised like xcorr from matlab... Cheers Robin From david at ar.media.kyoto-u.ac.jp Tue Jun 2 06:59:07 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 02 Jun 2009 19:59:07 +0900 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: References: <4A236193.7030004@ar.media.kyoto-u.ac.jp> <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> <4A249D87.20207@ar.media.kyoto-u.ac.jp> <3d375d730906012054h66d749eat395f86452aa1bfcd@mail.gmail.com> <4A249E82.4030300@ar.media.kyoto-u.ac.jp> <5b8d13220906020336r58878a6fx298c9b7a900fcfc9@mail.gmail.com> Message-ID: <4A2505FB.9010602@ar.media.kyoto-u.ac.jp> Robin wrote: > On Tue, Jun 2, 2009 at 11:36 AM, David Cournapeau wrote: > >> Done in r7031 - correlate/PyArray_Correlate should be unchanged, and >> acorrelate/PyArray_Acorrelate implement the conventional definitions, >> > > I don't know if it's been discussed before but while people are > thinking about/changing correlate I thought I'd like to request as a > user a matlab style xcorr function (basically with the functionality > of the matlab version). > > I don't know if this is a deliberate emission, but it is often one of > the first things my colleagues try when I get them using Python, and > as far as I know there isn't really a good answer. There is xcorr in > pylab, but it isn't vectorised like xcorr from matlab... > There is one in the talkbox scikit: http://github.com/cournape/talkbox/blob/202135a9d848931ebd036b97302f1e10d7488c63/scikits/talkbox/tools/correlations.py It uses the fft, and bonus point, the file is independent of the rest of toolbox. There is another version which uses direct implementation (this is faster if you need only a few lags, and it takes less memory too). David From rmay31 at gmail.com Tue Jun 2 09:56:08 2009 From: rmay31 at gmail.com (Ryan May) Date: Tue, 2 Jun 2009 08:56:08 -0500 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <4A2505FB.9010602@ar.media.kyoto-u.ac.jp> References: <3d375d730906010935n6c42717au8f224df393b1a7b1@mail.gmail.com> <4A249D87.20207@ar.media.kyoto-u.ac.jp> <3d375d730906012054h66d749eat395f86452aa1bfcd@mail.gmail.com> <4A249E82.4030300@ar.media.kyoto-u.ac.jp> <5b8d13220906020336r58878a6fx298c9b7a900fcfc9@mail.gmail.com> <4A2505FB.9010602@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, Jun 2, 2009 at 5:59 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Robin wrote: > > On Tue, Jun 2, 2009 at 11:36 AM, David Cournapeau > wrote: > > > >> Done in r7031 - correlate/PyArray_Correlate should be unchanged, and > >> acorrelate/PyArray_Acorrelate implement the conventional definitions, > >> > > > > I don't know if it's been discussed before but while people are > > thinking about/changing correlate I thought I'd like to request as a > > user a matlab style xcorr function (basically with the functionality > > of the matlab version). > > > > I don't know if this is a deliberate emission, but it is often one of > > the first things my colleagues try when I get them using Python, and > > as far as I know there isn't really a good answer. There is xcorr in > > pylab, but it isn't vectorised like xcorr from matlab... > > > > There is one in the talkbox scikit: > > > http://github.com/cournape/talkbox/blob/202135a9d848931ebd036b97302f1e10d7488c63/scikits/talkbox/tools/correlations.py > > It uses the fft, and bonus point, the file is independent of the rest of > toolbox. There is another version which uses direct implementation (this > is faster if you need only a few lags, and it takes less memory too). I'd be +1 on including something like this (provided it expanded to include complex-valued data). I think it's a real need, since everyone seems to keep rolling their own. I had to write my own just so that I can calculate a few lags in a vectorized fashion. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue Jun 2 10:09:31 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 2 Jun 2009 07:09:31 -0700 Subject: [Numpy-discussion] Multiplying Python float to numpy.array of objects works but fails with a numpy.float64, numpy Bug? In-Reply-To: References: Message-ID: On Tue, Jun 2, 2009 at 1:42 AM, Sebastian Walter wrote: > Hello, > Multiplying a Python float to a numpy.array of objects works flawlessly > but not with a numpy.float64 . > I tried ?numpy version '1.0.4' on a 32 bit Linux and ?'1.2.1' on a 64 > bit Linux: both raise the same exception. > > Is this a (known) bug? > > ---------------------- test.py ------------------------------------ > from numpy import * > > class adouble: > ? ? ? ?def __init__(self,x): > ? ? ? ? ? ? ? ?self.x = x > ? ? ? ?def __mul__(self,rhs): > ? ? ? ? ? ? ? ?if isinstance(rhs,adouble): > ? ? ? ? ? ? ? ? ? ? ? ?return adouble(self.x * rhs.x) > ? ? ? ? ? ? ? ?else: > ? ? ? ? ? ? ? ? ? ? ? ?return adouble(self.x * rhs) > ? ? ? ?def __str__(self): > ? ? ? ? ? ? ? ?return str(self.x) > > x = adouble(3.) > y = adouble(2.) > u = array([adouble(3.), adouble(5.)]) > v = array([adouble(2.), adouble(7.)]) > z = array([2.,3.]) > > print x * y ? ? ? ? ? ? ?# ok > print u * v ? ? ? ? ? ? ?# ok > print u * z ? ? ? ? ? ? ?# ok > print u * 3. ? ? ? ? ? ? # ok > print u * z[0] ? ? ? ? ? # _NOT_ OK! > print u * float64(3.) ? ?# _NOT_ OK! > > > > ---------------------- output ? --------------------------------- > walter at wronski$ python test.py > 6.0 > [6.0 35.0] > [6.0 15.0] > [9.0 15.0] > Traceback (most recent call last): > ?File "test.py", line 24, in > ? ?print u * z[0] ? # _NOT_ OK! > TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and > 'numpy.float64' Try adding __rmul__ = __mul__ like below: from numpy import * class adouble: def __init__(self,x): self.x = x def __mul__(self,rhs): if isinstance(rhs,adouble): return adouble(self.x * rhs.x) else: return adouble(self.x * rhs) __rmul__ = __mul__ def __str__(self): return str(self.x) def test(): x = adouble(3.) print 3 * x Output: >> test.test() 9.0 From dsdale24 at gmail.com Tue Jun 2 10:18:30 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Tue, 2 Jun 2009 10:18:30 -0400 Subject: [Numpy-discussion] Multiplying Python float to numpy.array of objects works but fails with a numpy.float64, numpy Bug? In-Reply-To: References: Message-ID: On Tue, Jun 2, 2009 at 10:09 AM, Keith Goodman wrote: > On Tue, Jun 2, 2009 at 1:42 AM, Sebastian Walter > wrote: > > Hello, > > Multiplying a Python float to a numpy.array of objects works flawlessly > > but not with a numpy.float64 . > > I tried numpy version '1.0.4' on a 32 bit Linux and '1.2.1' on a 64 > > bit Linux: both raise the same exception. > > > > Is this a (known) bug? > Yes, it was fixed in numpy-1.3. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Tue Jun 2 10:59:25 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Tue, 2 Jun 2009 16:59:25 +0200 Subject: [Numpy-discussion] Multiplying Python float to numpy.array of objects works but fails with a numpy.float64, numpy Bug? In-Reply-To: References: Message-ID: On Tue, Jun 2, 2009 at 4:18 PM, Darren Dale wrote: > > > On Tue, Jun 2, 2009 at 10:09 AM, Keith Goodman wrote: >> >> On Tue, Jun 2, 2009 at 1:42 AM, Sebastian Walter >> wrote: >> > Hello, >> > Multiplying a Python float to a numpy.array of objects works flawlessly >> > but not with a numpy.float64 . >> > I tried numpy version '1.0.4' on a 32 bit Linux and '1.2.1' on a 64 >> > bit Linux: both raise the same exception. >> > >> > Is this a (known) bug? > > Yes, it was fixed in numpy-1.3. ok, cool thanks for the info! > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From faltet at pytables.org Tue Jun 2 13:32:35 2009 From: faltet at pytables.org (Francesc Alted) Date: Tue, 2 Jun 2009 19:32:35 +0200 Subject: [Numpy-discussion] ANN: Numexpr 1.3 released Message-ID: <200906021932.35932.faltet@pytables.org> ======================== Announcing Numexpr 1.3 ======================== Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. On this release, and due to popular demand, support for single precision floating point types has been added. This allows for both improved performance and optimal usage of memory for the single precision computations. Of course, support for single precision in combination with Intel's VML is there too :) However, caveat emptor: the casting rules for floating point types slightly differs from those of NumPy. See the ``Casting rules`` section at: http://code.google.com/p/numexpr/wiki/Overview or the README.txt file for more info on this issue. In case you want to know more in detail what has changed in this version, see: http://code.google.com/p/numexpr/wiki/ReleaseNotes or have a look at RELEASE_NOTES.txt in the tarball. Where I can find Numexpr? ========================= The project is hosted at Google code in: http://code.google.com/p/numexpr/ And you can get the packages from PyPI as well: http://pypi.python.org/pypi How it works? ============= See: http://code.google.com/p/numexpr/wiki/Overview for a detailed description by the original author (David M. Cooke). Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy! -- Francesc Alted From Todd.Turner at wpafb.af.mil Tue Jun 2 14:19:13 2009 From: Todd.Turner at wpafb.af.mil (Turner, Todd J Civ USAF AFMC AFRL/RXLMP) Date: Tue, 2 Jun 2009 14:19:13 -0400 Subject: [Numpy-discussion] numpy issues at startup Message-ID: <1CDC952F2D441A479BCF66B7718987EB046BF5CD@VFOHMLAO03.Enterprise.afmc.ds.af.mil> I'm having some numpy problems when using the package with Python. My admin installed numpy for me and we think it's installed right. When I'm in python and type 'import numpy' I get the "Running numpy from source directory". It doesn't matter which directory I launch python out of, it always gives me the same errors and then can't do anything simple like numpy.ones() or any other routine. Any ideas? Do I have a path variable set wrong? Thanks for the help. Tj Turner -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jun 2 14:24:05 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 2 Jun 2009 13:24:05 -0500 Subject: [Numpy-discussion] numpy issues at startup In-Reply-To: <1CDC952F2D441A479BCF66B7718987EB046BF5CD@VFOHMLAO03.Enterprise.afmc.ds.af.mil> References: <1CDC952F2D441A479BCF66B7718987EB046BF5CD@VFOHMLAO03.Enterprise.afmc.ds.af.mil> Message-ID: <3d375d730906021124v1e6d4dl8d0116b6dcbd8d0@mail.gmail.com> On Tue, Jun 2, 2009 at 13:19, Turner, Todd J Civ USAF AFMC AFRL/RXLMP wrote: > I?m having some numpy problems when using the package with Python.? My admin > installed numpy for me and we think it?s installed right.? When I?m in > python and type ?import numpy? I get the ?Running numpy from source > directory?. It doesn?t matter which directory I launch python out of, it > always gives me the same errors and then can?t do anything simple like > numpy.ones() or any other routine.? Any ideas?? Do I have a path variable > set wrong? Please do the following: $ python >>> import numpy >>> print numpy.__file__ That should show you where the numpy on your sys.path is. Double-check that it is in the place you think it is. If so, move to that directory (e.g. /usr/lib/python2.5/site-packages/numpy/) and show us what is in there: $ cd /usr/lib/python2.5/site-packages/numpy/ $ ls .... -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Todd.Turner at wpafb.af.mil Tue Jun 2 14:35:47 2009 From: Todd.Turner at wpafb.af.mil (Turner, Todd J Civ USAF AFMC AFRL/RXLMP) Date: Tue, 2 Jun 2009 14:35:47 -0400 Subject: [Numpy-discussion] numpy issues at startup In-Reply-To: <3d375d730906021124v1e6d4dl8d0116b6dcbd8d0@mail.gmail.com> References: <1CDC952F2D441A479BCF66B7718987EB046BF5CD@VFOHMLAO03.Enterprise.afmc.ds.af.mil> <3d375d730906021124v1e6d4dl8d0116b6dcbd8d0@mail.gmail.com> Message-ID: <1CDC952F2D441A479BCF66B7718987EB046BF5F4@VFOHMLAO03.Enterprise.afmc.ds.af.mil> All right, here goes... >>> print numpy.__file__ /home/d1/turnertj/Python-2.5.4/lib/python2.5/site-packages/numpy/numpy/__init__.pyc [turnertj at daneel numpy]$ ls add_newdocs.py ctypeslib.pyc dual.py _import_tools.py lib numarray setup.py tests core distutils f2py __init__.py linalg oldnumeric setup.pyc version.py ctypeslib.py doc fft __init__.pyc matlib.py random testing version.pyc -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Robert Kern Sent: Tuesday, June 02, 2009 2:24 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy issues at startup On Tue, Jun 2, 2009 at 13:19, Turner, Todd J Civ USAF AFMC AFRL/RXLMP wrote: > I?m having some numpy problems when using the package with Python.? My admin > installed numpy for me and we think it?s installed right.? When I?m in > python and type ?import numpy? I get the ?Running numpy from source > directory?. It doesn?t matter which directory I launch python out of, it > always gives me the same errors and then can?t do anything simple like > numpy.ones() or any other routine.? Any ideas?? Do I have a path variable > set wrong? Please do the following: $ python >>> import numpy >>> print numpy.__file__ That should show you where the numpy on your sys.path is. Double-check that it is in the place you think it is. If so, move to that directory (e.g. /usr/lib/python2.5/site-packages/numpy/) and show us what is in there: $ cd /usr/lib/python2.5/site-packages/numpy/ $ ls .... -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Tue Jun 2 14:41:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 2 Jun 2009 13:41:43 -0500 Subject: [Numpy-discussion] numpy issues at startup In-Reply-To: <1CDC952F2D441A479BCF66B7718987EB046BF5F4@VFOHMLAO03.Enterprise.afmc.ds.af.mil> References: <1CDC952F2D441A479BCF66B7718987EB046BF5CD@VFOHMLAO03.Enterprise.afmc.ds.af.mil> <3d375d730906021124v1e6d4dl8d0116b6dcbd8d0@mail.gmail.com> <1CDC952F2D441A479BCF66B7718987EB046BF5F4@VFOHMLAO03.Enterprise.afmc.ds.af.mil> Message-ID: <3d375d730906021141v70191a86qdeae23169c9e6d06@mail.gmail.com> On Tue, Jun 2, 2009 at 13:35, Turner, Todd J Civ USAF AFMC AFRL/RXLMP wrote: > All right, here goes... > >>>> print numpy.__file__ > /home/d1/turnertj/Python-2.5.4/lib/python2.5/site-packages/numpy/numpy/__init__.pyc That looks like it was definitely installed incorrectly, then. It looks like the source package simply unpacked into site-packages. The installation instructions are given in INSTALL.txt. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Tue Jun 2 14:55:02 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 2 Jun 2009 14:55:02 -0400 Subject: [Numpy-discussion] Making NpzFiles behave more like dictionaries. Message-ID: Hi, It's occasionally annoyed me that NpzFiles can't be swapped in transparently for an in-memory dictionary since getting at the keys requires an attribute access. Below is a patch that implements some more of the dictionary interface for the NpzFile class. Any comments as to whether this is a good/bad idea, or about the specific implementation? Regards, David Index: io.py =================================================================== --- io.py (revision 7031) +++ io.py (working copy) @@ -118,6 +118,25 @@ else: raise KeyError, "%s is not a file in the archive" % key + def __iter__(self): + return iter(self.files) + + def items(self): + return [(f, self[f]) for f in self.files] + + def iteritems(self): + return ((f, self[f]) for f in self.files) + + def keys(self): + return self.files + + def iterkeys(self): + return self.__iter__() + + def __contains__(self, key): + return self.files.__contains__(key) + + def load(file, mmap_mode=None): """ Load a pickled, ``.npy``, or ``.npz`` binary file. From pav at iki.fi Tue Jun 2 15:06:51 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 2 Jun 2009 19:06:51 +0000 (UTC) Subject: [Numpy-discussion] Making NpzFiles behave more like dictionaries. References: Message-ID: Tue, 02 Jun 2009 14:55:02 -0400, David Warde-Farley wrote: > It's occasionally annoyed me that NpzFiles can't be swapped in > transparently for an in-memory dictionary since getting at the keys > requires an attribute access. Below is a patch that implements some more > of the dictionary interface for the NpzFile class. Any comments as to > whether this is a good/bad idea, or about the specific implementation? +0 I don't see any drawbacks, and the implementation looks good. -- Pauli Virtanen From rjsteed at talk21.com Tue Jun 2 15:16:44 2009 From: rjsteed at talk21.com (rob steed) Date: Tue, 2 Jun 2009 19:16:44 +0000 (GMT) Subject: [Numpy-discussion] Problem with correlate Message-ID: <241418.58749.qm@web86009.mail.ird.yahoo.com> I also think that the conjugate should be taken. I spent the last few weeks using correlate to experiment with signal processing and I got strange results until I realised that I had to manually take the conjugate. It would also be good if the function did it since applying the conjugate to the wrong sequence yields the complex conjugate of the correlation. Who would want to use the correlation without the conjugate, if someone is only using real values it won't affect them, if they are using complex values they probably want to conjugate. One function that does depend on correlate though is convolution! Changes made to correlate will affect it! but I have understand that a new function acorrelate is being created instead of changing correlate? Otherwise I've never used matlab but it does seem like xcorr has some good features. The modes 'same' and 'valid' were initially quite confusing especially as the default is 'valid', meaning that autocorrelations lead to a single value by default! I also think that having weighting options would be good. I now understand the complexities of the various weightings that can be applied to the correlation i.e. biased vs unbiased but I think that having correlate include these options might prompt users to investigate which one they really needed. Correlate seemed so simple when I first used it but it took me ages to realise that these are choices to be made. regards Rob From charlesr.harris at gmail.com Tue Jun 2 22:16:40 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 2 Jun 2009 20:16:40 -0600 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <241418.58749.qm@web86009.mail.ird.yahoo.com> References: <241418.58749.qm@web86009.mail.ird.yahoo.com> Message-ID: On Tue, Jun 2, 2009 at 1:16 PM, rob steed wrote: > > I also think that the conjugate should be taken. I spent the last few weeks > using correlate to experiment with > signal processing and I got strange results until I realised that I had to > manually take the conjugate. It > would also be good if the function did it since applying the conjugate to > the wrong sequence yields the > complex conjugate of the correlation. > > Who would want to use the correlation without the conjugate, if someone is > only using real values it won't > affect them, if they are using complex values they probably want to > conjugate. > > One function that does depend on correlate though is convolution! Changes > made to correlate will > affect it! but I have understand that a new function acorrelate is being > created instead of changing > correlate? > > Otherwise I've never used matlab but it does seem like xcorr has some good > features. The modes > 'same' and 'valid' were initially quite confusing especially as the default > is 'valid', meaning that autocorrelations > lead to a single value by default! > > I also think that having weighting options would be good. I now understand > the complexities of the various > weightings that can be applied to the correlation i.e. biased vs unbiased > but I think that having correlate > include these options might prompt users to investigate which one they > really needed. Correlate seemed > so simple when I first used it but it took me ages to realise that these > are choices to be made. > I wonder if xcorrelate would be a better name than acorrelate? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jun 2 22:20:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 2 Jun 2009 20:20:39 -0600 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> Message-ID: On Mon, Jun 1, 2009 at 8:43 PM, Robert Kern wrote: > On Mon, Jun 1, 2009 at 21:37, wrote: > > how do we catch a multiarray.error in a try except clause? > > > > e.g. > >>>> np.argmin([]) > > Traceback (most recent call last): > > File "", line 1, in > > np.argmin([]) > > File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > > line 631, in argmin > > return _wrapit(a, 'argmin', axis) > > File "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > > line 37, in _wrapit > > result = getattr(asarray(obj),method)(*args, **kwds) > > multiarray.error: attempt to get argmax/argmin of an empty sequence > > try: > ... > except numpy.core.multiarray.error: > ... > > Unfortunately, that is still a string exception. We should change that. > I'm fixing these, but doesn't that constitute an abi change? Code that used to catch the exceptions won't anymore. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Jun 2 22:03:02 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 03 Jun 2009 11:03:02 +0900 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: References: <241418.58749.qm@web86009.mail.ird.yahoo.com> Message-ID: <4A25D9D6.3020602@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > > I also think that having weighting options would be good. I now > understand the complexities of the various > weightings that can be applied to the correlation i.e. biased vs > unbiased but I think that having correlate > include these options might prompt users to investigate which one > they really needed. Correlate seemed > so simple when I first used it but it took me ages to realise that > these are choices to be made. > > > I wonder if xcorrelate would be a better name than acorrelate? It may be confusing, because what acorrelate currently does is not what xcorr does under matlab. Under matlab, xcorr computes the 1d cross correlation column-wise (would be axis wise under numpy). If I get the time today, I will try to implement a first version using generalized ufunc - that would be a good occasion to learn generalized ufunc. If that works well, maybe there is no need for the current acorrelate at all (but we should still keep PyArray_Acorrelate, I think). cheers, David From robert.kern at gmail.com Tue Jun 2 22:25:10 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 2 Jun 2009 21:25:10 -0500 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> Message-ID: <3d375d730906021925p699ad31eh778fa46bdf7ce0cf@mail.gmail.com> On Tue, Jun 2, 2009 at 21:20, Charles R Harris wrote: > > > On Mon, Jun 1, 2009 at 8:43 PM, Robert Kern wrote: >> >> On Mon, Jun 1, 2009 at 21:37, ? wrote: >> > how do we catch a multiarray.error in a try except clause? >> > >> > e.g. >> >>>> np.argmin([]) >> > Traceback (most recent call last): >> > ?File "", line 1, in >> > ? ?np.argmin([]) >> > ?File >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", >> > line 631, in argmin >> > ? ?return _wrapit(a, 'argmin', axis) >> > ?File >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", >> > line 37, in _wrapit >> > ? ?result = getattr(asarray(obj),method)(*args, **kwds) >> > multiarray.error: attempt to get argmax/argmin of an empty sequence >> >> try: >> ? ... >> except numpy.core.multiarray.error: >> ? ... >> >> Unfortunately, that is still a string exception. We should change that. > > I'm fixing these, but doesn't that constitute an abi change? Code that used > to catch the exceptions won't anymore. If they are catching numpy.core.multiarray.error, then it's not a problem. They never should have been catching "multiarray.error". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Tue Jun 2 22:36:58 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 2 Jun 2009 22:36:58 -0400 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: <3d375d730906021925p699ad31eh778fa46bdf7ce0cf@mail.gmail.com> References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> <3d375d730906021925p699ad31eh778fa46bdf7ce0cf@mail.gmail.com> Message-ID: <1cd32cbb0906021936g79af2787qc30a1d8200c35713@mail.gmail.com> On Tue, Jun 2, 2009 at 10:25 PM, Robert Kern wrote: > On Tue, Jun 2, 2009 at 21:20, Charles R Harris > wrote: >> >> >> On Mon, Jun 1, 2009 at 8:43 PM, Robert Kern wrote: >>> >>> On Mon, Jun 1, 2009 at 21:37, ? wrote: >>> > how do we catch a multiarray.error in a try except clause? >>> > >>> > e.g. >>> >>>> np.argmin([]) >>> > Traceback (most recent call last): >>> > ?File "", line 1, in >>> > ? ?np.argmin([]) >>> > ?File >>> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", >>> > line 631, in argmin >>> > ? ?return _wrapit(a, 'argmin', axis) >>> > ?File >>> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", >>> > line 37, in _wrapit >>> > ? ?result = getattr(asarray(obj),method)(*args, **kwds) >>> > multiarray.error: attempt to get argmax/argmin of an empty sequence >>> >>> try: >>> ? ... >>> except numpy.core.multiarray.error: >>> ? ... >>> >>> Unfortunately, that is still a string exception. We should change that. >> >> I'm fixing these, but doesn't that constitute an abi change? Code that used >> to catch the exceptions won't anymore. > > If they are catching numpy.core.multiarray.error, then it's not a > problem. They never should have been catching "multiarray.error". > But in my example it was the only way that I found to catch this error, except with an empty except clause. So someone might have also used it this way. I guess this would be an API change in my example. Josef From charlesr.harris at gmail.com Tue Jun 2 22:41:22 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 2 Jun 2009 20:41:22 -0600 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: <3d375d730906021925p699ad31eh778fa46bdf7ce0cf@mail.gmail.com> References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> <3d375d730906021925p699ad31eh778fa46bdf7ce0cf@mail.gmail.com> Message-ID: On Tue, Jun 2, 2009 at 8:25 PM, Robert Kern wrote: > On Tue, Jun 2, 2009 at 21:20, Charles R Harris > wrote: > > > > > > On Mon, Jun 1, 2009 at 8:43 PM, Robert Kern > wrote: > >> > >> On Mon, Jun 1, 2009 at 21:37, wrote: > >> > how do we catch a multiarray.error in a try except clause? > >> > > >> > e.g. > >> >>>> np.argmin([]) > >> > Traceback (most recent call last): > >> > File "", line 1, in > >> > np.argmin([]) > >> > File > >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > >> > line 631, in argmin > >> > return _wrapit(a, 'argmin', axis) > >> > File > >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > >> > line 37, in _wrapit > >> > result = getattr(asarray(obj),method)(*args, **kwds) > >> > multiarray.error: attempt to get argmax/argmin of an empty sequence > >> > >> try: > >> ... > >> except numpy.core.multiarray.error: > >> ... > >> > >> Unfortunately, that is still a string exception. We should change that. > > > > I'm fixing these, but doesn't that constitute an abi change? Code that > used > > to catch the exceptions won't anymore. > > If they are catching numpy.core.multiarray.error, then it's not a > problem. They never should have been catching "multiarray.error". > Well, I just removed them from lib/src/_compiled_base.c also. I suppose catching string errors from any of numpy's routines could be considered an error, but it was the only way folks could do some of these things before. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jun 2 22:59:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 2 Jun 2009 21:59:15 -0500 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> <3d375d730906021925p699ad31eh778fa46bdf7ce0cf@mail.gmail.com> Message-ID: <3d375d730906021959p41556e40qf38efbdade19a75a@mail.gmail.com> On Tue, Jun 2, 2009 at 21:41, Charles R Harris wrote: > > > On Tue, Jun 2, 2009 at 8:25 PM, Robert Kern wrote: >> >> On Tue, Jun 2, 2009 at 21:20, Charles R Harris >> wrote: >> > >> > >> > On Mon, Jun 1, 2009 at 8:43 PM, Robert Kern >> > wrote: >> >> >> >> On Mon, Jun 1, 2009 at 21:37, ? wrote: >> >> > how do we catch a multiarray.error in a try except clause? >> >> > >> >> > e.g. >> >> >>>> np.argmin([]) >> >> > Traceback (most recent call last): >> >> > ?File "", line 1, in >> >> > ? ?np.argmin([]) >> >> > ?File >> >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", >> >> > line 631, in argmin >> >> > ? ?return _wrapit(a, 'argmin', axis) >> >> > ?File >> >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", >> >> > line 37, in _wrapit >> >> > ? ?result = getattr(asarray(obj),method)(*args, **kwds) >> >> > multiarray.error: attempt to get argmax/argmin of an empty sequence >> >> >> >> try: >> >> ? ... >> >> except numpy.core.multiarray.error: >> >> ? ... >> >> >> >> Unfortunately, that is still a string exception. We should change that. >> > >> > I'm fixing these, but doesn't that constitute an abi change? Code that >> > used >> > to catch the exceptions won't anymore. >> >> If they are catching numpy.core.multiarray.error, then it's not a >> problem. They never should have been catching "multiarray.error". > > Well, I just removed them from? lib/src/_compiled_base.c also. I suppose > catching string errors from any of numpy's routines could be considered an > error, but it was the only way folks could do some of these things before. For multiarray.error and _compiled_base.error, these strings were exposed in the modules. The one in numpy.lib.function_base.histogramdd() is a little more problematic, but the f2py cases aren't. f2py is not used as a library often enough; I doubt anyone is actually trying to capture those long, human-readable exception strings. In any case, we need to do this ASAP. String exceptions are long-deprecated and are now causing DeprecationWarnings in the interpreter. Changing exceptions is generally considered an API change, not an ABI change. We should document it, of course, but I see no reason not to do it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Jun 2 23:21:03 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 2 Jun 2009 21:21:03 -0600 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: <3d375d730906021959p41556e40qf38efbdade19a75a@mail.gmail.com> References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> <3d375d730906021925p699ad31eh778fa46bdf7ce0cf@mail.gmail.com> <3d375d730906021959p41556e40qf38efbdade19a75a@mail.gmail.com> Message-ID: On Tue, Jun 2, 2009 at 8:59 PM, Robert Kern wrote: > On Tue, Jun 2, 2009 at 21:41, Charles R Harris > wrote: > > > > > > On Tue, Jun 2, 2009 at 8:25 PM, Robert Kern > wrote: > >> > >> On Tue, Jun 2, 2009 at 21:20, Charles R Harris > >> wrote: > >> > > >> > > >> > On Mon, Jun 1, 2009 at 8:43 PM, Robert Kern > >> > wrote: > >> >> > >> >> On Mon, Jun 1, 2009 at 21:37, wrote: > >> >> > how do we catch a multiarray.error in a try except clause? > >> >> > > >> >> > e.g. > >> >> >>>> np.argmin([]) > >> >> > Traceback (most recent call last): > >> >> > File "", line 1, in > >> >> > np.argmin([]) > >> >> > File > >> >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > >> >> > line 631, in argmin > >> >> > return _wrapit(a, 'argmin', axis) > >> >> > File > >> >> > "C:\Programs\Python25\Lib\site-packages\numpy\core\fromnumeric.py", > >> >> > line 37, in _wrapit > >> >> > result = getattr(asarray(obj),method)(*args, **kwds) > >> >> > multiarray.error: attempt to get argmax/argmin of an empty sequence > >> >> > >> >> try: > >> >> ... > >> >> except numpy.core.multiarray.error: > >> >> ... > >> >> > >> >> Unfortunately, that is still a string exception. We should change > that. > >> > > >> > I'm fixing these, but doesn't that constitute an abi change? Code that > >> > used > >> > to catch the exceptions won't anymore. > >> > >> If they are catching numpy.core.multiarray.error, then it's not a > >> problem. They never should have been catching "multiarray.error". > > > > Well, I just removed them from lib/src/_compiled_base.c also. I suppose > > catching string errors from any of numpy's routines could be considered > an > > error, but it was the only way folks could do some of these things > before. > > For multiarray.error and _compiled_base.error, these strings were > exposed in the modules. The one in > numpy.lib.function_base.histogramdd() is a little more problematic, > but the f2py cases aren't. f2py is not used as a library often enough; > I doubt anyone is actually trying to capture those long, > human-readable exception strings. > > In any case, we need to do this ASAP. String exceptions are > long-deprecated and are now causing DeprecationWarnings in the > interpreter. Changing exceptions is generally considered an API > change, not an ABI change. We should document it, of course, but I see > no reason not to do it. > OK. I left the strings exposed in the modules with a fixme notes. Do you think they should be removed while we are at it? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jun 2 23:22:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 2 Jun 2009 22:22:54 -0500 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> <3d375d730906021925p699ad31eh778fa46bdf7ce0cf@mail.gmail.com> <3d375d730906021959p41556e40qf38efbdade19a75a@mail.gmail.com> Message-ID: <3d375d730906022022n626c032je68aab0ebf593552@mail.gmail.com> On Tue, Jun 2, 2009 at 22:21, Charles R Harris wrote: > OK. I left the strings exposed in the modules with a fixme notes. Do you > think they should be removed while we are at it? I think they should be replaced with the Exception classes that are being raised in their place. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Jun 2 23:30:08 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 2 Jun 2009 21:30:08 -0600 Subject: [Numpy-discussion] how can one catch a multiarray.error In-Reply-To: <3d375d730906022022n626c032je68aab0ebf593552@mail.gmail.com> References: <1cd32cbb0906011937p7df6ef97r19833b6ade7eac29@mail.gmail.com> <3d375d730906011943o7a3be1f7h543fe9cfc7f3c196@mail.gmail.com> <3d375d730906021925p699ad31eh778fa46bdf7ce0cf@mail.gmail.com> <3d375d730906021959p41556e40qf38efbdade19a75a@mail.gmail.com> <3d375d730906022022n626c032je68aab0ebf593552@mail.gmail.com> Message-ID: On Tue, Jun 2, 2009 at 9:22 PM, Robert Kern wrote: > On Tue, Jun 2, 2009 at 22:21, Charles R Harris > wrote: > > > OK. I left the strings exposed in the modules with a fixme notes. Do you > > think they should be removed while we are at it? > > I think they should be replaced with the Exception classes that are > being raised in their place. > Hmm, all the replacement exceptions are standard python exceptions and for the most part already were in multiarray. How about replacing the strings with PyExc_Exception. I think that should catch everything. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rjsteed at talk21.com Wed Jun 3 08:08:48 2009 From: rjsteed at talk21.com (rob steed) Date: Wed, 3 Jun 2009 12:08:48 +0000 (GMT) Subject: [Numpy-discussion] Problem with correlate In-Reply-To: References: Message-ID: <659535.63467.qm@web86003.mail.ird.yahoo.com> > I wonder if xcorrelate would be a better name than acorrelate? I think it would. From j.m.girven at warwick.ac.uk Wed Jun 3 11:06:12 2009 From: j.m.girven at warwick.ac.uk (D2Hitman) Date: Wed, 3 Jun 2009 08:06:12 -0700 (PDT) Subject: [Numpy-discussion] field names on numpy arrays Message-ID: <23852413.post@talk.nabble.com> Hi, I would like to have an object/class that acts like array of floats such as: a_array = numpy.array([[0.,1.,2.,3.,4.],[1.,2.,3.,4.,5.]]) but i would like to be able to slice this array by some header dictionary: header_dict = {'a':0,'b':1,'c':2,'d':3,'e':4} such that i could use a_array['a'], which would get slice=header_dict['a'], slices a_array[:,slice] and return it. I understand record arrays such as: a_array = np.array([(0.,1.,2.,3.,4.),(1.,2.,3.,4.,5.)],dtype=[('a','f'),('b','f'),('c','f'),('d','f'),('e','f')]) do this with field names. a_array['a'] = array([ 0., 1.], dtype=float32) however i seem to lose simple operations such as multiplication (a_array*2) or powers (a_array**2). Is there something that does this? Or how would i go about creating an object/class that inherits all properties from numpy.array, but adds in a header to select columns? a_array = MyArray([(0.,1.,2.,3.,4.),(1.,2.,3.,4.,5.)], header=['a','b','c','d','e']) Thanks, Jon. -- View this message in context: http://www.nabble.com/field-names-on-numpy-arrays-tp23852413p23852413.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pgmdevlist at gmail.com Wed Jun 3 11:46:08 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 3 Jun 2009 11:46:08 -0400 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <23852413.post@talk.nabble.com> References: <23852413.post@talk.nabble.com> Message-ID: On Jun 3, 2009, at 11:06 AM, D2Hitman wrote: > > Hi, > > I would like to have an object/class that acts like array of floats > such as: > a_array = numpy.array([[0.,1.,2.,3.,4.],[1.,2.,3.,4.,5.]]) > but i would like to be able to slice this array by some header > dictionary: > header_dict = {'a':0,'b':1,'c':2,'d':3,'e':4} > such that i could use a_array['a'], > which would get slice=header_dict['a'], > slices a_array[:,slice] > and return it. > > I understand record arrays such as: > ... > however i seem to lose simple operations such as multiplication > (a_array*2) > or powers (a_array**2). Indeed. Structured arrays don't support many operations (because they're fairly general. For example, one field could be a string, how do you define * for strings ?) > Is there something that does this? Or how would i go about creating an > object/class that inherits all properties from numpy.array, but adds > in a > header to select columns? Quick shot (not tested): Subclass ndarray, define headdict as an attribute, overwrite __getitem__ w/ your own, where you test on the input: * if the input item is a string then take the corresponding column from your headdict * if the input item is anything else, use ndarray.__getitem__ Info on subclassing: http://docs.scipy.org/doc/numpy/user/basics.subclassing.html Note that you may want to put some fences here and there to prevent reshaping (else your headdict will become useless). As a side-note, I think this subject has been talked about on this mailing list (or scipy's). Sorry, no particular, but you may want to check the archives. From stefan at sun.ac.za Wed Jun 3 14:51:09 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 3 Jun 2009 20:51:09 +0200 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <23852413.post@talk.nabble.com> References: <23852413.post@talk.nabble.com> Message-ID: <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> Hi Jon 2009/6/3 D2Hitman : > I understand record arrays such as: > a_array = > np.array([(0.,1.,2.,3.,4.),(1.,2.,3.,4.,5.)],dtype=[('a','f'),('b','f'),('c','f'),('d','f'),('e','f')]) > do this with field names. > a_array['a'] = array([ 0., ?1.], dtype=float32) > however i seem to lose simple operations such as multiplication (a_array*2) > or powers (a_array**2). As a workaround, you can have two views on your data: n [39]: x Out[39]: array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], dtype=[('a', ' Hi all, I posted this message couple of days ago, but gmane grouped it with an old thread and it hasn't shown up on the front page. So here it is again... I'd really like to see the setmember1d_nu function in ticket 1036 get into numpy. There's a patch waiting for review that including tests: http://projects.scipy.org/numpy/ticket/1036 Is there anything I can do to help get it applied? Neil From dwf at cs.toronto.edu Wed Jun 3 16:05:51 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 3 Jun 2009 16:05:51 -0400 Subject: [Numpy-discussion] [RFR] NpzFile tweaks (Re: Making NpzFiles behave more like dictionaries.) In-Reply-To: References: Message-ID: <53EAD5CA-15B7-4DB2-91F1-C7795894ED12@cs.toronto.edu> On 2-Jun-09, at 3:06 PM, Pauli Virtanen wrote: > +0 > > I don't see any drawbacks, and the implementation looks good. Thanks Pauli. I realized I was missing values() and itervalues() (though I can't conceive of a scenario where I'd use them myself, I guess some code might expect them). Also I should probably make a copy of self.files in keys() to prevent it from being mutated. I've filed a ticket with the updated patch: http://projects.scipy.org/numpy/ticket/1125 David From josef.pktd at gmail.com Wed Jun 3 16:23:46 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 16:23:46 -0400 Subject: [Numpy-discussion] another view puzzle Message-ID: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> >>> import numpy as np >>> x = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], dtype=[('a', '>> xvm = x.view(np.matrix) >>> xvm matrix([[(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]], dtype=[('a', '>> xvm*2 matrix([[(0.0, 1.0, 2.0, 3.0, 4.0, 0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0, 1.0, 2.0, 3.0, 4.0, 5.0)]], dtype=object) >>> What am I doing wrong? Josef From josef.pktd at gmail.com Wed Jun 3 16:26:00 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 16:26:00 -0400 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> Message-ID: <1cd32cbb0906031326p75f4fcaew746f4eb39553731a@mail.gmail.com> 2009/6/3 St?fan van der Walt : > Hi Jon > > 2009/6/3 D2Hitman : >> I understand record arrays such as: >> a_array = >> np.array([(0.,1.,2.,3.,4.),(1.,2.,3.,4.,5.)],dtype=[('a','f'),('b','f'),('c','f'),('d','f'),('e','f')]) >> do this with field names. >> a_array['a'] = array([ 0., ?1.], dtype=float32) >> however i seem to lose simple operations such as multiplication (a_array*2) >> or powers (a_array**2). > > As a workaround, you can have two views on your data: > > n [39]: x > Out[39]: > array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], > ? ? ?dtype=[('a', ' ('e', ' > In [40]: x = x_dict.view(np.float32) > > In [41]: x**2 > Out[41]: array([ ?0., ? 1., ? 4., ? 9., ?16., ? 1., ? 4., ? 9., ?16., > 25.], dtype=float32) > > Then you can manipulate the same data using two different "interfaces". Why does it not preserve "shape", to do e.g. np.mean by axis? > > Regards > St?fan From Chris.Barker at noaa.gov Wed Jun 3 16:46:52 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 03 Jun 2009 13:46:52 -0700 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> Message-ID: <4A26E13C.7010502@noaa.gov> josef.pktd at gmail.com wrote: >>>> import numpy as np >>>> x = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], > dtype=[('a', ' ('e', ' >>>> xvm = x.view(np.matrix) >>>> xvm > matrix([[(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]], > dtype=[('a', ' ('e', ' References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> Message-ID: <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> On Wed, Jun 3, 2009 at 15:23, wrote: >>>> import numpy as np >>>> x = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], > ? ? dtype=[('a', ' ('e', ' >>>> xvm = x.view(np.matrix) >>>> xvm > matrix([[(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]], > ? ? ? dtype=[('a', ' ('e', '>>> xvm*2 > matrix([[(0.0, 1.0, 2.0, 3.0, 4.0, 0.0, 1.0, 2.0, 3.0, 4.0), > ? ? ? ? (1.0, 2.0, 3.0, 4.0, 5.0, 1.0, 2.0, 3.0, 4.0, 5.0)]], dtype=object) >>>> > > What am I doing wrong? You simply can't do numerical operations on structured arrays. matrix shows this behavior because it replaces * with dot(), and dot(x, 2) upcasts the array to an object array, which happens to represent records as tuples. This has nothing to do with views. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Wed Jun 3 17:01:58 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 3 Jun 2009 21:01:58 +0000 (UTC) Subject: [Numpy-discussion] [RFR] NpzFile tweaks (Re: Making NpzFiles behave more like dictionaries.) References: <53EAD5CA-15B7-4DB2-91F1-C7795894ED12@cs.toronto.edu> Message-ID: Wed, 03 Jun 2009 16:05:51 -0400, David Warde-Farley wrote: > On 2-Jun-09, at 3:06 PM, Pauli Virtanen wrote: > >> +0 >> >> I don't see any drawbacks, and the implementation looks good. > > Thanks Pauli. I realized I was missing values() and itervalues() (though > I can't conceive of a scenario where I'd use them myself, I guess some > code might expect them). Also I should probably make a copy of > self.files in keys() to prevent it from being mutated. > > I've filed a ticket with the updated patch: > > http://projects.scipy.org/numpy/ticket/1125 Btw, are you able to change the status of the ticket to "needs_review"? I think this should be possible for everyone, and not restricted to admins, but I'm not 100% sure... -- Pauli Virtanen From robert.kern at gmail.com Wed Jun 3 17:03:49 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Jun 2009 16:03:49 -0500 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <1cd32cbb0906031326p75f4fcaew746f4eb39553731a@mail.gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> <1cd32cbb0906031326p75f4fcaew746f4eb39553731a@mail.gmail.com> Message-ID: <3d375d730906031403x54041b7bkc3d10905957a856@mail.gmail.com> On Wed, Jun 3, 2009 at 15:26, wrote: > 2009/6/3 St?fan van der Walt : >> Hi Jon >> >> 2009/6/3 D2Hitman : >>> I understand record arrays such as: >>> a_array = >>> np.array([(0.,1.,2.,3.,4.),(1.,2.,3.,4.,5.)],dtype=[('a','f'),('b','f'),('c','f'),('d','f'),('e','f')]) >>> do this with field names. >>> a_array['a'] = array([ 0., ?1.], dtype=float32) >>> however i seem to lose simple operations such as multiplication (a_array*2) >>> or powers (a_array**2). >> >> As a workaround, you can have two views on your data: >> >> n [39]: x >> Out[39]: >> array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], >> ? ? ?dtype=[('a', '> ('e', '> >> In [40]: x = x_dict.view(np.float32) >> >> In [41]: x**2 >> Out[41]: array([ ?0., ? 1., ? 4., ? 9., ?16., ? 1., ? 4., ? 9., ?16., >> 25.], dtype=float32) >> >> Then you can manipulate the same data using two different "interfaces". > > Why does it not preserve "shape", to do e.g. np.mean by axis? It does preserve the shape. The input and output are both 1D. If you need a different shape (e.g. re-interpreting the record as another axis), you need to reshape it yourself. numpy can't guess what you want. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Wed Jun 3 17:06:59 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 17:06:59 -0400 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> Message-ID: <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> On Wed, Jun 3, 2009 at 4:58 PM, Robert Kern wrote: > On Wed, Jun 3, 2009 at 15:23, ? wrote: >>>>> import numpy as np >>>>> x = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], >> ? ? dtype=[('a', '> ('e', '> >>>>> xvm = x.view(np.matrix) >>>>> xvm >> matrix([[(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]], >> ? ? ? dtype=[('a', '> ('e', '>>>> xvm*2 >> matrix([[(0.0, 1.0, 2.0, 3.0, 4.0, 0.0, 1.0, 2.0, 3.0, 4.0), >> ? ? ? ? (1.0, 2.0, 3.0, 4.0, 5.0, 1.0, 2.0, 3.0, 4.0, 5.0)]], dtype=object) >>>>> >> >> What am I doing wrong? > > You simply can't do numerical operations on structured arrays. matrix > shows this behavior because it replaces * with dot(), and dot(x, 2) > upcasts the array to an object array, which happens to represent > records as tuples. > > This has nothing to do with views. Ok, I didn't know numpy can have structured matrices, I thought matrices are simple 2 dimensional animals. But I haven't looked at them much. Josef > -- > Robert Kern From robert.kern at gmail.com Wed Jun 3 17:09:14 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Jun 2009 16:09:14 -0500 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> Message-ID: <3d375d730906031409o28da05c5w62987b873cf92341@mail.gmail.com> On Wed, Jun 3, 2009 at 16:06, wrote: > On Wed, Jun 3, 2009 at 4:58 PM, Robert Kern wrote: >> On Wed, Jun 3, 2009 at 15:23, ? wrote: >>>>>> import numpy as np >>>>>> x = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], >>> ? ? dtype=[('a', '>> ('e', '>> >>>>>> xvm = x.view(np.matrix) >>>>>> xvm >>> matrix([[(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]], >>> ? ? ? dtype=[('a', '>> ('e', '>>>>> xvm*2 >>> matrix([[(0.0, 1.0, 2.0, 3.0, 4.0, 0.0, 1.0, 2.0, 3.0, 4.0), >>> ? ? ? ? (1.0, 2.0, 3.0, 4.0, 5.0, 1.0, 2.0, 3.0, 4.0, 5.0)]], dtype=object) >>>>>> >>> >>> What am I doing wrong? >> >> You simply can't do numerical operations on structured arrays. matrix >> shows this behavior because it replaces * with dot(), and dot(x, 2) >> upcasts the array to an object array, which happens to represent >> records as tuples. >> >> This has nothing to do with views. > > Ok, I didn't know numpy can have structured matrices, I thought > matrices are simple 2 dimensional animals. They *are*. Records are atomic items. They do not form an axis. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Wed Jun 3 17:18:05 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 03 Jun 2009 14:18:05 -0700 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> Message-ID: <4A26E88D.3080505@noaa.gov> josef.pktd at gmail.com wrote: > Ok, I didn't know numpy can have structured matrices, well, matrices are a subclass of nd-arrays, so they support it, but it's probably not the least bit useful. See my earlier post to see how to do what I think you want. You may not want a matrix anyway -- a 2-d array may be a better bet. the only thing matrices buy you is convenient linear algebra operations. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Wed Jun 3 17:31:32 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 17:31:32 -0400 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <4A26E88D.3080505@noaa.gov> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> Message-ID: <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> On Wed, Jun 3, 2009 at 5:18 PM, Christopher Barker wrote: > josef.pktd at gmail.com wrote: >> Ok, I didn't know numpy can have structured matrices, > > well, matrices are a subclass of nd-arrays, so they support it, but it's > probably not the least bit useful. > > See my earlier post to see how to do what I think you want. > > You may not want a matrix anyway -- a 2-d array may be a better bet. the > only thing matrices buy you is convenient linear algebra operations. I'm very happy with plain numpy arrays, but to handle different data types in scipy.stats, I'm still trying to figure out how views and structured arrays work. And I'm still confused. >From the use for data handling in for example matplotlib and the recarray functions, I thought of structured arrays (and recarrays) as columns of data. Instead the analogy to database records and (1d) arrays of structs as in matlab might be better. The numpy help and documentation is not exactly rich in examples how to do convert structured arrays to something that can be used for calculation, except for dictionary access and row iteration. And using views to access them is not as foolproof as I thought. Josef > -Chris From Chris.Barker at noaa.gov Wed Jun 3 17:57:29 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 03 Jun 2009 14:57:29 -0700 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> Message-ID: <4A26F1C9.2020207@noaa.gov> josef.pktd at gmail.com wrote: > I'm very happy with plain numpy arrays, but to handle different data > types in scipy.stats, I'm still trying to figure out how views and > structured arrays work. And I'm still confused. OK, I'd stay away from matrix then, no need to add that confusion >>From the use for data handling in for example matplotlib and the > recarray functions, I thought of structured arrays (and recarrays) as > columns of data. Instead the analogy to database records and (1d) > arrays of structs as in matlab might be better. they are a bit of a mixture -- I think the record style access: arr['x'] means that there is no "rows" or "columns", just data accessed by name. > The numpy help and documentation is not exactly rich in examples how > to do convert structured arrays to something that can be used for > calculation, except for dictionary access and row iteration. And using > views to access them is not as foolproof as I thought. views are kind of a low-level trick -- what views do is let you make more than one numpy array that share the same memory data block. Doing this required a bit of knowledge about how data is stored in memory. For the common use case, what I do is use struct arrays to store and mass data around, and simple pull out the data into a regular array to manipulate it: In [45]: x Out[45]: array([(0.0, 1.0, 2.0, 12.0, 4.0), (1.0, 2.0, 3.0, 45.0, 5.0)], dtype=[('a', ' What's a numpy.void type? I thought this would be a tuple, or a numpy scalar of that dtype. It can be indexed either way, though: In [70]: x[0][2] Out[70]: 2.0 In [72]: x[0]['c'] Out[72]: 2.0 cool. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed Jun 3 17:55:42 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Jun 2009 16:55:42 -0500 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> Message-ID: <3d375d730906031455w30a6637dv80e3de0bfd805a31@mail.gmail.com> On Wed, Jun 3, 2009 at 16:31, wrote: > On Wed, Jun 3, 2009 at 5:18 PM, Christopher Barker > wrote: >> josef.pktd at gmail.com wrote: >>> Ok, I didn't know numpy can have structured matrices, >> >> well, matrices are a subclass of nd-arrays, so they support it, but it's >> probably not the least bit useful. >> >> See my earlier post to see how to do what I think you want. >> >> You may not want a matrix anyway -- a 2-d array may be a better bet. the >> only thing matrices buy you is convenient linear algebra operations. > > I'm very happy with plain numpy arrays, but to handle different data > types in scipy.stats, I'm still trying to figure out how views and > structured arrays work. And I'm still confused. .view() is used two different ways, and I think that is confusing you. .view(some_dtype) constructs a view of the array's memory with a different dtype. This can cause a reinterpretation of the bytes of memory. .view(ndarray_subclass) just returns an instance of ndarray_subclass that looks at the same array (same shape, dtype, etc.). This does not cause a reinterpretation of the memory. These are two completely different things, unfortunately conflated into the same method. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Wed Jun 3 18:53:55 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 18:53:55 -0400 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <4A26F1C9.2020207@noaa.gov> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> <4A26F1C9.2020207@noaa.gov> Message-ID: <1cd32cbb0906031553p38ed0ae6ld6925b685fbb4849@mail.gmail.com> On Wed, Jun 3, 2009 at 5:57 PM, Christopher Barker wrote: > josef.pktd at gmail.com wrote: >> I'm very happy with plain numpy arrays, but to handle different data >> types in scipy.stats, I'm still trying to figure out how views and >> structured arrays work. And I'm still confused. > > OK, I'd stay away from matrix then, no need to add that confusion > >>>From the use for data handling in for example matplotlib and the >> recarray functions, I thought of structured arrays (and recarrays) as >> columns of data. Instead the analogy to database records and (1d) >> arrays of structs as in matlab might be better. > > they are a bit of a mixture -- I think the record style access: > > arr['x'] > > means that there is no "rows" or "columns", just data accessed by name. > >> The numpy help and documentation is not exactly rich in examples how >> to do convert structured arrays to something that can be used for >> calculation, except for dictionary access and row iteration. And using >> views to access them is not as foolproof as I thought. > > views are kind of a low-level trick -- what views do is let you make > more than one numpy array that share the same memory data block. Doing > this required a bit of knowledge about how data is stored in memory. > > For the common use case, what I do is use struct arrays to store and > mass data around, and simple pull out the data into a regular array to > manipulate it: > > In [45]: x > Out[45]: > array([(0.0, 1.0, 2.0, 12.0, 4.0), (1.0, 2.0, 3.0, 45.0, 5.0)], > ? ? ? dtype=[('a', ' ('e', ' > In [46]: > > In [47]: e = x['e'] > > In [48]: e > Out[48]: array([ 4., ?5.], dtype=float32) > > note that this is still a "view" into the original array: > > In [49]: e *= 5 > > In [50]: x > Out[50]: > array([(0.0, 1.0, 2.0, 12.0, 20.0), (1.0, 2.0, 3.0, 45.0, 25.0)], > ? ? ? dtype=[('a', ' ('e', ' > #( see how the e field changed ) > > This is interesting: > In [51]: x[0] > Out[51]: (0.0, 1.0, 2.0, 12.0, 20.0) > > In [52]: type(x[0]) > Out[52]: > > What's a numpy.void type? I thought this would be a tuple, or a numpy > scalar of that dtype. It can be indexed either way, though: > > In [70]: x[0][2] > Out[70]: 2.0 > > In [72]: x[0]['c'] > Out[72]: 2.0 > > cool. > The access of rows, one column or individual entries looks good, but to get slices out of the structured array takes more effort, and it took me more time to figure this out: >>> z array([[(0.0, 1.0, 2.0, 3.0, 4.0)], [(1.0, 2.0, 3.0, 4.0, 5.0)]], dtype=[('a', '>> z.view(float).reshape(-1,len(z.dtype)) array([[ 0., 1., 2., 3., 4.], [ 1., 2., 3., 4., 5.]]) >>> z.view(float).reshape(-1,len(z.dtype)).mean(1) array([ 2., 3.]) >>> z.view(float).reshape(-1,len(z.dtype))[:,2:4].sum(0) array([ 5., 7.]) if not all items in structured array have the same dtype, I didn't find anything better than >>> np.hstack([z[i] for i in z.dtype.names[2:4]]) array([[ 2., 3.], [ 3., 4.]]) >>> np.dot(np.hstack([z[i] for i in z.dtype.names[2:4]]), np.ones(len(z))) array([ 5., 7.]) Is len(z.dtype) > 0 the best way to find out whether an array has a structured dtype? Josef From stefan at sun.ac.za Wed Jun 3 18:55:12 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 4 Jun 2009 00:55:12 +0200 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <3d375d730906031455w30a6637dv80e3de0bfd805a31@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> <3d375d730906031455w30a6637dv80e3de0bfd805a31@mail.gmail.com> Message-ID: <9457e7c80906031555x13ac3a51t67bfc82cab8cbb66@mail.gmail.com> 2009/6/3 Robert Kern : > On Wed, Jun 3, 2009 at 16:31, ? wrote: >> I'm very happy with plain numpy arrays, but to handle different data >> types in scipy.stats, I'm still trying to figure out how views and >> structured arrays work. And I'm still confused. > > .view() is used two different ways, and I think that is confusing you. > .view(some_dtype) constructs a view of the array's memory with a > different dtype. This can cause a reinterpretation of the bytes of > memory. .view(ndarray_subclass) just returns an instance of > ndarray_subclass that looks at the same array (same shape, dtype, > etc.). This does not cause a reinterpretation of the memory. > > These are two completely different things, unfortunately conflated > into the same method. One way to distinguish the difference in code is to use x.view(dtype=float) or x.view(type=matrix) A person can even do both: x.view(dtype=float, type=matrix) Not ideal, but maybe a little bit better. Cheers St?fan From robert.kern at gmail.com Wed Jun 3 18:56:58 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Jun 2009 17:56:58 -0500 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <1cd32cbb0906031553p38ed0ae6ld6925b685fbb4849@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> <4A26F1C9.2020207@noaa.gov> <1cd32cbb0906031553p38ed0ae6ld6925b685fbb4849@mail.gmail.com> Message-ID: <3d375d730906031556u47d0a934y3a02c670a673bd7d@mail.gmail.com> On Wed, Jun 3, 2009 at 17:53, wrote: > Is len(z.dtype) > 0 ? the best way to find out whether an array has a > structured dtype? (z.dtype.names is not None) is better. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Wed Jun 3 18:58:13 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 18:58:13 -0400 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <3d375d730906031455w30a6637dv80e3de0bfd805a31@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> <3d375d730906031455w30a6637dv80e3de0bfd805a31@mail.gmail.com> Message-ID: <1cd32cbb0906031558kdf3594dwb416fa956e4deb5a@mail.gmail.com> On Wed, Jun 3, 2009 at 5:55 PM, Robert Kern wrote: > On Wed, Jun 3, 2009 at 16:31, ? wrote: >> On Wed, Jun 3, 2009 at 5:18 PM, Christopher Barker >> wrote: >>> josef.pktd at gmail.com wrote: >>>> Ok, I didn't know numpy can have structured matrices, >>> >>> well, matrices are a subclass of nd-arrays, so they support it, but it's >>> probably not the least bit useful. >>> >>> See my earlier post to see how to do what I think you want. >>> >>> You may not want a matrix anyway -- a 2-d array may be a better bet. the >>> only thing matrices buy you is convenient linear algebra operations. >> >> I'm very happy with plain numpy arrays, but to handle different data >> types in scipy.stats, I'm still trying to figure out how views and >> structured arrays work. And I'm still confused. > > .view() is used two different ways, and I think that is confusing you. > .view(some_dtype) constructs a view of the array's memory with a > different dtype. This can cause a reinterpretation of the bytes of > memory. .view(ndarray_subclass) just returns an instance of > ndarray_subclass that looks at the same array (same shape, dtype, > etc.). This does not cause a reinterpretation of the memory. > > These are two completely different things, unfortunately conflated > into the same method. Thanks, this makes it much clearer than the current docstring for np.view(). I didn't even know about .view(ndarray_subclass) until Pierre mentioned it today. Do you have an opinion about whether .view(ndarray_subclass) or __array_wrap__ is the more appropriate return wrapper for function such as the ones in stats? Josef > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Wed Jun 3 19:00:31 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Jun 2009 18:00:31 -0500 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <1cd32cbb0906031558kdf3594dwb416fa956e4deb5a@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> <3d375d730906031455w30a6637dv80e3de0bfd805a31@mail.gmail.com> <1cd32cbb0906031558kdf3594dwb416fa956e4deb5a@mail.gmail.com> Message-ID: <3d375d730906031600w689e03cagbb082f2c3dfa2761@mail.gmail.com> On Wed, Jun 3, 2009 at 17:58, wrote: > Do you have an opinion about whether ?.view(ndarray_subclass) or > __array_wrap__ is the more appropriate return wrapper for function > such as the ones in stats? __array_wrap__ would be more appropriate. It's what ufuncs use. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Wed Jun 3 19:20:43 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 3 Jun 2009 19:20:43 -0400 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <3d375d730906031403x54041b7bkc3d10905957a856@mail.gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> <1cd32cbb0906031326p75f4fcaew746f4eb39553731a@mail.gmail.com> <3d375d730906031403x54041b7bkc3d10905957a856@mail.gmail.com> Message-ID: <2849AC9C-7EE0-407D-8D60-198109DF74ED@gmail.com> On Jun 3, 2009, at 5:03 PM, Robert Kern wrote: > On Wed, Jun 3, 2009 at 15:26, wrote: >> 2009/6/3 St?fan van der Walt : >>> Hi Jon >>> >>> 2009/6/3 D2Hitman : >>>> I understand record arrays such as: >>>> a_array = >>>> np.array([(0.,1.,2.,3.,4.),(1.,2.,3.,4.,5.)],dtype=[('a','f'), >>>> ('b','f'),('c','f'),('d','f'),('e','f')]) >>>> do this with field names. >>>> a_array['a'] = array([ 0., 1.], dtype=float32) >>>> however i seem to lose simple operations such as multiplication >>>> (a_array*2) >>>> or powers (a_array**2). >> Why does it not preserve "shape", to do e.g. np.mean by axis? > > It does preserve the shape. The input and output are both 1D. If you > need a different shape (e.g. re-interpreting the record as another > axis), you need to reshape it yourself. numpy can't guess what you > want. Or, as all fields have the same dtype: >>> a_array.view(dtype=('f',len(a_array.dtype))) array([[ 0., 1., 2., 3., 4.], [ 1., 2., 3., 4., 5.]], dtype=float32) Ain't it fun ? From pgmdevlist at gmail.com Wed Jun 3 19:21:57 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 3 Jun 2009 19:21:57 -0400 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <3d375d730906031600w689e03cagbb082f2c3dfa2761@mail.gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> <3d375d730906031455w30a6637dv80e3de0bfd805a31@mail.gmail.com> <1cd32cbb0906031558kdf3594dwb416fa956e4deb5a@mail.gmail.com> <3d375d730906031600w689e03cagbb082f2c3dfa2761@mail.gmail.com> Message-ID: <221281F6-CEE8-469C-BDC5-710B4779FBD5@gmail.com> On Jun 3, 2009, at 7:00 PM, Robert Kern wrote: > On Wed, Jun 3, 2009 at 17:58, wrote: >> Do you have an opinion about whether .view(ndarray_subclass) or >> __array_wrap__ is the more appropriate return wrapper for function >> such as the ones in stats? > > __array_wrap__ would be more appropriate. It's what ufuncs use. And it would work w/ MaskedArrays (using a simple .view(MaskedArray) would wipe out some information, while MaskedArray.__array_wrap__ keeps it) From robert.kern at gmail.com Wed Jun 3 19:23:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Jun 2009 18:23:06 -0500 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <2849AC9C-7EE0-407D-8D60-198109DF74ED@gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> <1cd32cbb0906031326p75f4fcaew746f4eb39553731a@mail.gmail.com> <3d375d730906031403x54041b7bkc3d10905957a856@mail.gmail.com> <2849AC9C-7EE0-407D-8D60-198109DF74ED@gmail.com> Message-ID: <3d375d730906031623u6305a3e3vf08b4f3695fd12a5@mail.gmail.com> On Wed, Jun 3, 2009 at 18:20, Pierre GM wrote: > > On Jun 3, 2009, at 5:03 PM, Robert Kern wrote: > >> On Wed, Jun 3, 2009 at 15:26, ? wrote: >>> 2009/6/3 St?fan van der Walt : >>>> Hi Jon >>>> >>>> 2009/6/3 D2Hitman : >>>>> I understand record arrays such as: >>>>> a_array = >>>>> np.array([(0.,1.,2.,3.,4.),(1.,2.,3.,4.,5.)],dtype=[('a','f'), >>>>> ('b','f'),('c','f'),('d','f'),('e','f')]) >>>>> do this with field names. >>>>> a_array['a'] = array([ 0., ?1.], dtype=float32) >>>>> however i seem to lose simple operations such as multiplication >>>>> (a_array*2) >>>>> or powers (a_array**2). >>> Why does it not preserve "shape", to do e.g. np.mean by axis? >> >> It does preserve the shape. The input and output are both 1D. If you >> need a different shape (e.g. re-interpreting the record as another >> axis), you need to reshape it yourself. numpy can't guess what you >> want. > > Or, as all fields have the same dtype: > > ?>>> a_array.view(dtype=('f',len(a_array.dtype))) > array([[ 0., ?1., ?2., ?3., ?4.], > ? ? ? ?[ 1., ?2., ?3., ?4., ?5.]], dtype=float32) > > Ain't it fun ? Ah, yes, there is that niggle, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Wed Jun 3 19:25:08 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 19:25:08 -0400 Subject: [Numpy-discussion] another view puzzle In-Reply-To: <221281F6-CEE8-469C-BDC5-710B4779FBD5@gmail.com> References: <1cd32cbb0906031323g3749a823vec426e235b65efb8@mail.gmail.com> <3d375d730906031358w2743225anb2a51742877cfa5b@mail.gmail.com> <1cd32cbb0906031406i519422cexc0c2393956551e0a@mail.gmail.com> <4A26E88D.3080505@noaa.gov> <1cd32cbb0906031431w29ec5941sd604f21d4833f8c2@mail.gmail.com> <3d375d730906031455w30a6637dv80e3de0bfd805a31@mail.gmail.com> <1cd32cbb0906031558kdf3594dwb416fa956e4deb5a@mail.gmail.com> <3d375d730906031600w689e03cagbb082f2c3dfa2761@mail.gmail.com> <221281F6-CEE8-469C-BDC5-710B4779FBD5@gmail.com> Message-ID: <1cd32cbb0906031625l4869ba97m31671fd3c15083dc@mail.gmail.com> On Wed, Jun 3, 2009 at 7:21 PM, Pierre GM wrote: > > On Jun 3, 2009, at 7:00 PM, Robert Kern wrote: > >> On Wed, Jun 3, 2009 at 17:58, ? wrote: >>> Do you have an opinion about whether ?.view(ndarray_subclass) or >>> __array_wrap__ is the more appropriate return wrapper for function >>> such as the ones in stats? >> >> __array_wrap__ would be more appropriate. It's what ufuncs use. > > And it would work w/ MaskedArrays (using a simple .view(MaskedArray) > would wipe out some information, while MaskedArray.__array_wrap__ > keeps it) Thanks to all, this has been very informative. I added Roberts comments to the np.ndarray.view docstring and will add examples (to structured arrays) as I find time. Josef Josef From pgmdevlist at gmail.com Wed Jun 3 19:33:05 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 3 Jun 2009 19:33:05 -0400 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <3d375d730906031623u6305a3e3vf08b4f3695fd12a5@mail.gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> <1cd32cbb0906031326p75f4fcaew746f4eb39553731a@mail.gmail.com> <3d375d730906031403x54041b7bkc3d10905957a856@mail.gmail.com> <2849AC9C-7EE0-407D-8D60-198109DF74ED@gmail.com> <3d375d730906031623u6305a3e3vf08b4f3695fd12a5@mail.gmail.com> Message-ID: <18749028-CAC0-4C49-B053-EB19EB34E58D@gmail.com> On Jun 3, 2009, at 7:23 PM, Robert Kern wrote: > On Wed, Jun 3, 2009 at 18:20, Pierre GM wrote: >> >> >> Or, as all fields have the same dtype: >> >> >>> a_array.view(dtype=('f',len(a_array.dtype))) >> array([[ 0., 1., 2., 3., 4.], >> [ 1., 2., 3., 4., 5.]], dtype=float32) >> >> Ain't it fun ? > > Ah, yes, there is that niggle, too. Except that I always get bitten by that: >>> backandforth = a_array.view(dtype=('f',len(a_array.dtype))).view(a_array.dtype) >>> backandforth array([[(0.0, 1.0, 2.0, 3.0, 4.0)], [(1.0, 2.0, 3.0, 4.0, 5.0)]], dtype=[('a', '>> backandforth.shape (2,1) We gained a dimension ! From josef.pktd at gmail.com Wed Jun 3 19:56:07 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 19:56:07 -0400 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <18749028-CAC0-4C49-B053-EB19EB34E58D@gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> <1cd32cbb0906031326p75f4fcaew746f4eb39553731a@mail.gmail.com> <3d375d730906031403x54041b7bkc3d10905957a856@mail.gmail.com> <2849AC9C-7EE0-407D-8D60-198109DF74ED@gmail.com> <3d375d730906031623u6305a3e3vf08b4f3695fd12a5@mail.gmail.com> <18749028-CAC0-4C49-B053-EB19EB34E58D@gmail.com> Message-ID: <1cd32cbb0906031656j68398c56vfddd507f79a02cde@mail.gmail.com> On Wed, Jun 3, 2009 at 7:33 PM, Pierre GM wrote: > > On Jun 3, 2009, at 7:23 PM, Robert Kern wrote: > >> On Wed, Jun 3, 2009 at 18:20, Pierre GM wrote: >>> >>> >>> Or, as all fields have the same dtype: >>> >>> ?>>> a_array.view(dtype=('f',len(a_array.dtype))) >>> array([[ 0., ?1., ?2., ?3., ?4.], >>> ? ? ? ?[ 1., ?2., ?3., ?4., ?5.]], dtype=float32) >>> >>> Ain't it fun ? >> >> Ah, yes, there is that niggle, too. > > > > Except that I always get bitten by that: > > ?>>> backandforth = > a_array.view(dtype=('f',len(a_array.dtype))).view(a_array.dtype) > ?>>> backandforth > array([[(0.0, 1.0, 2.0, 3.0, 4.0)], > ? ? ? ?[(1.0, 2.0, 3.0, 4.0, 5.0)]], > ? ? ? dtype=[('a', ' ('e', ' ?>>> backandforth.shape > (2,1) > > We gained a dimension ! > I looked at the archives to my first discovery of views, for sorting rows proposed by Pierre. In this case reshape was not necessary. >>> np.sort(np.array([[4.0, 1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 3.0, 4.0, 5.0]]).view(dt),0).view(float) array([[ 1., 2., 3., 4., 5.], [ 4., 1., 2., 3., 4.]]) >>> dt [('a', '>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt) array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], dtype=[('a', '>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt).shape (2,) structured view on existing array is 2d >>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]).view(dt).shape (2, 1) view on view returns original shape, >>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]).view(dt).view(float).shape (2, 5) But sorting in between the two views also preserved original shape. This was the source about my initial confusion about the necessity of reshape. Josef From josef.pktd at gmail.com Wed Jun 3 20:25:14 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 20:25:14 -0400 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <1cd32cbb0906031656j68398c56vfddd507f79a02cde@mail.gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> <1cd32cbb0906031326p75f4fcaew746f4eb39553731a@mail.gmail.com> <3d375d730906031403x54041b7bkc3d10905957a856@mail.gmail.com> <2849AC9C-7EE0-407D-8D60-198109DF74ED@gmail.com> <3d375d730906031623u6305a3e3vf08b4f3695fd12a5@mail.gmail.com> <18749028-CAC0-4C49-B053-EB19EB34E58D@gmail.com> <1cd32cbb0906031656j68398c56vfddd507f79a02cde@mail.gmail.com> Message-ID: <1cd32cbb0906031725y2d4a2b7dn40f6ee9418bf386c@mail.gmail.com> On Wed, Jun 3, 2009 at 7:56 PM, wrote: > On Wed, Jun 3, 2009 at 7:33 PM, Pierre GM wrote: >> >> On Jun 3, 2009, at 7:23 PM, Robert Kern wrote: >> >>> On Wed, Jun 3, 2009 at 18:20, Pierre GM wrote: >>>> >>>> >>>> Or, as all fields have the same dtype: >>>> >>>> ?>>> a_array.view(dtype=('f',len(a_array.dtype))) >>>> array([[ 0., ?1., ?2., ?3., ?4.], >>>> ? ? ? ?[ 1., ?2., ?3., ?4., ?5.]], dtype=float32) >>>> >>>> Ain't it fun ? >>> >>> Ah, yes, there is that niggle, too. >> >> >> >> Except that I always get bitten by that: >> >> ?>>> backandforth = >> a_array.view(dtype=('f',len(a_array.dtype))).view(a_array.dtype) >> ?>>> backandforth >> array([[(0.0, 1.0, 2.0, 3.0, 4.0)], >> ? ? ? ?[(1.0, 2.0, 3.0, 4.0, 5.0)]], >> ? ? ? dtype=[('a', '> ('e', '> ?>>> backandforth.shape >> (2,1) >> >> We gained a dimension ! >> > > I looked at the archives to my first discovery of views, for sorting > rows proposed by Pierre. In this case reshape was not necessary. > >>>> np.sort(np.array([[4.0, 1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 3.0, 4.0, 5.0]]).view(dt),0).view(float) > array([[ 1., ?2., ?3., ?4., ?5.], > ? ? ? [ 4., ?1., ?2., ?3., ?4.]]) > >>>> dt > [('a', ' > looking closer, the extra dimension helps to maintain shape: > > direct construction of structured array > >>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt) > array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], > ? ? ?dtype=[('a', ' ('e', '>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt).shape > (2,) > > structured view on existing array is 2d >>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]).view(dt).shape > (2, 1) > > view on view returns original shape, >>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]).view(dt).view(float).shape > (2, 5) > > But sorting in between the two views also preserved original shape. > This was the source about my initial confusion about the necessity of > reshape. > here is a minimal example for 2d structured array: >>> dt = dtype=[('a', '>> ys = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt) >>> ys.shape (2,) >>> ys.view(float) array([ 0., 1., 2., 3., 4., 1., 2., 3., 4., 5.]) >>> ys = ys.reshape((len(ys),1)) >>> ys.shape (2, 1) >>> ys.view(float) array([[ 0., 1., 2., 3., 4.], [ 1., 2., 3., 4., 5.]]) Josef From ningsean at gmail.com Wed Jun 3 20:29:31 2009 From: ningsean at gmail.com (Ning Sean) Date: Wed, 3 Jun 2009 19:29:31 -0500 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? Message-ID: Hi, I want to extract elements of an array (say, a) that are contained in another array (say, b). That is, if a=array([1,1,2,3,3,4]), b=array([1,4]), then I want array([1,1,4]). I did the following but the speed is very slow (maybe because a is very long): c=array([]) for x in b: c=append(c,a[a==x]) any way to speed it up? Thanks! -Ning -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Jun 3 20:36:21 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 20:36:21 -0400 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <1cd32cbb0906031725y2d4a2b7dn40f6ee9418bf386c@mail.gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> <1cd32cbb0906031326p75f4fcaew746f4eb39553731a@mail.gmail.com> <3d375d730906031403x54041b7bkc3d10905957a856@mail.gmail.com> <2849AC9C-7EE0-407D-8D60-198109DF74ED@gmail.com> <3d375d730906031623u6305a3e3vf08b4f3695fd12a5@mail.gmail.com> <18749028-CAC0-4C49-B053-EB19EB34E58D@gmail.com> <1cd32cbb0906031656j68398c56vfddd507f79a02cde@mail.gmail.com> <1cd32cbb0906031725y2d4a2b7dn40f6ee9418bf386c@mail.gmail.com> Message-ID: <1cd32cbb0906031736m1714779fq6cd7a94aa23eecb2@mail.gmail.com> On Wed, Jun 3, 2009 at 8:25 PM, wrote: > On Wed, Jun 3, 2009 at 7:56 PM, ? wrote: >> On Wed, Jun 3, 2009 at 7:33 PM, Pierre GM wrote: >>> >>> On Jun 3, 2009, at 7:23 PM, Robert Kern wrote: >>> >>>> On Wed, Jun 3, 2009 at 18:20, Pierre GM wrote: >>>>> >>>>> >>>>> Or, as all fields have the same dtype: >>>>> >>>>> ?>>> a_array.view(dtype=('f',len(a_array.dtype))) >>>>> array([[ 0., ?1., ?2., ?3., ?4.], >>>>> ? ? ? ?[ 1., ?2., ?3., ?4., ?5.]], dtype=float32) >>>>> >>>>> Ain't it fun ? >>>> >>>> Ah, yes, there is that niggle, too. >>> >>> >>> >>> Except that I always get bitten by that: >>> >>> ?>>> backandforth = >>> a_array.view(dtype=('f',len(a_array.dtype))).view(a_array.dtype) >>> ?>>> backandforth >>> array([[(0.0, 1.0, 2.0, 3.0, 4.0)], >>> ? ? ? ?[(1.0, 2.0, 3.0, 4.0, 5.0)]], >>> ? ? ? dtype=[('a', '>> ('e', '>> ?>>> backandforth.shape >>> (2,1) >>> >>> We gained a dimension ! >>> >> >> I looked at the archives to my first discovery of views, for sorting >> rows proposed by Pierre. In this case reshape was not necessary. >> >>>>> np.sort(np.array([[4.0, 1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 3.0, 4.0, 5.0]]).view(dt),0).view(float) >> array([[ 1., ?2., ?3., ?4., ?5.], >> ? ? ? [ 4., ?1., ?2., ?3., ?4.]]) >> >>>>> dt >> [('a', '> >> looking closer, the extra dimension helps to maintain shape: >> >> direct construction of structured array >> >>>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt) >> array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)], >> ? ? ?dtype=[('a', '> ('e', '>>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt).shape >> (2,) >> >> structured view on existing array is 2d >>>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]).view(dt).shape >> (2, 1) >> >> view on view returns original shape, >>>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]).view(dt).view(float).shape >> (2, 5) >> >> But sorting in between the two views also preserved original shape. >> This was the source about my initial confusion about the necessity of >> reshape. >> > > here is a minimal example for 2d structured array: > >>>> dt = dtype=[('a', '>>> ys = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt) >>>> ys.shape > (2,) >>>> ys.view(float) > array([ 0., ?1., ?2., ?3., ?4., ?1., ?2., ?3., ?4., ?5.]) >>>> ys = ys.reshape((len(ys),1)) >>>> ys.shape > (2, 1) >>>> ys.view(float) > array([[ 0., ?1., ?2., ?3., ?4.], > ? ? ? [ 1., ?2., ?3., ?4., ?5.]]) > > and one more as summary: reshape, change dtype, change array type: >>> ys = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt) >>> ys.view(float, np.matrix) matrix([[ 0., 1., 2., 3., 4., 1., 2., 3., 4., 5.]]) >>> ys.view(float, np.matrix).mean(0) matrix([[ 0., 1., 2., 3., 4., 1., 2., 3., 4., 5.]]) >>> ys.reshape(-1,1).view(float, np.matrix) matrix([[ 0., 1., 2., 3., 4.], [ 1., 2., 3., 4., 5.]]) >>> ys.reshape(-1,1).view(float, np.matrix).mean(0) matrix([[ 0.5, 1.5, 2.5, 3.5, 4.5]]) >>> ys.view(float, np.matrix).reshape(-1,len(ys.dtype)).mean(0) matrix([[ 0.5, 1.5, 2.5, 3.5, 4.5]]) The End Josef From josef.pktd at gmail.com Wed Jun 3 20:45:10 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Jun 2009 20:45:10 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: References: Message-ID: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> On Wed, Jun 3, 2009 at 8:29 PM, Ning Sean wrote: > Hi, I want to extract elements of an array (say, a) that are contained in > another array (say, b). That is, if a=array([1,1,2,3,3,4]), b=array([1,4]), > then I want array([1,1,4]). > > I did the following but the speed is very slow (maybe because a is very > long): > > c=array([]) > for x in b: > ?? c=append(c,a[a==x]) > > any way to speed it up? > > Thanks! > -Ning > It's waiting in Trac for inclusion in numpy http://projects.scipy.org/numpy/ticket/1036 The current version only handles arrays with unique elements. You can copy the ticket attachment, the version there is very fast. Josef From dwf at cs.toronto.edu Wed Jun 3 21:46:31 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 3 Jun 2009 21:46:31 -0400 Subject: [Numpy-discussion] [RFR] NpzFile tweaks (Re: Making NpzFiles behave more like dictionaries.) In-Reply-To: References: <53EAD5CA-15B7-4DB2-91F1-C7795894ED12@cs.toronto.edu> Message-ID: <64ED3BCF-5A93-4AB2-A5F1-C03DD562D991@cs.toronto.edu> On 3-Jun-09, at 5:01 PM, Pauli Virtanen wrote: > > Btw, are you able to change the status of the ticket to > "needs_review"? > I think this should be possible for everyone, and not restricted to > admins, but I'm not 100% sure... Sorry, yes I am. I had just forgotten. David From ningsean at gmail.com Thu Jun 4 00:32:31 2009 From: ningsean at gmail.com (Ning Sean) Date: Wed, 3 Jun 2009 23:32:31 -0500 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> Message-ID: Thanks! Tried it and it is about twice as fast as my approach. -Ning On Wed, Jun 3, 2009 at 7:45 PM, wrote: > On Wed, Jun 3, 2009 at 8:29 PM, Ning Sean wrote: > > Hi, I want to extract elements of an array (say, a) that are contained in > > another array (say, b). That is, if a=array([1,1,2,3,3,4]), > b=array([1,4]), > > then I want array([1,1,4]). > > > > I did the following but the speed is very slow (maybe because a is very > > long): > > > > c=array([]) > > for x in b: > > c=append(c,a[a==x]) > > > > any way to speed it up? > > > > Thanks! > > -Ning > > > > > It's waiting in Trac for inclusion in numpy > http://projects.scipy.org/numpy/ticket/1036 > The current version only handles arrays with unique elements. > > You can copy the ticket attachment, the version there is very fast. > > Josef > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cimrman3 at ntc.zcu.cz Thu Jun 4 05:48:50 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu, 04 Jun 2009 11:48:50 +0200 Subject: [Numpy-discussion] setmember1d_nu In-Reply-To: References: Message-ID: <4A279882.5090700@ntc.zcu.cz> Hi Neil, Neil Crighton wrote: > Hi all, > > I posted this message couple of days ago, but gmane grouped it with an old > thread and it hasn't shown up on the front page. So here it is again... > > I'd really like to see the setmember1d_nu function in ticket 1036 get into > numpy. There's a patch waiting for review that including tests: > > http://projects.scipy.org/numpy/ticket/1036 > > Is there anything I can do to help get it applied? I guess I could commit it, if you review the patch and it works for you. Obviously, I cannot review it myself, but my SVN access may still work :) r. From cournape at gmail.com Thu Jun 4 06:14:35 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 4 Jun 2009 19:14:35 +0900 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: References: <4A249D87.20207@ar.media.kyoto-u.ac.jp> <3d375d730906012054h66d749eat395f86452aa1bfcd@mail.gmail.com> <4A249E82.4030300@ar.media.kyoto-u.ac.jp> <5b8d13220906020336r58878a6fx298c9b7a900fcfc9@mail.gmail.com> <4A2505FB.9010602@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220906040314o6bc5e4eeyc74f876f3440623b@mail.gmail.com> On Tue, Jun 2, 2009 at 10:56 PM, Ryan May wrote: > On Tue, Jun 2, 2009 at 5:59 AM, David Cournapeau > wrote: >> >> Robin wrote: >> > On Tue, Jun 2, 2009 at 11:36 AM, David Cournapeau >> > wrote: >> > >> >> Done in r7031 - correlate/PyArray_Correlate should be unchanged, and >> >> acorrelate/PyArray_Acorrelate implement the conventional definitions, >> >> >> > >> > I don't know if it's been discussed before but while people are >> > thinking about/changing correlate I thought I'd like to request as a >> > user a matlab style xcorr function (basically with the functionality >> > of the matlab version). >> > >> > I don't know if this is a deliberate emission, but it is often one of >> > the first things my colleagues try when I get them using Python, and >> > as far as I know there isn't really a good answer. There is xcorr in >> > pylab, but it isn't vectorised like xcorr from matlab... >> > >> >> There is one in the talkbox scikit: >> >> >> http://github.com/cournape/talkbox/blob/202135a9d848931ebd036b97302f1e10d7488c63/scikits/talkbox/tools/correlations.py >> >> It uses the fft, and bonus point, the file is independent of the rest of >> toolbox. There is another version which uses direct implementation (this >> is faster if you need only a few lags, and it takes less memory too). > > I'd be +1 on including something like this (provided it expanded to include > complex-valued data).? I think it's a real need, since everyone seems to > keep rolling their own.? I had to write my own just so that I can calculate > a few lags in a vectorized fashion. The code in talkbox is not good enough for scipy. I made an attempt for scipy.signal here: http://github.com/cournape/scipy3/blob/b004d17d824f1c03921d9663207ee40adadc5762/scipy/signal/correlations.py It is reasonably fast when only a few lags are needed, both double and complex double are supported, and it works on arbitrary axis and lags. Other precisions should be easy to add, but I think I need to extend the numpy code generators to support cython sources to avoid code duplication. Does that fill your need ? cheers, David From david at ar.media.kyoto-u.ac.jp Thu Jun 4 06:24:32 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 04 Jun 2009 19:24:32 +0900 Subject: [Numpy-discussion] Scipy 0.7.1rc1 released Message-ID: <4A27A0E0.1060903@ar.media.kyoto-u.ac.jp> Hi, The RC1 for 0.7.1 scipy release has just been tagged. This is a bug-only release, see below for the release notes. More information can also be found on the trac website: http://projects.scipy.org/scipy/milestone/0.7.1 Please test it ! The scipy developers -- ========================= SciPy 0.7.1 Release Notes ========================= .. contents:: SciPy 0.7.1 is a bug-fix release with no new features compared to 0.7.0. scipy.signal ============ Several memory leaks in lfilter have been fixed, and the support for array object has been fixed as well. scipy.sparse ============ scipy.io ======== Some performance regressions in 0.7.0 have been fixed in 0.7.1 (#882 and #885). Windows binaries for python 2.6 =============================== python 2.6 binaries for windows are now included. Python 2.6 binaries require numpy 1.3.0 or above, other binaries require numpy 1.2.0 or above. Universal build for scipy ========================= Mac OS X binary installer is now a universal build, and does not require gfortran to be installed. The binary requires numpy 1.2.0 or above and the python found on python.org. Checksums ========= 1632c340651ad097967921dd7539f0f0 release/installers/scipy-0.7.1rc1.zip 66d18c5557014ba0a839ca3c22c0f191 release/installers/scipy-0.7.1rc1-win32-superpack-python2.5.exe 64a007f88619c2ce75d8a2a7e558b4d4 release/installers/scipy-0.7.1rc1-win32-superpack-python2.6.exe be5697925454f2b5c9da0dd092fdcd03 release/installers/scipy-0.7.1rc1.tar.gz From dwf at cs.toronto.edu Thu Jun 4 07:19:59 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 4 Jun 2009 07:19:59 -0400 Subject: [Numpy-discussion] [RFR] NpzFile tweaks (Re: Making NpzFiles behave more like dictionaries.) In-Reply-To: References: <53EAD5CA-15B7-4DB2-91F1-C7795894ED12@cs.toronto.edu> Message-ID: <024DB614-9AF3-46EC-9A61-DC97A78CBB2C@cs.toronto.edu> On 3-Jun-09, at 5:01 PM, Pauli Virtanen wrote: > > Btw, are you able to change the status of the ticket to > "needs_review"? > I think this should be possible for everyone, and not restricted to > admins, but I'm not 100% sure... Sorry Pauli, seems I _don't_ have permission on the numpy trac to change ticket status. The radio button shows up but then it gives me a "Warning: No permission to change ticket fields." David From wierob83 at googlemail.com Thu Jun 4 08:19:33 2009 From: wierob83 at googlemail.com (wierob) Date: Thu, 04 Jun 2009 14:19:33 +0200 Subject: [Numpy-discussion] BigInteger equivalent in numpy Message-ID: <4A27BBD5.5010306@googlemail.com> Hi, is there a BigInteger equivalent in numpy? The largest integer type I wound was dtype int64. I'm using stats.linregress to perform a regression analysis. The return stderr was nan because stas.ss(...) returned a negative number due to an overflow. Setting dtype to int64 for my input data seems to fix this. But what if my data does not fit in int64? Since Python's long type can hold large data I tried to convert my input to long but it gets converted to int64 in numpy. kind regards robert From aisaac at american.edu Thu Jun 4 08:23:43 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 04 Jun 2009 08:23:43 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> Message-ID: <4A27BCCF.4090608@american.edu> a[(a==b[:,None]).sum(axis=0,dtype=bool)] hth, Alan Isaac From josef.pktd at gmail.com Thu Jun 4 08:35:05 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 08:35:05 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A27BCCF.4090608@american.edu> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> Message-ID: <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> On Thu, Jun 4, 2009 at 8:23 AM, Alan G Isaac wrote: > a[(a==b[:,None]).sum(axis=0,dtype=bool)] this is my preferred way when b is small and has unique elements. if the elements in b are not unique, then be can be replaced by np.unique(b) If b is large this creates a huge intermediate array The advantage of the new setmember1d_nu is that it handles large b very efficiently. My try on it was more than 10 times slower than the proposed solution for larger arrays. Josef > hth, > Alan Isaac From josef.pktd at gmail.com Thu Jun 4 08:55:08 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 08:55:08 -0400 Subject: [Numpy-discussion] BigInteger equivalent in numpy In-Reply-To: <4A27BBD5.5010306@googlemail.com> References: <4A27BBD5.5010306@googlemail.com> Message-ID: <1cd32cbb0906040555p6ad34e6i406b95a3626301c9@mail.gmail.com> On Thu, Jun 4, 2009 at 8:19 AM, wierob wrote: > Hi, > > is there a BigInteger equivalent in numpy? The largest integer type I > wound was dtype int64. > > I'm using stats.linregress to perform a regression analysis. The return > stderr was nan because stas.ss(...) returned a negative number due to an > overflow. Setting dtype to int64 for my input data seems to fix this. > But what if my data does not fit in int64? > > Since Python's long type can hold large data I tried to convert my input > to long but it gets converted to int64 in numpy. > you could try to use floats. stats.ss does the calculation in the same type as the input. If you convert your input data to floating point you will not get an overflow, but floating point precision instead. Note during the last bugfix, I also changed the implementation of stats.linregress and now (0.7.1 and later) it doesn't use stats.ss anymore, instead it uses np.cov which always uses floats. Also, if you are using an older version there was a mistake in the stderr calculations, http://projects.scipy.org/scipy/ticket/874 Josef > kind regards > robert From stefan at sun.ac.za Thu Jun 4 09:28:57 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 4 Jun 2009 15:28:57 +0200 Subject: [Numpy-discussion] [RFR] NpzFile tweaks (Re: Making NpzFiles behave more like dictionaries.) In-Reply-To: <024DB614-9AF3-46EC-9A61-DC97A78CBB2C@cs.toronto.edu> References: <53EAD5CA-15B7-4DB2-91F1-C7795894ED12@cs.toronto.edu> <024DB614-9AF3-46EC-9A61-DC97A78CBB2C@cs.toronto.edu> Message-ID: <9457e7c80906040628u28783e33n8878325550662230@mail.gmail.com> 2009/6/4 David Warde-Farley : > Sorry Pauli, seems I _don't_ have permission on the numpy trac to > change ticket status. The radio button shows up but then it gives me a > "Warning: No permission to change ticket fields." Should be fixed. Cheers St?fan From D.P.Reichert at sms.ed.ac.uk Thu Jun 4 09:36:38 2009 From: D.P.Reichert at sms.ed.ac.uk (David Paul Reichert) Date: Thu, 04 Jun 2009 14:36:38 +0100 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab Message-ID: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> Hi all, I would be glad if someone could help me with the following issue: From what I've read on the web it appears to me that numpy should be about as fast as matlab. However, when I do simple matrix multiplication, it consistently appears to be about 5 times slower. I tested this using A = 0.9 * numpy.matlib.ones((500,100)) B = 0.8 * numpy.matlib.ones((500,100)) def test(): for i in range(1000): A*B.T I also used ten times larger matrices with ten times less iterations, used xrange instead of range, arrays instead of matrices, and tested it on two different machines, and the result always seems to be the same. Any idea what could go wrong? I'm using ipython and matlab R2008b. Thanks, David -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From sebastian.walter at gmail.com Thu Jun 4 10:02:03 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Thu, 4 Jun 2009 16:02:03 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> Message-ID: Have a look at this thread: http://www.mail-archive.com/numpy-discussion at scipy.org/msg13085.html The speed difference is probably due to the fact that the matrix multiplication does not call optimized an optimized blas routine, e.g. the ATLAS blas. Sebastian On Thu, Jun 4, 2009 at 3:36 PM, David Paul Reichert wrote: > Hi all, > > I would be glad if someone could help me with > the following issue: > > ?From what I've read on the web it appears to me > that numpy should be about as fast as matlab. However, > when I do simple matrix multiplication, it consistently > appears to be about 5 times slower. I tested this using > > A = 0.9 * numpy.matlib.ones((500,100)) > B = 0.8 * numpy.matlib.ones((500,100)) > > def test(): > ? ? for i in range(1000): > ? ? ? ? A*B.T > > I also used ten times larger matrices with ten times less > iterations, used xrange instead of range, arrays instead > of matrices, and tested it on two different machines, > and the result always seems to be the same. > > Any idea what could go wrong? I'm using ipython and > matlab R2008b. > > Thanks, > > David > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From aisaac at american.edu Thu Jun 4 10:13:19 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 04 Jun 2009 10:13:19 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> Message-ID: <4A27D67F.3@american.edu> > On Thu, Jun 4, 2009 at 8:23 AM, Alan G Isaac wrote: >> a[(a==b[:,None]).sum(axis=0,dtype=bool)] On 6/4/2009 8:35 AM josef.pktd at gmail.com apparently wrote: > If b is large this creates a huge intermediate array True enough, but one could then use fromiter: setb = set(b) itr = (ai for ai in a if ai in setb) out = np.fromiter(itr, dtype=a.dtype) I suspect (?) that b would have to be pretty big relative to a for the repeated testing to be more costly than sorting a. Or if a stable order is not important (I don't recall if the OP specified), one could just np.intersect1d(a, np.unique(b)) On a different note, I think a name change is needed for your function. (Compare intersect1d_nu to see the potential confusion. And btw, what is the use case for intersect1d, which gives neither a set intersection nor a multiset intersection?) Cheers, Alan Isaac From rmay31 at gmail.com Thu Jun 4 10:25:16 2009 From: rmay31 at gmail.com (Ryan May) Date: Thu, 4 Jun 2009 09:25:16 -0500 Subject: [Numpy-discussion] Problem with correlate In-Reply-To: <5b8d13220906040314o6bc5e4eeyc74f876f3440623b@mail.gmail.com> References: <4A249D87.20207@ar.media.kyoto-u.ac.jp> <3d375d730906012054h66d749eat395f86452aa1bfcd@mail.gmail.com> <4A249E82.4030300@ar.media.kyoto-u.ac.jp> <5b8d13220906020336r58878a6fx298c9b7a900fcfc9@mail.gmail.com> <4A2505FB.9010602@ar.media.kyoto-u.ac.jp> <5b8d13220906040314o6bc5e4eeyc74f876f3440623b@mail.gmail.com> Message-ID: On Thu, Jun 4, 2009 at 5:14 AM, David Cournapeau wrote: > On Tue, Jun 2, 2009 at 10:56 PM, Ryan May wrote: > > On Tue, Jun 2, 2009 at 5:59 AM, David Cournapeau > > wrote: > >> > >> Robin wrote: > >> > On Tue, Jun 2, 2009 at 11:36 AM, David Cournapeau > > >> > wrote: > >> > > >> >> Done in r7031 - correlate/PyArray_Correlate should be unchanged, and > >> >> acorrelate/PyArray_Acorrelate implement the conventional definitions, > >> >> > >> > > >> > I don't know if it's been discussed before but while people are > >> > thinking about/changing correlate I thought I'd like to request as a > >> > user a matlab style xcorr function (basically with the functionality > >> > of the matlab version). > >> > > >> > I don't know if this is a deliberate emission, but it is often one of > >> > the first things my colleagues try when I get them using Python, and > >> > as far as I know there isn't really a good answer. There is xcorr in > >> > pylab, but it isn't vectorised like xcorr from matlab... > >> > > >> > >> There is one in the talkbox scikit: > >> > >> > >> > http://github.com/cournape/talkbox/blob/202135a9d848931ebd036b97302f1e10d7488c63/scikits/talkbox/tools/correlations.py > >> > >> It uses the fft, and bonus point, the file is independent of the rest of > >> toolbox. There is another version which uses direct implementation (this > >> is faster if you need only a few lags, and it takes less memory too). > > > > I'd be +1 on including something like this (provided it expanded to > include > > complex-valued data). I think it's a real need, since everyone seems to > > keep rolling their own. I had to write my own just so that I can > calculate > > a few lags in a vectorized fashion. > > The code in talkbox is not good enough for scipy. I made an attempt > for scipy.signal here: > > > http://github.com/cournape/scipy3/blob/b004d17d824f1c03921d9663207ee40adadc5762/scipy/signal/correlations.py > > It is reasonably fast when only a few lags are needed, both double and > complex double are supported, and it works on arbitrary axis and lags. > Other precisions should be easy to add, but I think I need to extend > the numpy code generators to support cython sources to avoid code > duplication. > > Does that fill your need ? It would fill mine. Would it make sense to make y default to x, so that you can use xcorr to do the autocorrelation as: xcorr(x) ? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jun 4 10:50:18 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 10:50:18 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A27D67F.3@american.edu> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> Message-ID: <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> On Thu, Jun 4, 2009 at 10:13 AM, Alan G Isaac wrote: >> On Thu, Jun 4, 2009 at 8:23 AM, Alan G Isaac wrote: >>> a[(a==b[:,None]).sum(axis=0,dtype=bool)] > > > On 6/4/2009 8:35 AM josef.pktd at gmail.com apparently wrote: >> If b is large this creates a huge intermediate array > > > True enough, but one could then use fromiter: > setb = set(b) > itr = (ai for ai in a if ai in setb) > out = np.fromiter(itr, dtype=a.dtype) > > I suspect (?) that b would have to be pretty > big relative to a for the repeated testing > to be more costly than sorting a. I didn't look at this case very closely for speed, setmember1d and setmember1d_nu return a boolean array, that can be used for indexing, not the actual elements. Your iterator is in python and could be pretty slow, but I only ran the performance script attached to the ticket and the speed differences for different ways of doing it were pretty big for large arrays. > > Or if a stable order is not important (I don't > recall if the OP specified), one could just > np.intersect1d(a, np.unique(b)) This requires that also `a` has only unique elements. intersect1d_nu doesn't require unique elements. > > On a different note, I think a name change > is needed for your function. (Compare > intersect1d_nu to see the potential > confusion. And btw, what is the use case > for intersect1d, which gives neither a > set intersection nor a multiset intersection?) intersect1d gives set intersection if both arrays have only unique elements (i.e. are sets). I thought the naming is pretty clear: intersect1d(a,b) set intersection if a and b with unique elements intersect1d_nu(a,b) set intersection if a and b with non-unique elements setmember1d(a,b) boolean index array for a of set intersection if a and b with unique elements setmember1d_nu(a,b) boolean index array for a of set intersection if a and b with non-unique elements The new docs http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/ are a bit clearer. However, I haven't used either of these functions much, and non of them are *my* functions. Of the arraysetops functions, I use unique1d most (because of the return index). I just keep track of these functions because of the use for categorical and dummy variables. Josef > > Cheers, > Alan Isaac > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jrennie at gmail.com Thu Jun 4 11:06:28 2009 From: jrennie at gmail.com (Jason Rennie) Date: Thu, 4 Jun 2009 11:06:28 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> Message-ID: <75c31b2a0906040806r7a03bffl78012a1f33051f21@mail.gmail.com> Thanks for the responses. I did not realize that dot() would do matrix multiplication which was the main reason I was looking for a matrix-like class. Like you and Tom suggested, I think it's best to stick to arrays. Cheers, Jason On Sun, May 24, 2009 at 6:45 PM, David Warde-Farley wrote: > On 24-May-09, at 8:32 AM, Tom K. wrote: > > > Maybe my reluctance to work with matrices stems from this kind of > > inconsistency. It seems like your code has to be all matrix, or all > > array - > > and if you mix them, you need to be very careful about which is which. > > Also, functions called on things of type matrix may not return a > matrix as expected, but rather an array. > > Anecdotally, it seems to me that lots of people (myself included) seem > to go through a phase early in their use of NumPy where they try to > use matrix(), but most seem to end up switching to using 2D arrays for > all the aforementioned reasons. > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Jun 4 11:11:26 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 4 Jun 2009 11:11:26 -0400 Subject: [Numpy-discussion] Scipy 0.7.1rc1 released In-Reply-To: <4A27A0E0.1060903@ar.media.kyoto-u.ac.jp> References: <4A27A0E0.1060903@ar.media.kyoto-u.ac.jp> Message-ID: <1e2af89e0906040811g2671142j987922ac69ccd909@mail.gmail.com> Hi, > ? ?The RC1 for 0.7.1 scipy release has just been tagged. This is a > bug-only release I feel (y)our pain, but don't you mean 'bug-fix only release'? ;-) Matthew From aisaac at american.edu Thu Jun 4 11:12:17 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 04 Jun 2009 11:12:17 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> Message-ID: <4A27E451.1080007@american.edu> > On Thu, Jun 4, 2009 at 10:13 AM, Alan G Isaac wrote: >> Or if a stable order is not important (I don't >> recall if the OP specified), one could just >> np.intersect1d(a, np.unique(b)) On 6/4/2009 10:50 AM josef.pktd at gmail.com apparently wrote: > This requires that also `a` has only unique elements. > intersect1d_nu doesn't require unique elements. >>> a array([1, 1, 2, 3, 3, 4]) >>> b array([1, 4]) >>> np.intersect1d(a, np.unique(b)) array([1, 1, 3, 4]) (And thus my question about intersect1d...) Cheers, Alan From josef.pktd at gmail.com Thu Jun 4 11:19:19 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 11:19:19 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A27E451.1080007@american.edu> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E451.1080007@american.edu> Message-ID: <1cd32cbb0906040819r5b11f1fes76d3c504f1d708e8@mail.gmail.com> On Thu, Jun 4, 2009 at 11:12 AM, Alan G Isaac wrote: >> On Thu, Jun 4, 2009 at 10:13 AM, Alan G Isaac wrote: >>> Or if a stable order is not important (I don't >>> recall if the OP specified), one could just >>> np.intersect1d(a, np.unique(b)) > > On 6/4/2009 10:50 AM josef.pktd at gmail.com apparently wrote: >> This requires that also `a` has only unique elements. >> intersect1d_nu doesn't require unique elements. > > >>>> a > array([1, 1, 2, 3, 3, 4]) >>>> b > array([1, 4]) >>>> np.intersect1d(a, np.unique(b)) > array([1, 1, 3, 4]) > > (And thus my question about intersect1d...) Yes, I know, and in my current numpy help file this is the only example there is, which is very misleading for its intended use. >>> a = np.array([1, 1, 2, 3, 3, 4]) >>> b = np.array([1, 4, 5]) >>> np.intersect1d(np.unique(a), np.unique(b)) array([1, 4]) >>> np.intersect1d_nu(a,b) array([1, 4]) Josef > > Cheers, > Alan > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From aisaac at american.edu Thu Jun 4 11:19:49 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 04 Jun 2009 11:19:49 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> Message-ID: <4A27E615.70604@american.edu> On 6/4/2009 10:50 AM josef.pktd at gmail.com apparently wrote: > intersect1d gives set intersection if both arrays have > only unique elements (i.e. are sets). I thought the > naming is pretty clear: > intersect1d(a,b) set intersection if a and b with unique elements > intersect1d_nu(a,b) set intersection if a and b with non-unique elements > setmember1d(a,b) boolean index array for a of set intersection if a > and b with unique elements > setmember1d_nu(a,b) boolean index array for a of set intersection if > a and b with non-unique elements >>> a array([1, 1, 2, 3, 3, 4]) >>> b array([1, 4, 4, 4]) >>> np.intersect1d_nu(a,b) array([1, 4]) That is, intersect1d_nu is the actual set intersection function. (I.e., intersect1d and intersect1d_nu would most naturally have swapped names.) That is why the appended _nu will not communicate what was intended. (I.e., setmember1d_nu will not be a match for intersect1d_nu.) Cheers, Alan Isaac From cimrman3 at ntc.zcu.cz Thu Jun 4 11:27:11 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu, 04 Jun 2009 17:27:11 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A27E615.70604@american.edu> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> Message-ID: <4A27E7CF.1080005@ntc.zcu.cz> Alan G Isaac wrote: > On 6/4/2009 10:50 AM josef.pktd at gmail.com apparently wrote: >> intersect1d gives set intersection if both arrays have >> only unique elements (i.e. are sets). I thought the >> naming is pretty clear: > >> intersect1d(a,b) set intersection if a and b with unique elements >> intersect1d_nu(a,b) set intersection if a and b with non-unique elements >> setmember1d(a,b) boolean index array for a of set intersection if a >> and b with unique elements >> setmember1d_nu(a,b) boolean index array for a of set intersection if >> a and b with non-unique elements > > >>>> a > array([1, 1, 2, 3, 3, 4]) >>>> b > array([1, 4, 4, 4]) >>>> np.intersect1d_nu(a,b) > array([1, 4]) > > That is, intersect1d_nu is the actual set intersection > function. (I.e., intersect1d and intersect1d_nu would most > naturally have swapped names.) That is why the appended _nu > will not communicate what was intended. (I.e., > setmember1d_nu will not be a match for intersect1d_nu.) The naming should express this: intersect1d expects its arguments are sets, intersect1d_nu does not. A set has unique elements by definition. cheers, r. From josef.pktd at gmail.com Thu Jun 4 11:29:56 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 11:29:56 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A27E615.70604@american.edu> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> Message-ID: <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> On Thu, Jun 4, 2009 at 11:19 AM, Alan G Isaac wrote: > On 6/4/2009 10:50 AM josef.pktd at gmail.com apparently wrote: >> intersect1d gives set intersection if both arrays have >> only unique elements (i.e. are sets). ?I thought the >> naming is pretty clear: > >> intersect1d(a,b) ? set intersection if a and b with unique elements >> intersect1d_nu(a,b) ? set intersection if a and b with non-unique elements >> setmember1d(a,b) ?boolean index array for a of set intersection if a >> and b with unique elements >> setmember1d_nu(a,b) ?boolean index array for a of set intersection if >> a and b with non-unique elements > > >>>> a > array([1, 1, 2, 3, 3, 4]) >>>> b > array([1, 4, 4, 4]) >>>> np.intersect1d_nu(a,b) > array([1, 4]) > > That is, intersect1d_nu is the actual set intersection > function. ?(I.e., intersect1d and intersect1d_nu would most > naturally have swapped names.) ?That is why the appended _nu > will not communicate what was intended. ?(I.e., > setmember1d_nu will not be a match for intersect1d_nu.) intersect1d is the intersection between sets (which are stored as arrays), just like in the mathematical definition the two sets only have unique elements intersect1d_nu is the intersection between two arrays which can have repeated elements. The result is a set, i.e. unique elements, stored as an array same for setmember1d, setmember1d_nu so postfix `_nu` only means that this function also works if the two arrays are not really sets, i.e. are not required to have unique elements to make sense. intersect1d should throw a domain error if you give it arrays with non-unique elements, which is not done for speed reasons > Cheers, > Alan Isaac From josef.pktd at gmail.com Thu Jun 4 11:30:30 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 11:30:30 -0400 Subject: [Numpy-discussion] Calculations with mixed type structured arrays Message-ID: <1cd32cbb0906040830q37678538h507d017883b1dcc3@mail.gmail.com> After yesterdays discussion, I wanted to see if views of structured arrays with mixed type can be easily used. Is the following useful for the numpy user guide? Josef Calculations with mixed type structured arrays ---------------------------------------------- >>> import numpy as np The following array has two integer and three float columns >>> dt = np.dtype([('a', '>> xs = np.ones(3,dt) >>> print xs.shape (3,) >>> print repr(xs) array([(1, 1, 1.0, 1.0, 1.0), (1, 1, 1.0, 1.0, 1.0), (1, 1, 1.0, 1.0, 1.0)], dtype=[('a', '>> print xs.view(float) [ 2.12199579e-314 1.00000000e+000 1.00000000e+000 1.00000000e+000 2.12199579e-314 1.00000000e+000 1.00000000e+000 1.00000000e+000 2.12199579e-314 1.00000000e+000 1.00000000e+000 1.00000000e+000] >>> print xs.view(float).shape (12,) >>> dt0 = np.dtype([('a', '>> np.ones(3,dt0).view(float) Traceback (most recent call last): ValueError: new type not compatible with array. However, we can construct a new dtype that creates views on the integer part and the float part separately >>> dt2 = np.dtype([('A', '>> print repr(xs.view(dt2)) array([([1, 1], [1.0, 1.0, 1.0]), ([1, 1], [1.0, 1.0, 1.0]), ([1, 1], [1.0, 1.0, 1.0])], dtype=[('A', '>> print xs.view(dt2)['B'].mean(0) [ 1. 1. 1.] >>> print xs.view(dt2)['A'].mean(0) [ 1. 1.] We can also assign new names to the two views and calculate (almost) as if they were regular arrays. The new variables are still only a view on the original memory. If we change them, then also the original structured array changes: >>> xva = xs.view(dt2)['A'] >>> xvb = xs.view(dt2)['B'] >>> xva *= range(1,3) >>> xvb[:,:] = xvb*range(1,4) >>> print xs [(1, 2, 1.0, 2.0, 3.0) (1, 2, 1.0, 2.0, 3.0) (1, 2, 1.0, 2.0, 3.0)] >>> print xva.mean(0) [ 1. 2.] >>> print xvb.mean(0) [ 1. 2. 3.] From stefan at sun.ac.za Thu Jun 4 11:34:59 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 4 Jun 2009 17:34:59 +0200 Subject: [Numpy-discussion] Scipy 0.7.1rc1 released In-Reply-To: <1e2af89e0906040811g2671142j987922ac69ccd909@mail.gmail.com> References: <4A27A0E0.1060903@ar.media.kyoto-u.ac.jp> <1e2af89e0906040811g2671142j987922ac69ccd909@mail.gmail.com> Message-ID: <9457e7c80906040834q71b8ea8lacf9aa25ce1ce6de@mail.gmail.com> 2009/6/4 Matthew Brett : >> ? ?The RC1 for 0.7.1 scipy release has just been tagged. This is a >> bug-only release > > I feel (y)our pain, but don't you mean 'bug-fix only release'? ;-) Thanks, guys! You made my weekend :-) Cheers St?fan From dwf at cs.toronto.edu Thu Jun 4 11:45:35 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 4 Jun 2009 11:45:35 -0400 Subject: [Numpy-discussion] [RFR] NpzFile tweaks (Re: Making NpzFiles behave more like dictionaries.) In-Reply-To: <9457e7c80906040628u28783e33n8878325550662230@mail.gmail.com> References: <53EAD5CA-15B7-4DB2-91F1-C7795894ED12@cs.toronto.edu> <024DB614-9AF3-46EC-9A61-DC97A78CBB2C@cs.toronto.edu> <9457e7c80906040628u28783e33n8878325550662230@mail.gmail.com> Message-ID: <7F77CB01-5A91-4762-8978-1CA37324583C@cs.toronto.edu> On 4-Jun-09, at 9:28 AM, St?fan van der Walt wrote: > 2009/6/4 David Warde-Farley : >> Sorry Pauli, seems I _don't_ have permission on the numpy trac to >> change ticket status. The radio button shows up but then it gives >> me a >> "Warning: No permission to change ticket fields." > > Should be fixed. Thanks Stefan. David From kwgoodman at gmail.com Thu Jun 4 11:46:06 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 4 Jun 2009 08:46:06 -0700 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> Message-ID: On Sun, May 24, 2009 at 3:45 PM, David Warde-Farley wrote: > Anecdotally, it seems to me that lots of people (myself included) seem > to go through a phase early in their use of NumPy where they try to > use matrix(), but most seem to end up switching to using 2D arrays for > all the aforementioned reasons. Maybe announcing that numpy will drop support for matrices in a future version (3.0, ...) would save a lot of pain in the long run. From aisaac at american.edu Thu Jun 4 12:01:49 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 04 Jun 2009 12:01:49 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> Message-ID: <4A27EFED.4010508@american.edu> > On Sun, May 24, 2009 at 3:45 PM, David Warde-Farley wrote: >> Anecdotally, it seems to me that lots of people (myself included) seem >> to go through a phase early in their use of NumPy where they try to >> use matrix(), but most seem to end up switching to using 2D arrays for >> all the aforementioned reasons. On 6/4/2009 11:46 AM Keith Goodman apparently wrote: > Maybe announcing that numpy will drop support for matrices in a future > version (3.0, ...) would save a lot of pain in the long run. Only if you want NumPy to be unusable when teaching basic linear algebra. I believe that would damage the speed at which NumPy use spreads. Alan Isaac From zelbier at gmail.com Thu Jun 4 12:08:02 2009 From: zelbier at gmail.com (Olivier Verdier) Date: Thu, 4 Jun 2009 18:08:02 +0200 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A27EFED.4010508@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A27EFED.4010508@american.edu> Message-ID: I really don't see any advantage of matrices over arrays for teaching. I prefer to teach linear algebra with arrays. I would also like matrices to disappear from numpy. But then one would need a new implementation of scipy.sparse, which is (very unfortunately) matrix-based at the moment. == Olivier 2009/6/4 Alan G Isaac > > On Sun, May 24, 2009 at 3:45 PM, David Warde-Farley > wrote: > >> Anecdotally, it seems to me that lots of people (myself included) seem > >> to go through a phase early in their use of NumPy where they try to > >> use matrix(), but most seem to end up switching to using 2D arrays for > >> all the aforementioned reasons. > > > On 6/4/2009 11:46 AM Keith Goodman apparently wrote: > > Maybe announcing that numpy will drop support for matrices in a future > > version (3.0, ...) would save a lot of pain in the long run. > > > Only if you want NumPy to be unusable when teaching > basic linear algebra. I believe that would damage > the speed at which NumPy use spreads. > > Alan Isaac > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Thu Jun 4 12:32:33 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 04 Jun 2009 12:32:33 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> Message-ID: <4A27F721.2020306@american.edu> On 6/4/2009 11:29 AM josef.pktd at gmail.com apparently wrote: > intersect1d is the intersection between sets (which are stored as > arrays), just like in the mathematical definition the two sets only > have unique elements Hmmm. OK, I see you and Robert believe this. But it does not match the documentation. But indeed, I see that the documentation is incorrect. E.g., >>> np.intersect1d([1,1,2,3,3,4],[1,4]) array([1, 1, 3, 4]) Is this a bug or a documentation bug? > intersect1d_nu is the intersection between two arrays which can have > repeated elements. The result is a set, i.e. unique elements, stored > as an array > same for setmember1d, setmember1d_nu I cannot understand this. Following your proposed reasoning, I expect a[setmember1d_nu(a,b)] to return the same as intersect1d_nu(a, b). It does not. > so postfix `_nu` only means that this function also works > if the two arrays are not really sets But that just begs the question: what does 'works' mean? See my previous comment (above). > intersect1d should throw a domain error if you give it arrays with > non-unique elements, which is not done for speed reasons *If* intersect1d behaved *exactly* as documented, the example intersect1d(a, np.unique(b)) shows that the documented behavior can be useful. And indeed, this would be the match to a[setmember1d_nu(a,b)] Cheers, Alan Isaac From aisaac at american.edu Thu Jun 4 12:42:10 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 04 Jun 2009 12:42:10 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A27EFED.4010508@american.edu> Message-ID: <4A27F962.2040201@american.edu> On 6/4/2009 12:08 PM Olivier Verdier apparently wrote: > I really don't see any advantage of matrices over arrays for teaching. I > prefer to teach linear algebra with arrays. beta = (X.T*X).I * X.T * Y beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) I rest my case. I would have to switch back to GAUSS for teaching if NumPy discarded matrices. Cheers, Alan Isaac From ndbecker2 at gmail.com Thu Jun 4 12:58:44 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 04 Jun 2009 12:58:44 -0400 Subject: [Numpy-discussion] Speaking of fft code Message-ID: Has this been considered as a candidate for our fft? http://sourceforge.net/projects/kissfft From josef.pktd at gmail.com Thu Jun 4 13:27:25 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 13:27:25 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A27F721.2020306@american.edu> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> Message-ID: <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> On Thu, Jun 4, 2009 at 12:32 PM, Alan G Isaac wrote: > On 6/4/2009 11:29 AM josef.pktd at gmail.com apparently wrote: >> intersect1d ?is the intersection between sets (which are stored as >> arrays), just like in the mathematical definition the two sets only >> have unique elements > > Hmmm. OK, I see you and Robert believe this. > But it does not match the documentation. > But indeed, I see that the documentation is incorrect. > E.g., > >>>> np.intersect1d([1,1,2,3,3,4],[1,4]) > array([1, 1, 3, 4]) > > Is this a bug or a documentation bug? > > > >> intersect1d_nu is the intersection between two arrays which can have >> repeated elements. The result is a set, i.e. unique elements, stored >> as an array > >> same for setmember1d, setmember1d_nu > > I cannot understand this. > Following your proposed reasoning, > I expect a[setmember1d_nu(a,b)] > to return the same as > intersect1d_nu(a, b). > It does not. I don't have setmember1d_nu available right now, but from my reading we should have intersect1d_nu(a, b).== np.unique(a[setmember1d_nu(a,b)]) > > > >> so ?postfix `_nu` only means that this function also works >> if the two arrays are not really sets > > But that just begs the question: what does 'works' mean? > See my previous comment (above). > > > >> intersect1d should throw a domain error if you give it arrays with >> non-unique elements, which is not done for speed reasons > > *If* intersect1d behaved *exactly* as documented, > the example > intersect1d(a, np.unique(b)) > shows that the documented behavior can be useful. > And indeed, this would be the match to > a[setmember1d_nu(a,b)] I'm don't know if anyone looked at the behavior for "unintented" usage intersect1d rearranges, sorts >>> np.intersect1d([4,1,3,3],[3,4]) array([3, 3, 4]) but it gives you the correct multiplicity >>> np.intersect1d([4,4,4,1,3,3],np.unique([3,4,3,0])) array([3, 3, 4, 4, 4]) so I guess, we have np.intersect1d([4,4,4,1,3,3], np.unique([3,4,3,0])) == np.sort(a[setmember1d_nu(a,b)]) for the example from the help file I don't find any meaningful interpretation >>> np.intersect1d([1,3,3],[3,1,1]) array([1, 1, 3, 3]) wrong answer >>> np.setmember1d([4,1,1,3,3],[3,4]) array([ True, True, False, True, True], dtype=bool) Note: there are two versions of the docs for np.intersect1d, the currently published docs which describe the actual behavior (for the non-unique case), and the new docs on the doc editor http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/ that describe the "intended" usage of the functions, which also corresponds closer to the original source docstring (http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/?revision=-227 ). that's my interpretation If you think that functions make sense also for the "unintended" usage, then you could add an example to the new docs. Josef From robert.kern at gmail.com Thu Jun 4 13:35:24 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 4 Jun 2009 12:35:24 -0500 Subject: [Numpy-discussion] Speaking of fft code In-Reply-To: References: Message-ID: <3d375d730906041035o3f33436dq3ae4ca80c6ef8e2c@mail.gmail.com> On Thu, Jun 4, 2009 at 11:58, Neal Becker wrote: > Has this been considered as a candidate for our fft? > > http://sourceforge.net/projects/kissfft No. What would be the advantage of moving to Kiss FFT to offset the cost? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From arokem at berkeley.edu Thu Jun 4 14:03:36 2009 From: arokem at berkeley.edu (Ariel Rokem) Date: Thu, 4 Jun 2009 11:03:36 -0700 Subject: [Numpy-discussion] ANNOUNCE: ETS 3.2.0 Released In-Reply-To: <49C7F530.5020700@enthought.com> References: <49C7F530.5020700@enthought.com> Message-ID: <43958ee60906041103i28c66328g4bdc4ee4dd1327c3@mail.gmail.com> Hi - thanks for all the good work on this! I have been using an older version of the ETS, which I got when I installed the EPD (4.0.30001) and now I have finally gotten around to trying to update my ETS to this version. I have a question - how do I go about uninstalling my previous version of the ETS? A more general question to anyone - what's the right way of uninstalling any old python package? In the past, I have been advised to go in and run "rm -rf" on the directory inside my site-packages directory (and remove stuff from "easy-install.pth"?), but in the case of ETS, I am not even sure which directories in there belong to ETS. Is there some elegant way of uninstalling python packages? TIA -- Ariel On Mon, Mar 23, 2009 at 1:46 PM, Dave Peterson wrote: > Hello, > > I'm pleased to announce that Enthought Tool Suite (ETS) version 3.2.0 > has been tagged and released! > > Source distributions (.tar.gz) have been uploaded to PyPi, and Windows > binaries will be follow shortly. A full install of ETS can be done using > Setuptools via a command like: > easy_install -U "ets[nonets] >= 3.2.0" > > NOTE 1: Users of an old ETS release will need to first uninstall prior > to installing the new ETS. > > NOTE 2: If you get a 'SandboxViolation' error, simply re-run the command > again -- it may take multiple invocations to get everything installed. > (This error appears to be a long-standing incompatibility between > numpy.distutils and setuptools.) > > Please see below for a list of what's new in this release. > > > What Is ETS? > =========== > > The Enthought Tool Suite (ETS) is a collection of components developed > by Enthought and the open-source community, which we use every day to > construct custom scientific applications. It includes a wide variety of > components, including: > * an extensible application framework > * application building blocks > * 2-D and 3-D graphics libraries > * scientific and math libraries > * developer tools > The cornerstone on which these tools rest is the Traits package, which > provides explicit type declarations in Python; its features include > initialization, validation, delegation, notification, and visualization > of typed attributes. > > More information on ETS is available from the development home page: > http://code.enthought.com/projects/index.php > > > Changelog > ========= > > ETS 3.2.0 is a feature-added update to ETS 3.1.0, including numerous > bug-fixes. Some of the notable changes include: > > Chaco > ----- > > * Domain limits - Mappers now can declare the "limits" of their valid > domain. PanTool and ZoomTool respect these limits. (pwang) > > * Adding "hide_grids" parameter to Plot.img_plot() and > Plot.contour_plot() so users can override the default behavior of hiding > grids. (pwang) > > * Refactored examples to declare a Demo object so they can be be run > with the demo.py example launcher. (vibha) > > * Adding chaco.overlays package with some canned SVG overlays. (bhendrix) > > * DragZoom now can scale both X and Y axes independently corresponding > to the mouse cursor motion along the X and Y axes (similar to the zoom > behavior in Matplotlib). (pwang) > > * New Examples: > * world map (bhendrix) > * more financial plots (pwang) > * scatter_toggle (pwang) > * stacked_axis (pwang) > > * Fixing the chaco.scales TimeFormatter to use the built-in localtime() > instead of the one in the safetime.py module due to Daylight Savings > Time issues with timedelta. (r23231, pwang) > > * Improved behavior of ScatterPlot when it doesn't get the type of > metadata it expects in its "selections" and "selection_masks" metadata > keys (r23121, pwang) > > * Setting the .range2d attribute on GridMapper now properly sets the two > DataRange1D instances of its sub-mappers. (r23119, pwang) > > * ScatterPlot.map_index() now respects the index_only flag (r23060, pwang) > > * Fixed occasional traceback/bug in LinePlot that occurred when data was > completely outside the visible range (r23059, pwang) > > * Implementing is_in() on legends to account for padding and alignment > (caused by tools that move the legend) (r23052, bhendrix) > > * Legend behaves properly when there are no plots to display (r23012, > judah) > > * Fixed LogScale in the chaco.scales package to correctly handle the > case when the length of the interval is less than a decade (r22907, > warren.weckesser) > > * Fixed traceback when calling copy_traits() on a DataView (r22894, vibha) > > * Scatter plots generated by Plot.plot() now properly use the "auto" > coloring feature of Plot. (r22727, pwang) > > * Reduced the size of screenshots in the user manual. (r22720, rkern) > > > Mayavi > ------ > > * 17, 18 March, 2009 (PR): > * NEW: A simple example to show how one can use TVTK?s visual module > with mlab. [23250] > * BUG: The size trait was being overridden and was different from the > parent causing a bug with resizing the viewer. [23243] > > * 15 March, 2009 (GV): > * ENH: Add a volume factory to mlab that knows how to set color, vmin > and vmax for the volume module [23221]. > > * 14 March, 2009 (PR): > * API/TEST: Added a new testing entry point: ?mayavi -t? now runs tests > in separate process, for isolation. Added enthought.mayavi.api.test to > allow for simple testing from the interpreter [23195]...[23200], > [23213], [23214], [23223]. > * BUG: The volume module was directly importing the wx_gradient_editor > leading to an import error when no wxPython is available. This has been > tested and fixed. Thanks to Christoph Bohme for reporting this issue. > [23191] > > * 14 March, 2009 (GV): > * BUG: [mlab]: fix positioning for titles [23194], and opacity for > titles and text [23193]. > * ENH: Add the mlab_source attribute on all objects created by mlab, > when possible [23201], [23209]. > * ENH: Add a message to help the first-time user, using the new banner > feature of the IPython shell view [23208]. > > * 13 March, 2009 (PR): > * NEW/API: Adding a powerful TCP/UDP server for scripting mayavi via the > network. This is available in enthought.mayavi.tools.server and is fully > documented. It uses twisted and currently only works with wxPython. It > is completely insecure though since it allows a remote user to do > practically anything from mayavi. > > * 13 March, 2009 (GV) > * API: rename mlab.orientationaxes to mlab.orientation_axes [23184] > > * 11 March, 2009 (GV) > * API: Expose ?traverse? in mlab.pipeline [23181] > > * 10 March, 2009 (PR) > * BUG: Fixed a subtle bug that affected the ImagePlaneWidget. This > happened because the scalar_type of the output data from the > VTKDataSource was not being set correctly. Getting the range of any > input scalars also seems to silence warnings from VTK. This should > hopefully fix issues with the use of the IPW with multiple scalars. I?ve > added two tests for this, one is an integration test since those errors > really show up only when the display is used. The other is a traditional > unittest. [23166] > > * 08 March, 2009 (GV) > * ENH: Raises an error when the user passes to mlab an array with > infinite values [23150] > > * 07 March, 2009 (PR) > * BUG: A subtle bug with a really gross error in the GridPlane > component, I was using the extents when I should really have been > looking at the dimensions. The extract grid filter was also not flushing > the data changes downstream leading to errors that are also fixed now. > These errors would manifest when you use an ExtractGrid to select a VOI > or a sample rate and then used a grid plane down stream causing very > wierd and incorrect rendering of the grid plane (thanks to conflation of > extents and dimensions). This bug was seen at NAL for a while and also > reported by Fred with a nice CME. The CME was then converted to a nice > unittest by Suyog and then improved. Thanks to them all. [23146] > > * 28 February, 2009 (PR) > * BUG: Fixed some issues reported by Ondrej Certik regarding the use Of > mlab.options.offscreen, mlab.options.backend = ?test?, removed cruft > from earlier ?null? backend, fixed bug with incorrect imports, > add_dataset set no longer adds one new null engine each time > figure=False is passed, added test case for the options.backend test. > [23088] > > * 23 February, 2009 (PR) > * ENH: Updating show so that it supports a stop keyword argument that > pops up a little UI that lets the user stop the mainloop temporarily and > continue using Python [23049] > > * 21 February, 2009 (GV) > * ENH: Add a richer view for the pipeline to the MayaviScene [23035] > * ENH: Add safegards to capture wrong triangle array sizes in > mlab.triangular_mesh_source. [23037] > > * 21 February, 2009 (PR) > * ENH: Making the transform data filter recordable. [23033] > * NEW: A simple animator class to make it relatively to create > animations. [23036] [23039] > > * 20 February, 2009 (PR) > * ENH: Added readers for various image file formats, poly data readers > and unstructured grid readers. These include DICOM, GESigna, DEM, > MetaImage (mha,mhd) MINC, AVSucd, GAMBIT, Exodus, STL, Points, Particle, > PLY, PDB, SLC, OBJ, Facet and BYU files. Also added several tests for > most of this functionality along with small data files. These are > additions from PR?s project staff, Suyog Jain and Sreekanth Ravindran. > [23013] > * ENH: We now change the default so the ImagePlaneWidget does not > control the LUT. Also made the IPW recordable. [23011] > > * 18 February, 2009 (GV) > * ENH: Add a preference manager view for editing preferences outside > envisage [22998] > > * 08 February, 2009 (GV) > * ENH: Center the glyphs created by barchart on the data points, as > mentioned by Rauli Ruohonen [22906] > > * 29 January, 2009 (GV) > * ENH: Make it possible to avoid redraws with mlab by using > mlab.gcf().scene.disable_render = True [22869] > > * 28 January, 2009 (PR and GV) > * ENH: Make the mlab.pipeline.user_defined factory function usable to > add arbitrary filters on the pipeline. [22867], [22865] > > * 11 January, 2009 (GV) > * ENH: Make mlab.imshow use the ImageActor. Enhance the ImageActor to > map scalars to colors when needed. [22816] > > > Traits > ------ > > * Fixed a bug whereby faulty error handling in the PyProtocols Pyrex > speedup code keeps references to tracebacks that have been handled. In > so doing, clean up the same code such that it can be used with a modern > Pyrex release (a bare raise can no longer be used outside of an except: > clause). > > * RangeEditor factory now supports a 'logslider' mode: Thanks to Matthew > Turk for the patch > > * TabularEditor factory now supports editing of all columns: Thanks to > Didrik Pinte for the patch > > * DateEditor factory in 'custom' style now supports multi-select feature. > > * DateEditor and TimeEditor now support the 'readonly' style. > > * Fixed a bug in the ArrayEditor factory that was causing multiple trait > change events to get fired when the underlying array is changed > externally to the editor: Thanks to Matthew Turk for he patch. > > * Fixed a circular import error in Color, Font and RGBColor traits > > * Fixed a bug in the factory for ArrayViewEditor so it now calls the > toolkit backend-specific editor > > > TraitsBackendWX > --------------- > > * RangeEditor now supports a 'logslider' mode: Thanks to Matthew Turk > for the patch > > * TabularEditor now supports editing of all columns: Thanks to Didrik > Pinte for the patch > > * DateEditor in 'custom' style now supports multi-select feature. > > * DateEditor and TimeEditor now support the 'readonly' style. > > * Added a trait to the wx pyface workbench View to indicate if the view > dock window should be closeable. > > * Fixed the DirectoryEditor to popup the correct file dialog (thanks to > Luca Fasano and Phil Thompson) > > * Fixed a circular import error in Color, Font and RGBColor traits > > * Fixed a bug in the ColorEditor that was causing the revert action to > not work correctly. > > * Fixed a bug that caused a traceback when trying to undock a pyface > dock window > > * Fixed a bug in the 'livemodal' view that caused the UI to become > unresponsive if the 'updated' event was fired on the contained view. > > * Fixed bugs in ListEditor (notebook style) that caused a loss of sync > between the 'selected' trait and the activated dock window. > > > TraitsBackendQt > --------------- > > * RangeEditor now supports a 'logslider' mode: Thanks to Matthew Turk > for the patch > > * Fixed the DirectoryEditor to popup the correct file dialog (thanks to > Luca Fasano and Phil Thompson) > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Ariel Rokem Helen Wills Neuroscience Institute University of California, Berkeley http://argentum.ucbso.berkeley.edu/ariel -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Jun 4 14:30:02 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 04 Jun 2009 14:30:02 -0400 Subject: [Numpy-discussion] Speaking of fft code References: <3d375d730906041035o3f33436dq3ae4ca80c6ef8e2c@mail.gmail.com> Message-ID: Robert Kern wrote: > On Thu, Jun 4, 2009 at 11:58, Neal Becker wrote: >> Has this been considered as a candidate for our fft? >> >> http://sourceforge.net/projects/kissfft > > No. What would be the advantage of moving to Kiss FFT to offset the cost? > I was reading this: http://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2009/05/msg00181.html From aisaac at american.edu Thu Jun 4 14:58:32 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 04 Jun 2009 14:58:32 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> Message-ID: <4A281958.6060208@american.edu> On 6/4/2009 1:27 PM josef.pktd at gmail.com apparently wrote: > Note: there are two versions of the docs for np.intersect1d, the > currently published docs which describe the actual behavior (for the > non-unique case), and the new docs on the doc editor > http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/ > that describe the "intended" usage of the functions, which also > corresponds closer to the original source docstring > (http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/?revision=-227 > ). that's my interpretation Again, the distributed docs do *not* describe the actual behavior for the non-unique case. E.g., >>> np.intersect1d([1,1,2,3,3,4], [1,4]) array([1, 1, 3, 4]) Might this is a better example of failure than the one in the doc editor? However the doc editor version states that the function fails for the non-unique case, so it seems there was a documentation bug that is in the process of being fixed. Thanks, Alan From fperez.net at gmail.com Thu Jun 4 15:12:40 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 4 Jun 2009 12:12:40 -0700 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> Message-ID: Howdy, 2009/6/3 St?fan van der Walt : >> however i seem to lose simple operations such as multiplication (a_array*2) >> or powers (a_array**2). > > As a workaround, you can have two views on your data: I was thinking about this yesterday, because I'm dealing with exactly this same problem in a local project. How hard would it be to allow structured arrays to support ufuncs/arithmetic for the case where their dtype is actually a composite of the same 'native' ones? Basically, for cases where one could manually make a flattened view using the 'base' dtype, do the operation, and repack the output into the structured dtype, perhaps numpy could support it natively? I can see there being a small cost to this, but the check could be very fast, because the information on whether a given dtype is in fact a composite of identical native ones or not can be stored statically, at construction time, in the dtype object itself. That way, at runtime, the ufunc/arithmetic machinery only needs to check this static flag, and if it's true, it can take the (slightly slower) path of (flatten,operate,repack) instead of bailing out with an error as it does today. It seems to me that this would make structured arrays even more useful and powerful than they already are, at minimal performance cost (and without knowing for sure, my intuition tells me that the implementation isn't all that bad, though I could be wrong). Cheers, f From robert.kern at gmail.com Thu Jun 4 15:14:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 4 Jun 2009 14:14:21 -0500 Subject: [Numpy-discussion] Speaking of fft code In-Reply-To: References: <3d375d730906041035o3f33436dq3ae4ca80c6ef8e2c@mail.gmail.com> Message-ID: <3d375d730906041214w2a9597ddob90de12adea87707@mail.gmail.com> On Thu, Jun 4, 2009 at 13:30, Neal Becker wrote: > Robert Kern wrote: > >> On Thu, Jun 4, 2009 at 11:58, Neal Becker wrote: >>> Has this been considered as a candidate for our fft? >>> >>> http://sourceforge.net/projects/kissfft >> >> No. What would be the advantage of moving to Kiss FFT to offset the cost? >> > > I was reading this: > > http://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2009/05/msg00181.html I don't see anything there that's particularly compelling over the FFTPACK implementation we currently use. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Thu Jun 4 15:28:46 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 4 Jun 2009 15:28:46 -0400 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> Message-ID: <9C0536CB-B4C4-4644-BE70-FE0EDCF4F55E@gmail.com> On Jun 4, 2009, at 3:12 PM, Fernando Perez wrote: > Howdy, > I was thinking about this yesterday, because I'm dealing with exactly > this same problem in a local project. How hard would it be to allow > structured arrays to support ufuncs/arithmetic for the case where > their dtype is actually a composite of the same 'native' ones? > Basically, for cases where one could manually make a flattened view > using the 'base' dtype, do the operation, and repack the output into > the structured dtype, perhaps numpy could support it natively? I foresee serious disturbance in the force... When I use structured arrays, each field usually represents a different variable, and I may not be keen on having a same operation applied to all variables. At least, the current behavior (raise an exception) forces me to think twice. Now, I'm not saying that it could not be useful in some cases. For example, I may want to have a (n,12) array of monthly data, with one field for each month. Then, having the possibility to apply the same operation to all fields would be great. I still think it's a particular case. > I can see there being a small cost to this, but the check could be > very fast, because the information on whether a given dtype is in fact > a composite of identical native ones or not can be stored statically, > at construction time, in the dtype object itself. That way, at > runtime, the ufunc/arithmetic machinery only needs to check this > static flag, and if it's true, it can take the (slightly slower) path > of (flatten,operate,repack) instead of bailing out with an error as it > does today. What about the case where you multiply a 1D structured array with a nD array ? What should you have ? From josef.pktd at gmail.com Thu Jun 4 16:14:59 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 16:14:59 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A281958.6060208@american.edu> References: <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> Message-ID: <1cd32cbb0906041314t44a09438sbd1dc7add2b63892@mail.gmail.com> On Thu, Jun 4, 2009 at 2:58 PM, Alan G Isaac wrote: > On 6/4/2009 1:27 PM josef.pktd at gmail.com apparently wrote: >> Note: there are two versions of the docs for np.intersect1d, the >> currently published docs which describe the actual behavior (for the >> non-unique case), and the new docs on the doc editor >> http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/ >> that describe the "intended" usage of the functions, which also >> corresponds closer to the original source docstring >> (http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/?revision=-227 >> ). that's my interpretation > > > Again, the distributed docs do *not* describe the actual > behavior for the non-unique case. ?E.g., > >>>> np.intersect1d([1,1,2,3,3,4], [1,4]) > array([1, 1, 3, 4]) > > Might this is a better example of > failure than the one in the doc editor? Thanks, that's a very clear example of a wrong answer, and it removes the question whether the function makes any sense for the non-unique case. I changed the example in the doc editor to this one. It will hopefully merged with the source at the next update. Josef > > However the doc editor version states that the function > fails for the non-unique case, so it seems there was a > documentation bug that is in the process of being fixed. Yes > > Thanks, > Alan > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gael.varoquaux at normalesup.org Thu Jun 4 16:16:34 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 4 Jun 2009 22:16:34 +0200 Subject: [Numpy-discussion] ANNOUNCE: ETS 3.2.0 Released In-Reply-To: <43958ee60906041103i28c66328g4bdc4ee4dd1327c3@mail.gmail.com> References: <49C7F530.5020700@enthought.com> <43958ee60906041103i28c66328g4bdc4ee4dd1327c3@mail.gmail.com> Message-ID: <20090604201634.GB17131@phare.normalesup.org> On Thu, Jun 04, 2009 at 11:03:36AM -0700, Ariel Rokem wrote: > I have a question - how do I go about uninstalling my previous version of > the ETS? A more general question to anyone - what's the right way of > uninstalling any old python package? In the past, I have been advised to > go in and run "rm -rf" on the directory inside my site-packages directory > (and remove stuff from "easy-install.pth"?), That's the right way to do it. > but in the case of ETS, I am not even sure which directories in > there belong to ETS. This is a real problem that has not been addressed so far. > Is there some elegant way of uninstalling python packages? Not when they have been installed with setuptools, AFAIK. You need to keep track of what dependency installed what. A proposal is out to implement uninstall http://tarekziade.wordpress.com/2009/05/10/distutils-state/ However, I am not sure how this will address the concern of installing a chain of dependencies brought by one package. Ga?l From pav at iki.fi Thu Jun 4 16:17:42 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 4 Jun 2009 20:17:42 +0000 (UTC) Subject: [Numpy-discussion] Scipy 0.7.1rc1 released References: <4A27A0E0.1060903@ar.media.kyoto-u.ac.jp> Message-ID: Thu, 04 Jun 2009 19:24:32 +0900, David Cournapeau wrote: [clip] > ========================= > SciPy 0.7.1 Release Notes > ========================= > > .. contents:: > > SciPy 0.7.1 is a bug-fix release with no new features compared to 0.7.0. scipy.special ============= Several bugs of varying severity were fixed in the special functions: - #503, #640: iv: problems at large arguments fixed by new implementation - #623: jv: fix errors at large arguments - #679: struve: fix wrong output for v < 0 - #803: pbdv produces invalid output - #804: lqmn: fix crashes on some input - #823: betainc: fix documentation - #834: exp1 strange behavior near negative integer values - #852: jn_zeros: more accurate results for large s, also in jnp/yn/ ynp_zeros - #853: jv, yv, iv: invalid results for non-integer v < 0, complex x - #854: jv, yv, iv, kv: return nan more consistently when out-of-domain - #927: ellipj: fix segfault on Windows - ive, jve, yve, kv, kve: with real-valued input, return nan for out-of-domain instead of returning only the real part of the result. Also, when ``scipy.special.errprint(1)`` has been enabled, warning messages are now issued as Python warnings instead of printing them to stderr. *** I added this to 0.7.1-notes.rst. -- Pauli Virtanen From slaunger at gmail.com Thu Jun 4 16:27:11 2009 From: slaunger at gmail.com (Kim Hansen) Date: Thu, 4 Jun 2009 22:27:11 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A281958.6060208@american.edu> References: <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> Message-ID: Concerning the name setmember1d_nu, I personally find it quite verbose and not the name I would expect as a non-insider coming to numpy and not knowing all the names of the more special hidden-away functions and not being a python-wiz either. I think ain(a,b) would be the name I had expected as an array equivalent of "a in b" (just as arange is the array version of range) or I would had anticipated that an ndarray object would have an "in(b)" or "in_iterable(b)" method, such that you could do a.in(b) which would return a boolean array of the same shape as a with elements true if the equivalent a members were members in the iterable b. When I had a problem where I needed this function, I could not find anything near that, and after looking around and also asking here I got some hints to use the ....1d functions, which gave me the idea to implement the few-line, very simple proposal for "a in b", which is now the proposal under review as the new function setmember1d_nu(a,b). Whereas I see this function name is in line with the existing functions, I really think the names are non-intuitive. I would therefore propose that it was also aliased to a more intuitive name such as ain(a,b) or perhaps better a.in(b) Again, I am probably missing some important points here as a non-experienced Python programmer and numpy user, I am just trying to give some input from the beginners point-of-view, if that can be of any help. Thank you, Kim From gael.varoquaux at normalesup.org Thu Jun 4 16:30:19 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 4 Jun 2009 22:30:19 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: References: <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> Message-ID: <20090604203019.GC17131@phare.normalesup.org> On Thu, Jun 04, 2009 at 10:27:11PM +0200, Kim Hansen wrote: > "in(b)" or "in_iterable(b)" method, such that you could do a.in(b) > which would return a boolean array of the same shape as a with > elements true if the equivalent a members were members in the iterable > b. That would really by what I would be looking for. Ga?l From peridot.faceted at gmail.com Thu Jun 4 16:38:40 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 4 Jun 2009 16:38:40 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> Message-ID: 2009/6/4 : > intersect1d should throw a domain error if you give it arrays with > non-unique elements, which is not done for speed reasons It seems to me that this is the basic source of the problem. Perhaps this can be addressed? I realize maintaining compatibility with the current behaviour is necessary, so how about a multistage deprecation: 1. add a keyword argument to intersect1d "assume_unique"; if it is not present, check for uniqueness and emit a warning if not unique 2. change the warning to an exception Optionally: 3. change the meaning of the function to that of intersect1d_nu if the keyword argument is not present One could do something similar with setmember1d. This would remove the pitfall of the 1d assumption and the wart of the _nu names without hampering performance for people who know they have unique arrays and are in a hurry. Anne From josef.pktd at gmail.com Thu Jun 4 16:43:39 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 16:43:39 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <20090604203019.GC17131@phare.normalesup.org> References: <4A27BCCF.4090608@american.edu> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> <20090604203019.GC17131@phare.normalesup.org> Message-ID: <1cd32cbb0906041343x42a0a438mef44bef1ef87f612@mail.gmail.com> On Thu, Jun 4, 2009 at 4:30 PM, Gael Varoquaux wrote: > On Thu, Jun 04, 2009 at 10:27:11PM +0200, Kim Hansen wrote: >> "in(b)" or "in_iterable(b)" method, such that you could do a.in(b) >> which would return a boolean array of the same shape as a with >> elements true if the equivalent a members were members in the iterable >> b. > > That would really by what I would be looking for. > Just using "in" might promise more than it does, eg. it works only for one dimensional arrays, maybe "in1d". With "in", I would expect a generic function as in python that works with many array types and dimensions. (But I haven't checked whether it would work with a 1d structured array or object array.) I found arraysetops because of unique1d, but I didn't figure out what the subpackage really does, because I was reading "arrayse-tops" instead of array-set-ops" BTW, for the docs, I haven't found a counter example where np.setdiff1d gives the wrong answer for non-unique arrays. Josef From gael.varoquaux at normalesup.org Thu Jun 4 16:52:00 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 4 Jun 2009 22:52:00 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906041343x42a0a438mef44bef1ef87f612@mail.gmail.com> References: <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> <20090604203019.GC17131@phare.normalesup.org> <1cd32cbb0906041343x42a0a438mef44bef1ef87f612@mail.gmail.com> Message-ID: <20090604205200.GD17131@phare.normalesup.org> On Thu, Jun 04, 2009 at 04:43:39PM -0400, josef.pktd at gmail.com wrote: > Just using "in" might promise more than it does, eg. it works only for > one dimensional arrays, maybe "in1d". With "in", Then 'in_1d' > I found arraysetops because of unique1d, but I didn't figure out what > the subpackage really does, because I was reading "arrayse-tops" > instead of array-set-ops" That's why I push people to use more underscores. IMHO PEP8 lacks a push for underscores. Ga?l From sccolbert at gmail.com Thu Jun 4 16:54:03 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 4 Jun 2009 16:54:03 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> Message-ID: <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> Sebastian is right. Since Matlab r2007 (i think that's the version) it has included support for multi-core architecture. On my core2 Quad here at the office, r2008b has no problem utilizing 100% cpu for large matrix multiplications. If you download and build atlas and lapack from source and enable parrallel threads in atlas, then compile numpy against these libraries, you should achieve similar if not better performance (since the atlas routines will be tuned to your system). If you're on Windows, you need to do some trickery to get threading to work (the instructions are on the atlas website). Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Thu Jun 4 16:56:48 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 4 Jun 2009 16:56:48 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> Message-ID: <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> I should update after reading the thread Sebastian linked: The current 1.3 version of numpy (don't know about previous versions) uses the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with these libraries. I've verified this on linux only, thought it shouldnt be any different on windows AFAIK. chris On Thu, Jun 4, 2009 at 4:54 PM, Chris Colbert wrote: > Sebastian is right. > > Since Matlab r2007 (i think that's the version) it has included support for > multi-core architecture. On my core2 Quad here at the office, r2008b has no > problem utilizing 100% cpu for large matrix multiplications. > > > If you download and build atlas and lapack from source and enable parrallel > threads in atlas, then compile numpy against these libraries, you should > achieve similar if not better performance (since the atlas routines will be > tuned to your system). > > If you're on Windows, you need to do some trickery to get threading to work > (the instructions are on the atlas website). > > Chris > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Thu Jun 4 17:03:25 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 4 Jun 2009 17:03:25 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> Message-ID: 2009/6/4 David Paul Reichert : > Hi all, > > I would be glad if someone could help me with > the following issue: > > ?From what I've read on the web it appears to me > that numpy should be about as fast as matlab. However, > when I do simple matrix multiplication, it consistently > appears to be about 5 times slower. I tested this using > > A = 0.9 * numpy.matlib.ones((500,100)) > B = 0.8 * numpy.matlib.ones((500,100)) > > def test(): > ? ? for i in range(1000): > ? ? ? ? A*B.T > > I also used ten times larger matrices with ten times less > iterations, used xrange instead of range, arrays instead > of matrices, and tested it on two different machines, > and the result always seems to be the same. > > Any idea what could go wrong? I'm using ipython and > matlab R2008b. Apart from the implementation issues people have chimed in about already, it's worth noting that the speed of matrix multiplication depends on the memory layout of the matrices. So generating B instead directly as a 100 by 500 matrix might affect the speed substantially (I'm not sure in which direction). If MATLAB's matrices have a different memory order, that might be a factor as well. Anne > Thanks, > > David > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Thu Jun 4 17:25:41 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 04 Jun 2009 14:25:41 -0700 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> Message-ID: <4A283BD5.4080405@noaa.gov> Keith Goodman wrote: > Maybe announcing that numpy will drop support for matrices in a future > version (3.0, ...) would save a lot of pain in the long run. Or make them better. There was a pretty good discussion of this a while back on this list. We all had a lot of opinions, and there were some good ideas in that thread. However, no none stepped up to implement any of it. I think the reason is that none of the core numpy developers use them/want them. In fact, many of those contributing to the discussion (myself included), didn't think it likely that they'd use them, even with improvements. Someone that thinks they are important needs to step up and really make them work. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From tgrav at mac.com Thu Jun 4 17:27:47 2009 From: tgrav at mac.com (Tommy Grav) Date: Thu, 04 Jun 2009 17:27:47 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A283BD5.4080405@noaa.gov> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> Message-ID: <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> On Jun 4, 2009, at 5:25 PM, Christopher Barker wrote: > Keith Goodman wrote: >> Maybe announcing that numpy will drop support for matrices in a >> future >> version (3.0, ...) would save a lot of pain in the long run. > > Or make them better. There was a pretty good discussion of this a > while > back on this list. We all had a lot of opinions, and there were some > good ideas in that thread. However, no none stepped up to implement > any > of it. > > I think the reason is that none of the core numpy developers use > them/want them. In fact, many of those contributing to the discussion > (myself included), didn't think it likely that they'd use them, even > with improvements. > > Someone that thinks they are important needs to step up and really > make > them work. Or the core development team split the matrices out of numpy and make it as separate package that the people that use them could pick up and run with. Cheers Tommy From aisaac at american.edu Thu Jun 4 17:41:42 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 04 Jun 2009 17:41:42 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> Message-ID: <4A283F96.1050103@american.edu> On 6/4/2009 5:27 PM Tommy Grav apparently wrote: > Or the core development team split the matrices out of numpy and make it > as separate package that the people that use them could pick up and > run with. This too would be a mistake, I believe. But it depends on whether a goal is to have more people use NumPy. I believe the community will gain from growth. My core concern here is keeping NumPy very friendly for teaching. This will mean keeping a matrix object in NumPy. For this purpose, I have found the existing matrix object to be adequate. (I am teaching economics students, who generally do not have prior programming experience.) I believe that this is crucial for "recruiting" new users, who might otherwise choose less powerful tools that appear to be more friendly on first encounter. In sum, my argument is this: Keeping a matrix object in NumPy has substantial benefits in encouraging growth of the NumPy community, and as far as I can tell, it is imposing few costs. Therefore I think there is a very substantial burden on people who propose removing the matrix object to demonstrate just how the NumPy community will benefit from this change. Alan Isaac From tgrav at mac.com Thu Jun 4 17:54:38 2009 From: tgrav at mac.com (Tommy Grav) Date: Thu, 04 Jun 2009 17:54:38 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A283F96.1050103@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> Message-ID: <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> On Jun 4, 2009, at 5:41 PM, Alan G Isaac wrote: > On 6/4/2009 5:27 PM Tommy Grav apparently wrote: >> Or the core development team split the matrices out of numpy and >> make it >> as separate package that the people that use them could pick up and >> run with. > > > This too would be a mistake, I believe. > But it depends on whether a goal is to > have more people use NumPy. I believe > the community will gain from growth. > > In sum, my argument is this: > Keeping a matrix object in NumPy has substantial > benefits in encouraging growth of the NumPy > community, and as far as I can tell, it is > imposing few costs. Therefore I think there is > a very substantial burden on people who propose > removing the matrix object to demonstrate > just how the NumPy community will benefit from > this change. This is a perfectly valid argument. I am actually quite happy with the numpy package as it is (I work in astronomy), I was just pointing out that if there are few of the core numpy people interested in maintaing or upgrading the matrix class one solution might be to make it a scipy-like package that easily can be installed on top of numpy, but where the code base might be more accessible to those that are interested in matrices, but feel that numpy is a daunting beast to tackle. Some sense of ownership of a matrixpy package might encourage more people to contribute. Just an idea ;-) Tommy From david at ar.media.kyoto-u.ac.jp Thu Jun 4 19:42:39 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 05 Jun 2009 08:42:39 +0900 Subject: [Numpy-discussion] [SciPy-user] Scipy 0.7.1rc1 released In-Reply-To: <1e2af89e0906040811g2671142j987922ac69ccd909@mail.gmail.com> References: <4A27A0E0.1060903@ar.media.kyoto-u.ac.jp> <1e2af89e0906040811g2671142j987922ac69ccd909@mail.gmail.com> Message-ID: <4A285BEF.6040502@ar.media.kyoto-u.ac.jp> Matthew Brett wrote: > Hi, > > >> The RC1 for 0.7.1 scipy release has just been tagged. This is a >> bug-only release >> > > I feel (y)our pain, but don't you mean 'bug-fix only release'? ;-) > Actually, there is one big bug on python 2.6 for mac os x, so maybe the bug-only is appropriate :) cheers, David From david at ar.media.kyoto-u.ac.jp Thu Jun 4 20:15:11 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 05 Jun 2009 09:15:11 +0900 Subject: [Numpy-discussion] Speaking of fft code In-Reply-To: References: Message-ID: <4A28638F.7010803@ar.media.kyoto-u.ac.jp> Neal Becker wrote: > Has this been considered as a candidate for our fft? > > http://sourceforge.net/projects/kissfft > I looked at it when I was looking for a BSD-compatible FFT with support for prime factors (which fftpack does not handle). As Robert mentioned, I did not see any compelling reason to bother - the main limitation of fftpack as used in numpy is support for prime numbers (because you then get N**2 performances), and kissfft did not support this last time I checked. cheers, David From dwf at cs.toronto.edu Thu Jun 4 20:37:02 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 4 Jun 2009 20:37:02 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> Message-ID: On 4-Jun-09, at 5:03 PM, Anne Archibald wrote: > Apart from the implementation issues people have chimed in about > already, it's worth noting that the speed of matrix multiplication > depends on the memory layout of the matrices. So generating B instead > directly as a 100 by 500 matrix might affect the speed substantially > (I'm not sure in which direction). If MATLAB's matrices have a > different memory order, that might be a factor as well. AFAIK Matlab matrices are always Fortran ordered. Does anyone know if the defaults on Mac OS X (vecLib/Accelerate) support multicore? Is there any sense in compiling ATLAS on OS X (I know it can be done)? David From david at ar.media.kyoto-u.ac.jp Thu Jun 4 20:26:30 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 05 Jun 2009 09:26:30 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> Message-ID: <4A286636.3040700@ar.media.kyoto-u.ac.jp> David Warde-Farley wrote: > On 4-Jun-09, at 5:03 PM, Anne Archibald wrote: > > >> Apart from the implementation issues people have chimed in about >> already, it's worth noting that the speed of matrix multiplication >> depends on the memory layout of the matrices. So generating B instead >> directly as a 100 by 500 matrix might affect the speed substantially >> (I'm not sure in which direction). If MATLAB's matrices have a >> different memory order, that might be a factor as well. >> > > AFAIK Matlab matrices are always Fortran ordered. > > Does anyone know if the defaults on Mac OS X (vecLib/Accelerate) > support multicore? Is there any sense in compiling ATLAS on OS X (I > know it can be done)? > It may be worthwhile if you use a recent gcc and recent ATLAS. Multithread support is supposed to be much better in 3.9.* compared to 3.6.* (which is likely the version used on vecLib/Accelerate). The main issue I could foresee is clashes between vecLib/Accelerate and Atlas if you mix softwares which use one or the other together. For the OP question: recent matlab versions use the MKL, which is likely to give higher performances than ATLAS, specially on windows (compilers on that platform are ancient, as building atlas with native compilers on windows requires super-human patience). David From josef.pktd at gmail.com Thu Jun 4 20:49:13 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2009 20:49:13 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <20090604205200.GD17131@phare.normalesup.org> References: <4A27D67F.3@american.edu> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> <20090604203019.GC17131@phare.normalesup.org> <1cd32cbb0906041343x42a0a438mef44bef1ef87f612@mail.gmail.com> <20090604205200.GD17131@phare.normalesup.org> Message-ID: <1cd32cbb0906041749x464445e5h50d27d880c03716c@mail.gmail.com> On Thu, Jun 4, 2009 at 4:52 PM, Gael Varoquaux wrote: > On Thu, Jun 04, 2009 at 04:43:39PM -0400, josef.pktd at gmail.com wrote: >> Just using "in" might promise more than it does, eg. it works only for >> one dimensional arrays, maybe "in1d". With "in", > > Then 'in_1d' No, if the breaks in a name are obvious, I still prefer names without underscores. I don't think `1d` or `2d` needs to be separated from the word, "in1d" I always remember how to spell unique1d, but I usually have to check how to spell at_least_2d, or maybe atleast_2d or even atleast2d. how about def setmember1d_nu(a, b): ... #aliases set_member_1d_but_it_does_not_really_have_to_be_a_set = setmember1d_nu in1d = setmember1d_nu Josef >>> [f for f in dir(np) if f[-2:]=='1d' or f[-2:]=='2d'] ['atleast_1d', 'atleast_2d', 'ediff1d', 'histogram2d', 'intersect1d', 'poly1d', 'setdiff1d', 'setmember1d', 'setxor1d', 'union1d', 'unique1d'] >>> [f for f in dir(scipy.signal) if f[-2:]=='1d' or f[-2:]=='2d'] ['atleast_1d', 'atleast_2d', 'convolve2d', 'correlate2d', 'cspline1d', 'cspline2d', 'medfilt2d', 'qspline1d', 'qspline2d', 'sepfir2d'] >>> >>> [f for f in dir(scipy.stats) if f[-2:]=='1d' or f[-2:]=='2d'] [] >>> >>> [f for f in dir(scipy.ndimage) if f[-2:]=='1d' or f[-2:]=='2d'] ['convolve1d', 'correlate1d', 'gaussian_filter1d', 'generic_filter1d', 'maximum_filter1d', 'minimum_filter1d', 'spline_filter1d', 'uniform_filter1d'] > >> I found arraysetops because of unique1d, but I didn't figure out what >> the subpackage really does, because I was reading "arrayse-tops" >> instead of array-set-ops" > > That's why I push people to use more underscores. IMHO PEP8 lacks a push > for underscores. > > Ga?l > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cimrman3 at ntc.zcu.cz Fri Jun 5 01:22:35 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 05 Jun 2009 07:22:35 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906041314t44a09438sbd1dc7add2b63892@mail.gmail.com> References: <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> <1cd32cbb0906041314t44a09438sbd1dc7add2b63892@mail.gmail.com> Message-ID: <4A28AB9B.2090208@ntc.zcu.cz> josef.pktd at gmail.com wrote: > On Thu, Jun 4, 2009 at 2:58 PM, Alan G Isaac wrote: >> On 6/4/2009 1:27 PM josef.pktd at gmail.com apparently wrote: >>> Note: there are two versions of the docs for np.intersect1d, the >>> currently published docs which describe the actual behavior (for the >>> non-unique case), and the new docs on the doc editor >>> http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/ >>> that describe the "intended" usage of the functions, which also >>> corresponds closer to the original source docstring >>> (http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/?revision=-227 >>> ). that's my interpretation >> >> Again, the distributed docs do *not* describe the actual >> behavior for the non-unique case. E.g., >> >>>>> np.intersect1d([1,1,2,3,3,4], [1,4]) >> array([1, 1, 3, 4]) >> >> Might this is a better example of >> failure than the one in the doc editor? > > Thanks, that's a very clear example of a wrong answer, > and it removes the question whether the function makes any sense for > the non-unique case. > I changed the example in the doc editor to this one. > > It will hopefully merged with the source at the next update. Thank you Josef! r. From cimrman3 at ntc.zcu.cz Fri Jun 5 01:27:16 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 05 Jun 2009 07:27:16 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: References: <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> Message-ID: <4A28ACB4.9060104@ntc.zcu.cz> Kim Hansen wrote: > Concerning the name setmember1d_nu, I personally find it quite verbose > and not the name I would expect as a non-insider coming to numpy and > not knowing all the names of the more special hidden-away functions > and not being a python-wiz either. To explain the naming: those names are used in matlab for functions of similar functionality. If better names are found, I am not against. What I particularly do not like is the _nu suffix (yes, blame me). r. From cimrman3 at ntc.zcu.cz Fri Jun 5 01:35:46 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 05 Jun 2009 07:35:46 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> Message-ID: <4A28AEB2.5050209@ntc.zcu.cz> Anne Archibald wrote: > 2009/6/4 : > >> intersect1d should throw a domain error if you give it arrays with >> non-unique elements, which is not done for speed reasons > > It seems to me that this is the basic source of the problem. Perhaps > this can be addressed? I realize maintaining compatibility with the > current behaviour is necessary, so how about a multistage deprecation: > > 1. add a keyword argument to intersect1d "assume_unique"; if it is not > present, check for uniqueness and emit a warning if not unique > 2. change the warning to an exception > Optionally: > 3. change the meaning of the function to that of intersect1d_nu if the > keyword argument is not present > > One could do something similar with setmember1d. > > This would remove the pitfall of the 1d assumption and the wart of the > _nu names without hampering performance for people who know they have > unique arrays and are in a hurry. You mean something like: def intersect1d(ar1, ar2, assume_unique=False): if not assume_unique: return intersect1d_nu(ar1, ar2) else: ... # the current code intersect1d_nu could be still exported to numpy namespace, or not. I like this. I do not undestand, however, what you mean by "remove the pitfall of the 1d assumption"? cheers, r. From cimrman3 at ntc.zcu.cz Fri Jun 5 01:48:37 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 05 Jun 2009 07:48:37 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906041343x42a0a438mef44bef1ef87f612@mail.gmail.com> References: <4A27BCCF.4090608@american.edu> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> <20090604203019.GC17131@phare.normalesup.org> <1cd32cbb0906041343x42a0a438mef44bef1ef87f612@mail.gmail.com> Message-ID: <4A28B1B5.2030200@ntc.zcu.cz> josef.pktd at gmail.com wrote: > On Thu, Jun 4, 2009 at 4:30 PM, Gael Varoquaux > wrote: >> On Thu, Jun 04, 2009 at 10:27:11PM +0200, Kim Hansen wrote: >>> "in(b)" or "in_iterable(b)" method, such that you could do a.in(b) >>> which would return a boolean array of the same shape as a with >>> elements true if the equivalent a members were members in the iterable >>> b. >> That would really by what I would be looking for. >> > > Just using "in" might promise more than it does, eg. it works only for > one dimensional arrays, maybe "in1d". With "in", I would expect a > generic function as in python that works with many array types and > dimensions. (But I haven't checked whether it would work with a 1d > structured array or object array.) > > I found arraysetops because of unique1d, but I didn't figure out what > the subpackage really does, because I was reading "arrayse-tops" > instead of array-set-ops" I am bad in choosing names, but note that numpy sub-modules usually do not use underscores, so array_set_ops would not fit well. > BTW, for the docs, I haven't found a counter example where > np.setdiff1d gives the wrong answer for non-unique arrays. In [4]: np.setmember1d( [1, 1, 2, 4, 2], [3, 2, 4] ) Out[4]: array([ True, False, True, True, True], dtype=bool) r. From josef.pktd at gmail.com Fri Jun 5 01:56:14 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Jun 2009 01:56:14 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A28B1B5.2030200@ntc.zcu.cz> References: <4A27BCCF.4090608@american.edu> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> <20090604203019.GC17131@phare.normalesup.org> <1cd32cbb0906041343x42a0a438mef44bef1ef87f612@mail.gmail.com> <4A28B1B5.2030200@ntc.zcu.cz> Message-ID: <1cd32cbb0906042256u32e5e6bgee7eec6c78e78db6@mail.gmail.com> On Fri, Jun 5, 2009 at 1:48 AM, Robert Cimrman wrote: > josef.pktd at gmail.com wrote: >> On Thu, Jun 4, 2009 at 4:30 PM, Gael Varoquaux >> wrote: >>> On Thu, Jun 04, 2009 at 10:27:11PM +0200, Kim Hansen wrote: >>>> "in(b)" or "in_iterable(b)" method, such that you could do a.in(b) >>>> which would return a boolean array of the same shape as a with >>>> elements true if the equivalent a members were members in the iterable >>>> b. >>> That would really by what I would be looking for. >>> >> >> Just using "in" might promise more than it does, eg. it works only for >> one dimensional arrays, maybe "in1d". With "in", I would expect a >> generic function as in python that works with many array types and >> dimensions. (But I haven't checked whether it would work with a 1d >> structured array or object array.) >> >> I found arraysetops because of unique1d, but I didn't figure out what >> the subpackage really does, because I was reading "arrayse-tops" >> instead of array-set-ops" > > I am bad in choosing names, but note that numpy sub-modules usually do > not use underscores, so array_set_ops would not fit well. I would have chosen something like setfun. Since this is in numpy that sets refers to arrays should be implied. > >> BTW, for the docs, I haven't found a counter example where >> np.setdiff1d gives the wrong answer for non-unique arrays. > > In [4]: np.setmember1d( [1, 1, 2, 4, 2], [3, 2, 4] ) > Out[4]: array([ True, False, ?True, ?True, ?True], dtype=bool) setdiff1d diff not member Looking at the source, I think setdiff always works even if for non-unique arrays. Josef > > r. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cimrman3 at ntc.zcu.cz Fri Jun 5 02:04:26 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 05 Jun 2009 08:04:26 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906042256u32e5e6bgee7eec6c78e78db6@mail.gmail.com> References: <4A27BCCF.4090608@american.edu> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A27F721.2020306@american.edu> <1cd32cbb0906041027q1fbe6122p4cc6eef14cb6d107@mail.gmail.com> <4A281958.6060208@american.edu> <20090604203019.GC17131@phare.normalesup.org> <1cd32cbb0906041343x42a0a438mef44bef1ef87f612@mail.gmail.com> <4A28B1B5.2030200@ntc.zcu.cz> <1cd32cbb0906042256u32e5e6bgee7eec6c78e78db6@mail.gmail.com> Message-ID: <4A28B56A.3080202@ntc.zcu.cz> josef.pktd at gmail.com wrote: > On Fri, Jun 5, 2009 at 1:48 AM, Robert Cimrman wrote: >> josef.pktd at gmail.com wrote: >>> On Thu, Jun 4, 2009 at 4:30 PM, Gael Varoquaux >>> wrote: >>>> On Thu, Jun 04, 2009 at 10:27:11PM +0200, Kim Hansen wrote: >>>>> "in(b)" or "in_iterable(b)" method, such that you could do a.in(b) >>>>> which would return a boolean array of the same shape as a with >>>>> elements true if the equivalent a members were members in the iterable >>>>> b. >>>> That would really by what I would be looking for. >>>> >>> Just using "in" might promise more than it does, eg. it works only for >>> one dimensional arrays, maybe "in1d". With "in", I would expect a >>> generic function as in python that works with many array types and >>> dimensions. (But I haven't checked whether it would work with a 1d >>> structured array or object array.) >>> >>> I found arraysetops because of unique1d, but I didn't figure out what >>> the subpackage really does, because I was reading "arrayse-tops" >>> instead of array-set-ops" >> I am bad in choosing names, but note that numpy sub-modules usually do >> not use underscores, so array_set_ops would not fit well. > > I would have chosen something like setfun. Since this is in numpy > that sets refers to arrays should be implied. Yes, good idea. I am not sure how to proceed, if people agree (name contest is open!) What about making an alias name setfun, and deprecate the name arraysetops? >>> BTW, for the docs, I haven't found a counter example where >>> np.setdiff1d gives the wrong answer for non-unique arrays. >> In [4]: np.setmember1d( [1, 1, 2, 4, 2], [3, 2, 4] ) >> Out[4]: array([ True, False, True, True, True], dtype=bool) > > setdiff1d diff not member > Looking at the source, I think setdiff always works even if for > non-unique arrays. Whoops, sorry. setdiff1d seems really to work for non-unique arrays - it relies on the behaviour above though :) - there is always one correct False even for repeated entries in the first array. r. From dwf at cs.toronto.edu Fri Jun 5 03:10:59 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 5 Jun 2009 03:10:59 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> Message-ID: <57AAFB1D-C911-442D-A212-86173ADA7692@cs.toronto.edu> On 4-Jun-09, at 4:38 PM, Anne Archibald wrote: > It seems to me that this is the basic source of the problem. Perhaps > this can be addressed? I realize maintaining compatibility with the > current behaviour is necessary, so how about a multistage deprecation: > > 1. add a keyword argument to intersect1d "assume_unique"; if it is not > present, check for uniqueness and emit a warning if not unique > 2. change the warning to an exception > Optionally: > 3. change the meaning of the function to that of intersect1d_nu if the > keyword argument is not present > > One could do something similar with setmember1d. +1 on this idea. I've been bitten by the non-unique stuff in the past, especially with setmember1d, not realizing that both need to be unique. David From D.P.Reichert at sms.ed.ac.uk Fri Jun 5 05:44:05 2009 From: D.P.Reichert at sms.ed.ac.uk (David Paul Reichert) Date: Fri, 05 Jun 2009 10:44:05 +0100 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A286636.3040700@ar.media.kyoto-u.ac.jp> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A286636.3040700@ar.media.kyoto-u.ac.jp> Message-ID: <20090605104405.zsxnhtdb4kow4w8o@www.sms.ed.ac.uk> Thanks for the replies so far. I had already tested using an already transposed matrix in the loop, it didn't make any difference. Oh and btw, I'm on (Scientific) Linux. I used the Enthought distribution, but I guess I'll have to get my hands dirty and try to get that Atlas thing working (I'm not a Linux expert though). My simulations pretty much consist of matrix multiplications, so if I don't get rid of that factor 5, I pretty much have to get back to Matlab. When you said Atlas is going to be optimized for my system, does that mean I should compile everything on each machine separately? I.e. I have a not-so-great desktop machine and one of those bigger multicore things available... Cheers David Quoting David Cournapeau : > David Warde-Farley wrote: >> On 4-Jun-09, at 5:03 PM, Anne Archibald wrote: >> >> >>> Apart from the implementation issues people have chimed in about >>> already, it's worth noting that the speed of matrix multiplication >>> depends on the memory layout of the matrices. So generating B instead >>> directly as a 100 by 500 matrix might affect the speed substantially >>> (I'm not sure in which direction). If MATLAB's matrices have a >>> different memory order, that might be a factor as well. >>> >> >> AFAIK Matlab matrices are always Fortran ordered. >> >> Does anyone know if the defaults on Mac OS X (vecLib/Accelerate) >> support multicore? Is there any sense in compiling ATLAS on OS X (I >> know it can be done)? >> > > It may be worthwhile if you use a recent gcc and recent ATLAS. > Multithread support is supposed to be much better in 3.9.* compared to > 3.6.* (which is likely the version used on vecLib/Accelerate). The main > issue I could foresee is clashes between vecLib/Accelerate and Atlas if > you mix softwares which use one or the other together. > > For the OP question: recent matlab versions use the MKL, which is likely > to give higher performances than ATLAS, specially on windows (compilers > on that platform are ancient, as building atlas with native compilers on > windows requires super-human patience). > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From sebastian.walter at gmail.com Fri Jun 5 06:03:15 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Fri, 5 Jun 2009 12:03:15 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> Message-ID: On Thu, Jun 4, 2009 at 10:56 PM, Chris Colbert wrote: > I should update after reading the thread Sebastian linked: > > The current 1.3 version of numpy (don't know about previous versions) uses > the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with > these libraries. I've verified this on linux only, thought it shouldnt be > any different on windows AFAIK. in the best of all possible worlds this would be done by a package maintainer.... > > chris > > On Thu, Jun 4, 2009 at 4:54 PM, Chris Colbert wrote: >> >> Sebastian is right. >> >> Since Matlab r2007 (i think that's the version) it has included support >> for multi-core architecture. On my core2 Quad here at the office, r2008b has >> no problem utilizing 100% cpu for large matrix multiplications. >> >> >> If you download and build atlas and lapack from source and enable >> parrallel threads in atlas, then compile numpy against these libraries, you >> should achieve similar if not better performance (since the atlas routines >> will be tuned to your system). >> >> If you're on Windows, you need to do some trickery to get threading to >> work (the instructions are on the atlas website). >> >> Chris >> >> > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From david at ar.media.kyoto-u.ac.jp Fri Jun 5 05:58:15 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 05 Jun 2009 18:58:15 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> Message-ID: <4A28EC37.10205@ar.media.kyoto-u.ac.jp> Sebastian Walter wrote: > On Thu, Jun 4, 2009 at 10:56 PM, Chris Colbert wrote: > >> I should update after reading the thread Sebastian linked: >> >> The current 1.3 version of numpy (don't know about previous versions) uses >> the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with >> these libraries. I've verified this on linux only, thought it shouldnt be >> any different on windows AFAIK. >> > > in the best of all possible worlds this would be done by a package > maintainer.... > Numpy packages on windows do use ATLAS, so I am not sure what you are referring to ? On a side note, correctly packaging ATLAS is almost inherently impossible, since the build method of ATLAS can never produce the same binary (even on the same machine), and the binary is optimized for the machine it was built on. So if you want the best speed, you should build atlas by yourself - which is painful on windows (you need cygwin). On windows, if you really care about speed, you should try linking against the Intel MKL. That's what Matlab uses internally on recent versions, so you would get the same speed. But that's rather involved. cheers, David From sebastian.walter at gmail.com Fri Jun 5 06:27:57 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Fri, 5 Jun 2009 12:27:57 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A28EC37.10205@ar.media.kyoto-u.ac.jp> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> Message-ID: On Fri, Jun 5, 2009 at 11:58 AM, David Cournapeau wrote: > Sebastian Walter wrote: >> On Thu, Jun 4, 2009 at 10:56 PM, Chris Colbert wrote: >> >>> I should update after reading the thread Sebastian linked: >>> >>> The current 1.3 version of numpy (don't know about previous versions) uses >>> the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with >>> these libraries. I've verified this on linux only, thought it shouldnt be >>> any different on windows AFAIK. >>> >> >> in the best of all possible worlds this would be done by a package >> maintainer.... >> > > Numpy packages on windows do use ATLAS, so I am not sure what you are > referring to ? I'm on debian unstable and my numpy (version 1.2.1) uses an unoptimized blas. I had the impression that most ppl that use numpy are on linux. But apparently this is a misconception. >On a side note, correctly packaging ATLAS is almost > inherently impossible, since the build method of ATLAS can never produce > the same binary (even on the same machine), and the binary is optimized > for the machine it was built on. So if you want the best speed, you > should build atlas by yourself - which is painful on windows (you need > cygwin). in the debian repositories there are different builds of atlas so there could be different builds for numpy, too. But there aren't.... > > On windows, if you really care about speed, you should try linking > against the Intel MKL. That's what Matlab uses internally on recent > versions, so you would get the same speed. But that's rather involved. How much faster is MKL than ATLAS? > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From david at ar.media.kyoto-u.ac.jp Fri Jun 5 06:20:28 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 05 Jun 2009 19:20:28 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> Message-ID: <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> Sebastian Walter wrote: > On Fri, Jun 5, 2009 at 11:58 AM, David > Cournapeau wrote: > >> Sebastian Walter wrote: >> >>> On Thu, Jun 4, 2009 at 10:56 PM, Chris Colbert wrote: >>> >>> >>>> I should update after reading the thread Sebastian linked: >>>> >>>> The current 1.3 version of numpy (don't know about previous versions) uses >>>> the optimized Atlas BLAS routines for numpy.dot() if numpy was compiled with >>>> these libraries. I've verified this on linux only, thought it shouldnt be >>>> any different on windows AFAIK. >>>> >>>> >>> in the best of all possible worlds this would be done by a package >>> maintainer.... >>> >>> >> Numpy packages on windows do use ATLAS, so I am not sure what you are >> referring to ? >> > I'm on debian unstable and my numpy (version 1.2.1) uses an unoptimized blas. > Yes, it is because the package on Linux are not well done in that respect (to their defense, numpy build is far from being packaging friendly, and is both fragile and obscure). > I had the impression that most ppl that use numpy are on linux. Sourceforge numbers tell a different story at least. I think most users on the ML use linux, and certainly almost every developer use linux or mac os x. But ML already filter most windows users - only geeks read ML :) I am pretty sure a vast majority of numpy users never even bother to look for the ML. >> On a side note, correctly packaging ATLAS is almost >> inherently impossible, since the build method of ATLAS can never produce >> the same binary (even on the same machine), and the binary is optimized >> for the machine it was built on. So if you want the best speed, you >> should build atlas by yourself - which is painful on windows (you need >> cygwin). >> > in the debian repositories there are different builds of atlas so > there could be different builds for numpy, too. > But there aren't.... > There are several problems: - packagers (rightfully) hate to have many versions of the same software - as for now, if ATLAS is detected, numpy is built differently than if it is linked against conventional blas/lapack - numpy on debian is not built with atlas support But there is certainly no need to build one numpy version for every atlas: the linux loader can load the most appropriate library depending on your architecture, the so called hwcap flag. If your CPU supports SSE2, and you have ATLAS installed for SSE2, then the loader will automatically load the libraries there instead of the one in /usr/lib by default. But because ATLAS is such a pain to support in a binary form, only ancient versions of ATLAS are packaged anyways (3.6.*). So if you care so much, you should build your own. >> On windows, if you really care about speed, you should try linking >> against the Intel MKL. That's what Matlab uses internally on recent >> versions, so you would get the same speed. But that's rather involved. >> > > It really depends on the CPU, compiler, how atlas was compiled, etc... it can be slightly faster to 10 times faster (if you use a very poorly optimized ATLAS). For some recent benchmarks: http://eigen.tuxfamily.org/index.php?title=Benchmark cheers, David From cournape at gmail.com Fri Jun 5 07:09:45 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 5 Jun 2009 20:09:45 +0900 Subject: [Numpy-discussion] scipy 0.7.1rc2 released Message-ID: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> Hi, The RC2 for 0.7.1 scipy release has just been tagged. This is a bug-fixes only release, see below for the release notes. More information can also be found on the trac website: http://projects.scipy.org/scipy/milestone/0.7.1 The only code change compared to the RC1 is one fix which is essential for mac os x/python 2.6 combination. Tarballs, windows and mac os x binaries are available. Please test it ! I am particularly interested in results for scipy binaries on mac os x (do they work on ppc). The scipy developers -- ========================= SciPy 0.7.1 Release Notes ========================= .. contents:: SciPy 0.7.1 is a bug-fix release with no new features compared to 0.7.0. scipy.io ======== Bugs fixed: - Several fixes in Matlab file IO scipy.odr ========= Bugs fixed: - Work around a failure with Python 2.6 scipy.signal ============ Memory leak in lfilter have been fixed, as well as support for array object Bugs fixed: - #880, #925: lfilter fixes - #871: bicgstab fails on Win32 scipy.sparse ============ Bugs fixed: - #883: scipy.io.mmread with scipy.sparse.lil_matrix broken - lil_matrix and csc_matrix reject now unexpected sequences, cf. http://thread.gmane.org/gmane.comp.python.scientific.user/19996 scipy.special ============= Several bugs of varying severity were fixed in the special functions: - #503, #640: iv: problems at large arguments fixed by new implementation - #623: jv: fix errors at large arguments - #679: struve: fix wrong output for v < 0 - #803: pbdv produces invalid output - #804: lqmn: fix crashes on some input - #823: betainc: fix documentation - #834: exp1 strange behavior near negative integer values - #852: jn_zeros: more accurate results for large s, also in jnp/yn/ynp_zeros - #853: jv, yv, iv: invalid results for non-integer v < 0, complex x - #854: jv, yv, iv, kv: return nan more consistently when out-of-domain - #927: ellipj: fix segfault on Windows - #946: ellpj: fix segfault on Mac OS X/python 2.6 combination. - ive, jve, yve, kv, kve: with real-valued input, return nan for out-of-domain instead of returning only the real part of the result. Also, when ``scipy.special.errprint(1)`` has been enabled, warning messages are now issued as Python warnings instead of printing them to stderr. scipy.stats =========== - linregress, mannwhitneyu, describe: errors fixed - kstwobign, norm, expon, exponweib, exponpow, frechet, genexpon, rdist, truncexpon, planck: improvements to numerical accuracy in distributions Windows binaries for python 2.6 =============================== python 2.6 binaries for windows are now included. The binary for python 2.5 requires numpy 1.2.0 or above, and and the one for python 2.6 requires numpy 1.3.0 or above. Universal build for scipy ========================= Mac OS X binary installer is now a proper universal build, and does not depend on gfortran anymore (libgfortran is statically linked). The python 2.5 version of scipy requires numpy 1.2.0 or above, the python 2.6 version requires numpy 1.3.0 or above. Checksums ========= 08cdf8d344535fcb5407dafd9f120b9b release/installers/scipy-0.7.1rc2.tar.gz 93595ca9f0b5690a6592c9fc43e9253d release/installers/scipy-0.7.1rc2-py2.6-macosx10.5.dmg fc8f434a9b4d76f1b38b7025f425127b release/installers/scipy-0.7.1rc2.zip 8cdc2472f3282f08a703cdcca5c92952 release/installers/scipy-0.7.1rc2-win32-superpack-python2.5.exe 15c4c45de931bd7f13e4ce24bd59579e release/installers/scipy-0.7.1rc2-win32-superpack-python2.6.exe e42853e39b3b4f590824e3a262863ef6 release/installers/scipy-0.7.1rc2-py2.5-macosx10.5.dmg From jrennie at gmail.com Fri Jun 5 09:05:31 2009 From: jrennie at gmail.com (Jason Rennie) Date: Fri, 5 Jun 2009 09:05:31 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090605104405.zsxnhtdb4kow4w8o@www.sms.ed.ac.uk> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A286636.3040700@ar.media.kyoto-u.ac.jp> <20090605104405.zsxnhtdb4kow4w8o@www.sms.ed.ac.uk> Message-ID: <75c31b2a0906050605v4804579dkc112c7a71b24a7f4@mail.gmail.com> Hi David, Let me suggest that you try the latest version of Ubuntu (9.04/Jaunty), which was released two months ago. It sounds like you are effectively using release 5 of RedHat Linux which was originally released May 2007. There have been updates (5.1, 5.2, 5.3), but, if my memory serves me correctly, RedHat updates are more focused on fixing bugs and security issues rather than improving functionality. Ubuntu does a full, new release every 6 months so you don't have to wait as long to see improvements. Ubuntu also has a tremendously better package management system. You generally shouldn't be installing packages by hand as it sounds like you are doing. This post suggests that the latest version of Ubuntu is up-to-date wrt ATLAS: http://www.mail-archive.com/numpy-discussion at scipy.org/msg13102.html Jason On Fri, Jun 5, 2009 at 5:44 AM, David Paul Reichert < D.P.Reichert at sms.ed.ac.uk> wrote: > Thanks for the replies so far. > > I had already tested using an already transposed matrix in the loop, > it didn't make any difference. Oh and btw, I'm on (Scientific) Linux. > > I used the Enthought distribution, but I guess I'll have to get > my hands dirty and try to get that Atlas thing working (I'm not > a Linux expert though). My simulations pretty much consist of > matrix multiplications, so if I don't get rid of that factor 5, > I pretty much have to get back to Matlab. > > When you said Atlas is going to be optimized for my system, does > that mean I should compile everything on each machine separately? > I.e. I have a not-so-great desktop machine and one of those bigger > multicore things available... > > Cheers > > David > -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Fadhley.Salim at uk.calyon.com Fri Jun 5 09:25:34 2009 From: Fadhley.Salim at uk.calyon.com (Fadhley Salim) Date: Fri, 5 Jun 2009 14:25:34 +0100 Subject: [Numpy-discussion] Import fails of scipy.factorial when installed as a zipped egg In-Reply-To: <75c31b2a0906050605v4804579dkc112c7a71b24a7f4@mail.gmail.com> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk><4A286636.3040700@ar.media.kyoto-u.ac.jp><20090605104405.zsxnhtdb4kow4w8o@www.sms.ed.ac.uk> <75c31b2a0906050605v4804579dkc112c7a71b24a7f4@mail.gmail.com> Message-ID: I've noticed that with scipy 0.7.0 + numpy 1.2.1, an importing the factorial function from the scipy module always seems to fail when scipy is installed as a zipped ",egg" file. When the project is installed as an unzipped directory it works fine. Is there any reason why this function should not be egg-safe? My test to verify this was pretty simple: I just installed my scipy egg (made by extracting the Windows, Python 2.4 Superpack) with the easy_install command. Whenever I install it with the "-Z" option (to uncompress) it works fine. With the "-z" option it always fails. Thanks! Sal -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- Disclaimer CALYON UK: This email does not create a legal relationship between any member of the Cr=E9dit Agricole group and the recipient or constitute investment advice. The content of this email should not be copied or disclosed (in whole or part) to any other person. It may contain information which is confidential, privileged or otherwise protected from disclosure. If you are not the intended recipient, you should notify us and delete it from your system. Emails may be monitored, are not secure and may be amended, destroyed or contain viruses and in communicating with us such conditions are accepted. Any content which does not relate to business matters is not endorsed by us. Calyon is authorised by the Comit=e9 des Etablissements de Cr=e9dit et des Entreprises d'Investissement (CECEI) and supervised by the Commission Bancaire in France and subject to limited regulation by the Financial Services Authority. Details about the extent of our regulation by the Financial Services Authority are available from us on request. Calyon is incorporated in France with limited liability and registered in England & Wales. Registration number: FC008194. Registered office: Broadwalk House, 5 Appold Street, London, EC2A 2DA. Disclaimer CALYON France: This message and/or any attachments is intended for the sole use of its addressee. If you are not the addressee, please immediately notify the sender and then destroy the message. As this message and/or any attachments may have been altered without our knowledge, its content is not legally binding on CALYON Cr?dit Agricole CIB. All rights reserved. Ce message et ses pi?ces jointes est destin? ? l'usage exclusif de son destinataire. Si vous recevez ce message par erreur, merci d'en aviser imm?diatement l'exp?diteur et de le d?truire ensuite. Le pr?sent message pouvant ?tre alt?r? ? notre insu, CALYON Cr?dit Agricole CIB ne peut pas ?tre engag? par son contenu. Tous droits r?serv?s. From josef.pktd at gmail.com Fri Jun 5 10:32:36 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Jun 2009 10:32:36 -0400 Subject: [Numpy-discussion] Import fails of scipy.factorial when installed as a zipped egg In-Reply-To: References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A286636.3040700@ar.media.kyoto-u.ac.jp> <20090605104405.zsxnhtdb4kow4w8o@www.sms.ed.ac.uk> <75c31b2a0906050605v4804579dkc112c7a71b24a7f4@mail.gmail.com> Message-ID: <1cd32cbb0906050732v3bf40facq2e8bb2f126d482d0@mail.gmail.com> On Fri, Jun 5, 2009 at 9:25 AM, Fadhley Salim wrote: > I've noticed that with scipy 0.7.0 + numpy 1.2.1, an importing the factorial > function from the scipy module always seems to fail when scipy is installed > as a zipped ",egg" file. When the project is installed as an unzipped > directory it works fine. > > Is there any reason why this function should not be egg-safe? > > My test to verify this was pretty simple: I just installed my scipy egg > (made by extracting the Windows, Python 2.4 Superpack) with the easy_install > command. Whenever I install it with the "-Z" option (to uncompress) it works > fine. With the "-z" option it always fails. > I don't think numpy/scipy are zip safe, the numpy packageloader uses os.path to find files. I would expect that you are not able to import anything, factorial might be just the first function that is loaded. easy_install usually does a check for zipsafe, and if it unpacks the egg it usually means it's not zipsafe, for example because of the use of __file__. That's my guess, Josef From D.P.Reichert at sms.ed.ac.uk Fri Jun 5 10:39:23 2009 From: D.P.Reichert at sms.ed.ac.uk (David Paul Reichert) Date: Fri, 05 Jun 2009 15:39:23 +0100 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <75c31b2a0906050605v4804579dkc112c7a71b24a7f4@mail.gmail.com> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A286636.3040700@ar.media.kyoto-u.ac.jp> <20090605104405.zsxnhtdb4kow4w8o@www.sms.ed.ac.uk> <75c31b2a0906050605v4804579dkc112c7a71b24a7f4@mail.gmail.com> Message-ID: <20090605153923.st7uoctkgskwsksc@www.sms.ed.ac.uk> Hi, Thanks for the suggestion. Unfortunately I'm using university managed machines here, so I have no control over the distribution, not even root access. However, I just downloaded the latest Enthought distribution, which uses numpy 1.3, and now numpy is only 30% to 60% slower than matlab, instead of 5 times slower. I can live with that. (whether it uses atlas now or not, I don't know). Cheers David Quoting Jason Rennie : > Hi David, > > Let me suggest that you try the latest version of Ubuntu (9.04/Jaunty), > which was released two months ago. It sounds like you are effectively using > release 5 of RedHat Linux which was originally released May 2007. There > have been updates (5.1, 5.2, 5.3), but, if my memory serves me correctly, > RedHat updates are more focused on fixing bugs and security issues rather > than improving functionality. Ubuntu does a full, new release every 6 > months so you don't have to wait as long to see improvements. Ubuntu also > has a tremendously better package management system. You generally > shouldn't be installing packages by hand as it sounds like you are doing. > > This post suggests that the latest version of Ubuntu is up-to-date wrt > ATLAS: > > http://www.mail-archive.com/numpy-discussion at scipy.org/msg13102.html > > Jason > > On Fri, Jun 5, 2009 at 5:44 AM, David Paul Reichert < > D.P.Reichert at sms.ed.ac.uk> wrote: > >> Thanks for the replies so far. >> >> I had already tested using an already transposed matrix in the loop, >> it didn't make any difference. Oh and btw, I'm on (Scientific) Linux. >> >> I used the Enthought distribution, but I guess I'll have to get >> my hands dirty and try to get that Atlas thing working (I'm not >> a Linux expert though). My simulations pretty much consist of >> matrix multiplications, so if I don't get rid of that factor 5, >> I pretty much have to get back to Matlab. >> >> When you said Atlas is going to be optimized for my system, does >> that mean I should compile everything on each machine separately? >> I.e. I have a not-so-great desktop machine and one of those bigger >> multicore things available... >> >> Cheers >> >> David >> > > -- > Jason Rennie > Research Scientist, ITA Software > 617-714-2645 > http://www.itasoftware.com/ > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From zelbier at gmail.com Fri Jun 5 11:38:40 2009 From: zelbier at gmail.com (Olivier Verdier) Date: Fri, 5 Jun 2009 17:38:40 +0200 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> Message-ID: I agree. It would be a good idea to have matrices out of numpy as a standalone package. Indeed, having matrices in the numpy core comes at a pedagogical cost. Newcomers (as I once was) do not know which to use. Matrix or array? It turns out that the vast majority of numpy/scipy modules use arrays, so arrays is the preferred way to go. It would thus be clearer to have arrays in numpy and matrices available as an external package. Besides, I think matrices can be pretty tricky when used for teaching. For instance, you have to explain that all the operators work component-wise, except the multiplication! Another caveat is that since matrices are always 2x2, the "scalar product" of two column vectors computed as " x.T * y" will not be a scalar, but a 2x2 matrix. There is also the fact that you must cast all your vectors to column/raw matrices (as in matlab). For all these reasons, I prefer to use arrays and dot for teaching, and I have never had any complaints. == Olivier 2009/6/4 Tommy Grav > > On Jun 4, 2009, at 5:41 PM, Alan G Isaac wrote: > > On 6/4/2009 5:27 PM Tommy Grav apparently wrote: > >> Or the core development team split the matrices out of numpy and > >> make it > >> as separate package that the people that use them could pick up and > >> run with. > > > > > > This too would be a mistake, I believe. > > But it depends on whether a goal is to > > have more people use NumPy. I believe > > the community will gain from growth. > > > > In sum, my argument is this: > > Keeping a matrix object in NumPy has substantial > > benefits in encouraging growth of the NumPy > > community, and as far as I can tell, it is > > imposing few costs. Therefore I think there is > > a very substantial burden on people who propose > > removing the matrix object to demonstrate > > just how the NumPy community will benefit from > > this change. > > This is a perfectly valid argument. I am actually quite happy with the > numpy package as it is (I work in astronomy), I was just pointing out > that if there are few of the core numpy people interested in maintaing > or upgrading the matrix class one solution might be to make it a > scipy-like package that easily can be installed on top of numpy, but > where the code base might be more accessible to those that are > interested in matrices, but feel that numpy is a daunting beast to > tackle. > Some sense of ownership of a matrixpy package might encourage more > people to contribute. > > Just an idea ;-) > Tommy > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Fadhley.Salim at uk.calyon.com Fri Jun 5 12:59:19 2009 From: Fadhley.Salim at uk.calyon.com (Fadhley Salim) Date: Fri, 5 Jun 2009 17:59:19 +0100 Subject: [Numpy-discussion] Import fails of scipy.factorial wheninstalled as a zipped egg In-Reply-To: <1cd32cbb0906050732v3bf40facq2e8bb2f126d482d0@mail.gmail.com> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk><4A286636.3040700@ar.media.kyoto-u.ac.jp><20090605104405.zsxnhtdb4kow4w8o@www.sms.ed.ac.uk><75c31b2a0906050605v4804579dkc112c7a71b24a7f4@mail.gmail.com> <1cd32cbb0906050732v3bf40facq2e8bb2f126d482d0@mail.gmail.com> Message-ID: > I don't think numpy/scipy are zip safe, the numpy packageloader uses os.path to find files. Evidently! :-) But the strange thing is that all this worked fine with Scipy 0.6.0 - it's only since 0.7.0 was released that this started going wrong. Sal -------------- next part -------------- Disclaimer CALYON UK: This email does not create a legal relationship between any member of the Cr=E9dit Agricole group and the recipient or constitute investment advice. The content of this email should not be copied or disclosed (in whole or part) to any other person. It may contain information which is confidential, privileged or otherwise protected from disclosure. If you are not the intended recipient, you should notify us and delete it from your system. Emails may be monitored, are not secure and may be amended, destroyed or contain viruses and in communicating with us such conditions are accepted. Any content which does not relate to business matters is not endorsed by us. Calyon is authorised by the Comit=e9 des Etablissements de Cr=e9dit et des Entreprises d'Investissement (CECEI) and supervised by the Commission Bancaire in France and subject to limited regulation by the Financial Services Authority. Details about the extent of our regulation by the Financial Services Authority are available from us on request. Calyon is incorporated in France with limited liability and registered in England & Wales. Registration number: FC008194. Registered office: Broadwalk House, 5 Appold Street, London, EC2A 2DA. Disclaimer CALYON France: This message and/or any attachments is intended for the sole use of its addressee. If you are not the addressee, please immediately notify the sender and then destroy the message. As this message and/or any attachments may have been altered without our knowledge, its content is not legally binding on CALYON Cr?dit Agricole CIB. All rights reserved. Ce message et ses pi?ces jointes est destin? ? l'usage exclusif de son destinataire. Si vous recevez ce message par erreur, merci d'en aviser imm?diatement l'exp?diteur et de le d?truire ensuite. Le pr?sent message pouvant ?tre alt?r? ? notre insu, CALYON Cr?dit Agricole CIB ne peut pas ?tre engag? par son contenu. Tous droits r?serv?s. From josef.pktd at gmail.com Fri Jun 5 13:18:56 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Jun 2009 13:18:56 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> Message-ID: <1cd32cbb0906051018l6277976flcf3bace25fcb9f5c@mail.gmail.com> On Fri, Jun 5, 2009 at 11:38 AM, Olivier Verdier wrote: > I agree. It would be a good idea to have matrices out of numpy as a > standalone package. > Indeed, having matrices in the numpy core comes at a pedagogical cost. > Newcomers (as I once was) do not know which to use. Matrix or array? It > turns out that the vast majority of numpy/scipy modules use arrays, so > arrays is the preferred way to go. > It would thus be clearer to have arrays in numpy and matrices available as > an external package. > Besides, I think matrices can be pretty tricky when used for teaching. For > instance, you have to explain that all the operators work component-wise, > except the multiplication! Another caveat is that since matrices are always > 2x2, the "scalar product" of two column vectors computed as " x.T * y" will > not be a scalar, but a 2x2 matrix. There is also the fact that you must cast > all your vectors to column/raw matrices (as in matlab). For all these > reasons, I prefer to use arrays and dot for teaching, and I have never had > any complaints. For anyone with an econometrics background in gauss or matlab, numpy arrays have a huge adjustment cost. I'm only using arrays for consistency, but my econometrics code is filled with arr[np.newaxis,:] . I wouldn't recommend my friends switching from gauss to numpy/scipy when they want to write or develop some econometrics algorithm. gauss feels like copying the formula from the notes or a book. With numpy arrays I always have to recover one dimension, and I have to verify the shape of any array more often. In econometrics the default vector is often a 2d vector so x.T*x and x*x.T have very different meanings. python/numpy/scipy have a lot of other advantages compared to gauss or matlab, but it's not the linear algebra syntax. So, I don't see any sense in removing matrices, you can just ignore them and tell your students to do so. It increases the initial switching cost, even if users, that get more into it, then switch to arrays. Josef (BTW: I think scipy.stats will soon become matrix friendly) >>> X = np.matrix([[1,0],[1,1]]) >>> X.mean() 0.75 >>> X.mean(0) matrix([[ 1. , 0.5]]) >>> X-X.mean(0) matrix([[ 0. , -0.5], [ 0. , 0.5]]) >>> Y = np.array([[1,0],[1,1]]) >>> Y.mean(0).shape (2,) >>> Y - Y.mean(0)[np.newaxis,:] array([[ 0. , -0.5], [ 0. , 0.5]]) >>> np.matrix([[1,0],[1,1]])**2 matrix([[1, 0], [2, 1]]) >>> np.array([[1,0],[1,1]])**2 array([[1, 0], [1, 1]]) >>> np.matrix([[1,0],[1,1]])**(-1) matrix([[ 1., 0.], [-1., 1.]]) >>> np.array([[1,0],[1,1]])**(-1) array([[ 1, -2147483648], [ 1, 1]]) >>> x = np.matrix([[1],[0]]) >>> x.T*x matrix([[1]]) >>> (x.T*x).shape (1, 1) >>> (x*x.T) matrix([[1, 0], [0, 0]]) >>> (x*x.T).shape (2, 2) >>> y = np.array([1,0]) >>> np.dot(y,y) 1 >>> np.dot(y,y.T) 1 >>> np.dot(y[:,np.newaxis], y[np.newaxis,:]) array([[1, 0], [0, 0]]) From efiring at hawaii.edu Fri Jun 5 13:29:25 2009 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 05 Jun 2009 07:29:25 -1000 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> Message-ID: <4A2955F5.9030006@hawaii.edu> David Cournapeau wrote: > > It really depends on the CPU, compiler, how atlas was compiled, etc... > it can be slightly faster to 10 times faster (if you use a very poorly > optimized ATLAS). > > For some recent benchmarks: > > http://eigen.tuxfamily.org/index.php?title=Benchmark > David, The eigen web site indicates that eigen achieves high performance without all the compilation difficulty of atlas. Does eigen have enough functionality to replace atlas in numpy? Presumably it would need C compatibility wrappers to emulate the blas functions. Would that kill its performance? Or be very difficult? (I'm asking from curiosity combined with complete ignorance. Until yesterday I had never even heard of eigen.) Eric > cheers, > > David From aisaac at american.edu Fri Jun 5 13:30:05 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 05 Jun 2009 13:30:05 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> Message-ID: <4A29561D.5070806@american.edu> On 6/5/2009 11:38 AM Olivier Verdier apparently wrote: > I think matrices can be pretty tricky when used for > teaching. For instance, you have to explain that all the > operators work component-wise, except the multiplication! > Another caveat is that since matrices are always 2x2, the > "scalar product" of two column vectors computed as " x.T > * y" will not be a scalar, but a 2x2 matrix. There is also > the fact that you must cast all your vectors to column/raw > matrices (as in matlab). For all these reasons, I prefer > to use arrays and dot for teaching, and I have never had > any complaints. I do not understand this "argument". You should take it very seriously when someone reports to you that the matrix object is a crucial to them, e.g., as a teaching tool. Even if you do not find personally persuasive an example like http://mail.scipy.org/pipermail/numpy-discussion/2009-June/043001.html I have told you: this is important for my students. Reporting that your students do not complain about using arrays instead of matrices does not change this one bit. Student backgrounds differ by domain of application. In economics, matrices are in *very* wide use, and multidimensional arrays get almost no use. Textbooks in econometrics (a huge and important field, even outside of economics) are full of proofs using matrix algebra. A close match to what the students see is crucial. When working with multiplication or exponentiation, matrices do what they expect, and 2d arrays do not. One more point. As Python users we get used to installing a package here and a package there to add functionality. But this is not how most people looking for a matrix language see the world. Removing the matrix object from NumPy will raise the barrier to adoption by social scientists, and there should be a strongly persuasive reason before taking such a step. Separately from all that, does anyone doubt that there is code that depends on the matrix object? The core objection to a past proposal for useful change was that it could break extant code. I would hope that nobody who took that position would subsequently propose removing the matrix object altogether. Cheers, Alan Isaac PS If x and y are "column vectors" (i.e., matrices), then x.T * y *should* be a 1?1 matrix. Since the * operator is doing matrix multiplication, this is the correct result, not an anomaly. From gely at usc.edu Fri Jun 5 13:33:31 2009 From: gely at usc.edu (Geoffrey Ely) Date: Fri, 5 Jun 2009 10:33:31 -0700 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <1cd32cbb0906051018l6277976flcf3bace25fcb9f5c@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <1cd32cbb0906051018l6277976flcf3bace25fcb9f5c@mail.gmail.com> Message-ID: On Jun 5, 2009, at 10:18 AM, josef.pktd at gmail.com wrote: > I'm only using arrays for consistency, but my econometrics code is > filled with arr[np.newaxis,:] . arr[None,:] is a lot cleaner in my opinion, especially when using "numpy" as the namespace. -Geoff From josef.pktd at gmail.com Fri Jun 5 13:44:18 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Jun 2009 13:44:18 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <1cd32cbb0906051018l6277976flcf3bace25fcb9f5c@mail.gmail.com> Message-ID: <1cd32cbb0906051044u64685529tde2d6f3a46f29b7@mail.gmail.com> On Fri, Jun 5, 2009 at 1:33 PM, Geoffrey Ely wrote: > On Jun 5, 2009, at 10:18 AM, josef.pktd at gmail.com wrote: >> I'm only using arrays for consistency, but my econometrics code is >> filled with arr[np.newaxis,:] . > > > arr[None,:] is a lot cleaner in my opinion, especially when using > "numpy" as the namespace. It took me a long time to figure out that None and np.newaxis do the same thing. I find newaxis more descriptive and mostly stick to it. When I started looking at numpy, I was always wondering what these "None"s were doing in the code. just one more useful trick to preserve dimension, using slices instead of index avoids arr[None,:], but requires more thinking. Josef >>> X matrix([[1, 0], [1, 1]]) >>> Y array([[1, 0], [1, 1]]) >>> X[0,:].shape == Y[0:1,:].shape True >>> X[0,:] == Y[0:1,:] matrix([[ True, True]], dtype=bool) > -Geoff From david at ar.media.kyoto-u.ac.jp Fri Jun 5 13:48:45 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 06 Jun 2009 02:48:45 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2955F5.9030006@hawaii.edu> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> Message-ID: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> Eric Firing wrote: > > David, > > The eigen web site indicates that eigen achieves high performance > without all the compilation difficulty of atlas. Does eigen have enough > functionality to replace atlas in numpy? No, eigen does not provide a (complete) BLAS/LAPACK interface. I don't know if that's even a goal of eigen (it started as a project for KDE, to support high performance core computations for things like spreadsheet and co). But even then, it would be a huge undertaking. For all its flaws, LAPACK is old, tested code, with a very stable language (F77). Eigen is: - not mature. - heavily expression-template-based C++, meaning compilation takes ages + esoteric, impossible to decypher compilation errors. We have enough build problems already :) - SSE dependency harcoded, since it is setup at build time. That's going backward IMHO - I would rather see a numpy/scipy which can load the optimized code at runtime. cheers, David From matthieu.brucher at gmail.com Fri Jun 5 14:46:37 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 5 Jun 2009 20:46:37 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> Message-ID: 2009/6/5 David Cournapeau : > Eric Firing wrote: >> >> David, >> >> The eigen web site indicates that eigen achieves high performance >> without all the compilation difficulty of atlas. ?Does eigen have enough >> functionality to replace atlas in numpy? > > No, eigen does not provide a (complete) BLAS/LAPACK interface. I don't > know if that's even a goal of eigen (it started as a project for KDE, to > support high performance core computations for things like spreadsheet > and co). > > But even then, it would be a huge undertaking. For all its flaws, LAPACK > is old, tested code, with a very stable language (F77). Eigen is: > ? ?- not mature. > ? ?- heavily expression-template-based C++, meaning compilation takes > ages + esoteric, impossible to decypher compilation errors. We have > enough build problems already :) > ? ?- SSE dependency harcoded, since it is setup at build time. That's > going backward IMHO - I would rather see a numpy/scipy which can load > the optimized code at runtime. I would add that it relies on C++ compiler extensions (the restrict keyword) as does blitz. You unfortunately can't expect every compiler to support it unless the C++ committee finally adds it to the standard. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From jrennie at gmail.com Fri Jun 5 15:02:06 2009 From: jrennie at gmail.com (Jason Rennie) Date: Fri, 5 Jun 2009 15:02:06 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <1cd32cbb0906051044u64685529tde2d6f3a46f29b7@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <1cd32cbb0906051018l6277976flcf3bace25fcb9f5c@mail.gmail.com> <1cd32cbb0906051044u64685529tde2d6f3a46f29b7@mail.gmail.com> Message-ID: <75c31b2a0906051202t28e51d95m1ecc0ff41034a0ea@mail.gmail.com> As someone who is very used to thinking in terms of matrices and who just went through the adjustment of translating matlab-like code to numpy, I found the current matrix module to be confusing. It's poor integration with the rest of numpy/scipy (in particular, scipy.optimize.fmin_cg) made it more difficult to use than it was worth. I'd rather have "matrix" and/or "matrix multiplication" sections of the documentation explain how to do typical, basic matrix operations with nparray, dot, T, and arr[None,:]. I think a matrix class would still be worthwhile for findability, but it should simply serve as documentation for how to do matrix stuff with nparray. Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bblais at bryant.edu Fri Jun 5 14:07:42 2009 From: bblais at bryant.edu (Brian Blais) Date: Fri, 05 Jun 2009 14:07:42 -0400 Subject: [Numpy-discussion] vectorizing Message-ID: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> Hello, I have a vectorizing problem that I don't see an obvious way to solve. What I have is a vector like: obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) and a matrix T=zeros((6,6)) and what I want in T is a count of all of the transitions in obs, e.g. T[1,2]=3 because the sequence 1-2 happens 3 times, T[3,4]=1 because the sequence 3-4 only happens once, etc... I can do it unvectorized like: for o1,o2 in zip(obs[:-1],obs[1:]): T[o1,o2]+=1 which gives the correct answer from above, which is: array([[ 0., 0., 0., 0., 0., 0.], [ 0., 0., 3., 0., 0., 1.], [ 0., 3., 0., 1., 0., 0.], [ 0., 0., 2., 0., 1., 0.], [ 0., 0., 0., 2., 0., 0.], [ 0., 0., 0., 0., 1., 0.]]) but I thought there would be a better way. I tried: o1=obs[:-1] o2=obs[1:] T[o1,o2]+=1 but this doesn't give a count, it just yields 1's at the transition points, like: array([[ 0., 0., 0., 0., 0., 0.], [ 0., 0., 1., 0., 0., 1.], [ 0., 1., 0., 1., 0., 0.], [ 0., 0., 1., 0., 1., 0.], [ 0., 0., 0., 1., 0., 0.], [ 0., 0., 0., 0., 1., 0.]]) Is there a clever way to do this? I could write a quick Cython solution, but I wanted to keep this as an all-numpy implementation if I can. thanks, Brian Blais -- Brian Blais bblais at bryant.edu http://web.bryant.edu/~bblais -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri Jun 5 15:49:47 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 5 Jun 2009 21:49:47 +0200 Subject: [Numpy-discussion] Maturing the Matrix class in NumPy Message-ID: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> Hi Alan 2009/6/5 Alan G Isaac : > You should take it very seriously when someone > reports to you that the matrix object is a crucial to them, > e.g., as a teaching tool. ?Even if you do not find > personally persuasive an example like > http://mail.scipy.org/pipermail/numpy-discussion/2009-June/043001.html You make a good point and the example you gave earlier shows how the Matrix object can be used to write expressions that correspond clearly to those in a textbook. Christopher is right that most of the developers don't use the Matrix class much, but that is no excuse to neglect it as we are currently doing. If the Matrix class is to remain, we need to take the steps necessary to integrate it into NumPy properly. To get going we'll need a list of changes required (i.e. "in an ideal world, how would matrices work?"). There should be a set protocol for all numpy functions that guarantees compatibility with ndarrays, matrices and other derived classes. This also ties in with a suggestion by Darren Dale on ufuncs that hasn't been getting the attention it deserves. Indexing issues need to be cleared up, with a chapter on matrix use written for the guide. Being one of the most vocal proponents of the Matrix class, would you be prepared to develop your Matrix Proposal at http://scipy.org/NewMatrixSpec further? We need to get to a point where we can assign tasks, and start writing code. The keyphrase here should be "rough consensus" to avoid the bike shedding we've run into in the past. Thanks for following this through, Regards St?fan From kwgoodman at gmail.com Fri Jun 5 15:52:25 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 5 Jun 2009 12:52:25 -0700 Subject: [Numpy-discussion] vectorizing In-Reply-To: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> Message-ID: On Fri, Jun 5, 2009 at 11:07 AM, Brian Blais wrote: > Hello, > I have a vectorizing problem that I don't see an obvious way to solve. ?What > I have is a vector like: > obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) > and a matrix > T=zeros((6,6)) > and what I want in T is a count of all of the transitions in obs, e.g. > T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the > sequence 3-4 only happens once, etc... ?I can do it unvectorized like: > for o1,o2 in zip(obs[:-1],obs[1:]): > ?? ?T[o1,o2]+=1 > > which gives the correct answer from above, which is: > array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], > ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], > ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], > ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], > ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) > > > but I thought there would be a better way. ?I tried: > o1=obs[:-1] > o2=obs[1:] > T[o1,o2]+=1 > but this doesn't give a count, it just yields 1's at the transition points, > like: > array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], > ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], > ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], > ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], > ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) > > Is there a clever way to do this? ?I could write a quick Cython solution, > but I wanted to keep this as an all-numpy implementation if I can. It's a little faster (8.5% for me when obs is length 10000) if you do T = np.zeros((6,6), dtype=np.int) But it more than 5 times faster if you use lists for T and obs. You're just storing information here, so there is no reason to pay for the overhead of arrays. import random import numpy as np T = [[0,0,0,0,0,0], [0,0,0,0,0,0], [0,0,0,0,0,0], [0,0,0,0,0,0], [0,0,0,0,0,0], [0,0,0,0,0,0]] obs = [random.randint(0, 5) for z in range(10000)] def test(obs, T): for o1,o2 in zip(obs[:-1],obs[1:]): T[o1][o2] += 1 return T From josef.pktd at gmail.com Fri Jun 5 15:53:17 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Jun 2009 15:53:17 -0400 Subject: [Numpy-discussion] vectorizing In-Reply-To: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> Message-ID: <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: > Hello, > I have a vectorizing problem that I don't see an obvious way to solve. ?What > I have is a vector like: > obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) > and a matrix > T=zeros((6,6)) > and what I want in T is a count of all of the transitions in obs, e.g. > T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the > sequence 3-4 only happens once, etc... ?I can do it unvectorized like: > for o1,o2 in zip(obs[:-1],obs[1:]): > ?? ?T[o1,o2]+=1 > > which gives the correct answer from above, which is: > array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], > ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], > ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], > ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], > ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) > > > but I thought there would be a better way. ?I tried: > o1=obs[:-1] > o2=obs[1:] > T[o1,o2]+=1 > but this doesn't give a count, it just yields 1's at the transition points, > like: > array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], > ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], > ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], > ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], > ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) > > Is there a clever way to do this? ?I could write a quick Cython solution, > but I wanted to keep this as an all-numpy implementation if I can. > histogram2d or its imitation, there was a discussion on histogram2d a short time ago >>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>> obs2 = obs - 1 >>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>> re = np.array([[ 0., 0., 0., 0., 0., 0.], ... [ 0., 0., 3., 0., 0., 1.], ... [ 0., 3., 0., 1., 0., 0.], ... [ 0., 0., 2., 0., 1., 0.], ... [ 0., 0., 0., 2., 0., 0.], ... [ 0., 0., 0., 0., 1., 0.]]) >>> np.all(re == trans) True >>> trans array([[0, 0, 0, 0, 0, 0], [0, 0, 3, 0, 0, 1], [0, 3, 0, 1, 0, 0], [0, 0, 2, 0, 1, 0], [0, 0, 0, 2, 0, 0], [0, 0, 0, 0, 1, 0]]) or >>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>> re array([[ 0., 0., 0., 0., 0., 0.], [ 0., 0., 3., 0., 0., 1.], [ 0., 3., 0., 1., 0., 0.], [ 0., 0., 2., 0., 1., 0.], [ 0., 0., 0., 2., 0., 0.], [ 0., 0., 0., 0., 1., 0.]]) >>> h array([[ 0., 0., 0., 0., 0., 0.], [ 0., 0., 3., 0., 0., 1.], [ 0., 3., 0., 1., 0., 0.], [ 0., 0., 2., 0., 1., 0.], [ 0., 0., 0., 2., 0., 0.], [ 0., 0., 0., 0., 1., 0.]]) >>> np.all(re == h) True From kwgoodman at gmail.com Fri Jun 5 16:01:09 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 5 Jun 2009 13:01:09 -0700 Subject: [Numpy-discussion] vectorizing In-Reply-To: <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: On Fri, Jun 5, 2009 at 12:53 PM, wrote: > On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >> Hello, >> I have a vectorizing problem that I don't see an obvious way to solve. ?What >> I have is a vector like: >> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >> and a matrix >> T=zeros((6,6)) >> and what I want in T is a count of all of the transitions in obs, e.g. >> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >> for o1,o2 in zip(obs[:-1],obs[1:]): >> ?? ?T[o1,o2]+=1 >> >> which gives the correct answer from above, which is: >> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >> >> >> but I thought there would be a better way. ?I tried: >> o1=obs[:-1] >> o2=obs[1:] >> T[o1,o2]+=1 >> but this doesn't give a count, it just yields 1's at the transition points, >> like: >> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >> >> Is there a clever way to do this? ?I could write a quick Cython solution, >> but I wanted to keep this as an all-numpy implementation if I can. >> > > histogram2d or its imitation, there was a discussion on histogram2d a > short time ago > >>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>> obs2 = obs - 1 >>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], > ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], > ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], > ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], > ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>> np.all(re == trans) > True > >>>> trans > array([[0, 0, 0, 0, 0, 0], > ? ? ? [0, 0, 3, 0, 0, 1], > ? ? ? [0, 3, 0, 1, 0, 0], > ? ? ? [0, 0, 2, 0, 1, 0], > ? ? ? [0, 0, 0, 2, 0, 0], > ? ? ? [0, 0, 0, 0, 1, 0]]) > > > or > >>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>> re > array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], > ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], > ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], > ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], > ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>> h > array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], > ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], > ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], > ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], > ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) > >>>> np.all(re == h) > True There's no way my list method can beat that. But by adding import psyco psyco.full() I get a total speed up of a factor of 15 when obs is length 10000. From josef.pktd at gmail.com Fri Jun 5 16:06:40 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Jun 2009 16:06:40 -0400 Subject: [Numpy-discussion] vectorizing In-Reply-To: <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: <1cd32cbb0906051306ke6c3698p281fca642cc03aa8@mail.gmail.com> On Fri, Jun 5, 2009 at 3:53 PM, wrote: > On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >> Hello, >> I have a vectorizing problem that I don't see an obvious way to solve. ?What >> I have is a vector like: >> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >> and a matrix >> T=zeros((6,6)) >> and what I want in T is a count of all of the transitions in obs, e.g. >> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >> for o1,o2 in zip(obs[:-1],obs[1:]): >> ?? ?T[o1,o2]+=1 >> >> which gives the correct answer from above, which is: >> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >> >> >> but I thought there would be a better way. ?I tried: >> o1=obs[:-1] >> o2=obs[1:] >> T[o1,o2]+=1 >> but this doesn't give a count, it just yields 1's at the transition points, >> like: >> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >> >> Is there a clever way to do this? ?I could write a quick Cython solution, >> but I wanted to keep this as an all-numpy implementation if I can. >> > > histogram2d or its imitation, there was a discussion on histogram2d a > short time ago > >>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>> obs2 = obs - 1 >>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) I don't think this is correct, there are problems for the first and last position if (1,1) and (5,5) are possible. I don't remember how I did it before. The histogram2d version still looks correct for this case. histogram2d uses bincount under the hood in a similar way. Josef >>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], > ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], > ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], > ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], > ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>> np.all(re == trans) > True > >>>> trans > array([[0, 0, 0, 0, 0, 0], > ? ? ? [0, 0, 3, 0, 0, 1], > ? ? ? [0, 3, 0, 1, 0, 0], > ? ? ? [0, 0, 2, 0, 1, 0], > ? ? ? [0, 0, 0, 2, 0, 0], > ? ? ? [0, 0, 0, 0, 1, 0]]) > > > or > >>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>> re > array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], > ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], > ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], > ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], > ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>> h > array([[ 0., ?0., ?0., ?0., ?0., ?0.], > ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], > ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], > ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], > ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], > ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) > >>>> np.all(re == h) > True > From kwgoodman at gmail.com Fri Jun 5 16:05:12 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 5 Jun 2009 13:05:12 -0700 Subject: [Numpy-discussion] vectorizing In-Reply-To: References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: > On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>> Hello, >>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>> I have is a vector like: >>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>> and a matrix >>> T=zeros((6,6)) >>> and what I want in T is a count of all of the transitions in obs, e.g. >>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>> for o1,o2 in zip(obs[:-1],obs[1:]): >>> ?? ?T[o1,o2]+=1 >>> >>> which gives the correct answer from above, which is: >>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>> >>> >>> but I thought there would be a better way. ?I tried: >>> o1=obs[:-1] >>> o2=obs[1:] >>> T[o1,o2]+=1 >>> but this doesn't give a count, it just yields 1's at the transition points, >>> like: >>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>> >>> Is there a clever way to do this? ?I could write a quick Cython solution, >>> but I wanted to keep this as an all-numpy implementation if I can. >>> >> >> histogram2d or its imitation, there was a discussion on histogram2d a >> short time ago >> >>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>> obs2 = obs - 1 >>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>> np.all(re == trans) >> True >> >>>>> trans >> array([[0, 0, 0, 0, 0, 0], >> ? ? ? [0, 0, 3, 0, 0, 1], >> ? ? ? [0, 3, 0, 1, 0, 0], >> ? ? ? [0, 0, 2, 0, 1, 0], >> ? ? ? [0, 0, 0, 2, 0, 0], >> ? ? ? [0, 0, 0, 0, 1, 0]]) >> >> >> or >> >>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>> re >> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>> h >> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >> >>>>> np.all(re == h) >> True > > There's no way my list method can beat that. But by adding > > import psyco > psyco.full() > > I get a total speed up of a factor of 15 when obs is length 10000. Actually, it is faster: histogram: >> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) 100 loops, best of 3: 4.14 ms per loop lists: >> timeit test(obs3, T3) 1000 loops, best of 3: 1.32 ms per loop From bpederse at gmail.com Fri Jun 5 16:22:14 2009 From: bpederse at gmail.com (Brent Pedersen) Date: Fri, 5 Jun 2009 13:22:14 -0700 Subject: [Numpy-discussion] vectorizing In-Reply-To: References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: On Fri, Jun 5, 2009 at 1:05 PM, Keith Goodman wrote: > On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: >> On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >>> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>>> Hello, >>>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>>> I have is a vector like: >>>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>> and a matrix >>>> T=zeros((6,6)) >>>> and what I want in T is a count of all of the transitions in obs, e.g. >>>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>>> for o1,o2 in zip(obs[:-1],obs[1:]): >>>> ?? ?T[o1,o2]+=1 >>>> >>>> which gives the correct answer from above, which is: >>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>> >>>> >>>> but I thought there would be a better way. ?I tried: >>>> o1=obs[:-1] >>>> o2=obs[1:] >>>> T[o1,o2]+=1 >>>> but this doesn't give a count, it just yields 1's at the transition points, >>>> like: >>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>> >>>> Is there a clever way to do this? ?I could write a quick Cython solution, >>>> but I wanted to keep this as an all-numpy implementation if I can. >>>> >>> >>> histogram2d or its imitation, there was a discussion on histogram2d a >>> short time ago >>> >>>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>> obs2 = obs - 1 >>>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> np.all(re == trans) >>> True >>> >>>>>> trans >>> array([[0, 0, 0, 0, 0, 0], >>> ? ? ? [0, 0, 3, 0, 0, 1], >>> ? ? ? [0, 3, 0, 1, 0, 0], >>> ? ? ? [0, 0, 2, 0, 1, 0], >>> ? ? ? [0, 0, 0, 2, 0, 0], >>> ? ? ? [0, 0, 0, 0, 1, 0]]) >>> >>> >>> or >>> >>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>> re >>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> h >>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>> >>>>>> np.all(re == h) >>> True >> >> There's no way my list method can beat that. But by adding >> >> import psyco >> psyco.full() >> >> I get a total speed up of a factor of 15 when obs is length 10000. > > Actually, it is faster: > > histogram: > >>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) > 100 loops, best of 3: 4.14 ms per loop > > lists: > >>> timeit test(obs3, T3) > 1000 loops, best of 3: 1.32 ms per loop > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > here's a go: import numpy as np import random from itertools import groupby def test1(obs, T): for o1,o2 in zip(obs[:-1],obs[1:]): T[o1][o2] += 1 return T def test2(obs, T): s = zip(obs[:-1], obs[1:]) for idx, g in groupby(sorted(s)): T[idx] = len(list(g)) return T obs = [random.randint(0, 5) for z in range(10000)] print test2(obs, np.zeros((6, 6))) print test1(obs, np.zeros((6, 6))) ############## In [10]: timeit test1(obs, np.zeros((6, 6))) 100 loops, best of 3: 18.8 ms per loop In [11]: timeit test2(obs, np.zeros((6, 6))) 100 loops, best of 3: 6.91 ms per loop From kwgoodman at gmail.com Fri Jun 5 16:27:12 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 5 Jun 2009 13:27:12 -0700 Subject: [Numpy-discussion] vectorizing In-Reply-To: References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: On Fri, Jun 5, 2009 at 1:22 PM, Brent Pedersen wrote: > On Fri, Jun 5, 2009 at 1:05 PM, Keith Goodman wrote: >> On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: >>> On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >>>> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>>>> Hello, >>>>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>>>> I have is a vector like: >>>>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>> and a matrix >>>>> T=zeros((6,6)) >>>>> and what I want in T is a count of all of the transitions in obs, e.g. >>>>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>>>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>>>> for o1,o2 in zip(obs[:-1],obs[1:]): >>>>> ?? ?T[o1,o2]+=1 >>>>> >>>>> which gives the correct answer from above, which is: >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>> >>>>> >>>>> but I thought there would be a better way. ?I tried: >>>>> o1=obs[:-1] >>>>> o2=obs[1:] >>>>> T[o1,o2]+=1 >>>>> but this doesn't give a count, it just yields 1's at the transition points, >>>>> like: >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>>>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>>>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>> >>>>> Is there a clever way to do this? ?I could write a quick Cython solution, >>>>> but I wanted to keep this as an all-numpy implementation if I can. >>>>> >>>> >>>> histogram2d or its imitation, there was a discussion on histogram2d a >>>> short time ago >>>> >>>>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>> obs2 = obs - 1 >>>>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>> np.all(re == trans) >>>> True >>>> >>>>>>> trans >>>> array([[0, 0, 0, 0, 0, 0], >>>> ? ? ? [0, 0, 3, 0, 0, 1], >>>> ? ? ? [0, 3, 0, 1, 0, 0], >>>> ? ? ? [0, 0, 2, 0, 1, 0], >>>> ? ? ? [0, 0, 0, 2, 0, 0], >>>> ? ? ? [0, 0, 0, 0, 1, 0]]) >>>> >>>> >>>> or >>>> >>>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>>> re >>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>> h >>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>> >>>>>>> np.all(re == h) >>>> True >>> >>> There's no way my list method can beat that. But by adding >>> >>> import psyco >>> psyco.full() >>> >>> I get a total speed up of a factor of 15 when obs is length 10000. >> >> Actually, it is faster: >> >> histogram: >> >>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >> 100 loops, best of 3: 4.14 ms per loop >> >> lists: >> >>>> timeit test(obs3, T3) >> 1000 loops, best of 3: 1.32 ms per loop >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > here's a go: > > import numpy as np > import random > from itertools import groupby > > def test1(obs, T): > ? for o1,o2 in zip(obs[:-1],obs[1:]): > ? ? ? T[o1][o2] += 1 > ? return T > > > def test2(obs, T): > ? ?s = zip(obs[:-1], obs[1:]) > ? ?for idx, g in groupby(sorted(s)): > ? ? ? ?T[idx] = len(list(g)) > ? ?return T > > obs = [random.randint(0, 5) for z in range(10000)] > > print test2(obs, np.zeros((6, 6))) > print test1(obs, np.zeros((6, 6))) > > > ############## > > In [10]: timeit test1(obs, np.zeros((6, 6))) > 100 loops, best of 3: 18.8 ms per loop > > In [11]: timeit test2(obs, np.zeros((6, 6))) > 100 loops, best of 3: 6.91 ms per loop Nice! Try adding import psyco psyco.full() to test1. Or is that cheating? From josef.pktd at gmail.com Fri Jun 5 16:31:24 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Jun 2009 16:31:24 -0400 Subject: [Numpy-discussion] vectorizing In-Reply-To: References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: <1cd32cbb0906051331l66895c6ar76f5684834ba8447@mail.gmail.com> On Fri, Jun 5, 2009 at 4:27 PM, Keith Goodman wrote: > On Fri, Jun 5, 2009 at 1:22 PM, Brent Pedersen wrote: >> On Fri, Jun 5, 2009 at 1:05 PM, Keith Goodman wrote: >>> On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: >>>> On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >>>>> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>>>>> Hello, >>>>>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>>>>> I have is a vector like: >>>>>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>> and a matrix >>>>>> T=zeros((6,6)) >>>>>> and what I want in T is a count of all of the transitions in obs, e.g. >>>>>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>>>>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>>>>> for o1,o2 in zip(obs[:-1],obs[1:]): >>>>>> ?? ?T[o1,o2]+=1 >>>>>> >>>>>> which gives the correct answer from above, which is: >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> >>>>>> >>>>>> but I thought there would be a better way. ?I tried: >>>>>> o1=obs[:-1] >>>>>> o2=obs[1:] >>>>>> T[o1,o2]+=1 >>>>>> but this doesn't give a count, it just yields 1's at the transition points, >>>>>> like: >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>>>>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> >>>>>> Is there a clever way to do this? ?I could write a quick Cython solution, >>>>>> but I wanted to keep this as an all-numpy implementation if I can. >>>>>> >>>>> >>>>> histogram2d or its imitation, there was a discussion on histogram2d a >>>>> short time ago >>>>> >>>>>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>>> obs2 = obs - 1 >>>>>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>> np.all(re == trans) >>>>> True >>>>> >>>>>>>> trans >>>>> array([[0, 0, 0, 0, 0, 0], >>>>> ? ? ? [0, 0, 3, 0, 0, 1], >>>>> ? ? ? [0, 3, 0, 1, 0, 0], >>>>> ? ? ? [0, 0, 2, 0, 1, 0], >>>>> ? ? ? [0, 0, 0, 2, 0, 0], >>>>> ? ? ? [0, 0, 0, 0, 1, 0]]) >>>>> >>>>> >>>>> or >>>>> >>>>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>>>> re >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>> h >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>> >>>>>>>> np.all(re == h) >>>>> True >>>> >>>> There's no way my list method can beat that. But by adding >>>> >>>> import psyco >>>> psyco.full() >>>> >>>> I get a total speed up of a factor of 15 when obs is length 10000. >>> >>> Actually, it is faster: >>> >>> histogram: >>> >>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>> 100 loops, best of 3: 4.14 ms per loop >>> >>> lists: >>> >>>>> timeit test(obs3, T3) >>> 1000 loops, best of 3: 1.32 ms per loop >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> here's a go: >> >> import numpy as np >> import random >> from itertools import groupby >> >> def test1(obs, T): >> ? for o1,o2 in zip(obs[:-1],obs[1:]): >> ? ? ? T[o1][o2] += 1 >> ? return T >> >> >> def test2(obs, T): >> ? ?s = zip(obs[:-1], obs[1:]) >> ? ?for idx, g in groupby(sorted(s)): >> ? ? ? ?T[idx] = len(list(g)) >> ? ?return T >> >> obs = [random.randint(0, 5) for z in range(10000)] >> >> print test2(obs, np.zeros((6, 6))) >> print test1(obs, np.zeros((6, 6))) >> >> >> ############## >> >> In [10]: timeit test1(obs, np.zeros((6, 6))) >> 100 loops, best of 3: 18.8 ms per loop >> >> In [11]: timeit test2(obs, np.zeros((6, 6))) >> 100 loops, best of 3: 6.91 ms per loop > > Nice! > > Try adding > > import psyco > psyco.full() > > to test1. Or is that cheating? > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > how about from scipy import ndimage ndimage.histogram((obs[:-1])*6 +obs[1:], 0, 35, 36 ).reshape(6,6) (bincount doesn't take the limits into account if they don't show up in the data, so first and last position would need special casing) Josef From bpederse at gmail.com Fri Jun 5 16:31:54 2009 From: bpederse at gmail.com (Brent Pedersen) Date: Fri, 5 Jun 2009 13:31:54 -0700 Subject: [Numpy-discussion] vectorizing In-Reply-To: References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: On Fri, Jun 5, 2009 at 1:27 PM, Keith Goodman wrote: > On Fri, Jun 5, 2009 at 1:22 PM, Brent Pedersen wrote: >> On Fri, Jun 5, 2009 at 1:05 PM, Keith Goodman wrote: >>> On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: >>>> On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >>>>> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>>>>> Hello, >>>>>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>>>>> I have is a vector like: >>>>>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>> and a matrix >>>>>> T=zeros((6,6)) >>>>>> and what I want in T is a count of all of the transitions in obs, e.g. >>>>>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>>>>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>>>>> for o1,o2 in zip(obs[:-1],obs[1:]): >>>>>> ?? ?T[o1,o2]+=1 >>>>>> >>>>>> which gives the correct answer from above, which is: >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> >>>>>> >>>>>> but I thought there would be a better way. ?I tried: >>>>>> o1=obs[:-1] >>>>>> o2=obs[1:] >>>>>> T[o1,o2]+=1 >>>>>> but this doesn't give a count, it just yields 1's at the transition points, >>>>>> like: >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>>>>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> >>>>>> Is there a clever way to do this? ?I could write a quick Cython solution, >>>>>> but I wanted to keep this as an all-numpy implementation if I can. >>>>>> >>>>> >>>>> histogram2d or its imitation, there was a discussion on histogram2d a >>>>> short time ago >>>>> >>>>>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>>> obs2 = obs - 1 >>>>>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>> np.all(re == trans) >>>>> True >>>>> >>>>>>>> trans >>>>> array([[0, 0, 0, 0, 0, 0], >>>>> ? ? ? [0, 0, 3, 0, 0, 1], >>>>> ? ? ? [0, 3, 0, 1, 0, 0], >>>>> ? ? ? [0, 0, 2, 0, 1, 0], >>>>> ? ? ? [0, 0, 0, 2, 0, 0], >>>>> ? ? ? [0, 0, 0, 0, 1, 0]]) >>>>> >>>>> >>>>> or >>>>> >>>>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>>>> re >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>> h >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>> >>>>>>>> np.all(re == h) >>>>> True >>>> >>>> There's no way my list method can beat that. But by adding >>>> >>>> import psyco >>>> psyco.full() >>>> >>>> I get a total speed up of a factor of 15 when obs is length 10000. >>> >>> Actually, it is faster: >>> >>> histogram: >>> >>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>> 100 loops, best of 3: 4.14 ms per loop >>> >>> lists: >>> >>>>> timeit test(obs3, T3) >>> 1000 loops, best of 3: 1.32 ms per loop >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> here's a go: >> >> import numpy as np >> import random >> from itertools import groupby >> >> def test1(obs, T): >> ? for o1,o2 in zip(obs[:-1],obs[1:]): >> ? ? ? T[o1][o2] += 1 >> ? return T >> >> >> def test2(obs, T): >> ? ?s = zip(obs[:-1], obs[1:]) >> ? ?for idx, g in groupby(sorted(s)): >> ? ? ? ?T[idx] = len(list(g)) >> ? ?return T >> >> obs = [random.randint(0, 5) for z in range(10000)] >> >> print test2(obs, np.zeros((6, 6))) >> print test1(obs, np.zeros((6, 6))) >> >> >> ############## >> >> In [10]: timeit test1(obs, np.zeros((6, 6))) >> 100 loops, best of 3: 18.8 ms per loop >> >> In [11]: timeit test2(obs, np.zeros((6, 6))) >> 100 loops, best of 3: 6.91 ms per loop > > Nice! > > Try adding > > import psyco > psyco.full() > > to test1. Or is that cheating? it is if you're running 64bit. :-) > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Fri Jun 5 16:35:53 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 5 Jun 2009 13:35:53 -0700 Subject: [Numpy-discussion] vectorizing In-Reply-To: <1cd32cbb0906051331l66895c6ar76f5684834ba8447@mail.gmail.com> References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> <1cd32cbb0906051331l66895c6ar76f5684834ba8447@mail.gmail.com> Message-ID: On Fri, Jun 5, 2009 at 1:31 PM, wrote: > On Fri, Jun 5, 2009 at 4:27 PM, Keith Goodman wrote: >> On Fri, Jun 5, 2009 at 1:22 PM, Brent Pedersen wrote: >>> On Fri, Jun 5, 2009 at 1:05 PM, Keith Goodman wrote: >>>> On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: >>>>> On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >>>>>> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>>>>>> Hello, >>>>>>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>>>>>> I have is a vector like: >>>>>>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>> and a matrix >>>>>>> T=zeros((6,6)) >>>>>>> and what I want in T is a count of all of the transitions in obs, e.g. >>>>>>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>>>>>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>>>>>> for o1,o2 in zip(obs[:-1],obs[1:]): >>>>>>> ?? ?T[o1,o2]+=1 >>>>>>> >>>>>>> which gives the correct answer from above, which is: >>>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>> >>>>>>> >>>>>>> but I thought there would be a better way. ?I tried: >>>>>>> o1=obs[:-1] >>>>>>> o2=obs[1:] >>>>>>> T[o1,o2]+=1 >>>>>>> but this doesn't give a count, it just yields 1's at the transition points, >>>>>>> like: >>>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>>>>>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>> >>>>>>> Is there a clever way to do this? ?I could write a quick Cython solution, >>>>>>> but I wanted to keep this as an all-numpy implementation if I can. >>>>>>> >>>>>> >>>>>> histogram2d or its imitation, there was a discussion on histogram2d a >>>>>> short time ago >>>>>> >>>>>>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>>>> obs2 = obs - 1 >>>>>>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>>>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>>> np.all(re == trans) >>>>>> True >>>>>> >>>>>>>>> trans >>>>>> array([[0, 0, 0, 0, 0, 0], >>>>>> ? ? ? [0, 0, 3, 0, 0, 1], >>>>>> ? ? ? [0, 3, 0, 1, 0, 0], >>>>>> ? ? ? [0, 0, 2, 0, 1, 0], >>>>>> ? ? ? [0, 0, 0, 2, 0, 0], >>>>>> ? ? ? [0, 0, 0, 0, 1, 0]]) >>>>>> >>>>>> >>>>>> or >>>>>> >>>>>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>>>>> re >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>>> h >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> >>>>>>>>> np.all(re == h) >>>>>> True >>>>> >>>>> There's no way my list method can beat that. But by adding >>>>> >>>>> import psyco >>>>> psyco.full() >>>>> >>>>> I get a total speed up of a factor of 15 when obs is length 10000. >>>> >>>> Actually, it is faster: >>>> >>>> histogram: >>>> >>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>> 100 loops, best of 3: 4.14 ms per loop >>>> >>>> lists: >>>> >>>>>> timeit test(obs3, T3) >>>> 1000 loops, best of 3: 1.32 ms per loop >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> here's a go: >>> >>> import numpy as np >>> import random >>> from itertools import groupby >>> >>> def test1(obs, T): >>> ? for o1,o2 in zip(obs[:-1],obs[1:]): >>> ? ? ? T[o1][o2] += 1 >>> ? return T >>> >>> >>> def test2(obs, T): >>> ? ?s = zip(obs[:-1], obs[1:]) >>> ? ?for idx, g in groupby(sorted(s)): >>> ? ? ? ?T[idx] = len(list(g)) >>> ? ?return T >>> >>> obs = [random.randint(0, 5) for z in range(10000)] >>> >>> print test2(obs, np.zeros((6, 6))) >>> print test1(obs, np.zeros((6, 6))) >>> >>> >>> ############## >>> >>> In [10]: timeit test1(obs, np.zeros((6, 6))) >>> 100 loops, best of 3: 18.8 ms per loop >>> >>> In [11]: timeit test2(obs, np.zeros((6, 6))) >>> 100 loops, best of 3: 6.91 ms per loop >> >> Nice! >> >> Try adding >> >> import psyco >> psyco.full() >> >> to test1. Or is that cheating? >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > how about > from scipy import ndimage > ndimage.histogram((obs[:-1])*6 +obs[1:], 0, 35, 36 ).reshape(6,6) > > (bincount doesn't take the limits into account if they don't show up > in the data, so first and last position would need special casing) Game over: >> from scipy.ndimage import histogram >> timeit histogram((obs[:-1])*6 +obs[1:], 0, 35, 36 ).reshape(6,6) 1000 loops, best of 3: 366 ?s per loop From josef.pktd at gmail.com Fri Jun 5 16:41:45 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Jun 2009 16:41:45 -0400 Subject: [Numpy-discussion] vectorizing In-Reply-To: References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> <1cd32cbb0906051331l66895c6ar76f5684834ba8447@mail.gmail.com> Message-ID: <1cd32cbb0906051341x5976d577jbbff067ad654be9@mail.gmail.com> On Fri, Jun 5, 2009 at 4:35 PM, Keith Goodman wrote: > On Fri, Jun 5, 2009 at 1:31 PM, ? wrote: >> On Fri, Jun 5, 2009 at 4:27 PM, Keith Goodman wrote: >>> On Fri, Jun 5, 2009 at 1:22 PM, Brent Pedersen wrote: >>>> On Fri, Jun 5, 2009 at 1:05 PM, Keith Goodman wrote: >>>>> On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: >>>>>> On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >>>>>>> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>>>>>>> Hello, >>>>>>>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>>>>>>> I have is a vector like: >>>>>>>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>>> and a matrix >>>>>>>> T=zeros((6,6)) >>>>>>>> and what I want in T is a count of all of the transitions in obs, e.g. >>>>>>>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>>>>>>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>>>>>>> for o1,o2 in zip(obs[:-1],obs[1:]): >>>>>>>> ?? ?T[o1,o2]+=1 >>>>>>>> >>>>>>>> which gives the correct answer from above, which is: >>>>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>>>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>>>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>>>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>>>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>> >>>>>>>> >>>>>>>> but I thought there would be a better way. ?I tried: >>>>>>>> o1=obs[:-1] >>>>>>>> o2=obs[1:] >>>>>>>> T[o1,o2]+=1 >>>>>>>> but this doesn't give a count, it just yields 1's at the transition points, >>>>>>>> like: >>>>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>>>>>>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>>>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>>>>>>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>>>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>> >>>>>>>> Is there a clever way to do this? ?I could write a quick Cython solution, >>>>>>>> but I wanted to keep this as an all-numpy implementation if I can. >>>>>>>> >>>>>>> >>>>>>> histogram2d or its imitation, there was a discussion on histogram2d a >>>>>>> short time ago >>>>>>> >>>>>>>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>>>>> obs2 = obs - 1 >>>>>>>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>>>>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>>> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>>> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>>> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>>> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>>> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>>>> np.all(re == trans) >>>>>>> True >>>>>>> >>>>>>>>>> trans >>>>>>> array([[0, 0, 0, 0, 0, 0], >>>>>>> ? ? ? [0, 0, 3, 0, 0, 1], >>>>>>> ? ? ? [0, 3, 0, 1, 0, 0], >>>>>>> ? ? ? [0, 0, 2, 0, 1, 0], >>>>>>> ? ? ? [0, 0, 0, 2, 0, 0], >>>>>>> ? ? ? [0, 0, 0, 0, 1, 0]]) >>>>>>> >>>>>>> >>>>>>> or >>>>>>> >>>>>>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>>>>>> re >>>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>>>> h >>>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>> >>>>>>>>>> np.all(re == h) >>>>>>> True >>>>>> >>>>>> There's no way my list method can beat that. But by adding >>>>>> >>>>>> import psyco >>>>>> psyco.full() >>>>>> >>>>>> I get a total speed up of a factor of 15 when obs is length 10000. >>>>> >>>>> Actually, it is faster: >>>>> >>>>> histogram: >>>>> >>>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>>> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>> 100 loops, best of 3: 4.14 ms per loop >>>>> >>>>> lists: >>>>> >>>>>>> timeit test(obs3, T3) >>>>> 1000 loops, best of 3: 1.32 ms per loop >>>>> _______________________________________________ >>>>> Numpy-discussion mailing list >>>>> Numpy-discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> here's a go: >>>> >>>> import numpy as np >>>> import random >>>> from itertools import groupby >>>> >>>> def test1(obs, T): >>>> ? for o1,o2 in zip(obs[:-1],obs[1:]): >>>> ? ? ? T[o1][o2] += 1 >>>> ? return T >>>> >>>> >>>> def test2(obs, T): >>>> ? ?s = zip(obs[:-1], obs[1:]) >>>> ? ?for idx, g in groupby(sorted(s)): >>>> ? ? ? ?T[idx] = len(list(g)) >>>> ? ?return T >>>> >>>> obs = [random.randint(0, 5) for z in range(10000)] >>>> >>>> print test2(obs, np.zeros((6, 6))) >>>> print test1(obs, np.zeros((6, 6))) >>>> >>>> >>>> ############## >>>> >>>> In [10]: timeit test1(obs, np.zeros((6, 6))) >>>> 100 loops, best of 3: 18.8 ms per loop >>>> >>>> In [11]: timeit test2(obs, np.zeros((6, 6))) >>>> 100 loops, best of 3: 6.91 ms per loop >>> >>> Nice! >>> >>> Try adding >>> >>> import psyco >>> psyco.full() >>> >>> to test1. Or is that cheating? >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> how about >> from scipy import ndimage >> ndimage.histogram((obs[:-1])*6 +obs[1:], 0, 35, 36 ).reshape(6,6) >> >> (bincount doesn't take the limits into account if they don't show up >> in the data, so first and last position would need special casing) > > Game over: > >>> from scipy.ndimage import histogram >>> timeit histogram((obs[:-1])*6 +obs[1:], 0, 35, 36 ).reshape(6,6) > 1000 loops, best of 3: 366 ?s per loop > Game over: maybe not, ndimage is very fast but crash prone (try feeding the wrong type) and will eventually be rewritten. Josef From kwgoodman at gmail.com Fri Jun 5 17:01:39 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 5 Jun 2009 14:01:39 -0700 Subject: [Numpy-discussion] vectorizing In-Reply-To: References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: On Fri, Jun 5, 2009 at 1:22 PM, Brent Pedersen wrote: > On Fri, Jun 5, 2009 at 1:05 PM, Keith Goodman wrote: >> On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: >>> On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >>>> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>>>> Hello, >>>>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>>>> I have is a vector like: >>>>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>> and a matrix >>>>> T=zeros((6,6)) >>>>> and what I want in T is a count of all of the transitions in obs, e.g. >>>>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>>>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>>>> for o1,o2 in zip(obs[:-1],obs[1:]): >>>>> ?? ?T[o1,o2]+=1 >>>>> >>>>> which gives the correct answer from above, which is: >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>> >>>>> >>>>> but I thought there would be a better way. ?I tried: >>>>> o1=obs[:-1] >>>>> o2=obs[1:] >>>>> T[o1,o2]+=1 >>>>> but this doesn't give a count, it just yields 1's at the transition points, >>>>> like: >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>>>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>>>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>> >>>>> Is there a clever way to do this? ?I could write a quick Cython solution, >>>>> but I wanted to keep this as an all-numpy implementation if I can. >>>>> >>>> >>>> histogram2d or its imitation, there was a discussion on histogram2d a >>>> short time ago >>>> >>>>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>> obs2 = obs - 1 >>>>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>> np.all(re == trans) >>>> True >>>> >>>>>>> trans >>>> array([[0, 0, 0, 0, 0, 0], >>>> ? ? ? [0, 0, 3, 0, 0, 1], >>>> ? ? ? [0, 3, 0, 1, 0, 0], >>>> ? ? ? [0, 0, 2, 0, 1, 0], >>>> ? ? ? [0, 0, 0, 2, 0, 0], >>>> ? ? ? [0, 0, 0, 0, 1, 0]]) >>>> >>>> >>>> or >>>> >>>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>>> re >>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>> h >>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>> >>>>>>> np.all(re == h) >>>> True >>> >>> There's no way my list method can beat that. But by adding >>> >>> import psyco >>> psyco.full() >>> >>> I get a total speed up of a factor of 15 when obs is length 10000. >> >> Actually, it is faster: >> >> histogram: >> >>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >> 100 loops, best of 3: 4.14 ms per loop >> >> lists: >> >>>> timeit test(obs3, T3) >> 1000 loops, best of 3: 1.32 ms per loop >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > here's a go: > > import numpy as np > import random > from itertools import groupby > > def test1(obs, T): > ? for o1,o2 in zip(obs[:-1],obs[1:]): > ? ? ? T[o1][o2] += 1 > ? return T > > > def test2(obs, T): > ? ?s = zip(obs[:-1], obs[1:]) > ? ?for idx, g in groupby(sorted(s)): > ? ? ? ?T[idx] = len(list(g)) > ? ?return T > > obs = [random.randint(0, 5) for z in range(10000)] > > print test2(obs, np.zeros((6, 6))) > print test1(obs, np.zeros((6, 6))) > > > ############## > > In [10]: timeit test1(obs, np.zeros((6, 6))) > 100 loops, best of 3: 18.8 ms per loop > > In [11]: timeit test2(obs, np.zeros((6, 6))) > 100 loops, best of 3: 6.91 ms per loop Wait, you tested the list method with an array. Try timeit test1(obs, np.zeros((6, 6)).tolist()) Probably best to move the array/list creation out of the timeit loop. Then my method won't have to pay the cost of converting to a list :) From aisaac at american.edu Fri Jun 5 17:14:39 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 05 Jun 2009 17:14:39 -0400 Subject: [Numpy-discussion] Maturing the Matrix class in NumPy In-Reply-To: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> Message-ID: <4A298ABF.3050103@american.edu> On 6/5/2009 3:49 PM St?fan van der Walt apparently wrote: > If the Matrix class is to remain, we need to take the steps > necessary to integrate it into NumPy properly. I think this requires a list of current problems. Many of the problems for NumPy have been addressed over time. I believe the remaining problems center more on SciPy rather than NumPy. This requires that users report difficulties. For example, Jason Rennie says he ran into problems with scipy.optimize.fmin_cg, although I do not recall him reporting these (I do recall an optimization problem he reported using ndarrays). Has he filed a bug report detailing his problem? > To get going we'll need a list of changes required (i.e. "in an ideal > world, how would matrices work?"). The key anomaly concerning matrices comes with indexing. See the introduction here: http://www.scipy.org/MatrixIndexing However changing this for the current matrix object was rejected in the last (exhausting) go round. > There should be a set protocol for > all numpy functions that guarantees compatibility with ndarrays, > matrices and other derived classes. My impression was that this was resolved as follows: handle all ndarray based objects as arrays (using asarray) in any NumPy function, but return the subclass when possible. (E.g., using asmatrix, return a matrix output for a matrix input.) This seems fine to me. > Being one of the most vocal proponents of the Matrix class, would you > be prepared to develop your Matrix Proposal at > http://scipy.org/NewMatrixSpec further? I consider my proposal to have the following status: rejected. I consider the core reason to be: creates a backwards incompatibility. That was a very long and exhausting discussion that was productive in laying out the issues, but I do not think we can progress in that direction. The existing matrix object is very usable. It's primary problem is some indexing anomalies, http://www.scipy.org/MatrixIndexing and not everyone saw those as problems. In terms of NumPy functions, I think the asarray/asmatrix protocol fits the bill. (Altho perhaps I am overlooking something as a user that is obvious to a developer.) Cheers, Alan From sccolbert at gmail.com Fri Jun 5 17:24:45 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 5 Jun 2009 17:24:45 -0400 Subject: [Numpy-discussion] Maturing the Matrix class in NumPy In-Reply-To: <4A298ABF.3050103@american.edu> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> Message-ID: <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> How about just introducing a slightly different syntax which tells numpy to handle the array like a matrix: Some thing along the lines of this: A = array[[..]] B = array[[..]] elementwise multipication (as it currently is): C = A * B matrix multiplication: C = {A} * {B} or C = [A] * [B] or any other brace we decide on Essentially, the brace tells numpy to handle the array objects like matrices, but with no need for a specific matrix type. Since textbook math typical has some kind of bracket around matrix variable, I think this would jive well just my .02 Chris On Fri, Jun 5, 2009 at 5:14 PM, Alan G Isaac wrote: > On 6/5/2009 3:49 PM St?fan van der Walt apparently wrote: > > If the Matrix class is to remain, we need to take the steps > > necessary to integrate it into NumPy properly. > > I think this requires a list of current problems. > Many of the problems for NumPy have been addressed over time. > I believe the remaining problems center more on SciPy rather than NumPy. > This requires that users report difficulties. > > For example, Jason Rennie says he ran into problems with > scipy.optimize.fmin_cg, although I do not recall him reporting > these (I do recall an optimization problem he reported using > ndarrays). Has he filed a bug report detailing his problem? > > > > To get going we'll need a list of changes required (i.e. "in an ideal > > world, how would matrices work?"). > > The key anomaly concerning matrices comes with indexing. > See the introduction here: http://www.scipy.org/MatrixIndexing > > However changing this for the current matrix object was rejected > in the last (exhausting) go round. > > > > There should be a set protocol for > > all numpy functions that guarantees compatibility with ndarrays, > > matrices and other derived classes. > > My impression was that this was resolved as follows: > handle all ndarray based objects as arrays (using asarray) > in any NumPy function, but return the subclass when possible. > (E.g., using asmatrix, return a matrix output for a matrix input.) > This seems fine to me. > > > > Being one of the most vocal proponents of the Matrix class, would you > > be prepared to develop your Matrix Proposal at > > http://scipy.org/NewMatrixSpec further? > > I consider my proposal to have the following status: rejected. > I consider the core reason to be: creates a backwards incompatibility. > That was a very long and exhausting discussion that was productive > in laying out the issues, but I do not think we can progress in that > direction. > > The existing matrix object is very usable. > It's primary problem is some indexing anomalies, > http://www.scipy.org/MatrixIndexing > and not everyone saw those as problems. > In terms of NumPy functions, I think the asarray/asmatrix > protocol fits the bill. (Altho perhaps I am overlooking > something as a user that is obvious to a developer.) > > Cheers, > Alan > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jun 5 17:28:40 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 5 Jun 2009 16:28:40 -0500 Subject: [Numpy-discussion] Maturing the Matrix class in NumPy In-Reply-To: <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> Message-ID: <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> On Fri, Jun 5, 2009 at 16:24, Chris Colbert wrote: > How about just introducing a slightly different syntax which tells numpy to > handle the array like a matrix: > > Some thing along the lines of this: > > A = array[[..]] > B = array[[..]] > > elementwise multipication (as it currently is): > > C = A * B > > matrix multiplication: > > C = {A} * {B} > > or > > C = [A] * [B] > > or any other brace we decide on > > Essentially, the brace tells numpy to handle the array objects like > matrices, but with no need for a specific matrix type. We don't control the Python language. We cannot make these kinds of changes to the syntax. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mforbes at physics.ubc.ca Fri Jun 5 17:14:57 2009 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Fri, 5 Jun 2009 15:14:57 -0600 Subject: [Numpy-discussion] IndexExpression bug? Message-ID: >>> np.array([0,1,2,3])[1:-1] array([1, 2]) but >>> np.array([0,1,2,3])[np.s_[1:-1]] array([1, 2, 3]) >>> np.array([0,1,2,3])[np.index_exp[1:-1]] array([1, 2, 3]) Possible fix: class IndexExpression(object): ... def __len__(self): return 0 (Presently this returns sys.maxint) Does this break anything (I can't find any coverage tests)? If not, I will submit a ticket. Michael. From sccolbert at gmail.com Fri Jun 5 17:37:49 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 5 Jun 2009 17:37:49 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041354h6b7a3949n4c45731172483c6d@mail.gmail.com> <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> Message-ID: <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> I'll caution anyone from using Atlas from the repos in Ubuntu 9.04 as the package is broken: https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510 just build Atlas yourself, you get better performance AND threading. Building it is not the nightmare it sounds like. I think i've done it a total of four times now, both 32-bit and 64-bit builds. If you need help with it, just email me off list. Cheers, Chris On Fri, Jun 5, 2009 at 2:46 PM, Matthieu Brucher wrote: > 2009/6/5 David Cournapeau : > > Eric Firing wrote: > >> > >> David, > >> > >> The eigen web site indicates that eigen achieves high performance > >> without all the compilation difficulty of atlas. Does eigen have enough > >> functionality to replace atlas in numpy? > > > > No, eigen does not provide a (complete) BLAS/LAPACK interface. I don't > > know if that's even a goal of eigen (it started as a project for KDE, to > > support high performance core computations for things like spreadsheet > > and co). > > > > But even then, it would be a huge undertaking. For all its flaws, LAPACK > > is old, tested code, with a very stable language (F77). Eigen is: > > - not mature. > > - heavily expression-template-based C++, meaning compilation takes > > ages + esoteric, impossible to decypher compilation errors. We have > > enough build problems already :) > > - SSE dependency harcoded, since it is setup at build time. That's > > going backward IMHO - I would rather see a numpy/scipy which can load > > the optimized code at runtime. > > I would add that it relies on C++ compiler extensions (the restrict > keyword) as does blitz. You unfortunately can't expect every compiler > to support it unless the C++ committee finally adds it to the > standard. > > Matthieu > -- > Information System Engineer, Ph.D. > Website: http://matthieu-brucher.developpez.com/ > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn: http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Fri Jun 5 17:41:54 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 5 Jun 2009 17:41:54 -0400 Subject: [Numpy-discussion] Maturing the Matrix class in NumPy In-Reply-To: <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> Message-ID: <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> well, it sounded like a good idea. Oh, well. On Fri, Jun 5, 2009 at 5:28 PM, Robert Kern wrote: > On Fri, Jun 5, 2009 at 16:24, Chris Colbert wrote: > > How about just introducing a slightly different syntax which tells numpy > to > > handle the array like a matrix: > > > > Some thing along the lines of this: > > > > A = array[[..]] > > B = array[[..]] > > > > elementwise multipication (as it currently is): > > > > C = A * B > > > > matrix multiplication: > > > > C = {A} * {B} > > > > or > > > > C = [A] * [B] > > > > or any other brace we decide on > > > > Essentially, the brace tells numpy to handle the array objects like > > matrices, but with no need for a specific matrix type. > > We don't control the Python language. We cannot make these kinds of > changes to the syntax. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bpederse at gmail.com Fri Jun 5 17:59:09 2009 From: bpederse at gmail.com (Brent Pedersen) Date: Fri, 5 Jun 2009 14:59:09 -0700 Subject: [Numpy-discussion] vectorizing In-Reply-To: References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: On Fri, Jun 5, 2009 at 2:01 PM, Keith Goodman wrote: > On Fri, Jun 5, 2009 at 1:22 PM, Brent Pedersen wrote: >> On Fri, Jun 5, 2009 at 1:05 PM, Keith Goodman wrote: >>> On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: >>>> On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >>>>> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>>>>> Hello, >>>>>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>>>>> I have is a vector like: >>>>>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>> and a matrix >>>>>> T=zeros((6,6)) >>>>>> and what I want in T is a count of all of the transitions in obs, e.g. >>>>>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>>>>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>>>>> for o1,o2 in zip(obs[:-1],obs[1:]): >>>>>> ?? ?T[o1,o2]+=1 >>>>>> >>>>>> which gives the correct answer from above, which is: >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> >>>>>> >>>>>> but I thought there would be a better way. ?I tried: >>>>>> o1=obs[:-1] >>>>>> o2=obs[1:] >>>>>> T[o1,o2]+=1 >>>>>> but this doesn't give a count, it just yields 1's at the transition points, >>>>>> like: >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>>>>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> >>>>>> Is there a clever way to do this? ?I could write a quick Cython solution, >>>>>> but I wanted to keep this as an all-numpy implementation if I can. >>>>>> >>>>> >>>>> histogram2d or its imitation, there was a discussion on histogram2d a >>>>> short time ago >>>>> >>>>>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>>> obs2 = obs - 1 >>>>>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>> np.all(re == trans) >>>>> True >>>>> >>>>>>>> trans >>>>> array([[0, 0, 0, 0, 0, 0], >>>>> ? ? ? [0, 0, 3, 0, 0, 1], >>>>> ? ? ? [0, 3, 0, 1, 0, 0], >>>>> ? ? ? [0, 0, 2, 0, 1, 0], >>>>> ? ? ? [0, 0, 0, 2, 0, 0], >>>>> ? ? ? [0, 0, 0, 0, 1, 0]]) >>>>> >>>>> >>>>> or >>>>> >>>>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>>>> re >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>> h >>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>> >>>>>>>> np.all(re == h) >>>>> True >>>> >>>> There's no way my list method can beat that. But by adding >>>> >>>> import psyco >>>> psyco.full() >>>> >>>> I get a total speed up of a factor of 15 when obs is length 10000. >>> >>> Actually, it is faster: >>> >>> histogram: >>> >>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>> 100 loops, best of 3: 4.14 ms per loop >>> >>> lists: >>> >>>>> timeit test(obs3, T3) >>> 1000 loops, best of 3: 1.32 ms per loop >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> here's a go: >> >> import numpy as np >> import random >> from itertools import groupby >> >> def test1(obs, T): >> ? for o1,o2 in zip(obs[:-1],obs[1:]): >> ? ? ? T[o1][o2] += 1 >> ? return T >> >> >> def test2(obs, T): >> ? ?s = zip(obs[:-1], obs[1:]) >> ? ?for idx, g in groupby(sorted(s)): >> ? ? ? ?T[idx] = len(list(g)) >> ? ?return T >> >> obs = [random.randint(0, 5) for z in range(10000)] >> >> print test2(obs, np.zeros((6, 6))) >> print test1(obs, np.zeros((6, 6))) >> >> >> ############## >> >> In [10]: timeit test1(obs, np.zeros((6, 6))) >> 100 loops, best of 3: 18.8 ms per loop >> >> In [11]: timeit test2(obs, np.zeros((6, 6))) >> 100 loops, best of 3: 6.91 ms per loop > > Wait, you tested the list method with an array. Try > > timeit test1(obs, np.zeros((6, 6)).tolist()) > > Probably best to move the array/list creation out of the timeit loop. > Then my method won't have to pay the cost of converting to a list :) > ah right, your test1 is faster than test2 that way. i'm just stoked to find out about ndimage.historgram. using this: >>> histogram((obs[:-1])*6 +obs[1:], 0, 36, 36).reshape(6,6) it gives the exact result as test1/test2. -b _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From aisaac at american.edu Fri Jun 5 18:02:09 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 05 Jun 2009 18:02:09 -0400 Subject: [Numpy-discussion] matrix multiplication In-Reply-To: <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> Message-ID: <4A2995E1.1090601@american.edu> On 6/5/2009 5:41 PM Chris Colbert apparently wrote: > well, it sounded like a good idea. I think something close to this would be possible: add dot as an array method. A .dot(B) .dot(C) is not as pretty as A * B * C but it is much better than np.dot(np.dot(A,B),C) In fact it is so much better, that it might (?) be worth considering separately from the entire matrix discussion. And it does not provide a matrix exponential, of course. Alan Isaac From matthew.brett at gmail.com Fri Jun 5 18:09:12 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 5 Jun 2009 18:09:12 -0400 Subject: [Numpy-discussion] matrix multiplication In-Reply-To: <4A2995E1.1090601@american.edu> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> <4A2995E1.1090601@american.edu> Message-ID: <1e2af89e0906051509q7b7fc50aye13bf34ebc45725a@mail.gmail.com> > I think something close to this would be possible: > add dot as an array method. > ? ? ? ?A .dot(B) .dot(C) > is not as pretty as > ? ? ? ?A * B * C > but it is much better than > ? ? ? ?np.dot(np.dot(A,B),C) That is much better. Matthew From robert.kern at gmail.com Fri Jun 5 18:12:02 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 5 Jun 2009 17:12:02 -0500 Subject: [Numpy-discussion] IndexExpression bug? In-Reply-To: References: Message-ID: <3d375d730906051512r584f253dx864c51ceed6f40cf@mail.gmail.com> On Fri, Jun 5, 2009 at 16:14, Michael McNeil Forbes wrote: > ?>>> np.array([0,1,2,3])[1:-1] > array([1, 2]) > > but > > ?>>> np.array([0,1,2,3])[np.s_[1:-1]] > array([1, 2, 3]) > ?>>> np.array([0,1,2,3])[np.index_exp[1:-1]] > array([1, 2, 3]) > > Possible fix: > class IndexExpression(object): > ? ? ... > ? ? def __len__(self): > ? ? ? ? return 0 > > (Presently this returns sys.maxint) > > Does this break anything (I can't find any coverage tests)? ?If not, I > will submit a ticket. I think that getting rid of __getslice__ and __len__ should work better. I don't really understand what the logic was behind including them in the first place, though. I might be missing something. In [21]: %cpaste Pasting code; enter '--' alone on the line to stop. :class IndexExpression(object): : """ : A nicer way to build up index tuples for arrays. : : For any index combination, including slicing and axis insertion, : 'a[indices]' is the same as 'a[index_exp[indices]]' for any : array 'a'. However, 'index_exp[indices]' can be used anywhere : in Python code and returns a tuple of slice objects that can be : used in the construction of complex index expressions. : """ : : def __init__(self, maketuple): : self.maketuple = maketuple : : def __getitem__(self, item): : if self.maketuple and type(item) is not tuple: : return (item,) : else: : return item :-- In [22]: s2 = IndexExpression(False) In [23]: s2[1:-1] Out[23]: slice(1, -1, None) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri Jun 5 18:30:45 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 5 Jun 2009 16:30:45 -0600 Subject: [Numpy-discussion] matrix multiplication In-Reply-To: <4A2995E1.1090601@american.edu> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> <4A2995E1.1090601@american.edu> Message-ID: On Fri, Jun 5, 2009 at 4:02 PM, Alan G Isaac wrote: > On 6/5/2009 5:41 PM Chris Colbert apparently wrote: > > well, it sounded like a good idea. > > > I think something close to this would be possible: > add dot as an array method. > A .dot(B) .dot(C) > is not as pretty as > A * B * C > but it is much better than > np.dot(np.dot(A,B),C) > I prefer using the evaluation operator, so that becomes A(B)(C) or, changing the multiplication order a bit, A(B(C)) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Jun 5 18:54:20 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 5 Jun 2009 15:54:20 -0700 Subject: [Numpy-discussion] matrix multiplication In-Reply-To: <4A2995E1.1090601@american.edu> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> <4A2995E1.1090601@american.edu> Message-ID: On Fri, Jun 5, 2009 at 3:02 PM, Alan G Isaac wrote: > I think something close to this would be possible: > add dot as an array method. > ? ? ? ?A .dot(B) .dot(C) > is not as pretty as > ? ? ? ?A * B * C > but it is much better than > ? ? ? ?np.dot(np.dot(A,B),C) I've noticed that x.sum() is faster than sum(x) >> x = np.array([1,2,3]) >> timeit x.sum() 100000 loops, best of 3: 3.01 ?s per loop >> from numpy import sum >> timeit sum(x) 100000 loops, best of 3: 4.84 ?s per loop Would the same be true of dot? That is, would x.dot(y) be faster than dot(x,y)? Or is it just that np.sum() has to go through some extra python code before it hits the C code? In general, if I'm trying to speed up an inner loop, I try to replace func(x) with x.func(). But I don't really understand the general principle at work here. From robert.kern at gmail.com Fri Jun 5 19:00:10 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 5 Jun 2009 18:00:10 -0500 Subject: [Numpy-discussion] matrix multiplication In-Reply-To: References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> <4A2995E1.1090601@american.edu> Message-ID: <3d375d730906051600h7141c27bk2544d94e9a907bee@mail.gmail.com> On Fri, Jun 5, 2009 at 17:54, Keith Goodman wrote: > On Fri, Jun 5, 2009 at 3:02 PM, Alan G Isaac wrote: >> I think something close to this would be possible: >> add dot as an array method. >> ? ? ? ?A .dot(B) .dot(C) >> is not as pretty as >> ? ? ? ?A * B * C >> but it is much better than >> ? ? ? ?np.dot(np.dot(A,B),C) > > I've noticed that x.sum() is faster than sum(x) > >>> x = np.array([1,2,3]) >>> timeit x.sum() > 100000 loops, best of 3: 3.01 ?s per loop >>> from numpy import sum >>> timeit sum(x) > 100000 loops, best of 3: 4.84 ?s per loop > > Would the same be true of dot? That is, would x.dot(y) be faster than > dot(x,y)? Or is it just that np.sum() has to go through some extra > python code before it hits the C code? No and yes, respectively. > In general, if I'm trying to speed up an inner loop, I try to replace > func(x) with x.func(). But I don't really understand the general > principle at work here. Most of the functions that mirror methods, including numpy.sum(), are Python functions that have a little of bit of code to convert the input to an array if necessary and to dispatch to the method. numpy.dot() is already implemented in C. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Fri Jun 5 20:19:16 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 05 Jun 2009 17:19:16 -0700 Subject: [Numpy-discussion] matrix multiplication In-Reply-To: <3d375d730906051600h7141c27bk2544d94e9a907bee@mail.gmail.com> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> <4A2995E1.1090601@american.edu> <3d375d730906051600h7141c27bk2544d94e9a907bee@mail.gmail.com> Message-ID: <4A29B604.1040106@noaa.gov> Robert Kern wrote: >>>> x = np.array([1,2,3]) >>>> timeit x.sum() >> 100000 loops, best of 3: 3.01 ?s per loop >>>> from numpy import sum >>>> timeit sum(x) >> 100000 loops, best of 3: 4.84 ?s per loop that is a VERY short array, so one extra function call overhead could make the difference. Is it really your use case to have such tiny sums inside a big loop, and is there no way to vectorize that? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From kwgoodman at gmail.com Fri Jun 5 20:36:07 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 5 Jun 2009 17:36:07 -0700 Subject: [Numpy-discussion] matrix multiplication In-Reply-To: <4A29B604.1040106@noaa.gov> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> <4A2995E1.1090601@american.edu> <3d375d730906051600h7141c27bk2544d94e9a907bee@mail.gmail.com> <4A29B604.1040106@noaa.gov> Message-ID: On Fri, Jun 5, 2009 at 5:19 PM, Christopher Barker wrote: > Robert Kern wrote: >>>>> x = np.array([1,2,3]) >>>>> timeit x.sum() >>> 100000 loops, best of 3: 3.01 ?s per loop >>>>> from numpy import sum >>>>> timeit sum(x) >>> 100000 loops, best of 3: 4.84 ?s per loop > > that is a VERY short array, so one extra function call overhead could > make the difference. Is it really your use case to have such tiny sums > inside a big loop, and is there no way to vectorize that? I was trying to make the timeit difference large. It is the overhead that I was interested in. But it is still noticable when x is a "typical" size: >> x = np.arange(1000) >> timeit x.sum() 100000 loops, best of 3: 5.46 ?s per loop >> from numpy import sum >> timeit sum(x) 100000 loops, best of 3: 7.31 ?s per loop >> x = np.random.rand(1000) >> timeit x.sum() 100000 loops, best of 3: 6.81 ?s per loop >> timeit sum(x) 100000 loops, best of 3: 8.36 ?s per loop That reminds me of a big difference between arrays and matrices. Matrices have the overhead of going through python code (the matrix class) to get to the core C array code. I coverted an interative optimization function from matrices to arrays and got a speed up of a factor of 3.5. Array size was around (500, 10) and (500,) From josef.pktd at gmail.com Fri Jun 5 21:09:14 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Jun 2009 21:09:14 -0400 Subject: [Numpy-discussion] vectorizing In-Reply-To: References: <17511C60-2781-4A07-87BE-051495D9FE37@bryant.edu> <1cd32cbb0906051253g6998c612yb4568c549624080d@mail.gmail.com> Message-ID: <1cd32cbb0906051809m5f3ba82fm45223bf76b3483dc@mail.gmail.com> On Fri, Jun 5, 2009 at 5:59 PM, Brent Pedersen wrote: > On Fri, Jun 5, 2009 at 2:01 PM, Keith Goodman wrote: >> On Fri, Jun 5, 2009 at 1:22 PM, Brent Pedersen wrote: >>> On Fri, Jun 5, 2009 at 1:05 PM, Keith Goodman wrote: >>>> On Fri, Jun 5, 2009 at 1:01 PM, Keith Goodman wrote: >>>>> On Fri, Jun 5, 2009 at 12:53 PM, ? wrote: >>>>>> On Fri, Jun 5, 2009 at 2:07 PM, Brian Blais wrote: >>>>>>> Hello, >>>>>>> I have a vectorizing problem that I don't see an obvious way to solve. ?What >>>>>>> I have is a vector like: >>>>>>> obs=array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>> and a matrix >>>>>>> T=zeros((6,6)) >>>>>>> and what I want in T is a count of all of the transitions in obs, e.g. >>>>>>> T[1,2]=3 because the sequence 1-2 happens 3 times, ?T[3,4]=1 because the >>>>>>> sequence 3-4 only happens once, etc... ?I can do it unvectorized like: >>>>>>> for o1,o2 in zip(obs[:-1],obs[1:]): >>>>>>> ?? ?T[o1,o2]+=1 >>>>>>> >>>>>>> which gives the correct answer from above, which is: >>>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>>> ?? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>> >>>>>>> >>>>>>> but I thought there would be a better way. ?I tried: >>>>>>> o1=obs[:-1] >>>>>>> o2=obs[1:] >>>>>>> T[o1,o2]+=1 >>>>>>> but this doesn't give a count, it just yields 1's at the transition points, >>>>>>> like: >>>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?0., ?1.], >>>>>>> ?? ? ? [ 0., ?1., ?0., ?1., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?1., ?0., ?1., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?0., ?1., ?0., ?0.], >>>>>>> ?? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>> >>>>>>> Is there a clever way to do this? ?I could write a quick Cython solution, >>>>>>> but I wanted to keep this as an all-numpy implementation if I can. >>>>>>> >>>>>> >>>>>> histogram2d or its imitation, there was a discussion on histogram2d a >>>>>> short time ago >>>>>> >>>>>>>>> obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) >>>>>>>>> obs2 = obs - 1 >>>>>>>>> trans = np.hstack((0,np.bincount(obs2[:-1]*6+6+obs2[1:]),0)).reshape(6,6) >>>>>>>>> re = np.array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ... ? ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>> ... ? ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>> ... ? ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>> ... ? ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>> ... ? ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>>> np.all(re == trans) >>>>>> True >>>>>> >>>>>>>>> trans >>>>>> array([[0, 0, 0, 0, 0, 0], >>>>>> ? ? ? [0, 0, 3, 0, 0, 1], >>>>>> ? ? ? [0, 3, 0, 1, 0, 0], >>>>>> ? ? ? [0, 0, 2, 0, 1, 0], >>>>>> ? ? ? [0, 0, 0, 2, 0, 0], >>>>>> ? ? ? [0, 0, 0, 0, 1, 0]]) >>>>>> >>>>>> >>>>>> or >>>>>> >>>>>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>>>>> re >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>>>>> h >>>>>> array([[ 0., ?0., ?0., ?0., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?3., ?0., ?0., ?1.], >>>>>> ? ? ? [ 0., ?3., ?0., ?1., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?2., ?0., ?1., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?2., ?0., ?0.], >>>>>> ? ? ? [ 0., ?0., ?0., ?0., ?1., ?0.]]) >>>>>> >>>>>>>>> np.all(re == h) >>>>>> True >>>>> >>>>> There's no way my list method can beat that. But by adding >>>>> >>>>> import psyco >>>>> psyco.full() >>>>> >>>>> I get a total speed up of a factor of 15 when obs is length 10000. >>>> >>>> Actually, it is faster: >>>> >>>> histogram: >>>> >>>>>> h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>>>> timeit h, e1, e2 = np.histogram2d(obs[:-1], obs[1:], bins=6, range=[[0,5],[0,5]]) >>>> 100 loops, best of 3: 4.14 ms per loop >>>> >>>> lists: >>>> >>>>>> timeit test(obs3, T3) >>>> 1000 loops, best of 3: 1.32 ms per loop >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> here's a go: >>> >>> import numpy as np >>> import random >>> from itertools import groupby >>> >>> def test1(obs, T): >>> ? for o1,o2 in zip(obs[:-1],obs[1:]): >>> ? ? ? T[o1][o2] += 1 >>> ? return T >>> >>> >>> def test2(obs, T): >>> ? ?s = zip(obs[:-1], obs[1:]) >>> ? ?for idx, g in groupby(sorted(s)): >>> ? ? ? ?T[idx] = len(list(g)) >>> ? ?return T >>> >>> obs = [random.randint(0, 5) for z in range(10000)] >>> >>> print test2(obs, np.zeros((6, 6))) >>> print test1(obs, np.zeros((6, 6))) >>> >>> >>> ############## >>> >>> In [10]: timeit test1(obs, np.zeros((6, 6))) >>> 100 loops, best of 3: 18.8 ms per loop >>> >>> In [11]: timeit test2(obs, np.zeros((6, 6))) >>> 100 loops, best of 3: 6.91 ms per loop >> >> Wait, you tested the list method with an array. Try >> >> timeit test1(obs, np.zeros((6, 6)).tolist()) >> >> Probably best to move the array/list creation out of the timeit loop. >> Then my method won't have to pay the cost of converting to a list :) >> > > ah right, your test1 is faster than test2 that way. > i'm just stoked to find out about ndimage.historgram. > > using this: >>>> histogram((obs[:-1])*6 +obs[1:], 0, 36, 36).reshape(6,6) > > it gives the exact result as test1/test2. > > -b > the cleaned up bincount version looks competitive, this time with tests to avoid index errors Josef import numpy as np from scipy import ndimage def countndi(obs): return ndimage.histogram((obs[:-1])*6 +obs[1:], 0, 36, 36 ).reshape(6,6) def countbc(obs): tr = np.zeros(len(obs)+1,int) tr[2:] = (obs[:-1])*6 + obs[1:] tr[1] = 35 trans = np.bincount(tr) trans[0] -= 1 trans[-1] -= 1 return trans.reshape(6,6) obs=np.array([1,2,3,4,3,2,1,2,1,2,1,5,4,3,2]) re = np.array([[ 0., 0., 0., 0., 0., 0.], [ 0., 0., 3., 0., 0., 1.], [ 0., 3., 0., 1., 0., 0.], [ 0., 0., 2., 0., 1., 0.], [ 0., 0., 0., 2., 0., 0.], [ 0., 0., 0., 0., 1., 0.]]) trans = countndi(obs) print np.all(re == trans) trans = countbc(obs) print np.all(re == trans) obs2=np.array([1,1,3,4,3,2,1,2,1,2,1,5,4,3,2]) transnd = countndi(obs2) transbc = countbc(obs2) print np.all(transnd == transbc) obs3=np.array([1,2,3,4,3,2,1,2,1,2,1,5,5,3,2]) transnd = countndi(obs3) transbc = countbc(obs3) print np.all(transnd == transbc) import time nit = 20000 t = time.time() for i in xrange(nit): countndi(obs) print time.time() - t t2 = time.time() for i in xrange(nit): countbc(obs) print time.time() - t2 From charlesr.harris at gmail.com Sat Jun 6 00:41:57 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 5 Jun 2009 22:41:57 -0600 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A29561D.5070806@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> Message-ID: On Fri, Jun 5, 2009 at 11:30 AM, Alan G Isaac wrote: > On 6/5/2009 11:38 AM Olivier Verdier apparently wrote: > > I think matrices can be pretty tricky when used for > > teaching. For instance, you have to explain that all the > > operators work component-wise, except the multiplication! > > Another caveat is that since matrices are always 2x2, the > > "scalar product" of two column vectors computed as " x.T > > * y" will not be a scalar, but a 2x2 matrix. There is also > > the fact that you must cast all your vectors to column/raw > > matrices (as in matlab). For all these reasons, I prefer > > to use arrays and dot for teaching, and I have never had > > any complaints. > > > I do not understand this "argument". > You should take it very seriously when someone > reports to you that the matrix object is a crucial to them, > e.g., as a teaching tool. Even if you do not find > personally persuasive an example like > http://mail.scipy.org/pipermail/numpy-discussion/2009-June/043001.html > I have told you: this is important for my students. > Reporting that your students do not complain about using > arrays instead of matrices does not change this one bit. > > Student backgrounds differ by domain of application. In > economics, matrices are in *very* wide use, and > multidimensional arrays get almost no use. Textbooks in > econometrics (a huge and important field, even outside of > economics) are full of proofs using matrix algebra. > A close match to what the students see is crucial. > When working with multiplication or exponentiation, > matrices do what they expect, and 2d arrays do not. > > One more point. As Python users we get used to installing > a package here and a package there to add functionality. > But this is not how most people looking for a matrix > language see the world. Removing the matrix object from > NumPy will raise the barrier to adoption by social > scientists, and there should be a strongly persuasive reason > before taking such a step. > > Separately from all that, does anyone doubt that there is > code that depends on the matrix object? The core objection > to a past proposal for useful change was that it could break > extant code. I would hope that nobody who took that > position would subsequently propose removing the matrix > object altogether. > > Cheers, > Alan Isaac > > PS If x and y are "column vectors" (i.e., matrices), then > x.T * y *should* be a 1?1 matrix. > Since the * operator is doing matrix multiplication, > this is the correct result, not an anomaly. > Well, one could argue that. The x.T is a member of the dual, hence maps vectors to the reals. Usually the reals aren't represented by 1x1 matrices. Just my [.02] cents. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Sat Jun 6 03:01:45 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 6 Jun 2009 00:01:45 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) Message-ID: Howdy, I'm finding myself often having to index *into* arrays to set values. As best I can see, in numpy functions/methods like diag or tri{u,l} provide for the extraction of values from arrays, but I haven't found their counterparts for generating the equivalent indices. Their implementations are actually quite trivial once you think of it, but I wonder if there's any benefit in having this type of machinery in numpy itself. I had to think of how to do it more than once, so I ended up writing up a few utilities for it. Below are my naive versions I have for internal use. If there's any interest, I'm happy to document them to compliance as a patch. Cheers, f #### def mask_indices(n,mask_func,k=0): """Return the indices for an array, given a masking function like tri{u,l}.""" m = np.ones((n,n),int) a = mask_func(m,k) return np.where(a != 0) def diag_indices(n,ndim=2): """Return the indices to index into a diagonal. Examples -------- >>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]) >>> a array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12], [13, 14, 15, 16]]) >>> di = diag_indices(4) >>> a[di] = 100 >>> a array([[100, 2, 3, 4], [ 5, 100, 7, 8], [ 9, 10, 100, 12], [ 13, 14, 15, 100]]) """ idx = np.arange(n) return (idx,)*ndim def tril_indices(n,k=0): """Return the indices for the lower-triangle of an (n,n) array. Examples -------- >>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]) >>> a array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12], [13, 14, 15, 16]]) >>> dl = tril_indices(4) >>> a[dl] = -1 >>> a array([[-1, 2, 3, 4], [-1, -1, 7, 8], [-1, -1, -1, 12], [-1, -1, -1, -1]]) >>> dl = tril_indices(4,2) >>> a[dl] = -10 >>> a array([[-10, -10, -10, 4], [-10, -10, -10, -10], [-10, -10, -10, -10], [-10, -10, -10, -10]]) """ return mask_indices(n,np.tril,k) def triu_indices(n,k=0): """Return the indices for the upper-triangle of an (n,n) array. Examples -------- >>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]) >>> a array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12], [13, 14, 15, 16]]) >>> du = triu_indices(4) >>> a[du] = -1 >>> a array([[-1, -1, -1, -1], [ 5, -1, -1, -1], [ 9, 10, -1, -1], [13, 14, 15, -1]]) >>> du = triu_indices(4,2) >>> a[du] = -10 >>> a array([[ -1, -1, -10, -10], [ 5, -1, -1, -10], [ 9, 10, -1, -1], [ 13, 14, 15, -1]]) """ return mask_indices(n,np.triu,k) From robert.kern at gmail.com Sat Jun 6 03:09:47 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 6 Jun 2009 02:09:47 -0500 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: Message-ID: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> On Sat, Jun 6, 2009 at 02:01, Fernando Perez wrote: > Howdy, > > I'm finding myself often having to index *into* arrays to set values. > As best I can see, in numpy functions/methods like diag or tri{u,l} > provide for the extraction of values from arrays, but I haven't found > their counterparts for generating the equivalent indices. > > Their implementations are actually quite trivial once you think of it, > but I wonder if there's any benefit in having this type of machinery > in numpy itself. I had to think of how to do it more than once, so I > ended up writing up a few utilities for it. > > Below are my naive versions I have for internal use. ?If there's any > interest, I'm happy to document them to compliance as a patch. +1 diag_indices() can be made more efficient, but these are fine. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Sat Jun 6 03:57:01 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 6 Jun 2009 09:57:01 +0200 Subject: [Numpy-discussion] matrix multiplication In-Reply-To: <4A2995E1.1090601@american.edu> References: <9457e7c80906051249x5851f280x11d5ae1c2a97aaba@mail.gmail.com> <4A298ABF.3050103@american.edu> <7f014ea60906051424q12e6594dhe90243c7fe5ada1c@mail.gmail.com> <3d375d730906051428i4d99fc72wb76e3d4561206715@mail.gmail.com> <7f014ea60906051441t4bc73554j9da0b29ef2d9ffd6@mail.gmail.com> <4A2995E1.1090601@american.edu> Message-ID: <20090606075701.GA27190@phare.normalesup.org> On Fri, Jun 05, 2009 at 06:02:09PM -0400, Alan G Isaac wrote: > I think something close to this would be possible: > add dot as an array method. > A .dot(B) .dot(C) > is not as pretty as > A * B * C > but it is much better than > np.dot(np.dot(A,B),C) > In fact it is so much better, that it > might (?) be worth considering separately > from the entire matrix discussion. I am +1e4 on that proposition. My 1e-2 euros. Ga?l From neilcrighton at gmail.com Sat Jun 6 04:42:58 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Sat, 6 Jun 2009 08:42:58 +0000 (UTC) Subject: [Numpy-discussion] extract elements of an array that are contained in another array? References: <1cd32cbb0906031745m3a0fc5fbga61240f7d5c90161@mail.gmail.com> <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A28AEB2.5050209@ntc.zcu.cz> Message-ID: Robert Cimrman ntc.zcu.cz> writes: > Anne Archibald wrote: > > > 1. add a keyword argument to intersect1d "assume_unique"; if it is not > > present, check for uniqueness and emit a warning if not unique > > 2. change the warning to an exception > > Optionally: > > 3. change the meaning of the function to that of intersect1d_nu if the > > keyword argument is not present > > > You mean something like: > > def intersect1d(ar1, ar2, assume_unique=False): > if not assume_unique: > return intersect1d_nu(ar1, ar2) > else: > ... # the current code > > intersect1d_nu could be still exported to numpy namespace, or not. > +1 - from the user's point of view there should just be intersect1d and setmember1d (i.e. no '_nu' versions). The assume_unique keyword Robert suggests can be used if speed is a problem. I really like in1d (no underscore) as a new name for setmember1d_nu. inarray is another possibility. I don't like 'ain'; 'a' in front of 'in' detracts from readability, unlike the extra a in arange. Can we summarise the discussion in this thread and write up a short proposal about what we'd like to change in arraysetops, and how to make the changes? Then it's easy for other people to give their opinion on any changes. I can do this if no one else has time. Neil From josef.pktd at gmail.com Sat Jun 6 07:41:32 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 6 Jun 2009 07:41:32 -0400 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: References: <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A28AEB2.5050209@ntc.zcu.cz> Message-ID: <1cd32cbb0906060441l119f274frc9108d1444cfe638@mail.gmail.com> On Sat, Jun 6, 2009 at 4:42 AM, Neil Crighton wrote: > Robert Cimrman ntc.zcu.cz> writes: > >> Anne Archibald wrote: >> >> > 1. add a keyword argument to intersect1d "assume_unique"; if it is not >> > present, check for uniqueness and emit a warning if not unique >> > 2. change the warning to an exception >> > Optionally: >> > 3. change the meaning of the function to that of intersect1d_nu if the >> > keyword argument is not present >> > 1. merge _nu version into one function ------------------------------------------------------- >> You mean something like: >> >> def intersect1d(ar1, ar2, assume_unique=False): >> ? ? ?if not assume_unique: >> ? ? ? ? ?return intersect1d_nu(ar1, ar2) >> ? ? ?else: >> ? ? ? ? ?... # the current code >> >> intersect1d_nu could be still exported to numpy namespace, or not. >> > > +1 - from the user's point of view there should just be intersect1d and > setmember1d (i.e. no '_nu' versions). The assume_unique keyword Robert suggests > can be used if speed is a problem. + 1 on rolling the _nu versions this way into the plain version, this would avoid a lot of the confusion. It would not be a code breaking API change for existing correct usage (but some speed regression without adding keyword) depreciate intersect1d_nu ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > intersect1d_nu could be still exported to numpy namespace, or not. I would say not, if they are the default branch of the non _nu version +1 on depreciation 2. alias as "in" --------------------- > > I really like in1d (no underscore) as a new name for setmember1d_nu. inarray is > another possibility. I don't like 'ain'; 'a' in front of 'in' detracts from > readability, unlike the extra a in arange. I don't like the extra "a"s either, ones name spaces are commonly used alias setmember1d_nu as `in1d` or `isin1d`, because the function is a "in" and not a set operation +1 > > Can we summarise the discussion in this thread and write up a short proposal > about what we'd like to change in arraysetops, and how to make the changes? > Then it's easy for other people to give their opinion on any changes. I can do > this if no one else has time. > other points 3. behavior of other set functions ----------------------------------------------- guarantee that setdiff1d works for non-unique arrays (even when implementation changes), and change documentation +1 need to check other functions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ union1d: works for non-unique arrays, obvious from source setxor1d: requires unique arrays >>> np.setxor1d([1,2,3,3,4,5], [0,0,1,2,2,6]) array([2, 4, 5, 6]) >>> np.setxor1d(np.unique([1,2,3,3,4,5]), np.unique([0,0,1,2,2,6])) array([0, 3, 4, 5, 6]) setxor: add keyword option and call unique by default +1 for symmetry ediff1d and unique1d are defined for non-unique arrays 4. name of keyword ---------------------------- intersect1d(ar1, ar2, assume_unique=False) alternative isunique=False or just unique=False +1 less to write 5. module name ----------------------- rename arraysetops to something easier to read like setfun. I think it would only affect internal changes since all functions are exported to the main numpy name space +1e-4 (I got used to arrayse_tops) 5. keep docs in sync with correct usage --------------------------------------------------------- obvious That's my summary and opinions Josef > > Neil > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Sat Jun 6 10:42:37 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 6 Jun 2009 07:42:37 -0700 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <7f014ea60906041356p19a53c2bi4258d6a6b93ef367@mail.gmail.com> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> Message-ID: On Fri, Jun 5, 2009 at 2:37 PM, Chris Colbert wrote: > I'll caution anyone from using Atlas from the repos in Ubuntu 9.04? as the > package is broken: > > https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510 > > > just build Atlas yourself, you get better performance AND threading. > Building it is not the nightmare it sounds like. I think i've done it a > total of four times now, both 32-bit and 64-bit builds. > > If you need help with it,? just email me off list. That's a nice offer. I tried building ATLAS on Debian a year or two ago and got stuck. Clear out your inbox! From kwgoodman at gmail.com Sat Jun 6 11:27:29 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 6 Jun 2009 08:27:29 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: Message-ID: On Sat, Jun 6, 2009 at 12:01 AM, Fernando Perez wrote: > def diag_indices(n,ndim=2): > ? ?"""Return the indices to index into a diagonal. > > ? ?Examples > ? ?-------- > ? ?>>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]) > ? ?>>> a > ? ?array([[ 1, ?2, ?3, ?4], > ? ? ? ? ? [ 5, ?6, ?7, ?8], > ? ? ? ? ? [ 9, 10, 11, 12], > ? ? ? ? ? [13, 14, 15, 16]]) > ? ?>>> di = diag_indices(4) > ? ?>>> a[di] = 100 > ? ?>>> a > ? ?array([[100, ? 2, ? 3, ? 4], > ? ? ? ? ? [ ?5, 100, ? 7, ? 8], > ? ? ? ? ? [ ?9, ?10, 100, ?12], > ? ? ? ? ? [ 13, ?14, ?15, 100]]) > ? ?""" > ? ?idx = np.arange(n) > ? ?return (idx,)*ndim I often set the diagonal to zero. Now I can make a fill_diag function. What do you think of passing in the array a instead of n and ndim (diag_indices_list_2 below)? from numpy import arange def diag_indices(n, ndim=2): idx = arange(n) return (idx,)*ndim def diag_indices_list(n, ndim=2): idx = range(n) return (idx,)*ndim def diag_indices_list_2(a): idx = range(a.shape[0]) return (idx,) * a.ndim >> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]) >> n = 4 >> ndim = 2 >> >> timeit diag_indices(n, ndim) 1000000 loops, best of 3: 1.76 ?s per loop >> >> timeit diag_indices_list(n, ndim) 1000000 loops, best of 3: 1.03 ?s per loop >> >> timeit diag_indices_list_2(a) 1000000 loops, best of 3: 1.21 ?s per loop From aisaac at american.edu Sat Jun 6 11:29:20 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 06 Jun 2009 11:29:20 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> Message-ID: <4A2A8B50.9030904@american.edu> On 6/6/2009 12:41 AM Charles R Harris apparently wrote: > Well, one could argue that. The x.T is a member of the dual, hence maps > vectors to the reals. Usually the reals aren't represented by 1x1 > matrices. Just my [.02] cents. Of course that same perspective could lead you to argue that a M?N matrix is for mapping N vectors to M vectors, not for doing matrix multiplication. Matrix multiplication produces a matrix result **by definition**. Treating 1?1 matrices as equivalent to scalars is just a convenient anomaly in certain popular matrix-oriented languages. Cheers, Alan Isaac From sccolbert at gmail.com Sat Jun 6 12:59:25 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sat, 6 Jun 2009 12:59:25 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> Message-ID: <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> since there is demand, and someone already emailed me, I'll put what I did in this post. It pretty much follows whats on the scipy website, with a couple other things I gleaned from reading the ATLAS install guide: and here it goes, this is valid for Ubuntu 9.04 64-bit (# starts a comment when working in the terminal) download lapack 3.2.1 http://www.netlib.org/lapack/lapack.tgz download atlas 3.8.3 http://sourceforge.net/project/downloading.php?group_id=23725&filename=atlas3.8.3.tar.bz2&a=65663372 create folder /home/your-user-name/build/atlas #this is where we build create folder /home/your-user-name/build/lapack #atlas and lapack extract the folder lapack-3.2.1 to /home/your-user-name/build/lapack extract the contents of atlas to /home/your-user-name/build/atlas now in the terminal: # remove g77 and get stuff we need sudo apt-get remove g77 sudo apt-get install gfortran sudo apt-get install build-essential sudo apt-get install python-dev sudo apt-get install python-setuptools sudo easy_install nose # build lapack cd /home/your-user-name/build/lapack/lapack-3.2.1 cp INSTALL/make.inc.gfortran make.inc gedit make.inc ################# #in the make.inc file make sure the line OPTS = -O2 -fPIC -m64 #and NOOPTS = -O0 -fPIC -m64 #the -m64 flags build 64-bit code, if you want 32-bit, simply leave #the -m64 flags out ################# cd SRC #this should build lapack without error make # build atlas cd /home/your-user-name/build/atlas #this is simply where we will build the atlas #libs, you can name it what you want mkdir Linux_X64SSE2 cd Linux_X64SSE2 #need to turn off cpu-throttling sudo cpufreq-selector -g performance #if you don't want 64bit code remove the -b 64 flag. replace the #number 2400 with your CPU frequency in MHZ #i.e. my cpu is 2.53 GHZ so i put 2530 ../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a #the configure step takes a bit, and should end without errors #this takes a long time, go get some coffee, it should end without error make build #this will verify the build, also long running make check #this will test the performance of your build and give you feedback on #it. your numbers should be close to the test numbers at the end make time cd lib #builds single threaded .so's make shared #builds multithreaded .so's make ptshared #copies all of the atlas libs (and the lapack lib built with atlas) #to our lib dir sudo cp *.so /usr/local/lib/ #now we need to get and build numpy download numpy 1.3.0 http://sourceforge.net/project/downloading.php?group_id=1369&filename=numpy-1.3.0.tar.gz&a=93506515 extract the folder numpy-1.3.0 to /home/your-user-name/build #in the terminal cd /home/your-user-name/build/numpy-1.3.0 cp site.cfg.example site.cfg gedit site.cfg ############################################### # in site.cfg uncomment the following lines and make them look like these [DEFAULT] library_dirs = /usr/local/lib include_dirs = /usr/local/include [blas_opt] libraries = ptf77blas, ptcblas, atlas [lapack_opt] libraries = lapack, ptf77blas, ptcblas, atlas ################################################### #if you want single threaded libs, uncomment those lines instead #build numpy- should end without error python setup.py build #install numpy python setup.py install cd /home sudo ldconfig python >>import numpy >>numpy.test() #this should run with no errors (skipped tests and known-fails are ok) >>a = numpy.random.randn(6000, 6000) >>numpy.dot(a, a) # look at your cpu monitor and verify all cpu cores are at 100% if you built with threads Celebrate with a beer! Cheers! Chris On Sat, Jun 6, 2009 at 10:42 AM, Keith Goodman wrote: > On Fri, Jun 5, 2009 at 2:37 PM, Chris Colbert wrote: >> I'll caution anyone from using Atlas from the repos in Ubuntu 9.04? as the >> package is broken: >> >> https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510 >> >> >> just build Atlas yourself, you get better performance AND threading. >> Building it is not the nightmare it sounds like. I think i've done it a >> total of four times now, both 32-bit and 64-bit builds. >> >> If you need help with it,? just email me off list. > > That's a nice offer. I tried building ATLAS on Debian a year or two > ago and got stuck. > > Clear out your inbox! > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From llewelr at gmail.com Sat Jun 6 13:02:01 2009 From: llewelr at gmail.com (Richard Llewellyn) Date: Sat, 6 Jun 2009 10:02:01 -0700 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? Message-ID: <28e83ea0906061002k456fc263t9a5e216835dee187@mail.gmail.com> Hello, I've managed a build of lapack and atlas on Fedora 10 on a quad core, 64, and now (...) have a numpy I can import that runs tests ok. :] I am puzzled, however, that numpy builds and imports lapack_lite. Does this mean I have a problem with the build(s)? Upon building numpy, I see the troubling output: ######################## C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC compile options: '-c' gcc: _configtest.c gcc -pthread _configtest.o -L/usr/local/rich/src/scipy_build/lib -llapack -lptf77blas -lptcblas -latlas -o _configtest /usr/bin/ld: _configtest: hidden symbol `__powidf2' in /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference d by DSO /usr/bin/ld: final link failed: Nonrepresentable section on output collect2: ld returned 1 exit status /usr/bin/ld: _configtest: hidden symbol `__powidf2' in /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference d by DSO /usr/bin/ld: final link failed: Nonrepresentable section on output collect2: ld returned 1 exit status failure. removing: _configtest.c _configtest.o Status: 255 Output: FOUND: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] language = f77 define_macros = [('NO_ATLAS_INFO', 2)] ########################## I don't have root on this machine, but could pester admins for eventual temporary access. Thanks much for any help, Rich -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Sat Jun 6 13:25:36 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sat, 6 Jun 2009 13:25:36 -0400 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? In-Reply-To: <28e83ea0906061002k456fc263t9a5e216835dee187@mail.gmail.com> References: <28e83ea0906061002k456fc263t9a5e216835dee187@mail.gmail.com> Message-ID: <7f014ea60906061025o17416dc4m97bd2fd0d15701b5@mail.gmail.com> when you build numpy, did you use site.cfg to tell it where to find your atlas libs? On Sat, Jun 6, 2009 at 1:02 PM, Richard Llewellyn wrote: > Hello, > > I've managed a build of lapack and atlas on Fedora 10 on a quad core, 64, > and now (...) have a numpy I can import that runs tests ok. :]??? I am > puzzled, however, that numpy builds and imports lapack_lite.? Does this mean > I have a problem with the build(s)? > Upon building numpy, I see the troubling output: > > ######################## > > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto > r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC > > compile options: '-c' > gcc: _configtest.c > gcc -pthread _configtest.o -L/usr/local/rich/src/scipy_build/lib -llapack > -lptf77blas -lptcblas -latlas -o _configtest > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference > d by DSO > /usr/bin/ld: final link failed: Nonrepresentable section on output > collect2: ld returned 1 exit status > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference > d by DSO > /usr/bin/ld: final link failed: Nonrepresentable section on output > collect2: ld returned 1 exit status > failure. > removing: _configtest.c _configtest.o > Status: 255 > Output: > ? FOUND: > ??? libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > ??? library_dirs = ['/usr/local/rich/src/scipy_build/lib'] > ??? language = f77 > ??? define_macros = [('NO_ATLAS_INFO', 2)] > > ########################## > > I don't have root on this machine, but could pester admins for eventual > temporary access. > > Thanks much for any help, > Rich > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From llewelr at gmail.com Sat Jun 6 13:37:58 2009 From: llewelr at gmail.com (Richard Llewellyn) Date: Sat, 6 Jun 2009 10:37:58 -0700 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? In-Reply-To: <7f014ea60906061025o17416dc4m97bd2fd0d15701b5@mail.gmail.com> References: <28e83ea0906061002k456fc263t9a5e216835dee187@mail.gmail.com> <7f014ea60906061025o17416dc4m97bd2fd0d15701b5@mail.gmail.com> Message-ID: <28e83ea0906061037h3ba96628qf5e57519e70dca12@mail.gmail.com> Hi Chris, thanks much for posting those installation instructions. Seems similar to what I pieced together. I gather ATLAS not found. Oops, drank that beer too early. I copied Atlas libs to /usr/local/rich/src/scipy_build/lib. This is my site.cfg. Out of desperation I tried search_static_first = 1, but probably of no use. [DEFAULT] library_dirs = /usr/local/rich/src/scipy_build/lib:$HOME/usr/galois/lib include_dirs = /usr/local/rich/src/scipy_build/lib/include:$HOME/usr/galois/include search_static_first = 1 [blas_opt] libraries = f77blas, cblas, atlas [lapack_opt] libraries = lapack, f77blas, cblas, atlas [amd] amd_libs = amd [umfpack] umfpack_libs = umfpack, gfortran [fftw] libraries = fftw3 Rich On Sat, Jun 6, 2009 at 10:25 AM, Chris Colbert wrote: > when you build numpy, did you use site.cfg to tell it where to find > your atlas libs? > > On Sat, Jun 6, 2009 at 1:02 PM, Richard Llewellyn > wrote: > > Hello, > > > > I've managed a build of lapack and atlas on Fedora 10 on a quad core, 64, > > and now (...) have a numpy I can import that runs tests ok. :] I am > > puzzled, however, that numpy builds and imports lapack_lite. Does this > mean > > I have a problem with the build(s)? > > Upon building numpy, I see the troubling output: > > > > ######################## > > > > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall > > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto > > r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC > > > > compile options: '-c' > > gcc: _configtest.c > > gcc -pthread _configtest.o -L/usr/local/rich/src/scipy_build/lib -llapack > > -lptf77blas -lptcblas -latlas -o _configtest > > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference > > d by DSO > > /usr/bin/ld: final link failed: Nonrepresentable section on output > > collect2: ld returned 1 exit status > > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference > > d by DSO > > /usr/bin/ld: final link failed: Nonrepresentable section on output > > collect2: ld returned 1 exit status > > failure. > > removing: _configtest.c _configtest.o > > Status: 255 > > Output: > > FOUND: > > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > > library_dirs = ['/usr/local/rich/src/scipy_build/lib'] > > language = f77 > > define_macros = [('NO_ATLAS_INFO', 2)] > > > > ########################## > > > > I don't have root on this machine, but could pester admins for eventual > > temporary access. > > > > Thanks much for any help, > > Rich > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Sat Jun 6 13:42:22 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sat, 6 Jun 2009 13:42:22 -0400 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? In-Reply-To: <28e83ea0906061037h3ba96628qf5e57519e70dca12@mail.gmail.com> References: <28e83ea0906061002k456fc263t9a5e216835dee187@mail.gmail.com> <7f014ea60906061025o17416dc4m97bd2fd0d15701b5@mail.gmail.com> <28e83ea0906061037h3ba96628qf5e57519e70dca12@mail.gmail.com> Message-ID: <7f014ea60906061042h3260df64je034a38de1894871@mail.gmail.com> can you run this and post the build.log to pastebin.com: assuming your numpy build directory is /home/numpy-1.3.0: cd /home/numpy-1.3.0 rm -rf build python setup.py build &&> build.log Chris On Sat, Jun 6, 2009 at 1:37 PM, Richard Llewellyn wrote: > Hi Chris, > ?thanks much for posting those installation instructions.? Seems similar to > what I pieced together. > > I gather ATLAS not found.? Oops, drank that beer too early. > > I copied Atlas libs to /usr/local/rich/src/scipy_build/lib. > > This is my site.cfg.? Out of desperation I tried search_static_first = 1, > but probably of no use. > > [DEFAULT] > library_dirs = /usr/local/rich/src/scipy_build/lib:$HOME/usr/galois/lib > include_dirs = > /usr/local/rich/src/scipy_build/lib/include:$HOME/usr/galois/include > search_static_first = 1 > > [blas_opt] > libraries = f77blas, cblas, atlas > > [lapack_opt] > libraries = lapack, f77blas, cblas, atlas > > [amd] > amd_libs = amd > > [umfpack] > umfpack_libs = umfpack, gfortran > > [fftw] > libraries = fftw3 > > > Rich > > > > > On Sat, Jun 6, 2009 at 10:25 AM, Chris Colbert wrote: >> >> when you build numpy, did you use site.cfg to tell it where to find >> your atlas libs? >> >> On Sat, Jun 6, 2009 at 1:02 PM, Richard Llewellyn >> wrote: >> > Hello, >> > >> > I've managed a build of lapack and atlas on Fedora 10 on a quad core, >> > 64, >> > and now (...) have a numpy I can import that runs tests ok. :]??? I am >> > puzzled, however, that numpy builds and imports lapack_lite.? Does this >> > mean >> > I have a problem with the build(s)? >> > Upon building numpy, I see the troubling output: >> > >> > ######################## >> > >> > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe >> > -Wall >> > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto >> > r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC >> > -fPIC >> > >> > compile options: '-c' >> > gcc: _configtest.c >> > gcc -pthread _configtest.o -L/usr/local/rich/src/scipy_build/lib >> > -llapack >> > -lptf77blas -lptcblas -latlas -o _configtest >> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in >> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference >> > d by DSO >> > /usr/bin/ld: final link failed: Nonrepresentable section on output >> > collect2: ld returned 1 exit status >> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in >> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference >> > d by DSO >> > /usr/bin/ld: final link failed: Nonrepresentable section on output >> > collect2: ld returned 1 exit status >> > failure. >> > removing: _configtest.c _configtest.o >> > Status: 255 >> > Output: >> > ? FOUND: >> > ??? libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] >> > ??? library_dirs = ['/usr/local/rich/src/scipy_build/lib'] >> > ??? language = f77 >> > ??? define_macros = [('NO_ATLAS_INFO', 2)] >> > >> > ########################## >> > >> > I don't have root on this machine, but could pester admins for eventual >> > temporary access. >> > >> > Thanks much for any help, >> > Rich >> > >> > _______________________________________________ >> > Numpy-discussion mailing list >> > Numpy-discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From sccolbert at gmail.com Sat Jun 6 13:46:27 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sat, 6 Jun 2009 13:46:27 -0400 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? In-Reply-To: <7f014ea60906061042h3260df64je034a38de1894871@mail.gmail.com> References: <28e83ea0906061002k456fc263t9a5e216835dee187@mail.gmail.com> <7f014ea60906061025o17416dc4m97bd2fd0d15701b5@mail.gmail.com> <28e83ea0906061037h3ba96628qf5e57519e70dca12@mail.gmail.com> <7f014ea60906061042h3260df64je034a38de1894871@mail.gmail.com> Message-ID: <7f014ea60906061046g650e71e4j4fb4f064ddc49030@mail.gmail.com> and where exactly are you seeing atlas not found? during the build process, are when import numpy in python? if its the latter, you need to add a .conf file in /etc/ld.so.conf.d/ with the line /usr/local/rich/src/scipy_build/lib and then run sudo ldconfig Chris On Sat, Jun 6, 2009 at 1:42 PM, Chris Colbert wrote: > can you run this and post the build.log to pastebin.com: > > assuming your numpy build directory is /home/numpy-1.3.0: > > cd /home/numpy-1.3.0 > rm -rf build > python setup.py build &&> build.log > > > Chris > > > On Sat, Jun 6, 2009 at 1:37 PM, Richard Llewellyn wrote: >> Hi Chris, >> ?thanks much for posting those installation instructions.? Seems similar to >> what I pieced together. >> >> I gather ATLAS not found.? Oops, drank that beer too early. >> >> I copied Atlas libs to /usr/local/rich/src/scipy_build/lib. >> >> This is my site.cfg.? Out of desperation I tried search_static_first = 1, >> but probably of no use. >> >> [DEFAULT] >> library_dirs = /usr/local/rich/src/scipy_build/lib:$HOME/usr/galois/lib >> include_dirs = >> /usr/local/rich/src/scipy_build/lib/include:$HOME/usr/galois/include >> search_static_first = 1 >> >> [blas_opt] >> libraries = f77blas, cblas, atlas >> >> [lapack_opt] >> libraries = lapack, f77blas, cblas, atlas >> >> [amd] >> amd_libs = amd >> >> [umfpack] >> umfpack_libs = umfpack, gfortran >> >> [fftw] >> libraries = fftw3 >> >> >> Rich >> >> >> >> >> On Sat, Jun 6, 2009 at 10:25 AM, Chris Colbert wrote: >>> >>> when you build numpy, did you use site.cfg to tell it where to find >>> your atlas libs? >>> >>> On Sat, Jun 6, 2009 at 1:02 PM, Richard Llewellyn >>> wrote: >>> > Hello, >>> > >>> > I've managed a build of lapack and atlas on Fedora 10 on a quad core, >>> > 64, >>> > and now (...) have a numpy I can import that runs tests ok. :]??? I am >>> > puzzled, however, that numpy builds and imports lapack_lite.? Does this >>> > mean >>> > I have a problem with the build(s)? >>> > Upon building numpy, I see the troubling output: >>> > >>> > ######################## >>> > >>> > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe >>> > -Wall >>> > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto >>> > r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC >>> > -fPIC >>> > >>> > compile options: '-c' >>> > gcc: _configtest.c >>> > gcc -pthread _configtest.o -L/usr/local/rich/src/scipy_build/lib >>> > -llapack >>> > -lptf77blas -lptcblas -latlas -o _configtest >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference >>> > d by DSO >>> > /usr/bin/ld: final link failed: Nonrepresentable section on output >>> > collect2: ld returned 1 exit status >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is reference >>> > d by DSO >>> > /usr/bin/ld: final link failed: Nonrepresentable section on output >>> > collect2: ld returned 1 exit status >>> > failure. >>> > removing: _configtest.c _configtest.o >>> > Status: 255 >>> > Output: >>> > ? FOUND: >>> > ??? libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] >>> > ??? library_dirs = ['/usr/local/rich/src/scipy_build/lib'] >>> > ??? language = f77 >>> > ??? define_macros = [('NO_ATLAS_INFO', 2)] >>> > >>> > ########################## >>> > >>> > I don't have root on this machine, but could pester admins for eventual >>> > temporary access. >>> > >>> > Thanks much for any help, >>> > Rich >>> > >>> > _______________________________________________ >>> > Numpy-discussion mailing list >>> > Numpy-discussion at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> > >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > From charlesr.harris at gmail.com Sat Jun 6 14:03:38 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 6 Jun 2009 12:03:38 -0600 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A2A8B50.9030904@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> Message-ID: On Sat, Jun 6, 2009 at 9:29 AM, Alan G Isaac wrote: > On 6/6/2009 12:41 AM Charles R Harris apparently wrote: > > Well, one could argue that. The x.T is a member of the dual, hence maps > > vectors to the reals. Usually the reals aren't represented by 1x1 > > matrices. Just my [.02] cents. > > Of course that same perspective could > lead you to argue that a M?N matrix > is for mapping N vectors to M vectors, > not for doing matrix multiplication. > > Matrix multiplication produces a > matrix result **by definition**. > Treating 1?1 matrices as equivalent > to scalars is just a convenient anomaly > in certain popular matrix-oriented > languages. > So is eye(3)*(v.T*v) valid? If (v.T*v) is 1x1 you have incompatible dimensions for the multiplication, whereas if it is a scalar you can multiply eye(3) by it. The usual matrix algebra gets a bit confused here because it isn't clear about the distinction between inner products and the expression v.T*v which is typically used in it's place. I think the only consistent way around this is to treat 1x1 matrices as scalars, which I believe matlab does, but then the expression eye(3)*(v.T*v) isn't associative and we lose some checks. I don't think we can change the current matrix class, to do so would break too much code. It would be nice to extend it with an explicit inner product, but I can't think of any simple notation for it that python would parse. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Sat Jun 6 14:30:37 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 6 Jun 2009 11:30:37 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> Message-ID: On Sat, Jun 6, 2009 at 12:09 AM, Robert Kern wrote: > +1 OK, thanks. I'll try to get it ready. > diag_indices() can be made more efficient, but these are fine. Suggestion? Right now it's not obvious to me... A few more questions: - Are doctests considered enough testing for numpy, or are separate tests also required? - Where should these go? - Any interest in also having the stuff below? I'm needing to build structured random arrays a lot (symmetric, anti-symmetric, symmetric with a particular diagonal, etc), and these are coming in handy. If you want them, I'll put the whole thing together (these use the indexing utilities from the previous suggestion). Thanks! f #### Other suggested utilities. Not fully commented yet, but if they are wanted for numpy, will be submitted in final form. def structured_rand_arr(size, sample_func=np.random.random, ltfac=None, utfac=None, fill_diag=None): """Make a structured random 2-d array of shape (size,size). Parameters ---------- size : int Determines the shape of the output array: (size,size). sample_func : function, optional. Must be a function which when called with a 2-tuple of ints, returns a 2-d array of that shape. By default, np.random.random is used, but any other sampling function can be used as long as matches this API. utfac : float, optional Multiplicative factor for the lower triangular part of the matrix. ltfac : float, optional Multiplicative factor for the lower triangular part of the matrix. fill_diag : float, optional If given, use this value to fill in the diagonal. Otherwise the diagonal will contain random elements. """ # Make a random array from the given sampling function mat0 = sample_func((size,size)) # And the empty one we'll then fill in to return mat = np.empty_like(mat0) # Extract indices for upper-triangle, lower-triangle and diagonal uidx = triu_indices(size,1) lidx = tril_indices(size,-1) didx = diag_indices(size) # Extract each part from the original and copy it to the output, possibly # applying multiplicative factors. We check the factors instead of # defaulting to 1.0 to avoid unnecessary floating point multiplications # which could be noticeable for very large sizes. if utfac: mat[uidx] = utfac * mat0[uidx] else: mat[uidx] = mat0[uidx] if ltfac: mat[lidx] = itfac * mat0.T[lidx] else: mat[lidx] = mat0.T[lidx] # If fill_diag was provided, use it; otherwise take the values in the # diagonal from the original random array. if fill_diag: mat[didx] = fill_diag else: mat[didx] = mat0[didx] return mat def symm_rand_arr(size,sample_func=np.random.random,fill_diag=None): """Make a symmetric random 2-d array of shape (size,size). Parameters ---------- n : int Size of the output array. fill_diag : float, optional If given, use this value to fill in the diagonal. Useful for """ return structured_rand_arr(size,sample_func,fill_diag=fill_diag) def antisymm_rand_arr(size,sample_func=np.random.random,fill_diag=None): """Make an anti-symmetric random 2-d array of shape (size,size). Parameters ---------- n : int Size of the output array. """ return structured_rand_arr(size,sample_func,ltfac=-1.0,fill_diag=fill_diag) From fperez.net at gmail.com Sat Jun 6 14:31:35 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 6 Jun 2009 11:31:35 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: Message-ID: On Sat, Jun 6, 2009 at 8:27 AM, Keith Goodman wrote: > What do you think of passing in the array a instead of n and ndim > (diag_indices_list_2 below)? Yes, I thought of that too. I see use cases for both though. Would people prefer both, or rather a flexible interface that tries to introspect the inputs and do both in one call? Cheers, f From llewelr at gmail.com Sat Jun 6 14:32:24 2009 From: llewelr at gmail.com (Richard Llewellyn) Date: Sat, 6 Jun 2009 11:32:24 -0700 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? In-Reply-To: <7f014ea60906061046g650e71e4j4fb4f064ddc49030@mail.gmail.com> References: <28e83ea0906061002k456fc263t9a5e216835dee187@mail.gmail.com> <7f014ea60906061025o17416dc4m97bd2fd0d15701b5@mail.gmail.com> <28e83ea0906061037h3ba96628qf5e57519e70dca12@mail.gmail.com> <7f014ea60906061042h3260df64je034a38de1894871@mail.gmail.com> <7f014ea60906061046g650e71e4j4fb4f064ddc49030@mail.gmail.com> Message-ID: <28e83ea0906061132u6b506449xadfdd1ca944dfde@mail.gmail.com> I posted the setup.py build output to pastebin.com, though missed the uninteresting stderr (forgot tcsh command to redirect both). Also, used setup.py build --fcompiler=gnu95. To be clear, I am not certain that my ATLAS libraries are not found. But during the build starting at line 95 (pastebin.com) I see a compilation failure, and then NO_ATLAS_INFO, 2. I don't think I can use ldconfig without root, but have set LD_LIBRARY_PATH to point to the scipy_build/lib until I put them somewhere else. importing numpy works, though lapack_lite is also imported. I wonder if this is normal even if my ATLAS was used. Thanks, Rich On Sat, Jun 6, 2009 at 10:46 AM, Chris Colbert wrote: > and where exactly are you seeing atlas not found? during the build > process, are when import numpy in python? > > if its the latter, you need to add a .conf file in /etc/ld.so.conf.d/ > with the line /usr/local/rich/src/scipy_build/lib and then run sudo > ldconfig > > Chris > > > On Sat, Jun 6, 2009 at 1:42 PM, Chris Colbert wrote: > > can you run this and post the build.log to pastebin.com: > > > > assuming your numpy build directory is /home/numpy-1.3.0: > > > > cd /home/numpy-1.3.0 > > rm -rf build > > python setup.py build &&> build.log > > > > > > Chris > > > > > > On Sat, Jun 6, 2009 at 1:37 PM, Richard Llewellyn > wrote: > >> Hi Chris, > >> thanks much for posting those installation instructions. Seems similar > to > >> what I pieced together. > >> > >> I gather ATLAS not found. Oops, drank that beer too early. > >> > >> I copied Atlas libs to /usr/local/rich/src/scipy_build/lib. > >> > >> This is my site.cfg. Out of desperation I tried search_static_first = > 1, > >> but probably of no use. > >> > >> [DEFAULT] > >> library_dirs = /usr/local/rich/src/scipy_build/lib:$HOME/usr/galois/lib > >> include_dirs = > >> /usr/local/rich/src/scipy_build/lib/include:$HOME/usr/galois/include > >> search_static_first = 1 > >> > >> [blas_opt] > >> libraries = f77blas, cblas, atlas > >> > >> [lapack_opt] > >> libraries = lapack, f77blas, cblas, atlas > >> > >> [amd] > >> amd_libs = amd > >> > >> [umfpack] > >> umfpack_libs = umfpack, gfortran > >> > >> [fftw] > >> libraries = fftw3 > >> > >> > >> Rich > >> > >> > >> > >> > >> On Sat, Jun 6, 2009 at 10:25 AM, Chris Colbert > wrote: > >>> > >>> when you build numpy, did you use site.cfg to tell it where to find > >>> your atlas libs? > >>> > >>> On Sat, Jun 6, 2009 at 1:02 PM, Richard Llewellyn > >>> wrote: > >>> > Hello, > >>> > > >>> > I've managed a build of lapack and atlas on Fedora 10 on a quad core, > >>> > 64, > >>> > and now (...) have a numpy I can import that runs tests ok. :] I > am > >>> > puzzled, however, that numpy builds and imports lapack_lite. Does > this > >>> > mean > >>> > I have a problem with the build(s)? > >>> > Upon building numpy, I see the troubling output: > >>> > > >>> > ######################## > >>> > > >>> > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe > >>> > -Wall > >>> > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto > >>> > r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC > >>> > -fPIC > >>> > > >>> > compile options: '-c' > >>> > gcc: _configtest.c > >>> > gcc -pthread _configtest.o -L/usr/local/rich/src/scipy_build/lib > >>> > -llapack > >>> > -lptf77blas -lptcblas -latlas -o _configtest > >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is > reference > >>> > d by DSO > >>> > /usr/bin/ld: final link failed: Nonrepresentable section on output > >>> > collect2: ld returned 1 exit status > >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is > reference > >>> > d by DSO > >>> > /usr/bin/ld: final link failed: Nonrepresentable section on output > >>> > collect2: ld returned 1 exit status > >>> > failure. > >>> > removing: _configtest.c _configtest.o > >>> > Status: 255 > >>> > Output: > >>> > FOUND: > >>> > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > >>> > library_dirs = ['/usr/local/rich/src/scipy_build/lib'] > >>> > language = f77 > >>> > define_macros = [('NO_ATLAS_INFO', 2)] > >>> > > >>> > ########################## > >>> > > >>> > I don't have root on this machine, but could pester admins for > eventual > >>> > temporary access. > >>> > > >>> > Thanks much for any help, > >>> > Rich > >>> > > >>> > _______________________________________________ > >>> > Numpy-discussion mailing list > >>> > Numpy-discussion at scipy.org > >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > > >>> > > >>> _______________________________________________ > >>> Numpy-discussion mailing list > >>> Numpy-discussion at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zelbier at gmail.com Sat Jun 6 14:34:34 2009 From: zelbier at gmail.com (Olivier Verdier) Date: Sat, 6 Jun 2009 20:34:34 +0200 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A29561D.5070806@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> Message-ID: I took that very seriously when you said that matrices were important to you. Far from me the idea of forbidding numpy users to use matrices. My point was the fact that newcomers are confused by the presence of both matrices and arrays. I think that there should be only one matrix/vector/tensor object in numpy. Therefore I would advocate the removal of matrices from numpy. *But* why not have matrices in a different component? Maybe as a part of scipy? or somewhere else? You would be more than welcome to use them anywhere. Note that I use components outside numpy for my teaching (scipy, sympy, mayavi, nosetest) and I don't have any problems with that. With my "argument" I endeavoured to explain the potential complications of using matrices instead of arrays when teaching. Perhaps the strongest argument against matrices is that you cannot use vectors. I've taught enough matlab courses to realise the pain that this represents for students. But I realise also that somebody else would have a different experience. Of course x.T*y should be a 1x1 matrix, this is not an anomaly, but it is confusing for students, because they expect a scalar. That is why I prefer to teach with dot. Then the relation matrix/vector/scalar is crystal clear. == Olivier 2009/6/5 Alan G Isaac > On 6/5/2009 11:38 AM Olivier Verdier apparently wrote: > > I think matrices can be pretty tricky when used for > > teaching. For instance, you have to explain that all the > > operators work component-wise, except the multiplication! > > Another caveat is that since matrices are always 2x2, the > > "scalar product" of two column vectors computed as " x.T > > * y" will not be a scalar, but a 2x2 matrix. There is also > > the fact that you must cast all your vectors to column/raw > > matrices (as in matlab). For all these reasons, I prefer > > to use arrays and dot for teaching, and I have never had > > any complaints. > > > I do not understand this "argument". > You should take it very seriously when someone > reports to you that the matrix object is a crucial to them, > e.g., as a teaching tool. Even if you do not find > personally persuasive an example like > http://mail.scipy.org/pipermail/numpy-discussion/2009-June/043001.html > I have told you: this is important for my students. > Reporting that your students do not complain about using > arrays instead of matrices does not change this one bit. > > Student backgrounds differ by domain of application. In > economics, matrices are in *very* wide use, and > multidimensional arrays get almost no use. Textbooks in > econometrics (a huge and important field, even outside of > economics) are full of proofs using matrix algebra. > A close match to what the students see is crucial. > When working with multiplication or exponentiation, > matrices do what they expect, and 2d arrays do not. > > One more point. As Python users we get used to installing > a package here and a package there to add functionality. > But this is not how most people looking for a matrix > language see the world. Removing the matrix object from > NumPy will raise the barrier to adoption by social > scientists, and there should be a strongly persuasive reason > before taking such a step. > > Separately from all that, does anyone doubt that there is > code that depends on the matrix object? The core objection > to a past proposal for useful change was that it could break > extant code. I would hope that nobody who took that > position would subsequently propose removing the matrix > object altogether. > > Cheers, > Alan Isaac > > PS If x and y are "column vectors" (i.e., matrices), then > x.T * y *should* be a 1?1 matrix. > Since the * operator is doing matrix multiplication, > this is the correct result, not an anomaly. > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Jun 6 14:35:33 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 6 Jun 2009 20:35:33 +0200 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> Message-ID: <20090606183533.GF19405@phare.normalesup.org> On Sat, Jun 06, 2009 at 11:30:37AM -0700, Fernando Perez wrote: > On Sat, Jun 6, 2009 at 12:09 AM, Robert Kern wrote: > - Any interest in also having the stuff below? I'm needing to build > structured random arrays a lot (symmetric, anti-symmetric, symmetric > with a particular diagonal, etc), and these are coming in handy. If > you want them, I'll put the whole thing together (these use the > indexing utilities from the previous suggestion). I think they need examples. Right now, it is not clear at all to me what they do. Ga?l From fperez.net at gmail.com Sat Jun 6 14:36:20 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 6 Jun 2009 11:36:20 -0700 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> Message-ID: On Sat, Jun 6, 2009 at 11:03 AM, Charles R Harris wrote: > I don't think we can change the current matrix class, to do so would break > too much code. It would be nice to extend it with an explicit inner product, > but I can't think of any simple notation for it that python would parse. Maybe it's time to make another push on python-dev for the pep-225 stuff for other operators? https://cirl.berkeley.edu/fperez/static/numpy-pep225/ Last year I got pretty much zero interest from python-dev on this, but they were very very busy with 3.0 on the horizon. Perhaps once they put 3.1 out would be a good time to champion this again. It's slightly independent of the matrix class debate, but perhaps having special operators for real matrix multiplication could ease some of the bottlenecks of this discussion. It would be great if someone could champion that discussion on python-dev though, I don't see myself finding the time for it another time around... Cheers, f From kwgoodman at gmail.com Sat Jun 6 14:46:52 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 6 Jun 2009 11:46:52 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> Message-ID: On Sat, Jun 6, 2009 at 11:30 AM, Fernando Perez wrote: > On Sat, Jun 6, 2009 at 12:09 AM, Robert Kern wrote: >> diag_indices() can be made more efficient, but these are fine. > > Suggestion? ?Right now it's not obvious to me... I'm interested in a more efficient way too. Here's how I plan to adapt the code for my personal use: def fill_diag(arr, value): if arr.ndim != 2: raise ValueError, "Input must be 2-d." idx = range(arr.shape[0]) arr[(idx,) * 2] = value return arr >> a = np.array([[1,2,3],[4,5,6],[7,8,9]]) >> a = fill_diag(a, 0) >> a array([[0, 2, 3], [4, 0, 6], [7, 8, 0]]) >> a = fill_diag(a, np.array([1,2,3])) >> a array([[1, 2, 3], [4, 2, 6], [7, 8, 3]]) From kwgoodman at gmail.com Sat Jun 6 14:50:14 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 6 Jun 2009 11:50:14 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> Message-ID: On Sat, Jun 6, 2009 at 11:46 AM, Keith Goodman wrote: > On Sat, Jun 6, 2009 at 11:30 AM, Fernando Perez wrote: >> On Sat, Jun 6, 2009 at 12:09 AM, Robert Kern wrote: >>> diag_indices() can be made more efficient, but these are fine. >> >> Suggestion? ?Right now it's not obvious to me... > > I'm interested in a more efficient way too. Here's how I plan to adapt > the code for my personal use: > > def fill_diag(arr, value): > ? ?if arr.ndim != 2: > ? ? ? ?raise ValueError, "Input must be 2-d." > ? ?idx = range(arr.shape[0]) > ? ?arr[(idx,) * 2] = value > ? ?return arr Maybe it is confusing to return the array since the operation is in place. So: def fill_diag(arr, value): if arr.ndim != 2: raise ValueError, "Input must be 2-d." idx = range(arr.shape[0]) arr[(idx,) * 2] = value From aisaac at american.edu Sat Jun 6 14:50:46 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 06 Jun 2009 14:50:46 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> Message-ID: <4A2ABA86.1030907@american.edu> On 6/6/2009 2:03 PM Charles R Harris apparently wrote: > So is eye(3)*(v.T*v) valid? If (v.T*v) is 1x1 you have incompatible > dimensions for the multiplication Exactly. So it is not valid. As you point out, to make it valid implies a loss of the associativity of matrix multiplication. Not a good idea! Cheers, Alan From fperez.net at gmail.com Sat Jun 6 14:45:09 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 6 Jun 2009 11:45:09 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: <20090606183533.GF19405@phare.normalesup.org> References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <20090606183533.GF19405@phare.normalesup.org> Message-ID: On Sat, Jun 6, 2009 at 11:35 AM, Gael Varoquaux wrote: > I think they need examples. Right now, it is not clear at all to me what > they do. Cheers, f # With doctests, set to be repeatable by seeding the rng. def structured_rand_arr(size, sample_func=np.random.random, ltfac=None, utfac=None, fill_diag=None): """Make a structured random 2-d array of shape (size,size). If no optional arguments are given, a symmetric array is returned. Parameters ---------- size : int Determines the shape of the output array: (size,size). sample_func : function, optional. Must be a function which when called with a 2-tuple of ints, returns a 2-d array of that shape. By default, np.random.random is used, but any other sampling function can be used as long as matches this API. utfac : float, optional Multiplicative factor for the lower triangular part of the matrix. ltfac : float, optional Multiplicative factor for the lower triangular part of the matrix. fill_diag : float, optional If given, use this value to fill in the diagonal. Otherwise the diagonal will contain random elements. Examples -------- >>> np.random.seed(0) >>> structured_rand_arr(4) array([[ 0.5488, 0.7152, 0.6028, 0.5449], [ 0.7152, 0.6459, 0.4376, 0.8918], [ 0.6028, 0.4376, 0.7917, 0.5289], [ 0.5449, 0.8918, 0.5289, 0.0871]]) >>> structured_rand_arr(4,ltfac=-10,utfac=10,fill_diag=0.5) array([[ 0.5 , 8.3262, 7.7816, 8.7001], [-8.3262, 0.5 , 4.6148, 7.8053], [-7.7816, -4.6148, 0.5 , 9.4467], [-8.7001, -7.8053, -9.4467, 0.5 ]]) """ # Make a random array from the given sampling function mat0 = sample_func((size,size)) # And the empty one we'll then fill in to return mat = np.empty_like(mat0) # Extract indices for upper-triangle, lower-triangle and diagonal uidx = triu_indices(size,1) lidx = tril_indices(size,-1) didx = diag_indices(size) # Extract each part from the original and copy it to the output, possibly # applying multiplicative factors. We check the factors instead of # defaulting to 1.0 to avoid unnecessary floating point multiplications # which could be noticeable for very large sizes. if utfac: mat[uidx] = utfac * mat0[uidx] else: mat[uidx] = mat0[uidx] if ltfac: mat[lidx] = ltfac * mat0.T[lidx] else: mat[lidx] = mat0.T[lidx] # If fill_diag was provided, use it; otherwise take the values in the # diagonal from the original random array. if fill_diag is not None: mat[didx] = fill_diag else: mat[didx] = mat0[didx] return mat def symm_rand_arr(size,sample_func=np.random.random,fill_diag=None): """Make a symmetric random 2-d array of shape (size,size). Parameters ---------- n : int Size of the output array. fill_diag : float, optional If given, use this value to fill in the diagonal. Useful for Examples -------- >>> np.random.seed(0) >>> symm_rand_arr(4) array([[ 0.5488, 0.7152, 0.6028, 0.5449], [ 0.7152, 0.6459, 0.4376, 0.8918], [ 0.6028, 0.4376, 0.7917, 0.5289], [ 0.5449, 0.8918, 0.5289, 0.0871]]) >>> symm_rand_arr(4,fill_diag=4) array([[ 4. , 0.8326, 0.7782, 0.87 ], [ 0.8326, 4. , 0.4615, 0.7805], [ 0.7782, 0.4615, 4. , 0.9447], [ 0.87 , 0.7805, 0.9447, 4. ]]) """ return structured_rand_arr(size,sample_func,fill_diag=fill_diag) def antisymm_rand_arr(size,sample_func=np.random.random): """Make an anti-symmetric random 2-d array of shape (size,size). Parameters ---------- n : int Size of the output array. Examples -------- >>> np.random.seed(0) >>> antisymm_rand_arr(4) array([[ 0. , 0.7152, 0.6028, 0.5449], [-0.7152, 0. , 0.4376, 0.8918], [-0.6028, -0.4376, 0. , 0.5289], [-0.5449, -0.8918, -0.5289, 0. ]]) """ return structured_rand_arr(size,sample_func,ltfac=-1.0,fill_diag=0) From charlesr.harris at gmail.com Sat Jun 6 14:58:09 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 6 Jun 2009 12:58:09 -0600 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> Message-ID: On Sat, Jun 6, 2009 at 12:34 PM, Olivier Verdier wrote: > I took that very seriously when you said that matrices were important to > you. Far from me the idea of forbidding numpy users to use matrices. > My point was the fact that newcomers are confused by the presence of both > matrices and arrays. I think that there should be only one > matrix/vector/tensor object in numpy. Therefore I would advocate the removal > of matrices from numpy. > > *But* why not have matrices in a different component? Maybe as a part of > scipy? or somewhere else? You would be more than welcome to use them > anywhere. Note that I use components outside numpy for my teaching (scipy, > sympy, mayavi, nosetest) and I don't have any problems with that. > > With my "argument" I endeavoured to explain the potential complications of > using matrices instead of arrays when teaching. Perhaps the strongest > argument against matrices is that you cannot use vectors. I've taught enough > matlab courses to realise the pain that this represents for students. But I > realise also that somebody else would have a different experience. > > Of course x.T*y should be a 1x1 matrix, this is not an anomaly, but it is > confusing for students, because they expect a scalar. That is why I prefer > to teach with dot. Then the relation matrix/vector/scalar is crystal clear. > How about the common expression exp((v.t*A*v)/2) do you expect a matrix exponential here? Or should the students write exp(/2) where <...> is the inner product? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Sat Jun 6 15:15:44 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sat, 6 Jun 2009 15:15:44 -0400 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? In-Reply-To: <28e83ea0906061132u6b506449xadfdd1ca944dfde@mail.gmail.com> References: <28e83ea0906061002k456fc263t9a5e216835dee187@mail.gmail.com> <7f014ea60906061025o17416dc4m97bd2fd0d15701b5@mail.gmail.com> <28e83ea0906061037h3ba96628qf5e57519e70dca12@mail.gmail.com> <7f014ea60906061042h3260df64je034a38de1894871@mail.gmail.com> <7f014ea60906061046g650e71e4j4fb4f064ddc49030@mail.gmail.com> <28e83ea0906061132u6b506449xadfdd1ca944dfde@mail.gmail.com> Message-ID: <7f014ea60906061215q15bf70b9i56c1e5d9fd5ca7b0@mail.gmail.com> i need the full link to pastebin.com in order to view your post. It will be something like http://pastebin.com/m6b09f05c chris On Sat, Jun 6, 2009 at 2:32 PM, Richard Llewellyn wrote: > I posted the setup.py build output to pastebin.com, though missed the > uninteresting stderr (forgot tcsh command to redirect both). > Also, used setup.py build --fcompiler=gnu95. > > > To be clear, I am not certain that my ATLAS libraries are not found. But > during the build starting at line 95 (pastebin.com) I see a compilation > failure, and then NO_ATLAS_INFO, 2. > > I don't think I can use ldconfig without root, but have set LD_LIBRARY_PATH > to point to the scipy_build/lib until I put them somewhere else. > > importing numpy works, though lapack_lite is also imported. I wonder if this > is normal even if my ATLAS was used. > > Thanks, > Rich > > On Sat, Jun 6, 2009 at 10:46 AM, Chris Colbert wrote: >> >> and where exactly are you seeing atlas not found? during the build >> process, are when import numpy in python? >> >> if its the latter, you need to add a .conf file ?in /etc/ld.so.conf.d/ >> ?with the line /usr/local/rich/src/scipy_build/lib ?and then run ?sudo >> ldconfig >> >> Chris >> >> >> On Sat, Jun 6, 2009 at 1:42 PM, Chris Colbert wrote: >> > can you run this and post the build.log to pastebin.com: >> > >> > assuming your numpy build directory is /home/numpy-1.3.0: >> > >> > cd /home/numpy-1.3.0 >> > rm -rf build >> > python setup.py build &&> build.log >> > >> > >> > Chris >> > >> > >> > On Sat, Jun 6, 2009 at 1:37 PM, Richard Llewellyn >> > wrote: >> >> Hi Chris, >> >> ?thanks much for posting those installation instructions.? Seems >> >> similar to >> >> what I pieced together. >> >> >> >> I gather ATLAS not found.? Oops, drank that beer too early. >> >> >> >> I copied Atlas libs to /usr/local/rich/src/scipy_build/lib. >> >> >> >> This is my site.cfg.? Out of desperation I tried search_static_first = >> >> 1, >> >> but probably of no use. >> >> >> >> [DEFAULT] >> >> library_dirs = /usr/local/rich/src/scipy_build/lib:$HOME/usr/galois/lib >> >> include_dirs = >> >> /usr/local/rich/src/scipy_build/lib/include:$HOME/usr/galois/include >> >> search_static_first = 1 >> >> >> >> [blas_opt] >> >> libraries = f77blas, cblas, atlas >> >> >> >> [lapack_opt] >> >> libraries = lapack, f77blas, cblas, atlas >> >> >> >> [amd] >> >> amd_libs = amd >> >> >> >> [umfpack] >> >> umfpack_libs = umfpack, gfortran >> >> >> >> [fftw] >> >> libraries = fftw3 >> >> >> >> >> >> Rich >> >> >> >> >> >> >> >> >> >> On Sat, Jun 6, 2009 at 10:25 AM, Chris Colbert >> >> wrote: >> >>> >> >>> when you build numpy, did you use site.cfg to tell it where to find >> >>> your atlas libs? >> >>> >> >>> On Sat, Jun 6, 2009 at 1:02 PM, Richard Llewellyn >> >>> wrote: >> >>> > Hello, >> >>> > >> >>> > I've managed a build of lapack and atlas on Fedora 10 on a quad >> >>> > core, >> >>> > 64, >> >>> > and now (...) have a numpy I can import that runs tests ok. :]??? I >> >>> > am >> >>> > puzzled, however, that numpy builds and imports lapack_lite.? Does >> >>> > this >> >>> > mean >> >>> > I have a problem with the build(s)? >> >>> > Upon building numpy, I see the troubling output: >> >>> > >> >>> > ######################## >> >>> > >> >>> > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe >> >>> > -Wall >> >>> > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto >> >>> > r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC >> >>> > -fPIC >> >>> > >> >>> > compile options: '-c' >> >>> > gcc: _configtest.c >> >>> > gcc -pthread _configtest.o -L/usr/local/rich/src/scipy_build/lib >> >>> > -llapack >> >>> > -lptf77blas -lptcblas -latlas -o _configtest >> >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in >> >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is >> >>> > reference >> >>> > d by DSO >> >>> > /usr/bin/ld: final link failed: Nonrepresentable section on output >> >>> > collect2: ld returned 1 exit status >> >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in >> >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is >> >>> > reference >> >>> > d by DSO >> >>> > /usr/bin/ld: final link failed: Nonrepresentable section on output >> >>> > collect2: ld returned 1 exit status >> >>> > failure. >> >>> > removing: _configtest.c _configtest.o >> >>> > Status: 255 >> >>> > Output: >> >>> > ? FOUND: >> >>> > ??? libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] >> >>> > ??? library_dirs = ['/usr/local/rich/src/scipy_build/lib'] >> >>> > ??? language = f77 >> >>> > ??? define_macros = [('NO_ATLAS_INFO', 2)] >> >>> > >> >>> > ########################## >> >>> > >> >>> > I don't have root on this machine, but could pester admins for >> >>> > eventual >> >>> > temporary access. >> >>> > >> >>> > Thanks much for any help, >> >>> > Rich >> >>> > >> >>> > _______________________________________________ >> >>> > Numpy-discussion mailing list >> >>> > Numpy-discussion at scipy.org >> >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>> > >> >>> > >> >>> _______________________________________________ >> >>> Numpy-discussion mailing list >> >>> Numpy-discussion at scipy.org >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> >> _______________________________________________ >> >> Numpy-discussion mailing list >> >> Numpy-discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> > >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From the.minjae at gmail.com Sat Jun 6 15:55:33 2009 From: the.minjae at gmail.com (Minjae Kim) Date: Sat, 6 Jun 2009 14:55:33 -0500 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> Message-ID: Thanks for this excellent recipe. I have not tried it out myself yet, but I will follow the instruction on clean Ubuntu 9.04 64-bit. Best, Minjae On Sat, Jun 6, 2009 at 11:59 AM, Chris Colbert wrote: > since there is demand, and someone already emailed me, I'll put what I > did in this post. It pretty much follows whats on the scipy website, > with a couple other things I gleaned from reading the ATLAS install > guide: > > and here it goes, this is valid for Ubuntu 9.04 64-bit (# starts a > comment when working in the terminal) > > > download lapack 3.2.1 http://www.netlib.org/lapack/lapack.tgz > download atlas 3.8.3 > > http://sourceforge.net/project/downloading.php?group_id=23725&filename=atlas3.8.3.tar.bz2&a=65663372 > > create folder /home/your-user-name/build/atlas #this is where we build > create folder /home/your-user-name/build/lapack #atlas and lapack > > extract the folder lapack-3.2.1 to /home/your-user-name/build/lapack > extract the contents of atlas to /home/your-user-name/build/atlas > > > > now in the terminal: > > # remove g77 and get stuff we need > sudo apt-get remove g77 > sudo apt-get install gfortran > sudo apt-get install build-essential > sudo apt-get install python-dev > sudo apt-get install python-setuptools > sudo easy_install nose > > > # build lapack > cd /home/your-user-name/build/lapack/lapack-3.2.1 > cp INSTALL/make.inc.gfortran make.inc > > gedit make.inc > ################# > #in the make.inc file make sure the line OPTS = -O2 -fPIC -m64 > #and NOOPTS = -O0 -fPIC -m64 > #the -m64 flags build 64-bit code, if you want 32-bit, simply leave > #the -m64 flags out > ################# > > cd SRC > > #this should build lapack without error > make > > > > # build atlas > > cd /home/your-user-name/build/atlas > > #this is simply where we will build the atlas > #libs, you can name it what you want > mkdir Linux_X64SSE2 > > cd Linux_X64SSE2 > > #need to turn off cpu-throttling > sudo cpufreq-selector -g performance > > #if you don't want 64bit code remove the -b 64 flag. replace the > #number 2400 with your CPU frequency in MHZ > #i.e. my cpu is 2.53 GHZ so i put 2530 > ../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC > > --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a > > #the configure step takes a bit, and should end without errors > > #this takes a long time, go get some coffee, it should end without error > make build > > #this will verify the build, also long running > make check > > #this will test the performance of your build and give you feedback on > #it. your numbers should be close to the test numbers at the end > make time > > cd lib > > #builds single threaded .so's > make shared > > #builds multithreaded .so's > make ptshared > > #copies all of the atlas libs (and the lapack lib built with atlas) > #to our lib dir > sudo cp *.so /usr/local/lib/ > > > > #now we need to get and build numpy > > download numpy 1.3.0 > > http://sourceforge.net/project/downloading.php?group_id=1369&filename=numpy-1.3.0.tar.gz&a=93506515 > > extract the folder numpy-1.3.0 to /home/your-user-name/build > > #in the terminal > > cd /home/your-user-name/build/numpy-1.3.0 > cp site.cfg.example site.cfg > > gedit site.cfg > ############################################### > # in site.cfg uncomment the following lines and make them look like these > [DEFAULT] > library_dirs = /usr/local/lib > include_dirs = /usr/local/include > > [blas_opt] > libraries = ptf77blas, ptcblas, atlas > > [lapack_opt] > libraries = lapack, ptf77blas, ptcblas, atlas > ################################################### > #if you want single threaded libs, uncomment those lines instead > > > #build numpy- should end without error > python setup.py build > > #install numpy > python setup.py install > > cd /home > > sudo ldconfig > > python > >>import numpy > >>numpy.test() #this should run with no errors (skipped tests and > known-fails are ok) > >>a = numpy.random.randn(6000, 6000) > >>numpy.dot(a, a) # look at your cpu monitor and verify all cpu cores > are at 100% if you built with threads > > > Celebrate with a beer! > > > Cheers! > > Chris > > > > > > On Sat, Jun 6, 2009 at 10:42 AM, Keith Goodman wrote: > > On Fri, Jun 5, 2009 at 2:37 PM, Chris Colbert > wrote: > >> I'll caution anyone from using Atlas from the repos in Ubuntu 9.04 as > the > >> package is broken: > >> > >> https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510 > >> > >> > >> just build Atlas yourself, you get better performance AND threading. > >> Building it is not the nightmare it sounds like. I think i've done it a > >> total of four times now, both 32-bit and 64-bit builds. > >> > >> If you need help with it, just email me off list. > > > > That's a nice offer. I tried building ATLAS on Debian a year or two > > ago and got stuck. > > > > Clear out your inbox! > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Sat Jun 6 15:59:01 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 06 Jun 2009 15:59:01 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> Message-ID: <4A2ACA85.2000304@american.edu> On 6/6/2009 2:58 PM Charles R Harris apparently wrote: > How about the common expression > exp((v.t*A*v)/2) > do you expect a matrix exponential here? I take your point that there are conveniences to treating a 1 by 1 matrix as a scalar. Most matrix programming languages do this, I think. For sure GAUSS does. The result of x' * A * x is a "matrix" (it has one row and one column) but it functions like a scalar (and even more, since right multiplication by it is also allowed). While I think this is "wrong", especially in a language that readily distinguishes scalars and matrices, I recognize that many others have found the behavior useful. And I confess that when I talk about quadratic forms, I do treat x.T * A * x as if it were scalar. But just to be clear, how are you proposing to implement that behavior, if you are? Alan From robert.kern at gmail.com Sat Jun 6 16:30:38 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 6 Jun 2009 15:30:38 -0500 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A2ACA85.2000304@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2ACA85.2000304@american.edu> Message-ID: <3d375d730906061330x5135781em7cdf6fc1c403bc16@mail.gmail.com> On Sat, Jun 6, 2009 at 14:59, Alan G Isaac wrote: > On 6/6/2009 2:58 PM Charles R Harris apparently wrote: >> How about the common expression >> exp((v.t*A*v)/2) >> do you expect a matrix exponential here? > > > I take your point that there are conveniences > to treating a 1 by 1 matrix as a scalar. > Most matrix programming languages do this, I think. > For sure GAUSS does. ?The result of ? x' * A * x > is a "matrix" (it has one row and one column) but > it functions like a scalar (and even more, > since right multiplication by it is also allowed). > > While I think this is "wrong", especially in a > language that readily distinguishes scalars > and matrices, I recognize that many others have > found the behavior useful. ?And I confess that > when I talk about quadratic forms, I do treat > x.T * A * x as if it were scalar. The old idea of introducing RowVector and ColumnVector would help here. If x were a ColumnVector and A a Matrix, then you can introduce the following rules: x.T is a RowVector RowVector * ColumnVector is a scalar RowVector * Matrix is a RowVector Matrix * ColumnVector is a ColumnVector -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bsouthey at gmail.com Sat Jun 6 16:31:27 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Sat, 6 Jun 2009 15:31:27 -0500 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: Message-ID: On Sat, Jun 6, 2009 at 2:01 AM, Fernando Perez wrote: [snip] > #### > > def mask_indices(n,mask_func,k=0): > ? ?"""Return the indices for an array, given a masking function like > tri{u,l}.""" > ? ?m = np.ones((n,n),int) > ? ?a = mask_func(m,k) > ? ?return np.where(a != 0) > > > def diag_indices(n,ndim=2): > ? ?"""Return the indices to index into a diagonal. > > ? ?Examples > ? ?-------- > ? ?>>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]) > ? ?>>> a > ? ?array([[ 1, ?2, ?3, ?4], > ? ? ? ? ? [ 5, ?6, ?7, ?8], > ? ? ? ? ? [ 9, 10, 11, 12], > ? ? ? ? ? [13, 14, 15, 16]]) > ? ?>>> di = diag_indices(4) > ? ?>>> a[di] = 100 > ? ?>>> a > ? ?array([[100, ? 2, ? 3, ? 4], > ? ? ? ? ? [ ?5, 100, ? 7, ? 8], > ? ? ? ? ? [ ?9, ?10, 100, ?12], > ? ? ? ? ? [ 13, ?14, ?15, 100]]) > ? ?""" > ? ?idx = np.arange(n) > ? ?return (idx,)*ndim > > While not trying to be negative, this raises important questions that need to be covered because the user should not have to do trial and error to find what actually works and what that does not. While certain features can be fixed within Numpy, API changes should be avoided. Please explain the argument of 'n'? Since you seem to be fixing it to the length of the main diagonal then it is redundant. Otherwise why the first 'n' diagonal elements and not the last 'n' diagonal elements. If it meant to be allow different diagonal elements then it would need adjustment to indicate start and stopping location. What happens when the shape of 'a' is different from 'n'? I would think that this means that diag_indices should be an array method or require passing a (or shape of a to diag_indices). What happens if the array is not square? If a is 4 by 2 then passing n=4 will be wrong. What about offdiagonals? That is should be clear that you are referring to the main diagonal. How does this address non-contiguous memory, Fortran ordered arrays or arrays with more than 2 dimensions? How does this handle record and masked arrays as well as the matrix subclass that are supported by Numpy? Presumably it does not so if it is not an array method, then the type of input would need to be checked. There are probably similar issues to the other functions you propose. Bruce From charlesr.harris at gmail.com Sat Jun 6 16:32:46 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 6 Jun 2009 14:32:46 -0600 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A2ACA85.2000304@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2ACA85.2000304@american.edu> Message-ID: On Sat, Jun 6, 2009 at 1:59 PM, Alan G Isaac wrote: > On 6/6/2009 2:58 PM Charles R Harris apparently wrote: > > How about the common expression > > exp((v.t*A*v)/2) > > do you expect a matrix exponential here? > > > I take your point that there are conveniences > to treating a 1 by 1 matrix as a scalar. > Most matrix programming languages do this, I think. > For sure GAUSS does. The result of x' * A * x > is a "matrix" (it has one row and one column) but > it functions like a scalar (and even more, > since right multiplication by it is also allowed). > It's actually an inner product and the notation would be technically correct. More generally, it is a bilinear function of two vectors. But the correct notation is a bit cumbersome for a student struggling with plain old matrices ;) Ndarrays are actually closer to the tensor ideal in that M*v would be a contraction removing two indices from a three index tensor product. The "dot" function, aka *, then functions as a contraction. In this case x.T*A*x works just fine because A*x is 1D and x.T=x, so the final result is a scalar (0D array). So making vectors 1D arrays would solve some problems. There remains the construction v*v.T, which should really be treated as a tensor product, or bivector. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From neilcrighton at gmail.com Sat Jun 6 16:37:35 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Sat, 6 Jun 2009 20:37:35 +0000 (UTC) Subject: [Numpy-discussion] Changes to arraysetops References: <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A28AEB2.5050209@ntc.zcu.cz> <1cd32cbb0906060441l119f274frc9108d1444cfe638@mail.gmail.com> Message-ID: Thanks for the summary! I'm +1 on points 1, 2 and 3. +0 for points 4 and 5 (assume_unique keyword and renaming arraysetops). Neil PS. I think you mean deprecate, not depreciate :) From robert.kern at gmail.com Sat Jun 6 16:46:42 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 6 Jun 2009 15:46:42 -0500 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: Message-ID: <3d375d730906061346h30dff1d3sb0ba7c8e46662e36@mail.gmail.com> On Sat, Jun 6, 2009 at 15:31, Bruce Southey wrote: > While not trying to be negative, this raises important questions that > need to be covered because the user should not have to do trial and > error to find what actually works and what that does not. While > certain features can be fixed within Numpy, API changes should be > avoided. He's proposing additional functions, not changes to existing functions. > How does this address non-contiguous memory, ?Fortran ordered arrays They just work. These functions create index arrays to use with fancy indexing which works on all of these because of numpy's abstractions. Please read the code. This is obvious. > or arrays with more than 2 dimensions? diag_indices(n, ndim=3), etc. > How does this handle record and masked arrays as well as the matrix > subclass that are supported by Numpy? Again, these functions work via fancy indexing. We don't need to repeat all of the documentation on fancy indexing in each of these functions. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat Jun 6 16:50:33 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 6 Jun 2009 14:50:33 -0600 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3d375d730906061330x5135781em7cdf6fc1c403bc16@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2ACA85.2000304@american.edu> <3d375d730906061330x5135781em7cdf6fc1c403bc16@mail.gmail.com> Message-ID: On Sat, Jun 6, 2009 at 2:30 PM, Robert Kern wrote: > On Sat, Jun 6, 2009 at 14:59, Alan G Isaac wrote: > > On 6/6/2009 2:58 PM Charles R Harris apparently wrote: > >> How about the common expression > >> exp((v.t*A*v)/2) > >> do you expect a matrix exponential here? > > > > > > I take your point that there are conveniences > > to treating a 1 by 1 matrix as a scalar. > > Most matrix programming languages do this, I think. > > For sure GAUSS does. The result of x' * A * x > > is a "matrix" (it has one row and one column) but > > it functions like a scalar (and even more, > > since right multiplication by it is also allowed). > > > > While I think this is "wrong", especially in a > > language that readily distinguishes scalars > > and matrices, I recognize that many others have > > found the behavior useful. And I confess that > > when I talk about quadratic forms, I do treat > > x.T * A * x as if it were scalar. > > The old idea of introducing RowVector and ColumnVector would help > here. If x were a ColumnVector and A a Matrix, then you can introduce > the following rules: > > x.T is a RowVector > RowVector * ColumnVector is a scalar > RowVector * Matrix is a RowVector > Matrix * ColumnVector is a ColumnVector > Yes, that is another good solution. In tensor notation, RowVectors have signature r_i, ColumnVectors c^i, and matrices M^i_j. The '*' operator is then a contraction on adjacent indices, a result with no indices is a scalar, and the only problem that remains is the tensor product usually achieved by x*y.T. But making the exception that col * row is the tensor product producing a matrix would solve that and still raise an error for such things as col*row*row. Or we could simply require something like bivector(x,y) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Sat Jun 6 17:00:44 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 06 Jun 2009 17:00:44 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3d375d730906061330x5135781em7cdf6fc1c403bc16@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2ACA85.2000304@american.edu> <3d375d730906061330x5135781em7cdf6fc1c403bc16@mail.gmail.com> Message-ID: <4A2AD8FC.2080602@american.edu> On 6/6/2009 4:30 PM Robert Kern apparently wrote: > The old idea of introducing RowVector and ColumnVector would help > here. If x were a ColumnVector and A a Matrix, then you can introduce > the following rules: > > x.T is a RowVector > RowVector * ColumnVector is a scalar > RowVector * Matrix is a RowVector > Matrix * ColumnVector is a ColumnVector To me, a "row vector" is just a matrix with a single row, and a "column vector" is just a matrix with a single column. Calling them "vectors" is rather redundant, since matrices are also vectors (i.e., belong to a vector space). I think the core of the row-vector/column-vector proposal is really the idea that we could have 1d objects that also have an "orientation" for the purposes of certain operations. But then why not just use matrices, which automatically provide that "orientation"? Here are the 3 reasons I see: - to allow iteration over matrices to produce a less surprising result (*if* you find it surprising that a matrix is a container of matrices, as I do) - to allow 1d indexing of these "vectors" - to introduce a scalar product I rather doubt (?) that these justify the added complexity of an additional array subclass. Alan From robert.kern at gmail.com Sat Jun 6 17:01:31 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 6 Jun 2009 16:01:31 -0500 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> Message-ID: <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> On Sat, Jun 6, 2009 at 13:30, Fernando Perez wrote: > On Sat, Jun 6, 2009 at 12:09 AM, Robert Kern wrote: > >> +1 > > OK, thanks. ?I'll try to get it ready. > >> diag_indices() can be made more efficient, but these are fine. > > Suggestion? ?Right now it's not obvious to me... Oops! Never mind. I thought it was using mask_indices like the others. There is a neat trick for accessing the diagonal of an existing array (a.flat[::a.shape[1]+1]), but it won't work to implement diag_indices(). > A few more questions: > > - Are doctests considered enough testing for numpy, or are separate > tests also required? I don't think we have the testing machinery hooked up to test the docstrings on the functions themselves (we made the decision to keep examples as clean and pedagogical as possible rather than complete tests). You can use doctests in the test directories, though. > - Where should these go? numpy/lib/twodim_base.py to go with their existing counterparts, I would think. > - Any interest in also having the stuff below? ?I'm needing to build > structured random arrays a lot (symmetric, anti-symmetric, symmetric > with ?a particular diagonal, etc), and these are coming in handy. ?If > you want them, I'll put the whole thing together (these use the > indexing utilities from the previous suggestion). I wouldn't mind having a little gallery of matrix generators in numpy, but someone else has already made a much more complete collection: http://pypi.python.org/pypi/rogues -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sat Jun 6 17:06:11 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 6 Jun 2009 16:06:11 -0500 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A2AD8FC.2080602@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2ACA85.2000304@american.edu> <3d375d730906061330x5135781em7cdf6fc1c403bc16@mail.gmail.com> <4A2AD8FC.2080602@american.edu> Message-ID: <3d375d730906061406u6d3be15bj19a9240807cbf645@mail.gmail.com> On Sat, Jun 6, 2009 at 16:00, Alan G Isaac wrote: > On 6/6/2009 4:30 PM Robert Kern apparently wrote: >> The old idea of introducing RowVector and ColumnVector would help >> here. If x were a ColumnVector and A a Matrix, then you can introduce >> the following rules: >> >> x.T is a RowVector >> RowVector * ColumnVector is a scalar >> RowVector * Matrix is a RowVector >> Matrix * ColumnVector is a ColumnVector > > > To me, a "row vector" is just a matrix with a single row, > and a "column vector" is just a matrix with a single column. > Calling them "vectors" is rather redundant, since matrices > are also vectors (i.e., belong to a vector space). > > I think the core of the row-vector/column-vector proposal > is really the idea that we could have 1d objects that > also have an "orientation" for the purposes of certain > operations. But then why not just use matrices, which > automatically provide that "orientation"? Because (x.T * x) where x is an (n,1) matrix and * is matrix multiplication (i.e. MM(n,1) -> MM(1,1)) is not the same thing as the inner product of a vector (RR^n -> RR). Please see the post I was responding to for the motivation. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From aisaac at american.edu Sat Jun 6 17:08:04 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 06 Jun 2009 17:08:04 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2ACA85.2000304@american.edu> Message-ID: <4A2ADAB4.4040005@american.edu> > On Sat, Jun 6, 2009 at 1:59 PM, Alan G Isaac For sure GAUSS does. The result of x' * A * x > is a "matrix" (it has one row and one column) but > it functions like a scalar (and even more, > since right multiplication by it is also allowed). On 6/6/2009 4:32 PM Charles R Harris apparently wrote: > It's actually an inner product Sorry for the confusion: I was just reporting how GAUSS treats the expression x' * A * x. Alan From charlesr.harris at gmail.com Sat Jun 6 17:17:11 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 6 Jun 2009 15:17:11 -0600 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A2AD8FC.2080602@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2ACA85.2000304@american.edu> <3d375d730906061330x5135781em7cdf6fc1c403bc16@mail.gmail.com> <4A2AD8FC.2080602@american.edu> Message-ID: On Sat, Jun 6, 2009 at 3:00 PM, Alan G Isaac wrote: > On 6/6/2009 4:30 PM Robert Kern apparently wrote: > > The old idea of introducing RowVector and ColumnVector would help > > here. If x were a ColumnVector and A a Matrix, then you can introduce > > the following rules: > > > > x.T is a RowVector > > RowVector * ColumnVector is a scalar > > RowVector * Matrix is a RowVector > > Matrix * ColumnVector is a ColumnVector > > > To me, a "row vector" is just a matrix with a single row, > and a "column vector" is just a matrix with a single column. > Calling them "vectors" is rather redundant, since matrices > are also vectors (i.e., belong to a vector space). > Well, yes, linear mappings between vector spaces are also vector spaces, but it is useful to make the distinction. Likewise, L(x,L(y,z)) is multilinear map that factors through the tensor product of x,y,z. So on and so forth. At some point all these constructions are useful. But I think it is pernicious for a first course in matrix algebra to not distinguish between matrices and vectors. The abstraction to general linear spaces can come later. > > I think the core of the row-vector/column-vector proposal > is really the idea that we could have 1d objects that > also have an "orientation" for the purposes of certain > operations. But then why not just use matrices, which > automatically provide that "orientation"? > Because at some point you want scalars. In matrix algebra matrices are generally considered maps between vector spaces. Covariance matrices don't fit that paradigm, but that is skimmed over. It's kind of a mess, actually. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sat Jun 6 18:02:44 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 6 Jun 2009 15:02:44 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> Message-ID: On Sat, Jun 6, 2009 at 2:01 PM, Robert Kern wrote: > ?There is a neat trick for accessing the diagonal of an existing array > (a.flat[::a.shape[1]+1]), but it won't work to implement > diag_indices(). Perfect. That's 3x faster. def fill_diag(arr, value): if arr.ndim != 2: raise ValueError, "Input must be 2-d." if arr.shape[0] != arr.shape[1]: raise ValueError, 'Input must be square.' arr.flat[::arr.shape[1]+1] = value From aisaac at american.edu Sat Jun 6 18:34:22 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 06 Jun 2009 18:34:22 -0400 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> Message-ID: <4A2AEEEE.5020602@american.edu> On 6/6/2009 6:02 PM Keith Goodman apparently wrote: >> def fill_diag(arr, value): > if arr.ndim != 2: > raise ValueError, "Input must be 2-d." > if arr.shape[0] != arr.shape[1]: > raise ValueError, 'Input must be square.' > arr.flat[::arr.shape[1]+1] = value You might want to check for contiguity. See diagrv in pyGAUSS.py: http://code.google.com/p/econpy/source/browse/trunk/pytrix/pyGAUSS.py Alan Isaac From robert.kern at gmail.com Sat Jun 6 18:39:45 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 6 Jun 2009 17:39:45 -0500 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: <4A2AEEEE.5020602@american.edu> References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> <4A2AEEEE.5020602@american.edu> Message-ID: <3d375d730906061539i51805fddm62541edbb1ca2ad7@mail.gmail.com> On Sat, Jun 6, 2009 at 17:34, Alan G Isaac wrote: > On 6/6/2009 6:02 PM Keith Goodman apparently wrote: >>> def fill_diag(arr, value): >> ? ? if arr.ndim != 2: >> ? ? ? ? raise ValueError, "Input must be 2-d." >> ? ? if arr.shape[0] != arr.shape[1]: >> ? ? ? ? raise ValueError, 'Input must be square.' >> ? ? arr.flat[::arr.shape[1]+1] = value > > > You might want to check for contiguity. > See diagrv in pyGAUSS.py: > http://code.google.com/p/econpy/source/browse/trunk/pytrix/pyGAUSS.py Ah, that's the beauty of .flat; it takes care of that for you. .flat is not a view onto the memory directly. It is a not-quite-a-view onto what the memory *would* be if the array were contiguous and the memory directly reflected the layout as seen by Python. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gnurser at googlemail.com Sat Jun 6 19:07:02 2009 From: gnurser at googlemail.com (George Nurser) Date: Sun, 7 Jun 2009 00:07:02 +0100 Subject: [Numpy-discussion] svn numpy not building on osx 10.5.6, python.org python 2.5.2 Message-ID: <1d1e6ea70906061607h1dd822c6w21f66580925309a7@mail.gmail.com> Hi, the current svn version 7039 isn't compiling for me. Clean checkout, old numpy directories removed from site-packages.. Same command did work for svn r 6329 [george-nursers-macbook-pro-15:~/src/numpy] agn% python setup.py config_fc --fcompiler=gnu95 build_clib --fcompiler=gnu95 build_ext --fcompiler=gnu95 install Running from numpy source directory. F2PY Version 2_7039 numpy/core/setup_common.py:81: MismatchCAPIWarning: API mismatch detected, the C API version numbers have to be updated. Current C api version is 3, with checksum c80bc716a6f035470a6f3f448406d9d5, but recorded checksum for C API version 3 in codegen_dir/cversions.txt is bf22c0d05b31625d2a7015988d61ce5a. If functions were added in the C API, you have to update C_API_VERSION in numpy/core/setup_common.pyc. MismatchCAPIWarning) blas_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers'] lapack_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-msse3'] running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_clib customize UnixCCompiler customize UnixCCompiler using build_clib building 'npymath' library compiling C sources C compiler: gcc -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -DNDEBUG -g -O3 error: unknown file type '.src' (from 'numpy/core/src/npy_math.c.src') --George. From aisaac at american.edu Sat Jun 6 19:34:14 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 06 Jun 2009 19:34:14 -0400 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: <3d375d730906061539i51805fddm62541edbb1ca2ad7@mail.gmail.com> References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> <4A2AEEEE.5020602@american.edu> <3d375d730906061539i51805fddm62541edbb1ca2ad7@mail.gmail.com> Message-ID: <4A2AFCF6.4060505@american.edu> On 6/6/2009 6:39 PM Robert Kern apparently wrote: > Ah, that's the beauty of .flat; it takes care of that for you. .flat > is not a view onto the memory directly. It is a not-quite-a-view onto > what the memory *would* be if the array were contiguous and the memory > directly reflected the layout as seen by Python. Aha. Thanks! Alan From robince at gmail.com Sat Jun 6 19:53:21 2009 From: robince at gmail.com (Robin) Date: Sun, 7 Jun 2009 00:53:21 +0100 Subject: [Numpy-discussion] NameError: name 'numeric' is not defined Message-ID: Hi, I just updated to latest numpy svn: In [10]: numpy.__version__ Out[10]: '1.4.0.dev7039' It seemed to build fine, but I am getting a lot of errors testing it: ---------------------------------------------------------------------- Ran 178 tests in 0.655s FAILED (errors=138) Out[8]: Almost all the errors look the same: ====================================================================== ERROR: test_shape (test_ctypeslib.TestNdpointer) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/tests/test_ctypeslib.py", line 83, in test_shape self.assert_(p.from_param(np.array([[1,2]]))) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ctypeslib.py", line 171, in from_param return obj.ctypes File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/__init__.py", line 27, in __all__ += numeric.__all__ NameError: name 'numeric' is not defined I haven't seen this before - is it something wrong with my build or the current svn state? I am using macports python 2.5.4 on os x 10.5.7 Cheers Robin From robince at gmail.com Sat Jun 6 20:28:38 2009 From: robince at gmail.com (Robin) Date: Sun, 7 Jun 2009 01:28:38 +0100 Subject: [Numpy-discussion] NameError: name 'numeric' is not defined In-Reply-To: References: Message-ID: On Sun, Jun 7, 2009 at 12:53 AM, Robin wrote: > I haven't seen this before - is it something wrong with my build or > the current svn state? I am using macports python 2.5.4 on os x 10.5.7 Hmmm... after rebuilding from the same version the problem seems to have gone away... sorry for the noise... Robin From tpk at kraussfamily.org Sat Jun 6 21:57:06 2009 From: tpk at kraussfamily.org (Tom K.) Date: Sat, 6 Jun 2009 18:57:06 -0700 (PDT) Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> Message-ID: <23907204.post@talk.nabble.com> Fernando Perez wrote: > > On Sat, Jun 6, 2009 at 11:03 AM, Charles R > Harris wrote: > >> I don't think we can change the current matrix class, to do so would >> break >> too much code. It would be nice to extend it with an explicit inner >> product, >> but I can't think of any simple notation for it that python would parse. > > Maybe it's time to make another push on python-dev for the pep-225 > stuff for other operators? > > https://cirl.berkeley.edu/fperez/static/numpy-pep225/ > > Last year I got pretty much zero interest from python-dev on this, but > they were very very busy with 3.0 on the horizon. Perhaps once they > put 3.1 out would be a good time to champion this again. > > It's slightly independent of the matrix class debate, but perhaps > having special operators for real matrix multiplication could ease > some of the bottlenecks of this discussion. > > It would be great if someone could champion that discussion on > python-dev though, I don't see myself finding the time for it another > time around... > How about pep 211? http://www.python.org/dev/peps/pep-0211/ PEP 211 proposes a single new operator (@) that could be used for matrix multiplication. MATLAB has elementwise versions of multiply, exponentiation, and left and right division using a preceding "." for the usual matrix versions (* ^ \ /). PEP 225 proposes "tilde" versions of + - * / % **. While PEP 225 would allow a matrix exponentiation and right divide, I think these things are much less common than matrix multiply. Plus, I think following through with the PEP 225 implementation would create a frankenstein of a language that would be hard to read. So, I would argue for pushing for a single new operator that can then be used to implement "dot" with a binary infix operator. We can resurrect PEP 211 or start a new PEP or whatever, the main thing is to have a proposal that makes sense. Actually, what do you all think of this: @ --> matrix multiply @@ --> matrix exponentiation and we leave it at that - let's not get too greedy and try for matrix inverse via @/ or something. For the nd array operator, I would propose taking the last dimension of the left array and "collapsing" it with the first dimension of the right array, so shape (a0, ..., aL-1,k) @ (k, b0, ..., bM-1) --> (a0, ..., aL-1, b0, ..., bM-1) Does that make sense? With this proposal, matrices go away and all our lives are sane again. :-) Long live the numpy ndarray! Thanks to the creators for all your hard work BTW - I love this stuff! - Tom K. -- View this message in context: http://www.nabble.com/matrix-default-to-column-vector--tp23652920p23907204.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ramercer at gmail.com Sat Jun 6 22:57:36 2009 From: ramercer at gmail.com (Adam Mercer) Date: Sat, 6 Jun 2009 21:57:36 -0500 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> Message-ID: <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> On Fri, Jun 5, 2009 at 06:09, David Cournapeau wrote: > Please test it ! I am particularly interested in results for scipy > binaries on mac os x (do they work on ppc). Test suite passes on Intel Mac OS X (10.5.7) built from source: OK (KNOWNFAIL=6, SKIP=21) Cheers Adam From ralf.gommers at googlemail.com Sat Jun 6 23:49:36 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 6 Jun 2009 23:49:36 -0400 Subject: [Numpy-discussion] reciprocal(0) Message-ID: Hi, I expect `reciprocal(x)` to calculate 1/x, and for input 0 to either follow the python rules or give the np.divide(1, 0) result. However the result returned (with numpy trunk) is: >>> np.reciprocal(0) -2147483648 >>> np.divide(1, 0) 0 >>> 1/0 Traceback (most recent call last): File "", line 1, in ZeroDivisionError: integer division or modulo by zero The result for a zero float argument is inf as expected. I want to document the correct behavior for integers, what should it be? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From llewelr at gmail.com Sun Jun 7 00:11:10 2009 From: llewelr at gmail.com (llewelr at gmail.com) Date: Sun, 07 Jun 2009 04:11:10 +0000 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? In-Reply-To: <7f014ea60906061511k7f58d501yf90a29932ce696b8@mail.gmail.com> Message-ID: <00221532c9dcd16924046bba504e@google.com> Hi, On Jun 6, 2009 3:11pm, Chris Colbert wrote: > it definately found your threaded atlas libraries. How do you know > it's numpy is using lapack_lite? I don't, actually. But it is importing it. With python -v, this is the error I get if I don't set LD_LIBRARY_PATH to my scipy_build directory import numpy.linalg.linalg # precompiled from /data10/users/rich/usr/galois/lib64/python/numpy/linalg/linalg.pyc dlopen("/data10/users/rich/usr/galois/lib64/python/numpy/linalg/lapack_lite.so", 2); Traceback (most recent call last): File "", line 1, in File "/data10/users/rich/usr/galois//lib64/python/numpy/__init__.py", line 130, in import add_newdocs File "/data10/users/rich/usr/galois//lib64/python/numpy/add_newdocs.py", line 9, in from lib import add_newdoc File "/data10/users/rich/usr/galois//lib64/python/numpy/lib/__init__.py", line 13, in from polynomial import * File "/data10/users/rich/usr/galois//lib64/python/numpy/lib/polynomial.py", line 18, in from numpy.linalg import eigvals, lstsq File "/data10/users/rich/usr/galois//lib64/python/numpy/linalg/__init__.py", line 47, in from linalg import * File "/data10/users/rich/usr/galois//lib64/python/numpy/linalg/linalg.py", line 22, in from numpy.linalg import lapack_lite ImportError: liblapack.so: cannot open shared object file: No such file or directory Here blas_opt_info seems to be missing ATLAS version. >>> numpy.show_config() atlas_threads_info: libraries = ['lapack', 'lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] language = f77 blas_opt_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] define_macros = [('NO_ATLAS_INFO', 2)] language = c atlas_blas_threads_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] language = c lapack_opt_info: libraries = ['lapack', 'lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] define_macros = [('NO_ATLAS_INFO', 2)] language = f77 lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE mkl_info: NOT AVAILABLE > when I do: > python > >>import numpy > >>numpy.show_config() > atlas_threads_info: > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/local/lib'] > language = f77 > blas_opt_info: > libraries = ['ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/local/lib'] > define_macros = [('ATLAS_INFO', '"\\"3.8.3\\""')] > language = c > atlas_blas_threads_info: > libraries = ['ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/local/lib'] > language = c > lapack_opt_info: > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/local/lib'] > define_macros = [('NO_ATLAS_INFO', 2)] > language = f77 > lapack_mkl_info: > NOT AVAILABLE > blas_mkl_info: > NOT AVAILABLE > mkl_info: > NOT AVAILABLE > also try: > >>> a = numpy.random.randn(6000, 6000) > >>> numpy.dot(a,a) > and make sure all your cpu cores peg at 100% Unfortunately only one cpu. What does that mean? Threaded libraries not used? from top: Cpu0 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.2%sy, 0.0%ni, 99.4%id, 0.0%wa, 0.2%hi, 0.2%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Thanks much for the help. Rich > On Sat, Jun 6, 2009 at 3:35 PM, llewelr at gmail.com> wrote: > > Oops. Thanks, that makes more sense: > > > > http://pastebin.com/m7067709b > > > > On Jun 6, 2009 12:15pm, Chris Colbert sccolbert at gmail.com> wrote: > >> i need the full link to pastebin.com in order to view your post. > >> > >> > >> > >> It will be something like http://pastebin.com/m6b09f05c > >> > >> > >> > >> > >> > >> chris > >> > >> > >> > >> > >> > >> On Sat, Jun 6, 2009 at 2:32 PM, Richard Llewellynllewelr at gmail.com> > wrote: > >> > >> > I posted the setup.py build output to pastebin.com, though missed the > >> > >> > uninteresting stderr (forgot tcsh command to redirect both). > >> > >> > Also, used setup.py build --fcompiler=gnu95. > >> > >> > > >> > >> > > >> > >> > To be clear, I am not certain that my ATLAS libraries are not found. > But > >> > >> > during the build starting at line 95 (pastebin.com) I see a > compilation > >> > >> > failure, and then NO_ATLAS_INFO, 2. > >> > >> > > >> > >> > I don't think I can use ldconfig without root, but have set > >> > LD_LIBRARY_PATH > >> > >> > to point to the scipy_build/lib until I put them somewhere else. > >> > >> > > >> > >> > importing numpy works, though lapack_lite is also imported. I wonder > if > >> > this > >> > >> > is normal even if my ATLAS was used. > >> > >> > > >> > >> > Thanks, > >> > >> > Rich > >> > >> > > >> > >> > On Sat, Jun 6, 2009 at 10:46 AM, Chris Colbert sccolbert at gmail.com> > >> > wrote: > >> > >> >> > >> > >> >> and where exactly are you seeing atlas not found? during the build > >> > >> >> process, are when import numpy in python? > >> > >> >> > >> > >> >> if its the latter, you need to add a .conf file in > /etc/ld.so.conf.d/ > >> > >> >> with the line /usr/local/rich/src/scipy_build/lib and then run sudo > >> > >> >> ldconfig > >> > >> >> > >> > >> >> Chris > >> > >> >> > >> > >> >> > >> > >> >> On Sat, Jun 6, 2009 at 1:42 PM, Chris Colbertsccolbert at gmail.com> > >> >> wrote: > >> > >> >> > can you run this and post the build.log to pastebin.com: > >> > >> >> > > >> > >> >> > assuming your numpy build directory is /home/numpy-1.3.0: > >> > >> >> > > >> > >> >> > cd /home/numpy-1.3.0 > >> > >> >> > rm -rf build > >> > >> >> > python setup.py build &&> build.log > >> > >> >> > > >> > >> >> > > >> > >> >> > Chris > >> > >> >> > > >> > >> >> > > >> > >> >> > On Sat, Jun 6, 2009 at 1:37 PM, Richard > Llewellynllewelr at gmail.com> > >> > >> >> > wrote: > >> > >> >> >> Hi Chris, > >> > >> >> >> thanks much for posting those installation instructions. Seems > >> > >> >> >> similar to > >> > >> >> >> what I pieced together. > >> > >> >> >> > >> > >> >> >> I gather ATLAS not found. Oops, drank that beer too early. > >> > >> >> >> > >> > >> >> >> I copied Atlas libs to /usr/local/rich/src/scipy_build/lib. > >> > >> >> >> > >> > >> >> >> This is my site.cfg. Out of desperation I tried > search_static_first > >> >> >> = > >> > >> >> >> 1, > >> > >> >> >> but probably of no use. > >> > >> >> >> > >> > >> >> >> [DEFAULT] > >> > >> >> >> library_dirs = > >> >> >> /usr/local/rich/src/scipy_build/lib:$HOME/usr/galois/lib > >> > >> >> >> include_dirs = > >> > >> >> >> > /usr/local/rich/src/scipy_build/lib/include:$HOME/usr/galois/include > >> > >> >> >> search_static_first = 1 > >> > >> >> >> > >> > >> >> >> [blas_opt] > >> > >> >> >> libraries = f77blas, cblas, atlas > >> > >> >> >> > >> > >> >> >> [lapack_opt] > >> > >> >> >> libraries = lapack, f77blas, cblas, atlas > >> > >> >> >> > >> > >> >> >> [amd] > >> > >> >> >> amd_libs = amd > >> > >> >> >> > >> > >> >> >> [umfpack] > >> > >> >> >> umfpack_libs = umfpack, gfortran > >> > >> >> >> > >> > >> >> >> [fftw] > >> > >> >> >> libraries = fftw3 > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> Rich > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> On Sat, Jun 6, 2009 at 10:25 AM, Chris Colbert > sccolbert at gmail.com> > >> > >> >> >> wrote: > >> > >> >> >>> > >> > >> >> >>> when you build numpy, did you use site.cfg to tell it where to > find > >> > >> >> >>> your atlas libs? > >> > >> >> >>> > >> > >> >> >>> On Sat, Jun 6, 2009 at 1:02 PM, Richard > Llewellynllewelr at gmail.com> > >> > >> >> >>> wrote: > >> > >> >> >>> > Hello, > >> > >> >> >>> > > >> > >> >> >>> > I've managed a build of lapack and atlas on Fedora 10 on a > quad > >> > >> >> >>> > core, > >> > >> >> >>> > 64, > >> > >> >> >>> > and now (...) have a numpy I can import that runs tests ok. :] > >> >> >>> > I > >> > >> >> >>> > am > >> > >> >> >>> > puzzled, however, that numpy builds and imports lapack_lite. > >> >> >>> > Does > >> > >> >> >>> > this > >> > >> >> >>> > mean > >> > >> >> >>> > I have a problem with the build(s)? > >> > >> >> >>> > Upon building numpy, I see the troubling output: > >> > >> >> >>> > > >> > >> >> >>> > ######################## > >> > >> >> >>> > > >> > >> >> >>> > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g > >> >> >>> > -pipe > >> > >> >> >>> > -Wall > >> > >> >> >>> > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto > >> > >> >> >>> > r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE > >> >> >>> > -fPIC > >> > >> >> >>> > -fPIC > >> > >> >> >>> > > >> > >> >> >>> > compile options: '-c' > >> > >> >> >>> > gcc: _configtest.c > >> > >> >> >>> > gcc -pthread _configtest.o > -L/usr/local/rich/src/scipy_build/lib > >> > >> >> >>> > -llapack > >> > >> >> >>> > -lptf77blas -lptcblas -latlas -o _configtest > >> > >> >> >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > >> > >> >> >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is > >> > >> >> >>> > reference > >> > >> >> >>> > d by DSO > >> > >> >> >>> > /usr/bin/ld: final link failed: Nonrepresentable section on > >> >> >>> > output > >> > >> >> >>> > collect2: ld returned 1 exit status > >> > >> >> >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > >> > >> >> >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is > >> > >> >> >>> > reference > >> > >> >> >>> > d by DSO > >> > >> >> >>> > /usr/bin/ld: final link failed: Nonrepresentable section on > >> >> >>> > output > >> > >> >> >>> > collect2: ld returned 1 exit status > >> > >> >> >>> > failure. > >> > >> >> >>> > removing: _configtest.c _configtest.o > >> > >> >> >>> > Status: 255 > >> > >> >> >>> > Output: > >> > >> >> >>> > FOUND: > >> > >> >> >>> > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > >> > >> >> >>> > library_dirs = ['/usr/local/rich/src/scipy_build/lib'] > >> > >> >> >>> > language = f77 > >> > >> >> >>> > define_macros = [('NO_ATLAS_INFO', 2)] > >> > >> >> >>> > > >> > >> >> >>> > ########################## > >> > >> >> >>> > > >> > >> >> >>> > I don't have root on this machine, but could pester admins for > >> > >> >> >>> > eventual > >> > >> >> >>> > temporary access. > >> > >> >> >>> > > >> > >> >> >>> > Thanks much for any help, > >> > >> >> >>> > Rich > >> > >> >> >>> > > >> > >> >> >>> > _______________________________________________ > >> > >> >> >>> > Numpy-discussion mailing list > >> > >> >> >>> > Numpy-discussion at scipy.org > >> > >> >> >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> >> >>> > > >> > >> >> >>> > > >> > >> >> >>> _______________________________________________ > >> > >> >> >>> Numpy-discussion mailing list > >> > >> >> >>> Numpy-discussion at scipy.org > >> > >> >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> _______________________________________________ > >> > >> >> >> Numpy-discussion mailing list > >> > >> >> >> Numpy-discussion at scipy.org > >> > >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> >> >> > >> > >> >> >> > >> > >> >> > > >> > >> >> _______________________________________________ > >> > >> >> Numpy-discussion mailing list > >> > >> >> Numpy-discussion at scipy.org > >> > >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > % -------------- next part -------------- An HTML attachment was scrubbed... URL: From llewelr at gmail.com Sun Jun 7 00:11:23 2009 From: llewelr at gmail.com (llewelr at gmail.com) Date: Sun, 07 Jun 2009 04:11:23 +0000 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? In-Reply-To: <7f014ea60906061511k7f58d501yf90a29932ce696b8@mail.gmail.com> Message-ID: <000325575efe8e92b6046bba51db@google.com> Hi, On Jun 6, 2009 3:11pm, Chris Colbert wrote: > it definately found your threaded atlas libraries. How do you know > it's numpy is using lapack_lite? I don't, actually. But it is importing it. With python -v, this is the error I get if I don't set LD_LIBRARY_PATH to my scipy_build directory import numpy.linalg.linalg # precompiled from /data10/users/rich/usr/galois/lib64/python/numpy/linalg/linalg.pyc dlopen("/data10/users/rich/usr/galois/lib64/python/numpy/linalg/lapack_lite.so", 2); Traceback (most recent call last): File "", line 1, in File "/data10/users/rich/usr/galois//lib64/python/numpy/__init__.py", line 130, in import add_newdocs File "/data10/users/rich/usr/galois//lib64/python/numpy/add_newdocs.py", line 9, in from lib import add_newdoc File "/data10/users/rich/usr/galois//lib64/python/numpy/lib/__init__.py", line 13, in from polynomial import * File "/data10/users/rich/usr/galois//lib64/python/numpy/lib/polynomial.py", line 18, in from numpy.linalg import eigvals, lstsq File "/data10/users/rich/usr/galois//lib64/python/numpy/linalg/__init__.py", line 47, in from linalg import * File "/data10/users/rich/usr/galois//lib64/python/numpy/linalg/linalg.py", line 22, in from numpy.linalg import lapack_lite ImportError: liblapack.so: cannot open shared object file: No such file or directory Here blas_opt_info seems to be missing ATLAS version. >>> numpy.show_config() atlas_threads_info: libraries = ['lapack', 'lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] language = f77 blas_opt_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] define_macros = [('NO_ATLAS_INFO', 2)] language = c atlas_blas_threads_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] language = c lapack_opt_info: libraries = ['lapack', 'lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] define_macros = [('NO_ATLAS_INFO', 2)] language = f77 lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE mkl_info: NOT AVAILABLE > when I do: > python > >>import numpy > >>numpy.show_config() > atlas_threads_info: > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/local/lib'] > language = f77 > blas_opt_info: > libraries = ['ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/local/lib'] > define_macros = [('ATLAS_INFO', '"\\"3.8.3\\""')] > language = c > atlas_blas_threads_info: > libraries = ['ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/local/lib'] > language = c > lapack_opt_info: > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > library_dirs = ['/usr/local/lib'] > define_macros = [('NO_ATLAS_INFO', 2)] > language = f77 > lapack_mkl_info: > NOT AVAILABLE > blas_mkl_info: > NOT AVAILABLE > mkl_info: > NOT AVAILABLE > also try: > >>> a = numpy.random.randn(6000, 6000) > >>> numpy.dot(a,a) > and make sure all your cpu cores peg at 100% Unfortunately only one cpu. What does that mean? Threaded libraries not used? from top: Cpu0 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.2%sy, 0.0%ni, 99.4%id, 0.0%wa, 0.2%hi, 0.2%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Thanks much for the help. Rich > On Sat, Jun 6, 2009 at 3:35 PM, llewelr at gmail.com> wrote: > > Oops. Thanks, that makes more sense: > > > > http://pastebin.com/m7067709b > > > > On Jun 6, 2009 12:15pm, Chris Colbert sccolbert at gmail.com> wrote: > >> i need the full link to pastebin.com in order to view your post. > >> > >> > >> > >> It will be something like http://pastebin.com/m6b09f05c > >> > >> > >> > >> > >> > >> chris > >> > >> > >> > >> > >> > >> On Sat, Jun 6, 2009 at 2:32 PM, Richard Llewellynllewelr at gmail.com> > wrote: > >> > >> > I posted the setup.py build output to pastebin.com, though missed the > >> > >> > uninteresting stderr (forgot tcsh command to redirect both). > >> > >> > Also, used setup.py build --fcompiler=gnu95. > >> > >> > > >> > >> > > >> > >> > To be clear, I am not certain that my ATLAS libraries are not found. > But > >> > >> > during the build starting at line 95 (pastebin.com) I see a > compilation > >> > >> > failure, and then NO_ATLAS_INFO, 2. > >> > >> > > >> > >> > I don't think I can use ldconfig without root, but have set > >> > LD_LIBRARY_PATH > >> > >> > to point to the scipy_build/lib until I put them somewhere else. > >> > >> > > >> > >> > importing numpy works, though lapack_lite is also imported. I wonder > if > >> > this > >> > >> > is normal even if my ATLAS was used. > >> > >> > > >> > >> > Thanks, > >> > >> > Rich > >> > >> > > >> > >> > On Sat, Jun 6, 2009 at 10:46 AM, Chris Colbert sccolbert at gmail.com> > >> > wrote: > >> > >> >> > >> > >> >> and where exactly are you seeing atlas not found? during the build > >> > >> >> process, are when import numpy in python? > >> > >> >> > >> > >> >> if its the latter, you need to add a .conf file in > /etc/ld.so.conf.d/ > >> > >> >> with the line /usr/local/rich/src/scipy_build/lib and then run sudo > >> > >> >> ldconfig > >> > >> >> > >> > >> >> Chris > >> > >> >> > >> > >> >> > >> > >> >> On Sat, Jun 6, 2009 at 1:42 PM, Chris Colbertsccolbert at gmail.com> > >> >> wrote: > >> > >> >> > can you run this and post the build.log to pastebin.com: > >> > >> >> > > >> > >> >> > assuming your numpy build directory is /home/numpy-1.3.0: > >> > >> >> > > >> > >> >> > cd /home/numpy-1.3.0 > >> > >> >> > rm -rf build > >> > >> >> > python setup.py build &&> build.log > >> > >> >> > > >> > >> >> > > >> > >> >> > Chris > >> > >> >> > > >> > >> >> > > >> > >> >> > On Sat, Jun 6, 2009 at 1:37 PM, Richard > Llewellynllewelr at gmail.com> > >> > >> >> > wrote: > >> > >> >> >> Hi Chris, > >> > >> >> >> thanks much for posting those installation instructions. Seems > >> > >> >> >> similar to > >> > >> >> >> what I pieced together. > >> > >> >> >> > >> > >> >> >> I gather ATLAS not found. Oops, drank that beer too early. > >> > >> >> >> > >> > >> >> >> I copied Atlas libs to /usr/local/rich/src/scipy_build/lib. > >> > >> >> >> > >> > >> >> >> This is my site.cfg. Out of desperation I tried > search_static_first > >> >> >> = > >> > >> >> >> 1, > >> > >> >> >> but probably of no use. > >> > >> >> >> > >> > >> >> >> [DEFAULT] > >> > >> >> >> library_dirs = > >> >> >> /usr/local/rich/src/scipy_build/lib:$HOME/usr/galois/lib > >> > >> >> >> include_dirs = > >> > >> >> >> > /usr/local/rich/src/scipy_build/lib/include:$HOME/usr/galois/include > >> > >> >> >> search_static_first = 1 > >> > >> >> >> > >> > >> >> >> [blas_opt] > >> > >> >> >> libraries = f77blas, cblas, atlas > >> > >> >> >> > >> > >> >> >> [lapack_opt] > >> > >> >> >> libraries = lapack, f77blas, cblas, atlas > >> > >> >> >> > >> > >> >> >> [amd] > >> > >> >> >> amd_libs = amd > >> > >> >> >> > >> > >> >> >> [umfpack] > >> > >> >> >> umfpack_libs = umfpack, gfortran > >> > >> >> >> > >> > >> >> >> [fftw] > >> > >> >> >> libraries = fftw3 > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> Rich > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> On Sat, Jun 6, 2009 at 10:25 AM, Chris Colbert > sccolbert at gmail.com> > >> > >> >> >> wrote: > >> > >> >> >>> > >> > >> >> >>> when you build numpy, did you use site.cfg to tell it where to > find > >> > >> >> >>> your atlas libs? > >> > >> >> >>> > >> > >> >> >>> On Sat, Jun 6, 2009 at 1:02 PM, Richard > Llewellynllewelr at gmail.com> > >> > >> >> >>> wrote: > >> > >> >> >>> > Hello, > >> > >> >> >>> > > >> > >> >> >>> > I've managed a build of lapack and atlas on Fedora 10 on a > quad > >> > >> >> >>> > core, > >> > >> >> >>> > 64, > >> > >> >> >>> > and now (...) have a numpy I can import that runs tests ok. :] > >> >> >>> > I > >> > >> >> >>> > am > >> > >> >> >>> > puzzled, however, that numpy builds and imports lapack_lite. > >> >> >>> > Does > >> > >> >> >>> > this > >> > >> >> >>> > mean > >> > >> >> >>> > I have a problem with the build(s)? > >> > >> >> >>> > Upon building numpy, I see the troubling output: > >> > >> >> >>> > > >> > >> >> >>> > ######################## > >> > >> >> >>> > > >> > >> >> >>> > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g > >> >> >>> > -pipe > >> > >> >> >>> > -Wall > >> > >> >> >>> > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto > >> > >> >> >>> > r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE > >> >> >>> > -fPIC > >> > >> >> >>> > -fPIC > >> > >> >> >>> > > >> > >> >> >>> > compile options: '-c' > >> > >> >> >>> > gcc: _configtest.c > >> > >> >> >>> > gcc -pthread _configtest.o > -L/usr/local/rich/src/scipy_build/lib > >> > >> >> >>> > -llapack > >> > >> >> >>> > -lptf77blas -lptcblas -latlas -o _configtest > >> > >> >> >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > >> > >> >> >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is > >> > >> >> >>> > reference > >> > >> >> >>> > d by DSO > >> > >> >> >>> > /usr/bin/ld: final link failed: Nonrepresentable section on > >> >> >>> > output > >> > >> >> >>> > collect2: ld returned 1 exit status > >> > >> >> >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in > >> > >> >> >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is > >> > >> >> >>> > reference > >> > >> >> >>> > d by DSO > >> > >> >> >>> > /usr/bin/ld: final link failed: Nonrepresentable section on > >> >> >>> > output > >> > >> >> >>> > collect2: ld returned 1 exit status > >> > >> >> >>> > failure. > >> > >> >> >>> > removing: _configtest.c _configtest.o > >> > >> >> >>> > Status: 255 > >> > >> >> >>> > Output: > >> > >> >> >>> > FOUND: > >> > >> >> >>> > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > >> > >> >> >>> > library_dirs = ['/usr/local/rich/src/scipy_build/lib'] > >> > >> >> >>> > language = f77 > >> > >> >> >>> > define_macros = [('NO_ATLAS_INFO', 2)] > >> > >> >> >>> > > >> > >> >> >>> > ########################## > >> > >> >> >>> > > >> > >> >> >>> > I don't have root on this machine, but could pester admins for > >> > >> >> >>> > eventual > >> > >> >> >>> > temporary access. > >> > >> >> >>> > > >> > >> >> >>> > Thanks much for any help, > >> > >> >> >>> > Rich > >> > >> >> >>> > > >> > >> >> >>> > _______________________________________________ > >> > >> >> >>> > Numpy-discussion mailing list > >> > >> >> >>> > Numpy-discussion at scipy.org > >> > >> >> >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> >> >>> > > >> > >> >> >>> > > >> > >> >> >>> _______________________________________________ > >> > >> >> >>> Numpy-discussion mailing list > >> > >> >> >>> Numpy-discussion at scipy.org > >> > >> >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> _______________________________________________ > >> > >> >> >> Numpy-discussion mailing list > >> > >> >> >> Numpy-discussion at scipy.org > >> > >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> >> >> > >> > >> >> >> > >> > >> >> > > >> > >> >> _______________________________________________ > >> > >> >> Numpy-discussion mailing list > >> > >> >> Numpy-discussion at scipy.org > >> > >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > % -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Jun 7 00:22:34 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 7 Jun 2009 00:22:34 -0400 Subject: [Numpy-discussion] reciprocal(0) In-Reply-To: References: Message-ID: <1cd32cbb0906062122o1a9ffe0awa3cbff5603a51afb@mail.gmail.com> On Sat, Jun 6, 2009 at 11:49 PM, Ralf Gommers wrote: > Hi, > > I expect `reciprocal(x)` to calculate 1/x, and for input 0 to either follow > the python rules or give the np.divide(1, 0) result. However the result > returned (with numpy trunk) is: > >>>> np.reciprocal(0) > -2147483648 > >>>> np.divide(1, 0) > 0 >>>> 1/0 > Traceback (most recent call last): > ? File "", line 1, in > ZeroDivisionError: integer division or modulo by zero > > The result for a zero float argument is inf as expected. I want to document > the correct behavior for integers, what should it be? > > Cheers, > Ralf Add a warning not to use integers, if a nan or inf is possible in the code, because the behavior in numpy is not very predictable. overflow looks ok, but I really don't like the casting of nans to zero. Josef >>> x = np.array([0,1],dtype=int) >>> x[1] = np.nan >>> x array([0, 0]) >>> x[1]= np.inf Traceback (most recent call last): OverflowError: cannot convert float infinity to long >>> np.array([np.nan, 1],dtype=int) array([0, 1]) >>> np.array([0, np.inf],dtype=int) Traceback (most recent call last): ValueError: setting an array element with a sequence. >>> np.array([np.nan, np.inf]).astype(int) array([-2147483648, -2147483648]) and now yours looks like an inf cast to zero >>> x = np.array([0,1],dtype=int) >>> x/x[0] array([0, 0]) Masked Arrays look good for this >>> x = np.ma.array([0,1],dtype=int) >>> x masked_array(data = [0 1], mask = False, fill_value = 999999) >>> x/x[0] masked_array(data = [-- --], mask = [ True True], fill_value = 999999) From charlesr.harris at gmail.com Sun Jun 7 00:58:10 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 6 Jun 2009 22:58:10 -0600 Subject: [Numpy-discussion] svn numpy not building on osx 10.5.6, python.org python 2.5.2 In-Reply-To: <1d1e6ea70906061607h1dd822c6w21f66580925309a7@mail.gmail.com> References: <1d1e6ea70906061607h1dd822c6w21f66580925309a7@mail.gmail.com> Message-ID: On Sat, Jun 6, 2009 at 5:07 PM, George Nurser wrote: > Hi, > the current svn version 7039 isn't compiling for me. > Clean checkout, old numpy directories removed from site-packages.. > Same command did work for svn r 6329 > > [george-nursers-macbook-pro-15:~/src/numpy] agn% python setup.py > config_fc --fcompiler=gnu95 build_clib --fcompiler=gnu95 build_ext > --fcompiler=gnu95 install > Running from numpy source directory. > F2PY Version 2_7039 > numpy/core/setup_common.py:81: MismatchCAPIWarning: API mismatch > detected, the C API version numbers have to be updated. Current C api > version is 3, with checksum c80bc716a6f035470a6f3f448406d9d5, but > recorded checksum for C API version 3 in codegen_dir/cversions.txt is > bf22c0d05b31625d2a7015988d61ce5a. If functions were added in the C > API, you have to update C_API_VERSION in numpy/core/setup_common.pyc. > MismatchCAPIWarning) > blas_opt_info: > FOUND: > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > define_macros = [('NO_ATLAS_INFO', 3)] > extra_compile_args = ['-msse3', > '-I/System/Library/Frameworks/vecLib.framework/Headers'] > > lapack_opt_info: > FOUND: > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > define_macros = [('NO_ATLAS_INFO', 3)] > extra_compile_args = ['-msse3'] > > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler options > running build_clib > customize UnixCCompiler > customize UnixCCompiler using build_clib > building 'npymath' library > compiling C sources > C compiler: gcc -arch ppc -arch i386 -isysroot > /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double > -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -DNDEBUG -g -O3 > > error: unknown file type '.src' (from 'numpy/core/src/npy_math.c.src') > I've saw that once long ago, but I forget the cause and the cure. Try removing the build directory first. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Jun 7 01:50:19 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 7 Jun 2009 01:50:19 -0400 Subject: [Numpy-discussion] reciprocal(0) In-Reply-To: <1cd32cbb0906062122o1a9ffe0awa3cbff5603a51afb@mail.gmail.com> References: <1cd32cbb0906062122o1a9ffe0awa3cbff5603a51afb@mail.gmail.com> Message-ID: You're right, that's a little inconsistent. I would also prefer to get an overflow for divide by 0 rather than casting to zero. - ralf On Sun, Jun 7, 2009 at 12:22 AM, wrote: > On Sat, Jun 6, 2009 at 11:49 PM, Ralf Gommers > wrote: > > Hi, > > > > I expect `reciprocal(x)` to calculate 1/x, and for input 0 to either > follow > > the python rules or give the np.divide(1, 0) result. However the result > > returned (with numpy trunk) is: > > > >>>> np.reciprocal(0) > > -2147483648 > > > >>>> np.divide(1, 0) > > 0 > >>>> 1/0 > > Traceback (most recent call last): > > File "", line 1, in > > ZeroDivisionError: integer division or modulo by zero > > > > The result for a zero float argument is inf as expected. I want to > document > > the correct behavior for integers, what should it be? > > > > Cheers, > > Ralf > > Add a warning not to use integers, if a nan or inf is possible in the > code, because the behavior in numpy is not very predictable. > overflow looks ok, but I really don't like the casting of nans to zero. > > Josef > > >>> x = np.array([0,1],dtype=int) > > >>> x[1] = np.nan > >>> x > array([0, 0]) > > >>> x[1]= np.inf > Traceback (most recent call last): > OverflowError: cannot convert float infinity to long > > >>> np.array([np.nan, 1],dtype=int) > array([0, 1]) > > >>> np.array([0, np.inf],dtype=int) > Traceback (most recent call last): > ValueError: setting an array element with a sequence. > > >>> np.array([np.nan, np.inf]).astype(int) > array([-2147483648, -2147483648]) > > > and now yours looks like an inf cast to zero > > >>> x = np.array([0,1],dtype=int) > >>> x/x[0] > array([0, 0]) > > Masked Arrays look good for this > > >>> x = np.ma.array([0,1],dtype=int) > >>> x > masked_array(data = [0 1], > mask = False, > fill_value = 999999) > > >>> x/x[0] > masked_array(data = [-- --], > mask = [ True True], > fill_value = 999999) > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sun Jun 7 01:56:34 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 07 Jun 2009 14:56:34 +0900 Subject: [Numpy-discussion] svn numpy not building on osx 10.5.6, python.org python 2.5.2 In-Reply-To: <1d1e6ea70906061607h1dd822c6w21f66580925309a7@mail.gmail.com> References: <1d1e6ea70906061607h1dd822c6w21f66580925309a7@mail.gmail.com> Message-ID: <4A2B5692.4010904@ar.media.kyoto-u.ac.jp> George Nurser wrote: > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler options > running build_clib > customize UnixCCompiler > customize UnixCCompiler using build_clib > building 'npymath' library > compiling C sources > C compiler: gcc -arch ppc -arch i386 -isysroot > /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double > -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -DNDEBUG -g -O3 > > error: unknown file type '.src' (from 'numpy/core/src/npy_math.c.src') > Remove your build directory before building again, David From zelbier at gmail.com Sun Jun 7 03:43:33 2009 From: zelbier at gmail.com (Olivier Verdier) Date: Sun, 7 Jun 2009 09:43:33 +0200 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <23907204.post@talk.nabble.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> Message-ID: There would be a much simpler solution than allowing a new operator. Just allow the numpy function dot to take more than two arguments. Then A*B*C in matrix notation would simply be: dot(A,B,C) with arrays. Wouldn't that make everybody happy? Plus it does not break backward compatibility. Am I missing something? == Olivier 2009/6/7 Tom K. > > > Fernando Perez wrote: > > > > On Sat, Jun 6, 2009 at 11:03 AM, Charles R > > Harris wrote: > > > >> I don't think we can change the current matrix class, to do so would > >> break > >> too much code. It would be nice to extend it with an explicit inner > >> product, > >> but I can't think of any simple notation for it that python would parse. > > > > Maybe it's time to make another push on python-dev for the pep-225 > > stuff for other operators? > > > > https://cirl.berkeley.edu/fperez/static/numpy-pep225/ > > > > Last year I got pretty much zero interest from python-dev on this, but > > they were very very busy with 3.0 on the horizon. Perhaps once they > > put 3.1 out would be a good time to champion this again. > > > > It's slightly independent of the matrix class debate, but perhaps > > having special operators for real matrix multiplication could ease > > some of the bottlenecks of this discussion. > > > > It would be great if someone could champion that discussion on > > python-dev though, I don't see myself finding the time for it another > > time around... > > > > How about pep 211? > http://www.python.org/dev/peps/pep-0211/ > > PEP 211 proposes a single new operator (@) that could be used for matrix > multiplication. > MATLAB has elementwise versions of multiply, exponentiation, and left and > right division using a preceding "." for the usual matrix versions (* ^ \ > /). > PEP 225 proposes "tilde" versions of + - * / % **. > > While PEP 225 would allow a matrix exponentiation and right divide, I think > these things are much less common than matrix multiply. Plus, I think > following through with the PEP 225 implementation would create a > frankenstein of a language that would be hard to read. > > So, I would argue for pushing for a single new operator that can then be > used to implement "dot" with a binary infix operator. We can resurrect PEP > 211 or start a new PEP or whatever, the main thing is to have a proposal > that makes sense. Actually, what do you all think of this: > @ --> matrix multiply > @@ --> matrix exponentiation > and we leave it at that - let's not get too greedy and try for matrix > inverse via @/ or something. > > For the nd array operator, I would propose taking the last dimension of the > left array and "collapsing" it with the first dimension of the right array, > so > shape (a0, ..., aL-1,k) @ (k, b0, ..., bM-1) --> (a0, ..., aL-1, b0, ..., > bM-1) > Does that make sense? > > With this proposal, matrices go away and all our lives are sane again. :-) > Long live the numpy ndarray! Thanks to the creators for all your hard work > BTW - I love this stuff! > > - Tom K. > -- > View this message in context: > http://www.nabble.com/matrix-default-to-column-vector--tp23652920p23907204.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jun 7 03:56:16 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 7 Jun 2009 02:56:16 -0500 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> Message-ID: <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> On Sun, Jun 7, 2009 at 02:43, Olivier Verdier wrote: > There would be a much simpler solution than allowing a new operator. Just > allow the numpy function dot to take more than two arguments. Then A*B*C in > matrix notation would simply be: > dot(A,B,C) > with arrays. Wouldn't that make everybody happy? Plus it does not break > backward compatibility. Am I missing something? We've discussed it before. Search the archives. Although matrix multiplication is mathematically associative, there are performance and precision implications to the order the multiplications happen. No satisfactory implementation was found. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From beckers at orn.mpg.de Sun Jun 7 04:20:40 2009 From: beckers at orn.mpg.de (Gabriel Beckers) Date: Sun, 07 Jun 2009 10:20:40 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> Message-ID: <1244362840.4377.10.camel@gabriel-desktop> On Sat, 2009-06-06 at 12:59 -0400, Chris Colbert wrote: > ../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC > --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a Many thanks Chris, I succeeded in building it. The configure command above contained two problems that I had to correct to get it to work though. In case other people are trying this, I used: ../configure -b 32 -D c -DPentiumCPS=1800 -Fa alg -fPIC --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/lapack_LINUX.a That is (in addition to the different -b switch for my 32-bit machine and the different processor speed): the dash before "alg" should be removed, and "Lapack_LINUX.a" should be "lapack_LINUX.a". Gabriel From fperez.net at gmail.com Sun Jun 7 04:28:51 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 7 Jun 2009 01:28:51 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> Message-ID: On Sat, Jun 6, 2009 at 2:01 PM, Robert Kern wrote: > On Sat, Jun 6, 2009 at 13:30, Fernando Perez wrote: > Oops! Never mind. I thought it was using mask_indices like the others. > ?There is a neat trick for accessing the diagonal of an existing array > (a.flat[::a.shape[1]+1]), but it won't work to implement > diag_indices(). Neat! A version valid for all dimensionalities (always writing the main diagonal) is: if a.ndim == 2: # Explicit, fast formula for the common case step = a.shape[1] + 1 else: step = np.cumprod((1,)+a.shape[:-1]).sum() a.flat[::step] = val Do you want this as part of the patch? If so, where should it go (it's not 2-d only)? If you want it, should I add a check for equal dimensions? (I'd be inclined to allow non-square in the 2-d case but to avoid it in other cases, where the formula completely breaks down. In 2-d it can be useful to fill the diagonal of rectangular arrays.) >> - Are doctests considered enough testing for numpy, or are separate >> tests also required? > > I don't think we have the testing machinery hooked up to test the > docstrings on the functions themselves (we made the decision to keep > examples as clean and pedagogical as possible rather than complete > tests). You can use doctests in the test directories, though. Got it. >> - Where should these go? > > numpy/lib/twodim_base.py to go with their existing counterparts, I would think. OK. Will send it in when I know whether you'd want the fill_diagonal one, and where that should go. I'll make a ticket with the patch attached. >> - Any interest in also having the stuff below? ?I'm needing to build >> structured random arrays a lot (symmetric, anti-symmetric, symmetric >> with ?a particular diagonal, etc), and these are coming in handy. ?If >> you want them, I'll put the whole thing together (these use the >> indexing utilities from the previous suggestion). > > I wouldn't mind having a little gallery of matrix generators in numpy, > but someone else has already made a much more complete collection: > > ?http://pypi.python.org/pypi/rogues Ah, great! This stuff would be really nice to have in numpy/scipy, actually. A lot more than my 15-minute hack :) OK, I'll keep mine for symmetric/antisymmetric random matrices, since that's what I need now, but it's great to know about that resource. Cheers, f From fperez.net at gmail.com Sun Jun 7 04:37:37 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 7 Jun 2009 01:37:37 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> Message-ID: On Sun, Jun 7, 2009 at 1:28 AM, Fernando Perez wrote: > > OK. ?Will send it in when I know whether you'd want the fill_diagonal > one, and where that should go. > One more question. For these *_indices() functions, would you want an interface that accepts *either* diag_indices(size,ndim) or diag_indices(anarray) Both can be useful depending on the case, but it means leaving the first argument as an untyped placeholder and typechecking on it. I don't know if numpy has a policy on avoiding that kind of shenanigans. Cheers, f From robert.kern at gmail.com Sun Jun 7 04:40:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 7 Jun 2009 03:40:06 -0500 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> Message-ID: <3d375d730906070140y115a1a07of8804706749a5f3b@mail.gmail.com> On Sun, Jun 7, 2009 at 03:37, Fernando Perez wrote: > On Sun, Jun 7, 2009 at 1:28 AM, Fernando Perez wrote: >> >> OK. ?Will send it in when I know whether you'd want the fill_diagonal >> one, and where that should go. >> > > One more question. ?For these *_indices() functions, would you want an > interface that accepts *either* > > diag_indices(size,ndim) > > or > > diag_indices(anarray) > > Both can be useful depending on the case, but it ?means leaving the > first argument as an untyped placeholder and ?typechecking on it. ?I > don't know if numpy has a policy on avoiding that kind of shenanigans. *I* do. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Sun Jun 7 04:49:12 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 7 Jun 2009 01:49:12 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: <3d375d730906070140y115a1a07of8804706749a5f3b@mail.gmail.com> References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> <3d375d730906070140y115a1a07of8804706749a5f3b@mail.gmail.com> Message-ID: On Sun, Jun 7, 2009 at 1:40 AM, Robert Kern wrote: >> Both can be useful depending on the case, but it ?means leaving the >> first argument as an untyped placeholder and ?typechecking on it. ?I >> don't know if numpy has a policy on avoiding that kind of shenanigans. > > *I* do. OK. I'll go for the first form then, since that one can be used when no array is present at that point. If someone complains on patch review, we can always add a helper (with a name like diag_indices_from(arr) ) for the second call form. Thanks, f From giorgio.luciano at inwind.it Sun Jun 7 04:56:04 2009 From: giorgio.luciano at inwind.it (giorgio.luciano at inwind.it) Date: Sun, 7 Jun 2009 10:56:04 +0200 Subject: [Numpy-discussion] Jcamp dx format Message-ID: Sorry for cross posting Hello to all, I've done a script for importing all spectra files in a directory and merge all them in one matrix. The file imported are dx files. the bad part is that the file is in matlab and it requite a function from bioinformatic toolbox (jcamp read). And now I just wnat to do the same in python. I guess I will have no problem for translating the script but I think I dont' have the time (and capabilities) to rewrite something like jcampread. Since jcamp dx format it's quite common among scientist. Does anyone can share some script/function for importing them in python (I guess that also a r routine can do the trick but I will prefer to use python). Thanks in advance to all Giorgio From fperez.net at gmail.com Sun Jun 7 05:22:19 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 7 Jun 2009 02:22:19 -0700 Subject: [Numpy-discussion] field names on numpy arrays In-Reply-To: <9C0536CB-B4C4-4644-BE70-FE0EDCF4F55E@gmail.com> References: <23852413.post@talk.nabble.com> <9457e7c80906031151r5d007a3dye89f38482be41a4c@mail.gmail.com> <9C0536CB-B4C4-4644-BE70-FE0EDCF4F55E@gmail.com> Message-ID: On Thu, Jun 4, 2009 at 12:28 PM, Pierre GM wrote: > I foresee serious disturbance in the force... > When I use structured arrays, each field usually represents a > different variable, and I may not be keen on having a same operation > applied to all variables. At least, the current behavior (raise an > exception) forces me to think twice. My main use case is really arithmetic: being able to take differences of structured arrays that contain similar data would make some of our code here clearer. But if this doesn't fly, I can always have a little subtract(a,b) helper that does the 'unpack, subtract,repack' dance for 'a-b' and similar for any other needed operations. I realize it's not the most generic case, but we're using structured arrays a lot for putting data in more manageable/comprehensible structures, and some basic arithmetic support on the whole array would be nice sometimes. The other alternative is to do field-by-field extractions and operations, which is unnecessary when all the fields happen to have the exact same kind of data (numerically speaking, not conceptually). > What about the case where you multiply a 1D structured array with a nD > array ? What should you have ? I'd punt. I think it's OK for structured arrays not to support the full range of binary operations and ufuncs, I was only thinking of allowing the very simplest: binary ops between absolutely identical (shape, dtype) arrays whose dtype is an aggregation of native ones for which the 'unpack, operate, repack' pattern works. It seems to me like an improvement in the functionality of structured arrays, but it's not a big deal. Cheers, f From wierob83 at googlemail.com Sun Jun 7 05:40:43 2009 From: wierob83 at googlemail.com (wierob) Date: Sun, 07 Jun 2009 11:40:43 +0200 Subject: [Numpy-discussion] BigInteger equivalent in numpy In-Reply-To: <1cd32cbb0906040555p6ad34e6i406b95a3626301c9@mail.gmail.com> References: <4A27BBD5.5010306@googlemail.com> <1cd32cbb0906040555p6ad34e6i406b95a3626301c9@mail.gmail.com> Message-ID: <4A2B8B1B.6020701@googlemail.com> Hi, int64 and float seem to work for the stderr calculation. Now, the calculation of the p-value causes an underflow. File "C:\Python26\lib\site-packages\scipy\stats\distributions.py", line 2829,in _cdf return special.stdtr(df, x) FloatingPointError: underflow encountered in stdtr It seems that the error occurs in scipy.special.stdtr(df, x) if df = array([13412]) and x = array([61.88071696]). >>> from scipy import special >>> import numpy >>> df = numpy.array([13412]) >>> x = numpy,array([61.88071696]) >>> special.stdtr(df,x) array([ 1.]) >>> numpy.seterr(all="raise") >>> special.stdtr(df,x) Traceback (most recent call last): File "", line 1, in FloatingPointError: underflow encountered in stdtr So, is there another function or datatype that can handle this? Besides, the overlow in stderr calculation caused nan as result value whereas the underflow in p-value calculation eventually leads to 0.0 (returned by linregress) which is somewhat inconsistent. And in the latter case, tricky to identify as an error. kind regards robert josef.pktd at gmail.com schrieb: > On Thu, Jun 4, 2009 at 8:19 AM, wierob wrote: > >> Hi, >> >> is there a BigInteger equivalent in numpy? The largest integer type I >> wound was dtype int64. >> >> I'm using stats.linregress to perform a regression analysis. The return >> stderr was nan because stas.ss(...) returned a negative number due to an >> overflow. Setting dtype to int64 for my input data seems to fix this. >> But what if my data does not fit in int64? >> >> Since Python's long type can hold large data I tried to convert my input >> to long but it gets converted to int64 in numpy. >> >> > > you could try to use floats. stats.ss does the calculation in the same > type as the input. > If you convert your input data to floating point you will not get an > overflow, but floating point precision instead. > > Note during the last bugfix, I also changed the implementation of > stats.linregress and now (0.7.1 and later) it doesn't use stats.ss > anymore, instead it uses np.cov which always uses floats. > Also, if you are using an older version there was a mistake in the > stderr calculations, http://projects.scipy.org/scipy/ticket/874 > > Josef > > >> kind regards >> robert >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From zelbier at gmail.com Sun Jun 7 05:44:19 2009 From: zelbier at gmail.com (Olivier Verdier) Date: Sun, 7 Jun 2009 11:44:19 +0200 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> Message-ID: Yes, I found the thread you are referring to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html However, since A*B*C exists for matrices and actually computes (A*B)*C, why not do the same with dot? I.e. why not decide that dot(A,B,C) does what would A*B*C do, i.e., dot(dot(A,B),C)? The performance and precision problems are the responsability of the user, just as with the formula A*B*C. == Olivier 2009/6/7 Robert Kern > On Sun, Jun 7, 2009 at 02:43, Olivier Verdier wrote: > > There would be a much simpler solution than allowing a new operator. Just > > allow the numpy function dot to take more than two arguments. Then A*B*C > in > > matrix notation would simply be: > > dot(A,B,C) > > with arrays. Wouldn't that make everybody happy? Plus it does not break > > backward compatibility. Am I missing something? > > We've discussed it before. Search the archives. Although matrix > multiplication is mathematically associative, there are performance > and precision implications to the order the multiplications happen. No > satisfactory implementation was found. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From beckers at orn.mpg.de Sun Jun 7 05:52:09 2009 From: beckers at orn.mpg.de (Gabriel Beckers) Date: Sun, 07 Jun 2009 11:52:09 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <1244362840.4377.10.camel@gabriel-desktop> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> Message-ID: <1244368329.4377.18.camel@gabriel-desktop> OK, perhaps I drank that beer too soon... Now, numpy.test() hangs at: test_pinv (test_defmatrix.TestProperties) ... So perhaps something is wrong with ATLAS, even though the building went fine, and "make check" and "make ptcheck" reported no errors. Gabriel On Sun, 2009-06-07 at 10:20 +0200, Gabriel Beckers wrote: > On Sat, 2009-06-06 at 12:59 -0400, Chris Colbert wrote: > > ../configure -b 64 -D c -DPentiumCPS=2400 -Fa -alg -fPIC > > --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a > > Many thanks Chris, I succeeded in building it. > > The configure command above contained two problems that I had to correct > to get it to work though. > > In case other people are trying this, I used: > > ../configure -b 32 -D c -DPentiumCPS=1800 -Fa alg -fPIC > --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/lapack_LINUX.a > > That is (in addition to the different -b switch for my 32-bit machine > and the different processor speed): the dash before "alg" should be > removed, and "Lapack_LINUX.a" should be "lapack_LINUX.a". > > Gabriel > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From david at ar.media.kyoto-u.ac.jp Sun Jun 7 05:37:21 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 07 Jun 2009 18:37:21 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <1244368329.4377.18.camel@gabriel-desktop> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> Message-ID: <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> Gabriel Beckers wrote: > OK, perhaps I drank that beer too soon... > > Now, numpy.test() hangs at: > > test_pinv (test_defmatrix.TestProperties) ... > > So perhaps something is wrong with ATLAS, even though the building went > fine, and "make check" and "make ptcheck" reported no errors. > Maybe you did not use the same fortran compiler with atlas and numpy, or maybe something else. make check/make ptchek do not test anything useful to avoid problems with numpy, in my experience. That's why compiling atlas by yourself is hard, and I generally advise against it: there is nothing intrinsically hard about it, but you need to know a lot of small details and platform oddities to get it right every time. That's just a waste of time in most cases IMHO, unless all you do with numpy is inverting big matrices, cheers, David From sebastian.walter at gmail.com Sun Jun 7 06:09:40 2009 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Sun, 7 Jun 2009 12:09:40 +0200 Subject: [Numpy-discussion] Multiplying Python float to numpy.array of objects works but fails with a numpy.float64, numpy Bug? In-Reply-To: References: Message-ID: Apparently the bugfix has been undone in the current bleeding edge version of numpy: ----------------------- numpy_float64_issue.py ---------------------------------- from numpy import * import numpy print 'numpy.__version__=',numpy.__version__ class adouble: def __init__(self,x): self.x = x def __mul__(self,rhs): if isinstance(rhs,adouble): return adouble(self.x * rhs.x) else: return adouble(self.x * rhs) def __str__(self): return str(self.x) x = adouble(3.) y = adouble(2.) u = array([adouble(3.), adouble(5.)]) v = array([adouble(2.), adouble(7.)]) z = array([2.,3.]) print x * y # ok print u * v # ok print u * z # ok print u * 3. # ok print u * z[0] # _NOT_ OK! print u * float64(3.) # _NOT_ OK! ---------------- end numpy_float64_issue.py -------------------------- OUTPUT: basti at shlp:~/tmp$ python numpy_float64_issue.py numpy.__version__= 1.4.0.dev7039 6.0 [6.0 35.0] [6.0 15.0] [9.0 15.0] Traceback (most recent call last): File "numpy_float64_issue.py", line 26, in print u * z[0] # _NOT_ OK! TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.float64' Should I open a ticket for that? Sebastian On Tue, Jun 2, 2009 at 4:18 PM, Darren Dale wrote: > > > On Tue, Jun 2, 2009 at 10:09 AM, Keith Goodman wrote: >> >> On Tue, Jun 2, 2009 at 1:42 AM, Sebastian Walter >> wrote: >> > Hello, >> > Multiplying a Python float to a numpy.array of objects works flawlessly >> > but not with a numpy.float64 . >> > I tried ?numpy version '1.0.4' on a 32 bit Linux and ?'1.2.1' on a 64 >> > bit Linux: both raise the same exception. >> > >> > Is this a (known) bug? > > Yes, it was fixed in numpy-1.3. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From gnurser at googlemail.com Sun Jun 7 06:11:28 2009 From: gnurser at googlemail.com (George Nurser) Date: Sun, 7 Jun 2009 11:11:28 +0100 Subject: [Numpy-discussion] svn numpy not building on osx 10.5.6, python.org python 2.5.2 In-Reply-To: <4A2B5692.4010904@ar.media.kyoto-u.ac.jp> References: <1d1e6ea70906061607h1dd822c6w21f66580925309a7@mail.gmail.com> <4A2B5692.4010904@ar.media.kyoto-u.ac.jp> Message-ID: <1d1e6ea70906070311m7f0326cdoc0283f4941f8a9af@mail.gmail.com> Sorry, I should have said that I'd always deleted the build directories. I now have a better idea about what the problem is. python setup.py config_fc --fcompiler=gnu95 build_clib --fcompiler=gnu95 build_ext --fcompiler=gnu95 install works OK for svn versions < 6481 (where coremath was merged). It fails for r 6481 and later. However, simply, python setup.py install works OK for r 6481 and current (r 7039). I'm using the old system gcc 4.0.1. Perhaps the problem is that my gfortran (v4.3.2) is from MacPorts and was compiled to produce intel-only binaries, rather than dual architecture binaries? But I'm puzzled as to why specifying the fortran compiler should make any difference -- I understood it isn't used to compile numpy. --George. 2009/6/7 David Cournapeau : > George Nurser wrote: >> running config_fc >> unifing config_fc, config, build_clib, build_ext, build commands >> --fcompiler options >> running build_clib >> customize UnixCCompiler >> customize UnixCCompiler using build_clib >> building 'npymath' library >> compiling C sources >> C compiler: gcc -arch ppc -arch i386 -isysroot >> /Developer/SDKs/MacOSX10.4u.sdk -fno-strict-aliasing -Wno-long-double >> -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -DNDEBUG -g -O3 >> >> error: unknown file type '.src' (from 'numpy/core/src/npy_math.c.src') >> > > Remove your build directory before building again, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gael.varoquaux at normalesup.org Sun Jun 7 06:12:04 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 7 Jun 2009 12:12:04 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> References: <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> Message-ID: <20090607101204.GB20612@phare.normalesup.org> On Sun, Jun 07, 2009 at 06:37:21PM +0900, David Cournapeau wrote: > That's why compiling atlas by yourself is hard, and I generally advise > against it: there is nothing intrinsically hard about it, but you need > to know a lot of small details and platform oddities to get it right > every time. That's just a waste of time in most cases IMHO, unless all > you do with numpy is inverting big matrices, Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it makes a big difference, especially since I have 8 cores. Ga?l From david at ar.media.kyoto-u.ac.jp Sun Jun 7 06:00:52 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 07 Jun 2009 19:00:52 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090607101204.GB20612@phare.normalesup.org> References: <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> Message-ID: <4A2B8FD4.1000602@ar.media.kyoto-u.ac.jp> Gael Varoquaux wrote: > On Sun, Jun 07, 2009 at 06:37:21PM +0900, David Cournapeau wrote: > >> That's why compiling atlas by yourself is hard, and I generally advise >> against it: there is nothing intrinsically hard about it, but you need >> to know a lot of small details and platform oddities to get it right >> every time. That's just a waste of time in most cases IMHO, unless all >> you do with numpy is inverting big matrices, >> > > Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it makes > a big difference, especially since I have 8 cores. > hence *most* :) I doubt most numpy users need to do PCA on high-dimensional data. cheers, David From beckers at orn.mpg.de Sun Jun 7 06:21:14 2009 From: beckers at orn.mpg.de (Gabriel Beckers) Date: Sun, 07 Jun 2009 12:21:14 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> Message-ID: <1244370074.4377.36.camel@gabriel-desktop> On Sun, 2009-06-07 at 18:37 +0900, David Cournapeau wrote: > Maybe you did not use the same fortran compiler with atlas and numpy, > or > maybe something else. make check/make ptchek do not test anything > useful > to avoid problems with numpy, in my experience. > > That's why compiling atlas by yourself is hard, and I generally advise > against it: there is nothing intrinsically hard about it, but you need > to know a lot of small details and platform oddities to get it right > every time. That's just a waste of time in most cases IMHO, unless all > you do with numpy is inverting big matrices, > > cheers, > > David Hi David, I did: sudo apt-get remove g77 sudo apt-get install gfortran before starting the whole thing, so I assume that should take care of it. I am not sure how much I actually depend on Atlas for what I do, so your advice is well taken. One thing I can think of is PCA and ICA (of *big* matrices of float32 data), using the MDP toolbox mostly. I should find out in how far Atlas is crucial specifically for that. All the best, Gabriel From david at ar.media.kyoto-u.ac.jp Sun Jun 7 06:05:36 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 07 Jun 2009 19:05:36 +0900 Subject: [Numpy-discussion] BigInteger equivalent in numpy In-Reply-To: <4A2B8B1B.6020701@googlemail.com> References: <4A27BBD5.5010306@googlemail.com> <1cd32cbb0906040555p6ad34e6i406b95a3626301c9@mail.gmail.com> <4A2B8B1B.6020701@googlemail.com> Message-ID: <4A2B90F0.8020503@ar.media.kyoto-u.ac.jp> (Please do not send twice your message to numpy/scipy ML, thank you) wierob wrote: > Hi, > > int64 and float seem to work for the stderr calculation. Now, the > calculation of the p-value causes an underflow. > > File "C:\Python26\lib\site-packages\scipy\stats\distributions.py", line 2829,in _cdf > return special.stdtr(df, x) > FloatingPointError: underflow encountered in stdtr > > It seems that the error occurs in scipy.special.stdtr(df, x) if df = > array([13412]) and x = array([61.88071696]). > The error seems to happen in the cephes library (the library we use for many functions in scipy.special, including this one). I don't know exactly what triggers it (your df is quite high, though, and the cdf is already quite past the 1-eps at your point). cheers, David From david at ar.media.kyoto-u.ac.jp Sun Jun 7 06:10:59 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 07 Jun 2009 19:10:59 +0900 Subject: [Numpy-discussion] svn numpy not building on osx 10.5.6, python.org python 2.5.2 In-Reply-To: <1d1e6ea70906070311m7f0326cdoc0283f4941f8a9af@mail.gmail.com> References: <1d1e6ea70906061607h1dd822c6w21f66580925309a7@mail.gmail.com> <4A2B5692.4010904@ar.media.kyoto-u.ac.jp> <1d1e6ea70906070311m7f0326cdoc0283f4941f8a9af@mail.gmail.com> Message-ID: <4A2B9233.4050200@ar.media.kyoto-u.ac.jp> George Nurser wrote: > Sorry, I should have said that I'd always deleted the build directories. > I now have a better idea about what the problem is. > > python setup.py config_fc --fcompiler=gnu95 build_clib > --fcompiler=gnu95 build_ext --fcompiler=gnu95 install > > works OK for svn versions < 6481 (where coremath was merged). > It fails for r 6481 and later. > Ok, I fixed the bug in numpy.distutils - it has nothing to do with your fortran compiler :) (see #1131). David From beckers at orn.mpg.de Sun Jun 7 06:31:16 2009 From: beckers at orn.mpg.de (Gabriel Beckers) Date: Sun, 07 Jun 2009 12:31:16 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2B8FD4.1000602@ar.media.kyoto-u.ac.jp> References: <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <4A2B8FD4.1000602@ar.media.kyoto-u.ac.jp> Message-ID: <1244370676.4377.39.camel@gabriel-desktop> On Sun, 2009-06-07 at 19:00 +0900, David Cournapeau wrote: > hence *most* :) I doubt most numpy users need to do PCA on > high-dimensional data. OK a quick look on the MDP website learns that I am one of the exceptions (as Ga?l's email already suggested). Gabriel From gnurser at googlemail.com Sun Jun 7 06:38:32 2009 From: gnurser at googlemail.com (George Nurser) Date: Sun, 7 Jun 2009 11:38:32 +0100 Subject: [Numpy-discussion] svn numpy not building on osx 10.5.6, python.org python 2.5.2 In-Reply-To: <4A2B9233.4050200@ar.media.kyoto-u.ac.jp> References: <1d1e6ea70906061607h1dd822c6w21f66580925309a7@mail.gmail.com> <4A2B5692.4010904@ar.media.kyoto-u.ac.jp> <1d1e6ea70906070311m7f0326cdoc0283f4941f8a9af@mail.gmail.com> <4A2B9233.4050200@ar.media.kyoto-u.ac.jp> Message-ID: <1d1e6ea70906070338w7f156076qb524236a53c81b1a@mail.gmail.com> Thanks for the quick fix. 2009/6/7 David Cournapeau : > George Nurser wrote: >> Sorry, I should have said that I'd always deleted the build directories. >> I now have a better idea about what the problem is. >> >> python setup.py config_fc --fcompiler=gnu95 build_clib >> --fcompiler=gnu95 build_ext --fcompiler=gnu95 install >> >> works OK for svn versions < 6481 (where coremath was merged). >> It fails for r 6481 and later. >> > > Ok, I fixed the bug in numpy.distutils - it has nothing to do with your > fortran compiler :) (see #1131). > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From tpk at kraussfamily.org Sun Jun 7 08:20:17 2009 From: tpk at kraussfamily.org (Tom K.) Date: Sun, 7 Jun 2009 05:20:17 -0700 (PDT) Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> Message-ID: <23910425.post@talk.nabble.com> Olivier Verdier-2 wrote: > > There would be a much simpler solution than allowing a new operator. Just > allow the numpy function dot to take more than two arguments. Then A*B*C > in > matrix notation would simply be: > dot(A,B,C) > > with arrays. Wouldn't that make everybody happy? Plus it does not break > backward compatibility. Am I missing something? > That wouldn't make me happy because it is not the same syntax as a binary infix operator. Introducing a new operator for matrix multiply (and possibly matrix exponentiation) does not break backward compatibility - how could it, given that the python language does not yet support the new operator? Going back to Alan Isaac's example: 1) beta = (X.T*X).I * X.T * Y 2) beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) With a multiple arguments to dot, 2) becomes: 3) beta = np.dot(la.inv(np.dot(X.T, X)), X.T, Y) This is somewhat better than 2) but not as nice as 1) IMO. Seeing 1) with @'s would take some getting used but I think we would adjust. For ".I" I would propose that ".I" be added to nd-arrays that inverts each matrix of the last two dimensions, so for example if X is 3D then X.I is the same as np.array([inv(Xi) for Xi in X]). This is also backwards compatible. With this behavior and the one I proposed for @, by adding preceding dimensions we are allowing doing matrix algebra on collections of matrices (although it looks like we might need a new .T that just swaps the last two dimensions to really pull that off). But a ".I" attribute and its behavior needn't be bundled with whatever proposal we wish to make to the python community for a new operator of course. Regards, Tom K. -- View this message in context: http://www.nabble.com/matrix-default-to-column-vector--tp23652920p23910425.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From beckers at orn.mpg.de Sun Jun 7 08:43:44 2009 From: beckers at orn.mpg.de (Gabriel Beckers) Date: Sun, 07 Jun 2009 14:43:44 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> Message-ID: <1244378624.4377.51.camel@gabriel-desktop> On Sun, 2009-06-07 at 18:37 +0900, David Cournapeau wrote: > That's why compiling atlas by yourself is hard, and I generally advise > against it: there is nothing intrinsically hard about it, but you need > to know a lot of small details and platform oddities to get it right > every time. That's just a waste of time in most cases IMHO, unless all > you do with numpy is inverting big matrices, I have been trying intel mkl and icc compiler instead, with no luck. I run into the same problem during setup as reported here: http://www.mail-archive.com/numpy-discussion at scipy.org/msg16595.html Sigh. I guess I should not get into these matters anyway; I am just a simple and humble user... As far as I understand the Ubuntu atlas problems have been found for complex types, which I don't use except for fft. I guess I'll continue to use the ubuntu libraries then and hope for better days in the future. Best, Gabriel From josef.pktd at gmail.com Sun Jun 7 08:50:00 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 7 Jun 2009 08:50:00 -0400 Subject: [Numpy-discussion] BigInteger equivalent in numpy In-Reply-To: <4A2B90F0.8020503@ar.media.kyoto-u.ac.jp> References: <4A27BBD5.5010306@googlemail.com> <1cd32cbb0906040555p6ad34e6i406b95a3626301c9@mail.gmail.com> <4A2B8B1B.6020701@googlemail.com> <4A2B90F0.8020503@ar.media.kyoto-u.ac.jp> Message-ID: <1cd32cbb0906070550p26684553n3eee14e54b496821@mail.gmail.com> On Sun, Jun 7, 2009 at 6:05 AM, David Cournapeau wrote: > (Please do not send twice your message to numpy/scipy ML, thank you) > > wierob wrote: >> Hi, >> >> int64 and float seem to work for the stderr calculation. Now, the >> calculation of the p-value causes an underflow. >> >> File "C:\Python26\lib\site-packages\scipy\stats\distributions.py", line 2829,in _cdf >> ? ? return special.stdtr(df, x) >> FloatingPointError: underflow encountered in stdtr >> >> It seems that the error occurs in scipy.special.stdtr(df, x) if df = >> array([13412]) and x = array([61.88071696]). >> > > The error seems to happen in the cephes library (the library we use for > many functions in scipy.special, including this one). I don't know > exactly what triggers it (your df is quite high, though, and the cdf is > already quite past the 1-eps at your point). > The result is correct, and we don't get the exception at the default warning/error level, so this looks ok to me. >>> stats.t._cdf(61.88071696, 13412) 1.0 >>> stats.norm._cdf(61.88071696) 1.0 >>> stats.norm._sf(61.88071696) 0.0 >>> stats.norm._ppf(1-1e-16) 8.2095361516013874 Using a double precision library won't help it find out whether the answer is (1 - 1-e30) or (1 - 1-e80) I made comments on a similar case yesterday where the overflow actually shows up in the result without any warning or exception. http://projects.scipy.org/scipy/ticket/962 But I only know about floating point precision what I learned in these news groups, and playing with limiting cases in stats.distributions. using error level to raise doesn't seem useful if you want for example to work with include floating point inf >>> np.seterr(all="raise") {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore', 'under': 'ignore'} >>> x = np.array([0,1]) >>> x array([0, 1]) >>> x/x[0] Traceback (most recent call last): File "", line 1, in x/x[0] FloatingPointError: divide by zero encountered in divide >>> x = np.array([0.,1.]) >>> x/x[0] Traceback (most recent call last): File "", line 1, in x/x[0] FloatingPointError: divide by zero encountered in divide >>> np.seterr(all="ignore") {'over': 'raise', 'divide': 'raise', 'invalid': 'raise', 'under': 'raise'} >>> x/x[0] array([ NaN, Inf]) Josef From bsouthey at gmail.com Sun Jun 7 09:19:29 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 7 Jun 2009 08:19:29 -0500 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> Message-ID: On Sun, Jun 7, 2009 at 3:37 AM, Fernando Perez wrote: > On Sun, Jun 7, 2009 at 1:28 AM, Fernando Perez wrote: >> >> OK. ?Will send it in when I know whether you'd want the fill_diagonal >> one, and where that should go. >> > > One more question. ?For these *_indices() functions, would you want an > interface that accepts *either* > > diag_indices(size,ndim) As I indicated above, this is unacceptable for the apparent usage. I do not understand what is expected with the ndim argument. If it is the indices of an array elements of the form: [0][0][0], [1][1][1], ... [k][k][k] where k=min(a.shape) for some array a then an ndim args is total redundant (although using shape is not correct for 1-d arrays). This is different than the diagonals of two 2-d arrays from an shape of 2 by 3 by 4 or some other expectation. > > or > > diag_indices(anarray) > +1 Bruce From zelbier at gmail.com Sun Jun 7 09:51:55 2009 From: zelbier at gmail.com (Olivier Verdier) Date: Sun, 7 Jun 2009 15:51:55 +0200 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <23910425.post@talk.nabble.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> Message-ID: There are two solutions to the A*B*C problem that are not quite comparable, and are not mutually exclusive either. 1) allow dot(A,B,C): this would be a great improvement over dot(dot(A,B),C), and it could virtually be done within a day. It is easy to implement, does not require a new syntax, and does not break BC 2) another solution, not incompatible with the first one, is to introduce a new operator in the python language. In the case that it be accepted by the python community at large (which is very unlikely, IMHO), be prepared to a very long time before it is actually implemented. We are talking about several years. I think that solution 1) is much more realistic than 2) (and again, they are not mutually exclusive, so implementing 1) does not preclude for a future implementation of 2)). Implementation of 1) would be quite nice when multiplication of several matrices is concerned. == Olivier 2009/6/7 Tom K. > > > Olivier Verdier-2 wrote: > > > > There would be a much simpler solution than allowing a new operator. Just > > allow the numpy function dot to take more than two arguments. Then A*B*C > > in > > matrix notation would simply be: > > dot(A,B,C) > > > > with arrays. Wouldn't that make everybody happy? Plus it does not break > > backward compatibility. Am I missing something? > > > > That wouldn't make me happy because it is not the same syntax as a binary > infix operator. Introducing a new operator for matrix multiply (and > possibly matrix exponentiation) does not break backward compatibility - how > could it, given that the python language does not yet support the new > operator? > > Going back to Alan Isaac's example: > 1) beta = (X.T*X).I * X.T * Y > 2) beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) > > With a multiple arguments to dot, 2) becomes: > 3) beta = np.dot(la.inv(np.dot(X.T, X)), X.T, Y) > > This is somewhat better than 2) but not as nice as 1) IMO. > > Seeing 1) with @'s would take some getting used but I think we would > adjust. > > For ".I" I would propose that ".I" be added to nd-arrays that inverts each > matrix of the last two dimensions, so for example if X is 3D then X.I is > the > same as np.array([inv(Xi) for Xi in X]). This is also backwards > compatible. > With this behavior and the one I proposed for @, by adding preceding > dimensions we are allowing doing matrix algebra on collections of matrices > (although it looks like we might need a new .T that just swaps the last two > dimensions to really pull that off). But a ".I" attribute and its behavior > needn't be bundled with whatever proposal we wish to make to the python > community for a new operator of course. > > Regards, > Tom K. > -- > View this message in context: > http://www.nabble.com/matrix-default-to-column-vector--tp23652920p23910425.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Sun Jun 7 12:49:48 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 7 Jun 2009 12:49:48 -0400 Subject: [Numpy-discussion] is my numpy installation using custom blas/lapack? In-Reply-To: <000325575efe8e92b6046bba51db@google.com> References: <7f014ea60906061511k7f58d501yf90a29932ce696b8@mail.gmail.com> <000325575efe8e92b6046bba51db@google.com> Message-ID: <7f014ea60906070949p4e167ac5y9d2077e3d3fc991d@mail.gmail.com> Based on your site.cfg you posted earlier, your install seems to be working fine. Your site.cfg doesnt have threaded libraries enabled. If you want to add threads (assuming you built the threaded atlas .so's), your site.cfg should look like this (assuming the threaded libs are in the same directory as the non-threaded libs): ##### site.cfg ########### [DEFAULT] library_dirs = /usr/local/rich/src/scipy_build/lib:$HOME/usr/galois/lib include_dirs = /usr/local/rich/src/scipy_build/lib/include:$HOME/usr/galois/include [blas_opt] libraries = ptf77blas, ptcblas, atlas [lapack_opt] libraries = lapack, ptf77blas, ptcblas, atlas [amd] amd_libs = amd [umfpack] umfpack_libs = umfpack, gfortran [fftw] libraries = fftw3 notice the ptf prefix on the atlas and blas libs. when you did numpy.show_config() it shows that its only using the single threaded libs, which is what you told it to do in site.cfg when you built it. Change your site.cfg rebuild & reinstall and you should be fine Chris On Sun, Jun 7, 2009 at 12:11 AM, wrote: > Hi, > > On Jun 6, 2009 3:11pm, Chris Colbert wrote: >> it definately found your threaded atlas libraries. How do you know >> >> it's numpy is using lapack_lite? > > I don't, actually. But it is importing it. With python -v, this is the error > I get if I don't set LD_LIBRARY_PATH to my scipy_build directory > > > import numpy.linalg.linalg # precompiled from > /data10/users/rich/usr/galois/lib64/python/numpy/linalg/linalg.pyc > dlopen("/data10/users/rich/usr/galois/lib64/python/numpy/linalg/lapack_lite.so", > 2); > Traceback (most recent call last): > File "", line 1, in > File "/data10/users/rich/usr/galois//lib64/python/numpy/__init__.py", line > 130, in > import add_newdocs > File "/data10/users/rich/usr/galois//lib64/python/numpy/add_newdocs.py", > line 9, in > from lib import add_newdoc > File "/data10/users/rich/usr/galois//lib64/python/numpy/lib/__init__.py", > line 13, in > from polynomial import * > File "/data10/users/rich/usr/galois//lib64/python/numpy/lib/polynomial.py", > line 18, in > from numpy.linalg import eigvals, lstsq > File "/data10/users/rich/usr/galois//lib64/python/numpy/linalg/__init__.py", > line 47, in > from linalg import * > File "/data10/users/rich/usr/galois//lib64/python/numpy/linalg/linalg.py", > line 22, in > from numpy.linalg import lapack_lite > ImportError: liblapack.so: cannot open shared object file: No such file or > directory >>>> > > > Here blas_opt_info seems to be missing ATLAS version. > >>>> numpy.show_config() > atlas_threads_info: > libraries = ['lapack', 'lapack', 'f77blas', 'cblas', 'atlas'] > library_dirs = ['/usr/local/rich/src/scipy_build/lib'] > language = f77 > > blas_opt_info: > libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] > library_dirs = ['/usr/local/rich/src/scipy_build/lib'] > define_macros = [('NO_ATLAS_INFO', 2)] > language = c > > atlas_blas_threads_info: > libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] > library_dirs = ['/usr/local/rich/src/scipy_build/lib'] > language = c > > lapack_opt_info: > libraries = ['lapack', 'lapack', 'f77blas', 'cblas', 'atlas'] > library_dirs = ['/usr/local/rich/src/scipy_build/lib'] > define_macros = [('NO_ATLAS_INFO', 2)] > language = f77 > > lapack_mkl_info: > NOT AVAILABLE > > blas_mkl_info: > NOT AVAILABLE > > mkl_info: > NOT AVAILABLE > > >> >> >> >> when I do: >> >> >> >> python >> >> >>import numpy >> >> >>numpy.show_config() >> >> atlas_threads_info: >> >> ? ?libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] >> >> ? ?library_dirs = ['/usr/local/lib'] >> >> ? ?language = f77 >> >> >> >> blas_opt_info: >> >> ? ?libraries = ['ptf77blas', 'ptcblas', 'atlas'] >> >> ? ?library_dirs = ['/usr/local/lib'] >> >> ? ?define_macros = [('ATLAS_INFO', '"\\"3.8.3\\""')] >> >> ? ?language = c >> >> >> >> atlas_blas_threads_info: >> >> ? ?libraries = ['ptf77blas', 'ptcblas', 'atlas'] >> >> ? ?library_dirs = ['/usr/local/lib'] >> >> ? ?language = c >> >> >> >> lapack_opt_info: >> >> ? ?libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] >> >> ? ?library_dirs = ['/usr/local/lib'] >> >> ? ?define_macros = [('NO_ATLAS_INFO', 2)] >> >> ? ?language = f77 >> >> >> >> lapack_mkl_info: >> >> ?NOT AVAILABLE >> >> >> >> blas_mkl_info: >> >> ?NOT AVAILABLE >> >> >> >> mkl_info: >> >> ?NOT AVAILABLE >> >> >> >> >> >> also try: >> >> >>> a = numpy.random.randn(6000, 6000) >> >> >>> numpy.dot(a,a) >> >> >> >> and make sure all your cpu cores peg at 100% >> >> >> > > Unfortunately only one cpu. What does that mean? Threaded libraries not > used? > > from top: > > Cpu0 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu1 : 0.0%us, 0.2%sy, 0.0%ni, 99.4%id, 0.0%wa, 0.2%hi, 0.2%si, 0.0%st > Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > > Thanks much for the help. > > Rich > >> >> >> On Sat, Jun 6, 2009 at 3:35 PM, llewelr at gmail.com> wrote: >> >> > Oops. Thanks, that makes more sense: >> >> > >> >> > http://pastebin.com/m7067709b >> >> > >> >> > On Jun 6, 2009 12:15pm, Chris Colbert sccolbert at gmail.com> wrote: >> >> >> i need the full link to pastebin.com in order to view your post. >> >> >> >> >> >> >> >> >> >> >> >> It will be something like http://pastebin.com/m6b09f05c >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> chris >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Sat, Jun 6, 2009 at 2:32 PM, Richard Llewellynllewelr at gmail.com> >> >> wrote: >> >> >> >> >> >> > I posted the setup.py build output to pastebin.com, though missed the >> >> >> >> >> >> > uninteresting stderr (forgot tcsh command to redirect both). >> >> >> >> >> >> > Also, used setup.py build --fcompiler=gnu95. >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> > To be clear, I am not certain that my ATLAS libraries are not found. >> >> > But >> >> >> >> >> >> > during the build starting at line 95 (pastebin.com) I see a >> >> > compilation >> >> >> >> >> >> > failure, and then NO_ATLAS_INFO, 2. >> >> >> >> >> >> > >> >> >> >> >> >> > I don't think I can use ldconfig without root, but have set >> >> >> > LD_LIBRARY_PATH >> >> >> >> >> >> > to point to the scipy_build/lib until I put them somewhere else. >> >> >> >> >> >> > >> >> >> >> >> >> > importing numpy works, though lapack_lite is also imported. I wonder >> >> > if >> >> >> > this >> >> >> >> >> >> > is normal even if my ATLAS was used. >> >> >> >> >> >> > >> >> >> >> >> >> > Thanks, >> >> >> >> >> >> > Rich >> >> >> >> >> >> > >> >> >> >> >> >> > On Sat, Jun 6, 2009 at 10:46 AM, Chris Colbert sccolbert at gmail.com> >> >> >> > wrote: >> >> >> >> >> >> >> >> >> >> >> >> >> >> and where exactly are you seeing atlas not found? during the build >> >> >> >> >> >> >> process, are when import numpy in python? >> >> >> >> >> >> >> >> >> >> >> >> >> >> if its the latter, you need to add a .conf file ?in >> >> >> /etc/ld.so.conf.d/ >> >> >> >> >> >> >> ?with the line /usr/local/rich/src/scipy_build/lib ?and then run >> >> >> ?sudo >> >> >> >> >> >> >> ldconfig >> >> >> >> >> >> >> >> >> >> >> >> >> >> Chris >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Sat, Jun 6, 2009 at 1:42 PM, Chris Colbertsccolbert at gmail.com> >> >> >> >> wrote: >> >> >> >> >> >> >> > can you run this and post the build.log to pastebin.com: >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > assuming your numpy build directory is /home/numpy-1.3.0: >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > cd /home/numpy-1.3.0 >> >> >> >> >> >> >> > rm -rf build >> >> >> >> >> >> >> > python setup.py build &&> build.log >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > Chris >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > On Sat, Jun 6, 2009 at 1:37 PM, Richard >> >> >> > Llewellynllewelr at gmail.com> >> >> >> >> >> >> >> > wrote: >> >> >> >> >> >> >> >> Hi Chris, >> >> >> >> >> >> >> >> ?thanks much for posting those installation instructions.? Seems >> >> >> >> >> >> >> >> similar to >> >> >> >> >> >> >> >> what I pieced together. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> I gather ATLAS not found.? Oops, drank that beer too early. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> I copied Atlas libs to /usr/local/rich/src/scipy_build/lib. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> This is my site.cfg.? Out of desperation I tried >> >> >> >> search_static_first >> >> >> >> >> = >> >> >> >> >> >> >> >> 1, >> >> >> >> >> >> >> >> but probably of no use. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> [DEFAULT] >> >> >> >> >> >> >> >> library_dirs = >> >> >> >> >> /usr/local/rich/src/scipy_build/lib:$HOME/usr/galois/lib >> >> >> >> >> >> >> >> include_dirs = >> >> >> >> >> >> >> >> >> >> >> >> /usr/local/rich/src/scipy_build/lib/include:$HOME/usr/galois/include >> >> >> >> >> >> >> >> search_static_first = 1 >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> [blas_opt] >> >> >> >> >> >> >> >> libraries = f77blas, cblas, atlas >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> [lapack_opt] >> >> >> >> >> >> >> >> libraries = lapack, f77blas, cblas, atlas >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> [amd] >> >> >> >> >> >> >> >> amd_libs = amd >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> [umfpack] >> >> >> >> >> >> >> >> umfpack_libs = umfpack, gfortran >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> [fftw] >> >> >> >> >> >> >> >> libraries = fftw3 >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Rich >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Sat, Jun 6, 2009 at 10:25 AM, Chris Colbert >> >> >> >> sccolbert at gmail.com> >> >> >> >> >> >> >> >> wrote: >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >> >>> when you build numpy, did you use site.cfg to tell it where to >> >> >> >>> find >> >> >> >> >> >> >> >>> your atlas libs? >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >> >>> On Sat, Jun 6, 2009 at 1:02 PM, Richard >> >> >> >>> Llewellynllewelr at gmail.com> >> >> >> >> >> >> >> >>> wrote: >> >> >> >> >> >> >> >>> > Hello, >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> > I've managed a build of lapack and atlas on Fedora 10 on a >> >> >> >>> > quad >> >> >> >> >> >> >> >>> > core, >> >> >> >> >> >> >> >>> > 64, >> >> >> >> >> >> >> >>> > and now (...) have a numpy I can import that runs tests ok. :] >> >> >> >> >>> > I >> >> >> >> >> >> >> >>> > am >> >> >> >> >> >> >> >>> > puzzled, however, that numpy builds and imports lapack_lite. >> >> >> >> >>> > Does >> >> >> >> >> >> >> >>> > this >> >> >> >> >> >> >> >>> > mean >> >> >> >> >> >> >> >>> > I have a problem with the build(s)? >> >> >> >> >> >> >> >>> > Upon building numpy, I see the troubling output: >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> > ######################## >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> > C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g >> >> >> >> >>> > -pipe >> >> >> >> >> >> >> >>> > -Wall >> >> >> >> >> >> >> >>> > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protecto >> >> >> >> >> >> >> >>> > r --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE >> >> >> >> >>> > -fPIC >> >> >> >> >> >> >> >>> > -fPIC >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> > compile options: '-c' >> >> >> >> >> >> >> >>> > gcc: _configtest.c >> >> >> >> >> >> >> >>> > gcc -pthread _configtest.o >> >> >> >>> > -L/usr/local/rich/src/scipy_build/lib >> >> >> >> >> >> >> >>> > -llapack >> >> >> >> >> >> >> >>> > -lptf77blas -lptcblas -latlas -o _configtest >> >> >> >> >> >> >> >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in >> >> >> >> >> >> >> >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is >> >> >> >> >> >> >> >>> > reference >> >> >> >> >> >> >> >>> > d by DSO >> >> >> >> >> >> >> >>> > /usr/bin/ld: final link failed: Nonrepresentable section on >> >> >> >> >>> > output >> >> >> >> >> >> >> >>> > collect2: ld returned 1 exit status >> >> >> >> >> >> >> >>> > /usr/bin/ld: _configtest: hidden symbol `__powidf2' in >> >> >> >> >> >> >> >>> > /usr/lib/gcc/x86_64-redhat-linux/4.3.2/libgcc.a(_powidf2.o) is >> >> >> >> >> >> >> >>> > reference >> >> >> >> >> >> >> >>> > d by DSO >> >> >> >> >> >> >> >>> > /usr/bin/ld: final link failed: Nonrepresentable section on >> >> >> >> >>> > output >> >> >> >> >> >> >> >>> > collect2: ld returned 1 exit status >> >> >> >> >> >> >> >>> > failure. >> >> >> >> >> >> >> >>> > removing: _configtest.c _configtest.o >> >> >> >> >> >> >> >>> > Status: 255 >> >> >> >> >> >> >> >>> > Output: >> >> >> >> >> >> >> >>> > ? FOUND: >> >> >> >> >> >> >> >>> > ??? libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] >> >> >> >> >> >> >> >>> > ??? library_dirs = ['/usr/local/rich/src/scipy_build/lib'] >> >> >> >> >> >> >> >>> > ??? language = f77 >> >> >> >> >> >> >> >>> > ??? define_macros = [('NO_ATLAS_INFO', 2)] >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> > ########################## >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> > I don't have root on this machine, but could pester admins for >> >> >> >> >> >> >> >>> > eventual >> >> >> >> >> >> >> >>> > temporary access. >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> > Thanks much for any help, >> >> >> >> >> >> >> >>> > Rich >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> > _______________________________________________ >> >> >> >> >> >> >> >>> > Numpy-discussion mailing list >> >> >> >> >> >> >> >>> > Numpy-discussion at scipy.org >> >> >> >> >> >> >> >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> > >> >> >> >> >> >> >> >>> _______________________________________________ >> >> >> >> >> >> >> >>> Numpy-discussion mailing list >> >> >> >> >> >> >> >>> Numpy-discussion at scipy.org >> >> >> >> >> >> >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> >> >> >> >> >> >> >> Numpy-discussion mailing list >> >> >> >> >> >> >> >> Numpy-discussion at scipy.org >> >> >> >> >> >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> _______________________________________________ >> >> >> >> >> >> >> Numpy-discussion mailing list >> >> >> >> >> >> >> Numpy-discussion at scipy.org >> >> >> >> >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >>% > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From sccolbert at gmail.com Sun Jun 7 14:44:27 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 7 Jun 2009 14:44:27 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <1244362840.4377.10.camel@gabriel-desktop> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> Message-ID: <7f014ea60906071144m60580f58yac5e18d1d1e5bca9@mail.gmail.com> thanks for catching the typos! Chris On Sun, Jun 7, 2009 at 4:20 AM, Gabriel Beckers wrote: > On Sat, 2009-06-06 at 12:59 -0400, Chris Colbert wrote: >> ../configure -b 64 -D c -DPentiumCPS=2400 -Fa ?-alg -fPIC >> --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a > > Many thanks Chris, I succeeded in building it. > > The configure command above contained two problems that I had to correct > to get it to work though. > > In case other people are trying this, I used: > > ../configure -b 32 -D c -DPentiumCPS=1800 -Fa alg -fPIC > --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/lapack_LINUX.a > > That is (in addition to the different -b switch for my 32-bit machine > and the different processor speed): the dash before "alg" should be > removed, and "Lapack_LINUX.a" should be "lapack_LINUX.a". > > Gabriel > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sccolbert at gmail.com Sun Jun 7 14:47:13 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sun, 7 Jun 2009 14:47:13 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <1244368329.4377.18.camel@gabriel-desktop> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> Message-ID: <7f014ea60906071147k652420b5qb55d6f231340d89c@mail.gmail.com> when i had problems building atlas in the past (i.e. numpy.test() failed) it was a problem with my lapack build, not atlas. The netlib website gives instructions for building the lapack test suite. I suggest you do that and run the tests on lapack and make sure everything is kosher. Chris On Sun, Jun 7, 2009 at 5:52 AM, Gabriel Beckers wrote: > OK, perhaps I drank that beer too soon... > > Now, numpy.test() hangs at: > > test_pinv (test_defmatrix.TestProperties) ... > > So perhaps something is wrong with ATLAS, even though the building went > fine, and "make check" and "make ptcheck" reported no errors. > > Gabriel > > On Sun, 2009-06-07 at 10:20 +0200, Gabriel Beckers wrote: >> On Sat, 2009-06-06 at 12:59 -0400, Chris Colbert wrote: >> > ../configure -b 64 -D c -DPentiumCPS=2400 -Fa ?-alg -fPIC >> > --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/Lapack_LINUX.a >> >> Many thanks Chris, I succeeded in building it. >> >> The configure command above contained two problems that I had to correct >> to get it to work though. >> >> In case other people are trying this, I used: >> >> ../configure -b 32 -D c -DPentiumCPS=1800 -Fa alg -fPIC >> --with-netlib-lapack=/home/your-user-name/build/lapack/lapack-3.2.1/lapack_LINUX.a >> >> That is (in addition to the different -b switch for my 32-bit machine >> and the different processor speed): the dash before "alg" should be >> removed, and "Lapack_LINUX.a" should be "lapack_LINUX.a". >> >> Gabriel >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Sun Jun 7 15:01:53 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 7 Jun 2009 14:01:53 -0500 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> Message-ID: <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> On Sun, Jun 7, 2009 at 04:44, Olivier Verdier wrote: > Yes, I found the thread you are referring > to:?http://mail.python.org/pipermail/python-dev/2008-July/081554.html > However, since A*B*C exists for matrices and actually computes (A*B)*C, why > not do the same with dot? I.e. why not decide that dot(A,B,C) does what > would A*B*C do, i.e., dot(dot(A,B),C)? > The performance and precision problems are the responsability of the user, > just as with the formula A*B*C. I'm happy to make the user responsible for performance and precision problems if he has the tools to handle them. The operator gives the user the easy ability to decide the precedence with parentheses. The function does not. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sun Jun 7 15:08:29 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 7 Jun 2009 14:08:29 -0500 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <23910425.post@talk.nabble.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> Message-ID: <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> On Sun, Jun 7, 2009 at 07:20, Tom K. wrote: > > > Olivier Verdier-2 wrote: >> >> There would be a much simpler solution than allowing a new operator. Just >> allow the numpy function dot to take more than two arguments. Then A*B*C >> in >> matrix notation would simply be: >> dot(A,B,C) >> >> with arrays. Wouldn't that make everybody happy? Plus it does not break >> backward compatibility. Am I missing something? >> > > That wouldn't make me happy because it is not the same syntax as a binary > infix operator. ?Introducing a new operator for matrix multiply (and > possibly matrix exponentiation) does not break backward compatibility - how > could it, given that the python language does not yet support the new > operator? > > Going back to Alan Isaac's example: > 1) ?beta = (X.T*X).I * X.T * Y > 2) ?beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) > > With a multiple arguments to dot, 2) becomes: > 3) ?beta = np.dot(la.inv(np.dot(X.T, X)), X.T, Y) > > This is somewhat better than 2) but not as nice as 1) IMO. 4) beta = la.lstsq(X, Y)[0] I really hate that example. > Seeing 1) with @'s would take some getting used but I think we would adjust. > > For ".I" I would propose that ".I" be added to nd-arrays that inverts each > matrix of the last two dimensions, so for example if X is 3D then X.I is the > same as np.array([inv(Xi) for Xi in X]). ?This is also backwards compatible. > With this behavior and the one I proposed for @, by adding preceding > dimensions we are allowing doing matrix algebra on collections of matrices > (although it looks like we might need a new .T that just swaps the last two > dimensions to really pull that off). ?But a ".I" attribute and its behavior > needn't be bundled with whatever proposal we wish to make to the python > community for a new operator of course. I am vehemently against adding .I to ndarray. I want to *discourage* the formation of explicit inverses. It is almost always a very wrong thing to do. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tpk at kraussfamily.org Sun Jun 7 15:13:10 2009 From: tpk at kraussfamily.org (Tom K.) Date: Sun, 7 Jun 2009 12:13:10 -0700 (PDT) Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> Message-ID: <23914277.post@talk.nabble.com> Olivier Verdier-2 wrote: > > There are two solutions to the A*B*C problem that are not quite > comparable, > and are not mutually exclusive either. > 1) allow dot(A,B,C): this would be a great improvement over > dot(dot(A,B),C), > and it could virtually be done within a day. It is easy to implement, does > not require a new syntax, and does not break BC > > 2) another solution, not incompatible with the first one, is to introduce > a > new operator in the python language. In the case that it be accepted by > the > python community at large (which is very unlikely, IMHO), be prepared to a > very long time before it is actually implemented. We are talking about > several years. > > I think that solution 1) is much more realistic than 2) (and again, they > are > not mutually exclusive, so implementing 1) does not preclude for a future > implementation of 2)). > > Implementation of 1) would be quite nice when multiplication of several > matrices is concerned. > I agree these are not mutually exclusive. In other words, they are separate issues. So sure while I would not be *as* happy with a multi-input dot as I would with a new operator that could overload dot, I don't mean to discourage the multi-input dot from being pursued. By all means, it seems like a worthwhile addition if done correctly. I just don't think it solves the problem, that being how do we improve the semantics of numpy to be more "matrix" based. It is a *requirement* that the package support * (or some binary infix operator) for matrix mutliplication. We do this with the matrix type. But, almost all experienced users drift away from matrix toward array as they find the matrix class too limiting or strange - it seems only applicable for new users and pedagogical purposes. My own experience is that having two ways to do things makes software confusing and overly complex. It would be preferable to have a single class that supports matrix multiplication syntax. If that takes a year or two until it "hits the street" as a numpy release, so be it... we've got time. I think your time estimate is correct, although the actual time to implement and test the new python syntax would probably be much shorter - it is not so much a technical issue as one of socialization. What do we want to do? How can we convince the larger python community that this is a good idea for them too? Cheers, Tom K. -- View this message in context: http://www.nabble.com/matrix-default-to-column-vector--tp23652920p23914277.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From tpk at kraussfamily.org Sun Jun 7 15:29:55 2009 From: tpk at kraussfamily.org (Tom K.) Date: Sun, 7 Jun 2009 12:29:55 -0700 (PDT) Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> Message-ID: <23914438.post@talk.nabble.com> Robert Kern-2 wrote: > > On Sun, Jun 7, 2009 at 07:20, Tom K. wrote: >> Going back to Alan Isaac's example: >> 1) ?beta = (X.T*X).I * X.T * Y > ... > 4) beta = la.lstsq(X, Y)[0] > > I really hate that example. > Understood. Maybe propose a different one? Robert Kern-2 wrote: > > >> Seeing 1) with @'s would take some getting used but I think we would >> adjust. >> >> For ".I" I would propose that ".I" be added to nd-arrays that inverts >> each >> matrix of the last two dimensions, so for example if X is 3D then X.I is >> the >> same as np.array([inv(Xi) for Xi in X]). ?This is also backwards >> compatible. >> With this behavior and the one I proposed for @, by adding preceding >> dimensions we are allowing doing matrix algebra on collections of >> matrices >> (although it looks like we might need a new .T that just swaps the last >> two >> dimensions to really pull that off). ?But a ".I" attribute and its >> behavior >> needn't be bundled with whatever proposal we wish to make to the python >> community for a new operator of course. > > I am vehemently against adding .I to ndarray. I want to *discourage* > the formation of explicit inverses. It is almost always a very wrong > thing to do. > You sound like Cleve Moler: always concerned about numeric fidelity. Point taken. - Tom K. -- View this message in context: http://www.nabble.com/matrix-default-to-column-vector--tp23652920p23914438.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From mmueller at python-academy.de Sun Jun 7 18:25:51 2009 From: mmueller at python-academy.de (=?windows-1252?Q?Mike_M=FCller?=) Date: Mon, 08 Jun 2009 00:25:51 +0200 Subject: [Numpy-discussion] [ANN] EuroSciPy 2009 - Presentation Schedule Published Message-ID: <4A2C3E6F.7090305@python-academy.de> EuroSciPy 2009 Presentation Schedule Published ============================================== The schedule of presentations for the EuroSciPy conference is online: http://www.euroscipy.org/presentations/schedule.html We have 16 talks from a variety of scientific fields. All about using Python for scientific work. EuroSciPy 2009 ============== We're pleased to announce the EuroSciPy 2009 Conference to be held in Leipzig, Germany on July 25-26, 2009. http://www.euroscipy.org This is the second conference after the successful conference last year. Again, EuroSciPy will be a venue for the European community of users of the Python programming language in science. Registration ------------ Registration is open. The registration fee is 100.00 ? for early registrants and will increase to 150.00 ? for late registration after June 15, 2009. Registration will include breakfast, snacks and lunch for Saturday and Sunday. Please register here: http://www.euroscipy.org/registration.html Important Dates --------------- March 21 Registration opens May 8 Abstract submission deadline May 15 Acceptance of presentations May 30 Announcement of conference program June 15 Early bird registration deadline July 15 Slides submission deadline July 20 - 24 Pre-Conference courses July 25/26 Conference August 15 Paper submission deadline Venue ----- mediencampus Poetenweg 28 04155 Leipzig Germany See http://www.euroscipy.org/venue.html for details. Help Welcome ------------ You like to help make the EuroSciPy 2009 a success? Here are some ways you can get involved: * attend the conference * submit an abstract for a presentation * give a lightning talk * make EuroSciPy known: - distribute the press release (http://www.euroscipy.org/media.html) to scientific magazines or other relevant media - write about it on your website - in your blog - talk to friends about it - post to local e-mail lists - post to related forums - spread flyers and posters in your institution - make entries in relevant event calendars - anything you can think of * inform potential sponsors about the event * become a sponsor If you're interested in volunteering to help organize things or have some other idea that can help the conference, please email us at mmueller at python-academy dot de. Sponsorship ----------- Do you like to sponsor the conference? There are several options available: http://www.euroscipy.org/sponsors/become_a_sponsor.html Pre-Conference Courses ---------------------- Would you like to learn Python or about some of the most used scientific libraries in Python? Then the "Python Summer Course" [1] might be for you. There are two parts to this course: * a two-day course "Introduction to Python" [2] for people with programming experience in other languages and * a three-day course "Python for Scientists and Engineers" [3] that introduces some of the most used Python tools for scientists and engineers such as NumPy, PyTables, and matplotlib Both courses can be booked individually [4]. Of course, you can attend the courses without registering for EuroSciPy. [1] http://www.python-academy.com/courses/python_summer_course.html [2] http://www.python-academy.com/courses/python_course_programmers.html [3] http://www.python-academy.com/courses/python_course_scientists.html [4] http://www.python-academy.com/courses/dates.html From dwf at cs.toronto.edu Sun Jun 7 19:31:19 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sun, 7 Jun 2009 19:31:19 -0400 Subject: [Numpy-discussion] Tuples vs. lists when defining recarrays with array() Message-ID: <64A74CFE-E8F9-4475-9910-DA8A4A96F3A5@cs.toronto.edu> A question was raised on the #scipy IRC earlier today, about the behaviour of array() with structured dtypes. After some educated guessing I figured out that for record arrays, tuples (rather than lists) must be used to indicate atomic elements. What I wondered is whether this behaviour is documented anywhere, and does it belong in the array() docstring, for example? The docstring currently reads "... or any (nested) sequence." In [57]: desc0 Out[57]: dtype([('id', '|O4'), ('val', '|O4'), ('date', '|O4')]) In [58]: values0 Out[58]: [9L, 1L, datetime.date(2009, 6, 7)] In [59]: arr = array(values0, dtype=desc0) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /Users/dwf/ in () ValueError: tried to set void-array with object members using buffer. In [60]: arr = array(tuple(values0), dtype=desc0) In [61]: arr Out[61]: array((9L, 1L, datetime.date(2009, 6, 7)), dtype=[('id', '|O4'), ('val', '|O4'), ('date', '|O4')]) - David From efiring at hawaii.edu Sun Jun 7 19:44:13 2009 From: efiring at hawaii.edu (Eric Firing) Date: Sun, 07 Jun 2009 13:44:13 -1000 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> Message-ID: <4A2C50CD.206@hawaii.edu> Robert Kern wrote: > On Sun, Jun 7, 2009 at 04:44, Olivier Verdier wrote: >> Yes, I found the thread you are referring >> to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html >> However, since A*B*C exists for matrices and actually computes (A*B)*C, why >> not do the same with dot? I.e. why not decide that dot(A,B,C) does what >> would A*B*C do, i.e., dot(dot(A,B),C)? >> The performance and precision problems are the responsability of the user, >> just as with the formula A*B*C. > > I'm happy to make the user responsible for performance and precision > problems if he has the tools to handle them. The operator gives the > user the easy ability to decide the precedence with parentheses. The > function does not. > The function could, with suitable parsing of the argument(s): (A*B)*C => dot( ((A,B),C) ) or dot( (A,B), C ) A*(B*C) => dot( (A, (B,C)) ) or dot( A, (B,C) ) Effectively, the comma is becoming the operator. Eric From aisaac at american.edu Sun Jun 7 20:12:42 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 07 Jun 2009 20:12:42 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A2C50CD.206@hawaii.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> <4A2C50CD.206@hawaii.edu> Message-ID: <4A2C577A.5040006@american.edu> >> On Sun, Jun 7, 2009 at 04:44, Olivier Verdier wrote: >>> Yes, I found the thread you are referring >>> to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html >>> However, since A*B*C exists for matrices and actually computes (A*B)*C, why >>> not do the same with dot? I.e. why not decide that dot(A,B,C) does what >>> would A*B*C do, i.e., dot(dot(A,B),C)? >>> The performance and precision problems are the responsability of the user, >>> just as with the formula A*B*C. > Robert Kern wrote: >> I'm happy to make the user responsible for performance and precision >> problems if he has the tools to handle them. The operator gives the >> user the easy ability to decide the precedence with parentheses. The >> function does not. On 6/7/2009 7:44 PM Eric Firing apparently wrote: > The function could, with suitable parsing of the argument(s): > (A*B)*C => dot( ((A,B),C) ) or dot( (A,B), C ) > A*(B*C) => dot( (A, (B,C)) ) or dot( A, (B,C) ) > Effectively, the comma is becoming the operator. Horribly implicit and hard to read! If something needs to be done to make ``dot`` approach the convenience of ``*``, I think that adding ``dot`` as an array method looks attractive. (A*B)*C => A.dot(B).dot(C) A*(B*C) => A.dot( B.dot(C) ) But no matter how you slice it, the left hand expression is more compact and easier to read. fwiw, Alan Isaac From robert.kern at gmail.com Sun Jun 7 20:52:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 7 Jun 2009 19:52:28 -0500 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A2C50CD.206@hawaii.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> <4A2C50CD.206@hawaii.edu> Message-ID: <3d375d730906071752v586f3a8p1b33604a4a8087b0@mail.gmail.com> On Sun, Jun 7, 2009 at 18:44, Eric Firing wrote: > Robert Kern wrote: >> On Sun, Jun 7, 2009 at 04:44, Olivier Verdier wrote: >>> Yes, I found the thread you are referring >>> to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html >>> However, since A*B*C exists for matrices and actually computes (A*B)*C, why >>> not do the same with dot? I.e. why not decide that dot(A,B,C) does what >>> would A*B*C do, i.e., dot(dot(A,B),C)? >>> The performance and precision problems are the responsability of the user, >>> just as with the formula A*B*C. >> >> I'm happy to make the user responsible for performance and precision >> problems if he has the tools to handle them. The operator gives the >> user the easy ability to decide the precedence with parentheses. The >> function does not. >> > > The function could, with suitable parsing of the argument(s): > > (A*B)*C => dot( ((A,B),C) ) ? ? or ?dot( (A,B), C ) > A*(B*C) => dot( (A, (B,C)) ) ? ?or ?dot( ?A, (B,C) ) > > Effectively, the comma is becoming the operator. Already discussed on the old thread. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Sun Jun 7 21:44:54 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 7 Jun 2009 18:44:54 -0700 Subject: [Numpy-discussion] Tuples vs. lists when defining recarrays with array() In-Reply-To: <64A74CFE-E8F9-4475-9910-DA8A4A96F3A5@cs.toronto.edu> References: <64A74CFE-E8F9-4475-9910-DA8A4A96F3A5@cs.toronto.edu> Message-ID: On Sun, Jun 7, 2009 at 4:31 PM, David Warde-Farley wrote: > A question was raised on the #scipy IRC earlier today, about the > behaviour of array() with structured dtypes. After some educated > guessing I figured out that for record arrays, tuples (rather than > lists) must be used to indicate atomic elements. What I wondered is > whether this behaviour is documented anywhere, and does it belong in > the array() docstring, for example? The docstring currently reads "... > or any (nested) sequence." +1 for a clear indication of this fact, as it's rather unusual that a tuple is OK where a list is not (for typical pythyon APIs) and the error is *very* obscure. I've been bitten enough times by this that by now I'm used to it, but I distinctly remember much head scratching and looking in the wrong places the first time I was hit by this behavior. I don't know if there's a good reason why lists aren't accepted though, so that instead of documenting an oddity it could just be cleaned up. Is it not possible for the constructor to duck-type here a list for a tuple? Maybe there's a good reason, but it feels like a bit of a wart. In any case, whether fixing or documenting it, +1 from this side. Cheers, f From david at ar.media.kyoto-u.ac.jp Sun Jun 7 22:53:32 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 08 Jun 2009 11:53:32 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <1244378624.4377.51.camel@gabriel-desktop> References: <20090604143638.xrqzx8w8v4ggckkk@www.sms.ed.ac.uk> <4A28EC37.10205@ar.media.kyoto-u.ac.jp> <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <1244378624.4377.51.camel@gabriel-desktop> Message-ID: <4A2C7D2C.3020903@ar.media.kyoto-u.ac.jp> Gabriel Beckers wrote: > On Sun, 2009-06-07 at 18:37 +0900, David Cournapeau wrote: > >> That's why compiling atlas by yourself is hard, and I generally advise >> against it: there is nothing intrinsically hard about it, but you need >> to know a lot of small details and platform oddities to get it right >> every time. That's just a waste of time in most cases IMHO, unless all >> you do with numpy is inverting big matrices, >> > > I have been trying intel mkl and icc compiler instead, with no luck. I > run into the same problem during setup as reported here: > > http://www.mail-archive.com/numpy-discussion at scipy.org/msg16595.html > See #1131 on numpy tracker - it has nothing to do with icc/mkl per-se. cheers, David From robert.kern at gmail.com Sun Jun 7 23:15:42 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 7 Jun 2009 22:15:42 -0500 Subject: [Numpy-discussion] Tuples vs. lists when defining recarrays with array() In-Reply-To: References: <64A74CFE-E8F9-4475-9910-DA8A4A96F3A5@cs.toronto.edu> Message-ID: <3d375d730906072015g5f9d6d47pf7a60d829bf5d1b8@mail.gmail.com> On Sun, Jun 7, 2009 at 20:44, Fernando Perez wrote: > On Sun, Jun 7, 2009 at 4:31 PM, David Warde-Farley wrote: >> A question was raised on the #scipy IRC earlier today, about the >> behaviour of array() with structured dtypes. After some educated >> guessing I figured out that for record arrays, tuples (rather than >> lists) must be used to indicate atomic elements. What I wondered is >> whether this behaviour is documented anywhere, and does it belong in >> the array() docstring, for example? The docstring currently reads "... >> or any (nested) sequence." > > +1 for a clear indication of ?this fact, as it's rather unusual that a > tuple is OK where a list is not (for typical pythyon APIs) and the > error is *very* obscure. ?I've been bitten enough times by this that > by now I'm used to it, but I distinctly remember much head scratching > and looking in the wrong places the first time I was hit by this > behavior. > > I don't know if there's a good reason why lists aren't accepted > though, so that instead of ?documenting an oddity it could just be > cleaned up. ?Is it not possible for the constructor to duck-type here > a list for a tuple? It may be *possible*, but it's certainly easier this way, and that is the reason for it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Mon Jun 8 00:29:08 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 8 Jun 2009 00:29:08 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090607101204.GB20612@phare.normalesup.org> References: <4A28F16C.3070607@ar.media.kyoto-u.ac.jp> <4A2955F5.9030006@hawaii.edu> <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> Message-ID: On 7-Jun-09, at 6:12 AM, Gael Varoquaux wrote: > Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it > makes > a big difference, especially since I have 8 cores. Just curious Gael: how many PC's are you retaining? Have you tried iterative methods (i.e. the EM algorithm for PCA)? David From gael.varoquaux at normalesup.org Mon Jun 8 01:32:10 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Jun 2009 07:32:10 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> Message-ID: <20090608053210.GC5032@phare.normalesup.org> On Mon, Jun 08, 2009 at 12:29:08AM -0400, David Warde-Farley wrote: > On 7-Jun-09, at 6:12 AM, Gael Varoquaux wrote: > > Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it > > makes > > a big difference, especially since I have 8 cores. > Just curious Gael: how many PC's are you retaining? Have you tried > iterative methods (i.e. the EM algorithm for PCA)? I am using the heuristic exposed in http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996 We have very noisy and long time series. My experience is that most model-based heuristics for choosing the number of PCs retained give us way too much on this problem (they simply keep diverging if I add noise at the end of the time series). The algorithm we use gives us ~50 interesting PCs (each composed of 50 000 dimensions). That happens to be quite right based on our experience with the signal. However, being fairly new to statistics, I am not aware of the EM algorithm that you mention. I'd be interested in a reference, to see if I can use that algorithm. The PCA bootstrap is time-consuming. Thanks, Ga?l From david at ar.media.kyoto-u.ac.jp Mon Jun 8 01:17:45 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 08 Jun 2009 14:17:45 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090608053210.GC5032@phare.normalesup.org> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> Message-ID: <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> Gael Varoquaux wrote: > I am using the heuristic exposed in > http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996 > > We have very noisy and long time series. My experience is that most > model-based heuristics for choosing the number of PCs retained give us > way too much on this problem (they simply keep diverging if I add noise > at the end of the time series). The algorithm we use gives us ~50 > interesting PCs (each composed of 50 000 dimensions). That happens to be > quite right based on our experience with the signal. However, being > fairly new to statistics, I am not aware of the EM algorithm that you > mention. I'd be interested in a reference, to see if I can use that > algorithm. I would not be surprised if David had this paper in mind :) http://www.cs.toronto.edu/~roweis/papers/empca.pdf cheers, David From gael.varoquaux at normalesup.org Mon Jun 8 01:38:47 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Jun 2009 07:38:47 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> References: <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> Message-ID: <20090608053847.GD5032@phare.normalesup.org> On Mon, Jun 08, 2009 at 02:17:45PM +0900, David Cournapeau wrote: > > However, being fairly new to statistics, I am not aware of the EM > > algorithm that you mention. I'd be interested in a reference, to see > > if I can use that algorithm. > I would not be surprised if David had this paper in mind :) > http://www.cs.toronto.edu/~roweis/papers/empca.pdf Excellent. Thanks to the Davids. I'll read that through. Ga?l From matthieu.brucher at gmail.com Mon Jun 8 02:58:29 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 08:58:29 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090608053210.GC5032@phare.normalesup.org> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> Message-ID: 2009/6/8 Gael Varoquaux : > On Mon, Jun 08, 2009 at 12:29:08AM -0400, David Warde-Farley wrote: >> On 7-Jun-09, at 6:12 AM, Gael Varoquaux wrote: > >> > Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it >> > makes >> > a big difference, especially since I have 8 cores. > >> Just curious Gael: how many PC's are you retaining? Have you tried >> iterative methods (i.e. the EM algorithm for PCA)? > > I am using the heuristic exposed in > http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996 > > We have very noisy and long time series. My experience is that most > model-based heuristics for choosing the number of PCs retained give us > way too much on this problem (they simply keep diverging if I add noise > at the end of the time series). The algorithm we use gives us ~50 > interesting PCs (each composed of 50 000 dimensions). That happens to be > quite right based on our experience with the signal. However, being > fairly new to statistics, I am not aware of the EM algorithm that you > mention. I'd be interested in a reference, to see if I can use that > algorithm. The PCA bootstrap is time-consuming. Hi, Given the number of PCs, I think you may just be measuring noise. As said in several manifold reduction publications (as the ones by Torbjorn Vik who published on robust PCA for medical imaging), you cannot expect to have more than 4 or 5 meaningful PCs, due to the dimensionality curse. If you want 50 PCs, you have to have at least... 10^50 samples, which is quite a lot, let's say it this way. According to the litterature, a usual manifold can be described by 4 or 5 variables. If you have more, it is that you may be infringing your hypothesis, here the linearity of your data (and as it is medical imaging, you know from the beginning that this hypothesis is wrong). So if you really want to find something meaningful and/or physical, you should use a real dimensionality reduction, preferably a non-linear one. Just my 2 cents ;) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From fperez.net at gmail.com Mon Jun 8 03:21:43 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 8 Jun 2009 00:21:43 -0700 Subject: [Numpy-discussion] Functions for indexing into certain parts of an array (2d) In-Reply-To: References: <3d375d730906060009r6efc1b70y6778408541fe6ce3@mail.gmail.com> <3d375d730906061401i611e2fe5j1d09c24cf8e49682@mail.gmail.com> Message-ID: On Sun, Jun 7, 2009 at 6:19 AM, Bruce Southey wrote: > On Sun, Jun 7, 2009 at 3:37 AM, Fernando Perez wrote: >> One more question. ?For these *_indices() functions, would you want an >> interface that accepts *either* >> >> diag_indices(size,ndim) > > As I indicated above, this is unacceptable for the apparent usage. Relax, nobody is trying to sneak past the Committee for the Prevention of Unacceptable Things. It's all now in a patch attached to this ticket: http://projects.scipy.org/numpy/ticket/1132 for regular review. I added the functions, with docstrings and tests. By the way, despite being above indicated as unacceptable, I still see value in being able to create these indexing structures without an actual array, so the implementation contains both versions, but with different names (to avoid the shenanigans that Robert rightfully has a policy of avoiding). > I do not understand what is expected with the ndim argument. If it is > the indices of an array elements of the form: [0][0][0], ?[1][1][1], > ... [k][k][k] where k=min(a.shape) for some array a then an ndim args > is total redundant (although using shape is not correct for 1-d > arrays). This is different than the diagonals of two 2-d arrays from > an shape of 2 by 3 by 4 or some other expectation. For an n-dimensional array, which probably comes closest to what we think of as a tensor in (muti) linear algebra, the notion of a diagonal as the list of entries with indices A[i,i,i,....,i] for i in [0...N] is a very natural one. >> diag_indices(anarray) >> > > +1 These were also added in this form, with the name _from(A), indicating that size/shape information should be taken from the input A. So both versions exist. Feel free to provide further feedback on the patch. Cheers, f From dwf at cs.toronto.edu Mon Jun 8 03:27:11 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 8 Jun 2009 03:27:11 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> Message-ID: <922E60B1-40B1-484A-B87C-AFFD7DEE6391@cs.toronto.edu> On 8-Jun-09, at 1:17 AM, David Cournapeau wrote: > I would not be surprised if David had this paper in mind :) > > http://www.cs.toronto.edu/~roweis/papers/empca.pdf Right you are :) There is a slight trick to it, though, in that it won't produce an orthogonal basis on its own, just something that spans that principal subspace. So you typically have to at least extract the first PC independently to uniquely orient your basis. You can then either subtract off the projection of the data on the 1st PC and find the next one, one at at time, or extract a spanning set all at once and orthogonalize with respect to the first PC. David From gael.varoquaux at normalesup.org Mon Jun 8 03:29:39 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Jun 2009 09:29:39 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> Message-ID: <20090608072939.GA23146@phare.normalesup.org> On Mon, Jun 08, 2009 at 08:58:29AM +0200, Matthieu Brucher wrote: > Given the number of PCs, I think you may just be measuring noise. > As said in several manifold reduction publications (as the ones by > Torbjorn Vik who published on robust PCA for medical imaging), you > cannot expect to have more than 4 or 5 meaningful PCs, due to the > dimensionality curse. If you want 50 PCs, you have to have at least... > 10^50 samples, which is quite a lot, let's say it this way. > According to the litterature, a usual manifold can be described by 4 > or 5 variables. If you have more, it is that you may be infringing > your hypothesis, here the linearity of your data (and as it is medical > imaging, you know from the beginning that this hypothesis is wrong). > So if you really want to find something meaningful and/or physical, > you should use a real dimensionality reduction, preferably a > non-linear one. I am not sure I am following you: I have time-varying signals. I am not taking a shot of the same process over and over again. My intuition tells me that I have more than 5 meaningful patterns. Anyhow, I do some more analysis behind that (ICA actually), and I do find more than 5 patterns of interest that I not noise. So maybe I should be using some non-linear dimensionality reduction, but what I am doing works, and I can write a generative model of it. Most importantly, it is actually quite computationaly simple. However, if you can point me to methods that you believe are better (and tell me why you believe so), I am all ears. Ga?l From matthieu.brucher at gmail.com Mon Jun 8 07:07:38 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 13:07:38 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090608072939.GA23146@phare.normalesup.org> References: <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <20090608072939.GA23146@phare.normalesup.org> Message-ID: 2009/6/8 Gael Varoquaux : > On Mon, Jun 08, 2009 at 08:58:29AM +0200, Matthieu Brucher wrote: >> Given the number of PCs, I think you may just be measuring noise. >> As said in several manifold reduction publications (as the ones by >> Torbjorn Vik who published on robust PCA for medical imaging), you >> cannot expect to have more than 4 or 5 meaningful PCs, due to the >> dimensionality curse. If you want 50 PCs, you have to have at least... >> 10^50 samples, which is quite a lot, let's say it this way. >> According to the litterature, a usual manifold can be described by 4 >> or 5 variables. If you have more, it is that you may be infringing >> your hypothesis, here the linearity of your data (and as it is medical >> imaging, you know from the beginning that this hypothesis is wrong). >> So if you really want to find something meaningful and/or physical, >> you should use a real dimensionality reduction, preferably a >> non-linear one. > > I am not sure I am following you: I have time-varying signals. I am not > taking a shot of the same process over and over again. My intuition tells > me that I have more than 5 meaningful patterns. How many samples do you have? 10000? a million? a billion? The problem with 50 PCs is that your search space is mostly empty, "thanks" to the curse of dimensionality. This means that you *should* not try to get a meaning for the 10th and following PCs. > Anyhow, I do some more analysis behind that (ICA actually), and I do find > more than 5 patterns of interest that I not noise. ICa suffers from the same problems than PCA. And I'm not even talking about the linearity hypothesis that is never respected. > So maybe I should be using some non-linear dimensionality reduction, but > what I am doing works, and I can write a generative model of it. Most > importantly, it is actually quite computationaly simple. Thanks linearity ;) The problem is that you will have a lot of confounds this way (your 50 PCs can in fact be the effect of 5 variables that are nonlinear). > However, if you can point me to methods that you believe are better (and > tell me why you believe so), I am all ears. My thesis was on nonlinear dimensionality reduction (this is why I believe so, especially in the medical imaging field), but it always need some adaptation. It depends on what you want to do, the time you can use to process data, ... Suffice to say we started with PCA some years ago and we were switching to nonlinear reduction because of the emptiness of the search space and because of the nonlinearity of the brain space (no idea what my former lab is doing now, but it is used for DTI at least). You should check some books on it, and you surely have to read something about the curse of dimensionality (at least if you want to get published, as people know about this issue in the medical field), even if you do not use nonlinear techniques. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From matthieu.brucher at gmail.com Mon Jun 8 07:09:02 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 13:09:02 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <922E60B1-40B1-484A-B87C-AFFD7DEE6391@cs.toronto.edu> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <922E60B1-40B1-484A-B87C-AFFD7DEE6391@cs.toronto.edu> Message-ID: 2009/6/8 David Warde-Farley : > > On 8-Jun-09, at 1:17 AM, David Cournapeau wrote: > >> I would not be surprised if David had this paper in mind :) >> >> http://www.cs.toronto.edu/~roweis/papers/empca.pdf > > Right you are :) > > There is a slight trick to it, though, in that it won't produce an > orthogonal basis on its own, just something that spans that principal > subspace. So you typically have to at least extract the first PC > independently to uniquely orient your basis. You can then either > subtract off the projection of the data on the 1st PC and find the > next one, one at at time, or extract a spanning set all at once and > orthogonalize with respect to the first PC. > > David Also Ch. Bishop has an article on using EM for PCA, Probabilistic Principal Components Analysis where I think he proves the equivalence as well. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From matthieu.brucher at gmail.com Mon Jun 8 04:08:22 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 10:08:22 +0200 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> Message-ID: 2009/6/8 Matthieu Brucher : > I'm trying to compile it with ICC 10.1.018, and it fails :| > > icc: scipy/special/cephes/const.c > scipy/special/cephes/const.c(94): error: floating-point operation > result is out of range > ?double INFINITY = 1.0/0.0; ?/* 99e999; */ > ? ? ? ? ? ? ? ? ? ? ? ^ > > scipy/special/cephes/const.c(99): error: floating-point operation > result is out of range > ?double NAN = 1.0/0.0 - 1.0/0.0; > ? ? ? ? ? ? ? ? ?^ > > scipy/special/cephes/const.c(99): error: floating-point operation > result is out of range > ?double NAN = 1.0/0.0 - 1.0/0.0; > ? ? ? ? ? ? ? ? ? ? ? ? ? ?^ > > compilation aborted for scipy/special/cephes/const.c (code 2) > scipy/special/cephes/const.c(94): error: floating-point operation > result is out of range > ?double INFINITY = 1.0/0.0; ?/* 99e999; */ > ? ? ? ? ? ? ? ? ? ? ? ^ > > scipy/special/cephes/const.c(99): error: floating-point operation > result is out of range > ?double NAN = 1.0/0.0 - 1.0/0.0; > ? ? ? ? ? ? ? ? ?^ > > scipy/special/cephes/const.c(99): error: floating-point operation > result is out of range > ?double NAN = 1.0/0.0 - 1.0/0.0; > > At least, it seems to pick up the Fortran compiler correctly (which > 0.7.0 didn't seem to do ;)) I manually fixed the files (mconf.h as well as ync.c which uses NAN, which can be not imported if NANS is defined, which is the case here for ICC), but I ran into an another error (one of the reason I tried numscons before): /appli/intel/10.1.018/intel64/fce/bin/ifort -shared -shared -nofor_main build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/scipy/fftpack/_fftpackmodule.o build/temp.linux-x86_64-2.5/scipy/fftpack/src/zfft.o build/temp.linux-x86_64-2.5/scipy/fftpack/src/drfft.o build/temp.linux-x86_64-2.5/scipy/fftpack/src/zrfft.o build/temp.linux-x86_64-2.5/scipy/fftpack/src/zfftnd.o build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/fortranobject.o -Lbuild/temp.linux-x86_64-2.5 -ldfftpack -o build/lib.linux-x86_64-2.5/scipy/fftpack/_fftpack.so ld: build/temp.linux-x86_64-2.5/libdfftpack.a(dffti1.o): relocation R_X86_64_32S against `a local symbol' can not be used when making a shared object; recompile with -fPIC build/temp.linux-x86_64-2.5/libdfftpack.a: could not read symbols: Bad value ld: build/temp.linux-x86_64-2.5/libdfftpack.a(dffti1.o): relocation R_X86_64_32S against `a local symbol' can not be used when making a shared object; recompile with -fPIC build/temp.linux-x86_64-2.5/libdfftpack.a: could not read symbols: Bad value It seems that the library is not compiled with fPIC (perhaps because it is a static library?). My compiler options are: Fortran f77 compiler: ifort -FI -w90 -w95 -xW -axP -O3 -unroll Fortran f90 compiler: ifort -FR -xW -axP -O3 -unroll Fortran fix compiler: ifort -FI -xW -axP -O3 -unroll Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From matthieu.brucher at gmail.com Mon Jun 8 03:53:04 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 09:53:04 +0200 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> Message-ID: I'm trying to compile it with ICC 10.1.018, and it fails :| icc: scipy/special/cephes/const.c scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ compilation aborted for scipy/special/cephes/const.c (code 2) scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; At least, it seems to pick up the Fortran compiler correctly (which 0.7.0 didn't seem to do ;)) Matthieu 2009/6/7 Adam Mercer : > On Fri, Jun 5, 2009 at 06:09, David Cournapeau wrote: > >> Please test it ! I am particularly interested in results for scipy >> binaries on mac os x (do they work on ppc). > > Test suite passes on Intel Mac OS X (10.5.7) built from source: > > OK (KNOWNFAIL=6, SKIP=21) > > > Cheers > > Adam > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From david at ar.media.kyoto-u.ac.jp Mon Jun 8 07:15:47 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 08 Jun 2009 20:15:47 +0900 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> Message-ID: <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > I'm trying to compile it with ICC 10.1.018, and it fails :| > > icc: scipy/special/cephes/const.c > scipy/special/cephes/const.c(94): error: floating-point operation > result is out of range > double INFINITY = 1.0/0.0; /* 99e999; */ > ^ > > scipy/special/cephes/const.c(99): error: floating-point operation > result is out of range > double NAN = 1.0/0.0 - 1.0/0.0; > ^ > > scipy/special/cephes/const.c(99): error: floating-point operation > result is out of range > double NAN = 1.0/0.0 - 1.0/0.0; > ^ > > compilation aborted for scipy/special/cephes/const.c (code 2) > scipy/special/cephes/const.c(94): error: floating-point operation > result is out of range > double INFINITY = 1.0/0.0; /* 99e999; */ > ^ > > scipy/special/cephes/const.c(99): error: floating-point operation > result is out of range > double NAN = 1.0/0.0 - 1.0/0.0; > ^ > > scipy/special/cephes/const.c(99): error: floating-point operation > result is out of range > double NAN = 1.0/0.0 - 1.0/0.0; > > At least, it seems to pick up the Fortran compiler correctly (which > 0.7.0 didn't seem to do ;)) > This code makes me cry... I know Visual Studio won't like it either. Cephes is a constant source of problems . As I mentioned a couple of months ago, I think the only solution is to rewrite most of scipy.special, at least the parts using cephes, using for example boost algorithms and unit tests. But I have not started anything concrete - Pauli did most of the work on scipy.special recently (Kudos to Pauli for consistently improving scipy.special, BTW) cheers, David From matthieu.brucher at gmail.com Mon Jun 8 07:45:29 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 13:45:29 +0200 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> Message-ID: 2009/6/8 David Cournapeau : > Matthieu Brucher wrote: >> I'm trying to compile it with ICC 10.1.018, and it fails :| >> >> icc: scipy/special/cephes/const.c >> scipy/special/cephes/const.c(94): error: floating-point operation >> result is out of range >> ? double INFINITY = 1.0/0.0; ?/* 99e999; */ >> ? ? ? ? ? ? ? ? ? ? ? ?^ >> >> scipy/special/cephes/const.c(99): error: floating-point operation >> result is out of range >> ? double NAN = 1.0/0.0 - 1.0/0.0; >> ? ? ? ? ? ? ? ? ? ^ >> >> scipy/special/cephes/const.c(99): error: floating-point operation >> result is out of range >> ? double NAN = 1.0/0.0 - 1.0/0.0; >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ^ >> >> compilation aborted for scipy/special/cephes/const.c (code 2) >> scipy/special/cephes/const.c(94): error: floating-point operation >> result is out of range >> ? double INFINITY = 1.0/0.0; ?/* 99e999; */ >> ? ? ? ? ? ? ? ? ? ? ? ?^ >> >> scipy/special/cephes/const.c(99): error: floating-point operation >> result is out of range >> ? double NAN = 1.0/0.0 - 1.0/0.0; >> ? ? ? ? ? ? ? ? ? ^ >> >> scipy/special/cephes/const.c(99): error: floating-point operation >> result is out of range >> ? double NAN = 1.0/0.0 - 1.0/0.0; >> >> At least, it seems to pick up the Fortran compiler correctly (which >> 0.7.0 didn't seem to do ;)) >> > > This code makes me cry... I know Visual Studio won't like it either. > Cephes is a constant source of problems . As I mentioned a couple of > months ago, I think the only solution is to rewrite most of > scipy.special, at least the parts using cephes, using for example boost > algorithms and unit tests. But I have not started anything concrete - > Pauli did most of the work on scipy.special recently (Kudos to Pauli for > consistently improving scipy.special, BTW) > > cheers, It could be simply enhanced by refactoring only mconf.h with proper compiler flags, and fix yn.c to remove the NAN detection (as it should be in the mconf.h). Unfortunately, I have no time for this at the moment (besides the fact that it is on my workstation, not at home). Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From cimrman3 at ntc.zcu.cz Mon Jun 8 07:51:26 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 08 Jun 2009 13:51:26 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <1cd32cbb0906060441l119f274frc9108d1444cfe638@mail.gmail.com> References: <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A28AEB2.5050209@ntc.zcu.cz> <1cd32cbb0906060441l119f274frc9108d1444cfe638@mail.gmail.com> Message-ID: <4A2CFB3E.3080007@ntc.zcu.cz> Hi Josef, thanks for the summary! I am responding below, later I will make an enhancement ticket. josef.pktd at gmail.com wrote: > On Sat, Jun 6, 2009 at 4:42 AM, Neil Crighton wrote: >> Robert Cimrman ntc.zcu.cz> writes: >> >>> Anne Archibald wrote: >>> >>>> 1. add a keyword argument to intersect1d "assume_unique"; if it is not >>>> present, check for uniqueness and emit a warning if not unique >>>> 2. change the warning to an exception >>>> Optionally: >>>> 3. change the meaning of the function to that of intersect1d_nu if the >>>> keyword argument is not present >>>> > > 1. merge _nu version into one function > ------------------------------------------------------- > >>> You mean something like: >>> >>> def intersect1d(ar1, ar2, assume_unique=False): >>> if not assume_unique: >>> return intersect1d_nu(ar1, ar2) >>> else: >>> ... # the current code >>> >>> intersect1d_nu could be still exported to numpy namespace, or not. >>> >> +1 - from the user's point of view there should just be intersect1d and >> setmember1d (i.e. no '_nu' versions). The assume_unique keyword Robert suggests >> can be used if speed is a problem. > > + 1 on rolling the _nu versions this way into the plain version, this > would avoid a lot of the confusion. > It would not be a code breaking API change for existing correct usage > (but some speed regression without adding keyword) +1 > depreciate intersect1d_nu > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> intersect1d_nu could be still exported to numpy namespace, or not. > I would say not, if they are the default branch of the non _nu version > > +1 on depreciation +0 > 2. alias as "in" > --------------------- >> I really like in1d (no underscore) as a new name for setmember1d_nu. inarray is >> another possibility. I don't like 'ain'; 'a' in front of 'in' detracts from >> readability, unlike the extra a in arange. > I don't like the extra "a"s either, ones name spaces are commonly used > > alias setmember1d_nu as `in1d` or `isin1d`, because the function is a > "in" and not a set operation > +1 +1 > 3. behavior of other set functions > ----------------------------------------------- > > guarantee that setdiff1d works for non-unique arrays (even when > implementation changes), and change documentation > +1 +1, it is useful for non-unique arrays. > need to check other functions > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > union1d: works for non-unique arrays, obvious from source Yes. > setxor1d: requires unique arrays >>>> np.setxor1d([1,2,3,3,4,5], [0,0,1,2,2,6]) > array([2, 4, 5, 6]) >>>> np.setxor1d(np.unique([1,2,3,3,4,5]), np.unique([0,0,1,2,2,6])) > array([0, 3, 4, 5, 6]) > > setxor: add keyword option and call unique by default > +1 for symmetry +1 - you mean np.setxor1d(np.unique(a), np.unique(b)) to become np.setxor1d(a, b, assume_unique=False), right? > ediff1d and unique1d are defined for non-unique arrays yes > 4. name of keyword > ---------------------------- > > intersect1d(ar1, ar2, assume_unique=False) > > alternative isunique=False or just unique=False > +1 less to write We should look at other functions in numpy (and/or scipy), what is a common scheme here. -1e-1 to the proposed names, as isunique is singular only, and unique=False does not show clearly the intent for me. What about ar1_unique=False, ar2_unique=False - to address each argument specifically? > 5. module name > ----------------------- > > rename arraysetops to something easier to read like setfun. I think it > would only affect internal changes since all functions are exported to > the main numpy name space > +1e-4 (I got used to arrayse_tops) +0 (internal change only). Other numpy/scipy submodules containing a bunch of functions are called *pack (fftpack, arpack, lapack), *alg (linalg), *utils. *fun is used comonly in the matlab world. > 5. keep docs in sync with correct usage > --------------------------------------------------------- > > obvious +1 thanks, r. From cournape at gmail.com Mon Jun 8 07:54:59 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 8 Jun 2009 20:54:59 +0900 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> On Mon, Jun 8, 2009 at 8:45 PM, Matthieu Brucher wrote: > 2009/6/8 David Cournapeau : >> Matthieu Brucher wrote: >>> I'm trying to compile it with ICC 10.1.018, and it fails :| >>> >>> icc: scipy/special/cephes/const.c >>> scipy/special/cephes/const.c(94): error: floating-point operation >>> result is out of range >>> ? double INFINITY = 1.0/0.0; ?/* 99e999; */ >>> ? ? ? ? ? ? ? ? ? ? ? ?^ >>> >>> scipy/special/cephes/const.c(99): error: floating-point operation >>> result is out of range >>> ? double NAN = 1.0/0.0 - 1.0/0.0; >>> ? ? ? ? ? ? ? ? ? ^ >>> >>> scipy/special/cephes/const.c(99): error: floating-point operation >>> result is out of range >>> ? double NAN = 1.0/0.0 - 1.0/0.0; >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ^ >>> >>> compilation aborted for scipy/special/cephes/const.c (code 2) >>> scipy/special/cephes/const.c(94): error: floating-point operation >>> result is out of range >>> ? double INFINITY = 1.0/0.0; ?/* 99e999; */ >>> ? ? ? ? ? ? ? ? ? ? ? ?^ >>> >>> scipy/special/cephes/const.c(99): error: floating-point operation >>> result is out of range >>> ? double NAN = 1.0/0.0 - 1.0/0.0; >>> ? ? ? ? ? ? ? ? ? ^ >>> >>> scipy/special/cephes/const.c(99): error: floating-point operation >>> result is out of range >>> ? double NAN = 1.0/0.0 - 1.0/0.0; >>> >>> At least, it seems to pick up the Fortran compiler correctly (which >>> 0.7.0 didn't seem to do ;)) >>> >> >> This code makes me cry... I know Visual Studio won't like it either. >> Cephes is a constant source of problems . As I mentioned a couple of >> months ago, I think the only solution is to rewrite most of >> scipy.special, at least the parts using cephes, using for example boost >> algorithms and unit tests. But I have not started anything concrete - >> Pauli did most of the work on scipy.special recently (Kudos to Pauli for >> consistently improving scipy.special, BTW) >> >> cheers, > > It could be simply enhanced by refactoring only mconf.h with proper > compiler flags, and fix yn.c to remove the NAN detection (as it should > be in the mconf.h). NAN and co definition should be dealt with the portable definitions we have now in numpy - I just have to find a way to reuse the corresponding code outside numpy (distutils currently does not handle proper installation of libraries built through build_clib), it is on my TODO list for scipy 0.8. Unfortunately, this is only the tip of the iceberg. A lot of code in cephes uses #ifdef on platform specificities, and let's not forget it is pre-ANSI C code (K&R declarations), with a lot of hidden bugs.\ cheers, David From matthieu.brucher at gmail.com Mon Jun 8 08:14:34 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 14:14:34 +0200 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> Message-ID: Good luck with fixing this then :| I've tried to build scipy with the MKL and ATLAS, and I have in both cases a segmentation fault. With the MKL, it is the same as in a previous mail, and for ATLAS it is there: Regression test for #946. ... Segmentation fault A bad ATLAS compilation? Matthieu >> It could be simply enhanced by refactoring only mconf.h with proper >> compiler flags, and fix yn.c to remove the NAN detection (as it should >> be in the mconf.h). > > NAN and co definition should be dealt with the portable definitions we > have now in numpy - I just have to find a way to reuse the > corresponding code outside numpy (distutils currently does not handle > proper installation of libraries built through build_clib), it is on > my TODO list for scipy 0.8. > > Unfortunately, this is only the tip of the iceberg. A lot of code in > cephes uses #ifdef on platform specificities, and let's not forget it > is pre-ANSI C code (K&R declarations), with a lot of hidden bugs.\ > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From david at ar.media.kyoto-u.ac.jp Mon Jun 8 08:10:48 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 08 Jun 2009 21:10:48 +0900 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> Message-ID: <4A2CFFC8.40803@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > Good luck with fixing this then :| > > I've tried to build scipy with the MKL and ATLAS, and I have in both > cases a segmentation fault. With the MKL, it is the same as in a > previous mail, and for ATLAS it is there: > Regression test for #946. ... Segmentation fault > Could you try the last revision in the 0.7.x branch ? There were quite a few problems with this exact code (that's the only reason why scipy 0.7.1 is not released yet, actually), and I added an ugly workaround for the time being, but that should work, David From matthieu.brucher at gmail.com Mon Jun 8 08:31:58 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 14:31:58 +0200 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: <4A2CFFC8.40803@ar.media.kyoto-u.ac.jp> References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> <4A2CFFC8.40803@ar.media.kyoto-u.ac.jp> Message-ID: 2009/6/8 David Cournapeau : > Matthieu Brucher wrote: >> Good luck with fixing this then :| >> >> I've tried to build scipy with the MKL and ATLAS, and I have in both >> cases a segmentation fault. With the MKL, it is the same as in a >> previous mail, and for ATLAS it is there: >> Regression test for #946. ... Segmentation fault >> > > Could you try the last revision in the 0.7.x branch ? There were quite a > few problems with this exact code (that's the only reason why scipy > 0.7.1 is not released yet, actually), and I added an ugly workaround for > the time being, but that should work, > > David Is there a way to get a tarball from the repository on the webpage? I can't do a checkout (no TortoiseSVN installed on my Windows and no web access from Linux :() Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From jrennie at gmail.com Mon Jun 8 08:33:11 2009 From: jrennie at gmail.com (Jason Rennie) Date: Mon, 8 Jun 2009 08:33:11 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> Message-ID: <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> Note that EM can be very slow to converge: http://www.cs.toronto.edu/~roweis/papers/emecgicml03.pdf EM is great for churning-out papers, not so great for getting real work done. Conjugate gradient is a much better tool, at least in my (and Salakhutdinov's) experience. Btw, have you considered how much the Gaussianity assumption is hurting you? Jason On Mon, Jun 8, 2009 at 1:17 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Gael Varoquaux wrote: > > I am using the heuristic exposed in > > http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996 > > > > We have very noisy and long time series. My experience is that most > > model-based heuristics for choosing the number of PCs retained give us > > way too much on this problem (they simply keep diverging if I add noise > > at the end of the time series). The algorithm we use gives us ~50 > > interesting PCs (each composed of 50 000 dimensions). That happens to be > > quite right based on our experience with the signal. However, being > > fairly new to statistics, I am not aware of the EM algorithm that you > > mention. I'd be interested in a reference, to see if I can use that > > algorithm. > > I would not be surprised if David had this paper in mind :) > > http://www.cs.toronto.edu/~roweis/papers/empca.pdf > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Mon Jun 8 09:00:52 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Jun 2009 15:00:52 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> References: <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> Message-ID: <20090608130052.GE22703@phare.normalesup.org> On Mon, Jun 08, 2009 at 08:33:11AM -0400, Jason Rennie wrote: > EM is great for churning-out papers, not so great for getting real work > done.? That's just what I thought. > Btw, have you considered how much the Gaussianity assumption is > hurting you? I have. And the answer is: not much. But then, my order-selection method is just about selecting the non-gaussian components. And the non-orthogonality of the interessing 'indedpendant' signals is small, in that subspace. Ga?l From josef.pktd at gmail.com Mon Jun 8 09:02:12 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 8 Jun 2009 09:02:12 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090608072939.GA23146@phare.normalesup.org> References: <7f014ea60906051437p6002c6f7jb2d77374a220d2aa@mail.gmail.com> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <20090608072939.GA23146@phare.normalesup.org> Message-ID: <1cd32cbb0906080602r26553b2egc14b98e1874b86ce@mail.gmail.com> On Mon, Jun 8, 2009 at 3:29 AM, Gael Varoquaux wrote: > On Mon, Jun 08, 2009 at 08:58:29AM +0200, Matthieu Brucher wrote: >> Given the number of PCs, I think you may just be measuring noise. >> As said in several manifold reduction publications (as the ones by >> Torbjorn Vik who published on robust PCA for medical imaging), you >> cannot expect to have more than 4 or 5 meaningful PCs, due to the >> dimensionality curse. If you want 50 PCs, you have to have at least... >> 10^50 samples, which is quite a lot, let's say it this way. >> According to the litterature, a usual manifold can be described by 4 >> or 5 variables. If you have more, it is that you may be infringing >> your hypothesis, here the linearity of your data (and as it is medical >> imaging, you know from the beginning that this hypothesis is wrong). >> So if you really want to find something meaningful and/or physical, >> you should use a real dimensionality reduction, preferably a >> non-linear one. > > I am not sure I am following you: I have time-varying signals. I am not > taking a shot of the same process over and over again. My intuition tells > me that I have more than 5 meaningful patterns. > > Anyhow, I do some more analysis behind that (ICA actually), and I do find > more than 5 patterns of interest that I not noise. Just curious: whats the actual shape of the array/data you run your PCA on. Number of time periods, size of cross section at point in time? Josef From david at ar.media.kyoto-u.ac.jp Mon Jun 8 08:55:23 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 08 Jun 2009 21:55:23 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> Message-ID: <4A2D0A3B.9030105@ar.media.kyoto-u.ac.jp> Jason Rennie wrote: > Note that EM can be very slow to converge: > > http://www.cs.toronto.edu/~roweis/papers/emecgicml03.pdf > > > EM is great for churning-out papers, not so great for getting real > work done. I think it depends on what you are doing - EM is used for 'real' work too, after all :) > Conjugate gradient is a much better tool, at least in my (and > Salakhutdinov's) experience. Thanks for the link, I was not aware of this work. What is the difference between the ECG method and the method proposed by Lange in [1] ? To avoid 'local trapping' of the parameter in EM methods, recursive EM [2] may also be a promising method, also it seems to me that it has not been used so much, but I may well be wrong (I have seen several people using a simplified version of it without much theoretical consideration in speech processing). cheers, David [1] "A gradient algorithm locally equivalent to the EM algorithm", in Journal of the Royal Statistical Society. Series B. Methodological, 1995, vol. 57, n^o 2, pp. 425-437 [2] "Online EM Algorithm for Latent Data Models", by: Olivier Cappe;, Eric Moulines, in the Journal of the Royal Statistical Society Series B (February 2009). From gael.varoquaux at normalesup.org Mon Jun 8 09:17:04 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Jun 2009 15:17:04 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <1cd32cbb0906080602r26553b2egc14b98e1874b86ce@mail.gmail.com> References: <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <20090608072939.GA23146@phare.normalesup.org> <1cd32cbb0906080602r26553b2egc14b98e1874b86ce@mail.gmail.com> Message-ID: <20090608131704.GF22703@phare.normalesup.org> On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.pktd at gmail.com wrote: > whats the actual shape of the array/data you run your PCA on. 50 000 dimensions, 820 datapoints. > Number of time periods, size of cross section at point in time? I am not sure what the question means. The data is sampled at 0.5Hz. Ga?l From matthieu.brucher at gmail.com Mon Jun 8 09:25:17 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 15:25:17 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090608131704.GF22703@phare.normalesup.org> References: <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <20090608072939.GA23146@phare.normalesup.org> <1cd32cbb0906080602r26553b2egc14b98e1874b86ce@mail.gmail.com> <20090608131704.GF22703@phare.normalesup.org> Message-ID: 2009/6/8 Gael Varoquaux : > On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.pktd at gmail.com wrote: >> whats the actual shape of the array/data you run your PCA on. > > 50 000 dimensions, 820 datapoints. You definitely can't expect to find 50 meaningfull PCs. It's impossible to robustly get them with less than thousand points! >> Number of time periods, size of cross section at point in time? > > I am not sure what the question means. The data is sampled at 0.5Hz. > > Ga?l > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From kwgoodman at gmail.com Mon Jun 8 09:28:06 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 8 Jun 2009 06:28:06 -0700 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <20090608131704.GF22703@phare.normalesup.org> References: <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <20090608072939.GA23146@phare.normalesup.org> <1cd32cbb0906080602r26553b2egc14b98e1874b86ce@mail.gmail.com> <20090608131704.GF22703@phare.normalesup.org> Message-ID: On Mon, Jun 8, 2009 at 6:17 AM, Gael Varoquaux wrote: > On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.pktd at gmail.com wrote: >> whats the actual shape of the array/data you run your PCA on. > > 50 000 dimensions, 820 datapoints. Have you tried shuffling each time series, performing PCA, looking at the magnitude of the largest eigenvalue, then repeating many times? That will give you an idea of how large the noise can be. Then you can see how many eigenvectors of the unshuffled data have eigenvalues greater than the noise. It would be kind of the empirical approach to random matrix theory. From gael.varoquaux at normalesup.org Mon Jun 8 09:34:14 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Jun 2009 15:34:14 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <20090608072939.GA23146@phare.normalesup.org> <1cd32cbb0906080602r26553b2egc14b98e1874b86ce@mail.gmail.com> <20090608131704.GF22703@phare.normalesup.org> Message-ID: <20090608133414.GG22703@phare.normalesup.org> On Mon, Jun 08, 2009 at 06:28:06AM -0700, Keith Goodman wrote: > On Mon, Jun 8, 2009 at 6:17 AM, Gael Varoquaux > wrote: > > On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.pktd at gmail.com wrote: > >> whats the actual shape of the array/data you run your PCA on. > > 50 000 dimensions, 820 datapoints. > Have you tried shuffling each time series, performing PCA, looking at > the magnitude of the largest eigenvalue, then repeating many times? > That will give you an idea of how large the noise can be. Then you can > see how many eigenvectors of the unshuffled data have eigenvalues > greater than the noise. It would be kind of the empirical approach to > random matrix theory. Yes, that's the kind of things that is done in the paper I pointed out and I use to infer the number of PCAs I retain. Ga?l From cimrman3 at ntc.zcu.cz Mon Jun 8 09:38:20 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 08 Jun 2009 15:38:20 +0200 Subject: [Numpy-discussion] extract elements of an array that are contained in another array? In-Reply-To: <4A2CFB3E.3080007@ntc.zcu.cz> References: <4A27BCCF.4090608@american.edu> <1cd32cbb0906040535m235a1e90oc52f21806238d5d7@mail.gmail.com> <4A27D67F.3@american.edu> <1cd32cbb0906040750x120abbe8g397eb57e734bdeb8@mail.gmail.com> <4A27E615.70604@american.edu> <1cd32cbb0906040829n74f01778s5ef37894e1b5e081@mail.gmail.com> <4A28AEB2.5050209@ntc.zcu.cz> <1cd32cbb0906060441l119f274frc9108d1444cfe638@mail.gmail.com> <4A2CFB3E.3080007@ntc.zcu.cz> Message-ID: <4A2D144C.8050507@ntc.zcu.cz> Robert Cimrman wrote: > Hi Josef, > > thanks for the summary! I am responding below, later I will make an > enhancement ticket. Done, see http://projects.scipy.org/numpy/ticket/1133 r. From cimrman3 at ntc.zcu.cz Mon Jun 8 09:38:46 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 08 Jun 2009 15:38:46 +0200 Subject: [Numpy-discussion] setmember1d_nu In-Reply-To: <4A279882.5090700@ntc.zcu.cz> References: <4A279882.5090700@ntc.zcu.cz> Message-ID: <4A2D1466.2000501@ntc.zcu.cz> Robert Cimrman wrote: > Hi Neil, > > Neil Crighton wrote: >> Hi all, >> >> I posted this message couple of days ago, but gmane grouped it with an old >> thread and it hasn't shown up on the front page. So here it is again... >> >> I'd really like to see the setmember1d_nu function in ticket 1036 get into >> numpy. There's a patch waiting for review that including tests: >> >> http://projects.scipy.org/numpy/ticket/1036 >> >> Is there anything I can do to help get it applied? > > I guess I could commit it, if you review the patch and it works for you. > Obviously, I cannot review it myself, but my SVN access may still work :) Thanks for the review, it is in! r. From matthieu.brucher at gmail.com Mon Jun 8 09:58:22 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 15:58:22 +0200 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> Message-ID: OK, I'm stuck with #946 with the MKL as well (finally managed to compile and use it with only the static library safe for libguide). I'm trying to download the trunk at the moment to check if the segmentation fault is still there. Matthieu 2009/6/8 Matthieu Brucher : > Good luck with fixing this then :| > > I've tried to build scipy with the MKL and ATLAS, and I have in both > cases a segmentation fault. With the MKL, it is the same as in a > previous mail, and for ATLAS it is there: > Regression test for #946. ... Segmentation fault > > A bad ATLAS compilation? > > Matthieu > >>> It could be simply enhanced by refactoring only mconf.h with proper >>> compiler flags, and fix yn.c to remove the NAN detection (as it should >>> be in the mconf.h). >> >> NAN and co definition should be dealt with the portable definitions we >> have now in numpy - I just have to find a way to reuse the >> corresponding code outside numpy (distutils currently does not handle >> proper installation of libraries built through build_clib), it is on >> my TODO list for scipy 0.8. >> >> Unfortunately, this is only the tip of the iceberg. A lot of code in >> cephes uses #ifdef on platform specificities, and let's not forget it >> is pre-ANSI C code (K&R declarations), with a lot of hidden bugs.\ >> >> cheers, >> >> David >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > Information System Engineer, Ph.D. > Website: http://matthieu-brucher.developpez.com/ > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn: http://www.linkedin.com/in/matthieubrucher > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From zelbier at gmail.com Mon Jun 8 09:58:25 2009 From: zelbier at gmail.com (Olivier Verdier) Date: Mon, 8 Jun 2009 15:58:25 +0200 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> Message-ID: Is this lack of associativity really *always* such a huge issue? I can imagine many situations where it is not. One just want to compute A*B*C, without any particular knowing of whether A*(B*C) or (A*B)*C is best. If the user is allowed to blindly use A*B*C, I don't really see why he wouldn't be allowed to use dot(A,B,C) with the same convention... One should realize that allowing dot(A,B,C) is just *better* than the present situation where the user is forced into writing dot(dot(A,B),C) or dot(A,dot(B,C)). One does not remove any liberty from the user. He may always switch back to one of the above forms if he really knows which is best for him. So I fail to see exactly where the problem is... == Olivier 2009/6/7 Robert Kern > On Sun, Jun 7, 2009 at 04:44, Olivier Verdier wrote: > > Yes, I found the thread you are referring > > to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html > > However, since A*B*C exists for matrices and actually computes (A*B)*C, > why > > not do the same with dot? I.e. why not decide that dot(A,B,C) does what > > would A*B*C do, i.e., dot(dot(A,B),C)? > > The performance and precision problems are the responsability of the > user, > > just as with the formula A*B*C. > > I'm happy to make the user responsible for performance and precision > problems if he has the tools to handle them. The operator gives the > user the easy ability to decide the precedence with parentheses. The > function does not. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Mon Jun 8 10:32:09 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 16:32:09 +0200 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> Message-ID: David, I've checked out the trunk, and the segmentation fault isn't there anymore (the trunk is labeled 0.8.0 though) Here is the log from the remaining errors with the MKL: ====================================================================== ERROR: Failure: ImportError (/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/linalg/atlas_version.so: undefined symbol: ATL_buildinfo) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/nose/loader.py", line 364, in loadTestsFromName addr.filename, addr.module) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/nose/importer.py", line 84, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/linalg/tests/test_atlas_version.py", line 8, in import scipy.linalg.atlas_version ImportError: /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/linalg/atlas_version.so: undefined symbol: ATL_buildinfo ====================================================================== ERROR: test_io.test_imread ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/nose/case.py", line 182, in runTest self.test(*self.arg) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/ndimage/tests/test_io.py", line 8, in test_imread img = ndi.imread(lp) AttributeError: 'module' object has no attribute 'imread' ====================================================================== ERROR: test_outer_v (test_lgmres.TestLGMRES) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/sparse/linalg/isolve/tests/test_lgmres.py", line 52, in test_outer_v x0, count_0 = do_solve(outer_k=6, outer_v=outer_v) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/sparse/linalg/isolve/tests/test_lgmres.py", line 29, in do_solve x0, flag = lgmres(A, b, x0=zeros(A.shape[0]), inner_m=6, tol=1e-14, **kw) TypeError: 'module' object is not callable ====================================================================== ERROR: test_preconditioner (test_lgmres.TestLGMRES) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/sparse/linalg/isolve/tests/test_lgmres.py", line 41, in test_preconditioner x0, count_0 = do_solve() File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/sparse/linalg/isolve/tests/test_lgmres.py", line 29, in do_solve x0, flag = lgmres(A, b, x0=zeros(A.shape[0]), inner_m=6, tol=1e-14, **kw) TypeError: 'module' object is not callable ====================================================================== ERROR: test_iv_cephes_vs_amos (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1691, in test_iv_cephes_vs_amos self.check_cephes_vs_amos(iv, iv, rtol=1e-12, atol=1e-305) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1672, in check_cephes_vs_amos assert_tol_equal(c1, c2, err_msg=(v, z), rtol=rtol, atol=atol) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 38, in assert_tol_equal verbose=verbose, header=header) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 377, in assert_array_compare val = comparison(x[~xnanid], y[~ynanid]) IndexError: 0-d arrays can't be indexed ====================================================================== FAIL: test_lorentz (test_odr.TestODR) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/odr/tests/test_odr.py", line 292, in test_lorentz 3.7798193600109009e+00]), File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 537, in assert_array_almost_equal header='Arrays are not almost equal') File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 395, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal (mismatch 100.0%) x: array([ 1.00000000e+03, 1.00000000e-01, 3.80000000e+00]) y: array([ 1.43067808e+03, 1.33905090e-01, 3.77981936e+00]) ====================================================================== FAIL: test_multi (test_odr.TestODR) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/odr/tests/test_odr.py", line 188, in test_multi 0.5101147161764654, 0.5173902330489161]), File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 537, in assert_array_almost_equal header='Arrays are not almost equal') File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 395, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal (mismatch 100.0%) x: array([ 4. , 2. , 7. , 0.4, 0.5]) y: array([ 4.37998803, 2.43330576, 8.00288459, 0.51011472, 0.51739023]) ====================================================================== FAIL: test_pearson (test_odr.TestODR) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/odr/tests/test_odr.py", line 235, in test_pearson np.array([ 5.4767400299231674, -0.4796082367610305]), File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 537, in assert_array_almost_equal header='Arrays are not almost equal') File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 395, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal (mismatch 100.0%) x: array([ 1., 1.]) y: array([ 5.47674003, -0.47960824]) ====================================================================== FAIL: test_jv_cephes_vs_amos (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1678, in test_jv_cephes_vs_amos self.check_cephes_vs_amos(jv, jn, rtol=1e-10, atol=1e-305) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1672, in check_cephes_vs_amos assert_tol_equal(c1, c2, err_msg=(v, z), rtol=rtol, atol=atol) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 38, in assert_tol_equal verbose=verbose, header=header) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 395, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-10, atol=1e-305 (-100.3, -1300) (mismatch 100.0%) x: array(0.0) y: array((-0.012756553055739306+0.017557888993457362j)) ====================================================================== FAIL: Real-valued Bessel I overflow ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1787, in test_ticket_503 assert_tol_equal(iv(1, 700), 1.528500390233901e302) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 38, in assert_tol_equal verbose=verbose, header=header) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 395, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array(1.5306870843952584e+302) y: array(1.528500390233901e+302) ====================================================================== FAIL: Real-valued Bessel domains ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1770, in test_ticket_854 assert isnan(jv(0.5, -1)) AssertionError ====================================================================== FAIL: test_yv_cephes_vs_amos_only_small_orders (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1688, in test_yv_cephes_vs_amos_only_small_orders self.check_cephes_vs_amos(yv, yn, rtol=1e-11, atol=1e-305, skip=skipper) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1672, in check_cephes_vs_amos assert_tol_equal(c1, c2, err_msg=(v, z), rtol=rtol, atol=atol) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 38, in assert_tol_equal verbose=verbose, header=header) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py", line 395, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-11, atol=1e-305 (-20.0, -1300) (mismatch 100.0%) x: array(0.0) y: array((-0.021008635623302012+0.013914406061324559j)) ====================================================================== FAIL: test_basic.TestStruve.test_some_values ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/nose/case.py", line 182, in runTest self.test(*self.arg) File "/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 2295, in test_some_values assert isnan(struve(-7.1, -1)) AssertionError ---------------------------------------------------------------------- Ran 3574 tests in 44.545s FAILED (KNOWNFAIL=6, SKIP=31, errors=5, failures=8) Matthieu 2009/6/8 Matthieu Brucher : > OK, I'm stuck with #946 with the MKL as well (finally managed to > compile and use it with only the static library safe for libguide). > > I'm trying to download the trunk at the moment to check if the > segmentation fault is still there. > > Matthieu > > 2009/6/8 Matthieu Brucher : >> Good luck with fixing this then :| >> >> I've tried to build scipy with the MKL and ATLAS, and I have in both >> cases a segmentation fault. With the MKL, it is the same as in a >> previous mail, and for ATLAS it is there: >> Regression test for #946. ... Segmentation fault >> >> A bad ATLAS compilation? >> >> Matthieu >> >>>> It could be simply enhanced by refactoring only mconf.h with proper >>>> compiler flags, and fix yn.c to remove the NAN detection (as it should >>>> be in the mconf.h). >>> >>> NAN and co definition should be dealt with the portable definitions we >>> have now in numpy - I just have to find a way to reuse the >>> corresponding code outside numpy (distutils currently does not handle >>> proper installation of libraries built through build_clib), it is on >>> my TODO list for scipy 0.8. >>> >>> Unfortunately, this is only the tip of the iceberg. A lot of code in >>> cephes uses #ifdef on platform specificities, and let's not forget it >>> is pre-ANSI C code (K&R declarations), with a lot of hidden bugs.\ >>> >>> cheers, >>> >>> David >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> >> -- >> Information System Engineer, Ph.D. >> Website: http://matthieu-brucher.developpez.com/ >> Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 >> LinkedIn: http://www.linkedin.com/in/matthieubrucher >> > > > > -- > Information System Engineer, Ph.D. > Website: http://matthieu-brucher.developpez.com/ > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn: http://www.linkedin.com/in/matthieubrucher > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From jrennie at gmail.com Mon Jun 8 10:40:15 2009 From: jrennie at gmail.com (Jason Rennie) Date: Mon, 8 Jun 2009 10:40:15 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2D0A3B.9030105@ar.media.kyoto-u.ac.jp> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> <4A2D0A3B.9030105@ar.media.kyoto-u.ac.jp> Message-ID: <75c31b2a0906080740rf4452d2vaa3a3a6964207621@mail.gmail.com> On Mon, Jun 8, 2009 at 8:55 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > I think it depends on what you are doing - EM is used for 'real' work > too, after all :) Certainly, but EM is really just a mediocre gradient descent/hill climbing algorithm that is relatively easy to implement. Thanks for the link, I was not aware of this work. What is the > difference between the ECG method and the method proposed by Lange in > [1] ? To avoid 'local trapping' of the parameter in EM methods, > recursive EM [2] may also be a promising method, also it seems to me > that it has not been used so much, but I may well be wrong (I have seen > several people using a simplified version of it without much theoretical > consideration in speech processing). I hung-out in the machine learning community appx. 1999-2007 and thought the Salakhutdinov work was extremely refreshing to see after listening to no end of papers applying EM to whatever was the hot topic at the time. :) I've certainly seen/heard about various fixes to EM, but I haven't seen convincing reason(s) to prefer it over proper gradient descent/hill climbing algorithms (besides its present-ability and ease of implementation). Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon Jun 8 10:23:29 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 08 Jun 2009 23:23:29 +0900 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> Message-ID: <4A2D1EE1.1050205@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > David, > > I've checked out the trunk, and the segmentation fault isn't there > anymore (the trunk is labeled 0.8.0 though) > Yes, the upcoming 0.7.1 release has its code in the 0.7.x svn branch. But the fix for #946 is a backport of 0.8.0, so in theory, it should be fixed :) Concerning the other errors: did you compile with intel compilers or GNU ones ? cheers, David From matthieu.brucher at gmail.com Mon Jun 8 10:49:44 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 8 Jun 2009 16:49:44 +0200 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: <4A2D1EE1.1050205@ar.media.kyoto-u.ac.jp> References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> <4A2D1EE1.1050205@ar.media.kyoto-u.ac.jp> Message-ID: 2009/6/8 David Cournapeau : > Matthieu Brucher wrote: >> David, >> >> I've checked out the trunk, and the segmentation fault isn't there >> anymore (the trunk is labeled 0.8.0 though) >> > > Yes, the upcoming 0.7.1 release has its code in the 0.7.x svn branch. > But the fix for #946 is a backport of 0.8.0, so in theory, it should be > fixed :) OK, I didn't check the branches, I should have :| > Concerning the other errors: did you compile with intel compilers or GNU > ones ? Only Intel compilers. Maybe I should check the rc branch instead of the trunk? Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From david at ar.media.kyoto-u.ac.jp Mon Jun 8 11:02:54 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 09 Jun 2009 00:02:54 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <75c31b2a0906080740rf4452d2vaa3a3a6964207621@mail.gmail.com> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> <4A2D0A3B.9030105@ar.media.kyoto-u.ac.jp> <75c31b2a0906080740rf4452d2vaa3a3a6964207621@mail.gmail.com> Message-ID: <4A2D281E.3050403@ar.media.kyoto-u.ac.jp> Jason Rennie wrote: > > I hung-out in the machine learning community appx. 1999-2007 and > thought the Salakhutdinov work was extremely refreshing to see after > listening to no end of papers applying EM to whatever was the hot > topic at the time. :) Isn't it true for any general framework who enjoys some popularity :) > I've certainly seen/heard about various fixes to EM, but I haven't > seen convincing reason(s) to prefer it over proper gradient > descent/hill climbing algorithms (besides its present-ability and ease > of implementation). I think there are cases where gradient methods are not applicable (latent models where the complete data Y cannot be split into observations-hidden (O, H) variables), although I am not sure that's a very common case in machine learning, cheers, David From david at ar.media.kyoto-u.ac.jp Mon Jun 8 11:11:23 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 09 Jun 2009 00:11:23 +0900 Subject: [Numpy-discussion] scipy 0.7.1rc2 released In-Reply-To: References: <5b8d13220906050409u30286931w7bd9aac1e01b9ebf@mail.gmail.com> <799406d60906061957w2b0bd6c9n33fb898a7fc16e28@mail.gmail.com> <4A2CF2E3.1080805@ar.media.kyoto-u.ac.jp> <5b8d13220906080454md10f9e6o74c924196f9f7bc@mail.gmail.com> <4A2D1EE1.1050205@ar.media.kyoto-u.ac.jp> Message-ID: <4A2D2A1B.60508@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: >> Concerning the other errors: did you compile with intel compilers or GNU >> ones ? >> > > Only Intel compilers. Maybe I should check the rc branch instead of the trunk? > I just wanted to confirm - I am actually rather surprised there are not more errors :) cheers, David From jonnojohnson at gmail.com Mon Jun 8 12:29:15 2009 From: jonnojohnson at gmail.com (Jonno) Date: Mon, 8 Jun 2009 11:29:15 -0500 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots Message-ID: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> Hi All, I'm new to python and tools like matplotlib and Mayavi so I may be missing something basic. I've been looking for a fairly lightweight editor/interactive shell combo that allows me to create plots and figures from a shell and play with them and kill them gracefully. The Mayavi documentation describes using IPython with the -wthread option at startup and while this works well I really would like to use an environment where I can see the variables. I really like the PyDee layout and it has everything I need (editor view, shell, workspace, doc view) but I don't think it can be used with IPython (yet). Does anyone have any suggestions? In Pydee I can generate plots and update them from the shell but I can't (or don't know how to) kill them. Is there an alternative with an IPython shell that has a built-in editor view or at least a workspace where I can view variables? I hope this is an acceptable place to post this. If not please let me know if you know a better place to ask. Cheers, Jonno. -- "If a theory can't produce hypotheses, can't be tested, can't be disproven, and can't make predictions, then it's not a theory and certainly not science." by spisska on Slashdot, Monday April 21, 2008 From gokhansever at gmail.com Mon Jun 8 12:35:43 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Mon, 8 Jun 2009 11:35:43 -0500 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> Message-ID: <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> Hello, To me, IPython is the right way to follow. Try "whos" to see what's in your namespace. You may want see this instructional video (A Demonstration of the 'IPython' Interactive Shell) to learn more about IPython's functionality or you can delve in its documentation. There are IPython integrations plans for pydee. You can see the details on pydee's google page. G?khan On Mon, Jun 8, 2009 at 11:29 AM, Jonno wrote: > Hi All, > > I'm new to python and tools like matplotlib and Mayavi so I may be > missing something basic. I've been looking for a fairly lightweight > editor/interactive shell combo that allows me to create plots and > figures from a shell and play with them and kill them gracefully. The > Mayavi documentation describes using IPython with the -wthread option > at startup and while this works well I really would like to use an > environment where I can see the variables. I really like the PyDee > layout and it has everything I need (editor view, shell, workspace, > doc view) but I don't think it can be used with IPython (yet). > Does anyone have any suggestions? > In Pydee I can generate plots and update them from the shell but I > can't (or don't know how to) kill them. > Is there an alternative with an IPython shell that has a built-in > editor view or at least a workspace where I can view variables? > > I hope this is an acceptable place to post this. If not please let me > know if you know a better place to ask. > > Cheers, > > Jonno. > > -- > "If a theory can't produce hypotheses, can't be tested, can't be > disproven, and can't make predictions, then it's not a theory and > certainly not science." by spisska on Slashdot, Monday April 21, 2008 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Jun 8 12:39:43 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 8 Jun 2009 12:39:43 -0400 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> Message-ID: <1cd32cbb0906080939q36edbaf9gb526ac2694f150c4@mail.gmail.com> On Mon, Jun 8, 2009 at 12:29 PM, Jonno wrote: > Hi All, > > I'm new to python and tools like matplotlib and Mayavi so I may be > missing something basic. I've been looking for a fairly lightweight > editor/interactive shell combo that allows me to create plots and > figures from a shell and play with them and kill them gracefully. The > Mayavi documentation describes using IPython with the -wthread option > at startup and while this works well I really would like to use an > environment where I can see the variables. I really like the PyDee > layout and it has everything I need (editor view, shell, workspace, > doc view) but I don't think it can be used with IPython (yet). > Does anyone have any suggestions? > In Pydee I can generate plots and update them from the shell but I > can't (or don't know how to) kill them. I'm using now pydee as my main shell to try out new scripts and I don't have any problems with the plots. I'm creating plots the standard way from matplotlib import pyplot as plt plt.plot(x,y) and I can close the poping up plot windows. if I have too many plot windows, I use plt.close("all") and it works without problems. Sometimes the windows are a bit slow in responding, and I need to use plt.show() more often than in a regular script. I especially like the source view doc window in pydee, and the select lines and execute, and ... Josef > Is there an alternative with an IPython shell that has a built-in > editor view or at least a workspace where I can view variables? > > I hope this is an acceptable place to post this. If not please let me > know if you know a better place to ask. > > Cheers, > > Jonno. > > -- > "If a theory can't produce hypotheses, can't be tested, can't be > disproven, and can't make predictions, then it's not a theory and > certainly not science." by spisska ?on Slashdot, Monday April 21, 2008 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Mon Jun 8 12:55:09 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 08 Jun 2009 09:55:09 -0700 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> Message-ID: <4A2D426D.3070004@noaa.gov> Olivier Verdier wrote: > One > should realize that allowing dot(A,B,C) is just *better* than the > present situation where the user is forced into writing dot(dot(A,B),C) > or dot(A,dot(B,C)). I'm lost now -- how is this better in any significant way? Tom K. wrote: > But, > almost all experienced users drift away from matrix toward array as they > find the matrix class too limiting or strange That's one reason, and the other is that when you are doing real work, it is very rare for the linear algebra portion to be significant. I know in my code (and this was true when I was using MATLAB too), I may have 100 lines of code, and one of them is a linear algebra expression that could be expressed nicely with matrices and infix operators. Given that the rest of the code is more natural with nd-arrays, why the heck would I want to use matrices? this drove me crazy with MATLAB -- I hated the default matrix operators, I was always typing ".*", etc. > - it seems only applicable for new users and pedagogical purposes. and I'd take the new users of this list -- it serves no one to teach people something first, then tell them to abandon it. Which leaves the pedagogical purposes. In that case, you really need operators, slightly cleaner syntax that isn't infix really doesn't solve the pedagogical function. It seems there is a small but significant group of folks on this list that want matrices for that reason. That group needs to settle on a solution, and then implement it. Personally, I think the row-vector, column-vector approach is the way to go -- even though, yes, these are matrices that happen to have one dimension or the other set to one, I know that when I was learning LA (and still), I thought about row and column vectors a fair bit, and indexing them in a simple way would be nice. But I don't teach, so I'll stop there. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jonnojohnson at gmail.com Mon Jun 8 12:58:31 2009 From: jonnojohnson at gmail.com (Jonno) Date: Mon, 8 Jun 2009 11:58:31 -0500 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <1cd32cbb0906080939q36edbaf9gb526ac2694f150c4@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <1cd32cbb0906080939q36edbaf9gb526ac2694f150c4@mail.gmail.com> Message-ID: <3d15ebce0906080958w2d375fc9y87094a33bdb3600f@mail.gmail.com> On Mon, Jun 8, 2009 at 11:39 AM, wrote: > > I'm using now pydee as my main shell to try out new scripts and I > don't have any problems with the plots. I'm creating plots the > standard way > from matplotlib import pyplot as plt > plt.plot(x,y) > > and I can close the poping up plot windows. > if I have too many plot windows, I use plt.close("all") > and it works without problems. > Sometimes the windows are a bit slow in responding, and I need to use > plt.show() more often than in a regular script. > > I especially like the source view doc window in pydee, and the select > lines and execute, and ... > > Josef Thanks Josef, I shouldn't have included Matplotlib since Pydee does work well with its plots. I had forgotten that. It really is just the Mayavi plots (or scenes I guess) that don't play well. From jonnojohnson at gmail.com Mon Jun 8 13:11:31 2009 From: jonnojohnson at gmail.com (Jonno) Date: Mon, 8 Jun 2009 12:11:31 -0500 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> Message-ID: <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> On Mon, Jun 8, 2009 at 11:35 AM, G?khan SEVER wrote: > Hello, > > To me, IPython is the right way to follow. Try "whos" to see what's in your > namespace. > > You may want see this instructional video (A Demonstration of the 'IPython' > Interactive Shell) to learn more about IPython's functionality or you can > delve in its documentation. > > There are IPython integrations plans for pydee. You can see the details on > pydee's google page. > > > G?khan Thanks Gokhan, I didn't know about whos so thanks for the tip. What about a lightweight editor with an integrated IPython shell then? I also found PyScripter which looks pretty nice too but also has the same lack of IPython shell. From zelbier at gmail.com Mon Jun 8 13:12:43 2009 From: zelbier at gmail.com (Olivier Verdier) Date: Mon, 8 Jun 2009 19:12:43 +0200 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A2D426D.3070004@noaa.gov> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23907204.post@talk.nabble.com> <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> <4A2D426D.3070004@noaa.gov> Message-ID: 2009/6/8 Christopher Barker > Olivier Verdier wrote: > > One > > should realize that allowing dot(A,B,C) is just *better* than the > > present situation where the user is forced into writing dot(dot(A,B),C) > > or dot(A,dot(B,C)). > > I'm lost now -- how is this better in any significant way? Well, allowing dot(A,B,C) does not remove any other possibility does it? That is what I meant by "better". It just gives the user an extra possibility. What would be wrong with that? Especially since matrix users already can write A*B*C. I won't fight for this though. I personally don't care but I think that it would remove the last argument for matrices against arrays, namely the fact that A*B*C is easier to write than dot(dot(A,B),C). I don't understand why it would be a bad idea to implement this dot(A,B,C). > Tom K. wrote: > > But, > > almost all experienced users drift away from matrix toward array as they > > find the matrix class too limiting or strange > > That's one reason, and the other is that when you are doing real work, > it is very rare for the linear algebra portion to be significant. I know > in my code (and this was true when I was using MATLAB too), I may have > 100 lines of code, and one of them is a linear algebra expression that > could be expressed nicely with matrices and infix operators. Given that > the rest of the code is more natural with nd-arrays, why the heck would > I want to use matrices? this drove me crazy with MATLAB -- I hated the > default matrix operators, I was always typing ".*", etc. This exactly agrees with my experience too. == Olivier -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Mon Jun 8 13:26:09 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Mon, 8 Jun 2009 12:26:09 -0500 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> Message-ID: <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> On Mon, Jun 8, 2009 at 12:11 PM, Jonno wrote: > On Mon, Jun 8, 2009 at 11:35 AM, G?khan SEVER > wrote: > > Hello, > > > > To me, IPython is the right way to follow. Try "whos" to see what's in > your > > namespace. > > > > You may want see this instructional video (A Demonstration of the > 'IPython' > > Interactive Shell) to learn more about IPython's functionality or you can > > delve in its documentation. > > > > There are IPython integrations plans for pydee. You can see the details > on > > pydee's google page. > > > > > > G?khan > > Thanks Gokhan, > > I didn't know about whos so thanks for the tip. What about a > lightweight editor with an integrated IPython shell then? > > I also found PyScripter which looks pretty nice too but also has the > same lack of IPython shell. > I use scite as my main text editor. It highlights Python syntax nicely, and has code-completion support. Well not as powerful as Eclipse-PyDev pair but it works :) And yes PyDev doesn't have IPython integration either. Eclipse-PyDev is also slow to me, (loading takes lots of time :)) and shell integration not as easy as in IPy. I am looking forward to pydee developer's to bring IPython functionality into their development environment. Besides, PyScripter, there is also Eric4 as a free IDE for Python, but again no IPython. So far, IPython-Scite is the fastest that I can build my programs. Experiment in Ipython and build pieces in Scite. I would like to know what others use in this respect? -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Mon Jun 8 13:54:25 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Mon, 8 Jun 2009 12:54:25 -0500 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> Message-ID: <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> On Jun 8, 2009, at 12:26 PM, G?khan SEVER wrote: > On Mon, Jun 8, 2009 at 12:11 PM, Jonno wrote: > On Mon, Jun 8, 2009 at 11:35 AM, G?khan SEVER > wrote: > > Hello, > > > > To me, IPython is the right way to follow. Try "whos" to see > what's in your > > namespace. > > > > You may want see this instructional video (A Demonstration of the > 'IPython' > > Interactive Shell) to learn more about IPython's functionality or > you can > > delve in its documentation. > > > > There are IPython integrations plans for pydee. You can see the > details on > > pydee's google page. > > > > > > G?khan > > Thanks Gokhan, > > I didn't know about whos so thanks for the tip. What about a > lightweight editor with an integrated IPython shell then? > > I also found PyScripter which looks pretty nice too but also has the > same lack of IPython shell. > > I use scite as my main text editor. It highlights Python syntax > nicely, and has code-completion support. Well not as powerful as > Eclipse-PyDev pair but it works :) And yes PyDev doesn't have > IPython integration either. Eclipse-PyDev is also slow to me, > (loading takes lots of time :)) and shell integration not as easy as > in IPy. I am looking forward to pydee developer's to bring IPython > functionality into their development environment. > > Besides, PyScripter, there is also Eric4 as a free IDE for Python, > but again no IPython. > > So far, IPython-Scite is the fastest that I can build my programs. > Experiment in Ipython and build pieces in Scite. I would like to > know what others use in this respect? You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It integrates IPython into an Envisage application and has a crude name-space browser EPDLab is part of the Enthought Tool Suite and is an open-source application (BSD-style license). It's another example (like Mayavi) of using the Enthought Tool Suite to build applications. Don't be confused because the binary distribution called EPD is only free for academic use EPDLab is completely open source. You can check out the source code here: https://svn.enthought.com/svn/enthought/EPDLab/trunk It requires quite a bit of ETS to run first though. If you have EPD installed, then EPDLab is already available to you. It's still alpha, so I hesitate to advertise it. But, it's easy to extend as you would like so I thought I would chime in on this discussion. Best regards, -Travis > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliphant at enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Mon Jun 8 14:48:47 2009 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 08 Jun 2009 14:48:47 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23907204.post@talk.nabble.com> <3d375d730906070056j43b6e16fp818011701dfa6e9e@mail.gmail.com> <3d375d730906071201k6ce4cd37y4afd814d9a551aa0@mail.gmail.com> <4A2D426D.3070004@noaa.gov> Message-ID: <4A2D5D0F.30800@american.edu> Olivier Verdier wrote: > Well, allowing dot(A,B,C) does not remove any other possibility does it? > I won't fight for this though. I personally don't care but I think that > it would remove the last argument for matrices against arrays, namely > the fact that A*B*C is easier to write than dot(dot(A,B),C). Well, no. Notation matters to students. Additionally, matrix exponentiation is useful. E.g., A**(N-1) finds the transitive closure of the binary relation represented by the NxN boolean matrix A. Alan Isaac From aisaac at american.edu Mon Jun 8 15:10:37 2009 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 08 Jun 2009 15:10:37 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> Message-ID: <4A2D622D.9050008@american.edu> >> Going back to Alan Isaac's example: >> 1) beta = (X.T*X).I * X.T * Y >> 2) beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) Robert Kern wrote: > 4) beta = la.lstsq(X, Y)[0] > > I really hate that example. Remember, the example is a **teaching** example. I actually use NumPy in a Master's level math econ course (among other places). As it happens, I do get around to explaining why using an explicit inverse is a bad idea numerically, but that is entirely an aside in a course that is not concerned with numerical methods. It is concerned only with mastering a few basic math tools, and being able to implement some of them in code is largely a check on understanding and precision (and to provide basic background for future applications). Having them use lstsq is counterproductive for the material being covered, at least initially. A typical course of this type uses Excel or includes no applications at all. So please, show a little gratitude. ;-) Alan Isaac From llewelr at gmail.com Mon Jun 8 15:20:29 2009 From: llewelr at gmail.com (llewelr at gmail.com) Date: Mon, 08 Jun 2009 19:20:29 +0000 Subject: [Numpy-discussion] Fwd: Re: is my numpy installation using custom blas/lapack? In-Reply-To: <28e83ea0906081215w2957d52sfa19041e4dcd7499@mail.gmail.com> Message-ID: <001636457a9aa02c1f046bdb2247@google.com> Changing the site.cfg as you suggested did the trick! For what its worth, setup.py build no longer fails as before at compilation step (line 95), (I'm still puzzled whether this earlier 'failure' was caused by some error in my build process but I should probably let it go.) and numpy.show_config() now shows ATLAS info under blas_opt_info: blas_opt_info: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.3\\""')] language = c I guess the short answer for whether non-threaded ATLAS libraries are being used (after being found) by a numpy installation is that there is no short answer. Thanks Chris for your patience & help! Numpy is a great resource. Rich -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jun 8 15:33:13 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Jun 2009 14:33:13 -0500 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <4A2D622D.9050008@american.edu> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> <4A2D622D.9050008@american.edu> Message-ID: <3d375d730906081233r3221d2ffq807196a049cab0ce@mail.gmail.com> On Mon, Jun 8, 2009 at 14:10, Alan G Isaac wrote: >>> Going back to Alan Isaac's example: >>> 1) ?beta = (X.T*X).I * X.T * Y >>> 2) ?beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) > > > Robert Kern wrote: >> 4) beta = la.lstsq(X, Y)[0] >> >> I really hate that example. > > > Remember, the example is a **teaching** example. I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations. > I actually use NumPy in a Master's level math econ course > (among other places). ?As it happens, I do get around to > explaining why using an explicit inverse is a bad idea > numerically, but that is entirely an aside in a course > that is not concerned with numerical methods. ?It is > concerned only with mastering a few basic math tools, > and being able to implement some of them in code is > largely a check on understanding and precision (and > to provide basic background for future applications). > Having them use lstsq is counterproductive for the > material being covered, at least initially. > > A typical course of this type uses Excel or includes > no applications at all. ?So please, > show a little gratitude. ?;-) If it's not a class where they are going to use what they learn in the future to write numerical programs, I really don't care whether you teach it with numpy or not. If it *is* such a class, then I would prefer that the students get taught the right way to write numerical programs. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From martyfuhry at gmail.com Mon Jun 8 15:40:36 2009 From: martyfuhry at gmail.com (Marty Fuhry) Date: Mon, 8 Jun 2009 15:40:36 -0400 Subject: [Numpy-discussion] New datetime dtypes Message-ID: Hello, I'm working on the new datetime64 and timedelta64 dtypes (as proposed here: http://projects.scipy.org/numpy/browser/trunk/doc/neps/datetime-proposal3.rst). I'm looking through the C code in numpy core, and can't seem to find much in the way of dtypes. Pierre suggested looking through the multiarraymodule file. Where can I find some reference code on the other dtypes? -Marty Fuhry From dwf at cs.toronto.edu Mon Jun 8 14:14:48 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 8 Jun 2009 14:14:48 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <7f014ea60906060959s6570cc32l277c5ab423f0b9ed@mail.gmail.com> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> Message-ID: An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Mon Jun 8 15:54:19 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 8 Jun 2009 15:54:19 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3d375d730906081233r3221d2ffq807196a049cab0ce@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> <4A2D622D.9050008@american.edu> <3d375d730906081233r3221d2ffq807196a049cab0ce@mail.gmail.com> Message-ID: On Mon, Jun 8, 2009 at 3:33 PM, Robert Kern wrote: > On Mon, Jun 8, 2009 at 14:10, Alan G Isaac wrote: >>>> Going back to Alan Isaac's example: >>>> 1) ?beta = (X.T*X).I * X.T * Y >>>> 2) ?beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) >> >> >> Robert Kern wrote: >>> 4) beta = la.lstsq(X, Y)[0] >>> >>> I really hate that example. >> >> >> Remember, the example is a **teaching** example. > > I know. Honestly, I would prefer that teachers skip over the normal > equations entirely and move directly to decomposition approaches. If > you are going to make them implement least-squares from more basic > tools, I think it's more enlightening as a student to start with the > SVD than the normal equations. > >> I actually use NumPy in a Master's level math econ course >> (among other places). ?As it happens, I do get around to >> explaining why using an explicit inverse is a bad idea >> numerically, but that is entirely an aside in a course >> that is not concerned with numerical methods. ?It is >> concerned only with mastering a few basic math tools, >> and being able to implement some of them in code is >> largely a check on understanding and precision (and >> to provide basic background for future applications). >> Having them use lstsq is counterproductive for the >> material being covered, at least initially. >> >> A typical course of this type uses Excel or includes >> no applications at all. ?So please, >> show a little gratitude. ?;-) > > If it's not a class where they are going to use what they learn in the > future to write numerical programs, I really don't care whether you > teach it with numpy or not. > > If it *is* such a class, then I would prefer that the students get > taught the right way to write numerical programs. > I started in such a class (with Dr. Isaac as a matter of fact). I found the use of Python with Numpy to be very enlightening for the basic concepts of linear algebra. I appreciated the simple syntax of matrices at the time as a gentler learning curve since my background in programming was mainly at a hobbyist level. I then went on to take a few econometrics courses where we learned the normal equations. Now a few years later I am working on scipy.stats as a google summer of code project, and I am learning why a SVD decomposition is much more efficient (an economist never necessarily *needs* to know what's under the hood of their stats package). The intuition for the numerical methods was in place, as well as the basic familiarity with numpy/scipy. So I would not discount this approach too much. People get what they want out of anything, and I was happy to learn about Python and Numpy/Scipy as alternatives to proprietary packages. And I hope my work this summer can contribute even a little to making the project an accessible alternative for researchers without a strong technical background. Skipper From jh at physics.ucf.edu Mon Jun 8 16:02:36 2009 From: jh at physics.ucf.edu (Joe Harrington) Date: Mon, 08 Jun 2009 16:02:36 -0400 Subject: [Numpy-discussion] The SciPy Doc Marathon continues Message-ID: Let's Finish Documenting SciPy! Last year, we began the SciPy Documentation Marathon to write reference pages ("docstrings") for NumPy and SciPy. It was a huge job, bigger than we first imagined, with NumPy alone having over 2,000 functions. We created the doc wiki (now at docs.scipy.org), where you write, review, and proofread docs that then get integrated into the source code. In September, we had over 55% of NumPy in the "first draft" stage, and about 25% to the "needs review" stage. The PDF NumPy Reference Guide was over 300 pages, nicely formatted by ReST, which makes an HTML version as well. The PDF document now has over 500 pages, with the addition of sections from Travis Oliphant's book Guide to NumPy. That's an amazing amount of work, possible through the contributions of over 30 volunteers. It came back to us as the vastly-expanded help pages in NumPy 1.2, released last September. With your help, WE CAN FINISH! This summer we can: - Write all the "important" NumPy pages to the "Needs Review" stage - Start documenting the SciPy package - Get the SciPy User Manual started - Implement dual review - technical and presentation - on the doc wiki - Get NumPy docs and packaging on a sound financial footing We'll start with the first two. UCF has hired David Goldsmith to lead this summer's doc effort. David will write a lot of docs himself, but more importantly, he will organize our efforts toward completing doc milestones. There will be rewards, T-shirts, and likely other fun stuff for those who contribute the most. David will start the ball rolling shortly. This is a big vision, and it will require YOUR help to make it happen! The main need now is for people to work on the reference pages. Here's how: 1. Go to http://docs.scipy.org/NumPy 2. Read the intro and doc standards, and some docstrings on the wiki 3. Make an account 4. Ask the scipy-dev at scipy.org email list for editor access 5. EDIT! All doc discussions (except announcements like this one) should happen on the scipy-dev at scipy.org email list. You can browse the archives and sign up for the list at http://scipy.org/Mailing_Lists . That's where we will announce sprints on topic areas and so on. We'll also meet online every week, Wednesdays at 4:30pm US Eastern Time, on Skype. David will give the details. Welcome back to the Marathon! --jh-- Prof. Joseph Harrington Planetary Sciences Group Department of Physics MAP 414 4000 Central Florida Blvd. University of Central Florida Orlando, FL 32816-2385 jh at physics.ucf.edu planets.ucf.edu From Chris.Barker at noaa.gov Mon Jun 8 16:11:38 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 08 Jun 2009 13:11:38 -0700 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> Message-ID: <4A2D707A.9040303@noaa.gov> G?khan SEVER wrote: > So far, IPython-Scite is the fastest that I can build my programs. > Experiment in Ipython and build pieces in Scite. I would like to know > what others use in this respect? Peppy (http://peppy.flipturn.org/) + iPython It would be nice to have those two integrated, though it's really not hard to switch between them. iPython's "run" is wonderful. A note about Peppy: It's pretty new, not widely used, heavyweight and not feature complete. However, it does a fews things right (by my personal definition of right ;-) ) that I haven't seen in any other editor: * Modern GUI (i.e. not Emacs or vim, which probably get everything else right...) * Scripted/written in Python (the other reason not to use Emacs/vim) * designed to be general purpose, not primarily python * multiple top-level windows, and the ability to edit the same file in multiple windows at once. * Python (and other languages) indenting done right (i.e. like Emacs Python mode) * Fully cross platform (Windows, Mac, *nix (GTK) ) * all it's other features are pretty common... Major missing feature: code completion -- I'm really starting to like that in iPython... It's been my primary editor for year or so, and hasn't destroyed any data yet! I'd love to see it get wider use. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From stefan at sun.ac.za Mon Jun 8 16:09:37 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 8 Jun 2009 22:09:37 +0200 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <3d375d730906081233r3221d2ffq807196a049cab0ce@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> <4A2D622D.9050008@american.edu> <3d375d730906081233r3221d2ffq807196a049cab0ce@mail.gmail.com> Message-ID: <9457e7c80906081309t610aeba6xfc6fa71996c0fda1@mail.gmail.com> 2009/6/8 Robert Kern : >> Remember, the example is a **teaching** example. > > I know. Honestly, I would prefer that teachers skip over the normal > equations entirely and move directly to decomposition approaches. If > you are going to make them implement least-squares from more basic > tools, I think it's more enlightening as a student to start with the > SVD than the normal equations. I agree, and I wish our cirriculum followed that route. In linear algebra, I also don't much like the way eigenvalues are taught, where students have to solve characteristic polynomials by hand. When I teach the subject again, I'll pay more attention to these books: Numerical linear algebra by Lloyd Trefethen http://books.google.co.za/books?id=bj-Lu6zjWbEC (e.g. has SVD in Lecture 4) Applied Numerical Linear Algebra by James Demmel http://books.google.co.za/books?id=lr8cFi-YWnIC (e.g. has perturbation theory on page 4) Regards St?fan From josef.pktd at gmail.com Mon Jun 8 16:21:56 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 8 Jun 2009 16:21:56 -0400 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <9457e7c80906081309t610aeba6xfc6fa71996c0fda1@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> <4A2D622D.9050008@american.edu> <3d375d730906081233r3221d2ffq807196a049cab0ce@mail.gmail.com> <9457e7c80906081309t610aeba6xfc6fa71996c0fda1@mail.gmail.com> Message-ID: <1cd32cbb0906081321ge1cb9a2q5974b43ac99651e5@mail.gmail.com> 2009/6/8 St?fan van der Walt : > 2009/6/8 Robert Kern : >>> Remember, the example is a **teaching** example. >> >> I know. Honestly, I would prefer that teachers skip over the normal >> equations entirely and move directly to decomposition approaches. If >> you are going to make them implement least-squares from more basic >> tools, I think it's more enlightening as a student to start with the >> SVD than the normal equations. > > I agree, and I wish our cirriculum followed that route. ?In linear > algebra, I also don't much like the way eigenvalues are taught, where > students have to solve characteristic polynomials by hand. ?When I > teach the subject again, I'll pay more attention to these books: > > Numerical linear algebra by Lloyd Trefethen > http://books.google.co.za/books?id=bj-Lu6zjWbEC > > (e.g. has SVD in Lecture 4) > > Applied Numerical Linear Algebra by James Demmel > http://books.google.co.za/books?id=lr8cFi-YWnIC > > (e.g. has perturbation theory on page 4) > > Regards > St?fan Ok, I also have to give my 2 cents Any basic econometrics textbook warns of multicollinearity. Since, economists are mostly interested in the parameter estimates, the covariance matrix needs to have little multicollinearity, otherwise the standard errors of the parameters will be huge. If I use automatically pinv or lstsq, then, unless I look at the condition number and singularities, I get estimates that look pretty nice, even they have an "arbitrary" choice of the indeterminacy. So in economics, I never worried too much about the numerical precision of the inverse, because, if the correlation matrix is close to singular, the model is misspecified, or needs reparameterization or the data is useless for the question. Compared to endogeneity bias for example, or homoscedasticy assumptions and so on, the numerical problem is pretty small. This doesn't mean matrix decomposition methods are not useful for numerical calculations and efficiency, but I don't think the numerical problem deserves a lot of emphasis in a basic econometrics class. Josef From Chris.Barker at noaa.gov Mon Jun 8 16:34:39 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 08 Jun 2009 13:34:39 -0700 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> Message-ID: <4A2D75DF.6090503@noaa.gov> Travis Oliphant wrote: > You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It > integrates IPython into an Envisage application and has a crude > name-space browser I was wondering when you guys would get around to making one of those. Nice start, the iPython shell is nice, though the editor needs a lot of features -- I wonder if you could integrate an existing wxPython editor: Editra Peppy SPE PyPE ... And get full featured editor that way. Winpdb would be nice, too.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Mon Jun 8 16:33:08 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Jun 2009 15:33:08 -0500 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <1cd32cbb0906081321ge1cb9a2q5974b43ac99651e5@mail.gmail.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> <4A2D622D.9050008@american.edu> <3d375d730906081233r3221d2ffq807196a049cab0ce@mail.gmail.com> <9457e7c80906081309t610aeba6xfc6fa71996c0fda1@mail.gmail.com> <1cd32cbb0906081321ge1cb9a2q5974b43ac99651e5@mail.gmail.com> Message-ID: <3d375d730906081333m3256b426sa31e770121e3cc08@mail.gmail.com> On Mon, Jun 8, 2009 at 15:21, wrote: > 2009/6/8 St?fan van der Walt : >> 2009/6/8 Robert Kern : >>>> Remember, the example is a **teaching** example. >>> >>> I know. Honestly, I would prefer that teachers skip over the normal >>> equations entirely and move directly to decomposition approaches. If >>> you are going to make them implement least-squares from more basic >>> tools, I think it's more enlightening as a student to start with the >>> SVD than the normal equations. >> >> I agree, and I wish our cirriculum followed that route. ?In linear >> algebra, I also don't much like the way eigenvalues are taught, where >> students have to solve characteristic polynomials by hand. ?When I >> teach the subject again, I'll pay more attention to these books: >> >> Numerical linear algebra by Lloyd Trefethen >> http://books.google.co.za/books?id=bj-Lu6zjWbEC >> >> (e.g. has SVD in Lecture 4) >> >> Applied Numerical Linear Algebra by James Demmel >> http://books.google.co.za/books?id=lr8cFi-YWnIC >> >> (e.g. has perturbation theory on page 4) >> >> Regards >> St?fan > > Ok, I also have to give my 2 cents > > Any basic econometrics textbook warns of multicollinearity. Since, > economists are mostly interested in the parameter estimates, the > covariance matrix needs to have little multicollinearity, otherwise > the standard errors of the parameters will be huge. > > If I use automatically pinv or lstsq, then, unless I look at the > condition number and singularities, I get estimates that look pretty > nice, even they have an "arbitrary" choice of the indeterminacy. > > So in economics, I never worried too much about the numerical > precision of the inverse, because, if the correlation matrix is close > to singular, the model is misspecified, or needs reparameterization or > the data is useless for the question. > > Compared to endogeneity bias for example, or homoscedasticy > assumptions and so on, the numerical problem is pretty small. > > This doesn't mean matrix decomposition methods are not useful for > numerical calculations and efficiency, but I don't think the numerical > problem deserves a lot of emphasis in a basic econometrics class. Actually, my point is a bit broader. Numerics aside, if you are going to bother peeking under the hood of least-squares at all, I think the student gets a better understanding of least-squares via one of the decomposition methods rather than the normal equations. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon Jun 8 16:34:26 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Jun 2009 15:34:26 -0500 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <4A2D75DF.6090503@noaa.gov> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> <4A2D75DF.6090503@noaa.gov> Message-ID: <3d375d730906081334p66bf8403tcae5c1b9bdc863cd@mail.gmail.com> On Mon, Jun 8, 2009 at 15:34, Christopher Barker wrote: > Travis Oliphant wrote: >> You might take a look at EPDLab as well. ? Thanks to Gael Varoquaux, It >> integrates IPython into an Envisage application and has a crude >> name-space browser > > I was wondering when you guys would get around to making one of those. > > Nice start, the iPython shell is nice, though the editor needs a lot of > features -- I wonder if you could integrate an existing wxPython editor: > > Editra > Peppy > SPE > PyPE > ... > > And get full featured editor that way. That's what this part is for: https://svn.enthought.com/svn/enthought/EPDLab/trunk/enthought/epdlab/remote_editor/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon Jun 8 16:35:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Jun 2009 15:35:54 -0500 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <3d375d730906081334p66bf8403tcae5c1b9bdc863cd@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> <4A2D75DF.6090503@noaa.gov> <3d375d730906081334p66bf8403tcae5c1b9bdc863cd@mail.gmail.com> Message-ID: <3d375d730906081335xdc7740el9c312ae6dc7fd205@mail.gmail.com> On Mon, Jun 8, 2009 at 15:34, Robert Kern wrote: > On Mon, Jun 8, 2009 at 15:34, Christopher Barker wrote: >> Travis Oliphant wrote: >>> You might take a look at EPDLab as well. ? Thanks to Gael Varoquaux, It >>> integrates IPython into an Envisage application and has a crude >>> name-space browser >> >> I was wondering when you guys would get around to making one of those. >> >> Nice start, the iPython shell is nice, though the editor needs a lot of >> features -- I wonder if you could integrate an existing wxPython editor: >> >> Editra >> Peppy >> SPE >> PyPE >> ... >> >> And get full featured editor that way. > > That's what this part is for: > > https://svn.enthought.com/svn/enthought/EPDLab/trunk/enthought/epdlab/remote_editor/ More accurately, these: https://svn.enthought.com/svn/enthought/EnvisagePlugins/trunk/enthought/plugins/remote_editor/editor_plugins/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Mon Jun 8 17:15:52 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Jun 2009 23:15:52 +0200 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> Message-ID: <20090608211552.GE16831@phare.normalesup.org> On Mon, Jun 08, 2009 at 12:54:25PM -0500, Travis Oliphant wrote: > You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It > integrates IPython into an Envisage application and has a crude name-space > browser And it integrates with editra to have an editor where you can select code, and run it in EPDLab, in addition to running the whole file. And we claim that having the editor as a separate process is a feature: first we can use a good editor (actually anyone, provided you write a plugin that sends the code to be executed to EPDLab via sockets), second: if your execution environement crashes (and yes, this can happen, especially if you are running custom C code binded in Python), you don't loose your editor. Getting things right with the IPython shell was a bastard, and there is still work to be done (although the latest release of IPython introduces continuation lines ! Yay). On the other hand, you can add a lot of features based on the existing code. And all the components are reusable components, which means that you can build your own application with them. I really cannot work on EPDLab anymore, I don't have time (thanks a lot to Enthought for financing my work on such a thrilling project). I do maintainance on the IPython wx frontend, because I am the person who knows the code best (unfortunately), althought Laurent Dufrechou is helping out. However, I strongly encourage people to contribute to EPDLab. It is open source, in an open SVN. You can get check in rights if you show that your contributions are of quality. Enthought is a company, and has its own agenda. It needs to sell products to consummers, so it might not be interesting in investing time where you might (althought Enthought has proven more than once that they can invest time on long-term projects, just because they believe they are good for the future of scientific computing in Python). On the other hand, if you are willing to devote time to add what you think lacks in EPDLab (whatever it might be), _you_ can make a difference. I believe in the future of EPDLab because it is based on very powerful components, like IPython, Traits, matplotlib, Wx, Mayavi, Chaco. I believe that choosing to work with a power stack like this has an upfront cost: you need to make sure everything fits together. Last year I sent micro-patches to matplotlib to make sure event-loops where correctly detected. I spent more than a month working in the IPython code base. The enthought guys fixed Traits bugs. This is costly and takes time. However, this can get you far, very far. All right, back to catching up with life :) Ga?l From gael.varoquaux at normalesup.org Mon Jun 8 17:17:07 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Jun 2009 23:17:07 +0200 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <4A2D75DF.6090503@noaa.gov> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> <4A2D75DF.6090503@noaa.gov> Message-ID: <20090608211707.GF16831@phare.normalesup.org> On Mon, Jun 08, 2009 at 01:34:39PM -0700, Christopher Barker wrote: > Travis Oliphant wrote: > > You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It > > integrates IPython into an Envisage application and has a crude > > name-space browser > I was wondering when you guys would get around to making one of those. > Nice start, the iPython shell is nice, though the editor needs a lot of > features -- I wonder if you could integrate an existing wxPython editor: > Editra Click in the menu: 'new file in remote browser', or something like this. If you have editra installed, it will launch it, with a special plugin allowing you to execute selected code in EPDLab. ;) Ga?l From dwf at cs.toronto.edu Mon Jun 8 17:19:11 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 8 Jun 2009 17:19:11 -0400 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <3d15ebce0906080958w2d375fc9y87094a33bdb3600f@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <1cd32cbb0906080939q36edbaf9gb526ac2694f150c4@mail.gmail.com> <3d15ebce0906080958w2d375fc9y87094a33bdb3600f@mail.gmail.com> Message-ID: On 8-Jun-09, at 12:58 PM, Jonno wrote: > Thanks Josef, > > I shouldn't have included Matplotlib since Pydee does work well with > its plots. I had forgotten that. It really is just the Mayavi plots > (or scenes I guess) that don't play well. I don't know how exactly matplotlib integration issues are handled in Pydee, but I do know that it's all in Qt. You should be able to set the environment variable ETS_TOOLKIT='qt4' to make Mayavi use the (somewhat neglected but still functional, AFAIK) Qt backend. Wx and Qt event loops competing might be the problem. David From gael.varoquaux at normalesup.org Mon Jun 8 17:23:45 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Jun 2009 23:23:45 +0200 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <1cd32cbb0906080939q36edbaf9gb526ac2694f150c4@mail.gmail.com> <3d15ebce0906080958w2d375fc9y87094a33bdb3600f@mail.gmail.com> Message-ID: <20090608212345.GG16831@phare.normalesup.org> On Mon, Jun 08, 2009 at 05:19:11PM -0400, David Warde-Farley wrote: > On 8-Jun-09, at 12:58 PM, Jonno wrote: > > Thanks Josef, > > I shouldn't have included Matplotlib since Pydee does work well with > > its plots. I had forgotten that. It really is just the Mayavi plots > > (or scenes I guess) that don't play well. > I don't know how exactly matplotlib integration issues are handled in > Pydee, but I do know that it's all in Qt. > You should be able to set the environment variable ETS_TOOLKIT='qt4' > to make Mayavi use the (somewhat neglected but still functional, > AFAIK) Qt backend. Wx and Qt event loops competing might be the problem. Correct. And as you point out the Qt backend of Mayavi is less functionnal, because (due to licensing reason) there is less economic pressure on making it work. Ga?l From d_l_goldsmith at yahoo.com Mon Jun 8 17:44:12 2009 From: d_l_goldsmith at yahoo.com (d_l_goldsmith at yahoo.com) Date: Mon, 8 Jun 2009 14:44:12 -0700 (PDT) Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate Message-ID: <295686.90587.qm@web52108.mail.re2.yahoo.com> Hi, folks.? Unable to find a printed reference for the definition we use to compute the functions in the Subject line of this email, I posted a couple queries for help in this regard in the Discussion for fv (http://docs.scipy.org/numpy/docs/numpy.lib.financial.fv/#discussion-sec).? josef Pktd's reply (thanks!) just makes me even more doubtful that we're using the definition that most users from the financial community would be expecting.? At this point, I have to say, I'm very concerned that our implementation for these is "wrong" (or at least inconsistent with what's used in financial circles); if you know of a reference - less ephemeral than a solely electronic document - defining these functions as we've implemented them, please share. Thanks! David Goldsmith PS: Some of the financial functions' help doc says they're unimplemented - are there plans to implement them, and if not, why do we have help doc for them? From Chris.Barker at noaa.gov Mon Jun 8 17:47:49 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 08 Jun 2009 14:47:49 -0700 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <20090608211707.GF16831@phare.normalesup.org> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> <4A2D75DF.6090503@noaa.gov> <20090608211707.GF16831@phare.normalesup.org> Message-ID: <4A2D8705.4050208@noaa.gov> Gael Varoquaux wrote: > Click in the menu: 'new file in remote browser', or something like this. > If you have editra installed, it will launch it, with a special plugin > allowing you to execute selected code in EPDLab. very cool, thanks! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From juanjo.gomeznavarro at gmail.com Mon Jun 8 17:52:29 2009 From: juanjo.gomeznavarro at gmail.com (Juanjo Gomez Navarro) Date: Mon, 8 Jun 2009 23:52:29 +0200 Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? In-Reply-To: <18571cd90906061221q7e31c477pc0c27ee700286ed8@mail.gmail.com> References: <18571cd90906061221q7e31c477pc0c27ee700286ed8@mail.gmail.com> Message-ID: <18571cd90906081452g58791bb0ie106924de33bc914@mail.gmail.com> Hi all, I'm new in numpy. Actually, I'm new in Python. In order to learn a bit, I want to create a program to plot the Mandelbrot set. This program is quite simple, and I have already programmed it. The problem is that I come from fortran, so I use to think in "for" loops. I know that it is not the best way to use Python and in fact the performance of the program is more than poor. Here is the program: *#!/usr/bin/python* > > *import numpy as np* > *import matplotlib.pyplot as plt* > > *# Some parameters* > *Xmin=-1.5* > *Xmax=0.5* > *Ymin=-1* > *Ymax=1* > > *Ds = 0.01* > > *# Initialization of varibles* > *X = np.arange(Xmin,Xmax,Ds)* > *Y = np.arange(Ymax,Ymin,-Ds)* > > *N = np.zeros((X.shape[0],Y.shape[0]),'f')* > > *############## Here are inefficient the calculations ################* > *for i in range(X.shape[0]):* > * for j in range(Y.shape[0]):* > * z= complex(0.0, 0.0)* > * c = complex(X[i], Y[j])* > * while N[i, j] < 30 and abs(z) < 2:* > * N[i, j] += 1* > * z = z**2 + c* > * if N[i, j] == 29:* > * N[i, j]=0* > #################################################### > > *# And now, just for ploting...* > *N = N.transpose()* > *fig = plt.figure()* > *plt.imshow(N,cmap=plt.cm.Blues)* > *plt.title('Mandelbrot set')* > *plt.xticks([]); plt.yticks([])* > *plt.show()* > *fig.savefig('test.png')* > As you can see, it is very simple, but it takes several seconds running just to create a 200x200 plot. Fortran takes same time to create a 2000x2000 plot, around 100 times faster... So the question is, do you know how to programme this in a Python-like fashion in order to improve seriously the performance? Thanks in advance -- Juan Jos? G?mez Navarro Edificio CIOyN, Campus de Espinardo, 30100 Departamento de F?sica Universidad de Murcia Tfno. (+34) 968 398552 Email: juanjo.gomeznavarro at gmail.com Web: http://ciclon.inf.um.es/Inicio.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Mon Jun 8 17:55:02 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 8 Jun 2009 14:55:02 -0700 (PDT) Subject: [Numpy-discussion] Summer NumPy Doc Marathon (Reply-to: scipy-dev@scipy.org) Message-ID: <110298.95028.qm@web52107.mail.re2.yahoo.com> Dear SciPy Community Members: Hi! My name is David Goldsmith. I've been hired for the summer by Joe Harrington to further progress on NumPy documentation and ultimately, pending funding, SciPy documentation. Joe and I are reviving last summer?s enthusiasm in the community for this mission and enlisting as many of you as possible in the effort. On that note, please peruse the NumPy Doc Wiki (http://docs.scipy.org/numpy/Front Page/) and, in particular, the master list of functions/objects (?items?) needing work (http://docs.scipy.org/numpy/Milestones/). Our goal is to have every item to the ready-for-first-review stage (or better) by August 18 (i.e., the start of SciPyCon09). To accomplish this, we're forming teams to attack each doc category on the Milestones page. From the Milestones page: "To speed things up, get more uniformity in the docs, and add a social element, we're attacking these categories as teams. A team lead takes responsibility for getting a category to "Needs review" within one month [we expect that some categories will require less time ? please furnish your most ?optimistically realistic? deadline when ?claiming? a category], but no later than 18 August 2009. As leader, you commit to working with anyone who signs up in your category, and vice versa. The scipy-dev mailing list is a great place to recruit helpers. "Major doc contributors will be listed in NumPy's contributors file, THANKS.txt. Anyone writing more than 1000 words will get a T-shirt (while supplies last, etc.). Teams that reach their goals in time will get special mention in THANKS.txt. "Of course, you don't have to join a team. If you'd like to work on your own, please choose docstrings from an unclaimed category, and put your name after docstrings you are editing in the list below. If someone later claims that category, please coordinate with them or finish up your current docstrings and move to another category." Please note that, to edit anything on the Wiki (including the doc itself), you?ll need ?edit rights? ? how you get these is Item 5 under ?Before you start? on the ?Front Page,? but for your convenience, I?ll quote that here: "Register a username on [docs.scipy.org]. Send an e-mail with your username to the scipy-dev mailing list (requires subscribing to the mailing list first, [which can be done at http://mail.scipy.org/mailman/listinfo/scipy-dev]), so that we can give you edit rights. If you are not subscribed to the mailing-list, you can also send an email to gael dot varoquaux at normalesup dot org, but this will take longer [and you?ll want to subscribe to scipy-dev anyway, because that?s the place to post questions and comments about this whole doc development project]." Also, I?ll be holding a weekly Skype (www.skype.com) telecon ? Wednesdays at 4:30pm US Eastern Daylight Time - to review progress and discuss any roadblocks we may have encountered (or anticipate encountering). If you?d like to participate and haven?t already downloaded and installed Skype and registered a Skype ID, you should do those things; then, you'll be able to join in simply by "Skyping" me (Skype ID: d.l.goldsmith) and I'll add you to the call. So, thanks for your time reading this, and please make time this summer to help us meet (or beat) the goal. Sincerely, David Goldsmith, Technical Editor Olympia, WA From d_l_goldsmith at yahoo.com Mon Jun 8 18:04:23 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 8 Jun 2009 15:04:23 -0700 (PDT) Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? Message-ID: <75173.19938.qm@web52104.mail.re2.yahoo.com> I look forward to an instructive reply: the "Pythonic" way to do it would be to take advantage of the facts that Numpy is "pre-vectorized" and uses broadcasting, but so far I haven't been able to figure out (though I haven't yet really buckled down and tried real hard) how to broadcast a conditionally-terminated iteration where the number of iterations will vary among the array elements. Hopefully someone else already has. :-) DG --- On Mon, 6/8/09, Juanjo Gomez Navarro wrote: > From: Juanjo Gomez Navarro > Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? > To: numpy-discussion at scipy.org > Date: Monday, June 8, 2009, 2:52 PM > > Hi all, > > I'm new in numpy. Actually, I'm new in Python. In > order to learn a bit, I want to create a program to plot the > Mandelbrot set. This program is quite simple, and I have > already programmed it. The problem is that I come from > fortran, so I use to think in "for" loops. I know > that it is not the best way to use Python and in fact the > performance of the program is more than poor. > > > > Here is the program: > > #!/usr/bin/python > > > > import numpy as np > > import matplotlib.pyplot as plt > > > > # Some parameters > Xmin=-1.5 > > Xmax=0.5 > > Ymin=-1 > > Ymax=1 > > > Ds = 0.01 > > > # Initialization of varibles > X = np.arange(Xmin,Xmax,Ds) > > > Y = np.arange(Ymax,Ymin,-Ds) > > > > N = np.zeros((X.shape[0],Y.shape[0]),'f') > > > ############## Here are inefficient the calculations > ################ > for i in range(X.shape[0]): > ? for j in range(Y.shape[0]): > ??? z= complex(0.0, 0.0) > ??? c = complex(X[i], Y[j]) > > ??? while N[i, j] < 30 and abs(z) < 2: > > ????? N[i, j] += 1 > ????? z = z**2 + c > ??? if N[i, j] == 29: > ????? N[i, j]=0 > #################################################### > > # And now, just for ploting... > N = N.transpose() > > fig = plt.figure() > > plt.imshow(N,cmap=plt.cm.Blues) > > > plt.title('Mandelbrot set') > > plt.xticks([]); plt.yticks([]) > > > plt.show() > > fig.savefig('test.png') > > > > As you can see, it is very simple, but it takes several > seconds running just to create a 200x200 plot. Fortran takes > same time to create a 2000x2000 plot, around 100 times > faster... So the question is, do you know how to programme > this in a Python-like fashion in order to improve seriously > the performance? > > > > Thanks in advance > -- > Juan Jos? G?mez Navarro > > Edificio CIOyN, Campus de Espinardo, 30100 > Departamento de F?sica > Universidad de Murcia > Tfno. (+34) 968 398552 > > Email: juanjo.gomeznavarro at gmail.com > > Web: http://ciclon.inf.um.es/Inicio.html > > > -----Inline Attachment Follows----- > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gokhansever at gmail.com Mon Jun 8 18:14:27 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Mon, 8 Jun 2009 17:14:27 -0500 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <4A2D8705.4050208@noaa.gov> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> <4A2D75DF.6090503@noaa.gov> <20090608211707.GF16831@phare.normalesup.org> <4A2D8705.4050208@noaa.gov> Message-ID: <49d6b3500906081514i4cdbf392q9fbb9e6e999ca6b6@mail.gmail.com> On Mon, Jun 8, 2009 at 4:47 PM, Christopher Barker wrote: > Gael Varoquaux wrote: > > Click in the menu: 'new file in remote browser', or something like this. > > If you have editra installed, it will launch it, with a special plugin > > allowing you to execute selected code in EPDLab. > > very cool, thanks! > > -Chris > > IPython's edit command works in a similar fashion, too. edit test.py open an existing file or creates one, and right after you close the file IPy executes the content. These are from ipy_user_conf.py file: # Configure your favourite editor? # Good idea e.g. for %edit os.path.isfile import ipy_editors # Choose one of these: ipy_editors.scite() #ipy_editors.scite('c:/opt/scite/scite.exe') #ipy_editors.komodo() #ipy_editors.idle() # ... or many others, try 'ipy_editors??' after import to see them G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Mon Jun 8 18:16:47 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 9 Jun 2009 00:16:47 +0200 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <49d6b3500906081514i4cdbf392q9fbb9e6e999ca6b6@mail.gmail.com> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> <4A2D75DF.6090503@noaa.gov> <20090608211707.GF16831@phare.normalesup.org> <4A2D8705.4050208@noaa.gov> <49d6b3500906081514i4cdbf392q9fbb9e6e999ca6b6@mail.gmail.com> Message-ID: <20090608221647.GH28350@phare.normalesup.org> On Mon, Jun 08, 2009 at 05:14:27PM -0500, G?khan SEVER wrote: > IPython's edit command works in a similar fashion, too. > edit test.py The cool thing is that you can select text in the editor and execute in EPDLab. On the other hand, I know that IPython has hooks to grow this in the code base, and I would like this to grow also directly in IPython. Hell, I use vim. How cool would it be to select (using visual mode) snippets in vim, and execute them in a running Ipython session. Ga?l From robert.kern at gmail.com Mon Jun 8 18:25:37 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Jun 2009 17:25:37 -0500 Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? In-Reply-To: <75173.19938.qm@web52104.mail.re2.yahoo.com> References: <75173.19938.qm@web52104.mail.re2.yahoo.com> Message-ID: <3d375d730906081525v41b34e0dt4d32ea58cc041020@mail.gmail.com> On Mon, Jun 8, 2009 at 17:04, David Goldsmith wrote: > > I look forward to an instructive reply: the "Pythonic" way to do it would be to take advantage of the facts that Numpy is "pre-vectorized" and uses broadcasting, but so far I haven't been able to figure out (though I haven't yet really buckled down and tried real hard) how to broadcast a conditionally-terminated iteration where the number of iterations will vary among the array elements. ?Hopefully someone else already has. :-) You can't, really. What you can do is just keep iterating with the whole data set and ignore the parts that have already converged. Here is an example: z = np.zeros((201,201), dtype=complex) Y, X = np.mgrid[1:-1:-201j, -1.5:0.5:201j] c = np.empty_like(z) c.real = X c.imag = Y N = np.zeros(z.shape, dtype=int) while ((N<30) | (abs(z)<2)).all(): N += abs(z) < 2 z = z ** 2 + c N[N>=30] = 0 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Mon Jun 8 18:38:46 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 8 Jun 2009 18:38:46 -0400 Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? In-Reply-To: <3d375d730906081525v41b34e0dt4d32ea58cc041020@mail.gmail.com> References: <75173.19938.qm@web52104.mail.re2.yahoo.com> <3d375d730906081525v41b34e0dt4d32ea58cc041020@mail.gmail.com> Message-ID: 2009/6/8 Robert Kern : > On Mon, Jun 8, 2009 at 17:04, David Goldsmith wrote: >> >> I look forward to an instructive reply: the "Pythonic" way to do it would be to take advantage of the facts that Numpy is "pre-vectorized" and uses broadcasting, but so far I haven't been able to figure out (though I haven't yet really buckled down and tried real hard) how to broadcast a conditionally-terminated iteration where the number of iterations will vary among the array elements. ?Hopefully someone else already has. :-) > > You can't, really. What you can do is just keep iterating with the > whole data set and ignore the parts that have already converged. Here > is an example: Well, yes and no. This is only worth doing if the number of problem points that require many iterations is small - not the case here without some sort of periodicity detection - but you can keep an array of not-yet-converged points, which you iterate. When some converge, you store them in a results array (with fancy indexing) and remove them from your still-converging array. It's also worth remembering that the overhead of for loops is large but not enormous, so you can often remove only the inner for loop, in this case perhaps iterating over the image a line at a time. Anne > > z = np.zeros((201,201), dtype=complex) > Y, X = np.mgrid[1:-1:-201j, -1.5:0.5:201j] > c = np.empty_like(z) > c.real = X > c.imag = Y > N = np.zeros(z.shape, dtype=int) > > while ((N<30) | (abs(z)<2)).all(): > ? ?N += abs(z) < 2 > ? ?z = z ** 2 + c > > N[N>=30] = 0 > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ?-- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d_l_goldsmith at yahoo.com Mon Jun 8 18:40:25 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 8 Jun 2009 15:40:25 -0700 (PDT) Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? Message-ID: <506972.12301.qm@web52110.mail.re2.yahoo.com> Thanks, Robert! DG --- On Mon, 6/8/09, Robert Kern wrote: > I haven't been able to figure out (though I haven't yet > really buckled down and tried real hard) how to broadcast a > conditionally-terminated iteration where the number of > iterations will vary among the array elements. ?Hopefully > someone else already has. :-) > > You can't, really. What you can do is just keep iterating > with the > whole data set and ignore the parts that have already > converged. Here > is an example: > > z = np.zeros((201,201), dtype=complex) > Y, X = np.mgrid[1:-1:-201j, -1.5:0.5:201j] > c = np.empty_like(z) > c.real = X > c.imag = Y > N = np.zeros(z.shape, dtype=int) > > while ((N<30) | (abs(z)<2)).all(): > ? ? N += abs(z) < 2 > ? ? z = z ** 2 + c > > N[N>=30] = 0 > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, > a harmless > enigma that is made terrible by our own mad attempt to > interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d_l_goldsmith at yahoo.com Mon Jun 8 19:01:41 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 8 Jun 2009 16:01:41 -0700 (PDT) Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? Message-ID: <243788.65252.qm@web52106.mail.re2.yahoo.com> --- On Mon, 6/8/09, Anne Archibald wrote: > > You can't, really. What you can do is just keep > iterating with the > > whole data set and ignore the parts that have already > converged. Here > > is an example: > > Well, yes and no. This is only worth doing if the number of > problem > points that require many iterations is small - not the case > here > without some sort of periodicity detection - but you can > keep an array > of not-yet-converged points, which you iterate. When some > converge, > you store them in a results array (with fancy indexing) and > remove > them from your still-converging array. Thanks, Anne. This is the way I had anticipated implementing it myself eventually, but the "fancy-indexing" requirement has caused me to keep postponing it, waiting for some time when I'll have a hefty block of time to figure it out and then, inevitably, debug it. :( Also, the transfer of points from un-converged to converged - when that's a large number, might that not be a large time-suck compared to Rob's method? (Too bad this wasn't posted a couple weeks ago: I'd've had time then to implement your method and "race" it against Rob's, but alas, now I have this doc editing job...but that's a good thing, as my fractals are not yet making me any real money.) :-) > It's also worth remembering that the overhead of for loops > is large > but not enormous, so you can often remove only the inner > for loop, in > this case perhaps iterating over the image a line at a > time. Yes, definitely well worth remembering - thanks for reminding us! Thanks again, DG > > Anne > From robert.kern at gmail.com Mon Jun 8 19:07:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Jun 2009 18:07:21 -0500 Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? In-Reply-To: <243788.65252.qm@web52106.mail.re2.yahoo.com> References: <243788.65252.qm@web52106.mail.re2.yahoo.com> Message-ID: <3d375d730906081607k33105b14s4ef2feeed3433d36@mail.gmail.com> On Mon, Jun 8, 2009 at 18:01, David Goldsmith wrote: > > --- On Mon, 6/8/09, Anne Archibald wrote: > >> > You can't, really. What you can do is just keep >> iterating with the >> > whole data set and ignore the parts that have already >> converged. Here >> > is an example: >> >> Well, yes and no. This is only worth doing if the number of >> problem >> points that require many iterations is small - not the case >> here >> without some sort of periodicity detection - but you can >> keep an array >> of not-yet-converged points, which you iterate. When some >> converge, >> you store them in a results array (with fancy indexing) and >> remove >> them from your still-converging array. > > Thanks, Anne. ?This is the way I had anticipated implementing it myself eventually, but the "fancy-indexing" requirement has caused me to keep postponing it, waiting for some time when I'll have a hefty block of time to figure it out and then, inevitably, debug it. :( ?Also, the transfer of points from un-converged to converged - when that's a large number, might that not be a large time-suck compared to Rob's method? ?(Too bad this wasn't posted a couple weeks ago: I'd've had time then to implement your method and "race" it against Rob's, but alas, now I have this doc editing job...but that's a good thing, as my fractals are not yet making me any real money.) :-) The advantage of my implementation is that I didn't have to think too hard about it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jsseabold at gmail.com Mon Jun 8 21:01:25 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 8 Jun 2009 21:01:25 -0400 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <295686.90587.qm@web52108.mail.re2.yahoo.com> References: <295686.90587.qm@web52108.mail.re2.yahoo.com> Message-ID: On Mon, Jun 8, 2009 at 5:44 PM, wrote: > > Hi, folks.? Unable to find a printed reference for the definition we use to compute the functions in the Subject line of this email, I posted a couple queries for help in this regard in the Discussion for fv > (http://docs.scipy.org/numpy/docs/numpy.lib.financial.fv/#discussion-sec).? josef Pktd's reply (thanks!) just makes me even more doubtful that we're using the definition that most users from the financial community would be expecting.? At this point, I have to say, I'm very concerned that our implementation for these is "wrong" (or at least inconsistent with what's used in financial circles); if you know of a reference - less ephemeral than a solely electronic document - defining these functions as we've implemented them, please share. ?Thanks! > Just quickly comparing In [3]: np.lib.financial.fv(.1,10,-100,-350) Out[3]: 2501.5523211350032 With OO Calc =fv(.1,10,-100,-350) =2501.55 Both return the value of 350*1.1**10 + 100*1.1**9 + ... + 100*1.1 which is what I would expect it to do. I didn't look too closely at the docs though, so they might be a bit confusing and need some cleaning up. There was a recent discussion about numpy.financial in this thread . The way that it was left is that they are there as teaching tools to mimic *some* of the functionality of spreadsheets/ financials calculators. I'm currently working on implementing some other common spreadsheet/ financial calculator on my own for possible inclusion somewhere later, as I think was the original vision . Skipper From cycomanic at gmail.com Mon Jun 8 21:10:24 2009 From: cycomanic at gmail.com (Jochen Schroeder) Date: Tue, 9 Jun 2009 13:10:24 +1200 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <20090608221647.GH28350@phare.normalesup.org> References: <3d15ebce0906080929r781b024dr9061ea6cff9b1df5@mail.gmail.com> <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> <4A2D75DF.6090503@noaa.gov> <20090608211707.GF16831@phare.normalesup.org> <4A2D8705.4050208@noaa.gov> <49d6b3500906081514i4cdbf392q9fbb9e6e999ca6b6@mail.gmail.com> <20090608221647.GH28350@phare.normalesup.org> Message-ID: <20090609011023.GA13728@jochen.schroeder.phy.auckland.ac.nz> On 09/06/09 00:16, Gael Varoquaux wrote: > On Mon, Jun 08, 2009 at 05:14:27PM -0500, G?khan SEVER wrote: > > IPython's edit command works in a similar fashion, too. > > > edit test.py > > The cool thing is that you can select text in the editor and execute in > EPDLab. On the other hand, I know that IPython has hooks to grow this in > the code base, and I would like this to grow also directly in IPython. > > Hell, I use vim. How cool would it be to select (using visual mode) > snippets in vim, and execute them in a running Ipython session. I think there's a vim script for executing the marked code in python. If IPython has already hooks for executing code in an existing session, it might be possible to adapt this script. Also I encourage everyone to have a look at pida: http://pida.co.uk/ which is a python IDE using an embedded vim (although you can embed other editors as well I think). The website looks like development has been stale, but if you look at svn there've been commits lately. Cheers Jochen > > Ga?l > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From stefan at sun.ac.za Mon Jun 8 21:39:03 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 9 Jun 2009 03:39:03 +0200 Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? In-Reply-To: <18571cd90906081452g58791bb0ie106924de33bc914@mail.gmail.com> References: <18571cd90906061221q7e31c477pc0c27ee700286ed8@mail.gmail.com> <18571cd90906081452g58791bb0ie106924de33bc914@mail.gmail.com> Message-ID: <9457e7c80906081839p12fb39v8efc12c554162f63@mail.gmail.com> Hi Juan 2009/6/8 Juanjo Gomez Navarro : > I'm new in numpy. Actually, I'm new in Python. In order to learn a bit, I > want to create a program to plot the Mandelbrot set. This program is quite > simple, and I have already programmed it. The problem is that I come from > fortran, so I use to think in "for" loops. I know that it is not the best > way to use Python and in fact the performance of the program is more than > poor. > > Here is the program: > >> #!/usr/bin/python >> >> import numpy as np >> import matplotlib.pyplot as plt >> >> # Some parameters >> Xmin=-1.5 >> Xmax=0.5 >> Ymin=-1 >> Ymax=1 >> >> Ds = 0.01 >> >> # Initialization of varibles >> X = np.arange(Xmin,Xmax,Ds) >> Y = np.arange(Ymax,Ymin,-Ds) >> >> N = np.zeros((X.shape[0],Y.shape[0]),'f') >> >> ############## Here are inefficient the calculations ################ >> for i in range(X.shape[0]): >> ? for j in range(Y.shape[0]): >> ??? z= complex(0.0, 0.0) >> ??? c = complex(X[i], Y[j]) >> ??? while N[i, j] < 30 and abs(z) < 2: >> ????? N[i, j] += 1 >> ????? z = z**2 + c >> ??? if N[i, j] == 29: >> ????? N[i, j]=0 >> #################################################### >> >> # And now, just for ploting... >> N = N.transpose() >> fig = plt.figure() >> plt.imshow(N,cmap=plt.cm.Blues) >> plt.title('Mandelbrot set') >> plt.xticks([]); plt.yticks([]) >> plt.show() >> fig.savefig('test.png') > > > As you can see, it is very simple, but it takes several seconds running just > to create a 200x200 plot. Fortran takes same time to create a 2000x2000 > plot, around 100 times faster... So the question is, do you know how to > programme this in a Python-like fashion in order to improve seriously the > performance? Here is another version, similar to Robert's, that I wrote up for the documentation project last year: http://mentat.za.net/numpy/intro/intro.html We never used it, but I still like the pretty pictures :-) Cheers St?fan From stefan at sun.ac.za Mon Jun 8 21:42:22 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 9 Jun 2009 03:42:22 +0200 Subject: [Numpy-discussion] From CorePy: New ExtBuffer object Message-ID: <9457e7c80906081842k5579e01ev424e8b9d570deaac@mail.gmail.com> Hi, Just a heads-up on something they're talking about over at CorePy. Regards St?fan ---------- Forwarded message ---------- From: Andrew Friedley Date: 2009/6/8 Subject: [Corepy-devel] New ExtBuffer object To: CorePy Development I wrote a new buffer object today, called ExtBuffer, that can be used with libraries/objects that support the Python 2.6 buffer interface (e.g. NumPy). ?This brings page-aligned memory (and huge-page) support to anything that can use a buffer object (eg NumPy arrays). ?ExtBuffer can also be initialized using a pointer to an existing memory region. This allows you, for example, to set up a NumPy array spanning a Cell SPU's memory mapped local store, accessing LS like any other NumPy array. The ExtBuffer is included as part of the 'corepy.lib.extarray' module, and can be used like this: import corepy.lib.extarray as extarray import numpy buf = extarray.extbuffer(4096, huge = True) array = numpy.frombuffer(buf, dtype=numpy.int32) I wrote a some documentation here: http://corepy.org/wiki/index.php?title=Extended_Array If anyone has any questions, thoughts, ideas, bugs, etc, please let me know! Andrew _______________________________________________ Corepy-devel mailing list Corepy-devel at osl.iu.edu http://www.osl.iu.edu/mailman/listinfo.cgi/corepy-devel From d_l_goldsmith at yahoo.com Mon Jun 8 22:17:30 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 8 Jun 2009 19:17:30 -0700 (PDT) Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? Message-ID: <500728.93787.qm@web52103.mail.re2.yahoo.com> --- On Mon, 6/8/09, Robert Kern wrote: > Goldsmith > wrote: > > > The advantage of my implementation is that I didn't have to > think too > hard about it. > -- > Robert Kern > Agreed. :-) DG From d_l_goldsmith at yahoo.com Mon Jun 8 23:18:04 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 8 Jun 2009 20:18:04 -0700 (PDT) Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate Message-ID: <419981.813.qm@web52110.mail.re2.yahoo.com> --- On Mon, 6/8/09, Skipper Seabold wrote: > There was a recent discussion about numpy.financial in this > thread > . > > Skipper Thanks, Skipper. Having now read that thread (but not the arguments, provided elsewhere, for the existence of numpy.financial in the first place), and considering that the only references mentioned there are also electronic ones (which, for the purpose of referencing sources in the function docs, I believe we're wanting to shun as much as possible), I formally "move" that numpy.financial (or at least that subset of it consisting of functions which are commonly subject to multiple definitions) be moved out of numpy. (Where _to_ exactly, I cannot say.) DG From jsseabold at gmail.com Mon Jun 8 23:24:40 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 8 Jun 2009 23:24:40 -0400 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <419981.813.qm@web52110.mail.re2.yahoo.com> References: <419981.813.qm@web52110.mail.re2.yahoo.com> Message-ID: On Mon, Jun 8, 2009 at 9:01 PM, Skipper Seabold wrote: > Just quickly comparing > > In [3]: np.lib.financial.fv(.1,10,-100,-350) > Out[3]: 2501.5523211350032 > > With OO Calc > =fv(.1,10,-100,-350) > =2501.55 > > Both return the value of 350*1.1**10 + 100*1.1**9 + ... + 100*1.1 > which is what I would expect it to do. I didn't look too closely at > the docs though, so they might be a bit confusing and need some > cleaning up. > I forgot the last payment (which doesn't earn any interest), so one more 100. On Mon, Jun 8, 2009 at 11:18 PM, David Goldsmith wrote: > > --- On Mon, 6/8/09, Skipper Seabold wrote: > >> There was a recent discussion about numpy.financial in this >> thread >> . >> >> Skipper > > Thanks, Skipper. ?Having now read that thread (but not the arguments, provided elsewhere, for the existence of numpy.financial in the first place), and considering that the only references mentioned there are also electronic ones (which, for the purpose of referencing sources in the function docs, I believe we're wanting to shun as much as possible), I formally "move" that numpy.financial (or at least that subset of it consisting of functions which are commonly subject to multiple definitions) be moved out of numpy. ?(Where _to_ exactly, I cannot say.) > I think fv is the expected behavior. See my reply to the discussion of fv to explain the notes section , and I think I can probably have a look at the consistency of the rest pretty soon. I don't have a more permanent reference for fv offhand, but it should be in any corporate finance text etc. Most of these type of "formulas" use basic results of geometric series to simplify. Skipper From aisaac at american.edu Mon Jun 8 23:40:12 2009 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 08 Jun 2009 23:40:12 -0400 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <419981.813.qm@web52110.mail.re2.yahoo.com> References: <419981.813.qm@web52110.mail.re2.yahoo.com> Message-ID: <4A2DD99C.3000201@american.edu> On 6/8/2009 11:18 PM David Goldsmith apparently wrote: > I formally "move" that numpy.financial (or at least that > subset of it consisting of functions which are commonly > subject to multiple definitions) be moved out of numpy. My recollection is that Travis O. added this with the explicit intent of seducing users who might otherwise turn to spreadsheets for such functionality. I.e., it was part of an effort to extend the net of the NumPy community. I am not urging a case one way or another, although I am very sympathetic to that reasoning, whether or not I am correctly recalling the actual motivation. In that light, however, standard spreadsheet definitions would be the proper guide. E.g., the definitions used by Gnumeric. Cheers, Alan Isaac From d_l_goldsmith at yahoo.com Mon Jun 8 23:54:56 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 8 Jun 2009 20:54:56 -0700 (PDT) Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate Message-ID: <947999.43926.qm@web52112.mail.re2.yahoo.com> So would we regard a hard-copy of the users guide or reference manual for such a spreadsheet as sufficiently "permanent" to pass muster for use as a reference? DG --- On Mon, 6/8/09, Alan G Isaac wrote: > From: Alan G Isaac > Subject: Re: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate > To: "Discussion of Numerical Python" > Date: Monday, June 8, 2009, 8:40 PM > On 6/8/2009 11:18 PM David Goldsmith > apparently wrote: > > I formally "move" that numpy.financial (or at least > that > > subset of it consisting of functions which are > commonly > > subject to multiple definitions) be moved out of > numpy.? > > > > My recollection is that Travis O. added this with the > explicit intent of seducing users who might otherwise > turn to spreadsheets for such functionality.? I.e., > it was part of an effort to extend the net of the NumPy > community. > > I am not urging a case one way or another, although I am > very sympathetic to that reasoning, whether or not I am > correctly recalling the actual motivation. > > In that light, however, standard spreadsheet definitions > would be the proper guide.? E.g., the definitions used > by > Gnumeric. > > Cheers, > Alan Isaac > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Tue Jun 9 00:18:09 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Jun 2009 23:18:09 -0500 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <947999.43926.qm@web52112.mail.re2.yahoo.com> References: <947999.43926.qm@web52112.mail.re2.yahoo.com> Message-ID: <3d375d730906082118v78fc62e7v6fb88a4d7ed88641@mail.gmail.com> On Mon, Jun 8, 2009 at 22:54, David Goldsmith wrote: > > So would we regard a hard-copy of the users guide or reference manual for such a spreadsheet as sufficiently "permanent" to pass muster for use as a reference? The OpenFormula standard is probably better: http://www.oasis-open.org/committees/documents.php?wg_abbrev=office-formula -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jsseabold at gmail.com Tue Jun 9 00:34:06 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 9 Jun 2009 00:34:06 -0400 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <3d375d730906082118v78fc62e7v6fb88a4d7ed88641@mail.gmail.com> References: <947999.43926.qm@web52112.mail.re2.yahoo.com> <3d375d730906082118v78fc62e7v6fb88a4d7ed88641@mail.gmail.com> Message-ID: On Tue, Jun 9, 2009 at 12:18 AM, Robert Kern wrote: > On Mon, Jun 8, 2009 at 22:54, David Goldsmith wrote: >> >> So would we regard a hard-copy of the users guide or reference manual for such a spreadsheet as sufficiently "permanent" to pass muster for use as a reference? > > The OpenFormula standard is probably better: > > http://www.oasis-open.org/committees/documents.php?wg_abbrev=office-formula > This is a nice reference. There are notes for which packages agree/disagree, proprietary and open source, and values for tests. Skipper From d_l_goldsmith at yahoo.com Tue Jun 9 00:51:46 2009 From: d_l_goldsmith at yahoo.com (d_l_goldsmith at yahoo.com) Date: Mon, 8 Jun 2009 21:51:46 -0700 (PDT) Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate Message-ID: <595939.51926.qm@web52105.mail.re2.yahoo.com> --- On Mon, 6/8/09, Skipper Seabold wrote: > I forgot the last payment (which doesn't earn any > interest), so one more 100. So in fact they're not in agreement? > pretty soon.? I don't have a more permanent reference > for fv offhand, > but it should be in any corporate finance text etc.? > Most of these > type of "formulas" use basic results of geometric series to > simplify. Let me be more specific about the difference between what we have and what I'm finding in print. Essentially, it boils down to this: in every source I've found, two "different" present/future values are discussed, that for a single amount, and that for a constant (i.e., not even the first "payment" is allowed to be different) periodic payment. I have not been able to find a single printed reference that gives a formula for (or even discusses, for that matter) the combination of these two, which is clearly what we have implemented (and which is, just as clearly, actually seen in practice). Now, my lazy side simply hopes that my stridency will finally cause someone to pipe up and say "look, dummy, it's in Schmoe, Joe, 2005. "Advanced Financial Practice." Financial Press, NY NY. There's your reference; find it and look it up if you don't trust me" and then I'll feel like we've at least covered our communal rear-end. But my more conscientious side worries that, if I've had so much trouble finding our more "advanced" definition (and I have tried, believe me), then I'm concerned that what your typical student (for example) is most likely to encounter is one of those simpler definitions, and thus get confused (at best) if they look at our help doc and find quite a different (at least superficially) definition (or worse, don't look at the help doc, and either can't get the function to work because the required number of inputs doesn't match what they're expecting from their text, or somehow manage to get it to work, but get an answer very different from that given in other sources, e.g., the answers in the back of their text.) One obvious answer to this dilemma is to explain this discrepancy in the help doc, but then we have to explain - clearly and lucidly, mind you - how one uses our functions for the two simpler cases, how/why the formula we use is the combination of the other two, etc. (it's rather hard to anticipate, for me at least, all the possible confusions this discrepancy might create) and in any event, somehow I don't really think something so necessarily elaborate is appropriate in this case. So, again, given that fv and pv (and by extension, nper, pmt, and rate) have multiple definitions floating around out there, I sincerely think we should "punt" (my apologies to those unfamiliar w/ the American "football" metaphor), i.e., rid ourselves of this nightmare, esp. in light of what I feel are compelling, independent arguments against the inclusion of these functions in this library in the first place. Sorry for my stridency, and thank you for your time and patience. DG > > Skipper > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Tue Jun 9 01:14:09 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 9 Jun 2009 01:14:09 -0400 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <595939.51926.qm@web52105.mail.re2.yahoo.com> References: <595939.51926.qm@web52105.mail.re2.yahoo.com> Message-ID: <1cd32cbb0906082214k1d318fbfn77fb67b664477ae3@mail.gmail.com> On Tue, Jun 9, 2009 at 12:51 AM, wrote: > > --- On Mon, 6/8/09, Skipper Seabold wrote: > >> I forgot the last payment (which doesn't earn any >> interest), so one more 100. > > So in fact they're not in agreement? > >> pretty soon.? I don't have a more permanent reference >> for fv offhand, >> but it should be in any corporate finance text etc. >> Most of these >> type of "formulas" use basic results of geometric series to >> simplify. > > Let me be more specific about the difference between what we have and what I'm finding in print. ?Essentially, it boils down to this: in every source I've found, two "different" present/future values are discussed, that for a single amount, and that for a constant (i.e., not even the first "payment" is allowed to be different) periodic payment. ?I have not been able to find a single printed reference that gives a formula for (or even discusses, for that matter) the combination of these two, which is clearly what we have implemented (and which is, just as clearly, actually seen in practice). > > Now, my lazy side simply hopes that my stridency will finally cause someone to pipe up and say "look, dummy, it's in Schmoe, Joe, 2005. "Advanced Financial Practice." ?Financial Press, NY NY. ?There's your reference; find it and look it up if you don't trust me" and then I'll feel like we've at least covered our communal rear-end. ?But my more conscientious side worries that, if I've had so much trouble finding our more "advanced" definition (and I have tried, believe me), then I'm concerned that what your typical student (for example) is most likely to encounter is one of those simpler definitions, and thus get confused (at best) if they look at our help doc and find quite a different (at least superficially) definition (or worse, don't look at the help doc, and either can't get the function to work because the required number of inputs doesn't match what they're expecting from their text, or somehow manage to get it to work, but get an answer very > ?different from that given in other sources, e.g., the answers in the back of their text.) > > One obvious answer to this dilemma is to explain this discrepancy in the help doc, but then we have to explain - clearly and lucidly, mind you - how one uses our functions for the two simpler cases, how/why the formula we use is the combination of the other two, etc. (it's rather hard to anticipate, for me at least, all the possible confusions this discrepancy might create) and in any event, somehow I don't really think something so necessarily elaborate is appropriate in this case. ?So, again, given that fv and pv (and by extension, nper, pmt, and rate) have multiple definitions floating around out there, I sincerely think we should "punt" (my apologies to those unfamiliar w/ the American "football" metaphor), i.e., rid ourselves of this nightmare, esp. in light of what I feel are compelling, independent arguments against the inclusion of these functions in this library in the first place. > > Sorry for my stridency, and thank you for your time and patience. > I guess non of them found a textbook either Josef just some samples Note: Applications do not agree on the answer for IPMT(5%/12;10;360;10000;0;1), for payments at the beginning of each period. Kspread agrees with Gnumeric on the answer listed above. Excel and OOo both get -410.38 as the result. TODO: which is correct? Note: OpenOffice.org 2.0 and Excel returns different results to mot of the cases. The following use the result of Excel. Note: Gnumeric gives an error for negative rates. Excel and OOo2 do not. For NPER(-1%;-100;1000), OOo2 gives 9.48, Excel produces 9.483283066, Gnumeric gives a #DIV/0 error. This appears to be a bug in Gnumeric. From gael.varoquaux at normalesup.org Tue Jun 9 01:47:51 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 9 Jun 2009 07:47:51 +0200 Subject: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots In-Reply-To: <20090609011023.GA13728@jochen.schroeder.phy.auckland.ac.nz> References: <49d6b3500906080935i7d013bc9q918dc2a7f27536b6@mail.gmail.com> <3d15ebce0906081011uf4e3f81ha45202647e0ba916@mail.gmail.com> <49d6b3500906081026x5c1ed581pdf170981919a5967@mail.gmail.com> <20E4B6FC-F2B1-4C6C-949B-D0589DD797C5@enthought.com> <4A2D75DF.6090503@noaa.gov> <20090608211707.GF16831@phare.normalesup.org> <4A2D8705.4050208@noaa.gov> <49d6b3500906081514i4cdbf392q9fbb9e6e999ca6b6@mail.gmail.com> <20090608221647.GH28350@phare.normalesup.org> <20090609011023.GA13728@jochen.schroeder.phy.auckland.ac.nz> Message-ID: <20090609054751.GB4831@phare.normalesup.org> On Tue, Jun 09, 2009 at 01:10:24PM +1200, Jochen Schroeder wrote: > On 09/06/09 00:16, Gael Varoquaux wrote: > > On Mon, Jun 08, 2009 at 05:14:27PM -0500, G?khan SEVER wrote: > > > IPython's edit command works in a similar fashion, too. > > > edit test.py > > The cool thing is that you can select text in the editor and execute in > > EPDLab. On the other hand, I know that IPython has hooks to grow this in > > the code base, and I would like this to grow also directly in IPython. > > Hell, I use vim. How cool would it be to select (using visual mode) > > snippets in vim, and execute them in a running Ipython session. > I think there's a vim script for executing the marked code in python. If > IPython has already hooks for executing code in an existing session, it > might be possible to adapt this script. I do think it is, and that's just what I was suggesting. Now, I don't have time for that, but if someone feels like... :) Ga?l From d_l_goldsmith at yahoo.com Tue Jun 9 02:11:59 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 8 Jun 2009 23:11:59 -0700 (PDT) Subject: [Numpy-discussion] More on Summer NumPy Doc Marathon Message-ID: <901731.25166.qm@web52108.mail.re2.yahoo.com> Hi again, folks. I have a special request. Part of the vision for my job is that I'll focus my writing efforts on the docs no one else is gung-ho to work on. So, even if you're not quite ready to commit, if you're leaning toward volunteering to be a team lead for one (or more) categories, please let me know which one(s) (off list, if you prefer) so I can get an initial idea of what the "leftovers" are going to be. Thanks! DG From d_l_goldsmith at yahoo.com Tue Jun 9 02:32:32 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 8 Jun 2009 23:32:32 -0700 (PDT) Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate Message-ID: <575900.75584.qm@web52103.mail.re2.yahoo.com> --- On Mon, 6/8/09, Robert Kern wrote: > The OpenFormula standard is probably better: > > http://www.oasis-open.org/committees/documents.php?wg_abbrev=office-formula > > -- > Robert Kern OK, thanks Robert (as always); I'll go ahead and use this until/unless someone provide a printed reference. Thanks again. DG From robince at gmail.com Tue Jun 9 03:50:46 2009 From: robince at gmail.com (Robin) Date: Tue, 9 Jun 2009 08:50:46 +0100 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> Message-ID: On Mon, Jun 8, 2009 at 7:14 PM, David Warde-Farley wrote: > > On 8-Jun-09, at 8:33 AM, Jason Rennie wrote: > > Note that EM can be very slow to converge: > > That's absolutely true, but EM for PCA can be a life saver in cases where > diagonalizing (or even computing) the full covariance matrix is not a > realistic option. Diagonalization can be a lot of wasted effort if all you > care about are a few leading eigenvectors. EM also lets you deal with > missing values in a principled way, which I don't think you can do with > standard SVD. > > EM certainly isn't a magic bullet but there are circumstances where it's > appropriate. I'm a big fan of the ECG paper too. :) Hi, I've been following this with interest... although I'm not really familiar with the area. At the risk of drifting further off topic I wondered if anyone could recommend an accessible review of these kind of dimensionality reduction techniques... I am familiar with PCA and know of diffusion maps and ICA and others, but I'd never heard of EM and I don't really have any idea how they relate to each other and which might be better for one job or the other... so some sort of primer would be really handy. Cheers Robin From matthieu.brucher at gmail.com Tue Jun 9 03:57:18 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 9 Jun 2009 09:57:18 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> Message-ID: 2009/6/9 Robin : > On Mon, Jun 8, 2009 at 7:14 PM, David Warde-Farley wrote: >> >> On 8-Jun-09, at 8:33 AM, Jason Rennie wrote: >> >> Note that EM can be very slow to converge: >> >> That's absolutely true, but EM for PCA can be a life saver in cases where >> diagonalizing (or even computing) the full covariance matrix is not a >> realistic option. Diagonalization can be a lot of wasted effort if all you >> care about are a few leading eigenvectors. EM also lets you deal with >> missing values in a principled way, which I don't think you can do with >> standard SVD. >> >> EM certainly isn't a magic bullet but there are circumstances where it's >> appropriate. I'm a big fan of the ECG paper too. :) > > Hi, > > I've been following this with interest... although I'm not really > familiar with the area. At the risk of drifting further off topic I > wondered if anyone could recommend an accessible review of these kind > of dimensionality reduction techniques... I am familiar with PCA and > know of diffusion maps and ICA and others, but I'd never heard of EM > and I don't really have any idea how they relate to each other and > which might be better for one job or the other... so some sort of > primer would be really handy. Hi, Check Ch. Bishop publication on Probabilistic Principal Components Analysis, you have there the parallel between the two (EM is in fact just a way of computing PPCA, and with some Gaussian assumptions, you get PCA). Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From david at ar.media.kyoto-u.ac.jp Tue Jun 9 03:54:36 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 09 Jun 2009 16:54:36 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> Message-ID: <4A2E153C.3080808@ar.media.kyoto-u.ac.jp> Robin wrote: > On Mon, Jun 8, 2009 at 7:14 PM, David Warde-Farley wrote: > >> On 8-Jun-09, at 8:33 AM, Jason Rennie wrote: >> >> Note that EM can be very slow to converge: >> >> That's absolutely true, but EM for PCA can be a life saver in cases where >> diagonalizing (or even computing) the full covariance matrix is not a >> realistic option. Diagonalization can be a lot of wasted effort if all you >> care about are a few leading eigenvectors. EM also lets you deal with >> missing values in a principled way, which I don't think you can do with >> standard SVD. >> >> EM certainly isn't a magic bullet but there are circumstances where it's >> appropriate. I'm a big fan of the ECG paper too. :) >> > > Hi, > > I've been following this with interest... although I'm not really > familiar with the area. At the risk of drifting further off topic I > wondered if anyone could recommend an accessible review of these kind > of dimensionality reduction techniques... I am familiar with PCA and > know of diffusion maps and ICA and others, but I'd never heard of EM > and I don't really have any idea how they relate to each other and > which might be better for one job or the other... so some sort of > primer would be really handy. > I think the biggest problem is the 'babel tower' aspect of machine learning (the expression is from David H. Wolpert I believe), and practitioners in different subfields often use totally different words for more or less the same concepts (and many keep being rediscovered). For example, what ML people call PCA is called Karhunen Lo?ve in signal processing, and the concepts are quite similar. Anyway, the book from Bishop is a pretty good reference by one of the leading researcher: http://research.microsoft.com/en-us/um/people/cmbishop/prml/ It can be read without much background besides basic 1st year calculus/linear algebra. cheers, David From david at ar.media.kyoto-u.ac.jp Tue Jun 9 04:09:06 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 09 Jun 2009 17:09:06 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2E153C.3080808@ar.media.kyoto-u.ac.jp> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> <4A2E153C.3080808@ar.media.kyoto-u.ac.jp> Message-ID: <4A2E18A2.7010702@ar.media.kyoto-u.ac.jp> David Cournapeau wrote: > > I think the biggest problem is the 'babel tower' aspect of machine > learning (the expression is from David H. Wolpert I believe), and > practitioners in different subfields often use totally different words > for more or less the same concepts (and many keep being rediscovered). > For example, what ML people call PCA is called Karhunen Lo?ve in signal > processing, and the concepts are quite similar. > > Anyway, the book from Bishop is a pretty good reference by one of the > leading researcher: > > http://research.microsoft.com/en-us/um/people/cmbishop/prml/ > Should have mentioned that it is the same Bishop as mentioned by Matthieu, and that chapter 12 deals with latent models with continuous latent variable, which is one way to consider PCA in a probabilistic framework. David From neilcrighton at gmail.com Tue Jun 9 04:27:55 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Tue, 9 Jun 2009 08:27:55 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?setmember1d=5Fnu?= References: <4A279882.5090700@ntc.zcu.cz> <4A2D1466.2000501@ntc.zcu.cz> Message-ID: Robert Cimrman ntc.zcu.cz> writes: > >> I'd really like to see the setmember1d_nu function in ticket 1036 get into > >> numpy. There's a patch waiting for review that including tests: > >> > >> http://projects.scipy.org/numpy/ticket/1036 > >> > >> Is there anything I can do to help get it applied? > > > > I guess I could commit it, if you review the patch and it works for you. > > Obviously, I cannot review it myself, but my SVN access may still work :) > > Thanks for the review, it is in! > > r. Great - thanks! People often post to the list asking for this functionality, so it's nice to get it into numpy (whatever it ends up being called). Neil From stefan at sun.ac.za Tue Jun 9 04:28:02 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 9 Jun 2009 10:28:02 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2E153C.3080808@ar.media.kyoto-u.ac.jp> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> <4A2E153C.3080808@ar.media.kyoto-u.ac.jp> Message-ID: <9457e7c80906090128u4ac2e67ai8e6381f667daeddf@mail.gmail.com> 2009/6/9 David Cournapeau : > Anyway, the book from Bishop is a pretty good reference by one of the > leading researcher: > > http://research.microsoft.com/en-us/um/people/cmbishop/prml/ > > It can be read without much background besides basic 1st year > calculus/linear algebra. Bishop's book could be confusing at times, so I would also recommend going back to the original papers. It is sometimes easier to learn *with* researchers than from them! Cheers St?fan From joschu at caltech.edu Tue Jun 9 04:32:57 2009 From: joschu at caltech.edu (John Schulman) Date: Tue, 9 Jun 2009 01:32:57 -0700 Subject: [Numpy-discussion] error with large memmap Message-ID: <185761440906090132i18f37eb6r23b3e177027e0387@mail.gmail.com> I'm getting the error OverflowError: cannot fit 'long' into an index-sized integer when I try to memmap a 6gb file top of the stack trace is mm = mmap.mmap(fid.fileno(), bytes, access=acc) where bytes = 6528000000L I thought that 64-bit systems with python>2.5 could memmap large files. I'm running the latest EPD python distribution (4.2.30201), which uses python 2.5.4 and numpy 1.2.1 Macbook Pro Core 2 Duo, OS X 10.5.6 From cimrman3 at ntc.zcu.cz Tue Jun 9 04:35:15 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 09 Jun 2009 10:35:15 +0200 Subject: [Numpy-discussion] setmember1d_nu In-Reply-To: References: <4A279882.5090700@ntc.zcu.cz> <4A2D1466.2000501@ntc.zcu.cz> Message-ID: <4A2E1EC3.70807@ntc.zcu.cz> Neil Crighton wrote: > Robert Cimrman ntc.zcu.cz> writes: > >>>> I'd really like to see the setmember1d_nu function in ticket 1036 get into >>>> numpy. There's a patch waiting for review that including tests: >>>> >>>> http://projects.scipy.org/numpy/ticket/1036 >>>> >>>> Is there anything I can do to help get it applied? >>> I guess I could commit it, if you review the patch and it works for you. >>> Obviously, I cannot review it myself, but my SVN access may still work :) >> Thanks for the review, it is in! >> >> r. > > Great - thanks! People often post to the list asking for this functionality, so > it's nice to get it into numpy (whatever it ends up being called). Thank you for starting the discussion :) From cimrman3 at ntc.zcu.cz Tue Jun 9 04:37:54 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 09 Jun 2009 10:37:54 +0200 Subject: [Numpy-discussion] improving arraysetops Message-ID: <4A2E1F62.4010208@ntc.zcu.cz> Hi, I am starting a new thread, so that it reaches the interested people. Let us discuss improvements to arraysetops (array set operations) at [1] (allowing non-unique arrays as function arguments, better naming conventions and documentation). r. [1] http://projects.scipy.org/numpy/ticket/1133 From seb.haase at gmail.com Tue Jun 9 04:51:38 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Tue, 9 Jun 2009 10:51:38 +0200 Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? In-Reply-To: <9457e7c80906081839p12fb39v8efc12c554162f63@mail.gmail.com> References: <18571cd90906061221q7e31c477pc0c27ee700286ed8@mail.gmail.com> <18571cd90906081452g58791bb0ie106924de33bc914@mail.gmail.com> <9457e7c80906081839p12fb39v8efc12c554162f63@mail.gmail.com> Message-ID: 2009/6/9 St?fan van der Walt : > Hi Juan > <---cut ---> >> As you can see, it is very simple, but it takes several seconds running just >> to create a 200x200 plot. Fortran takes same time to create a 2000x2000 >> plot, around 100 times faster... So the question is, do you know how to >> programme this in a Python-like fashion in order to improve seriously the >> performance? > > Here is another version, similar to Robert's, that I wrote up for the > documentation project last year: > > http://mentat.za.net/numpy/intro/intro.html > > We never used it, but I still like the pretty pictures :-) +1 Hi St?fan, this does look really nice !! Could it be put somewhere more prominent !? How about onto the SciPy site with a "newcomers start here"-link right on the first page .... My 2 cents, Sebastian Haase From charlesr.harris at gmail.com Tue Jun 9 05:34:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 9 Jun 2009 03:34:18 -0600 Subject: [Numpy-discussion] error with large memmap In-Reply-To: <185761440906090132i18f37eb6r23b3e177027e0387@mail.gmail.com> References: <185761440906090132i18f37eb6r23b3e177027e0387@mail.gmail.com> Message-ID: On Tue, Jun 9, 2009 at 2:32 AM, John Schulman wrote: > I'm getting the error > OverflowError: cannot fit 'long' into an index-sized integer > when I try to memmap a 6gb file > > top of the stack trace is > mm = mmap.mmap(fid.fileno(), bytes, access=acc) > where bytes = 6528000000L > > I thought that 64-bit systems with python>2.5 could memmap large > files. I'm running the latest EPD python distribution (4.2.30201), > which uses python 2.5.4 and numpy 1.2.1 > Macbook Pro Core 2 Duo, OS X 10.5.6 > Is your python 64 bits? Try: file `which python` and see what it says. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Jun 9 05:52:08 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 09 Jun 2009 18:52:08 +0900 Subject: [Numpy-discussion] error with large memmap In-Reply-To: References: <185761440906090132i18f37eb6r23b3e177027e0387@mail.gmail.com> Message-ID: <4A2E30C8.5060805@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Tue, Jun 9, 2009 at 2:32 AM, John Schulman > wrote: > > I'm getting the error > OverflowError: cannot fit 'long' into an index-sized integer > when I try to memmap a 6gb file > > top of the stack trace is > mm = mmap.mmap(fid.fileno(), bytes, access=acc) > where bytes = 6528000000L > > I thought that 64-bit systems with python>2.5 could memmap large > files. I'm running the latest EPD python distribution (4.2.30201), > which uses python 2.5.4 and numpy 1.2.1 > Macbook Pro Core 2 Duo, OS X 10.5.6 > > > Is your python 64 bits? Try: > > file `which python` This is even better: python -c "import platform; print platform.machine()" as mac os x can be confusing with fat binaries and all :) David From jsseabold at gmail.com Tue Jun 9 09:45:43 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 9 Jun 2009 09:45:43 -0400 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <1cd32cbb0906082214k1d318fbfn77fb67b664477ae3@mail.gmail.com> References: <595939.51926.qm@web52105.mail.re2.yahoo.com> <1cd32cbb0906082214k1d318fbfn77fb67b664477ae3@mail.gmail.com> Message-ID: On Tue, Jun 9, 2009 at 1:14 AM, wrote: > On Tue, Jun 9, 2009 at 12:51 AM, wrote: >> >> --- On Mon, 6/8/09, Skipper Seabold wrote: >> >>> I forgot the last payment (which doesn't earn any >>> interest), so one more 100. >> >> So in fact they're not in agreement? >> >>> pretty soon.? I don't have a more permanent reference >>> for fv offhand, >>> but it should be in any corporate finance text etc. >>> Most of these >>> type of "formulas" use basic results of geometric series to >>> simplify. >> >> Let me be more specific about the difference between what we have and what I'm finding in print. ?Essentially, it boils down to this: in every source I've found, two "different" present/future values are discussed, that for a single amount, and that for a constant (i.e., not even the first "payment" is allowed to be different) periodic payment. ?I have not been able to find a single printed reference that gives a formula for (or even discusses, for that matter) the combination of these two, which is clearly what we have implemented (and which is, just as clearly, actually seen in practice). >> These are the two most basic building blocks of time value problems, discounting one cash flow and an annuity. There are *plenty* of examples and use cases for uneven cash flows or for providing a given pv or fv. Without even getting into actual financial contracts, suppose I have an investment account that already has $10,000 and I plan to add $500 every month and earn 4%. Then we would need something like fv to tell me how much this will be worth after 180 months. I don't necessarily need a reference to tell me this would be useful to know. >> Now, my lazy side simply hopes that my stridency will finally cause someone to pipe up and say "look, dummy, it's in Schmoe, Joe, 2005. "Advanced Financial Practice." ?Financial Press, NY NY. ?There's your reference; find it and look it up if you don't trust me" and then I'll feel like we've at least covered our communal rear-end. ?But my more conscientious side worries that, if I've had so much trouble finding our more "advanced" definition (and I have tried, believe me), then I'm concerned that what your typical student (for example) is most likely to encounter is one of those simpler definitions, and thus get confused (at best) if they look at our help doc and find quite a different (at least superficially) definition (or worse, don't look at the help doc, and either can't get the function to work because the required number of inputs doesn't match what they're expecting from their text, or somehow manage to get it to work, but get an answer very >> ?different from that given in other sources, e.g., the answers in the back of their text.) >> I don't know that these are "formulas" per se, rather than convenience functions for typical use cases. That's why they're in spreadsheets in the first place. They also follow the behavior of financial calculators, where you typically have to input a N, I/Y, PMT, PV and FV (even if one of these last two values is zero). If you need a textbook reference, as I said before you could literally pick up any corporate finance text and derive these functions from the basics. Try having a look at some end of chapter questions (or financial calculator handbook) to get an idea of when and how they'd actually be used. >> One obvious answer to this dilemma is to explain this discrepancy in the help doc, but then we have to explain - clearly and lucidly, mind you - how one uses our functions for the two simpler cases, how/why the formula we use is the combination of the other two, etc. (it's rather hard to anticipate, for me at least, all the possible confusions this discrepancy might create) and in any event, somehow I don't really think something so necessarily elaborate is appropriate in this case. ?So, again, given that fv and pv (and by extension, nper, pmt, and rate) have multiple definitions floating around out there, I sincerely think we should "punt" (my apologies to those unfamiliar w/ the American "football" metaphor), i.e., rid ourselves of this nightmare, esp. in light of what I feel are compelling, independent arguments against the inclusion of these functions in this library in the first place. >> >> Sorry for my stridency, and thank you for your time and patience. >> I don't think that there are multiple definitions of these (very simple) functions floating around, but rather different assumptions/implementations that lead to ever so slightly different results. My plan for the additions and when checking the existing ones is to derive the result, so that we know what's going on. Once you state your assumptions, the result will be clearly one way or another. This would be my way of "covering" our functions. I derived the result, so here's what's going on, here's a use case to have a look at as an example. Then we should be fine. It's not that I don't appreciate your concern for being correct. I guess it's just that I don't share it (the concern that is) in this case. Skipper From joschu at caltech.edu Tue Jun 9 10:32:24 2009 From: joschu at caltech.edu (John Schulman) Date: Tue, 9 Jun 2009 07:32:24 -0700 Subject: [Numpy-discussion] error with large memmap In-Reply-To: <4A2E30C8.5060805@ar.media.kyoto-u.ac.jp> References: <185761440906090132i18f37eb6r23b3e177027e0387@mail.gmail.com> <4A2E30C8.5060805@ar.media.kyoto-u.ac.jp> Message-ID: <185761440906090732i59421c9iac8a9d12a507569e@mail.gmail.com> OK looks like that was the issue $ python -c "import platform; print platform.machine()" i386 Thanks On Tue, Jun 9, 2009 at 2:52 AM, David Cournapeau wrote: > Charles R Harris wrote: >> >> >> On Tue, Jun 9, 2009 at 2:32 AM, John Schulman > > wrote: >> >> ? ? I'm getting the error >> ? ? OverflowError: cannot fit 'long' into an index-sized integer >> ? ? when I try to memmap a 6gb file >> >> ? ? top of the stack trace is >> ? ? mm = mmap.mmap(fid.fileno(), bytes, access=acc) >> ? ? where bytes = 6528000000L >> >> ? ? I thought that 64-bit systems with python>2.5 could memmap large >> ? ? files. I'm running the latest EPD python distribution (4.2.30201), >> ? ? which uses python 2.5.4 and numpy 1.2.1 >> ? ? Macbook Pro Core 2 Duo, OS X 10.5.6 >> >> >> Is your python 64 bits? Try: >> >> file `which python` > > This is even better: > > python -c "import platform; print platform.machine()" > > as mac os x can be confusing with fat binaries and all :) > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From D.P.Reichert at sms.ed.ac.uk Tue Jun 9 10:38:11 2009 From: D.P.Reichert at sms.ed.ac.uk (David Paul Reichert) Date: Tue, 09 Jun 2009 15:38:11 +0100 Subject: [Numpy-discussion] Adding zero-dimension arrays, bug? Message-ID: <20090609153811.bjxo6yswaoskc88o@www.sms.ed.ac.uk> Hi, Numpy let's me define arrays with zero rows and/or columns, and that's wanted behaviour from what I have read in discussions. However, I can add an array with zero rows to an array with one row (but not more), resulting in another zero row array, like so: In: a = zeros((4,0)) In: a Out: array([], shape=(4, 0), dtype=float64) In: b = zeros((4,1)) In: b Out: array([[ 0.], [ 0.], [ 0.], [ 0.]]) In: a + b Out: array([], shape=(4, 0), dtype=float64) Is this a bug? This should give a shape mismatch error, shouldn't? Cheers David -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From joschu at caltech.edu Tue Jun 9 10:51:59 2009 From: joschu at caltech.edu (John Schulman) Date: Tue, 9 Jun 2009 07:51:59 -0700 Subject: [Numpy-discussion] error with large memmap In-Reply-To: <185761440906090732i59421c9iac8a9d12a507569e@mail.gmail.com> References: <185761440906090132i18f37eb6r23b3e177027e0387@mail.gmail.com> <4A2E30C8.5060805@ar.media.kyoto-u.ac.jp> <185761440906090732i59421c9iac8a9d12a507569e@mail.gmail.com> Message-ID: <185761440906090751u511434bw33122d3e8858fb0b@mail.gmail.com> What's the best way to install a 64-bit python alongside my existing installation? On Tue, Jun 9, 2009 at 7:32 AM, John Schulman wrote: > OK looks like that was the issue > $ python -c "import platform; print platform.machine()" > i386 > > Thanks > > > On Tue, Jun 9, 2009 at 2:52 AM, David > Cournapeau wrote: >> Charles R Harris wrote: >>> >>> >>> On Tue, Jun 9, 2009 at 2:32 AM, John Schulman >> > wrote: >>> >>> ? ? I'm getting the error >>> ? ? OverflowError: cannot fit 'long' into an index-sized integer >>> ? ? when I try to memmap a 6gb file >>> >>> ? ? top of the stack trace is >>> ? ? mm = mmap.mmap(fid.fileno(), bytes, access=acc) >>> ? ? where bytes = 6528000000L >>> >>> ? ? I thought that 64-bit systems with python>2.5 could memmap large >>> ? ? files. I'm running the latest EPD python distribution (4.2.30201), >>> ? ? which uses python 2.5.4 and numpy 1.2.1 >>> ? ? Macbook Pro Core 2 Duo, OS X 10.5.6 >>> >>> >>> Is your python 64 bits? Try: >>> >>> file `which python` >> >> This is even better: >> >> python -c "import platform; print platform.machine()" >> >> as mac os x can be confusing with fat binaries and all :) >> >> David >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > From david at ar.media.kyoto-u.ac.jp Tue Jun 9 10:47:13 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 09 Jun 2009 23:47:13 +0900 Subject: [Numpy-discussion] error with large memmap In-Reply-To: <185761440906090751u511434bw33122d3e8858fb0b@mail.gmail.com> References: <185761440906090132i18f37eb6r23b3e177027e0387@mail.gmail.com> <4A2E30C8.5060805@ar.media.kyoto-u.ac.jp> <185761440906090732i59421c9iac8a9d12a507569e@mail.gmail.com> <185761440906090751u511434bw33122d3e8858fb0b@mail.gmail.com> Message-ID: <4A2E75F1.7020909@ar.media.kyoto-u.ac.jp> John Schulman wrote: > What's the best way to install a 64-bit python alongside my existing > installation? > It is a bit complicated because you need to build your own python interpreter (the python.org one does not handle 64 bits AFAIK). You could just install your python somewhere in your $HOME for example, and whenever you use this python interpreter to build a package, it will install it inside the site-packages of this python - so no clash with existing python interpreters. There are also things like virtualenv and co which can be considered if you need more than a couple of packages, cheers, David From bela.mihalik at gmail.com Tue Jun 9 11:19:28 2009 From: bela.mihalik at gmail.com (bela) Date: Tue, 9 Jun 2009 08:19:28 -0700 (PDT) Subject: [Numpy-discussion] second 2d fft gives the same result as fft+ifft Message-ID: <23945026.post@talk.nabble.com> I tried to calculate the second fourier transformation of an image with the following code below: --------------------------------------------------------------- import pylab import numpy ### Create a simple image fx = numpy.zeros( 128**2 ).reshape(128,128).astype( numpy.float ) for i in xrange(8): for j in xrange(8): fx[i*8+16][j*8+16] = 1.0 ### Fourier Transformations Ffx = numpy.copy( numpy.fft.fft2( fx ).real ) # 1st fourier FFfx = numpy.copy( numpy.fft.fft2( Ffx ).real ) # 2nd fourier IFfx = numpy.copy( numpy.fft.ifft2( Ffx ).real ) # inverse fourier ### Display result pylab.figure( 1, figsize=(8,8), dpi=125 ) pylab.subplot(221) pylab.imshow( fx, cmap=pylab.cm.gray ) pylab.colorbar() pylab.title( "fx" ) pylab.subplot(222) pylab.imshow( Ffx, cmap=pylab.cm.gray ) pylab.colorbar() pylab.title( "Ffx" ) pylab.subplot(223) pylab.imshow( FFfx, cmap=pylab.cm.gray ) pylab.colorbar() pylab.title( "FFfx" ) pylab.subplot(224) pylab.imshow( IFfx, cmap=pylab.cm.gray ) pylab.colorbar() pylab.title( "IFfx" ) pylab.show() --------------------------------------------------------------- On my computer FFfx is the same as IFfx..... but why? I uploaded a screenshot about my result here: http://server6.theimagehosting.com/image.php?img=second_fourier.png Bela -- View this message in context: http://www.nabble.com/second-2d-fft-gives-the-same-result-as-fft%2Bifft-tp23945026p23945026.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From matthieu.brucher at gmail.com Tue Jun 9 11:36:18 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 9 Jun 2009 17:36:18 +0200 Subject: [Numpy-discussion] second 2d fft gives the same result as fft+ifft In-Reply-To: <23945026.post@talk.nabble.com> References: <23945026.post@talk.nabble.com> Message-ID: Hi, Is it really ? You only show the imaginary part of the FFT, so you can't be sure of what you are saying. Don't forget that the only difference between FFT and iFFT is (besides of teh scaling factor) a minus sign in the exponent. Matthieu 2009/6/9 bela : > > I tried to calculate the second fourier transformation of an image with the > following code below: > > --------------------------------------------------------------- > import pylab > import numpy > > ### Create a simple image > > fx = numpy.zeros( 128**2 ).reshape(128,128).astype( numpy.float ) > > for i in xrange(8): > ? ? ? ?for j in xrange(8): > ? ? ? ? ? ? ? ?fx[i*8+16][j*8+16] = 1.0 > > ### Fourier Transformations > > Ffx = numpy.copy( numpy.fft.fft2( fx ).real ) ? # 1st fourier > FFfx = numpy.copy( numpy.fft.fft2( Ffx ).real ) ?# 2nd fourier > IFfx = numpy.copy( numpy.fft.ifft2( Ffx ).real ) ? # inverse fourier > > ### Display result > > pylab.figure( 1, figsize=(8,8), dpi=125 ) > > pylab.subplot(221) > pylab.imshow( fx, cmap=pylab.cm.gray ) > pylab.colorbar() > pylab.title( "fx" ) > > pylab.subplot(222) > pylab.imshow( Ffx, cmap=pylab.cm.gray ) > pylab.colorbar() > pylab.title( "Ffx" ) > > pylab.subplot(223) > pylab.imshow( FFfx, cmap=pylab.cm.gray ) > pylab.colorbar() > pylab.title( "FFfx" ) > > pylab.subplot(224) > pylab.imshow( IFfx, cmap=pylab.cm.gray ) > pylab.colorbar() > pylab.title( "IFfx" ) > > pylab.show() > --------------------------------------------------------------- > > On my computer FFfx is the same as IFfx..... but why? > > I uploaded a screenshot about my result here: > http://server6.theimagehosting.com/image.php?img=second_fourier.png > > Bela > > > -- > View this message in context: http://www.nabble.com/second-2d-fft-gives-the-same-result-as-fft%2Bifft-tp23945026p23945026.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From charlesr.harris at gmail.com Tue Jun 9 11:48:05 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 9 Jun 2009 09:48:05 -0600 Subject: [Numpy-discussion] error with large memmap In-Reply-To: <185761440906090751u511434bw33122d3e8858fb0b@mail.gmail.com> References: <185761440906090132i18f37eb6r23b3e177027e0387@mail.gmail.com> <4A2E30C8.5060805@ar.media.kyoto-u.ac.jp> <185761440906090732i59421c9iac8a9d12a507569e@mail.gmail.com> <185761440906090751u511434bw33122d3e8858fb0b@mail.gmail.com> Message-ID: On Tue, Jun 9, 2009 at 8:51 AM, John Schulman wrote: > What's the best way to install a 64-bit python alongside my existing > installation? > There was a long thread about that a while back. Try a search on the archives. It wasn't all that easy, though. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Tue Jun 9 12:31:25 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 09:31:25 -0700 (PDT) Subject: [Numpy-discussion] [SciPy-dev] More on Summer NumPy Doc Marathon Message-ID: <667868.5819.qm@web52110.mail.re2.yahoo.com> Thanks, Stefan. The lists you suggest already exist (more or less, depending on the "thing," i.e., list of categories, completely, prioritized list of individual items, sort of, at least w/in the categories) on the Milestones page (that's essentially what the Milestones page is) and the list of individual items is far too long to duplicate here, but for everyone's convenience I'll provide the list of categories (at least those for which the goal has not been, or is not close to being, met, which is most of them): Data type investigation Fourier transforms Linear algebra Error handling Financial functions Functional operations Help routines Indexing Input/Output Logic, comparisons etc. Polynomials Random number generation Other random operations Boolean set operations Searching Sorting Statistics Comparison Window functions Sums, interpolation, gradients, etc Arithmetic + basic functions I Arithmetic + basic functions II Arithmetic + basic functions III Masked arrays Masked arrays, II Masked arrays, III Masked arrays, IV Operations on masks Even more MA functions I Even more MA functions II Numpy internals C-types Other math The matrix library Numarray compatibility Numeric compatibility Other array subclasses Matrix subclass Ndarray Ndarray, II Dtypes Ufunc Scalar base class Scalar types Comments: 0) The number of individual items in each of these categories varies from one to a few dozen or so 1) Omitted are a few "meta-categories," e.g., "Routines," "Basic Objects," etc. 2) IMO, there are still too many of these (at least too many to not be intimidating in the manner Stefan has implied); I had it in mind to try to create an intermediate level of organization, i.e., "meso-categories," but I couldn't really justify it on grounds other than there are simply still too many categories to be unintimidating, so I was advised against usage of time in that endeavor. However, if there's an outpouring of support for me doing that, it would fall on sympathetic ears. As far as prioritizing individual items, my opinion is that team leads should do that (or not, as they deem appropriate) - I wouldn't presume to know enough to do that in most cases. However, if people want to furnish me with suggested prioritizations, I'd be happy to be the one to edit the Wiki to reflect these. DG --- On Tue, 6/9/09, St?fan van der Walt wrote: > From: St?fan van der Walt > Subject: Re: [SciPy-dev] More on Summer NumPy Doc Marathon > To: "SciPy Developers List" > Date: Tuesday, June 9, 2009, 1:34 AM > Hi David > > 2009/6/9 David Goldsmith : > > > > Hi again, folks. ?I have a special request. ?Part of > the vision for my job is that I'll focus my writing efforts > on the docs no one else is gung-ho to work on. ?So, even if > you're not quite ready to commit, if you're leaning toward > volunteering to be a team lead for one (or more) categories, > please let me know which one(s) (off list, if you prefer) so > I can get an initial idea of what the "leftovers" are going > to be. ?Thanks! > > That's a pretty wide question.? Maybe you could post a > list of > categories and ask who would be willing to mentor and write > on each? > For the writing, we could decide on a prioritised list of > functions, > publish that list and then document those entries one by > one (i.e. > make the tasks small enough so that people don't run away > screaming). > > Cheers > St?fan > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From d_l_goldsmith at yahoo.com Tue Jun 9 12:58:05 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 09:58:05 -0700 (PDT) Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate Message-ID: <202530.34045.qm@web52109.mail.re2.yahoo.com> --- On Tue, 6/9/09, Skipper Seabold wrote: > These are the two most basic building blocks of time value > problems, > discounting one cash flow and an annuity.? There are > *plenty* of > examples and use cases for uneven cash flows or for > providing a given > pv or fv.? Without even getting into actual financial > contracts, > suppose I have an investment account that already has > $10,000 and I > plan to add $500 every month and earn 4%.? Then we > would need > something like fv to tell me how much this will be worth > after 180 > months.? I don't necessarily need a reference to tell > me this would be > useful to know. Use case examples aren't the problem; worked out examples combining these two principles aren't the problem; usefulness isn't the problem; the problem is one of meeting a particular reference standard. > I don't know that these are "formulas" per se, rather than Except that we do provide a "formula" in our help doc; perhaps the "solution" is to get rid of that and include an explanation of how our function combines the two basic formulae (for which I do have a hard-copy reference: Gitman, L. J., 2003. "Principals of Managerial Finance (Brief), 3rd Ed." Pearson Education) to handle the more general case. > FV (even if one of these last two values is zero).? If > you need a > textbook reference, as I said before you could literally > pick up any > corporate finance text and derive these functions from the > basics. I don't question that, what I question is the appropriateness of such derivation in numpy's help doc; as I see it, one of the purposes of providing references in our situation is precisely to avoid having to include derivations in our help doc. But it's all moot, as Robert has furnished me with a reference specifically for the "formula" we do have, and, although it's an electronic reference, it seems "stable" enough to warrant an exception to the "references should be hard-copy AMAP" policy. DG From d_l_goldsmith at yahoo.com Tue Jun 9 13:31:47 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 10:31:47 -0700 (PDT) Subject: [Numpy-discussion] [SciPy-dev] More on Summer NumPy Doc Marathon Message-ID: <570350.72787.qm@web52101.mail.re2.yahoo.com> Thanks, Bruce. --- On Tue, 6/9/09, Bruce Southey wrote: > Hi, > Great. > Can you provide the actual colors at the start for : > ? ? * Edit white, light gray, or yellow > ? ? * Don't edit dark gray > While not color-blind, not all browsers render the same > colors on all > operating systems etc. > What are you using for light gray or is that meant to be > blue. If it is > 'blue' then what does it mean? > It appears to be the same color used on the Front Page to > say 'Proofed'. Yes, by all means, I agree 100% (I'm having the same problem). :-) In fact, I think (and have thought) that the light and dark grey and cyan are all too close to each other - anyone object to me replacing the greys w/ orange and lavender? > What does the 'green' color mean? > The links says 'Reviewed (needs proof)'? but how does > one say 'proofed'. Cyan (if I understand you correctly). > Also the milestone link from the Front Page does not go > anywhere: > http://docs.scipy.org/numpy/Front%20Page/#milestones Ooops, accidentally broke that editing the page to make the Milestones link more prominent - I'll fix it imminently. Thanks again, DG > > Bruce > > > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From d_l_goldsmith at yahoo.com Tue Jun 9 13:41:21 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 10:41:21 -0700 (PDT) Subject: [Numpy-discussion] [SciPy-dev] More on Summer NumPy Doc Marathon Message-ID: <988376.79271.qm@web52101.mail.re2.yahoo.com> > Also the milestone link from the Front Page does not go > anywhere: > http://docs.scipy.org/numpy/Front%20Page/#milestones Fixed. DG From dwf at cs.toronto.edu Tue Jun 9 14:00:58 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 9 Jun 2009 14:00:58 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2E153C.3080808@ar.media.kyoto-u.ac.jp> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> <4A2E153C.3080808@ar.media.kyoto-u.ac.jp> Message-ID: <1895DB24-ECC3-4611-97DF-869380A8A4F2@cs.toronto.edu> On 9-Jun-09, at 3:54 AM, David Cournapeau wrote: > For example, what ML people call PCA is called Karhunen Lo?ve in > signal > processing, and the concepts are quite similar. Yup. This seems to be a nice set of review notes: http://www.ece.rutgers.edu/~orfanidi/ece525/svd.pdf And going further than just PCA/KLT, tying it together with maximum likelihood factor analysis/linear dynamical systems/hidden Markov models, http://www.cs.toronto.edu/~roweis/papers/NC110201.pdf David From d_l_goldsmith at yahoo.com Tue Jun 9 14:01:57 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 11:01:57 -0700 (PDT) Subject: [Numpy-discussion] second 2d fft gives the same result as fft+ifft Message-ID: <459130.76439.qm@web52103.mail.re2.yahoo.com> --- On Tue, 6/9/09, Matthieu Brucher wrote: > Hi, > > Is it really ? > You only show the imaginary part of the FFT, so you can't > be sure of > what you are saying. Indeed, is there not a "label" for a function f which satisfies Im(iFFT(f)) = Im(FFT^2(f)), Re(iFFT(f)) != Re(FFT^2(f))? (And similarly if Im and Re roles are reversed.) Seems like the class of such functions (if any exist) might have some interesting properties... DG > Don't forget that the only difference between FFT and iFFT > is (besides > of teh scaling factor) a minus sign in the exponent. > > Matthieu > > 2009/6/9 bela : > > > > I tried to calculate the second fourier transformation > of an image with the > > following code below: > > > > > --------------------------------------------------------------- > > import pylab > > import numpy > > > > ### Create a simple image > > > > fx = numpy.zeros( 128**2 ).reshape(128,128).astype( > numpy.float ) > > > > for i in xrange(8): > > ? ? ? ?for j in xrange(8): > > ? ? ? ? ? ? ? ?fx[i*8+16][j*8+16] = 1.0 > > > > ### Fourier Transformations > > > > Ffx = numpy.copy( numpy.fft.fft2( fx ).real ) ? # 1st > fourier > > FFfx = numpy.copy( numpy.fft.fft2( Ffx ).real ) ?# > 2nd fourier > > IFfx = numpy.copy( numpy.fft.ifft2( Ffx ).real ) ? # > inverse fourier > > > > ### Display result > > > > pylab.figure( 1, figsize=(8,8), dpi=125 ) > > > > pylab.subplot(221) > > pylab.imshow( fx, cmap=pylab.cm.gray ) > > pylab.colorbar() > > pylab.title( "fx" ) > > > > pylab.subplot(222) > > pylab.imshow( Ffx, cmap=pylab.cm.gray ) > > pylab.colorbar() > > pylab.title( "Ffx" ) > > > > pylab.subplot(223) > > pylab.imshow( FFfx, cmap=pylab.cm.gray ) > > pylab.colorbar() > > pylab.title( "FFfx" ) > > > > pylab.subplot(224) > > pylab.imshow( IFfx, cmap=pylab.cm.gray ) > > pylab.colorbar() > > pylab.title( "IFfx" ) > > > > pylab.show() > > > --------------------------------------------------------------- > > > > On my computer FFfx is the same as IFfx..... but why? > > > > I uploaded a screenshot about my result here: > > http://server6.theimagehosting.com/image.php?img=second_fourier.png > > > > Bela > > > > > > -- > > View this message in context: http://www.nabble.com/second-2d-fft-gives-the-same-result-as-fft%2Bifft-tp23945026p23945026.html > > Sent from the Numpy-discussion mailing list archive at > Nabble.com. > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Information System Engineer, Ph.D. > Website: http://matthieu-brucher.developpez.com/ > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn: http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d_l_goldsmith at yahoo.com Tue Jun 9 14:06:50 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 11:06:50 -0700 (PDT) Subject: [Numpy-discussion] second 2d fft gives the same result as fft+ifft Message-ID: <315462.74301.qm@web52108.mail.re2.yahoo.com> Sorry, I meant: Im(iFT(FT(f))) = Im(FT^2(f)), Re(iFT(FT(f))) != Re(FT^2(f)) DG --- On Tue, 6/9/09, David Goldsmith wrote: > From: David Goldsmith > Subject: Re: [Numpy-discussion] second 2d fft gives the same result as fft+ifft > To: "Discussion of Numerical Python" > Date: Tuesday, June 9, 2009, 11:01 AM > > --- On Tue, 6/9/09, Matthieu Brucher > wrote: > > > Hi, > > > > Is it really ? > > You only show the imaginary part of the FFT, so you > can't > > be sure of > > what you are saying. > > Indeed, is there not a "label" for a function f which > satisfies > > ? ???Im(iFFT(f)) = Im(FFT^2(f)), > Re(iFFT(f)) != Re(FFT^2(f))? > > (And similarly if Im and Re roles are reversed.)? > Seems like the class of such functions (if any exist) might > have some interesting properties... > > DG > > > Don't forget that the only difference between FFT and > iFFT > > is (besides > > of teh scaling factor) a minus sign in the exponent. > > > > Matthieu > > > > 2009/6/9 bela : > > > > > > I tried to calculate the second fourier > transformation > > of an image with the > > > following code below: > > > > > > > > > --------------------------------------------------------------- > > > import pylab > > > import numpy > > > > > > ### Create a simple image > > > > > > fx = numpy.zeros( 128**2 > ).reshape(128,128).astype( > > numpy.float ) > > > > > > for i in xrange(8): > > > ? ? ? ?for j in xrange(8): > > > ? ? ? ? ? ? ? ?fx[i*8+16][j*8+16] = 1.0 > > > > > > ### Fourier Transformations > > > > > > Ffx = numpy.copy( numpy.fft.fft2( fx ).real ) ? > # 1st > > fourier > > > FFfx = numpy.copy( numpy.fft.fft2( Ffx ).real ) > ?# > > 2nd fourier > > > IFfx = numpy.copy( numpy.fft.ifft2( Ffx ).real ) > ? # > > inverse fourier > > > > > > ### Display result > > > > > > pylab.figure( 1, figsize=(8,8), dpi=125 ) > > > > > > pylab.subplot(221) > > > pylab.imshow( fx, cmap=pylab.cm.gray ) > > > pylab.colorbar() > > > pylab.title( "fx" ) > > > > > > pylab.subplot(222) > > > pylab.imshow( Ffx, cmap=pylab.cm.gray ) > > > pylab.colorbar() > > > pylab.title( "Ffx" ) > > > > > > pylab.subplot(223) > > > pylab.imshow( FFfx, cmap=pylab.cm.gray ) > > > pylab.colorbar() > > > pylab.title( "FFfx" ) > > > > > > pylab.subplot(224) > > > pylab.imshow( IFfx, cmap=pylab.cm.gray ) > > > pylab.colorbar() > > > pylab.title( "IFfx" ) > > > > > > pylab.show() > > > > > > --------------------------------------------------------------- > > > > > > On my computer FFfx is the same as IFfx..... but > why? > > > > > > I uploaded a screenshot about my result here: > > > http://server6.theimagehosting.com/image.php?img=second_fourier.png > > > > > > Bela > > > > > > > > > -- > > > View this message in context: http://www.nabble.com/second-2d-fft-gives-the-same-result-as-fft%2Bifft-tp23945026p23945026.html > > > Sent from the Numpy-discussion mailing list > archive at > > Nabble.com. > > > > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > -- > > Information System Engineer, Ph.D. > > Website: http://matthieu-brucher.developpez.com/ > > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > > LinkedIn: http://www.linkedin.com/in/matthieubrucher > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > ? ? ? > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Tue Jun 9 14:08:43 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 9 Jun 2009 18:08:43 +0000 (UTC) Subject: [Numpy-discussion] Multiplying Python float to numpy.array of objects works but fails with a numpy.float64, numpy Bug? References: Message-ID: Sun, 07 Jun 2009 12:09:40 +0200, Sebastian Walter wrote: > from numpy import * > import numpy > print 'numpy.__version__=',numpy.__version__ > > class adouble: > def __init__(self,x): > self.x = x > def __mul__(self,rhs): > if isinstance(rhs,adouble): > return adouble(self.x * rhs.x) > else: > return adouble(self.x * rhs) > def __str__(self): > return str(self.x) [clip] > print u * float64(3.) # _NOT_ OK! [clip] > > Should I open a ticket for that? Please do, the current behavior doesn't seem correct. -- Pauli Virtanen From robert.kern at gmail.com Tue Jun 9 14:48:38 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 9 Jun 2009 13:48:38 -0500 Subject: [Numpy-discussion] Adding zero-dimension arrays, bug? In-Reply-To: <20090609153811.bjxo6yswaoskc88o@www.sms.ed.ac.uk> References: <20090609153811.bjxo6yswaoskc88o@www.sms.ed.ac.uk> Message-ID: <3d375d730906091148j706fee36sed3c2004b3590fc@mail.gmail.com> On Tue, Jun 9, 2009 at 09:38, David Paul Reichert wrote: > Hi, > > Numpy let's me define arrays with zero rows and/or > columns, and that's wanted behaviour from what I have > read in discussions. However, I can add an array > with zero rows to an array with one row (but not more), > resulting in another zero row array, like so: > > > In: a = zeros((4,0)) > > In: a > Out: array([], shape=(4, 0), dtype=float64) > > In: b = zeros((4,1)) > > In: b > Out: > array([[ 0.], > ? ? ? ?[ 0.], > ? ? ? ?[ 0.], > ? ? ? ?[ 0.]]) > > In: a + b > Out: array([], shape=(4, 0), dtype=float64) > > > Is this a bug? This should give a shape mismatch error, > shouldn't? No. According to the rules of broadcasting, the axis will match iff the sizes from each array are the same OR one of them is 1. So (4,1) will broadcast with (4,0) to result in a (4,0) array. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bruno.piguet at gmail.com Tue Jun 9 14:56:29 2009 From: bruno.piguet at gmail.com (bruno Piguet) Date: Tue, 9 Jun 2009 20:56:29 +0200 Subject: [Numpy-discussion] Howto vectorise a dot product ? Message-ID: Dear all, Can someone point me to a doc on dot product vectorisation ? Here is what I try to do : I've got a rotation function which looks like : def rotat_scal(phi, V): s = math.sin(phi) c = math.cos(phi) M = np.zeros((3, 3)) M[2, 2] = M[1, 1] = c M[1, 2] = -s M[2, 1] = s M[0, 0] = 1 return np.dot(M, V) (where phi is a scalar, and V and array of size (3,1)) Now, I want to apply it to a time series of phi and V, in a vectorised way. So, I tried to simply add a first dimension : Phi is now of size(n) and V (n, 3). (I really whish to have this shape, for direct correspondance to file). The corresponding function looks like : def rotat_vect(phi, V): s = np.sin(phi) c = np.cos(phi) M = np.zeros((len(phi), 3, 3)) M[:, 2, 2] = M[:, 1, 1] = c M[:, 1, 2] = -s M[:, 2, 1] = s M[:, 0, 0] = np.ones (len(phi)) return np.dot(M, V) It was not really a surprise to see that it didn't work : > [...] > return np.dot(M, V) > ValueError: objects are not aligned Any hint ? Bruno. -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.carl at ngc.com Tue Jun 9 15:03:13 2009 From: a.carl at ngc.com (Carl, Andrew F (AS)) Date: Tue, 9 Jun 2009 14:03:13 -0500 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content Message-ID: Would it be a reasonable request, that the "F2PY Windows" web page contain known combinations of version numbers for Python, Numpy and Gfortran verified to play nice? Some references as to queried compiler system environmental variables would be useful also. Thanks, Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Tue Jun 9 16:27:51 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 9 Jun 2009 16:27:51 -0400 Subject: [Numpy-discussion] Howto vectorise a dot product ? In-Reply-To: References: Message-ID: <7F001AA6-2B44-4B5F-9877-90A6F3735A40@cs.toronto.edu> On 9-Jun-09, at 2:56 PM, bruno Piguet wrote: > Phi is now of size(n) and V (n, 3). > (I really whish to have this shape, for direct correspondance to > file). > > The corresponding function looks like : > > def rotat_vect(phi, V): > s = np.sin(phi) > c = np.cos(phi) > M = np.zeros((len(phi), 3, 3)) > M[:, 2, 2] = M[:, 1, 1] = c > M[:, 1, 2] = -s > M[:, 2, 1] = s > M[:, 0, 0] = np.ones (len(phi)) > return np.dot(M, V) Well, if you make V have a singleton dimension on the end you can then do it, but you will get one more axis than you care about. The help for dot() says this: For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors (without complex conjugation). For N dimensions it is a sum product over the last axis of `a` and the second-to-last of `b`:: dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m]) So changing your code to this: return np.dot(M, V[:,:,np.newaxis])[arange(len(phi)), :, arange(len(phi)), :] will do what you want, but it will also do a lot of useless multiplication in computing that product. I'm not sure of any better way, and am kind of curious myself (since I often have to take products of one or several vectors with several matrices). David From charlesr.harris at gmail.com Tue Jun 9 16:46:25 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 9 Jun 2009 14:46:25 -0600 Subject: [Numpy-discussion] Howto vectorise a dot product ? In-Reply-To: References: Message-ID: On Tue, Jun 9, 2009 at 12:56 PM, bruno Piguet wrote: > Dear all, > > Can someone point me to a doc on dot product vectorisation ? > > Here is what I try to do : > > I've got a rotation function which looks like : > > def rotat_scal(phi, V): > s = math.sin(phi) > c = math.cos(phi) > M = np.zeros((3, 3)) > M[2, 2] = M[1, 1] = c > M[1, 2] = -s > M[2, 1] = s > M[0, 0] = 1 > return np.dot(M, V) > > (where phi is a scalar, and V and array of size (3,1)) > > Now, I want to apply it to a time series of phi and V, in a vectorised way. > So, I tried to simply add a first dimension : > Phi is now of size(n) and V (n, 3). > (I really whish to have this shape, for direct correspondance to file). > Well, in this case you can use complex multiplication and either work with just the x,y components or use two complex components, i.e., [x + 1j*y, z]. In the first case you can then do the rotation as V*exp(1j*phi). If you want more general rotations, a ufunc for quaternions would do the trick. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.benoit.1 at gmail.com Tue Jun 9 21:46:00 2009 From: jacob.benoit.1 at gmail.com (Benoit Jacob) Date: Wed, 10 Jun 2009 03:46:00 +0200 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab Message-ID: Hi, I'm one of the Eigen developers and was pointed to your discussion. I just want to clarify a few things for future reference (not trying to get you to use Eigen): > No, eigen does not provide a (complete) BLAS/LAPACK interface. True, > I don't know if that's even a goal of eigen Not a goal indeed, though there's agreement that such a bridge would be a nice add-on. (Would be a one-directional bridge though. You can't express with BLAS/LAPACK all what you can express with the Eigen API). > (it started as a project for KDE, to support high performance core > computations for things like spreadsheet and co). Yes, that's how it started 3 years ago. A lot changed since, though. See http://eigen.tuxfamily.org/index.php?title=Main_Page#Projects_using_Eigen > Eigen is: > - not mature. Fair enough > - heavily expression-template-based C++, meaning compilation takes ages No, because _we_ are serious about compilation times, unlike other c++ template libraries. But granted, compilation times are not as short as a plain C library either. > + esoteric, impossible to decypher compilation errors. Try it ;) See e.g. this comment: http://www.macresearch.org/interview-eigen-matrix-library#comment-14667 > - SSE dependency harcoded, since it is setup at build time. That's > going backward IMHO - I would rather see a numpy/scipy which can load > the optimized code at runtime. Eigen doesn't _require_ any SIMD instruction set although it can use SSE / AltiVec if enabled. It is true that with Eigen this is set up at build time, but this is only because it is squarely _not_ Eigen's job to do runtime platform checks. Eigen isn't a binary library. If you want a runtime platform switch, just compile your critical Eigen code twice, one with SSE one without, and do the platform check in your own app. The good thing here is that Eigen makes sure that the ABI is independent of whether vectorization is enabled. And to reply to Matthieu's mail: > I would add that it relies on C++ compiler extensions (the restrict > keyword) as does blitz. You unfortunately can't expect every compiler > to support it unless the C++ committee finally adds it to the > standard. currently we have: #define EIGEN_RESTRICT __restrict This is ready to be replaced by an empty symbol if some compiler doesn't support restrict. The only reason why we didn't do this is that all compilers we've encountered so far support restrict. Cheers, Benoit From charlesr.harris at gmail.com Tue Jun 9 22:30:01 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 9 Jun 2009 20:30:01 -0600 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: Message-ID: On Tue, Jun 9, 2009 at 7:46 PM, Benoit Jacob wrote: > Hi, > > I'm one of the Eigen developers and was pointed to your discussion. I > just want to clarify a few things for future reference (not trying to > get you to use Eigen): > > > No, eigen does not provide a (complete) BLAS/LAPACK interface. > > True, > > > I don't know if that's even a goal of eigen > > Not a goal indeed, though there's agreement that such a bridge would > be a nice add-on. (Would be a one-directional bridge though. You can't > express with BLAS/LAPACK all what you can express with the Eigen API). > > > (it started as a project for KDE, to support high performance core > > computations for things like spreadsheet and co). > > Yes, that's how it started 3 years ago. A lot changed since, though. See > http://eigen.tuxfamily.org/index.php?title=Main_Page#Projects_using_Eigen > > > Eigen is: > > - not mature. > > Fair enough > > > - heavily expression-template-based C++, meaning compilation takes ages > > No, because _we_ are serious about compilation times, unlike other c++ > template libraries. But granted, compilation times are not as short as > a plain C library either. > I wonder if it is possible to have a compiler/parser that does nothing but translate templates into c? Say, something written in python ;) Name mangling would be a problem but could perhaps be simplified for the somewhat limited space needed for numpy/scipy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Jun 9 22:33:26 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 10 Jun 2009 11:33:26 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: Message-ID: <4A2F1B76.7080705@ar.media.kyoto-u.ac.jp> Hi Benoit, Benoit Jacob wrote: > No, because _we_ are serious about compilation times, unlike other c++ > template libraries. But granted, compilation times are not as short as > a plain C library either. > I concede it is not as bad as the heavily templated libraries in boost. But C++ is just horribly slow to compile, at least with g++ - in scipy, half of the compilation time is spent for a couple of C++ files which uses simple templates. And the compiler takes a lot of memory during compilation (~ 300 Mb per file - that's a problem because I rely a lot on VM to build numpy/scipy binaries). > Eigen doesn't _require_ any SIMD instruction set although it can use > SSE / AltiVec if enabled. > If SSE is not enabled, my (very limited) tests show that eigen does not perform as well as a stock debian ATLAS on the benchmarks given by eigen. For example: g++ benchBlasGemm.cpp -I .. -lblas -O2 -DNDEBUG && ./a.out 300 cblas: 0.034222 (0.788 GFlops/s) eigen : 0.0863581 (0.312 GFlops/s) eigen : 0.121259 (0.222 GFlops/s) g++ benchBlasGemm.cpp -I .. -lblas -O2 -DNDEBUG -msse2 && ./a.out 300 cblas: 0.035438 (0.761 GFlops/s) eigen : 0.0182271 (1.481 GFlops/s) eigen : 0.0860961 (0.313 GFlops/s) (on a PIV, which may not be very representative of current architectures) > It is true that with Eigen this is set up at build time, but this is > only because it is squarely _not_ Eigen's job to do runtime platform > checks. Eigen isn't a binary library. If you want a runtime platform > switch, just compile your critical Eigen code twice, one with SSE one > without, and do the platform check in your own app. The good thing > here is that Eigen makes sure that the ABI is independent of whether > vectorization is enabled. > I understand that it is not a goal of eigen, and that should be the application's job. It is just that MKL does it automatically, and doing it in a cross platform way in the context of python extensions is quite hard because of various linking strategies on different OS. cheers, David From david at ar.media.kyoto-u.ac.jp Tue Jun 9 22:43:18 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 10 Jun 2009 11:43:18 +0900 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content In-Reply-To: References: Message-ID: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> Carl, Andrew F (AS) wrote: > > Would it be a reasonable request, that the "F2PY Windows" web page > contain known combinations of version numbers for Python, Numpy and > Gfortran verified to play nice? Some references as to queried compiler > system environmental variables would be useful also. > I have added some numpy.distutils support for gfortran on windows, but Windows + gfortran + numpy is unlikely to work well unless you build numpy by yourself with gfortran. I have actually been considering a move to gfortran for windows builds, but I would prefer waiting mingw to officially support for gcc 4.* series. cheers, David From JARED.RUBIN at saic.com Tue Jun 9 22:52:43 2009 From: JARED.RUBIN at saic.com (Rubin, Jared) Date: Tue, 9 Jun 2009 19:52:43 -0700 Subject: [Numpy-discussion] numpy C++ swig class example Message-ID: <3A7F37AE3B50AE479C753AFB27FBB64D018E066B@0461-its-exmb04.us.saic.com> I am using the numpy.i interface file and have gotten the cookbook/swig example to work from scipy. Are there any examples of appyling the numpy.i to a C++ header file. I would like to generate a lightweight Array2D class that just uses doubles and would have the following header file Array2D.h ========= class Array2D { public: int _nrow; int _ncol; double* data; Array2DD(int nrow, int ncol, double *data); } // I would expect the have the following Array2D.i interface file %module array2DD %{ #define SWIG_FILE_WITH_INIT #include "Array2D.h" %} %include "numpy.i" %init %{ import_array(); %} %ignore Array2D(); %ignore Array2D(long nrow, long ncol); %apply (int DIM1, int DIM2, double* IN_ARRAY2) {(int nrow, int ncol, double *data)} %include "Array2D.h" -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Jun 9 23:50:40 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 9 Jun 2009 23:50:40 -0400 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <202530.34045.qm@web52109.mail.re2.yahoo.com> References: <202530.34045.qm@web52109.mail.re2.yahoo.com> Message-ID: On Tue, Jun 9, 2009 at 12:58 PM, David Goldsmith wrote: > > --- On Tue, 6/9/09, Skipper Seabold wrote: > >> These are the two most basic building blocks of time value >> problems, >> discounting one cash flow and an annuity.? There are >> *plenty* of >> examples and use cases for uneven cash flows or for >> providing a given >> pv or fv.? Without even getting into actual financial >> contracts, >> suppose I have an investment account that already has >> $10,000 and I >> plan to add $500 every month and earn 4%.? Then we >> would need >> something like fv to tell me how much this will be worth >> after 180 >> months.? I don't necessarily need a reference to tell >> me this would be >> useful to know. > > Use case examples aren't the problem; worked out examples combining these two principles aren't the problem; usefulness isn't the problem; the problem is one of meeting a particular reference standard. > >> I don't know that these are "formulas" per se, rather than > > Except that we do provide a "formula" in our help doc; perhaps the "solution" is to get rid of that and include an explanation of how our function combines the two basic formulae (for which I do have a hard-copy reference: Gitman, L. J., 2003. ?"Principals of Managerial Finance (Brief), 3rd Ed." ?Pearson Education) to handle the more general case. > >> FV (even if one of these last two values is zero).? If >> you need a >> textbook reference, as I said before you could literally >> pick up any >> corporate finance text and derive these functions from the >> basics. > > I don't question that, what I question is the appropriateness of such derivation in numpy's help doc; as I see it, one of the purposes of providing references in our situation is precisely to avoid having to include derivations in our help doc. > > But it's all moot, as Robert has furnished me with a reference specifically for the "formula" we do have, and, although it's an electronic reference, it seems "stable" enough to warrant an exception to the "references should be hard-copy AMAP" policy. > Just to follow up a bit with this. I don't want to beat a dead horse, but I'd just like to share my experience with these documents and functions so far. I have implemented the ipmt and ppmt functions that were "not implemented" in numpy.lib.financial as well as written some tests. ipmt is one of the functions where there was a discrepancy between what OO and Excel report for the beginning of period payment assumptions and what Gnumeric and Kspread report as stated in the OpenFormula document referenced above (I discovered errors in this document as well as the openoffice documents referenced btw but they become obvious when you work these problems out). OpenFormula lists the Gnumeric/Kspread as the "result" in the document, but there is still a question to which is correct. Well, I was able to derive both results, and as I suspected the Gnumeric/Kspread was based on an incorrect assumption (or a mistake in implementation) not a different formula. My point with the derivations wasn't to include them in the documentation, but rather to find out what assumptions are being made and then deduce which results are correct. In the cases of these simple spreadsheet functions I think it should be obvious if it's right or wrong. If there is any interest in adding the ipmt and ppmt functions now, I can apply a patch. If not, I can keep them separate as I work on the other functions. Cheers, Skipper From dwf at cs.toronto.edu Wed Jun 10 00:03:30 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 10 Jun 2009 00:03:30 -0400 Subject: [Numpy-discussion] Jcamp dx format In-Reply-To: References: Message-ID: On 7-Jun-09, at 4:56 AM, giorgio.luciano at inwind.it wrote: > Sorry for cross posting > > Hello to all, > I've done a script for importing all spectra files in a directory > and merge all them in one matrix. The file imported are dx files. > the bad part is that the file is in matlab and it requite a function > from bioinformatic toolbox (jcamp read). > And now I just wnat to do the same in python. I guess I will have no > problem for translating the script but I think I dont' have the time > (and capabilities) to rewrite something like jcampread. Since jcamp > dx format it's quite common among scientist. Does anyone can share > some script/function for importing them in python (I guess that also > a r routine can do the trick but I will prefer to use python). > Thanks in advance to all > Giorgio Googling revealed this which should do the trick: http://code.google.com/p/pyms/ David From david at ar.media.kyoto-u.ac.jp Tue Jun 9 23:55:52 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 10 Jun 2009 12:55:52 +0900 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <1895DB24-ECC3-4611-97DF-869380A8A4F2@cs.toronto.edu> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <1244362840.4377.10.camel@gabriel-desktop> <1244368329.4377.18.camel@gabriel-desktop> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> <4A2E153C.3080808@ar.media.kyoto-u.ac.jp> <1895DB24-ECC3-4611-97DF-869380A8A4F2@cs.toronto.edu> Message-ID: <4A2F2EC8.7070804@ar.media.kyoto-u.ac.jp> David Warde-Farley wrote: > On 9-Jun-09, at 3:54 AM, David Cournapeau wrote: > > >> For example, what ML people call PCA is called Karhunen Lo?ve in >> signal >> processing, and the concepts are quite similar. >> > > > Yup. This seems to be a nice set of review notes: > > http://www.ece.rutgers.edu/~orfanidi/ece525/svd.pdf > This looks indeed like a very nice review from a signal processing approach. I never took the time to understand the similarities/differences/connections between traditional SP approaches and the machine learning approach. I wonder if the subspaces methods ala PENCIL/MUSIC and co have a (useful) interpretation in a more ML approach, I never really thought about it. I guess other people had :) > And going further than just PCA/KLT, tying it together with maximum > likelihood factor analysis/linear dynamical systems/hidden Markov > models, > > http://www.cs.toronto.edu/~roweis/papers/NC110201.pdf > As much as I like this paper, I always felt that you miss a lot of insights when considering PCA only from a purely statistical POV. I really like the consideration of PCA within a function approximation POV (the chapter 9 of the Mallat book on wavelet is cristal clear, for example, and it is based on all those cool functional spaces theory likes Besov space). cheers, David From d_l_goldsmith at yahoo.com Wed Jun 10 01:00:14 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 22:00:14 -0700 (PDT) Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate Message-ID: <456109.37185.qm@web52105.mail.re2.yahoo.com> --- On Tue, 6/9/09, Skipper Seabold wrote: > I have implemented the ipmt and ppmt functions that were > "not > implemented" in numpy.lib.financial as well as written some > tests. Thanks! > ipmt is one of the functions where there was a discrepancy > between > what OO and Excel report for the beginning of period > payment > assumptions and what Gnumeric and Kspread report as stated > in the > OpenFormula document referenced above (I discovered errors > in this > document as well as the openoffice documents referenced btw > but they > become obvious when you work these problems out).? And the nightmare worsens (IMO). > OpenFormula lists > the Gnumeric/Kspread as the "result" in the document, but > there is > still a question to which is correct.? Well, I was > able to derive both > results, and as I suspected the Gnumeric/Kspread was based > on an > incorrect assumption (or a mistake in implementation) not a > different > formula.? My point with the derivations wasn't to > include them in the > documentation, but rather to find out what assumptions are > being made > and then deduce which results are correct.? In the > cases of these > simple spreadsheet functions I think it should be obvious > if it's > right or wrong. OK, I concede defeat: if it is the wisdom of the PTB that numpy.financial be retained, I will stop messing w/ their help doc, 'cause I'm clearly in over my head. DG > If there is any interest in adding the ipmt and ppmt > functions now, I > can apply a patch.? If not, I can keep them separate > as I work on the > other functions. > > Cheers, > > Skipper > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gokhansever at gmail.com Wed Jun 10 01:26:52 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 00:26:52 -0500 Subject: [Numpy-discussion] Question about memmap Message-ID: <49d6b3500906092226j2e6e2eadme7f5c878886e47d@mail.gmail.com> Hello, I am having problem while trying to memory map a simple file (attached as test.txt) In IPython data = memmap('test.txt', mode='r', dtype=double, shape=(3,5)) data memmap([[ 3.45616501e-86, 4.85780149e-33, 4.85787493e-33, 5.07185821e-86, 4.85780159e-33], [ 4.85787493e-33, 5.07185821e-86, 1.28444278e-57, 1.39804066e-76, 4.85787506e-33], [ 4.83906715e-33, 4.85784273e-33, 4.85787506e-33, 4.83906715e-33, 4.85784273e-33]]) which is not what is in the file. Tried different dtype float options, but always the same result. Could you tell me what could be wrong? Thanks... G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- 42501.0000 999999.9999 999999.9999 999999.9999 999999.9999 42502.0000 999999.9999 999999.9999 999999.9999 999999.9999 42503.0000 999999.9999 999999.9999 999999.9999 999999.9999 From matthew.brett at gmail.com Wed Jun 10 01:34:32 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 9 Jun 2009 22:34:32 -0700 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <49d6b3500906092226j2e6e2eadme7f5c878886e47d@mail.gmail.com> References: <49d6b3500906092226j2e6e2eadme7f5c878886e47d@mail.gmail.com> Message-ID: <1e2af89e0906092234r27255eeds71b2af6f15ca78f0@mail.gmail.com> Hi, > I am having problem while trying to memory map a simple file (attached as > test.txt) The file looks like a text file, but memmap is for binary files. Could that be the problem? Best, Matthew From gokhansever at gmail.com Wed Jun 10 01:38:48 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 00:38:48 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <1e2af89e0906092234r27255eeds71b2af6f15ca78f0@mail.gmail.com> References: <49d6b3500906092226j2e6e2eadme7f5c878886e47d@mail.gmail.com> <1e2af89e0906092234r27255eeds71b2af6f15ca78f0@mail.gmail.com> Message-ID: <49d6b3500906092238v73d72518ka6f6a49b751f3b7f@mail.gmail.com> On Wed, Jun 10, 2009 at 12:34 AM, Matthew Brett wrote: > Hi, > > > I am having problem while trying to memory map a simple file (attached as > > test.txt) > > The file looks like a text file, but memmap is for binary files. > Could that be the problem? > > Best, > > Matthew > I don't see such a restriction in memmap function based on its help at or memmap? from IPython. http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html#numpy.memmap I thought it will happily work with text files too :( G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Wed Jun 10 01:56:09 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 22:56:09 -0700 (PDT) Subject: [Numpy-discussion] Question about memmap Message-ID: <140065.95059.qm@web52104.mail.re2.yahoo.com> --- On Tue, 6/9/09, G?khan SEVER wrote: > Matthew Brett > wrote: > > > I am having problem while trying to memory map a > simple file (attached as > > > test.txt) > > The file looks like a text file, but memmap is for > binary files. > > Could that be the problem? > > Matthew > > I don't see such a restriction in memmap function based > on its help Fixed (at least in the Numpy Doc Wiki, don't know how long it will take for that to propagate to a release) > at or memmap? from IPython. Sorry, no authority over there. > I thought it will happily work with text files too :( Not the soln. you were hoping for, I know, sorry. DG From d_l_goldsmith at yahoo.com Wed Jun 10 01:56:40 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 22:56:40 -0700 (PDT) Subject: [Numpy-discussion] Question about memmap Message-ID: <350756.17463.qm@web52102.mail.re2.yahoo.com> --- On Tue, 6/9/09, G?khan SEVER wrote: > Matthew Brett > wrote: > > > I am having problem while trying to memory map a > simple file (attached as > > > test.txt) > > The file looks like a text file, but memmap is for > binary files. > > Could that be the problem? > > Matthew > > I don't see such a restriction in memmap function based > on its help Fixed (at least in the Numpy Doc Wiki, don't know how long it will take for that to propagate to a release) > at or memmap? from IPython. Sorry, no authority over there. > I thought it will happily work with text files too :( Not the soln. you were hoping for, I know, sorry. DG From robert.kern at gmail.com Wed Jun 10 01:58:58 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 10 Jun 2009 00:58:58 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <140065.95059.qm@web52104.mail.re2.yahoo.com> References: <140065.95059.qm@web52104.mail.re2.yahoo.com> Message-ID: <3d375d730906092258q31dda907gceed42ba29158e3c@mail.gmail.com> On Wed, Jun 10, 2009 at 00:56, David Goldsmith wrote: > > --- On Tue, 6/9/09, G?khan SEVER wrote: >> I don't see such a restriction in memmap function based >> on its help > > Fixed (at least in the Numpy Doc Wiki, don't know how long it will take for that to propagate to a release) > >> at or memmap? from IPython. > > Sorry, no authority over there. "memmap?" prints the docstring. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d_l_goldsmith at yahoo.com Wed Jun 10 01:59:25 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 22:59:25 -0700 (PDT) Subject: [Numpy-discussion] Question about memmap Message-ID: <776404.74211.qm@web52108.mail.re2.yahoo.com> Sorry for the double post, my link and/or browser was acting up. DG From gokhansever at gmail.com Wed Jun 10 02:01:28 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 01:01:28 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <140065.95059.qm@web52104.mail.re2.yahoo.com> References: <140065.95059.qm@web52104.mail.re2.yahoo.com> Message-ID: <49d6b3500906092301g54edeb98m1994e177588e36d3@mail.gmail.com> On Wed, Jun 10, 2009 at 12:56 AM, David Goldsmith wrote: > > --- On Tue, 6/9/09, G?khan SEVER wrote: > > > Matthew Brett > > wrote: > > > > > I am having problem while trying to memory map a > > simple file (attached as > > > > > test.txt) > > > > The file looks like a text file, but memmap is for > > binary files. > > > > Could that be the problem? > > > > Matthew > > > > I don't see such a restriction in memmap function based > > on its help > > Fixed (at least in the Numpy Doc Wiki, don't know how long it will take for > that to propagate to a release) > You mean you modified the rst documents in the numpy trunk? > > > at or memmap? from IPython. > > Sorry, no authority over there. > What do you mean by this? > > > I thought it will happily work with text files too :( > > Not the soln. you were hoping for, I know, sorry. > I was going to compare a script with memmap and loadtxt versions of data loading and processing. It still works fine with loadtxt :) gs -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruno.piguet at gmail.com Wed Jun 10 02:21:07 2009 From: bruno.piguet at gmail.com (bruno Piguet) Date: Wed, 10 Jun 2009 08:21:07 +0200 Subject: [Numpy-discussion] Howto vectorise a dot product ? In-Reply-To: References: Message-ID: 2009/6/9 Charles R Harris > > Well, in this case you can use complex multiplication and either work with > just the x,y components or use two complex components, i.e., [x + 1j*y, z]. > In the first case you can then do the rotation as V*exp(1j*phi). In the real case, it's a real 3-axes rotation, where M = dot (M1(psi), dot (M2(theta), M3(phi))). The decomposition in 2D-rotations and the use of complex operation is possible, but the matrix notation is more concise. If you want more general rotations, a ufunc for quaternions would do the > trick. > You mean something like Christoph Gohlke's "transformations.py" program ? I'll also chek if I really ned vectorisation. After all, Numpy slicing is known to be efficient. I'll do some timing with the pseudo-vectorial function : def rotat_vect(phi, theta, psi, V): for i in xrange(len(phi)): rotat_scal(phi[i,:], theta[i,:], psi[i,:], V[i,:]) Bruno. -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Wed Jun 10 02:41:08 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 9 Jun 2009 23:41:08 -0700 (PDT) Subject: [Numpy-discussion] Question about memmap Message-ID: <317513.11704.qm@web52101.mail.re2.yahoo.com> --- On Tue, 6/9/09, G?khan SEVER wrote: > You mean you modified the rst documents in the numpy > trunk? No, at least I don't think so, I made the modification at: http://docs.scipy.org/numpy/docs/numpy.core.memmap.memmap/ and, IIUC, the "auto-sync" between the Wiki and the rst is one-way: rst changes automatically propagate to the Wiki, but not vice-versa; anyone, please correct me if I'm wrong, and if I'm right, please elaborate on precisely what has to happen for Wiki changes to be propagated to the rst (because I don't know). > > at or memmap? from IPython. > > Sorry, no authority over there. > > What do you mean by this? Sorry again, I don't "do" IPython, so when I saw "at or memmap? from IPython" I thought you must be referring to IPython's independent help doc system, and, by extension, the people who are responsible for it. But Robert set me straight. > > I thought it will happily work with text files too :( > > Not the soln. you were hoping for, I know, sorry. > > I was going to compare a script with memmap and loadtxt > versions of data loading and processing. It still works fine > with loadtxt :) Good. DG From gokhansever at gmail.com Wed Jun 10 02:51:19 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 01:51:19 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <1e2af89e0906092234r27255eeds71b2af6f15ca78f0@mail.gmail.com> References: <49d6b3500906092226j2e6e2eadme7f5c878886e47d@mail.gmail.com> <1e2af89e0906092234r27255eeds71b2af6f15ca78f0@mail.gmail.com> Message-ID: <49d6b3500906092351o7a28651cn9eb54373cc571403@mail.gmail.com> On Wed, Jun 10, 2009 at 12:34 AM, Matthew Brett wrote: > Hi, > > > I am having problem while trying to memory map a simple file (attached as > > test.txt) > > The file looks like a text file, but memmap is for binary files. > Could that be the problem? > > Best, > > Matthew What's the reason again that memmap only works with binary files? Could the functionality be extended to text files as well? Python's mmap module support text file mapping, however I am getting another error this time :( In [1]: import mmap In [2]: f = open('test.txt', 'r') In [3]: map = mmap.mmap(f.fileno(), 0) --------------------------------------------------------------------------- EnvironmentError Traceback (most recent call last) /home/gsever/Desktop/src/range_calc/ in () EnvironmentError: [Errno 13] Permission denied I am on a Linux machine... gs -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Wed Jun 10 02:53:36 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 01:53:36 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <317513.11704.qm@web52101.mail.re2.yahoo.com> References: <317513.11704.qm@web52101.mail.re2.yahoo.com> Message-ID: <49d6b3500906092353m52d18814v7e3a8474d495d2f5@mail.gmail.com> On Wed, Jun 10, 2009 at 1:41 AM, David Goldsmith wrote: > > --- On Tue, 6/9/09, G?khan SEVER wrote: > > > You mean you modified the rst documents in the numpy > > trunk? > > No, at least I don't think so, I made the modification at: > > http://docs.scipy.org/numpy/docs/numpy.core.memmap.memmap/ > > and, IIUC, the "auto-sync" between the Wiki and the rst is one-way: rst > changes automatically propagate to the Wiki, but not vice-versa; anyone, > please correct me if I'm wrong, and if I'm right, please elaborate on > precisely what has to happen for Wiki changes to be propagated to the rst > (because I don't know). > I don't know that there are two way sync between the wiki and rst files. As far as I know, fixing the rst files is the right way to go. However, I might be wrong? -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Wed Jun 10 03:03:22 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Wed, 10 Jun 2009 00:03:22 -0700 (PDT) Subject: [Numpy-discussion] Question about memmap Message-ID: <138419.24681.qm@web52107.mail.re2.yahoo.com> My present job - and the Summer Numpy Doc Marathon - is premised on making changes/additions through the Wiki; if anyone other than registered developers is to be messing w/ the rst, it's news to me. At this point, someone who knows should please step in and clearly explain the relationship between the Wiki and the rst (or point to the place on the Wiki where this is explained). Thanks! DG --- On Tue, 6/9/09, G?khan SEVER wrote: > From: G?khan SEVER > Subject: Re: [Numpy-discussion] Question about memmap > To: "Discussion of Numerical Python" > Date: Tuesday, June 9, 2009, 11:53 PM > On Wed, Jun > 10, 2009 at 1:41 AM, David Goldsmith > wrote: > > > > --- On Tue, 6/9/09, G?khan SEVER > wrote: > > > > > You mean you modified the rst > documents in the numpy > > > trunk? > > > > No, at least I don't think so, I made the > modification at: > > > > http://docs.scipy.org/numpy/docs/numpy.core.memmap.memmap/ > > > > and, IIUC, the "auto-sync" between the Wiki and > the rst is one-way: rst changes automatically propagate to > the Wiki, but not vice-versa; anyone, please correct me if > I'm wrong, and if I'm right, please elaborate on > precisely what has to happen for Wiki changes to be > propagated to the rst (because I don't know). > > > > I don't know that there are two way sync between the > wiki and rst files. As far as I know, fixing the rst files > is the right way to go. > However, I might be wrong? > > -----Inline Attachment Follows----- > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Wed Jun 10 03:13:22 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Jun 2009 07:13:22 +0000 (UTC) Subject: [Numpy-discussion] Question about memmap References: <49d6b3500906092226j2e6e2eadme7f5c878886e47d@mail.gmail.com> <1e2af89e0906092234r27255eeds71b2af6f15ca78f0@mail.gmail.com> <49d6b3500906092351o7a28651cn9eb54373cc571403@mail.gmail.com> Message-ID: Wed, 10 Jun 2009 01:51:19 -0500, G?khan SEVER kirjoitti: > What's the reason again that memmap only works with binary files? There are no separate "text files" and "binary files". All files are binary, some just contain text that in some cases represents an array of numbers. Memmap views also text files as binary. It returns you an array representing the *character data* in the file. > Could the functionality be extended to text files as well? In principle, yes. But this would need special parsing of the text in the memmap. Doing this right would be considerably more work than just representing the binary data. Also, I doubt that this would be very useful: representing large amounts of data as text is not efficient. I also think few people have interest in this feature. -- Pauli Virtanen From gokhansever at gmail.com Wed Jun 10 03:20:00 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 02:20:00 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <138419.24681.qm@web52107.mail.re2.yahoo.com> References: <138419.24681.qm@web52107.mail.re2.yahoo.com> Message-ID: <49d6b3500906100020n653cf834u8250700ecc1caa45@mail.gmail.com> On Wed, Jun 10, 2009 at 2:03 AM, David Goldsmith wrote: > > My present job - and the Summer Numpy Doc Marathon - is premised on making > changes/additions through the Wiki; if anyone other than registered > developers is to be messing w/ the rst, it's news to me. At this point, > someone who knows should please step in and clearly explain the relationship > between the Wiki and the rst (or point to the place on the Wiki where this > is explained). Thanks! > > DG > To me, docstring originated changes should be made on the actual source codes. Since they the preliminary sources for sphinx to work on integrating with rst documents under the /doc folder in the main numpy trunk. There is a daily doc build system running so each change will be reflected on the next build cycle. Also for a developer, just doing a "svn up" will fetch the necessary updated from the code-base in this case memmap.py file itself, from there on it's optional whether to read docstings from the file itself or via IPy or in another way. The philosophy I like in this design, a well written and documented code doesn't need an additional documentation because everything one needs is right in the code :) gs -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Jun 10 03:23:18 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 10 Jun 2009 02:23:18 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <49d6b3500906100020n653cf834u8250700ecc1caa45@mail.gmail.com> References: <138419.24681.qm@web52107.mail.re2.yahoo.com> <49d6b3500906100020n653cf834u8250700ecc1caa45@mail.gmail.com> Message-ID: <3d375d730906100023q4f0b7a43i28b81e9127edb004@mail.gmail.com> On Wed, Jun 10, 2009 at 02:20, G?khan SEVER wrote: > On Wed, Jun 10, 2009 at 2:03 AM, David Goldsmith > wrote: >> >> My present job - and the Summer Numpy Doc Marathon - is premised on making >> changes/additions through the Wiki; if anyone other than registered >> developers is to be messing w/ the rst, it's news to me. ?At this point, >> someone who knows should please step in and clearly explain the relationship >> between the Wiki and the rst (or point to the place on the Wiki where this >> is explained). ?Thanks! >> >> DG > > To me, docstring originated changes should be made on the actual source > codes. Since they the preliminary sources for sphinx to work on integrating > with rst documents under the /doc folder in the main numpy trunk. There is a > daily doc build system running so each change will be reflected on the next > build cycle. The changes in the doc wiki will get pushed to the docstrings in SVN. The Sphinx documentation for numpy.memmap is built from these docstrings. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Wed Jun 10 03:25:22 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 10 Jun 2009 09:25:22 +0200 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <49d6b3500906100020n653cf834u8250700ecc1caa45@mail.gmail.com> References: <138419.24681.qm@web52107.mail.re2.yahoo.com> <49d6b3500906100020n653cf834u8250700ecc1caa45@mail.gmail.com> Message-ID: <20090610072522.GA7593@phare.normalesup.org> On Wed, Jun 10, 2009 at 02:20:00AM -0500, G?khan SEVER wrote: > On Wed, Jun 10, 2009 at 2:03 AM, David Goldsmith > <[1]d_l_goldsmith at yahoo.com> wrote: > My present job - and the Summer Numpy Doc Marathon - is premised on > making changes/additions through the Wiki; if anyone other than > registered developers is to be messing w/ the rst, it's news to me. ?At > this point, someone who knows should please step in and clearly explain > the relationship between the Wiki and the rst (or point to the place on > the Wiki where this is explained). ?Thanks! > DG > To me, docstring originated changes should be made on the actual source > codes. Since they the preliminary sources for sphinx to work on > integrating with rst documents under the /doc folder in the main numpy > trunk. There is a daily doc build system running so each change will be > reflected on the next build cycle. > Also for a developer, just doing a "svn up" will fetch the necessary > updated from the code-base in this case memmap.py file itself, from there > on it's optional whether to read docstings from the file itself or via IPy > or in another way. The philosophy I like in this design, a well written > and documented code doesn't need an additional documentation because > everything one needs is right in the code :) The wiki is syncrhonised to the source. Everything that is edited in the source ends up in the wiki, and vice versa, althought editing the same docstring at both ends gives a conflict (which can be solved). I tend to encourage using the wiki, because it makes it easy to document for a non developper. Reviewing the changes is also easier. Ga?l From gokhansever at gmail.com Wed Jun 10 03:25:47 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 02:25:47 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: References: <49d6b3500906092226j2e6e2eadme7f5c878886e47d@mail.gmail.com> <1e2af89e0906092234r27255eeds71b2af6f15ca78f0@mail.gmail.com> <49d6b3500906092351o7a28651cn9eb54373cc571403@mail.gmail.com> Message-ID: <49d6b3500906100025w22074fb7t4f7acc7720e185db@mail.gmail.com> On Wed, Jun 10, 2009 at 2:13 AM, Pauli Virtanen wrote: > Wed, 10 Jun 2009 01:51:19 -0500, G?khan SEVER kirjoitti: > > What's the reason again that memmap only works with binary files? > > There are no separate "text files" and "binary files". All files are > binary, some just contain text that in some cases represents an array of > numbers. > > Memmap views also text files as binary. It returns you an array > representing the *character data* in the file. > > > Could the functionality be extended to text files as well? > > In principle, yes. But this would need special parsing of the text in the > memmap. Doing this right would be considerably more work than just > representing the binary data. Also, I doubt that this would be very > useful: representing large amounts of data as text is not efficient. I > also think few people have interest in this feature. > I was expecting to see a similar result to loadtxt() function with memmap(). I just can't map the numbers in to an array but the whole file represented as characters. Now I see why I don't see what it's actually in my test.txt in terms of numbers. Reading more from memmap.py, I see that it uses mmap module. Your explanations confirm my observation that text files should also work here --providing that missing special parsing. I don't have much idea of how to implement this... Gokhan -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.sinclair.za at gmail.com Wed Jun 10 03:26:32 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 10 Jun 2009 09:26:32 +0200 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <138419.24681.qm@web52107.mail.re2.yahoo.com> References: <138419.24681.qm@web52107.mail.re2.yahoo.com> Message-ID: <6a17e9ee0906100026y34035578ye9349da3f79de193@mail.gmail.com> > 2009/6/10 David Goldsmith : > > My present job - and the Summer Numpy Doc Marathon - is premised on making changes/additions through the Wiki; if anyone other than registered developers is to be messing w/ the rst, it's news to me. ?At this point, someone who knows should please step in and clearly explain the relationship between the Wiki and the rst (or point to the place on the Wiki where this is explained). ?Thanks! > > DG To add to Robert's eplanation. The front page of the Doc-Wiki says: "You do not need to be a SciPy developer to contribute, as any documentation changes committed directly to the Subversion repository by developers are automatically propogated here on a daily basis. This means that you can be sure the documentation reflected here is in sync with the most recent Scipy development efforts." All of the documentation in the Wiki is actually stored as plain text in rst format (this is what you see when you click on the edit link). The files are stored in a separate subversion repository to the official NumPy and SciPy repositories. The Doc-Wiki simply renders the rst formatted text and provides nice functionality for editing and navigating the documentation. For documentation to get from the Wiki's repo to the main NumPy and SciPy repo's someone (with commit privileges) must make a patch and apply it. Visit http://docs.scipy.org/numpy/patch/ and generate a patch to see what I mean. Any changes a developer checks into the main repo's will automatically be propogated to the Doc-Wiki repo once a day, to avoid things getting to confused. The upshot is, if you're a developer you can commit doc changes directly to the main repos. If you are not you can edit the rst docs in the Doc-Wiki and this will be committed to the main repos at some convenient time (usually just before a release). Cheers, Scott From robert.kern at gmail.com Wed Jun 10 03:30:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 10 Jun 2009 02:30:28 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <49d6b3500906100025w22074fb7t4f7acc7720e185db@mail.gmail.com> References: <49d6b3500906092226j2e6e2eadme7f5c878886e47d@mail.gmail.com> <1e2af89e0906092234r27255eeds71b2af6f15ca78f0@mail.gmail.com> <49d6b3500906092351o7a28651cn9eb54373cc571403@mail.gmail.com> <49d6b3500906100025w22074fb7t4f7acc7720e185db@mail.gmail.com> Message-ID: <3d375d730906100030k4e440861g481205db18016f9c@mail.gmail.com> On Wed, Jun 10, 2009 at 02:25, G?khan SEVER wrote: > On Wed, Jun 10, 2009 at 2:13 AM, Pauli Virtanen wrote: >> >> Wed, 10 Jun 2009 01:51:19 -0500, G?khan SEVER kirjoitti: >> > What's the reason again that memmap only works with binary files? >> >> There are no separate "text files" and "binary files". All files are >> binary, some just contain text that in some cases represents an array of >> numbers. >> >> Memmap views also text files as binary. It returns you an array >> representing the *character data* in the file. >> >> > Could the functionality be extended to text files as well? >> >> In principle, yes. But this would need special parsing of the text in the >> memmap. Doing this right would be considerably more work than just >> representing the binary data. Also, I doubt that this would be very >> useful: representing large amounts of data as text is not efficient. I >> also think few people have interest in this feature. > > I was expecting to see a similar result to loadtxt() function with memmap(). > I just can't map the numbers in to an array but the whole file represented > as characters. Now I see why I don't see what it's actually in my test.txt > in terms of numbers. > > Reading more from memmap.py, I see that it uses mmap module. Your > explanations confirm my observation that text files should also work here > --providing that missing special parsing. I don't have much idea of how to > implement this... No, numpy.memmap cannot be made to deal meaningfully with text files (except as an array of characters, perhaps, but that's not what we're talking about). In order to parse the text into an array of numbers, all of the memory has to be read. The resulting floating point array will not (and cannot) be synchronized in any way back to the text in the file. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gokhansever at gmail.com Wed Jun 10 03:33:19 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 02:33:19 -0500 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <6a17e9ee0906100026y34035578ye9349da3f79de193@mail.gmail.com> References: <138419.24681.qm@web52107.mail.re2.yahoo.com> <6a17e9ee0906100026y34035578ye9349da3f79de193@mail.gmail.com> Message-ID: <49d6b3500906100033l43a694ecn54d7aaa75bea1c04@mail.gmail.com> On Wed, Jun 10, 2009 at 2:26 AM, Scott Sinclair wrote: > > 2009/6/10 David Goldsmith : > > > > My present job - and the Summer Numpy Doc Marathon - is premised on > making changes/additions through the Wiki; if anyone other than registered > developers is to be messing w/ the rst, it's news to me. At this point, > someone who knows should please step in and clearly explain the relationship > between the Wiki and the rst (or point to the place on the Wiki where this > is explained). Thanks! > > > > DG > > To add to Robert's eplanation. > > The front page of the Doc-Wiki says: > > "You do not need to be a SciPy developer to contribute, as any > documentation changes committed directly to the Subversion repository > by developers are automatically propogated here on a daily basis. This > means that you can be sure the documentation reflected here is in sync > with the most recent Scipy development efforts." > > All of the documentation in the Wiki is actually stored as plain text > in rst format (this is what you see when you click on the edit link). > The files are stored in a separate subversion repository to the > official NumPy and SciPy repositories. The Doc-Wiki simply renders the > rst formatted text and provides nice functionality for editing and > navigating the documentation. > > For documentation to get from the Wiki's repo to the main NumPy and > SciPy repo's someone (with commit privileges) must make a patch and > apply it. Visit http://docs.scipy.org/numpy/patch/ and generate a > patch to see what I mean. > > Any changes a developer checks into the main repo's will automatically > be propogated to the Doc-Wiki repo once a day, to avoid things getting > to confused. > > The upshot is, if you're a developer you can commit doc changes > directly to the main repos. If you are not you can edit the rst docs > in the Doc-Wiki and this will be committed to the main repos at some > convenient time (usually just before a release). Thanks for the clarification. It seems like the Wiki system simplifies the documentation process way much. -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Wed Jun 10 03:36:07 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Wed, 10 Jun 2009 00:36:07 -0700 (PDT) Subject: [Numpy-discussion] Question about memmap Message-ID: <540908.45602.qm@web52112.mail.re2.yahoo.com> --- On Wed, 6/10/09, Gael Varoquaux wrote: > I tend to encourage using the wiki, because it makes it > easy to document > for a non developper. Reviewing the changes is also > easier. And it provides a level of "protection" for the source; though a late-comer to this system, it's wisdom is plainly apparent, at least to me. OK, so the reconciliation is two-way, via SVN; I take it only registered developers have update/commit privileges? Does at least one developer check SVN at least once daily? DG From d_l_goldsmith at yahoo.com Wed Jun 10 03:42:44 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Wed, 10 Jun 2009 00:42:44 -0700 (PDT) Subject: [Numpy-discussion] Question about memmap Message-ID: <761390.60157.qm@web52109.mail.re2.yahoo.com> --- On Wed, 6/10/09, Scott Sinclair wrote: > The front page of the Doc-Wiki says: > > "You do not need to be a SciPy developer to contribute, as > any > documentation changes committed directly to the Subversion > repository > by developers are automatically propogated here on a daily > basis. This > means that you can be sure the documentation reflected here > is in sync > with the most recent Scipy development efforts." Which is why I though the sync was one way. Unfortunately, I didn't now read on (but, as is often the case, what follows makes much more sense now that I know what it means ;-) ). DG > All of the documentation in the Wiki is actually stored as > plain text > in rst format (this is what you see when you click on the > edit link). > The files are stored in a separate subversion repository to > the > official NumPy and SciPy repositories. The Doc-Wiki simply > renders the > rst formatted text and provides nice functionality for > editing and > navigating the documentation. > > For documentation to get from the Wiki's repo to the main > NumPy and > SciPy repo's someone (with commit privileges) must make a > patch and > apply it. Visit http://docs.scipy.org/numpy/patch/ and generate a > patch to see what I mean. > > Any changes a developer checks into the main repo's will > automatically > be propogated to the Doc-Wiki repo once a day, to avoid > things getting > to confused. > > The upshot is, if you're a developer you can commit doc > changes > directly to the main repos. If you are not you can edit the > rst docs > in the Doc-Wiki and this will be committed to the main > repos at some > convenient time (usually just before a release). > > Cheers, > Scott > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gael.varoquaux at normalesup.org Wed Jun 10 03:54:15 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 10 Jun 2009 09:54:15 +0200 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <540908.45602.qm@web52112.mail.re2.yahoo.com> References: <540908.45602.qm@web52112.mail.re2.yahoo.com> Message-ID: <20090610075415.GB10728@phare.normalesup.org> On Wed, Jun 10, 2009 at 12:36:07AM -0700, David Goldsmith wrote: > OK, so the reconciliation is two-way, via SVN; I take it only registered developers have update/commit privileges? Does at least one developer check SVN at least once daily? The way the web application works, is that it can generate standard diffs to SVN, using the 'patch' link, in the top bar. If you have commit rights on numpy, you can apply them. Also, it pulls the docstrings from the svn nightly. Any conflicts are tracked similarly to SVN, and are listed in the 'Merge' page, where you can resolve them, as you do in SVN. We should make you administrator of the web app fairly soon. Ga?l From scott.sinclair.za at gmail.com Wed Jun 10 04:06:40 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 10 Jun 2009 10:06:40 +0200 Subject: [Numpy-discussion] Question about memmap In-Reply-To: <761390.60157.qm@web52109.mail.re2.yahoo.com> References: <761390.60157.qm@web52109.mail.re2.yahoo.com> Message-ID: <6a17e9ee0906100106x601c4d74o3aa7cd5251ea9703@mail.gmail.com> > 2009/6/10 David Goldsmith : > > --- On Wed, 6/10/09, Scott Sinclair wrote: > >> The front page of the Doc-Wiki says: >> >> "You do not need to be a SciPy developer to contribute, as >> any >> documentation changes committed directly to the Subversion >> repository >> by developers are automatically propogated here on a daily >> basis. This >> means that you can be sure the documentation reflected here >> is in sync >> with the most recent Scipy development efforts." > > Which is why I though the sync was one way. ?Unfortunately, I didn't now read on (but, as is often the case, what follows makes much more sense now that I know what it means ;-) ). > > DG I've modified the Introduction on the front page http://docs.scipy.org/numpy/Front Page Should be "clear as mud" now :) Cheers, Scott From d_l_goldsmith at yahoo.com Wed Jun 10 04:11:17 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Wed, 10 Jun 2009 01:11:17 -0700 (PDT) Subject: [Numpy-discussion] Question about memmap Message-ID: <627499.73032.qm@web52101.mail.re2.yahoo.com> Thanks, Scott. Clear as mud. (Just kidding, of course.) ;-) DG --- On Wed, 6/10/09, Scott Sinclair wrote: > From: Scott Sinclair > Subject: Re: [Numpy-discussion] Question about memmap > To: "Discussion of Numerical Python" > Date: Wednesday, June 10, 2009, 1:06 AM > > 2009/6/10 David Goldsmith : > > > > --- On Wed, 6/10/09, Scott Sinclair > wrote: > > > >> The front page of the Doc-Wiki says: > >> > >> "You do not need to be a SciPy developer to > contribute, as > >> any > >> documentation changes committed directly to the > Subversion > >> repository > >> by developers are automatically propogated here on > a daily > >> basis. This > >> means that you can be sure the documentation > reflected here > >> is in sync > >> with the most recent Scipy development efforts." > > > > Which is why I though the sync was one way. > ?Unfortunately, I didn't now read on (but, as is often the > case, what follows makes much more sense now that I know > what it means ;-) ). > > > > DG > > I've modified the Introduction on the front page > http://docs.scipy.org/numpy/Front Page > > Should be "clear as mud" now :) > > Cheers, > Scott > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From juanjo.gomeznavarro at gmail.com Wed Jun 10 04:41:34 2009 From: juanjo.gomeznavarro at gmail.com (Juanjo Gomez Navarro) Date: Wed, 10 Jun 2009 10:41:34 +0200 Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 33, Issue 59 In-Reply-To: References: Message-ID: <18571cd90906100141g6bc009d4g1ba0550f2341cc1d@mail.gmail.com> Thanks a lot, now I have a quite fast program to compute Fractals :D. Nevertheless, I want to comment some more doubts. The speed at which some points tend to infinite is huge. Some points, after 10 steps, reach a NaN. This is not problem in my Mac Book, but in the PC the speed is really poor when some infinities are reached (in the mac, the program takes 3 seconds to run, meanwhile in the PC it takes more than 1 minute). In order to solve this, I have added a line to set to 0 the points who have reached 2.0 (so they are already out of the Mandelbrot set): * for n in range(1,ITERATIONS): print "Iteration %d" % n z *= z z += c fractal[(fractal == 1) & (abs(z) > 2.0)] = float(n) / ITERATIONS # This is the new line to avoid series in some points to reach infinite, which causes problems in my PC z[abs(z) > 2.0] = 0 * This solves the problem for PC, but delays the calculation... On the other hand, the number of calculations that *really need* to be done (of points who have not yet been excluded from the Mandelbrot set) decreases rapidly. In the beginning, there are, in a given example, 250000 points, but in the final steps there are only 60000. Nevertheless, I'm calculating * needlessly* the 250000 points all the time, when only 10% of calculations should need to be done! It is a waste of time. Is there any way to save time in these useless calculations? The idea should be to perform the update of z only if certain conditions are met, in this case that abs(z)<2. Thanks. 2009/6/9 > ttp://mentat.za.net/numpy/intro/intro.html > > We never used it, but I still like the pretty pictures :-) > > Cheers > St?fan > -- Juan Jos? G?mez Navarro Edificio CIOyN, Campus de Espinardo, 30100 Departamento de F?sica Universidad de Murcia Tfno. (+34) 968 398552 Email: juanjo.gomeznavarro at gmail.com Web: http://ciclon.inf.um.es/Inicio.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.benoit.1 at gmail.com Wed Jun 10 11:10:47 2009 From: jacob.benoit.1 at gmail.com (Benoit Jacob) Date: Wed, 10 Jun 2009 11:10:47 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: References: Message-ID: 2009/6/9 Charles R Harris : >> > ? - heavily expression-template-based C++, meaning compilation takes >> > ages >> >> No, because _we_ are serious about compilation times, unlike other c++ >> template libraries. But granted, compilation times are not as short as >> a plain C library either. > > I wonder if it is possible to have a compiler/parser that does nothing but > translate templates into c? Say, something written in python ;) Name > mangling would be a problem but could perhaps be simplified for the somewhat > limited space needed for numpy/scipy. That's not possible: templates are (mostly) the only thing in C++ that can't be translated into C. In a C++ template library, templated types are types that are not determined by the library itself, but will be determined by the application that uses the library. So the template library itself can't be translated into C because there's no way in C to allow "not yet determined" types. In some trivial cases that can be done with macros, but C++ templates go farther than that, they are Turing complete. A C++ type is a tree and using template expressions you can perform any operation on these trees at compilation time. In Eigen, we use these trees to represent arithmetic expressions like "a+b+c". That's the paradigm known as "expression templates". Benoit From jacob.benoit.1 at gmail.com Wed Jun 10 11:24:42 2009 From: jacob.benoit.1 at gmail.com (Benoit Jacob) Date: Wed, 10 Jun 2009 11:24:42 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2F1B76.7080705@ar.media.kyoto-u.ac.jp> References: <4A2F1B76.7080705@ar.media.kyoto-u.ac.jp> Message-ID: Hi David, 2009/6/9 David Cournapeau : > Hi Benoit, > > Benoit Jacob wrote: >> No, because _we_ are serious about compilation times, unlike other c++ >> template libraries. But granted, compilation times are not as short as >> a plain C library either. >> > > I concede it is not as bad as the heavily templated libraries in boost. > But C++ is just horribly slow to compile, at least with g++ - in scipy, > half of the compilation time is spent for a couple of C++ files which > uses simple templates. And the compiler takes a lot of memory during > compilation (~ 300 Mb per file - that's a problem because I rely a lot > on VM to build numpy/scipy binaries). Well, I can't comment on other libraries that I don't know. It is true that compilation time and memory usage in C++ templated code will never be as low as in C compilation, and can easily go awry if the c++ programmer isn't careful. Templates are really a scripting language for the compiler and like in any (turing complete) language you can always write a program that takes long to "execute". > >> Eigen doesn't _require_ any SIMD instruction set although it can use >> SSE / AltiVec if enabled. >> > > If SSE is not enabled, my (very limited) tests show that eigen does not > perform as well as a stock debian ATLAS on the benchmarks given by > eigen. For example: Of course! The whole point is that ATLAS is a binary library with its own SSE code, so it is still able to use SSE even if your program was compiled without SSE enabled: ATLAS will run its own platform check at runtime. So it's not a surprise that ATLAS with SSE is faster than Eigen without SSE. By the way this was shown in our benchmark already: http://eigen.tuxfamily.org/index.php?title=Benchmark Scroll down to matrix matrix product. The gray curve "eigen2_novec" is eigen without SSE. > > ?g++ benchBlasGemm.cpp -I .. -lblas -O2 -DNDEBUG && ./a.out 300 > cblas: 0.034222 (0.788 GFlops/s) > eigen : 0.0863581 (0.312 GFlops/s) > eigen : 0.121259 (0.222 GFlops/s) and just out of curiosity, what are the 2 eigen lines ? > > g++ benchBlasGemm.cpp -I .. -lblas -O2 -DNDEBUG -msse2 && ./a.out 300 > cblas: 0.035438 (0.761 GFlops/s) > eigen : 0.0182271 (1.481 GFlops/s) > eigen : 0.0860961 (0.313 GFlops/s) > > (on a PIV, which may not be very representative of current architectures) > >> It is true that with Eigen this is set up at build time, but this is >> only because it is squarely _not_ Eigen's job to do runtime platform >> checks. Eigen isn't a binary library. If you want a runtime platform >> switch, just compile your critical Eigen code twice, one with SSE one >> without, and do the platform check in your own app. The good thing >> here is that Eigen makes sure that the ABI is independent of whether >> vectorization is enabled. >> > > I understand that it is not a goal of eigen, and that should be the > application's job. It is just that MKL does it automatically, and doing > it in a cross platform way in the context of python extensions is quite > hard because of various linking strategies on different OS. Yes, I understand that. MKL is not only a math library, it comes with embedded threading library and hardware detection routines. Cheers, Benoit From a.carl at ngc.com Wed Jun 10 11:30:38 2009 From: a.carl at ngc.com (Carl, Andrew F (AS)) Date: Wed, 10 Jun 2009 10:30:38 -0500 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content In-Reply-To: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> References: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> Message-ID: Then how about known combinations of version numbers for g77/gcc, python, and numpy? The gist would be to convey something that is known too work together. Some of the "magic" under the hood is looking at system environmental variables, and the Intel fortran compiler flags wreak havoc, requiring the uninstall of the intel compiler to prevent corruption of the gnu95 setup w/ gfortran. Something on where to look to see whats going on would be helpful! Thanks, Andy -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of David Cournapeau Sent: Tuesday, June 09, 2009 7:43 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Inquiry Regarding F2PY Windows Content Carl, Andrew F (AS) wrote: > > Would it be a reasonable request, that the "F2PY Windows" web page > contain known combinations of version numbers for Python, Numpy and > Gfortran verified to play nice? Some references as to queried compiler > system environmental variables would be useful also. > I have added some numpy.distutils support for gfortran on windows, but Windows + gfortran + numpy is unlikely to work well unless you build numpy by yourself with gfortran. I have actually been considering a move to gfortran for windows builds, but I would prefer waiting mingw to officially support for gcc 4.* series. cheers, David _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From cournape at gmail.com Wed Jun 10 11:43:30 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 11 Jun 2009 00:43:30 +0900 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content In-Reply-To: References: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220906100843y46d228d8nded205449913a0b1@mail.gmail.com> On Thu, Jun 11, 2009 at 12:30 AM, Carl, Andrew F (AS) wrote: > > Then how about known combinations of version numbers for g77/gcc, > python, and numpy? Any of them should work. Numpy on windows is built with g77/gcc as available from the official MinGW installer. >. Something on where to look to > see whats going on would be helpful! Something like python setup.py build_ext --fcompiler=gnu should force to look for g77 instead of the Intel fortran compiler. David From bela.mihalik at gmail.com Wed Jun 10 11:53:31 2009 From: bela.mihalik at gmail.com (=?ISO-8859-1?Q?B=E9la_MIHALIK?=) Date: Wed, 10 Jun 2009 17:53:31 +0200 Subject: [Numpy-discussion] second 2d fft gives the same result as fft+ifft In-Reply-To: <315462.74301.qm@web52108.mail.re2.yahoo.com> References: <315462.74301.qm@web52108.mail.re2.yahoo.com> Message-ID: <1edb76140906100853i1a20962eo520f0d1b480ba3f3@mail.gmail.com> I made the transformations with scipy. It works with complex numbers. The results are here: http://server6.theimagehosting.com/image.php?img=second_fourier_scipy.png Also i asked it from others... as i understand now: the second fourier transformation of a function always gives back the mirror of the function itself. I didn't know it :-) thanks, bela 2009/6/9 David Goldsmith > > Sorry, I meant: > > Im(iFT(FT(f))) = Im(FT^2(f)), Re(iFT(FT(f))) != Re(FT^2(f)) > > DG > > --- On Tue, 6/9/09, David Goldsmith wrote: > > > From: David Goldsmith > > Subject: Re: [Numpy-discussion] second 2d fft gives the same result as > fft+ifft > > To: "Discussion of Numerical Python" > > Date: Tuesday, June 9, 2009, 11:01 AM > > > > --- On Tue, 6/9/09, Matthieu Brucher > > wrote: > > > > > Hi, > > > > > > Is it really ? > > > You only show the imaginary part of the FFT, so you > > can't > > > be sure of > > > what you are saying. > > > > Indeed, is there not a "label" for a function f which > > satisfies > > > > Im(iFFT(f)) = Im(FFT^2(f)), > > Re(iFFT(f)) != Re(FFT^2(f))? > > > > (And similarly if Im and Re roles are reversed.) > > Seems like the class of such functions (if any exist) might > > have some interesting properties... > > > > DG > > > > > Don't forget that the only difference between FFT and > > iFFT > > > is (besides > > > of teh scaling factor) a minus sign in the exponent. > > > > > > Matthieu > > > > > > 2009/6/9 bela : > > > > > > > > I tried to calculate the second fourier > > transformation > > > of an image with the > > > > following code below: > > > > > > > > > > > > > --------------------------------------------------------------- > > > > import pylab > > > > import numpy > > > > > > > > ### Create a simple image > > > > > > > > fx = numpy.zeros( 128**2 > > ).reshape(128,128).astype( > > > numpy.float ) > > > > > > > > for i in xrange(8): > > > > for j in xrange(8): > > > > fx[i*8+16][j*8+16] = 1.0 > > > > > > > > ### Fourier Transformations > > > > > > > > Ffx = numpy.copy( numpy.fft.fft2( fx ).real ) > > # 1st > > > fourier > > > > FFfx = numpy.copy( numpy.fft.fft2( Ffx ).real ) > > # > > > 2nd fourier > > > > IFfx = numpy.copy( numpy.fft.ifft2( Ffx ).real ) > > # > > > inverse fourier > > > > > > > > ### Display result > > > > > > > > pylab.figure( 1, figsize=(8,8), dpi=125 ) > > > > > > > > pylab.subplot(221) > > > > pylab.imshow( fx, cmap=pylab.cm.gray ) > > > > pylab.colorbar() > > > > pylab.title( "fx" ) > > > > > > > > pylab.subplot(222) > > > > pylab.imshow( Ffx, cmap=pylab.cm.gray ) > > > > pylab.colorbar() > > > > pylab.title( "Ffx" ) > > > > > > > > pylab.subplot(223) > > > > pylab.imshow( FFfx, cmap=pylab.cm.gray ) > > > > pylab.colorbar() > > > > pylab.title( "FFfx" ) > > > > > > > > pylab.subplot(224) > > > > pylab.imshow( IFfx, cmap=pylab.cm.gray ) > > > > pylab.colorbar() > > > > pylab.title( "IFfx" ) > > > > > > > > pylab.show() > > > > > > > > > --------------------------------------------------------------- > > > > > > > > On my computer FFfx is the same as IFfx..... but > > why? > > > > > > > > I uploaded a screenshot about my result here: > > > > http://server6.theimagehosting.com/image.php?img=second_fourier.png > > > > > > > > Bela > > > > > > > > > > > > -- > > > > View this message in context: > http://www.nabble.com/second-2d-fft-gives-the-same-result-as-fft%2Bifft-tp23945026p23945026.html > > > > Sent from the Numpy-discussion mailing list > > archive at > > > Nabble.com. > > > > > > > > _______________________________________________ > > > > Numpy-discussion mailing list > > > > Numpy-discussion at scipy.org > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > -- > > > Information System Engineer, Ph.D. > > > Website: http://matthieu-brucher.developpez.com/ > > > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > > > LinkedIn: http://www.linkedin.com/in/matthieubrucher > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.carl at ngc.com Wed Jun 10 11:55:52 2009 From: a.carl at ngc.com (Carl, Andrew F (AS)) Date: Wed, 10 Jun 2009 10:55:52 -0500 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content In-Reply-To: <5b8d13220906100843y46d228d8nded205449913a0b1@mail.gmail.com> References: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> <5b8d13220906100843y46d228d8nded205449913a0b1@mail.gmail.com> Message-ID: The default finds both the g77 & gfortran compilers. The issue is that the flags associated w/ both appear to be corrupted by the contents of the intel fortran environmental variables (i.e. content matched, removal of the intel system environmental variables made the problem go away), resulting in the example problem failing. Used f2py/diagnose.py to verify. It's just that it took about six python install/uninstall cycles. It would be nice to enable others to avoid the problem via a heads-up on the F2Py Windows web page. -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of David Cournapeau Sent: Wednesday, June 10, 2009 8:44 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Inquiry Regarding F2PY Windows Content On Thu, Jun 11, 2009 at 12:30 AM, Carl, Andrew F (AS) wrote: > > Then how about known combinations of version numbers for g77/gcc, > python, and numpy? Any of them should work. Numpy on windows is built with g77/gcc as available from the official MinGW installer. >. Something on where to look to > see whats going on would be helpful! Something like python setup.py build_ext --fcompiler=gnu should force to look for g77 instead of the Intel fortran compiler. David _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From HAWRYLA at novachem.com Wed Jun 10 12:22:16 2009 From: HAWRYLA at novachem.com (Andrew Hawryluk) Date: Wed, 10 Jun 2009 10:22:16 -0600 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content In-Reply-To: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> References: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> Message-ID: <48C01AE7354EC240A26F19CEB995E943033AF1BE@CHMAILMBX01.novachem.com> > -----Original Message----- > From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- > bounces at scipy.org] On Behalf Of David Cournapeau > Sent: 9 Jun 2009 8:43 PM > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Inquiry Regarding F2PY Windows Content > > Carl, Andrew F (AS) wrote: > > > > Would it be a reasonable request, that the "F2PY Windows" web page > > contain known combinations of version numbers for Python, Numpy and > > Gfortran verified to play nice? Some references as to queried > compiler > > system environmental variables would be useful also. > > > I have added some numpy.distutils support for gfortran on windows, but > Windows + gfortran + numpy is unlikely to work well unless you build > numpy by yourself with gfortran. I have actually been considering a > move to gfortran for windows builds, but I would prefer waiting mingw > to officially support for gcc 4.* series. I was able to get gfortran working on Windows just a few weeks ago. The only problem I had was that I needed Python >= 2.5.3 before it would work. See issue #2234 in http://www.python.org/download/releases/2.5.4/NEWS.txt Andrew From cournape at gmail.com Wed Jun 10 12:39:08 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 11 Jun 2009 01:39:08 +0900 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content In-Reply-To: <48C01AE7354EC240A26F19CEB995E943033AF1BE@CHMAILMBX01.novachem.com> References: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> <48C01AE7354EC240A26F19CEB995E943033AF1BE@CHMAILMBX01.novachem.com> Message-ID: <5b8d13220906100939n54a9ff60ob60ea4e902104706@mail.gmail.com> On Thu, Jun 11, 2009 at 1:22 AM, Andrew Hawryluk wrote: > I was able to get gfortran working on Windows just a few weeks ago. The > only problem I had was that I needed Python >= 2.5.3 before it would > work. See issue #2234 in > http://www.python.org/download/releases/2.5.4/NEWS.txt That's not really the problem I had in mind - rather, if you compile your extension with gfortran, since numpy is built with g77 on windows, you can get into hairy situations since g77 and gfortran are not compatible. If you build numpy yourself with gfortran (which should work), then there is no problem. David From a.carl at ngc.com Wed Jun 10 12:41:53 2009 From: a.carl at ngc.com (Carl, Andrew F (AS)) Date: Wed, 10 Jun 2009 11:41:53 -0500 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content In-Reply-To: <48C01AE7354EC240A26F19CEB995E943033AF1BE@CHMAILMBX01.novachem.com> References: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> <48C01AE7354EC240A26F19CEB995E943033AF1BE@CHMAILMBX01.novachem.com> Message-ID: Got gfortran (dated 2009-04-21) working w/ python 2.5.2, numpy 1.3.0 & MinGW 5.1.4, and after removing the intel fortran system environmental variables (plus six python installs, a intel fortran uninstall, and a day and a half of head scratching). Andy -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Andrew Hawryluk Sent: Wednesday, June 10, 2009 9:22 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Inquiry Regarding F2PY Windows Content > -----Original Message----- > From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- > bounces at scipy.org] On Behalf Of David Cournapeau > Sent: 9 Jun 2009 8:43 PM > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Inquiry Regarding F2PY Windows Content > > Carl, Andrew F (AS) wrote: > > > > Would it be a reasonable request, that the "F2PY Windows" web page > > contain known combinations of version numbers for Python, Numpy and > > Gfortran verified to play nice? Some references as to queried > compiler > > system environmental variables would be useful also. > > > I have added some numpy.distutils support for gfortran on windows, but > Windows + gfortran + numpy is unlikely to work well unless you build > numpy by yourself with gfortran. I have actually been considering a > move to gfortran for windows builds, but I would prefer waiting mingw > to officially support for gcc 4.* series. I was able to get gfortran working on Windows just a few weeks ago. The only problem I had was that I needed Python >= 2.5.3 before it would work. See issue #2234 in http://www.python.org/download/releases/2.5.4/NEWS.txt Andrew _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Wed Jun 10 12:59:36 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 10 Jun 2009 10:59:36 -0600 Subject: [Numpy-discussion] Howto vectorise a dot product ? In-Reply-To: References: Message-ID: On Wed, Jun 10, 2009 at 12:21 AM, bruno Piguet wrote: > 2009/6/9 Charles R Harris > >> >> Well, in this case you can use complex multiplication and either work with >> just the x,y components or use two complex components, i.e., [x + 1j*y, z]. >> In the first case you can then do the rotation as V*exp(1j*phi). > > > In the real case, it's a real 3-axes rotation, where M = dot (M1(psi), dot > (M2(theta), M3(phi))). The decomposition in 2D-rotations and the use of > complex operation is possible, but the matrix notation is more concise. > > If you want more general rotations, a ufunc for quaternions would do the >> trick. >> > > You mean something like Christoph Gohlke's "transformations.py" program ? > I don't know. But I think generating the rotation matrices is likely to be the bottleneck and I'm not sure how you want to specify them. After you have the stack of rotation matrices there are at least three ways to multiply them with stacks of vectors: in a python loop, using newaxis and a sum, or possibly a generalized ufunc. I'm not sure about the latter but I think there is an example that does that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From srunni at gmail.com Wed Jun 10 13:44:28 2009 From: srunni at gmail.com (Samir Unni) Date: Wed, 10 Jun 2009 12:44:28 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 Message-ID: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> Hi, I'm trying to use F2PY on Mac OS 10.5 with G95, but I'm getting the error "g95: unrecognized option '-shared'". I tried modifying the NumPy code to use the correct "-dynamic" flag, rather than the "-shared" flag. While that does allow for F2PY to successfully execute, I get the error Traceback (most recent call last): File "", line 1, in ImportError: dlopen(/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so, 2): no suitable image found. Did find: /Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so: unknown file type, first eight bytes: 0x80 0xC0 0x4F 0x00 0xEB 0x57 0xE0 0x8F when I attempt to import the generated module. Any ideas on how to fix this? Thanks, Samir Unni From d_l_goldsmith at yahoo.com Wed Jun 10 13:55:58 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Wed, 10 Jun 2009 10:55:58 -0700 (PDT) Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 33, Issue 59 Message-ID: <248036.44290.qm@web52105.mail.re2.yahoo.com> --- On Wed, 6/10/09, Juanjo Gomez Navarro wrote: > The speed at which some points tend to infinite is huge. > Some points, after 10 steps, reach a NaN. This is not > problem in my Mac Book, but in the PC the speed is really > poor when some infinities are reached (in the mac, the > program takes 3 seconds to run, meanwhile in the PC it takes > more than 1 minute). In order to solve this, I have added a > line to set to 0 the points who have reached 2.0 (so they > are already out of the Mandelbrot set): Yeah, of course something like that should've been in the original code. > On the other hand, the number of calculations that > really need to be done (of points who have not yet > been excluded from the Mandelbrot set) decreases rapidly. In > the beginning, there are, in a given example, 250000 points, > but in the final steps there are only 60000. Nevertheless, > I'm calculating needlessly the 250000 points all > the time, when only 10% of calculations should need to be > done! It is a waste of time. > > Is there any way to save time in these useless > calculations? The idea should be to perform the update of z > only if certain conditions are met, in this case that > abs(z)<2. To do that you'd need to use the "fancy indexing" approach suggested by Anne, but of course, as Robert emphasized, figuring out the details of that implementation is much harder. I think the main thing to take away from all this (unless fractals, or an analogous algorithm, is what your ultimate goal is) is the use of implied for loops in the indexing as the Python replacement for explicit for loops in C and FORTRAN. This may be a tough adjustment at first if you've never encountered it in another context. (I'd venture to guess that many (most?) numpy users first encountered it in matlab, idl, Splus, or something like that, and thus came to numpy already familiar w/ the approach; in other words, numpy didn't invent this approach, it has a pretty long history.) Can anyone point Juanjo directly to that portion of a tutorial which goes over this? DG > > Thanks. > > > 2009/6/9 > ? > ttp://mentat.za.net/numpy/intro/intro.html > > > > We never used it, but I still like the pretty pictures :-) > > > > Cheers > > St?fan > > -- > Juan Jos? G?mez Navarro > > Edificio CIOyN, Campus de Espinardo, 30100 > Departamento de F?sica > Universidad de Murcia > Tfno. (+34) 968 398552 > > Email: juanjo.gomeznavarro at gmail.com > > Web: http://ciclon.inf.um.es/Inicio.html > > > -----Inline Attachment Follows----- > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From bsouthey at gmail.com Wed Jun 10 14:33:27 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 10 Jun 2009 13:33:27 -0500 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content In-Reply-To: References: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> <48C01AE7354EC240A26F19CEB995E943033AF1BE@CHMAILMBX01.novachem.com> Message-ID: <4A2FFC77.5090802@gmail.com> On 06/10/2009 11:41 AM, Carl, Andrew F (AS) wrote: > Got gfortran (dated 2009-04-21) working w/ python 2.5.2, numpy 1.3.0& > MinGW 5.1.4, and after removing the intel fortran system environmental > variables (plus six python installs, a intel fortran uninstall, and a > day and a half of head scratching). > > Andy > Hi, What is the operating system (version and type) you are using? Also, could you please list the steps that worked or an overview of them as it may help other people? Thanks Bruce From wfspotz at sandia.gov Wed Jun 10 14:52:22 2009 From: wfspotz at sandia.gov (Bill Spotz) Date: Wed, 10 Jun 2009 14:52:22 -0400 Subject: [Numpy-discussion] numpy C++ swig class example In-Reply-To: <3A7F37AE3B50AE479C753AFB27FBB64D018E066B@0461-its-exmb04.us.saic.com> References: <3A7F37AE3B50AE479C753AFB27FBB64D018E066B@0461-its-exmb04.us.saic.com> Message-ID: <61EAA776-A5E2-4A39-81FB-33EC28FBF79C@sandia.gov> Jared, Typically, public data members would be handled in SWIG with, I believe, the "out" typemap. numpy.i does not support this typemap because there is no general way that I know of to tell it that, for example, _nrow and _ncol are the dimensions of data. Faced with this problem, and a desire to have a "data" attribute, I would probably do the following (ignoring the fact that the constructor has an argument named "data" that clashes with the public variable): %ignore Array2D::data; %apply (int DIM1, int DIM2, double* IN_ARRAY2) {(int nrow, int ncol, double *data)}; %include "Array2D.h" %extend(Array2D) { PyObject * _getData() { npy_intp dims[2] = {self->_nrow, self->_ncol}; return PyArray_SimpleNewFromData(2, dims, NPY_DOUBLE, (void*)data); } } %pythoncode %{ data = property(_getData, doc="docstring for data") %} You'll want four spaces as indentation within the %pythoncode directive to ensure it fits properly under the Array2D python proxy class. This should allow getting the entire array and setting and getting elements of the array: >>> a = Array2D([[1,2],[3,4]]) >>> a.data[0,0] = 5 >>> print a.data [[5 2] [3 4]] >>> print a.data[1,1] 4 I don't think I would do this for an a class that advertises itself as an array (for this case, I would make sure that an instance of Array2D behaves like an array rather than its "data" attribute), but I could see other classes that might have an attribute that is an array, where you would want to do it. On Jun 9, 2009, at 10:52 PM, Rubin, Jared wrote: > I am using the numpy.i interface file and have gotten the cookbook/ > swig example to work from scipy. > Are there any examples of appyling the numpy.i to a C++ header file. > I would like to generate a lightweight > Array2D class that just uses doubles and would have the following > header file > > Array2D.h > ========= > class Array2D { > public: > int _nrow; > int _ncol; > double* data; > Array2DD(int nrow, int ncol, double *data); > } > > // I would expect the have the following Array2D.i interface file > %module array2DD > > %{ > #define SWIG_FILE_WITH_INIT > #include "Array2D.h" > %} > > %include "numpy.i" > > %init %{ > import_array(); > %} > > %ignore Array2D(); > %ignore Array2D(long nrow, long ncol); > %apply (int DIM1, int DIM2, double* IN_ARRAY2) {(int nrow, int ncol, > double *data)} > %include "Array2D.h" > ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From gokhansever at gmail.com Wed Jun 10 15:23:47 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 14:23:47 -0500 Subject: [Numpy-discussion] [OFFTOPIC] Reference management for articles Message-ID: <49d6b3500906101223p749a8aefva5e6fa7c679c267b@mail.gmail.com> Hello, I am very off-the-topic, sorry about that first, but I know most of the people in this list are students / scientists. Just want to know a few opinions upon how you manage references (following the list of references in the end of articles, books, etc... or building your owns). Some articles are really goes beyond limits on this. I have seen a couple articles titles with at least 200 citations attached in the end, and makes me confused easily while trying to focus on the core of an article. What is your solution for this issue? Do you use a web-based system a tool like Zotero? An open-source recommendations are more appreciated. Thank you... G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.carl at ngc.com Wed Jun 10 15:34:29 2009 From: a.carl at ngc.com (Carl, Andrew F (AS)) Date: Wed, 10 Jun 2009 14:34:29 -0500 Subject: [Numpy-discussion] Inquiry Regarding F2PY Windows Content In-Reply-To: <4A2FFC77.5090802@gmail.com> References: <4A2F1DC6.8090008@ar.media.kyoto-u.ac.jp> <48C01AE7354EC240A26F19CEB995E943033AF1BE@CHMAILMBX01.novachem.com> <4A2FFC77.5090802@gmail.com> Message-ID: PC w/ Windows XP [Version 5.1.2600] Steps: 1) , in my case as follows: a) ActiveState ActivePython 2.5.2.2 b) Numpy 1.3.0 c) gFortran (dated 2009-04-21) 2) Install MinGW 5.1.4 (g++, g77, make) 3) Check system environmental variables as follows: a) PATH: C:\gfortran\libexec\gcc\i586-pc-mingw32\4.5.0;C:\gfortran\bin;C:\MinGW\b in b) C_INCLUDE_PATH: C:\gfortran\include 4) Run C:\Python25\Lib\site-packages\numpy\f2py\diagnose.py and review output for g77 & g95 fcompilers found & contents of flags: a) GnuFCompiler instance: c:\MinGW\gin\g77.exe b) Gnu95FCompiler instance: c:\gfortran\bin\gfortran.exe c) If you don't see items "a" & "b" above, don't bother trying the example problem from the "F2py Windows" web page 5) To make matters worse, an install/uninstall of ipython from "Add or Remove Programs" "broke" it, requiring the entire process to be repeated (i.e. uninstall/re-install numpy & scipy alone did not work). THIS IS SCARY (and frustrating)! Andy -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Bruce Southey Sent: Wednesday, June 10, 2009 11:33 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Inquiry Regarding F2PY Windows Content On 06/10/2009 11:41 AM, Carl, Andrew F (AS) wrote: > Got gfortran (dated 2009-04-21) working w/ python 2.5.2, numpy 1.3.0& > MinGW 5.1.4, and after removing the intel fortran system environmental > variables (plus six python installs, a intel fortran uninstall, and a > day and a half of head scratching). > > Andy > Hi, What is the operating system (version and type) you are using? Also, could you please list the steps that worked or an overview of them as it may help other people? Thanks Bruce _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From ramercer at gmail.com Wed Jun 10 15:59:38 2009 From: ramercer at gmail.com (Adam Mercer) Date: Wed, 10 Jun 2009 14:59:38 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> Message-ID: <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> On Wed, Jun 10, 2009 at 12:44, Samir Unni wrote: > I'm trying to use F2PY on Mac OS 10.5 with G95, but I'm getting the > error "g95: unrecognized option '-shared'". I tried modifying the > NumPy code to use the correct "-dynamic" flag, rather than the > "-shared" flag. While that does allow for F2PY to successfully > execute, I get the error > > Traceback (most recent call last): > ?File "", line 1, in > ImportError: dlopen(/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so, > 2): no suitable image found. ?Did find: > ? ? ? ?/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so: > unknown file type, first eight bytes: 0x80 0xC0 0x4F 0x00 0xEB 0x57 > 0xE0 0x8F > > when I attempt to import the generated module. Any ideas on how to fix this? AFAIK g95 is not supported by numpy distutils on Mac OS X. Cheers Adam From srunni at gmail.com Wed Jun 10 16:04:14 2009 From: srunni at gmail.com (Samir Unni) Date: Wed, 10 Jun 2009 15:04:14 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> Message-ID: <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> On Wed, Jun 10, 2009 at 2:59 PM, Adam Mercer wrote: > On Wed, Jun 10, 2009 at 12:44, Samir Unni wrote: > >> I'm trying to use F2PY on Mac OS 10.5 with G95, but I'm getting the >> error "g95: unrecognized option '-shared'". I tried modifying the >> NumPy code to use the correct "-dynamic" flag, rather than the >> "-shared" flag. While that does allow for F2PY to successfully >> execute, I get the error >> >> Traceback (most recent call last): >> ?File "", line 1, in >> ImportError: dlopen(/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so, >> 2): no suitable image found. ?Did find: >> ? ? ? ?/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so: >> unknown file type, first eight bytes: 0x80 0xC0 0x4F 0x00 0xEB 0x57 >> 0xE0 0x8F >> >> when I attempt to import the generated module. Any ideas on how to fix this? > > AFAIK g95 is not supported by numpy distutils on Mac OS X. Are you sure? When I run "f2py -c --help-fcompiler", I get: List of available Fortran compilers: --fcompiler=g95 G95 Fortran Compiler (0.91) G95 is the only compiler listed as available. If it can't be used, then what can? I would actually prefer to use GFortran, but that is not listed as available. Thanks, Samir From ramercer at gmail.com Wed Jun 10 16:11:35 2009 From: ramercer at gmail.com (Adam Mercer) Date: Wed, 10 Jun 2009 15:11:35 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> Message-ID: <799406d60906101311h72191cbcr2c1b5eab484d71d6@mail.gmail.com> On Wed, Jun 10, 2009 at 15:04, Samir Unni wrote: > Are you sure? When I run "f2py -c --help-fcompiler", I get: > > List of available Fortran compilers: > ?--fcompiler=g95 ?G95 Fortran Compiler (0.91) > > G95 is the only compiler listed as available. If it can't be used, > then what can? I would actually prefer to use GFortran, but that is > not listed as available. I get the following with numpy-1.3.0: $ f2py -c --help-fcompiler Fortran compilers found: Compilers available for this platform, but not found: --fcompiler=absoft Absoft Corp Fortran Compiler --fcompiler=g95 G95 Fortran Compiler --fcompiler=gnu GNU Fortran 77 compiler --fcompiler=gnu95 GNU Fortran 95 compiler --fcompiler=ibm IBM XL Fortran Compiler --fcompiler=intel Intel Fortran Compiler for 32-bit apps --fcompiler=nag NAGWare Fortran 95 Compiler Compilers not available on this platform: --fcompiler=compaq Compaq Fortran Compiler --fcompiler=hpux HP Fortran 90 Compiler --fcompiler=intele Intel Fortran Compiler for Itanium apps --fcompiler=intelem Intel Fortran Compiler for EM64T-based apps --fcompiler=intelev Intel Visual Fortran Compiler for Itanium apps --fcompiler=intelv Intel Visual Fortran Compiler for 32-bit apps --fcompiler=lahey Lahey/Fujitsu Fortran 95 Compiler --fcompiler=mips MIPSpro Fortran Compiler --fcompiler=none Fake Fortran compiler --fcompiler=pg Portland Group Fortran Compiler --fcompiler=sun Sun or Forte Fortran 95 Compiler --fcompiler=vast Pacific-Sierra Research Fortran 90 Compiler For compiler details, run 'config_fc --verbose' setup command. $ Cheers Adam From a.carl at ngc.com Wed Jun 10 16:18:22 2009 From: a.carl at ngc.com (Carl, Andrew F (AS)) Date: Wed, 10 Jun 2009 15:18:22 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <799406d60906101311h72191cbcr2c1b5eab484d71d6@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> <799406d60906101311h72191cbcr2c1b5eab484d71d6@mail.gmail.com> Message-ID: You might try running: C:\Python25\Lib\site-packages\numpy\f2py\diagnose.py -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Adam Mercer Sent: Wednesday, June 10, 2009 1:12 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 On Wed, Jun 10, 2009 at 15:04, Samir Unni wrote: > Are you sure? When I run "f2py -c --help-fcompiler", I get: > > List of available Fortran compilers: > ?--fcompiler=g95 ?G95 Fortran Compiler (0.91) > > G95 is the only compiler listed as available. If it can't be used, > then what can? I would actually prefer to use GFortran, but that is > not listed as available. I get the following with numpy-1.3.0: $ f2py -c --help-fcompiler Fortran compilers found: Compilers available for this platform, but not found: --fcompiler=absoft Absoft Corp Fortran Compiler --fcompiler=g95 G95 Fortran Compiler --fcompiler=gnu GNU Fortran 77 compiler --fcompiler=gnu95 GNU Fortran 95 compiler --fcompiler=ibm IBM XL Fortran Compiler --fcompiler=intel Intel Fortran Compiler for 32-bit apps --fcompiler=nag NAGWare Fortran 95 Compiler Compilers not available on this platform: --fcompiler=compaq Compaq Fortran Compiler --fcompiler=hpux HP Fortran 90 Compiler --fcompiler=intele Intel Fortran Compiler for Itanium apps --fcompiler=intelem Intel Fortran Compiler for EM64T-based apps --fcompiler=intelev Intel Visual Fortran Compiler for Itanium apps --fcompiler=intelv Intel Visual Fortran Compiler for 32-bit apps --fcompiler=lahey Lahey/Fujitsu Fortran 95 Compiler --fcompiler=mips MIPSpro Fortran Compiler --fcompiler=none Fake Fortran compiler --fcompiler=pg Portland Group Fortran Compiler --fcompiler=sun Sun or Forte Fortran 95 Compiler --fcompiler=vast Pacific-Sierra Research Fortran 90 Compiler For compiler details, run 'config_fc --verbose' setup command. $ Cheers Adam _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From srunni at gmail.com Wed Jun 10 16:19:52 2009 From: srunni at gmail.com (Samir Unni) Date: Wed, 10 Jun 2009 15:19:52 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <799406d60906101311h72191cbcr2c1b5eab484d71d6@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> <799406d60906101311h72191cbcr2c1b5eab484d71d6@mail.gmail.com> Message-ID: <283efb240906101319o2ac0e95aieb73b99455c24de8@mail.gmail.com> On Wed, Jun 10, 2009 at 3:11 PM, Adam Mercer wrote: > On Wed, Jun 10, 2009 at 15:04, Samir Unni wrote: > >> Are you sure? When I run "f2py -c --help-fcompiler", I get: >> >> List of available Fortran compilers: >> ?--fcompiler=g95 ?G95 Fortran Compiler (0.91) >> >> G95 is the only compiler listed as available. If it can't be used, >> then what can? I would actually prefer to use GFortran, but that is >> not listed as available. > > I get the following with numpy-1.3.0: > > $ f2py -c --help-fcompiler > Fortran compilers found: > Compilers available for this platform, but not found: > ?--fcompiler=absoft ?Absoft Corp Fortran Compiler > ?--fcompiler=g95 ? ? G95 Fortran Compiler > ?--fcompiler=gnu ? ? GNU Fortran 77 compiler > ?--fcompiler=gnu95 ? GNU Fortran 95 compiler > ?--fcompiler=ibm ? ? IBM XL Fortran Compiler > ?--fcompiler=intel ? Intel Fortran Compiler for 32-bit apps > ?--fcompiler=nag ? ? NAGWare Fortran 95 Compiler > Compilers not available on this platform: > ?--fcompiler=compaq ? Compaq Fortran Compiler > ?--fcompiler=hpux ? ? HP Fortran 90 Compiler > ?--fcompiler=intele ? Intel Fortran Compiler for Itanium apps > ?--fcompiler=intelem ?Intel Fortran Compiler for EM64T-based apps > ?--fcompiler=intelev ?Intel Visual Fortran Compiler for Itanium apps > ?--fcompiler=intelv ? Intel Visual Fortran Compiler for 32-bit apps > ?--fcompiler=lahey ? ?Lahey/Fujitsu Fortran 95 Compiler > ?--fcompiler=mips ? ? MIPSpro Fortran Compiler > ?--fcompiler=none ? ? Fake Fortran compiler > ?--fcompiler=pg ? ? ? Portland Group Fortran Compiler > ?--fcompiler=sun ? ? ?Sun or Forte Fortran 95 Compiler > ?--fcompiler=vast ? ? Pacific-Sierra Research Fortran 90 Compiler > For compiler details, run 'config_fc --verbose' setup command. That's odd. You're running Mac OS 10.5.7? Did you install NumPy manually or via Fink? Thanks, Samir From gely at usc.edu Wed Jun 10 16:21:24 2009 From: gely at usc.edu (Geoffrey Ely) Date: Wed, 10 Jun 2009 13:21:24 -0700 Subject: [Numpy-discussion] [OFFTOPIC] Reference management for articles In-Reply-To: <49d6b3500906101223p749a8aefva5e6fa7c679c267b@mail.gmail.com> References: <49d6b3500906101223p749a8aefva5e6fa7c679c267b@mail.gmail.com> Message-ID: If you use LaTex and Mac OSX, I recommend BibDesk: http://bibdesk.sourceforge.net/ Quite nice, and open-source. -Geoff On Jun 10, 2009, at 12:23 PM, G?khan SEVER wrote: > Hello, > > I am very off-the-topic, sorry about that first, but I know most of > the people in this list are students / scientists. Just want to know > a few opinions upon how you manage references (following the list of > references in the end of articles, books, etc... or building your > owns). Some articles are really goes beyond limits on this. I have > seen a couple articles titles with at least 200 citations attached > in the end, and makes me confused easily while trying to focus on > the core of an article. > > What is your solution for this issue? Do you use a web-based system > a tool like Zotero? An open-source recommendations are more > appreciated. > > Thank you... > > G?khan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From srunni at gmail.com Wed Jun 10 16:24:05 2009 From: srunni at gmail.com (Samir Unni) Date: Wed, 10 Jun 2009 15:24:05 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> <799406d60906101311h72191cbcr2c1b5eab484d71d6@mail.gmail.com> Message-ID: <283efb240906101324x38638cc4q3dbe16526679c25f@mail.gmail.com> On Wed, Jun 10, 2009 at 3:18 PM, Carl, Andrew F (AS) wrote: > You might try running: C:\Python25\Lib\site-packages\numpy\f2py\diagnose.py That's giving me the same result: List of available Fortran compilers: --fcompiler=g95 G95 Fortran Compiler (0.91) List of unavailable Fortran compilers: --fcompiler=absoft Absoft Corp Fortran Compiler --fcompiler=compaq Compaq Fortran Compiler --fcompiler=compaqv DIGITAL|Compaq Visual Fortran Compiler --fcompiler=gnu GNU Fortran Compiler --fcompiler=gnu95 GNU 95 Fortran Compiler --fcompiler=hpux HP Fortran 90 Compiler --fcompiler=ibm IBM XL Fortran Compiler --fcompiler=intel Intel Fortran Compiler for 32-bit apps --fcompiler=intele Intel Fortran Compiler for Itanium apps --fcompiler=intelem Intel Fortran Compiler for EM64T-based apps --fcompiler=intelev Intel Visual Fortran Compiler for Itanium apps --fcompiler=intelv Intel Visual Fortran Compiler for 32-bit apps --fcompiler=lahey Lahey/Fujitsu Fortran 95 Compiler --fcompiler=mips MIPSpro Fortran Compiler --fcompiler=nag NAGWare Fortran 95 Compiler --fcompiler=none Fake Fortran compiler --fcompiler=pg Portland Group Fortran Compiler --fcompiler=sun Sun|Forte Fortran 95 Compiler --fcompiler=vast Pacific-Sierra Research Fortran 90 Compiler List of unimplemented Fortran compilers: --fcompiler=f Fortran Company/NAG F Compiler Thanks, Samir From ramercer at gmail.com Wed Jun 10 16:24:34 2009 From: ramercer at gmail.com (Adam Mercer) Date: Wed, 10 Jun 2009 15:24:34 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <283efb240906101319o2ac0e95aieb73b99455c24de8@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> <799406d60906101311h72191cbcr2c1b5eab484d71d6@mail.gmail.com> <283efb240906101319o2ac0e95aieb73b99455c24de8@mail.gmail.com> Message-ID: <799406d60906101324o2ece6b7arffcedfd3ae63a5e2@mail.gmail.com> On Wed, Jun 10, 2009 at 15:19, Samir Unni wrote: > That's odd. You're running Mac OS 10.5.7? Did you install NumPy > manually or via Fink? Yep Intel 10.5.7, installed from MacPorts. Cheers Adam From a.carl at ngc.com Wed Jun 10 16:31:01 2009 From: a.carl at ngc.com (Carl, Andrew F (AS)) Date: Wed, 10 Jun 2009 15:31:01 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <283efb240906101324x38638cc4q3dbe16526679c25f@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com><799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com><283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com><799406d60906101311h72191cbcr2c1b5eab484d71d6@mail.gmail.com> <283efb240906101324x38638cc4q3dbe16526679c25f@mail.gmail.com> Message-ID: As strange as it may sound, the same type of thing happened on my PC: it was working (i.e. diagnose.py & example problem), then quit working after an uninstall of ipython, requiring a complete reinstall. (see my previous post earlier today). -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Samir Unni Sent: Wednesday, June 10, 2009 1:24 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 On Wed, Jun 10, 2009 at 3:18 PM, Carl, Andrew F (AS) wrote: > You might try running: > C:\Python25\Lib\site-packages\numpy\f2py\diagnose.py That's giving me the same result: List of available Fortran compilers: --fcompiler=g95 G95 Fortran Compiler (0.91) List of unavailable Fortran compilers: --fcompiler=absoft Absoft Corp Fortran Compiler --fcompiler=compaq Compaq Fortran Compiler --fcompiler=compaqv DIGITAL|Compaq Visual Fortran Compiler --fcompiler=gnu GNU Fortran Compiler --fcompiler=gnu95 GNU 95 Fortran Compiler --fcompiler=hpux HP Fortran 90 Compiler --fcompiler=ibm IBM XL Fortran Compiler --fcompiler=intel Intel Fortran Compiler for 32-bit apps --fcompiler=intele Intel Fortran Compiler for Itanium apps --fcompiler=intelem Intel Fortran Compiler for EM64T-based apps --fcompiler=intelev Intel Visual Fortran Compiler for Itanium apps --fcompiler=intelv Intel Visual Fortran Compiler for 32-bit apps --fcompiler=lahey Lahey/Fujitsu Fortran 95 Compiler --fcompiler=mips MIPSpro Fortran Compiler --fcompiler=nag NAGWare Fortran 95 Compiler --fcompiler=none Fake Fortran compiler --fcompiler=pg Portland Group Fortran Compiler --fcompiler=sun Sun|Forte Fortran 95 Compiler --fcompiler=vast Pacific-Sierra Research Fortran 90 Compiler List of unimplemented Fortran compilers: --fcompiler=f Fortran Company/NAG F Compiler Thanks, Samir _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From srunni at gmail.com Wed Jun 10 16:33:02 2009 From: srunni at gmail.com (Samir Unni) Date: Wed, 10 Jun 2009 15:33:02 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <799406d60906101324o2ece6b7arffcedfd3ae63a5e2@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> <799406d60906101311h72191cbcr2c1b5eab484d71d6@mail.gmail.com> <283efb240906101319o2ac0e95aieb73b99455c24de8@mail.gmail.com> <799406d60906101324o2ece6b7arffcedfd3ae63a5e2@mail.gmail.com> Message-ID: <283efb240906101333h73182121va6542e145b9ce8ee@mail.gmail.com> On Wed, Jun 10, 2009 at 3:24 PM, Adam Mercer wrote: > On Wed, Jun 10, 2009 at 15:19, Samir Unni wrote: > >> That's odd. You're running Mac OS 10.5.7? Did you install NumPy >> manually or via Fink? > > Yep Intel 10.5.7, installed from MacPorts. MacPorts might be the difference then. I tried both the Fink and manual installs, and both are giving the same result. Do you think you could try doing the manual install and let me know if all those other compilers are still listed as available? Also, what version of Python are you using? Thanks, Samir From timmichelsen at gmx-topmail.de Wed Jun 10 18:08:41 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 11 Jun 2009 00:08:41 +0200 Subject: [Numpy-discussion] [OFFTOPIC] Reference management for articles In-Reply-To: References: <49d6b3500906101223p749a8aefva5e6fa7c679c267b@mail.gmail.com> Message-ID: > http://bibdesk.sourceforge.net/ I use JabRef for quite some time. Very nice and cross-platform. Good interoperability with LyX. If you with MS or OOo, you'd go for Bibus. Best regards, Timmie From dwf at cs.toronto.edu Wed Jun 10 18:25:32 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 10 Jun 2009 18:25:32 -0400 Subject: [Numpy-discussion] [OFFTOPIC] Reference management for articles In-Reply-To: <49d6b3500906101223p749a8aefva5e6fa7c679c267b@mail.gmail.com> References: <49d6b3500906101223p749a8aefva5e6fa7c679c267b@mail.gmail.com> Message-ID: <7E8AD77B-6267-49A5-BA0E-F1169A53CE23@cs.toronto.edu> On 10-Jun-09, at 3:23 PM, G?khan SEVER wrote: > I am very off-the-topic, sorry about that first, but I know most of > the > people in this list are students / scientists. Just want to know a few > opinions upon how you manage references (following the list of > references in > the end of articles, books, etc... or building your owns). Some > articles are > really goes beyond limits on this. I have seen a couple articles > titles with > at least 200 citations attached in the end, and makes me confused > easily > while trying to focus on the core of an article. > > What is your solution for this issue? Do you use a web-based system > a tool > like Zotero? An open-source recommendations are more appreciated. BibDesk seems to be preferred by lots of people (Mac only but it is open source). CiteULike has a fairly nice interface and the added perk of being "in the cloud". Plus it can export to .bib which can then be imported by BibDesk or anything that speaks BibTeX for keeping/ searching/viewing an offline copy. David From cournape at gmail.com Wed Jun 10 21:32:40 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 11 Jun 2009 10:32:40 +0900 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> Message-ID: <5b8d13220906101832vf83ae81jc2d0a507af58563d@mail.gmail.com> On Thu, Jun 11, 2009 at 5:04 AM, Samir Unni wrote: > On Wed, Jun 10, 2009 at 2:59 PM, Adam Mercer wrote: >> On Wed, Jun 10, 2009 at 12:44, Samir Unni wrote: >> >>> I'm trying to use F2PY on Mac OS 10.5 with G95, but I'm getting the >>> error "g95: unrecognized option '-shared'". I tried modifying the >>> NumPy code to use the correct "-dynamic" flag, rather than the >>> "-shared" flag. While that does allow for F2PY to successfully >>> execute, I get the error >>> >>> Traceback (most recent call last): >>> ?File "", line 1, in >>> ImportError: dlopen(/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so, >>> 2): no suitable image found. ?Did find: >>> ? ? ? ?/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so: >>> unknown file type, first eight bytes: 0x80 0xC0 0x4F 0x00 0xEB 0x57 >>> 0xE0 0x8F >>> >>> when I attempt to import the generated module. Any ideas on how to fix this? >> >> AFAIK g95 is not supported by numpy distutils on Mac OS X. No it is not, at least not in your configuration: g95 cannot build universal binaries, and I think the OP error is caused by this. Gfortran is certainly supported. I don't know about fink and darwin ports, but the gfortran compiler available at http://r.research.att.com/tools/ is the recommended one. David From gokhansever at gmail.com Wed Jun 10 23:29:30 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_SEVER?=) Date: Wed, 10 Jun 2009 22:29:30 -0500 Subject: [Numpy-discussion] [OFFTOPIC] Reference management for articles In-Reply-To: <7E8AD77B-6267-49A5-BA0E-F1169A53CE23@cs.toronto.edu> References: <49d6b3500906101223p749a8aefva5e6fa7c679c267b@mail.gmail.com> <7E8AD77B-6267-49A5-BA0E-F1169A53CE23@cs.toronto.edu> Message-ID: <49d6b3500906102029p5fe0f754w68ac67fb860ce48c@mail.gmail.com> Thanks for sharing your ideas on this subject. What I am most likely going to do is, do a test-drive for each mentioned tools, except the ones for MacOS :) since I use Linux (Fedora) almost all the time. I also use OO for composing, and once in a while MS Office tools. I may give 'LyX' a try since it is mentioned alot (or I am just coming across publishings where people use LyX) G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From tpk at kraussfamily.org Wed Jun 10 23:58:09 2009 From: tpk at kraussfamily.org (Tom K.) Date: Wed, 10 Jun 2009 20:58:09 -0700 (PDT) Subject: [Numpy-discussion] Howto vectorise a dot product ? In-Reply-To: References: Message-ID: <23974947.post@talk.nabble.com> bruno Piguet wrote: > > Can someone point me to a doc on dot product vectorisation ? > As I posted in the past you can try this one liner: " numpy.array(map(numpy.dot, a, b)) that works for matrix multiply if a, b are (n, 3, 3). " This would also work if a is (n, 3, 3) and b is (n, 3, 1) URL http://www.nabble.com/Array-of-matrices---Inverse-and-dot-td21666949.html#a21670624 Maybe write a loop of length 3? For example, taking a linear combination of columns, out = np.zeros(a.shape[:-1], a.dtype) for i in range(a.shape[-1]): out += a[..., i]*b[..., i, :] Now we can vectorize that loop - here's one with no loop that expands the right array to a larger intermediate array via broadcasting, does elementwise multiplication, and then sums along the appropriate dimension: a=np.random.randn(3,2,2) b=np.random.randn(3,2,1) (a[..., np.newaxis]*b[..., np.newaxis, :, :]).sum(len(a.shape)-1) I think I would still prefer a C implementation ... this isn't terribly readable or optimal (maybe the numeric police can help me out here....! Help!) - Tom -- View this message in context: http://www.nabble.com/Howto-vectorise-a-dot-product---tp23949253p23974947.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From jsseabold at gmail.com Thu Jun 11 00:18:15 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 11 Jun 2009 00:18:15 -0400 Subject: [Numpy-discussion] [OFFTOPIC] Reference management for articles In-Reply-To: <49d6b3500906102029p5fe0f754w68ac67fb860ce48c@mail.gmail.com> References: <49d6b3500906101223p749a8aefva5e6fa7c679c267b@mail.gmail.com> <7E8AD77B-6267-49A5-BA0E-F1169A53CE23@cs.toronto.edu> <49d6b3500906102029p5fe0f754w68ac67fb860ce48c@mail.gmail.com> Message-ID: On Wed, Jun 10, 2009 at 11:29 PM, G?khan SEVER wrote: > Thanks for sharing your ideas on this subject. > > What I am most likely going to do is, do a test-drive for each mentioned > tools, except the ones for MacOS :) since I use Linux (Fedora) almost all > the time. > > I also use OO for composing, and once in a while MS Office tools. I may give > 'LyX' a try since it is mentioned alot (or I am just coming across > publishings where people use LyX) > FWIW since you're using Linux, I use KBibTeX (if you use KDE or even if not) and LyX. With KBibTeX I just get Google Scholar to output BibTeX along with the articles and edit the source as I need. I don't know about journal quality papers (or support for very complex mathematics so I've heard), but LyX suits my needs (presentations, homeworks, papers, CV) perfectly and you can always modify the source, import or write your own class, or input raw TeX if you need to change some behavior. Not to mention the (configurable) keyboard shortcuts... Cheers, Skipper From charlesr.harris at gmail.com Thu Jun 11 00:18:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 10 Jun 2009 22:18:39 -0600 Subject: [Numpy-discussion] Datetime branch Message-ID: Hi Travis, I looked through the recent commits to the datetime branch and, as I'm working on cleaning up arraytypes I would appreciate it if you can merge up your changes there as soon as possible to minimize conflicts. Also, this change looks like a reversion of current trunk to something older: @@ -2688,10 +2690,8 @@ goto err; } - /* - * PyExc_Exception should catch all the standard errors that are - * now raised instead of the string exception "multiarray.error". - * This is for backward compatibility with existing code. - */ - PyDict_SetItemString (d, "error", PyExc_Exception); + /* Fixme: we might want to remove this old string exception string */ + s = PyString_FromString("multiarray.error"); + PyDict_SetItemString (d, "error", s); + Py_DECREF(s); s = PyString_FromString("3.0"); PyDict_SetItemString(d, "__version__", s); And I am concerned that there might be other such cases. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jun 11 00:21:14 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 10 Jun 2009 23:21:14 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: References: Message-ID: <3d375d730906102121u6094d9f9vfde9ef8fc150e602@mail.gmail.com> On Wed, Jun 10, 2009 at 23:18, Charles R Harris wrote: > Hi Travis, > > I looked through the recent commits to the datetime branch and, as I'm > working on cleaning up arraytypes I would appreciate it if you can merge up > your changes there as soon as possible to minimize conflicts. Also, this > change looks like a reversion of current trunk to something older: Probably my fault. I tried to keep my git branch synched and rebased against svn/trunk, but that may not have worked. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From shivraj.ms at gmail.com Thu Jun 11 07:19:38 2009 From: shivraj.ms at gmail.com (Shivaraj M S) Date: Thu, 11 Jun 2009 16:49:38 +0530 Subject: [Numpy-discussion] all and alltrue Message-ID: <2c1c314c0906110419p4d900bafne8841a00af1823e1@mail.gmail.com> Hello, I just came across 'all' and 'alltrue' functions in fromnumeric.py They are one and same. IMHO, alltrue = all would be sufficient. Regards _______________ Shivaraj -- Regards _______________ Shivaraj -------------- next part -------------- An HTML attachment was scrubbed... URL: From srunni at gmail.com Thu Jun 11 11:56:33 2009 From: srunni at gmail.com (Samir Unni) Date: Thu, 11 Jun 2009 10:56:33 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <5b8d13220906101832vf83ae81jc2d0a507af58563d@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> <5b8d13220906101832vf83ae81jc2d0a507af58563d@mail.gmail.com> Message-ID: <283efb240906110856i3b5fbf44sd11f8825c0a2cdc6@mail.gmail.com> On Wed, Jun 10, 2009 at 8:32 PM, David Cournapeau wrote: > On Thu, Jun 11, 2009 at 5:04 AM, Samir Unni wrote: >> On Wed, Jun 10, 2009 at 2:59 PM, Adam Mercer wrote: >>> On Wed, Jun 10, 2009 at 12:44, Samir Unni wrote: >>> >>>> I'm trying to use F2PY on Mac OS 10.5 with G95, but I'm getting the >>>> error "g95: unrecognized option '-shared'". I tried modifying the >>>> NumPy code to use the correct "-dynamic" flag, rather than the >>>> "-shared" flag. While that does allow for F2PY to successfully >>>> execute, I get the error >>>> >>>> Traceback (most recent call last): >>>> ?File "", line 1, in >>>> ImportError: dlopen(/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so, >>>> 2): no suitable image found. ?Did find: >>>> ? ? ? ?/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so: >>>> unknown file type, first eight bytes: 0x80 0xC0 0x4F 0x00 0xEB 0x57 >>>> 0xE0 0x8F >>>> >>>> when I attempt to import the generated module. Any ideas on how to fix this? >>> >>> AFAIK g95 is not supported by numpy distutils on Mac OS X. > > No it is not, at least not in your configuration: g95 cannot build > universal binaries, and I think the OP error is caused by this. > Gfortran is certainly supported. Running diagnose.py gives me this error: Couldn't match compiler version for 'GNU Fortran (GCC) 4.3.3\nCopyright (C) 2008 Free Software Foundation, Inc.\n\nGNU Fortran comes with NO WARRANTY, to the extent permitted by law.\nYou may redistribute copies of GNU Fortran\nunder the terms of the GNU General Public License.\nFor more information about these matters, see the file named COPYING\n' Is this the source of the problem? I looked at numpy/distutils/tests/test_fcompiler_gnu.py, and I found this: gfortran_version_strings = [ ('GNU Fortran 95 (GCC 4.0.3 20051023 (prerelease) (Debian 4.0.2-3))', '4.0.3'), ('GNU Fortran 95 (GCC) 4.1.0', '4.1.0'), ('GNU Fortran 95 (GCC) 4.2.0 20060218 (experimental)', '4.2.0'), ('GNU Fortran (GCC) 4.3.0 20070316 (experimental)', '4.3.0'), ] My GNU Fortran version string is "GNU Fortran (GCC) 4.3.3". However, even after adding it to that list and reinstalling, the issue persisted. Thanks, Samir From oliphant at enthought.com Thu Jun 11 12:29:04 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 11 Jun 2009 11:29:04 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: References: Message-ID: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> On Jun 10, 2009, at 11:18 PM, Charles R Harris wrote: > Hi Travis, > > I looked through the recent commits to the datetime branch and, as > I'm working on cleaning up arraytypes I would appreciate it if you > can merge up your changes there as soon as possible to minimize > conflicts. Also, this change looks like a reversion of current trunk > to something older: > > @@ -2688,10 +2690,8 @@ > goto err; > } > - /* > - * PyExc_Exception should catch all the standard errors that are > - * now raised instead of the string exception "multiarray.error". > > - * This is for backward compatibility with existing code. > - */ > - PyDict_SetItemString (d, "error", PyExc_Exception); > + /* Fixme: we might want to remove this old string exception > string */ > > + s = PyString_FromString("multiarray.error"); > + PyDict_SetItemString (d, "error", s); > + Py_DECREF(s); > s = PyString_FromString("3.0"); > PyDict_SetItemString(d, "__version__", s); > And I am concerned that there might be other such cases. There may be. I took Robert's git branch and tried to re-base it. But, I'm a git neophyte and didn't know what I was doing. I tried to eliminate the most obvious cases where is trunk was out-of-date, but obviously missed some. Thanks for checking. I don't want to stomp on your work, but I don't know what you mean by cleaning up arraytypes? I am hoping to get the datetime changes merged to trunk by the end of the month. There are a couple of potential issues that may slow that down because of the change to the PyArray_Descr structure (we need to add a metadata object at the end). We may need to bump up the version number of NumPy to encourage C- extensions to re-compile, but I think it will work without that. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Jun 11 12:58:14 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 12 Jun 2009 01:58:14 +0900 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <283efb240906110856i3b5fbf44sd11f8825c0a2cdc6@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> <5b8d13220906101832vf83ae81jc2d0a507af58563d@mail.gmail.com> <283efb240906110856i3b5fbf44sd11f8825c0a2cdc6@mail.gmail.com> Message-ID: <5b8d13220906110958m49bb236ck6a53b5f386e21df2@mail.gmail.com> On Fri, Jun 12, 2009 at 12:56 AM, Samir Unni wrote: > On Wed, Jun 10, 2009 at 8:32 PM, David Cournapeau wrote: >> On Thu, Jun 11, 2009 at 5:04 AM, Samir Unni wrote: >>> On Wed, Jun 10, 2009 at 2:59 PM, Adam Mercer wrote: >>>> On Wed, Jun 10, 2009 at 12:44, Samir Unni wrote: >>>> >>>>> I'm trying to use F2PY on Mac OS 10.5 with G95, but I'm getting the >>>>> error "g95: unrecognized option '-shared'". I tried modifying the >>>>> NumPy code to use the correct "-dynamic" flag, rather than the >>>>> "-shared" flag. While that does allow for F2PY to successfully >>>>> execute, I get the error >>>>> >>>>> Traceback (most recent call last): >>>>> ?File "", line 1, in >>>>> ImportError: dlopen(/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so, >>>>> 2): no suitable image found. ?Did find: >>>>> ? ? ? ?/Users/srunni/src/pdb2pqr/pdb2pqr/tinker/src/tinker/source/ese.so: >>>>> unknown file type, first eight bytes: 0x80 0xC0 0x4F 0x00 0xEB 0x57 >>>>> 0xE0 0x8F >>>>> >>>>> when I attempt to import the generated module. Any ideas on how to fix this? >>>> >>>> AFAIK g95 is not supported by numpy distutils on Mac OS X. >> >> No it is not, at least not in your configuration: g95 cannot build >> universal binaries, and I think the OP error is caused by this. >> Gfortran is certainly supported. > > Running diagnose.py gives me this error: Ignore diagnose.py for the time being - what does f2py says ? > Is this the source of the problem? I looked at > numpy/distutils/tests/test_fcompiler_gnu.py, and I found this: > > gfortran_version_strings = [ > ? ?('GNU Fortran 95 (GCC 4.0.3 20051023 (prerelease) (Debian 4.0.2-3))', > ? ? '4.0.3'), > ? ?('GNU Fortran 95 (GCC) 4.1.0', '4.1.0'), > ? ?('GNU Fortran 95 (GCC) 4.2.0 20060218 (experimental)', '4.2.0'), > ? ?('GNU Fortran (GCC) 4.3.0 20070316 (experimental)', '4.3.0'), > ] > > My GNU Fortran version string is "GNU Fortran (GCC) 4.3.3". However, > even after adding it to that list and reinstalling, the issue > persisted. Yes, this just looks like a unit test, so it won't change anything to numpy.distutils behavior. David From jsseabold at gmail.com Thu Jun 11 12:30:33 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 11 Jun 2009 12:30:33 -0400 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <456109.37185.qm@web52105.mail.re2.yahoo.com> References: <456109.37185.qm@web52105.mail.re2.yahoo.com> Message-ID: On Wed, Jun 10, 2009 at 1:00 AM, David Goldsmith wrote: > > --- On Tue, 6/9/09, Skipper Seabold wrote: > >> I have implemented the ipmt and ppmt functions that were >> "not >> implemented" in numpy.lib.financial as well as written some >> tests. > > Thanks! > >> ipmt is one of the functions where there was a discrepancy >> between >> what OO and Excel report for the beginning of period >> payment >> assumptions and what Gnumeric and Kspread report as stated >> in the >> OpenFormula document referenced above (I discovered errors >> in this >> document as well as the openoffice documents referenced btw >> but they >> become obvious when you work these problems out). > > And the nightmare worsens (IMO). > >> OpenFormula lists >> the Gnumeric/Kspread as the "result" in the document, but >> there is >> still a question to which is correct.? Well, I was >> able to derive both >> results, and as I suspected the Gnumeric/Kspread was based >> on an >> incorrect assumption (or a mistake in implementation) not a >> different >> formula.? My point with the derivations wasn't to >> include them in the >> documentation, but rather to find out what assumptions are >> being made >> and then deduce which results are correct.? In the >> cases of these >> simple spreadsheet functions I think it should be obvious >> if it's >> right or wrong. > > OK, I concede defeat: if it is the wisdom of the PTB that numpy.financial be retained, I will stop messing w/ their help doc, 'cause I'm clearly in over my head. I wouldn't concede defeat just yet. A few weeks ago I didn't even know numpy.financial existed, and I appreciate your calling attention to the possibility that there could be problems. With this in mind. It looks like just trying to follow these formulas has led to some problems. There are different "formulas" for the modified internal rate of return out there, but they're all trying to say the same thing just not always very clearly, and this led to np.mirr being incorrect (though it looks like what the formula in the OpenFormula doc, for instance, says just not what it means and does). I have filed a ticket here with a patch and tests. Please let me know if I haven't done the patching and tests correctly. Skipper From srunni at gmail.com Thu Jun 11 13:06:32 2009 From: srunni at gmail.com (Samir Unni) Date: Thu, 11 Jun 2009 12:06:32 -0500 Subject: [Numpy-discussion] F2PY error with G95 on Mac OS 10.5 In-Reply-To: <5b8d13220906110958m49bb236ck6a53b5f386e21df2@mail.gmail.com> References: <283efb240906101044v58200204id267b0bae59946ca@mail.gmail.com> <799406d60906101259u4b626ec3t7ce26c0eeedb44c7@mail.gmail.com> <283efb240906101304m7a12fd52k540d2e19e693beca@mail.gmail.com> <5b8d13220906101832vf83ae81jc2d0a507af58563d@mail.gmail.com> <283efb240906110856i3b5fbf44sd11f8825c0a2cdc6@mail.gmail.com> <5b8d13220906110958m49bb236ck6a53b5f386e21df2@mail.gmail.com> Message-ID: <283efb240906111006p1243dcf9v4a65658f131bb7a9@mail.gmail.com> On Thu, Jun 11, 2009 at 11:58 AM, David Cournapeau wrote: > Ignore diagnose.py for the time being - what does f2py says ? running build running config_fc running build_src building extension "ese" sources f2py options: [] adding '/tmp/tmpFE_lPq/src.macosx-10.5-i386-2.5/fortranobject.c' to sources. adding '/tmp/tmpFE_lPq/src.macosx-10.5-i386-2.5' to include_dirs. creating /tmp/tmpFE_lPq creating /tmp/tmpFE_lPq/src.macosx-10.5-i386-2.5 copying /System/Library/Frameworks/Python.framework/Versions/2.5/Extras/lib/python/numpy/f2py/src/fortranobject.c -> /tmp/tmpFE_lPq/src.macosx-10.5-i386-2.5 copying /System/Library/Frameworks/Python.framework/Versions/2.5/Extras/lib/python/numpy/f2py/src/fortranobject.h -> /tmp/tmpFE_lPq/src.macosx-10.5-i386-2.5 running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext customize NAGFCompiler customize AbsoftFCompiler customize IbmFCompiler Could not locate executable g77 Could not locate executable f77 Could not locate executable f95 customize GnuFCompiler customize Gnu95FCompiler Couldn't match compiler version for 'GNU Fortran (GCC) 4.3.3\nCopyright (C) 2008 Free Software Foundation, Inc.\n\nGNU Fortran comes with NO WARRANTY, to the extent permitted by law.\nYou may redistribute copies of GNU Fortran\nunder the terms of the GNU General Public License.\nFor more information about these matters, see the file named COPYING\n' customize G95FCompiler customize G95FCompiler customize G95FCompiler using build_ext Removing build directory /tmp/tmpFE_lPq The odd thing is that I'm running my locally installed copy of F2PY, but it is still using files from the system-wide version (installed in /System) of NumPy. From jrennie at gmail.com Thu Jun 11 13:12:22 2009 From: jrennie at gmail.com (Jason Rennie) Date: Thu, 11 Jun 2009 13:12:22 -0400 Subject: [Numpy-discussion] performance matrix multiplication vs. matlab In-Reply-To: <4A2D281E.3050403@ar.media.kyoto-u.ac.jp> References: <4A295A7D.1040407@ar.media.kyoto-u.ac.jp> <4A2B8A51.3070409@ar.media.kyoto-u.ac.jp> <20090607101204.GB20612@phare.normalesup.org> <20090608053210.GC5032@phare.normalesup.org> <4A2C9EF9.7030302@ar.media.kyoto-u.ac.jp> <75c31b2a0906080533t29af5e2k6aef04136a3a5a5e@mail.gmail.com> <4A2D0A3B.9030105@ar.media.kyoto-u.ac.jp> <75c31b2a0906080740rf4452d2vaa3a3a6964207621@mail.gmail.com> <4A2D281E.3050403@ar.media.kyoto-u.ac.jp> Message-ID: <75c31b2a0906111012x22a146act5d7e5d14ccca9254@mail.gmail.com> On Mon, Jun 8, 2009 at 11:02 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Isn't it true for any general framework who enjoys some popularity :) Yup :) I think there are cases where gradient methods are not applicable > (latent models where the complete data Y cannot be split into > observations-hidden (O, H) variables), although I am not sure that's a > very common case in machine learning, > I won't argue with that. My bias has certainly been strongly influenced by the type of problems I've been exposed to. It'd be interesting to hear of a problem where one can't separate observed/hidden variables :) Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jun 11 13:20:14 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Jun 2009 11:20:14 -0600 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> Message-ID: On Thu, Jun 11, 2009 at 10:29 AM, Travis Oliphant wrote: > > On Jun 10, 2009, at 11:18 PM, Charles R Harris wrote: > > Hi Travis, > > I looked through the recent commits to the datetime branch and, as I'm > working on cleaning up arraytypes I would appreciate it if you can merge up > your changes there as soon as possible to minimize conflicts. Also, this > change looks like a reversion of current trunk to something older: > > @@ -2688,10 +2690,8 @@ > goto err; > } > - /* > - * PyExc_Exception should catch all the standard errors that are > - * now raised instead of the string exception "multiarray.error". > > - * This is for backward compatibility with existing code. > - */ > - PyDict_SetItemString (d, "error", PyExc_Exception); > + /* Fixme: we might want to remove this old string exception string */ > > + s = PyString_FromString("multiarray.error"); > + PyDict_SetItemString (d, "error", s); > + Py_DECREF(s); > s = PyString_FromString("3.0"); > PyDict_SetItemString(d, "__version__", s); > > And I am concerned that there might be other such cases. > > > There may be. I took Robert's git branch and tried to re-base it. But, > I'm a git neophyte and didn't know what I was doing. I tried to eliminate > the most obvious cases where is trunk was out-of-date, but obviously missed > some. > > Thanks for checking. I don't want to stomp on your work, but I don't know > what you mean by cleaning up arraytypes? > Reorganize a bit, break the long repeat lines, more prominent labels, c style, that sort of thing. I got most of the c style cleanups done before the 1.3 release but didn't quite finish all the files that are now broken up in multiarray. > I am hoping to get the datetime changes merged to trunk by the end of the > month. There are a couple of potential issues that may slow that down > because of the change to the PyArray_Descr structure (we need to add a > metadata object at the end). > Cleaning up the functions associated with the descriptor handling is on my list somewhere, but not near the top at the moment. Too much work ;) > > We may need to bump up the version number of NumPy to encourage > C-extensions to re-compile, but I think it will work without that. > The API is going to be bumped in any case and it may be that now is the time to add in any ABI changes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Thu Jun 11 13:23:53 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Thu, 11 Jun 2009 10:23:53 -0700 (PDT) Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate Message-ID: <316776.49606.qm@web52102.mail.re2.yahoo.com> Thanks, both for your kind words and for the patch; I don't suppose I could prevail upon you to be Team Lead for doc-ing numpy.finacial? DG --- On Thu, 6/11/09, Skipper Seabold wrote: > > OK, I concede defeat: if it is the wisdom of the PTB > that numpy.financial be retained, I will stop messing w/ > their help doc, 'cause I'm clearly in over my head. > > I wouldn't concede defeat just yet.? A few weeks ago I > didn't even > know numpy.financial existed, and I appreciate your calling > attention > to the possibility that there could be problems. > > With this in mind.? It looks like just trying to > follow these formulas > has led to some problems.? There are different > "formulas" for the > modified internal rate of return out there, but they're all > trying to > say the same thing just not always very clearly, and this > led to > np.mirr being incorrect (though it looks like what the > formula in the > OpenFormula doc, for instance, says just not what it means > and does). > > I have filed a ticket here > with a > patch and tests. > Please let me know if I haven't done the patching and tests > correctly. > > Skipper > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Thu Jun 11 13:29:06 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Jun 2009 11:29:06 -0600 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> Message-ID: On Thu, Jun 11, 2009 at 10:29 AM, Travis Oliphant wrote: > > On Jun 10, 2009, at 11:18 PM, Charles R Harris wrote: > > Hi Travis, > > I looked through the recent commits to the datetime branch and, as I'm > working on cleaning up arraytypes I would appreciate it if you can merge up > your changes there as soon as possible to minimize conflicts. Also, this > change looks like a reversion of current trunk to something older: > > @@ -2688,10 +2690,8 @@ > goto err; > } > - /* > - * PyExc_Exception should catch all the standard errors that are > - * now raised instead of the string exception "multiarray.error". > > - * This is for backward compatibility with existing code. > - */ > - PyDict_SetItemString (d, "error", PyExc_Exception); > + /* Fixme: we might want to remove this old string exception string */ > > + s = PyString_FromString("multiarray.error"); > + PyDict_SetItemString (d, "error", s); > + Py_DECREF(s); > s = PyString_FromString("3.0"); > PyDict_SetItemString(d, "__version__", s); > > And I am concerned that there might be other such cases. > > > There may be. I took Robert's git branch and tried to re-base it. But, > I'm a git neophyte and didn't know what I was doing. I tried to eliminate > the most obvious cases where is trunk was out-of-date, but obviously missed > some. > > Thanks for checking. I don't want to stomp on your work, but I don't know > what you mean by cleaning up arraytypes? > Oh, and slipping the new types in between 64 bit integers and floats is a bit iffy. I've always been a bit bothered by numpy's dependence on the linear order of the types as it is hard to maintain when new types are added, and user types don't seem quite adequate. I don't know what we should do here but it might be worth thinking about some other way of indicating the relation/priority of the different types. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Thu Jun 11 13:32:39 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 11 Jun 2009 13:32:39 -0400 Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate In-Reply-To: <316776.49606.qm@web52102.mail.re2.yahoo.com> References: <316776.49606.qm@web52102.mail.re2.yahoo.com> Message-ID: On Thu, Jun 11, 2009 at 1:23 PM, David Goldsmith wrote: > > Thanks, both for your kind words and for the patch; I don't suppose I could prevail upon you to be Team Lead for doc-ing numpy.finacial? > > DG > Sure why not. From robert.kern at gmail.com Thu Jun 11 13:34:20 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Jun 2009 12:34:20 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> Message-ID: <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> On Thu, Jun 11, 2009 at 12:29, Charles R Harris wrote: > Oh, and slipping the new types in between 64 bit integers and floats is a > bit iffy. Where, specifically? There are several linear orders of types in numpy. I tried to be careful to do the right thing in each. The enum numbers are after NPY_VOID, of course, for compatibility. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Jun 11 13:39:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Jun 2009 11:39:39 -0600 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> Message-ID: On Thu, Jun 11, 2009 at 11:34 AM, Robert Kern wrote: > On Thu, Jun 11, 2009 at 12:29, Charles R > Harris wrote: > > Oh, and slipping the new types in between 64 bit integers and floats is a > > bit iffy. > > Where, specifically? There are several linear orders of types in > numpy. I tried to be careful to do the right thing in each. The enum > numbers are after NPY_VOID, of course, for compatibility. > I noticed. I'm not saying it's wrong, just that a linear order lacks descriptive power and is difficult to maintain. I expect you ran into that problem when trying to make everything work as you wanted. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jun 11 13:47:26 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Jun 2009 12:47:26 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> Message-ID: <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> On Thu, Jun 11, 2009 at 12:39, Charles R Harris wrote: > > > On Thu, Jun 11, 2009 at 11:34 AM, Robert Kern wrote: >> >> On Thu, Jun 11, 2009 at 12:29, Charles R >> Harris wrote: >> > Oh, and slipping the new types in between 64 bit integers and floats is >> > a >> > bit iffy. >> >> Where, specifically? There are several linear orders of types in >> numpy. I tried to be careful to do the right thing in each. The enum >> numbers are after NPY_VOID, of course, for compatibility. > > I noticed. I'm not saying it's wrong, just that a linear order lacks > descriptive power and is difficult to maintain. I expect you ran into that > problem when trying to make everything work as you wanted. Yes. Now, which place am I slipping in the new types between 64-bit integers and floats? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Jun 11 14:06:41 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Jun 2009 12:06:41 -0600 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> Message-ID: On Thu, Jun 11, 2009 at 11:47 AM, Robert Kern wrote: > On Thu, Jun 11, 2009 at 12:39, Charles R > Harris wrote: > > > > > > On Thu, Jun 11, 2009 at 11:34 AM, Robert Kern > wrote: > >> > >> On Thu, Jun 11, 2009 at 12:29, Charles R > >> Harris wrote: > >> > Oh, and slipping the new types in between 64 bit integers and floats > is > >> > a > >> > bit iffy. > >> > >> Where, specifically? There are several linear orders of types in > >> numpy. I tried to be careful to do the right thing in each. The enum > >> numbers are after NPY_VOID, of course, for compatibility. > > > > I noticed. I'm not saying it's wrong, just that a linear order lacks > > descriptive power and is difficult to maintain. I expect you ran into > that > > problem when trying to make everything work as you wanted. > > Yes. Now, which place am I slipping in the new types between 64-bit > integers and floats? > In the ufunc generator. But most of the macros use the type ordering and how do you control the promotion (or lack thereof) of the various types to/from the datetime types? There also seems to be some mechanism for raising errors that has been added, maybe to loops. I'm not clear on that, did you add some such mechanism? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jun 11 14:18:27 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Jun 2009 13:18:27 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> Message-ID: <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> On Thu, Jun 11, 2009 at 13:06, Charles R Harris wrote: > > > On Thu, Jun 11, 2009 at 11:47 AM, Robert Kern wrote: >> >> On Thu, Jun 11, 2009 at 12:39, Charles R >> Harris wrote: >> > >> > >> > On Thu, Jun 11, 2009 at 11:34 AM, Robert Kern >> > wrote: >> >> >> >> On Thu, Jun 11, 2009 at 12:29, Charles R >> >> Harris wrote: >> >> > Oh, and slipping the new types in between 64 bit integers and floats >> >> > is >> >> > a >> >> > bit iffy. >> >> >> >> Where, specifically? There are several linear orders of types in >> >> numpy. I tried to be careful to do the right thing in each. The enum >> >> numbers are after NPY_VOID, of course, for compatibility. >> > >> > I noticed. I'm not saying it's wrong, just that a linear order lacks >> > descriptive power and is difficult to maintain. I expect you ran into >> > that >> > problem when trying to make everything work as you wanted. >> >> Yes. Now, which place am I slipping in the new types between 64-bit >> integers and floats? > > In the ufunc generator. This line from generate_umath.py? all = '?bBhHiIlLqQtTfdgFDGO' > But most of the macros use the type ordering Not quite. They use the order of the loops given to the ufunc. The order of the types in that string I think you are referring doesn't affect much. Basically, just the comparisons where every type has a loop. > and how > do you control the promotion (or lack thereof) of the various types to/from > the datetime types? PyArray_CanCastSafely() in convert_datatype.c. datetime and timedelta types cannot be auto-casted to or from any datatype. They can be explicitly cast, but ufuncs won't auto-cast them when trying to find the right loop. The datetime types are a bit unique in that they need to exclude certain combinations (e.g. datetime+datetime). Allowing auto-casts prevented me from doing that. In fact, the placement of the datetime typecodes in that string was a leftover from when I was trying to allow auto-casts between integers and datetime types. Now that I disallow them, the ordering can be changed. > There also seems to be some mechanism for raising errors that has been > added, maybe to loops. I'm not clear on that, did you add some such > mechanism? Not really. Object loops already had such a mechanism; I just extended that to do the same thing for the datetime types, too. You will be able to raise a Python exception in the datetime loops. Of course, you pay for that a little because that means that you can't release the GIL. I don't think that will be a substantial problem. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Jun 11 14:44:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Jun 2009 12:44:32 -0600 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> Message-ID: On Thu, Jun 11, 2009 at 12:18 PM, Robert Kern wrote: > On Thu, Jun 11, 2009 at 13:06, Charles R > Harris wrote: > > > > > > On Thu, Jun 11, 2009 at 11:47 AM, Robert Kern > wrote: > >> > >> On Thu, Jun 11, 2009 at 12:39, Charles R > >> Harris wrote: > >> > > >> > > >> > On Thu, Jun 11, 2009 at 11:34 AM, Robert Kern > >> > wrote: > >> >> > >> >> On Thu, Jun 11, 2009 at 12:29, Charles R > >> >> Harris wrote: > >> >> > Oh, and slipping the new types in between 64 bit integers and > floats > >> >> > is > >> >> > a > >> >> > bit iffy. > >> >> > >> >> Where, specifically? There are several linear orders of types in > >> >> numpy. I tried to be careful to do the right thing in each. The enum > >> >> numbers are after NPY_VOID, of course, for compatibility. > >> > > >> > I noticed. I'm not saying it's wrong, just that a linear order lacks > >> > descriptive power and is difficult to maintain. I expect you ran into > >> > that > >> > problem when trying to make everything work as you wanted. > >> > >> Yes. Now, which place am I slipping in the new types between 64-bit > >> integers and floats? > > > > In the ufunc generator. > > This line from generate_umath.py? > > all = '?bBhHiIlLqQtTfdgFDGO' > > > But most of the macros use the type ordering > > Not quite. They use the order of the loops given to the ufunc. The > order of the types in that string I think you are referring doesn't > affect much. Basically, just the comparisons where every type has a > loop. > > > and how > > do you control the promotion (or lack thereof) of the various types > to/from > > the datetime types? > > PyArray_CanCastSafely() in convert_datatype.c. datetime and timedelta > types cannot be auto-casted to or from any datatype. They can be > explicitly cast, but ufuncs won't auto-cast them when trying to find > the right loop. The datetime types are a bit unique in that they need > to exclude certain combinations (e.g. datetime+datetime). Allowing > auto-casts prevented me from doing that. > The implementation of PyArray_CanCastSafely illustrates two other points that bother me. 1) The rules are encoded in the program logic. This makes them difficult to find or to see what they are and requires editing the code to make changes. 2) Some of the rules are maintained by the types. That is even more obscure and reminiscent of the "friend" functions in c++ that encode the same sort of thing when the operators are overloaded. I never did like that as a general system ;) > In fact, the placement of the datetime typecodes in that string was a > leftover from when I was trying to allow auto-casts between integers > and datetime types. Now that I disallow them, the ordering can be > changed. > > > There also seems to be some mechanism for raising errors that has been > > added, maybe to loops. I'm not clear on that, did you add some such > > mechanism? > > Not really. Object loops already had such a mechanism; I just extended > that to do the same thing for the datetime types, too. You will be > able to raise a Python exception in the datetime loops. Of course, you > pay for that a little because that means that you can't release the > GIL. I don't think that will be a substantial problem. > Didn't say it was a problem, just that the issue of raising errors in the ufunc loops has come up before and I wondered if you were developing some mechanism for that. BTW, what is the metadata that is going to be added to the types? What purpose does it serve? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jun 11 14:55:38 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Jun 2009 13:55:38 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> Message-ID: <3d375d730906111155g498e52c2ua6b206fdb57eee3d@mail.gmail.com> On Thu, Jun 11, 2009 at 13:44, Charles R Harris wrote: > > > On Thu, Jun 11, 2009 at 12:18 PM, Robert Kern wrote: >> >> On Thu, Jun 11, 2009 at 13:06, Charles R >> Harris wrote: >> > >> > >> > On Thu, Jun 11, 2009 at 11:47 AM, Robert Kern >> > wrote: >> >> >> >> On Thu, Jun 11, 2009 at 12:39, Charles R >> >> Harris wrote: >> >> > >> >> > >> >> > On Thu, Jun 11, 2009 at 11:34 AM, Robert Kern >> >> > wrote: >> >> >> >> >> >> On Thu, Jun 11, 2009 at 12:29, Charles R >> >> >> Harris wrote: >> >> >> > Oh, and slipping the new types in between 64 bit integers and >> >> >> > floats >> >> >> > is >> >> >> > a >> >> >> > bit iffy. >> >> >> >> >> >> Where, specifically? There are several linear orders of types in >> >> >> numpy. I tried to be careful to do the right thing in each. The enum >> >> >> numbers are after NPY_VOID, of course, for compatibility. >> >> > >> >> > I noticed. I'm not saying it's wrong, just that a linear order lacks >> >> > descriptive power and is difficult to maintain. I expect you ran into >> >> > that >> >> > problem when trying to make everything work as you wanted. >> >> >> >> Yes. Now, which place am I slipping in the new types between 64-bit >> >> integers and floats? >> > >> > In the ufunc generator. >> >> This line from generate_umath.py? >> >> ?all = '?bBhHiIlLqQtTfdgFDGO' >> >> > But most of the macros use the type ordering >> >> Not quite. They use the order of the loops given to the ufunc. The >> order of the types in that string I think you are referring doesn't >> affect much. Basically, just the comparisons where every type has a >> loop. >> >> > and how >> > do you control the promotion (or lack thereof) of the various types >> > to/from >> > the datetime types? >> >> PyArray_CanCastSafely() in convert_datatype.c. datetime and timedelta >> types cannot be auto-casted to or from any datatype. They can be >> explicitly cast, but ufuncs won't auto-cast them when trying to find >> the right loop. The datetime types are a bit unique in that they need >> to exclude certain combinations (e.g. datetime+datetime). Allowing >> auto-casts prevented me from doing that. > > The implementation of? PyArray_CanCastSafely illustrates two other points > that bother me. > > 1) The rules are encoded in the program logic. This makes them difficult to > find or to see what they are and requires editing the code to make changes. > > 2) Some of the rules are maintained by the types. That is even more obscure > and reminiscent of the "friend" functions in c++ that encode the same sort > of thing when the operators are overloaded. I never did like that as a > general system ;) Yeah, I'm not much a fan of it, either. But it's what I had to work with. >> In fact, the placement of the datetime typecodes in that string was a >> leftover from when I was trying to allow auto-casts between integers >> and datetime types. Now that I disallow them, the ordering can be >> changed. >> >> > There also seems to be some mechanism for raising errors that has been >> > added, maybe to loops. I'm not clear on that, did you add some such >> > mechanism? >> >> Not really. Object loops already had such a mechanism; I just extended >> that to do the same thing for the datetime types, too. You will be >> able to raise a Python exception in the datetime loops. Of course, you >> pay for that a little because that means that you can't release the >> GIL. I don't think that will be a substantial problem. > > Didn't say it was a problem, just that the issue of raising errors in the > ufunc loops has come up before and I wondered if you were developing some > mechanism for that. We were brainstorming, but there isn't a good way to do it (i.e. allowing a useful message rather than just an error flag) without holding on to the GIL or much more extensive modifications to the machinery. > BTW, what is the metadata that is going to be added to the types? What > purpose does it serve? Storage for the time frequency (days, weeks, months, etc.) per the NEP. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d_l_goldsmith at yahoo.com Thu Jun 11 14:58:43 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Thu, 11 Jun 2009 11:58:43 -0700 (PDT) Subject: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate Message-ID: <657765.61364.qm@web52105.mail.re2.yahoo.com> --- On Thu, 6/11/09, Skipper Seabold wrote: > > Thanks, both for your kind words and for the patch; I > don't suppose I could prevail upon you to be Team Lead for > doc-ing numpy.finacial? > > > > DG > > > > Sure why not. Great! I'm done editing the Wiki for a bit if you wanna go ahead and add yourself. DG From oliphant at enthought.com Thu Jun 11 15:07:12 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 11 Jun 2009 14:07:12 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> Message-ID: <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> On Jun 11, 2009, at 1:44 PM, Charles R Harris wrote: > > The implementation of PyArray_CanCastSafely illustrates two other > points that bother me. > > 1) The rules are encoded in the program logic. This makes them > difficult to find or to see what they are and requires editing the > code to make changes. I agree that this is all sub-optimal. I didn't do much to fix what was there with Numeric except add a semi-orthogonal user-defined approach. I like the generic function concept that was added to the ufuncs quite a bit. I'm wondering if most of the functions currently in the *f member of the data-type structure couldn't be implemented under that notion instead. Also, should we attach coercion information to each data-type directly and an API to extend the coercion information? I agree that the "implicit" ordering of the data-types for coercion is wonky, but it allowed the code from Numeric to be used to dispatch in the ufunc instead of designing a new approach. Do you have other ideas about how this might work? > > 2) Some of the rules are maintained by the types. That is even more > obscure and reminiscent of the "friend" functions in c++ that encode > the same sort of thing when the operators are overloaded. I never > did like that as a general system ;) Are you referring to the user-defined data-types? I agree it's pretty kludgy. Are you envisioning a "global" coercion table? It seems like this may need to be operation specific and extensible to allow new data-types to be added fairly easily. > > BTW, what is the metadata that is going to be added to the types? > What purpose does it serve? In the date-time case, it holds what frequency the integer in the data- type represents. There will only be 2 new static data-types. "Datetime" and "Timedelta" that use 8 bytes each. What those 8 bytes represent will be determined by the metadata (years, months, seconds, etc...). But, generally, it will be an extra dictionary that can store anything you want (anybody want to define a "float" data-type that uses IBM format bits?). The ufunc machinery needs to change to handle passing that information in somehow. The approaches we take to doing that will also hopefully allow us to define ufuncs for string, unicode, and void * arrays as well. Thanks, -Travis From charlesr.harris at gmail.com Thu Jun 11 15:24:40 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Jun 2009 13:24:40 -0600 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> Message-ID: On Thu, Jun 11, 2009 at 1:07 PM, Travis Oliphant wrote: > > On Jun 11, 2009, at 1:44 PM, Charles R Harris wrote: > > > > > The implementation of PyArray_CanCastSafely illustrates two other > > points that bother me. > > > > 1) The rules are encoded in the program logic. This makes them > > difficult to find or to see what they are and requires editing the > > code to make changes. > > I agree that this is all sub-optimal. I didn't do much to fix what > was there with Numeric except add a semi-orthogonal user-defined > approach. > > I like the generic function concept that was added to the ufuncs quite > a bit. I'm wondering if most of the functions currently in the *f > member of the data-type structure couldn't be implemented under that > notion instead. > > Also, should we attach coercion information to each data-type directly > and an API to extend the coercion information? I agree that the > "implicit" ordering of the data-types for coercion is wonky, but it > allowed the code from Numeric to be used to dispatch in the ufunc > instead of designing a new approach. Do you have other ideas about > how this might work? > It was a fairly decent system when there were just a few numeric types, but there are more data types then datetime that might be useful so it would be nice if there was a more general way to add them without wading through all the stuff Robert had to do. The descriptors still need to be identified and a number is as good as anything, it is the reliance on ordering that is the limitation. For a general solution, my thoughts have been running along the lines of a table/linked list, but not directly implemented in c. Who wants to edit a 19x19 array, maybe even several of them ;) So I'm trying to think how the rules could be encoded so that a python program could generate tables or lists. The rules could all be collected in one spot, then. Actual code would still be needed for the conversions and loops and there needs to be a way to associate the conversion with the corresponding function. So probably a name as well as a number is needed when a new type is added. > > > > 2) Some of the rules are maintained by the types. That is even more > > obscure and reminiscent of the "friend" functions in c++ that encode > > the same sort of thing when the operators are overloaded. I never > > did like that as a general system ;) > > Are you referring to the user-defined data-types? I agree it's > pretty kludgy. Are you envisioning a "global" coercion table? It > seems like this may need to be operation specific and extensible to > allow new data-types to be added fairly easily. > > > > > BTW, what is the metadata that is going to be added to the types? > > What purpose does it serve? > > In the date-time case, it holds what frequency the integer in the data- > type represents. There will only be 2 new static data-types. > "Datetime" and "Timedelta" that use 8 bytes each. > > What those 8 bytes represent will be determined by the metadata > (years, months, seconds, etc...). > > But, generally, it will be an extra dictionary that can store anything > you want (anybody want to define a "float" data-type that uses IBM > format bits?). The ufunc machinery needs to change to handle passing > that information in somehow. The approaches we take to doing that > will also hopefully allow us to define ufuncs for string, unicode, and > void * arrays as well. > Might be useful for units also. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Thu Jun 11 15:25:51 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 11 Jun 2009 14:25:51 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> Message-ID: <9CD7EF10-D248-49AA-B972-AF394F433A5A@enthought.com> On Jun 11, 2009, at 1:18 PM, Robert Kern wrote: > This line from generate_umath.py? > > all = '?bBhHiIlLqQtTfdgFDGO' Oh, I don't think it's a good idea to use "T" and "t" for date-time. In the struct module this mean "struct" and "bit" respectively. I propose "M" and "m" -Travis From charlesr.harris at gmail.com Thu Jun 11 15:29:51 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Jun 2009 13:29:51 -0600 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> Message-ID: On Thu, Jun 11, 2009 at 1:07 PM, Travis Oliphant wrote: > > In the date-time case, it holds what frequency the integer in the data- > type represents. There will only be 2 new static data-types. > "Datetime" and "Timedelta" that use 8 bytes each. > > What those 8 bytes represent will be determined by the metadata > (years, months, seconds, etc...). > > But, generally, it will be an extra dictionary that can store anything > you want (anybody want to define a "float" data-type that uses IBM > format bits?). The ufunc machinery needs to change to handle passing > that information in somehow. The approaches we take to doing that > will also hopefully allow us to define ufuncs for string, unicode, and > void * arrays as well. > Hmm. I wonder if there could be a python program the loads conversion information into the dictionary when the module loads? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Thu Jun 11 15:37:20 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 11 Jun 2009 15:37:20 -0400 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> References: <7ECF513D-137D-4016-B832-6FDF9E6D20C4@enthought.com> <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> Message-ID: On Jun 11, 2009, at 3:07 PM, Travis Oliphant wrote: >> BTW, what is the metadata that is going to be added to the types? >> What purpose does it serve? > > In the date-time case, it holds what frequency the integer in the > data- > type represents. There will only be 2 new static data-types. > "Datetime" and "Timedelta" that use 8 bytes each. > > What those 8 bytes represent will be determined by the metadata > (years, months, seconds, etc...). As Charles pointed out, it'd be quite useful for units as well. Or to store some extra information like the filling_value of a MaskedArray... So, this metadata would be attached to an array, right ? Scalars would be considered as 0d array for that purpose, right ? eg, given a 1d array of dates w/ a given frequency, accessing a single element would give me a scalar w/ the same frequency ? > The ufunc machinery needs to change to handle passing > that information in somehow. The approaches we take to doing that > will also hopefully allow us to define ufuncs for string, unicode, and > void * arrays as well. In that case, could we also think about what Darren was suggesting for his units package, viz, a pre-processing function (__array_unwrap__ ?) that complements the current __array_wrap__ one ? The idea being that any operation would be performed on a ndarray, the corresponding metadata would be just passed along during the operation, and modifications to the metadata would be taken care of in the pre- and/ or post- processing steps ? Oh, just another question: why trying to put datetime and timedelta in the type ordering ? My understanding is that underneath, they're just long/longlong. It's only because they have a particular metadata that they should be processed differently, right ? So, if soon we add units to floats, the underneath object would still be considered float, dealing w/ the unit has to be let for ufuncs ? From robert.kern at gmail.com Thu Jun 11 15:47:51 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Jun 2009 14:47:51 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: References: <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> Message-ID: <3d375d730906111247w30494661j50608a086f402b0b@mail.gmail.com> On Thu, Jun 11, 2009 at 14:37, Pierre GM wrote: > > On Jun 11, 2009, at 3:07 PM, Travis Oliphant wrote: > >>> BTW, what is the metadata that is going to be added to the types? >>> What purpose does it serve? >> >> In the date-time case, it holds what frequency the integer in the >> data- >> type represents. ? ?There will only be 2 new static data-types. >> "Datetime" and "Timedelta" that use 8 bytes each. >> >> What those 8 bytes represent will be determined by the metadata >> (years, months, seconds, etc...). > > As Charles pointed out, it'd be quite useful for units as well. Or to > store some extra information like the filling_value of a MaskedArray... > > So, this metadata would be attached to an array, right ? No. The metadata is on the dtype. > Scalars would > be considered as 0d array for that purpose, right ? eg, ?given a 1d > array of dates w/ a given frequency, accessing a single element would > give me a scalar w/ the same frequency ? It should. The details still need to be worked out. >> ?The ufunc machinery needs to change to handle passing >> that information in somehow. ? The approaches we take to doing that >> will also hopefully allow us to define ufuncs for string, unicode, and >> void * arrays as well. > > In that case, could we also think about what Darren was suggesting for > his units package, viz, a pre-processing function (__array_unwrap__ ?) > that complements the current __array_wrap__ one ? The idea being that > any operation would be performed on a ndarray, the corresponding > metadata would be just passed along during the operation, and > modifications to the metadata would be taken care of in the pre- and/ > or post- processing steps ? Neither here nor there, I think. > Oh, just another question: why trying to put datetime and timedelta in > the type ordering ? My understanding is that underneath, they're just > long/longlong. It's only because they have a particular metadata that > they should be processed differently, right ? No. They need to be different types such that the ufunc mechanism can find the right loop implementations. > So, if soon we add units > to floats, the underneath object would still be considered float, > dealing w/ the unit has to be let for ufuncs ? This is why I don't think this mechanism can be used for units. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Thu Jun 11 16:33:10 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 11 Jun 2009 16:33:10 -0400 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <3d375d730906111247w30494661j50608a086f402b0b@mail.gmail.com> References: <3d375d730906111034j9151d41o3a6775d22101fa6f@mail.gmail.com> <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> <3d375d730906111247w30494661j50608a086f402b0b@mail.gmail.com> Message-ID: <49FB8150-0790-487D-8D56-61806AA933BE@gmail.com> On Jun 11, 2009, at 3:47 PM, Robert Kern wrote: > On Thu, Jun 11, 2009 at 14:37, Pierre GM wrote: >> >> On Jun 11, 2009, at 3:07 PM, Travis Oliphant wrote: >> >>>> BTW, what is the metadata that is going to be added to the types? >>>> What purpose does it serve? >>> >>> In the date-time case, it holds what frequency the integer in the >>> data- >>> type represents. There will only be 2 new static data-types. >>> "Datetime" and "Timedelta" that use 8 bytes each. >>> >>> What those 8 bytes represent will be determined by the metadata >>> (years, months, seconds, etc...). >> >> As Charles pointed out, it'd be quite useful for units as well. Or to >> store some extra information like the filling_value of a >> MaskedArray... >> >> So, this metadata would be attached to an array, right ? > > No. The metadata is on the dtype. Ah, OK. Still could be used for units, then. And it'll probably make things easier to define custom dtypes (I was thinking about a standard problem where all the fields of a structured array have the same dtype. A flag could be attached to the main dtype telling that it's OK to perform some functions on fields, for example... Thinking aloud here). >> Scalars would >> be considered as 0d array for that purpose, right ? eg, given a 1d >> array of dates w/ a given frequency, accessing a single element would >> give me a scalar w/ the same frequency ? > > It should. The details still need to be worked out. OK. > >>> The ufunc machinery needs to change to handle passing >>> that information in somehow. The approaches we take to doing that >>> will also hopefully allow us to define ufuncs for string, unicode, >>> and >>> void * arrays as well. >> >> In that case, could we also think about what Darren was suggesting >> for >> his units package, viz, a pre-processing function >> (__array_unwrap__ ?) >> that complements the current __array_wrap__ one ? The idea being that >> any operation would be performed on a ndarray, the corresponding >> metadata would be just passed along during the operation, and >> modifications to the metadata would be taken care of in the pre- and/ >> or post- processing steps ? > > Neither here nor there, I think. > >> Oh, just another question: why trying to put datetime and timedelta >> in >> the type ordering ? My understanding is that underneath, they're just >> long/longlong. It's only because they have a particular metadata that >> they should be processed differently, right ? > > No. They need to be different types such that the ufunc mechanism can > find the right loop implementations. Meh. I'm not familiar enough with the details of C ufuncs, so bear with me for a minute. A datetime is basically a long + a frequency attribute. All the operations recognized as valid for a datetime object will deal w/ the long part, the frequency are just patched back at the end, right ? So, a ufunc could first check the underlying type (here, long or longlong), then check whether there's a value for the 'unit': if there's one, choose the corresponding loop, if None, use the default (the one we currently have). I really fail to see why we need to see datetime/timedelta as intrinsically different from the other types (apart that they carry some extra info), and why the mechanism should be different for datetime/timedelta than for units, say. >> So, if soon we add units >> to floats, the underneath object would still be considered float, >> dealing w/ the unit has to be let for ufuncs ? > > This is why I don't think this mechanism can be used for units. Robert, would you mind pointing me offlist to the relevant part of the code so that I can try to figure out by myself ? Or just explain it in plain english (which would then be the basis for a documentation of these new features)... From robert.kern at gmail.com Thu Jun 11 17:35:12 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Jun 2009 16:35:12 -0500 Subject: [Numpy-discussion] Datetime branch In-Reply-To: <49FB8150-0790-487D-8D56-61806AA933BE@gmail.com> References: <3d375d730906111047r60fa90c4r8c89c269b5f66e4e@mail.gmail.com> <3d375d730906111118y5318cf29y6954d2ea8f4a3c47@mail.gmail.com> <0762D48A-CEC8-40E2-9DC0-162B7C3C4CA4@enthought.com> <3d375d730906111247w30494661j50608a086f402b0b@mail.gmail.com> <49FB8150-0790-487D-8D56-61806AA933BE@gmail.com> Message-ID: <3d375d730906111435j6b6422es15a56b8f089f410d@mail.gmail.com> On Thu, Jun 11, 2009 at 15:33, Pierre GM wrote: > > On Jun 11, 2009, at 3:47 PM, Robert Kern wrote: > >> On Thu, Jun 11, 2009 at 14:37, Pierre GM wrote: >>> Oh, just another question: why trying to put datetime and timedelta >>> in >>> the type ordering ? My understanding is that underneath, they're just >>> long/longlong. It's only because they have a particular metadata that >>> they should be processed differently, right ? >> >> No. They need to be different types such that the ufunc mechanism can >> find the right loop implementations. > > Meh. I'm not familiar enough with the details of C ufuncs, so bear > with me for a minute. > > A datetime ?is basically a long + a frequency attribute. All the > operations recognized as valid for a datetime object will deal w/ the > long part, the frequency are just patched back at the end, right ? So, > a ufunc could first check the underlying type (here, long or > longlong), then check whether there's a value for the 'unit': if > there's one, choose the corresponding loop, if None, use the default > (the one we currently have). That's a much more invasive change to the ufunc dispatch mechanism than I wanted to implement. You would need to store these other loops somewhere on the ufunc object along with more complicated metadata about each loop. > I really fail to see why we need to see datetime/timedelta as > intrinsically different from the other types (apart that they carry > some extra info), and why the mechanism should be different for > datetime/timedelta than for units, say. I would want to apply units to any dtype rather than have reserved dtypes for unitted arrays. If you relax that restriction and only consider unitted double arrays, then you could probably use the same mechanism. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From darkgl0w at yahoo.com Fri Jun 12 03:06:18 2009 From: darkgl0w at yahoo.com (Cristi Constantin) Date: Fri, 12 Jun 2009 00:06:18 -0700 (PDT) Subject: [Numpy-discussion] Unite a Rectangular Unicode Array into one newline-separated string Message-ID: <270222.59377.qm@web52104.mail.re2.yahoo.com> Good day. I am trying to unite a Rectangular Unicode Array into one newline-separated string. Basically each line is separated by next line with '\n', ? and all characters from one line are merged. For example: import numpy as np a = np.arange(12).reshape(3,4) a = np.asarray(a,'U') # Method 1 that works: '\n'.join([ ''.join([j.encode('utf8') for j in i]) for i in a ]) # Prints '0123\n4567\n8911\n' # This is VERY slow. # Method 2 that works: ''.join ( np.hstack( np.hstack( (i,np.array([u'\n'],'U')) ) for i in a)).encode('utf8') # Prints '0123\n4567\n8911\n' # This is faster, but still quite slow. It's very important to encode the result in UTF8, because the values will not work with ASCII codec. I played with: a.astype(str) # But in some cases, this raises UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in position 0: ordinal not in range(128) And i also played with: a.tostring() # But this returns '0\x00\x00\x001\x00\x00\x002\x00\x00\x003\x00\x00\x004\x00\x00\x005\x00\x00\x006\x00\x00\x007\x00\x00\x008\x00\x00\x009\x00\x00\x001\x00\x00\x001\x00\x00\x00' # ... and i have not idea what to do with this value. Can anyone suggest faster methods to transform into string that unicode array? Thank you very much. -------------- next part -------------- An HTML attachment was scrubbed... URL: From seb.haase at gmail.com Fri Jun 12 04:40:05 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Fri, 12 Jun 2009 10:40:05 +0200 Subject: [Numpy-discussion] numpy test and "true division" Message-ID: Hi all, first off - I'm happy that I finally figured out how to run the test suite, and I get only one failure (Debian 64-bit, Numpy 1.3.0, "FAIL: Ticket #950") Before though, I got: Ran 2031 tests in 5.272s FAILED (KNOWNFAIL=1, SKIP=11, errors=3, failures=8) Then I realized that I changed my python to always start up using "-Qnew" - i.e. default to true division (1/2 = .5 and not 1/2=0). So, would it be possible to change those 8-12 test to use '//': using python -Qwarn -i I get: Python 2.5.2 (r252:60911, Jan 4 2009, 21:59:32) /numpy/core/numerictypes.py:245: DeprecationWarning: classic int division bytes = bits / 8 /numpy/core/numerictypes.py:285: DeprecationWarning: classic int division na_name = '%s%d' % (english_capitalize(base), bit/2) /numpy/core/numerictypes.py:315: DeprecationWarning: classic int division charname = 'i%d' % (bits/8,) /numpy/core/numerictypes.py:316: DeprecationWarning: classic int division ucharname = 'u%d' % (bits/8,) /numpy/core/numerictypes.py:557: DeprecationWarning: classic int division nbytes[obj] = val[2] / 8 >>> import sys;sys.displayhook = sys.__displayhook__ >>> N.test() Running unit tests for numpy NumPy version 1.3.0 NumPy is installed in /home/shaase/Priithon_25_lin64/numpy Python version 2.5.2 (r252:60911, Jan 4 2009, 21:59:32) [GCC 4.3.2] nose version 0.10.4 /numpy/f2py/auxfuncs.py:568: DeprecationWarning: classic int division /numpy/fft/helper.py:38: DeprecationWarning: classic int division p2 = (n+1)/2 /numpy/fft/helper.py:67: DeprecationWarning: classic int division p2 = n-(n+1)/2 /numpy/lib/tests/test_arraysetops.py:138: DeprecationWarning: classic int division a = np.fix( nItem / 10 * np.random.random( nItem ) ) /numpy/lib/tests/test_arraysetops.py:139: DeprecationWarning: classic int division b = np.fix( nItem / 10 * np.random.random( nItem ) ) /numpy/lib/function_base.py:613: DeprecationWarning: classic int division scl = avg.dtype.type(a.size/avg.size) /numpy/lib/shape_base.py:1079: DeprecationWarning: classic int division n /= max(dim_in,1) /numpy/linalg/linalg.py:830: DeprecationWarning: classic int division for i in range(len(ind)/2): FAIL: Ticket #950 ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/shaase/Priithon_25_lin64/numpy/core/tests/test_regression.py", line 1248, in test_blasdot_uninitialized_memory assert np.all(z == 0) AssertionError -------------------------------------- Cheers, Sebastian Haase From david at ar.media.kyoto-u.ac.jp Fri Jun 12 06:46:04 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 12 Jun 2009 19:46:04 +0900 Subject: [Numpy-discussion] RFC: add a install_clib numpy.distutils command to install pure C libraries Message-ID: <4A3231EC.9050707@ar.media.kyoto-u.ac.jp> Hi, I have finally spent some time so that we can install pure C libraries using numpy.distutils. With this, one could imagine having a C library for fft, special functions in numpy or scipy, so that the library could be reused in another package at the C level. If someone knowledgeable about numpy.distutils would like to review this branch, I would be grateful: http://codereview.appspot.com/75047/show The branch uses this functionality for npy_math, so that we can do: # get_npymath_info is in numpy.distutils.misc_util, and its output is compatible with system_info config.add_library("foo", sources=['foo.c'], extra_info=get_npymath_info()) The code to add an installable library is generic, but there is no code to get the build info for external packages (except for npy_math). I did not find an obvious way to do this generically, thanks, David From gael.varoquaux at normalesup.org Fri Jun 12 07:34:24 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 12 Jun 2009 13:34:24 +0200 Subject: [Numpy-discussion] RFC: add a install_clib numpy.distutils command to install pure C libraries In-Reply-To: <4A3231EC.9050707@ar.media.kyoto-u.ac.jp> References: <4A3231EC.9050707@ar.media.kyoto-u.ac.jp> Message-ID: <20090612113424.GB21844@phare.normalesup.org> On Fri, Jun 12, 2009 at 07:46:04PM +0900, David Cournapeau wrote: > I have finally spent some time so that we can install pure C > libraries using numpy.distutils. With this, one could imagine having a C > library for fft, special functions in numpy or scipy, so that the > library could be reused in another package at the C level. If someone > knowledgeable about numpy.distutils would like to review this branch, I > would be grateful: Do I understand this well? Does that mean that another package could use the lapack exposed by numpy, or the special function exposed by scipy, or the random number generator exposed by numpy, at the C level? That is fantastic! I am travelling and in a conference for two weeks, so I don't know how much I will be able to review the code soon, but I am very interested. Ga?l From david at ar.media.kyoto-u.ac.jp Fri Jun 12 07:38:51 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 12 Jun 2009 20:38:51 +0900 Subject: [Numpy-discussion] RFC: add a install_clib numpy.distutils command to install pure C libraries In-Reply-To: <20090612113424.GB21844@phare.normalesup.org> References: <4A3231EC.9050707@ar.media.kyoto-u.ac.jp> <20090612113424.GB21844@phare.normalesup.org> Message-ID: <4A323E4B.2020008@ar.media.kyoto-u.ac.jp> Gael Varoquaux wrote: > On Fri, Jun 12, 2009 at 07:46:04PM +0900, David Cournapeau wrote: > >> I have finally spent some time so that we can install pure C >> libraries using numpy.distutils. With this, one could imagine having a C >> library for fft, special functions in numpy or scipy, so that the >> library could be reused in another package at the C level. If someone >> knowledgeable about numpy.distutils would like to review this branch, I >> would be grateful: >> > > Do I understand this well? Does that mean that another package could use > the lapack exposed by numpy, or the special function exposed by scipy, or > the random number generator exposed by numpy, at the C level? > Well, that's the goal, yes, but it only solves the problem at the build level. There is still a lot of work to make e.g. blas or lapack usable from C - npy_math is kind of trivial, comparatively. There is another problem for scipy: I don't know how to make the build information available to 3rd party - for npymath, it is easy because it is in numpy, so I can just add one function to numpy.distutils. I am tempted to just steal pkg-config .pc format - having a pkg-config clone in python should be quite trivial, and the format is quite flexible. cheers, David From david at ar.media.kyoto-u.ac.jp Fri Jun 12 07:54:10 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 12 Jun 2009 20:54:10 +0900 Subject: [Numpy-discussion] scipy 0.7.1 rc3 Message-ID: <4A3241E2.1010000@ar.media.kyoto-u.ac.jp> Hi, I have uploaded the binaries and source tarballs for 0.7.1rc3. The rc3 fixes some issues in scipy.special, which caused wrong behavior/crashes on some platforms. Hopefully, this will be the 0.7.1 release, cheers, David ========================= SciPy 0.7.1 Release Notes ========================= .. contents:: SciPy 0.7.1 is a bug-fix release with no new features compared to 0.7.0. scipy.io ======== Bugs fixed: - Several fixes in Matlab file IO scipy.odr ========= Bugs fixed: - Work around a failure with Python 2.6 scipy.signal ============ Memory leak in lfilter have been fixed, as well as support for array object Bugs fixed: - #880, #925: lfilter fixes - #871: bicgstab fails on Win32 scipy.sparse ============ Bugs fixed: - #883: scipy.io.mmread with scipy.sparse.lil_matrix broken - lil_matrix and csc_matrix reject now unexpected sequences, cf. http://thread.gmane.org/gmane.comp.python.scientific.user/19996 scipy.special ============= Several bugs of varying severity were fixed in the special functions: - #503, #640: iv: problems at large arguments fixed by new implementation - #623: jv: fix errors at large arguments - #679: struve: fix wrong output for v < 0 - #803: pbdv produces invalid output - #804: lqmn: fix crashes on some input - #823: betainc: fix documentation - #834: exp1 strange behavior near negative integer values - #852: jn_zeros: more accurate results for large s, also in jnp/yn/ynp_zeros - #853: jv, yv, iv: invalid results for non-integer v < 0, complex x - #854: jv, yv, iv, kv: return nan more consistently when out-of-domain - #927: ellipj: fix segfault on Windows - #946: ellpj: fix segfault on Mac OS X/python 2.6 combination. - ive, jve, yve, kv, kve: with real-valued input, return nan for out-of-domain instead of returning only the real part of the result. Also, when ``scipy.special.errprint(1)`` has been enabled, warning messages are now issued as Python warnings instead of printing them to stderr. scipy.stats =========== - linregress, mannwhitneyu, describe: errors fixed - kstwobign, norm, expon, exponweib, exponpow, frechet, genexpon, rdist, truncexpon, planck: improvements to numerical accuracy in distributions Windows binaries for python 2.6 =============================== python 2.6 binaries for windows are now included. The binary for python 2.5 requires numpy 1.2.0 or above, and and the one for python 2.6 requires numpy 1.3.0 or above. Universal build for scipy ========================= Mac OS X binary installer is now a proper universal build, and does not depend on gfortran anymore (libgfortran is statically linked). The python 2.5 version of scipy requires numpy 1.2.0 or above, the python 2.6 version requires numpy 1.3.0 or above. Checksums ========= 9dd5af43cc26ae6d38a13b373ba430fa release/installers/scipy-0.7.1rc3-py2.6-python.org.dmg 290c2e056fda1f86dfa9f3a76d207a8c release/installers/scipy-0.7.1rc3-win32-superpack-python2.6.exe d582dff7535d2b64a097fb4bfbc75d09 release/installers/scipy-0.7.1rc3-win32-superpack-python2.5.exe a19400ccfd65d1a0a5030848af6f78ea release/installers/scipy-0.7.1rc3.tar.gz d4ebf322c62b09c4ebaad7b67f92d032 release/installers/scipy-0.7.1rc3.zip a0ea0366b178a7827f10a480f97c3c47 release/installers/scipy-0.7.1rc3-py2.5-python.org.dmg From geometrian at gmail.com Fri Jun 12 14:21:02 2009 From: geometrian at gmail.com (Ian Mallett) Date: Fri, 12 Jun 2009 11:21:02 -0700 Subject: [Numpy-discussion] Elementary Array Switching Message-ID: Hi, I have a NumPy array. The array is 3D, n x n x 3. I'm trying to flip the first element of the last dimension with the last. I tried: temp = myarray[:,:,0].copy() myarray[:,:,0] = myarray[:,:,2].copy() myarray[:,:,2] = temp del temp But it doesn't work as expected. I'm definitely not very good at NumPy, so I get the feeling it's something silly I'm doing. What should that code look like? Thanks, Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jun 12 14:25:23 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Jun 2009 13:25:23 -0500 Subject: [Numpy-discussion] Elementary Array Switching In-Reply-To: References: Message-ID: <3d375d730906121125r51ebb89bqc090ca1cfd9f94dc@mail.gmail.com> On Fri, Jun 12, 2009 at 13:21, Ian Mallett wrote: > Hi, > > I have a NumPy array.? The array is 3D, n x n x 3.? I'm trying to flip the > first element of the last dimension with the last.? I tried: > temp = myarray[:,:,0].copy() > myarray[:,:,0] = myarray[:,:,2].copy() > myarray[:,:,2] = temp > del temp > But it doesn't work as expected. What did you expect? What did you get? Show us the results. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From petit.frederic at free.fr Fri Jun 12 19:08:48 2009 From: petit.frederic at free.fr (fred) Date: Sat, 13 Jun 2009 01:08:48 +0200 Subject: [Numpy-discussion] finding index in an array... Message-ID: <4A32E000.4020901@free.fr> Hi, Say I have an array A with shape (10,3) and A[3,:] = [1,2,3] I want to find the index of the array in which I have these values [1,2,3]. How can I do that? The only workaround I have found is to use a list: A.tolist().index([1,2, 3]) That works fine, but is there a better solution (without using list, for instance)? TIA. Cheers, -- Fred From josef.pktd at gmail.com Fri Jun 12 19:32:59 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 12 Jun 2009 19:32:59 -0400 Subject: [Numpy-discussion] finding index in an array... In-Reply-To: <4A32E000.4020901@free.fr> References: <4A32E000.4020901@free.fr> Message-ID: <1cd32cbb0906121632w624a9e0ake8c0fb2fe34d2a3c@mail.gmail.com> On Fri, Jun 12, 2009 at 7:08 PM, fred wrote: > Hi, > > Say I have an array A with shape (10,3) and > > A[3,:] = [1,2,3] > > I want to find the index of the array in which I have these values [1,2,3]. > > How can I do that? > > The only workaround I have found is to use a list: > > A.tolist().index([1,2, 3]) > > That works fine, but is there a better solution (without using list, for > instance)? > something like this should work to find rows with specific elements, if I understand you correctly. np.nonzero(A.view([('',float)]*3) == np.array((1,2,3),[('',float)]*3))[0] It creates an extra dimension, that needs to be removed with [0], but it takes to long now, to remember how to get rid of it. Josef > TIA. > > > Cheers, > > -- > Fred > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From fredmfp at gmail.com Sat Jun 13 04:38:39 2009 From: fredmfp at gmail.com (fred) Date: Sat, 13 Jun 2009 10:38:39 +0200 Subject: [Numpy-discussion] finding index in an array... In-Reply-To: <1cd32cbb0906121632w624a9e0ake8c0fb2fe34d2a3c@mail.gmail.com> References: <4A32E000.4020901@free.fr> <1cd32cbb0906121632w624a9e0ake8c0fb2fe34d2a3c@mail.gmail.com> Message-ID: <4A33658F.2060007@gmail.com> josef.pktd at gmail.com a ?crit : > something like this should work to find rows with specific elements, > if I understand you correctly. You did ;-) > np.nonzero(A.view([('',float)]*3) == np.array((1,2,3),[('',float)]*3))[0] > > It creates an extra dimension, that needs to be removed with [0], but > it takes to long now, to remember how to get rid of it. Ok, I get it. Thanks. BTW, I was thinking this was a FAQ and there was a more straightforward answer, or a buitlin function to do the trick... Cheers, -- Fred From emmanuelle.gouillart at normalesup.org Sat Jun 13 05:04:11 2009 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Sat, 13 Jun 2009 11:04:11 +0200 Subject: [Numpy-discussion] finding index in an array... In-Reply-To: <4A32E000.4020901@free.fr> References: <4A32E000.4020901@free.fr> Message-ID: <20090613090411.GA11221@phare.normalesup.org> Hi Fred, here is another solution >>> A = np.arange(99).reshape((33,3) >>> mask = (A==np.array([0,1,2])) >>> np.nonzero(np.prod(mask, axis=1))[0] array([0] I found it to be less elegant than Josef's solution changing the dtype of the array, but it may be easier to understand if you're not very familiar with dtypes. Also, it is a bit faster on my computer: >>> %timeit np.nonzero(A.view([('',int)]*3) == np.array((0,1,2),[('',int)]*3))[0] 10000 loops, best of 3: 61.8 ?s per loop >>> %timeit np.nonzero(np.prod((A==np.array([0,1,2])), axis=1))[0] 10000 loops, best of 3: 38.1 ?s per loop Cheers, Emmanuelle On Sat, Jun 13, 2009 at 01:08:48AM +0200, fred wrote: > Hi, > Say I have an array A with shape (10,3) and > A[3,:] = [1,2,3] > I want to find the index of the array in which I have these values [1,2,3]. > How can I do that? > The only workaround I have found is to use a list: > A.tolist().index([1,2, 3]) > That works fine, but is there a better solution (without using list, for > instance)? > TIA. > Cheers, From fredmfp at gmail.com Sat Jun 13 05:51:12 2009 From: fredmfp at gmail.com (fred) Date: Sat, 13 Jun 2009 11:51:12 +0200 Subject: [Numpy-discussion] finding index in an array... In-Reply-To: <20090613090411.GA11221@phare.normalesup.org> References: <4A32E000.4020901@free.fr> <20090613090411.GA11221@phare.normalesup.org> Message-ID: <4A337690.6060300@gmail.com> Emmanuelle Gouillart a ?crit : > Hi Fred, Hi Manue ;-) > here is another solution >>>> A = np.arange(99).reshape((33,3) >>>> mask = (A==np.array([0,1,2])) >>>> np.nonzero(np.prod(mask, axis=1))[0] > array([0] > > I found it to be less elegant than Josef's solution changing the dtype of > the array, but it may be easier to understand if you're not very familiar > with dtypes. I have no problem. > Also, it is a bit faster on my computer: I take it. Thanks! Cheers, -- Fred From josef.pktd at gmail.com Sat Jun 13 06:37:00 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Jun 2009 06:37:00 -0400 Subject: [Numpy-discussion] finding index in an array... In-Reply-To: <4A337690.6060300@gmail.com> References: <4A32E000.4020901@free.fr> <20090613090411.GA11221@phare.normalesup.org> <4A337690.6060300@gmail.com> Message-ID: <1cd32cbb0906130337u75a3c296q85418d2b1bad41d5@mail.gmail.com> On Sat, Jun 13, 2009 at 5:51 AM, fred wrote: > Emmanuelle Gouillart a ?crit : >> Hi Fred, > Hi Manue ;-) > >> here is another solution >>>>> A = np.arange(99).reshape((33,3) >>>>> mask = (A==np.array([0,1,2])) >>>>> np.nonzero(np.prod(mask, axis=1))[0] >> array([0] >> >> I found it to be less elegant than Josef's solution changing the dtype of >> the array, but it may be easier to understand if you're not very familiar >> with dtypes. > I have no problem. I think using a view instead of broadcasting in this case is unnecessarily complicated. Maybe I was practicing views with structured types too much. using "all" instead of "prod" may be more descriptive: >>> (A == np.array([1,2,3])).all(1).nonzero()[0] array([3, 7]) >>> (A == [1,2,3]).all(1).nonzero()[0] array([3, 7]) Josef > >> Also, it is a bit faster on my computer: > I take it. > > Thanks! > > > Cheers, > > -- > Fred > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From david at ar.media.kyoto-u.ac.jp Sat Jun 13 09:46:48 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 13 Jun 2009 22:46:48 +0900 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays Message-ID: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> Hi, I have cleaned up a bit the code, and would like to suggest the inclusion of a neighborhood iterator for numpy. St?fan took a look at it already, but it needs more eyeballs. It is a "subclass" of PyArrayIterObject, and can be used to iterate over a neighborhood of a point (handling boundaries with 0 padding for the time being). http://codereview.appspot.com/75055/show I have used it to replace the current for code correlateND in scipy.signal, where it works quite well (I think it makes the code more readable in that case). cheers, David From charlesr.harris at gmail.com Sat Jun 13 14:00:53 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 13 Jun 2009 12:00:53 -0600 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> Message-ID: On Sat, Jun 13, 2009 at 7:46 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > > Hi, > > I have cleaned up a bit the code, and would like to suggest the > inclusion of a neighborhood iterator for numpy. St?fan took a look at it > already, but it needs more eyeballs. It is a "subclass" of > PyArrayIterObject, and can be used to iterate over a neighborhood of a > point (handling boundaries with 0 padding for the time being). > > http://codereview.appspot.com/75055/show > > I have used it to replace the current for code correlateND in > scipy.signal, where it works quite well (I think it makes the code more > readable in that case). Some nitpicks: 1) The name neigh sounds like a horse. Maybe region, neighborhood, or something similar would be better. 2) Is PyObject_Init NULL safe? ret = PyArray_malloc(sizeof(*ret)); + PyObject_Init((PyObject*)ret,&PyArrayNeighIter_Type); + if (ret == NULL) { + return NULL; + } 3) Documentation is needed. In particular, I think it worth mentioning that the number of bounds is taken from the PyArrayIterObject, which isn't the most transparent thing. Otherwise, looks good. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jun 13 14:22:10 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 13 Jun 2009 12:22:10 -0600 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> Message-ID: On Sat, Jun 13, 2009 at 12:00 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sat, Jun 13, 2009 at 7:46 AM, David Cournapeau < > david at ar.media.kyoto-u.ac.jp> wrote: > > > > Hi, > > > > I have cleaned up a bit the code, and would like to suggest the > > inclusion of a neighborhood iterator for numpy. St?fan took a look at it > > already, but it needs more eyeballs. It is a "subclass" of > > PyArrayIterObject, and can be used to iterate over a neighborhood of a > > point (handling boundaries with 0 padding for the time being). > > > > http://codereview.appspot.com/75055/show > > > > I have used it to replace the current for code correlateND in > > scipy.signal, where it works quite well (I think it makes the code more > > readable in that case). > > Some nitpicks: > > 1) The name neigh sounds like a horse. Maybe region, neighborhood, or > something similar would be better. > > 2) Is PyObject_Init NULL safe? > > ret = PyArray_malloc(sizeof(*ret)); > + PyObject_Init((PyObject*)ret,&PyArrayNeighIter_Type); > + if (ret == NULL) { > + return NULL; > + } > > 3) Documentation is needed. In particular, I think it worth mentioning that > the number of bounds is taken from the PyArrayIterObject, which isn't the > most transparent thing. > More nitpicks: 1) Since reference counting is such a pain, you should document that the constructor returns a new reference and that the PyArrayIterObject does not need to have its reference count incremented before the call and that the reference count is unchanged on failure. 2) Why are _update_coord_iter(c) and _inc_set_ptr(c) macros? Why are they defined inside functions? If left as macros, they should be in CAPS, but why not just write them out? 3) Is it really worth the hassle to use inline functions? What does it buy in terms of speed that justifies the complication? Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Jun 13 14:29:32 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 14 Jun 2009 03:29:32 +0900 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220906131129l644291dct7e10be81e31ce111@mail.gmail.com> On Sun, Jun 14, 2009 at 3:00 AM, Charles R Harris wrote: > > > On Sat, Jun 13, 2009 at 7:46 AM, David Cournapeau > wrote: >> >> Hi, >> >> ? ?I have cleaned up a bit the code, and would like to suggest the >> inclusion of a neighborhood iterator for numpy. St?fan took a look at it >> already, but it needs more eyeballs. It is a "subclass" of >> PyArrayIterObject, and can be used to iterate over a neighborhood of a >> point (handling boundaries with 0 padding for the time being). >> >> http://codereview.appspot.com/75055/show >> >> I have used it to replace the current for code correlateND in >> scipy.signal, where it works quite well (I think it makes the code more >> readable in that case). > > Some nitpicks: > > 1) The name neigh sounds like a horse. Maybe region,? neighborhood, or > something similar would be better. Neighborhood makes the name quite long - maybe region would be better, although region does not imply the notion of "contiguity" ? > > 2) Is PyObject_Init NULL safe? > > ret = PyArray_malloc(sizeof(*ret)); > + ? ?PyObject_Init((PyObject*)ret,&PyArrayNeighIter_Type); > + ? ?if (ret == NULL) { > + ? ? ? ?return NULL; > + ? ?} No idea, I copied from the current iterator. In any case, it is safer to call it after the NULL check. > > 3) Documentation is needed. In particular, I think it worth mentioning that > the number of bounds is taken from the PyArrayIterObject, which isn't the > most transparent thing. The number of bounds is the number of dimensions of the array, I thought it was unambiguous. And yes, it needs documentation. thanks for the review, David From cournape at gmail.com Sat Jun 13 14:35:38 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 14 Jun 2009 03:35:38 +0900 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> On Sun, Jun 14, 2009 at 3:22 AM, Charles R Harris wrote: > 1) Since reference counting is such a pain, you should document that the > constructor returns a new reference and that the PyArrayIterObject does not > need to have its reference count incremented before the call and that the > reference count is unchanged on failure. OK. > 2) Why are _update_coord_iter(c) and _inc_set_ptr(c) macros? Why are they > defined inside functions? If left as macros, they should be in CAPS, but why > not just write them out? They are macro because they are reused in the 2d specialized functions (I will add 3d too) > 3) Is it really worth the hassle to use inline functions? What does it buy > in terms of speed that justifies the complication? Which complication are you talking about ? Except NPY_INLINE, I see none. In terms of speed, we are talking about several times faster. Think about using if for correlate, for example: you have a NxN image with a MxM kernel: PyArrayNeigh_IterNext will be called NxNxMxM times... I don't remember the numbers, but it was several times slower without the inline with gcc 4.3 on Linux. The 2d optimized functions, which just do manual loop unrolling, already buy up to a factor 2x. cheers, David From charlesr.harris at gmail.com Sat Jun 13 14:51:06 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 13 Jun 2009 12:51:06 -0600 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> Message-ID: On Sat, Jun 13, 2009 at 12:35 PM, David Cournapeau wrote: > On Sun, Jun 14, 2009 at 3:22 AM, Charles R > Harris wrote: > > > 1) Since reference counting is such a pain, you should document that the > > constructor returns a new reference and that the PyArrayIterObject does > not > > need to have its reference count incremented before the call and that the > > reference count is unchanged on failure. > > OK. > > > 2) Why are _update_coord_iter(c) and _inc_set_ptr(c) macros? Why are they > > defined inside functions? If left as macros, they should be in CAPS, but > why > > not just write them out? > > They are macro because they are reused in the 2d specialized functions > (I will add 3d too) > IIRC, inline doesn't recurse, so there is some advantage to having these as macros. But I really dislike seeing macros defined inside of functions, especially when they aren't exclusive to that function. So at least move them outside. But often it is clearer for code maintainence to simply write them out, it just takes a few more lines. IOW, use macros judiciously. > > > 3) Is it really worth the hassle to use inline functions? What does it > buy > > in terms of speed that justifies the complication? > > Which complication are you talking about ? Except NPY_INLINE, I see > none. In terms of speed, we are talking about several times faster. That's what I wanted to hear. But in c++ is generally best for simplicity and debugging to start out not using inlines, then add them is benchmarks show a decent advantage. And in those cases it is best if the inlines are just a few lines long. > > Think about using if for correlate, for example: you have a NxN image > with a MxM kernel: PyArrayNeigh_IterNext will be called NxNxMxM > times... I don't remember the numbers, but it was several times slower > without the inline with gcc 4.3 on Linux. The 2d optimized functions, > which just do manual loop unrolling, already buy up to a factor 2x. > So what is the tradeoff between just unrolling the loops vs inline functions? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jun 13 14:59:11 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 13 Jun 2009 12:59:11 -0600 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: <5b8d13220906131129l644291dct7e10be81e31ce111@mail.gmail.com> References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131129l644291dct7e10be81e31ce111@mail.gmail.com> Message-ID: On Sat, Jun 13, 2009 at 12:29 PM, David Cournapeau wrote: > On Sun, Jun 14, 2009 at 3:00 AM, Charles R > Harris wrote: > > > > > > On Sat, Jun 13, 2009 at 7:46 AM, David Cournapeau > > wrote: > >> > >> Hi, > >> > >> I have cleaned up a bit the code, and would like to suggest the > >> inclusion of a neighborhood iterator for numpy. St?fan took a look at it > >> already, but it needs more eyeballs. It is a "subclass" of > >> PyArrayIterObject, and can be used to iterate over a neighborhood of a > >> point (handling boundaries with 0 padding for the time being). > >> > >> http://codereview.appspot.com/75055/show > >> > >> I have used it to replace the current for code correlateND in > >> scipy.signal, where it works quite well (I think it makes the code more > >> readable in that case). > > > > Some nitpicks: > > > > 1) The name neigh sounds like a horse. Maybe region, neighborhood, or > > something similar would be better. > > Neighborhood makes the name quite long - maybe region would be better, > although region does not imply the notion of "contiguity" ? > It does make it long, which is an inconvenience to the coder. But I think it is easier on the reader. Hmm... a thesaurus also suggests zone, area, and vicinity. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Sat Jun 13 15:26:38 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Sat, 13 Jun 2009 12:26:38 -0700 (PDT) Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays Message-ID: <190072.52379.qm@web52103.mail.re2.yahoo.com> --- On Sat, 6/13/09, Charles R Harris wrote: > Some nitpicks: > > 3) Documentation is needed. In particular, I think it worth > mentioning that the number of bounds is taken from the > PyArrayIterObject, which isn't the most transparent > thing. OP's recognition of this need acknowledged, I have a nitpick of my own: "documentation needed" is not a nitpick. ;-) Thanks, David and Chuck. DG > > Chuck > > > > -----Inline Attachment Follows----- > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d_l_goldsmith at yahoo.com Sat Jun 13 15:58:42 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Sat, 13 Jun 2009 12:58:42 -0700 (PDT) Subject: [Numpy-discussion] More on doc-ing new functions Message-ID: <533868.82740.qm@web52112.mail.re2.yahoo.com> Are new functions automatically added to the Numpy Doc Wiki? In particular: 0) is the documentation itself (assuming there is some) added in such a way that it can be edited by Wiki users; and 1) is the name of the function automatically added to a "best guess" category in the Milestones? If not, if we can remember to do so, when adding a new function, please add an "issue" at http://code.google.com/p/numpydocmarathon09/issues to record that these chores need doing. Thanks! DG From jkington at wisc.edu Sat Jun 13 20:11:32 2009 From: jkington at wisc.edu (Joe Kington) Date: Sat, 13 Jun 2009 19:11:32 -0500 Subject: [Numpy-discussion] Indexing with integer ndarrays Message-ID: Hi folks, This is probably a very simple question, but it has me stumped... I have an integer 2D array containing 3rd dimesion indicies that I'd like to use to index values in a 3D array. Basically, I want the equivalent of: > output = np.zeros((ny,nx)) > > for i in xrange(ny): > for j in xrange(nx): > z = grid[i,j] > output[j,i] = bigVolume[j,i,z] Where grid is my 2D array of indicies and bigVolume is my 3D array. I've read the numpy-user and numpybook sections on indexing with an integer ndarray, but I'm still not quite able to wrap my head around how it should work. I'm sure I'm missing something obvious (I haven't been using numpy particularly long). If it helps anyone visualize it, I'm essentially trying to extract attributes out of a seismic volume along the surface of a horizion. Thanks! -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Jun 13 20:17:30 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 13 Jun 2009 19:17:30 -0500 Subject: [Numpy-discussion] Indexing with integer ndarrays In-Reply-To: References: Message-ID: <3d375d730906131717vc73bb12ief2c72a48a8f4660@mail.gmail.com> On Sat, Jun 13, 2009 at 19:11, Joe Kington wrote: > Hi folks, > > This is probably a very simple question, but it has me stumped... > > I have an integer 2D array containing 3rd dimesion indicies that I'd like to > use to index values in a 3D array. > > Basically, I want the equivalent of: > >> output = np.zeros((ny,nx)) >> >> for i in xrange(ny): >> ??? for j in xrange(nx): >> ??????? z = grid[i,j] >> ??????? output[j,i] = bigVolume[j,i,z] > > Where grid is my 2D array of indicies and bigVolume is my 3D array. > > I've read the numpy-user and numpybook sections on indexing with an integer > ndarray, but I'm still not quite able to wrap my head around how it should > work.? I'm sure I'm missing something obvious (I haven't been using numpy > particularly long). > > If it helps anyone visualize it, I'm essentially trying to extract > attributes out of a seismic volume along the surface of a horizion. I discuss this particular use case (well, a little different; we are pulling out a thin slab around a horizon rather than a slice) here: http://mail.scipy.org/pipermail/numpy-discussion/2008-July/035776.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jkington at wisc.edu Sat Jun 13 20:30:17 2009 From: jkington at wisc.edu (Joe Kington) Date: Sat, 13 Jun 2009 19:30:17 -0500 Subject: [Numpy-discussion] Indexing with integer ndarrays In-Reply-To: <3d375d730906131717vc73bb12ief2c72a48a8f4660@mail.gmail.com> References: <3d375d730906131717vc73bb12ief2c72a48a8f4660@mail.gmail.com> Message-ID: Thank you! That answered things quite nicely. My apologies for not finding the earlier discussion before sending out the question... Thanks again, -Joe On Sat, Jun 13, 2009 at 7:17 PM, Robert Kern wrote: > On Sat, Jun 13, 2009 at 19:11, Joe Kington wrote: > > Hi folks, > > > > This is probably a very simple question, but it has me stumped... > > > > I have an integer 2D array containing 3rd dimesion indicies that I'd like > to > > use to index values in a 3D array. > > > > Basically, I want the equivalent of: > > > >> output = np.zeros((ny,nx)) > >> > >> for i in xrange(ny): > >> for j in xrange(nx): > >> z = grid[i,j] > >> output[j,i] = bigVolume[j,i,z] > > > > Where grid is my 2D array of indicies and bigVolume is my 3D array. > > > > I've read the numpy-user and numpybook sections on indexing with an > integer > > ndarray, but I'm still not quite able to wrap my head around how it > should > > work. I'm sure I'm missing something obvious (I haven't been using numpy > > particularly long). > > > > If it helps anyone visualize it, I'm essentially trying to extract > > attributes out of a seismic volume along the surface of a horizion. > > I discuss this particular use case (well, a little different; we are > pulling out a thin slab around a horizon rather than a slice) here: > > http://mail.scipy.org/pipermail/numpy-discussion/2008-July/035776.html > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From geometrian at gmail.com Sun Jun 14 00:47:14 2009 From: geometrian at gmail.com (Ian Mallett) Date: Sat, 13 Jun 2009 21:47:14 -0700 Subject: [Numpy-discussion] Elementary Array Switching In-Reply-To: <3d375d730906121125r51ebb89bqc090ca1cfd9f94dc@mail.gmail.com> References: <3d375d730906121125r51ebb89bqc090ca1cfd9f94dc@mail.gmail.com> Message-ID: It seems to be working now--I think my problem is elsewhere. Sorry... -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Jun 14 03:59:09 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 14 Jun 2009 16:59:09 +0900 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> Message-ID: <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> On Sun, Jun 14, 2009 at 3:51 AM, Charles R Harris wrote: > > > On Sat, Jun 13, 2009 at 12:35 PM, David Cournapeau > wrote: >> >> On Sun, Jun 14, 2009 at 3:22 AM, Charles R >> Harris wrote: >> >> > 1) Since reference counting is such a pain, you should document that the >> > constructor returns a new reference and that the PyArrayIterObject does >> > not >> > need to have its reference count incremented before the call and that >> > the >> > reference count is unchanged on failure. >> >> OK. >> >> > 2) Why are _update_coord_iter(c) and _inc_set_ptr(c) macros? Why are >> > they >> > defined inside functions? If left as macros, they should be in CAPS, but >> > why >> > not just write them out? >> >> They are macro because they are reused in the 2d specialized functions >> (I will add 3d too) > > IIRC, inline doesn't recurse, so there is some advantage to having these as > macros. But I really dislike seeing macros defined inside of functions, > especially when they aren't exclusive to that function. Well, they are kind of exclusive to this function - and the special 2d case; they are not supposed to be used by themselves (which is why they are undefined right away). But I changed this anyway to ALL CAPS and defined outside. > IOW, use macros judiciously. It may not be obvious, because it looks really simple, but the code is heavily optimized. I spent several hours to find an implementation which works as it does now. The macro is used for a reason :) > > That's what I wanted to hear. But in c++ is generally best for simplicity > and debugging to start out not using inlines, then add them is benchmarks > show a decent advantage. And in those cases it is best if the inlines are > just a few lines long. That's how I did it :) - and the inlines are short I think. To be more concrete about the numbers, the new correlate which uses the iterator gives me (first shape for x, second for y, compute correlate(x, y), time of one iteration on average for timeit): # Using inline + unrolling (unrolling not implemented for 3d arrays) 100x100 with 20x20 0.0521752119064 300x300 with 10x10 0.0856973171234 9x9x9 with 8x8x8 0.0261607170105 # No explicit inline or unrolling 100x100 with 20x20 0.0859595060349 300x300 with 10x10 0.154387807846 9x9x9 with 8x8x8 0.0328009128571 This is on RHEL5, with a relatively old compiler (gcc 4.1). That's the kind of code where optimization flags/compiler/os matters a lot (differences are over 100 % on a same machine with different combinations of OS/Compilers). As a comparison, the inline+unrolling gives same speed as the old code for correlate, which assumed contiguity and manipulated indexes directly. cheers, David From cournape at gmail.com Sun Jun 14 04:07:03 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 14 Jun 2009 17:07:03 +0900 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> Message-ID: <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> On Sun, Jun 14, 2009 at 4:59 PM, David Cournapeau wrote: > On Sun, Jun 14, 2009 at 3:51 AM, Charles R > Harris wrote: >> >> >> On Sat, Jun 13, 2009 at 12:35 PM, David Cournapeau >> wrote: >>> >>> On Sun, Jun 14, 2009 at 3:22 AM, Charles R >>> Harris wrote: >>> >>> > 1) Since reference counting is such a pain, you should document that the >>> > constructor returns a new reference and that the PyArrayIterObject does >>> > not >>> > need to have its reference count incremented before the call and that >>> > the >>> > reference count is unchanged on failure. >>> >>> OK. >>> >>> > 2) Why are _update_coord_iter(c) and _inc_set_ptr(c) macros? Why are >>> > they >>> > defined inside functions? If left as macros, they should be in CAPS, but >>> > why >>> > not just write them out? >>> >>> They are macro because they are reused in the 2d specialized functions >>> (I will add 3d too) >> >> IIRC, inline doesn't recurse, so there is some advantage to having these as >> macros. But I really dislike seeing macros defined inside of functions, >> especially when they aren't exclusive to that function. > > Well, they are kind of exclusive to this function - and the special 2d > case; they are not supposed to be used by themselves (which is why > they are undefined right away). But I changed this anyway to ALL CAPS > and defined outside. > >> IOW, use macros judiciously. > > It may not be obvious, because it looks really simple, but the code is > heavily optimized. I spent several hours to find an implementation > which works as it does now. The macro is used for a reason :) Forgot a link to the updated version: http://github.com/cournape/scipy3/blob/1a4a6b4619a5d4f954168a02ed485db1a3b6b8e8/scipy/signal/neighiter.h cheers, David From mmueller at python-academy.de Sun Jun 14 07:44:20 2009 From: mmueller at python-academy.de (=?windows-1252?Q?Mike_M=FCller?=) Date: Sun, 14 Jun 2009 13:44:20 +0200 Subject: [Numpy-discussion] [ANN] Reminder: EuroSciPy 2009 - Early Bird Deadline June 15, 2009 Message-ID: <4A34E294.4040401@python-academy.de> EuroSciPy 2009 - Early Bird Deadline June 15, 2009 ================================================== The early bird deadline for EuroSciPy 2009 is June 15, 2009. Please register ( http://www.euroscipy.org/registration.html ) by this date to take advantage of the reduced early registration rate. EuroSciPy 2009 ============== We're pleased to announce the EuroSciPy 2009 Conference to be held in Leipzig, Germany on July 25-26, 2009. http://www.euroscipy.org This is the second conference after the successful conference last year. Again, EuroSciPy will be a venue for the European community of users of the Python programming language in science. Presentation Schedule --------------------- The schedule of presentations for the EuroSciPy conference is online: http://www.euroscipy.org/presentations/schedule.html We have 16 talks from a variety of scientific fields. All about using Python for scientific work. Registration ------------ Registration is open. The registration fee is 100.00 ? for early registrants and will increase to 150.00 ? for late registration after June 15, 2009. On-site registration and registration after July 23, 2009 will be 200.00 ?. Registration will include breakfast, snacks and lunch for Saturday and Sunday. Please register here: http://www.euroscipy.org/registration.html Important Dates --------------- March 21 Registration opens May 8 Abstract submission deadline May 15 Acceptance of presentations May 30 Announcement of conference program June 15 Early bird registration deadline July 15 Slides submission deadline July 20 - 24 Pre-Conference courses July 25/26 Conference August 15 Paper submission deadline Venue ----- mediencampus Poetenweg 28 04155 Leipzig Germany See http://www.euroscipy.org/venue.html for details. Help Welcome ------------ You like to help make the EuroSciPy 2009 a success? Here are some ways you can get involved: * attend the conference * submit an abstract for a presentation * give a lightning talk * make EuroSciPy known: - distribute the press release (http://www.euroscipy.org/media.html) to scientific magazines or other relevant media - write about it on your website - in your blog - talk to friends about it - post to local e-mail lists - post to related forums - spread flyers and posters in your institution - make entries in relevant event calendars - anything you can think of * inform potential sponsors about the event * become a sponsor If you're interested in volunteering to help organize things or have some other idea that can help the conference, please email us at mmueller at python-academy dot de. Sponsorship ----------- Do you like to sponsor the conference? There are several options available: http://www.euroscipy.org/sponsors/become_a_sponsor.html Pre-Conference Courses ---------------------- Would you like to learn Python or about some of the most used scientific libraries in Python? Then the "Python Summer Course" [1] might be for you. There are two parts to this course: * a two-day course "Introduction to Python" [2] for people with programming experience in other languages and * a three-day course "Python for Scientists and Engineers" [3] that introduces some of the most used Python tools for scientists and engineers such as NumPy, PyTables, and matplotlib Both courses can be booked individually [4]. Of course, you can attend the courses without registering for EuroSciPy. [1] http://www.python-academy.com/courses/python_summer_course.html [2] http://www.python-academy.com/courses/python_course_programmers.html [3] http://www.python-academy.com/courses/python_course_scientists.html [4] http://www.python-academy.com/courses/dates.html From kxroberto at googlemail.com Sun Jun 14 10:11:29 2009 From: kxroberto at googlemail.com (Robert) Date: Sun, 14 Jun 2009 16:11:29 +0200 Subject: [Numpy-discussion] interleaving arrays Message-ID: whats the right way to efficiently weave arrays like this ? : >>> n array([1, 2, 3, 4]) >>> m array([11, 22, 33, 44]) >>> o array([111, 222, 333, 444]) => [ 1, 11, 111, 2, 22, 222, 3, 33, 333, 4, 44, 444] From emmanuelle.gouillart at normalesup.org Sun Jun 14 10:21:43 2009 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Sun, 14 Jun 2009 16:21:43 +0200 Subject: [Numpy-discussion] interleaving arrays In-Reply-To: References: Message-ID: <20090614142143.GC12010@phare.normalesup.org> a = np.empty(3*n.size, np.int) a[::3]=n a[1::3]=m a[2::3]=o or np.array(zip(n,m,o)).ravel() but the first solution is faster, even if you have to write more :D Emmanuelle On Sun, Jun 14, 2009 at 04:11:29PM +0200, Robert wrote: > whats the right way to efficiently weave arrays like this ? : > >>> n > array([1, 2, 3, 4]) > >>> m > array([11, 22, 33, 44]) > >>> o > array([111, 222, 333, 444]) > => > [ 1, 11, 111, 2, 22, 222, 3, 33, 333, 4, 44, 444] > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From tpk at kraussfamily.org Sun Jun 14 12:20:51 2009 From: tpk at kraussfamily.org (Tom K.) Date: Sun, 14 Jun 2009 09:20:51 -0700 (PDT) Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23693116.post@talk.nabble.com> <4A283BD5.4080405@noaa.gov> <3699E128-2BCA-45E1-97A1-B0DE3B8E8B74@mac.com> <4A283F96.1050103@american.edu> <513936A2-C309-435B-B9AD-1B8B71D57B06@mac.com> <4A29561D.5070806@american.edu> <4A2A8B50.9030904@american.edu> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> <4A2D622D.9050008@american.edu> <3d375d730906081233r3221d2ffq807196a049cab0ce@mail.gmail.com> Message-ID: <24023215.post@talk.nabble.com> jseabold wrote: > > On Mon, Jun 8, 2009 at 3:33 PM, Robert Kern wrote: >> On Mon, Jun 8, 2009 at 14:10, Alan G Isaac wrote: >>>>> Going back to Alan Isaac's example: >>>>> 1) ?beta = (X.T*X).I * X.T * Y >>>>> 2) ?beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) >>> >>> >>> Robert Kern wrote: >>>> 4) beta = la.lstsq(X, Y)[0] >>>> >>>> I really hate that example. > I propose the following alternative for discussion: U, s, Vh = np.linalg.svd(X, full_matrices=False) 1) beta = Vh.T*np.asmatrix(np.diag(1/s))*U.T*Y 2) beta = np.dot(Vh.T, np.dot(np.diag(1/s), np.dot(U.T, Y))) 1) is with X and Y starting out as matrices, 2) is with arrays. Even diag results in an array that has to be matricized. Sigh. -- View this message in context: http://www.nabble.com/matrix-default-to-column-vector--tp23652920p24023215.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From charlesr.harris at gmail.com Sun Jun 14 12:45:30 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 14 Jun 2009 10:45:30 -0600 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> Message-ID: On Sun, Jun 14, 2009 at 2:07 AM, David Cournapeau wrote: > > On Sun, Jun 14, 2009 at 4:59 PM, David Cournapeau wrote: > > On Sun, Jun 14, 2009 at 3:51 AM, Charles R > > Harris wrote: > >> > >> > >> On Sat, Jun 13, 2009 at 12:35 PM, David Cournapeau > >> wrote: > >>> > >>> On Sun, Jun 14, 2009 at 3:22 AM, Charles R > >>> Harris wrote: > >>> > >>> > 1) Since reference counting is such a pain, you should document that the > >>> > constructor returns a new reference and that the PyArrayIterObject does > >>> > not > >>> > need to have its reference count incremented before the call and that > >>> > the > >>> > reference count is unchanged on failure. > >>> > >>> OK. > >>> > >>> > 2) Why are _update_coord_iter(c) and _inc_set_ptr(c) macros? Why are > >>> > they > >>> > defined inside functions? If left as macros, they should be in CAPS, but > >>> > why > >>> > not just write them out? > >>> > >>> They are macro because they are reused in the 2d specialized functions > >>> (I will add 3d too) > >> > >> IIRC, inline doesn't recurse, so there is some advantage to having these as > >> macros. But I really dislike seeing macros defined inside of functions, > >> especially when they aren't exclusive to that function. > > > > Well, they are kind of exclusive to this function - and the special 2d > > case; they are not supposed to be used by themselves (which is why > > they are undefined right away). But I changed this anyway to ALL CAPS > > and defined outside. > > > >> IOW, use macros judiciously. > > > > It may not be obvious, because it looks really simple, but the code is > > heavily optimized. I spent several hours to find an implementation > > which works as it does now. The macro is used for a reason :) > > Forgot a link to the updated version: > > http://github.com/cournape/scipy3/blob/1a4a6b4619a5d4f954168a02ed485db1a3b6b8e8/scipy/signal/neighiter.h Looking good. A few more nitpicks ;) 1) The documentation of PyObject_Init doesn't say whether it is NULL safe, so I think there needs to be a check here before the call: ret = PyArray_malloc(sizeof(*ret)); PyObject_Init((PyObject *)ret, &PyArrayNeighborhoodIter_Type) if (ret == NULL) { ??? return NULL; } 2) Do the bounds need to be ordered? If so, that should be mentioned and checked. 3) In the documentation x is used but the function prototype uses iter. 4) I don't think the reference is borrowed since it is incremented if the ctor succeeds. I think the point here is that the user doesn't need to worry about it. 5) There should be spaces around the "-" here: for (i = iter->nd-1; i >= 0; --i) ?Likewise, the convention in python seems to be a space between the "for" and "(" 6) If the functions use neighborhood (I do think that looks better), then the file names should also. Chuck From charlesr.harris at gmail.com Sun Jun 14 13:02:53 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 14 Jun 2009 11:02:53 -0600 Subject: [Numpy-discussion] matrix default to column vector? In-Reply-To: <24023215.post@talk.nabble.com> References: <75c31b2a0905210610x1a321264r5b6f93d327ef2b36@mail.gmail.com> <23907204.post@talk.nabble.com> <23910425.post@talk.nabble.com> <3d375d730906071208y7e3437d7xeeee9e321d0f78dc@mail.gmail.com> <4A2D622D.9050008@american.edu> <3d375d730906081233r3221d2ffq807196a049cab0ce@mail.gmail.com> <24023215.post@talk.nabble.com> Message-ID: On Sun, Jun 14, 2009 at 10:20 AM, Tom K. wrote: > > > > jseabold wrote: >> >> On Mon, Jun 8, 2009 at 3:33 PM, Robert Kern wrote: >>> On Mon, Jun 8, 2009 at 14:10, Alan G Isaac wrote: >>>>>> Going back to Alan Isaac's example: >>>>>> 1) ?beta = (X.T*X).I * X.T * Y >>>>>> 2) ?beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) >>>> >>>> >>>> Robert Kern wrote: >>>>> 4) beta = la.lstsq(X, Y)[0] >>>>> >>>>> I really hate that example. >> > > I propose the following alternative for discussion: > > ?U, s, Vh = np.linalg.svd(X, full_matrices=False) > ?1) ?beta = Vh.T*np.asmatrix(np.diag(1/s))*U.T*Y > ?2) ?beta = np.dot(Vh.T, np.dot(np.diag(1/s), np.dot(U.T, Y))) > > 1) is with X and Y starting out as matrices, 2) is with . >. ?Sigh. > The problem is that I left the diagonal returned by svd as an array rather than a matrix for backward compatibility. Diag returns the diagonal when presented with a 2d matrix (array). I went back and forth on that, but not doing so would have required everyone to change their code to use diagflat instead of diag. I do note that diag doesn't preserve matrices and that looks like a bug to me. This is also an argument against over-overloaded functions such as diag. Such functions are one of the blemishes of MatLab, IMHO. OTOH, there is something of a shortage of short, meaningful names Chuck From bryan at cole.uklinux.net Sun Jun 14 15:31:49 2009 From: bryan at cole.uklinux.net (Bryan Cole) Date: Sun, 14 Jun 2009 20:31:49 +0100 Subject: [Numpy-discussion] passing arrays between processes Message-ID: <1245007907.8230.11.camel@pc2.cole.uklinux.net> I'm starting work on an application involving cpu-intensive data processing using a quad-core PC. I've not worked with multi-core systems previously and I'm wondering what is the best way to utilise the hardware when working with numpy arrays. I think I'm going to use the multiprocessing package, but what's the best way to pass arrays between processes? I'm unsure of the relative merits of pipes vs shared mem. Unfortunately, I don't have access to the quad-core machine to benchmark stuff right now. Any advice would be appreciated. In case it's relevant: the data takes the form of a stream of numpy.double arrays with sizes in the range 2000 to 10000. cheers, Bryan From dagss at student.matnat.uio.no Sun Jun 14 16:04:52 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 14 Jun 2009 22:04:52 +0200 Subject: [Numpy-discussion] passing arrays between processes In-Reply-To: <1245007907.8230.11.camel@pc2.cole.uklinux.net> References: <1245007907.8230.11.camel@pc2.cole.uklinux.net> Message-ID: <4A3557E4.9020102@student.matnat.uio.no> Bryan Cole wrote: > I'm starting work on an application involving cpu-intensive data > processing using a quad-core PC. I've not worked with multi-core systems > previously and I'm wondering what is the best way to utilise the > hardware when working with numpy arrays. I think I'm going to use the > multiprocessing package, but what's the best way to pass arrays between > processes? You may want to look at MPI, e.g. mpi4py is convenient for this kind of work. For numerical work across processes it is close to a de facto standard. It requires an MPI implementation set up on your machine though (but for single-machine use this isn't hard to set up, typically just install e.g. OpenMPI), and that you launch Python through mpiexec -n 4 python somescript.py -- Dag Sverre From bryan at cole.uklinux.net Sun Jun 14 16:24:50 2009 From: bryan at cole.uklinux.net (Bryan Cole) Date: Sun, 14 Jun 2009 21:24:50 +0100 Subject: [Numpy-discussion] passing arrays between processes In-Reply-To: <4A3557E4.9020102@student.matnat.uio.no> References: <1245007907.8230.11.camel@pc2.cole.uklinux.net> <4A3557E4.9020102@student.matnat.uio.no> Message-ID: <1245011089.8230.14.camel@pc2.cole.uklinux.net> > > You may want to look at MPI, e.g. mpi4py is convenient for this kind of > work. For numerical work across processes it is close to a de facto > standard. > > It requires an MPI implementation set up on your machine though (but for > single-machine use this isn't hard to set up, typically just install > e.g. OpenMPI), and that you launch Python through > > mpiexec -n 4 python somescript.py > Thanks for this tip. I'm looking at the mpi4py docs now. Seems kinda complicated... In fact, I should have specified previously: I need to deploy on MS-Win. On first glance, I can't see that mpi4py is installable on Windows. Bryan From bryan at cole.uklinux.net Sun Jun 14 16:27:11 2009 From: bryan at cole.uklinux.net (Bryan Cole) Date: Sun, 14 Jun 2009 21:27:11 +0100 Subject: [Numpy-discussion] passing arrays between processes In-Reply-To: <1245011089.8230.14.camel@pc2.cole.uklinux.net> References: <1245007907.8230.11.camel@pc2.cole.uklinux.net> <4A3557E4.9020102@student.matnat.uio.no> <1245011089.8230.14.camel@pc2.cole.uklinux.net> Message-ID: <1245011230.8230.15.camel@pc2.cole.uklinux.net> > In fact, I should have specified previously: I need to > deploy on MS-Win. On first glance, I can't see that mpi4py is > installable on Windows. My mistake. I see it's included in Enthon, which I'm using. Bryan > > > Bryan From robert.kern at gmail.com Sun Jun 14 16:50:31 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 14 Jun 2009 15:50:31 -0500 Subject: [Numpy-discussion] passing arrays between processes In-Reply-To: <1245007907.8230.11.camel@pc2.cole.uklinux.net> References: <1245007907.8230.11.camel@pc2.cole.uklinux.net> Message-ID: <3d375d730906141350t46a03277v10a8c3f2c19cd0f8@mail.gmail.com> On Sun, Jun 14, 2009 at 14:31, Bryan Cole wrote: > I'm starting work on an application involving cpu-intensive data > processing using a quad-core PC. I've not worked with multi-core systems > previously and I'm wondering what is the best way to utilise the > hardware when working with numpy arrays. I think I'm going to use the > multiprocessing package, but what's the best way to pass arrays between > processes? > > I'm unsure of the relative merits of pipes vs shared mem. Unfortunately, > I don't have access to the quad-core machine to benchmark stuff right > now. Any advice would be appreciated. You can see a previous discussion on scipy-user in February titled "shared memory machines" about using arrays backed by shared memory with multiprocessing. Particularly this message: http://mail.scipy.org/pipermail/scipy-user/2009-February/019935.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From neilcrighton at gmail.com Sun Jun 14 18:40:50 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Sun, 14 Jun 2009 22:40:50 +0000 (UTC) Subject: [Numpy-discussion] improving arraysetops References: <4A2E1F62.4010208@ntc.zcu.cz> Message-ID: Robert Cimrman ntc.zcu.cz> writes: > > Hi, > > I am starting a new thread, so that it reaches the interested people. > Let us discuss improvements to arraysetops (array set operations) at [1] > (allowing non-unique arrays as function arguments, better naming > conventions and documentation). > > r. > > [1] http://projects.scipy.org/numpy/ticket/1133 > Hi, These changes looks good to me. For point (1) I think we should fold the unique and _nu code into a single function. For point (3) I like in1d - it's shorter than isin1d but is still clear. What about merging unique and unique1d? They're essentially identical for an array input, but unique uses the builtin set() for non-array inputs and so is around 2x faster in this case - see below. Is it worth accepting a speed regression for unique to get rid of the function duplication? (Or can they be combined?) Neil In [24]: l = list(np.random.randint(100, size=10000)) In [25]: %timeit np.unique1d(l) 1000 loops, best of 3: 1.9 ms per loop In [26]: %timeit np.unique(l) 1000 loops, best of 3: 793 ?s per loop In [27]: l = list(np.random.randint(100, size=1000000)) In [28]: %timeit np.unique(l) 10 loops, best of 3: 78 ms per loop In [29]: %timeit np.unique1d(l) 10 loops, best of 3: 233 ms per loop From bryan at cole.uklinux.net Mon Jun 15 02:22:55 2009 From: bryan at cole.uklinux.net (Bryan Cole) Date: Mon, 15 Jun 2009 07:22:55 +0100 Subject: [Numpy-discussion] passing arrays between processes In-Reply-To: <3d375d730906141350t46a03277v10a8c3f2c19cd0f8@mail.gmail.com> References: <1245007907.8230.11.camel@pc2.cole.uklinux.net> <3d375d730906141350t46a03277v10a8c3f2c19cd0f8@mail.gmail.com> Message-ID: <1245046974.14263.4.camel@pc2.cole.uklinux.net> On Sun, 2009-06-14 at 15:50 -0500, Robert Kern wrote: > On Sun, Jun 14, 2009 at 14:31, Bryan Cole wrote: > > I'm starting work on an application involving cpu-intensive data > > processing using a quad-core PC. I've not worked with multi-core systems > > previously and I'm wondering what is the best way to utilise the > > hardware when working with numpy arrays. I think I'm going to use the > > multiprocessing package, but what's the best way to pass arrays between > > processes? > > > > I'm unsure of the relative merits of pipes vs shared mem. Unfortunately, > > I don't have access to the quad-core machine to benchmark stuff right > > now. Any advice would be appreciated. > > You can see a previous discussion on scipy-user in February titled > "shared memory machines" about using arrays backed by shared memory > with multiprocessing. Particularly this message: > > http://mail.scipy.org/pipermail/scipy-user/2009-February/019935.html > Thanks. Does Sturla's extension have any advantages over using a multiprocessing.sharedctypes.RawArray accessed as a numpy view? Bryan From fperez.net at gmail.com Mon Jun 15 03:47:29 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 15 Jun 2009 00:47:29 -0700 Subject: [Numpy-discussion] Tutorial topics for SciPy'09 Conference In-Reply-To: References: Message-ID: Hi all, In order to proceed with contacting speakers, we'd now like to get some feedback from you. This Doodle poll should take no more than a couple of minutes to fill out (no password or registration required): http://doodle.com/hb5bea6fivm3b5bk So please let us know which topics you are most interested in, and we'll do our best to accommodate everyone. Keep in mind that speaker availability and balancing out the topics means that the actual tutorials offered probably won't be exactly the list of top 8 voted topics, but the feedback will certainly help us steer the decision process. Thanks for your time, Dave Peterson and Fernando Perez On Mon, Jun 1, 2009 at 10:21 PM, Fernando Perez wrote: > Hi all, > > The time for the Scipy'09 conference is rapidly approaching, and we > would like to both announce the plan for tutorials and solicit > feedback from everyone on topics of interest. > > Broadly speaking, the plan is something along the lines of ?what we > had last year: one continuous 2-day tutorial ?aimed at introductory > users, starting from the very basics, and in parallel a set of > 'advanced' tutorials, consisting of a series of 2-hour sessions on > specific ?topics. > > We will request that the presenters for the advanced tutorials keep > the 'tutorial' word very much in mind, so that the sessions really > contain hands-on learning work and not simply a 2-hour long slide > presentation. ?We will ?thus require that all the tutorials will be > based on tools that the attendees can install at least 2 weeks in > advance on all ?platforms (no "I released it last night" software). > > With that in mind, we'd like feedback from all of you on possible > topics for the advanced tutorials. ?We have space for 8 slots total, > and here are in no particular order some possible topics. ?At this > point there are no guarantees yet that we can get presentations for > these, but we'd like to establish a first list of preferred topics to > try and secure the presentations as soon as possible. > > This is simply a list of candiate topics that various people have > informally suggested so far: > > - Mayavi/TVTK > - Advanced topics in matplotlib > - Statistics with Scipy > - The TimeSeries scikit > - Designing scientific interfaces with Traits > - Advanced numpy > - Sparse Linear Algebra with Scipy > - Structured and record arrays in numpy > - Cython > - Sage - general tutorial > - Sage - specific topics, suggestions welcome > - Using GPUs with PyCUDA > - Testing strategies for scientific codes > - Parallel processing and mpi4py > - Graph theory with Networkx > - Design patterns for efficient iterator-based scientific codes. > - Symbolic computing with sympy > > We'd like to hear from any ideas on other possible topics of interest, > and we'll then run a doodle poll ?to gather quantitative feedback with > the final list of candidates. > > Many thanks, > > f > From pav at iki.fi Mon Jun 15 04:30:13 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 15 Jun 2009 08:30:13 +0000 (UTC) Subject: [Numpy-discussion] More on doc-ing new functions References: <533868.82740.qm@web52112.mail.re2.yahoo.com> Message-ID: Sat, 13 Jun 2009 12:58:42 -0700, David Goldsmith kirjoitti: > Are new functions automatically added to the Numpy Doc Wiki? In > particular: 0) is the documentation itself (assuming there is some) > added in such a way that it can be edited by Wiki users; Yes, new functions appear in the wiki, but, > and 1) is the > name of the function automatically added to a "best guess" category in > the Milestones? they do not automatically appear on the Milestones page. More importantly, new functions must also be added (via the wiki) to the proper .rst file, eg., http://docs.scipy.org/numpy/docs/numpy-docs/reference/routines.set.rst/ in order to be included in the final documentation. -- Pauli Virtanen From pav at iki.fi Mon Jun 15 04:31:54 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 15 Jun 2009 08:31:54 +0000 (UTC) Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> Message-ID: Sat, 13 Jun 2009 12:00:53 -0600, Charles R Harris kirjoitti: > > 3) Documentation is needed. In particular, I think it worth mentioning > that the number of bounds is taken from the PyArrayIterObject, which > isn't the most transparent thing. For reference, the docs should probably go here: http://docs.scipy.org/numpy/docs/numpy-docs/reference/c-api.array.rst/#array-iterators Probably as a new subsection. -- Pauli Virtanen From geometrian at gmail.com Mon Jun 15 04:41:03 2009 From: geometrian at gmail.com (Ian Mallett) Date: Mon, 15 Jun 2009 01:41:03 -0700 Subject: [Numpy-discussion] Interleaved Arrays and Message-ID: Hi, So I'm trying to get a certain sort of 3D terrain working in PyOpenGL. The idea is to get vertex buffer objects to draw a simple 2D plane comprised of many flat polygons, and use a vertex shader to deform that with a heightmap and map that on a sphere. I've managed to do this with a grid (simple points), making the vertex buffer object: threedimensionalgrid = dstack(mgrid[0:size,0:size,0:1])/float(size-1) twodimensionalgrid = threedimensionalgrid.reshape(self.size_squared,3) floattwodimensionalgrid = array(twodimensionalgrid,"f") self.vertex_vbo = vbo.VBO(floattwodimensionalgrid) However, landscapes tend to be, um, solid :D So, the landscape needs to be drawn as quads or triangles. Strips of triangles will be most effective, and the data must be specified to vbo.VBO() in a certain way: n = #blah testlist = [] for x in xrange(n): for y in xrange(n): testlist.append([x,y]) testlist.append([x+1,y]) If "testlist" is an array (i.e., I could go: "array(testlist)"), it works nicely. However, my Python method is certainly improveable with numpy. I suspect the best way is interleaving the arrays [x,y->yn] and [x+1,y->yn] ntimes, but I couldn't figure out how to do that... Help? Thanks, Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From cimrman3 at ntc.zcu.cz Mon Jun 15 05:55:11 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 15 Jun 2009 11:55:11 +0200 Subject: [Numpy-discussion] improving arraysetops In-Reply-To: References: <4A2E1F62.4010208@ntc.zcu.cz> Message-ID: <4A361A7F.6050100@ntc.zcu.cz> Neil Crighton wrote: > Robert Cimrman ntc.zcu.cz> writes: > >> Hi, >> >> I am starting a new thread, so that it reaches the interested people. >> Let us discuss improvements to arraysetops (array set operations) at [1] >> (allowing non-unique arrays as function arguments, better naming >> conventions and documentation). >> >> r. >> >> [1] http://projects.scipy.org/numpy/ticket/1133 >> > > Hi, > > These changes looks good to me. For point (1) I think we should fold the > unique and _nu code into a single function. For point (3) I like in1d - it's > shorter than isin1d but is still clear. yes, the _nu functions will be useless then, their bodies can be moved into the generic functions. > What about merging unique and unique1d? They're essentially identical for an > array input, but unique uses the builtin set() for non-array inputs and so is > around 2x faster in this case - see below. Is it worth accepting a speed > regression for unique to get rid of the function duplication? (Or can they be > combined?) unique1d can return the indices - can this be achieved by using set(), too? The implementation for arrays is the same already, IMHO, so I would prefer adding return_index, return_inverse to unique (automatically converting input to array, if necessary), and deprecate unique1d. We can view it also as adding the set() approach to unique1d, when the return_index, return_inverse arguments are not set, and renaming unique1d -> unique. > Neil > > > In [24]: l = list(np.random.randint(100, size=10000)) > In [25]: %timeit np.unique1d(l) > 1000 loops, best of 3: 1.9 ms per loop > In [26]: %timeit np.unique(l) > 1000 loops, best of 3: 793 ?s per loop > In [27]: l = list(np.random.randint(100, size=1000000)) > In [28]: %timeit np.unique(l) > 10 loops, best of 3: 78 ms per loop > In [29]: %timeit np.unique1d(l) > 10 loops, best of 3: 233 ms per loop I have found a strange bug in unique(): In [24]: l = list(np.random.randint(100, size=1000)) In [25]: %timeit np.unique(l) --------------------------------------------------------------------------- UnicodeEncodeError Traceback (most recent call last) /usr/lib64/python2.5/site-packages/IPython/iplib.py in ipmagic(self, arg_s) 951 else: 952 magic_args = self.var_expand(magic_args,1) --> 953 return fn(magic_args) 954 955 def ipalias(self,arg_s): /usr/lib64/python2.5/site-packages/IPython/Magic.py in magic_timeit(self, parameter_s) 1829 precision, 1830 best * scaling[order], -> 1831 units[order]) 1832 if tc > tc_min: 1833 print "Compiler time: %.2f s" % tc UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in position 28: ordinal not in range(128) It disappears after increasing the array size, or the integer size. In [39]: np.__version__ Out[39]: '1.4.0.dev7047' r. From Fadhley.Salim at uk.calyon.com Mon Jun 15 06:47:48 2009 From: Fadhley.Salim at uk.calyon.com (Fadhley Salim) Date: Mon, 15 Jun 2009 11:47:48 +0100 Subject: [Numpy-discussion] Scipy 0.6.0 to 0.7.0, sparse matrix change In-Reply-To: <4A361A7F.6050100@ntc.zcu.cz> References: <4A2E1F62.4010208@ntc.zcu.cz> <4A361A7F.6050100@ntc.zcu.cz> Message-ID: I'm trying to track down a numerical discrepancy in our proejct. We noticed that a certain set of results are different having upgraded from scipy 0.6.0 to 0.7.0. The following item from the Scipy change-log is our current number-one suspect. Could anybody who knows suggest what was actually involved in the change which I have highlighted with stars below? Thanks Sparse Matrices --------------- [...] The handling of diagonals in the ``spdiags`` function has been changed. It now agrees with the MATLAB(TM) function of the same name. *** Numerous efficiency improvements to format conversions and sparse matrix arithmetic have been made. Finally, this release contains numerous bugfixes. *** -------------- next part -------------- Disclaimer CALYON UK: This email does not create a legal relationship between any member of the Cr=E9dit Agricole group and the recipient or constitute investment advice. The content of this email should not be copied or disclosed (in whole or part) to any other person. It may contain information which is confidential, privileged or otherwise protected from disclosure. If you are not the intended recipient, you should notify us and delete it from your system. Emails may be monitored, are not secure and may be amended, destroyed or contain viruses and in communicating with us such conditions are accepted. Any content which does not relate to business matters is not endorsed by us. Calyon is authorised by the Comit=e9 des Etablissements de Cr=e9dit et des Entreprises d'Investissement (CECEI) and supervised by the Commission Bancaire in France and subject to limited regulation by the Financial Services Authority. Details about the extent of our regulation by the Financial Services Authority are available from us on request. Calyon is incorporated in France with limited liability and registered in England & Wales. Registration number: FC008194. Registered office: Broadwalk House, 5 Appold Street, London, EC2A 2DA. Disclaimer CALYON France: This message and/or any attachments is intended for the sole use of its addressee. If you are not the addressee, please immediately notify the sender and then destroy the message. As this message and/or any attachments may have been altered without our knowledge, its content is not legally binding on CALYON Cr?dit Agricole CIB. All rights reserved. Ce message et ses pi?ces jointes est destin? ? l'usage exclusif de son destinataire. Si vous recevez ce message par erreur, merci d'en aviser imm?diatement l'exp?diteur et de le d?truire ensuite. Le pr?sent message pouvant ?tre alt?r? ? notre insu, CALYON Cr?dit Agricole CIB ne peut pas ?tre engag? par son contenu. Tous droits r?serv?s. From robert.kern at gmail.com Mon Jun 15 11:43:03 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 15 Jun 2009 10:43:03 -0500 Subject: [Numpy-discussion] passing arrays between processes In-Reply-To: <1245046974.14263.4.camel@pc2.cole.uklinux.net> References: <1245007907.8230.11.camel@pc2.cole.uklinux.net> <3d375d730906141350t46a03277v10a8c3f2c19cd0f8@mail.gmail.com> <1245046974.14263.4.camel@pc2.cole.uklinux.net> Message-ID: <3d375d730906150843s4d693580hb1d783957a92df86@mail.gmail.com> On Mon, Jun 15, 2009 at 01:22, Bryan Cole wrote: > On Sun, 2009-06-14 at 15:50 -0500, Robert Kern wrote: >> On Sun, Jun 14, 2009 at 14:31, Bryan Cole wrote: >> > I'm starting work on an application involving cpu-intensive data >> > processing using a quad-core PC. I've not worked with multi-core systems >> > previously and I'm wondering what is the best way to utilise the >> > hardware when working with numpy arrays. I think I'm going to use the >> > multiprocessing package, but what's the best way to pass arrays between >> > processes? >> > >> > I'm unsure of the relative merits of pipes vs shared mem. Unfortunately, >> > I don't have access to the quad-core machine to benchmark stuff right >> > now. Any advice would be appreciated. >> >> You can see a previous discussion on scipy-user in February titled >> "shared memory machines" about using arrays backed by shared memory >> with multiprocessing. Particularly this message: >> >> http://mail.scipy.org/pipermail/scipy-user/2009-February/019935.html >> > > Thanks. > > Does Sturla's extension have any advantages over using a > multiprocessing.sharedctypes.RawArray accessed as a numpy view? It will be easier to write code that correctly holds and releases the shared memory with Sturla's extension. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d_l_goldsmith at yahoo.com Mon Jun 15 12:52:05 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 15 Jun 2009 09:52:05 -0700 (PDT) Subject: [Numpy-discussion] More on doc-ing new functions Message-ID: <494379.70739.qm@web52111.mail.re2.yahoo.com> Thanks, Pauli. Obvious follow-up: --- On Mon, 6/15/09, Pauli Virtanen wrote: > David Goldsmith kirjoitti: > > > Are new functions automatically added to the Numpy Doc > Wiki?? In > > particular: 0) is the documentation itself (assuming > there is some) > > added in such a way that it can be edited by Wiki > users; > > Yes, new functions appear in the wiki, but, > > > and 1) is the > > name of the function automatically added to a "best > guess" category in > > the Milestones?? > > they do not automatically appear on the Milestones page. > > More importantly, new functions must also be added (via the > wiki) to the > proper .rst file, eg., > > ??? http://docs.scipy.org/numpy/docs/numpy-docs/reference/routines.set.rst/ > > in order to be included in the final documentation. > > Pauli Virtanen Is there a protocol for making sure these things get done? (Just don't want to reinvent the wheel.) DG From d_l_goldsmith at yahoo.com Mon Jun 15 12:53:02 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Mon, 15 Jun 2009 09:53:02 -0700 (PDT) Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays Message-ID: <433794.99877.qm@web52106.mail.re2.yahoo.com> Thanks, Pauli! DG --- On Mon, 6/15/09, Pauli Virtanen wrote: > From: Pauli Virtanen > Subject: Re: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays > To: numpy-discussion at scipy.org > Date: Monday, June 15, 2009, 1:31 AM > Sat, 13 Jun 2009 12:00:53 -0600, > Charles R Harris kirjoitti: > > > > 3) Documentation is needed. In particular, I think it > worth mentioning > > that the number of bounds is taken from the > PyArrayIterObject, which > > isn't the most transparent thing. > > For reference, the docs should probably go here: > > http://docs.scipy.org/numpy/docs/numpy-docs/reference/c-api.array.rst/#array-iterators > > Probably as a new subsection. > > -- > Pauli Virtanen > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dalcinl at gmail.com Mon Jun 15 16:42:41 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 15 Jun 2009 17:42:41 -0300 Subject: [Numpy-discussion] passing arrays between processes In-Reply-To: <1245011230.8230.15.camel@pc2.cole.uklinux.net> References: <1245007907.8230.11.camel@pc2.cole.uklinux.net> <4A3557E4.9020102@student.matnat.uio.no> <1245011089.8230.14.camel@pc2.cole.uklinux.net> <1245011230.8230.15.camel@pc2.cole.uklinux.net> Message-ID: On Sun, Jun 14, 2009 at 5:27 PM, Bryan Cole wrote: >> ?In fact, I should have specified previously: I need to >> deploy on MS-Win. On first glance, I can't see that mpi4py is >> installable on Windows. > > My mistake. I see it's included in Enthon, which I'm using. > Hi, Bryan... I'm the author of mpi4py... If you are going to run your code in a single multicore machine, then you should likely use Sturla's extension... As you noticed, MPI is a bit "complicated". Moreover, you will have two dependencies: the core MPI implementation, and mpi4py. These "complications" and extra dependencies however do make sense in the case of DISTRIBUTED computing, i.e, you want to take advantage of many machines to perform your computations. In such cases, MPI is the "smart" approach, and mpi4py the best wrapper out there... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From oliphant at enthought.com Mon Jun 15 16:31:16 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Mon, 15 Jun 2009 15:31:16 -0500 Subject: [Numpy-discussion] Join us for the 2nd Scientific Computing with Python Webinar Message-ID: <9806A7FA-C9F7-4491-8AE1-514DC13851F8@enthought.com> Hello all Python users: I am pleased to announce the second installment of a free Webinar series that discusses using Python for scientific computing. Enthought hosts this free series which takes place once a month for about 60-90 minutes. The schedule and length may change based on participation feedback, but for now it is scheduled for the third Friday of every month. This free webinar should not be confused with the EPD webinar on the first Friday of each month which is open only to subscribers to the Enthought Python Distribution at the Basic level or above. This session's speakers will be me (Travis Oliphant) and Peter Wang. I will show off a bit of EPDLab which is an interactive Python environment built using IPython, Traits, and Envisage. Peter Wang will present a demo of Chaco and provide some examples of interactive visualizations that can be easily constructed using it's classes. If there is time after the Chaco demo, I will continue the discussion about Mayavi, but I suspect this will have to wait until the next session. All of the tools we will show are open-source, freely- available tools from multiple sources. They can all be conveniently installed using the Enthought Python Distribution. This event will take place on Friday, June 19th at 1:00pm CDT and will last 60 to 90 minutes depending on the questions asked. If you would like to participate, please register by clicking on the link below or going to https://www1.gotomeeting.com/register/303689873. There will be a 15 minute technical help-session prior to the on-line meeting which you should plan to use if you have never participated in a GoToWebinar previously. During this time you can test your connection and audio equipment as well as familiarize yourself with the GoTo Meeting software (which currently only works with Mac and Windows systems). I am looking forward to interacting with many of you again this Friday. Best regards, Travis Oliphant Enthought, Inc. Enthought is the company that sponsored the creation of SciPy and the Enthought Tool Suite. It continues to sponsor the SciPy community by hosting the SciPy mailing list and website and participating in the development of SciPy and NumPy. Enthought creates custom scientific and technical software applications and provides training on using Python for technical computing. Enthought also provides the Enthought Python Distribution. Learn more at http://www.enthought.com Bios for Travis Oliphant and Peter Wang can be read at http://www.enthought.com/company/executive-team.php -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliphant at enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brennan.williams at visualreservoir.com Mon Jun 15 18:27:13 2009 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Tue, 16 Jun 2009 10:27:13 +1200 Subject: [Numpy-discussion] npfile deprecation warning Message-ID: <4A36CAC1.8040604@visualreservoir.com> Hi I'm using npfile which is giving me a deprecation warning. For the time being I want to continue using it but I would like to suppress the warning messages. Is it possible to trap the deprecation warning but still have the npfile go ahead? Thanks Brennan From robert.kern at gmail.com Mon Jun 15 18:33:02 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 15 Jun 2009 17:33:02 -0500 Subject: [Numpy-discussion] npfile deprecation warning In-Reply-To: <4A36CAC1.8040604@visualreservoir.com> References: <4A36CAC1.8040604@visualreservoir.com> Message-ID: <3d375d730906151533t57b04203sf7edc68852611579@mail.gmail.com> On Mon, Jun 15, 2009 at 17:27, Brennan Williams wrote: > Hi > > I'm using npfile which is giving me a deprecation warning. For the time > being I want to continue using it but I would like to suppress > the warning messages. Is it possible to trap the deprecation warning but > still have the npfile go ahead? http://docs.python.org/library/warnings -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From brennan.williams at visualreservoir.com Mon Jun 15 19:48:40 2009 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Tue, 16 Jun 2009 11:48:40 +1200 Subject: [Numpy-discussion] npfile deprecation warning In-Reply-To: <3d375d730906151533t57b04203sf7edc68852611579@mail.gmail.com> References: <4A36CAC1.8040604@visualreservoir.com> <3d375d730906151533t57b04203sf7edc68852611579@mail.gmail.com> Message-ID: <4A36DDD8.2000103@visualreservoir.com> Robert Kern wrote: > On Mon, Jun 15, 2009 at 17:27, Brennan > Williams wrote: > >> Hi >> >> I'm using npfile which is giving me a deprecation warning. For the time >> being I want to continue using it but I would like to suppress >> the warning messages. Is it possible to trap the deprecation warning but >> still have the npfile go ahead? >> > > http://docs.python.org/library/warnings > > Thanks. OK I've put the following in my code... import warnings def fxn(): warnings.warn("deprecated", DeprecationWarning) with warnings.catch_warnings(): warnings.simplefilter("ignore") fxn() but I'm getting an invalid syntax error... with warnings.catch_warnings(): ^ SyntaxError: invalid syntax I haven't used "with" before. Is this supposed to go in the function def where I use npfile? I've put it near the top of my .py file after my imports and before my class definitions. btw I'm using Python 2.5.4 Brennan From robert.kern at gmail.com Mon Jun 15 19:53:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 15 Jun 2009 18:53:43 -0500 Subject: [Numpy-discussion] npfile deprecation warning In-Reply-To: <4A36DDD8.2000103@visualreservoir.com> References: <4A36CAC1.8040604@visualreservoir.com> <3d375d730906151533t57b04203sf7edc68852611579@mail.gmail.com> <4A36DDD8.2000103@visualreservoir.com> Message-ID: <3d375d730906151653r848ce60q8fcf054d2e109f01@mail.gmail.com> On Mon, Jun 15, 2009 at 18:48, Brennan Williams wrote: > Robert Kern wrote: >> On Mon, Jun 15, 2009 at 17:27, Brennan >> Williams wrote: >> >>> Hi >>> >>> I'm using npfile which is giving me a deprecation warning. For the time >>> being I want to continue using it but I would like to suppress >>> the warning messages. Is it possible to trap the deprecation warning but >>> still have the npfile go ahead? >>> >> >> http://docs.python.org/library/warnings >> >> > Thanks. > OK I've put the following in my code... > > import warnings > > def fxn(): > ? ?warnings.warn("deprecated", DeprecationWarning) > > with warnings.catch_warnings(): > ? ?warnings.simplefilter("ignore") > ? ?fxn() catch_warnings() was added in Python 2.6, as stated in the documentation. I recommend setting up the simplefilter in your main() function, and only for DeprecationWarnings. > but I'm getting an invalid syntax error... > > with warnings.catch_warnings(): > ? ? ? ? ? ? ? ? ? ? ? ^ > SyntaxError: invalid syntax > > I haven't used "with" before. Is this supposed to go in the function def > where I use npfile? I've put it near the top of my .py file after my > imports and before my class definitions. You would use the with statement only around code that calls the function. > btw I'm using Python 2.5.4 In Python 2.5, you need this at the top of your file (after docstrings but before any other code): from __future__ import with_statement -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From brennan.williams at visualreservoir.com Mon Jun 15 20:25:41 2009 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Tue, 16 Jun 2009 12:25:41 +1200 Subject: [Numpy-discussion] npfile deprecation warning In-Reply-To: <3d375d730906151653r848ce60q8fcf054d2e109f01@mail.gmail.com> References: <4A36CAC1.8040604@visualreservoir.com> <3d375d730906151533t57b04203sf7edc68852611579@mail.gmail.com> <4A36DDD8.2000103@visualreservoir.com> <3d375d730906151653r848ce60q8fcf054d2e109f01@mail.gmail.com> Message-ID: <4A36E685.9000908@visualreservoir.com> Robert Kern wrote: > On Mon, Jun 15, 2009 at 18:48, Brennan > Williams wrote: > >> Robert Kern wrote: >> >>> On Mon, Jun 15, 2009 at 17:27, Brennan >>> Williams wrote: >>> >>> >>>> Hi >>>> >>>> I'm using npfile which is giving me a deprecation warning. For the time >>>> being I want to continue using it but I would like to suppress >>>> the warning messages. Is it possible to trap the deprecation warning but >>>> still have the npfile go ahead? >>>> >>>> >>> http://docs.python.org/library/warnings >>> >>> >>> >> Thanks. >> OK I've put the following in my code... >> >> import warnings >> >> def fxn(): >> warnings.warn("deprecated", DeprecationWarning) >> >> with warnings.catch_warnings(): >> warnings.simplefilter("ignore") >> fxn() >> > > catch_warnings() was added in Python 2.6, as stated in the > documentation. My mistake. I saw the "new in 2.1" at the top of the page but didn't read all the way to the bottom where catch_warnings is documented (with "new in 2.6"). > I recommend setting up the simplefilter in your main() > function, and only for DeprecationWarnings. > > done and it works. Thanks. >> but I'm getting an invalid syntax error... >> >> with warnings.catch_warnings(): >> ^ >> SyntaxError: invalid syntax >> >> I haven't used "with" before. Is this supposed to go in the function def >> where I use npfile? I've put it near the top of my .py file after my >> imports and before my class definitions. >> > > You would use the with statement only around code that calls the function. > > >> btw I'm using Python 2.5.4 >> > > In Python 2.5, you need this at the top of your file (after docstrings > but before any other code): > > from __future__ import with_statement > > From pav at iki.fi Tue Jun 16 04:15:51 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 16 Jun 2009 08:15:51 +0000 (UTC) Subject: [Numpy-discussion] More on doc-ing new functions References: <494379.70739.qm@web52111.mail.re2.yahoo.com> Message-ID: Mon, 15 Jun 2009 09:52:05 -0700, David Goldsmith kirjoitti: [clip] > Is there a protocol for making sure these things get done? (Just don't > want to reinvent the wheel.) I don't think so. The current way is that people forget to do it, and then someone fixes it afterwards :) I'm not sure how much rubber hose we can use on developers. But please, * When you add new functions in Python APIs, please insert the function to the correct autosummary:: directive in the correct RST file under doc/source/reference/routines.* * * When you add new functions in the C APIs, please insert at least stub function documentation (cfunction:: directives) in the corresponding RST file doc/source/reference/c-api.* * Sphinx has some coverage assessing features which we probably should try out. There's also doc/summarize.py in Numpy's SVN that should report the status, but it's currently broken. -- Pauli Virtanen From william.ratcliff at gmail.com Tue Jun 16 09:21:26 2009 From: william.ratcliff at gmail.com (william ratcliff) Date: Tue, 16 Jun 2009 09:21:26 -0400 Subject: [Numpy-discussion] state of G3F2PY? Message-ID: <827183970906160621s5e5facb5hbc0cae9ec7630517@mail.gmail.com> Hi! I'm looking at trying to bind a rather large (>150K lines of code) crystallography library to python and would like to know what the state of F2py is. Are allocatable arrays supported? Derived types? Modules, Pointers, etc.? Is there a list somewhere? Has anyone else looked into wrapping such a large code base? If so, any pointers? Thanks, William -------------- next part -------------- An HTML attachment was scrubbed... URL: From darkgl0w at yahoo.com Tue Jun 16 10:42:35 2009 From: darkgl0w at yahoo.com (Cristi Constantin) Date: Tue, 16 Jun 2009 07:42:35 -0700 (PDT) Subject: [Numpy-discussion] Array resize question Message-ID: <578355.81118.qm@web52103.mail.re2.yahoo.com> Good day. I have this array: a = array([[u'0', u'0', u'0', u'0', u'0', u' '], ?????? [u'1', u'1', u'1', u'1', u'1', u' '], ?????? [u'2', u'2', u'2', u'2', u'2', u' '], ?????? [u'3', u'3', u'3', u'3', u'3', u' '], ?????? [u'4', u'4', u'4', u'4', u'4', u'']], ????? dtype=' From nmb at wartburg.edu Tue Jun 16 11:10:42 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Tue, 16 Jun 2009 10:10:42 -0500 Subject: [Numpy-discussion] Array resize question In-Reply-To: <578355.81118.qm@web52103.mail.re2.yahoo.com> References: <578355.81118.qm@web52103.mail.re2.yahoo.com> Message-ID: <4A37B5F2.9010707@wartburg.edu> On 2009-06-16 09:42 , Cristi Constantin wrote: > Good day. > I have this array: > > a = array([[u'0', u'0', u'0', u'0', u'0', u' '], > [u'1', u'1', u'1', u'1', u'1', u' '], > [u'2', u'2', u'2', u'2', u'2', u' '], > [u'3', u'3', u'3', u'3', u'3', u' '], > [u'4', u'4', u'4', u'4', u'4', u'']], > dtype=' > I want to resize it, but i don't want to alter the order of elements. > > a.resize((5,10)) # Will result in > array([[u'0', u'0', u'0', u'0', u'0', u' ', u'1', u'1', u'1', u'1'], > [u'1', u' ', u'2', u'2', u'2', u'2', u'2', u' ', u'3', u'3'], > [u'3', u'3', u'3', u' ', u'4', u'4', u'4', u'4', u'4', u''], > [u'', u'', u'', u'', u'', u'', u'', u'', u'', u''], > [u'', u'', u'', u'', u'', u'', u'', u'', u'', u'']], > dtype=' > That means all my values are mutilated. What i want is the order to be > kept and only the last elements to become empty. Like this: > array([[u'0', u'0', u'0', u'0', u'0', u' ', u'', u'', u'', u''], > [u'1', u'1', u'1', u'1', u'1', u' ', u'', u'', u'', u''], > [u'2', u'2', u'2', u'2', u'2', u' ', u'', u'', u'', u''], > [u'3', u'3', u'3', u'3', u'3', u' ', u'', u'', u'', u''], > [u'4', u'4', u'4', u'4', u'4', u' ', u'', u'', u'', u'']], > dtype=' > I tried to play with resize like this: > a.resize((5,10), refcheck=True, order=False) > # SystemError: NULL result without error in PyObject_Call > > vCont1.resize((5,10),True,False) > # TypeError: an integer is required > > Can anyone tell me how this "resize" function works ? > I already checked the help file : > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.resize.html > Thank you in advance. Resize takes the existing elements and puts them in a new order. This is its purpose. What you want to do is to stack two arrays together to make a new array: In [4]: x = np.random.random((5,5)) In [6]: x.shape Out[6]: (5, 5) In [7]: y = np.zeros((5,5)) In [8]: y.shape Out[8]: (5, 5) In [10]: z = np.hstack((x,y)) In [11]: z.shape Out[11]: (5, 10) In [12]: z Out[12]: array([[ 0.72215359, 0.32388934, 0.24858866, 0.40907379, 0.26072476, 0. , 0. , 0. , 0. , 0. ], [ 0.59085241, 0.88075534, 0.2288914 , 0.49258006, 0.28175061, 0. , 0. , 0. , 0. , 0. ], [ 0.50355137, 0.30180634, 0.09177751, 0.08608373, 0.04114688, 0. , 0. , 0. , 0. , 0. ], [ 0.06053053, 0.80426792, 0.21038812, 0.28098004, 0.88956146, 0. , 0. , 0. , 0. , 0. ], [ 0.17359959, 0.4629072 , 0.30100704, 0.45434713, 0.86597028, 0. , 0. , 0. , 0. , 0. ]]) This should work the same with your unicode arrays in place of the floating point arrays here. (To make an array with all empty strings, you can do y = np.empty((5,5), dtype='U'); y[:] = '') -Neil From kwmsmith at gmail.com Tue Jun 16 12:13:59 2009 From: kwmsmith at gmail.com (Kurt Smith) Date: Tue, 16 Jun 2009 11:13:59 -0500 Subject: [Numpy-discussion] state of G3F2PY? In-Reply-To: <827183970906160621s5e5facb5hbc0cae9ec7630517@mail.gmail.com> References: <827183970906160621s5e5facb5hbc0cae9ec7630517@mail.gmail.com> Message-ID: On Tue, Jun 16, 2009 at 8:21 AM, william ratcliff wrote: > Hi!? I'm looking at trying to bind a rather large (>150K lines of code) > crystallography library to python and would like to know what the state of > F2py is.? Are allocatable arrays supported?? Derived types?? Modules, > Pointers, etc.?? Is there a list somewhere?? Has anyone else looked into > wrapping such a large code base?? If so, any pointers? I've never used the current distributed version of f2py (numpy.f2py I believe is the default one) to wrap a library as large as this, and I don't believe that f2py can handle assumed-shape arrays as arguments to routines -- I haven't checked its support for allocatable, though, but I don't think so. I'm certainly open to correction, though. In my experience, f2py excels at wrapping Fortran 77-style programs with older array passing conventions, where the shape information is passed in as arguments to a subroutine/function. It won't solve your problem currently, but you might be interested in my GSoC project which aims to do just what you want -- binding a fortran library to Cython/Python, with support for allocatable/assumed-shape/assumed-size array arguments, pointers, derived types, modules, etc. It is being heavily developed as we speak, but will (hopefully) be usable by sometime this fall. Your library seems pretty large, which would be an excellent test for the GSoC project. If you are willing, and when the project is ready to tackle such a library, we'd be glad to work with you to get your library wrapped. As mentioned, we won't be at this point until the fall, though. Pearu has been kind enough to let us use the 'fparser' module from the G3F2PY project as the fortran parser, so the work of f2py continues on in our GSoC work. There are a few software requrirements -- we've tried to keep dependencies to a minimum: 1) a fortran compiler that supports the intrinsic ISO_C_BINDING module -- gfortran 4.3.3 supports it, as do pretty much every current version of other compilers. 2) The GSoC project is distributed with Cython. 3) Python version >= 2.5. 4) Hopefully you're not on windows -- we certainly plan on supporting Windows at some point in the future, but we don't have access to it for testing right now. Hope this helps, Kurt Smith From william.ratcliff at gmail.com Tue Jun 16 12:19:31 2009 From: william.ratcliff at gmail.com (william ratcliff) Date: Tue, 16 Jun 2009 12:19:31 -0400 Subject: [Numpy-discussion] state of G3F2PY? In-Reply-To: References: <827183970906160621s5e5facb5hbc0cae9ec7630517@mail.gmail.com> Message-ID: <827183970906160919ha402e79wa1dfbee97e4e5a72@mail.gmail.com> I would be interested in testing your GSOC project and will do what I can in the mean time. I do develop on windows, but the library lives on linux, macos, and windows, so we can test on anyg--it also binds with ifort, gfortran, etc. so seems rather robust. Cheers, William On Tue, Jun 16, 2009 at 12:13 PM, Kurt Smith wrote: > On Tue, Jun 16, 2009 at 8:21 AM, william > ratcliff wrote: > > Hi! I'm looking at trying to bind a rather large (>150K lines of code) > > crystallography library to python and would like to know what the state > of > > F2py is. Are allocatable arrays supported? Derived types? Modules, > > Pointers, etc.? Is there a list somewhere? Has anyone else looked into > > wrapping such a large code base? If so, any pointers? > > I've never used the current distributed version of f2py (numpy.f2py I > believe is the default one) to wrap a library as large as this, and I > don't believe that f2py can handle assumed-shape arrays as arguments > to routines -- I haven't checked its support for allocatable, though, > but I don't think so. I'm certainly open to correction, though. In > my experience, f2py excels at wrapping Fortran 77-style programs with > older array passing conventions, where the shape information is passed > in as arguments to a subroutine/function. > > It won't solve your problem currently, but you might be interested in > my GSoC project which aims to do just what you want -- binding a > fortran library to Cython/Python, with support for > allocatable/assumed-shape/assumed-size array arguments, pointers, > derived types, modules, etc. It is being heavily developed as we > speak, but will (hopefully) be usable by sometime this fall. > > Your library seems pretty large, which would be an excellent test for > the GSoC project. If you are willing, and when the project is ready > to tackle such a library, we'd be glad to work with you to get your > library wrapped. As mentioned, we won't be at this point until the > fall, though. > > Pearu has been kind enough to let us use the 'fparser' module from the > G3F2PY project as the fortran parser, so the work of f2py continues on > in our GSoC work. > > There are a few software requrirements -- we've tried to keep > dependencies to a minimum: > > 1) a fortran compiler that supports the intrinsic ISO_C_BINDING module > -- gfortran 4.3.3 supports it, as do pretty much every current version > of other compilers. > 2) The GSoC project is distributed with Cython. > 3) Python version >= 2.5. > 4) Hopefully you're not on windows -- we certainly plan on supporting > Windows at some point in the future, but we don't have access to it > for testing right now. > > Hope this helps, > > Kurt Smith > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.lewis17 at gmail.com Tue Jun 16 12:48:02 2009 From: brian.lewis17 at gmail.com (Brian Lewis) Date: Tue, 16 Jun 2009 09:48:02 -0700 Subject: [Numpy-discussion] ticket #1096 Message-ID: http://projects.scipy.org/numpy/ticket/1096 Is the fix to this to check if (line 95 of trunk/numpy/core/src/umathmodule.c.src ) const @type@ tmp = x - y; is -inf or not. And if it is, just to return -inf. If so, is there any chance someone can commit this quick fix....I want to use it :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Jun 16 14:47:44 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 16 Jun 2009 18:47:44 +0000 (UTC) Subject: [Numpy-discussion] ticket #1096 References: Message-ID: On 2009-06-16, Brian Lewis wrote: > http://projects.scipy.org/numpy/ticket/1096 > > Is the fix to this to check if (line 95 of > trunk/numpy/core/src/umathmodule.c.src > ) > > const @type@ tmp = x - y; > > is -inf or not. And if it is, just to return -inf. That's not the correct fix. Anyway, fixed in r7059. -- Pauli Virtanen From kxroberto at googlemail.com Tue Jun 16 15:18:02 2009 From: kxroberto at googlemail.com (Robert) Date: Tue, 16 Jun 2009 21:18:02 +0200 Subject: [Numpy-discussion] Interleaved Arrays and In-Reply-To: References: Message-ID: Ian Mallett wrote: > > n = #blah > testlist = [] > for x in xrange(n): > for y in xrange(n): > testlist.append([x,y]) > testlist.append([x+1,y]) > > If "testlist" is an array (i.e., I could go: "array(testlist)"), it > works nicely. However, my Python method is certainly improveable with > numpy. I suspect the best way is interleaving the arrays [x,y->yn] and > [x+1,y->yn] n times, but I couldn't figure out how to do that... > e.g with column_stack >>> n = 10 >>> xx = np.ones(n) >>> yy = np.arange(n) >>> aa = np.column_stack((xx,yy)) >>> bb = np.column_stack((xx+1,yy)) >>> aa array([[ 1., 0.], [ 1., 1.], [ 1., 2.], [ 1., 3.], [ 1., 4.], [ 1., 5.], [ 1., 6.], [ 1., 7.], [ 1., 8.], [ 1., 9.]]) >>> bb array([[ 2., 0.], [ 2., 1.], [ 2., 2.], [ 2., 3.], [ 2., 4.], [ 2., 5.], [ 2., 6.], [ 2., 7.], [ 2., 8.], [ 2., 9.]]) >>> np.column_stack((aa,bb)) array([[ 1., 0., 2., 0.], [ 1., 1., 2., 1.], [ 1., 2., 2., 2.], [ 1., 3., 2., 3.], [ 1., 4., 2., 4.], [ 1., 5., 2., 5.], [ 1., 6., 2., 6.], [ 1., 7., 2., 7.], [ 1., 8., 2., 8.], [ 1., 9., 2., 9.]]) >>> cc = _ >>> cc.reshape((n*2,2)) array([[ 1., 0.], [ 2., 0.], [ 1., 1.], [ 2., 1.], [ 1., 2.], [ 2., 2.], [ 1., 3.], [ 2., 3.], [ 1., 4.], [ 2., 4.], [ 1., 5.], [ 2., 5.], [ 1., 6.], [ 2., 6.], [ 1., 7.], [ 2., 7.], [ 1., 8.], [ 2., 8.], [ 1., 9.], [ 2., 9.]]) >>> However I feel too, there is a intuitive abbrev function like 'interleave' or so missing in numpy shape_base or so. Robert From brian.lewis17 at gmail.com Tue Jun 16 15:27:44 2009 From: brian.lewis17 at gmail.com (Brian Lewis) Date: Tue, 16 Jun 2009 12:27:44 -0700 Subject: [Numpy-discussion] ticket #1096 In-Reply-To: References: Message-ID: On Tue, Jun 16, 2009 at 11:47 AM, Pauli Virtanen wrote: > Anyway, fixed in r7059. > Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmb at wartburg.edu Tue Jun 16 16:14:35 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Tue, 16 Jun 2009 15:14:35 -0500 Subject: [Numpy-discussion] Interleaved Arrays and In-Reply-To: References: Message-ID: <4A37FD2B.9000009@wartburg.edu> On 06/16/2009 02:18 PM, Robert wrote: > >>> n = 10 > >>> xx = np.ones(n) > >>> yy = np.arange(n) > >>> aa = np.column_stack((xx,yy)) > >>> bb = np.column_stack((xx+1,yy)) > >>> aa > array([[ 1., 0.], > [ 1., 1.], > [ 1., 2.], > [ 1., 3.], > [ 1., 4.], > [ 1., 5.], > [ 1., 6.], > [ 1., 7.], > [ 1., 8.], > [ 1., 9.]]) > >>> bb > array([[ 2., 0.], > [ 2., 1.], > [ 2., 2.], > [ 2., 3.], > [ 2., 4.], > [ 2., 5.], > [ 2., 6.], > [ 2., 7.], > [ 2., 8.], > [ 2., 9.]]) > >>> np.column_stack((aa,bb)) > array([[ 1., 0., 2., 0.], > [ 1., 1., 2., 1.], > [ 1., 2., 2., 2.], > [ 1., 3., 2., 3.], > [ 1., 4., 2., 4.], > [ 1., 5., 2., 5.], > [ 1., 6., 2., 6.], > [ 1., 7., 2., 7.], > [ 1., 8., 2., 8.], > [ 1., 9., 2., 9.]]) > >>> cc = _ > >>> cc.reshape((n*2,2)) > array([[ 1., 0.], > [ 2., 0.], > [ 1., 1.], > [ 2., 1.], > [ 1., 2.], > [ 2., 2.], > [ 1., 3.], > [ 2., 3.], > [ 1., 4.], > [ 2., 4.], > [ 1., 5.], > [ 2., 5.], > [ 1., 6.], > [ 2., 6.], > [ 1., 7.], > [ 2., 7.], > [ 1., 8.], > [ 2., 8.], > [ 1., 9.], > [ 2., 9.]]) > >>> > > > However I feel too, there is a intuitive abbrev function like > 'interleave' or so missing in numpy shape_base or so. Using fancy indexing, you can set strided portions of an array equal to another array. So:: In [2]: aa = np.empty((10,2)) In [3]: aa[:, 0] = 1 In [4]: aa[:,1] = np.arange(10) In [5]: bb = np.empty((10,2)) In [6]: bb[:,0] = 2 In [7]: bb[:,1] = aa[:,1] # this works In [8]: cc = np.empty((20,2)) In [9]: cc[::2,:] = aa In [10]: cc[1::2,:] = bb In [11]: cc Out[11]: array([[ 1., 0.], [ 2., 0.], [ 1., 1.], [ 2., 1.], [ 1., 2.], [ 2., 2.], [ 1., 3.], [ 2., 3.], [ 1., 4.], [ 2., 4.], [ 1., 5.], [ 2., 5.], [ 1., 6.], [ 2., 6.], [ 1., 7.], [ 2., 7.], [ 1., 8.], [ 2., 8.], [ 1., 9.], [ 2., 9.]]) Using this syntax, interleave could be a one-liner. -Neil From kxroberto at googlemail.com Tue Jun 16 17:05:06 2009 From: kxroberto at googlemail.com (Robert) Date: Tue, 16 Jun 2009 23:05:06 +0200 Subject: [Numpy-discussion] Interleaved Arrays and In-Reply-To: <4A37FD2B.9000009@wartburg.edu> References: <4A37FD2B.9000009@wartburg.edu> Message-ID: Neil Martinsen-Burrell wrote: > On 06/16/2009 02:18 PM, Robert wrote: >> >>> n = 10 >> >>> xx = np.ones(n) >> >>> yy = np.arange(n) >> >>> aa = np.column_stack((xx,yy)) >> >>> bb = np.column_stack((xx+1,yy)) >> >>> aa >> array([[ 1., 0.], >> [ 1., 1.], >> [ 1., 2.], >> [ 1., 3.], >> [ 1., 4.], >> [ 1., 5.], >> [ 1., 6.], >> [ 1., 7.], >> [ 1., 8.], >> [ 1., 9.]]) >> >>> bb >> array([[ 2., 0.], >> [ 2., 1.], >> [ 2., 2.], >> [ 2., 3.], >> [ 2., 4.], >> [ 2., 5.], >> [ 2., 6.], >> [ 2., 7.], >> [ 2., 8.], >> [ 2., 9.]]) >> >>> np.column_stack((aa,bb)) >> array([[ 1., 0., 2., 0.], >> [ 1., 1., 2., 1.], >> [ 1., 2., 2., 2.], >> [ 1., 3., 2., 3.], >> [ 1., 4., 2., 4.], >> [ 1., 5., 2., 5.], >> [ 1., 6., 2., 6.], >> [ 1., 7., 2., 7.], >> [ 1., 8., 2., 8.], >> [ 1., 9., 2., 9.]]) >> >>> cc = _ >> >>> cc.reshape((n*2,2)) >> array([[ 1., 0.], >> [ 2., 0.], >> [ 1., 1.], >> [ 2., 1.], >> [ 1., 2.], >> [ 2., 2.], >> [ 1., 3.], >> [ 2., 3.], >> [ 1., 4.], >> [ 2., 4.], >> [ 1., 5.], >> [ 2., 5.], >> [ 1., 6.], >> [ 2., 6.], >> [ 1., 7.], >> [ 2., 7.], >> [ 1., 8.], >> [ 2., 8.], >> [ 1., 9.], >> [ 2., 9.]]) >> >>> >> >> >> However I feel too, there is a intuitive abbrev function like >> 'interleave' or so missing in numpy shape_base or so. > > Using fancy indexing, you can set strided portions of an array equal to > another array. So:: > > In [2]: aa = np.empty((10,2)) > > In [3]: aa[:, 0] = 1 > > In [4]: aa[:,1] = np.arange(10) > > In [5]: bb = np.empty((10,2)) > > In [6]: bb[:,0] = 2 > > In [7]: bb[:,1] = aa[:,1] # this works > > In [8]: cc = np.empty((20,2)) > > In [9]: cc[::2,:] = aa > > In [10]: cc[1::2,:] = bb > > In [11]: cc > Out[11]: > array([[ 1., 0.], > [ 2., 0.], > [ 1., 1.], > [ 2., 1.], > [ 1., 2.], > [ 2., 2.], > [ 1., 3.], > [ 2., 3.], > [ 1., 4.], > [ 2., 4.], > [ 1., 5.], > [ 2., 5.], > [ 1., 6.], > [ 2., 6.], > [ 1., 7.], > [ 2., 7.], > [ 1., 8.], > [ 2., 8.], > [ 1., 9.], > [ 2., 9.]]) > > Using this syntax, interleave could be a one-liner. > > -Neil that method of 'filling an empty with a pattern' was mentioned in the other (general) interleaving question. It requires however a lot of particular numbers and :'s in the code, and requires even more statements which can hardly be written in functional style - in one line?. The other approach is more jount, free of fancy indexing assignments. The general interleaving should work efficiently in one like this: np.column_stack/concatenate((r,g,b,....), axis=...).reshape(..) But as all this is not intuitive, something like this should be in numpy perhaps? : def interleave( tup_arrays, axis = None ) Robert From nmb at wartburg.edu Tue Jun 16 22:22:40 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Tue, 16 Jun 2009 21:22:40 -0500 Subject: [Numpy-discussion] Interleaved Arrays and In-Reply-To: References: <4A37FD2B.9000009@wartburg.edu> Message-ID: <4A385370.7000601@wartburg.edu> On 2009-06-16 16:05 , Robert wrote: > Neil Martinsen-Burrell wrote: >> On 06/16/2009 02:18 PM, Robert wrote: >>> >>> n = 10 >>> >>> xx = np.ones(n) >>> >>> yy = np.arange(n) >>> >>> aa = np.column_stack((xx,yy)) >>> >>> bb = np.column_stack((xx+1,yy)) >>> >>> aa >>> array([[ 1., 0.], >>> [ 1., 1.], >>> [ 1., 2.], >>> [ 1., 3.], >>> [ 1., 4.], >>> [ 1., 5.], >>> [ 1., 6.], >>> [ 1., 7.], >>> [ 1., 8.], >>> [ 1., 9.]]) >>> >>> bb >>> array([[ 2., 0.], >>> [ 2., 1.], >>> [ 2., 2.], >>> [ 2., 3.], >>> [ 2., 4.], >>> [ 2., 5.], >>> [ 2., 6.], >>> [ 2., 7.], >>> [ 2., 8.], >>> [ 2., 9.]]) >>> >>> np.column_stack((aa,bb)) >>> array([[ 1., 0., 2., 0.], >>> [ 1., 1., 2., 1.], >>> [ 1., 2., 2., 2.], >>> [ 1., 3., 2., 3.], >>> [ 1., 4., 2., 4.], >>> [ 1., 5., 2., 5.], >>> [ 1., 6., 2., 6.], >>> [ 1., 7., 2., 7.], >>> [ 1., 8., 2., 8.], >>> [ 1., 9., 2., 9.]]) >>> >>> cc = _ >>> >>> cc.reshape((n*2,2)) >>> array([[ 1., 0.], >>> [ 2., 0.], >>> [ 1., 1.], >>> [ 2., 1.], >>> [ 1., 2.], >>> [ 2., 2.], >>> [ 1., 3.], >>> [ 2., 3.], >>> [ 1., 4.], >>> [ 2., 4.], >>> [ 1., 5.], >>> [ 2., 5.], >>> [ 1., 6.], >>> [ 2., 6.], >>> [ 1., 7.], >>> [ 2., 7.], >>> [ 1., 8.], >>> [ 2., 8.], >>> [ 1., 9.], >>> [ 2., 9.]]) >>> >>> >>> >>> >>> However I feel too, there is a intuitive abbrev function like >>> 'interleave' or so missing in numpy shape_base or so. >> >> Using fancy indexing, you can set strided portions of an array equal to >> another array. So:: >> >> In [2]: aa = np.empty((10,2)) >> >> In [3]: aa[:, 0] = 1 >> >> In [4]: aa[:,1] = np.arange(10) >> >> In [5]: bb = np.empty((10,2)) >> >> In [6]: bb[:,0] = 2 >> >> In [7]: bb[:,1] = aa[:,1] # this works >> >> In [8]: cc = np.empty((20,2)) >> >> In [9]: cc[::2,:] = aa >> >> In [10]: cc[1::2,:] = bb >> >> In [11]: cc >> Out[11]: >> array([[ 1., 0.], >> [ 2., 0.], >> [ 1., 1.], >> [ 2., 1.], >> [ 1., 2.], >> [ 2., 2.], >> [ 1., 3.], >> [ 2., 3.], >> [ 1., 4.], >> [ 2., 4.], >> [ 1., 5.], >> [ 2., 5.], >> [ 1., 6.], >> [ 2., 6.], >> [ 1., 7.], >> [ 2., 7.], >> [ 1., 8.], >> [ 2., 8.], >> [ 1., 9.], >> [ 2., 9.]]) >> >> Using this syntax, interleave could be a one-liner. >> >> -Neil > > that method of 'filling an empty with a pattern' was mentioned in > the other (general) interleaving question. It requires however a > lot of particular numbers and :'s in the code, and requires even > more statements which can hardly be written in functional style - > in one line?. The other approach is more jount, free of fancy > indexing assignments. jount? I think that assigning to a strided index is very clear, but that is a difference of opinion. All of the calls to np.empty are the equivalent of the column_stack's in your example. I think that operations on segments of arrays are fundamental to an array-processing language such as NumPy. Using ";" you can put as many of those statements as you would like one line. :) > The general interleaving should work efficiently in one like this: > > np.column_stack/concatenate((r,g,b,....), axis=...).reshape(..) > > But as all this is not intuitive, something like this should be in > numpy perhaps? : > > def interleave( tup_arrays, axis = None ) Here is a minimally tested implementation. If anyone really wants this for numpy, I'll gladly add comments and tests. I couldn't figure out how to automatically find the greatest dtype, so I added an argument to specify, otherwise it uses the type of the first array. def interleave(arrays, axis=0, dtype=None): assert len(arrays) > 0 first = arrays[0] assert all([arr.shape == first.shape for arr in arrays]) new_shape = list(first.shape) new_shape[axis] *= len(arrays) if dtype is None: new_dtype = first.dtype else: new_dtype = dtype interleaved = np.empty(new_shape, new_dtype) axis_slice = [slice(None, None, None)]*axis + \ [slice(0,None,len(arrays))] + [Ellipsis] for i, arr in enumerate(arrays): axis_slice[axis] = slice(i, None, len(arrays)) interleaved[tuple(axis_slice)] = arr return interleaved From peridot.faceted at gmail.com Tue Jun 16 22:43:53 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 16 Jun 2009 22:43:53 -0400 Subject: [Numpy-discussion] Interleaved Arrays and In-Reply-To: <4A385370.7000601@wartburg.edu> References: <4A37FD2B.9000009@wartburg.edu> <4A385370.7000601@wartburg.edu> Message-ID: I'm not sure it's worth having a function to replace a one-liner (column_stack followed by reshape). But if you're going to implement this with slice assignment, you should take advantage of the flexibility this method allows and offer the possibility of interleaving "raggedly", that is, where the size of the arrays drops at some point, so that you could interleave arrays of size 4, 4, and 3 to get one array of size 11. This allows split and join operations, for example for multiprocessing. On the other hand you should also include a documentation warning that this can be slow when interleaving large numbers of small arrays. Anne 2009/6/16 Neil Martinsen-Burrell : > On 2009-06-16 16:05 , Robert wrote: >> Neil Martinsen-Burrell wrote: >>> On 06/16/2009 02:18 PM, Robert wrote: >>>> ? ?>>> ? n = 10 >>>> ? ?>>> ? xx = np.ones(n) >>>> ? ?>>> ? yy = np.arange(n) >>>> ? ?>>> ? aa = np.column_stack((xx,yy)) >>>> ? ?>>> ? bb = np.column_stack((xx+1,yy)) >>>> ? ?>>> ? aa >>>> array([[ 1., ?0.], >>>> ? ? ? ? ? [ 1., ?1.], >>>> ? ? ? ? ? [ 1., ?2.], >>>> ? ? ? ? ? [ 1., ?3.], >>>> ? ? ? ? ? [ 1., ?4.], >>>> ? ? ? ? ? [ 1., ?5.], >>>> ? ? ? ? ? [ 1., ?6.], >>>> ? ? ? ? ? [ 1., ?7.], >>>> ? ? ? ? ? [ 1., ?8.], >>>> ? ? ? ? ? [ 1., ?9.]]) >>>> ? ?>>> ? bb >>>> array([[ 2., ?0.], >>>> ? ? ? ? ? [ 2., ?1.], >>>> ? ? ? ? ? [ 2., ?2.], >>>> ? ? ? ? ? [ 2., ?3.], >>>> ? ? ? ? ? [ 2., ?4.], >>>> ? ? ? ? ? [ 2., ?5.], >>>> ? ? ? ? ? [ 2., ?6.], >>>> ? ? ? ? ? [ 2., ?7.], >>>> ? ? ? ? ? [ 2., ?8.], >>>> ? ? ? ? ? [ 2., ?9.]]) >>>> ? ?>>> ? np.column_stack((aa,bb)) >>>> array([[ 1., ?0., ?2., ?0.], >>>> ? ? ? ? ? [ 1., ?1., ?2., ?1.], >>>> ? ? ? ? ? [ 1., ?2., ?2., ?2.], >>>> ? ? ? ? ? [ 1., ?3., ?2., ?3.], >>>> ? ? ? ? ? [ 1., ?4., ?2., ?4.], >>>> ? ? ? ? ? [ 1., ?5., ?2., ?5.], >>>> ? ? ? ? ? [ 1., ?6., ?2., ?6.], >>>> ? ? ? ? ? [ 1., ?7., ?2., ?7.], >>>> ? ? ? ? ? [ 1., ?8., ?2., ?8.], >>>> ? ? ? ? ? [ 1., ?9., ?2., ?9.]]) >>>> ? ?>>> ? cc = _ >>>> ? ?>>> ? cc.reshape((n*2,2)) >>>> array([[ 1., ?0.], >>>> ? ? ? ? ? [ 2., ?0.], >>>> ? ? ? ? ? [ 1., ?1.], >>>> ? ? ? ? ? [ 2., ?1.], >>>> ? ? ? ? ? [ 1., ?2.], >>>> ? ? ? ? ? [ 2., ?2.], >>>> ? ? ? ? ? [ 1., ?3.], >>>> ? ? ? ? ? [ 2., ?3.], >>>> ? ? ? ? ? [ 1., ?4.], >>>> ? ? ? ? ? [ 2., ?4.], >>>> ? ? ? ? ? [ 1., ?5.], >>>> ? ? ? ? ? [ 2., ?5.], >>>> ? ? ? ? ? [ 1., ?6.], >>>> ? ? ? ? ? [ 2., ?6.], >>>> ? ? ? ? ? [ 1., ?7.], >>>> ? ? ? ? ? [ 2., ?7.], >>>> ? ? ? ? ? [ 1., ?8.], >>>> ? ? ? ? ? [ 2., ?8.], >>>> ? ? ? ? ? [ 1., ?9.], >>>> ? ? ? ? ? [ 2., ?9.]]) >>>> ? ?>>> >>>> >>>> >>>> However I feel too, there is a intuitive abbrev function like >>>> 'interleave' or so missing in numpy shape_base or so. >>> >>> Using fancy indexing, you can set strided portions of an array equal to >>> another array. ?So:: >>> >>> In [2]: aa = np.empty((10,2)) >>> >>> In [3]: aa[:, 0] = 1 >>> >>> In [4]: aa[:,1] = np.arange(10) >>> >>> In [5]: bb = np.empty((10,2)) >>> >>> In [6]: bb[:,0] = 2 >>> >>> In [7]: bb[:,1] = aa[:,1] # this works >>> >>> In [8]: cc = np.empty((20,2)) >>> >>> In [9]: cc[::2,:] = aa >>> >>> In [10]: cc[1::2,:] = bb >>> >>> In [11]: cc >>> Out[11]: >>> array([[ 1., ?0.], >>> ? ? ? ? ?[ 2., ?0.], >>> ? ? ? ? ?[ 1., ?1.], >>> ? ? ? ? ?[ 2., ?1.], >>> ? ? ? ? ?[ 1., ?2.], >>> ? ? ? ? ?[ 2., ?2.], >>> ? ? ? ? ?[ 1., ?3.], >>> ? ? ? ? ?[ 2., ?3.], >>> ? ? ? ? ?[ 1., ?4.], >>> ? ? ? ? ?[ 2., ?4.], >>> ? ? ? ? ?[ 1., ?5.], >>> ? ? ? ? ?[ 2., ?5.], >>> ? ? ? ? ?[ 1., ?6.], >>> ? ? ? ? ?[ 2., ?6.], >>> ? ? ? ? ?[ 1., ?7.], >>> ? ? ? ? ?[ 2., ?7.], >>> ? ? ? ? ?[ 1., ?8.], >>> ? ? ? ? ?[ 2., ?8.], >>> ? ? ? ? ?[ 1., ?9.], >>> ? ? ? ? ?[ 2., ?9.]]) >>> >>> Using this syntax, interleave could be a one-liner. >>> >>> -Neil >> >> that method of 'filling an empty with a pattern' was mentioned in >> the other (general) interleaving question. It requires however a >> lot of particular numbers and :'s in the code, and requires even >> more statements which can hardly be written in functional style - >> in one line?. The other approach is more jount, free of fancy >> indexing assignments. > > jount? ?I think that assigning to a strided index is very clear, but > that is a difference of opinion. ?All of the calls to np.empty are the > equivalent of the column_stack's in your example. ?I think that > operations on segments of arrays are fundamental to an array-processing > language such as NumPy. ?Using ";" you can put as many of those > statements as you would like one line. :) > >> The general interleaving should work efficiently in one like this: >> >> np.column_stack/concatenate((r,g,b,....), axis=...).reshape(..) >> >> But as all this is not intuitive, something like this should be in >> numpy perhaps? : >> >> def interleave( tup_arrays, axis = None ) > > Here is a minimally tested implementation. ?If anyone really wants this > for numpy, I'll gladly add comments and tests. ?I couldn't figure out > how to automatically find the greatest dtype, so I added an argument to > specify, otherwise it uses the type of the first array. > > def interleave(arrays, axis=0, dtype=None): > ? ? assert len(arrays) > 0 > ? ? first = arrays[0] > ? ? assert all([arr.shape == first.shape for arr in arrays]) > ? ? new_shape = list(first.shape) > ? ? new_shape[axis] *= len(arrays) > ? ? if dtype is None: > ? ? ? ? new_dtype = first.dtype > ? ? else: > ? ? ? ? new_dtype = dtype > ? ? interleaved = np.empty(new_shape, new_dtype) > ? ? axis_slice = [slice(None, None, None)]*axis + \ > ? ? ? ? ? ? ? ? ?[slice(0,None,len(arrays))] + [Ellipsis] > ? ? for i, arr in enumerate(arrays): > ? ? ? ? axis_slice[axis] = slice(i, None, len(arrays)) > ? ? ? ? interleaved[tuple(axis_slice)] = arr > ? ? return interleaved > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From wnbell at gmail.com Wed Jun 17 02:02:38 2009 From: wnbell at gmail.com (Nathan Bell) Date: Wed, 17 Jun 2009 02:02:38 -0400 Subject: [Numpy-discussion] Scipy 0.6.0 to 0.7.0, sparse matrix change In-Reply-To: References: <4A2E1F62.4010208@ntc.zcu.cz> <4A361A7F.6050100@ntc.zcu.cz> Message-ID: On Mon, Jun 15, 2009 at 6:47 AM, Fadhley Salim wrote: > > I'm trying to track down a numerical discrepancy in our proejct. We > noticed that a certain set of results are different having upgraded from > scipy 0.6.0 to 0.7.0. > > The following item from the Scipy change-log is our current number-one > suspect. Could anybody who knows suggest what was actually involved in > the change which I have highlighted with stars below? > > [...] > > The handling of diagonals in the ``spdiags`` function has been changed. > It now agrees with the MATLAB(TM) function of the same name. > > *** Numerous efficiency improvements to format conversions and sparse > matrix arithmetic have been made. ?Finally, this release contains > numerous bugfixes. *** > Can you elaborate on why you think sparse matrices may be the culprit? None of the changes between 0.6 and 0.7 should produce different numerical results (beyond standard floating point margins). -- Nathan Bell wnbell at gmail.com http://www.wnbell.com/ From scott.sinclair.za at gmail.com Wed Jun 17 03:03:59 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 17 Jun 2009 09:03:59 +0200 Subject: [Numpy-discussion] Array resize question In-Reply-To: <578355.81118.qm@web52103.mail.re2.yahoo.com> References: <578355.81118.qm@web52103.mail.re2.yahoo.com> Message-ID: <6a17e9ee0906170003n336f4877p4b963b75d2e4983f@mail.gmail.com> > 2009/6/16 Cristi Constantin > > Good day. > I have this array: > > a = array([[u'0', u'0', u'0', u'0', u'0', u' '], > ?????? [u'1', u'1', u'1', u'1', u'1', u' '], > ?????? [u'2', u'2', u'2', u'2', u'2', u' '], > ?????? [u'3', u'3', u'3', u'3', u'3', u' '], > ?????? [u'4', u'4', u'4', u'4', u'4', u'']], > ????? dtype=' > I want to resize it, but i don't want to alter the order of elements. > > a.resize((5,10)) # Will result in > array([[u'0', u'0', u'0', u'0', u'0', u' ', u'1', u'1', u'1', u'1'], > ?????? [u'1', u' ', u'2', u'2', u'2', u'2', u'2', u' ', u'3', u'3'], > ?????? [u'3', u'3', u'3', u' ', u'4', u'4', u'4', u'4', u'4', u''], > ?????? [u'', u'', u'', u'', u'', u'', u'', u'', u'', u''], > ?????? [u'', u'', u'', u'', u'', u'', u'', u'', u'', u'']], > ????? dtype=' > That means all my values are mutilated. What i want is the order to be kept and only the last elements to become empty. Like this: > array([[u'0', u'0', u'0', u'0', u'0', u' ', u'', u'', u'', u''], > ?????? [u'1', u'1', u'1', u'1', u'1', u' ', u'', u'', u'', u''], > ?????? [u'2', u'2', u'2', u'2', u'2', u' ', u'', u'', u'', u''], > ?????? [u'3', u'3', u'3', u'3', u'3', u' ', u'', u'', u'', u''], > ?????? [u'4', u'4', u'4', u'4', u'4', u' ', u'', u'', u'', u'']], > ????? dtype=' > I tried to play with resize like this: > a.resize((5,10), refcheck=True, order=False) > # SystemError: NULL result without error in PyObject_Call > > vCont1.resize((5,10),True,False) > # TypeError: an integer is required > > Can anyone tell me how this "resize" function works ? > I already checked the help file : http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.resize.html The resize method of ndarray is currently broken when the 'order' keyword is specified, which is why you get the SystemError http://projects.scipy.org/numpy/ticket/840 It's also worth knowing that the resize function and the ndarray resize method both behave a little differently. Compare: http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.resize.html and http://docs.scipy.org/doc/numpy/reference/generated/numpy.resize.html Basically, the data in your original array is inserted into the new array in the one dimensional order that it's stored in memory and any remaining space is filled with repeats of the data (resize function) or packed with zero's (resize array method). # resize function >>> import numpy as np >>> a = np.array([[1, 2],[3, 4]]) >>> print a [[1 2] [3 4]] >>> print np.resize(a, (3,3)) [[1, 2, 3], [4, 1, 2], [3, 4, 1]]) #resize array method >>> b = np.array([[1, 2],[3, 4]]) >>> print b [[1 2] [3 4]] >>> b.resize((3,3)) >>> print b [[1 2 3] [4 0 0] [0 0 0]] Neil's response gives you what you want in this case. Cheers, Scott From darkgl0w at yahoo.com Wed Jun 17 03:07:41 2009 From: darkgl0w at yahoo.com (Cristi Constantin) Date: Wed, 17 Jun 2009 00:07:41 -0700 (PDT) Subject: [Numpy-discussion] Array resize question closed Message-ID: <392437.72467.qm@web52110.mail.re2.yahoo.com> Thank you Neil Martinsen-Burrell and Scott Sinclair for your answers. :) It's exactly what i wanted to know. I will use hstack and vstack to "resize" my array. Have a nice day. --- On Wed, 6/17/09, Scott Sinclair wrote: The resize method of ndarray is currently broken when the 'order' keyword is specified, which is why you get the SystemError http://projects.scipy.org/numpy/ticket/840 It's also worth knowing that the resize function and the ndarray resize method both behave a little differently. Compare: http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.resize.html and http://docs.scipy.org/doc/numpy/reference/generated/numpy.resize.html Basically, the data in your original array is inserted into the new array in the one dimensional order that it's stored in memory and any remaining space is filled with repeats of the data (resize function) or packed with zero's (resize array method). # resize function >>> import numpy as np >>> a = np.array([[1, 2],[3, 4]]) >>> print a [[1 2] [3 4]] >>> print np.resize(a, (3,3)) [[1, 2, 3], [4, 1, 2], [3, 4, 1]]) #resize array method >>> b = np.array([[1, 2],[3, 4]]) >>> print b [[1 2] [3 4]] >>> b.resize((3,3)) >>> print b [[1 2 3] [4 0 0] [0 0 0]] Neil's response gives you what you want in this case. Cheers, Scott _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From neilcrighton at gmail.com Wed Jun 17 05:34:45 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Wed, 17 Jun 2009 09:34:45 +0000 (UTC) Subject: [Numpy-discussion] all and alltrue References: <2c1c314c0906110419p4d900bafne8841a00af1823e1@mail.gmail.com> Message-ID: Shivaraj M S gmail.com> writes: > > Hello,I just came across 'all' and 'alltrue' functions in fromnumeric.py > They are one and same.IMHO,alltrue = all would be > sufficient.Regards_______________ > Shivaraj-- There are other duplications too: np.all np.alltrue np.any np.sometrue np.deg2rad np.radians np.rad2deg np.degrees And maybe more I've missed. Can we deprecate alltrue and sometrue, and either deg2rad/rad2deg, or radians/degrees? They would be deprecated in 1.4 and presumably removed in 1.5. Neil From neilcrighton at gmail.com Wed Jun 17 06:11:47 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Wed, 17 Jun 2009 11:11:47 +0100 Subject: [Numpy-discussion] improving arraysetops Message-ID: <63751c30906170311obc07693u276ba597be31892e@mail.gmail.com> > > What about merging unique and unique1d? They're essentially identical for an > > array input, but unique uses the builtin set() for non-array inputs and so is > > around 2x faster in this case - see below. Is it worth accepting a speed > > regression for unique to get rid of the function duplication? (Or can they be > > combined?) > > unique1d can return the indices - can this be achieved by using set(), too? > No, set() can't return the indices as far as I know. > The implementation for arrays is the same already, IMHO, so I would > prefer adding return_index, return_inverse to unique (automatically > converting input to array, if necessary), and deprecate unique1d. > > We can view it also as adding the set() approach to unique1d, when the > return_index, return_inverse arguments are not set, and renaming > unique1d -> unique. > This sounds good. If you don't have time to do it, I don't mind having a go at writing a patch to implement these changes (deprecate the existing unique1d, rename unique1d to unique and add the set approach from the old unique, and the other changes mentioned in http://projects.scipy.org/numpy/ticket/1133). > I have found a strange bug in unique(): > > In [24]: l = list(np.random.randint(100, size=1000)) > > In [25]: %timeit np.unique(l) > --------------------------------------------------------------------------- > UnicodeEncodeError Traceback (most recent call last) > > /usr/lib64/python2.5/site-packages/IPython/iplib.py in ipmagic(self, arg_s) > 951 else: > 952 magic_args = self.var_expand(magic_args,1) > --> 953 return fn(magic_args) > 954 > 955 def ipalias(self,arg_s): > > /usr/lib64/python2.5/site-packages/IPython/Magic.py in > magic_timeit(self, parameter_s) > 1829 > precision, > 1830 best > * scaling[order], > -> 1831 > units[order]) > 1832 if tc > tc_min: > 1833 print "Compiler time: %.2f s" % tc > > UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in > position 28: ordinal not in range(128) > > It disappears after increasing the array size, or the integer size. > In [39]: np.__version__ > Out[39]: '1.4.0.dev7047' > > r. Weird! From the error message, it looks like a problem with ipython's timeit function rather than unique. I can't reproduce it on my machine (numpy 1.4.0.dev, r7059; IPython 0.10.bzr.r1163 ). Neil From cimrman3 at ntc.zcu.cz Wed Jun 17 09:06:39 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 17 Jun 2009 15:06:39 +0200 Subject: [Numpy-discussion] improving arraysetops In-Reply-To: <63751c30906170311obc07693u276ba597be31892e@mail.gmail.com> References: <63751c30906170311obc07693u276ba597be31892e@mail.gmail.com> Message-ID: <4A38EA5F.8080507@ntc.zcu.cz> Hi Neil, Neil Crighton wrote: >>> What about merging unique and unique1d? They're essentially identical for an >>> array input, but unique uses the builtin set() for non-array inputs and so is >>> around 2x faster in this case - see below. Is it worth accepting a speed >>> regression for unique to get rid of the function duplication? (Or can they be >>> combined?) >> unique1d can return the indices - can this be achieved by using set(), too? >> > > No, set() can't return the indices as far as I know. > >> The implementation for arrays is the same already, IMHO, so I would >> prefer adding return_index, return_inverse to unique (automatically >> converting input to array, if necessary), and deprecate unique1d. >> >> We can view it also as adding the set() approach to unique1d, when the >> return_index, return_inverse arguments are not set, and renaming >> unique1d -> unique. >> > > This sounds good. If you don't have time to do it, I don't mind having > a go at writing > a patch to implement these changes (deprecate the existing unique1d, rename > unique1d to unique and add the set approach from the old unique, and the other > changes mentioned in http://projects.scipy.org/numpy/ticket/1133). That would be really great - I will not be online starting tomorrow till the end of next week (more or less), so I can really look at the issue after I return. [...] >> UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in >> position 28: ordinal not in range(128) >> >> It disappears after increasing the array size, or the integer size. >> In [39]: np.__version__ >> Out[39]: '1.4.0.dev7047' >> >> r. > > Weird! From the error message, it looks like a problem with ipython's timeit > function rather than unique. I can't reproduce it on my machine > (numpy 1.4.0.dev, r7059; IPython 0.10.bzr.r1163 ). True, I have ipython 0.9.1, that might cause the problem. cheers, r. From nwagner at iam.uni-stuttgart.de Wed Jun 17 14:18:13 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 17 Jun 2009 20:18:13 +0200 Subject: [Numpy-discussion] numpy port to Jython Message-ID: Hi all, Is there a port of numpy/scipy to Jython ? Any pointer would be appreciated. Thanks in advance Nils From robert.kern at gmail.com Wed Jun 17 14:22:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Jun 2009 13:22:01 -0500 Subject: [Numpy-discussion] numpy port to Jython In-Reply-To: References: Message-ID: <3d375d730906171122v7ada14c2v8e03fd88cbbdc8ab@mail.gmail.com> On Wed, Jun 17, 2009 at 13:18, Nils Wagner wrote: > Hi all, > > Is there a port of numpy/scipy to Jython ? No. I believe there once was a start at porting an ancient Numeric to Jython, but I don't think it got very far. Certainly nothing of scipy. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Wed Jun 17 15:00:23 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 17 Jun 2009 15:00:23 -0400 Subject: [Numpy-discussion] numpy port to Jython In-Reply-To: References: Message-ID: On 17-Jun-09, at 2:18 PM, Nils Wagner wrote: > Is there a port of numpy/scipy to Jython ? > > Any pointer would be appreciated. Folks have successfully gotten it working from IronPython (the .NET CLR) via Ironclad ( http://code.google.com/p/ironclad/ )... not Jython though. David From nwagner at iam.uni-stuttgart.de Wed Jun 17 15:17:50 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 17 Jun 2009 21:17:50 +0200 Subject: [Numpy-discussion] numpy port to Jython In-Reply-To: References: Message-ID: On Wed, 17 Jun 2009 15:00:23 -0400 David Warde-Farley wrote: > On 17-Jun-09, at 2:18 PM, Nils Wagner wrote: > >> Is there a port of numpy/scipy to Jython ? >> >> Any pointer would be appreciated. > >Folks have successfully gotten it working from IronPython >(the .NET > CLR) via Ironclad ( http://code.google.com/p/ironclad/ >)... not Jython > though. > > David David, Thank you for your reply. Unfortunately, Ironclad currently only works on 32-bit Windows ... Nils From aisaac at american.edu Wed Jun 17 17:47:32 2009 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 17 Jun 2009 17:47:32 -0400 Subject: [Numpy-discussion] OT NPER in Gnumeric (was: Definitions of pv, fv, nper, pmt, and rate) In-Reply-To: <1cd32cbb0906082214k1d318fbfn77fb67b664477ae3@mail.gmail.com> References: <595939.51926.qm@web52105.mail.re2.yahoo.com> <1cd32cbb0906082214k1d318fbfn77fb67b664477ae3@mail.gmail.com> Message-ID: <4A396474.8040804@american.edu> On 6/9/2009 1:14 AM josef.pktd at gmail.com quoted: > Note: Gnumeric gives an error for negative rates. Excel and OOo2 do > not. For NPER(-1%;-100;1000), OOo2 gives 9.48, Excel produces > 9.483283066, Gnumeric gives a #DIV/0 error. This appears to be a bug > in Gnumeric. Recently fixed (in Gnumeric). Alan Isaac From fredmfp at gmail.com Wed Jun 17 18:32:57 2009 From: fredmfp at gmail.com (fred) Date: Thu, 18 Jun 2009 00:32:57 +0200 Subject: [Numpy-discussion] detecting point out of bounds... Message-ID: <4A396F19.6050309@gmail.com> Hi all, Let's say I have an array (n,3) ie x, y, v, in each row. How can I count the number of points (x,y) that are out of bounds (xmin, xmax) (ymin, ymax)? The following is obviously wrong: n = (data[:, 0]xmax).nonzero()[0].size + \ (data[:, 1]ymax).nonzero()[0].size and I don't want to use a loop to count the number of bad points, of course. Any clue? TIA. Cheers, -- Fred From robert.kern at gmail.com Wed Jun 17 18:47:09 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Jun 2009 17:47:09 -0500 Subject: [Numpy-discussion] detecting point out of bounds... In-Reply-To: <4A396F19.6050309@gmail.com> References: <4A396F19.6050309@gmail.com> Message-ID: <3d375d730906171547u73e98eb8n1e773214f2588fcc@mail.gmail.com> On Wed, Jun 17, 2009 at 17:32, fred wrote: > Hi all, > > Let's say I have an array (n,3) ie x, y, v, in each row. > > How can I count the number of points (x,y) that are out of bounds (xmin, > xmax) (ymin, ymax)? > > The following is obviously wrong: > > ? ? ? ?n = (data[:, 0] ? ? ? ? ? ?(data[:, 0]>xmax).nonzero()[0].size + \ > ? ? ? ? ? ?(data[:, 1] ? ? ? ? ? ?(data[:, 1]>ymax).nonzero()[0].size > > and I don't want to use a loop to count the number of bad points, > of course. ((data[:,0]xmax) | (data[:,1]ymax)).sum() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fredmfp at gmail.com Wed Jun 17 19:04:39 2009 From: fredmfp at gmail.com (fred) Date: Thu, 18 Jun 2009 01:04:39 +0200 Subject: [Numpy-discussion] detecting point out of bounds... In-Reply-To: <3d375d730906171547u73e98eb8n1e773214f2588fcc@mail.gmail.com> References: <4A396F19.6050309@gmail.com> <3d375d730906171547u73e98eb8n1e773214f2588fcc@mail.gmail.com> Message-ID: <4A397687.4040803@gmail.com> Robert Kern a ?crit : > ((data[:,0]xmax) | (data[:,1] (data[:,1]>ymax)).sum() Nice, as usual ;-) I did not know this writing. Thanks a lot. Cheers, -- Fred From cournape at gmail.com Thu Jun 18 00:35:37 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 18 Jun 2009 13:35:37 +0900 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> Message-ID: <5b8d13220906172135m1d38d3e5ja3e0dd19c8f57fb8@mail.gmail.com> On Mon, Jun 15, 2009 at 1:45 AM, Charles R Harris wrote: > > 1) The documentation of PyObject_Init doesn't say whether it is NULL > safe, so I think there needs to be a check here before the call: I checked the code of PyObject_init: I think it is safe to call it with NULL, since NULL is checked for. I was lazy to change this, as I should then change other objects which do this as well in numpy for consistency. > 2) Do the bounds need to be ordered? If so, that should be mentioned > and checked. I added this to the documentation (in the ref guide of numpy) > > 3) In the documentation x is used but the function prototype uses iter. Fixed. > > 4) I don't think the reference is borrowed since it is incremented if > the ctor succeeds. I think the point here is that the user doesn't > need to worry about it. I have always been confused by borrowed/stolen vocabulary, to be honest. I mentioned that nothing is changed if the ctor fails, and that the neighborhood holds a new reference. > 5) There should be spaces around the "-" here: > for (i = iter->nd-1; i >= 0; --i) > ?Likewise, the convention in python seems to be a space between the > "for" and "(" done. > 6) If the functions use neighborhood (I do think that looks better), > then the file names should also. In numpy, that's integrated in iterators.c + one header. I will commit the code to numpy, once I have checked it works as expected in scipy. That raises the question: can we make scipy 0.8.0 depends on numpy 1.4.0, or should I maintain a copy of the iterator in scipy 0.8.x so that it can be compiled with numpy 1.3.0 ? David From charlesr.harris at gmail.com Thu Jun 18 01:05:50 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Jun 2009 23:05:50 -0600 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: <5b8d13220906172135m1d38d3e5ja3e0dd19c8f57fb8@mail.gmail.com> References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> <5b8d13220906172135m1d38d3e5ja3e0dd19c8f57fb8@mail.gmail.com> Message-ID: Hi David, On Wed, Jun 17, 2009 at 10:35 PM, David Cournapeau wrote: > On Mon, Jun 15, 2009 at 1:45 AM, Charles R > Harris wrote: >> >> 1) The documentation of PyObject_Init doesn't say whether it is NULL >> safe, so I think there needs to be a check here before the call: > > I checked the code of PyObject_init: I think it is safe to call it > with NULL, since NULL is checked for. I was lazy to change this, as I > should then change other objects which do this as well in numpy for > consistency. > >> 2) Do the bounds need to be ordered? If so, that should be mentioned >> and checked. > > I added this to the documentation (in the ref guide of numpy) > >> >> 3) In the documentation x is used but the function prototype uses iter. > > Fixed. > >> >> 4) I don't think the reference is borrowed since it is incremented if >> the ctor succeeds. I think the point here is that the user doesn't >> need to worry about it. > > I have always been confused by borrowed/stolen vocabulary, to be > honest. I mentioned that nothing is changed if the ctor fails, and > that the neighborhood holds a new reference. > >> 5) There should be spaces around the "-" here: >> for (i = iter->nd-1; i >= 0; --i) >> ?Likewise, the convention in python seems to be a space between the >> "for" and "(" > > done. > >> 6) If the functions use neighborhood (I do think that looks better), >> then the file names should also. > > In numpy, that's integrated in iterators.c + one header. I will commit > the code to numpy, once I have checked it works as expected in scipy. > OK, I'm out of nits ;) > That raises the question: can we make scipy 0.8.0 depends on numpy > 1.4.0, or should I maintain a copy of the iterator in scipy 0.8.x so > that it can be compiled with numpy 1.3.0 ? > I don't know. I think it is cleaner to depend on 1.4.0, but there might be other considerations such as release schedules, both for numpy/scipy and perhaps some of the major linux distros (Fedora/Ubuntu). You are probably a better judge of that than I, but we should discuss it along with setting some goals for the next release. Chuck From darkgl0w at yahoo.com Thu Jun 18 05:03:49 2009 From: darkgl0w at yahoo.com (Cristi Constantin) Date: Thu, 18 Jun 2009 02:03:49 -0700 (PDT) Subject: [Numpy-discussion] Advanced indexing advice? Message-ID: <688181.89514.qm@web52103.mail.re2.yahoo.com> Good day. I have a question about advanced indexing. I have 2 matrices : >>> a=array([[ 0,? 1,? 2,? 3,? 4,? 5], ????? ??? ? [ 6,? 7,? 8,? 9, 10, 11], ???????? [12, 13, 14, 15, 16, 17], ???????? [18, 19, 20, 21, 22, 23]]) b=array([[1, 0, 1], ???????? [0, 2, 0], ???????? [0, 0, 3]]) >>> I want to put all NON-zero elements from array B into array A, but use offset! This is how i did: >>> mask = b!=0????? # Save all non-zero values from B. offset = (1,1)?? # Save offset. bshape = b.shape # Save shape of B. # Action ! a[ offset[0]:bshape[0]+offset[0], offset[1]:bshape[1]+offset[1] ][ mask ] = b[ mask ] >>> After this, a becomes : array([[ 0,? 1,? 2,? 3,? 4,? 5], ?????? [ 6,? 1,? 8,? 1, 10, 11], ?????? [12, 13,? 2, 15, 16, 17], ?????? [18, 19, 20,? 3, 22, 23]]) That's exactly what i want. Now, my method works, but is very slow when used with big arrays. I use doule indexing, one for offset and one for mask... Can anyone suggest another method, maybe with advanced indexing to do both the offset and the mask ? I am quite newbie with Numpy. Or maybe just don't use the mask at all ? All non-zero values from B must be inserted into A, using offset... I tried : a[ offset[0]:bshape[0]+offset[0], offset[1]:bshape[1]+offset[1], mask ] = b[ mask ] and a[ mask, offset[0]:bshape[0]+offset[0], offset[1]:bshape[1]+offset[1] ] = b[ mask ] but both methods transform a into : array([[ 1,? 1,? 1,? 3,? 4,? 5], ?????? [ 6,? 2,? 8,? 9, 10, 11], ?????? [12, 13,? 3, 15, 16, 17], ?????? [18, 19, 20, 21, 22, 23]]) and that's completely wrong. Any advice is good. Thank you in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Thu Jun 18 05:16:15 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 18 Jun 2009 11:16:15 +0200 Subject: [Numpy-discussion] Advanced indexing advice? In-Reply-To: <688181.89514.qm@web52103.mail.re2.yahoo.com> References: <688181.89514.qm@web52103.mail.re2.yahoo.com> Message-ID: <9457e7c80906180216l27cedd50x20860ffa81e5e4c2@mail.gmail.com> Hi Cristi 2009/6/18 Cristi Constantin : > I have a question about advanced indexing. > > I have 2 matrices : > >>>> > a=array([[ 0,? 1,? 2,? 3,? 4,? 5], > ????? ??? ? [ 6,? 7,? 8,? 9, 10, 11], > ???????? [12, 13, 14, 15, 16, 17], > ???????? [18, 19, 20, 21, 22, 23]]) > > b=array([[1, 0, 1], > ???????? [0, 2, 0], > ???????? [0, 0, 3]]) >>>> > > I want to put all NON-zero elements from array B into array A, but use > offset! Here's a solution using views: offset = np.array([1,1]) slices = [slice(*x) for x in zip(offset, offset + b.shape)] c = a[slices] mask = (b != 0) c[mask] = b[mask] Regards St?fan From geometrian at gmail.com Thu Jun 18 06:01:15 2009 From: geometrian at gmail.com (Ian Mallett) Date: Thu, 18 Jun 2009 03:01:15 -0700 Subject: [Numpy-discussion] Interleaved Arrays and In-Reply-To: References: <4A37FD2B.9000009@wartburg.edu> <4A385370.7000601@wartburg.edu> Message-ID: Most excellent solutions, thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From darkgl0w at yahoo.com Thu Jun 18 06:03:23 2009 From: darkgl0w at yahoo.com (Cristi Constantin) Date: Thu, 18 Jun 2009 03:03:23 -0700 (PDT) Subject: [Numpy-discussion] Advanced indexing advice? Message-ID: <597463.14459.qm@web52108.mail.re2.yahoo.com> Thank you so much for your prompt answer, St?fan. It's a very very interesting method. I will keep it for future. :) But, i tested it with a few examples and the speed of execution is just a tiny bit slower than what i told you i was using. So it's not faster, it's about the same speed. Thank you again. I will play with your method a little more. --- On Thu, 6/18/09, St?fan van der Walt wrote: From: St?fan van der Walt Subject: Re: [Numpy-discussion] Advanced indexing advice? To: "Discussion of Numerical Python" Date: Thursday, June 18, 2009, 2:16 AM Hi Cristi 2009/6/18 Cristi Constantin : > I have a question about advanced indexing. > > I have 2 matrices : > >>>> > a=array([[ 0,? 1,? 2,? 3,? 4,? 5], > ????? ??? ? [ 6,? 7,? 8,? 9, 10, 11], > ???????? [12, 13, 14, 15, 16, 17], > ???????? [18, 19, 20, 21, 22, 23]]) > > b=array([[1, 0, 1], > ???????? [0, 2, 0], > ???????? [0, 0, 3]]) >>>> > > I want to put all NON-zero elements from array B into array A, but use > offset! Here's a solution using views: offset = np.array([1,1]) slices = [slice(*x) for x in zip(offset, offset + b.shape)] c = a[slices] mask = (b != 0) c[mask] = b[mask] Regards St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rowen at u.washington.edu Thu Jun 18 12:35:32 2009 From: rowen at u.washington.edu (Russell E. Owen) Date: Thu, 18 Jun 2009 09:35:32 -0700 Subject: [Numpy-discussion] Please add Mac binary for numpy 1.3.0 and Python 2.6 Message-ID: If you don't want to build one then you are welcome to serve one I built. Several people have tried it and reported that it works. Contact me for a URL. -- Russell From stefan at sun.ac.za Thu Jun 18 17:19:33 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 18 Jun 2009 23:19:33 +0200 Subject: [Numpy-discussion] Windows build-slave needed: please help out! Message-ID: <9457e7c80906181419xc19c750m8de1929424a00330@mail.gmail.com> Hi everyone, Thomas Heller was kind enough to host our Windows build-slave thus far, but he can no longer do so. We need a new home for the Windows build slave, so if you have a Windows machine that is permanently on-line and that does not contain mission-critical data, please give me a shout. Thanks! St?fan From d_l_goldsmith at yahoo.com Thu Jun 18 17:58:23 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Thu, 18 Jun 2009 14:58:23 -0700 (PDT) Subject: [Numpy-discussion] Windows build-slave needed: please help out! Message-ID: <732957.86374.qm@web52104.mail.re2.yahoo.com> Hi, Stefan. I was just thinking about the computer I'd designate for this, and remembered that I still have it running XP Pro and don't want to "upgrade" it to Vista, so there's that to consider. DG --- On Thu, 6/18/09, St?fan van der Walt wrote: > From: St?fan van der Walt > Subject: [Numpy-discussion] Windows build-slave needed: please help out! > To: "Discussion of Numerical Python" > Date: Thursday, June 18, 2009, 2:19 PM > Hi everyone, > > Thomas Heller was kind enough to host our Windows > build-slave thus > far, but he can no longer do so.? We need a new home > for the Windows > build slave, so if you have a Windows machine that is > permanently > on-line and that does not contain mission-critical data, > please give > me a shout. > > Thanks! > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Thu Jun 18 18:17:22 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 18 Jun 2009 16:17:22 -0600 Subject: [Numpy-discussion] Windows build-slave needed: please help out! In-Reply-To: <732957.86374.qm@web52104.mail.re2.yahoo.com> References: <732957.86374.qm@web52104.mail.re2.yahoo.com> Message-ID: On Thu, Jun 18, 2009 at 3:58 PM, David Goldsmith wrote: > > Hi, Stefan. ?I was just thinking about the computer I'd designate for this, and remembered that I still have it running XP Pro and don't want to "upgrade" it to Vista, so there's that to consider. > Lots of folks still use XP, so I don't see that as a problem. But it would be nice to have a Vista/Windows7 machine also. Chuck From d_l_goldsmith at yahoo.com Thu Jun 18 19:05:59 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Thu, 18 Jun 2009 16:05:59 -0700 (PDT) Subject: [Numpy-discussion] Windows build-slave needed: please help out! Message-ID: <253561.45379.qm@web52112.mail.re2.yahoo.com> OK, thanks. I assume, if I'm selected, you'll be making this as "plug-and-play" as possible? DG --- On Thu, 6/18/09, Charles R Harris wrote: > From: Charles R Harris > Subject: Re: [Numpy-discussion] Windows build-slave needed: please help out! > To: "Discussion of Numerical Python" > Date: Thursday, June 18, 2009, 3:17 PM > On Thu, Jun 18, 2009 at 3:58 PM, > David Goldsmith > wrote: > > > > Hi, Stefan. ?I was just thinking about the computer > I'd designate for this, and remembered that I still have it > running XP Pro and don't want to "upgrade" it to Vista, so > there's that to consider. > > > > Lots of folks still use XP, so I don't see that as a > problem. But it > would be nice to have a Vista/Windows7 machine also. > > Chuck > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kxroberto at googlemail.com Fri Jun 19 09:49:57 2009 From: kxroberto at googlemail.com (Robert) Date: Fri, 19 Jun 2009 15:49:57 +0200 Subject: [Numpy-discussion] Advanced indexing advice? In-Reply-To: <597463.14459.qm@web52108.mail.re2.yahoo.com> References: <597463.14459.qm@web52108.mail.re2.yahoo.com> Message-ID: well its not really slow. yet with np.where it seems to be 2x faster for large arrays : a[1:4,1:4] = np.where(mask,b,a[1:4,1:4]) otherwise consider Cython: http://docs.cython.org/docs/numpy_tutorial.html#tuning-indexing-further Robert Cristi Constantin wrote: > > Thank you so much for your prompt answer, St?fan. > It's a very very interesting method. I will keep it for future. :) > > But, i tested it with a few examples and the speed of execution is just > a tiny bit slower than what i told you i was using. So it's not faster, > it's about the same speed. > > Thank you again. I will play with your method a little more. > > --- On *Thu, 6/18/09, St?fan van der Walt //* wrote: > > From: St?fan van der Walt > Subject: Re: [Numpy-discussion] Advanced indexing advice? > To: "Discussion of Numerical Python" > Date: Thursday, June 18, 2009, 2:16 AM > > Hi Cristi > > 2009/6/18 Cristi Constantin >: > > I have a question about advanced indexing. > > > > I have 2 matrices : > > > >>>> > > a=array([[ 0, 1, 2, 3, 4, 5], > > [ 6, 7, 8, 9, 10, 11], > > [12, 13, 14, 15, 16, 17], > > [18, 19, 20, 21, 22, 23]]) > > > > b=array([[1, 0, 1], > > [0, 2, 0], > > [0, 0, 3]]) > >>>> > > > > I want to put all NON-zero elements from array B into array A, > but use > > offset! > > Here's a solution using views: > > offset = np.array([1,1]) > slices = [slice(*x) for x in zip(offset, offset + b.shape)] > c = a[slices] > mask = (b != 0) > c[mask] = b[mask] > > Regards > St?fan > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From kxroberto at googlemail.com Fri Jun 19 09:56:11 2009 From: kxroberto at googlemail.com (Robert) Date: Fri, 19 Jun 2009 15:56:11 +0200 Subject: [Numpy-discussion] Advanced indexing advice? In-Reply-To: References: <597463.14459.qm@web52108.mail.re2.yahoo.com> Message-ID: or if your mask is thin and constant, then consider np.put From gael.varoquaux at normalesup.org Fri Jun 19 10:01:58 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 19 Jun 2009 16:01:58 +0200 Subject: [Numpy-discussion] [ANN] SciPy 2009 conference opened up for registration Message-ID: <20090619140158.GA12012@phare.normalesup.org> We are finally opening the registration for the SciPy 2009 conference. It took us time, but the reason is that we made careful budget estimations to bring the registration cost down. We are very happy to announce that this year registration to the conference will be only $150, sprints $100, and students get half price! We made this effort because we hope it will open up the conference to more people, especially students that often have to finance this trip with little budget. As a consequence, however, catering at noon is not included. This does not mean that we are getting a reduced conference. Quite on the contrary, this year we have two keynote speakers. And what speakers: Peter Norvig and Jon Guyer! Peter Norvig is the director of research at Google and Jon Guyer is a research scientist at NIST, in the Thermodynamics and Kinetics Group, where he leads a fiPy, a finite element project in Python. The SciPy 2009 Conference ========================== SciPy 2009, the 8th Python in Science conference (http://conference.scipy.org), will be held from August 18-23, 2009 at Caltech in Pasadena, CA, USA. Each year SciPy attracts leading figures in research and scientific software development with Python from a wide range of scientific and engineering disciplines. The focus of the conference is both on scientific libraries and tools developed with Python and on scientific or engineering achievements using Python. Call for Papers ================ We welcome contributions from the industry as well as the academic world. Indeed, industrial research and development as well academic research face the challenge of mastering IT tools for exploration, modeling and analysis. We look forward to hearing your recent breakthroughs using Python! Please read the full call for papers (http://conference.scipy.org/call_for_papers). Important Dates ================ * Friday, June 26: Abstracts Due * Saturday, July 4: Announce accepted talks, post schedule * Friday, July 10: Early Registration ends * Tuesday-Wednesday, August 18-19: Tutorials * Thursday-Friday, August 20-21: Conference * Saturday-Sunday, August 22-23: Sprints * Friday, September 4: Papers for proceedings due The SciPy 2009 executive committee ----------------------------------- * Jarrod Millman, UC Berkeley, USA (Conference Chair) * Ga?l Varoquaux, INRIA Saclay, France (Program Co-Chair) * St?fan van der Walt, University of Stellenbosch, South Africa * (Program Co-Chair) * Fernando P?rez, UC Berkeley, USA (Tutorial Chair) From nwagner at iam.uni-stuttgart.de Fri Jun 19 11:28:30 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 19 Jun 2009 17:28:30 +0200 Subject: [Numpy-discussion] Test bug in reduceat with structured arrays copied for speed. Message-ID: Hi all, Is this a known failure ? I am using 1.4.0.dev7069 ====================================================================== FAIL: Test bug in reduceat with structured arrays copied for speed. ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/home/nwagner/local/lib/python2.5/site-packages/nose-0.10.4-py2.5.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 700, in test_reduceat assert_array_almost_equal(h1, h2) File "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 537, in assert_array_almost_equal header='Arrays are not almost equal') File "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 395, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal (mismatch 100.0%) x: array([ 4.61621844e+24, 4.61621844e+24, 4.61621844e+24, 4.61621844e+24], dtype=float32) y: array([ 700., 800., 1000., 7500.], dtype=float32) ---------------------------------------------------------------------- Ran 2065 tests in 200.909s FAILED (KNOWNFAIL=1, failures=1) Nils From pav at iki.fi Fri Jun 19 12:00:43 2009 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 19 Jun 2009 16:00:43 +0000 (UTC) Subject: [Numpy-discussion] Test bug in reduceat with structured arrays copied for speed. References: Message-ID: On 2009-06-19, Nils Wagner wrote: > Is this a known failure ? > I am using 1.4.0.dev7069 Check the tickets: http://projects.scipy.org/numpy/ticket/1108 Cause is not known yet, but that bug most likely has been around for a long time. -- Pauli Virtanen From gael.varoquaux at normalesup.org Fri Jun 19 12:15:16 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 19 Jun 2009 18:15:16 +0200 Subject: [Numpy-discussion] [Correction] Re: [ANN] SciPy 2009 conference opened up for registration In-Reply-To: <20090619140158.GA12012@phare.normalesup.org> References: <20090619140158.GA12012@phare.normalesup.org> Message-ID: <20090619161516.GB16549@phare.normalesup.org> Please excuse me for incorrect information in my announcement: On Fri, Jun 19, 2009 at 04:01:58PM +0200, Gael Varoquaux wrote: > We are very happy to announce that this year registration to the > conference will be only $150, sprints $100, and students get half price! This should read that the tutorials are $100, not the sprints. The sprints are actually free, off course. We will be very please to see as many people as possible willing to participate at the sprint in making the SciPy ecosystem thrive. Thanks to Travis Oliphant for pointing out the typo. Ga?l Varoquaux From darkgl0w at yahoo.com Fri Jun 19 16:55:29 2009 From: darkgl0w at yahoo.com (Cristi Constantin) Date: Fri, 19 Jun 2009 13:55:29 -0700 (PDT) Subject: [Numpy-discussion] Advanced indexing advice Message-ID: <274044.18856.qm@web52104.mail.re2.yahoo.com> Thank you very much Robert ! For really big matrices, "where" is 4 times faster ! For small matrices, it's slower than my old solution, but it really doesn't matter, it's fast enough. :) I will think about "put" too. I used it before, but i must adapt all functions to obtain the necessary indexes. Have a nice day ! From: Robert Subject: Re: [Numpy-discussion] Advanced indexing advice? To: numpy-discussion at scipy.org Date: Friday, June 19, 2009, 6:49 AM well its not really slow. yet with np.where it seems to be 2x faster for large arrays : a[1:4,1:4] = np.where(mask,b,a[1:4,1:4]) otherwise consider Cython: http://docs.cython.org/docs/numpy_tutorial.html#tuning-indexing-further Robert >? ???2009/6/18 Cristi Constantin ? ???>: >? ? ? > I have a question about advanced indexing. >? ? ? > >? ? ? > I have 2 matrices : >? ? ? > >? ? ? >>>> >? ? ? > a=array([[ 0,? 1,? 2,? 3,? 4,? 5], >? ? ? >? ? ? ? ? ???[ 6,? 7,? 8,? 9, 10, 11], >? ? ? >? ? ? ? ? [12, 13, 14, 15, 16, 17], >? ? ? >? ? ? ? ? [18, 19, 20, 21, 22, 23]]) >? ? ? > >? ? ? > b=array([[1, 0, 1], >? ? ? >? ? ? ? ? [0, 2, 0], >? ? ? >? ? ? ? ? [0, 0, 3]]) >? ? ? >>>> >? ? ? > >? ? ? > I want to put all NON-zero elements from array B into array A, >? ???but use offset! -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsdale24 at gmail.com Sat Jun 20 09:47:52 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Sat, 20 Jun 2009 09:47:52 -0400 Subject: [Numpy-discussion] np.test() fails on ubuntu karmic Message-ID: I know karmic is in early development, but I wanted to give the numpy devs a heads up that the test suite will not run on this platform. I've tested numpy-1.2.1 installed with the package manager, the 1.3.0 release, and the trunk. I'm not familiar enough with the internals of numpy's testing framework to to understand the traceback, could anyone comment? Thanks, Darren np.test() Running unit tests for numpy NumPy version 1.3.0 NumPy is installed in /usr/local/lib/python2.6/dist-packages/numpy Python version 2.6.2+ (release26-maint, Jun 19 2009, 15:16:33) [GCC 4.4.0] nose version 0.11.0 --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/darren/ in () /usr/local/lib/python2.6/dist-packages/numpy/testing/nosetester.pyc in test(self, label, verbose, extra_argv, doctests, coverage) 249 doctests, coverage) 250 from noseclasses import NumpyTestProgram --> 251 t = NumpyTestProgram(argv=argv, exit=False, plugins=plugins) 252 return t.result 253 /usr/local/lib/python2.6/dist-packages/nose-0.11.0.dev_r0-py2.6.egg/nose/core.pyc in __init__(self, module, defaultTest, argv, testRunner, testLoader, env, config, suite, exit, plugins, addplugins) 111 unittest.TestProgram.__init__( 112 self, module=module, defaultTest=defaultTest, --> 113 argv=argv, testRunner=testRunner, testLoader=testLoader) 114 115 def makeConfig(self, env, plugins=None): /usr/lib/python2.6/unittest.pyc in __init__(self, module, defaultTest, argv, testRunner, testLoader) 817 self.progName = os.path.basename(argv[0]) 818 self.parseArgs(argv) --> 819 self.runTests() 820 821 def usageExit(self, msg=None): /usr/local/lib/python2.6/dist-packages/numpy/testing/noseclasses.pyc in runTests(self) 298 self.testRunner = plug_runner 299 --> 300 self.result = self.testRunner.run(self.test) 301 self.success = self.result.wasSuccessful() 302 return self.success TypeError: unbound method run() must be called with TextTestRunner instance as first argument (got ContextSuite instance instead) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Jun 20 10:15:42 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 20 Jun 2009 23:15:42 +0900 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> <5b8d13220906172135m1d38d3e5ja3e0dd19c8f57fb8@mail.gmail.com> Message-ID: <5b8d13220906200715i6565fcb3rf16a65951c135b30@mail.gmail.com> On Thu, Jun 18, 2009 at 2:05 PM, Charles R Harris wrote: > > I don't know. I think it is cleaner to depend on 1.4.0 Oh, definitely - I certainly prefer this solution as well. > but there > might be other considerations such as release schedules, both for > numpy/scipy and perhaps some of the major linux distros > (Fedora/Ubuntu). You are probably a better judge of that than I, but > we should discuss it along with setting some goals for the next > release. I don't think there are many new features for numpy 1.4.0. Most of the changes (in term of commits) are the ones I did to put everything in separate files for multiarray/ufunc, but that has 0 consequence for users (hopefully :) ). There is the usual set of bug-fixes. One thing I want to check is python 2.7 compatibility: there are some massive distutils changes, I have already detected and reported a couple of regressions, but I should test more thoroughly (especially on windows). I know that Travis and Robert pushed some stuff in a branch for date handling, I don't know if they intend to push this soon in the trunk or not ? cheers, David From stefan at sun.ac.za Sat Jun 20 10:15:25 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 20 Jun 2009 16:15:25 +0200 Subject: [Numpy-discussion] np.test() fails on ubuntu karmic In-Reply-To: References: Message-ID: <9457e7c80906200715i5fc52819rd4e622f738c1e77b@mail.gmail.com> Hi, I am running NumPy from SVN with nose 0.10.4 on Karmic and everything seems ok. Could be that one of the packages in Karmic itself is broken? Regards St?fan 2009/6/20 Darren Dale : > I know karmic is in early development, but I wanted to give the numpy devs a > heads up that the test suite will not run on this platform. I've tested > numpy-1.2.1 installed with the package manager, the 1.3.0 release, and the > trunk. I'm not familiar enough with the internals of numpy's testing > framework to to understand the traceback, could anyone comment? > > Thanks, > Darren From david at ar.media.kyoto-u.ac.jp Sat Jun 20 10:03:52 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 20 Jun 2009 23:03:52 +0900 Subject: [Numpy-discussion] np.test() fails on ubuntu karmic In-Reply-To: <9457e7c80906200715i5fc52819rd4e622f738c1e77b@mail.gmail.com> References: <9457e7c80906200715i5fc52819rd4e622f738c1e77b@mail.gmail.com> Message-ID: <4A3CEC48.8030608@ar.media.kyoto-u.ac.jp> St?fan van der Walt wrote: > Hi, > > I am running NumPy from SVN with nose 0.10.4 on Karmic and everything > seems ok. Could be that one of the packages in Karmic itself is > broken? > Or maybe something with nose 0.11 (I don't see any change related to new nose in numpy/testing since 1.3.0, though, and I think numpy svn works with nose 0.11) David From dsdale24 at gmail.com Sat Jun 20 10:33:32 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Sat, 20 Jun 2009 10:33:32 -0400 Subject: [Numpy-discussion] np.test() fails on ubuntu karmic In-Reply-To: <4A3CEC48.8030608@ar.media.kyoto-u.ac.jp> References: <9457e7c80906200715i5fc52819rd4e622f738c1e77b@mail.gmail.com> <4A3CEC48.8030608@ar.media.kyoto-u.ac.jp> Message-ID: On Sat, Jun 20, 2009 at 10:03 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > St?fan van der Walt wrote: > > Hi, > > > > I am running NumPy from SVN with nose 0.10.4 on Karmic and everything > > seems ok. Could be that one of the packages in Karmic itself is > > broken? > > > > Or maybe something with nose 0.11 (I don't see any change related to new > nose in numpy/testing since 1.3.0, though, and I think numpy svn works > with nose 0.11) > I should have mentioned that I upgraded to nose-0.11 when I saw the failure with 0.10.4. Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsdale24 at gmail.com Sat Jun 20 10:41:12 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Sat, 20 Jun 2009 10:41:12 -0400 Subject: [Numpy-discussion] np.test() fails on ubuntu karmic In-Reply-To: <9457e7c80906200715i5fc52819rd4e622f738c1e77b@mail.gmail.com> References: <9457e7c80906200715i5fc52819rd4e622f738c1e77b@mail.gmail.com> Message-ID: 2009/6/20 St?fan van der Walt > Hi, > > I am running NumPy from SVN with nose 0.10.4 on Karmic and everything > seems ok. Could be that one of the packages in Karmic itself is > broken? > That's interesting. Have you been keeping up to date with the package manager? Maybe I need to do a fresh install. Thankfully they recently released another alpha. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sat Jun 20 12:10:41 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 20 Jun 2009 18:10:41 +0200 Subject: [Numpy-discussion] np.test() fails on ubuntu karmic In-Reply-To: References: <9457e7c80906200715i5fc52819rd4e622f738c1e77b@mail.gmail.com> Message-ID: <9457e7c80906200910m7590b786w521893bb2af58523@mail.gmail.com> 2009/6/20 Darren Dale : > 2009/6/20 St?fan van der Walt >> I am running NumPy from SVN with nose 0.10.4 on Karmic and everything >> seems ok. ?Could be that one of the packages in Karmic itself is >> broken? > > That's interesting. Have you been keeping up to date with the package > manager? Maybe I need to do a fresh install. Thankfully they recently > released another alpha. I just updated my machine, and everything broke in the way you described. This must be a regression in one of the python26-* packages? Regards St?fan From stefan at sun.ac.za Sat Jun 20 12:12:23 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 20 Jun 2009 18:12:23 +0200 Subject: [Numpy-discussion] np.test() fails on ubuntu karmic In-Reply-To: <9457e7c80906200910m7590b786w521893bb2af58523@mail.gmail.com> References: <9457e7c80906200715i5fc52819rd4e622f738c1e77b@mail.gmail.com> <9457e7c80906200910m7590b786w521893bb2af58523@mail.gmail.com> Message-ID: <9457e7c80906200912n33e52807xd78bb076a01f5b83@mail.gmail.com> 2009/6/20 St?fan van der Walt : > I just updated my machine, and everything broke in the way you > described. ?This must be a regression in one of the python26-* > packages? The problem can be reproduced from the command prompt: $ nosetests Traceback (most recent call last): File "/usr/bin/nosetests", line 8, in load_entry_point('nose==0.10.4', 'console_scripts', 'nosetests')() File "/usr/lib/pymodules/python2.6/nose/core.py", line 219, in __init__ argv=argv, testRunner=testRunner, testLoader=testLoader) File "/usr/lib/python2.6/unittest.py", line 819, in __init__ self.runTests() File "/usr/lib/pymodules/python2.6/nose/core.py", line 298, in runTests result = self.testRunner.run(self.test) TypeError: unbound method run() must be called with TextTestRunner instance as first argument (got ContextSuite instance instead) So we can safely say it is not a NumPy issue. Regards St?fan From dsdale24 at gmail.com Sat Jun 20 12:42:16 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Sat, 20 Jun 2009 12:42:16 -0400 Subject: [Numpy-discussion] np.test() fails on ubuntu karmic In-Reply-To: <9457e7c80906200912n33e52807xd78bb076a01f5b83@mail.gmail.com> References: <9457e7c80906200715i5fc52819rd4e622f738c1e77b@mail.gmail.com> <9457e7c80906200910m7590b786w521893bb2af58523@mail.gmail.com> <9457e7c80906200912n33e52807xd78bb076a01f5b83@mail.gmail.com> Message-ID: 2009/6/20 St?fan van der Walt > 2009/6/20 St?fan van der Walt : > > I just updated my machine, and everything broke in the way you > > described. This must be a regression in one of the python26-* > > packages? > > The problem can be reproduced from the command prompt: > > $ nosetests > Traceback (most recent call last): > File "/usr/bin/nosetests", line 8, in > load_entry_point('nose==0.10.4', 'console_scripts', 'nosetests')() > File "/usr/lib/pymodules/python2.6/nose/core.py", line 219, in __init__ > argv=argv, testRunner=testRunner, testLoader=testLoader) > File "/usr/lib/python2.6/unittest.py", line 819, in __init__ > self.runTests() > File "/usr/lib/pymodules/python2.6/nose/core.py", line 298, in runTests > result = self.testRunner.run(self.test) > TypeError: unbound method run() must be called with TextTestRunner > instance as first argument (got ContextSuite instance instead) > > So we can safely say it is not a NumPy issue. > I can confirm, nosetests fails with a fresh install of Karmic alpha 2, but it does not have anything to do with numpy. Thanks for pointing that test out, I'll file a bug report at launchpad. -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.tollerud at gmail.com Sat Jun 20 13:04:03 2009 From: erik.tollerud at gmail.com (Erik Tollerud) Date: Sat, 20 Jun 2009 12:04:03 -0500 Subject: [Numpy-discussion] Structured array inititialization weirdness Message-ID: I've encountered an odd error I don't understand (see the case below): the first structured array ("A" in the example) initializes from a list of length-2 arrays with no problem, but if I give it a 2-by-2 array ("B"), it raises a TypeError... Why would it be any different to convert the first index of the array into a list? >>> from numpy import * >>> from numpy.random import * >>> dt=dtype([('a','f'),('b','f')]) >>> A=array(list(randn(2,10)),dtype=dt) >>> B=array(randn(2,10),dtype=dt) Traceback (most recent call last): File "", line 1, in TypeError: expected a readable buffer object From stefan at sun.ac.za Sat Jun 20 13:35:16 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 20 Jun 2009 19:35:16 +0200 Subject: [Numpy-discussion] Structured array inititialization weirdness In-Reply-To: References: Message-ID: <9457e7c80906201035i4929fec1i20fb241d0b8dc052@mail.gmail.com> Hi Erik 2009/6/20 Erik Tollerud : > I've encountered an odd error I don't understand (see the case below): > the first structured array ("A" in the example) initializes from a > list of length-2 arrays with no problem, but if I give it a 2-by-2 > array ("B"), it raises a TypeError... Why would it be any different to > convert the first index of the array into a list? > >>>> from numpy import * >>>> from numpy.random import * >>>> dt=dtype([('a','f'),('b','f')]) >>>> A=array(list(randn(2,10)),dtype=dt) >>>> B=array(randn(2,10),dtype=dt) > Traceback (most recent call last): > ?File "", line 1, in > TypeError: expected a readable buffer object The ndarray constructor does its best to guess what kind of data you are feeding it, but sometimes it needs a bit of help. By casting your array to list, you give numpy a hint as to how your data is partitioned. In this case, however, I think that hint is the wrong one. I prefer to construct arrays explicitly, so there is no doubt what is happening under the hood: dt = np.dtype([('a', np.float64), ('b', np.float64)]) np.random.random([2,10]).astype(np.float64).view(dt) Alternatively, construct the array first, and then fill the elements: x = np.zeros(10, dtype=dt) x['a'] = np.random.random(10) x['b'] = np.random.random(10) Regards St?fan From charlesr.harris at gmail.com Sat Jun 20 14:08:29 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 20 Jun 2009 12:08:29 -0600 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: <5b8d13220906200715i6565fcb3rf16a65951c135b30@mail.gmail.com> References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> <5b8d13220906172135m1d38d3e5ja3e0dd19c8f57fb8@mail.gmail.com> <5b8d13220906200715i6565fcb3rf16a65951c135b30@mail.gmail.com> Message-ID: On Sat, Jun 20, 2009 at 8:15 AM, David Cournapeau wrote: > On Thu, Jun 18, 2009 at 2:05 PM, Charles R > Harris wrote: > >> >> I don't know. I think it is cleaner to depend on 1.4.0 > > Oh, definitely - I certainly prefer this solution as well. > >> but there >> might be other considerations such as release schedules, both for >> numpy/scipy and perhaps some of the major linux distros >> (Fedora/Ubuntu). You are probably a better judge of that than I, but >> we should discuss it along with setting some goals for the next >> release. > > I don't think there are many new features for numpy 1.4.0. Most of the > changes (in term of commits) are the ones I did to put everything in > separate files for multiarray/ufunc, but that has 0 consequence for > users (hopefully :) ). > > There is the usual set of bug-fixes. One thing I want to check is > python 2.7 compatibility: there are some massive distutils changes, I > have already detected and reported a couple of regressions, but I > should test more thoroughly (especially on windows). > > ?I know that Travis and Robert pushed some stuff in a branch for date > handling, I don't know if they intend to push this soon in the trunk > or not ? > Since we are extending the API and bumping up the API number, I wonder if we should change the python dependency to 2.6? If there are otherwise no major improvements in the pipe besides bug fixes now might be the time to start laying the foundations for Python 3.0 compatibility. We could make a bugfix release of 1.3.0 for those folks still using python 2.4-2.5, which is probably quite a few. If we want to go that way we should probably make an ongoing effort to backport fixes as we go because it's such a pain to sort them all out later. Chuck From faltet at pytables.org Sat Jun 20 14:45:40 2009 From: faltet at pytables.org (Francesc Alted) Date: Sat, 20 Jun 2009 20:45:40 +0200 Subject: [Numpy-discussion] Structured array inititialization weirdness In-Reply-To: References: Message-ID: <200906202045.40502.faltet@pytables.org> A Saturday 20 June 2009 19:04:03 Erik Tollerud escrigu?: > I've encountered an odd error I don't understand (see the case below): > the first structured array ("A" in the example) initializes from a > list of length-2 arrays with no problem, but if I give it a 2-by-2 > array ("B"), it raises a TypeError... Why would it be any different to > convert the first index of the array into a list? > > >>> from numpy import * > >>> from numpy.random import * > >>> dt=dtype([('a','f'),('b','f')]) > >>> A=array(list(randn(2,10)),dtype=dt) > >>> B=array(randn(2,10),dtype=dt) > > Traceback (most recent call last): > File "", line 1, in > TypeError: expected a readable buffer object > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Besides Stefan's solution, you can use the `numpy.rec` module that provides serveral methods allowing more magic so as to recognize inputs in a variety of formats. For your needs, `numpy.rec.fromarrays` seems to work just fine: In [17]: B = np.rec.fromarrays(np.random.randn(2,10),dtype=dt) In [18]: B Out[18]: rec.array([(1.1266019344329834, -0.091760553419589996), (0.018915429711341858, 0.26001507043838501), (0.80425763130187988, 0.77772557735443115), (0.20478853583335876, 0.10050154477357864), (0.67508858442306519, 1.8889480829238892), (1.0913237333297729, 1.9765472412109375), (-0.64121735095977783, -0.14685167372226715), (0.26050111651420593, 0.56423413753509521), (-0.047166235744953156, 1.0811176300048828), (-2.828101634979248, -0.36026483774185181)], dtype=[('a', ' Hi, I'm working on the I/O documentation, and have a bunch of questions. 1. The npy/npz formats are documented in lib.format and in the NEP ( http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt). Is lib.format the right place to add relevant parts of the NEP, or would doc.io be better? Or create a separate page (maybe doc.npy_format)? And is the .npz format fixed or still in flux? 2. Is the .npy format version number (now at 1.0) independent of the numpy version numbering, when is it incremented, and will it be backwards compatible? 3. For a longer coherent overview of I/O, does that go in doc.io or routines.io.rst? 4. This page http://www.scipy.org/Data_sets_and_examples talks about including data sets with scipy, has this happened? Would it be possible to include a single small dataset in numpy for use in examples? 5. DataSource contains a lot of TODOs and behavior that is documented as a bug in the docstring. Is anyone working on this? If not, I can give it a go. TODOs that need work, or at least a yes/no decision: 5a. .zip and .tar support (is .tar needed?) 5b. URLs only work if they include 'http://' (currently documented as a bug, which it not necessarily is. fix or document?) 5c. _cache() does not handle compressed files, and should use shutils.copyfile 5d. make abspath() more robust 5e. in open(), support for creating files and adding a 'subdir' parameter (needed?) Does anyone have (self-contained) code using DataSource, or a suggestion for data on the web that can be used in examples? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Jun 20 18:08:29 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 20 Jun 2009 17:08:29 -0500 Subject: [Numpy-discussion] I/O documentation and code In-Reply-To: References: Message-ID: <3d375d730906201508n2fd0496fgb29302d1c89c893b@mail.gmail.com> On Sat, Jun 20, 2009 at 16:33, Ralf Gommers wrote: > > Hi, > > I'm working on the I/O documentation, and have a bunch of questions. > > 1. The npy/npz formats are documented in lib.format and in the NEP (http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt). Is lib.format the right place to add relevant parts of the NEP, or would doc.io be better? What parts? > Or create a separate page (maybe doc.npy_format)? Probably all of the implemented NEPs should have their own place in the documentation and other parts should reference the NEPs for technical detail. > And is the .npz format fixed or still in flux? It's not as formalized as the .npy format, but I expect it to be at least as solid as other code in numpy. > 2. Is the .npy format version number (now at 1.0) independent of the numpy version numbering, when is it incremented, and will it be backwards compatible? It is independent of numpy version numbering. If we do upgrade the format, the code in numpy.io will still be able to read and write 1.0 files. > 4. This page http://www.scipy.org/Data_sets_and_examples talks about including data sets with scipy, has this happened? Would it be possible to include a single small dataset in numpy for use in examples? I think the dataset convention is entirely independent of numpy per se. The current version of this stuff is in the scikits.learn package: http://svn.scipy.org/svn/scikits/trunk/learn/scikits/learn/datasets/ The proposal could be turned into an "informative" NEP, of course. It needs to be updated, though (e.g. it talks about not needing to combine masked arrays and record arrays, but this has already been done with the numpy.ma rewrite). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ?-- Umberto Eco From jsseabold at gmail.com Sat Jun 20 18:24:41 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Sat, 20 Jun 2009 18:24:41 -0400 Subject: [Numpy-discussion] I/O documentation and code In-Reply-To: References: Message-ID: On Sat, Jun 20, 2009 at 5:33 PM, Ralf Gommers wrote: > Hi, > > I'm working on the I/O documentation, and have a bunch of questions. > > 1. The npy/npz formats are documented in lib.format and in the NEP > (http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt). Is > lib.format the right place to add relevant parts of the NEP, or would doc.io > be better? Or create a separate page (maybe doc.npy_format)? And is the .npz > format fixed or still in flux? > > 2. Is the .npy format version number (now at 1.0) independent of the numpy > version numbering, when is it incremented, and will it be backwards > compatible? > > 3. For a longer coherent overview of I/O, does that go in doc.io or > routines.io.rst? > > 4. This page http://www.scipy.org/Data_sets_and_examples talks about > including data sets with scipy, has this happened? Would it be possible to > include a single small dataset in numpy for use in examples? > > 5. DataSource contains a lot of TODOs and behavior that is documented as a > bug in the docstring. Is anyone working on this? If not, I can give it a go. This was proposed as a GSoC project and I went through it, but that's about all I know. I can't find my notes now, but here are some thoughts off the top of my head. The code is here for the record > TODOs that need work, or at least a yes/no decision: > 5a. .zip and .tar support (is .tar needed?) Would these be trivial to implement? And since the import overhead is deferred until it's needed I don't see the harm in including the support... > 5b. URLs only work if they include 'http://' (currently documented as a bug, > which it not necessarily is. fix or document?) I would say document, since we might have any number of protocols, so it might not make sense to just default to http:// > 5c. _cache() does not handle compressed files, and should use > shutils.copyfile I never understood what this meant, but maybe I'm missing something. If path is a compressed file then it is written to a local directory as a compressed file. What else does it need to handle? Should it be fetch archive, extract (single file or archive), cache locally? > 5d. make abspath() more robust > 5e. in open(), support for creating files and adding a 'subdir' parameter > (needed?) > I would think there should be support for both of these. I have some rough scripts that I used for remote data fetching and I like it to create a ./tmp directory and cache the file there and then clean up after myself when I'm done. > Does anyone have (self-contained) code using DataSource, or a suggestion for > data on the web that can be used in examples? > I'm not sure if this is what you're after, but I've been using some of these "classic published results" and there are some compressed archives. http://www.stanford.edu/~clint/bench/ > Cheers, > Ralf > Skipper From ralf.gommers at googlemail.com Sat Jun 20 20:02:12 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 20 Jun 2009 20:02:12 -0400 Subject: [Numpy-discussion] I/O documentation and code In-Reply-To: <3d375d730906201508n2fd0496fgb29302d1c89c893b@mail.gmail.com> References: <3d375d730906201508n2fd0496fgb29302d1c89c893b@mail.gmail.com> Message-ID: On Sat, Jun 20, 2009 at 6:08 PM, Robert Kern wrote: > On Sat, Jun 20, 2009 at 16:33, Ralf Gommers > wrote: > > > > Hi, > > > > I'm working on the I/O documentation, and have a bunch of questions. > > > > 1. The npy/npz formats are documented in lib.format and in the NEP ( > http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt). Is > lib.format the right place to add relevant parts of the NEP, or would > doc.io be better? > > What parts? > - abstract (i.e. what is this, what's it good for) - comparison with pickle, memmep - most of the items in "Requirements" - extension info (.npy/.npz, not enforced) Right now lib.format does not contain the word "binary", or ".npy". We need a complete description in the reference guide that functions like `save` and `load` can reference. > > > Or create a separate page (maybe doc.npy_format)? > > Probably all of the implemented NEPs should have their own place in > the documentation and other parts should reference the NEPs for > technical detail. > Good point, NEPs should be somewhere in the docs. However, they do not seem appropriate to refer users to, paragraphs like Rationale, Use Cases, Implementation are not aimed directly at users. > > > And is the .npz format fixed or still in flux? > > It's not as formalized as the .npy format, but I expect it to be at > least as solid as other code in numpy. > > > 2. Is the .npy format version number (now at 1.0) independent of the > numpy version numbering, when is it incremented, and will it be backwards > compatible? > > It is independent of numpy version numbering. If we do upgrade the > format, the code in numpy.io will still be able to read and write 1.0 > files. > > > 4. This page http://www.scipy.org/Data_sets_and_examples talks about > including data sets with scipy, has this happened? Would it be possible to > include a single small dataset in numpy for use in examples? > > I think the dataset convention is entirely independent of numpy per > se. The current version of this stuff is in the scikits.learn package: > > http://svn.scipy.org/svn/scikits/trunk/learn/scikits/learn/datasets/ > > The proposal could be turned into an "informative" NEP, of course. It > needs to be updated, though (e.g. it talks about not needing to > combine masked arrays and record arrays, but this has already been > done with the numpy.ma rewrite). > Interesting, might be useful for all sorts of examples in docstrings and especially tutorial-style docs. David, do you still plan to put this forward for inclusion in numpy or scipy? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Jun 20 20:28:14 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 20 Jun 2009 20:28:14 -0400 Subject: [Numpy-discussion] I/O documentation and code In-Reply-To: References: Message-ID: On Sat, Jun 20, 2009 at 6:24 PM, Skipper Seabold wrote: > On Sat, Jun 20, 2009 at 5:33 PM, Ralf > Gommers wrote: > > Hi, > > > > I'm working on the I/O documentation, and have a bunch of questions. > > > > 1. The npy/npz formats are documented in lib.format and in the NEP > > (http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt). Is > > lib.format the right place to add relevant parts of the NEP, or would > doc.io > > be better? Or create a separate page (maybe doc.npy_format)? And is the > .npz > > format fixed or still in flux? > > > > 2. Is the .npy format version number (now at 1.0) independent of the > numpy > > version numbering, when is it incremented, and will it be backwards > > compatible? > > > > 3. For a longer coherent overview of I/O, does that go in doc.io or > > routines.io.rst? > > > > 4. This page http://www.scipy.org/Data_sets_and_examples talks about > > including data sets with scipy, has this happened? Would it be possible > to > > include a single small dataset in numpy for use in examples? > > > > 5. DataSource contains a lot of TODOs and behavior that is documented as > a > > bug in the docstring. Is anyone working on this? If not, I can give it a > go. > > This was proposed as a GSoC project and I went through it, but that's > about all I know. I can't find my notes now, but here are some > thoughts off the top of my head. The code is here for the record > > > > TODOs that need work, or at least a yes/no decision: > > 5a. .zip and .tar support (is .tar needed?) > > Would these be trivial to implement? And since the import overhead is > deferred until it's needed I don't see the harm in including the > support... > .zip would be similar to .gz and .bz2. These are all assumed to be single files, .tar is usually a file archive which needs a different approach. > > > 5b. URLs only work if they include 'http://' (currently documented as a > bug, > > which it not necessarily is. fix or document?) > > I would say document, since we might have any number of protocols, so > it might not make sense to just default to http:// agreed. > > > > 5c. _cache() does not handle compressed files, and should use > > shutils.copyfile > > I never understood what this meant, but maybe I'm missing something. > If path is a compressed file then it is written to a local directory > as a compressed file. What else does it need to handle? Should it be > fetch archive, extract (single file or archive), cache locally? Maybe it's about fetching data with gzip-compression, as described here http://diveintopython.org/http_web_services/gzip_compression.html. I agree normal read/write should work with compressed data. For local files, a file copy operation would make more sense than reading the file and then writing it to a new file anyway. > > > > 5d. make abspath() more robust > > 5e. in open(), support for creating files and adding a 'subdir' parameter > > (needed?) > > > > I would think there should be support for both of these. I have some > rough scripts that I used for remote data fetching and I like it to > create a ./tmp directory and cache the file there and then clean up > after myself when I'm done. > > > Does anyone have (self-contained) code using DataSource, or a suggestion > for > > data on the web that can be used in examples? > > > > I'm not sure if this is what you're after, but I've been using some of > these "classic published results" and there are some compressed > archives. > > http://www.stanford.edu/~clint/bench/ Something like this, but we can not rely on such a site to stay where it is forever. Maybe it makes sense to put some files on scipy.org and use those. We could for example use data from http://data.un.org/ , the usage terms allow that and it's high-quality data. For example data on energy usage (total, fossil and alternative sources for different countries). Longer term the data sets from the learn scikit that Robert pointed out may make it into numpy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Jun 20 20:31:04 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 21 Jun 2009 02:31:04 +0200 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: References: <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> <5b8d13220906172135m1d38d3e5ja3e0dd19c8f57fb8@mail.gmail.com> <5b8d13220906200715i6565fcb3rf16a65951c135b30@mail.gmail.com> Message-ID: <20090621003104.GA6812@phare.normalesup.org> On Sat, Jun 20, 2009 at 12:08:29PM -0600, Charles R Harris wrote: > Since we are extending the API and bumping up the API number, I wonder > if we should change the python dependency to 2.6? If there are > otherwise no major improvements in the pipe besides bug fixes now > might be the time to start laying the foundations for Python 3.0 > compatibility. What is the reasonning behind this? What would bumping Python dependency give us? Ga?l From robert.kern at gmail.com Sat Jun 20 20:34:32 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 20 Jun 2009 19:34:32 -0500 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: <20090621003104.GA6812@phare.normalesup.org> References: <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> <5b8d13220906172135m1d38d3e5ja3e0dd19c8f57fb8@mail.gmail.com> <5b8d13220906200715i6565fcb3rf16a65951c135b30@mail.gmail.com> <20090621003104.GA6812@phare.normalesup.org> Message-ID: <3d375d730906201734g7162214aj698113e992e1b36c@mail.gmail.com> On Sat, Jun 20, 2009 at 19:31, Gael Varoquaux wrote: > > On Sat, Jun 20, 2009 at 12:08:29PM -0600, Charles R Harris wrote: > > Since we are extending the API and bumping up the API number, I wonder > > if we should change the python dependency to 2.6? If there are > > otherwise no major improvements in the pipe besides bug fixes now > > might be the time to start laying the foundations for Python 3.0 > > compatibility. > > What is the reasonning behind this? What would bumping Python dependency > give us? It will probably be difficult to port to Python 3.0 while maintaining compatibility with Python 2.x<6. However, that is a hypothetical, and I am strongly against bumping the Python requirement up to 2.6 until we discover that "difficult" is actually "near-impossible". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ?-- Umberto Eco From cournape at gmail.com Sun Jun 21 00:51:31 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 21 Jun 2009 13:51:31 +0900 Subject: [Numpy-discussion] Ready for review: PyArrayNeighIterObject, an iterator to iterate over a neighborhood in arbitrary arrays In-Reply-To: References: <4A33ADC8.9010702@ar.media.kyoto-u.ac.jp> <5b8d13220906131135r6f285545rbfc6cc1699568924@mail.gmail.com> <5b8d13220906140059k2322a70cs8cf14b43fa64babb@mail.gmail.com> <5b8d13220906140107n6ad3bb26n2271c29e298d4c73@mail.gmail.com> <5b8d13220906172135m1d38d3e5ja3e0dd19c8f57fb8@mail.gmail.com> <5b8d13220906200715i6565fcb3rf16a65951c135b30@mail.gmail.com> Message-ID: <5b8d13220906202151p2170c1f0q2a9bbe33d2e90983@mail.gmail.com> On Sun, Jun 21, 2009 at 3:08 AM, Charles R Harris wrote: > > Since we are extending the API and bumping up the API number, I wonder > if we should change the python dependency to 2.6? I think we should keep compatibility with python 2.4, it is still used by many people, and the default python on several major distributions (RHEL comes to mind). Moreover, python 2.5/2.6 don't bring much advantages compared to 2.4. The only reason I can see if if that helps for python 3.0 porting - we are too early in the process to have a good idea on this I think (maybe supporting both python 2.* and 3.* is practically impossible, for example, so we would need to support numpy for python 2.* for quite a while). > If there are > otherwise no major improvements in the pipe besides bug fixes now > might be the time to start laying the foundations for Python 3.0 > compatibility. Numpy trunk has several enhancements which are minor on their own, but would be useful for scipy 0.8.0, reusable npy_math, in particular, to improve scipy.special. David From geometrian at gmail.com Sun Jun 21 03:04:28 2009 From: geometrian at gmail.com (Ian Mallett) Date: Sun, 21 Jun 2009 00:04:28 -0700 Subject: [Numpy-discussion] Blurring an Array Message-ID: Hello, I'm working on a program that will draw me a metallic 3D texture. I successfully made a Perlin noise implementation and found that when the result is blurred in one direction, the result actually looks somewhat like brushed aluminum. The plan is to do this for every n*m*3 layer (2D texture) in the 3D texture. My solution to this anisotropic blurring looks like: soften = layerarray.copy() total = 1 for bluriter in xrange(1,5,1): soften[:,bluriter:] += layerarray[:,:-bluriter] soften[:,:-bluriter] += layerarray[:,bluriter:] total += 2 soften /= total Where layerarray is a n*m*3 array, and soften is the array that will be converted into an image with the other 2D images and saved. This code successfully blurs the array in the y-direction. However, it does not do so the way I would like. The blur is accomplished by a simple shift, making the arrays not line up. This leaves space at the edges. When the final soften array is divided by total, those areas are especially dark. Visually, this is unacceptable, and leads to banding, which is especially evident if the texture is repeated. As you can see in this image, which shows about 6 repeats of the texture, http://img13.imageshack.us/img13/5789/image1wgq.png, the dark edges are annoying. I made sure, of course, that the Perlin noise implementation is tileable. The solution to my problem is to make the shifted array wrap around so that its overhang fills in the hole the shift caused it to leave behind. For example, to simulate shifting the texture 8 units up with wrap, the first 8 rows should be removed from the top and added to the bottom. Likewise for columns if the blur goes in that direction. I already tried a couple of times at this, and it's not working. I need a way to blur soften by column and by row. Thanks, Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jun 21 04:05:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 21 Jun 2009 03:05:06 -0500 Subject: [Numpy-discussion] Blurring an Array In-Reply-To: References: Message-ID: <3d375d730906210105m5d01a831v54bc58b0bbd237ec@mail.gmail.com> On Sun, Jun 21, 2009 at 02:04, Ian Mallett wrote: > > Hello, > > I'm working on a program that will draw me a metallic 3D texture.? I successfully made a Perlin noise implementation and found that when the result is blurred in one direction, the result actually looks somewhat like brushed aluminum.? The plan is to do this for every n*m*3 layer (2D texture) in the 3D texture. > > My solution to this anisotropic blurring looks like: > > soften = layerarray.copy() > total = 1 > for bluriter in xrange(1,5,1): > ??? soften[:,bluriter:]? += layerarray[:,:-bluriter] > ??? soften[:,:-bluriter] += layerarray[:,bluriter:] > ??? total += 2 > soften /= total Use scipy.ndimage.convolve() and an appropriate convolution kernel. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ?-- Umberto Eco From geometrian at gmail.com Sun Jun 21 04:31:25 2009 From: geometrian at gmail.com (Ian Mallett) Date: Sun, 21 Jun 2009 01:31:25 -0700 Subject: [Numpy-discussion] Blurring an Array In-Reply-To: <3d375d730906210105m5d01a831v54bc58b0bbd237ec@mail.gmail.com> References: <3d375d730906210105m5d01a831v54bc58b0bbd237ec@mail.gmail.com> Message-ID: Sounds like it would work, but unfortunately numpy was one of my dependency constraints. I should have mentioned that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jun 21 04:34:00 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 21 Jun 2009 03:34:00 -0500 Subject: [Numpy-discussion] Blurring an Array In-Reply-To: References: <3d375d730906210105m5d01a831v54bc58b0bbd237ec@mail.gmail.com> Message-ID: <3d375d730906210134l500a7630ka77d00082ffcad40@mail.gmail.com> On Sun, Jun 21, 2009 at 03:31, Ian Mallett wrote: > > Sounds like it would work, but unfortunately numpy was one of my dependency constraints.? I should have mentioned that. In that case, use numpy.roll() instead of slicing to get wraparound. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ?-- Umberto Eco From david at ar.media.kyoto-u.ac.jp Sun Jun 21 05:01:31 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 21 Jun 2009 18:01:31 +0900 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 Message-ID: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> (Continuing the discussion initiated in the neighborhood iterator thread) Hi, I would like to gather people's opinion on what to target for numpy 1.4.0. - Chuck suggested to drop python < 2.6 support from now on. I am against it without a very strong and detailed rationale, because many OS still don't have python 2.6 (RHEL, Ubuntu LTS). - Even if not many new features have been implemented since 1.3.0, there were several changes which would be quite useful (using npy_math from numpy for scipy.special, neighborhood iterator for scipy.signal). So releasing 1.4.0 soon would be useful so that scipy 0.8.0 could depend on it. - Fixing crashes on windows 64 bits: I have not made any progress on this. I am out of ideas on how to debug the problem, to be honest. Are there any other features people would like to put into numpy for 1.4.0 ? cheers, David From geometrian at gmail.com Sun Jun 21 06:48:26 2009 From: geometrian at gmail.com (Ian Mallett) Date: Sun, 21 Jun 2009 03:48:26 -0700 Subject: [Numpy-discussion] Blurring an Array In-Reply-To: <3d375d730906210134l500a7630ka77d00082ffcad40@mail.gmail.com> References: <3d375d730906210105m5d01a831v54bc58b0bbd237ec@mail.gmail.com> <3d375d730906210134l500a7630ka77d00082ffcad40@mail.gmail.com> Message-ID: This works perfectly! Is there likewise a similar call for Numeric? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsdale24 at gmail.com Sun Jun 21 07:38:26 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Sun, 21 Jun 2009 07:38:26 -0400 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, Jun 21, 2009 at 5:01 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > (Continuing the discussion initiated in the neighborhood iterator thread) > > Hi, > > I would like to gather people's opinion on what to target for numpy > 1.4.0. > - Chuck suggested to drop python < 2.6 support from now on. I am > against it without a very strong and detailed rationale, because many OS > still don't have python 2.6 (RHEL, Ubuntu LTS). > - Even if not many new features have been implemented since 1.3.0, > there were several changes which would be quite useful (using npy_math > from numpy for scipy.special, neighborhood iterator for scipy.signal). > So releasing 1.4.0 soon would be useful so that scipy 0.8.0 could depend > on it. > - Fixing crashes on windows 64 bits: I have not made any progress on > this. I am out of ideas on how to debug the problem, to be honest. > > Are there any other features people would like to put into numpy for 1.4.0 > ? > I've been trying to engage the numpy developers on a proposal to enhance the array wrapping mechanism in ufuncs. The goal is to enable suclasses of ndarray like MaskedArray and Quantity to work with the built-in ufuncs instead of having to reimplement them. Here is the thread: http://www.nabble.com/suggestion-for-generalizing-numpy-functions-tt22413628.html#a22413628 -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.reid at mail.cryst.bbk.ac.uk Sun Jun 21 10:38:00 2009 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Sun, 21 Jun 2009 15:38:00 +0100 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau wrote: > (Continuing the discussion initiated in the neighborhood iterator thread) > - Chuck suggested to drop python < 2.6 support from now on. I am > against it without a very strong and detailed rationale, because many OS > still don't have python 2.6 (RHEL, Ubuntu LTS). I vote against dropping support for python 2.5. Personally I have no incentive to upgrade to 2.6 and am very happy with 2.5. From dsdale24 at gmail.com Sun Jun 21 10:55:53 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Sun, 21 Jun 2009 10:55:53 -0400 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, Jun 21, 2009 at 10:38 AM, John Reid wrote: > David Cournapeau wrote: > > (Continuing the discussion initiated in the neighborhood iterator thread) > > - Chuck suggested to drop python < 2.6 support from now on. I am > > against it without a very strong and detailed rationale, because many OS > > still don't have python 2.6 (RHEL, Ubuntu LTS). > > I vote against dropping support for python 2.5. Personally I have no > incentive to upgrade to 2.6 and am very happy with 2.5. > Will requiring python-2.6 help the developers port numpy to python-3? -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Jun 21 11:34:15 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 22 Jun 2009 00:34:15 +0900 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220906210834h3d6afcees372b803c277746c7@mail.gmail.com> On Sun, Jun 21, 2009 at 11:55 PM, Darren Dale wrote: > On Sun, Jun 21, 2009 at 10:38 AM, John Reid > wrote: >> >> David Cournapeau wrote: >> > (Continuing the discussion initiated in the neighborhood iterator >> > thread) >> > ? ? - Chuck suggested to drop python < 2.6 support from now on. I am >> > against it without a very strong and detailed rationale, because many OS >> > still don't have python 2.6 (RHEL, Ubuntu LTS). >> >> I vote against dropping support for python 2.5. Personally I have no >> incentive to upgrade to 2.6 and am very happy with 2.5. > > Will requiring python-2.6 help the developers port numpy to python-3? It will help, but it is unclear whether it will be a significant help. >From what I have seen, very few non trivial C extensions have been ported so far, and it is not even clear to me whether it will be possible to support both python 2.x and python 3.x at the same time (without a huge amount of work, that is). David From charlesr.harris at gmail.com Sun Jun 21 11:42:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 21 Jun 2009 09:42:18 -0600 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, Jun 21, 2009 at 8:55 AM, Darren Dale wrote: > On Sun, Jun 21, 2009 at 10:38 AM, John Reid > wrote: >> >> David Cournapeau wrote: >> > (Continuing the discussion initiated in the neighborhood iterator >> > thread) >> > ? ? - Chuck suggested to drop python < 2.6 support from now on. I am >> > against it without a very strong and detailed rationale, because many OS >> > still don't have python 2.6 (RHEL, Ubuntu LTS). >> >> I vote against dropping support for python 2.5. Personally I have no >> incentive to upgrade to 2.6 and am very happy with 2.5. > > Will requiring python-2.6 help the developers port numpy to python-3? > Can't really say at this point, but it is the suggested path to python-3. I raised the point to start a discussion. My thoughts are that we need to make the move at some point next year. Chuck From charlesr.harris at gmail.com Sun Jun 21 11:53:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 21 Jun 2009 09:53:18 -0600 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, Jun 21, 2009 at 9:42 AM, Charles R Harris wrote: > On Sun, Jun 21, 2009 at 8:55 AM, Darren Dale wrote: >> On Sun, Jun 21, 2009 at 10:38 AM, John Reid >> wrote: >>> >>> David Cournapeau wrote: >>> > (Continuing the discussion initiated in the neighborhood iterator >>> > thread) >>> > ? ? - Chuck suggested to drop python < 2.6 support from now on. I am >>> > against it without a very strong and detailed rationale, because many OS >>> > still don't have python 2.6 (RHEL, Ubuntu LTS). >>> >>> I vote against dropping support for python 2.5. Personally I have no >>> incentive to upgrade to 2.6 and am very happy with 2.5. >> >> Will requiring python-2.6 help the developers port numpy to python-3? >> > > Can't really say at this point, but it is the suggested path to > python-3. I raised the point to start a discussion. My thoughts are > that we need to make the move at some point next year. > Before that move we should have a version of numpy that doesn't need many fixes and that supports developments in scipy so that folks can get good functionality sticking to that release for a while without upgrading. But we do need to think a bit about what that entails. Chuck From cournape at gmail.com Sun Jun 21 11:59:46 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 22 Jun 2009 00:59:46 +0900 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220906210859u64c36af5tdb154593db61ab07@mail.gmail.com> On Mon, Jun 22, 2009 at 12:42 AM, Charles R Harris wrote: > On Sun, Jun 21, 2009 at 8:55 AM, Darren Dale wrote: >> On Sun, Jun 21, 2009 at 10:38 AM, John Reid >> wrote: >>> >>> David Cournapeau wrote: >>> > (Continuing the discussion initiated in the neighborhood iterator >>> > thread) >>> > ? ? - Chuck suggested to drop python < 2.6 support from now on. I am >>> > against it without a very strong and detailed rationale, because many OS >>> > still don't have python 2.6 (RHEL, Ubuntu LTS). >>> >>> I vote against dropping support for python 2.5. Personally I have no >>> incentive to upgrade to 2.6 and am very happy with 2.5. >> >> Will requiring python-2.6 help the developers port numpy to python-3? >> > > Can't really say at this point, but it is the suggested path to > python-3. OTOH, I don't find the python 3 "official" transition story very convincing. I have tried to gather all the information I could find, both on the python wiki and from transitions stories. To support both python 2 and 3, the suggestion is to use the 2to3 script, but it is painfully slow for big packages like numpy. And there ave very few stories for porting python 3 C extensions. Another suggestion is to avoid breaking the API when transitioning for python 3. But that seems quite unrealistic. How do we deal with the removing of string/long APIs ? This will impact the numpy API as well, so how do we deal with it ? Also, there does not seem to be any advantages for python 3 for scientific people ? cheers, David From cournape at gmail.com Sun Jun 21 12:00:52 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 22 Jun 2009 01:00:52 +0900 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <5b8d13220906210859u64c36af5tdb154593db61ab07@mail.gmail.com> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <5b8d13220906210859u64c36af5tdb154593db61ab07@mail.gmail.com> Message-ID: <5b8d13220906210900p59af5868mcfb6b444adf96f94@mail.gmail.com> On Mon, Jun 22, 2009 at 12:59 AM, David Cournapeau wrote: > Another suggestion is to avoid breaking the API when transitioning for > python 3. But that seems quite unrealistic. How do we deal with the > removing of string/long APIs ? ^^^^^ This should be int, of course, David From lou_boog2000 at yahoo.com Sun Jun 21 12:09:23 2009 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Sun, 21 Jun 2009 09:09:23 -0700 (PDT) Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 Message-ID: <782677.66861.qm@web34406.mail.mud.yahoo.com> I'm still using 2.4, but I plan to go to 2.5 when the project we're doing now reaches a stable point later this year. ?Not sure after that. ?I know it's real work to keep several versions going, but I sense there are a lot of people in the 2.4 - 2.5 window. ?I guess 2.6 is a mini step toward 3.0. ?The problem with each step is that all the libraries we rely on have to be ugraded to that step or we might lose the functionality of that library. ?For me that's a killer. I have to take a good look at all of them before the upgrade or a big project will take a fatal hit. -- Lou Pecora, my views are my own. --- On Sun, 6/21/09, John Reid wrote: From: John Reid Subject: Re: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 To: numpy-discussion at scipy.org Date: Sunday, June 21, 2009, 10:38 AM David Cournapeau wrote: > (Continuing the discussion initiated in the neighborhood iterator thread) >? ???- Chuck suggested to drop python < 2.6 support from now on. I am > against it without a very strong and detailed rationale, because many OS > still don't have python 2.6 (RHEL, Ubuntu LTS). I vote against dropping support for python 2.5. Personally I have no incentive to upgrade to 2.6 and am very happy with 2.5. _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From kxroberto at googlemail.com Sun Jun 21 13:17:37 2009 From: kxroberto at googlemail.com (Robert) Date: Sun, 21 Jun 2009 19:17:37 +0200 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <782677.66861.qm@web34406.mail.mud.yahoo.com> References: <782677.66861.qm@web34406.mail.mud.yahoo.com> Message-ID: Lou Pecora wrote: > I'm still using 2.4, but I plan to go to 2.5 when the project we're > doing now reaches a stable point later this year. Not sure after that. > I know it's real work to keep several versions going, but I sense there > are a lot of people in the 2.4 - 2.5 window. I guess 2.6 is a mini step > toward 3.0. The problem with each step is that all the libraries we > rely on have to be ugraded to that step or we might lose the > functionality of that library. For me that's a killer. I have to take a > good look at all of them before the upgrade or a big project will take a > fatal hit. > +1 I'd like even support for Python 2.3. Many basic libraries still support 2.3 . Recently I wanted to download the latest numpy for 2.3 which I need for some projects and :-( . Just since 2008-08-01 they dropped both 2.3 and 2.4. Is there a serious reason? And numpy is a very basic library. And what is in numpy (scipy) that requires advanced language syntax? Its just about numbers and slices. A few ifdef's for new concepts like new base classes. It needs nothing of the real news of advanced Pythons. A thing like numpy/scipy is most simple regarding code structure. It should be easy to offer it for old Pythons - at least 2.3 which is the inofficial "Python 2" baseline for library producers. Even a complex GUI app like Pythonwin (where it is very tempting to use advanced sugar) is still supported for even 2.2 Regarding Python 2 -> 3 Migration. Look at e.g. Cython - it produces C code with a few #ifdef's and macros and which compiles both in Py2 (2.3+) and Py3. Its quite simple to maintain. Also Python code can be written so, that it can be auto-transposed from 2 -> 3 for long time: by 2to3 + pyXpy comment transposition language like print "x" #$3 print("x") #$3 only_py3_func() only_py2_func() #$3 It would be drastic to force the basis for numpy to Py3 so early - unless a back direction is offered. And numpy should/could in my opinion be one of the last libraries which cuts off support for old Pythons. - One of the itching problems when using numpy with smaller apps, scripts, web and with freezers is, that import is very slow, and it rams almost the whole numpy into the memory. Many fat dlls like _dotblas, etc. And there are strange (unnecessary) dependencies between the branches and files. See thread "minimal numpy?". That all threatens usabilty of numpy in a many areas. Its designed like "loading numpy/scipe in the lab in the morning and exiting in the evening". That is too sloopy for a basic library which adds a efficient key array class to Python. numpy should be usable in a much lighter way. If "import numpy" shall by default still import most of numpy - ok, but there should perhaps be at least an easy mechanism to stop this "all-in-one" behavior and dependencies with an easy switch. Or with "import numpy.base" Or in my opinion the reverse behavior would be more decent for a library: "import numpy" imports only the minimum (like in that other thread) and "import numpy.anth[ology] as np" or so may draw the whole mess as before. inter-DLL-imports like multiarray.pyd <-> umath.pyd should use relative imports (like the py modules next to them). Such cleanup of the code organization and improving usability etc I think is more important than playing a role as "leading py3 enforcer" Robert From oliphant at enthought.com Sun Jun 21 18:04:56 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Sun, 21 Jun 2009 17:04:56 -0500 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <782677.66861.qm@web34406.mail.mud.yahoo.com> Message-ID: <406284B7-C717-47E8-8CA1-85B04AFEDBB9@enthought.com> I don't remember dropping support for 2.4. When did that happen? Sent from my iPhone On Jun 21, 2009, at 12:17 PM, Robert wrote: > Lou Pecora wrote: >> I'm still using 2.4, but I plan to go to 2.5 when the project we're >> doing now reaches a stable point later this year. Not sure after >> that. >> I know it's real work to keep several versions going, but I sense >> there >> are a lot of people in the 2.4 - 2.5 window. I guess 2.6 is a mini >> step >> toward 3.0. The problem with each step is that all the libraries we >> rely on have to be ugraded to that step or we might lose the >> functionality of that library. For me that's a killer. I have to >> take a >> good look at all of them before the upgrade or a big project will >> take a >> fatal hit. >> > > +1 > > I'd like even support for Python 2.3. Many basic libraries still > support 2.3 . Recently I wanted to download the latest numpy for > 2.3 which I need for some projects and :-( . Just since > 2008-08-01 they dropped both 2.3 and 2.4. Is there a serious reason? > > And numpy is a very basic library. And what is in numpy (scipy) > that requires advanced language syntax? Its just about numbers and > slices. A few ifdef's for new concepts like new base classes. It > needs nothing of the real news of advanced Pythons. A thing like > numpy/scipy is most simple regarding code structure. It should be > easy to offer it for old Pythons - at least 2.3 which is the > inofficial "Python 2" baseline for library producers. > Even a complex GUI app like Pythonwin (where it is very tempting > to use advanced sugar) is still supported for even 2.2 > > Regarding Python 2 -> 3 Migration. Look at e.g. Cython - it > produces C code with a few #ifdef's and macros and which compiles > both in Py2 (2.3+) and Py3. Its quite simple to maintain. Also > Python code can be written so, that it can be auto-transposed from > 2 -> 3 for long time: by 2to3 + pyXpy comment transposition > language like > > print "x" #$3 print("x") > #$3 only_py3_func() > only_py2_func() #$3 > > It would be drastic to force the basis for numpy to Py3 so early - > unless a back direction is offered. And numpy should/could in my > opinion be one of the last libraries which cuts off support for > old Pythons. > > - > > One of the itching problems when using numpy with smaller apps, > scripts, web and with freezers is, that import is very slow, and > it rams almost the whole numpy into the memory. Many fat dlls like > _dotblas, etc. And there are strange (unnecessary) dependencies > between the branches and files. See thread "minimal numpy?". > That all threatens usabilty of numpy in a many areas. Its designed > like "loading numpy/scipe in the lab in the morning and exiting in > the evening". That is too sloopy for a basic library which adds a > efficient key array class to Python. numpy should be usable in a > much lighter way. > > If "import numpy" shall by default still import most of numpy - > ok, but there should perhaps be at least an easy mechanism to stop > this "all-in-one" behavior and dependencies with an easy switch. > Or with "import numpy.base" > Or in my opinion the reverse behavior would be more decent for a > library: > "import numpy" imports only the minimum (like in that other > thread) and "import numpy.anth[ology] as np" or so may draw the > whole mess as before. > inter-DLL-imports like multiarray.pyd <-> umath.pyd should use > relative imports (like the py modules next to them). > Such cleanup of the code organization and improving usability etc > I think is more important than playing a role as "leading py3 > enforcer" > > > Robert > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Sun Jun 21 19:10:08 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 21 Jun 2009 17:10:08 -0600 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <406284B7-C717-47E8-8CA1-85B04AFEDBB9@enthought.com> References: <782677.66861.qm@web34406.mail.mud.yahoo.com> <406284B7-C717-47E8-8CA1-85B04AFEDBB9@enthought.com> Message-ID: On Sun, Jun 21, 2009 at 4:04 PM, Travis Oliphant wrote: > I don't remember dropping support for 2.4. ? When did that happen? > > It didn't, numpy should still run with 2.4. If there is a problem I haven't heard about it. Chuck From bsouthey at gmail.com Sun Jun 21 21:15:47 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 21 Jun 2009 20:15:47 -0500 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, Jun 21, 2009 at 4:01 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > (Continuing the discussion initiated in the neighborhood iterator thread) > > Hi, > > I would like to gather people's opinion on what to target for numpy > 1.4.0. > - Chuck suggested to drop python < 2.6 support from now on. I am > against it without a very strong and detailed rationale, because many OS > still don't have python 2.6 (RHEL, Ubuntu LTS). > - Even if not many new features have been implemented since 1.3.0, > there were several changes which would be quite useful (using npy_math > from numpy for scipy.special, neighborhood iterator for scipy.signal). > So releasing 1.4.0 soon would be useful so that scipy 0.8.0 could depend > on it. > - Fixing crashes on windows 64 bits: I have not made any progress on > this. I am out of ideas on how to debug the problem, to be honest. > > I think this is an essential requirement for numpy. I am prepared to attempt to try to help as I now have a 64-bit windows system I can use. Just that I do not like the windows environment for programming one bit. So I would appreciate any pointers to get the necessary functional system for 64-bit windows. Probably the other item I would suggest is resolving the Matrix issues and having the best solution implemented. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Jun 21 22:09:00 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 22 Jun 2009 11:09:00 +0900 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <406284B7-C717-47E8-8CA1-85B04AFEDBB9@enthought.com> References: <782677.66861.qm@web34406.mail.mud.yahoo.com> <406284B7-C717-47E8-8CA1-85B04AFEDBB9@enthought.com> Message-ID: <5b8d13220906211909m2aebb5eaxe5a809367ecfb8c4@mail.gmail.com> On Mon, Jun 22, 2009 at 7:04 AM, Travis Oliphant wrote: > I don't remember dropping support for 2.4. ? When did that happen? It does work on 2.4, I regularly test numpy and scipy on RHEL 5 with 64 bits python 2.4. We dropped 2.3 support since 1.2 IIRC. David From luigi_curzi at yahoo.it Sun Jun 21 22:24:43 2009 From: luigi_curzi at yahoo.it (luigi curzi) Date: Mon, 22 Jun 2009 04:24:43 +0200 Subject: [Numpy-discussion] Memmap + resize Message-ID: <20090622042443.44fbb6be@pozzaibe> hello, is it possible to resize in place a memmap array? thanks in advice Luigi -- ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* Io sono lo sbaglio, il momento di confusione, l'inopportuno Non sono niente. Non sar? mai niente. Non posso volere d'essere niente. A parte questo, ho in me tutti i sogni del mondo. F. Pessoa ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* From cournape at gmail.com Sun Jun 21 22:45:18 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 22 Jun 2009 11:45:18 +0900 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <782677.66861.qm@web34406.mail.mud.yahoo.com> Message-ID: <5b8d13220906211945l53c1c2efw284f4df75c19cde3@mail.gmail.com> On Mon, Jun 22, 2009 at 2:17 AM, Robert wrote: > > I'd like even support for Python 2.3. Many basic libraries still > support 2.3 and many don't :) In my experience, the limit is often python 2.4 (which we still support). > And numpy is a very basic library. And what is in numpy (scipy) > that requires advanced language syntax? Its just about numbers and > slices. A few ifdef's for new concepts like new base classes. It > needs nothing of the real news of advanced Pythons. A thing like > numpy/scipy is most simple regarding code structure. It is certainly not easy - it depends on some quite obscure features of the C API. The problem is not that numpy is supposedly just about numbers and slices, but the number of C API calls to python which change between versions. A lot changed between python 2 and python 3. To give you an idea: numpy is ~ 80 000 LOC of C, 55000 LOC of python. There is as much C code in numpy than in the object implementation in python (float, int, object, list, dict, tuples, etc...). Effectively, porting numpy is like porting the core objects of python. We certainly don't have as many resources as python. > It would be drastic to force the basis for numpy to Py3 so early - > unless a back direction is offered. Agreed - that's why nobody suggested it. > One of the itching problems when using numpy with smaller apps, > scripts, web and with freezers is, that import is very slow very slow is a stretch - it is around 100 ms on a recent computer on decent OS. It is difficult to be much faster without breaking compatibility. > And there are strange (unnecessary) dependencies > between the branches and files. Yes, there are some strange dependencies of say numpy.core on other subpackages. I agree this should change (this causes me a lot of headache when debugging on windows, for example, and that's the major problem while building numpy statically into python interpreter). But it is not always easy without breaking someone else's code. > > If "import numpy" shall by default still import most of numpy - > ok, but there should perhaps be at least an easy mechanism to stop > this "all-in-one" behavior and dependencies with an easy switch. this has been suggested before. We can't break the current behavior, because so many people depend on it, but having two modes of import is not good either. It means doubling the amount of configurations to test, for once. That's not maintainable. David From robert.kern at gmail.com Mon Jun 22 00:00:10 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 21 Jun 2009 23:00:10 -0500 Subject: [Numpy-discussion] Blurring an Array In-Reply-To: References: <3d375d730906210105m5d01a831v54bc58b0bbd237ec@mail.gmail.com> <3d375d730906210134l500a7630ka77d00082ffcad40@mail.gmail.com> Message-ID: <3d375d730906212100h5715852fgd264b845d4145e2e@mail.gmail.com> On Sun, Jun 21, 2009 at 05:48, Ian Mallett wrote: > > This works perfectly!? Is there likewise a similar call for Numeric? If Numeric.roll() exists, then yes. Otherwise, you may have to look at the numpy.roll() sources to replicate what it does. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ?-- Umberto Eco From dwf at cs.toronto.edu Mon Jun 22 02:09:52 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 22 Jun 2009 02:09:52 -0400 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <5b8d13220906210859u64c36af5tdb154593db61ab07@mail.gmail.com> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <5b8d13220906210859u64c36af5tdb154593db61ab07@mail.gmail.com> Message-ID: On 21-Jun-09, at 11:59 AM, David Cournapeau wrote: >> Can't really say at this point, but it is the suggested path to >> python-3. > > OTOH, I don't find the python 3 "official" transition story very > convincing. I have tried to gather all the information I could find, > both on the python wiki and from transitions stories. To support both > python 2 and 3, the suggestion is to use the 2to3 script, but it is > painfully slow for big packages like numpy. And there ave very few > stories for porting python 3 C extensions. It's the suggested path for python packages in general but I wonder about how readily this advice applies to packages that are so heavily C-dependent. There was talk of using Cython to ease the transition/ maintenance of 2.x and 3.x branches, since it abstracts away the choice of old-style buffer interface vs. PEP3118-style. Perhaps Dag Sverre has something to say on this topic? > Also, there does not seem to be any advantages for python 3 for > scientific people ? There's some new stuff with regard to native arbitrary-precision arithmetic, which might affect some people, but I agree on the whole. David From pgmdevlist at gmail.com Mon Jun 22 02:18:52 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 22 Jun 2009 02:18:52 -0400 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: <9B3C6D25-D9C0-449F-92FE-CE9006D7D063@gmail.com> On Jun 21, 2009, at 5:01 AM, David Cournapeau wrote: > (Continuing the discussion initiated in the neighborhood iterator > thread) > > Hi, > > I would like to gather people's opinion on what to target for numpy > 1.4.0. Is this a wish list ? * As Darren mentioned, some __array_prepare__ method called when a ufunc is called on a subclass of ndarray, before any computation takes place. Think of it as a parallel to __array_wrap__ * A .metadata/.addinfo object storing additional information along a ndarray. Travis O. mentioned that a little while back. Could probably wait a bit, till 1.5 ? From cournape at gmail.com Mon Jun 22 02:28:34 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 22 Jun 2009 15:28:34 +0900 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <9B3C6D25-D9C0-449F-92FE-CE9006D7D063@gmail.com> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <9B3C6D25-D9C0-449F-92FE-CE9006D7D063@gmail.com> Message-ID: <5b8d13220906212328h613c3593vd83549896edbb1f3@mail.gmail.com> On Mon, Jun 22, 2009 at 3:18 PM, Pierre GM wrote: > > On Jun 21, 2009, at 5:01 AM, David Cournapeau wrote: > >> (Continuing the discussion initiated in the neighborhood iterator >> thread) >> >> Hi, >> >> ? ?I would like to gather people's opinion on what to target for numpy >> 1.4.0. > > Is this a wish list ? Yes - as long as the wishes are backup-ed with a timeline and volunteers for the implementation :) > > * As Darren mentioned, some __array_prepare__ method called when a > ufunc is called on a subclass of ndarray, before any computation takes > place. Think of it as a parallel to __array_wrap__ Has there been any work on this ? I cannot comment on the feature itself, I am not knowledgeable enough on that part of the code. > > * A .metadata/.addinfo object storing additional information along a > ndarray. Travis O. mentioned that a little while back. Could probably > wait a bit, till 1.5 ? Same comment as for __array_prepare__. My main motivation for an early numpy 1.4 is scipy 0.8.0. But if other people think numpy 1.4.0 as it is now is too 'weak', making scipy 0.8.0 compatible with numpy 1.3.0 is technically possible. Then we can talk about adding those features for numpy 1.4.0, and release it later. cheers, David From cournape at gmail.com Mon Jun 22 02:39:22 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 22 Jun 2009 15:39:22 +0900 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220906212339t17750cefj98c9589e2bccaab1@mail.gmail.com> On Mon, Jun 22, 2009 at 10:15 AM, Bruce Southey wrote: > > > On Sun, Jun 21, 2009 at 4:01 AM, David Cournapeau > wrote: >> >> (Continuing the discussion initiated in the neighborhood iterator thread) >> >> Hi, >> >> ? ?I would like to gather people's opinion on what to target for numpy >> 1.4.0. >> ? ?- Chuck suggested to drop python < 2.6 support from now on. I am >> against it without a very strong and detailed rationale, because many OS >> still don't have python 2.6 (RHEL, Ubuntu LTS). >> ? ?- Even if not many new features have been implemented since 1.3.0, >> there were several changes which would be quite useful (using npy_math >> from numpy for scipy.special, neighborhood iterator for scipy.signal). >> So releasing 1.4.0 soon would be useful so that scipy 0.8.0 could depend >> on it. >> ? ?- Fixing crashes on windows 64 bits: I have not made any progress on >> this. I am out of ideas on how to debug the problem, to be honest. >> > > I think this is an essential requirement for numpy. Yes. Windows 64 bits is finally usable (enough drivers) for the end users. And the (totally unusable) numpy 1.3 64 bits binaries have been downloaded quite a bit, confirming the interest. > I am prepared to attempt > to try to help as I now have a 64-bit windows system I can use. Just that I > do not like the windows environment for programming one bit. Me neither. It is really awful unless you are in the IDE. Ideally, a primarily windows developer would be in charge. > So I would > appreciate any pointers to get the necessary functional system for 64-bit > windows. To summarize the issue. We have two alternatives: - compiling with VS 2008 works more or less. There are a few bugs, but those are solvable through the debugger I guess with enough motivation. - compiling with mingw-w64 crashes randomly. Sometimes it crashes at import, sometimes during the unit tests, sometimes even before calling init_multirarray (the first numpy C extension). It also depends on how you launch python (IDLE vs. cmd.exe). The big problem with VS 2008 is that there is no free fortran compiler compatible with VS 2008. I can not even compile a trivial fortran + C project with the VS 2008-gfortran combination. A numpy only build could still be useful (for matplotlib, for example), but I am reluctant to do that if we later go the mingw route, since both built would be more than likely ABI incompatible. With mingw compilers, we get gfortran "for free" (I can compile scipy, for example), but there is this very serious crash which happens randomly. The debugger does not work (maybe some stack corruption), and there is no valgrind on windows, so it is very difficult to track down. Of course, mingw debugging symbols are not compatible with MS compilers, so the MS debugger is no option either. David From stefan at sun.ac.za Mon Jun 22 02:56:30 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 22 Jun 2009 08:56:30 +0200 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <5b8d13220906212328h613c3593vd83549896edbb1f3@mail.gmail.com> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <9B3C6D25-D9C0-449F-92FE-CE9006D7D063@gmail.com> <5b8d13220906212328h613c3593vd83549896edbb1f3@mail.gmail.com> Message-ID: <9457e7c80906212356p4beac9f6uad092aff05ba85cd@mail.gmail.com> 2009/6/22 David Cournapeau : > My main motivation for an early numpy 1.4 is scipy 0.8.0. But if other > people think numpy 1.4.0 as it is now is too 'weak', making scipy > 0.8.0 compatible with numpy 1.3.0 is technically possible. Then we can > talk about adding those features for numpy 1.4.0, and release it > later. I think there are enough changes for a 1.4 release. It makes little sense to put our effort into making SciPy compatible with an old version of NumPy, rather than moving both packages forward. Regards St?fan From charlesr.harris at gmail.com Mon Jun 22 03:04:13 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 22 Jun 2009 01:04:13 -0600 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <9B3C6D25-D9C0-449F-92FE-CE9006D7D063@gmail.com> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <9B3C6D25-D9C0-449F-92FE-CE9006D7D063@gmail.com> Message-ID: On Mon, Jun 22, 2009 at 12:18 AM, Pierre GM wrote: > > On Jun 21, 2009, at 5:01 AM, David Cournapeau wrote: > >> (Continuing the discussion initiated in the neighborhood iterator >> thread) >> >> Hi, >> >> ? ?I would like to gather people's opinion on what to target for numpy >> 1.4.0. > > Is this a wish list ? > > * As Darren mentioned, some __array_prepare__ method called when a > ufunc is called on a subclass of ndarray, before any computation takes > place. Think of it as a parallel to __array_wrap__ > This is an interesting idea but it need some fleshing out. Working out a few example applications would firm things up I think. Things like what parameters would be passed need to be specified. For instance, if units are involved there would be a difference between binary and unary functions. And what of things like log and exp which should have unitless arguments? There looks to be a lot of preliminary work there. > * A .metadata/.addinfo object storing additional information along a > ndarray. Travis O. mentioned that a little while back. Could probably > wait a bit, till 1.5 ? That might be the easiest thing to add. Travis should have something to say about that. Chuck From charlesr.harris at gmail.com Mon Jun 22 03:06:07 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 22 Jun 2009 01:06:07 -0600 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <9457e7c80906212356p4beac9f6uad092aff05ba85cd@mail.gmail.com> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <9B3C6D25-D9C0-449F-92FE-CE9006D7D063@gmail.com> <5b8d13220906212328h613c3593vd83549896edbb1f3@mail.gmail.com> <9457e7c80906212356p4beac9f6uad092aff05ba85cd@mail.gmail.com> Message-ID: 2009/6/22 St?fan van der Walt : > 2009/6/22 David Cournapeau : >> My main motivation for an early numpy 1.4 is scipy 0.8.0. But if other >> people think numpy 1.4.0 as it is now is too 'weak', making scipy >> 0.8.0 compatible with numpy 1.3.0 is technically possible. Then we can >> talk about adding those features for numpy 1.4.0, and release it >> later. > > I think there are enough changes for a 1.4 release. ?It makes little > sense to put our effort into making SciPy compatible with an old > version of NumPy, rather than moving both packages forward. > It's looking like fixing bugs and trying to get a working windows 64 bit release might be what we should shoot for. Chuck From luigi_curzi at yahoo.it Mon Jun 22 04:24:08 2009 From: luigi_curzi at yahoo.it (luigi curzi) Date: Mon, 22 Jun 2009 10:24:08 +0200 Subject: [Numpy-discussion] Memmap + resize In-Reply-To: <20090622042443.44fbb6be@pozzaibe> References: <20090622042443.44fbb6be@pozzaibe> Message-ID: <20090622102408.256ca962@pozzaibe> Il giorno Mon, 22 Jun 2009 04:24:43 +0200 luigi curzi ha scritto: > hello, > is it possible to resize in place a memmap array? > maybe it wasn't clear, i mean: i have a numpy array memory mapped with memmap, when i try to do array.resize(..., refcheck=0) i receive this error: ValueError: cannot resize this array: it does not own its data in fact array.flags says OWNDATA = False. so, my question. > thanks in advice > Luigi -- ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* Io sono lo sbaglio, il momento di confusione, l'inopportuno Non sono niente. Non sar? mai niente. Non posso volere d'essere niente. A parte questo, ho in me tutti i sogni del mondo. F. Pessoa ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* From neilcrighton at gmail.com Mon Jun 22 06:52:17 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Mon, 22 Jun 2009 10:52:17 +0000 (UTC) Subject: [Numpy-discussion] Patch for review (improving arraysetops) References: <63751c30906170311obc07693u276ba597be31892e@mail.gmail.com> <4A38EA5F.8080507@ntc.zcu.cz> Message-ID: Robert Cimrman ntc.zcu.cz> writes: > Hi Neil, > > This sounds good. If you don't have time to do it, I don't mind having > > a go at writing > > a patch to implement these changes (deprecate the existing unique1d, rename > > unique1d to unique and add the set approach from the old unique, and the other > > changes mentioned in http://projects.scipy.org/numpy/ticket/1133). > > That would be really great - I will not be online starting tomorrow till > the end of next week (more or less), so I can really look at the issue > after I return. > Here's a patch that implements most of the changes suggested in the ticket, and merges unique and unique1d functionality to a single function unique in arraysetops: http://projects.scipy.org/numpy/ticket/1133 Please review it. Thanks, Neil From dsdale24 at gmail.com Mon Jun 22 07:27:48 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Mon, 22 Jun 2009 07:27:48 -0400 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <9B3C6D25-D9C0-449F-92FE-CE9006D7D063@gmail.com> Message-ID: On Mon, Jun 22, 2009 at 3:04 AM, Charles R Harris wrote: > On Mon, Jun 22, 2009 at 12:18 AM, Pierre GM wrote: > > > > On Jun 21, 2009, at 5:01 AM, David Cournapeau wrote: > > > >> (Continuing the discussion initiated in the neighborhood iterator > >> thread) > >> > >> Hi, > >> > >> I would like to gather people's opinion on what to target for numpy > >> 1.4.0. > > > > Is this a wish list ? > > > > * As Darren mentioned, some __array_prepare__ method called when a > > ufunc is called on a subclass of ndarray, before any computation takes > > place. Think of it as a parallel to __array_wrap__ > > > > This is an interesting idea but it need some fleshing out. Working out > a few example applications would firm things up I think. Things like > what parameters would be passed need to be specified. I posted an explanation of what parameters would be passed and a simple example of how the implementation might look and behave. I gave two application examples, MaskedArray and Quantities. > For instance, if > units are involved there would be a difference between binary and > unary functions. And what of things like log and exp which should have > unitless arguments? There looks to be a lot of preliminary work there. > These considerations are already addressed and many of the ufuncs (arithmetic, inverse, etc) are already implemented in Quantities-0.5b2 (I'll post a 0.5b3 that implements log and exp this morning at PyPI). The identity of the ufunc itself is used as the context for operating on the units or raising an error if the units are incompatible with the operation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsandie at gmail.com Mon Jun 22 09:29:58 2009 From: dsandie at gmail.com (Sandeep Devadas) Date: Mon, 22 Jun 2009 18:59:58 +0530 Subject: [Numpy-discussion] MKL Path error on Cygwin after installing on windows. Message-ID: Hello There, My name is Sandeep Devadas and im trying to install numpy for python on Cygwins latest version(1.5.xx) on Windows XP.I'm getting an error message when I follow the instructions given at http://www.scipy.org/Installing_SciPy/Windows (I have searched for the files and except for mkl_ia32,the rest-->mkl_c_dll & libguide40 are available at C:\Program Files\Intel\MKL\10.0.5.025\ia32\lib When I enter the following at the BASH Shell after installing MKL 10.0.5.025(30 day evaluation), python setup.py config,I get $ python setup.py config Running from numpy source directory. non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2 blas_opt_info: blas_mkl_info: libraries mkl_ia32,mkl_c_dll,libguide40 not found in \Program Files\Intel\MKL\10.0.5.025\ia32\lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE /bin/numpy-1.3.0/numpy/distutils/system_info.py:1383: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) blas_info: libraries blas not found in /usr/local/lib FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl_ia32,mkl_c_dll,libguide40 not found in \Program Files\Intel\MKL\10.0.5.025\ia32\lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_info NOT AVAILABLE /bin/numpy-1.3.0/numpy/distutils/system_info.py:1290: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) lapack_info: libraries lapack not found in /usr/local/lib FOUND: libraries = ['lapack'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 running config Please let me know what to do. Thanks and Regards, Sandeep. -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon Jun 22 09:33:32 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 22 Jun 2009 22:33:32 +0900 Subject: [Numpy-discussion] MKL Path error on Cygwin after installing on windows. In-Reply-To: References: Message-ID: <4A3F882C.3050904@ar.media.kyoto-u.ac.jp> Hi Sandeep, Sandeep Devadas wrote: > Hello There, > My name is Sandeep Devadas and im trying to install > numpy for python on Cygwins latest version(1.5.xx) on Windows XP.I'm > getting an error message when I follow the instructions given at > http://www.scipy.org/Installing_SciPy/Windows You can't build a native numpy for windows on cygwin. You need to build from cmd.exe and a native python (from python.org). Note also that building numpy with the MKL is relatively complex - we may not be able to help you to do it completely. > > (I have searched for the files and except for mkl_ia32,the > rest-->mkl_c_dll & libguide40 are available at C:\Program > Files\Intel\MKL\10.0.5.025\ia32\lib Most likely, on cygwin, numpy only looks for libraries usable from cygwin (libfoo.a, libfoo.dll). Try again under a native windows shell (cmd.exe) cheers, David From dsdale24 at gmail.com Mon Jun 22 10:30:27 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Mon, 22 Jun 2009 10:30:27 -0400 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <9B3C6D25-D9C0-449F-92FE-CE9006D7D063@gmail.com> Message-ID: On Mon, Jun 22, 2009 at 7:27 AM, Darren Dale wrote: > > > On Mon, Jun 22, 2009 at 3:04 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> On Mon, Jun 22, 2009 at 12:18 AM, Pierre GM wrote: >> > >> > On Jun 21, 2009, at 5:01 AM, David Cournapeau wrote: >> > >> >> (Continuing the discussion initiated in the neighborhood iterator >> >> thread) >> >> >> >> Hi, >> >> >> >> I would like to gather people's opinion on what to target for numpy >> >> 1.4.0. >> > >> > Is this a wish list ? >> > >> > * As Darren mentioned, some __array_prepare__ method called when a >> > ufunc is called on a subclass of ndarray, before any computation takes >> > place. Think of it as a parallel to __array_wrap__ >> > >> >> This is an interesting idea but it need some fleshing out. Working out >> a few example applications would firm things up I think. Things like >> what parameters would be passed need to be specified. > > > I posted an explanation of what parameters would be passed and a simple > example of how the implementation might look and behave. I gave two > application examples, MaskedArray and Quantities. > > >> For instance, if >> units are involved there would be a difference between binary and >> unary functions. And what of things like log and exp which should have >> unitless arguments? There looks to be a lot of preliminary work there. >> > > These considerations are already addressed and many of the ufuncs > (arithmetic, inverse, etc) are already implemented in Quantities-0.5b2 (I'll > post a 0.5b3 that implements log and exp this morning at PyPI). The identity > of the ufunc itself is used as the context for operating on the units or > raising an error if the units are incompatible with the operation. > I just posted sources and installers for quantities-0.5b5 at http://pypi.python.org/pypi . Part of the code that allows quantities to work with standard numpy ufuncs is in quantity.py (Quantity.__array_wrap__) and the rest is at the end of dimensionality.py. So for example, you can do: import numpy as np import quantities as pq np.log(pq.m) # raises an error np.log(pq.m/pq.m) # yields: array(0.0) * dimensionless unrecognized ufuncs return a dimensionless quantity by default, but quantities could raise an error instead: >>> np.logical_and(pq.m,pq.m) ufunc not implemented, please file a bug report array(True, dtype=bool) * dimensionless Unit handling is all implemented in __array_wrap__, but this occurs too late in the ufunc to catch errors before the ufunc can modify data in-place. For example: q1=1*pq.m q2=1*pq.s np.add(q1,q2) # raises an ValueError np.add(q1,q2, q1) # also raises a ValueError, but too late: print q1 # yields array(2.0) * m Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.ratcliff at gmail.com Mon Jun 22 10:43:03 2009 From: william.ratcliff at gmail.com (william ratcliff) Date: Mon, 22 Jun 2009 10:43:03 -0400 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: <5b8d13220906212339t17750cefj98c9589e2bccaab1@mail.gmail.com> References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <5b8d13220906212339t17750cefj98c9589e2bccaab1@mail.gmail.com> Message-ID: <827183970906220743o4cc1b2ech353b63dd2438984e@mail.gmail.com> I haven't used it, but you might try Rational Purify from IBM as a valgrind alternative on Windows. They used to have a free trial. Btw. Are there any docs that I can read on the issues involved in fortran-c-python bindings? Anything related to gfortran (fortran 95/2003) and python in particular would be much appreciated!!! Cheers, William On Mon, Jun 22, 2009 at 2:39 AM, David Cournapeau wrote: > On Mon, Jun 22, 2009 at 10:15 AM, Bruce Southey wrote: > > > > > > On Sun, Jun 21, 2009 at 4:01 AM, David Cournapeau > > wrote: > >> > >> (Continuing the discussion initiated in the neighborhood iterator > thread) > >> > >> Hi, > >> > >> I would like to gather people's opinion on what to target for numpy > >> 1.4.0. > >> - Chuck suggested to drop python < 2.6 support from now on. I am > >> against it without a very strong and detailed rationale, because many OS > >> still don't have python 2.6 (RHEL, Ubuntu LTS). > >> - Even if not many new features have been implemented since 1.3.0, > >> there were several changes which would be quite useful (using npy_math > >> from numpy for scipy.special, neighborhood iterator for scipy.signal). > >> So releasing 1.4.0 soon would be useful so that scipy 0.8.0 could depend > >> on it. > >> - Fixing crashes on windows 64 bits: I have not made any progress on > >> this. I am out of ideas on how to debug the problem, to be honest. > >> > > > > I think this is an essential requirement for numpy. > > Yes. Windows 64 bits is finally usable (enough drivers) for the end > users. And the (totally unusable) numpy 1.3 64 bits binaries have been > downloaded quite a bit, confirming the interest. > > > I am prepared to attempt > > to try to help as I now have a 64-bit windows system I can use. Just that > I > > do not like the windows environment for programming one bit. > > Me neither. It is really awful unless you are in the IDE. Ideally, a > primarily windows developer would be in charge. > > > So I would > > appreciate any pointers to get the necessary functional system for 64-bit > > windows. > > To summarize the issue. We have two alternatives: > - compiling with VS 2008 works more or less. There are a few bugs, > but those are solvable through the debugger I guess with enough > motivation. > - compiling with mingw-w64 crashes randomly. Sometimes it crashes > at import, sometimes during the unit tests, sometimes even before > calling init_multirarray (the first numpy C extension). It also > depends on how you launch python (IDLE vs. cmd.exe). > > The big problem with VS 2008 is that there is no free fortran compiler > compatible with VS 2008. I can not even compile a trivial fortran + C > project with the VS 2008-gfortran combination. A numpy only build > could still be useful (for matplotlib, for example), but I am > reluctant to do that if we later go the mingw route, since both built > would be more than likely ABI incompatible. > > With mingw compilers, we get gfortran "for free" (I can compile scipy, > for example), but there is this very serious crash which happens > randomly. The debugger does not work (maybe some stack corruption), > and there is no valgrind on windows, so it is very difficult to track > down. Of course, mingw debugging symbols are not compatible with MS > compilers, so the MS debugger is no option either. > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From neilcrighton at gmail.com Mon Jun 22 10:42:59 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Mon, 22 Jun 2009 14:42:59 +0000 (UTC) Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau ar.media.kyoto-u.ac.jp> writes: > > (Continuing the discussion initiated in the neighborhood iterator thread) > > Hi, > > I would like to gather people's opinion on what to target for numpy > 1.4.0. > Are there any other features people would like to put into numpy for 1.4.0 ? > I'd like to get the patch in ticket 1113 (http://projects.scipy.org/numpy/ticket/1133), or some version of it, into 1.4. It would also be great to get all the docstrings David Goldsmith and others are working on into the next release. Neil From robert.kern at gmail.com Mon Jun 22 12:22:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 22 Jun 2009 11:22:52 -0500 Subject: [Numpy-discussion] Memmap + resize In-Reply-To: <20090622042443.44fbb6be@pozzaibe> References: <20090622042443.44fbb6be@pozzaibe> Message-ID: <3d375d730906220922u5b35a0cdk6d14ca999b7bab20@mail.gmail.com> 2009/6/21 luigi curzi > > hello, > is it possible to resize in place a memmap array? No. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ?-- Umberto Eco From neilcrighton at gmail.com Mon Jun 22 16:12:53 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Mon, 22 Jun 2009 20:12:53 +0000 (UTC) Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <5b8d13220906210859u64c36af5tdb154593db61ab07@mail.gmail.com> Message-ID: David Cournapeau gmail.com> writes: > >>> David Cournapeau wrote: > >>> > (Continuing the discussion initiated in the neighborhood iterator > >>> > thread) > >>> > ? ? - Chuck suggested to drop python < 2.6 support from now on. I am > >>> > against it without a very strong and detailed rationale, because many OS > >>> > still don't have python 2.6 (RHEL, Ubuntu LTS). > >>> > >>> I vote against dropping support for python 2.5. Personally I have no > >>> incentive to upgrade to 2.6 and am very happy with 2.5. > >> > >> Will requiring python-2.6 help the developers port numpy to python-3? > >> > > > > Can't really say at this point, but it is the suggested path to > > python-3. > > OTOH, I don't find the python 3 "official" transition story very > convincing. I have tried to gather all the information I could find, > both on the python wiki and from transitions stories. To support both > python 2 and 3, the suggestion is to use the 2to3 script, but it is > painfully slow for big packages like numpy. And there ave very few > stories for porting python 3 C extensions. > > Another suggestion is to avoid breaking the API when transitioning for > python 3. But that seems quite unrealistic. How do we deal with the > removing of string/long APIs ? This will impact the numpy API as well, > so how do we deal with it ? > As I understand this suggestion, they just hope external packages don't say 'Hey, if we're breaking backwards compatibility anyway, lets take the chance to do a whole lot of extra API breakage!' That way, if people have problems migrating to the new version, they know they're likely to be python 3 related. Jarrod Millman's blog post about numpy and python 3 mentions this: http://jarrodmillman.blogspot.com/2009/01/when-will-numpy-and-scipy-migrate-to.html > Also, there does not seem to be any advantages for python 3 for > scientific people ? > I think there are lots of advantages in python 3 for scientific people. The new integer division alone is a huge improvement. I've been bitten by this (1/2 = 0) several times in the past, and the only reason I'm not bitten by it now is that I've trained myself to always type things like 1./x, which look ugly. The reorganisation of the standard library and the removal of duplicate ways of doing things in the core also makes the language much easier to learn. This isn't a huge gain for people already familiar with Python's idiosyncracies, but it's important for people first coming to the language. Print becoming a function would have been a pain for interactive work, but happily ipython auto-parentheses takes care of that. You could argue that moving to python 3 isn't attractive because there isn't any scientific library support, but then that's because numpy hasn't been ported to python 3 yet ;) Neil From dsdale24 at gmail.com Mon Jun 22 16:54:57 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Mon, 22 Jun 2009 16:54:57 -0400 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <5b8d13220906210859u64c36af5tdb154593db61ab07@mail.gmail.com> Message-ID: On Mon, Jun 22, 2009 at 4:12 PM, Neil Crighton wrote: > David Cournapeau gmail.com> writes: > > > >>> David Cournapeau wrote: > > >>> > (Continuing the discussion initiated in the neighborhood iterator > > >>> > thread) > > >>> > - Chuck suggested to drop python < 2.6 support from now on. I > am > > >>> > against it without a very strong and detailed rationale, because > many OS > > >>> > still don't have python 2.6 (RHEL, Ubuntu LTS). > > >>> > > >>> I vote against dropping support for python 2.5. Personally I have no > > >>> incentive to upgrade to 2.6 and am very happy with 2.5. > > >> > > >> Will requiring python-2.6 help the developers port numpy to python-3? > > >> > > > > > > Can't really say at this point, but it is the suggested path to > > > python-3. > > > > OTOH, I don't find the python 3 "official" transition story very > > convincing. I have tried to gather all the information I could find, > > both on the python wiki and from transitions stories. To support both > > python 2 and 3, the suggestion is to use the 2to3 script, but it is > > painfully slow for big packages like numpy. And there ave very few > > stories for porting python 3 C extensions. > > > > Another suggestion is to avoid breaking the API when transitioning for > > python 3. But that seems quite unrealistic. How do we deal with the > > removing of string/long APIs ? This will impact the numpy API as well, > > so how do we deal with it ? > > > > As I understand this suggestion, they just hope external packages don't say > 'Hey, if we're breaking backwards compatibility anyway, lets take the > chance to > do a whole lot of extra API breakage!' That way, if people have problems > migrating to the new version, they know they're likely to be python 3 > related. > Jarrod Millman's blog post about numpy and python 3 mentions this: > > > http://jarrodmillman.blogspot.com/2009/01/when-will-numpy-and-scipy-migrate-to.html > > > Also, there does not seem to be any advantages for python 3 for > > scientific people ? > > > > I think there are lots of advantages in python 3 for scientific people. > The > new integer division alone is a huge improvement. I've been bitten by this > (1/2 = 0) several times in the past, and the only reason I'm not bitten by > it > now is that I've trained myself to always type things like 1./x, which look > ugly. > > The reorganisation of the standard library and the removal of duplicate > ways of > doing things in the core also makes the language much easier to learn. This > isn't a huge gain for people already familiar with Python's idiosyncracies, > but > it's important for people first coming to the language. > I'd like to add to that. When I advocate for python, one of the points I make is that the community of scientific researchers who embrace python is rapidly growing. That can be a compelling argument, it makes people much more comfortable about investing a little time to see what it is all about. Someone curious about python will be drawn to learn about the latest and greatest version of the language, maybe they learn some basics, and then they discover that the scientific libraries require an older version of the language. The impression may be that the scientific packages are not well enough supported to keep up with the rest of the python community, and that it is difficult to deal with dependencies in python. Superpacks like pythonxy and EPD can help avoid such a situation, but in the long run I don't think our scientific python community will be well served if we don't try surmount this obstacle. Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From seb.haase at gmail.com Mon Jun 22 16:55:31 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Mon, 22 Jun 2009 22:55:31 +0200 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <5b8d13220906210859u64c36af5tdb154593db61ab07@mail.gmail.com> Message-ID: On Mon, Jun 22, 2009 at 10:12 PM, Neil Crighton wrote: > > David Cournapeau gmail.com> writes: > > > >>> David Cournapeau wrote: > > >>> > (Continuing the discussion initiated in the neighborhood iterator > > >>> > thread) > > >>> > ? ? - Chuck suggested to drop python < 2.6 support from now on. I am > > >>> > against it without a very strong and detailed rationale, because many OS > > >>> > still don't have python 2.6 (RHEL, Ubuntu LTS). > > >>> > > >>> I vote against dropping support for python 2.5. Personally I have no > > >>> incentive to upgrade to 2.6 and am very happy with 2.5. > > >> > > >> Will requiring python-2.6 help the developers port numpy to python-3? > > >> > > > > > > Can't really say at this point, but it is the suggested path to > > > python-3. > > > > OTOH, I don't find the python 3 "official" transition story very > > convincing. I have tried to gather all the information I could find, > > both on the python wiki and from transitions stories. To support both > > python 2 and 3, the suggestion is to use the 2to3 script, but it is > > painfully slow for big packages like numpy. And there ave very few > > stories for porting python 3 C extensions. > > > > Another suggestion is to avoid breaking the API when transitioning for > > python 3. But that seems quite unrealistic. How do we deal with the > > removing of string/long APIs ? This will impact the numpy API as well, > > so how do we deal with it ? > > > > As I understand this suggestion, they just hope external packages don't say > 'Hey, if we're breaking backwards compatibility anyway, lets take the chance to > do a whole lot of extra API breakage!' ?That way, if people have problems > migrating to the new version, they know they're likely to be python 3 related. > Jarrod Millman's blog post about numpy and python 3 mentions this: > > http://jarrodmillman.blogspot.com/2009/01/when-will-numpy-and-scipy-migrate-to.html > > > Also, there does not seem to be any advantages for python 3 for > > scientific people ? > > > > I think there are lots of advantages in python 3 for scientific people. ?The > new integer division alone is a huge improvement. ?I've been bitten by this > (1/2 = 0) several times in the past, and the only reason I'm not bitten by it > now is that I've trained myself to always type things like 1./x, which look > ugly. > > The reorganisation of the standard library and the removal of duplicate ways of > doing things in the core also makes the language much easier to learn. This > isn't a huge gain for people already familiar with Python's idiosyncracies, but > it's important for people first coming to the language. > > Print becoming a function would have been a pain for interactive work, but > happily ipython auto-parentheses takes care of that. > > You could argue that moving to python 3 isn't attractive because there isn't > any scientific library support, but then that's because numpy hasn't been > ported to python 3 yet ;) > Neil, I agree that "new integer division alone is a huge improvement" I have been using python 2 with -Qnew for a long time now, and I like the fact that 1/2 = .5 ;-) (in fact I set it up to get the -Qnew by default, so don't have to actually type it...) Everything seems to work well !! (numpy,scipy,wxPython,sympy,matplotlib,PIL(minor patching) at least) I thought this one was making things much easier to explain to new comers and made things "future ready" -- every if we will have to wait for the rest of the future .... --Sebastian Haase From h5py at alfven.org Mon Jun 22 17:48:50 2009 From: h5py at alfven.org (Andrew Collette) Date: Mon, 22 Jun 2009 14:48:50 -0700 Subject: [Numpy-discussion] ANN: HDF5 for Python (h5py) 1.2 Message-ID: Announcing HDF5 for Python (h5py) 1.2 ===================================== I'm pleased to announce the availability of HDF5 for Python 1.2 final! This release represents a significant update to the h5py feature set. Some of the new new features are: - Support for variable-length strings! - Use of built-in Python exceptions (KeyError, etc), alongside H5Error - Top-level support for HDF5 CORE, SEC2, STDIO, WINDOWS and FAMILY drivers - Support for ENUM and ARRAY types - Support for Unicode file names - Big speedup (~3x) when using single-index slicing on a chunked dataset Main site: http://h5py.alfven.org Google code: http://h5py.googlecode.com What is h5py? ------------- HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific software library designed for the fast, flexible storage of enormous amounts of data. >From a Python programmer's perspective, HDF5 provides a robust way to store data, organized by name in a tree-like fashion. You can create datasets (arrays on disk) hundreds of gigabytes in size, and perform random-access I/O on desired sections. Datasets are organized in a filesystem-like hierarchy using containers called "groups", and accesed using the tradional POSIX /path/to/resource syntax. In addition to providing interoperability with existing HDF5 datasets and platforms, h5py is a convienient way to store and retrieve arbitrary NumPy data and metadata. Full list of new features in 1.2 -------------------------------- - Variable-length strings are now supported! They are mapped to native Python strings via the NumPy "object" type. VL strings may be read, written and created from h5py, and are allowed in all HDF5 contexts, even as members of compound or array types. - HDF5 exceptions now inherit from common Python built-ins like TypeError and ValueError (in addition to current HDF5 error hierarchy), freeing the user from knowledge of the HDF5 error system. Existing code which uses H5Error will continue to work. - Many different low-level HDF5 drivers can now be used when creating a file, which allows purely in-memory ("core") files, multi-volume ("family") files, and files which use low-level buffered I/O. - Groups and attributes now support the standard Python dictionary interface methods, including keys(), values() and friends. The existing methods (listnames(), listobjects(), etc.) remain and will not be removed until at least h5py 1.4 or equivalent. - Workaround for an HDF5 bug has sped up reading/writing of chunked datasets. When using a slice with fewer dimensions than the dataset, there can be as much as a 3x improvement in write times over h5py 1.1. - Enumerated types are now fully supported; they can be used in NumPy anywhere integer types are allowed, and are stored as native HDF5 enums. Conversion between integers and enums is supported. - The NumPy "array" dtype is now allowed as a top-level type when creating a dataset, not just as a member of a compound type. - Unicode file names are now supported - It's now possible to explicitly set the type of an attribute, and to preserve the type of an attribute while modifying it. - High-level objects now have .parent and .file attributes, to make the navigation of HDF5 files more convenient. Design revisions since 1.1 -------------------------- - The role of the "name" attribute on File objects has changed. "name" now returns the HDF5 path of the File object ('/'); the file name on disk is available at File.filename. - Dictionary-interface methods for Group and AttributeManager objects have been renamed to follow the standard Python convention (keys(), values(), etc). The old method names are still available but deprecated. - The HDF5 shuffle filter is no longer automatically activated when GZIP or LZF compression is used; many datasets "in the wild" do not benefit from shuffling. Standard features ----------------- - Supports storage of NumPy data of the following types: * Integer/Unsigned Integer * Float/Double * Complex/Double Complex * Compound ("recarray") * Strings * Boolean * Array * Enumeration (integers) * Void - Random access to datasets using the standard NumPy slicing syntax, including a subset of fancy indexing and point-based selection - Transparent compression of datasets using GZIP, LZF or SZIP, and error-detection using Fletcher32 - "Pythonic" interface supporting dictionary and NumPy-array metaphors for the high-level HDF5 abstrations like groups and datasets - A comprehensive, object-oriented wrapping of the HDF5 low-level C API via Cython, in addition to the NumPy-like high-level interface. - Supports many new features of HDF5 1.8, including recursive iteration over entire files and in-library copy operations on the file tree - Thread-safe Where to get it --------------- * Main website, documentation: http://h5py.alfven.org * Downloads, bug tracker: http://h5py.googlecode.com Requires -------- * Linux, Mac OS-X or Windows * Python 2.5 (Windows), Python 2.5 or 2.6 (Linux/Mac OS-X) * NumPy 1.0.3 or later * HDF5 1.6.5 or later (including 1.8); HDF5 is included with the Windows version. Thanks ------ Thanks to D. Dale, E. Lawrence and other for their continued support and comments. Also thanks to the Francesc Alted and the PyTables project, for inspiration and generously providing their code to the community. Thanks to everyone at the HDF Group for creating such a useful piece of software. From david at ar.media.kyoto-u.ac.jp Mon Jun 22 21:53:03 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 23 Jun 2009 10:53:03 +0900 Subject: [Numpy-discussion] Plans for Numpy 1.4.0 and scipy 0.8.0 In-Reply-To: References: <4A3DF6EB.4050606@ar.media.kyoto-u.ac.jp> <5b8d13220906210859u64c36af5tdb154593db61ab07@mail.gmail.com> Message-ID: <4A40357F.2030205@ar.media.kyoto-u.ac.jp> Neil Crighton wrote: > David Cournapeau gmail.com> writes: > > >>>>> David Cournapeau wrote: >>>>> >>>>>> (Continuing the discussion initiated in the neighborhood iterator >>>>>> thread) >>>>>> - Chuck suggested to drop python < 2.6 support from now on. I am >>>>>> against it without a very strong and detailed rationale, because many OS >>>>>> still don't have python 2.6 (RHEL, Ubuntu LTS). >>>>>> >>>>> I vote against dropping support for python 2.5. Personally I have no >>>>> incentive to upgrade to 2.6 and am very happy with 2.5. >>>>> >>>> Will requiring python-2.6 help the developers port numpy to python-3? >>>> >>>> >>> Can't really say at this point, but it is the suggested path to >>> python-3. >>> >> OTOH, I don't find the python 3 "official" transition story very >> convincing. I have tried to gather all the information I could find, >> both on the python wiki and from transitions stories. To support both >> python 2 and 3, the suggestion is to use the 2to3 script, but it is >> painfully slow for big packages like numpy. And there ave very few >> stories for porting python 3 C extensions. >> >> Another suggestion is to avoid breaking the API when transitioning for >> python 3. But that seems quite unrealistic. How do we deal with the >> removing of string/long APIs ? This will impact the numpy API as well, >> so how do we deal with it ? >> >> > > As I understand this suggestion, they just hope external packages don't say > 'Hey, if we're breaking backwards compatibility anyway, lets take the chance to > do a whole lot of extra API breakage!' That way, if people have problems > migrating to the new version, they know they're likely to be python 3 related. > Jarrod Millman's blog post about numpy and python 3 mentions this: > > http://jarrodmillman.blogspot.com/2009/01/when-will-numpy-and-scipy-migrate-to.html > As I understand, the rationale is that by not breaking API at the same time as py3k transition, people can migrate more easily through 2to3. But I am really not convinced that's possible in numpy's case. I am not 100 % sure yet, but it does not look like it will be possible to support the py3k C API without breaking things. Again, numpy is quite particular: it is not only a big C extension, which define new types, but it also exports a C API. So the situation is very close to python itself, which is not backward compatible either. > I think there are lots of advantages in python 3 for scientific people. The > new integer division alone is a huge improvement. I've been bitten by this > (1/2 = 0) several times in the past, and the only reason I'm not bitten by it > now is that I've trained myself to always type things like 1./x, which look > ugly. > > The reorganisation of the standard library and the removal of duplicate ways of > doing things in the core also makes the language much easier to learn. This > isn't a huge gain for people already familiar with Python's idiosyncracies, but > it's important for people first coming to the language. > Hey, different people, different opinions. Those are really minor for me, I am glad it matters for other people :) > You could argue that moving to python 3 isn't attractive because there isn't > any scientific library support, but then that's because numpy hasn't been > ported to python 3 yet ;) > Once numpy is ported, other softwares should be much easier. For example, I don't expect scipy to be hard to port once numpy is ported. David From kwmsmith at gmail.com Mon Jun 22 22:53:15 2009 From: kwmsmith at gmail.com (Kurt Smith) Date: Mon, 22 Jun 2009 21:53:15 -0500 Subject: [Numpy-discussion] Help using numpy.distutils.fcompiler for my GSoC project Message-ID: Hello, I'm working under the able mentoring of Dag Sverre Seljebotn to implement a GSoC project informally known as 'f2cy'. From the 10,000 meter view, f2cy will (1) wrap fortran 77/90/95 code into a python module (reproducing f2py in this regard) with full support for assumed-shape fortran arrays (beyond f2py, if I'm not mistaken); and (2) will wrap the same fortran code in *cython* cdef functions that can be manipulated at the cython level with no interaction with the python API -- yielding close to the metal speed, even for small functions. We're making use of the new ISO_C_BINDING intrinsic module implemented in most every extant Fortran 95 compiler (including gfortran) to make the resulting code very portable. This introduces a small wrinkle to the compilation of the python/cython module, and that's the content of my question. Basically, f2cy generates a 'genconfig.f95' fortran source file that, when compiled and run, creates another fortran source file 'config.f95' and a C header file 'config.h'. config.f95 defines a module that contains the necessary mappings between the ISO_C_BINDING kind-type-parameters (ktps) and the ktps used in the rest of the generated source code. The genconfig.f95 file needs to be compiled and run before the python module can be compiled, since the python module depends on the 'config.f95' file. Once config.f95 exists, it is easy to make the python module. The trick is in generating the 'config.f95' file in a portable way, using the right fortran compiler with all the right flags. We plan on leveraging the impressive body of work in the numpy.distutils.fcompiler package for this purpose; I don't think it would be too hard to compile the 'genconfig.f95' into an object file, but we then need to compile it to an executable. Is there a way for numpy.distutils to compile a fortran source file into an executable? I'm not well-versed in the black arts of distutils or numpy.distutils. I know the basics and am quickly reading up on it -- any pointers would be very appreciated. Thanks, Kurt Smith From dsandie at gmail.com Tue Jun 23 04:54:33 2009 From: dsandie at gmail.com (Sandeep Devadas) Date: Tue, 23 Jun 2009 14:24:33 +0530 Subject: [Numpy-discussion] MKL Path error on Cygwin after installing on windows. In-Reply-To: <4A3F882C.3050904@ar.media.kyoto-u.ac.jp> References: <4A3F882C.3050904@ar.media.kyoto-u.ac.jp> Message-ID: Hi David, On Mon, Jun 22, 2009 at 7:03 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi Sandeep, > > Sandeep Devadas wrote: > > Hello There, > > My name is Sandeep Devadas and im trying to install > > numpy for python on Cygwins latest version(1.5.xx) on Windows XP.I'm > > getting an error message when I follow the instructions given at > > http://www.scipy.org/Installing_SciPy/Windows > > You can't build a native numpy for windows on cygwin. You need to build > from cmd.exe and a native python (from python.org). Note also that > building numpy with the MKL is relatively complex - we may not be able > to help you to do it completely. > I need to build numpy for python interpreter on cygwin.There is one program which has been compiled on cygwin and is being called as a module on python interpreter(installed on cygwin).There is another program which requires numpy as a pre-requisite install before it can be installed.So numpy on python under cygwin is the only way out.Please let me know how to install numpy under cygwin. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Jun 23 04:48:51 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 23 Jun 2009 17:48:51 +0900 Subject: [Numpy-discussion] MKL Path error on Cygwin after installing on windows. In-Reply-To: References: <4A3F882C.3050904@ar.media.kyoto-u.ac.jp> Message-ID: <4A4096F3.4060906@ar.media.kyoto-u.ac.jp> Sandeep Devadas wrote: > Hi David, > > On Mon, Jun 22, 2009 at 7:03 PM, David Cournapeau > > > wrote: > > Hi Sandeep, > > Sandeep Devadas wrote: > > Hello There, > > My name is Sandeep Devadas and im trying to install > > numpy for python on Cygwins latest version(1.5.xx) on Windows XP.I'm > > getting an error message when I follow the instructions given at > > http://www.scipy.org/Installing_SciPy/Windows > > You can't build a native numpy for windows on cygwin. You need to > build > from cmd.exe and a native python (from python.org > ). Note also that > building numpy with the MKL is relatively complex - we may not be able > to help you to do it completely. > > > > I need to build numpy for python interpreter on cygwin. You can build numpy for cygwin, but not with the MKL. Numpy has no prerequisite, just a C compiler (gcc from cygwin). cheers, David From dsandie at gmail.com Tue Jun 23 05:10:27 2009 From: dsandie at gmail.com (Sandeep Devadas) Date: Tue, 23 Jun 2009 14:40:27 +0530 Subject: [Numpy-discussion] MKL Path error on Cygwin after installing on windows. In-Reply-To: <4A4096F3.4060906@ar.media.kyoto-u.ac.jp> References: <4A3F882C.3050904@ar.media.kyoto-u.ac.jp> <4A4096F3.4060906@ar.media.kyoto-u.ac.jp> Message-ID: How do I install numpy for cygwin under windows? On Tue, Jun 23, 2009 at 2:18 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Sandeep Devadas wrote: > > Hi David, > > > > On Mon, Jun 22, 2009 at 7:03 PM, David Cournapeau > > > > > wrote: > > > > Hi Sandeep, > > > > Sandeep Devadas wrote: > > > Hello There, > > > My name is Sandeep Devadas and im trying to install > > > numpy for python on Cygwins latest version(1.5.xx) on Windows > XP.I'm > > > getting an error message when I follow the instructions given at > > > http://www.scipy.org/Installing_SciPy/Windows > > > > You can't build a native numpy for windows on cygwin. You need to > > build > > from cmd.exe and a native python (from python.org > > ). Note also that > > building numpy with the MKL is relatively complex - we may not be > able > > to help you to do it completely. > > > > > > > > I need to build numpy for python interpreter on cygwin. > > You can build numpy for cygwin, but not with the MKL. Numpy has no > prerequisite, just a C compiler (gcc from cygwin). > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Tue Jun 23 08:23:16 2009 From: faltet at pytables.org (Francesc Alted) Date: Tue, 23 Jun 2009 14:23:16 +0200 Subject: [Numpy-discussion] ANN: Numexpr 1.3.1 released Message-ID: <200906231423.17172.faltet@pytables.org> ========================== Announcing Numexpr 1.3.1 ========================== Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. This is a maintenance release. On it, support for the `unit32` type has been added (it is internally upcasted to 'int64'), as well as a new `abs()` function (thanks to Pauli Virtanen for the patch). Also, a little tweaking in the treatment of unaligned arrays on Intel architectures allowed for up to 2x speedups in computations involving unaligned arrays. For example, for multiplying 2 arrays (see the included ``unaligned-simple.py`` benchmark), figures before the tweaking were: NumPy aligned: 0.63 s NumPy unaligned: 1.66 s Numexpr aligned: 0.65 s Numexpr unaligned: 1.09 s while now they are: NumPy aligned: 0.63 s NumPy unaligned: 1.65 s Numexpr aligned: 0.65 s Numexpr unaligned: 0.57 s <-- almost 2x faster than above You can also see how the unaligned case can be even faster than the aligned one. The explanation is that the 'aligned' array was actually a strided one (actually a column of an structured array), and the total working data size was a bit larger for this case. In case you want to know more in detail what has changed in this version, see: http://code.google.com/p/numexpr/wiki/ReleaseNotes or have a look at RELEASE_NOTES.txt in the tarball. Where I can find Numexpr? ========================= The project is hosted at Google code in: http://code.google.com/p/numexpr/ And you can get the packages from PyPI as well: http://pypi.python.org/pypi How it works? ============= See: http://code.google.com/p/numexpr/wiki/Overview for a detailed description by the original author (David M. Cooke). Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy! -- Francesc Alted From dsandie at gmail.com Tue Jun 23 09:13:44 2009 From: dsandie at gmail.com (Sandeep Devadas) Date: Tue, 23 Jun 2009 18:43:44 +0530 Subject: [Numpy-discussion] MKL Path error on Cygwin after installing on windows. In-Reply-To: References: <4A3F882C.3050904@ar.media.kyoto-u.ac.jp> <4A4096F3.4060906@ar.media.kyoto-u.ac.jp> Message-ID: Hi, I got numpy working on Python under cygwin and am getting this error . Please let me know what to do as I am new to Python and dont know anything. $ python Python 2.5.2 (r252:60911, Dec 2 2008, 09:26:14) [GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test( 10 ) Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.5/site-packages/numpy/testing/nosetester.py", line 221, in test argv = self._test_argv(label, verbose, extra_argv) File "/usr/lib/python2.5/site-packages/numpy/testing/nosetester.py", line 141, in _test_argv raise TypeError, 'Selection label should be a string' TypeError: Selection label should be a string Thanks and Regards, Sandeep. On Tue, Jun 23, 2009 at 2:40 PM, Sandeep Devadas wrote: > How do I install numpy for cygwin under windows? > > > On Tue, Jun 23, 2009 at 2:18 PM, David Cournapeau < > david at ar.media.kyoto-u.ac.jp> wrote: > >> Sandeep Devadas wrote: >> > Hi David, >> > >> > On Mon, Jun 22, 2009 at 7:03 PM, David Cournapeau >> > > >> > wrote: >> > >> > Hi Sandeep, >> > >> > Sandeep Devadas wrote: >> > > Hello There, >> > > My name is Sandeep Devadas and im trying to >> install >> > > numpy for python on Cygwins latest version(1.5.xx) on Windows >> XP.I'm >> > > getting an error message when I follow the instructions given at >> > > http://www.scipy.org/Installing_SciPy/Windows >> > >> > You can't build a native numpy for windows on cygwin. You need to >> > build >> > from cmd.exe and a native python (from python.org >> > ). Note also that >> > building numpy with the MKL is relatively complex - we may not be >> able >> > to help you to do it completely. >> > >> > >> > >> > I need to build numpy for python interpreter on cygwin. >> >> You can build numpy for cygwin, but not with the MKL. Numpy has no >> prerequisite, just a C compiler (gcc from cygwin). >> >> cheers, >> >> David >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sole at esrf.fr Tue Jun 23 11:58:24 2009 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Tue, 23 Jun 2009 17:58:24 +0200 Subject: [Numpy-discussion] ANN: HDF5 for Python (h5py) 1.2 In-Reply-To: References: Message-ID: <4A40FBA0.1050605@esrf.fr> Dear Andrew, I have succeeded on generating a win32 binary installer for python 2.6. Running depends on the installed libraries it shows there are no dependencies on other libraries than those of VS2008. I have tried to send it to you directly but I am not sure if your mail address accepts attachments. Please let me know if you are interested on getting the installer. Armando From mforbes at physics.ubc.ca Tue Jun 23 14:19:48 2009 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Tue, 23 Jun 2009 12:19:48 -0600 Subject: [Numpy-discussion] Efficient ?axpy operation without copy (B += a*A) Message-ID: Hi, Is there a way of performing vectorized ?axpy (daxpy) operations without making copies or dropping into C? i.e: I want to do big = (10000,5000) A = np.ones(big,dtype=float) B = np.ones(big,dtype=float) a = 1.5 B += a*A without making any copies? (I know I could go A *= a B += A A /= a but that is not so efficient either). There are exposed blas daxpy operations in scipy, but in the version I have (EPD), these also seem to make copies (though recent version seem to be fixed by looking at the source.) Thanks, Michael. From dalcinl at gmail.com Tue Jun 23 14:41:00 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 23 Jun 2009 15:41:00 -0300 Subject: [Numpy-discussion] Help using numpy.distutils.fcompiler for my GSoC project In-Reply-To: References: Message-ID: On Mon, Jun 22, 2009 at 11:53 PM, Kurt Smith wrote: > Hello, > > Is there a way for numpy.distutils to compile a fortran source file > into an executable? If the whole point of building the executable is to run it in order to parse the output, then you can start with this: $ cat setup.py from numpy.distutils.core import setup from numpy.distutils.command.config import config as config_orig helloworld = """ program hello write(*,*) "Hello, World!" end program hello """ class config(config_orig): def run(self): self.try_run(helloworld, lang='f90') setup(name="ConfigF90", cmdclass={'config' : config}) $ python setup.py config <... lots of ouput ...> gfortran:f90: _configtest.f90 /usr/bin/gfortran -Wall -Wall _configtest.o -lgfortran -o _configtest _configtest Hello, World! success! removing: _configtest.f90 _configtest.o _configtest In order to actually capture the ouput, you will have to implement method spawn() in class config(), likely using subprocess module (or older os.* APIS's for max backward compatibility) Hope this helps, > I'm not well-versed in the black arts of > distutils or numpy.distutils. ?I know the basics and am quickly > reading up on it -- any pointers would be very appreciated. > > Thanks, > > Kurt Smith > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From pav at iki.fi Tue Jun 23 14:46:23 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 23 Jun 2009 18:46:23 +0000 (UTC) Subject: [Numpy-discussion] Efficient ?axpy operation without copy (B += a*A) References: Message-ID: On 2009-06-23, Michael McNeil Forbes wrote: > Is there a way of performing vectorized ?axpy (daxpy) operations > without making copies or dropping into C? > > i.e: I want to do > > big = (10000,5000) > A = np.ones(big,dtype=float) > B = np.ones(big,dtype=float) > a = 1.5 > B += a*A I think the only available choice is to use the BLAS routines from scipy.lib: >>> from scipy.lib.blas import get_blas_funcs >>> axpy, = get_blas_funcs(['axpy'], [A, B]) >>> res = axpy(A.ravel(), B.ravel(), A.size, a) >>> res.base is B True >>> B[0,0] 2.5 Works provided A and B are initially in C-order so that ravel() doesn't create copies. If unsure, it's best to make use of the return value of axpy and not assume B is modified in-place. [clip] > There are exposed blas daxpy operations in scipy, but in the version I > have (EPD), these also seem to make copies (though recent version seem > to be fixed by looking at the source.) I don't see many relevant changes in scipy.lib recently, so I'm not sure what change you mean by the above. -- Pauli Virtanen From faltet at pytables.org Tue Jun 23 14:52:01 2009 From: faltet at pytables.org (Francesc Alted) Date: Tue, 23 Jun 2009 20:52:01 +0200 Subject: [Numpy-discussion] ANN: PyTables 2.2b1 ready for testing Message-ID: <200906232052.02129.faltet@pytables.org> Hi, This is for inform you about the first beta release for PyTables 2.2. You will find there some interesting new features, but no question that the most appealing one is the new `tables.Expr` class. You can think about it as powerful evaluator for generic mathematical expressions of NumPy arrays as well as disk-based datasets. `tables.Expr` works like a sort of replacement of the `numpy.memmap` module, but it has the next advantages over the latter: * It can evaluate whatever Numexpr expression without need to take care of temporaries. For example, it can compute expressions like: "a*b-1" or "(a*arctan2(b,c)*sqrt(d))**2-1" where 'a','b','c' and 'd' can be any PyTables homogeneous dataset or NumPy array, in an optimal way (i.e. avoiding temporaries and making an effective use of the computational resources of your machine). * Contrarily to `numpy.memmap`, `tables.Expr` works for *arbitrarily* large datasets, no matter your platform is 32-bit or 64-bit or your available virtual memory: if your disk can keep your input and output datasets, you will be able to do your computations. * In the PyTables tradition, it can make use of compression transparently, so even in the case that your datasets does not fit on-disk, there is still a chance that the compressed ones do. Finally, and although in most of scenarios compression does actually improve the speed of I/O, it is true that CPU is still the main bottleneck when compressing/decompressing. This is being addressed. So, for those of you that need to work with datasets that defies your computer capabilities, please give the `tables.Expr` a try and report your experience. I'll be glad to try to hear you back! Keep reading for instructions on finding the new code and documentation. =========================== Announcing PyTables 2.2b1 =========================== PyTables is a library for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data with support for full 64-bit file addressing. PyTables runs on top of the HDF5 library and NumPy package for achieving maximum throughput and convenient use. This is the first beta of the PyTables 2.2 series. Here, you will find support for NumPy's extended slicing in all `Leaf` objects as well as an updated Numexpr module (to 1.3.1), which can lead to up a 25% improvement of the time for both in-kernel and indexed queries for unaligned columns in tables (which can be a quite common situation). But perhaps the most interesting feature is the introduction of the `Expr` class, which allows evaluating expressions containing general array-like objects. It can evaluate expressions (like '3*a+4*b') that operate on *arbitrary large* arrays while optimizing the resources (basically main memory and CPU cache memory) required to perform them. It works similarly to the Numexpr package, but in addition to NumPy objects, it also accepts disk-based homogeneous arrays, like the `Array`, `CArray`, `EArray` and `Column` PyTables objects. You can find the documentation about the new `Expr` class at: http://www.pytables.org/docs/manual-2.2b1/ch04.html#ExprClass In case you want to know more in detail what has changed in this version, have a look at: http://www.pytables.org/moin/ReleaseNotes/Release_2.2b1 You can download a source package with generated PDF and HTML docs, as well as binaries for Windows, from: http://www.pytables.org/download/preliminary For an on-line version of the manual, visit: http://www.pytables.org/docs/manual-2.2b1 Resources ========= About PyTables: http://www.pytables.org About the HDF5 library: http://hdfgroup.org/HDF5/ About NumPy: http://numpy.scipy.org/ Acknowledgments =============== Thanks to many users who provided feature improvements, patches, bug reports, support and suggestions. See the ``THANKS`` file in the distribution package for a (incomplete) list of contributors. Most specially, a lot of kudos go to the HDF5 and NumPy (and numarray!) makers. Without them, PyTables simply would not exist. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. ---- **Enjoy data!** -- The PyTables Team -- Francesc Alted From charlesr.harris at gmail.com Tue Jun 23 15:31:03 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 23 Jun 2009 13:31:03 -0600 Subject: [Numpy-discussion] Efficient ?axpy operation without copy (B += a*A) In-Reply-To: References: Message-ID: On Tue, Jun 23, 2009 at 12:46 PM, Pauli Virtanen wrote: > On 2009-06-23, Michael McNeil Forbes wrote: >> Is there a way of performing vectorized ?axpy (daxpy) operations >> without making copies or dropping into C? >> >> i.e: I want to do >> >> big = (10000,5000) >> A = np.ones(big,dtype=float) >> B = np.ones(big,dtype=float) >> a = 1.5 >> B += a*A > > I think the only available choice is to use the BLAS routines > from scipy.lib: > >>>> from scipy.lib.blas import get_blas_funcs >>>> axpy, = get_blas_funcs(['axpy'], [A, B]) >>>> res = axpy(A.ravel(), B.ravel(), A.size, a) >>>> res.base is B > True >>>> B[0,0] > 2.5 > > Works provided A and B are initially in C-order so that ravel() > doesn't create copies. If unsure, it's best to make use of the > return value of axpy and not assume B is modified in-place. > > [clip] >> There are exposed blas daxpy operations in scipy, but in the version I >> have (EPD), these also seem to make copies (though recent version seem >> to be fixed by looking at the source.) > > I don't see many relevant changes in scipy.lib recently, so I'm > not sure what change you mean by the above. > Now and then I've thought about adding it as a ufunc. It wouldn't be hard. Chuck From kwmsmith at gmail.com Tue Jun 23 15:48:18 2009 From: kwmsmith at gmail.com (Kurt Smith) Date: Tue, 23 Jun 2009 14:48:18 -0500 Subject: [Numpy-discussion] Help using numpy.distutils.fcompiler for my GSoC project In-Reply-To: References: Message-ID: On Tue, Jun 23, 2009 at 1:41 PM, Lisandro Dalcin wrote: > On Mon, Jun 22, 2009 at 11:53 PM, Kurt Smith wrote: >> Hello, >> >> Is there a way for numpy.distutils to compile a fortran source file >> into an executable? > > If the whole point of building the executable is to run it in order to > parse the output, then you can start with this: > > $ cat setup.py > from numpy.distutils.core import setup > from numpy.distutils.command.config import config as config_orig > > helloworld = """ > program hello > ? ?write(*,*) "Hello, World!" > end program hello > """ > > class config(config_orig): > ? ?def run(self): > ? ? ? ?self.try_run(helloworld, lang='f90') > > setup(name="ConfigF90", > ? ? ?cmdclass={'config' : config}) > > > $ python setup.py config > <... lots of ouput ...> > gfortran:f90: _configtest.f90 > /usr/bin/gfortran -Wall -Wall _configtest.o -lgfortran -o _configtest > _configtest > ?Hello, World! > success! > removing: _configtest.f90 _configtest.o _configtest Thank you! The comments at the top of numpy.distutils.command.config.py make me think that try_run won't work for all fortran compilers on all platforms. But this certainly helps and I've almost got it solved. What I've discovered is something more low-level, like the following: $ cat batch.py from numpy.distutils.fcompiler import new_fcompiler fcomp = new_fcompiler() # will grab the default fcompiler, overridable with keyword args. fcomp.customize() objs = fcomp.compile(['./genconfig.f95']) # compiles into an object. fcomp.link_executable(objs, 'genconfig') # makes the executable 'genconfig' # spawn the executable here to generate config files. The above works for simple cases (but for the spawning step, but I'll have that working soon) -- I'll probably end up wrapping it in a run method inside a config subclass, like you have above. Thanks again, Kurt From mforbes at physics.ubc.ca Tue Jun 23 18:51:49 2009 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Tue, 23 Jun 2009 16:51:49 -0600 Subject: [Numpy-discussion] Efficient ?axpy operation without copy (B += a*A) In-Reply-To: References: Message-ID: Thanks Pauli, On 23 Jun 2009, at 12:46 PM, Pauli Virtanen wrote: >>>> from scipy.lib.blas import get_blas_funcs >>>> axpy, = get_blas_funcs(['axpy'], [A, B]) >>>> res = axpy(A.ravel(), B.ravel(), A.size, a) >>>> res.base is B ... > Works provided A and B are initially in C-order so that ravel() > doesn't create copies. If unsure, it's best to make use of the > return value of axpy and not assume B is modified in-place. > > [clip] >> There are exposed blas daxpy operations in scipy, but in the >> version I >> have (EPD), these also seem to make copies (though recent version >> seem >> to be fixed by looking at the source.) > > I don't see many relevant changes in scipy.lib recently, so I'm > not sure what change you mean by the above. I was not ravelling or providing the size, and as a result the return value was a copy and B was not modified leading me to an incorrect conclusion: It works with EPD. Michael. From cournape at gmail.com Tue Jun 23 23:17:36 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 24 Jun 2009 12:17:36 +0900 Subject: [Numpy-discussion] Help using numpy.distutils.fcompiler for my GSoC project In-Reply-To: References: Message-ID: <5b8d13220906232017t2ba05467n3cc460ead2ebf750@mail.gmail.com> On Wed, Jun 24, 2009 at 3:41 AM, Lisandro Dalcin wrote: > On Mon, Jun 22, 2009 at 11:53 PM, Kurt Smith wrote: >> Hello, >> >> Is there a way for numpy.distutils to compile a fortran source file >> into an executable? > > If the whole point of building the executable is to run it in order to > parse the output, then you can start with this: > > $ cat setup.py > from numpy.distutils.core import setup > from numpy.distutils.command.config import config as config_orig > > helloworld = """ > program hello > ? ?write(*,*) "Hello, World!" > end program hello > """ > > class config(config_orig): > ? ?def run(self): > ? ? ? ?self.try_run(helloworld, lang='f90') > > setup(name="ConfigF90", > ? ? ?cmdclass={'config' : config}) > > > $ python setup.py config > <... lots of ouput ...> > gfortran:f90: _configtest.f90 > /usr/bin/gfortran -Wall -Wall _configtest.o -lgfortran -o _configtest > _configtest > ?Hello, World! > success! > removing: _configtest.f90 _configtest.o _configtest > > > In order to actually capture the ouput, you will have to implement > method spawn() in class config(), likely using subprocess module (or > older os.* APIS's for max backward compatibility) If possible, you should not build executables, it is not portable. Compiling and linking is Ok, running is not. For a tool which is aimed a general use, I think this is important. Knowing the exact tests needed by the OP would help me to give more detailed advices. cheers, David From david at ar.media.kyoto-u.ac.jp Tue Jun 23 23:05:22 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 24 Jun 2009 12:05:22 +0900 Subject: [Numpy-discussion] MKL Path error on Cygwin after installing on windows. In-Reply-To: References: <4A3F882C.3050904@ar.media.kyoto-u.ac.jp> <4A4096F3.4060906@ar.media.kyoto-u.ac.jp> Message-ID: <4A4197F2.9060804@ar.media.kyoto-u.ac.jp> Sandeep Devadas wrote: > Hi, > I got numpy working on Python under cygwin and am getting this > error . > Please let me know what to do as I am new to Python and dont know > anything. > > $ python > Python 2.5.2 (r252:60911, Dec 2 2008, 09:26:14) > [GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy > >>> numpy.test( 10 ) > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/python2.5/site-packages/numpy/testing/nosetester.py", > line 221, in test > argv = self._test_argv(label, verbose, extra_argv) > File "/usr/lib/python2.5/site-packages/numpy/testing/nosetester.py", > line 141, in _test_argv > raise TypeError, 'Selection label should be a string' > TypeError: Selection label should be a string You can test numpy as follows: >>> import numpy >>> numpy.test() # basic tests >>> numpy.test(label='full') # all tests, including slower ones >>> numpy.test(verbose=10) # verbose output IIRC, there are not that many more tests with label='full' than with the default for numpy (the difference matters for scipy). cheers, David From dsandie at gmail.com Wed Jun 24 07:18:17 2009 From: dsandie at gmail.com (Sandeep Devadas) Date: Wed, 24 Jun 2009 16:48:17 +0530 Subject: [Numpy-discussion] MKL Path error on Cygwin after installing on windows. In-Reply-To: <4A4197F2.9060804@ar.media.kyoto-u.ac.jp> References: <4A3F882C.3050904@ar.media.kyoto-u.ac.jp> <4A4096F3.4060906@ar.media.kyoto-u.ac.jp> <4A4197F2.9060804@ar.media.kyoto-u.ac.jp> Message-ID: Hi David, I tried what you told me and Im getting this error.Please let me know what to do next. $ python Python 2.5.2 (r252:60911, Dec 2 2008, 09:26:14) [GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Running unit tests for numpy Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.5/site-packages/numpy/testing/nosetester.py", line 240, in test self._show_system_info() File "/usr/lib/python2.5/site-packages/numpy/testing/nosetester.py", line 151, in _show_system_info nose = import_nose() File "/usr/lib/python2.5/site-packages/numpy/testing/nosetester.py", line 51, in import_nose raise ImportError(msg) ImportError: Need nose >= 0.10.0 for tests - see http://somethingaboutorange.com/mrl/projects/nose >>> Thanks and Regards, Sandeep. On Wed, Jun 24, 2009 at 8:35 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Sandeep Devadas wrote: > > Hi, > > I got numpy working on Python under cygwin and am getting this > > error . > > Please let me know what to do as I am new to Python and dont know > > anything. > > > > $ python > > Python 2.5.2 (r252:60911, Dec 2 2008, 09:26:14) > > [GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin > > Type "help", "copyright", "credits" or "license" for more information. > > >>> import numpy > > >>> numpy.test( 10 ) > > Traceback (most recent call last): > > File "", line 1, in > > File "/usr/lib/python2.5/site-packages/numpy/testing/nosetester.py", > > line 221, in test > > argv = self._test_argv(label, verbose, extra_argv) > > File "/usr/lib/python2.5/site-packages/numpy/testing/nosetester.py", > > line 141, in _test_argv > > raise TypeError, 'Selection label should be a string' > > TypeError: Selection label should be a string > > You can test numpy as follows: > >>> import numpy > >>> numpy.test() # basic tests > >>> numpy.test(label='full') # all tests, including slower ones > >>> numpy.test(verbose=10) # verbose output > > IIRC, there are not that many more tests with label='full' than with the > default for numpy (the difference matters for scipy). > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredmfp at gmail.com Wed Jun 24 08:30:26 2009 From: fredmfp at gmail.com (fred) Date: Wed, 24 Jun 2009 14:30:26 +0200 Subject: [Numpy-discussion] convolve optimisation... Message-ID: <4A421C62.9050502@gmail.com> Hi all, Say I have a 2D array A(nx, ny). In each A[i, j] I want to compute convolve(a, kernel) where a is subarray of A. a and kernel are small besides A. The problem is that nx & ny are quite "big", ie ~1000, so using a loop on i & j is forbidden here. So how can I do what I want? Any idea? TIA. Cheers, -- Fred From fredmfp at gmail.com Wed Jun 24 08:33:20 2009 From: fredmfp at gmail.com (fred) Date: Wed, 24 Jun 2009 14:33:20 +0200 Subject: [Numpy-discussion] convolve optimisation... In-Reply-To: <4A421C62.9050502@gmail.com> References: <4A421C62.9050502@gmail.com> Message-ID: <4A421D10.7090906@gmail.com> fred a ?crit : > Hi all, > > Say I have a 2D array A(nx, ny). > > In each A[i, j] I want to compute convolve(a, kernel) > > where a is subarray of A. > > a and kernel are small besides A. I forgot to mention: kernel is not constant, of course. It varies vs. others parameters. Cheers, -- Fred From stefan at sun.ac.za Wed Jun 24 08:40:37 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 24 Jun 2009 14:40:37 +0200 Subject: [Numpy-discussion] convolve optimisation... In-Reply-To: <4A421D10.7090906@gmail.com> References: <4A421C62.9050502@gmail.com> <4A421D10.7090906@gmail.com> Message-ID: <9457e7c80906240540m67aa4918oec84b4231fc5fa05@mail.gmail.com> 2009/6/24 fred : > fred a ?crit : >> Hi all, >> >> Say I have a 2D array A(nx, ny). >> >> In each A[i, j] I want to compute convolve(a, kernel) >> >> where a is subarray of A. >> >> a and kernel are small besides A. > I forgot to mention: kernel is not constant, of course. > It varies vs. others parameters. If your kernel varies with i and j, you have little choice but to do this at the C level. Have a look at the Cython convolution example here: http://docs.cython.org/docs/numpy_tutorial.html Alternatively, David Cournapeau can take this opportunity to illustrate his very nifty Neighbour Iterator. Regards St?fan From fredmfp at gmail.com Wed Jun 24 08:58:48 2009 From: fredmfp at gmail.com (fred) Date: Wed, 24 Jun 2009 14:58:48 +0200 Subject: [Numpy-discussion] convolve optimisation... In-Reply-To: <9457e7c80906240540m67aa4918oec84b4231fc5fa05@mail.gmail.com> References: <4A421C62.9050502@gmail.com> <4A421D10.7090906@gmail.com> <9457e7c80906240540m67aa4918oec84b4231fc5fa05@mail.gmail.com> Message-ID: <4A422308.2070304@gmail.com> St?fan van der Walt a ?crit : > If your kernel varies with i and j, you have little choice but to do > this at the C level. > > Have a look at the Cython convolution example here: Thanks. I'm looking at it. > Alternatively, David Cournapeau can take this opportunity to > illustrate his very nifty Neighbour Iterator. Ok. Cheers, -- Fred From dsdale24 at gmail.com Wed Jun 24 09:08:38 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 24 Jun 2009 09:08:38 -0400 Subject: [Numpy-discussion] suggestion for generalizing numpy functions In-Reply-To: References: Message-ID: On Wed, May 27, 2009 at 11:30 AM, Darren Dale wrote: > Now that numpy-1.3 has been released, I was hoping I could engage the numpy > developers and community concerning my suggestion to improve the ufunc > wrapping mechanism. Currently, ufuncs call, on the way out, the > __array_wrap__ method of the input array with the highest > __array_priority__. > > There are use cases, like masked arrays or arrays with units, where it is > imperative to run some code on the way in to the ufunc as well. MaskedArrays > do this by reimplementing or wrapping ufuncs, but this approach puts some > pretty severe constraints on subclassing. For example, in my Quantities > package I have a Quantity object that derives from ndarray. It has been > suggested that in order to make ufuncs work with Quantity, I should wrap > numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as > well, deriving from MaskedArray, and would therefore have to wrap the > MaskedArray ufuncs as well. > > If ufuncs would simply call a method both on the way in and on the way out, > I think this would go a long way to improving this situation. I whipped up a > simple proof of concept and posted it in this thread a while back. For > example, a MaskedQuantity would implement a method like __gfunc_pre__ to > check the validity of the units operation etc, and would then call > MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. > __gfunc_pre__ would return a dict containing any metadata the subclasses > wish to provide based on the inputs, and that dict would be passed along > with the inputs, output and context to __gfunc_post__, so postprocessing can > be done (__gfunc_post__ replacing __array_wrap__). > > Of course, packages like MaskedArray may still wish to reimplement ufuncs, > like Eric Firing is investigating right now. The point is that classes that > dont care about the implementation of ufuncs, that only need to provide > metadata based on the inputs and the output, can do so using this mechanism > and can build upon other specialized arrays. > > I would really appreciate input from numpy developers and other interested > parties. I would like to continue developing the Quantities package this > summer, and have been approached by numerous people interested in using > Quantities with sage, sympy, matplotlib. But I would prefer to improve the > ufunc mechanism (or establish that there is no interest among the community > to do so) so I can improve the package (or limit its scope) before making an > official announcement. > There was some discussion of this proposal to allow better interaction of ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 and scipy-0.8.0) and the comments were encouraging. I have been trying to gather feedback as to whether the numpy devs were receptive to the idea, and it seems the answer is tentatively yes, although there were questions about who would actually write the code. I guess I have not made clear that I intend to write the implementation and tests. I gained some familiarity with the relevant code while squashing a few bugs for numpy-1.3, but it would be helpful if someone else who is familiar with the existing __array_wrap__ machinery would be willing to discuss this proposal in more detail and offer constructive criticism along the way. Is anyone willing? What is the timeframe being considered for the numpy-1.4 release? Thanks, Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Jun 24 09:42:12 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Jun 2009 07:42:12 -0600 Subject: [Numpy-discussion] suggestion for generalizing numpy functions In-Reply-To: References: Message-ID: On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale wrote: > On Wed, May 27, 2009 at 11:30 AM, Darren Dale wrote: >> >> Now that numpy-1.3 has been released, I was hoping I could engage the >> numpy developers and community concerning my suggestion to improve the ufunc >> wrapping mechanism. Currently, ufuncs call, on the way out, the >> __array_wrap__ method of the input array with the highest >> __array_priority__. >> >> There are use cases, like masked arrays or arrays with units, where it is >> imperative to run some code on the way in to the ufunc as well. MaskedArrays >> do this by reimplementing or wrapping ufuncs, but this approach puts some >> pretty severe constraints on subclassing. For example, in my Quantities >> package I have a Quantity object that derives from ndarray. It has been >> suggested that in order to make ufuncs work with Quantity, I should wrap >> numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as >> well, deriving from MaskedArray, and would therefore have to wrap the >> MaskedArray ufuncs as well. >> >> If ufuncs would simply call a method both on the way in and on the way >> out, I think this would go a long way to improving this situation. I whipped >> up a simple proof of concept and posted it in this thread a while back. For >> example, a MaskedQuantity would implement a method like __gfunc_pre__ to >> check the validity of the units operation etc, and would then call >> MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. >> __gfunc_pre__ would return a dict containing any metadata the subclasses >> wish to provide based on the inputs, and that dict would be passed along >> with the inputs, output and context to __gfunc_post__, so postprocessing can >> be done (__gfunc_post__ replacing __array_wrap__). >> >> Of course, packages like MaskedArray may still wish to reimplement ufuncs, >> like Eric Firing is investigating right now. The point is that classes that >> dont care about the implementation of ufuncs, that only need to provide >> metadata based on the inputs and the output, can do so using this mechanism >> and can build upon other specialized arrays. >> >> I would really appreciate input from numpy developers and other interested >> parties. I would like to continue developing the Quantities package this >> summer, and have been approached by numerous people interested in using >> Quantities with sage, sympy, matplotlib. But I would prefer to improve the >> ufunc mechanism (or establish that there is no interest among the community >> to do so) so I can improve the package (or limit its scope) before making an >> official announcement. > > There was some discussion of this proposal to allow better interaction of > ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 and > scipy-0.8.0) and the comments were encouraging. I have been trying to gather > feedback as to whether the numpy devs were receptive to the idea, and it > seems the answer is tentatively yes, although there were questions about who > would actually write the code. I guess I have not made clear that I intend > to write the implementation and tests. I gained some familiarity with the > relevant code while squashing a few bugs for numpy-1.3, but it would be > helpful if someone else who is familiar with the existing __array_wrap__ > machinery would be willing to discuss this proposal in more detail and offer > constructive criticism along the way. Is anyone willing? > I think Travis would be the only one familiar with that code and that would be from a couple of years back when he wrote it. Most of us have followed the same route as yourself, finding our way into the code by squashing bugs. Chuck From kwmsmith at gmail.com Wed Jun 24 10:02:39 2009 From: kwmsmith at gmail.com (Kurt Smith) Date: Wed, 24 Jun 2009 09:02:39 -0500 Subject: [Numpy-discussion] Help using numpy.distutils.fcompiler for my GSoC project In-Reply-To: <5b8d13220906232017t2ba05467n3cc460ead2ebf750@mail.gmail.com> References: <5b8d13220906232017t2ba05467n3cc460ead2ebf750@mail.gmail.com> Message-ID: On Tue, Jun 23, 2009 at 10:17 PM, David Cournapeau wrote: > If possible, you should not build executables, it is not portable. > Compiling and linking is Ok, running is not. For a tool which is aimed > a general use, I think this is important. Knowing the exact tests > needed by the OP would help me to give more detailed advices. Hmmm. Thanks for the input. Ironically, the reason we're building the executable is for portability of the interoperable types. By running the genconfig program it guarantees that we get the correct C type <-> Fortran type correspondence set up. This is especially desirable given that compiler flags can change the size of some datatypes, which would be captured correctly by the genconfig program's output -- if everything goes as planned ;-) We'd like to make it so that any fortran procedure can be wrapped without having to modify the kind type parameters of the arguments. For clarity: is it the actual steps to run the executable that isn't portable (spawning, etc)? Or is the problem in compiling the object file to an executable? Would it be possible to compile the executable, have it run on those systems that 'work' -- which would hopefully be any Unix-flavor system, and have the user do it manually otherwise? That's suboptimal to say the least, but possibly worth it for the benefits of complete type interoperability. FYI, the genconfig.f95 file is completely self-contained, i.e. we just need to do, essentially, $ fort-compiler genconfig.f95 -o genconfig && ./genconfig (adapted to the platform, of course). This would need to be run before compiling the extension module. Is it possible to make this portable? Thanks again, Kurt From david at ar.media.kyoto-u.ac.jp Wed Jun 24 10:01:55 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 24 Jun 2009 23:01:55 +0900 Subject: [Numpy-discussion] MKL Path error on Cygwin after installing on windows. In-Reply-To: References: <4A3F882C.3050904@ar.media.kyoto-u.ac.jp> <4A4096F3.4060906@ar.media.kyoto-u.ac.jp> <4A4197F2.9060804@ar.media.kyoto-u.ac.jp> Message-ID: <4A4231D3.7000907@ar.media.kyoto-u.ac.jp> Sandeep Devadas wrote: > Hi David, > I tried what you told me and Im getting this error.Please > let me know what to do next. What about installing nose :) http://somethingaboutorange.com/mrl/projects/nose/0.11.1/ You need setuptools installed. Note that those are only necessary for testing. cheers, David From david at ar.media.kyoto-u.ac.jp Wed Jun 24 10:05:45 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 24 Jun 2009 23:05:45 +0900 Subject: [Numpy-discussion] Help using numpy.distutils.fcompiler for my GSoC project In-Reply-To: References: <5b8d13220906232017t2ba05467n3cc460ead2ebf750@mail.gmail.com> Message-ID: <4A4232B9.3040607@ar.media.kyoto-u.ac.jp> Kurt Smith wrote: > On Tue, Jun 23, 2009 at 10:17 PM, David Cournapeau wrote: > > >> If possible, you should not build executables, it is not portable. >> Compiling and linking is Ok, running is not. For a tool which is aimed >> a general use, I think this is important. Knowing the exact tests >> needed by the OP would help me to give more detailed advices. >> > > Hmmm. Thanks for the input. > > Ironically, the reason we're building the executable is for > portability of the interoperable types. By running the genconfig > program it guarantees that we get the correct C type <-> Fortran type > correspondence set up. This is especially desirable given that > compiler flags can change the size of some datatypes, which would be > captured correctly by the genconfig program's output -- if everything > goes as planned ;-) We'd like to make it so that any fortran > procedure can be wrapped without having to modify the kind type > parameters of the arguments. > Can't you do this without running executables ? What is not portable is to run executables (because they cannot always run - for example cross compilation on windows). Windows causes quite a headache with recent version of pythons w.r.t running executables if you need to link against the C runtime. > This would need to be run before compiling the extension module. Is > it possible to make this portable? > No. But most of the time, you can test things without running anything. For example, all the type sizeofs are detected by compilation-only with numpy, using some C hackery. What are the exact tests you need to do ? cheers, David From dsdale24 at gmail.com Wed Jun 24 10:52:24 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 24 Jun 2009 10:52:24 -0400 Subject: [Numpy-discussion] suggestion for generalizing numpy functions In-Reply-To: References: Message-ID: On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris wrote: > On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale wrote: > > On Wed, May 27, 2009 at 11:30 AM, Darren Dale > wrote: > >> > >> Now that numpy-1.3 has been released, I was hoping I could engage the > >> numpy developers and community concerning my suggestion to improve the > ufunc > >> wrapping mechanism. Currently, ufuncs call, on the way out, the > >> __array_wrap__ method of the input array with the highest > >> __array_priority__. > >> > >> There are use cases, like masked arrays or arrays with units, where it > is > >> imperative to run some code on the way in to the ufunc as well. > MaskedArrays > >> do this by reimplementing or wrapping ufuncs, but this approach puts > some > >> pretty severe constraints on subclassing. For example, in my Quantities > >> package I have a Quantity object that derives from ndarray. It has been > >> suggested that in order to make ufuncs work with Quantity, I should wrap > >> numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as > >> well, deriving from MaskedArray, and would therefore have to wrap the > >> MaskedArray ufuncs as well. > >> > >> If ufuncs would simply call a method both on the way in and on the way > >> out, I think this would go a long way to improving this situation. I > whipped > >> up a simple proof of concept and posted it in this thread a while back. > For > >> example, a MaskedQuantity would implement a method like __gfunc_pre__ to > >> check the validity of the units operation etc, and would then call > >> MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. > >> __gfunc_pre__ would return a dict containing any metadata the subclasses > >> wish to provide based on the inputs, and that dict would be passed along > >> with the inputs, output and context to __gfunc_post__, so postprocessing > can > >> be done (__gfunc_post__ replacing __array_wrap__). > >> > >> Of course, packages like MaskedArray may still wish to reimplement > ufuncs, > >> like Eric Firing is investigating right now. The point is that classes > that > >> dont care about the implementation of ufuncs, that only need to provide > >> metadata based on the inputs and the output, can do so using this > mechanism > >> and can build upon other specialized arrays. > >> > >> I would really appreciate input from numpy developers and other > interested > >> parties. I would like to continue developing the Quantities package this > >> summer, and have been approached by numerous people interested in using > >> Quantities with sage, sympy, matplotlib. But I would prefer to improve > the > >> ufunc mechanism (or establish that there is no interest among the > community > >> to do so) so I can improve the package (or limit its scope) before > making an > >> official announcement. > > > > There was some discussion of this proposal to allow better interaction of > > ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 > and > > scipy-0.8.0) and the comments were encouraging. I have been trying to > gather > > feedback as to whether the numpy devs were receptive to the idea, and it > > seems the answer is tentatively yes, although there were questions about > who > > would actually write the code. I guess I have not made clear that I > intend > > to write the implementation and tests. I gained some familiarity with the > > relevant code while squashing a few bugs for numpy-1.3, but it would be > > helpful if someone else who is familiar with the existing __array_wrap__ > > machinery would be willing to discuss this proposal in more detail and > offer > > constructive criticism along the way. Is anyone willing? > > > > I think Travis would be the only one familiar with that code and that > would be from a couple of years back when he wrote it. Most of us have > followed the same route as yourself, finding our way into the code by > squashing bugs. > > Do you mean that you would require Travis to sign off on the implementation (assuming he would agree to review my work)? I would really like to avoid a situation where I invest the time and then the code bitrots because I can't find a route to committing it to svn. Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwmsmith at gmail.com Wed Jun 24 11:07:53 2009 From: kwmsmith at gmail.com (Kurt Smith) Date: Wed, 24 Jun 2009 10:07:53 -0500 Subject: [Numpy-discussion] Help using numpy.distutils.fcompiler for my GSoC project In-Reply-To: <4A4232B9.3040607@ar.media.kyoto-u.ac.jp> References: <5b8d13220906232017t2ba05467n3cc460ead2ebf750@mail.gmail.com> <4A4232B9.3040607@ar.media.kyoto-u.ac.jp> Message-ID: On Wed, Jun 24, 2009 at 9:05 AM, David Cournapeau wrote: > Kurt Smith wrote: >> On Tue, Jun 23, 2009 at 10:17 PM, David Cournapeau wrote: >> >> >>> If possible, you should not build executables, it is not portable. >>> Compiling and linking is Ok, running is not. For a tool which is aimed >>> a general use, I think this is important. Knowing the exact tests >>> needed by the OP would help me to give more detailed advices. >>> >> >> Hmmm. ?Thanks for the input. >> >> Ironically, the reason we're building the executable is for >> portability of the interoperable types. ?By running the genconfig >> program it guarantees that we get the correct C type <-> Fortran type >> correspondence set up. ?This is especially desirable given that >> compiler flags can change the size of some datatypes, which would be >> captured correctly by the genconfig program's output -- if everything >> goes as planned ;-) ?We'd like to make it so that any fortran >> procedure can be wrapped without having to modify the kind type >> parameters of the arguments. >> > > Can't you do this without running executables ? What is not portable is > to run executables (because they cannot always run - for example cross > compilation on windows). Windows causes quite a headache with recent > version of pythons w.r.t running executables if you need to link against > the C runtime. What we're attempting to do is similar to Cython. Cython is run with a .pyx file as argument, and outputs a standalone C file that can be shipped and later compiled without any Cython dependence from then onwards. We'd like to have f2cy do the same: it is run with fortran source files as arguments, generates wrappers & various build scripts for convenience (a Makefile, a setup.py, SConstruct file, etc) that are shipped. Later these build files/scripts are used to create the extension module, without any dependence on f2cy. So these wrappers & build scripts must have a way to portably generate the Fortran type <-> C type mappings, portable between compiler and platform. A simple example for illustration: a fortran subroutine to be wrapped: subroutine foo(a) integer(kind=selected_int_kind(10)), intent(inout) :: a ... end subroutine foo The 'selected_int_kind(10)' call might correspond to an actual kind type parameter of 1,2,4 or 8 depending on the fortran compiler -- some compilers have the ktp correspond to the byte size, others just label them sequentially, so no consistent meaning is assignable to the value of the selected_int_kind(10) call. It is impossible beforehand to know if that kind-type-parameter corresponds to a c_int, a c_long, etc, since the correspondence can vary platform to platform and can be changed with compiler flags. Some compilers will raise an error if an incorrect assumption is made. We have to have the genconfig executable run beforehand to ensure that the right correspondence is set-up, adapted to that platform and that compiler. Again, since f2cy isn't around to make sure things are set-up correctly, then we're stuck with running the executable to handle all possible points in the [compiler] x [platform] space. There may be a via media: we could have the user supply the type-mapping information to f2cy, which would generate the wrappers with these mappings assumed. It wouldn't be as portable or general, but that would be the tradeoff. It would avoid running the executable on those platforms that won't allow it. > >> This would need to be run before compiling the extension module. ?Is >> it possible to make this portable? >> > > No. But most of the time, you can test things without running anything. > For example, all the type sizeofs are detected by compilation-only with > numpy, using some C hackery. What are the exact tests you need to do ? I assume the type sizeofs are available for the C types, but not the Fortran types, right? Its the fortran type sizeofs that we would need, which I doubt could be determined in a portable way (see above). Thanks again, Kurt From david at ar.media.kyoto-u.ac.jp Wed Jun 24 11:40:40 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 25 Jun 2009 00:40:40 +0900 Subject: [Numpy-discussion] Help using numpy.distutils.fcompiler for my GSoC project In-Reply-To: References: <5b8d13220906232017t2ba05467n3cc460ead2ebf750@mail.gmail.com> <4A4232B9.3040607@ar.media.kyoto-u.ac.jp> Message-ID: <4A4248F8.3090803@ar.media.kyoto-u.ac.jp> Kurt Smith wrote: > On Wed, Jun 24, 2009 at 9:05 AM, David > Cournapeau wrote: > >> Kurt Smith wrote: >> >>> On Tue, Jun 23, 2009 at 10:17 PM, David Cournapeau wrote: >>> >>> >>> >>>> If possible, you should not build executables, it is not portable. >>>> Compiling and linking is Ok, running is not. For a tool which is aimed >>>> a general use, I think this is important. Knowing the exact tests >>>> needed by the OP would help me to give more detailed advices. >>>> >>>> >>> Hmmm. Thanks for the input. >>> >>> Ironically, the reason we're building the executable is for >>> portability of the interoperable types. By running the genconfig >>> program it guarantees that we get the correct C type <-> Fortran type >>> correspondence set up. This is especially desirable given that >>> compiler flags can change the size of some datatypes, which would be >>> captured correctly by the genconfig program's output -- if everything >>> goes as planned ;-) We'd like to make it so that any fortran >>> procedure can be wrapped without having to modify the kind type >>> parameters of the arguments. >>> >>> >> Can't you do this without running executables ? What is not portable is >> to run executables (because they cannot always run - for example cross >> compilation on windows). Windows causes quite a headache with recent >> version of pythons w.r.t running executables if you need to link against >> the C runtime. >> > > What we're attempting to do is similar to Cython. Cython is run with > a .pyx file as argument, and outputs a standalone C file that can be > shipped and later compiled without any Cython dependence from then > onwards. > > We'd like to have f2cy do the same: it is run with fortran source > files as arguments, generates wrappers & various build scripts for > convenience (a Makefile, a setup.py, SConstruct file, etc) that are > shipped. Later these build files/scripts are used to create the > extension module, without any dependence on f2cy. So these wrappers & > build scripts must have a way to portably generate the Fortran type > <-> C type mappings, portable between compiler and platform. > > A simple example for illustration: > > a fortran subroutine to be wrapped: > > subroutine foo(a) > integer(kind=selected_int_kind(10)), intent(inout) :: a > ... > end subroutine foo > > The 'selected_int_kind(10)' call might correspond to an actual kind > type parameter of 1,2,4 or 8 depending on the fortran compiler -- some > compilers have the ktp correspond to the byte size, others just label > them sequentially, so no consistent meaning is assignable to the value > of the selected_int_kind(10) call. It is impossible beforehand to > know if that kind-type-parameter corresponds to a c_int, a c_long, > etc, since the correspondence can vary platform to platform and can be > changed with compiler flags. Some compilers will raise an error if an > incorrect assumption is made. > > We have to have the genconfig executable run beforehand to ensure that > the right correspondence is set-up, adapted to that platform and that > compiler. Again, since f2cy isn't around to make sure things are > set-up correctly, then we're stuck with running the executable to > handle all possible points in the [compiler] x [platform] space. > > There may be a via media: we could have the user supply the > type-mapping information to f2cy, which would generate the wrappers > with these mappings assumed. It wouldn't be as portable or general, > but that would be the tradeoff. It would avoid running the executable > on those platforms that won't allow it. > > >>> This would need to be run before compiling the extension module. Is >>> it possible to make this portable? >>> >>> >> No. But most of the time, you can test things without running anything. >> For example, all the type sizeofs are detected by compilation-only with >> numpy, using some C hackery. What are the exact tests you need to do ? >> > > I assume the type sizeofs are available for the C types The configure checks are specific to C, but maybe they can be adapted for fortran (I don't know enough fortran to be sure, though). The code can be found here: http://projects.scipy.org/numpy/browser/trunk/numpy/distutils/command/config.py (line 169) The idea is simple: you use sizeof(type) + a shift as an index of an array, and rely on the compiler to fail if the index is negative. cheers, David From sccolbert at gmail.com Wed Jun 24 12:06:12 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Wed, 24 Jun 2009 12:06:12 -0400 Subject: [Numpy-discussion] convolve optimisation... In-Reply-To: <4A422308.2070304@gmail.com> References: <4A421C62.9050502@gmail.com> <4A421D10.7090906@gmail.com> <9457e7c80906240540m67aa4918oec84b4231fc5fa05@mail.gmail.com> <4A422308.2070304@gmail.com> Message-ID: <7f014ea60906240906g3f913fb9lb00cbdcac37394c0@mail.gmail.com> do you mean that the values in the kernel depends on the kernels position relative to the data to be convolved, or that the kernel is not composed of homogeneous values but otherwise does not change as it is slid around the source data? If the case is the latter, you may be better off doing the convolution in the fourier domain. On Wed, Jun 24, 2009 at 8:58 AM, fred wrote: > St?fan van der Walt a ?crit : > >> If your kernel varies with i and j, you have little choice but to do >> this at the C level. >> >> Have a look at the Cython convolution example here: > Thanks. > I'm looking at it. > >> Alternatively, David Cournapeau can take this opportunity to >> illustrate his very nifty Neighbour Iterator. > Ok. > > Cheers, > > -- > Fred > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Wed Jun 24 15:37:40 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Jun 2009 13:37:40 -0600 Subject: [Numpy-discussion] suggestion for generalizing numpy functions In-Reply-To: References: Message-ID: On Wed, Jun 24, 2009 at 8:52 AM, Darren Dale wrote: > On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris > wrote: >> >> On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale wrote: >> > On Wed, May 27, 2009 at 11:30 AM, Darren Dale >> > wrote: >> >> >> >> Now that numpy-1.3 has been released, I was hoping I could engage the >> >> numpy developers and community concerning my suggestion to improve the >> >> ufunc >> >> wrapping mechanism. Currently, ufuncs call, on the way out, the >> >> __array_wrap__ method of the input array with the highest >> >> __array_priority__. >> >> >> >> There are use cases, like masked arrays or arrays with units, where it >> >> is >> >> imperative to run some code on the way in to the ufunc as well. >> >> MaskedArrays >> >> do this by reimplementing or wrapping ufuncs, but this approach puts >> >> some >> >> pretty severe constraints on subclassing. For example, in my Quantities >> >> package I have a Quantity object that derives from ndarray. It has been >> >> suggested that in order to make ufuncs work with Quantity, I should >> >> wrap >> >> numpy's built-in ufuncs. But I intend to make a MaskedQuantity object >> >> as >> >> well, deriving from MaskedArray, and would therefore have to wrap the >> >> MaskedArray ufuncs as well. >> >> >> >> If ufuncs would simply call a method both on the way in and on the way >> >> out, I think this would go a long way to improving this situation. I >> >> whipped >> >> up a simple proof of concept and posted it in this thread a while back. >> >> For >> >> example, a MaskedQuantity would implement a method like __gfunc_pre__ >> >> to >> >> check the validity of the units operation etc, and would then call >> >> MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. >> >> __gfunc_pre__ would return a dict containing any metadata the >> >> subclasses >> >> wish to provide based on the inputs, and that dict would be passed >> >> along >> >> with the inputs, output and context to __gfunc_post__, so >> >> postprocessing can >> >> be done (__gfunc_post__ replacing __array_wrap__). >> >> >> >> Of course, packages like MaskedArray may still wish to reimplement >> >> ufuncs, >> >> like Eric Firing is investigating right now. The point is that classes >> >> that >> >> dont care about the implementation of ufuncs, that only need to provide >> >> metadata based on the inputs and the output, can do so using this >> >> mechanism >> >> and can build upon other specialized arrays. >> >> >> >> I would really appreciate input from numpy developers and other >> >> interested >> >> parties. I would like to continue developing the Quantities package >> >> this >> >> summer, and have been approached by numerous people interested in using >> >> Quantities with sage, sympy, matplotlib. But I would prefer to improve >> >> the >> >> ufunc mechanism (or establish that there is no interest among the >> >> community >> >> to do so) so I can improve the package (or limit its scope) before >> >> making an >> >> official announcement. >> > >> > There was some discussion of this proposal to allow better interaction >> > of >> > ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 >> > and >> > scipy-0.8.0) and the comments were encouraging. I have been trying to >> > gather >> > feedback as to whether the numpy devs were receptive to the idea, and it >> > seems the answer is tentatively yes, although there were questions about >> > who >> > would actually write the code. I guess I have not made clear that I >> > intend >> > to write the implementation and tests. I gained some familiarity with >> > the >> > relevant code while squashing a few bugs for numpy-1.3, but it would be >> > helpful if someone else who is familiar with the existing __array_wrap__ >> > machinery would be willing to discuss this proposal in more detail and >> > offer >> > constructive criticism along the way. Is anyone willing? >> > >> >> I think Travis would be the only one familiar with that code and that >> would be from a couple of years back when he wrote it. Most of us have >> followed the same route as yourself, finding our way into the code by >> squashing bugs. >> > > Do you mean that you would require Travis to sign off on the implementation > (assuming he would agree to review my work)? I would really like to avoid a > situation where I invest the time and then the code bitrots because I can't > find a route to committing it to svn. > No, just that Travis would know the most about that subsystem if you are looking for help. I and others here can look over the code and commit it without Travis signing off on it. You could ask for commit privileges yourself. The important thing is having some tests and an agreement that the interface is appropriate. Pierre also seems interested in the functionality so it would be useful for him to say that it serves his needs also. Chuck From dsdale24 at gmail.com Wed Jun 24 15:49:07 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 24 Jun 2009 15:49:07 -0400 Subject: [Numpy-discussion] suggestion for generalizing numpy functions In-Reply-To: References: Message-ID: On Wed, Jun 24, 2009 at 3:37 PM, Charles R Harris wrote: > On Wed, Jun 24, 2009 at 8:52 AM, Darren Dale wrote: > > On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris > > wrote: > >> > >> On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale wrote: > >> > On Wed, May 27, 2009 at 11:30 AM, Darren Dale > >> > wrote: > >> >> > >> >> Now that numpy-1.3 has been released, I was hoping I could engage the > >> >> numpy developers and community concerning my suggestion to improve > the > >> >> ufunc > >> >> wrapping mechanism. Currently, ufuncs call, on the way out, the > >> >> __array_wrap__ method of the input array with the highest > >> >> __array_priority__. > >> >> > >> >> There are use cases, like masked arrays or arrays with units, where > it > >> >> is > >> >> imperative to run some code on the way in to the ufunc as well. > >> >> MaskedArrays > >> >> do this by reimplementing or wrapping ufuncs, but this approach puts > >> >> some > >> >> pretty severe constraints on subclassing. For example, in my > Quantities > >> >> package I have a Quantity object that derives from ndarray. It has > been > >> >> suggested that in order to make ufuncs work with Quantity, I should > >> >> wrap > >> >> numpy's built-in ufuncs. But I intend to make a MaskedQuantity object > >> >> as > >> >> well, deriving from MaskedArray, and would therefore have to wrap the > >> >> MaskedArray ufuncs as well. > >> >> > >> >> If ufuncs would simply call a method both on the way in and on the > way > >> >> out, I think this would go a long way to improving this situation. I > >> >> whipped > >> >> up a simple proof of concept and posted it in this thread a while > back. > >> >> For > >> >> example, a MaskedQuantity would implement a method like __gfunc_pre__ > >> >> to > >> >> check the validity of the units operation etc, and would then call > >> >> MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. > >> >> __gfunc_pre__ would return a dict containing any metadata the > >> >> subclasses > >> >> wish to provide based on the inputs, and that dict would be passed > >> >> along > >> >> with the inputs, output and context to __gfunc_post__, so > >> >> postprocessing can > >> >> be done (__gfunc_post__ replacing __array_wrap__). > >> >> > >> >> Of course, packages like MaskedArray may still wish to reimplement > >> >> ufuncs, > >> >> like Eric Firing is investigating right now. The point is that > classes > >> >> that > >> >> dont care about the implementation of ufuncs, that only need to > provide > >> >> metadata based on the inputs and the output, can do so using this > >> >> mechanism > >> >> and can build upon other specialized arrays. > >> >> > >> >> I would really appreciate input from numpy developers and other > >> >> interested > >> >> parties. I would like to continue developing the Quantities package > >> >> this > >> >> summer, and have been approached by numerous people interested in > using > >> >> Quantities with sage, sympy, matplotlib. But I would prefer to > improve > >> >> the > >> >> ufunc mechanism (or establish that there is no interest among the > >> >> community > >> >> to do so) so I can improve the package (or limit its scope) before > >> >> making an > >> >> official announcement. > >> > > >> > There was some discussion of this proposal to allow better interaction > >> > of > >> > ufuncs with ndarray subclasses in another thread (Plans for > numpy-1.4.0 > >> > and > >> > scipy-0.8.0) and the comments were encouraging. I have been trying to > >> > gather > >> > feedback as to whether the numpy devs were receptive to the idea, and > it > >> > seems the answer is tentatively yes, although there were questions > about > >> > who > >> > would actually write the code. I guess I have not made clear that I > >> > intend > >> > to write the implementation and tests. I gained some familiarity with > >> > the > >> > relevant code while squashing a few bugs for numpy-1.3, but it would > be > >> > helpful if someone else who is familiar with the existing > __array_wrap__ > >> > machinery would be willing to discuss this proposal in more detail and > >> > offer > >> > constructive criticism along the way. Is anyone willing? > >> > > >> > >> I think Travis would be the only one familiar with that code and that > >> would be from a couple of years back when he wrote it. Most of us have > >> followed the same route as yourself, finding our way into the code by > >> squashing bugs. > >> > > > > Do you mean that you would require Travis to sign off on the > implementation > > (assuming he would agree to review my work)? I would really like to avoid > a > > situation where I invest the time and then the code bitrots because I > can't > > find a route to committing it to svn. > > > > No, just that Travis would know the most about that subsystem if you > are looking for help. I and others here can look over the code and > commit it without Travis signing off on it. You could ask for commit > privileges yourself. The important thing is having some tests and an > agreement that the interface is appropriate. Pierre also seems > interested in the functionality so it would be useful for him to say > that it serves his needs also. > Ok, I'll start working on it then. Any idea what you are targeting for numpy-1.4? Scipy-2009, or much earlier? I'd like to gauge how to budget my time. Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Jun 24 16:08:59 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Jun 2009 14:08:59 -0600 Subject: [Numpy-discussion] suggestion for generalizing numpy functions In-Reply-To: References: Message-ID: On Wed, Jun 24, 2009 at 1:49 PM, Darren Dale wrote: > On Wed, Jun 24, 2009 at 3:37 PM, Charles R Harris > wrote: >> >> On Wed, Jun 24, 2009 at 8:52 AM, Darren Dale wrote: >> > On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris >> > wrote: >> >> >> >> On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale wrote: >> >> > On Wed, May 27, 2009 at 11:30 AM, Darren Dale >> >> > wrote: >> >> >> >> >> >> Now that numpy-1.3 has been released, I was hoping I could engage >> >> >> the >> >> >> numpy developers and community concerning my suggestion to improve >> >> >> the >> >> >> ufunc >> >> >> wrapping mechanism. Currently, ufuncs call, on the way out, the >> >> >> __array_wrap__ method of the input array with the highest >> >> >> __array_priority__. >> >> >> >> >> >> There are use cases, like masked arrays or arrays with units, where >> >> >> it >> >> >> is >> >> >> imperative to run some code on the way in to the ufunc as well. >> >> >> MaskedArrays >> >> >> do this by reimplementing or wrapping ufuncs, but this approach puts >> >> >> some >> >> >> pretty severe constraints on subclassing. For example, in my >> >> >> Quantities >> >> >> package I have a Quantity object that derives from ndarray. It has >> >> >> been >> >> >> suggested that in order to make ufuncs work with Quantity, I should >> >> >> wrap >> >> >> numpy's built-in ufuncs. But I intend to make a MaskedQuantity >> >> >> object >> >> >> as >> >> >> well, deriving from MaskedArray, and would therefore have to wrap >> >> >> the >> >> >> MaskedArray ufuncs as well. >> >> >> >> >> >> If ufuncs would simply call a method both on the way in and on the >> >> >> way >> >> >> out, I think this would go a long way to improving this situation. I >> >> >> whipped >> >> >> up a simple proof of concept and posted it in this thread a while >> >> >> back. >> >> >> For >> >> >> example, a MaskedQuantity would implement a method like >> >> >> __gfunc_pre__ >> >> >> to >> >> >> check the validity of the units operation etc, and would then call >> >> >> MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. >> >> >> __gfunc_pre__ would return a dict containing any metadata the >> >> >> subclasses >> >> >> wish to provide based on the inputs, and that dict would be passed >> >> >> along >> >> >> with the inputs, output and context to __gfunc_post__, so >> >> >> postprocessing can >> >> >> be done (__gfunc_post__ replacing __array_wrap__). >> >> >> >> >> >> Of course, packages like MaskedArray may still wish to reimplement >> >> >> ufuncs, >> >> >> like Eric Firing is investigating right now. The point is that >> >> >> classes >> >> >> that >> >> >> dont care about the implementation of ufuncs, that only need to >> >> >> provide >> >> >> metadata based on the inputs and the output, can do so using this >> >> >> mechanism >> >> >> and can build upon other specialized arrays. >> >> >> >> >> >> I would really appreciate input from numpy developers and other >> >> >> interested >> >> >> parties. I would like to continue developing the Quantities package >> >> >> this >> >> >> summer, and have been approached by numerous people interested in >> >> >> using >> >> >> Quantities with sage, sympy, matplotlib. But I would prefer to >> >> >> improve >> >> >> the >> >> >> ufunc mechanism (or establish that there is no interest among the >> >> >> community >> >> >> to do so) so I can improve the package (or limit its scope) before >> >> >> making an >> >> >> official announcement. >> >> > >> >> > There was some discussion of this proposal to allow better >> >> > interaction >> >> > of >> >> > ufuncs with ndarray subclasses in another thread (Plans for >> >> > numpy-1.4.0 >> >> > and >> >> > scipy-0.8.0) and the comments were encouraging. I have been trying to >> >> > gather >> >> > feedback as to whether the numpy devs were receptive to the idea, and >> >> > it >> >> > seems the answer is tentatively yes, although there were questions >> >> > about >> >> > who >> >> > would actually write the code. I guess I have not made clear that I >> >> > intend >> >> > to write the implementation and tests. I gained some familiarity with >> >> > the >> >> > relevant code while squashing a few bugs for numpy-1.3, but it would >> >> > be >> >> > helpful if someone else who is familiar with the existing >> >> > __array_wrap__ >> >> > machinery would be willing to discuss this proposal in more detail >> >> > and >> >> > offer >> >> > constructive criticism along the way. Is anyone willing? >> >> > >> >> >> >> I think Travis would be the only one familiar with that code and that >> >> would be from a couple of years back when he wrote it. Most of us have >> >> followed the same route as yourself, finding our way into the code by >> >> squashing bugs. >> >> >> > >> > Do you mean that you would require Travis to sign off on the >> > implementation >> > (assuming he would agree to review my work)? I would really like to >> > avoid a >> > situation where I invest the time and then the code bitrots because I >> > can't >> > find a route to committing it to svn. >> > >> >> No, just that Travis would know the most about that subsystem if you >> are looking for help. I and others here can look over the code and >> commit it without Travis signing off on it. You could ask for commit >> privileges yourself. The important thing is having some tests and an >> agreement that the interface is appropriate. Pierre also seems >> interested in the functionality so it would be useful for him to say >> that it serves his needs also. > > Ok, I'll start working on it then. Any idea what you are targeting for > numpy-1.4? Scipy-2009, or much earlier? I'd like to gauge how to budget my > time. > The timeline is open for discussion. A six month timeline would put it sometime in November but David might want it earlier for scipy 0.8. My guess would be sometime after Scipy-2009, late September at the earliest. But as I say, it is open for discussion. What schedule would you prefer? Chuck From christopher.e.kees at usace.army.mil Wed Jun 24 16:14:52 2009 From: christopher.e.kees at usace.army.mil (Chris Kees) Date: Wed, 24 Jun 2009 15:14:52 -0500 Subject: [Numpy-discussion] latex equations in docstrings Message-ID: <3EE15B23-A11E-418D-9819-C8408B33FA28@usace.army.mil> Hi, Apologies if I sent two copies of this message to the list. If I'm using the math directive in my docstrings following the scipy guidelines (http://projects.scipy.org/numpy/wiki/ CodingStyleGuidelines), what tool do I need to use in order to generate html with properly formated equations? I read that Sphinx is not primarily intended for API docs, but using epydoc I can't seem to get the equations formatted. Are you guys using Sphinx on the numpy/ scipy source? Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Jun 24 16:36:58 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 24 Jun 2009 15:36:58 -0500 Subject: [Numpy-discussion] latex equations in docstrings In-Reply-To: <3EE15B23-A11E-418D-9819-C8408B33FA28@usace.army.mil> References: <3EE15B23-A11E-418D-9819-C8408B33FA28@usace.army.mil> Message-ID: <3d375d730906241336u2521d99dp2223e3d053203932@mail.gmail.com> On Wed, Jun 24, 2009 at 15:14, Chris Kees wrote: > Hi, > > Apologies if I sent two copies of this message to the list. If I'm using the > math directive in my docstrings following the scipy guidelines > (http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines), what tool do I > need to use in order to generate html with properly formated equations? ?I > read that Sphinx is not primarily intended for API docs, but using epydoc I > can't seem to get the equations formatted. ?Are you guys using Sphinx on the > numpy/scipy source? Yes. The HOWTO_BUILD_DOCS.txt is unfortunately out of date. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dsdale24 at gmail.com Wed Jun 24 16:37:23 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 24 Jun 2009 16:37:23 -0400 Subject: [Numpy-discussion] suggestion for generalizing numpy functions In-Reply-To: References: Message-ID: On Wed, Jun 24, 2009 at 4:08 PM, Charles R Harris wrote: > On Wed, Jun 24, 2009 at 1:49 PM, Darren Dale wrote: > > On Wed, Jun 24, 2009 at 3:37 PM, Charles R Harris > > wrote: > >> > >> On Wed, Jun 24, 2009 at 8:52 AM, Darren Dale wrote: > >> > On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris > >> > wrote: > >> >> > >> >> On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale > wrote: > >> >> > On Wed, May 27, 2009 at 11:30 AM, Darren Dale > >> >> > wrote: > >> >> >> > >> >> >> Now that numpy-1.3 has been released, I was hoping I could engage > >> >> >> the > >> >> >> numpy developers and community concerning my suggestion to improve > >> >> >> the > >> >> >> ufunc > >> >> >> wrapping mechanism. Currently, ufuncs call, on the way out, the > >> >> >> __array_wrap__ method of the input array with the highest > >> >> >> __array_priority__. > >> >> >> > >> >> >> There are use cases, like masked arrays or arrays with units, > where > >> >> >> it > >> >> >> is > >> >> >> imperative to run some code on the way in to the ufunc as well. > >> >> >> MaskedArrays > >> >> >> do this by reimplementing or wrapping ufuncs, but this approach > puts > >> >> >> some > >> >> >> pretty severe constraints on subclassing. For example, in my > >> >> >> Quantities > >> >> >> package I have a Quantity object that derives from ndarray. It has > >> >> >> been > >> >> >> suggested that in order to make ufuncs work with Quantity, I > should > >> >> >> wrap > >> >> >> numpy's built-in ufuncs. But I intend to make a MaskedQuantity > >> >> >> object > >> >> >> as > >> >> >> well, deriving from MaskedArray, and would therefore have to wrap > >> >> >> the > >> >> >> MaskedArray ufuncs as well. > >> >> >> > >> >> >> If ufuncs would simply call a method both on the way in and on the > >> >> >> way > >> >> >> out, I think this would go a long way to improving this situation. > I > >> >> >> whipped > >> >> >> up a simple proof of concept and posted it in this thread a while > >> >> >> back. > >> >> >> For > >> >> >> example, a MaskedQuantity would implement a method like > >> >> >> __gfunc_pre__ > >> >> >> to > >> >> >> check the validity of the units operation etc, and would then call > >> >> >> MaskedArray.__gfunc_pre__ (if defined) to determine the domain > etc. > >> >> >> __gfunc_pre__ would return a dict containing any metadata the > >> >> >> subclasses > >> >> >> wish to provide based on the inputs, and that dict would be passed > >> >> >> along > >> >> >> with the inputs, output and context to __gfunc_post__, so > >> >> >> postprocessing can > >> >> >> be done (__gfunc_post__ replacing __array_wrap__). > >> >> >> > >> >> >> Of course, packages like MaskedArray may still wish to reimplement > >> >> >> ufuncs, > >> >> >> like Eric Firing is investigating right now. The point is that > >> >> >> classes > >> >> >> that > >> >> >> dont care about the implementation of ufuncs, that only need to > >> >> >> provide > >> >> >> metadata based on the inputs and the output, can do so using this > >> >> >> mechanism > >> >> >> and can build upon other specialized arrays. > >> >> >> > >> >> >> I would really appreciate input from numpy developers and other > >> >> >> interested > >> >> >> parties. I would like to continue developing the Quantities > package > >> >> >> this > >> >> >> summer, and have been approached by numerous people interested in > >> >> >> using > >> >> >> Quantities with sage, sympy, matplotlib. But I would prefer to > >> >> >> improve > >> >> >> the > >> >> >> ufunc mechanism (or establish that there is no interest among the > >> >> >> community > >> >> >> to do so) so I can improve the package (or limit its scope) before > >> >> >> making an > >> >> >> official announcement. > >> >> > > >> >> > There was some discussion of this proposal to allow better > >> >> > interaction > >> >> > of > >> >> > ufuncs with ndarray subclasses in another thread (Plans for > >> >> > numpy-1.4.0 > >> >> > and > >> >> > scipy-0.8.0) and the comments were encouraging. I have been trying > to > >> >> > gather > >> >> > feedback as to whether the numpy devs were receptive to the idea, > and > >> >> > it > >> >> > seems the answer is tentatively yes, although there were questions > >> >> > about > >> >> > who > >> >> > would actually write the code. I guess I have not made clear that I > >> >> > intend > >> >> > to write the implementation and tests. I gained some familiarity > with > >> >> > the > >> >> > relevant code while squashing a few bugs for numpy-1.3, but it > would > >> >> > be > >> >> > helpful if someone else who is familiar with the existing > >> >> > __array_wrap__ > >> >> > machinery would be willing to discuss this proposal in more detail > >> >> > and > >> >> > offer > >> >> > constructive criticism along the way. Is anyone willing? > >> >> > > >> >> > >> >> I think Travis would be the only one familiar with that code and that > >> >> would be from a couple of years back when he wrote it. Most of us > have > >> >> followed the same route as yourself, finding our way into the code by > >> >> squashing bugs. > >> >> > >> > > >> > Do you mean that you would require Travis to sign off on the > >> > implementation > >> > (assuming he would agree to review my work)? I would really like to > >> > avoid a > >> > situation where I invest the time and then the code bitrots because I > >> > can't > >> > find a route to committing it to svn. > >> > > >> > >> No, just that Travis would know the most about that subsystem if you > >> are looking for help. I and others here can look over the code and > >> commit it without Travis signing off on it. You could ask for commit > >> privileges yourself. The important thing is having some tests and an > >> agreement that the interface is appropriate. Pierre also seems > >> interested in the functionality so it would be useful for him to say > >> that it serves his needs also. > > > > Ok, I'll start working on it then. Any idea what you are targeting for > > numpy-1.4? Scipy-2009, or much earlier? I'd like to gauge how to budget > my > > time. > > > > The timeline is open for discussion. A six month timeline would put it > sometime in November but David might want it earlier for scipy 0.8. My > guess would be sometime after Scipy-2009, late September at the > earliest. But as I say, it is open for discussion. What schedule would > you prefer? > I guess I'd like a shot at submitting this in time for 1.4, but I wouldn't want to hold up the release. Late September should provide plenty of time. Darren -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed Jun 24 17:26:22 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 24 Jun 2009 21:26:22 +0000 (UTC) Subject: [Numpy-discussion] latex equations in docstrings References: <3EE15B23-A11E-418D-9819-C8408B33FA28@usace.army.mil> <3d375d730906241336u2521d99dp2223e3d053203932@mail.gmail.com> Message-ID: On 2009-06-24, Robert Kern wrote: [clip] > Yes. The HOWTO_BUILD_DOCS.txt is unfortunately out of date. So it is. Rewritten. -- Pauli Virtanen From joschu at caltech.edu Thu Jun 25 09:43:58 2009 From: joschu at caltech.edu (John Schulman) Date: Thu, 25 Jun 2009 09:43:58 -0400 Subject: [Numpy-discussion] switching to float32 Message-ID: <185761440906250643j68423b5ctd80367e856027945@mail.gmail.com> I'm trying to reduce the memory used in a calculation, so I'd like to switch my program to float32 instead of float64. Is it possible to change the numpy default float size, so I don't have to explicitly state dtype=np.float32 everywhere? Thanks, John From gely at usc.edu Thu Jun 25 11:40:47 2009 From: gely at usc.edu (Geoffrey Ely) Date: Thu, 25 Jun 2009 08:40:47 -0700 Subject: [Numpy-discussion] switching to float32 In-Reply-To: <185761440906250643j68423b5ctd80367e856027945@mail.gmail.com> References: <185761440906250643j68423b5ctd80367e856027945@mail.gmail.com> Message-ID: This does not exactly answer your question, but you can use the dtype string representation and positional parameter to make things nicer. For example: a = numpy.array( [1.0, 2.0, 3.0], 'f' ) instead of a = numpy.array( [1.0, 2.0, 3.0], dtype=numpy.float32 ) -Geoff On Jun 25, 2009, at 6:43 AM, John Schulman wrote: > I'm trying to reduce the memory used in a calculation, so I'd like to > switch my program to float32 instead of float64. Is it possible to > change the numpy default float size, so I don't have to explicitly > state dtype=np.float32 everywhere? > > Thanks, > John > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From magawake at gmail.com Thu Jun 25 20:59:06 2009 From: magawake at gmail.com (Mag Gam) Date: Thu, 25 Jun 2009 20:59:06 -0400 Subject: [Numpy-discussion] loading data Message-ID: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> Hello. I am very new to NumPy and Python. We are doing some research in our Physics lab and we need to store massive amounts of data (100GB daily). I therefore, am going to use hdf5 and h5py. The problem is I am using np.loadtxt() to create my array and create a dataset according to that. np.loadtxt() is reading a file which is about 50GB. This takes a very long time! I was wondering if there was a much easier and better way of doing this. TIA From peridot.faceted at gmail.com Thu Jun 25 21:50:28 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 25 Jun 2009 21:50:28 -0400 Subject: [Numpy-discussion] loading data In-Reply-To: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> References: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> Message-ID: 2009/6/25 Mag Gam : > Hello. > > I am very new to NumPy and Python. We are doing some research in our > Physics lab and we need to store massive amounts of data (100GB > daily). I therefore, am going to use hdf5 and h5py. The problem is I > am using np.loadtxt() to create my array and create a dataset > according to that. np.loadtxt() is reading a file which is about 50GB. > This takes a very long time! I was wondering if there was a much > easier and better way of doing this. If you are stuck with the text array, you probably can't beat numpy.loadtxt(); reading a 50 GB text file is going to be slow no matter how you cut it. So I would take a look at the code that generates the text file, and see if there's any way you can make it generate a format that is faster to read. (I assume the code is in C or FORTRAN and you'd rather not mess with it more than necessary). Of course, generating hdf5 directly is probably fastest; you might look at the C and FORTRAN hdf5 libraries and see how hard it would be to integrate them into the code that currently generates a text file. Even if you need to have a python script to gather the data and add metadata, hdf5 will be much much more efficient than text files as an intermediate format. If integrating HDF5 into the generating application is too difficult, you can try simply generating a binary format. Using numpy's structured data types, it is possible to read in binary files extremely efficiently. If you're using the same architecture to generate the files as read them, you can just write out raw binary arrays of floats or doubles and then read them into numpy. I think FORTRAN also has a semi-standard padded binary format which isn't too difficult to read either. You could even use numpy's native file format, which for a single array should be pretty straightforward, and should yield portable results. If you really can't modify the code that generates the text files, your code is going to be slow. But you might be able to make it slightly less slow. If, for example, the text files are a very specific format, especially if they're made up of columns of fixed width, it would be possible to write compiled code to read them slightly more quickly. (The very easiest way to do this is to write a little C program that reads the text files and writes out a slightly friendlier format, as above.) But you may well find that simply reading a 50 GB file dominates your run time, which would mean that you're stuck with slowness. In short: avoid text files if at all possible. Good luck, Anne > TIA > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From nmb at wartburg.edu Thu Jun 25 21:35:29 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Thu, 25 Jun 2009 20:35:29 -0500 (CDT) Subject: [Numpy-discussion] loading data In-Reply-To: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> References: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> Message-ID: <49556.216.203.97.234.1245980129.squirrel@mcsp.wartburg.edu> On Thu, June 25, 2009 7:59 pm, Mag Gam wrote: > I am very new to NumPy and Python. We are doing some research in our > Physics lab and we need to store massive amounts of data (100GB > daily). I therefore, am going to use hdf5 and h5py. The problem is I > am using np.loadtxt() to create my array and create a dataset > according to that. np.loadtxt() is reading a file which is about 50GB. > This takes a very long time! I was wondering if there was a much > easier and better way of doing this. 50 GB is a *lot* of data to read from a disk into memory (if you really do have that much memory). A magnetic hard drive can read less than 150 MB/s, so just to read the blocks off the disk would take over 5 minutes. np.loadtxt has additional processing on top of that. I think you may be interested in PyTables (www.pytables.org) or np.memmap, although since you have already settled on HDF5, PyTables would be a natural choice, since it can process on-disk datasets as if they were NumPy arrays (which might be nice if you don't have all 50GB of memory). -Neil From ckkart at hoc.net Fri Jun 26 04:39:37 2009 From: ckkart at hoc.net (Christian K.) Date: Fri, 26 Jun 2009 08:39:37 +0000 (UTC) Subject: [Numpy-discussion] switching to float32 References: <185761440906250643j68423b5ctd80367e856027945@mail.gmail.com> Message-ID: John Schulman caltech.edu> writes: > > I'm trying to reduce the memory used in a calculation, so I'd like to > switch my program to float32 instead of float64. Is it possible to > change the numpy default float size, so I don't have to explicitly > state dtype=np.float32 everywhere? Possibly not the nicest way, but np.float64 = np.float32 somewhere at the beginning should work. Christian From magawake at gmail.com Fri Jun 26 06:38:11 2009 From: magawake at gmail.com (Mag Gam) Date: Fri, 26 Jun 2009 06:38:11 -0400 Subject: [Numpy-discussion] loading data In-Reply-To: References: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> Message-ID: <1cbd6f830906260338r71722882yd67fa2ade0d2fbbe@mail.gmail.com> Thanks everyone for the great and well thought out responses! To make matters worse, this is actually a 50gb compressed csv file. So it looks like this, 2009.06.01.plasmasub.csv.gz We get this data from another lab from the Westcoast every night therefore I don't have the option to have this file natively in hdf5. We are sticking with hdf5 because we have other applications that use this data and we wanted to standardize hdf5. Since my file is in csv, would it better for me to create a a tsv file temporarily and have np.loadtxt ? Also, I am curious about Neil's np.memmap. Do you have a some sample code for mapping a compressed csv file into memory? and loading the dataset into a dset (hdf5 structure)? TIA On Thu, Jun 25, 2009 at 9:50 PM, Anne Archibald wrote: > 2009/6/25 Mag Gam : >> Hello. >> >> I am very new to NumPy and Python. We are doing some research in our >> Physics lab and we need to store massive amounts of data (100GB >> daily). I therefore, am going to use hdf5 and h5py. The problem is I >> am using np.loadtxt() to create my array and create a dataset >> according to that. np.loadtxt() is reading a file which is about 50GB. >> This takes a very long time! I was wondering if there was a much >> easier and better way of doing this. > > If you are stuck with the text array, you probably can't beat > numpy.loadtxt(); reading a 50 GB text file is going to be slow no > matter how you cut it. So I would take a look at the code that > generates the text file, and see if there's any way you can make it > generate a format that is faster to read. (I assume the code is in C > or FORTRAN and you'd rather not mess with it more than necessary). > > Of course, generating hdf5 directly is probably fastest; you might > look at the C and FORTRAN hdf5 libraries and see how hard it would be > to integrate them into the code that currently generates a text file. > Even if you need to have a python script to gather the data and add > metadata, hdf5 will be much much more efficient than text files as an > intermediate format. > > If integrating HDF5 into the generating application is too difficult, > you can try simply generating a binary format. Using numpy's > structured data types, it is possible to read in binary files > extremely efficiently. If you're using the same architecture to > generate the files as read them, you can just write out raw binary > arrays of floats or doubles and then read them into numpy. I think > FORTRAN also has a semi-standard padded binary format which isn't too > difficult to read either. You could even use numpy's native file > format, which for a single array should be pretty straightforward, and > should yield portable results. > > If you really can't modify the code that generates the text files, > your code is going to be slow. But you might be able to make it > slightly less slow. If, for example, the text files are a very > specific format, especially if they're made up of columns of fixed > width, it would be possible to write compiled code to read them > slightly more quickly. (The very easiest way to do this is to write a > little C program that reads the text files and writes out a slightly > friendlier format, as above.) But you may well find that simply > reading a 50 GB file dominates your run time, which would mean that > you're stuck with slowness. > > > In short: avoid text files if at all possible. > > > Good luck, > Anne > >> TIA >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From faltet at pytables.org Fri Jun 26 07:05:58 2009 From: faltet at pytables.org (Francesc Alted) Date: Fri, 26 Jun 2009 13:05:58 +0200 Subject: [Numpy-discussion] loading data In-Reply-To: <1cbd6f830906260338r71722882yd67fa2ade0d2fbbe@mail.gmail.com> References: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> <1cbd6f830906260338r71722882yd67fa2ade0d2fbbe@mail.gmail.com> Message-ID: <200906261306.01110.faltet@pytables.org> A Friday 26 June 2009 12:38:11 Mag Gam escrigu?: > Thanks everyone for the great and well thought out responses! > > To make matters worse, this is actually a 50gb compressed csv file. So > it looks like this, 2009.06.01.plasmasub.csv.gz > We get this data from another lab from the Westcoast every night > therefore I don't have the option to have this file natively in hdf5. > We are sticking with hdf5 because we have other applications that use > this data and we wanted to standardize hdf5. Well, since you are adopting HDF5, the best solution is that the Westcoast lab would send the file directly in HDF5. That will save you a lot of headaches. If this is not possible, then I think the best would be that you do some profiles in your code and see where the bottleneck is. Using cProfile normally offers a good insight on what's consuming more time in your converter. There are three most probable hot spots, the decompressor (gzip) time, the np.loadtxt and the HDF5 writer function. If the problem is gzip, then you won't be unable to accelerate the conversion unless the other lab is willing to use a lighter compressor (lzop, for example). If it is np.loadtxt(), then you should ask yourself if you are trying to load everything in-memory; if you are, don't do that; just try to load & write slice by slice. Finally, if the problem is on the HDF5 write, try to use write array slices (and not record- by-record writes). > Also, I am curious about Neil's np.memmap. Do you have a some sample > code for mapping a compressed csv file into memory? and loading the > dataset into a dset (hdf5 structure)? No, np.memmap is meant to map *uncompressed binary* files in memory, so you can't follow this path. -- Francesc Alted From magawake at gmail.com Fri Jun 26 07:09:13 2009 From: magawake at gmail.com (Mag Gam) Date: Fri, 26 Jun 2009 07:09:13 -0400 Subject: [Numpy-discussion] loading data In-Reply-To: <200906261306.01110.faltet@pytables.org> References: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> <1cbd6f830906260338r71722882yd67fa2ade0d2fbbe@mail.gmail.com> <200906261306.01110.faltet@pytables.org> Message-ID: <1cbd6f830906260409v4baca90dpd5910858c7140218@mail.gmail.com> I really like the slice by slice idea! But, I don't know how to implement the code. Do you have any sample code? I suspect its the writing portion thats taking the lonest. I did a simple decompress test and its fast. On Fri, Jun 26, 2009 at 7:05 AM, Francesc Alted wrote: > A Friday 26 June 2009 12:38:11 Mag Gam escrigu?: >> Thanks everyone for the great and well thought out responses! >> >> To make matters worse, this is actually a 50gb compressed csv file. So >> it looks like this, 2009.06.01.plasmasub.csv.gz >> We get this data from another lab from the Westcoast every night >> therefore I don't have the option to have this file natively in hdf5. >> We are sticking with hdf5 because we have other applications that use >> this data and we wanted to standardize hdf5. > > Well, since you are adopting HDF5, the best solution is that the Westcoast lab > would send the file directly in HDF5. ?That will save you a lot of headaches. > If this is not possible, then I think the best would be that you do some > profiles in your code and see where the bottleneck is. ?Using cProfile > normally offers a good insight on what's consuming more time in your > converter. > > There are three most probable hot spots, the decompressor (gzip) time, the > np.loadtxt and the HDF5 writer function. ?If the problem is gzip, then you > won't be unable to accelerate the conversion unless the other lab is willing > to use a lighter compressor (lzop, for example). ?If it is np.loadtxt(), then > you should ask yourself if you are trying to load everything in-memory; if you > are, don't do that; just try to load & write slice by slice. ?Finally, if the > problem is on the HDF5 write, try to use write array slices (and not record- > by-record writes). > >> Also, I am curious about Neil's ?np.memmap. Do you have a some sample >> code for mapping a compressed csv file into memory? and loading the >> dataset into a dset (hdf5 structure)? > > No, np.memmap is meant to map *uncompressed binary* files in memory, so you > can't follow this path. > > -- > Francesc Alted > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From faltet at pytables.org Fri Jun 26 07:31:40 2009 From: faltet at pytables.org (Francesc Alted) Date: Fri, 26 Jun 2009 13:31:40 +0200 Subject: [Numpy-discussion] loading data In-Reply-To: <1cbd6f830906260409v4baca90dpd5910858c7140218@mail.gmail.com> References: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> <200906261306.01110.faltet@pytables.org> <1cbd6f830906260409v4baca90dpd5910858c7140218@mail.gmail.com> Message-ID: <200906261331.40299.faltet@pytables.org> A Friday 26 June 2009 13:09:13 Mag Gam escrigu?: > I really like the slice by slice idea! Hmm, after looking at the np.loadtxt() docstrings it seems it works by loading the complete file at once, so you shouldn't use this directly (unless you split your big file before, but this will take time too). So, I'd say that your best bet would be to use Python's `csv.reader()` iterator to iterate over the lines in your file and setup a buffer (a NumPy array/recarray would be fine), so that when the buffer is full it is written to the HDF5 file. That should be pretty optimal. With this you will not try to load the entire file into memory, which is what I think is probably killing the performance in your case (unless your machine has much more memory than 50 GB, that is). -- Francesc Alted From magawake at gmail.com Fri Jun 26 07:46:13 2009 From: magawake at gmail.com (Mag Gam) Date: Fri, 26 Jun 2009 07:46:13 -0400 Subject: [Numpy-discussion] loading data In-Reply-To: <200906261331.40299.faltet@pytables.org> References: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> <200906261306.01110.faltet@pytables.org> <1cbd6f830906260409v4baca90dpd5910858c7140218@mail.gmail.com> <200906261331.40299.faltet@pytables.org> Message-ID: <1cbd6f830906260446l7e4c1efdq826ddb72a7dc59cf@mail.gmail.com> Yes, you are correct! I think this is the best path. However, I need to learn how to append a hdf5 dataset . I looked at this, http://code.google.com/p/h5py/wiki/FAQ#Appending_data_to_a_dataset but was not able to do so. Do you happen to have any sample code for this, if you used hdf5. On Fri, Jun 26, 2009 at 7:31 AM, Francesc Alted wrote: > A Friday 26 June 2009 13:09:13 Mag Gam escrigu?: >> I really like the slice by slice idea! > > Hmm, after looking at the np.loadtxt() docstrings it seems it works by loading > the complete file at once, so you shouldn't use this directly (unless you > split your big file before, but this will take time too). ?So, I'd say that > your best bet would be to use Python's `csv.reader()` iterator to iterate over > the lines in your file and setup a buffer (a NumPy array/recarray would be > fine), so that when the buffer is full it is written to the HDF5 file. ?That > should be pretty optimal. > > With this you will not try to load the entire file into memory, which is what > I think is probably killing the performance in your case (unless your machine > has much more memory than 50 GB, that is). > > -- > Francesc Alted > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From faltet at pytables.org Fri Jun 26 08:00:17 2009 From: faltet at pytables.org (Francesc Alted) Date: Fri, 26 Jun 2009 14:00:17 +0200 Subject: [Numpy-discussion] loading data In-Reply-To: <1cbd6f830906260446l7e4c1efdq826ddb72a7dc59cf@mail.gmail.com> References: <1cbd6f830906251759l1d473d66t3a81e7992d533669@mail.gmail.com> <200906261331.40299.faltet@pytables.org> <1cbd6f830906260446l7e4c1efdq826ddb72a7dc59cf@mail.gmail.com> Message-ID: <200906261400.17889.faltet@pytables.org> A Friday 26 June 2009 13:46:13 Mag Gam escrigu?: > Yes, you are correct! > > I think this is the best path. > > However, I need to learn how to append a hdf5 dataset . I looked at > this, http://code.google.com/p/h5py/wiki/FAQ#Appending_data_to_a_dataset > but was not able to do so. Do you happen to have any sample code for > this, if you used hdf5. Well, by looking at the docs, it seems just a matter of using `.resize()` and then a traditional assignment (? la arr[row1:row2] = slice). But I'd recommend you to not run too fast and read the documentation carefully: you will notice that, in the end, you will be far more productive, trust me ;-) -- Francesc Alted From dyamins at gmail.com Fri Jun 26 14:51:37 2009 From: dyamins at gmail.com (Dan Yamins) Date: Fri, 26 Jun 2009 14:51:37 -0400 Subject: [Numpy-discussion] Rec array: numpy.rec vs numpy.array with complex dtype Message-ID: <15e4667e0906261151w7f6a84a2rd7ee42e3265ccd77@mail.gmail.com> Dear Numpy list: We've been using the numpy.rec classes to make record array objects. We've noticed that in more recent versions of numpy, record-array like objects can be made directly with the numpy.ndarray class, by passing a complex data type. However, it looks like the numpy.rec class is still supported. So, we have a couple of questions: 1) Which is the preferred way to make a record array, numpy.rec, or the numpy.ndarray with complex data type? A somewhat detailed explanation of the comparative properties would be great. (We know it's buried somewhere in the document ... sorry for being lazy!) 2) The individual "records" in numpy.rec array have the "numpy.record" type. The individual records in the numpy.array approach have "numpy.void" type. Can you tell us a little about how these differ, and what the advantages of one vs the other is? 3) We've heard talk about "complex data types" in numpy in general. Is there some good place we can read about this more extensively? Also: one thing we use and like about the numpy.rec constructors is that they can take a "names" argument, and the constructor function does some inferring about what the formats you want are, e.g.: img = numpy.rec.fromrecords([(0,1,'a'),(2,0,'b')], names = ['A','B','C']) produces: rec.array([(0, 1, 'a'), (2, 0, 'b')], dtype=[('A', ' From pgmdevlist at gmail.com Fri Jun 26 15:16:51 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 26 Jun 2009 15:16:51 -0400 Subject: [Numpy-discussion] Rec array: numpy.rec vs numpy.array with complex dtype In-Reply-To: <15e4667e0906261151w7f6a84a2rd7ee42e3265ccd77@mail.gmail.com> References: <15e4667e0906261151w7f6a84a2rd7ee42e3265ccd77@mail.gmail.com> Message-ID: <6A78A032-6B93-4E81-91B0-B8FA1DD762C8@gmail.com> On Jun 26, 2009, at 2:51 PM, Dan Yamins wrote: > > We've been using the numpy.rec classes to make record array objects. > > We've noticed that in more recent versions of numpy, record-array > like objects can be made directly with the numpy.ndarray class, by > passing a complex data type. Hasn't it always been the case? > However, it looks like the numpy.rec class is still supported. > > So, we have a couple of questions: > > 1) Which is the preferred way to make a record array, numpy.rec, or > the numpy.ndarray with complex data type? A somewhat detailed > explanation of the comparative properties would be great. (We know > it's buried somewhere in the document ... sorry for being lazy!) Short answer: a np.recarray is a subclass of ndarray with structured dtype, where fields can be accessed has attributes (as in 'yourarray.yourfield') instead of as items (as in yourarray['yourfield']). Under the hood, that means that the __getattribute__ method (and the corresponding __setattr__) had to be overloaded (you need to check whether an attribute is a field or not), which slows things down compared to a standard ndarray. My favorite way to get a np.recarray is to define a standard ndarray w/ complex dtype, and then take a view as a recarray Example: >>> np.array([(1,10),(2,20)],dtype=[('a',int), ('b',int)]).view(np.recarray) > > 2) The individual "records" in numpy.rec array have the > "numpy.record" type. The individual records in the numpy.array > approach have "numpy.void" type. Can you tell us a little about > how these differ, and what the advantages of one vs the other is? Mmh: >>> x = np.array([(1,10),(2,20)],dtype=[('a',int),('b',int)]) >>> rx = x.view(np.recarray) >>> type(x[0]) >>> type(rx[0]) What numpy version are you using ? > 3) We've heard talk about "complex data types" in numpy in general. > Is there some good place we can read about this more extensively? I think the proper term is 'structured data type', or 'structured array'. > > > Also: one thing we use and like about the numpy.rec constructors is > that they can take a "names" argument, and the constructor function > does some inferring about what the formats you want are, e.g.: > > img = numpy.rec.fromrecords([(0,1,'a'),(2,0,'b')], names = > ['A','B','C']) > > produces: > > rec.array([(0, 1, 'a'), (2, 0, 'b')], dtype=[('A', ' ('B', ' > This is very convenient. > > My immediate guess for the equivalent thing with the numpy.ndarray > approach: > > img = numpy.array([(0,1,'a'),(2,0,'b')], names = ['A','B','C']) > > does not work. Is there some syntax for doing this? You have to construct your dtype explicitly, as in "dtype=[('A', ' References: <185761440906250643j68423b5ctd80367e856027945@mail.gmail.com> Message-ID: <3d375d730906261226t67d92d6asa606b496892aea62@mail.gmail.com> On Fri, Jun 26, 2009 at 03:39, Christian K. wrote: > John Schulman caltech.edu> writes: > >> >> I'm trying to reduce the memory used in a calculation, so I'd like to >> switch my program to float32 instead of float64. Is it possible to >> change the numpy default float size, so I don't have to explicitly >> state dtype=np.float32 everywhere? > > Possibly not the nicest way, but > > np.float64 = np.float32 > > somewhere at the beginning should work. No. There is no way to change the default dtype of ones(), zeros(), etc. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dyamins at gmail.com Fri Jun 26 15:59:57 2009 From: dyamins at gmail.com (Dan Yamins) Date: Fri, 26 Jun 2009 15:59:57 -0400 Subject: [Numpy-discussion] Rec array: numpy.rec vs numpy.array with complex dtype In-Reply-To: <6A78A032-6B93-4E81-91B0-B8FA1DD762C8@gmail.com> References: <15e4667e0906261151w7f6a84a2rd7ee42e3265ccd77@mail.gmail.com> <6A78A032-6B93-4E81-91B0-B8FA1DD762C8@gmail.com> Message-ID: <15e4667e0906261259x14fee942s1917a208b231e245@mail.gmail.com> Pierre, thanks for your response. I have some follow up questions. Short answer: > a np.recarray is a subclass of ndarray with structured dtype, where > fields can be accessed has attributes (as in 'yourarray.yourfield') > instead of as items (as in yourarray['yourfield']). Is this the only substantial thing added in the recarray class? The fact that you can access some fields via attribute notation? We haven't been using this feature anyhow ... (what happens with the field names have spaces?) Is the recarray class still being developed actively? My favorite way to get a np.recarray is to define a standard ndarray > w/ complex dtype, and then take a view as a recarray > Example: > >>> np.array([(1,10),(2,20)],dtype=[('a',int), > ('b',int)]).view(np.recarray) Is the purpose of this basically to use the property of recarrays of accessing fields as attributes? Or do you have other reasons why you like this view? > Mmh: > >>> x = np.array([(1,10),(2,20)],dtype=[('a',int),('b',int)]) > >>> rx = x.view(np.recarray) > >>> type(x[0]) > > >>> type(rx[0]) > > In [18]: x = np.rec.fromrecords([(0,1,'a'),(2,0,'b')],names = ['A','B']) In [19]: x[0] Out[19]: (0, 1, 'a') In [20]: type(x[0]) Out[20]: In [21]: np.version.version Out[21]: '1.3.0' > 3) We've heard talk about "complex data types" in numpy in general. > Is there some good place we can read about this more extensively? I think the proper term is 'structured data type', or 'structured > array'. > Do you recommend a place we can learn about the interesting things one can do with structured data types? Or is the on-line documentation on the scipy site the best as of now? > > > > You have to construct your dtype explicitly, as in "dtype=[('A', > ' np.rec.fromrecords processes the array and try to guess the best type > for each field, but it's slow and not always correct Evidently. But sometimes (in fact, a lot of times, in our particular applications), the type inference works fine and the slowdown is not large enough to be noticeable. And of course in the recarray constructors one can override the type inference by including a 'dtype' or 'formats' argument as well. Obviously we can write constructor functions that include type inference algorithms of our own, ... but having a "standard" way to do this, with best practices maintained in the numpy core would be quite useful nonetheless. thanks, Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Fri Jun 26 16:54:32 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 26 Jun 2009 16:54:32 -0400 Subject: [Numpy-discussion] Rec array: numpy.rec vs numpy.array with complex dtype In-Reply-To: <15e4667e0906261259x14fee942s1917a208b231e245@mail.gmail.com> References: <15e4667e0906261151w7f6a84a2rd7ee42e3265ccd77@mail.gmail.com> <6A78A032-6B93-4E81-91B0-B8FA1DD762C8@gmail.com> <15e4667e0906261259x14fee942s1917a208b231e245@mail.gmail.com> Message-ID: <8BE9683F-9560-4B02-BE3A-E9DB1C6CD54A@gmail.com> On Jun 26, 2009, at 3:59 PM, Dan Yamins wrote: > > Short answer: > a np.recarray is a subclass of ndarray with structured dtype, where > fields can be accessed has attributes (as in 'yourarray.yourfield') > instead of as items (as in yourarray['yourfield']). > > Is this the only substantial thing added in the recarray class? AFAIK, yes. > The fact that you can access some fields via attribute notation? We > haven't been using this feature anyhow ... (what happens with the > field names have spaces?) Well, spaces in a field name is a bad idea, but nothing prevents you to do it (I wonder whether we shouldn't check for it in the definition of the dtype). Anyway, that will of course fail gloriously if you try to access it by attribute. > Is the recarray class still being developed actively? I don't know. There's not much you can add to it, is there ? > > My favorite way to get a np.recarray is to define a standard ndarray > w/ complex dtype, and then take a view as a recarray > Example: > >>> np.array([(1,10),(2,20)],dtype=[('a',int), > ('b',int)]).view(np.recarray) > > Is the purpose of this basically to use the property of recarrays of > accessing fields as attributes? Or do you have other reasons why > you like this view? You're correct, it's only to provide a more convenient way to access fields. I personally stopped using recarrays in favor of the easier ndarrays w/ structured dtype. If I really need to access fields as attributes, I'd write a subclass and make each field accesible as a property. > Do you recommend a place we can learn about the interesting things > one can do with structured data types? Or is the on-line > documentation on the scipy site the best as of now? http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html is a good start. Feel free to start some tutorial page. > > np.rec.fromrecords processes the array and try to guess the best type > for each field, but it's slow and not always correct > > Evidently. But sometimes (in fact, a lot of times, in our > particular applications), the type inference works fine and the > slowdown is not large enough to be noticeable. > And of course in the recarray constructors one can override the type > inference by including a 'dtype' or 'formats' argument as well. > Obviously we can write constructor functions that include type > inference algorithms of our own, ... but having a "standard" way to > do this, with best practices maintained in the numpy core would be > quite useful nonetheless. Well, you can always use the functions of the np.rec modules (fromfile, fromstring, fromrecords...). You can also have a look at np.lib.io.genfromtxt, a function to create a ndarray (or recarray, or MaskedArray) from a text file. I don't think overloading np.array to support cases like the ones you described is a good idea: I prefer to have some specific tools (like the np.rec functions) that one catch- all function. From ckkart at hoc.net Fri Jun 26 17:02:23 2009 From: ckkart at hoc.net (Christian K.) Date: Fri, 26 Jun 2009 23:02:23 +0200 Subject: [Numpy-discussion] switching to float32 In-Reply-To: <3d375d730906261226t67d92d6asa606b496892aea62@mail.gmail.com> References: <185761440906250643j68423b5ctd80367e856027945@mail.gmail.com> <3d375d730906261226t67d92d6asa606b496892aea62@mail.gmail.com> Message-ID: Robert Kern schrieb: > On Fri, Jun 26, 2009 at 03:39, Christian K. wrote: >> John Schulman caltech.edu> writes: >> >>> >>> I'm trying to reduce the memory used in a calculation, so I'd like to >>> switch my program to float32 instead of float64. Is it possible to >>> change the numpy default float size, so I don't have to explicitly >>> state dtype=np.float32 everywhere? >> >> Possibly not the nicest way, but >> >> np.float64 = np.float32 >> >> somewhere at the beginning should work. > > No. There is no way to change the default dtype of ones(), zeros(), etc. Right, I answered before finishing to think. I thought it was about replacing the dtype keyword arg in all places to float32 which could very well be accomplished by the text editor itself as well. Sorry. Christian From dineshbvadhia at hotmail.com Fri Jun 26 17:42:26 2009 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Fri, 26 Jun 2009 14:42:26 -0700 Subject: [Numpy-discussion] Import Numpy on Windows Vista x64 AMD Message-ID: Ticket# 1084 (http://projects.scipy.org/numpy/timeline?from=2009-06-09T03%3A01%3A59-0500&precision=second) says that the numpy import on Windows Vista x64 AMD systems works now. Is this for Numpy 1.3 or 1.4 and if 1.3 has anyone tried it successfully? Thanks. Dinesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jun 26 21:13:45 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 26 Jun 2009 19:13:45 -0600 Subject: [Numpy-discussion] fromfile and ticket #1152 Message-ID: The question is: what should happen when fewer items are read than requested. The current behaviour is 1) Error message written to stderr (needs to be fixed) 2) If 0 items are read then nomemory error is raised ;) So, should a warning be raised and an array returned with however many items were read? Meaning an empty array if nothing was read. Or should and error be raised in this circumstance? The behaviour is currently undocumented. Chuck From cournape at gmail.com Fri Jun 26 22:00:33 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 27 Jun 2009 11:00:33 +0900 Subject: [Numpy-discussion] Import Numpy on Windows Vista x64 AMD In-Reply-To: References: Message-ID: <5b8d13220906261900i51940186q356593c69f898260@mail.gmail.com> On Sat, Jun 27, 2009 at 6:42 AM, Dinesh B Vadhia wrote: > Ticket# 1084 > (http://projects.scipy.org/numpy/timeline?from=2009-06-09T03%3A01%3A59-0500&precision=second) > says that the numpy import on Windows Vista x64 AMD systems works now. I mistakenly closed it as fixed, but it is just a duplicate. The problem persists, David From charlesr.harris at gmail.com Fri Jun 26 23:30:20 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 26 Jun 2009 21:30:20 -0600 Subject: [Numpy-discussion] stderr Message-ID: There are a few spots where messages are printed to stderr. Some of these look almost like debugging stuff, for instance NPY_NO_EXPORT void format_ at name@(char *buf, size_t buflen, @name@ val, unsigned int prec) { /* XXX: Find a correct size here for format string */ char format[64], *res; size_t i, cnt; PyOS_snprintf(format, sizeof(format), _FMT1, prec); res = NumPyOS_ascii_format at type@(buf, buflen, format, val, 0); if (res == NULL) { fprintf(stderr, "Error while formatting\n"); return; } /* If nothing but digits after sign, append ".0" */ cnt = strlen(buf); for (i = (val < 0) ? 1 : 0; i < cnt; ++i) { if (!isdigit(Py_CHARMASK(buf[i]))) { break; } } if (i == cnt && buflen >= cnt + 3) { strcpy(&buf[cnt],".0"); } } Do we want to raise an error here? Alternatively, we could use an assert. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Jun 27 03:09:34 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 27 Jun 2009 09:09:34 +0200 Subject: [Numpy-discussion] SciPy abstract submission deadline extended Message-ID: <20090627070934.GA6149@phare.normalesup.org> Greetings, The conference committee is extending the deadline for abstract submission for the Scipy conference 2009 one week. On Friday July 3th, at midnight Pacific, we will turn off the abstract submission on the conference site. Up to then, you can modify the already-submitted abstract, or submit new abstracts. Submitting Papers --------------------- The program features tutorials, contributed papers, lightning talks, and bird-of-a-feather sessions. We are soliciting talks and accompanying papers (either formal academic or magazine-style articles) that discuss topics which center around scientific computing using Python. These include applications, teaching, future development directions, and research. A collection of peer-reviewed articles will be published as part of the proceedings. Proposals for talks are submitted as extended abstracts. There are two categories of talks: Paper presentations These talks are 35 minutes in duration (including questions). A one page abstract of no less than 500 words (excluding figures and references) should give an outline of the final paper. Proceeding papers are due two weeks after the conference, and may be in a formal academic style, or in a more relaxed magazine-style format. Rapid presentations These talks are 10 minutes in duration. An abstract of between 300 and 700 words should describe the topic and motivate its relevance to scientific computing. In addition, there will be an open session for lightning talks during which any attendee willing to do so is invited to do a couple-of-minutes-long presentation. If you wish to present a talk at the conference, please create an account on the website (http://conference.scipy.org). You may then submit an abstract by logging in, clicking on your profile and following the "Submit an abstract" link. Submission Guidelines * Submissions should be uploaded via the online form. * Submissions whose main purpose is to promote a commercial product or service will be refused. * All accepted proposals must be presented at the SciPy conference by at least one author. * Authors of an accepted proposal can provide a final paper for publication in the conference proceedings. Final papers are limited to 7 pages, including diagrams, figures, references, and appendices. The papers will be reviewed to help ensure the high-quality of the proceedings. For further information, please visit the conference homepage: http://conference.scipy.org. The SciPy 2009 executive committee ----------------------------------- * Jarrod Millman, UC Berkeley, USA (Conference Chair) * Ga?l Varoquaux, INRIA Saclay, France (Program Co-Chair) * St?fan van der Walt, University of Stellenbosch, South Africa (Program Co-Chair) * Fernando P?rez, UC Berkeley, USA (Tutorial Chair) From dg.gmane at thesamovar.net Sat Jun 27 06:58:16 2009 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Sat, 27 Jun 2009 12:58:16 +0200 Subject: [Numpy-discussion] switching to float32 In-Reply-To: <3d375d730906261226t67d92d6asa606b496892aea62@mail.gmail.com> References: <185761440906250643j68423b5ctd80367e856027945@mail.gmail.com> <3d375d730906261226t67d92d6asa606b496892aea62@mail.gmail.com> Message-ID: Robert Kern wrote: > On Fri, Jun 26, 2009 at 03:39, Christian K. wrote: >> John Schulman caltech.edu> writes: >> >>> I'm trying to reduce the memory used in a calculation, so I'd like to >>> switch my program to float32 instead of float64. Is it possible to >>> change the numpy default float size, so I don't have to explicitly >>> state dtype=np.float32 everywhere? >> Possibly not the nicest way, but >> >> np.float64 = np.float32 >> >> somewhere at the beginning should work. > > No. There is no way to change the default dtype of ones(), zeros(), etc. > Is there any chance that this will ever be changed or is it considered too unpythonic / too much work to implement? I would find this quite a useful feature in a project I'm working on, because I want to mixing some GPU code in to various places with pycuda, but older GPU cards only have support for float32. Dan From dagss at student.matnat.uio.no Sat Jun 27 08:39:48 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 27 Jun 2009 14:39:48 +0200 Subject: [Numpy-discussion] switching to float32 In-Reply-To: References: <185761440906250643j68423b5ctd80367e856027945@mail.gmail.com> <3d375d730906261226t67d92d6asa606b496892aea62@mail.gmail.com> Message-ID: <4A461314.6010109@student.matnat.uio.no> Dan Goodman wrote: > Robert Kern wrote: >> On Fri, Jun 26, 2009 at 03:39, Christian K. wrote: >>> John Schulman caltech.edu> writes: >>> >>>> I'm trying to reduce the memory used in a calculation, so I'd like to >>>> switch my program to float32 instead of float64. Is it possible to >>>> change the numpy default float size, so I don't have to explicitly >>>> state dtype=np.float32 everywhere? >>> Possibly not the nicest way, but >>> >>> np.float64 = np.float32 >>> >>> somewhere at the beginning should work. >> No. There is no way to change the default dtype of ones(), zeros(), etc. >> > > Is there any chance that this will ever be changed or is it considered > too unpythonic / too much work to implement? I would find this quite a > useful feature in a project I'm working on, because I want to mixing > some GPU code in to various places with pycuda, but older GPU cards only > have support for float32. Well, such a mechanism would mean that your code could not be reused reliably by other code (because, what if two different codebases sets different defaults...). So requiring any such mechanism would make it easier to write non-reusable code, which is generally considered a bad thing... Note that it is relatively easy for you to do e.g. default_dtype = np.float32 def array(*args, **kw): if 'dtype' not in kw.keys(): return np.array(*args, **kw, dtype=default_dtype) else: return np.array(*args, **kw) and so on in your own codebase, avoiding the problem. -- Dag Sverre From d_l_goldsmith at yahoo.com Sat Jun 27 11:06:51 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Sat, 27 Jun 2009 08:06:51 -0700 (PDT) Subject: [Numpy-discussion] switching to float32 Message-ID: <661133.27816.qm@web52104.mail.re2.yahoo.com> --- On Sat, 6/27/09, Dag Sverre Seljebotn wrote: > Note that it is relatively easy for you to do e.g. > > default_dtype = np.float32 > > def array(*args, **kw): > if 'dtype' not in kw.keys(): > return np.array(*args, **kw, dtype=default_dtype) > else: > return np.array(*args, **kw) Perhaps this could be added to the Cookbook. DG > > and so on in your own codebase, avoiding the problem. > > -- > Dag Sverre > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dineshbvadhia at hotmail.com Sat Jun 27 13:05:29 2009 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Sat, 27 Jun 2009 10:05:29 -0700 Subject: [Numpy-discussion] Import Numpy on Windows Vista x64 AMD In-Reply-To: <5b8d13220906261900i51940186q356593c69f898260@mail.gmail.com> References: <5b8d13220906261900i51940186q356593c69f898260@mail.gmail.com> Message-ID: Okay. Maybe a bit harsh, but wouldn't it be better not to have the release as available if it cannot be imported? From: David Cournapeau Sent: Friday, June 26, 2009 7:00 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Import Numpy on Windows Vista x64 AMD On Sat, Jun 27, 2009 at 6:42 AM, Dinesh B Vadhia wrote: > Ticket# 1084 > (http://projects.scipy.org/numpy/timeline?from=2009-06-09T03%3A01%3A59-0500&precision=second) > says that the numpy import on Windows Vista x64 AMD systems works now. I mistakenly closed it as fixed, but it is just a duplicate. The problem persists, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Jun 27 13:21:06 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 28 Jun 2009 02:21:06 +0900 Subject: [Numpy-discussion] Import Numpy on Windows Vista x64 AMD In-Reply-To: References: <5b8d13220906261900i51940186q356593c69f898260@mail.gmail.com> Message-ID: <5b8d13220906271021p43899f2ao1f412b768ff24703@mail.gmail.com> On Sun, Jun 28, 2009 at 2:05 AM, Dinesh B Vadhia wrote: > Okay.? Maybe a bit harsh, but wouldn't it be better not to have the release > as available if it cannot be imported? It cannot be imported in some situations, but it works fine in others. David From jed.ludlow at gmail.com Sat Jun 27 14:26:04 2009 From: jed.ludlow at gmail.com (Jed Ludlow) Date: Sat, 27 Jun 2009 18:26:04 +0000 (UTC) Subject: [Numpy-discussion] fromfile and ticket #1152 References: Message-ID: Charles R Harris gmail.com> writes: > > The question is: what should happen when fewer items are read than > requested. The current behaviour is > > 1) Error message written to stderr (needs to be fixed) > 2) If 0 items are read then nomemory error is raised ;) > > So, should a warning be raised and an array returned with however many > items were read? Meaning an empty array if nothing was read. Or should > and error be raised in this circumstance? The behaviour is currently > undocumented. > > Chuck > Of course, this discussion isn't new, and I don't know that it needs to be completely rehashed. See http://mail.scipy.org/pipermail/numpy-discussion/2009-May/042668.html and the discussion that followed. I also noticed that it was suggested that a ticket should be filed on the issue in a separate discussion: http://mail.scipy.org/pipermail/numpy-discussion/2009-May/042638.html I noticed the inconsistency myself recently and noticed that a ticket didn't exist and felt inclined to create one. Incidentally, I was reading from two binary files using fromfile() for one (flat uint32s) and fid.read() for another (full of packed C structures) and found myself having to handle exceptions AND check return lengths in the first case and only having to check return lengths in the second. I would vote toward harmonizing the behavior with the python built-in fid.read(bytes) as the default, which simply returns as many items as could be read before the EOF was reached, including zero. An additional strict interface could be added (by keyword) for those who want an exception raised whenever the requested read count could not be completed. Jed From pav at iki.fi Sat Jun 27 15:08:49 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 27 Jun 2009 19:08:49 +0000 (UTC) Subject: [Numpy-discussion] fromfile and ticket #1152 References: Message-ID: On 2009-06-27, Jed Ludlow wrote: > Of course, this discussion isn't new, and I don't know that it needs to be > completely rehashed. See > > http://mail.scipy.org/pipermail/numpy-discussion/2009-May/042668.html Especially this: http://mail.scipy.org/pipermail/numpy-discussion/2009-May/042762.html I think we sort of converged on some aspects of viable interface options: - "Non-strict" interface Return as many items as can be read, or up to `count`, if given. Never raise errors or warnings. Stop reading immediately on unexpected input. fromstring("1,2,x,4", sep=",") -> [1,2] fromstring("1,2,x,4", sep=",", count=5) -> [1,2] - Strict interface Raise ValueError on malformed input, or if there are not enough items for `count`. fromstring("1,2,x,4", sep=",") -> ValueError fromstring("1,2,3,4", sep=",", count=5) -> ValueError The main disagreement was which of the above to use as the default. A hybrid of the above, which would raise an error only when `count` was specified, was also suggested. Then some variations on whether default values should be introduced and what to do with non-numeric entries in this case. I believe: - We should not break backward compatibility, so the "non-strict" interface should be the default. No errors or warnings raised, except passing through underlying I/O errors (eg. sector not found and other OS-level stuff). - We could optionally, later on, implement the strict interface. - We should drop the idea of default values for now, and keep fromfile and fromstring simple. [clip] > I would vote toward harmonizing the behavior with the python built-in > fid.read(bytes) as the default, which simply returns as many items as could be > read before the EOF was reached, including zero. An additional strict interface > could be added (by keyword) for those who want an exception raised whenever the > requested read count could not be completed. +1 -- Pauli Virtanen From pav at iki.fi Sat Jun 27 15:12:33 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 27 Jun 2009 19:12:33 +0000 (UTC) Subject: [Numpy-discussion] stderr References: Message-ID: On 2009-06-27, Charles R Harris wrote: [clip] > > PyOS_snprintf(format, sizeof(format), _FMT1, prec); > res = NumPyOS_ascii_format at type@(buf, buflen, format, val, 0); > if (res == NULL) { > fprintf(stderr, "Error while formatting\n"); > return; > } > [clip] > Do we want to raise an error here? Alternatively, we could use an assert. I'd advise against asserts. They should be used only for conditions that are (expected to be) logically impossible. This one here seems to be possible when out-of-memory, or some other condition. Also, an assert makes the program crash on C-level, which is clearly undesirable in a Python program as it cannot be handled. Raising an error here seems to be the proper thing to do. -- Pauli Virtanen From charlesr.harris at gmail.com Sat Jun 27 15:38:14 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 27 Jun 2009 13:38:14 -0600 Subject: [Numpy-discussion] stderr In-Reply-To: References: Message-ID: On Sat, Jun 27, 2009 at 1:12 PM, Pauli Virtanen wrote: > On 2009-06-27, Charles R Harris wrote: > [clip] > > > > PyOS_snprintf(format, sizeof(format), _FMT1, prec); > > res = NumPyOS_ascii_format at type@(buf, buflen, format, val, 0); > > if (res == NULL) { > > fprintf(stderr, "Error while formatting\n"); > > return; > > } > > > [clip] > > Do we want to raise an error here? Alternatively, we could use an assert. > > I'd advise against asserts. They should be used only for > conditions that are (expected to be) logically impossible. This > one here seems to be possible when out-of-memory, or some other > condition. > > Also, an assert makes the program crash on C-level, which is > clearly undesirable in a Python program as it cannot be handled. > > Raising an error here seems to be the proper thing to do. > I'm inclined that way also, but it will require some work. The routine currently returns nothing so the calling routine will need to call into python to see if an error was raised. Alternatively, we could add an error return. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Jun 27 15:41:22 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 27 Jun 2009 15:41:22 -0400 Subject: [Numpy-discussion] npy/npz file extensions in save/savez Message-ID: Hi, The .npy/.npz extensions are just suggestions and can be replaced with something application-specific according to the NEP (see "Conventions" in http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt). However the code in save/savez does not seem to allow this. Pass in 'filename.ext' and out comes 'filename.ext.npy'. The relevant code in `save` is: if isinstance(file, basestring): if not file.endswith('.npy'): file = file + '.npy' Replacing the second line with: if not os.path.splitext(file)[1]: would fix it I think. Same for `savez`. `load` does the correct thing, it only cares about the magic string and not the extension. Should save/savez allow other extensions? If so, should I open a ticket and attach this as a patch? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From woiski at gmail.com Sat Jun 27 17:08:58 2009 From: woiski at gmail.com (Emanuel Woiski) Date: Sat, 27 Jun 2009 18:08:58 -0300 Subject: [Numpy-discussion] Fwd: YASS (Yet Another Success Story) In-Reply-To: <534f19f6-181c-42ee-803c-9bd867b6d053@n4g2000vba.googlegroups.com> References: <534f19f6-181c-42ee-803c-9bd867b6d053@n4g2000vba.googlegroups.com> Message-ID: <9c08f0d10906271408w7cd6cecaud181216dfd97b007@mail.gmail.com> pessoal Hst?ria de sucesso woiski ---------- Forwarded message ---------- From: k3xji Date: 2009/6/20 Subject: YASS (Yet Another Success Story) To: python-list at python.org Hi all, Started a project year ago with hard goals in mind : Developing a game server which can handle thousands of clients simultaneously.System must be stable, scalable, efficient, and if the client need to migrate the server to another OS, this will be a matter of time, if possible without changing any code. Also, the system should be controlled via database operations, meaning some specific DB changes should be determined by the system on-the-fly and necassary processing should be done at runtime(Client timeout values, adding new game lobbies..etc..) And all of this should be a one-man project dues to the fact that the project is a volunteer work, thus is separate from our ordinary jobs, there is no budget:). This was 8 months ago, and I still remember we are talking with my colleguae to decide which programming language is better for our requirements. We wrote the tradeoffs of different languages on board and compare our objectives against them by giving numbers. I remember, the last three standing programming languages are: C++, Java, and Python. We decided that code maintability and stability is cheaper than native code efficiency and also easy porting is important. C++ is only successfull on efficiency but maintaining the code, even if we use 3rd party libraries is a nightmare, especially in a massive multithreaded project like a game server. Also, we decided a game server's bottleneck is on network I/O other than the actual processing which means there should be not so much performance degradation between C++ and an interpreted language. So the final is between Java and Python. I know Java has been adding very good optimizations for interpreter efficiency over the years(JIT compiling..etc.) And I remember somewhere reading that Java is 10 times faster than Python. But Python, on the other hand is a perfect language and very simple which means the project may be completed sooner and efficiency again will be not a so much problem as the bottleneck should be on network I/O if the server is coded correctly. Although we have made this decision, we still had some doubts but we move on with Python. I don't know Python at the time and only coded few simple projects with it. I started with reading python socket/threading tutorials, then I downloaded and read actual Python code to get deeper and deeper. This was 10 months ago. Now we have been able to implement the game server with Python using MySQL as a backend DB and currently 5000 players (increasing) are playing multiplayer games with it with a system load of 0.4-0.8(see Unix getloadavg output) on a 1.8 Ghz Dual Xeon processor. Let me give some benchmarks, please note that one thing I learn through this journey is that testing/benchmarking a server software is really difficult and maybe in some conditions impossible thing to do correctly as inactive client connections may cause indeterministic behavior and also WAN latencies are very different than LAN latencies..etc. Anyway, we had to do a benchmark before going with this server on live. We installed a UnrealIRCd server which is also using select() as an I/O paradigm just as we are and supporting many of the features we have. This server is a IRC server written in C and being written since nearly 10 years. We connect thousands of clients and do simple messaging between them. I don't have an official test/benchmark result or a nice graph for this, however the system running UnrealIRCD is locked up when processing 4 Mb/sec of data, where our 6 month old Python server is locking up the system resources after 3.5 Mb/sec. We have done other tests including stressing server accept(), adding inactive connections but I am really not %100 confident about the test environments and do not want to give you false impression. But we decided that this 0.5 Mb factor is still not enough for us and we move on to inspect what is really causing the bottlenecks and how to optimize them. Here, to be honest, I cannot see very uch help from Python. Profiling/Optimization is the only place I am upset with Python. Here is the requirements for the profiler: I want to be able to enable/disable profiler tool on the fly without requiring any kind of restart and the profiler should be multithreaded (thus thread-safe). I decided that cProfiler and other profiler tools are in need of some updates to fulfill my req?rements, which makes me write my own profiler. It took me 3 days to come up with a profiler that satisfies my requirements, this profiler will profile only my code, meaning the server code to avoid performance degradation and timings will need not be nested-aware and timing precision is not very important (other than secs). I have implemented the profiler with the help of decorators and started profiling my code on-live. Here comes the another part of Python that is some kind of shady: optimization. After profiling the code, it turns out most of the time is spent on the following: 1) Selecting/Deselecting interest sets for select() 2) Unnecessary function calls. (for integrating some OOP concepts, I think we have overlooked the performance factor and there really are some functions which can be inlined but declared as a function) 3) Redundant try-except's in all over place(Again our fault to make the system stable, we have put some debug purposed-assert like try- excepts in the main server flow.) After, looking on these subjects for 2 weeks, we came up with solutions for all of them, and we see that it is working approximately 5 times faster. This test is done in live, we run two server processes which are all running on a predefined CPU(means CPU affinity of the process is set before hand to avoid mis-interpreting system diagnose tools like top(1)). When same client number is connected to both servers, the second one is giving 5 times better values in top and ps tools. Also, the processing time (meaning the server loop except the select() call) is also 10 times faster than before. Just one note about optimizing Python code: do not optimize Python code based on your assumptions, just go and test if it really runs faster. I don't want to go to details of this hint, but believe me making Python code optimized may be very very tricky. It is then I decided to write up here this as a success story, as I am very newcomer to Python but come up with a nearly commercial product in a very short period of time and I don't think this is about my personal characteristics and intelligence or so:), as I am old enough to know/meet that there are much much more brilliant people than I am and they also have similar experiences with Python. So, one last note: every software project goes same tasks as above often much much more officially and carefully, I would suggest managers to see that just do not listen to the ordinary brain-washes. Python is a great choice for easy developing, easy debugging, easy maintaining and most importantly very very time-friendly. Of course there will be tasks .n which Python is suitable, but hey, if it Python is in the list, take it seriously. Thanks, Special thanks to people in this group who had answered my silly Python questions along the way. S?mer Cip -- http://mail.python.org/mailman/listinfo/python-list -------------- next part -------------- An HTML attachment was scrubbed... URL: From dg.gmane at thesamovar.net Sat Jun 27 20:31:21 2009 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Sun, 28 Jun 2009 02:31:21 +0200 Subject: [Numpy-discussion] switching to float32 In-Reply-To: <4A461314.6010109@student.matnat.uio.no> References: <185761440906250643j68423b5ctd80367e856027945@mail.gmail.com> <3d375d730906261226t67d92d6asa606b496892aea62@mail.gmail.com> <4A461314.6010109@student.matnat.uio.no> Message-ID: Dag Sverre Seljebotn wrote: > Well, such a mechanism would mean that your code could not be reused > reliably by other code (because, what if two different codebases sets > different defaults...). So requiring any such mechanism would make it > easier to write non-reusable code, which is generally considered a bad > thing... > > Note that it is relatively easy for you to do e.g. > > default_dtype = np.float32 > > def array(*args, **kw): > if 'dtype' not in kw.keys(): > return np.array(*args, **kw, dtype=default_dtype) > else: > return np.array(*args, **kw) > > and so on in your own codebase, avoiding the problem. > Neat trick - I think I'll do exactly that. I'll also need to cover a few other cases, like zeros(), ones(), etc., but I think it should work. One could even write a little macro that generates wrappers like this for all numpy/scipy functions that have a 'dtype' argument. Thanks for an excellent idea! Dan From dineshbvadhia at hotmail.com Sat Jun 27 22:31:20 2009 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Sat, 27 Jun 2009 19:31:20 -0700 Subject: [Numpy-discussion] Import Numpy on Windows Vista x64 AMD In-Reply-To: <5b8d13220906271021p43899f2ao1f412b768ff24703@mail.gmail.com> References: <5b8d13220906261900i51940186q356593c69f898260@mail.gmail.com> <5b8d13220906271021p43899f2ao1f412b768ff24703@mail.gmail.com> Message-ID: The machine in question is factory installed with: OS: Windows Vista 64-bit SP2 Processor: Intel Core2 Quad CPU, Q6600 @ 2.4Ghz @2.4GHz Memory: 8Gb Apart from Python 2.5.4 nothing has been installed on this machine as it is being used only to run Python programs. Python 2.6.1 was installed so that Numpy 1.3 could be installed. Numpy was installed per machine and per-user and each time import numpy didn't work. If Numpy cannot be imported on this plain-vanilla machine what configuration does it work on? Dinesh From: David Cournapeau Sent: Saturday, June 27, 2009 10:21 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Import Numpy on Windows Vista x64 AMD On Sun, Jun 28, 2009 at 2:05 AM, Dinesh B Vadhia wrote: > Okay. Maybe a bit harsh, but wouldn't it be better not to have the release > as available if it cannot be imported? It cannot be imported in some situations, but it works fine in others. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Jun 27 22:42:26 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 28 Jun 2009 11:42:26 +0900 Subject: [Numpy-discussion] Import Numpy on Windows Vista x64 AMD In-Reply-To: References: <5b8d13220906261900i51940186q356593c69f898260@mail.gmail.com> <5b8d13220906271021p43899f2ao1f412b768ff24703@mail.gmail.com> Message-ID: <5b8d13220906271942h3ae1a81icb5a19ac8de2c01a@mail.gmail.com> On Sun, Jun 28, 2009 at 11:31 AM, Dinesh B Vadhia wrote: > The machine in question is factory installed with: > > OS: Windows Vista 64-bit SP2 > Processor: Intel Core2 Quad CPU, Q6600 @ 2.4Ghz @2.4GHz > Memory:? 8Gb > > Apart from Python 2.5.4 nothing has been installed on this machine as it is > being used only to run Python programs. > > Python 2.6.1 was installed so that Numpy 1.3 could be installed.? Numpy was > installed per machine and per-user and each time import numpy didn't work. > > If Numpy cannot be imported on this plain-vanilla machine what configuration > does it work on? I think in this case, the problem is Vista. There are subtle problems with dll loading on Vista, that nobody in the python community has been able to track down. See for example: http://bugs.python.org/issue4018 I can import it fine in windows xp 64 (although it is certainly not bug-free - the 64 bits installer is marked as experimental for a reason). David From cournape at gmail.com Sun Jun 28 09:16:50 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 28 Jun 2009 22:16:50 +0900 Subject: [Numpy-discussion] stderr In-Reply-To: References: Message-ID: <5b8d13220906280616o31c3fe11j656cf5bf101db2c4@mail.gmail.com> On Sun, Jun 28, 2009 at 4:38 AM, Charles R Harris wrote: > > > On Sat, Jun 27, 2009 at 1:12 PM, Pauli Virtanen wrote: >> >> On 2009-06-27, Charles R Harris wrote: >> [clip] >> > >> > ? ?PyOS_snprintf(format, sizeof(format), _FMT1, prec); >> > ? ?res = NumPyOS_ascii_format at type@(buf, buflen, format, val, 0); >> > ? ?if (res == NULL) { >> > ? ? ? ?fprintf(stderr, "Error while formatting\n"); >> > ? ? ? ?return; >> > ? ?} >> > >> [clip] >> > Do we want to raise an error here? Alternatively, we could use an >> > assert. >> >> I'd advise against asserts. They should be used only for >> conditions that are (expected to be) logically impossible. This >> one here seems to be possible when out-of-memory, or some other >> condition. >> >> Also, an assert makes the program crash on C-level, which is >> clearly undesirable in a Python program as it cannot be handled. >> >> Raising an error here seems to be the proper thing to do. > > I'm inclined that way also, but it will require some work. The routine > currently returns nothing so the calling routine will need to call into > python to see if an error was raised. Alternatively, we could add an error > return. We could just raise a python exception, if this is possible (I don't know if those format_* functions are allowed to fail). David From david at ar.media.kyoto-u.ac.jp Mon Jun 29 01:34:51 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 29 Jun 2009 14:34:51 +0900 Subject: [Numpy-discussion] Code for compatibility with gfortran, license issues Message-ID: <4A48527B.9020302@ar.media.kyoto-u.ac.jp> Hi, I started working on a new approach for windows 64 bits support, to be able to combine gfortran and visual studio. Basically, I am reimplementing the needed functions from libgfortran so that it can be built with MS compiler, but I cannot hope to do that without using gfortran sources (under the GPL 3), if only for the signature/API of the functions. Is it ok to include those sources in numpy/scipy, given that they will only be used in those cases where libgfortran would have to be used anyway ? I got pretty far very quickly using this method (full numpy works except for a few unit tests, scipy.linalg, scipy.sparse, scipy.fftpack are already working), so it would be nice if it was possible :) cheers, David From millman at berkeley.edu Mon Jun 29 02:59:42 2009 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 28 Jun 2009 23:59:42 -0700 Subject: [Numpy-discussion] ANN: SciPy 2009 student sponsorship Message-ID: I am pleased to announce that the Python Software Foundation is sponsoring 10 students' travel, registration, and accommodation for the SciPy 2009 conference (Aug. 18-23). ?The focus of the conference is both on scientific libraries and tools developed with Python and on scientific or engineering achievements using Python. ?If you're in college or a graduate program, please check out the details here: http://conference.scipy.org/student-funding About the conference -------------------- SciPy 2009, the 8th Python in Science conference, will be held from August 18-23, 2009 at Caltech in Pasadena, CA, USA.? The conference starts with two days of tutorials to the scientific Python tools. There will be two tracks, one for introduction of the basic tools to beginners, and one for more advanced tools.? The tutorials will be followed by two days of talks.? Both days of talks will begin with a keynote address.? The first day?s keynote will be given by Peter Norvig, the Director of Research at Google; while, the second keynote will be delivered by Jon Guyer, a Materials Scientist in the Thermodynamics and Kinetics Group at NIST.? The program committee will select the remaining talks from submissions to our call for papers. All selected talks will be included in our conference proceedings edited by the program committee.? After the talks each day we will provide several rooms for impromptu birds of a feather discussions. Finally, the last two days of the conference will be used for a number of coding sprints on the major software projects in our community. For the 8th consecutive year, the conference will bring together the developers and users of the open source software stack for scientific computing with Python. ?Attendees have the opportunity to review the available tools and how they apply to specific problems. ?By providing a forum for developers to share their Python expertise with the wider commercial, academic, and research communities, this conference fosters collaboration and facilitates the sharing of software components, techniques, and a vision for high level language use in scientific computing. For further information, please visit the conference homepage: http://conference.scipy.org. Important Dates --------------- * Friday, July 3: Abstracts Due * Friday, July 10: Announce accepted talks, post schedule * Friday, July 10: Early Registration ends * Tuesday-Wednesday, August 18-19: Tutorials * Thursday-Friday, August 20-21: Conference * Saturday-Sunday, August 22-23: Sprints * Friday, September 4: Papers for proceedings due Executive Committee ------------------- * Jarrod Millman, UC Berkeley, USA (Conference Chair) * Ga?l Varoquaux, INRIA Saclay, France (Program Co-Chair) * St?fan van der Walt, University of Stellenbosch, South Africa (Program Co-Chair) * Fernando P?rez, UC Berkeley, USA (Tutorial Chair) From hanni.ali at gmail.com Mon Jun 29 03:09:45 2009 From: hanni.ali at gmail.com (Hanni Ali) Date: Mon, 29 Jun 2009 08:09:45 +0100 Subject: [Numpy-discussion] Code for compatibility with gfortran, license issues In-Reply-To: <4A48527B.9020302@ar.media.kyoto-u.ac.jp> References: <4A48527B.9020302@ar.media.kyoto-u.ac.jp> Message-ID: <789d27b10906290009k5ab6508fo8980a2b3b36b4a1b@mail.gmail.com> Hi David, Sounds very interesting, have you noticed any improvement in performance ove using the builtin numpy blas lite? If you need someone to test on Windows 64 I would be happy to do so. Hanni 2009/6/29 David Cournapeau > Hi, > > I started working on a new approach for windows 64 bits support, to > be able to combine gfortran and visual studio. Basically, I am > reimplementing the needed functions from libgfortran so that it can be > built with MS compiler, but I cannot hope to do that without using > gfortran sources (under the GPL 3), if only for the signature/API of the > functions. Is it ok to include those sources in numpy/scipy, given that > they will only be used in those cases where libgfortran would have to be > used anyway ? > > I got pretty far very quickly using this method (full numpy works except > for a few unit tests, scipy.linalg, scipy.sparse, scipy.fftpack are > already working), so it would be nice if it was possible :) > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon Jun 29 04:48:37 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 29 Jun 2009 17:48:37 +0900 Subject: [Numpy-discussion] Code for compatibility with gfortran, license issues In-Reply-To: <789d27b10906290009k5ab6508fo8980a2b3b36b4a1b@mail.gmail.com> References: <4A48527B.9020302@ar.media.kyoto-u.ac.jp> <789d27b10906290009k5ab6508fo8980a2b3b36b4a1b@mail.gmail.com> Message-ID: <5b8d13220906290148o299bc711g20f3b65c78e12817@mail.gmail.com> On Mon, Jun 29, 2009 at 4:09 PM, Hanni Ali wrote: > Hi David, > > Sounds very interesting, have you noticed any improvement in performance ove > using the builtin numpy blas lite? For now, I focus on building and passing the test suite. That's already a lot of work since MS compilers are very crappy C compilers (lot of missing functions, no complex support, etc... all which has to be reimplemented). So performances will only come much later :) If you care about performance, I would guess intel compilers and one of the numerous commercial fortran compiler is the way to go. cheers, David From animator333 at yahoo.com Mon Jun 29 09:20:06 2009 From: animator333 at yahoo.com (Prashant Saxena) Date: Mon, 29 Jun 2009 18:50:06 +0530 (IST) Subject: [Numpy-discussion] numpy/numexpr performance (particle simulation) Message-ID: <193184.73446.qm@web94906.mail.in2.yahoo.com> Hi, I am doing a little test using numpy and numexpr to do a particle simulation. I never used either of them much and this is the first time I have to go deeper. Here is the code: import numpy as np import numexpr as nexpr class Particle( object ): def __init__( self, id ): self.position = [0.0, 0.0, 0.0] self.color = [0.5, 0.5, 0.5] self.direction = [0.0, 0.0, 0.0] self..id = id self.lifeSpan = 0 self.type = 0 class Emitter( object ): def __init__( self ): self.particles = np.empty([], dtype=Particle()) self.currentParticleId = 0 self.numParticles = 1000 self.emissionRate = 10 self.position = [0.0, 0.0, 0.0] self.rotation = [0.0, 0.0, 0.0] # Add a single particle def addParticle( self ): """ Add a single particle in the emitter. """ if self.currentParticleId < self.numParticles: self.particles = np.append( self.particles, Particle( self..currentParticleId ) ) self.currentParticleId+=1 ####################################################### Problem 1: self.particles = np.empty([], dtype=Particle()) In "Emitter" class, how do I initialize a numpy array of "Particle" type? Problem 2: If problem 1 can be solved, is it possible to use numexpr to alter the position of each particle's position in emitter.particles array. For example, multiply x,y and z of each particle by using three random values. In other words self.particles[0].position*= random(1.0) self.particles[1].position*= random(2.0) self.particles[2].position*= random(3.0) If problem 1 cannot be solved then may be I could use something like this using simple list: a = [Particle(0), Particle(1)] How do I modify the position of each particle in array"a" using numexpr? I would be using 1-5 million particles where each particle's attribute postion, rotation and color will be modified using some complex equations. The above code is just the begining hence any tips for performance would be greatly apperciated. Regards Prashant Yahoo! recommends that you upgrade to the new and safer Internet Explorer 8. http://downloads.yahoo.com/in/internetexplorer/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From afriedle at indiana.edu Mon Jun 29 10:42:35 2009 From: afriedle at indiana.edu (Andrew Friedley) Date: Mon, 29 Jun 2009 10:42:35 -0400 Subject: [Numpy-discussion] Add/multiply reduction confusion Message-ID: <4A48D2DB.10508@indiana.edu> Hi, I'm trying to understand how integer types are upcast for add/multiply operations for my GSoC project (Implementing Ufuncs using CorePy). The documentation says that for reduction with add/multiply operations, integer types are 'upcast' to the int_ type (int64 on my system). What exactly does this mean, internally? Where/when does the upcasting occur? Is it a C-style cast, or a memory copy to a new temporary array? I'm a confused as to which low-level ufunc loop type is used (and why). This is what I see: >>> a = numpy.arange(131072, dtype=numpy.int32) >>> r = numpy.add.reduce(a) >>> print type(r) >>> print hex(r) 0x1ffff0000L Okay, fine. But I have my own ufunc, which defines only the following types right now (I stripped it down for debugging): >>> print corefunc.add.types ['ii->i', 'll->l'] NumPy has this, for comparison: >>> print numpy.add.types ['??->?', 'bb->b', 'BB->B', 'hh->h', 'HH->H', 'ii->i', 'II->I', 'll->l', 'LL->L', 'qq->q', 'QQ->Q', 'ff->f', 'dd->d', 'gg->g', 'FF->F', 'DD->D', 'GG->G', 'OO->O'] Also just to verify I did this: >>> print numpy.typeDict['i'] >>> print numpy.typeDict['l'] Yet when I call my own ufunc, this happens: >>> a = numpy.arange(131072, dtype=numpy.int32) >>> r = corefunc.add.reduce(a) >>> print type(r) >>> print hex(r) -0x10000 It looks like no upcasting is occurring here? My ii->i loop is being used, not the ll->l loop.. why? I'm guessing this is something I am doing wrong, any ideas what it is? Andrew From kwmsmith at gmail.com Mon Jun 29 11:13:09 2009 From: kwmsmith at gmail.com (Kurt Smith) Date: Mon, 29 Jun 2009 10:13:09 -0500 Subject: [Numpy-discussion] numpy/numexpr performance (particle simulation) In-Reply-To: <193184.73446.qm@web94906.mail.in2.yahoo.com> References: <193184.73446.qm@web94906.mail.in2.yahoo.com> Message-ID: On Mon, Jun 29, 2009 at 8:20 AM, Prashant Saxena wrote: > Hi, > > I am doing a little test using numpy and numexpr to do a particle > simulation. I never used either of them much and this is the first time I > have to go deeper. Here is the code: > > import numpy as np > import numexpr as nexpr > > class Particle( object ): > > ??? def __init__( self, id ): > ??????? self.position = [0.0, 0.0, 0.0] > ??????? self.color = [0.5, 0.5, 0.5] > ??????? self.direction = [0.0, 0.0, 0.0] > ??????? self.id = id > ??????? self.lifeSpan = 0 > ??????? self.type = 0 > > class Emitter( object ): > > ??? def __init__( self ): > ??????? self.particles = np.empty([], dtype=Particle()) > ??????? self.currentParticleId = 0 > ??????? self.numParticles = 1000 > ??????? self.emissionRate = 10 > ??????? self.position = [0.0, 0.0, 0.0] > ??????? self.rotation = [0.0, 0.0, 0.0] > > ??? # Add a single particle > ??? def addParticle( self ): > ??????? """ > ??????? Add a single particle in the emitter. > ??????? """ > ??????? if self.currentParticleId < self.numParticles: > ??????????? self.particles = np.append( self.particles, Particle( > self..currentParticleId ) ) > ??????????? self.currentParticleId+=1 > > > ####################################################### > Problem 1: > self.particles = np.empty([], dtype=Particle()) > In "Emitter" class, how do I initialize a numpy array of "Particle" type? > > Problem 2: > If problem 1 can be solved, is it possible to use numexpr to alter the > position of each particle's position > in emitter.particles array. For example, multiply x,y and z of each particle > by using three random values. In other words > > self.particles[0].position*= random(1.0) > self.particles[1].position*= random(2.0) > self.particles[2].position*= random(3.0) > > If problem 1 cannot be solved then may be I could use something like this > using simple list: > > a = [Particle(0), Particle(1)] > > How do I modify the position of each particle in array"a" using numexpr? > > I would be using 1-5 million particles where each particle's attribute > postion, rotation and color will be modified using some complex > equations. The above code is just the begining hence any tips for > performance would be greatly apperciated. 1-5 million particles means that 1) the 'particle' object shouldn't be a pure python object, with lists for the position field, etc. You won't get very good cache locality (although this might be hard to do anyway, depending on the sort of operations you're doing), all the ints, floats, etc are wrapped up inside python objects, making simple arithmetic very expensive, and you'll want to start thinking about memory usage and minimizing pointer lookups. If you want to go the pure numpy route (that is, not resorting to putting it into an extension module via Pyrex/Cython, f2py, etc.), you could try something like this: import numpy as np DTYPE = np.float32 position_dtype = np.dtype({'names':('x','y','z'), 'formats':(DTYPE, DTYPE, DTYPE)}) # this is a nested dtype, for illustration. It's a nicer data representation, # but you might not want the nesting. particle_dtype = np.dtype( { 'names':('position', 'color', 'direction', 'id', 'lifeSpan', 'type'), 'formats':(position_dtype, position_dtype, position_dtype, np.int32, np.int32, np.int32) } ) # This is the flat version of the above. Should be obvious what's going on. flat_particle_dtype = np.dtype( { 'names':('posx', 'posy', 'posz', 'cx', 'cy', 'cz', 'dirx', 'diry', 'dirz', 'id', 'lifespan', 'type'), 'formats':[DTYPE]*9+[np.int32]*3 } ) class Emitter(object): def __init__(self, N): self.numParticles = N self.particles = np.zeros((self.numParticles,), dtype=flat_particle_dtype) # or you can use the nested version: # self.particles = np.zeros((self.numParticles,), dtype=particle_dtype) self.emissionRate = 10 # ... if __name__ == '__main__': N = 1000 emitter = Emitter(N=N) posx = emitter.particles['posx'] posy = emitter.particles['posy'] posz = emitter.particles['posz'] print emitter.particles[:10] posx[:] = np.ones((N,),dtype=DTYPE) posy[:] = np.ones((N,),dtype=DTYPE)*2 posz[:] = np.ones((N,),dtype=DTYPE)*3 import numexpr posx[:] = numexpr.evaluate('2.*posx + posz - 10.') print emitter.particles[:10] For particle simulations that has to be *really fast*, this isn't the route I'd take for the low-level stuff, though. All depends on how fast you want to have something working vs. how fast you need it to work ;-) Kurt From geometrian at gmail.com Mon Jun 29 12:49:18 2009 From: geometrian at gmail.com (Ian Mallett) Date: Mon, 29 Jun 2009 09:49:18 -0700 Subject: [Numpy-discussion] numpy/numexpr performance (particle simulation) In-Reply-To: References: <193184.73446.qm@web94906.mail.in2.yahoo.com> Message-ID: As an off-topic solution, there's always the GPU to do the the particle updating. With half decent optimization, I've gotten over a million particles in *real-time*. You could presumably run several of these at the same time to get as many particles as you want. Downside would be ease-of-implementation... -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jun 29 13:15:27 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 29 Jun 2009 12:15:27 -0500 Subject: [Numpy-discussion] Code for compatibility with gfortran, license issues In-Reply-To: <4A48527B.9020302@ar.media.kyoto-u.ac.jp> References: <4A48527B.9020302@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730906291015q74b2e592n96fb4dc295d97649@mail.gmail.com> On Mon, Jun 29, 2009 at 00:34, David Cournapeau wrote: > Hi, > > ? ?I started working on a new approach for windows 64 bits support, to > be able to combine gfortran and visual studio. Basically, I am > reimplementing the needed functions from libgfortran so that it can be > built with MS compiler, but I cannot hope to do that without using > gfortran sources (under the GPL 3), if only for the signature/API of the > functions. Function signatures and public APIs are not copyrightable. What else do you need? > Is it ok to include those sources in numpy/scipy, given that > they will only be used in those cases where libgfortran would have to be > used anyway ? Keep in mind that libgfortran is GPLv3+exception. Is the code that you are looking at from gfortran itself (GPLv3) or libgfortran (GPLv3+exception)? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Mon Jun 29 13:47:42 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 30 Jun 2009 02:47:42 +0900 Subject: [Numpy-discussion] Code for compatibility with gfortran, license issues In-Reply-To: <3d375d730906291015q74b2e592n96fb4dc295d97649@mail.gmail.com> References: <4A48527B.9020302@ar.media.kyoto-u.ac.jp> <3d375d730906291015q74b2e592n96fb4dc295d97649@mail.gmail.com> Message-ID: <5b8d13220906291047ve1fa9dfq403fe0c85a96990f@mail.gmail.com> On Tue, Jun 30, 2009 at 2:15 AM, Robert Kern wrote: > On Mon, Jun 29, 2009 at 00:34, David > Cournapeau wrote: >> Hi, >> >> ? ?I started working on a new approach for windows 64 bits support, to >> be able to combine gfortran and visual studio. Basically, I am >> reimplementing the needed functions from libgfortran so that it can be >> built with MS compiler, but I cannot hope to do that without using >> gfortran sources (under the GPL 3), if only for the signature/API of the >> functions. > > Function signatures and public APIs are not copyrightable. What else > do you need? I need to implement some intrinsics (_gfortran_pow_i4_i4, etc...). I would not consider them public a priori, but I don't really know the legal meaning of public API. > > Keep in mind that libgfortran is GPLv3+exception. Is the code that you > are looking at from gfortran itself (GPLv3) or libgfortran It should be only libgfortran (and libgcc) - I could not quite understand the exception part for libgfortran and its implications. One function which may be problematic is __chkstk. It seems that anything different from the exact implementation would be troublesome. It is implemented purely in ASM, and does not use standard calling convention, so anything else than a verbatim copy sounds dangerous (the function itself is like 5-6 instructions). I could get rid of most non-computational functions by modifying the fortran code in scipy, except for this one. David From bsouthey at gmail.com Mon Jun 29 14:06:47 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 29 Jun 2009 13:06:47 -0500 Subject: [Numpy-discussion] Code for compatibility with gfortran, license issues In-Reply-To: <3d375d730906291015q74b2e592n96fb4dc295d97649@mail.gmail.com> References: <4A48527B.9020302@ar.media.kyoto-u.ac.jp> <3d375d730906291015q74b2e592n96fb4dc295d97649@mail.gmail.com> Message-ID: <4A4902B7.2030603@gmail.com> On 06/29/2009 12:15 PM, Robert Kern wrote: > On Mon, Jun 29, 2009 at 00:34, David > Cournapeau wrote: > >> Hi, >> >> I started working on a new approach for windows 64 bits support, to >> be able to combine gfortran and visual studio. Basically, I am >> reimplementing the needed functions from libgfortran so that it can be >> built with MS compiler, but I cannot hope to do that without using >> gfortran sources (under the GPL 3), if only for the signature/API of the >> functions. >> > > Function signatures and public APIs are not copyrightable. What else > do you need? > > >> Is it ok to include those sources in numpy/scipy, given that >> they will only be used in those cases where libgfortran would have to be >> used anyway ? >> > > Keep in mind that libgfortran is GPLv3+exception. Is the code that you > are looking at from gfortran itself (GPLv3) or libgfortran > (GPLv3+exception)? > > Since you appear to require that the user already has gfortran installed, then you just need to avoid adding any GPL licensed code to actual numpy/scipy code as that would make numpy/scipy GPL. See SFLC's 'Maintaining Permissive-Licensed Files in a GPL-Licensed Project: Guidelines for Developers': http://www.softwarefreedom.org/resources/2007/gpl-non-gpl-collaboration.html I would think that you could just provide an appropriately licensed package that combines a separately downloaded numpy/scipy with the separately downloaded/installed gfortran to install the new version of numpy/scipy. Essentially the same way you get non-free software like mp3 decoders for certain Linux distros. If that works, then perhaps a clean room implementation and rewrite of certain fortran code could be done to remove the gfortran dependencies. Regards Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From rowen at u.washington.edu Mon Jun 29 17:03:23 2009 From: rowen at u.washington.edu (Russell E. Owen) Date: Mon, 29 Jun 2009 14:03:23 -0700 Subject: [Numpy-discussion] Advice on converting Numarray C extension? Message-ID: I have an old Numarray C extension (or, rather, a Python package containing a C extension) that I would like to convert to numpy (in a way that is likely to be supported long-term). Options I have found include: - Use the new numpy extension. This seems likely to be fast and future-proof. But I am finding the documentation slow going. Does anyone know of a simple example (e.g. read in an array, create a new array)? - Use the Numarray compatible C API. Simple (and takes advantage of the nice Numarray tutorial example for documentation), but will this be supported in the long term? - Switch to ctypes. Simple in concept. But I'm wondering if I can get distutils to build the resulting package. - Use SWIG. I have some experience with it, but not with numpy arrays. - Use Cython to replace the C code. No idea if this is a long-term supported package. Another option is to try to rewrite in pure python. Perhaps the numpy indexing is sophisticated enough to allow an efficient solution. The C extension computes a radial profile from a 2-d masked array: radProf(r)= sum of all unmasked pixels at radius r about some specified center index I can easily generate (and cache) a 2-d array of radius index, but is it possible to use that to efficiently generate the desired sum? Any opinions? -- Russell From charlesr.harris at gmail.com Mon Jun 29 17:29:17 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 29 Jun 2009 15:29:17 -0600 Subject: [Numpy-discussion] Advice on converting Numarray C extension? In-Reply-To: References: Message-ID: On Mon, Jun 29, 2009 at 3:03 PM, Russell E. Owen wrote: > I have an old Numarray C extension (or, rather, a Python package > containing a C extension) that I would like to convert to numpy > (in a way that is likely to be supported long-term). > > Options I have found include: > > - Use the new numpy extension. This seems likely to be fast and > future-proof. But I am finding the documentation slow going. Does anyone > know of a simple example (e.g. read in an array, create a new array)? > > - Use the Numarray compatible C API. Simple (and takes advantage of the > nice Numarray tutorial example for documentation), but will this be > supported in the long term? > > - Switch to ctypes. Simple in concept. But I'm wondering if I can get > distutils to build the resulting package. > > - Use SWIG. I have some experience with it, but not with numpy arrays. > > - Use Cython to replace the C code. No idea if this is a long-term > supported package. > > Another option is to try to rewrite in pure python. Perhaps the numpy > indexing is sophisticated enough to allow an efficient solution. The C > extension computes a radial profile from a 2-d masked array: > radProf(r)= sum of all unmasked pixels at radius r about some > specified center index > I can easily generate (and cache) a 2-d array of radius index, but is it > possible to use that to efficiently generate the desired sum? > > Any opinions? > How big is the extension and what does it do? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From neilcrighton at gmail.com Mon Jun 29 17:49:47 2009 From: neilcrighton at gmail.com (Neil Crighton) Date: Mon, 29 Jun 2009 22:49:47 +0100 Subject: [Numpy-discussion] Deprecating function_base.unique() Message-ID: <63751c30906291449x53175f56t17aa9eca0253457d@mail.gmail.com> Hi, There was some discussion (e.g. http://article.gmane.org/gmane.comp.python.numeric.general/30629) about changes to the arraysetops module to consolidate the separate unique/non-unique functions and rename setmember1d to in1d. There's a patch that makes these changes in ticket 1133 (http://projects.scipy.org/numpy/ticket/1133) The patch also deprecates function_base.unique and renames arraysetops.unique1d to arraysetops.unique as per this discussion: http://article.gmane.org/gmane.comp.python.numeric.general/31068 Is deprecating function_base.unique the right thing to do? Alternatively, we could deprecate unique1d and transfer its functionality to function_base.unique, but since many of the setops functions use unique, arraysetops seems a more natural place for it. Does anyone have any thoughts about this? Neil From perry at stsci.edu Mon Jun 29 18:01:51 2009 From: perry at stsci.edu (Perry Greenfield) Date: Mon, 29 Jun 2009 18:01:51 -0400 Subject: [Numpy-discussion] Advice on converting Numarray C extension? In-Reply-To: References: Message-ID: <4D2B04ED-4612-4244-A8B8-3FF0C8659582@stsci.edu> Hi Russell, Have you looked at the example in our interactive data analysis tutorial where we compute radial profiles in Python? It's not as fast as C because of the sort, but perhaps that's fast enough for your purposes. I wasn't sure if you had already seen that approach or not. (I think it is in the 3rd chapter, but I can look it up if you need me to). Perry On Jun 29, 2009, at 5:03 PM, Russell E. Owen wrote: > I have an old Numarray C extension (or, rather, a Python package > containing a C extension) that I would like to convert to numpy > (in a way that is likely to be supported long-term). > > Options I have found include: > > - Use the new numpy extension. This seems likely to be fast and > future-proof. But I am finding the documentation slow going. Does > anyone > know of a simple example (e.g. read in an array, create a new array)? > > - Use the Numarray compatible C API. Simple (and takes advantage of > the > nice Numarray tutorial example for documentation), but will this be > supported in the long term? > > - Switch to ctypes. Simple in concept. But I'm wondering if I can get > distutils to build the resulting package. > > - Use SWIG. I have some experience with it, but not with numpy arrays. > > - Use Cython to replace the C code. No idea if this is a long-term > supported package. > > Another option is to try to rewrite in pure python. Perhaps the numpy > indexing is sophisticated enough to allow an efficient solution. The C > extension computes a radial profile from a 2-d masked array: > radProf(r)= sum of all unmasked pixels at radius r about some > specified center index > I can easily generate (and cache) a 2-d array of radius index, but > is it > possible to use that to efficiently generate the desired sum? > > Any opinions? > > -- Russell > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From rowen at u.washington.edu Mon Jun 29 18:17:19 2009 From: rowen at u.washington.edu (Russell E. Owen) Date: Mon, 29 Jun 2009 15:17:19 -0700 Subject: [Numpy-discussion] Advice on converting Numarray C extension? References: Message-ID: In article , Charles R Harris wrote: > On Mon, Jun 29, 2009 at 3:03 PM, Russell E. Owen > wrote: > > > I have an old Numarray C extension (or, rather, a Python package > > containing a C extension) that I would like to convert to numpy > > (in a way that is likely to be supported long-term). > > > > Options I have found include: > > > > - Use the new numpy extension. This seems likely to be fast and > > future-proof. But I am finding the documentation slow going. Does anyone > > know of a simple example (e.g. read in an array, create a new array)? > > > > - Use the Numarray compatible C API. Simple (and takes advantage of the > > nice Numarray tutorial example for documentation), but will this be > > supported in the long term? > > > > - Switch to ctypes. Simple in concept. But I'm wondering if I can get > > distutils to build the resulting package. > > > > - Use SWIG. I have some experience with it, but not with numpy arrays. > > > > - Use Cython to replace the C code. No idea if this is a long-term > > supported package. > > > > Another option is to try to rewrite in pure python. Perhaps the numpy > > indexing is sophisticated enough to allow an efficient solution. The C > > extension computes a radial profile from a 2-d masked array: > > radProf(r)= sum of all unmasked pixels at radius r about some > > specified center index > > I can easily generate (and cache) a 2-d array of radius index, but is it > > possible to use that to efficiently generate the desired sum? > > > > Any opinions? > > > > How big is the extension and what does it do? It basically contains 2 functions: 1: radProfile: given a masked image (2d array), a radius and a desired center: compute a new 1d array whose value at index r is the sum of all unmasked pixels at radius r. 2: radAsymm: given the same inputs as radProfile, return a (scalar) measure of radial asymmetry by computing the variance of unmasked pixels at each radius and combining the results. The original source file is about 1000 lines long, of which 1/3 to 1/2 is the basic C code and the rest is Python wrapper. -- Russell From rowen at u.washington.edu Mon Jun 29 18:18:44 2009 From: rowen at u.washington.edu (Russell E. Owen) Date: Mon, 29 Jun 2009 15:18:44 -0700 Subject: [Numpy-discussion] Advice on converting Numarray C extension? References: <4D2B04ED-4612-4244-A8B8-3FF0C8659582@stsci.edu> Message-ID: In article <4D2B04ED-4612-4244-A8B8-3FF0C8659582 at stsci.edu>, Perry Greenfield wrote: > Hi Russell, > > Have you looked at the example in our interactive data analysis > tutorial where we compute radial profiles in Python? It's not as fast > as C because of the sort, but perhaps that's fast enough for your > purposes. I wasn't sure if you had already seen that approach or not. > (I think it is in the 3rd chapter, but I can look it up if you need me > to). I have not seen this. I'll give it a look. Thanks! But I suspect the sort will add unacceptable overhead because this radial profile is computed as part of an iteration (to find the point of maximum radial symmetry). -- Russell From charlesr.harris at gmail.com Mon Jun 29 20:10:42 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 29 Jun 2009 18:10:42 -0600 Subject: [Numpy-discussion] Advice on converting Numarray C extension? In-Reply-To: References: Message-ID: On Mon, Jun 29, 2009 at 4:17 PM, Russell E. Owen wrote: > In article > , > Charles R Harris wrote: > > > On Mon, Jun 29, 2009 at 3:03 PM, Russell E. Owen > > wrote: > > > > > I have an old Numarray C extension (or, rather, a Python package > > > containing a C extension) that I would like to convert to numpy > > > (in a way that is likely to be supported long-term). > > > > > > Options I have found include: > > > > > > - Use the new numpy extension. This seems likely to be fast and > > > future-proof. But I am finding the documentation slow going. Does > anyone > > > know of a simple example (e.g. read in an array, create a new array)? > > > > > > - Use the Numarray compatible C API. Simple (and takes advantage of the > > > nice Numarray tutorial example for documentation), but will this be > > > supported in the long term? > > > > > > - Switch to ctypes. Simple in concept. But I'm wondering if I can get > > > distutils to build the resulting package. > > > > > > - Use SWIG. I have some experience with it, but not with numpy arrays. > > > > > > - Use Cython to replace the C code. No idea if this is a long-term > > > supported package. > > > > > > Another option is to try to rewrite in pure python. Perhaps the numpy > > > indexing is sophisticated enough to allow an efficient solution. The C > > > extension computes a radial profile from a 2-d masked array: > > > radProf(r)= sum of all unmasked pixels at radius r about some > > > specified center index > > > I can easily generate (and cache) a 2-d array of radius index, but is > it > > > possible to use that to efficiently generate the desired sum? > > > > > > Any opinions? > > > > > > > How big is the extension and what does it do? > > It basically contains 2 functions: > 1: radProfile: given a masked image (2d array), a radius and a desired > center: compute a new 1d array whose value at index r is the sum of all > unmasked pixels at radius r. > > 2: radAsymm: given the same inputs as radProfile, return a (scalar) > measure of radial asymmetry by computing the variance of unmasked pixels > at each radius and combining the results. > > The original source file is about 1000 lines long, of which 1/3 to 1/2 > is the basic C code and the rest is Python wrapper. > It sounds small enough that you should be able to update it to the numpy interface. What functions do you need? You should also be able to attach a copy (zipped) if it is small enough, which might help us help you. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jun 29 20:14:02 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 29 Jun 2009 18:14:02 -0600 Subject: [Numpy-discussion] Advice on converting Numarray C extension? In-Reply-To: References: Message-ID: On Mon, Jun 29, 2009 at 6:10 PM, Charles R Harris wrote: > > > On Mon, Jun 29, 2009 at 4:17 PM, Russell E. Owen wrote: > >> In article >> , >> Charles R Harris wrote: >> >> > On Mon, Jun 29, 2009 at 3:03 PM, Russell E. Owen >> > wrote: >> > >> > > I have an old Numarray C extension (or, rather, a Python package >> > > containing a C extension) that I would like to convert to numpy >> > > (in a way that is likely to be supported long-term). >> > > >> > > Options I have found include: >> > > >> > > - Use the new numpy extension. This seems likely to be fast and >> > > future-proof. But I am finding the documentation slow going. Does >> anyone >> > > know of a simple example (e.g. read in an array, create a new array)? >> > > >> > > - Use the Numarray compatible C API. Simple (and takes advantage of >> the >> > > nice Numarray tutorial example for documentation), but will this be >> > > supported in the long term? >> > > >> > > - Switch to ctypes. Simple in concept. But I'm wondering if I can get >> > > distutils to build the resulting package. >> > > >> > > - Use SWIG. I have some experience with it, but not with numpy arrays. >> > > >> > > - Use Cython to replace the C code. No idea if this is a long-term >> > > supported package. >> > > >> > > Another option is to try to rewrite in pure python. Perhaps the numpy >> > > indexing is sophisticated enough to allow an efficient solution. The C >> > > extension computes a radial profile from a 2-d masked array: >> > > radProf(r)= sum of all unmasked pixels at radius r about some >> > > specified center index >> > > I can easily generate (and cache) a 2-d array of radius index, but is >> it >> > > possible to use that to efficiently generate the desired sum? >> > > >> > > Any opinions? >> > > >> > >> > How big is the extension and what does it do? >> >> It basically contains 2 functions: >> 1: radProfile: given a masked image (2d array), a radius and a desired >> center: compute a new 1d array whose value at index r is the sum of all >> unmasked pixels at radius r. >> >> 2: radAsymm: given the same inputs as radProfile, return a (scalar) >> measure of radial asymmetry by computing the variance of unmasked pixels >> at each radius and combining the results. >> >> The original source file is about 1000 lines long, of which 1/3 to 1/2 >> is the basic C code and the rest is Python wrapper. >> > > It sounds small enough that you should be able to update it to the numpy > interface. What functions do you need? You should also be able to attach a > copy (zipped) if it is small enough, which might help us help you. > And yes, you could probably do it in cython and save yourself a bit of interface code. I think cython will currently handle 2d arrays without much trouble. But I don't know what the fastest approach would be here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon Jun 29 22:37:46 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 30 Jun 2009 11:37:46 +0900 Subject: [Numpy-discussion] Code for compatibility with gfortran, license issues In-Reply-To: <4A4902B7.2030603@gmail.com> References: <4A48527B.9020302@ar.media.kyoto-u.ac.jp> <3d375d730906291015q74b2e592n96fb4dc295d97649@mail.gmail.com> <4A4902B7.2030603@gmail.com> Message-ID: <5b8d13220906291937g114b8a65ke11d19f73f89c096@mail.gmail.com> On Tue, Jun 30, 2009 at 3:06 AM, Bruce Southey wrote: > > I would think that you could just provide an appropriately licensed package > that combines a separately downloaded numpy/scipy with the? separately > downloaded/installed gfortran to install the new version of numpy/scipy. That's exactly what I would like to avoid :) That's a lot of work, just for a couple of functions. > Essentially the same way you get non-free software like mp3 decoders for > certain Linux distros. If that works, then perhaps a clean room > implementation and rewrite of certain fortran code could be done to remove > the gfortran dependencies. As mentioned by Robert, the runtimes (both libgcc and libgfortran), although under the GPL, have an exception so that linking against them do not force you to release the resulting code under the GPL terms. Up to now, numpy on windows and mac os x was built with gcc and have dependency with those runtimes. What is not clear to me is whether static or dynamic linking makes a difference (I did not think about this when building recent mac os x binaries with libgfortran linked statically). IOW, the only real difference is that I would like to include some sources under the GPLv3 (with the GCC exception) in numpy/scipy, sources which would only be used in the cases where we would have to link against code under the same license anyway. David From nwagner at iam.uni-stuttgart.de Tue Jun 30 05:22:34 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 30 Jun 2009 11:22:34 +0200 Subject: [Numpy-discussion] permutation symbol Message-ID: Hi all, How can I build the following product with numpy q_i = \varepsilon_{ijk} q_{kj} where \varepsilon_{ijk} denotes the permutation symbol. Nils http://mathworld.wolfram.com/PermutationSymbol.html From nwagner at iam.uni-stuttgart.de Tue Jun 30 07:11:58 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 30 Jun 2009 13:11:58 +0200 Subject: [Numpy-discussion] permutation symbol In-Reply-To: References: Message-ID: On Tue, 30 Jun 2009 11:22:34 +0200 "Nils Wagner" wrote: > Hi all, > > How can I build the following product with numpy > > q_i = \varepsilon_{ijk} q_{kj} > > where \varepsilon_{ijk} denotes the permutation symbol. > > Nils Sorry for replying to myself. The permutation symbol is also known as the Levi-Civita symbol. I found an explicit expression at http://en.wikipedia.org/wiki/Levi-Civita_symbol How do I build the product of the Levi-Civita symbol \varepsilon_{ijk} and the two dimensional array q_{kj}, i,j,k = 1,2,3 ? Nils -------------- next part -------------- A non-text attachment was scrubbed... Name: levi_civita.py Type: text/x-python Size: 429 bytes Desc: not available URL: From charlesr.harris at gmail.com Tue Jun 30 12:27:05 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Jun 2009 10:27:05 -0600 Subject: [Numpy-discussion] permutation symbol In-Reply-To: References: Message-ID: On Tue, Jun 30, 2009 at 5:11 AM, Nils Wagner wrote: > On Tue, 30 Jun 2009 11:22:34 +0200 > "Nils Wagner" wrote: > >> Hi all, >> >> How can I build the following product with numpy >> >> q_i = \varepsilon_{ijk} q_{kj} >> >> where \varepsilon_{ijk} denotes the permutation symbol. >> >> Nils >> > Sorry for replying to myself. > The permutation symbol is also known as the Levi-Civita symbol. > I found an explicit expression at > http://en.wikipedia.org/wiki/Levi-Civita_symbol > > How do I build the product of the Levi-Civita symbol \varepsilon_{ijk} and > the two dimensional array > q_{kj}, i,j,k = 1,2,3 ? > Write it out explicitly. It essentially antisymmetrizes q and the three off diagonal elements can then be treated as a vector. Depending on how q is formed and the resulting vector is used there may be other things you can do when you use it in a more general expression. If this is part of a general calculation there might be other ways of expressing it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Tue Jun 30 12:40:31 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 30 Jun 2009 18:40:31 +0200 Subject: [Numpy-discussion] permutation symbol In-Reply-To: References: Message-ID: On Tue, 30 Jun 2009 10:27:05 -0600 Charles R Harris wrote: > On Tue, Jun 30, 2009 at 5:11 AM, Nils Wagner > wrote: > >> On Tue, 30 Jun 2009 11:22:34 +0200 >> "Nils Wagner" wrote: >> >>> Hi all, >>> >>> How can I build the following product with numpy >>> >>> q_i = \varepsilon_{ijk} q_{kj} >>> >>> where \varepsilon_{ijk} denotes the permutation symbol. >>> >>> Nils >>> >> Sorry for replying to myself. >> The permutation symbol is also known as the Levi-Civita >>symbol. >> I found an explicit expression at >> http://en.wikipedia.org/wiki/Levi-Civita_symbol >> >> How do I build the product of the Levi-Civita symbol >>\varepsilon_{ijk} and >> the two dimensional array >> q_{kj}, i,j,k = 1,2,3 ? >> > > Write it out explicitly. It essentially antisymmetrizes >q and the three off > diagonal elements can then be treated as a vector. >Depending on how q is > formed and the resulting vector is used there may be >other things you can do > when you use it in a more general expression. If this is >part of a general > calculation there might be other ways of expressing it. > > Chuck Hi Chuck, Thank you for your response. The problem at hand is described in a paper by Angeles namely equation (17c) in "Automatic computation of the screw parameters of rigid-body motions. Part I: Finitely-separated positions" Journal of Dynamic systems, Measurement and Control, Vol. 108 (1986) pp. 32-38 I am looking for a pythonic implementation of the algorithm. Nils From charlesr.harris at gmail.com Tue Jun 30 12:56:16 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Jun 2009 10:56:16 -0600 Subject: [Numpy-discussion] permutation symbol In-Reply-To: References: Message-ID: On Tue, Jun 30, 2009 at 10:40 AM, Nils Wagner wrote: > On Tue, 30 Jun 2009 10:27:05 -0600 > Charles R Harris wrote: > > On Tue, Jun 30, 2009 at 5:11 AM, Nils Wagner > > wrote: > > > >> On Tue, 30 Jun 2009 11:22:34 +0200 > >> "Nils Wagner" wrote: > >> > >>> Hi all, > >>> > >>> How can I build the following product with numpy > >>> > >>> q_i = \varepsilon_{ijk} q_{kj} > >>> > >>> where \varepsilon_{ijk} denotes the permutation symbol. > >>> > >>> Nils > >>> > >> Sorry for replying to myself. > >> The permutation symbol is also known as the Levi-Civita > >>symbol. > >> I found an explicit expression at > >> http://en.wikipedia.org/wiki/Levi-Civita_symbol > >> > >> How do I build the product of the Levi-Civita symbol > >>\varepsilon_{ijk} and > >> the two dimensional array > >> q_{kj}, i,j,k = 1,2,3 ? > >> > > > > Write it out explicitly. It essentially antisymmetrizes > >q and the three off > > diagonal elements can then be treated as a vector. > >Depending on how q is > > formed and the resulting vector is used there may be > >other things you can do > > when you use it in a more general expression. If this is > >part of a general > > calculation there might be other ways of expressing it. > > > > Chuck > > Hi Chuck, > > Thank you for your response. > The problem at hand is described in a paper by Angeles > namely equation (17c) in > "Automatic computation of the screw parameters of > rigid-body motions. > Part I: Finitely-separated positions" > Journal of Dynamic systems, Measurement and Control, Vol. > 108 (1986) pp. 32-38 > You can solve this problem using quaternions also, in which case it reduces to an eigenvalue problem. You will note that such things as PCA are used in the papers that reference the cited work so you can't really get around that bit of inefficiency. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jun 30 13:10:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Jun 2009 11:10:39 -0600 Subject: [Numpy-discussion] permutation symbol In-Reply-To: References: Message-ID: On Tue, Jun 30, 2009 at 10:56 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Tue, Jun 30, 2009 at 10:40 AM, Nils Wagner < > nwagner at iam.uni-stuttgart.de> wrote: > >> On Tue, 30 Jun 2009 10:27:05 -0600 >> Charles R Harris wrote: >> > On Tue, Jun 30, 2009 at 5:11 AM, Nils Wagner >> > wrote: >> > >> >> On Tue, 30 Jun 2009 11:22:34 +0200 >> >> "Nils Wagner" wrote: >> >> >> >>> Hi all, >> >>> >> >>> How can I build the following product with numpy >> >>> >> >>> q_i = \varepsilon_{ijk} q_{kj} >> >>> >> >>> where \varepsilon_{ijk} denotes the permutation symbol. >> >>> >> >>> Nils >> >>> >> >> Sorry for replying to myself. >> >> The permutation symbol is also known as the Levi-Civita >> >>symbol. >> >> I found an explicit expression at >> >> http://en.wikipedia.org/wiki/Levi-Civita_symbol >> >> >> >> How do I build the product of the Levi-Civita symbol >> >>\varepsilon_{ijk} and >> >> the two dimensional array >> >> q_{kj}, i,j,k = 1,2,3 ? >> >> >> > >> > Write it out explicitly. It essentially antisymmetrizes >> >q and the three off >> > diagonal elements can then be treated as a vector. >> >Depending on how q is >> > formed and the resulting vector is used there may be >> >other things you can do >> > when you use it in a more general expression. If this is >> >part of a general >> > calculation there might be other ways of expressing it. >> > >> > Chuck >> >> Hi Chuck, >> >> Thank you for your response. >> The problem at hand is described in a paper by Angeles >> namely equation (17c) in >> "Automatic computation of the screw parameters of >> rigid-body motions. >> Part I: Finitely-separated positions" >> Journal of Dynamic systems, Measurement and Control, Vol. >> 108 (1986) pp. 32-38 >> > > You can solve this problem using quaternions also, in which case it reduces > to an eigenvalue problem. You will note that such things as PCA are used in > the papers that reference the cited work so you can't really get around that > bit of inefficiency. > Here's a reference to the quaternion approach: http://people.csail.mit.edu/bkph/papers/Absolute_Orientation.pdf. You can get the translation part from the motion of the centroid. If you are into abstractions you will note that the problem reduces to minimising a quadratic form in the quaternion components. The rest is just algebra ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Tue Jun 30 14:26:50 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 30 Jun 2009 20:26:50 +0200 Subject: [Numpy-discussion] permutation symbol In-Reply-To: References: Message-ID: On Tue, 30 Jun 2009 11:10:39 -0600 Charles R Harris wrote: > On Tue, Jun 30, 2009 at 10:56 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, Jun 30, 2009 at 10:40 AM, Nils Wagner < >> nwagner at iam.uni-stuttgart.de> wrote: >> >>> On Tue, 30 Jun 2009 10:27:05 -0600 >>> Charles R Harris wrote: >>> > On Tue, Jun 30, 2009 at 5:11 AM, Nils Wagner >>> > wrote: >>> > >>> >> On Tue, 30 Jun 2009 11:22:34 +0200 >>> >> "Nils Wagner" wrote: >>> >> >>> >>> Hi all, >>> >>> >>> >>> How can I build the following product with numpy >>> >>> >>> >>> q_i = \varepsilon_{ijk} q_{kj} >>> >>> >>> >>> where \varepsilon_{ijk} denotes the permutation >>>symbol. >>> >>> >>> >>> Nils >>> >>> >>> >> Sorry for replying to myself. >>> >> The permutation symbol is also known as the >>>Levi-Civita >>> >>symbol. >>> >> I found an explicit expression at >>> >> http://en.wikipedia.org/wiki/Levi-Civita_symbol >>> >> >>> >> How do I build the product of the Levi-Civita symbol >>> >>\varepsilon_{ijk} and >>> >> the two dimensional array >>> >> q_{kj}, i,j,k = 1,2,3 ? >>> >> >>> > >>> > Write it out explicitly. It essentially >>>antisymmetrizes >>> >q and the three off >>> > diagonal elements can then be treated as a vector. >>> >Depending on how q is >>> > formed and the resulting vector is used there may be >>> >other things you can do >>> > when you use it in a more general expression. If this >>>is >>> >part of a general >>> > calculation there might be other ways of expressing >>>it. >>> > >>> > Chuck >>> >>> Hi Chuck, >>> >>> Thank you for your response. >>> The problem at hand is described in a paper by Angeles >>> namely equation (17c) in >>> "Automatic computation of the screw parameters of >>> rigid-body motions. >>> Part I: Finitely-separated positions" >>> Journal of Dynamic systems, Measurement and Control, >>>Vol. >>> 108 (1986) pp. 32-38 >>> >> >> You can solve this problem using quaternions also, in >>which case it reduces >> to an eigenvalue problem. You will note that such things >>as PCA are used in >> the papers that reference the cited work so you can't >>really get around that >> bit of inefficiency. >> > > Here's a reference to the quaternion approach: > http://people.csail.mit.edu/bkph/papers/Absolute_Orientation.pdf. >You can > get the translation part from the motion of the >centroid. > > If you are into abstractions you will note that the >problem reduces to > minimising a quadratic form in the quaternion >components. The rest is just > algebra ;) > > Chuck It turns out that the product is simply an invariant of a 3 \times 3 matrix. from numpy import array, zeros, identity from numpy.linalg import norm def vect(A): """ linear invariant of a 3 x 3 matrix """ tmp = zeros(3,float) tmp[0] = 0.5*(A[2,1]-A[1,2]) tmp[1] = 0.5*(A[0,2]-A[2,0]) tmp[2] = 0.5*(A[1,0]-A[0,1]) return tmp Q = array([[0,0,-1],[-1,0,0],[0,1,0]]) q = vect(Q) print q Nils From d_l_goldsmith at yahoo.com Tue Jun 30 14:28:34 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 30 Jun 2009 11:28:34 -0700 (PDT) Subject: [Numpy-discussion] permutation symbol Message-ID: <511928.40212.qm@web52105.mail.re2.yahoo.com> Hi! I can't help but wonder if - out there somewhere - there's an optimized algorithm for implementing the general N-D Permutation Symbol: "The symbol can be generalized to an arbitrary number of elements, in which case the permutation symbol is (-1)^(i(p)), where i(p) is the number of transpositions of pairs of elements (i.e., permutation inversions) that must be composed to build up the permutation p (Skiena 1990)." (perhaps in the referenced paper - anyone have "free" access to such reprints online) which we could add to NumPy, including it's specializations to the 3-D and 2-D cases as convenience functions. (I'd write it given the algorithm, but combinatorics was never my strong suit, i.e., I'm sure someone else on this list could figure out at least a brute force algorithm more efficiently than yours truly.) Alternatively, isn't this "just" a tensor product with one factor a constant tensor, and thus we could implement it that way using an existing resource? DG --- On Tue, 6/30/09, Charles R Harris wrote: > From: Charles R Harris > Subject: Re: [Numpy-discussion] permutation symbol > To: "Discussion of Numerical Python" > Date: Tuesday, June 30, 2009, 10:10 AM > > > On Tue, Jun 30, 2009 at 10:56 AM, > Charles R Harris > wrote: > > > > On > Tue, Jun 30, 2009 at 10:40 AM, Nils Wagner > wrote: > > > On Tue, 30 Jun 2009 10:27:05 -0600 > > ?Charles R Harris > wrote: > > > On Tue, Jun 30, 2009 at 5:11 AM, Nils Wagner > > > wrote: > > > > > >> On Tue, 30 Jun 2009 11:22:34 +0200 > > >> ?"Nils Wagner" > wrote: > > >> > > >>> ?Hi all, > > >>> > > >>> How can I build the following product with > numpy > > >>> > > >>> q_i = \varepsilon_{ijk} q_{kj} > > >>> > > >>> where ?\varepsilon_{ijk} denotes the > permutation symbol. > > >>> > > >>> Nils > > >>> > > >> ?Sorry for replying to myself. > > >> The permutation symbol is also known as the > Levi-Civita > > >>symbol. > > >> I found an explicit expression at > > >> http://en.wikipedia.org/wiki/Levi-Civita_symbol > > >> > > >> How do I build the product of the Levi-Civita > symbol > > >>\varepsilon_{ijk} and > > >> the two dimensional array > > >> q_{kj}, i,j,k = 1,2,3 ? > > >> > > > > > > Write it out explicitly. It essentially > antisymmetrizes > > >q and the three off > > > diagonal elements can then be treated as a vector. > > >Depending on how q is > > > formed and the resulting vector is used there may be > > >other things you can do > > > when you use it in a more general expression. If this > is > > >part of a general > > > calculation there might be other ways of expressing > it. > > > > > > Chuck > > > > Hi Chuck, > > > > Thank you for your response. > > The problem at hand is described in a paper by Angeles > > namely equation (17c) in > > "Automatic computation of the screw parameters of > > rigid-body motions. > > Part I: Finitely-separated positions" > > Journal of Dynamic systems, Measurement and Control, Vol. > > 108 (1986) pp. 32-38 > > > You can solve this problem using quaternions also, in which > case it reduces to an eigenvalue problem. You will note that > such things as PCA are used in the papers that reference the > cited work so you can't really get around that bit of > inefficiency. > > > > Here's a reference to the quaternion approach: http://people.csail.mit.edu/bkph/papers/Absolute_Orientation.pdf. > You can get the translation part from the motion of the > centroid. > > ? > If you are into abstractions you will note that the problem > reduces to minimising a quadratic form in the quaternion > components. The rest is just algebra ;) > > Chuck > > > > -----Inline Attachment Follows----- > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From d_l_goldsmith at yahoo.com Tue Jun 30 14:30:51 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 30 Jun 2009 11:30:51 -0700 (PDT) Subject: [Numpy-discussion] permutation symbol Message-ID: <837214.99427.qm@web52106.mail.re2.yahoo.com> Great, Nils! Now, can you generalize it to N-D for us? ;-) DG --- On Tue, 6/30/09, Nils Wagner wrote: > From: Nils Wagner > Subject: Re: [Numpy-discussion] permutation symbol > To: "Discussion of Numerical Python" > Date: Tuesday, June 30, 2009, 11:26 AM > On Tue, 30 Jun 2009 11:10:39 -0600 > ? Charles R Harris > wrote: > > On Tue, Jun 30, 2009 at 10:56 AM, Charles R Harris > < > > charlesr.harris at gmail.com> > wrote: > > > >> > >> > >> On Tue, Jun 30, 2009 at 10:40 AM, Nils Wagner > < > >> nwagner at iam.uni-stuttgart.de> > wrote: > >> > >>> On Tue, 30 Jun 2009 10:27:05 -0600 > >>>? Charles R Harris > wrote: > >>> > On Tue, Jun 30, 2009 at 5:11 AM, Nils > Wagner > >>> > wrote: > >>> > > >>> >> On Tue, 30 Jun 2009 11:22:34 +0200 > >>> >>? "Nils Wagner" > wrote: > >>> >> > >>> >>>? Hi all, > >>> >>> > >>> >>> How can I build the following > product with numpy > >>> >>> > >>> >>> q_i = \varepsilon_{ijk} q_{kj} > >>> >>> > >>> >>> where? \varepsilon_{ijk} > denotes the permutation > >>>symbol. > >>> >>> > >>> >>> Nils > >>> >>> > >>> >>? Sorry for replying to myself. > >>> >> The permutation symbol is also known > as the > >>>Levi-Civita > >>> >>symbol. > >>> >> I found an explicit expression at > >>> >> http://en.wikipedia.org/wiki/Levi-Civita_symbol > >>> >> > >>> >> How do I build the product of the > Levi-Civita symbol > >>> >>\varepsilon_{ijk} and > >>> >> the two dimensional array > >>> >> q_{kj}, i,j,k = 1,2,3 ? > >>> >> > >>> > > >>> > Write it out explicitly. It essentially > >>>antisymmetrizes > >>> >q and the three off > >>> > diagonal elements can then be treated as > a vector. > >>> >Depending on how q is > >>> > formed and the resulting vector is used > there may be > >>> >other things you can do > >>> > when you use it in a more general > expression. If this > >>>is > >>> >part of a general > >>> > calculation there might be other ways of > expressing > >>>it. > >>> > > >>> > Chuck > >>> > >>> Hi Chuck, > >>> > >>> Thank you for your response. > >>> The problem at hand is described in a paper by > Angeles > >>> namely equation (17c) in > >>> "Automatic computation of the screw parameters > of > >>> rigid-body motions. > >>> Part I: Finitely-separated positions" > >>> Journal of Dynamic systems, Measurement and > Control, > >>>Vol. > >>> 108 (1986) pp. 32-38 > >>> > >> > >> You can solve this problem using quaternions also, > in > >>which case it reduces > >> to an eigenvalue problem. You will note that such > things > >>as PCA are used in > >> the papers that reference the cited work so you > can't > >>really get around that > >> bit of inefficiency. > >> > > > > Here's a reference to the quaternion approach: > > http://people.csail.mit.edu/bkph/papers/Absolute_Orientation.pdf. > > >You can > > get the translation part from the motion of the > >centroid. > > > > If you are into abstractions you will note that the > >problem reduces to > > minimising a quadratic form in the quaternion > >components. The rest is just > > algebra ;) > > > > Chuck > > It turns out that the product is simply an invariant of a > 3 \times 3 matrix. > > from numpy import array, zeros, identity > from numpy.linalg import norm > > > def vect(A): > ? ???""" linear invariant of a 3 x 3 > matrix """ > ? ???tmp = zeros(3,float) > ? ???tmp[0] = 0.5*(A[2,1]-A[1,2]) > ? ???tmp[1] = 0.5*(A[0,2]-A[2,0]) > ? ???tmp[2] = 0.5*(A[1,0]-A[0,1]) > > ? ???return tmp > > Q = array([[0,0,-1],[-1,0,0],[0,1,0]]) > > q = vect(Q) > print q > ? > > Nils > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From rowen at u.washington.edu Tue Jun 30 14:31:35 2009 From: rowen at u.washington.edu (Russell E. Owen) Date: Tue, 30 Jun 2009 11:31:35 -0700 Subject: [Numpy-discussion] Advice on converting Numarray C extension? References: Message-ID: In article , Charles R Harris wrote: > On Mon, Jun 29, 2009 at 4:17 PM, Russell E. Owen > wrote: > > > In article > > , > > Charles R Harris wrote: > > > > > On Mon, Jun 29, 2009 at 3:03 PM, Russell E. Owen > > > wrote: > > > > > > > I have an old Numarray C extension (or, rather, a Python package > > > > containing a C extension) that I would like to convert to numpy > > > > (in a way that is likely to be supported long-term). > > > > > > How big is the extension and what does it do? > > > > It basically contains 2 functions: > > 1: radProfile: given a masked image (2d array), a radius and a desired > > center: compute a new 1d array whose value at index r is the sum of all > > unmasked pixels at radius r. > > > > 2: radAsymm: given the same inputs as radProfile, return a (scalar) > > measure of radial asymmetry by computing the variance of unmasked pixels > > at each radius and combining the results. > > > > The original source file is about 1000 lines long, of which 1/3 to 1/2 > > is the basic C code and the rest is Python wrapper. > > It sounds small enough that you should be able to update it to the numpy > interface. What functions do you need? You should also be able to attach a > copy (zipped) if it is small enough, which might help us help you. It is the PyGuide package a 525k zip file. The extension code is in the src directory. I would certainly be grateful for any pointers to how the old numarray C API functions map to the new numpy ones. I would prefer to use the new numpy API if I can figure out what to do. -- Russell From charlesr.harris at gmail.com Tue Jun 30 14:48:54 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Jun 2009 12:48:54 -0600 Subject: [Numpy-discussion] permutation symbol In-Reply-To: References: Message-ID: On Tue, Jun 30, 2009 at 12:26 PM, Nils Wagner wrote: > On Tue, 30 Jun 2009 11:10:39 -0600 > Charles R Harris wrote: > > On Tue, Jun 30, 2009 at 10:56 AM, Charles R Harris < > > charlesr.harris at gmail.com> wrote: > > > >> > >> > >> On Tue, Jun 30, 2009 at 10:40 AM, Nils Wagner < > >> nwagner at iam.uni-stuttgart.de> wrote: > >> > >>> On Tue, 30 Jun 2009 10:27:05 -0600 > >>> Charles R Harris wrote: > >>> > On Tue, Jun 30, 2009 at 5:11 AM, Nils Wagner > >>> > wrote: > >>> > > >>> >> On Tue, 30 Jun 2009 11:22:34 +0200 > >>> >> "Nils Wagner" wrote: > >>> >> > >>> >>> Hi all, > >>> >>> > >>> >>> How can I build the following product with numpy > >>> >>> > >>> >>> q_i = \varepsilon_{ijk} q_{kj} > >>> >>> > >>> >>> where \varepsilon_{ijk} denotes the permutation > >>>symbol. > >>> >>> > >>> >>> Nils > >>> >>> > >>> >> Sorry for replying to myself. > >>> >> The permutation symbol is also known as the > >>>Levi-Civita > >>> >>symbol. > >>> >> I found an explicit expression at > >>> >> http://en.wikipedia.org/wiki/Levi-Civita_symbol > >>> >> > >>> >> How do I build the product of the Levi-Civita symbol > >>> >>\varepsilon_{ijk} and > >>> >> the two dimensional array > >>> >> q_{kj}, i,j,k = 1,2,3 ? > >>> >> > >>> > > >>> > Write it out explicitly. It essentially > >>>antisymmetrizes > >>> >q and the three off > >>> > diagonal elements can then be treated as a vector. > >>> >Depending on how q is > >>> > formed and the resulting vector is used there may be > >>> >other things you can do > >>> > when you use it in a more general expression. If this > >>>is > >>> >part of a general > >>> > calculation there might be other ways of expressing > >>>it. > >>> > > >>> > Chuck > >>> > >>> Hi Chuck, > >>> > >>> Thank you for your response. > >>> The problem at hand is described in a paper by Angeles > >>> namely equation (17c) in > >>> "Automatic computation of the screw parameters of > >>> rigid-body motions. > >>> Part I: Finitely-separated positions" > >>> Journal of Dynamic systems, Measurement and Control, > >>>Vol. > >>> 108 (1986) pp. 32-38 > >>> > >> > >> You can solve this problem using quaternions also, in > >>which case it reduces > >> to an eigenvalue problem. You will note that such things > >>as PCA are used in > >> the papers that reference the cited work so you can't > >>really get around that > >> bit of inefficiency. > >> > > > > Here's a reference to the quaternion approach: > > http://people.csail.mit.edu/bkph/papers/Absolute_Orientation.pdf. > >You can > > get the translation part from the motion of the > >centroid. > > > > If you are into abstractions you will note that the > >problem reduces to > > minimising a quadratic form in the quaternion > >components. The rest is just > > algebra ;) > > > > Chuck > > It turns out that the product is simply an invariant of a > 3 \times 3 matrix. > > from numpy import array, zeros, identity > from numpy.linalg import norm > > > def vect(A): > """ linear invariant of a 3 x 3 matrix """ > tmp = zeros(3,float) > tmp[0] = 0.5*(A[2,1]-A[1,2]) > tmp[1] = 0.5*(A[0,2]-A[2,0]) > tmp[2] = 0.5*(A[1,0]-A[0,1]) > > return tmp > Yep, that's writing it out explicitly. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Tue Jun 30 14:50:31 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 30 Jun 2009 20:50:31 +0200 Subject: [Numpy-discussion] permutation symbol In-Reply-To: <837214.99427.qm@web52106.mail.re2.yahoo.com> References: <837214.99427.qm@web52106.mail.re2.yahoo.com> Message-ID: On Tue, 30 Jun 2009 11:30:51 -0700 (PDT) David Goldsmith wrote: > > Great, Nils! Now, can you generalize it to N-D for us? >;-) > > DG Just curious - Do you have any application for N-D case in mind ? Nils From charlesr.harris at gmail.com Tue Jun 30 14:51:54 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Jun 2009 12:51:54 -0600 Subject: [Numpy-discussion] permutation symbol In-Reply-To: <837214.99427.qm@web52106.mail.re2.yahoo.com> References: <837214.99427.qm@web52106.mail.re2.yahoo.com> Message-ID: On Tue, Jun 30, 2009 at 12:30 PM, David Goldsmith wrote: > > Great, Nils! Now, can you generalize it to N-D for us? ;-) > > It's generally better to use the exterior calculus, determinants, and various other tricks. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jun 30 15:00:46 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Jun 2009 13:00:46 -0600 Subject: [Numpy-discussion] Advice on converting Numarray C extension? In-Reply-To: References: Message-ID: On Tue, Jun 30, 2009 at 12:31 PM, Russell E. Owen wrote: > In article > , > Charles R Harris wrote: > > > On Mon, Jun 29, 2009 at 4:17 PM, Russell E. Owen > > wrote: > > > > > In article > > > , > > > Charles R Harris wrote: > > > > > > > On Mon, Jun 29, 2009 at 3:03 PM, Russell E. Owen > > > > wrote: > > > > > > > > > I have an old Numarray C extension (or, rather, a Python package > > > > > containing a C extension) that I would like to convert to numpy > > > > > (in a way that is likely to be supported long-term). > > > > > > > > How big is the extension and what does it do? > > > > > > It basically contains 2 functions: > > > 1: radProfile: given a masked image (2d array), a radius and a desired > > > center: compute a new 1d array whose value at index r is the sum of all > > > unmasked pixels at radius r. > > > > > > 2: radAsymm: given the same inputs as radProfile, return a (scalar) > > > measure of radial asymmetry by computing the variance of unmasked > pixels > > > at each radius and combining the results. > > > > > > The original source file is about 1000 lines long, of which 1/3 to 1/2 > > > is the basic C code and the rest is Python wrapper. > > > > It sounds small enough that you should be able to update it to the numpy > > interface. What functions do you need? You should also be able to attach > a > > copy (zipped) if it is small enough, which might help us help you. > > It is the PyGuide package > > a 525k zip file. The extension code is in the src directory. > > I would certainly be grateful for any pointers to how the old numarray C > API functions map to the new numpy ones. I would prefer to use the new > numpy API if I can figure out what to do. > You can look at the numpy/numarray/_capi.c file where the translation from numpy to numarray is located. A lot of the functions map directly, others are more complicated. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jun 30 15:51:15 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Jun 2009 13:51:15 -0600 Subject: [Numpy-discussion] permutation symbol In-Reply-To: References: Message-ID: On Tue, Jun 30, 2009 at 12:26 PM, Nils Wagner wrote: > On Tue, 30 Jun 2009 11:10:39 -0600 > Charles R Harris wrote: > > On Tue, Jun 30, 2009 at 10:56 AM, Charles R Harris < > > charlesr.harris at gmail.com> wrote: > > > >> > >> > >> On Tue, Jun 30, 2009 at 10:40 AM, Nils Wagner < > >> nwagner at iam.uni-stuttgart.de> wrote: > >> > >>> On Tue, 30 Jun 2009 10:27:05 -0600 > >>> Charles R Harris wrote: > >>> > On Tue, Jun 30, 2009 at 5:11 AM, Nils Wagner > >>> > wrote: > >>> > > >>> >> On Tue, 30 Jun 2009 11:22:34 +0200 > >>> >> "Nils Wagner" wrote: > >>> >> > >>> >>> Hi all, > >>> >>> > >>> >>> How can I build the following product with numpy > >>> >>> > >>> >>> q_i = \varepsilon_{ijk} q_{kj} > >>> >>> > >>> >>> where \varepsilon_{ijk} denotes the permutation > >>>symbol. > >>> >>> > >>> >>> Nils > >>> >>> > >>> >> Sorry for replying to myself. > >>> >> The permutation symbol is also known as the > >>>Levi-Civita > >>> >>symbol. > >>> >> I found an explicit expression at > >>> >> http://en.wikipedia.org/wiki/Levi-Civita_symbol > >>> >> > >>> >> How do I build the product of the Levi-Civita symbol > >>> >>\varepsilon_{ijk} and > >>> >> the two dimensional array > >>> >> q_{kj}, i,j,k = 1,2,3 ? > >>> >> > >>> > > >>> > Write it out explicitly. It essentially > >>>antisymmetrizes > >>> >q and the three off > >>> > diagonal elements can then be treated as a vector. > >>> >Depending on how q is > >>> > formed and the resulting vector is used there may be > >>> >other things you can do > >>> > when you use it in a more general expression. If this > >>>is > >>> >part of a general > >>> > calculation there might be other ways of expressing > >>>it. > >>> > > >>> > Chuck > >>> > >>> Hi Chuck, > >>> > >>> Thank you for your response. > >>> The problem at hand is described in a paper by Angeles > >>> namely equation (17c) in > >>> "Automatic computation of the screw parameters of > >>> rigid-body motions. > >>> Part I: Finitely-separated positions" > >>> Journal of Dynamic systems, Measurement and Control, > >>>Vol. > >>> 108 (1986) pp. 32-38 > >>> > >> > >> You can solve this problem using quaternions also, in > >>which case it reduces > >> to an eigenvalue problem. You will note that such things > >>as PCA are used in > >> the papers that reference the cited work so you can't > >>really get around that > >> bit of inefficiency. > >> > > > > Here's a reference to the quaternion approach: > > http://people.csail.mit.edu/bkph/papers/Absolute_Orientation.pdf. > >You can > > get the translation part from the motion of the > >centroid. > > > > If you are into abstractions you will note that the > >problem reduces to > > minimising a quadratic form in the quaternion > >components. The rest is just > > algebra ;) > > > > Chuck > > It turns out that the product is simply an invariant of a > 3 \times 3 matrix. > > from numpy import array, zeros, identity > from numpy.linalg import norm > > > def vect(A): > """ linear invariant of a 3 x 3 matrix """ > tmp = zeros(3,float) > tmp[0] = 0.5*(A[2,1]-A[1,2]) > tmp[1] = 0.5*(A[0,2]-A[2,0]) > tmp[2] = 0.5*(A[1,0]-A[0,1]) > > return tmp Out of curiosity, where did the .5 come from? It is not normally part of the Levi-Civita symbol. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_l_goldsmith at yahoo.com Tue Jun 30 16:02:23 2009 From: d_l_goldsmith at yahoo.com (David Goldsmith) Date: Tue, 30 Jun 2009 13:02:23 -0700 (PDT) Subject: [Numpy-discussion] permutation symbol Message-ID: <547672.76551.qm@web52101.mail.re2.yahoo.com> No, but my guess is that it might be useful in numerical General Relativity, e.g., as well as pedagogically in teaching differential geometry and tensor algebra. DG --- On Tue, 6/30/09, Nils Wagner wrote: > From: Nils Wagner > Subject: Re: [Numpy-discussion] permutation symbol > To: "Discussion of Numerical Python" > Date: Tuesday, June 30, 2009, 11:50 AM > On Tue, 30 Jun 2009 11:30:51 -0700 > (PDT) > ? David Goldsmith > wrote: > > > > Great, Nils!? Now, can you generalize it to N-D > for us? > >;-) > > > > DG > ? > Just curious - Do you have any application for N-D case in > > mind ? > > Nils > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Jun 30 16:17:52 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Jun 2009 14:17:52 -0600 Subject: [Numpy-discussion] permutation symbol In-Reply-To: <547672.76551.qm@web52101.mail.re2.yahoo.com> References: <547672.76551.qm@web52101.mail.re2.yahoo.com> Message-ID: On Tue, Jun 30, 2009 at 2:02 PM, David Goldsmith wrote: > > No, but my guess is that it might be useful in numerical General > Relativity, e.g., as well as pedagogically in teaching differential geometry > and tensor algebra. > > DG > It shouldn't be difficult to generate all the permutations of the indices and assign +/-1 as appropriate. The rest of the entries would be zero. There was a discussion about generating permutations a while back... Hmm, using one of the recursive routines it looks like it shouldn't be too difficult to track even and odd permutations. But for general relativity, four indices with four values, it would probably be quickest to just write down the 24 permutations and use a singleton array, i.e., generate the array once and just return references or copies. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jun 30 21:00:00 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Jun 2009 19:00:00 -0600 Subject: [Numpy-discussion] Advice on converting Numarray C extension? In-Reply-To: References: Message-ID: On Tue, Jun 30, 2009 at 12:31 PM, Russell E. Owen wrote: > In article > , > Charles R Harris wrote: > > > On Mon, Jun 29, 2009 at 4:17 PM, Russell E. Owen > > wrote: > > > > > In article > > > , > > > Charles R Harris wrote: > > > > > > > On Mon, Jun 29, 2009 at 3:03 PM, Russell E. Owen > > > > wrote: > > > > > > > > > I have an old Numarray C extension (or, rather, a Python package > > > > > containing a C extension) that I would like to convert to numpy > > > > > (in a way that is likely to be supported long-term). > > > > > > > > How big is the extension and what does it do? > > > > > > It basically contains 2 functions: > > > 1: radProfile: given a masked image (2d array), a radius and a desired > > > center: compute a new 1d array whose value at index r is the sum of all > > > unmasked pixels at radius r. > > > > > > 2: radAsymm: given the same inputs as radProfile, return a (scalar) > > > measure of radial asymmetry by computing the variance of unmasked > pixels > > > at each radius and combining the results. > > > > > > The original source file is about 1000 lines long, of which 1/3 to 1/2 > > > is the basic C code and the rest is Python wrapper. > > > > It sounds small enough that you should be able to update it to the numpy > > interface. What functions do you need? You should also be able to attach > a > > copy (zipped) if it is small enough, which might help us help you. > > It is the PyGuide package > > a 525k zip file. The extension code is in the src directory. > > I would certainly be grateful for any pointers to how the old numarray C > API functions map to the new numpy ones. I would prefer to use the new > numpy API if I can figure out what to do. > This doesn't look too bad, I only count 5 functions/macros. NA_InputArray NA_OutputArray NA_ShapeEqual NA_NewArray NA_OFFSETDATA The quick and dirty solution would be to just copy those functions in at the top of your code. You might want to fix up the NumarrayType enum instead of including it and a few other such. The code looks like it would go over into cython fairly nicely since it is split between interface code, which would look good in python, and a couple of pure c functions. If you have the time that might be a good way to go. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: