From chunwei.yuan at gmail.com Thu Aug 3 13:00:03 2017 From: chunwei.yuan at gmail.com (Chun-Wei Yuan) Date: Thu, 3 Aug 2017 10:00:03 -0700 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: Any way I can help expedite this? On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan wrote: > That would be great. I just used np.argsort because it was familiar to > me. Didn't know about the C code. > > On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz < > jfoxrabinovitz at gmail.com> wrote: > >> While #9211 is a good start, it is pretty inefficient in terms of the >> fact that it performs an O(nlogn) sort of the array. It is possible to >> reduce the time to O(n) by using a similar partitioning algorithm to the >> one in the C code of percentile. I will look into it as soon as I can. >> >> -Joe >> >> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan >> wrote: >> >>> Just to provide some context, 9213 actually spawned off of this guy: >>> >>> https://github.com/numpy/numpy/pull/9211 >>> >>> which might address the weighted inputs issue Joe brought up. >>> >>> C >>> >>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz < >>> jfoxrabinovitz at gmail.com> wrote: >>> >>>> I think that there would be a very good reason to have a separate >>>> function if we were to introduce weights to the inputs, similarly to the >>>> way that we have mean and average. This would have some (positive) >>>> repercussions like making weighted histograms with the Freedman-Diaconis >>>> binwidth estimator a possibility. I have had this change on the back-burner >>>> for a long time, mainly because I was too lazy to figure out how to include >>>> it in the C code. However, I will take a closer look. >>>> >>>> Regards, >>>> >>>> -Joe >>>> >>>> >>>> >>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan >>>> wrote: >>>> >>>>> There's an ongoing effort to introduce quantile() into numpy. You'd >>>>> use it just like percentile(), but would input your q value in probability >>>>> space (0.5 for 50%): >>>>> >>>>> https://github.com/numpy/numpy/pull/9213 >>>>> >>>>> Since there's a great deal of overlap between these two functions, >>>>> we'd like to solicit opinions on how to move forward on this. >>>>> >>>>> The current thinking is to tolerate the redundancy and keep both, >>>>> using one as the engine for the other. I'm partial to having quantile >>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting on >>>>> quantile(). >>>>> >>>>> Best, >>>>> >>>>> C >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Thu Aug 3 14:10:20 2017 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Thu, 3 Aug 2017 14:10:20 -0400 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: Not that I know of. The algorithm is very simple, requiring a relatively small addition to the current introselect algorithm used for `np.partition`. My biggest hurdle is figuring out how the calling machinery really works so that I can figure out which input type permutations I need to generate, and how to get the right backend running for a given function call. -Joe On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan wrote: > Any way I can help expedite this? > > On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan > wrote: >> >> That would be great. I just used np.argsort because it was familiar to >> me. Didn't know about the C code. >> >> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz >> wrote: >>> >>> While #9211 is a good start, it is pretty inefficient in terms of the >>> fact that it performs an O(nlogn) sort of the array. It is possible to >>> reduce the time to O(n) by using a similar partitioning algorithm to the one >>> in the C code of percentile. I will look into it as soon as I can. >>> >>> -Joe >>> >>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan >>> wrote: >>>> >>>> Just to provide some context, 9213 actually spawned off of this guy: >>>> >>>> https://github.com/numpy/numpy/pull/9211 >>>> >>>> which might address the weighted inputs issue Joe brought up. >>>> >>>> C >>>> >>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz >>>> wrote: >>>>> >>>>> I think that there would be a very good reason to have a separate >>>>> function if we were to introduce weights to the inputs, similarly to the way >>>>> that we have mean and average. This would have some (positive) repercussions >>>>> like making weighted histograms with the Freedman-Diaconis binwidth >>>>> estimator a possibility. I have had this change on the back-burner for a >>>>> long time, mainly because I was too lazy to figure out how to include it in >>>>> the C code. However, I will take a closer look. >>>>> >>>>> Regards, >>>>> >>>>> -Joe >>>>> >>>>> >>>>> >>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan >>>>> wrote: >>>>>> >>>>>> There's an ongoing effort to introduce quantile() into numpy. You'd >>>>>> use it just like percentile(), but would input your q value in probability >>>>>> space (0.5 for 50%): >>>>>> >>>>>> https://github.com/numpy/numpy/pull/9213 >>>>>> >>>>>> Since there's a great deal of overlap between these two functions, >>>>>> we'd like to solicit opinions on how to move forward on this. >>>>>> >>>>>> The current thinking is to tolerate the redundancy and keep both, >>>>>> using one as the engine for the other. I'm partial to having quantile >>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting on >>>>>> quantile(). >>>>>> >>>>>> Best, >>>>>> >>>>>> C >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at python.org >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From chunwei.yuan at gmail.com Thu Aug 3 17:36:45 2017 From: chunwei.yuan at gmail.com (Chun-Wei Yuan) Date: Thu, 3 Aug 2017 14:36:45 -0700 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: Cool. Just as a heads up, for my algorithm to work, I actually need the indices, which is why argsort() is so important to me. I use it to get both ap_sorted and ws_sorted variables. If your weighted-quantile algo is faster and doesn't require those indices, please by all means change my implementation. Thanks. On Thu, Aug 3, 2017 at 11:10 AM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > Not that I know of. The algorithm is very simple, requiring a > relatively small addition to the current introselect algorithm used > for `np.partition`. My biggest hurdle is figuring out how the calling > machinery really works so that I can figure out which input type > permutations I need to generate, and how to get the right backend > running for a given function call. > > -Joe > > On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan > wrote: > > Any way I can help expedite this? > > > > On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan > > wrote: > >> > >> That would be great. I just used np.argsort because it was familiar to > >> me. Didn't know about the C code. > >> > >> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz > >> wrote: > >>> > >>> While #9211 is a good start, it is pretty inefficient in terms of the > >>> fact that it performs an O(nlogn) sort of the array. It is possible to > >>> reduce the time to O(n) by using a similar partitioning algorithm to > the one > >>> in the C code of percentile. I will look into it as soon as I can. > >>> > >>> -Joe > >>> > >>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan > > >>> wrote: > >>>> > >>>> Just to provide some context, 9213 actually spawned off of this guy: > >>>> > >>>> https://github.com/numpy/numpy/pull/9211 > >>>> > >>>> which might address the weighted inputs issue Joe brought up. > >>>> > >>>> C > >>>> > >>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz > >>>> wrote: > >>>>> > >>>>> I think that there would be a very good reason to have a separate > >>>>> function if we were to introduce weights to the inputs, similarly to > the way > >>>>> that we have mean and average. This would have some (positive) > repercussions > >>>>> like making weighted histograms with the Freedman-Diaconis binwidth > >>>>> estimator a possibility. I have had this change on the back-burner > for a > >>>>> long time, mainly because I was too lazy to figure out how to > include it in > >>>>> the C code. However, I will take a closer look. > >>>>> > >>>>> Regards, > >>>>> > >>>>> -Joe > >>>>> > >>>>> > >>>>> > >>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan < > chunwei.yuan at gmail.com> > >>>>> wrote: > >>>>>> > >>>>>> There's an ongoing effort to introduce quantile() into numpy. You'd > >>>>>> use it just like percentile(), but would input your q value in > probability > >>>>>> space (0.5 for 50%): > >>>>>> > >>>>>> https://github.com/numpy/numpy/pull/9213 > >>>>>> > >>>>>> Since there's a great deal of overlap between these two functions, > >>>>>> we'd like to solicit opinions on how to move forward on this. > >>>>>> > >>>>>> The current thinking is to tolerate the redundancy and keep both, > >>>>>> using one as the engine for the other. I'm partial to having > quantile > >>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting > on > >>>>>> quantile(). > >>>>>> > >>>>>> Best, > >>>>>> > >>>>>> C > >>>>>> > >>>>>> _______________________________________________ > >>>>>> NumPy-Discussion mailing list > >>>>>> NumPy-Discussion at python.org > >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > >>>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> NumPy-Discussion mailing list > >>>>> NumPy-Discussion at python.org > >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion > >>>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> NumPy-Discussion mailing list > >>>> NumPy-Discussion at python.org > >>>> https://mail.python.org/mailman/listinfo/numpy-discussion > >>>> > >>> > >>> > >>> _______________________________________________ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion at python.org > >>> https://mail.python.org/mailman/listinfo/numpy-discussion > >>> > >> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Fri Aug 4 14:09:23 2017 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Fri, 4 Aug 2017 14:09:23 -0400 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: I will go over your PR carefully to make sure we can agree on a matching API. After that, we can swap the backend out whenever I get around to it. Thanks for working on this. -Joe On Thu, Aug 3, 2017 at 5:36 PM, Chun-Wei Yuan wrote: > Cool. Just as a heads up, for my algorithm to work, I actually need the > indices, which is why argsort() is so important to me. I use it to get both > ap_sorted and ws_sorted variables. If your weighted-quantile algo is faster > and doesn't require those indices, please by all means change my > implementation. Thanks. > > On Thu, Aug 3, 2017 at 11:10 AM, Joseph Fox-Rabinovitz > wrote: >> >> Not that I know of. The algorithm is very simple, requiring a >> relatively small addition to the current introselect algorithm used >> for `np.partition`. My biggest hurdle is figuring out how the calling >> machinery really works so that I can figure out which input type >> permutations I need to generate, and how to get the right backend >> running for a given function call. >> >> -Joe >> >> On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan >> wrote: >> > Any way I can help expedite this? >> > >> > On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan >> > wrote: >> >> >> >> That would be great. I just used np.argsort because it was familiar to >> >> me. Didn't know about the C code. >> >> >> >> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz >> >> wrote: >> >>> >> >>> While #9211 is a good start, it is pretty inefficient in terms of the >> >>> fact that it performs an O(nlogn) sort of the array. It is possible to >> >>> reduce the time to O(n) by using a similar partitioning algorithm to >> >>> the one >> >>> in the C code of percentile. I will look into it as soon as I can. >> >>> >> >>> -Joe >> >>> >> >>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan >> >>> >> >>> wrote: >> >>>> >> >>>> Just to provide some context, 9213 actually spawned off of this guy: >> >>>> >> >>>> https://github.com/numpy/numpy/pull/9211 >> >>>> >> >>>> which might address the weighted inputs issue Joe brought up. >> >>>> >> >>>> C >> >>>> >> >>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz >> >>>> wrote: >> >>>>> >> >>>>> I think that there would be a very good reason to have a separate >> >>>>> function if we were to introduce weights to the inputs, similarly to >> >>>>> the way >> >>>>> that we have mean and average. This would have some (positive) >> >>>>> repercussions >> >>>>> like making weighted histograms with the Freedman-Diaconis binwidth >> >>>>> estimator a possibility. I have had this change on the back-burner >> >>>>> for a >> >>>>> long time, mainly because I was too lazy to figure out how to >> >>>>> include it in >> >>>>> the C code. However, I will take a closer look. >> >>>>> >> >>>>> Regards, >> >>>>> >> >>>>> -Joe >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan >> >>>>> >> >>>>> wrote: >> >>>>>> >> >>>>>> There's an ongoing effort to introduce quantile() into numpy. >> >>>>>> You'd >> >>>>>> use it just like percentile(), but would input your q value in >> >>>>>> probability >> >>>>>> space (0.5 for 50%): >> >>>>>> >> >>>>>> https://github.com/numpy/numpy/pull/9213 >> >>>>>> >> >>>>>> Since there's a great deal of overlap between these two functions, >> >>>>>> we'd like to solicit opinions on how to move forward on this. >> >>>>>> >> >>>>>> The current thinking is to tolerate the redundancy and keep both, >> >>>>>> using one as the engine for the other. I'm partial to having >> >>>>>> quantile >> >>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting >> >>>>>> on >> >>>>>> quantile(). >> >>>>>> >> >>>>>> Best, >> >>>>>> >> >>>>>> C >> >>>>>> >> >>>>>> _______________________________________________ >> >>>>>> NumPy-Discussion mailing list >> >>>>>> NumPy-Discussion at python.org >> >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >>>>>> >> >>>>> >> >>>>> >> >>>>> _______________________________________________ >> >>>>> NumPy-Discussion mailing list >> >>>>> NumPy-Discussion at python.org >> >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >>>>> >> >>>> >> >>>> >> >>>> _______________________________________________ >> >>>> NumPy-Discussion mailing list >> >>>> NumPy-Discussion at python.org >> >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >>>> >> >>> >> >>> >> >>> _______________________________________________ >> >>> NumPy-Discussion mailing list >> >>> NumPy-Discussion at python.org >> >>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >>> >> >> >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From laytonjb at att.net Fri Aug 4 15:24:28 2017 From: laytonjb at att.net (Jeff Layton) Date: Fri, 4 Aug 2017 15:24:28 -0400 Subject: [Numpy-discussion] F2PY problems with PGI compilers Message-ID: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> Good afternoon! I'm trying to build a Python module using F2PY on a simple Fortran code using the PGI 17.4 community compilers. I'm using Conda 4.3.21 with Python 2.7.13 and F2PY 2. The command line I'm using is, f2py --compiler=pg --fcompiler=pg -c -m mdevice mdevice.f90 The output from f2py is at the end of the email. Any suggestions are greatly appreciated. Thanks! Jeff Output from f2py: running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building extension "mdevice" sources f2py options: [] f2py:> /tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c creating /tmp/tmptN1fdp/src.linux-x86_64-2.7 Reading fortran codes... Reading file 'mdevice.f90' (format:free) Post-processing... Block: mdevice Block: devicequery In: :mdevice:mdevice.f90:devicequery get_useparameters: no module cudafor info used by devicequery Post-processing (stage 2)... Building modules... Building module "mdevice"... Constructing wrapper function "devicequery"... devicequery() Wrote C/API module "mdevice" to file "/tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c" adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7/fortranobject.c' to sources. adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7' to include_dirs. copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c -> /tmp/tmptN1fdp/src.linux-x86_64-2.7 copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h -> /tmp/tmptN1fdp/src.linux-x86_64-2.7 build_src: building npy-pkg config files running build_ext error: don't know how to compile C/C++ code on platform 'posix' with 'pg' compiler From jfoxrabinovitz at gmail.com Fri Aug 4 18:31:52 2017 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Fri, 4 Aug 2017 18:31:52 -0400 Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in PR#9514 Message-ID: I would like to propose the addition of a new function, `np.neighborwise` in PR#9514. It is based on the discussion relating to my proposal for `np.ratio` (PR#9481) and Eric Wieser's `np.neighborwise` in PR#9428. This function accepts an array `a`, a vectorized function of two arguments `func`, and applies the function to all of the neighboring elements of the array across multiple dimensions. There are options for masking out parts of the calculation and for applying the function recursively. The name of the function is not written in stone. The current name is taken directly from PR#9428 because I can not think of a better one. This function can serve as a backend for the existing `np.diff`, which has been re-implemented in this PR, as well as for the `ratio` function I propsed earlier. This adds the diagonal diffs feature, which is tested and backwards compatible. `ratio` can be implemented very simply with or without a mask. With a mask, it can be expressed `np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args: args[1])` (The conversion to bool is done automatically). The one potentially non-backwards-compatible API change that this PR introduces is that `np.diff` now returns an `ndarray` version of the input, instead of the original array itself if `n==0`. Previously, the exact input reference was returned for `n==0`. I very seriously doubt that this feature was ever used outside the numpy test suite anyway. The advantage of this change is that an invalid axis input can now be caught before returning the unaltered array. If this change is considered too drastic, I can remove it without removing the axis check. The two main differences between this PR and PR#9428 are the addition of masks to the computation, and the interpretation of multiple axes. PR#9428 applies `func` successively along each axis. This provides no way of doing diagonal diffs. I chose to shift along all the axes simultaneously before applying `func`. To clarify with an example, if we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`, PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the version I propose here just takes the diagonal diff `4 - 1 = 3`. Besides being more intuitive in my opinion, taking diagonal diffs actually adds a new feature that can not be obtained directly by taking successive diffs. Please let me know your thoughts. Regards, -Joe From ben.v.root at gmail.com Fri Aug 4 21:44:18 2017 From: ben.v.root at gmail.com (Benjamin Root) Date: Fri, 4 Aug 2017 21:44:18 -0400 Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in PR#9514 In-Reply-To: References: Message-ID: So, this is a kernel mechanism? On Fri, Aug 4, 2017 at 6:31 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > I would like to propose the addition of a new function, > `np.neighborwise` in PR#9514. It is based on the discussion relating > to my proposal for `np.ratio` (PR#9481) and Eric Wieser's > `np.neighborwise` in PR#9428. This function accepts an array `a`, a > vectorized function of two arguments `func`, and applies the function > to all of the neighboring elements of the array across multiple > dimensions. There are options for masking out parts of the calculation > and for applying the function recursively. > > The name of the function is not written in stone. The current name is > taken directly from PR#9428 because I can not think of a better one. > > This function can serve as a backend for the existing `np.diff`, which > has been re-implemented in this PR, as well as for the `ratio` > function I propsed earlier. This adds the diagonal diffs feature, > which is tested and backwards compatible. `ratio` can be implemented > very simply with or without a mask. With a mask, it can be expressed > `np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args: > args[1])` (The conversion to bool is done automatically). > > The one potentially non-backwards-compatible API change that this PR > introduces is that `np.diff` now returns an `ndarray` version of the > input, instead of the original array itself if `n==0`. Previously, the > exact input reference was returned for `n==0`. I very seriously doubt > that this feature was ever used outside the numpy test suite anyway. > The advantage of this change is that an invalid axis input can now be > caught before returning the unaltered array. If this change is > considered too drastic, I can remove it without removing the axis > check. > > The two main differences between this PR and PR#9428 are the addition > of masks to the computation, and the interpretation of multiple axes. > PR#9428 applies `func` successively along each axis. This provides no > way of doing diagonal diffs. I chose to shift along all the axes > simultaneously before applying `func`. To clarify with an example, if > we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`, > PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the > version I propose here just takes the diagonal diff `4 - 1 = 3`. > Besides being more intuitive in my opinion, taking diagonal diffs > actually adds a new feature that can not be obtained directly by > taking successive diffs. > > Please let me know your thoughts. > > Regards, > > -Joe > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim at cerazone.net Fri Aug 4 22:54:13 2017 From: tim at cerazone.net (Tim Cera) Date: Sat, 05 Aug 2017 02:54:13 +0000 Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in PR#9514 In-Reply-To: References: Message-ID: As noted https://github.com/numpy/numpy/pull/303 a large part of this functionality has been implemented before for numpy and didn't go anywhere because it is already present in scipy.ndimage. IMHO it is better suited in numpy with a better name so that people don't miss it. Kindest regards, Tim On Fri, Aug 4, 2017 at 6:33 PM Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > I would like to propose the addition of a new function, > `np.neighborwise` in PR#9514. It is based on the discussion relating > to my proposal for `np.ratio` (PR#9481) and Eric Wieser's > `np.neighborwise` in PR#9428. This function accepts an array `a`, a > vectorized function of two arguments `func`, and applies the function > to all of the neighboring elements of the array across multiple > dimensions. There are options for masking out parts of the calculation > and for applying the function recursively. > > The name of the function is not written in stone. The current name is > taken directly from PR#9428 because I can not think of a better one. > > This function can serve as a backend for the existing `np.diff`, which > has been re-implemented in this PR, as well as for the `ratio` > function I propsed earlier. This adds the diagonal diffs feature, > which is tested and backwards compatible. `ratio` can be implemented > very simply with or without a mask. With a mask, it can be expressed > `np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args: > args[1])` (The conversion to bool is done automatically). > > The one potentially non-backwards-compatible API change that this PR > introduces is that `np.diff` now returns an `ndarray` version of the > input, instead of the original array itself if `n==0`. Previously, the > exact input reference was returned for `n==0`. I very seriously doubt > that this feature was ever used outside the numpy test suite anyway. > The advantage of this change is that an invalid axis input can now be > caught before returning the unaltered array. If this change is > considered too drastic, I can remove it without removing the axis > check. > > The two main differences between this PR and PR#9428 are the addition > of masks to the computation, and the interpretation of multiple axes. > PR#9428 applies `func` successively along each axis. This provides no > way of doing diagonal diffs. I chose to shift along all the axes > simultaneously before applying `func`. To clarify with an example, if > we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`, > PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the > version I propose here just takes the diagonal diff `4 - 1 = 3`. > Besides being more intuitive in my opinion, taking diagonal diffs > actually adds a new feature that can not be obtained directly by > taking successive diffs. > > Please let me know your thoughts. > > Regards, > > -Joe > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Sat Aug 5 19:55:08 2017 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Sat, 05 Aug 2017 16:55:08 -0700 Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in PR#9514 In-Reply-To: References: Message-ID: <1501977308.1874930.1064358864.145BC92F@webmail.messagingengine.com> On Fri, Aug 4, 2017, at 19:54, Tim Cera wrote: > As noted https://github.com/numpy/numpy/pull/303 a large part of this > functionality has been implemented before for numpy and didn't go > anywhere because it is already present in scipy.ndimage.> > IMHO it is better suited in numpy with a better name so that people > don't miss it. Is this essentially `scipy.ndimage.generic_filter`? St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim at cerazone.net Sun Aug 6 00:01:32 2017 From: tim at cerazone.net (Tim Cera) Date: Sun, 06 Aug 2017 04:01:32 +0000 Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in PR#9514 In-Reply-To: <1501977308.1874930.1064358864.145BC92F@webmail.messagingengine.com> References: <1501977308.1874930.1064358864.145BC92F@webmail.messagingengine.com> Message-ID: It you're into reading ancient history here is the link to the discussion where Zachary Pincus makes the same observation and my response was to close the PR because I could use scipy.ndimage.generic_filter, even though at least through my eyes, my implementation was nicer. http://numpy-discussion.10968.n7.nabble.com/Fwd-numpy-ENH-Initial-implementation-of-a-neighbor-calculation-303-td27508.html On Sat, Aug 5, 2017 at 7:56 PM Stefan van der Walt wrote: > On Fri, Aug 4, 2017, at 19:54, Tim Cera wrote: > > As noted https://github.com/numpy/numpy/pull/303 a large part of this > functionality has been implemented before for numpy and didn't go anywhere > because it is already present in scipy.ndimage. > > IMHO it is better suited in numpy with a better name so that people don't > miss it. > > > Is this essentially `scipy.ndimage.generic_filter`? > > St?fan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Sun Aug 6 08:37:43 2017 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Sun, 6 Aug 2017 14:37:43 +0200 Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in PR#9514 In-Reply-To: References: <1501977308.1874930.1064358864.145BC92F@webmail.messagingengine.com> Message-ID: <238c20a8-6d44-4111-b0fd-ac295472fece@Spark> It?s nice that this is pure Python / NumPy vectorized, whereas generic_filter requires some compilation to get good performance. (Tim, although your implementation is nice and readable, it would have been very slow for any significant volumes.) However, my feeling is that this function is too specialized for a foundational package like NumPy. As Sebastian Berg pointed out on one of the PRs, it can cause confusion when there are many ways of achieving the same outcome. imho, the One Way to do this kind of operation is using generic_filter together with LowLevelCallable. My two blog posts on the topic: https://ilovesymposia.com/2017/03/12/scipys-new-lowlevelcallable-is-a-game-changer/ https://ilovesymposia.com/2017/03/15/prettier-lowlevelcallables-with-numba-jit-and-decorators/ This has the advantage that it?s even more general. (In fact, it avoids the repeated-applications-vs-diagonal-application argument altogether. These are simply two different kernels.) Perhaps ndimage lacks discoverability to other fields? But I think that can be better solved with documentation, rather than duplicating functionality and cluttering the NumPy API. Sorry! Juan. On 6 Aug 2017, 6:02 AM +0200, Tim Cera , wrote: > It you're into reading ancient history here is the link to the discussion where Zachary Pincus makes the same observation and my response was to close the PR because I could use scipy.ndimage.generic_filter, even though at least through my eyes, my implementation was nicer. > http://numpy-discussion.10968.n7.nabble.com/Fwd-numpy-ENH-Initial-implementation-of-a-neighbor-calculation-303-td27508.html > > > On Sat, Aug 5, 2017 at 7:56 PM Stefan van der Walt wrote: > > > On Fri, Aug 4, 2017, at 19:54, Tim Cera wrote: > > > > As noted?https://github.com/numpy/numpy/pull/303?a large part of this functionality has been implemented before for numpy and didn't go anywhere because it is already present in scipy.ndimage. > > > > > > > > IMHO it is better suited in numpy with a better name so that people don't miss it. > > > > > > Is this essentially `scipy.ndimage.generic_filter`? > > > > > > St?fan > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From nisoli at im.ufrj.br Mon Aug 7 17:01:33 2017 From: nisoli at im.ufrj.br (Nisoli Isaia) Date: Mon, 7 Aug 2017 18:01:33 -0300 Subject: [Numpy-discussion] np.array, copy=False and memmap Message-ID: Dear all, I have a question about the behaviour of y = np.array(x, copy=False, dtype='float32') when x is a memmap. If we check the memmap attribute of mmap print "mmap attribute", y._mmap numpy tells us that y is not a memmap. But the following code snippet crashes the python interpreter # opens the memmap with open(filename,'r+b') as f: mm = mmap.mmap(f.fileno(),0) x = np.frombuffer(mm, dtype='float32') # builds an array from the memmap, with the option copy=False y = np.array(x, copy=False, dtype='float32') print "before", y # closes the file mm.close() print "after", y In my code I use memmaps to share read-only objects when doing parallel processing and the behaviour of np.array, even if not consistent, it's desirable. I share scipy sparse matrices over many processes and if np.array would make a copy when dealing with memmaps this would force me to rewrite part of the sparse matrices code. Would it be possible in the future releases of numpy to have np.array check, if copy is false, if y is a memmap and in that case return a full memmap object instead of slicing it? Best wishes Isaia P.S. A longer account of the issue may be found on my university blog http://www.im.ufrj.br/nisoli/blog/?p=131 -- Isaia Nisoli -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Thu Aug 10 12:27:57 2017 From: allanhaldane at gmail.com (Allan Haldane) Date: Thu, 10 Aug 2017 12:27:57 -0400 Subject: [Numpy-discussion] np.array, copy=False and memmap In-Reply-To: References: Message-ID: <3af6ce18-4fec-4100-0eab-dc5782c94e64@gmail.com> On 08/07/2017 05:01 PM, Nisoli Isaia wrote: > Dear all, > I have a question about the behaviour of > > y = np.array(x, copy=False, dtype='float32') > > when x is a memmap. If we check the memmap attribute of mmap > > print "mmap attribute", y._mmap > > numpy tells us that y is not a memmap. > But the following code snippet crashes the python interpreter > > # opens the memmap > with open(filename,'r+b') as f: > mm = mmap.mmap(f.fileno(),0) > x = np.frombuffer(mm, dtype='float32') > > # builds an array from the memmap, with the option copy=False > y = np.array(x, copy=False, dtype='float32') > print "before", y > > # closes the file > mm.close() > print "after", y > > In my code I use memmaps to share read-only objects when doing parallel > processing > and the behaviour of np.array, even if not consistent, it's desirable. > I share scipy sparse matrices over many processes and if np.array would > make a copy > when dealing with memmaps this would force me to rewrite part of the sparse > matrices > code. > Would it be possible in the future releases of numpy to have np.array > check, > if copy is false, if y is a memmap and in that case return a full memmap > object > instead of slicing it? This does appear to be a bug in numpy or mmap. Probably the solution isn't to make mmaps a special case, rather we should fix a bug somewhere in the use of the PEP3118 interface. I've opened an issue on github for your issue: https://github.com/numpy/numpy/issues/9537 It seems to me that the "correct" behavior may be for it to me impossible to close the memmap while pointers to it exist; this is the behavior for `memoryview`s of mmaps. That is, your line `mm.close()` shoud raise an error `BufferError: cannot close exported pointers exist`. > Best wishes > Isaia > > P.S. A longer account of the issue may be found on my university blog > http://www.im.ufrj.br/nisoli/blog/?p=131 > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From allanhaldane at gmail.com Thu Aug 10 13:00:30 2017 From: allanhaldane at gmail.com (Allan Haldane) Date: Thu, 10 Aug 2017 13:00:30 -0400 Subject: [Numpy-discussion] Changing MaskedArray.squeeze() to never return masked In-Reply-To: References: Message-ID: <88b41baa-610c-7d14-601e-54f25160a039@gmail.com> On 07/18/2017 09:52 AM, Benjamin Root wrote: > This sort of change seems very similar to the np.diag() change a few years > ago. Are there lessons we could learn from then that we could apply to here? > > Why would the returned view not be a masked array? > > Ben Root I am in favor of the proposed change below. I'd like to merge it, but before that I want to make sure I understand your comment. Are you referring to the proposed change to make diag return a view instead of a copy? Note that this has not actually happened yet: https://github.com/numpy/numpy/issues/7661 Also, I think this case is different because it does not change core numpy, rather this is to make the MaskedArray module act more consistently with core numpy. Because of that I think it is much less problematic than the diag changes. Cheers, Allan > On Tue, Jul 18, 2017 at 9:37 AM, Eric Wieser > wrote: > >> When using ndarray.squeeze, a view is returned, which means you can do >> the follow (somewhat-contrived) operation: >> >>>>> def fill_contrived(a): >> a.squeeze()[...] = 2 >> return a >>>>> fill_contrived(np.array([1])) >> array(2) >> >> However, when tried with a masked array, this can fail, breaking liskov >> subsitution: >> >>>>> fill_contrived(np.ma.array([1], mask=[True])) >> MaskError: Cannot alter the masked element. >> >> This fails because squeeze breaks the contract of returning a view, >> instead deciding sometimes to return masked. >> >> There is a patch that fixes this in gh-9432 >> - however, by necessity it >> breaks any existing code that uses m_arr.squeeze() is np.ma.masked. >> >> Is this too breaking a change? >> >> Eric >> ? >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From ben.v.root at gmail.com Thu Aug 10 13:21:17 2017 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 10 Aug 2017 13:21:17 -0400 Subject: [Numpy-discussion] Changing MaskedArray.squeeze() to never return masked In-Reply-To: <88b41baa-610c-7d14-601e-54f25160a039@gmail.com> References: <88b41baa-610c-7d14-601e-54f25160a039@gmail.com> Message-ID: Yes, that is the change I am thinking of. And yes, it hasn't happened yet. But, it has been set to warn for a few years now, and there was a lot of controversy over it when it was first proposed. That said, I do think the way it was handled made sense, and it is a good model to follow for these types of changes. For all intents and purposes, MaskedArray is "core numpy" for many users. Yes, it has its quirks, but it has been very stable for many years, and users have gotten used to the quirks. While I am all for taking steps to eliminate as many quirks as possible, we need to be mindful of such potentially disruptive changes and give users enough of a heads-up about it. Ben Root On Thu, Aug 10, 2017 at 1:00 PM, Allan Haldane wrote: > On 07/18/2017 09:52 AM, Benjamin Root wrote: > > This sort of change seems very similar to the np.diag() change a few > years > > ago. Are there lessons we could learn from then that we could apply to > here? > > > > Why would the returned view not be a masked array? > > > > Ben Root > > I am in favor of the proposed change below. > > I'd like to merge it, but before that I want to make sure I understand > your comment. > > Are you referring to the proposed change to make diag return a view > instead of a copy? Note that this has not actually happened yet: > https://github.com/numpy/numpy/issues/7661 > > Also, I think this case is different because it does not change core > numpy, rather this is to make the MaskedArray module act more > consistently with core numpy. Because of that I think it is much less > problematic than the diag changes. > > Cheers, > Allan > > > > On Tue, Jul 18, 2017 at 9:37 AM, Eric Wieser < > wieser.eric+numpy at gmail.com> > > wrote: > > > >> When using ndarray.squeeze, a view is returned, which means you can do > >> the follow (somewhat-contrived) operation: > >> > >>>>> def fill_contrived(a): > >> a.squeeze()[...] = 2 > >> return a > >>>>> fill_contrived(np.array([1])) > >> array(2) > >> > >> However, when tried with a masked array, this can fail, breaking liskov > >> subsitution: > >> > >>>>> fill_contrived(np.ma.array([1], mask=[True])) > >> MaskError: Cannot alter the masked element. > >> > >> This fails because squeeze breaks the contract of returning a view, > >> instead deciding sometimes to return masked. > >> > >> There is a patch that fixes this in gh-9432 > >> - however, by necessity it > >> breaks any existing code that uses m_arr.squeeze() is np.ma.masked. > >> > >> Is this too breaking a change? > >> > >> Eric > >> ? > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at python.org > >> https://mail.python.org/mailman/listinfo/numpy-discussion > >> > >> > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Aug 10 14:24:05 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 10 Aug 2017 20:24:05 +0200 Subject: [Numpy-discussion] np.array, copy=False and memmap In-Reply-To: <3af6ce18-4fec-4100-0eab-dc5782c94e64@gmail.com> References: <3af6ce18-4fec-4100-0eab-dc5782c94e64@gmail.com> Message-ID: <1502389445.10770.3.camel@sipsolutions.net> On Thu, 2017-08-10 at 12:27 -0400, Allan Haldane wrote: > On 08/07/2017 05:01 PM, Nisoli Isaia wrote: > > Dear all, > > I have a question about the behaviour of > > > > y = np.array(x, copy=False, dtype='float32') > > > > when x is a memmap. If we check the memmap attribute of mmap > > > > print "mmap attribute", y._mmap > > > > numpy tells us that y is not a memmap. > > But the following code snippet crashes the python interpreter > > > > # opens the memmap > > with open(filename,'r+b') as f: > > ??????mm = mmap.mmap(f.fileno(),0) > > ??????x = np.frombuffer(mm, dtype='float32') > > > > # builds an array from the memmap, with the option copy=False > > y = np.array(x, copy=False, dtype='float32') > > print "before", y > > > > # closes the file > > mm.close() > > print "after", y > > > > In my code I use memmaps to share read-only objects when doing > > parallel > > processing > > and the behaviour of np.array, even if not consistent, it's > > desirable. > > I share scipy sparse matrices over many processes and if np.array > > would > > make a copy > > when dealing with memmaps this would force me to rewrite part of > > the sparse > > matrices > > code. > > Would it be possible in the future releases of numpy to have > > np.array > > check, > > if copy is false, if y is a memmap and in that case return a full > > memmap > > object > > instead of slicing it? > > This does appear to be a bug in numpy or mmap. > Frankly on first sight, I do not think it is a bug in either of them. Numpy uses view (memmap really is just a name for a memory map backed numpy array). The numpy array will hold a reference to the memory map object in its `.base` attribute (or the base of the base, etc.). If you close a mmap object, and then keep using it, you can get segfaults of course, I am not sure what you can do about it. Maybe python can try to warn you when you exit the context/close a file pointer, but I suppose: Python does memory management for you, it makes doing IO management easy, but you need to manage the IO correctly. That this segfaults and not just errors may be annoying, but seems the nature of things on first sight. - Sebastian > Probably the solution isn't to make mmaps a special case, rather we > should fix a bug somewhere in the use of the PEP3118 interface. > > I've opened an issue on github for your issue: > https://github.com/numpy/numpy/issues/9537 > > It seems to me that the "correct" behavior may be for it to me > impossible to close the memmap while pointers to it exist; this is > the > behavior for `memoryview`s of mmaps. That is, your line `mm.close()` > shoud raise an error `BufferError: cannot close exported pointers > exist`. > > > > Best wishes > > Isaia > > > > P.S. A longer account of the issue may be found on my university > > blog > > http://www.im.ufrj.br/nisoli/blog/?p=131 > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From allanhaldane at gmail.com Thu Aug 10 15:56:41 2017 From: allanhaldane at gmail.com (Allan Haldane) Date: Thu, 10 Aug 2017 15:56:41 -0400 Subject: [Numpy-discussion] np.array, copy=False and memmap In-Reply-To: <1502389445.10770.3.camel@sipsolutions.net> References: <3af6ce18-4fec-4100-0eab-dc5782c94e64@gmail.com> <1502389445.10770.3.camel@sipsolutions.net> Message-ID: On 08/10/2017 02:24 PM, Sebastian Berg wrote: > On Thu, 2017-08-10 at 12:27 -0400, Allan Haldane wrote: >> On 08/07/2017 05:01 PM, Nisoli Isaia wrote: >>> Dear all, >>> I have a question about the behaviour of >>> >>> y = np.array(x, copy=False, dtype='float32') >>> >>> when x is a memmap. If we check the memmap attribute of mmap >>> >>> print "mmap attribute", y._mmap >>> >>> numpy tells us that y is not a memmap. >>> But the following code snippet crashes the python interpreter >>> >>> # opens the memmap >>> with open(filename,'r+b') as f: >>> mm = mmap.mmap(f.fileno(),0) >>> x = np.frombuffer(mm, dtype='float32') >>> >>> # builds an array from the memmap, with the option copy=False >>> y = np.array(x, copy=False, dtype='float32') >>> print "before", y >>> >>> # closes the file >>> mm.close() >>> print "after", y >>> >>> In my code I use memmaps to share read-only objects when doing >>> parallel >>> processing >>> and the behaviour of np.array, even if not consistent, it's >>> desirable. >>> I share scipy sparse matrices over many processes and if np.array >>> would >>> make a copy >>> when dealing with memmaps this would force me to rewrite part of >>> the sparse >>> matrices >>> code. >>> Would it be possible in the future releases of numpy to have >>> np.array >>> check, >>> if copy is false, if y is a memmap and in that case return a full >>> memmap >>> object >>> instead of slicing it? >> >> This does appear to be a bug in numpy or mmap. >> > > Frankly on first sight, I do not think it is a bug in either of them. > Numpy uses view (memmap really is just a name for a memory map backed > numpy array). The numpy array will hold a reference to the memory map > object in its `.base` attribute (or the base of the base, etc.). > > If you close a mmap object, and then keep using it, you can get > segfaults of course, I am not sure what you can do about it. Maybe > python can try to warn you when you exit the context/close a file > pointer, but I suppose: Python does memory management for you, it makes > doing IO management easy, but you need to manage the IO correctly. That > this segfaults and not just errors may be annoying, but seems the > nature of things on first sight. > > - Sebastian I admit I have not had time to investigate it thoroughly, but it appears to me that the intended design of mmap was to make it impossible to close a mmap if there were still pointers to it. Consider the following behavior (python3): >>> import mmap >>> with open('test', 'r+b') as f: >>> mm = mmap.mmap(f.fileno(),0) >>> mv = memoryview(mm) >>> mm.close() BufferError: cannot close exported pointers exist If memoryview behaves this way, why doesn't/can't ndarray? (Both use the PEP3118 interface, as far as I understand). You can see in the mmap code that it tries to carefully keep track of any exported buffers, but numpy manages to bypass this: https://github.com/python/cpython/blob/b879fe82e7e5c3f7673c9a7fa4aad42bd05445d8/Modules/mmapmodule.c#L727 Allan From allanhaldane at gmail.com Thu Aug 10 16:06:38 2017 From: allanhaldane at gmail.com (Allan Haldane) Date: Thu, 10 Aug 2017 16:06:38 -0400 Subject: [Numpy-discussion] np.array, copy=False and memmap In-Reply-To: References: Message-ID: <20883c9a-06f5-6b12-294e-6b4fdbd72c59@gmail.com> On 08/07/2017 05:01 PM, Nisoli Isaia wrote: > Dear all, > I have a question about the behaviour of > > y = np.array(x, copy=False, dtype='float32') > > when x is a memmap. If we check the memmap attribute of mmap > > print "mmap attribute", y._mmap > > numpy tells us that y is not a memmap. > But the following code snippet crashes the python interpreter > > # opens the memmap > with open(filename,'r+b') as f: > mm = mmap.mmap(f.fileno(),0) > x = np.frombuffer(mm, dtype='float32') > > # builds an array from the memmap, with the option copy=False > y = np.array(x, copy=False, dtype='float32') > print "before", y > > # closes the file > mm.close() > print "after", y > > In my code I use memmaps to share read-only objects when doing parallel > processing > and the behaviour of np.array, even if not consistent, it's desirable. > I share scipy sparse matrices over many processes and if np.array would > make a copy > when dealing with memmaps this would force me to rewrite part of the sparse > matrices > code. > Would it be possible in the future releases of numpy to have np.array > check, > if copy is false, if y is a memmap and in that case return a full memmap > object > instead of slicing it? > > Best wishes > Isaia > > P.S. A longer account of the issue may be found on my university blog > http://www.im.ufrj.br/nisoli/blog/?p=131 I just read your blog post, as well. To confirm your question there: yes, if you slice or "view" a numpy array which points to memmapped data, then the slice or view will also point to memmapped data and will not make a copy. This way you avoid using up a lot of memory. It is also important to realize that `np.memmap` is merely a subclass of `np.ndarray` which just provides a few extra helper methods which ndarrays don't have, but is otherwise identical. The most important difference is that `np.memmap` has a `flush` method. (It also has a _mmap private attribute). But otherwise, both ndarrays and memmaps have an internal data pointer pointing to the underlying data, and slices or views of ndarrays (or memmaps) will point to the same memory (no copies). In your code when you do y = np.array(x, copy=False) where x is a np.memmap object, y will point to the same memory locations as x. However, y will not be a memmap object, because of how you constructed it, so will not have the `flush` method which can be important if you are writing to y and expect it to be written to disk. If you are only reading from y, though, this shouldn't matter. Also, note that an np.memmap object is different from an mmap.mmap object: The former uses the latter internally. Allan From wieser.eric+numpy at gmail.com Thu Aug 10 17:08:31 2017 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Thu, 10 Aug 2017 21:08:31 +0000 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: Let?s try and keep this on topic - most replies to this message has been about #9211, which is an orthogonal issue. There are two main questions here: 1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, 25), if they had the choice 2. Is this desirable enough to justify increasing the API surface? The general consensus on the github issue answers yes to 1, but is neutral on 2. It would be good to get more opinions. Eric On Fri, 21 Jul 2017 at 16:12 Chun-Wei Yuan chunwei.yuan at gmail.com wrote: There's an ongoing effort to introduce quantile() into numpy. You'd use it > just like percentile(), but would input your q value in probability space > (0.5 for 50%): > > https://github.com/numpy/numpy/pull/9213 > > Since there's a great deal of overlap between these two functions, we'd > like to solicit opinions on how to move forward on this. > > The current thinking is to tolerate the redundancy and keep both, using > one as the engine for the other. I'm partial to having quantile because > 1.) I prefer probability space, and 2.) I have a PR waiting on quantile(). > > Best, > > C > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Thu Aug 10 18:08:09 2017 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Fri, 11 Aug 2017 00:08:09 +0200 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: <36d84390-b9bf-45d5-a12f-08876d8811c9@Spark> I concur with the consensus. On 10 Aug 2017, 11:10 PM +0200, Eric Wieser , wrote: > Let?s try and keep this on topic - most replies to this message has been about #9211, which is an orthogonal issue. > There are two main questions here: > > 1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, 25), if they had the choice > 2. Is this desirable enough to justify increasing the API surface? > > The general consensus on the github issue answers yes to 1, but is neutral on 2. It would be good to get more opinions. > Eric > On Fri, 21 Jul 2017 at 16:12 Chun-Wei Yuan chunwei.yuan at gmail.com wrote: > > There's an ongoing effort to introduce quantile() into numpy.? You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%): > > > > https://github.com/numpy/numpy/pull/9213 > > > > Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this. > > > > The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.? I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile(). > > > > Best, > > > > C > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Aug 12 00:34:48 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 12 Aug 2017 16:34:48 +1200 Subject: [Numpy-discussion] F2PY problems with PGI compilers In-Reply-To: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> Message-ID: On Sat, Aug 5, 2017 at 7:24 AM, Jeff Layton wrote: > Good afternoon! > > I'm trying to build a Python module using F2PY on a simple Fortran code > using the PGI 17.4 community compilers. > > I'm using Conda 4.3.21 with Python 2.7.13 and F2PY 2. The command line I'm > using is, > > > f2py --compiler=pg --fcompiler=pg -c -m mdevice mdevice.f90 > > > The output from f2py is at the end of the email. Any suggestions are > greatly appreciated. > --compiler=pg seems wrong, that specifies the C/C++ compiler to use not the Fortran compiler. Hence you get the error "don't know how to compile C/C++ code on platform 'posix' with 'pg' compiler". Try just leaving that off (thereby using the default C compiler you have installed, probably gcc). Ralf > Thanks! > > Jeff > > > Output from f2py: > > > > running build > running config_cc > unifing config_cc, config, build_clib, build_ext, build commands > --compiler options > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler options > running build_src > build_src > building extension "mdevice" sources > f2py options: [] > f2py:> /tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c > creating /tmp/tmptN1fdp/src.linux-x86_64-2.7 > Reading fortran codes... > Reading file 'mdevice.f90' (format:free) > Post-processing... > Block: mdevice > Block: devicequery > In: :mdevice:mdevice.f90:devicequery > get_useparameters: no module cudafor info used by devicequery > Post-processing (stage 2)... > Building modules... > Building module "mdevice"... > Constructing wrapper function "devicequery"... > devicequery() > Wrote C/API module "mdevice" to file "/tmp/tmptN1fdp/src.linux-x86_ > 64-2.7/mdevicemodule.c" > adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7/fortranobject.c' to sources. > adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7' to include_dirs. > copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c > -> /tmp/tmptN1fdp/src.linux-x86_64-2.7 > copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h > -> /tmp/tmptN1fdp/src.linux-x86_64-2.7 > build_src: building npy-pkg config files > running build_ext > error: don't know how to compile C/C++ code on platform 'posix' with 'pg' > compiler > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Aug 13 09:28:52 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 13 Aug 2017 07:28:52 -0600 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: On Thu, Aug 10, 2017 at 3:08 PM, Eric Wieser wrote: > Let?s try and keep this on topic - most replies to this message has been > about #9211, which is an orthogonal issue. > > There are two main questions here: > > 1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, > 25), if they had the choice > 2. Is this desirable enough to justify increasing the API surface? > > The general consensus on the github issue answers yes to 1, but is neutral > on 2. It would be good to get more opinions. > I think a quantile function would be natural and desirable. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Aug 13 12:50:10 2017 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 13 Aug 2017 12:50:10 -0400 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: On Sun, Aug 13, 2017 at 9:28 AM, Charles R Harris wrote: > > > On Thu, Aug 10, 2017 at 3:08 PM, Eric Wieser > wrote: > >> Let?s try and keep this on topic - most replies to this message has been >> about #9211, which is an orthogonal issue. >> >> There are two main questions here: >> >> 1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, >> 25), if they had the choice >> 2. Is this desirable enough to justify increasing the API surface? >> >> The general consensus on the github issue answers yes to 1, but is >> neutral on 2. It would be good to get more opinions. >> > > I think a quantile function would be natural and desirable. > I'm in favor of adding it. (moving away from +0) It should be an obvious code completion choice, np.q? Josef > > > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From laytonjb at att.net Sun Aug 13 15:24:12 2017 From: laytonjb at att.net (Jeff Layton) Date: Sun, 13 Aug 2017 15:24:12 -0400 Subject: [Numpy-discussion] F2PY problems with PGI compilers In-Reply-To: References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> Message-ID: <99bcdeac-4872-c287-62e7-884c490c22bf@att.net> +SciPy list > > > On Sat, Aug 5, 2017 at 7:24 AM, Jeff Layton > wrote: > > Good afternoon! > > I'm trying to build a Python module using F2PY on a simple Fortran > code using the PGI 17.4 community compilers. > > I'm using Conda 4.3.21 with Python 2.7.13 and F2PY 2. The command > line I'm using is, > > > f2py --compiler=pg --fcompiler=pg -c -m mdevice mdevice.f90 > > > The output from f2py is at the end of the email. Any suggestions > are greatly appreciated. > > > --compiler=pg seems wrong, that specifies the C/C++ compiler to use > not the Fortran compiler. Hence you get the error "don't know how to > compile C/C++ code on platform 'posix' with 'pg' compiler". Try just > leaving that off (thereby using the default C compiler you have > installed, probably gcc). Ralf - thanks for the response! I had tried that before and F2PY still thinks it's using the PGI C compiler: running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building extension "mdevice" sources f2py options: [] f2py:> /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c creating /tmp/tmpkxCUbk/src.linux-x86_64-2.7 Reading fortran codes... Reading file 'mdevice.f90' (format:free) Post-processing... Block: mdevice Block: devicequery In: :mdevice:mdevice.f90:devicequery get_useparameters: no module cudafor info used by devicequery Post-processing (stage 2)... Building modules... Building module "mdevice"... Constructing wrapper function "devicequery"... devicequery() Wrote C/API module "mdevice" to file "/tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c" adding '/tmp/tmpkxCUbk/src.linux-x86_64-2.7/fortranobject.c' to sources. adding '/tmp/tmpkxCUbk/src.linux-x86_64-2.7' to include_dirs. copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c -> /tmp/tmpkxCUbk/src.linux-x86_64-2.7 copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h -> /tmp/tmpkxCUbk/src.linux-x86_64-2.7 build_src: building npy-pkg config files running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext customize PGroupFCompiler Found executable /opt/pgi/linux86-64/pgidir/pgf90 Found executable /opt/pgi/linux86-64/pgidir/pgf77 Found executable /opt/pgi/linux86-64/17.4/bin/pgfortran customize PGroupFCompiler using build_ext building 'mdevice' extension compiling C sources C compiler: /opt/pgi/linux86-64/pgidir/pgcc -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC creating /tmp/tmpkxCUbk/tmp creating /tmp/tmpkxCUbk/tmp/tmpkxCUbk creating /tmp/tmpkxCUbk/tmp/tmpkxCUbk/src.linux-x86_64-2.7 compile options: '-I/tmp/tmpkxCUbk/src.linux-x86_64-2.7 -I/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/core/include -I/home/laytonjb/anaconda2/include/python2.7 -c' pgcc: /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c pgcc-Error-Unknown switch: -fno-strict-aliasing pgcc-Error-Unknown switch: -fwrapv pgcc-Error-Unknown switch: -Wall pgcc-Error-Unknown switch: -Wstrict-prototypes pgcc-Error-Unknown switch: -fno-strict-aliasing pgcc-Error-Unknown switch: -fwrapv pgcc-Error-Unknown switch: -Wall pgcc-Error-Unknown switch: -Wstrict-prototypes error: Command "/opt/pgi/linux86-64/pgidir/pgcc -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/tmp/tmpkxCUbk/src.linux-x86_64-2.7 -I/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/core/include -I/home/laytonjb/anaconda2/include/python2.7 -c /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c -o /tmp/tmpkxCUbk/tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.o" failed with exit status 1 I'm definitely at a lose here. I have no idea how to make F2PY work with the PGI compilers. I'm beginning to think F2PY is completely borked unless you use the defaults (gcc). Thanks! Jeff > > > > Thanks! > > Jeff > > > Output from f2py: > > > > running build > running config_cc > unifing config_cc, config, build_clib, build_ext, build commands > --compiler options > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler options > running build_src > build_src > building extension "mdevice" sources > f2py options: [] > f2py:> /tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c > creating /tmp/tmptN1fdp/src.linux-x86_64-2.7 > Reading fortran codes... > Reading file 'mdevice.f90' (format:free) > Post-processing... > Block: mdevice > Block: devicequery > In: :mdevice:mdevice.f90:devicequery > get_useparameters: no module cudafor info used by devicequery > Post-processing (stage 2)... > Building modules... > Building module "mdevice"... > Constructing wrapper function "devicequery"... > devicequery() > Wrote C/API module "mdevice" to file > "/tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c" > adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7/fortranobject.c' to > sources. > adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7' to include_dirs. > copying > /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c > -> /tmp/tmptN1fdp/src.linux-x86_64-2.7 > copying > /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h > -> /tmp/tmptN1fdp/src.linux-x86_64-2.7 > build_src: building npy-pkg config files > running build_ext > error: don't know how to compile C/C++ code on platform 'posix' > with 'pg' compiler > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Aug 14 03:51:03 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 14 Aug 2017 19:51:03 +1200 Subject: [Numpy-discussion] F2PY problems with PGI compilers In-Reply-To: <99bcdeac-4872-c287-62e7-884c490c22bf@att.net> References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> <99bcdeac-4872-c287-62e7-884c490c22bf@att.net> Message-ID: On Mon, Aug 14, 2017 at 7:24 AM, Jeff Layton wrote: > +SciPy list > > > > On Sat, Aug 5, 2017 at 7:24 AM, Jeff Layton wrote: > >> Good afternoon! >> >> I'm trying to build a Python module using F2PY on a simple Fortran code >> using the PGI 17.4 community compilers. >> >> I'm using Conda 4.3.21 with Python 2.7.13 and F2PY 2. The command line >> I'm using is, >> >> >> f2py --compiler=pg --fcompiler=pg -c -m mdevice mdevice.f90 >> >> >> The output from f2py is at the end of the email. Any suggestions are >> greatly appreciated. >> > > --compiler=pg seems wrong, that specifies the C/C++ compiler to use not > the Fortran compiler. Hence you get the error "don't know how to compile > C/C++ code on platform 'posix' with 'pg' compiler". Try just leaving that > off (thereby using the default C compiler you have installed, probably gcc). > > > Ralf - thanks for the response! I had tried that before and F2PY still > thinks it's using the PGI C compiler: > > > running build > running config_cc > unifing config_cc, config, build_clib, build_ext, build commands > --compiler options > running config_fc > unifing config_fc, config, build_clib, build_ext, build commands > --fcompiler options > running build_src > build_src > building extension "mdevice" sources > f2py options: [] > f2py:> /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c > creating /tmp/tmpkxCUbk/src.linux-x86_64-2.7 > Reading fortran codes... > Reading file 'mdevice.f90' (format:free) > Post-processing... > Block: mdevice > Block: devicequery > In: :mdevice:mdevice.f90:devicequery > get_useparameters: no module cudafor info used by devicequery > Post-processing (stage 2)... > Building modules... > Building module "mdevice"... > Constructing wrapper function "devicequery"... > devicequery() > Wrote C/API module "mdevice" to file "/tmp/tmpkxCUbk/src.linux-x86_ > 64-2.7/mdevicemodule.c" > adding '/tmp/tmpkxCUbk/src.linux-x86_64-2.7/fortranobject.c' to sources. > adding '/tmp/tmpkxCUbk/src.linux-x86_64-2.7' to include_dirs. > copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c > -> /tmp/tmpkxCUbk/src.linux-x86_64-2.7 > copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h > -> /tmp/tmpkxCUbk/src.linux-x86_64-2.7 > build_src: building npy-pkg config files > running build_ext > customize UnixCCompiler > customize UnixCCompiler using build_ext > customize PGroupFCompiler > Found executable /opt/pgi/linux86-64/pgidir/pgf90 > Found executable /opt/pgi/linux86-64/pgidir/pgf77 > Found executable /opt/pgi/linux86-64/17.4/bin/pgfortran > customize PGroupFCompiler using build_ext > building 'mdevice' extension > compiling C sources > C compiler: /opt/pgi/linux86-64/pgidir/pgcc -fno-strict-aliasing -g -O2 > -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC > > creating /tmp/tmpkxCUbk/tmp > creating /tmp/tmpkxCUbk/tmp/tmpkxCUbk > creating /tmp/tmpkxCUbk/tmp/tmpkxCUbk/src.linux-x86_64-2.7 > compile options: '-I/tmp/tmpkxCUbk/src.linux-x86_64-2.7 > -I/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/core/include > -I/home/laytonjb/anaconda2/include/python2.7 -c' > pgcc: /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c > pgcc-Error-Unknown switch: -fno-strict-aliasing > pgcc-Error-Unknown switch: -fwrapv > pgcc-Error-Unknown switch: -Wall > pgcc-Error-Unknown switch: -Wstrict-prototypes > pgcc-Error-Unknown switch: -fno-strict-aliasing > pgcc-Error-Unknown switch: -fwrapv > pgcc-Error-Unknown switch: -Wall > pgcc-Error-Unknown switch: -Wstrict-prototypes > error: Command "/opt/pgi/linux86-64/pgidir/pgcc -fno-strict-aliasing -g > -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC > -I/tmp/tmpkxCUbk/src.linux-x86_64-2.7 -I/home/laytonjb/anaconda2/ > lib/python2.7/site-packages/numpy/core/include -I/home/laytonjb/anaconda2/include/python2.7 > -c /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c -o > /tmp/tmpkxCUbk/tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.o" failed > with exit status 1 > > > > I'm definitely at a lose here. I have no idea how to make F2PY work with > the PGI compilers. I'm beginning to think F2PY is completely borked unless > you use the defaults (gcc). > That's not the case. Here is an example when using the Intel Fortran compiler together with either MSVC or Intel C compilers: https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows I notice there that in all cases the C compiler is explicitly specified. Did you also try ``--compiler=gcc --fcompiler=pg``? Also, I'm not sure how often this is done with f2py directly; I've only ever used the --fcompiler flag via ``python setup.py config --fcompiler=..``, invoking f2py under the hood. It could be that doing this directly is indeed broken (or was never supported in the first place). Ralf > > Thanks! > > Jeff > > > > > > >> Thanks! >> >> Jeff >> >> >> Output from f2py: >> >> >> >> running build >> running config_cc >> unifing config_cc, config, build_clib, build_ext, build commands >> --compiler options >> running config_fc >> unifing config_fc, config, build_clib, build_ext, build commands >> --fcompiler options >> running build_src >> build_src >> building extension "mdevice" sources >> f2py options: [] >> f2py:> /tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c >> creating /tmp/tmptN1fdp/src.linux-x86_64-2.7 >> Reading fortran codes... >> Reading file 'mdevice.f90' (format:free) >> Post-processing... >> Block: mdevice >> Block: devicequery >> In: :mdevice:mdevice.f90:devicequery >> get_useparameters: no module cudafor info used by devicequery >> Post-processing (stage 2)... >> Building modules... >> Building module "mdevice"... >> Constructing wrapper function "devicequery"... >> devicequery() >> Wrote C/API module "mdevice" to file >> "/tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c" >> adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7/fortranobject.c' to >> sources. >> adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7' to include_dirs. >> copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c >> -> /tmp/tmptN1fdp/src.linux-x86_64-2.7 >> copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h >> -> /tmp/tmptN1fdp/src.linux-x86_64-2.7 >> build_src: building npy-pkg config files >> running build_ext >> error: don't know how to compile C/C++ code on platform 'posix' with 'pg' >> compiler >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > > > _______________________________________________ > NumPy-Discussion mailing listNumPy-Discussion at python.orghttps://mail.python.org/mailman/listinfo/numpy-discussion > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Aug 14 04:05:52 2017 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 14 Aug 2017 10:05:52 +0200 Subject: [Numpy-discussion] F2PY problems with PGI compilers In-Reply-To: References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> <99bcdeac-4872-c287-62e7-884c490c22bf@att.net> Message-ID: Ralf Gommers kirjoitti 14.08.2017 klo 09:51: > I'm definitely at a lose here. I have no idea how to make F2PY work > with the PGI compilers. I'm beginning to think F2PY is completely > borked unless you use the defaults (gcc). > > > That's not the case. Here is an example when using the Intel Fortran > compiler together with either MSVC or Intel C compilers: > https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows Note that it is not necessary to use f2py for compiling. It can also just generate the C and fortran source files necessary --- although you need to also compile and link in fortranobject.[ch] which are found inside the numpy folders, and supply the correct Python and numpy include paths see numpy.get_includes(). From laytonjb at att.net Mon Aug 14 10:19:46 2017 From: laytonjb at att.net (Jeff Layton) Date: Mon, 14 Aug 2017 10:19:46 -0400 Subject: [Numpy-discussion] F2PY problems with PGI compilers In-Reply-To: References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> <99bcdeac-4872-c287-62e7-884c490c22bf@att.net> Message-ID: <158a0a4f-450a-be1e-5512-e96974075f5d@att.net> On 08/14/2017 03:51 AM, Ralf Gommers wrote: > > > > > > > I'm definitely at a lose here. I have no idea how to make F2PY > work with the PGI compilers. I'm beginning to think F2PY is > completely borked unless you use the defaults (gcc). > > > That's not the case. Here is an example when using the Intel Fortran > compiler together with either MSVC or Intel C compilers: > https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows > > I notice there that in all cases the C compiler is explicitly > specified. Did you also try ``--compiler=gcc --fcompiler=pg``? > > Also, I'm not sure how often this is done with f2py directly; I've > only ever used the --fcompiler flag via ``python setup.py config > --fcompiler=..``, invoking f2py under the hood. It could be that > doing this directly is indeed broken (or was never supported in the > first place). > > Ralf > > Point taken. I don't use Windows too much and I don't use the Intel compiler any more (it's not free for non-commercial use :) ). I tried using "--compiler=gcc --fcompiler=pg" and I get the same answer at the very end. running build_ext error: don't know how to compile C/C++ code on platform 'posix' with 'gcc' compiler Good point about f2py. I'm using the Anaconda distribution of f2py and that may have limitations with respect to the PGI compiler. I may download the f2py source and build it to include PGI support. Maybe that will fix the problem. Thanks! Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From laytonjb at att.net Mon Aug 14 10:32:57 2017 From: laytonjb at att.net (Jeff Layton) Date: Mon, 14 Aug 2017 10:32:57 -0400 Subject: [Numpy-discussion] F2PY problems with PGI compilers In-Reply-To: <2b3fbab5-0aa2-adab-3177-f956bfad3403@att.net> References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> <99bcdeac-4872-c287-62e7-884c490c22bf@att.net> <2b3fbab5-0aa2-adab-3177-f956bfad3403@att.net> Message-ID: <36e5cffb-0d50-9f8b-cdfc-ad38eee00ebe@att.net> On 08/14/2017 10:27 AM, Jeff Layton wrote: > On 08/14/2017 04:05 AM, Pauli Virtanen wrote: >> Ralf Gommers kirjoitti 14.08.2017 klo 09:51: >>> I'm definitely at a lose here. I have no idea how to make F2PY >>> work >>> with the PGI compilers. I'm beginning to think F2PY is completely >>> borked unless you use the defaults (gcc). >>> >>> >>> That's not the case. Here is an example when using the Intel Fortran >>> compiler together with either MSVC or Intel C compilers: >>> https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows >>> >> Note that it is not necessary to use f2py for compiling. It can also >> just generate the C and fortran source files necessary --- although you >> need to also compile and link in fortranobject.[ch] which are found >> inside the numpy folders, and supply the correct Python and numpy >> include paths see numpy.get_includes(). > > I was hoping to avoid this :) I wanted to use f2py as a "module > builder" for some code :) However, it appears I will have to go down > this path to see if I can get further. > > Thanks for the advice! > > Jeff > > > > From laytonjb at att.net Mon Aug 14 11:01:49 2017 From: laytonjb at att.net (Jeff Layton) Date: Mon, 14 Aug 2017 11:01:49 -0400 Subject: [Numpy-discussion] F2PY problems with PGI compilers In-Reply-To: <158a0a4f-450a-be1e-5512-e96974075f5d@att.net> References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> <99bcdeac-4872-c287-62e7-884c490c22bf@att.net> <158a0a4f-450a-be1e-5512-e96974075f5d@att.net> Message-ID: <22965279-d221-c01f-3e22-689ea1fe3442@att.net> On 08/14/2017 10:19 AM, Jeff Layton wrote: > On 08/14/2017 03:51 AM, Ralf Gommers wrote: >> >> >> >> >> >> >> I'm definitely at a lose here. I have no idea how to make F2PY >> work with the PGI compilers. I'm beginning to think F2PY is >> completely borked unless you use the defaults (gcc). >> >> >> That's not the case. Here is an example when using the Intel Fortran >> compiler together with either MSVC or Intel C compilers: >> https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows >> >> I notice there that in all cases the C compiler is explicitly >> specified. Did you also try ``--compiler=gcc --fcompiler=pg``? >> >> Also, I'm not sure how often this is done with f2py directly; I've >> only ever used the --fcompiler flag via ``python setup.py config >> --fcompiler=..``, invoking f2py under the hood. It could be that >> doing this directly is indeed broken (or was never supported in the >> first place). >> >> Ralf >> >> > > Point taken. I don't use Windows too much and I don't use the Intel > compiler any more (it's not free for non-commercial use :) ). > > I tried using "--compiler=gcc --fcompiler=pg" and I get the same > answer at the very end. > > > running build_ext > error: don't know how to compile C/C++ code on platform 'posix' with > 'gcc' compiler > > > Good point about f2py. I'm using the Anaconda distribution of f2py and > that may have limitations with respect to the PGI compiler. I may > download the f2py source and build it to include PGI support. Maybe > that will fix the problem. > > Thanks! > > Jeff > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Aug 14 13:17:21 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 14 Aug 2017 10:17:21 -0700 Subject: [Numpy-discussion] quantile() or percentile() In-Reply-To: References: Message-ID: +1 on quantile() -CHB On Sun, Aug 13, 2017 at 6:28 AM, Charles R Harris wrote: > > > On Thu, Aug 10, 2017 at 3:08 PM, Eric Wieser > wrote: > >> Let?s try and keep this on topic - most replies to this message has been >> about #9211, which is an orthogonal issue. >> >> There are two main questions here: >> >> 1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, >> 25), if they had the choice >> 2. Is this desirable enough to justify increasing the API surface? >> >> The general consensus on the github issue answers yes to 1, but is >> neutral on 2. It would be good to get more opinions. >> > > I think a quantile function would be natural and desirable. > > > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Aug 14 13:29:49 2017 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 15 Aug 2017 05:29:49 +1200 Subject: [Numpy-discussion] F2PY problems with PGI compilers In-Reply-To: <158a0a4f-450a-be1e-5512-e96974075f5d@att.net> References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net> <99bcdeac-4872-c287-62e7-884c490c22bf@att.net> <158a0a4f-450a-be1e-5512-e96974075f5d@att.net> Message-ID: On Tue, Aug 15, 2017 at 2:19 AM, Jeff Layton wrote: > On 08/14/2017 03:51 AM, Ralf Gommers wrote: > > > > > >> >> >> I'm definitely at a lose here. I have no idea how to make F2PY work with >> the PGI compilers. I'm beginning to think F2PY is completely borked unless >> you use the defaults (gcc). >> > > That's not the case. Here is an example when using the Intel Fortran > compiler together with either MSVC or Intel C compilers: > https://software.intel.com/en-us/articles/building- > numpyscipy-with-intel-mkl-and-intel-fortran-on-windows > > I notice there that in all cases the C compiler is explicitly specified. > Did you also try ``--compiler=gcc --fcompiler=pg``? > > Also, I'm not sure how often this is done with f2py directly; I've only > ever used the --fcompiler flag via ``python setup.py config > --fcompiler=..``, invoking f2py under the hood. It could be that doing > this directly is indeed broken (or was never supported in the first place). > > Ralf > > > > Point taken. I don't use Windows too much and I don't use the Intel > compiler any more (it's not free for non-commercial use :) ). > > I tried using "--compiler=gcc --fcompiler=pg" and I get the same answer at > the very end. > > > running build_ext > error: don't know how to compile C/C++ code on platform 'posix' with 'gcc' > compiler > > > Good point about f2py. I'm using the Anaconda distribution of f2py and > that may have limitations with respect to the PGI compiler. I may download > the f2py source and build it to include PGI support. Maybe that will fix > the problem. > That won't make a difference, all the build config code is pure Python. Anaconda will give you the same results as building from source. Ralf > Thanks! > > Jeff > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavdev at gmx.de Tue Aug 15 12:26:10 2017 From: pavdev at gmx.de (Paul) Date: Tue, 15 Aug 2017 18:26:10 +0200 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) Message-ID: Hi all, I recently spent some time adding python interfaces to my tensor libraries: * Tensor Contraction Library (TCL): https://github.com/springer13/tcl * Tensor Transposition Library (HPTT): https://github.com/springer13/hptt Both libraries tend to give very significant speedups over what is currently offered by NumPY; Speedups typically range from 5x - 20x w.r.t. HPTT and >>20x for TCL (see attached, Host: 2x Intel Haswell-EP E5-2680 v3 (24 threads)). Thus, I was curious if some of you would benefit from those speedups and if you want it to be integrated into NumPY. The HPTT and TCL libraries are respectively similar to numpy.transpose() and numpy.einsum(). I welcome you to give the packages a try and see if they can help you to speedup some of your tensor-related operations. Finally: Which steps would be required to integrate those libraries into NumPY? Which problems do you anticipate? Thank you, Paul -------------- next part -------------- A non-text attachment was scrubbed... Name: hptt_vs_numpy.pdf Type: application/pdf Size: 14732 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tcl_vs_numpy_vs_eigen.pdf Type: application/pdf Size: 116285 bytes Desc: not available URL: From charlesr.harris at gmail.com Tue Aug 15 14:05:34 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Aug 2017 12:05:34 -0600 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) In-Reply-To: References: Message-ID: On Tue, Aug 15, 2017 at 10:26 AM, Paul wrote: > Hi all, > > I recently spent some time adding python interfaces to my tensor libraries: > * Tensor Contraction Library (TCL): https://github.com/springer13/tcl > * Tensor Transposition Library (HPTT): https://github.com/springer13/ > hptt > > Both libraries tend to give very significant speedups over what is > currently offered by NumPY; Speedups > typically range from 5x - 20x w.r.t. HPTT and >>20x for TCL (see > attached, Host: 2x Intel Haswell-EP E5-2680 v3 (24 threads)). > Thus, I was curious if some of you would benefit from those speedups and > if you want it to > be integrated into NumPY. > > The HPTT and TCL libraries are respectively similar to numpy.transpose() > and numpy.einsum(). > > I welcome you to give the packages a try and see if they can help you to > speedup some of your tensor-related operations. > > Finally: Which steps would be required to integrate those libraries into > NumPY? Which problems do you anticipate? > > What version of Numpy are you comparing to? Note that in 1.13 you can enable some optimization in einsum, and the coming 1.14 makes that the default and uses CBLAS when possible. If you want to get it into Numpy, it would be worth checking if the existing functions can be improved before adding new ones. Note that Numpy transposition method just rearranges the indices, so the advantage of actual transposition is to have better cache performance or allow direct use of CBLAS. I assume TCL uses some tricks to do transposition in a way that is more cache friendly? Might check the license if your work uses code from a publication. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavdev at gmx.de Wed Aug 16 05:39:26 2017 From: pavdev at gmx.de (Paul Springer) Date: Wed, 16 Aug 2017 11:39:26 +0200 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) In-Reply-To: References: Message-ID: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> > What version of Numpy are you comparing to? Note that in 1.13 you can > enable some optimization in einsum, and the coming 1.14 makes that the > default and uses CBLAS when possible. I was using 1.10.4; however, I am currently running the benchmark with 1.13.1 and 'optimize=True'; this, however, seems to yield even worse performance (see attached). If you are interested, you can check the performance difference yourself via: ./benchmark/python/bechmark.sh > If you want to get it into Numpy, it would be worth checking if the > existing functions can be improved before adding new ones. > > Note that Numpy transposition method just rearranges the indices, so > the advantage of actual transposition is to have better cache > performance or allow direct use of CBLAS. I assume TCL uses some > tricks to do transposition in a way that is more cache friendly? HPTT is a sophisticated library for tensor transpositions, as such it blocks the tensors such that (1) spatial locality can be exploited. Moreover, (2) it uses explicit vectorization to take advantage of the CPU's vector units. TCL uses the Transpose-Transpose-GEMM-Transpose approach where all tensors are flattened into matrices (via HPTT) and then contracted via GEMM; the final result is eventually folded (via HPTT) into the desired output tensor. Would it be possible to expose HPTT and TCL as optional packages within NumPY? This way I don't have to redo the work that I've already put into those libraries. > Might check the license if your work uses code from a publication. As far as licenses are concerned that should not be a problem since I wrote to code myself and it doesn't use code from publications other than mine. Best regards, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tcl_vs_numpy_vs_eigen.pdf Type: application/pdf Size: 111882 bytes Desc: not available URL: From shoyer at gmail.com Wed Aug 16 11:38:24 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 16 Aug 2017 08:38:24 -0700 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) In-Reply-To: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> References: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> Message-ID: On Wed, Aug 16, 2017 at 2:39 AM, Paul Springer wrote: > > What version of Numpy are you comparing to? Note that in 1.13 you can > enable some optimization in einsum, and the coming 1.14 makes that the > default and uses CBLAS when possible. > > I was using 1.10.4; however, I am currently running the benchmark with > 1.13.1 and 'optimize=True'; this, however, seems to yield even worse > performance (see attached). > If you are interested, you can check the performance difference yourself > via: ./benchmark/python/bechmark.sh > This sounds like you may be using relatively small matrices, where the overhead of calculating the optimal strategy dominates. Can you try with a few bigger test cases? -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed Aug 16 12:08:39 2017 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 16 Aug 2017 16:08:39 +0000 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) In-Reply-To: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> References: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> Message-ID: (NB all: the thread title seems to interchange the acronyms for the Thread Contraction Library (TCL) and the High-Perormance Tensor Transpose (HPTT) packages. I'm not fixing it so as not to break threading.) On Wed, Aug 16, 2017 at 11:40 AM Paul Springer wrote: > If you want to get it into Numpy, it would be worth checking if the > existing functions can be improved before adding new ones. > > Note that Numpy transposition method just rearranges the indices, so the > advantage of actual transposition is to have better cache performance or > allow direct use of CBLAS. I assume TCL uses some tricks to do > transposition in a way that is more cache friendly? > > HPTT is a sophisticated library for tensor transpositions, as such it > blocks the tensors such that (1) spatial locality can be exploited. > Moreover, (2) it uses explicit vectorization to take advantage of the CPU's > vector units. > I think this library provides functionality that isn't readily accessible from within numpy at the moment. The only functions I know of to rearrange the memory layout of data are things like ascontiguousarray and asfortranarray, as well as assignment (e.g. a[...] = b). The general strategy within numpy is to assume that all functions work equally well on arrays with arbitrary memory layouts, so that users often don't even know the memory layouts of their data. The striding functionality means data usually doesn't actually get transposed until absolutely necessary. Of course, few if any numpy functions work equally well on different memory layouts; unary ufuncs contain code to try to carry out their iteration in the fastest way, but it's not clear how well that works or whether they have the freedom to choose the layouts of their output arrays. If you wanted to integrate HPTT into numpy, I think the best approach might be to wire it into the assignment machinery, so that when users do things like a[::2,:] = b[:,::3].T HPTT springs into action behind the scenes and makes this assignment as efficient as possible (how well does it handle arrays with spaces between elements?). Then ascontiguousarray and asfortranarray and the like could simply use assignment to an appropriately-ordered destination when they actually needed to do anything. TCL uses the Transpose-Transpose-GEMM-Transpose approach where all tensors > are flattened into matrices (via HPTT) and then contracted via GEMM; the > final result is eventually folded (via HPTT) into the desired output tensor. > This is a pretty direct replacement of einsum, but I think einsum may well already do pretty much this, apart from not using HPTT to do the transposes. So the way to get this functionality would be to make the matrix-rearrangement primitives use HPTT, as above. Would it be possible to expose HPTT and TCL as optional packages within > NumPY? This way I don't have to redo the work that I've already put into > those libraries. > I think numpy should be regarded as a basically-complete package for manipulating strided in-memory data, to which we are reluctant to add new user-visible functionality. Tools that can act under the hood to make existing code faster, or to reduce the work users must to to make their code run fast enough, are valuable. > Might check the license if your work uses code from a publication. > > As far as licenses are concerned that should not be a problem since I > wrote to code myself and it doesn't use code from publications other than > mine. > Would some of your techniques help numpy to more rapidly evaluate things like C[...] = A+B, when A B and C are arbitrarily strided and there are no ordering constraints on the result? Or just A+B where numpy is free to choose the memory layout for the result? Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavdev at gmx.de Wed Aug 16 18:29:53 2017 From: pavdev at gmx.de (Paul Springer) Date: Thu, 17 Aug 2017 00:29:53 +0200 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) In-Reply-To: References: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> Message-ID: Am 8/16/17 um 5:38 PM schrieb Stephan Hoyer: > On Wed, Aug 16, 2017 at 2:39 AM, Paul Springer > wrote: > > >> What version of Numpy are you comparing to? Note that in 1.13 you >> can enable some optimization in einsum, and the coming 1.14 makes >> that the default and uses CBLAS when possible. > I was using 1.10.4; however, I am currently running the benchmark > with 1.13.1 and 'optimize=True'; this, however, seems to yield > even worse performance (see attached). > If you are interested, you can check the performance difference > yourself via: ./benchmark/python/bechmark.sh > > > This sounds like you may be using relatively small matrices, where the > overhead of calculating the optimal strategy dominates. Can you try > with a few bigger test cases? > The sizes of the tensors varies form ~5MB up to ~100MB towards the far right of the plot; this corresponds to matrices of size ~1000^2 to ~5000^2, thus the sizes should be large enough to amortize any overhead associated to calculating the optimal strategy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavdev at gmx.de Wed Aug 16 18:33:27 2017 From: pavdev at gmx.de (Paul Springer) Date: Thu, 17 Aug 2017 00:33:27 +0200 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) In-Reply-To: References: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> Message-ID: Am 8/16/17 um 6:08 PM schrieb Anne Archibald: > (NB all: the thread title seems to interchange the acronyms for the > Thread Contraction Library (TCL) and the High-Perormance Tensor > Transpose (HPTT) packages. I'm not fixing it so as not to break > threading.) > > On Wed, Aug 16, 2017 at 11:40 AM Paul Springer > wrote: > >> If you want to get it into Numpy, it would be worth checking if >> the existing functions can be improved before adding new ones. >> >> Note that Numpy transposition method just rearranges the indices, >> so the advantage of actual transposition is to have better cache >> performance or allow direct use of CBLAS. I assume TCL uses some >> tricks to do transposition in a way that is more cache friendly? > HPTT is a sophisticated library for tensor transpositions, as such > it blocks the tensors such that (1) spatial locality can be > exploited. Moreover, (2) it uses explicit vectorization to take > advantage of the CPU's vector units. > > > I think this library provides functionality that isn't readily > accessible from within numpy at the moment. The only functions I know > of to rearrange the memory layout of data are things like > ascontiguousarray and asfortranarray, as well as assignment (e.g. > a[...] = b). The general strategy within numpy is to assume that all > functions work equally well on arrays with arbitrary memory layouts, > so that users often don't even know the memory layouts of their data. > The striding functionality means data usually doesn't actually get > transposed until absolutely necessary. > > Of course, few if any numpy functions work equally well on different > memory layouts; unary ufuncs contain code to try to carry out their > iteration in the fastest way, but it's not clear how well that works > or whether they have the freedom to choose the layouts of their output > arrays. > > If you wanted to integrate HPTT into numpy, I think the best approach > might be to wire it into the assignment machinery, so that when users > do things like a[::2,:] = b[:,::3].T HPTT springs into action behind > the scenes and makes this assignment as efficient as possible (how > well does it handle arrays with spaces between elements?). Then > ascontiguousarray and asfortranarray and the like could simply use > assignment to an appropriately-ordered destination when they actually > needed to do anything. HPTT offers support for subtensor (via the outerSize parameter, which is similar to the leading dimension in BLAS), thus, HPTT can also deal with arbitrarily strided transpositions. However, a non-unite stride for the fastest-varying index is devastating for performance since this prohibits the use of vectorization and the exploitation of spatial locality. How would the integration of HPTT into NumPY look like? Which steps would need to be taken? Would it be required the HPTT be distributed in source code along side NumPY (at that point I might have to change the license for HPTT) or would it be fine to add an git dependency? That way users who build NumPY from source could fetch HPTT and set a flag during the build process of NumPY, indicating the HPTT is available? How would the process look like if NumPY is distributed as a precompiled binary? The same questions apply with respect to TCL. > > TCL uses the Transpose-Transpose-GEMM-Transpose approach where all > tensors are flattened into matrices (via HPTT) and then contracted > via GEMM; the final result is eventually folded (via HPTT) into > the desired output tensor. > > > This is a pretty direct replacement of einsum, but I think einsum may > well already do pretty much this, apart from not using HPTT to do the > transposes. So the way to get this functionality would be to make the > matrix-rearrangement primitives use HPTT, as above. That would certainly be one approach, however, TCL also explores several different strategies/candidates and picks the one that minimizes the data movements required by the transpositions. > > Would it be possible to expose HPTT and TCL as optional packages > within NumPY? This way I don't have to redo the work that I've > already put into those libraries. > > > I think numpy should be regarded as a basically-complete package for > manipulating strided in-memory data, to which we are reluctant to add > new user-visible functionality. Tools that can act under the hood to > make existing code faster, or to reduce the work users must to to make > their code run fast enough, are valuable. It seems to me that TCL is such a candidate, since it can replace a significant portion of the functionality offered by numpy.einsum(), yielding significantly higher performance. I imagine some thing of the form: def einsum(...): if( tclApplicable and tclAvailable ): tcl.tensorMult(...) > > Would some of your techniques help numpy to more rapidly evaluate > things like C[...] = A+B, when A B and C are arbitrarily strided and > there are no ordering constraints on the result? Or just A+B where > numpy is free to choose the memory layout for the result? Actually, HPTT is only concerned with the operation of the form B[perm(i0,i1,...)] = alpha * A[i0,i1,...] + beta * B[perm(i0,i1,...)] (where alpha and beta are scalars). Summing over multiple transposed tensors can be quite challenging (https://arxiv.org/abs/1705.06661) and is not covered by HPTT. Does this answer your question? -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Aug 17 03:55:57 2017 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 17 Aug 2017 09:55:57 +0200 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) In-Reply-To: References: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> Message-ID: <1502956557.27862.3.camel@sipsolutions.net> On Thu, 2017-08-17 at 00:33 +0200, Paul Springer wrote: > Am 8/16/17 um 6:08 PM schrieb Anne Archibald: > > > If you wanted to integrate HPTT into numpy, I think the best > > approach might be to wire it into the assignment machinery, so that > > when users do things like a[::2,:] = b[:,::3].T HPTT springs into > > action behind the scenes and makes this assignment as efficient as > > possible (how well does it handle arrays with spaces between > > elements?). Then ascontiguousarray and asfortranarray and the like > > could simply use assignment to an appropriately-ordered destination > > when they actually needed to do anything. > ?HPTT offers support for subtensor (via the outerSize parameter, > which is similar to the leading dimension in BLAS), thus, HPTT can > also deal with arbitrarily strided transpositions. > However, a non-unite stride for the fastest-varying index is > devastating for performance since this prohibits the use of > vectorization and the exploitation of spatial locality. > > How would the integration of HPTT into NumPY look like?? > Which steps would need to be taken? > Would it be required the HPTT be distributed in source code along > side NumPY (at that point I might have to change the license for > HPTT) or would it be fine to add an git dependency? That way users > who build NumPY from source could fetch HPTT and set a flag during > the build process of NumPY, indicating the HPTT is available?? > How would the process look like if NumPY is distributed as a > precompiled binary? > Well, numpy is BSD, and the official binaries will be BSD, someone else could do less free binaries of course. I doubt we can have a hard dependency unless it is part of the numpy source (some trick like this at one point existed for fftw, but....). I doubt including the source itself is going to happen quickly since we would first have to decide to actually use a modern C++ compiler (I have no idea if that is problematic or not). Having a blocked/fancier (I assume) iterator jump in at least for simple operations such as transposed+copy as Anne suggested sounds very cool though. It could be nice for simple ufuncs at least as well. I have no idea how difficult that may be though or how much complexity it would add to maintenance. My guess is it might require quite a lot of work to integrate such optimizations into the Iterator itself (even though it would be awesome), compared to just trying to plug it into some selected fast paths as Anne suggested. One thing that might be very simple and also pretty nice is just trying to keep the documentation (or wiki page or so linked from the documentation) up to date with suggestions for people interested in speed improvements listing things such as (not sure if we have that): * Use pyfftw for speeding up ffts * numexpr can be nice and gives a way to quickly use multiple cores * numba can automagically compile some python functions to be fast * Use TCL if you need faster einsum(like) operations * ... Just a few thoughts, did not think about details really. But yes, it is sounds reasonable to me to re-add support for optional dependencies such as fftw or your TCL. But packagers have to make use of that or I fear it is actually less available than a standalone python module. - Sebastian > The same questions apply with respect to TCL. > > > TCL uses the Transpose-Transpose-GEMM-Transpose approach where > > > all tensors are flattened into matrices (via HPTT) and then > > > contracted via GEMM; the final result is eventually folded (via > > > HPTT) into the desired output tensor. > > > > > > > This is a pretty direct replacement of einsum, but I think einsum > > may well already do pretty much this, apart from not using HPTT to > > do the transposes. So the way to get this functionality would be to > > make the matrix-rearrangement primitives use HPTT, as above. > ?That would certainly be one approach, however, TCL also explores > several different strategies/candidates and picks the one that > minimizes the data movements required by the transpositions. > > > Would it be possible to expose HPTT and TCL as optional packages > > > within NumPY? This way I don't have to redo the work that I've > > > already put into those libraries. > > > > > > > I think numpy should be regarded as a basically-complete package > > for manipulating strided in-memory data, to which we are reluctant > > to add new user-visible functionality. Tools that can act under the > > hood to make existing code faster, or to reduce the work users must > > to to make their code run fast enough, are valuable. > ?It seems to me that TCL is such a candidate, since it can replace a > significant portion of the functionality offered by numpy.einsum(), > yielding significantly higher performance. > > I imagine some thing of the form: > > def einsum(...): > ??? if( tclApplicable and tclAvailable ): > ?????? tcl.tensorMult(...) > > ? > > Would some of your techniques help numpy to more rapidly evaluate > > things like C[...] = A+B, when A B and C are arbitrarily strided > > and there are no ordering constraints on the result? Or just A+B > > where numpy is free to choose the memory layout for the result? > ?Actually, HPTT is only concerned with the operation of the form > B[perm(i0,i1,...)] = alpha * A[i0,i1,...] + beta * B[perm(i0,i1,...)] > (where alpha and beta are scalars). Summing over multiple transposed > tensors can be quite challenging (https://arxiv.org/abs/1705.06661) > and is not covered by HPTT. Does this answer your question? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: This is a digitally signed message part URL: From chris.barker at noaa.gov Thu Aug 17 12:15:14 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 17 Aug 2017 09:15:14 -0700 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) In-Reply-To: <1502956557.27862.3.camel@sipsolutions.net> References: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> <1502956557.27862.3.camel@sipsolutions.net> Message-ID: On Thu, Aug 17, 2017 at 12:55 AM, Sebastian Berg wrote: > > How would the process look like if NumPY is distributed as a > > precompiled binary? > > > Well, numpy is BSD, and the official binaries will be BSD, someone else > could do less free binaries of course. Indeed, if you want it to be distributed as a binary with numpy, then the license needs to be compatible -- do you have a substantial objection to BSD? The BSD family is pretty much the standard for Python -- Python (and numpy) are very broadly used in proprietary software. I doubt we can have a hard > dependency unless it is part of the numpy source and no reason to -- if it is a hard dependency, it HAS to be compatible licensed, and it's a lot easier to keep the source together. However, it _could_ be a soft dependency, like LAPACK/BLAS -- I've honestly lost track, but numpy used come with a lapack-lite (or some such), so that it could be compiled and work with no external LAPACK implementation -- you wouldn't get the best performance, but it would work. I doubt including the source > itself is going to happen quickly since we would first have to decide > to actually use a modern C++ compiler (I have no idea if that is > problematic or not). > could it be there as a conditional compilation? There is a lot of push to support C++11 elsewhere, so a compiled-with-a-modern-compiler numpy is not SO far off.. (for py3 anyway...) * Use TCL if you need faster einsum(like) operations > That is, of course, the other option -- distribute it on its own or maybe in scipy, and then users can use it as an optimization for those few core functions where speed matters to them -- honestly, it's a pretty small fraction of numpy code. But it sure would be nice if it could be built in, and then folks would get better performance without even thinkning about it. > Just a few thoughts, did not think about details really. But yes, it is > sounds reasonable to me to re-add support for optional dependencies > such as fftw or your TCL. But packagers have to make use of that or I > fear it is actually less available than a standalone python module. > true -- though I expect Anaconda / conda forge at least would be likely to pick it up if it works well. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Aug 17 12:58:33 2017 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 17 Aug 2017 10:58:33 -0600 Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL) In-Reply-To: References: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de> <1502956557.27862.3.camel@sipsolutions.net> Message-ID: On Thu, Aug 17, 2017 at 10:15 AM, Chris Barker wrote: > On Thu, Aug 17, 2017 at 12:55 AM, Sebastian Berg < > sebastian at sipsolutions.net> wrote: > >> > How would the process look like if NumPY is distributed as a >> > precompiled binary? >> >> >> Well, numpy is BSD, and the official binaries will be BSD, someone else >> could do less free binaries of course. > > > Indeed, if you want it to be distributed as a binary with numpy, then the > license needs to be compatible -- do you have a substantial objection to > BSD? The BSD family is pretty much the standard for Python -- Python (and > numpy) are very broadly used in proprietary software. > > I doubt we can have a hard >> dependency unless it is part of the numpy source > > > and no reason to -- if it is a hard dependency, it HAS to be compatible > licensed, and it's a lot easier to keep the source together. > > However, it _could_ be a soft dependency, like LAPACK/BLAS -- I've > honestly lost track, but numpy used come with a lapack-lite (or some such), > so that it could be compiled and work with no external LAPACK > implementation -- you wouldn't get the best performance, but it would work. > > I doubt including the source >> itself is going to happen quickly since we would first have to decide >> to actually use a modern C++ compiler (I have no idea if that is >> problematic or not). >> > > could it be there as a conditional compilation? There is a lot of push to > support C++11 elsewhere, so a compiled-with-a-modern-compiler numpy is > not SO far off.. > > (for py3 anyway...) > It would take a fair amount of grunge work to get there. Variables would need renaming, for instance `new`, and other such things. Nothing mind bending, but not completely trivial either. > > * Use TCL if you need faster einsum(like) operations >> > > That is, of course, the other option -- distribute it on its own or maybe > in scipy, and then users can use it as an optimization for those few core > functions where speed matters to them -- honestly, it's a pretty small > fraction of numpy code. > > But it sure would be nice if it could be built in, and then folks would > get better performance without even thinkning about it. > > >> Just a few thoughts, did not think about details really. But yes, it is >> sounds reasonable to me to re-add support for optional dependencies >> such as fftw or your TCL. But packagers have to make use of that or I >> fear it is actually less available than a standalone python module. >> > > true -- though I expect Anaconda / conda forge at least would be likely to > pick it up if it works well. > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Fri Aug 18 15:51:52 2017 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 18 Aug 2017 12:51:52 -0700 Subject: [Numpy-discussion] NetworkX 2.0b1 released Message-ID: Hi All, I am happy to announce the **beta** release of NetworkX 2.0! NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. This release supports Python 2.7 and 3.4-3.6 and contains many new features. This release is the result of over two years of work with over 600 pull requests by 85 contributors. We have made **major changes** to the methods in the Multi/Di/Graph classes and before the 2.0 release we need feedback on those changes. If you have code that imports networkx, please take some time to check that you are able to update your code to work with the new release. Please see the draft of the 2.0 release announcement: http://networkx.readthedocs.io/en/latest/news.html#networkx-2-0 In particular, we would like feedback on the migration guide from 1.X to 2.0: http://networkx.readthedocs.io/en/latest/release/migration_guide_from_1.x_to_2.0.html Since it is a beta release, pip won't automatically install it. So $ pip install networkx still installs networkx-1.11 still. But $ pip install --pre networkx will install networkx-2.0b1. If you already have networkx installed then you need to do $ pip install --pre --upgrade networkx For more information, please visit our `website `_ and our `gallery of examples `_. Please send comments and questions to the `networkx-discuss mailing list `_ or create an issue `here `_. Best regards, Jarrod From diagonaldevice at gmail.com Fri Aug 18 17:45:23 2017 From: diagonaldevice at gmail.com (Michael Lamparski) Date: Fri, 18 Aug 2017 17:45:23 -0400 Subject: [Numpy-discussion] Why are empty arrays False? Message-ID: Greetings, all. I am troubled. The TL;DR is that `bool(array([])) is False` is misleading, dangerous, and unnecessary. Let's begin with some examples: >>> bool(np.array(1)) True >>> bool(np.array(0)) False >>> bool(np.array([0, 1])) ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() >>> bool(np.array([1])) True >>> bool(np.array([0])) False >>> bool(np.array([])) False One of these things is not like the other. The first three results embody a design that is consistent with some of the most fundamental design choices in numpy, such as the choice to have comparison operators like `==` work elementwise. And it is the only such design I can think of that is consistent in all edge cases. (see footnote 1) The next two examples (involving arrays of shape (1,)) are a straightforward extension of the design to arrays that are isomorphic to scalars. I can't say I recall ever finding a use for this feature... but it seems fairly harmless. So how about that last example, with array([])? Well... it's /kind of/ like how other python containers work, right? Falseness is emptiness (see footnote 2)... Except that this is actually *a complete lie*, due to /all of the other examples above/! Here's what I would like to see: >>> bool(np.array([])) ValueError: The truth value of a non-scalar array is ambiguous. Use a.any() or a.all() Why do I care? Well, I myself wasted an hour barking up the wrong tree while debugging some code when it turned out that I was mistakenly using truthiness to identify empty arrays. It just so happened that the arrays always contained 1 or 0 elements, so it /appeared/ to work except in the rare case of array([0]) where things suddenly exploded. I posit that there is no usage of the fact that `bool(array([])) is False` in any real-world code which is not accompanied by a horrible bug writhing in hiding just beneath the surface. For this reason, I wish to see this behavior *abolished*. Thank you. -Michael Footnotes: 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would just implicitly do `all()`, which would make `if a == b:` work like it does for virtually every other reasonably-designed type in existence. But then I recall that, if this were done, then the behavior of `if a != b:` would stand out like a sore thumb instead. Truly, punting on 'any/all' was the right choice. 2: np.array([[[[]]]]) is also False, which makes this an interesting sort of n-dimensional emptiness test; but if that's really what you're looking for, you can achieve this much more safely with `np.all(x.shape)` or `bool(x.flat)` -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Fri Aug 18 18:00:32 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 18 Aug 2017 15:00:32 -0700 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: Message-ID: I agree, this behavior seems actively harmful. Let's fix it. On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski wrote: > Greetings, all. I am troubled. > > The TL;DR is that `bool(array([])) is False` is misleading, dangerous, and > unnecessary. Let's begin with some examples: > > >>> bool(np.array(1)) > True > >>> bool(np.array(0)) > False > >>> bool(np.array([0, 1])) > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() > >>> bool(np.array([1])) > True > >>> bool(np.array([0])) > False > >>> bool(np.array([])) > False > > One of these things is not like the other. > > The first three results embody a design that is consistent with some of > the most fundamental design choices in numpy, such as the choice to have > comparison operators like `==` work elementwise. And it is the only such > design I can think of that is consistent in all edge cases. (see footnote 1) > > The next two examples (involving arrays of shape (1,)) are a > straightforward extension of the design to arrays that are isomorphic to > scalars. I can't say I recall ever finding a use for this feature... but > it seems fairly harmless. > > So how about that last example, with array([])? Well... it's /kind of/ > like how other python containers work, right? Falseness is emptiness (see > footnote 2)... Except that this is actually *a complete lie*, due to /all > of the other examples above/! > > Here's what I would like to see: > > >>> bool(np.array([])) > ValueError: The truth value of a non-scalar array is ambiguous. Use > a.any() or a.all() > > Why do I care? Well, I myself wasted an hour barking up the wrong tree > while debugging some code when it turned out that I was mistakenly using > truthiness to identify empty arrays. It just so happened that the arrays > always contained 1 or 0 elements, so it /appeared/ to work except in the > rare case of array([0]) where things suddenly exploded. > > I posit that there is no usage of the fact that `bool(array([])) is False` > in any real-world code which is not accompanied by a horrible bug writhing > in hiding just beneath the surface. For this reason, I wish to see this > behavior *abolished*. > > Thank you. > -Michael > > Footnotes: > 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would just > implicitly do `all()`, which would make `if a == b:` work like it does for > virtually every other reasonably-designed type in existence. But then I > recall that, if this were done, then the behavior of `if a != b:` would > stand out like a sore thumb instead. Truly, punting on 'any/all' was the > right choice. > > 2: np.array([[[[]]]]) is also False, which makes this an interesting sort > of n-dimensional emptiness test; but if that's really what you're looking > for, you can achieve this much more safely with `np.all(x.shape)` or > `bool(x.flat)` > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Fri Aug 18 18:37:52 2017 From: pmhobson at gmail.com (Paul Hobson) Date: Fri, 18 Aug 2017 15:37:52 -0700 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: Message-ID: Maybe I'm missing something. This seems fine to me: >>> bool(np.array([])) False But I would have expected these to raise ValueErrors recommending any() and all(): >>> bool(np.array([1])) True >>> bool(np.array([0])) False On Fri, Aug 18, 2017 at 3:00 PM, Stephan Hoyer wrote: > I agree, this behavior seems actively harmful. Let's fix it. > > On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski < > diagonaldevice at gmail.com> wrote: > >> Greetings, all. I am troubled. >> >> The TL;DR is that `bool(array([])) is False` is misleading, dangerous, >> and unnecessary. Let's begin with some examples: >> >> >>> bool(np.array(1)) >> True >> >>> bool(np.array(0)) >> False >> >>> bool(np.array([0, 1])) >> ValueError: The truth value of an array with more than one element is >> ambiguous. Use a.any() or a.all() >> >>> bool(np.array([1])) >> True >> >>> bool(np.array([0])) >> False >> >>> bool(np.array([])) >> False >> >> One of these things is not like the other. >> >> The first three results embody a design that is consistent with some of >> the most fundamental design choices in numpy, such as the choice to have >> comparison operators like `==` work elementwise. And it is the only such >> design I can think of that is consistent in all edge cases. (see footnote 1) >> >> The next two examples (involving arrays of shape (1,)) are a >> straightforward extension of the design to arrays that are isomorphic to >> scalars. I can't say I recall ever finding a use for this feature... but >> it seems fairly harmless. >> >> So how about that last example, with array([])? Well... it's /kind of/ >> like how other python containers work, right? Falseness is emptiness (see >> footnote 2)... Except that this is actually *a complete lie*, due to /all >> of the other examples above/! >> >> Here's what I would like to see: >> >> >>> bool(np.array([])) >> ValueError: The truth value of a non-scalar array is ambiguous. Use >> a.any() or a.all() >> >> Why do I care? Well, I myself wasted an hour barking up the wrong tree >> while debugging some code when it turned out that I was mistakenly using >> truthiness to identify empty arrays. It just so happened that the arrays >> always contained 1 or 0 elements, so it /appeared/ to work except in the >> rare case of array([0]) where things suddenly exploded. >> >> I posit that there is no usage of the fact that `bool(array([])) is >> False` in any real-world code which is not accompanied by a horrible bug >> writhing in hiding just beneath the surface. For this reason, I wish to see >> this behavior *abolished*. >> >> Thank you. >> -Michael >> >> Footnotes: >> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would >> just implicitly do `all()`, which would make `if a == b:` work like it does >> for virtually every other reasonably-designed type in existence. But then >> I recall that, if this were done, then the behavior of `if a != b:` would >> stand out like a sore thumb instead. Truly, punting on 'any/all' was the >> right choice. >> >> 2: np.array([[[[]]]]) is also False, which makes this an interesting sort >> of n-dimensional emptiness test; but if that's really what you're looking >> for, you can achieve this much more safely with `np.all(x.shape)` or >> `bool(x.flat)` >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From diagonaldevice at gmail.com Fri Aug 18 19:02:43 2017 From: diagonaldevice at gmail.com (Michael Lamparski) Date: Fri, 18 Aug 2017 19:02:43 -0400 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: Message-ID: > But I would have expected these to raise ValueErrors recommending any() and all(): > >>> bool(np.array([1])) > True > >>> bool(np.array([0])) > False While I can't confess to know the *actual* reason why single-element arrays evaluate the way they do, this is how I understand it: One thing that single-element arrays have going for them is that, for arrays like this, `x.any() == x.all()`. Hence, in these cases, there is no ambiguity. In this same light, we can see yet another argument against bool(np.array([])), because guess what: This one IS ambiguous! >>> np.array([]).any() False >>> np.array([]).all() True On Fri, Aug 18, 2017 at 6:37 PM, Paul Hobson wrote: > Maybe I'm missing something. > > This seems fine to me: > >>> bool(np.array([])) > False > > But I would have expected these to raise ValueErrors recommending any() > and all(): > >>> bool(np.array([1])) > True > >>> bool(np.array([0])) > False > > On Fri, Aug 18, 2017 at 3:00 PM, Stephan Hoyer wrote: > >> I agree, this behavior seems actively harmful. Let's fix it. >> >> On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski < >> diagonaldevice at gmail.com> wrote: >> >>> Greetings, all. I am troubled. >>> >>> The TL;DR is that `bool(array([])) is False` is misleading, dangerous, >>> and unnecessary. Let's begin with some examples: >>> >>> >>> bool(np.array(1)) >>> True >>> >>> bool(np.array(0)) >>> False >>> >>> bool(np.array([0, 1])) >>> ValueError: The truth value of an array with more than one element is >>> ambiguous. Use a.any() or a.all() >>> >>> bool(np.array([1])) >>> True >>> >>> bool(np.array([0])) >>> False >>> >>> bool(np.array([])) >>> False >>> >>> One of these things is not like the other. >>> >>> The first three results embody a design that is consistent with some of >>> the most fundamental design choices in numpy, such as the choice to have >>> comparison operators like `==` work elementwise. And it is the only such >>> design I can think of that is consistent in all edge cases. (see footnote 1) >>> >>> The next two examples (involving arrays of shape (1,)) are a >>> straightforward extension of the design to arrays that are isomorphic to >>> scalars. I can't say I recall ever finding a use for this feature... but >>> it seems fairly harmless. >>> >>> So how about that last example, with array([])? Well... it's /kind of/ >>> like how other python containers work, right? Falseness is emptiness (see >>> footnote 2)... Except that this is actually *a complete lie*, due to /all >>> of the other examples above/! >>> >>> Here's what I would like to see: >>> >>> >>> bool(np.array([])) >>> ValueError: The truth value of a non-scalar array is ambiguous. Use >>> a.any() or a.all() >>> >>> Why do I care? Well, I myself wasted an hour barking up the wrong tree >>> while debugging some code when it turned out that I was mistakenly using >>> truthiness to identify empty arrays. It just so happened that the arrays >>> always contained 1 or 0 elements, so it /appeared/ to work except in the >>> rare case of array([0]) where things suddenly exploded. >>> >>> I posit that there is no usage of the fact that `bool(array([])) is >>> False` in any real-world code which is not accompanied by a horrible bug >>> writhing in hiding just beneath the surface. For this reason, I wish to see >>> this behavior *abolished*. >>> >>> Thank you. >>> -Michael >>> >>> Footnotes: >>> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would >>> just implicitly do `all()`, which would make `if a == b:` work like it does >>> for virtually every other reasonably-designed type in existence. But then >>> I recall that, if this were done, then the behavior of `if a != b:` would >>> stand out like a sore thumb instead. Truly, punting on 'any/all' was the >>> right choice. >>> >>> 2: np.array([[[[]]]]) is also False, which makes this an interesting >>> sort of n-dimensional emptiness test; but if that's really what you're >>> looking for, you can achieve this much more safely with `np.all(x.shape)` >>> or `bool(x.flat)` >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Fri Aug 18 19:07:51 2017 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Fri, 18 Aug 2017 23:07:51 +0000 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: Message-ID: I'm also in favor of fixing this, although we might need a deprecation cycle with a warning advising to use arr.size in future to detect emptiness - just in case anyone is using it. On Sat, Aug 19, 2017, 06:01 Stephan Hoyer wrote: > I agree, this behavior seems actively harmful. Let's fix it. > > On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski < > diagonaldevice at gmail.com> wrote: > >> Greetings, all. I am troubled. >> >> The TL;DR is that `bool(array([])) is False` is misleading, dangerous, >> and unnecessary. Let's begin with some examples: >> >> >>> bool(np.array(1)) >> True >> >>> bool(np.array(0)) >> False >> >>> bool(np.array([0, 1])) >> ValueError: The truth value of an array with more than one element is >> ambiguous. Use a.any() or a.all() >> >>> bool(np.array([1])) >> True >> >>> bool(np.array([0])) >> False >> >>> bool(np.array([])) >> False >> >> One of these things is not like the other. >> >> The first three results embody a design that is consistent with some of >> the most fundamental design choices in numpy, such as the choice to have >> comparison operators like `==` work elementwise. And it is the only such >> design I can think of that is consistent in all edge cases. (see footnote 1) >> >> The next two examples (involving arrays of shape (1,)) are a >> straightforward extension of the design to arrays that are isomorphic to >> scalars. I can't say I recall ever finding a use for this feature... but >> it seems fairly harmless. >> >> So how about that last example, with array([])? Well... it's /kind of/ >> like how other python containers work, right? Falseness is emptiness (see >> footnote 2)... Except that this is actually *a complete lie*, due to /all >> of the other examples above/! >> >> Here's what I would like to see: >> >> >>> bool(np.array([])) >> ValueError: The truth value of a non-scalar array is ambiguous. Use >> a.any() or a.all() >> >> Why do I care? Well, I myself wasted an hour barking up the wrong tree >> while debugging some code when it turned out that I was mistakenly using >> truthiness to identify empty arrays. It just so happened that the arrays >> always contained 1 or 0 elements, so it /appeared/ to work except in the >> rare case of array([0]) where things suddenly exploded. >> >> I posit that there is no usage of the fact that `bool(array([])) is >> False` in any real-world code which is not accompanied by a horrible bug >> writhing in hiding just beneath the surface. For this reason, I wish to see >> this behavior *abolished*. >> >> Thank you. >> -Michael >> >> Footnotes: >> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would >> just implicitly do `all()`, which would make `if a == b:` work like it does >> for virtually every other reasonably-designed type in existence. But then >> I recall that, if this were done, then the behavior of `if a != b:` would >> stand out like a sore thumb instead. Truly, punting on 'any/all' was the >> right choice. >> >> 2: np.array([[[[]]]]) is also False, which makes this an interesting sort >> of n-dimensional emptiness test; but if that's really what you're looking >> for, you can achieve this much more safely with `np.all(x.shape)` or >> `bool(x.flat)` >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Aug 18 20:12:43 2017 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 18 Aug 2017 17:12:43 -0700 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: Message-ID: On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski wrote: > Greetings, all. I am troubled. > > The TL;DR is that `bool(array([])) is False` is misleading, dangerous, and > unnecessary. Let's begin with some examples: > >>>> bool(np.array(1)) > True >>>> bool(np.array(0)) > False >>>> bool(np.array([0, 1])) > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() >>>> bool(np.array([1])) > True >>>> bool(np.array([0])) > False >>>> bool(np.array([])) > False > > One of these things is not like the other. > > The first three results embody a design that is consistent with some of the > most fundamental design choices in numpy, such as the choice to have > comparison operators like `==` work elementwise. And it is the only such > design I can think of that is consistent in all edge cases. (see footnote 1) > > The next two examples (involving arrays of shape (1,)) are a straightforward > extension of the design to arrays that are isomorphic to scalars. I can't > say I recall ever finding a use for this feature... but it seems fairly > harmless. > > So how about that last example, with array([])? Well... it's /kind of/ like > how other python containers work, right? Falseness is emptiness (see > footnote 2)... Except that this is actually *a complete lie*, due to /all > of the other examples above/! Yeah, numpy tries to follow Python conventions, except sometimes you run into these cases where it's trying to simultaneously follow two incompatible extensions and things get... problematic. > Here's what I would like to see: > >>>> bool(np.array([])) > ValueError: The truth value of a non-scalar array is ambiguous. Use a.any() > or a.all() > > Why do I care? Well, I myself wasted an hour barking up the wrong tree > while debugging some code when it turned out that I was mistakenly using > truthiness to identify empty arrays. It just so happened that the arrays > always contained 1 or 0 elements, so it /appeared/ to work except in the > rare case of array([0]) where things suddenly exploded. Yeah, we should probably deprecate and remove this (though it will take some time). > 2: np.array([[[[]]]]) is also False, which makes this an interesting sort of > n-dimensional emptiness test; but if that's really what you're looking for, > you can achieve this much more safely with `np.all(x.shape)` or > `bool(x.flat)` x.size is also useful for emptiness checking. -n -- Nathaniel J. Smith -- https://vorpus.org From efiring at hawaii.edu Fri Aug 18 22:34:02 2017 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 18 Aug 2017 16:34:02 -1000 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: Message-ID: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> On 2017/08/18 11:45 AM, Michael Lamparski wrote: > Greetings, all.? I am troubled. > > The TL;DR is that `bool(array([])) is False` is misleading, dangerous, > and unnecessary. Let's begin with some examples: > > >>> bool(np.array(1)) > True > >>> bool(np.array(0)) > False > >>> bool(np.array([0, 1])) > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() > >>> bool(np.array([1])) > True > >>> bool(np.array([0])) > False > >>> bool(np.array([])) > False > > One of these things is not like the other. > > The first three results embody a design that is consistent with some of > the most fundamental design choices in numpy, such as the choice to have > comparison operators like `==` work elementwise.? And it is the only > such design I can think of that is consistent in all edge cases. (see > footnote 1) > > The next two examples (involving arrays of shape (1,)) are a > straightforward extension of the design to arrays that are isomorphic to > scalars.? I can't say I recall ever finding a use for this feature... > but it seems fairly harmless. > > So how about that last example, with array([])?? Well... it's /kind of/ > like how other python containers work, right? Falseness is emptiness > (see footnote 2)...? Except that this is actually *a complete lie*, due > to /all of the other examples above/! I don't agree. I think the consistency between bool([]) and bool(array([])) is worth preserving. Nothing you have shown is inconsistent with "Falseness is emptiness", which is quite fundamental in Python. The inconsistency is in distinguishing between 1 element and more than one element. To be consistent, bool(array([0])) and bool(array([0, 1])) should both be True. Contrary to the ValueError message, there need be no ambiguity, any more than there is an ambiguity in bool([1, 2]). Eric > > Here's what I would like to see: > > >>> bool(np.array([])) > ValueError: The truth value of a non-scalar array is ambiguous. Use > a.any() or a.all() > > Why do I care?? Well, I myself wasted an hour barking up the wrong tree > while debugging some code when it turned out that I was mistakenly using > truthiness to identify empty arrays. It just so happened that the arrays > always contained 1 or 0 elements, so it /appeared/ to work except in the > rare case of array([0]) where things suddenly exploded. > > I posit that there is no usage of the fact that `bool(array([])) is > False` in any real-world code which is not accompanied by a horrible bug > writhing in hiding just beneath the surface. For this reason, I wish to > see this behavior *abolished*. > > Thank you. > -Michael > > Footnotes: > 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would > just implicitly do `all()`, which would make `if a == b:` work like it > does for virtually every other reasonably-designed type in existence. > But then I recall that, if this were done, then the behavior of `if a != > b:` would stand out like a sore thumb instead.? Truly, punting on > 'any/all' was the right choice. > > 2: np.array([[[[]]]]) is also False, which makes this an interesting > sort of n-dimensional emptiness test; but if that's really what you're > looking for, you can achieve this much more safely with > `np.all(x.shape)` or `bool(x.flat)` > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From wieser.eric+numpy at gmail.com Sat Aug 19 00:19:54 2017 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Sat, 19 Aug 2017 04:19:54 +0000 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> Message-ID: Defining falseness? as emptiness in numpy is problematic, as then bool(array(0)) and bool(0) would have different results. 0d arrays are supposed to behave as much like their scalar values as possible, so this is not acceptable. More importantly though, allowing your proposed semantics would cause a lot of silent bugs in code like `if arr == value`, which would be silently true of array inputs. We already diverge from python on what == means, so I see no reason to match the normal semantics of bool. I'd be tentatively in favor of deprecating bool(array([1]) with a warning asking for `.squeeze()` to be used, since this also hides a (smaller) class of bugs. On Sat, Aug 19, 2017, 10:34 Eric Firing wrote: > On 2017/08/18 11:45 AM, Michael Lamparski wrote: > > Greetings, all. I am troubled. > > > > The TL;DR is that `bool(array([])) is False` is misleading, dangerous, > > and unnecessary. Let's begin with some examples: > > > > >>> bool(np.array(1)) > > True > > >>> bool(np.array(0)) > > False > > >>> bool(np.array([0, 1])) > > ValueError: The truth value of an array with more than one element is > > ambiguous. Use a.any() or a.all() > > >>> bool(np.array([1])) > > True > > >>> bool(np.array([0])) > > False > > >>> bool(np.array([])) > > False > > > > One of these things is not like the other. > > > > The first three results embody a design that is consistent with some of > > the most fundamental design choices in numpy, such as the choice to have > > comparison operators like `==` work elementwise. And it is the only > > such design I can think of that is consistent in all edge cases. (see > > footnote 1) > > > > The next two examples (involving arrays of shape (1,)) are a > > straightforward extension of the design to arrays that are isomorphic to > > scalars. I can't say I recall ever finding a use for this feature... > > but it seems fairly harmless. > > > > So how about that last example, with array([])? Well... it's /kind of/ > > like how other python containers work, right? Falseness is emptiness > > (see footnote 2)... Except that this is actually *a complete lie*, due > > to /all of the other examples above/! > > I don't agree. I think the consistency between bool([]) and > bool(array([])) is worth preserving. Nothing you have shown is > inconsistent with "Falseness is emptiness", which is quite fundamental > in Python. The inconsistency is in distinguishing between 1 element and > more than one element. To be consistent, bool(array([0])) and > bool(array([0, 1])) should both be True. Contrary to the ValueError > message, there need be no ambiguity, any more than there is an ambiguity > in bool([1, 2]). > > Eric > > > > > > Here's what I would like to see: > > > > >>> bool(np.array([])) > > ValueError: The truth value of a non-scalar array is ambiguous. Use > > a.any() or a.all() > > > > Why do I care? Well, I myself wasted an hour barking up the wrong tree > > while debugging some code when it turned out that I was mistakenly using > > truthiness to identify empty arrays. It just so happened that the arrays > > always contained 1 or 0 elements, so it /appeared/ to work except in the > > rare case of array([0]) where things suddenly exploded. > > > > I posit that there is no usage of the fact that `bool(array([])) is > > False` in any real-world code which is not accompanied by a horrible bug > > writhing in hiding just beneath the surface. For this reason, I wish to > > see this behavior *abolished*. > > > > Thank you. > > -Michael > > > > Footnotes: > > 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would > > just implicitly do `all()`, which would make `if a == b:` work like it > > does for virtually every other reasonably-designed type in existence. > > But then I recall that, if this were done, then the behavior of `if a != > > b:` would stand out like a sore thumb instead. Truly, punting on > > 'any/all' was the right choice. > > > > 2: np.array([[[[]]]]) is also False, which makes this an interesting > > sort of n-dimensional emptiness test; but if that's really what you're > > looking for, you can achieve this much more safely with > > `np.all(x.shape)` or `bool(x.flat)` > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From diagonaldevice at gmail.com Sat Aug 19 01:04:56 2017 From: diagonaldevice at gmail.com (Michael Lamparski) Date: Sat, 19 Aug 2017 01:04:56 -0400 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> Message-ID: > More importantly though, allowing your proposed semantics would cause a lot of silent bugs in code like `if arr == value`, which would be silently true of array inputs. We already diverge from python on what == means, so I see no reason to match the normal semantics of bool. Eric hits the nail right on the head here. (er, ahh, you're both Eric!) And this gets worse; not only would `a == b` be true, but so would `a != b`! For the vast majority of arrays, `bool(x != x)` would be True! I can resonate with Eric F's feelings, because to be honest, I've never been a big fan of the fact that comparison operators return arrays in the first place. That said... it's a difficult design question, and I can respect the decision that was made; there certainly are a large variety of circumstances where broadcasting these operations are useful. On the other hand, it is a decision that comes with implications that cannot be ignored in many other parts of the library, and truthiness of arrays is one of them. > I'd be tentatively in favor of deprecating bool(array([1]) with a warning asking for `.squeeze()` to be used, since this also hides a (smaller) class of bugs. I can get behind this as well, though I just keep wondering in the back of my mind whether there's some tricky but legitimate use case that I'm not thinking about, where arrays of size 1 just happen to have a natural tendency to arise. On Sat, Aug 19, 2017, 10:34 Eric Firing wrote: > On 2017/08/18 11:45 AM, Michael Lamparski wrote: > > Greetings, all. I am troubled. > > > > The TL;DR is that `bool(array([])) is False` is misleading, dangerous, > > and unnecessary. Let's begin with some examples: > > > > >>> bool(np.array(1)) > > True > > >>> bool(np.array(0)) > > False > > >>> bool(np.array([0, 1])) > > ValueError: The truth value of an array with more than one element is > > ambiguous. Use a.any() or a.all() > > >>> bool(np.array([1])) > > True > > >>> bool(np.array([0])) > > False > > >>> bool(np.array([])) > > False > > > > One of these things is not like the other. > > > > The first three results embody a design that is consistent with some of > > the most fundamental design choices in numpy, such as the choice to have > > comparison operators like `==` work elementwise. And it is the only > > such design I can think of that is consistent in all edge cases. (see > > footnote 1) > > > > The next two examples (involving arrays of shape (1,)) are a > > straightforward extension of the design to arrays that are isomorphic to > > scalars. I can't say I recall ever finding a use for this feature... > > but it seems fairly harmless. > > > > So how about that last example, with array([])? Well... it's /kind of/ > > like how other python containers work, right? Falseness is emptiness > > (see footnote 2)... Except that this is actually *a complete lie*, due > > to /all of the other examples above/! > > I don't agree. I think the consistency between bool([]) and > bool(array([])) is worth preserving. Nothing you have shown is > inconsistent with "Falseness is emptiness", which is quite fundamental > in Python. The inconsistency is in distinguishing between 1 element and > more than one element. To be consistent, bool(array([0])) and > bool(array([0, 1])) should both be True. Contrary to the ValueError > message, there need be no ambiguity, any more than there is an ambiguity > in bool([1, 2]). > > Eric > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Sat Aug 19 03:57:58 2017 From: andyfaff at gmail.com (Andrew Nelson) Date: Sat, 19 Aug 2017 17:57:58 +1000 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: Message-ID: > I think the consistency between bool([]) and bool(array([])) is worth preserving I'm with Eric Firing on this one. Empty sequences are False in Python. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Sat Aug 19 05:00:43 2017 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Sat, 19 Aug 2017 09:00:43 +0000 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: Message-ID: Andrew, that can only be useful if you also require that all non-empty arrays are True - else code looking for empty arrays gets false positives on arrays of zeros. But as I mention above, that is not acceptable, as it produces silent traps for new users, or functions not written with numpy in mind. "In the face of ambiguity, refuse the tempting to guess" tells us that throwing an error is the right thing to do here. In idiomatic code, numpy arrays have semantics closer to scalars than to sequences - iteration is usually a red flag. Another example of how arrays are not like sequences - the + operator is element-wise addition, not sequence concatenation. On Sat, Aug 19, 2017, 15:58 Andrew Nelson wrote: > > I think the consistency between bool([]) and > bool(array([])) is worth preserving > > I'm with Eric Firing on this one. Empty sequences are False in Python. > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Aug 19 09:22:33 2017 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 19 Aug 2017 15:22:33 +0200 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> Message-ID: <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Michael Lamparski kirjoitti 19.08.2017 klo 07:04: >> I'd be tentatively in favor of deprecating bool(array([1]) with a > warning asking for `.squeeze()` to be used, since this also hides a > (smaller) class of bugs. > > I can get behind this as well, though I just keep wondering in the back > of my mind whether there's some tricky but legitimate use case that I'm > not thinking about, where arrays of size 1 just happen to have a natural > tendency to arise. Changing this sort of fundamental semantics (i.e. size-1 arrays behave like scalars in bool, int, etc. casting context) this late in the game in my opinion should be discussed with more care. While the intention of making it harder to write code with bugs is good, it should not come at the cost of having everyone fix their old scripts, which worked correctly previously, but then suddenly stop working. Note also that I expect polling on this mailing list will not reach the majority of the user base, so I would suggest being very conservative when deprecating features that are not wrong but only with suboptimal semantics. This sort of backward-incompatible changes accumulate, and will lead to rotting of third-party code. -- Pauli Virtanen From diagonaldevice at gmail.com Sat Aug 19 13:18:45 2017 From: diagonaldevice at gmail.com (Michael Lamparski) Date: Sat, 19 Aug 2017 13:18:45 -0400 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Message-ID: On Sat, Aug 19, 2017 at 9:22 AM, Pauli Virtanen wrote: > While the intention of making it harder to write code with bugs > is good, it should not come at the cost of having everyone fix > their old scripts, which worked correctly previously, but then > suddenly stop working. This is a good point. Deprecating anything in such a widely used library has a very big cost that must be weighed against the benefits, and I agree that truth-testing on size=1 arrays is neither broken nor dangerous. IMO, it is a small refactoring hazard at worst. > Note also that I expect polling on this mailing list will not > reach the majority of the user base, [...] Yep. This thread was really just to test the waters. While there's no way to really reach out to the silent majority, I am going to at least make a github issue and summarize the points from this discussion there. I'm glad to see that the general response so far has been that this seems actionable (specifically, deprecating __nonzero__ on size=0 arrays). -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Sat Aug 19 14:00:35 2017 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 19 Aug 2017 08:00:35 -1000 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Message-ID: On 2017/08/19 7:18 AM, Michael Lamparski wrote: > While there's no way to really reach out to the silent majority, I am > going to at least make a github issue and summarize the points from this > discussion there.? I'm glad to see that the general response so far has > been that this seems actionable (specifically, deprecating __nonzero__ > on size=0 arrays). No, that is the response you agree with; I don't think is fair to characterize it as the "general response". From diagonaldevice at gmail.com Sat Aug 19 16:26:47 2017 From: diagonaldevice at gmail.com (Michael Lamparski) Date: Sat, 19 Aug 2017 16:26:47 -0400 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Message-ID: On Sat, Aug 19, 2017 at 2:00 PM, Eric Firing wrote: > On 2017/08/19 7:18 AM, Michael Lamparski wrote: > >> While there's no way to really reach out to the silent majority, I am >> going to at least make a github issue and summarize the points from this >> discussion there. I'm glad to see that the general response so far has >> been that this seems actionable (specifically, deprecating __nonzero__ on >> size=0 arrays). >> > > No, that is the response you agree with; I don't think is fair to > characterize it as the "general response". > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > With regards to gauging "general response," all I'm really trying to do is gauge the likelihood of my issue getting closed right away without action, if I were to file one (e.g. has this issue already been discussed, with a decision to leave things as they are?), because I don't want to waste my time and others' by creating an issue for something that is never going to happen. I've gotten the impression from this conversation that this change (specifically for size=0) *is* possible, especially since two people with a decent history of contribution to the numpy repository have voiced approval for the change. As I see it, opening an issue will at least invite some more discussion, and at best motivate a change. To me, that is a "generally positive response." --- ...but there's also more to it beyond the "general response." From your words, I get the impression that you believe that I am simply ignoring your comments or do not value them, simply because they go against mine. Please understand: I *don't* enjoy the fact that truthness of numpy arrays works differently from lists! And there's plenty else that I don't enjoy about numpy, too; I *don't* enjoy the fact that I need to change a whole bunch of `assert a == b` statements to `assert (a == b).all()` after changing the type of some tuple to an array. I *don't* enjoy how numpy's auto-magical shape-finding makes it nearly impossible to have an array of heterogenous tuples. But over the years, I've also put considerable amount of time and thought into understanding *why* these design choices were made. Library design is a difficult beast. Every design decision you make can interact in unexpected ways with all of your other decisions, and eventually you have to accept the fact that you can't always have your cake and eat it too. And desigining a library like numpy, the library to end all libraries for working with numerical data? That is h-a-r-d HARD. That borders on programming-language-design hard. The fact of the matter is that *I agree with you.* Truthiness SHOULD denote emptiness for python types....but I have already considered this, and weighed it against every other design consideration that came to mind. In the end, those other design considerations won out, and "scalar evaluation/any()/all()" is the lesser of two evils. To convince me personally, you need to start by presenting something novel that I haven't thought about. There will be opportunity for others to do the same on Github. Please; I live for discussions about pitfalls in language and library design! -Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Sat Aug 19 16:36:15 2017 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 19 Aug 2017 10:36:15 -1000 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Message-ID: <05ae7890-bffe-219e-c627-3694de54a8fd@hawaii.edu> On 2017/08/19 10:26 AM, Michael Lamparski wrote: > There will be opportunity for others to do the same on Github. Please; I > live for discussions about pitfalls in language and library design! > Thank you for your thoughtful discussion. Eric From njs at pobox.com Sat Aug 19 17:24:26 2017 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 19 Aug 2017 14:24:26 -0700 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> Message-ID: On Fri, Aug 18, 2017 at 7:34 PM, Eric Firing wrote: > I don't agree. I think the consistency between bool([]) and bool(array([])) > is worth preserving. Nothing you have shown is inconsistent with "Falseness > is emptiness", which is quite fundamental in Python. The inconsistency is > in distinguishing between 1 element and more than one element. To be > consistent, bool(array([0])) and bool(array([0, 1])) should both be True. > Contrary to the ValueError message, there need be no ambiguity, any more > than there is an ambiguity in bool([1, 2]). Yeah, this is a mess. But we're definitely not going to make bool(array([0])) be True. That would break tons of code that currently relies on the current behavior. And the current behavior does make sense, in every case except empty arrays: bool broadcasts over the array, and then, oh shoot, Python requires that bool's return value be a scalar, so if this results in anything besides an array of size 1, raise an error. OTOH you can't really write code that depends on using the current bool(array([])) semantics for emptiness checking, unless the only two cases you care about are "empty" and "non-empty with exactly one element and that element is truthy". So it's much less likely that changing that will break existing code, plus any code that does break was already likely broken in subtle ways. The consistency-with-Python argument cuts two ways: if an array is a container, then for consistency bool should do emptiness checking. If an array is a bunch of scalars with broadcasting, then for consistency bool should do truthiness checking on the individual elements and raise an error on any array with size != 1. So we can't just rely on consistency-with-Python to resolve the argument -- we need to pick one :-). Though internal consistency within numpy would argue for the latter option, because numpy almost always prefers the bag-of-scalars semantics over the container semantics, e.g. for + and *, like Eric Wieser mentioned. Though there are exceptions like iteration. ...Though actually, iteration and indexing by scalars tries to be consistent with Python in yet a third way. They pretend that an array is a unidimensional container holding a bunch of arrays: In [3]: np.array([[1]])[0] Out[3]: array([1]) In [4]: next(iter(np.array([[1]]))) Out[4]: array([1]) So according to this model, bool(np.array([])) should be False, but bool(np.array([[]])) should be True (note that with lists, bool([[]]) is True). But alas: In [5]: bool(np.array([])), bool(np.array([[]])) Out[5]: (False, False) -n -- Nathaniel J. Smith -- https://vorpus.org From m.h.vankerkwijk at gmail.com Sat Aug 19 18:05:50 2017 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 19 Aug 2017 18:05:50 -0400 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Message-ID: Agreed with Eric Wieser here have an empty array test as `False` is less than useless, since a non-empty array either returns something based on its contents or an error. This means that one cannot write statements like `if array:`. Does this leave any use case? It seems to me it just shows there is no point in defining the truthiness of an empty array. -- Marten From ben.v.root at gmail.com Mon Aug 21 10:34:22 2017 From: ben.v.root at gmail.com (Benjamin Root) Date: Mon, 21 Aug 2017 10:34:22 -0400 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Message-ID: I've long ago stopped doing any "emptiness is false"-type tests on any python containers when iterators and generators became common, because they always return True. Ben On Sat, Aug 19, 2017 at 6:05 PM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Agreed with Eric Wieser here have an empty array test as `False` is > less than useless, since a non-empty array either returns something > based on its contents or an error. This means that one cannot write > statements like `if array:`. Does this leave any use case? It seems to > me it just shows there is no point in defining the truthiness of an > empty array. > -- Marten > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Aug 22 12:31:43 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 22 Aug 2017 09:31:43 -0700 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Message-ID: On Mon, Aug 21, 2017 at 7:34 AM, Benjamin Root wrote: > I've long ago stopped doing any "emptiness is false"-type tests on any > python containers when iterators and generators became common, because they > always return True. > good point. Personally, I've thought for years that Python's "Truthiness" concept is a wart. Sure, empty sequences, and zero values are often "False" in nature, but truthiness really is application-dependent -- in particular, sometimes a value of zero is meaningful, and sometimes not. Is it really so hard to write: if len(seq) == 0: or if x == 0: or if arr.size == 0: or arr.shape == (0,0): And then you are being far more explicit about what the test really is. And thanks Ben, for pointing out the issue with iterables. One more example of how Python has really changed its focus: Python 2 (or maybe, Python1.5) was all about sequences. Python 3 is all about iterables -- and the "empty is False" concept does not map well to iterables.... As to the topic at hand, if we had it to do again, I would NOT make an array that happens to hold a single value act like a scalar for bool() -- a 1-D array that happens to be length-1 really is a different beast than a scalar. But we don't have it to do again -- so we probably need to keep it as it is for backward compatibility. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From diagonaldevice at gmail.com Tue Aug 22 14:04:25 2017 From: diagonaldevice at gmail.com (Michael Lamparski) Date: Tue, 22 Aug 2017 14:04:25 -0400 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Message-ID: On Tue, Aug 22, 2017 at 12:31 PM, Chris Barker wrote: > Personally, I've thought for years that Python's "Truthiness" concept is a wart. > Sure, empty sequences, and zero values are often "False" in nature, > but truthiness really is application-dependent -- in particular, sometimes > a value of zero is meaningful, and sometimes not. I think truthiness is easily a wart in any dynamically-typed language (and yet ironically, every language I can think of that has truthiness is dynamically typed except for C++). And yet for some reason it seems to be pressed forward as idiomatic in python, and for that reason alone, I use it. These are questions I ask myself on a daily basis, just to support this strange idiom: - How close to the public API is this argument? - Is '' a reasonable value for this string? - How about an empty tuple? Empty set? - Should this sentinel value be None or a new object()? - Is this list local to this function? - Is the type of this optional argument always True? - How liable are these answers to change with future refactoring? which seems like a pretty big laundry list to keep in check for what's supposed to be syntactic sugar. In the end, I will admit that I think my code "looks nice," but I think that's only because I've gotten used to seeing it! After answering all of these questions I tend to find that truthiness is seldom usable in any sort of generic code. These are the kinds of places where I usually find myself using truthiness instead, and all involve working with objects of known type: # 1. A list used as a stack while stack: top = stack.pop() ... def read_config(d): # 2. Empty default value for a mutable argument that I don't mutate d = dict(d or {}) a = d.pop('a') b = d.pop('b') ... # 3. Validating configuration if d: warn('unrecognized config keys: {!r}'.format(list(d))) # 4. Oddball cases, e.g. the "linked list" (a, (b, (c, (d, (e, None))))) def iter_linked_list(node): while node: value, node = node yield value # 5. ...more oddball stuff... def format_call(f, *args, **kw): arg_s = ', '.join(repr(x) for x in args) kw_s = ', '.join(f'{k:!s}={v:!r}' for k,v in kw.items()) sep = ', ' if args and kw else '' return f'{f.__name__}({arg_s}{sep}{kw_s})' Meanwhile, for an arbitrary iterator taken as an argument, if you want it to have at least one element for some reason, then good luck; truthiness will not help you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Aug 22 17:48:14 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 22 Aug 2017 14:48:14 -0700 Subject: [Numpy-discussion] Why are empty arrays False? In-Reply-To: References: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu> <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi> Message-ID: On Tue, Aug 22, 2017 at 11:04 AM, Michael Lamparski < diagonaldevice at gmail.com> wrote: > I think truthiness is easily a wart in any dynamically-typed language (and > yet ironically, every language I can think of that has truthiness is > dynamically typed except for C++). And yet for some reason it seems to be > pressed forward as idiomatic in python, and for that reason alone, I use > it. > me too :-) > Meanwhile, for an arbitrary iterator taken as an argument, if you want it > to have at least one element for some reason, then good luck; truthiness > will not help you. > of course, nor will len() And this is mostly OK, as if you are taking an aritrary iterable, then you are probably going to, well, iterate over it, and: for this in an_empty_iterable: ... works fine. But bringing it back OT -- it's all a bit messy, but there is logic for the existing conventions in numpy -- and I think backward compatibility is more important than a slightly cleaner API. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From renato.fabbri at gmail.com Thu Aug 24 09:53:27 2017 From: renato.fabbri at gmail.com (Renato Fabbri) Date: Thu, 24 Aug 2017 10:53:27 -0300 Subject: [Numpy-discussion] power function distribution or power-law distribution? Message-ID: numpy.random.power.__doc__ uses only the term "power function distribution". I cannot find a comparison between this term and "power-law distribution" and am quite interested to know if they are simply synonyms. Any ideas? BTW. how is this list related to numpy-discussion at scipy.org? -- Renato Fabbri GNU/Linux User #479299 labmacambira.sourceforge.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Aug 24 10:07:00 2017 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 24 Aug 2017 16:07:00 +0200 Subject: [Numpy-discussion] power function distribution or power-law distribution? In-Reply-To: References: Message-ID: <1503583620.2351.9.camel@iki.fi> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti: > numpy.random.power.__doc__ > > uses only the term "power function distribution". The documentation in the most recent Numpy version seems to be more explicit, see the Notes section for the PDF: https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power .html > BTW. how is this list related to numpy-discussion at scipy.org? That's the old address of this list. The current address is numpy-discussion at python.org and it should be used instead. -- Pauli Virtanen From renato.fabbri at gmail.com Thu Aug 24 10:41:14 2017 From: renato.fabbri at gmail.com (Renato Fabbri) Date: Thu, 24 Aug 2017 11:41:14 -0300 Subject: [Numpy-discussion] power function distribution or power-law distribution? In-Reply-To: <1503583620.2351.9.camel@iki.fi> References: <1503583620.2351.9.camel@iki.fi> Message-ID: Thanks for the reply. But the question remains: how are the terms "power function distribution" and "power-law distribution" related? The documentation link you sent have no information on this. ( And seems the same as I get here In [6]: n.version.full_version Out[6]: '1.11.0' ) On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen wrote: > to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti: > > numpy.random.power.__doc__ > > > > uses only the term "power function distribution". > > The documentation in the most recent Numpy version seems to be more > explicit, see the Notes section for the PDF: > > https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power > .html > > > BTW. how is this list related to numpy-discussion at scipy.org? > > That's the old address of this list. > The current address is numpy-discussion at python.org and it should be > used instead. > > -- > Pauli Virtanen > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -- Renato Fabbri GNU/Linux User #479299 labmacambira.sourceforge.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Thu Aug 24 10:47:51 2017 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Thu, 24 Aug 2017 09:47:51 -0500 Subject: [Numpy-discussion] power function distribution or power-law distribution? In-Reply-To: References: <1503583620.2351.9.camel@iki.fi> Message-ID: The latest version of numpy is 1.13. In this case, as described in the docs, a power function distribution is one with a probability desnity function of the form ax^(a-1) for x between 0 and 1. On Thu, Aug 24, 2017 at 9:41 AM, Renato Fabbri wrote: > Thanks for the reply. > > But the question remains: > how are the terms "power function distribution" > and "power-law distribution" related? > > The documentation link you sent have no information on this. > ( > And seems the same as I get here > In [6]: n.version.full_version > Out[6]: '1.11.0' > ) > > On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen wrote: > >> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti: >> > numpy.random.power.__doc__ >> > >> > uses only the term "power function distribution". >> >> The documentation in the most recent Numpy version seems to be more >> explicit, see the Notes section for the PDF: >> >> https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power >> .html >> >> >> > BTW. how is this list related to numpy-discussion at scipy.org? >> >> That's the old address of this list. >> The current address is numpy-discussion at python.org and it should be >> used instead. >> >> -- >> Pauli Virtanen >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > > > -- > Renato Fabbri > GNU/Linux User #479299 > labmacambira.sourceforge.net > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From renato.fabbri at gmail.com Thu Aug 24 10:56:46 2017 From: renato.fabbri at gmail.com (Renato Fabbri) Date: Thu, 24 Aug 2017 11:56:46 -0300 Subject: [Numpy-discussion] power function distribution or power-law distribution? In-Reply-To: References: <1503583620.2351.9.camel@iki.fi> Message-ID: On Thu, Aug 24, 2017 at 11:47 AM, Nathan Goldbaum wrote: > The latest version of numpy is 1.13. > > In this case, as described in the docs, a power function distribution is > one with a probability desnity function of the form ax^(a-1) for x between > 0 and 1. > ok, let's try ourselves to relate the terms. Would you agree that the "power function distribution" is a "power-law distribution" in which the domain is restricted to be [0,1]? > > On Thu, Aug 24, 2017 at 9:41 AM, Renato Fabbri > wrote: > >> Thanks for the reply. >> >> But the question remains: >> how are the terms "power function distribution" >> and "power-law distribution" related? >> >> The documentation link you sent have no information on this. >> ( >> And seems the same as I get here >> In [6]: n.version.full_version >> Out[6]: '1.11.0' >> ) >> >> On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen wrote: >> >>> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti: >>> > numpy.random.power.__doc__ >>> > >>> > uses only the term "power function distribution". >>> >>> The documentation in the most recent Numpy version seems to be more >>> explicit, see the Notes section for the PDF: >>> >>> https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power >>> .html >>> >>> >>> > BTW. how is this list related to numpy-discussion at scipy.org? >>> >>> That's the old address of this list. >>> The current address is numpy-discussion at python.org and it should be >>> used instead. >>> >>> -- >>> Pauli Virtanen >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> >> >> >> -- >> Renato Fabbri >> GNU/Linux User #479299 >> labmacambira.sourceforge.net >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- Renato Fabbri GNU/Linux User #479299 labmacambira.sourceforge.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Aug 24 12:57:43 2017 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 24 Aug 2017 12:57:43 -0400 Subject: [Numpy-discussion] power function distribution or power-law distribution? In-Reply-To: References: <1503583620.2351.9.camel@iki.fi> Message-ID: On Thu, Aug 24, 2017 at 10:56 AM, Renato Fabbri wrote: > On Thu, Aug 24, 2017 at 11:47 AM, Nathan Goldbaum > wrote: > >> The latest version of numpy is 1.13. >> >> In this case, as described in the docs, a power function distribution is >> one with a probability desnity function of the form ax^(a-1) for x between >> 0 and 1. >> > > ok, let's try ourselves to relate the terms. > Would you agree that the "power function distribution" is a "power-law > distribution" > in which the domain is restricted to be [0,1]? > I would phrase it weaker. The emphasis for power-law distribution is often or commonly on the tail behavior. The functional form of the pdf is the same as the power-law distribution but restricted to a finite interval [0,1] or The power function distribution can be considered as a truncated power-law distribution. (I looked at it maybe 9 years ago, but gave up on the similarity because the purpose is very different, at least based on what I looked at at the time. The similarity in name also got me confused initially.) Josef > > > > >> >> On Thu, Aug 24, 2017 at 9:41 AM, Renato Fabbri >> wrote: >> >>> Thanks for the reply. >>> >>> But the question remains: >>> how are the terms "power function distribution" >>> and "power-law distribution" related? >>> >>> The documentation link you sent have no information on this. >>> ( >>> And seems the same as I get here >>> In [6]: n.version.full_version >>> Out[6]: '1.11.0' >>> ) >>> >>> On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen wrote: >>> >>>> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti: >>>> > numpy.random.power.__doc__ >>>> > >>>> > uses only the term "power function distribution". >>>> >>>> The documentation in the most recent Numpy version seems to be more >>>> explicit, see the Notes section for the PDF: >>>> >>>> https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power >>>> .html >>>> >>>> >>>> > BTW. how is this list related to numpy-discussion at scipy.org? >>>> >>>> That's the old address of this list. >>>> The current address is numpy-discussion at python.org and it should be >>>> used instead. >>>> >>>> -- >>>> Pauli Virtanen >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> >>> -- >>> Renato Fabbri >>> GNU/Linux User #479299 >>> labmacambira.sourceforge.net >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > > -- > Renato Fabbri > GNU/Linux User #479299 > labmacambira.sourceforge.net > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Aug 24 13:19:27 2017 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Aug 2017 10:19:27 -0700 Subject: [Numpy-discussion] power function distribution or power-law distribution? In-Reply-To: References: <1503583620.2351.9.camel@iki.fi> Message-ID: On Thu, Aug 24, 2017 at 7:56 AM, Renato Fabbri wrote: > > On Thu, Aug 24, 2017 at 11:47 AM, Nathan Goldbaum wrote: >> >> The latest version of numpy is 1.13. >> >> In this case, as described in the docs, a power function distribution is one with a probability desnity function of the form ax^(a-1) for x between 0 and 1. > > ok, let's try ourselves to relate the terms. > Would you agree that the "power function distribution" is a "power-law distribution" > in which the domain is restricted to be [0,1]? I probably wouldn't. The coincidental similarity in functional form (domain and normalizing constants notwithstanding) obscures the very different mechanisms each represent. The ambiguous name of the method `power` instead of `power_function` is my fault. You have my apologies. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Aug 24 13:24:44 2017 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 24 Aug 2017 13:24:44 -0400 Subject: [Numpy-discussion] power function distribution or power-law distribution? In-Reply-To: References: <1503583620.2351.9.camel@iki.fi> Message-ID: On Thu, Aug 24, 2017 at 12:57 PM, wrote: > > > On Thu, Aug 24, 2017 at 10:56 AM, Renato Fabbri > wrote: > >> On Thu, Aug 24, 2017 at 11:47 AM, Nathan Goldbaum >> wrote: >> >>> The latest version of numpy is 1.13. >>> >>> In this case, as described in the docs, a power function distribution is >>> one with a probability desnity function of the form ax^(a-1) for x between >>> 0 and 1. >>> >> >> ok, let's try ourselves to relate the terms. >> Would you agree that the "power function distribution" is a "power-law >> distribution" >> in which the domain is restricted to be [0,1]? >> > > I would phrase it weaker. The emphasis for power-law distribution is often > or commonly on the tail behavior. > > The functional form of the pdf is the same as the power-law distribution > but restricted to a finite interval [0,1] > or > The power function distribution can be considered as a truncated power-law > distribution. > > (I looked at it maybe 9 years ago, but gave up on the similarity because > the purpose is very different, at least based on what I looked at at the > time. The similarity in name also got me confused initially.) > Based on what I start to remember: The power function distribution can have increasing pdf. Because of the truncation it does not need the same parameter restriction as the power law distribution in order to integrate to a finite value so it can be normalized to a proper distribution. Josef > > Josef > > > > > >> >> >> >> >>> >>> On Thu, Aug 24, 2017 at 9:41 AM, Renato Fabbri >>> wrote: >>> >>>> Thanks for the reply. >>>> >>>> But the question remains: >>>> how are the terms "power function distribution" >>>> and "power-law distribution" related? >>>> >>>> The documentation link you sent have no information on this. >>>> ( >>>> And seems the same as I get here >>>> In [6]: n.version.full_version >>>> Out[6]: '1.11.0' >>>> ) >>>> >>>> On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen wrote: >>>> >>>>> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti: >>>>> > numpy.random.power.__doc__ >>>>> > >>>>> > uses only the term "power function distribution". >>>>> >>>>> The documentation in the most recent Numpy version seems to be more >>>>> explicit, see the Notes section for the PDF: >>>>> >>>>> https://docs.scipy.org/doc/numpy/reference/generated/numpy.r >>>>> andom.power >>>>> .html >>>>> >>>>> >>>>> > BTW. how is this list related to numpy-discussion at scipy.org? >>>>> >>>>> That's the old address of this list. >>>>> The current address is numpy-discussion at python.org and it should be >>>>> used instead. >>>>> >>>>> -- >>>>> Pauli Virtanen >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> >>>> -- >>>> Renato Fabbri >>>> GNU/Linux User #479299 >>>> labmacambira.sourceforge.net >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> >> >> -- >> Renato Fabbri >> GNU/Linux User #479299 >> labmacambira.sourceforge.net >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Mon Aug 28 15:20:17 2017 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 28 Aug 2017 19:20:17 +0000 Subject: [Numpy-discussion] Interface numpy arrays to Matlab? Message-ID: I've searched but haven't found any decent answer. I need to call Matlab from python. Matlab has a python module for this purpose, but it doesn't understand numpy AFAICT. What solutions are there for efficiently interfacing numpy arrays to Matlab? Thanks, Neal -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Aug 28 16:21:41 2017 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 28 Aug 2017 13:21:41 -0700 Subject: [Numpy-discussion] Interface numpy arrays to Matlab? In-Reply-To: References: Message-ID: If you can use Octave instead of Matlab, I've had a very good experience with Oct2Py: https://github.com/blink1073/oct2py On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker wrote: > I've searched but haven't found any decent answer. I need to call Matlab > from python. Matlab has a python module for this purpose, but it doesn't > understand numpy AFAICT. What solutions are there for efficiently > interfacing numpy arrays to Matlab? > > Thanks, > Neal > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From perimosocordiae at gmail.com Mon Aug 28 16:29:25 2017 From: perimosocordiae at gmail.com (CJ Carey) Date: Mon, 28 Aug 2017 16:29:25 -0400 Subject: [Numpy-discussion] Interface numpy arrays to Matlab? In-Reply-To: References: Message-ID: Looks like Transplant can handle this use-case. Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html GitHub link: https://github.com/bastibe/transplant I haven't given it a try myself, but it looks promising. On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer wrote: > If you can use Octave instead of Matlab, I've had a very good experience > with Oct2Py: > https://github.com/blink1073/oct2py > > On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker wrote: > >> I've searched but haven't found any decent answer. I need to call Matlab >> from python. Matlab has a python module for this purpose, but it doesn't >> understand numpy AFAICT. What solutions are there for efficiently >> interfacing numpy arrays to Matlab? >> >> Thanks, >> Neal >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From grlee77 at gmail.com Mon Aug 28 17:27:00 2017 From: grlee77 at gmail.com (Gregory Lee) Date: Mon, 28 Aug 2017 17:27:00 -0400 Subject: [Numpy-discussion] Interface numpy arrays to Matlab? In-Reply-To: References: Message-ID: I have not used Transplant, but it sounds fairly similar to Python-matlab-bridge. We currently optionally call Matlab via Python-matlab-bridge in some of the the tests for the PyWavelets package. https://arokem.github.io/python-matlab-bridge/ https://github.com/arokem/python-matlab-bridge I would be interested in hearing about the benefits/drawbacks relative to Transplant if there is anyone who has used both. On Mon, Aug 28, 2017 at 4:29 PM, CJ Carey wrote: > Looks like Transplant can handle this use-case. > > Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html > GitHub link: https://github.com/bastibe/transplant > > I haven't given it a try myself, but it looks promising. > > On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer wrote: > >> If you can use Octave instead of Matlab, I've had a very good experience >> with Oct2Py: >> https://github.com/blink1073/oct2py >> >> On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker >> wrote: >> >>> I've searched but haven't found any decent answer. I need to call >>> Matlab from python. Matlab has a python module for this purpose, but it >>> doesn't understand numpy AFAICT. What solutions are there for efficiently >>> interfacing numpy arrays to Matlab? >>> >>> Thanks, >>> Neal >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Tue Aug 29 07:08:57 2017 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 29 Aug 2017 11:08:57 +0000 Subject: [Numpy-discussion] Interface numpy arrays to Matlab? In-Reply-To: References: Message-ID: Transplant sounds interesting, I think I could use this. I don't understand though why nobody has used a more direct approach? Matlab has their python API https://www.mathworks.com/help/matlab/matlab-engine-for-python.html. This will pass Matlab arrays to/from python as some kind of opaque blob. I would guess that inside every Matlab array is a numpy array crying to be freed - in both cases an array is a block of memory together with shape and stride information. So I would hope a direct conversion could be done, at least via C API if not directly with python numpy API. But it seems nobody has done this, so maybe it's not that simple? On Mon, Aug 28, 2017 at 5:32 PM Gregory Lee wrote: > I have not used Transplant, but it sounds fairly similar to > Python-matlab-bridge. We currently optionally call Matlab via > Python-matlab-bridge in some of the the tests for the PyWavelets package. > > https://arokem.github.io/python-matlab-bridge/ > https://github.com/arokem/python-matlab-bridge > > I would be interested in hearing about the benefits/drawbacks relative to > Transplant if there is anyone who has used both. > > > On Mon, Aug 28, 2017 at 4:29 PM, CJ Carey > wrote: > >> Looks like Transplant can handle this use-case. >> >> Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html >> GitHub link: https://github.com/bastibe/transplant >> >> I haven't given it a try myself, but it looks promising. >> >> On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer wrote: >> >>> If you can use Octave instead of Matlab, I've had a very good experience >>> with Oct2Py: >>> https://github.com/blink1073/oct2py >>> >>> On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker >>> wrote: >>> >>>> I've searched but haven't found any decent answer. I need to call >>>> Matlab from python. Matlab has a python module for this purpose, but it >>>> doesn't understand numpy AFAICT. What solutions are there for efficiently >>>> interfacing numpy arrays to Matlab? >>>> >>>> Thanks, >>>> Neal >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From deak.andris at gmail.com Tue Aug 29 07:44:36 2017 From: deak.andris at gmail.com (Andras Deak) Date: Tue, 29 Aug 2017 13:44:36 +0200 Subject: [Numpy-discussion] Interface numpy arrays to Matlab? In-Reply-To: References: Message-ID: On Tue, Aug 29, 2017 at 1:08 PM, Neal Becker wrote: > [...] > [I] would guess that inside every Matlab array is a numpy array crying to be > freed - in both cases an array is a block of memory together with shape and > stride information. So I would hope a direct conversion could be done, at > least via C API if not directly with python numpy API. But it seems nobody > has done this, so maybe it's not that simple? I was going to suggest this Stack Overflow post earlier but figured that you must have found it already: https://stackoverflow.com/questions/34155829/how-to-efficiently-convert-matlab-engine-arrays-to-numpy-ndarray Based on that it seems that at least arrays returned from the MATLAB engine can be reasonably converted using their underlying data (`_data` attribute, together with the `size` attribute to unravel multidimensional arrays). The other way around (i.e. passing numpy arrays to the MATLAB engine) seems less straightforward: all I could find was https://www.mathworks.com/matlabcentral/answers/216498-passing-numpy-ndarray-from-python-to-matlab The comments there suggest that you can instantiate `matlab.double` objects from lists that you can pass to the MATLAB engine. Explicitly converting your arrays to lists along this step don't sound too good to me. Disclaimer: I haven't tried either methods. Regards, Andr?s De?k > On Mon, Aug 28, 2017 at 5:32 PM Gregory Lee wrote: >> >> I have not used Transplant, but it sounds fairly similar to >> Python-matlab-bridge. We currently optionally call Matlab via >> Python-matlab-bridge in some of the the tests for the PyWavelets package. >> >> https://arokem.github.io/python-matlab-bridge/ >> https://github.com/arokem/python-matlab-bridge >> >> I would be interested in hearing about the benefits/drawbacks relative to >> Transplant if there is anyone who has used both. >> >> >> On Mon, Aug 28, 2017 at 4:29 PM, CJ Carey >> wrote: >>> >>> Looks like Transplant can handle this use-case. >>> >>> Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html >>> GitHub link: https://github.com/bastibe/transplant >>> >>> I haven't given it a try myself, but it looks promising. >>> >>> On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer wrote: >>>> >>>> If you can use Octave instead of Matlab, I've had a very good experience >>>> with Oct2Py: >>>> https://github.com/blink1073/oct2py >>>> >>>> On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker >>>> wrote: >>>>> >>>>> I've searched but haven't found any decent answer. I need to call >>>>> Matlab from python. Matlab has a python module for this purpose, but it >>>>> doesn't understand numpy AFAICT. What solutions are there for efficiently >>>>> interfacing numpy arrays to Matlab? >>>>> >>>>> Thanks, >>>>> Neal >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From chris.barker at noaa.gov Tue Aug 29 14:52:47 2017 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 29 Aug 2017 11:52:47 -0700 Subject: [Numpy-discussion] Interface numpy arrays to Matlab? In-Reply-To: References: Message-ID: On Tue, Aug 29, 2017 at 4:08 AM, Neal Becker wrote: > Transplant sounds interesting, I think I could use this. I don't > understand though why nobody has used a more direct approach? Matlab has > their python API https://www.mathworks.com/help/matlab/matlab-engine-for- > python.html. This will pass Matlab arrays to/from python as some kind of > opaque blob. I would guess that inside every Matlab array is a numpy array > crying to be freed - in both cases an array is a block of memory together > with shape and stride information. So I would hope a direct conversion > could be done, at least via C API if not directly with python numpy API. > I agree -- it is absolutley bizare that they havn'etr built in a numpy array <-> matlab array mapping! MAybe they do'nt want Matlb usres to realize that nmpy provides most of what MATLAB does (but better :-) ) -- and want people to use Python with MATlab for other pytonic stuff that MATLAB doesn't do well.... but they do provide a mapping for array.array: https://www.mathworks.com/help/matlab/matlab_external/use-python-array-array-types.html which is a buffer you can wrap a numpy array around efficiently.... odd that you'd have to write that code. -CHB > But it seems nobody has done this, so maybe it's not that simple? > > > On Mon, Aug 28, 2017 at 5:32 PM Gregory Lee wrote: > >> I have not used Transplant, but it sounds fairly similar to >> Python-matlab-bridge. We currently optionally call Matlab via >> Python-matlab-bridge in some of the the tests for the PyWavelets package. >> >> https://arokem.github.io/python-matlab-bridge/ >> https://github.com/arokem/python-matlab-bridge >> >> I would be interested in hearing about the benefits/drawbacks relative to >> Transplant if there is anyone who has used both. >> >> >> On Mon, Aug 28, 2017 at 4:29 PM, CJ Carey >> wrote: >> >>> Looks like Transplant can handle this use-case. >>> >>> Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html >>> GitHub link: https://github.com/bastibe/transplant >>> >>> I haven't given it a try myself, but it looks promising. >>> >>> On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer wrote: >>> >>>> If you can use Octave instead of Matlab, I've had a very good >>>> experience with Oct2Py: >>>> https://github.com/blink1073/oct2py >>>> >>>> On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker >>>> wrote: >>>> >>>>> I've searched but haven't found any decent answer. I need to call >>>>> Matlab from python. Matlab has a python module for this purpose, but it >>>>> doesn't understand numpy AFAICT. What solutions are there for efficiently >>>>> interfacing numpy arrays to Matlab? >>>>> >>>>> Thanks, >>>>> Neal >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From Catherine.M.Moroney at jpl.nasa.gov Tue Aug 29 21:03:55 2017 From: Catherine.M.Moroney at jpl.nasa.gov (Moroney, Catherine M (398E)) Date: Wed, 30 Aug 2017 01:03:55 +0000 Subject: [Numpy-discussion] selecting all 2-d slices out of n-dimensional array Message-ID: <89D45F18-79B8-454C-9A3E-8417AED831F2@jpl.nasa.gov> Hello, I have an n-dimensional array (say (4,4,2,2)) and I wish to automatically extract all the (4,4) slices in it. i.e. a = numpy.arange(0, 64).reshape(4,4,2,2) slice1 = a[..., 0, 0] slice2 = a[..., 0, 1] slice3 = a[..., 1, 0] slice4 = a[..., 1,1] Simple enough example but in my case array ?a? will have unknown rank and size. All I know is that it will have more than 2 dimensions, but I don?t know ahead of time how many dimensions or what the size of those dimensions are. What is the best way of tackling this problem without writing a whole bunch of if-then cases depending on what the rank and shape of a is? Is there a one-size-fits-all solution? I?m using python 2.7 and numpy 1.8.2 Thanks for any advice, Catherine -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Aug 29 21:47:21 2017 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Aug 2017 18:47:21 -0700 Subject: [Numpy-discussion] selecting all 2-d slices out of n-dimensional array In-Reply-To: <89D45F18-79B8-454C-9A3E-8417AED831F2@jpl.nasa.gov> References: <89D45F18-79B8-454C-9A3E-8417AED831F2@jpl.nasa.gov> Message-ID: On Tue, Aug 29, 2017 at 6:03 PM, Moroney, Catherine M (398E) < Catherine.M.Moroney at jpl.nasa.gov> wrote: > Hello, > > > > I have an n-dimensional array (say (4,4,2,2)) and I wish to automatically > extract all the (4,4) slices in it. > > i.e. > > > > a = numpy.arange(0, 64).reshape(4,4,2,2) > > slice1 = a[..., 0, 0] > > slice2 = a[..., 0, 1] > > slice3 = a[..., 1, 0] > > slice4 = a[..., 1,1] > > > > Simple enough example but in my case array ?a? will have unknown rank and > size. All I know is that it will have more than 2 dimensions, but I don?t > know ahead of time how many dimensions or what the size of those dimensions > are. > > > > What is the best way of tackling this problem without writing a whole > bunch of if-then cases depending on what the rank and shape of a is? Is > there a one-size-fits-all solution? > First, reshape the array to (4, 4, -1). The -1 tells the method to choose whatever's needed to get the size to work out. Then roll the last axis to the front, and then you have a sequence of the (4, 4) arrays that you wanted. E.g. (using (4,4,3,3) as the original shape for clarity) [~] |26> a = numpy.arange(0, 4*4*3*3).reshape(4,4,3,3) [~] |27> b = a.reshape([4, 4, -1]) [~] |28> b.shape (4, 4, 9) [~] |29> c = np.rollaxis(b, -1, 0) [~] |30> c.shape (9, 4, 4) [~] |31> c[0] array([[ 0, 9, 18, 27], [ 36, 45, 54, 63], [ 72, 81, 90, 99], [108, 117, 126, 135]]) [~] |32> c[1] array([[ 1, 10, 19, 28], [ 37, 46, 55, 64], [ 73, 82, 91, 100], [109, 118, 127, 136]]) -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From jladasky at itu.edu Tue Aug 29 22:13:08 2017 From: jladasky at itu.edu (John Ladasky) Date: Tue, 29 Aug 2017 19:13:08 -0700 Subject: [Numpy-discussion] selecting all 2-d slices out of n-dimensional array In-Reply-To: References: <89D45F18-79B8-454C-9A3E-8417AED831F2@jpl.nasa.gov> Message-ID: Nice solution, Robert. My solution was not idiomatic Numpy, but it was idiomatic Python: def slice2d(arr): xmax, ymax = arr.shape[-2:] return (arr[...,x,y] for x in range(xmax) for y in range(ymax)) On Tue, Aug 29, 2017 at 6:47 PM, Robert Kern wrote: > On Tue, Aug 29, 2017 at 6:03 PM, Moroney, Catherine M (398E) < > Catherine.M.Moroney at jpl.nasa.gov> wrote: > >> Hello, >> >> >> >> I have an n-dimensional array (say (4,4,2,2)) and I wish to automatically >> extract all the (4,4) slices in it. >> >> i.e. >> >> >> >> a = numpy.arange(0, 64).reshape(4,4,2,2) >> >> slice1 = a[..., 0, 0] >> >> slice2 = a[..., 0, 1] >> >> slice3 = a[..., 1, 0] >> >> slice4 = a[..., 1,1] >> >> >> >> Simple enough example but in my case array ?a? will have unknown rank and >> size. All I know is that it will have more than 2 dimensions, but I don?t >> know ahead of time how many dimensions or what the size of those dimensions >> are. >> >> >> >> What is the best way of tackling this problem without writing a whole >> bunch of if-then cases depending on what the rank and shape of a is? Is >> there a one-size-fits-all solution? >> > > First, reshape the array to (4, 4, -1). The -1 tells the method to choose > whatever's needed to get the size to work out. Then roll the last axis to > the front, and then you have a sequence of the (4, 4) arrays that you > wanted. > > E.g. (using (4,4,3,3) as the original shape for clarity) > > [~] > |26> a = numpy.arange(0, 4*4*3*3).reshape(4,4,3,3) > > [~] > |27> b = a.reshape([4, 4, -1]) > > [~] > |28> b.shape > (4, 4, 9) > > [~] > |29> c = np.rollaxis(b, -1, 0) > > [~] > |30> c.shape > (9, 4, 4) > > [~] > |31> c[0] > array([[ 0, 9, 18, 27], > [ 36, 45, 54, 63], > [ 72, 81, 90, 99], > [108, 117, 126, 135]]) > > [~] > |32> c[1] > array([[ 1, 10, 19, 28], > [ 37, 46, 55, 64], > [ 73, 82, 91, 100], > [109, 118, 127, 136]]) > > -- > Robert Kern > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- *John J. Ladasky Jr., Ph.D.* *Research Scientist* *International Technological University* *2711 N. First St, San Jose, CA 95134 USA* -------------- next part -------------- An HTML attachment was scrubbed... URL: