From chunwei.yuan at gmail.com  Thu Aug  3 13:00:03 2017
From: chunwei.yuan at gmail.com (Chun-Wei Yuan)
Date: Thu, 3 Aug 2017 10:00:03 -0700
Subject: [Numpy-discussion] quantile() or percentile()
In-Reply-To: <CADNTXLJrdkVL44eU2u415ptznDkA1kzftJ6-d_1ZA13AvFUULQ@mail.gmail.com>
References: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
 <CAAa1KPYWpby0LgL2_puQ+9PeWJxB9O4Nrc4VAW9+Gf9xgDt+ow@mail.gmail.com>
 <CADNTXLJju6eWzhw4YiHbz_jaQWEd1ajpjHrkL4Kjjq_a40bg3A@mail.gmail.com>
 <CAAa1KPbD4uDjhkFsWye14vEXc4++DvHvgo1ak8ky7Fzfj-PeZg@mail.gmail.com>
 <CADNTXLJrdkVL44eU2u415ptznDkA1kzftJ6-d_1ZA13AvFUULQ@mail.gmail.com>
Message-ID: <CADNTXLJoucgmrsO6u0cXJZVM+vRANm4aP6BVqfVW7j67WLHzrg@mail.gmail.com>

Any way I can help expedite this?

On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
wrote:

> That would be great.  I just used np.argsort because it was familiar to
> me.  Didn't know about the C code.
>
> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz <
> jfoxrabinovitz at gmail.com> wrote:
>
>> While #9211 is a good start, it is pretty inefficient in terms of the
>> fact that it performs an O(nlogn) sort of the array. It is possible to
>> reduce the time to O(n) by using a similar partitioning algorithm to the
>> one in the C code of percentile. I will look into it as soon as I can.
>>
>>     -Joe
>>
>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
>> wrote:
>>
>>> Just to provide some context, 9213 actually spawned off of this guy:
>>>
>>> https://github.com/numpy/numpy/pull/9211
>>>
>>> which might address the weighted inputs issue Joe brought up.
>>>
>>> C
>>>
>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz <
>>> jfoxrabinovitz at gmail.com> wrote:
>>>
>>>> I think that there would be a very good reason to have a separate
>>>> function if we were to introduce weights to the inputs, similarly to the
>>>> way that we have mean and average. This would have some (positive)
>>>> repercussions like making weighted histograms with the Freedman-Diaconis
>>>> binwidth estimator a possibility. I have had this change on the back-burner
>>>> for a long time, mainly because I was too lazy to figure out how to include
>>>> it in the C code. However, I will take a closer look.
>>>>
>>>> Regards,
>>>>
>>>>     -Joe
>>>>
>>>>
>>>>
>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
>>>> wrote:
>>>>
>>>>> There's an ongoing effort to introduce quantile() into numpy.  You'd
>>>>> use it just like percentile(), but would input your q value in probability
>>>>> space (0.5 for 50%):
>>>>>
>>>>> https://github.com/numpy/numpy/pull/9213
>>>>>
>>>>> Since there's a great deal of overlap between these two functions,
>>>>> we'd like to solicit opinions on how to move forward on this.
>>>>>
>>>>> The current thinking is to tolerate the redundancy and keep both,
>>>>> using one as the engine for the other.  I'm partial to having quantile
>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting on
>>>>> quantile().
>>>>>
>>>>> Best,
>>>>>
>>>>> C
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at python.org
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170803/88e1951f/attachment.html>

From jfoxrabinovitz at gmail.com  Thu Aug  3 14:10:20 2017
From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz)
Date: Thu, 3 Aug 2017 14:10:20 -0400
Subject: [Numpy-discussion] quantile() or percentile()
In-Reply-To: <CADNTXLJoucgmrsO6u0cXJZVM+vRANm4aP6BVqfVW7j67WLHzrg@mail.gmail.com>
References: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
 <CAAa1KPYWpby0LgL2_puQ+9PeWJxB9O4Nrc4VAW9+Gf9xgDt+ow@mail.gmail.com>
 <CADNTXLJju6eWzhw4YiHbz_jaQWEd1ajpjHrkL4Kjjq_a40bg3A@mail.gmail.com>
 <CAAa1KPbD4uDjhkFsWye14vEXc4++DvHvgo1ak8ky7Fzfj-PeZg@mail.gmail.com>
 <CADNTXLJrdkVL44eU2u415ptznDkA1kzftJ6-d_1ZA13AvFUULQ@mail.gmail.com>
 <CADNTXLJoucgmrsO6u0cXJZVM+vRANm4aP6BVqfVW7j67WLHzrg@mail.gmail.com>
Message-ID: <CAAa1KPazZNrOq09Mxtuo=E-JaFV4obMVsqn0RvcoPoN_48=F3g@mail.gmail.com>

Not that I know of. The algorithm is very simple, requiring a
relatively small addition to the current introselect algorithm used
for `np.partition`. My biggest hurdle is figuring out how the calling
machinery really works so that I can figure out which input type
permutations I need to generate, and how to get the right backend
running for a given function call.

    -Joe

On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com> wrote:
> Any way I can help expedite this?
>
> On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
> wrote:
>>
>> That would be great.  I just used np.argsort because it was familiar to
>> me.  Didn't know about the C code.
>>
>> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz
>> <jfoxrabinovitz at gmail.com> wrote:
>>>
>>> While #9211 is a good start, it is pretty inefficient in terms of the
>>> fact that it performs an O(nlogn) sort of the array. It is possible to
>>> reduce the time to O(n) by using a similar partitioning algorithm to the one
>>> in the C code of percentile. I will look into it as soon as I can.
>>>
>>>     -Joe
>>>
>>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
>>> wrote:
>>>>
>>>> Just to provide some context, 9213 actually spawned off of this guy:
>>>>
>>>> https://github.com/numpy/numpy/pull/9211
>>>>
>>>> which might address the weighted inputs issue Joe brought up.
>>>>
>>>> C
>>>>
>>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz
>>>> <jfoxrabinovitz at gmail.com> wrote:
>>>>>
>>>>> I think that there would be a very good reason to have a separate
>>>>> function if we were to introduce weights to the inputs, similarly to the way
>>>>> that we have mean and average. This would have some (positive) repercussions
>>>>> like making weighted histograms with the Freedman-Diaconis binwidth
>>>>> estimator a possibility. I have had this change on the back-burner for a
>>>>> long time, mainly because I was too lazy to figure out how to include it in
>>>>> the C code. However, I will take a closer look.
>>>>>
>>>>> Regards,
>>>>>
>>>>>     -Joe
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> There's an ongoing effort to introduce quantile() into numpy.  You'd
>>>>>> use it just like percentile(), but would input your q value in probability
>>>>>> space (0.5 for 50%):
>>>>>>
>>>>>> https://github.com/numpy/numpy/pull/9213
>>>>>>
>>>>>> Since there's a great deal of overlap between these two functions,
>>>>>> we'd like to solicit opinions on how to move forward on this.
>>>>>>
>>>>>> The current thinking is to tolerate the redundancy and keep both,
>>>>>> using one as the engine for the other.  I'm partial to having quantile
>>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting on
>>>>>> quantile().
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> C
>>>>>>
>>>>>> _______________________________________________
>>>>>> NumPy-Discussion mailing list
>>>>>> NumPy-Discussion at python.org
>>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at python.org
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

From chunwei.yuan at gmail.com  Thu Aug  3 17:36:45 2017
From: chunwei.yuan at gmail.com (Chun-Wei Yuan)
Date: Thu, 3 Aug 2017 14:36:45 -0700
Subject: [Numpy-discussion] quantile() or percentile()
In-Reply-To: <CAAa1KPazZNrOq09Mxtuo=E-JaFV4obMVsqn0RvcoPoN_48=F3g@mail.gmail.com>
References: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
 <CAAa1KPYWpby0LgL2_puQ+9PeWJxB9O4Nrc4VAW9+Gf9xgDt+ow@mail.gmail.com>
 <CADNTXLJju6eWzhw4YiHbz_jaQWEd1ajpjHrkL4Kjjq_a40bg3A@mail.gmail.com>
 <CAAa1KPbD4uDjhkFsWye14vEXc4++DvHvgo1ak8ky7Fzfj-PeZg@mail.gmail.com>
 <CADNTXLJrdkVL44eU2u415ptznDkA1kzftJ6-d_1ZA13AvFUULQ@mail.gmail.com>
 <CADNTXLJoucgmrsO6u0cXJZVM+vRANm4aP6BVqfVW7j67WLHzrg@mail.gmail.com>
 <CAAa1KPazZNrOq09Mxtuo=E-JaFV4obMVsqn0RvcoPoN_48=F3g@mail.gmail.com>
Message-ID: <CADNTXLKp5Ycnip--nT9w8c2ecpHicEcS-+19FXnW-vMpPY9cjg@mail.gmail.com>

Cool.  Just as a heads up, for my algorithm to work, I actually need the
indices, which is why argsort() is so important to me.  I use it to get
both ap_sorted and ws_sorted variables.  If your weighted-quantile algo is
faster and doesn't require those indices, please by all means change my
implementation.  Thanks.

On Thu, Aug 3, 2017 at 11:10 AM, Joseph Fox-Rabinovitz <
jfoxrabinovitz at gmail.com> wrote:

> Not that I know of. The algorithm is very simple, requiring a
> relatively small addition to the current introselect algorithm used
> for `np.partition`. My biggest hurdle is figuring out how the calling
> machinery really works so that I can figure out which input type
> permutations I need to generate, and how to get the right backend
> running for a given function call.
>
>     -Joe
>
> On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
> wrote:
> > Any way I can help expedite this?
> >
> > On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
> > wrote:
> >>
> >> That would be great.  I just used np.argsort because it was familiar to
> >> me.  Didn't know about the C code.
> >>
> >> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz
> >> <jfoxrabinovitz at gmail.com> wrote:
> >>>
> >>> While #9211 is a good start, it is pretty inefficient in terms of the
> >>> fact that it performs an O(nlogn) sort of the array. It is possible to
> >>> reduce the time to O(n) by using a similar partitioning algorithm to
> the one
> >>> in the C code of percentile. I will look into it as soon as I can.
> >>>
> >>>     -Joe
> >>>
> >>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com
> >
> >>> wrote:
> >>>>
> >>>> Just to provide some context, 9213 actually spawned off of this guy:
> >>>>
> >>>> https://github.com/numpy/numpy/pull/9211
> >>>>
> >>>> which might address the weighted inputs issue Joe brought up.
> >>>>
> >>>> C
> >>>>
> >>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz
> >>>> <jfoxrabinovitz at gmail.com> wrote:
> >>>>>
> >>>>> I think that there would be a very good reason to have a separate
> >>>>> function if we were to introduce weights to the inputs, similarly to
> the way
> >>>>> that we have mean and average. This would have some (positive)
> repercussions
> >>>>> like making weighted histograms with the Freedman-Diaconis binwidth
> >>>>> estimator a possibility. I have had this change on the back-burner
> for a
> >>>>> long time, mainly because I was too lazy to figure out how to
> include it in
> >>>>> the C code. However, I will take a closer look.
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>>     -Joe
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan <
> chunwei.yuan at gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> There's an ongoing effort to introduce quantile() into numpy.  You'd
> >>>>>> use it just like percentile(), but would input your q value in
> probability
> >>>>>> space (0.5 for 50%):
> >>>>>>
> >>>>>> https://github.com/numpy/numpy/pull/9213
> >>>>>>
> >>>>>> Since there's a great deal of overlap between these two functions,
> >>>>>> we'd like to solicit opinions on how to move forward on this.
> >>>>>>
> >>>>>> The current thinking is to tolerate the redundancy and keep both,
> >>>>>> using one as the engine for the other.  I'm partial to having
> quantile
> >>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting
> on
> >>>>>> quantile().
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> C
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> NumPy-Discussion mailing list
> >>>>>> NumPy-Discussion at python.org
> >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> NumPy-Discussion mailing list
> >>>>> NumPy-Discussion at python.org
> >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> NumPy-Discussion mailing list
> >>>> NumPy-Discussion at python.org
> >>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion at python.org
> >>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>>
> >>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170803/9287c9b4/attachment-0001.html>

From jfoxrabinovitz at gmail.com  Fri Aug  4 14:09:23 2017
From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz)
Date: Fri, 4 Aug 2017 14:09:23 -0400
Subject: [Numpy-discussion] quantile() or percentile()
In-Reply-To: <CADNTXLKp5Ycnip--nT9w8c2ecpHicEcS-+19FXnW-vMpPY9cjg@mail.gmail.com>
References: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
 <CAAa1KPYWpby0LgL2_puQ+9PeWJxB9O4Nrc4VAW9+Gf9xgDt+ow@mail.gmail.com>
 <CADNTXLJju6eWzhw4YiHbz_jaQWEd1ajpjHrkL4Kjjq_a40bg3A@mail.gmail.com>
 <CAAa1KPbD4uDjhkFsWye14vEXc4++DvHvgo1ak8ky7Fzfj-PeZg@mail.gmail.com>
 <CADNTXLJrdkVL44eU2u415ptznDkA1kzftJ6-d_1ZA13AvFUULQ@mail.gmail.com>
 <CADNTXLJoucgmrsO6u0cXJZVM+vRANm4aP6BVqfVW7j67WLHzrg@mail.gmail.com>
 <CAAa1KPazZNrOq09Mxtuo=E-JaFV4obMVsqn0RvcoPoN_48=F3g@mail.gmail.com>
 <CADNTXLKp5Ycnip--nT9w8c2ecpHicEcS-+19FXnW-vMpPY9cjg@mail.gmail.com>
Message-ID: <CAAa1KPZAkU8DV1-XP-ukJwoNtAoCV3AjSrusCJ3jrPzNEfnFMQ@mail.gmail.com>

I will go over your PR carefully to make sure we can agree on a
matching API. After that, we can swap the backend out whenever I get
around to it.

Thanks for working on this.

    -Joe

On Thu, Aug 3, 2017 at 5:36 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com> wrote:
> Cool.  Just as a heads up, for my algorithm to work, I actually need the
> indices, which is why argsort() is so important to me.  I use it to get both
> ap_sorted and ws_sorted variables.  If your weighted-quantile algo is faster
> and doesn't require those indices, please by all means change my
> implementation.  Thanks.
>
> On Thu, Aug 3, 2017 at 11:10 AM, Joseph Fox-Rabinovitz
> <jfoxrabinovitz at gmail.com> wrote:
>>
>> Not that I know of. The algorithm is very simple, requiring a
>> relatively small addition to the current introselect algorithm used
>> for `np.partition`. My biggest hurdle is figuring out how the calling
>> machinery really works so that I can figure out which input type
>> permutations I need to generate, and how to get the right backend
>> running for a given function call.
>>
>>     -Joe
>>
>> On Thu, Aug 3, 2017 at 1:00 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
>> wrote:
>> > Any way I can help expedite this?
>> >
>> > On Fri, Jul 21, 2017 at 4:42 PM, Chun-Wei Yuan <chunwei.yuan at gmail.com>
>> > wrote:
>> >>
>> >> That would be great.  I just used np.argsort because it was familiar to
>> >> me.  Didn't know about the C code.
>> >>
>> >> On Fri, Jul 21, 2017 at 3:43 PM, Joseph Fox-Rabinovitz
>> >> <jfoxrabinovitz at gmail.com> wrote:
>> >>>
>> >>> While #9211 is a good start, it is pretty inefficient in terms of the
>> >>> fact that it performs an O(nlogn) sort of the array. It is possible to
>> >>> reduce the time to O(n) by using a similar partitioning algorithm to
>> >>> the one
>> >>> in the C code of percentile. I will look into it as soon as I can.
>> >>>
>> >>>     -Joe
>> >>>
>> >>> On Fri, Jul 21, 2017 at 5:34 PM, Chun-Wei Yuan
>> >>> <chunwei.yuan at gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> Just to provide some context, 9213 actually spawned off of this guy:
>> >>>>
>> >>>> https://github.com/numpy/numpy/pull/9211
>> >>>>
>> >>>> which might address the weighted inputs issue Joe brought up.
>> >>>>
>> >>>> C
>> >>>>
>> >>>> On Fri, Jul 21, 2017 at 2:21 PM, Joseph Fox-Rabinovitz
>> >>>> <jfoxrabinovitz at gmail.com> wrote:
>> >>>>>
>> >>>>> I think that there would be a very good reason to have a separate
>> >>>>> function if we were to introduce weights to the inputs, similarly to
>> >>>>> the way
>> >>>>> that we have mean and average. This would have some (positive)
>> >>>>> repercussions
>> >>>>> like making weighted histograms with the Freedman-Diaconis binwidth
>> >>>>> estimator a possibility. I have had this change on the back-burner
>> >>>>> for a
>> >>>>> long time, mainly because I was too lazy to figure out how to
>> >>>>> include it in
>> >>>>> the C code. However, I will take a closer look.
>> >>>>>
>> >>>>> Regards,
>> >>>>>
>> >>>>>     -Joe
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Fri, Jul 21, 2017 at 5:11 PM, Chun-Wei Yuan
>> >>>>> <chunwei.yuan at gmail.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> There's an ongoing effort to introduce quantile() into numpy.
>> >>>>>> You'd
>> >>>>>> use it just like percentile(), but would input your q value in
>> >>>>>> probability
>> >>>>>> space (0.5 for 50%):
>> >>>>>>
>> >>>>>> https://github.com/numpy/numpy/pull/9213
>> >>>>>>
>> >>>>>> Since there's a great deal of overlap between these two functions,
>> >>>>>> we'd like to solicit opinions on how to move forward on this.
>> >>>>>>
>> >>>>>> The current thinking is to tolerate the redundancy and keep both,
>> >>>>>> using one as the engine for the other.  I'm partial to having
>> >>>>>> quantile
>> >>>>>> because 1.) I prefer probability space, and 2.) I have a PR waiting
>> >>>>>> on
>> >>>>>> quantile().
>> >>>>>>
>> >>>>>> Best,
>> >>>>>>
>> >>>>>> C
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> NumPy-Discussion mailing list
>> >>>>>> NumPy-Discussion at python.org
>> >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> NumPy-Discussion mailing list
>> >>>>> NumPy-Discussion at python.org
>> >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >>>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> NumPy-Discussion mailing list
>> >>>> NumPy-Discussion at python.org
>> >>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >>>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> NumPy-Discussion mailing list
>> >>> NumPy-Discussion at python.org
>> >>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >>>
>> >>
>> >
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at python.org
>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>> >
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

From laytonjb at att.net  Fri Aug  4 15:24:28 2017
From: laytonjb at att.net (Jeff Layton)
Date: Fri, 4 Aug 2017 15:24:28 -0400
Subject: [Numpy-discussion] F2PY problems with PGI compilers
Message-ID: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>

Good afternoon!

I'm trying to build a Python module using F2PY on a simple Fortran code 
using the PGI 17.4 community compilers.

I'm using Conda 4.3.21 with Python 2.7.13 and F2PY 2. The command line 
I'm using is,


f2py --compiler=pg --fcompiler=pg -c -m mdevice mdevice.f90


The output from f2py is at the end of the email. Any suggestions are 
greatly appreciated.

Thanks!

Jeff


Output from f2py:


running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands 
--compiler options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands 
--fcompiler options
running build_src
build_src
building extension "mdevice" sources
f2py options: []
f2py:> /tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c
creating /tmp/tmptN1fdp/src.linux-x86_64-2.7
Reading fortran codes...
         Reading file 'mdevice.f90' (format:free)
Post-processing...
         Block: mdevice
                         Block: devicequery
In: :mdevice:mdevice.f90:devicequery
get_useparameters: no module cudafor info used by devicequery
Post-processing (stage 2)...
Building modules...
         Building module "mdevice"...
                 Constructing wrapper function "devicequery"...
                   devicequery()
         Wrote C/API module "mdevice" to file 
"/tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c"
   adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7/fortranobject.c' to sources.
   adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7' to include_dirs.
copying 
/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c 
-> /tmp/tmptN1fdp/src.linux-x86_64-2.7
copying 
/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h 
-> /tmp/tmptN1fdp/src.linux-x86_64-2.7
build_src: building npy-pkg config files
running build_ext
error: don't know how to compile C/C++ code on platform 'posix' with 
'pg' compiler


From jfoxrabinovitz at gmail.com  Fri Aug  4 18:31:52 2017
From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz)
Date: Fri, 4 Aug 2017 18:31:52 -0400
Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in PR#9514
Message-ID: <CAAa1KPaGbMGEj6_aDKwZt9_HNcsmEUqTyX-K7NS_TM3Vb_iN8Q@mail.gmail.com>

I would like to propose the addition of a new function,
`np.neighborwise` in PR#9514. It is based on the discussion relating
to my proposal for `np.ratio` (PR#9481) and Eric Wieser's
`np.neighborwise` in PR#9428. This function accepts an array `a`, a
vectorized function of two arguments `func`, and applies the function
to all of the neighboring elements of the array across multiple
dimensions. There are options for masking out parts of the calculation
and for applying the function recursively.

The name of the function is not written in stone. The current name is
taken directly from PR#9428 because I can not think of a better one.

This function can serve as a backend for the existing `np.diff`, which
has been re-implemented in this PR, as well as for the `ratio`
function I propsed earlier. This adds the diagonal diffs feature,
which is tested and backwards compatible. `ratio` can be implemented
very simply with or without a mask. With a mask, it can be expressed
`np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args:
args[1])` (The conversion to bool is done automatically).

The one potentially non-backwards-compatible API change that this PR
introduces is that `np.diff` now returns an `ndarray` version of the
input, instead of the original array itself if `n==0`. Previously, the
exact input reference was returned for `n==0`. I very seriously doubt
that this feature was ever used outside the numpy test suite anyway.
The advantage of this change is that an invalid axis input can now be
caught before returning the unaltered array. If this change is
considered too drastic, I can remove it without removing the axis
check.

The two main differences between this PR and PR#9428 are the addition
of masks to the computation, and the interpretation of multiple axes.
PR#9428 applies `func` successively along each axis. This provides no
way of doing diagonal diffs. I chose to shift along all the axes
simultaneously before applying `func`. To clarify with an example, if
we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`,
PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the
version I propose here just takes the diagonal diff `4 - 1 = 3`.
Besides being more intuitive in my opinion, taking diagonal diffs
actually adds a new feature that can not be obtained directly by
taking successive diffs.

Please let me know your thoughts.

Regards,

    -Joe

From ben.v.root at gmail.com  Fri Aug  4 21:44:18 2017
From: ben.v.root at gmail.com (Benjamin Root)
Date: Fri, 4 Aug 2017 21:44:18 -0400
Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in
 PR#9514
In-Reply-To: <CAAa1KPaGbMGEj6_aDKwZt9_HNcsmEUqTyX-K7NS_TM3Vb_iN8Q@mail.gmail.com>
References: <CAAa1KPaGbMGEj6_aDKwZt9_HNcsmEUqTyX-K7NS_TM3Vb_iN8Q@mail.gmail.com>
Message-ID: <CANNq6Fnv06T2hah9uE1-TvrcnADKiLsY0MEBuPCtKxv3o==40g@mail.gmail.com>

So, this is a kernel mechanism?

On Fri, Aug 4, 2017 at 6:31 PM, Joseph Fox-Rabinovitz <
jfoxrabinovitz at gmail.com> wrote:

> I would like to propose the addition of a new function,
> `np.neighborwise` in PR#9514. It is based on the discussion relating
> to my proposal for `np.ratio` (PR#9481) and Eric Wieser's
> `np.neighborwise` in PR#9428. This function accepts an array `a`, a
> vectorized function of two arguments `func`, and applies the function
> to all of the neighboring elements of the array across multiple
> dimensions. There are options for masking out parts of the calculation
> and for applying the function recursively.
>
> The name of the function is not written in stone. The current name is
> taken directly from PR#9428 because I can not think of a better one.
>
> This function can serve as a backend for the existing `np.diff`, which
> has been re-implemented in this PR, as well as for the `ratio`
> function I propsed earlier. This adds the diagonal diffs feature,
> which is tested and backwards compatible. `ratio` can be implemented
> very simply with or without a mask. With a mask, it can be expressed
> `np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args:
> args[1])` (The conversion to bool is done automatically).
>
> The one potentially non-backwards-compatible API change that this PR
> introduces is that `np.diff` now returns an `ndarray` version of the
> input, instead of the original array itself if `n==0`. Previously, the
> exact input reference was returned for `n==0`. I very seriously doubt
> that this feature was ever used outside the numpy test suite anyway.
> The advantage of this change is that an invalid axis input can now be
> caught before returning the unaltered array. If this change is
> considered too drastic, I can remove it without removing the axis
> check.
>
> The two main differences between this PR and PR#9428 are the addition
> of masks to the computation, and the interpretation of multiple axes.
> PR#9428 applies `func` successively along each axis. This provides no
> way of doing diagonal diffs. I chose to shift along all the axes
> simultaneously before applying `func`. To clarify with an example, if
> we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`,
> PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the
> version I propose here just takes the diagonal diff `4 - 1 = 3`.
> Besides being more intuitive in my opinion, taking diagonal diffs
> actually adds a new feature that can not be obtained directly by
> taking successive diffs.
>
> Please let me know your thoughts.
>
> Regards,
>
>     -Joe
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170804/12ca4b00/attachment-0001.html>

From tim at cerazone.net  Fri Aug  4 22:54:13 2017
From: tim at cerazone.net (Tim Cera)
Date: Sat, 05 Aug 2017 02:54:13 +0000
Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in
 PR#9514
In-Reply-To: <CAAa1KPaGbMGEj6_aDKwZt9_HNcsmEUqTyX-K7NS_TM3Vb_iN8Q@mail.gmail.com>
References: <CAAa1KPaGbMGEj6_aDKwZt9_HNcsmEUqTyX-K7NS_TM3Vb_iN8Q@mail.gmail.com>
Message-ID: <CAO5s+D_CbDc-J199SGET5ityKJXpO_U+SJRAZCEjjXV_LmqkZA@mail.gmail.com>

As noted https://github.com/numpy/numpy/pull/303 a large part of this
functionality has been implemented before for numpy and didn't go anywhere
because it is already present in scipy.ndimage.

IMHO it is better suited in numpy with a better name so that people don't
miss it.

Kindest regards,
Tim

On Fri, Aug 4, 2017 at 6:33 PM Joseph Fox-Rabinovitz <
jfoxrabinovitz at gmail.com> wrote:

> I would like to propose the addition of a new function,
> `np.neighborwise` in PR#9514. It is based on the discussion relating
> to my proposal for `np.ratio` (PR#9481) and Eric Wieser's
> `np.neighborwise` in PR#9428. This function accepts an array `a`, a
> vectorized function of two arguments `func`, and applies the function
> to all of the neighboring elements of the array across multiple
> dimensions. There are options for masking out parts of the calculation
> and for applying the function recursively.
>
> The name of the function is not written in stone. The current name is
> taken directly from PR#9428 because I can not think of a better one.
>
> This function can serve as a backend for the existing `np.diff`, which
> has been re-implemented in this PR, as well as for the `ratio`
> function I propsed earlier. This adds the diagonal diffs feature,
> which is tested and backwards compatible. `ratio` can be implemented
> very simply with or without a mask. With a mask, it can be expressed
> `np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args:
> args[1])` (The conversion to bool is done automatically).
>
> The one potentially non-backwards-compatible API change that this PR
> introduces is that `np.diff` now returns an `ndarray` version of the
> input, instead of the original array itself if `n==0`. Previously, the
> exact input reference was returned for `n==0`. I very seriously doubt
> that this feature was ever used outside the numpy test suite anyway.
> The advantage of this change is that an invalid axis input can now be
> caught before returning the unaltered array. If this change is
> considered too drastic, I can remove it without removing the axis
> check.
>
> The two main differences between this PR and PR#9428 are the addition
> of masks to the computation, and the interpretation of multiple axes.
> PR#9428 applies `func` successively along each axis. This provides no
> way of doing diagonal diffs. I chose to shift along all the axes
> simultaneously before applying `func`. To clarify with an example, if
> we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`,
> PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the
> version I propose here just takes the diagonal diff `4 - 1 = 3`.
> Besides being more intuitive in my opinion, taking diagonal diffs
> actually adds a new feature that can not be obtained directly by
> taking successive diffs.
>
> Please let me know your thoughts.
>
> Regards,
>
>     -Joe
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170805/530b38dd/attachment.html>

From stefanv at berkeley.edu  Sat Aug  5 19:55:08 2017
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Sat, 05 Aug 2017 16:55:08 -0700
Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in
 PR#9514
In-Reply-To: <CAO5s+D_CbDc-J199SGET5ityKJXpO_U+SJRAZCEjjXV_LmqkZA@mail.gmail.com>
References: <CAAa1KPaGbMGEj6_aDKwZt9_HNcsmEUqTyX-K7NS_TM3Vb_iN8Q@mail.gmail.com>
 <CAO5s+D_CbDc-J199SGET5ityKJXpO_U+SJRAZCEjjXV_LmqkZA@mail.gmail.com>
Message-ID: <1501977308.1874930.1064358864.145BC92F@webmail.messagingengine.com>

On Fri, Aug 4, 2017, at 19:54, Tim Cera wrote:
> As noted https://github.com/numpy/numpy/pull/303 a large part of this
> functionality has been implemented before for numpy and didn't go
> anywhere because it is already present in scipy.ndimage.> 
> IMHO it is better suited in numpy with a better name so that people
> don't miss it.
Is this essentially `scipy.ndimage.generic_filter`?

St?fan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170805/1824d49a/attachment.html>

From tim at cerazone.net  Sun Aug  6 00:01:32 2017
From: tim at cerazone.net (Tim Cera)
Date: Sun, 06 Aug 2017 04:01:32 +0000
Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in
 PR#9514
In-Reply-To: <1501977308.1874930.1064358864.145BC92F@webmail.messagingengine.com>
References: <CAAa1KPaGbMGEj6_aDKwZt9_HNcsmEUqTyX-K7NS_TM3Vb_iN8Q@mail.gmail.com>
 <CAO5s+D_CbDc-J199SGET5ityKJXpO_U+SJRAZCEjjXV_LmqkZA@mail.gmail.com>
 <1501977308.1874930.1064358864.145BC92F@webmail.messagingengine.com>
Message-ID: <CAO5s+D8rzvKkLjnd+UcM6i4VBfy5Uatak7HOpPLiKqmFWW4WLg@mail.gmail.com>

It you're into reading ancient history here is the link to the discussion
where Zachary Pincus makes the same observation and my response was to
close the PR because I could use scipy.ndimage.generic_filter, even though
at least through my eyes, my implementation was nicer.
http://numpy-discussion.10968.n7.nabble.com/Fwd-numpy-ENH-Initial-implementation-of-a-neighbor-calculation-303-td27508.html

On Sat, Aug 5, 2017 at 7:56 PM Stefan van der Walt <stefanv at berkeley.edu>
wrote:

> On Fri, Aug 4, 2017, at 19:54, Tim Cera wrote:
>
> As noted https://github.com/numpy/numpy/pull/303 a large part of this
> functionality has been implemented before for numpy and didn't go anywhere
> because it is already present in scipy.ndimage.
>
> IMHO it is better suited in numpy with a better name so that people don't
> miss it.
>
>
> Is this essentially `scipy.ndimage.generic_filter`?
>
> St?fan
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170806/2ed9e780/attachment.html>

From jni.soma at gmail.com  Sun Aug  6 08:37:43 2017
From: jni.soma at gmail.com (Juan Nunez-Iglesias)
Date: Sun, 6 Aug 2017 14:37:43 +0200
Subject: [Numpy-discussion] ENH: Proposal to add np.neighborwise in
 PR#9514
In-Reply-To: <CAO5s+D8rzvKkLjnd+UcM6i4VBfy5Uatak7HOpPLiKqmFWW4WLg@mail.gmail.com>
References: <CAAa1KPaGbMGEj6_aDKwZt9_HNcsmEUqTyX-K7NS_TM3Vb_iN8Q@mail.gmail.com>
 <CAO5s+D_CbDc-J199SGET5ityKJXpO_U+SJRAZCEjjXV_LmqkZA@mail.gmail.com>
 <1501977308.1874930.1064358864.145BC92F@webmail.messagingengine.com>
 <CAO5s+D8rzvKkLjnd+UcM6i4VBfy5Uatak7HOpPLiKqmFWW4WLg@mail.gmail.com>
Message-ID: <238c20a8-6d44-4111-b0fd-ac295472fece@Spark>

It?s nice that this is pure Python / NumPy vectorized, whereas generic_filter requires some compilation to get good performance. (Tim, although your implementation is nice and readable, it would have been very slow for any significant volumes.)

However, my feeling is that this function is too specialized for a foundational package like NumPy. As Sebastian Berg pointed out on one of the PRs, it can cause confusion when there are many ways of achieving the same outcome. imho, the One Way to do this kind of operation is using generic_filter together with LowLevelCallable. My two blog posts on the topic:

https://ilovesymposia.com/2017/03/12/scipys-new-lowlevelcallable-is-a-game-changer/
https://ilovesymposia.com/2017/03/15/prettier-lowlevelcallables-with-numba-jit-and-decorators/

This has the advantage that it?s even more general. (In fact, it avoids the repeated-applications-vs-diagonal-application argument altogether. These are simply two different kernels.)

Perhaps ndimage lacks discoverability to other fields? But I think that can be better solved with documentation, rather than duplicating functionality and cluttering the NumPy API.

Sorry!

Juan.

On 6 Aug 2017, 6:02 AM +0200, Tim Cera <tim at cerazone.net>, wrote:
> It you're into reading ancient history here is the link to the discussion where Zachary Pincus makes the same observation and my response was to close the PR because I could use scipy.ndimage.generic_filter, even though at least through my eyes, my implementation was nicer.
> http://numpy-discussion.10968.n7.nabble.com/Fwd-numpy-ENH-Initial-implementation-of-a-neighbor-calculation-303-td27508.html
>
> > On Sat, Aug 5, 2017 at 7:56 PM Stefan van der Walt <stefanv at berkeley.edu> wrote:
> > > On Fri, Aug 4, 2017, at 19:54, Tim Cera wrote:
> > > > As noted?https://github.com/numpy/numpy/pull/303?a large part of this functionality has been implemented before for numpy and didn't go anywhere because it is already present in scipy.ndimage.
> > > >
> > > > IMHO it is better suited in numpy with a better name so that people don't miss it.
> > >
> > > Is this essentially `scipy.ndimage.generic_filter`?
> > >
> > > St?fan
> > >
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170806/b01e7e36/attachment.html>

From nisoli at im.ufrj.br  Mon Aug  7 17:01:33 2017
From: nisoli at im.ufrj.br (Nisoli Isaia)
Date: Mon, 7 Aug 2017 18:01:33 -0300
Subject: [Numpy-discussion] np.array, copy=False and memmap
Message-ID: <CA+KViW-GzMNCV8KjoiqXo9ce6jM1UPRsrpw5d5jRZ31ERX553A@mail.gmail.com>

Dear all,
I have a question about the behaviour of

y = np.array(x, copy=False, dtype='float32')

when x is a memmap. If we check the memmap attribute of mmap

print "mmap attribute", y._mmap

numpy tells us that y is not a memmap.
But the following code snippet crashes the python interpreter

# opens the memmap
with open(filename,'r+b') as f:
      mm = mmap.mmap(f.fileno(),0)
      x = np.frombuffer(mm, dtype='float32')

# builds an array from the memmap, with the option copy=False
y = np.array(x, copy=False, dtype='float32')
print "before", y

# closes the file
mm.close()
print "after", y

In my code I use memmaps to share read-only objects when doing parallel
processing
and the behaviour of np.array, even if not consistent, it's desirable.
I share scipy sparse matrices over many processes and if np.array would
make a copy
when dealing with memmaps this would force me to rewrite part of the sparse
matrices
code.
Would it be possible in the future releases of numpy to have np.array
check,
if copy is false, if y is a memmap and in that case return a full memmap
object
instead of slicing it?

Best wishes
Isaia

P.S. A longer account of the issue may be found on my university blog
http://www.im.ufrj.br/nisoli/blog/?p=131

-- 
Isaia Nisoli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170807/2ad31c9a/attachment-0001.html>

From allanhaldane at gmail.com  Thu Aug 10 12:27:57 2017
From: allanhaldane at gmail.com (Allan Haldane)
Date: Thu, 10 Aug 2017 12:27:57 -0400
Subject: [Numpy-discussion] np.array, copy=False and memmap
In-Reply-To: <CA+KViW-GzMNCV8KjoiqXo9ce6jM1UPRsrpw5d5jRZ31ERX553A@mail.gmail.com>
References: <CA+KViW-GzMNCV8KjoiqXo9ce6jM1UPRsrpw5d5jRZ31ERX553A@mail.gmail.com>
Message-ID: <3af6ce18-4fec-4100-0eab-dc5782c94e64@gmail.com>

On 08/07/2017 05:01 PM, Nisoli Isaia wrote:
> Dear all,
> I have a question about the behaviour of
> 
> y = np.array(x, copy=False, dtype='float32')
> 
> when x is a memmap. If we check the memmap attribute of mmap
> 
> print "mmap attribute", y._mmap
> 
> numpy tells us that y is not a memmap.
> But the following code snippet crashes the python interpreter
> 
> # opens the memmap
> with open(filename,'r+b') as f:
>       mm = mmap.mmap(f.fileno(),0)
>       x = np.frombuffer(mm, dtype='float32')
> 
> # builds an array from the memmap, with the option copy=False
> y = np.array(x, copy=False, dtype='float32')
> print "before", y
> 
> # closes the file
> mm.close()
> print "after", y
> 
> In my code I use memmaps to share read-only objects when doing parallel
> processing
> and the behaviour of np.array, even if not consistent, it's desirable.
> I share scipy sparse matrices over many processes and if np.array would
> make a copy
> when dealing with memmaps this would force me to rewrite part of the sparse
> matrices
> code.
> Would it be possible in the future releases of numpy to have np.array
> check,
> if copy is false, if y is a memmap and in that case return a full memmap
> object
> instead of slicing it?

This does appear to be a bug in numpy or mmap.

Probably the solution isn't to make mmaps a special case, rather we
should fix a bug somewhere in the use of the PEP3118 interface.

I've opened an issue on github for your issue:
https://github.com/numpy/numpy/issues/9537

It seems to me that the "correct" behavior may be for it to me
impossible to close the memmap while pointers to it exist; this is the
behavior for `memoryview`s of mmaps. That is, your line `mm.close()`
shoud raise an error `BufferError: cannot close exported pointers exist`.


> Best wishes
> Isaia
> 
> P.S. A longer account of the issue may be found on my university blog
> http://www.im.ufrj.br/nisoli/blog/?p=131
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 


From allanhaldane at gmail.com  Thu Aug 10 13:00:30 2017
From: allanhaldane at gmail.com (Allan Haldane)
Date: Thu, 10 Aug 2017 13:00:30 -0400
Subject: [Numpy-discussion] Changing MaskedArray.squeeze() to never
 return masked
In-Reply-To: <CANNq6F=B5W1SpSY_5ijfA-41B2LCAxtW6sqg7YrOrQSWCtvy0w@mail.gmail.com>
References: <CAL1kJvDNUv5r5dTHNbQ95ExFDU9CyeaCvqaHwjTKh-s0Vi-qnQ@mail.gmail.com>
 <CAL1kJvC6UG6F5W3m=HvYqP0DFJcy2=CyQtBFMOqgRmDHUrfoMA@mail.gmail.com>
 <CANNq6F=B5W1SpSY_5ijfA-41B2LCAxtW6sqg7YrOrQSWCtvy0w@mail.gmail.com>
Message-ID: <88b41baa-610c-7d14-601e-54f25160a039@gmail.com>

On 07/18/2017 09:52 AM, Benjamin Root wrote:
> This sort of change seems very similar to the np.diag() change a few years
> ago. Are there lessons we could learn from then that we could apply to here?
> 
> Why would the returned view not be a masked array?
> 
> Ben Root

I am in favor of the proposed change below.

I'd like to merge it, but before that I want to make sure I understand
your comment.

Are you referring to the proposed change to make diag return a view
instead of a copy? Note that this has not actually happened yet:
https://github.com/numpy/numpy/issues/7661

Also, I think this case is different because it does not change core
numpy, rather this is to make the MaskedArray module act more
consistently with core numpy. Because of that I think it is much less
problematic than the diag changes.

Cheers,
Allan


> On Tue, Jul 18, 2017 at 9:37 AM, Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
> 
>> When using ndarray.squeeze, a view is returned, which means you can do
>> the follow (somewhat-contrived) operation:
>>
>>>>> def fill_contrived(a):
>>         a.squeeze()[...] = 2
>>         return a
>>>>> fill_contrived(np.array([1]))
>> array(2)
>>
>> However, when tried with a masked array, this can fail, breaking liskov
>> subsitution:
>>
>>>>> fill_contrived(np.ma.array([1], mask=[True]))
>> MaskError: Cannot alter the masked element.
>>
>> This fails because squeeze breaks the contract of returning a view,
>> instead deciding sometimes to return masked.
>>
>> There is a patch that fixes this in gh-9432
>> <https://github.com/numpy/numpy/pull/9432> - however, by necessity it
>> breaks any existing code that uses m_arr.squeeze() is np.ma.masked.
>>
>> Is this too breaking a change?
>>
>> Eric
>> ?
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 


From ben.v.root at gmail.com  Thu Aug 10 13:21:17 2017
From: ben.v.root at gmail.com (Benjamin Root)
Date: Thu, 10 Aug 2017 13:21:17 -0400
Subject: [Numpy-discussion] Changing MaskedArray.squeeze() to never
 return masked
In-Reply-To: <88b41baa-610c-7d14-601e-54f25160a039@gmail.com>
References: <CAL1kJvDNUv5r5dTHNbQ95ExFDU9CyeaCvqaHwjTKh-s0Vi-qnQ@mail.gmail.com>
 <CAL1kJvC6UG6F5W3m=HvYqP0DFJcy2=CyQtBFMOqgRmDHUrfoMA@mail.gmail.com>
 <CANNq6F=B5W1SpSY_5ijfA-41B2LCAxtW6sqg7YrOrQSWCtvy0w@mail.gmail.com>
 <88b41baa-610c-7d14-601e-54f25160a039@gmail.com>
Message-ID: <CANNq6FnjEPbRLT1EVF-YOPxxVo1vsytydAodY2e8RpO0taOivQ@mail.gmail.com>

Yes, that is the change I am thinking of. And yes, it hasn't happened yet.
But, it has been set to warn for a few years now, and there was a lot of
controversy over it when it was first proposed. That said, I do think the
way it was handled made sense, and it is a good model to follow for these
types of changes.

For all intents and purposes, MaskedArray is "core numpy" for many users.
Yes, it has its quirks, but it has been very stable for many years, and
users have gotten used to the quirks. While I am all for taking steps to
eliminate as many quirks as possible, we need to be mindful of such
potentially disruptive changes and give users enough of a heads-up about it.

Ben Root


On Thu, Aug 10, 2017 at 1:00 PM, Allan Haldane <allanhaldane at gmail.com>
wrote:

> On 07/18/2017 09:52 AM, Benjamin Root wrote:
> > This sort of change seems very similar to the np.diag() change a few
> years
> > ago. Are there lessons we could learn from then that we could apply to
> here?
> >
> > Why would the returned view not be a masked array?
> >
> > Ben Root
>
> I am in favor of the proposed change below.
>
> I'd like to merge it, but before that I want to make sure I understand
> your comment.
>
> Are you referring to the proposed change to make diag return a view
> instead of a copy? Note that this has not actually happened yet:
> https://github.com/numpy/numpy/issues/7661
>
> Also, I think this case is different because it does not change core
> numpy, rather this is to make the MaskedArray module act more
> consistently with core numpy. Because of that I think it is much less
> problematic than the diag changes.
>
> Cheers,
> Allan
>
>
> > On Tue, Jul 18, 2017 at 9:37 AM, Eric Wieser <
> wieser.eric+numpy at gmail.com>
> > wrote:
> >
> >> When using ndarray.squeeze, a view is returned, which means you can do
> >> the follow (somewhat-contrived) operation:
> >>
> >>>>> def fill_contrived(a):
> >>         a.squeeze()[...] = 2
> >>         return a
> >>>>> fill_contrived(np.array([1]))
> >> array(2)
> >>
> >> However, when tried with a masked array, this can fail, breaking liskov
> >> subsitution:
> >>
> >>>>> fill_contrived(np.ma.array([1], mask=[True]))
> >> MaskError: Cannot alter the masked element.
> >>
> >> This fails because squeeze breaks the contract of returning a view,
> >> instead deciding sometimes to return masked.
> >>
> >> There is a patch that fixes this in gh-9432
> >> <https://github.com/numpy/numpy/pull/9432> - however, by necessity it
> >> breaks any existing code that uses m_arr.squeeze() is np.ma.masked.
> >>
> >> Is this too breaking a change?
> >>
> >> Eric
> >> ?
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at python.org
> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>
> >>
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170810/7fa64732/attachment.html>

From sebastian at sipsolutions.net  Thu Aug 10 14:24:05 2017
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Thu, 10 Aug 2017 20:24:05 +0200
Subject: [Numpy-discussion] np.array, copy=False and memmap
In-Reply-To: <3af6ce18-4fec-4100-0eab-dc5782c94e64@gmail.com>
References: <CA+KViW-GzMNCV8KjoiqXo9ce6jM1UPRsrpw5d5jRZ31ERX553A@mail.gmail.com>
 <3af6ce18-4fec-4100-0eab-dc5782c94e64@gmail.com>
Message-ID: <1502389445.10770.3.camel@sipsolutions.net>

On Thu, 2017-08-10 at 12:27 -0400, Allan Haldane wrote:
> On 08/07/2017 05:01 PM, Nisoli Isaia wrote:
> > Dear all,
> > I have a question about the behaviour of
> > 
> > y = np.array(x, copy=False, dtype='float32')
> > 
> > when x is a memmap. If we check the memmap attribute of mmap
> > 
> > print "mmap attribute", y._mmap
> > 
> > numpy tells us that y is not a memmap.
> > But the following code snippet crashes the python interpreter
> > 
> > # opens the memmap
> > with open(filename,'r+b') as f:
> > ??????mm = mmap.mmap(f.fileno(),0)
> > ??????x = np.frombuffer(mm, dtype='float32')
> > 
> > # builds an array from the memmap, with the option copy=False
> > y = np.array(x, copy=False, dtype='float32')
> > print "before", y
> > 
> > # closes the file
> > mm.close()
> > print "after", y
> > 
> > In my code I use memmaps to share read-only objects when doing
> > parallel
> > processing
> > and the behaviour of np.array, even if not consistent, it's
> > desirable.
> > I share scipy sparse matrices over many processes and if np.array
> > would
> > make a copy
> > when dealing with memmaps this would force me to rewrite part of
> > the sparse
> > matrices
> > code.
> > Would it be possible in the future releases of numpy to have
> > np.array
> > check,
> > if copy is false, if y is a memmap and in that case return a full
> > memmap
> > object
> > instead of slicing it?
> 
> This does appear to be a bug in numpy or mmap.
> 

Frankly on first sight, I do not think it is a bug in either of them.
Numpy uses view (memmap really is just a name for a memory map backed
numpy array). The numpy array will hold a reference to the memory map
object in its `.base` attribute (or the base of the base, etc.).

If you close a mmap object, and then keep using it, you can get
segfaults of course, I am not sure what you can do about it. Maybe
python can try to warn you when you exit the context/close a file
pointer, but I suppose: Python does memory management for you, it makes
doing IO management easy, but you need to manage the IO correctly. That
this segfaults and not just errors may be annoying, but seems the
nature of things on first sight.

- Sebastian


> Probably the solution isn't to make mmaps a special case, rather we
> should fix a bug somewhere in the use of the PEP3118 interface.
> 
> I've opened an issue on github for your issue:
> https://github.com/numpy/numpy/issues/9537
> 
> It seems to me that the "correct" behavior may be for it to me
> impossible to close the memmap while pointers to it exist; this is
> the
> behavior for `memoryview`s of mmaps. That is, your line `mm.close()`
> shoud raise an error `BufferError: cannot close exported pointers
> exist`.
> 
> 
> > Best wishes
> > Isaia
> > 
> > P.S. A longer account of the issue may be found on my university
> > blog
> > http://www.im.ufrj.br/nisoli/blog/?p=131
> > 
> > 
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170810/73334aed/attachment-0001.sig>

From allanhaldane at gmail.com  Thu Aug 10 15:56:41 2017
From: allanhaldane at gmail.com (Allan Haldane)
Date: Thu, 10 Aug 2017 15:56:41 -0400
Subject: [Numpy-discussion] np.array, copy=False and memmap
In-Reply-To: <1502389445.10770.3.camel@sipsolutions.net>
References: <CA+KViW-GzMNCV8KjoiqXo9ce6jM1UPRsrpw5d5jRZ31ERX553A@mail.gmail.com>
 <3af6ce18-4fec-4100-0eab-dc5782c94e64@gmail.com>
 <1502389445.10770.3.camel@sipsolutions.net>
Message-ID: <e9249972-9ad9-48c5-0c9e-b880ac03b4c4@gmail.com>

On 08/10/2017 02:24 PM, Sebastian Berg wrote:
> On Thu, 2017-08-10 at 12:27 -0400, Allan Haldane wrote:
>> On 08/07/2017 05:01 PM, Nisoli Isaia wrote:
>>> Dear all,
>>> I have a question about the behaviour of
>>>
>>> y = np.array(x, copy=False, dtype='float32')
>>>
>>> when x is a memmap. If we check the memmap attribute of mmap
>>>
>>> print "mmap attribute", y._mmap
>>>
>>> numpy tells us that y is not a memmap.
>>> But the following code snippet crashes the python interpreter
>>>
>>> # opens the memmap
>>> with open(filename,'r+b') as f:
>>>       mm = mmap.mmap(f.fileno(),0)
>>>       x = np.frombuffer(mm, dtype='float32')
>>>
>>> # builds an array from the memmap, with the option copy=False
>>> y = np.array(x, copy=False, dtype='float32')
>>> print "before", y
>>>
>>> # closes the file
>>> mm.close()
>>> print "after", y
>>>
>>> In my code I use memmaps to share read-only objects when doing
>>> parallel
>>> processing
>>> and the behaviour of np.array, even if not consistent, it's
>>> desirable.
>>> I share scipy sparse matrices over many processes and if np.array
>>> would
>>> make a copy
>>> when dealing with memmaps this would force me to rewrite part of
>>> the sparse
>>> matrices
>>> code.
>>> Would it be possible in the future releases of numpy to have
>>> np.array
>>> check,
>>> if copy is false, if y is a memmap and in that case return a full
>>> memmap
>>> object
>>> instead of slicing it?
>>
>> This does appear to be a bug in numpy or mmap.
>>
> 
> Frankly on first sight, I do not think it is a bug in either of them.
> Numpy uses view (memmap really is just a name for a memory map backed
> numpy array). The numpy array will hold a reference to the memory map
> object in its `.base` attribute (or the base of the base, etc.).
> 
> If you close a mmap object, and then keep using it, you can get
> segfaults of course, I am not sure what you can do about it. Maybe
> python can try to warn you when you exit the context/close a file
> pointer, but I suppose: Python does memory management for you, it makes
> doing IO management easy, but you need to manage the IO correctly. That
> this segfaults and not just errors may be annoying, but seems the
> nature of things on first sight.
> 
> - Sebastian

I admit I have not had time to investigate it thoroughly, but it appears
to me that the intended design of mmap was to make it impossible to
close a mmap if there were still pointers to it.

Consider the following behavior (python3):

    >>> import mmap
    >>> with open('test', 'r+b') as f:
    >>>     mm = mmap.mmap(f.fileno(),0)
    >>> mv = memoryview(mm)
    >>> mm.close()
    BufferError: cannot close exported pointers exist

If memoryview behaves this way, why doesn't/can't ndarray? (Both use the
PEP3118 interface, as far as I understand).

You can see in the mmap code that it tries to carefully keep track of
any exported buffers, but numpy manages to bypass this:
https://github.com/python/cpython/blob/b879fe82e7e5c3f7673c9a7fa4aad42bd05445d8/Modules/mmapmodule.c#L727


Allan

From allanhaldane at gmail.com  Thu Aug 10 16:06:38 2017
From: allanhaldane at gmail.com (Allan Haldane)
Date: Thu, 10 Aug 2017 16:06:38 -0400
Subject: [Numpy-discussion] np.array, copy=False and memmap
In-Reply-To: <CA+KViW-GzMNCV8KjoiqXo9ce6jM1UPRsrpw5d5jRZ31ERX553A@mail.gmail.com>
References: <CA+KViW-GzMNCV8KjoiqXo9ce6jM1UPRsrpw5d5jRZ31ERX553A@mail.gmail.com>
Message-ID: <20883c9a-06f5-6b12-294e-6b4fdbd72c59@gmail.com>

On 08/07/2017 05:01 PM, Nisoli Isaia wrote:
> Dear all,
> I have a question about the behaviour of
> 
> y = np.array(x, copy=False, dtype='float32')
> 
> when x is a memmap. If we check the memmap attribute of mmap
> 
> print "mmap attribute", y._mmap
> 
> numpy tells us that y is not a memmap.
> But the following code snippet crashes the python interpreter
> 
> # opens the memmap
> with open(filename,'r+b') as f:
>       mm = mmap.mmap(f.fileno(),0)
>       x = np.frombuffer(mm, dtype='float32')
> 
> # builds an array from the memmap, with the option copy=False
> y = np.array(x, copy=False, dtype='float32')
> print "before", y
> 
> # closes the file
> mm.close()
> print "after", y
> 
> In my code I use memmaps to share read-only objects when doing parallel
> processing
> and the behaviour of np.array, even if not consistent, it's desirable.
> I share scipy sparse matrices over many processes and if np.array would
> make a copy
> when dealing with memmaps this would force me to rewrite part of the sparse
> matrices
> code.
> Would it be possible in the future releases of numpy to have np.array
> check,
> if copy is false, if y is a memmap and in that case return a full memmap
> object
> instead of slicing it?
> 
> Best wishes
> Isaia
> 
> P.S. A longer account of the issue may be found on my university blog
> http://www.im.ufrj.br/nisoli/blog/?p=131

I just read your blog post, as well.

To confirm your question there: yes, if you slice or "view" a numpy
array which points to memmapped data, then the slice or view will also
point to memmapped data and will not make a copy. This way you avoid
using up a lot of memory.

It is also important to realize that `np.memmap` is merely a subclass of
`np.ndarray` which just provides a few extra helper methods which
ndarrays don't have, but is otherwise identical. The most important
difference is that `np.memmap` has a `flush` method. (It also has a
_mmap private attribute). But otherwise, both ndarrays and memmaps have
an internal data pointer pointing to the underlying data, and slices or
views of ndarrays (or memmaps) will point to the same memory (no
copies). In your code when you do

y = np.array(x, copy=False)

where x is a np.memmap object, y will point to the same memory locations
as x. However, y will not be a memmap object, because of how you
constructed it, so will not have the `flush` method which can be
important if you are writing to y and expect it to be written to disk.
If you are only reading from  y, though, this shouldn't matter.

Also, note that an np.memmap object is different from an mmap.mmap
object: The former uses the latter internally.

Allan


From wieser.eric+numpy at gmail.com  Thu Aug 10 17:08:31 2017
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Thu, 10 Aug 2017 21:08:31 +0000
Subject: [Numpy-discussion] quantile() or percentile()
In-Reply-To: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
References: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
Message-ID: <CAL1kJvBv2DpzX45hMji3udp0GePT3pkqkG-TGBC+ADzE2h4sHQ@mail.gmail.com>

Let?s try and keep this on topic - most replies to this message has been
about #9211, which is an orthogonal issue.

There are two main questions here:

   1. Would the community prefer to use np.quantile(x, 0.25) instead
of np.percentile(x,
   25), if they had the choice
   2. Is this desirable enough to justify increasing the API surface?

The general consensus on the github issue answers yes to 1, but is neutral
on 2. It would be good to get more opinions.

Eric

On Fri, 21 Jul 2017 at 16:12 Chun-Wei Yuan chunwei.yuan at gmail.com
<http://mailto:chunwei.yuan at gmail.com> wrote:

There's an ongoing effort to introduce quantile() into numpy.  You'd use it
> just like percentile(), but would input your q value in probability space
> (0.5 for 50%):
>
> https://github.com/numpy/numpy/pull/9213
>
> Since there's a great deal of overlap between these two functions, we'd
> like to solicit opinions on how to move forward on this.
>
> The current thinking is to tolerate the redundancy and keep both, using
> one as the engine for the other.  I'm partial to having quantile because
> 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().
>
> Best,
>
> C
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170810/c7876282/attachment-0001.html>

From jni.soma at gmail.com  Thu Aug 10 18:08:09 2017
From: jni.soma at gmail.com (Juan Nunez-Iglesias)
Date: Fri, 11 Aug 2017 00:08:09 +0200
Subject: [Numpy-discussion] quantile() or percentile()
In-Reply-To: <CAL1kJvBv2DpzX45hMji3udp0GePT3pkqkG-TGBC+ADzE2h4sHQ@mail.gmail.com>
References: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
 <CAL1kJvBv2DpzX45hMji3udp0GePT3pkqkG-TGBC+ADzE2h4sHQ@mail.gmail.com>
Message-ID: <36d84390-b9bf-45d5-a12f-08876d8811c9@Spark>

I concur with the consensus.

On 10 Aug 2017, 11:10 PM +0200, Eric Wieser <wieser.eric+numpy at gmail.com>, wrote:
> Let?s try and keep this on topic - most replies to this message has been about #9211, which is an orthogonal issue.
> There are two main questions here:
>
> 1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x, 25), if they had the choice
> 2. Is this desirable enough to justify increasing the API surface?
>
> The general consensus on the github issue answers yes to 1, but is neutral on 2. It would be good to get more opinions.
> Eric
> On Fri, 21 Jul 2017 at 16:12 Chun-Wei Yuan chunwei.yuan at gmail.com wrote:
> > There's an ongoing effort to introduce quantile() into numpy.? You'd use it just like percentile(), but would input your q value in probability space (0.5 for 50%):
> >
> > https://github.com/numpy/numpy/pull/9213
> >
> > Since there's a great deal of overlap between these two functions, we'd like to solicit opinions on how to move forward on this.
> >
> > The current thinking is to tolerate the redundancy and keep both, using one as the engine for the other.? I'm partial to having quantile because 1.) I prefer probability space, and 2.) I have a PR waiting on quantile().
> >
> > Best,
> >
> > C
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170811/7a331dd8/attachment.html>

From ralf.gommers at gmail.com  Sat Aug 12 00:34:48 2017
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 12 Aug 2017 16:34:48 +1200
Subject: [Numpy-discussion] F2PY problems with PGI compilers
In-Reply-To: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>
References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>
Message-ID: <CABL7CQiEW_WQvzZQ8iuV_AcTcwi8+ay3wpptev7JRdXC_v1Ecg@mail.gmail.com>

On Sat, Aug 5, 2017 at 7:24 AM, Jeff Layton <laytonjb at att.net> wrote:

> Good afternoon!
>
> I'm trying to build a Python module using F2PY on a simple Fortran code
> using the PGI 17.4 community compilers.
>
> I'm using Conda 4.3.21 with Python 2.7.13 and F2PY 2. The command line I'm
> using is,
>
>
> f2py --compiler=pg --fcompiler=pg -c -m mdevice mdevice.f90
>
>
> The output from f2py is at the end of the email. Any suggestions are
> greatly appreciated.
>

--compiler=pg seems wrong, that specifies the C/C++ compiler to use not the
Fortran compiler. Hence you get the error "don't know how to compile C/C++
code on platform 'posix' with 'pg' compiler". Try just leaving that off
(thereby using the default C compiler you have installed, probably gcc).

Ralf


> Thanks!
>
> Jeff
>
>
> Output from f2py:
>
>
>
> running build
> running config_cc
> unifing config_cc, config, build_clib, build_ext, build commands
> --compiler options
> running config_fc
> unifing config_fc, config, build_clib, build_ext, build commands
> --fcompiler options
> running build_src
> build_src
> building extension "mdevice" sources
> f2py options: []
> f2py:> /tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c
> creating /tmp/tmptN1fdp/src.linux-x86_64-2.7
> Reading fortran codes...
>         Reading file 'mdevice.f90' (format:free)
> Post-processing...
>         Block: mdevice
>                         Block: devicequery
> In: :mdevice:mdevice.f90:devicequery
> get_useparameters: no module cudafor info used by devicequery
> Post-processing (stage 2)...
> Building modules...
>         Building module "mdevice"...
>                 Constructing wrapper function "devicequery"...
>                   devicequery()
>         Wrote C/API module "mdevice" to file "/tmp/tmptN1fdp/src.linux-x86_
> 64-2.7/mdevicemodule.c"
>   adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7/fortranobject.c' to sources.
>   adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7' to include_dirs.
> copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c
> -> /tmp/tmptN1fdp/src.linux-x86_64-2.7
> copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h
> -> /tmp/tmptN1fdp/src.linux-x86_64-2.7
> build_src: building npy-pkg config files
> running build_ext
> error: don't know how to compile C/C++ code on platform 'posix' with 'pg'
> compiler
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170812/165e1ceb/attachment.html>

From charlesr.harris at gmail.com  Sun Aug 13 09:28:52 2017
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 13 Aug 2017 07:28:52 -0600
Subject: [Numpy-discussion] quantile() or percentile()
In-Reply-To: <CAL1kJvBv2DpzX45hMji3udp0GePT3pkqkG-TGBC+ADzE2h4sHQ@mail.gmail.com>
References: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
 <CAL1kJvBv2DpzX45hMji3udp0GePT3pkqkG-TGBC+ADzE2h4sHQ@mail.gmail.com>
Message-ID: <CAB6mnxL73x5xYTHCsh8kihnnUDp3s+V_OwYrYcirdr7QULvM5g@mail.gmail.com>

On Thu, Aug 10, 2017 at 3:08 PM, Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> Let?s try and keep this on topic - most replies to this message has been
> about #9211, which is an orthogonal issue.
>
> There are two main questions here:
>
>    1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x,
>    25), if they had the choice
>    2. Is this desirable enough to justify increasing the API surface?
>
> The general consensus on the github issue answers yes to 1, but is neutral
> on 2. It would be good to get more opinions.
>

I think a quantile function would be natural and desirable.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170813/bfb0dc97/attachment.html>

From josef.pktd at gmail.com  Sun Aug 13 12:50:10 2017
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 13 Aug 2017 12:50:10 -0400
Subject: [Numpy-discussion] quantile() or percentile()
In-Reply-To: <CAB6mnxL73x5xYTHCsh8kihnnUDp3s+V_OwYrYcirdr7QULvM5g@mail.gmail.com>
References: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
 <CAL1kJvBv2DpzX45hMji3udp0GePT3pkqkG-TGBC+ADzE2h4sHQ@mail.gmail.com>
 <CAB6mnxL73x5xYTHCsh8kihnnUDp3s+V_OwYrYcirdr7QULvM5g@mail.gmail.com>
Message-ID: <CAMMTP+B3AK0VBmB7p+88so6TAkekRwWqkjCPh2J1oBKmz=fUEg@mail.gmail.com>

On Sun, Aug 13, 2017 at 9:28 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Thu, Aug 10, 2017 at 3:08 PM, Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>
>> Let?s try and keep this on topic - most replies to this message has been
>> about #9211, which is an orthogonal issue.
>>
>> There are two main questions here:
>>
>>    1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x,
>>    25), if they had the choice
>>    2. Is this desirable enough to justify increasing the API surface?
>>
>> The general consensus on the github issue answers yes to 1, but is
>> neutral on 2. It would be good to get more opinions.
>>
>
> I think a quantile function would be natural and desirable.
>

I'm in favor of adding it. (moving away from +0)
It should be an obvious code completion choice, np.q?

Josef


>
> <snip>
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170813/655eb861/attachment.html>

From laytonjb at att.net  Sun Aug 13 15:24:12 2017
From: laytonjb at att.net (Jeff Layton)
Date: Sun, 13 Aug 2017 15:24:12 -0400
Subject: [Numpy-discussion] F2PY problems with PGI compilers
In-Reply-To: <CABL7CQiEW_WQvzZQ8iuV_AcTcwi8+ay3wpptev7JRdXC_v1Ecg@mail.gmail.com>
References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>
 <CABL7CQiEW_WQvzZQ8iuV_AcTcwi8+ay3wpptev7JRdXC_v1Ecg@mail.gmail.com>
Message-ID: <99bcdeac-4872-c287-62e7-884c490c22bf@att.net>

+SciPy list

>
>
> On Sat, Aug 5, 2017 at 7:24 AM, Jeff Layton <laytonjb at att.net 
> <mailto:laytonjb at att.net>> wrote:
>
>     Good afternoon!
>
>     I'm trying to build a Python module using F2PY on a simple Fortran
>     code using the PGI 17.4 community compilers.
>
>     I'm using Conda 4.3.21 with Python 2.7.13 and F2PY 2. The command
>     line I'm using is,
>
>
>     f2py --compiler=pg --fcompiler=pg -c -m mdevice mdevice.f90
>
>
>     The output from f2py is at the end of the email. Any suggestions
>     are greatly appreciated.
>
>
> --compiler=pg seems wrong, that specifies the C/C++ compiler to use 
> not the Fortran compiler. Hence you get the error "don't know how to 
> compile C/C++ code on platform 'posix' with 'pg' compiler". Try just 
> leaving that off (thereby using the default C compiler you have 
> installed, probably gcc).

Ralf - thanks for the response! I had tried that before and F2PY still 
thinks it's using the PGI C compiler:


running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands 
--compiler options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands 
--fcompiler options
running build_src
build_src
building extension "mdevice" sources
f2py options: []
f2py:> /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c
creating /tmp/tmpkxCUbk/src.linux-x86_64-2.7
Reading fortran codes...
         Reading file 'mdevice.f90' (format:free)
Post-processing...
         Block: mdevice
                         Block: devicequery
In: :mdevice:mdevice.f90:devicequery
get_useparameters: no module cudafor info used by devicequery
Post-processing (stage 2)...
Building modules...
         Building module "mdevice"...
                 Constructing wrapper function "devicequery"...
                   devicequery()
         Wrote C/API module "mdevice" to file 
"/tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c"
   adding '/tmp/tmpkxCUbk/src.linux-x86_64-2.7/fortranobject.c' to sources.
   adding '/tmp/tmpkxCUbk/src.linux-x86_64-2.7' to include_dirs.
copying 
/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c 
-> /tmp/tmpkxCUbk/src.linux-x86_64-2.7
copying 
/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h 
-> /tmp/tmpkxCUbk/src.linux-x86_64-2.7
build_src: building npy-pkg config files
running build_ext
customize UnixCCompiler
customize UnixCCompiler using build_ext
customize PGroupFCompiler
Found executable /opt/pgi/linux86-64/pgidir/pgf90
Found executable /opt/pgi/linux86-64/pgidir/pgf77
Found executable /opt/pgi/linux86-64/17.4/bin/pgfortran
customize PGroupFCompiler using build_ext
building 'mdevice' extension
compiling C sources
C compiler: /opt/pgi/linux86-64/pgidir/pgcc -fno-strict-aliasing -g -O2 
-DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC

creating /tmp/tmpkxCUbk/tmp
creating /tmp/tmpkxCUbk/tmp/tmpkxCUbk
creating /tmp/tmpkxCUbk/tmp/tmpkxCUbk/src.linux-x86_64-2.7
compile options: '-I/tmp/tmpkxCUbk/src.linux-x86_64-2.7 
-I/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/core/include 
-I/home/laytonjb/anaconda2/include/python2.7 -c'
pgcc: /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c
pgcc-Error-Unknown switch: -fno-strict-aliasing
pgcc-Error-Unknown switch: -fwrapv
pgcc-Error-Unknown switch: -Wall
pgcc-Error-Unknown switch: -Wstrict-prototypes
pgcc-Error-Unknown switch: -fno-strict-aliasing
pgcc-Error-Unknown switch: -fwrapv
pgcc-Error-Unknown switch: -Wall
pgcc-Error-Unknown switch: -Wstrict-prototypes
error: Command "/opt/pgi/linux86-64/pgidir/pgcc -fno-strict-aliasing -g 
-O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC 
-I/tmp/tmpkxCUbk/src.linux-x86_64-2.7 
-I/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/core/include 
-I/home/laytonjb/anaconda2/include/python2.7 -c 
/tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c -o 
/tmp/tmpkxCUbk/tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.o" 
failed with exit status 1


I'm definitely at a lose here. I have no idea how to make F2PY work with 
the PGI compilers. I'm beginning to think F2PY is completely borked 
unless you use the defaults (gcc).


Thanks!

Jeff


>
>
>
>     Thanks!
>
>     Jeff
>
>
>     Output from f2py:
>
>
>
>     running build
>     running config_cc
>     unifing config_cc, config, build_clib, build_ext, build commands
>     --compiler options
>     running config_fc
>     unifing config_fc, config, build_clib, build_ext, build commands
>     --fcompiler options
>     running build_src
>     build_src
>     building extension "mdevice" sources
>     f2py options: []
>     f2py:> /tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c
>     creating /tmp/tmptN1fdp/src.linux-x86_64-2.7
>     Reading fortran codes...
>             Reading file 'mdevice.f90' (format:free)
>     Post-processing...
>             Block: mdevice
>                             Block: devicequery
>     In: :mdevice:mdevice.f90:devicequery
>     get_useparameters: no module cudafor info used by devicequery
>     Post-processing (stage 2)...
>     Building modules...
>             Building module "mdevice"...
>                     Constructing wrapper function "devicequery"...
>                       devicequery()
>             Wrote C/API module "mdevice" to file
>     "/tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c"
>       adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7/fortranobject.c' to
>     sources.
>       adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7' to include_dirs.
>     copying
>     /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c
>     -> /tmp/tmptN1fdp/src.linux-x86_64-2.7
>     copying
>     /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h
>     -> /tmp/tmptN1fdp/src.linux-x86_64-2.7
>     build_src: building npy-pkg config files
>     running build_ext
>     error: don't know how to compile C/C++ code on platform 'posix'
>     with 'pg' compiler
>
>
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at python.org <mailto:NumPy-Discussion at python.org>
>     https://mail.python.org/mailman/listinfo/numpy-discussion
>     <https://mail.python.org/mailman/listinfo/numpy-discussion>
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170813/53ae6da6/attachment-0001.html>

From ralf.gommers at gmail.com  Mon Aug 14 03:51:03 2017
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 14 Aug 2017 19:51:03 +1200
Subject: [Numpy-discussion] F2PY problems with PGI compilers
In-Reply-To: <99bcdeac-4872-c287-62e7-884c490c22bf@att.net>
References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>
 <CABL7CQiEW_WQvzZQ8iuV_AcTcwi8+ay3wpptev7JRdXC_v1Ecg@mail.gmail.com>
 <99bcdeac-4872-c287-62e7-884c490c22bf@att.net>
Message-ID: <CABL7CQiMYsUJo8Yso6YjKsToTBUi5VhdFhqqVocGvHfmP7Mr0g@mail.gmail.com>

On Mon, Aug 14, 2017 at 7:24 AM, Jeff Layton <laytonjb at att.net> wrote:

> +SciPy list
>
>
>
> On Sat, Aug 5, 2017 at 7:24 AM, Jeff Layton <laytonjb at att.net> wrote:
>
>> Good afternoon!
>>
>> I'm trying to build a Python module using F2PY on a simple Fortran code
>> using the PGI 17.4 community compilers.
>>
>> I'm using Conda 4.3.21 with Python 2.7.13 and F2PY 2. The command line
>> I'm using is,
>>
>>
>> f2py --compiler=pg --fcompiler=pg -c -m mdevice mdevice.f90
>>
>>
>> The output from f2py is at the end of the email. Any suggestions are
>> greatly appreciated.
>>
>
> --compiler=pg seems wrong, that specifies the C/C++ compiler to use not
> the Fortran compiler. Hence you get the error "don't know how to compile
> C/C++ code on platform 'posix' with 'pg' compiler". Try just leaving that
> off (thereby using the default C compiler you have installed, probably gcc).
>
>
> Ralf - thanks for the response! I had tried that before and F2PY still
> thinks it's using the PGI C compiler:
>
>
> running build
> running config_cc
> unifing config_cc, config, build_clib, build_ext, build commands
> --compiler options
> running config_fc
> unifing config_fc, config, build_clib, build_ext, build commands
> --fcompiler options
> running build_src
> build_src
> building extension "mdevice" sources
> f2py options: []
> f2py:> /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c
> creating /tmp/tmpkxCUbk/src.linux-x86_64-2.7
> Reading fortran codes...
>         Reading file 'mdevice.f90' (format:free)
> Post-processing...
>         Block: mdevice
>                         Block: devicequery
> In: :mdevice:mdevice.f90:devicequery
> get_useparameters: no module cudafor info used by devicequery
> Post-processing (stage 2)...
> Building modules...
>         Building module "mdevice"...
>                 Constructing wrapper function "devicequery"...
>                   devicequery()
>         Wrote C/API module "mdevice" to file "/tmp/tmpkxCUbk/src.linux-x86_
> 64-2.7/mdevicemodule.c"
>   adding '/tmp/tmpkxCUbk/src.linux-x86_64-2.7/fortranobject.c' to sources.
>   adding '/tmp/tmpkxCUbk/src.linux-x86_64-2.7' to include_dirs.
> copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c
> -> /tmp/tmpkxCUbk/src.linux-x86_64-2.7
> copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h
> -> /tmp/tmpkxCUbk/src.linux-x86_64-2.7
> build_src: building npy-pkg config files
> running build_ext
> customize UnixCCompiler
> customize UnixCCompiler using build_ext
> customize PGroupFCompiler
> Found executable /opt/pgi/linux86-64/pgidir/pgf90
> Found executable /opt/pgi/linux86-64/pgidir/pgf77
> Found executable /opt/pgi/linux86-64/17.4/bin/pgfortran
> customize PGroupFCompiler using build_ext
> building 'mdevice' extension
> compiling C sources
> C compiler: /opt/pgi/linux86-64/pgidir/pgcc -fno-strict-aliasing -g -O2
> -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC
>
> creating /tmp/tmpkxCUbk/tmp
> creating /tmp/tmpkxCUbk/tmp/tmpkxCUbk
> creating /tmp/tmpkxCUbk/tmp/tmpkxCUbk/src.linux-x86_64-2.7
> compile options: '-I/tmp/tmpkxCUbk/src.linux-x86_64-2.7
> -I/home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/core/include
> -I/home/laytonjb/anaconda2/include/python2.7 -c'
> pgcc: /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c
> pgcc-Error-Unknown switch: -fno-strict-aliasing
> pgcc-Error-Unknown switch: -fwrapv
> pgcc-Error-Unknown switch: -Wall
> pgcc-Error-Unknown switch: -Wstrict-prototypes
> pgcc-Error-Unknown switch: -fno-strict-aliasing
> pgcc-Error-Unknown switch: -fwrapv
> pgcc-Error-Unknown switch: -Wall
> pgcc-Error-Unknown switch: -Wstrict-prototypes
> error: Command "/opt/pgi/linux86-64/pgidir/pgcc -fno-strict-aliasing -g
> -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC
> -I/tmp/tmpkxCUbk/src.linux-x86_64-2.7 -I/home/laytonjb/anaconda2/
> lib/python2.7/site-packages/numpy/core/include -I/home/laytonjb/anaconda2/include/python2.7
> -c /tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.c -o
> /tmp/tmpkxCUbk/tmp/tmpkxCUbk/src.linux-x86_64-2.7/mdevicemodule.o" failed
> with exit status 1
>
>
>
> I'm definitely at a lose here. I have no idea how to make F2PY work with
> the PGI compilers. I'm beginning to think F2PY is completely borked unless
> you use the defaults (gcc).
>

That's not the case. Here is an example when using the Intel Fortran
compiler together with either MSVC or Intel C compilers:
https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows

I notice there that in all cases the C compiler is explicitly specified.
Did you also try ``--compiler=gcc --fcompiler=pg``?

Also, I'm not sure how often this is done with f2py directly; I've only
ever used the --fcompiler flag via ``python setup.py config
--fcompiler=..<etc>``, invoking f2py under the hood. It could be that doing
this directly is indeed broken (or was never supported in the first place).

Ralf


>
> Thanks!
>
> Jeff
>
>
>
>
>
>
>> Thanks!
>>
>> Jeff
>>
>>
>> Output from f2py:
>>
>>
>>
>> running build
>> running config_cc
>> unifing config_cc, config, build_clib, build_ext, build commands
>> --compiler options
>> running config_fc
>> unifing config_fc, config, build_clib, build_ext, build commands
>> --fcompiler options
>> running build_src
>> build_src
>> building extension "mdevice" sources
>> f2py options: []
>> f2py:> /tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c
>> creating /tmp/tmptN1fdp/src.linux-x86_64-2.7
>> Reading fortran codes...
>>         Reading file 'mdevice.f90' (format:free)
>> Post-processing...
>>         Block: mdevice
>>                         Block: devicequery
>> In: :mdevice:mdevice.f90:devicequery
>> get_useparameters: no module cudafor info used by devicequery
>> Post-processing (stage 2)...
>> Building modules...
>>         Building module "mdevice"...
>>                 Constructing wrapper function "devicequery"...
>>                   devicequery()
>>         Wrote C/API module "mdevice" to file
>> "/tmp/tmptN1fdp/src.linux-x86_64-2.7/mdevicemodule.c"
>>   adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7/fortranobject.c' to
>> sources.
>>   adding '/tmp/tmptN1fdp/src.linux-x86_64-2.7' to include_dirs.
>> copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.c
>> -> /tmp/tmptN1fdp/src.linux-x86_64-2.7
>> copying /home/laytonjb/anaconda2/lib/python2.7/site-packages/numpy/f2py/src/fortranobject.h
>> -> /tmp/tmptN1fdp/src.linux-x86_64-2.7
>> build_src: building npy-pkg config files
>> running build_ext
>> error: don't know how to compile C/C++ code on platform 'posix' with 'pg'
>> compiler
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing listNumPy-Discussion at python.orghttps://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170814/d4ee6d3d/attachment.html>

From pav at iki.fi  Mon Aug 14 04:05:52 2017
From: pav at iki.fi (Pauli Virtanen)
Date: Mon, 14 Aug 2017 10:05:52 +0200
Subject: [Numpy-discussion] F2PY problems with PGI compilers
In-Reply-To: <CABL7CQiMYsUJo8Yso6YjKsToTBUi5VhdFhqqVocGvHfmP7Mr0g@mail.gmail.com>
References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>
 <CABL7CQiEW_WQvzZQ8iuV_AcTcwi8+ay3wpptev7JRdXC_v1Ecg@mail.gmail.com>
 <99bcdeac-4872-c287-62e7-884c490c22bf@att.net>
 <CABL7CQiMYsUJo8Yso6YjKsToTBUi5VhdFhqqVocGvHfmP7Mr0g@mail.gmail.com>
Message-ID: <e92a6774-6b18-f7b8-b3ec-06b8cb9f35e3@iki.fi>

Ralf Gommers kirjoitti 14.08.2017 klo 09:51:
>     I'm definitely at a lose here. I have no idea how to make F2PY work
>     with the PGI compilers. I'm beginning to think F2PY is completely
>     borked unless you use the defaults (gcc).
> 
> 
> That's not the case. Here is an example when using the Intel Fortran
> compiler together with either MSVC or Intel C compilers:
> https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows

Note that it is not necessary to use f2py for compiling. It can also
just generate the C and fortran source files necessary --- although you
need to also compile and link in fortranobject.[ch] which are found
inside the numpy folders, and supply the correct Python and numpy
include paths see numpy.get_includes().

From laytonjb at att.net  Mon Aug 14 10:19:46 2017
From: laytonjb at att.net (Jeff Layton)
Date: Mon, 14 Aug 2017 10:19:46 -0400
Subject: [Numpy-discussion] F2PY problems with PGI compilers
In-Reply-To: <CABL7CQiMYsUJo8Yso6YjKsToTBUi5VhdFhqqVocGvHfmP7Mr0g@mail.gmail.com>
References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>
 <CABL7CQiEW_WQvzZQ8iuV_AcTcwi8+ay3wpptev7JRdXC_v1Ecg@mail.gmail.com>
 <99bcdeac-4872-c287-62e7-884c490c22bf@att.net>
 <CABL7CQiMYsUJo8Yso6YjKsToTBUi5VhdFhqqVocGvHfmP7Mr0g@mail.gmail.com>
Message-ID: <158a0a4f-450a-be1e-5512-e96974075f5d@att.net>

On 08/14/2017 03:51 AM, Ralf Gommers wrote:
>
>
>
>
>
>
>     I'm definitely at a lose here. I have no idea how to make F2PY
>     work with the PGI compilers. I'm beginning to think F2PY is
>     completely borked unless you use the defaults (gcc).
>
>
> That's not the case. Here is an example when using the Intel Fortran 
> compiler together with either MSVC or Intel C compilers: 
> https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows
>
> I notice there that in all cases the C compiler is explicitly 
> specified. Did you also try ``--compiler=gcc --fcompiler=pg``?
>
> Also, I'm not sure how often this is done with f2py directly; I've 
> only ever used the --fcompiler flag via ``python setup.py config 
> --fcompiler=..<etc>``, invoking f2py under the hood. It could be that 
> doing this directly is indeed broken (or was never supported in the 
> first place).
>
> Ralf
>
>

Point taken. I don't use Windows too much and I don't use the Intel 
compiler any more (it's not free for non-commercial use :)  ).

I tried using "--compiler=gcc --fcompiler=pg" and I get the same answer 
at the very end.


running build_ext
error: don't know how to compile C/C++ code on platform 'posix' with 
'gcc' compiler


Good point about f2py. I'm using the Anaconda distribution of f2py and 
that may have limitations with respect to the PGI compiler. I may 
download the f2py source and build it to include PGI support. Maybe that 
will fix the problem.

Thanks!

Jeff


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170814/08df8e0f/attachment.html>

From laytonjb at att.net  Mon Aug 14 10:32:57 2017
From: laytonjb at att.net (Jeff Layton)
Date: Mon, 14 Aug 2017 10:32:57 -0400
Subject: [Numpy-discussion] F2PY problems with PGI compilers
In-Reply-To: <2b3fbab5-0aa2-adab-3177-f956bfad3403@att.net>
References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>
 <CABL7CQiEW_WQvzZQ8iuV_AcTcwi8+ay3wpptev7JRdXC_v1Ecg@mail.gmail.com>
 <99bcdeac-4872-c287-62e7-884c490c22bf@att.net>
 <CABL7CQiMYsUJo8Yso6YjKsToTBUi5VhdFhqqVocGvHfmP7Mr0g@mail.gmail.com>
 <e92a6774-6b18-f7b8-b3ec-06b8cb9f35e3@iki.fi>
 <2b3fbab5-0aa2-adab-3177-f956bfad3403@att.net>
Message-ID: <36e5cffb-0d50-9f8b-cdfc-ad38eee00ebe@att.net>

On 08/14/2017 10:27 AM, Jeff Layton wrote:
> On 08/14/2017 04:05 AM, Pauli Virtanen wrote:
>> Ralf Gommers kirjoitti 14.08.2017 klo 09:51:
>>>      I'm definitely at a lose here. I have no idea how to make F2PY 
>>> work
>>>      with the PGI compilers. I'm beginning to think F2PY is completely
>>>      borked unless you use the defaults (gcc).
>>>
>>>
>>> That's not the case. Here is an example when using the Intel Fortran
>>> compiler together with either MSVC or Intel C compilers:
>>> https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows 
>>>
>> Note that it is not necessary to use f2py for compiling. It can also
>> just generate the C and fortran source files necessary --- although you
>> need to also compile and link in fortranobject.[ch] which are found
>> inside the numpy folders, and supply the correct Python and numpy
>> include paths see numpy.get_includes().
>
> I was hoping to avoid this :)  I wanted to use f2py as a "module 
> builder" for some code :)  However, it appears I will have to go down 
> this path to see if I can get further.
>
> Thanks for the advice!
>
> Jeff
>
>
>
>


From laytonjb at att.net  Mon Aug 14 11:01:49 2017
From: laytonjb at att.net (Jeff Layton)
Date: Mon, 14 Aug 2017 11:01:49 -0400
Subject: [Numpy-discussion] F2PY problems with PGI compilers
In-Reply-To: <158a0a4f-450a-be1e-5512-e96974075f5d@att.net>
References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>
 <CABL7CQiEW_WQvzZQ8iuV_AcTcwi8+ay3wpptev7JRdXC_v1Ecg@mail.gmail.com>
 <99bcdeac-4872-c287-62e7-884c490c22bf@att.net>
 <CABL7CQiMYsUJo8Yso6YjKsToTBUi5VhdFhqqVocGvHfmP7Mr0g@mail.gmail.com>
 <158a0a4f-450a-be1e-5512-e96974075f5d@att.net>
Message-ID: <22965279-d221-c01f-3e22-689ea1fe3442@att.net>

On 08/14/2017 10:19 AM, Jeff Layton wrote:
> On 08/14/2017 03:51 AM, Ralf Gommers wrote:
>>
>>
>>
>>
>>
>>
>>     I'm definitely at a lose here. I have no idea how to make F2PY
>>     work with the PGI compilers. I'm beginning to think F2PY is
>>     completely borked unless you use the defaults (gcc).
>>
>>
>> That's not the case. Here is an example when using the Intel Fortran 
>> compiler together with either MSVC or Intel C compilers: 
>> https://software.intel.com/en-us/articles/building-numpyscipy-with-intel-mkl-and-intel-fortran-on-windows
>>
>> I notice there that in all cases the C compiler is explicitly 
>> specified. Did you also try ``--compiler=gcc --fcompiler=pg``?
>>
>> Also, I'm not sure how often this is done with f2py directly; I've 
>> only ever used the --fcompiler flag via ``python setup.py config 
>> --fcompiler=..<etc>``, invoking f2py under the hood. It could be that 
>> doing this directly is indeed broken (or was never supported in the 
>> first place).
>>
>> Ralf
>>
>>
>
> Point taken. I don't use Windows too much and I don't use the Intel 
> compiler any more (it's not free for non-commercial use :) ).
>
> I tried using "--compiler=gcc --fcompiler=pg" and I get the same 
> answer at the very end.
>
>
> running build_ext
> error: don't know how to compile C/C++ code on platform 'posix' with 
> 'gcc' compiler
>
>
> Good point about f2py. I'm using the Anaconda distribution of f2py and 
> that may have limitations with respect to the PGI compiler. I may 
> download the f2py source and build it to include PGI support. Maybe 
> that will fix the problem.
>
> Thanks!
>
> Jeff
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170814/386c2e2c/attachment.html>

From chris.barker at noaa.gov  Mon Aug 14 13:17:21 2017
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 14 Aug 2017 10:17:21 -0700
Subject: [Numpy-discussion] quantile() or percentile()
In-Reply-To: <CAB6mnxL73x5xYTHCsh8kihnnUDp3s+V_OwYrYcirdr7QULvM5g@mail.gmail.com>
References: <CADNTXLJr7x6Ospk=oSRJMKFg=0Kj6Zax4op9O0h-Dofmx_z1Gw@mail.gmail.com>
 <CAL1kJvBv2DpzX45hMji3udp0GePT3pkqkG-TGBC+ADzE2h4sHQ@mail.gmail.com>
 <CAB6mnxL73x5xYTHCsh8kihnnUDp3s+V_OwYrYcirdr7QULvM5g@mail.gmail.com>
Message-ID: <CALGmxEJ1VsR4OSf2yT-5Pqyrey-X1xsUDWkNtznT+=H3g0oFsA@mail.gmail.com>

+1 on quantile()

-CHB


On Sun, Aug 13, 2017 at 6:28 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Thu, Aug 10, 2017 at 3:08 PM, Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>
>> Let?s try and keep this on topic - most replies to this message has been
>> about #9211, which is an orthogonal issue.
>>
>> There are two main questions here:
>>
>>    1. Would the community prefer to use np.quantile(x, 0.25) instead of np.percentile(x,
>>    25), if they had the choice
>>    2. Is this desirable enough to justify increasing the API surface?
>>
>> The general consensus on the github issue answers yes to 1, but is
>> neutral on 2. It would be good to get more opinions.
>>
>
> I think a quantile function would be natural and desirable.
>
> <snip>
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170814/0822b4ce/attachment.html>

From ralf.gommers at gmail.com  Mon Aug 14 13:29:49 2017
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 15 Aug 2017 05:29:49 +1200
Subject: [Numpy-discussion] F2PY problems with PGI compilers
In-Reply-To: <158a0a4f-450a-be1e-5512-e96974075f5d@att.net>
References: <0017358f-06d0-b702-2c89-3b5f3c13c94e@att.net>
 <CABL7CQiEW_WQvzZQ8iuV_AcTcwi8+ay3wpptev7JRdXC_v1Ecg@mail.gmail.com>
 <99bcdeac-4872-c287-62e7-884c490c22bf@att.net>
 <CABL7CQiMYsUJo8Yso6YjKsToTBUi5VhdFhqqVocGvHfmP7Mr0g@mail.gmail.com>
 <158a0a4f-450a-be1e-5512-e96974075f5d@att.net>
Message-ID: <CABL7CQi7NB28NeD4_z+5GqQAjrcN-u+TODcpk-qMt9FfLBksvQ@mail.gmail.com>

On Tue, Aug 15, 2017 at 2:19 AM, Jeff Layton <laytonjb at att.net> wrote:

> On 08/14/2017 03:51 AM, Ralf Gommers wrote:
>
>
>
>
>
>>
>>
>> I'm definitely at a lose here. I have no idea how to make F2PY work with
>> the PGI compilers. I'm beginning to think F2PY is completely borked unless
>> you use the defaults (gcc).
>>
>
> That's not the case. Here is an example when using the Intel Fortran
> compiler together with either MSVC or Intel C compilers:
> https://software.intel.com/en-us/articles/building-
> numpyscipy-with-intel-mkl-and-intel-fortran-on-windows
>
> I notice there that in all cases the C compiler is explicitly specified.
> Did you also try ``--compiler=gcc --fcompiler=pg``?
>
> Also, I'm not sure how often this is done with f2py directly; I've only
> ever used the --fcompiler flag via ``python setup.py config
> --fcompiler=..<etc>``, invoking f2py under the hood. It could be that doing
> this directly is indeed broken (or was never supported in the first place).
>
> Ralf
>
>
>
> Point taken. I don't use Windows too much and I don't use the Intel
> compiler any more (it's not free for non-commercial use :)  ).
>
> I tried using "--compiler=gcc --fcompiler=pg" and I get the same answer at
> the very end.
>
>
> running build_ext
> error: don't know how to compile C/C++ code on platform 'posix' with 'gcc'
> compiler
>
>
> Good point about f2py. I'm using the Anaconda distribution of f2py and
> that may have limitations with respect to the PGI compiler. I may download
> the f2py source and build it to include PGI support. Maybe that will fix
> the problem.
>

That won't make a difference, all the build config code is pure Python.
Anaconda will give you the same results as building from source.

Ralf


> Thanks!
>
> Jeff
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170815/bbac9d59/attachment.html>

From pavdev at gmx.de  Tue Aug 15 12:26:10 2017
From: pavdev at gmx.de (Paul)
Date: Tue, 15 Aug 2017 18:26:10 +0200
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
Message-ID: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>

Hi all,

I recently spent some time adding python interfaces to my tensor libraries:
  * Tensor Contraction Library (TCL): https://github.com/springer13/tcl
  * Tensor Transposition Library (HPTT): https://github.com/springer13/hptt
 
Both libraries tend to give very significant speedups over what is
currently offered by NumPY; Speedups
typically range from 5x - 20x w.r.t. HPTT and >>20x for TCL (see
attached, Host: 2x Intel Haswell-EP E5-2680 v3 (24 threads)).
Thus, I was curious if some of you would benefit from those speedups and
if you want it to
be integrated into NumPY.

The HPTT and TCL libraries are respectively similar to numpy.transpose()
and numpy.einsum().

I welcome you to give the packages a try and see if they can help you to
speedup some of your tensor-related operations.

Finally: Which steps would be required to integrate those libraries into
NumPY? Which problems do you anticipate?

Thank you,
Paul

-------------- next part --------------
A non-text attachment was scrubbed...
Name: hptt_vs_numpy.pdf
Type: application/pdf
Size: 14732 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170815/fdfddaba/attachment-0002.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tcl_vs_numpy_vs_eigen.pdf
Type: application/pdf
Size: 116285 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170815/fdfddaba/attachment-0003.pdf>

From charlesr.harris at gmail.com  Tue Aug 15 14:05:34 2017
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 15 Aug 2017 12:05:34 -0600
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
In-Reply-To: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
References: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
Message-ID: <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>

On Tue, Aug 15, 2017 at 10:26 AM, Paul <pavdev at gmx.de> wrote:

> Hi all,
>
> I recently spent some time adding python interfaces to my tensor libraries:
>   * Tensor Contraction Library (TCL): https://github.com/springer13/tcl
>   * Tensor Transposition Library (HPTT): https://github.com/springer13/
> hptt
>
> Both libraries tend to give very significant speedups over what is
> currently offered by NumPY; Speedups
> typically range from 5x - 20x w.r.t. HPTT and >>20x for TCL (see
> attached, Host: 2x Intel Haswell-EP E5-2680 v3 (24 threads)).
> Thus, I was curious if some of you would benefit from those speedups and
> if you want it to
> be integrated into NumPY.
>
> The HPTT and TCL libraries are respectively similar to numpy.transpose()
> and numpy.einsum().
>
> I welcome you to give the packages a try and see if they can help you to
> speedup some of your tensor-related operations.
>
> Finally: Which steps would be required to integrate those libraries into
> NumPY? Which problems do you anticipate?
>
>
What version of Numpy are you comparing to? Note that in 1.13 you can
enable some optimization in einsum, and the coming 1.14 makes that the
default and uses CBLAS when possible.

If you want to get it into Numpy, it would be worth checking if the
existing functions can be improved before adding new ones.

Note that Numpy transposition method just rearranges the indices, so the
advantage of actual transposition is to have better cache performance or
allow direct use of CBLAS. I assume TCL uses some tricks to do
transposition in a way that is more cache friendly?

Might check the license if your work uses code from a publication.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170815/5c8147e0/attachment.html>

From pavdev at gmx.de  Wed Aug 16 05:39:26 2017
From: pavdev at gmx.de (Paul Springer)
Date: Wed, 16 Aug 2017 11:39:26 +0200
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
In-Reply-To: <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>
References: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
 <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>
Message-ID: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>


> What version of Numpy are you comparing to? Note that in 1.13 you can
> enable some optimization in einsum, and the coming 1.14 makes that the
> default and uses CBLAS when possible.
I was using 1.10.4; however, I am currently running the benchmark with
1.13.1 and 'optimize=True'; this, however, seems to yield even worse
performance (see attached).
If you are interested, you can check the performance difference yourself
via: ./benchmark/python/bechmark.sh
> If you want to get it into Numpy, it would be worth checking if the
> existing functions can be improved before adding new ones.
>
> Note that Numpy transposition method just rearranges the indices, so
> the advantage of actual transposition is to have better cache
> performance or allow direct use of CBLAS. I assume TCL uses some
> tricks to do transposition in a way that is more cache friendly?
HPTT is a sophisticated library for tensor transpositions, as such it
blocks the tensors such that (1) spatial locality can be exploited.
Moreover, (2) it uses explicit vectorization to take advantage of the
CPU's vector units.

TCL uses the Transpose-Transpose-GEMM-Transpose approach where all
tensors are flattened into matrices (via HPTT) and then contracted via
GEMM; the final result is eventually folded (via HPTT) into the desired
output tensor.

Would it be possible to expose HPTT and TCL as optional packages within
NumPY? This way I don't have to redo the work that I've already put into
those libraries.
> Might check the license if your work uses code from a publication.
As far as licenses are concerned that should not be a problem since I
wrote to code myself and it doesn't use code from publications other
than mine.

Best regards,
Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170816/b16d3461/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tcl_vs_numpy_vs_eigen.pdf
Type: application/pdf
Size: 111882 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170816/b16d3461/attachment-0001.pdf>

From shoyer at gmail.com  Wed Aug 16 11:38:24 2017
From: shoyer at gmail.com (Stephan Hoyer)
Date: Wed, 16 Aug 2017 08:38:24 -0700
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
In-Reply-To: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>
References: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
 <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>
 <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>
Message-ID: <CAEQ_Tve-fSKsZg5PPfWFiz058e5_uH-mfdAXHuwXupKu0fY-Lg@mail.gmail.com>

On Wed, Aug 16, 2017 at 2:39 AM, Paul Springer <pavdev at gmx.de> wrote:

>
> What version of Numpy are you comparing to? Note that in 1.13 you can
> enable some optimization in einsum, and the coming 1.14 makes that the
> default and uses CBLAS when possible.
>
> I was using 1.10.4; however, I am currently running the benchmark with
> 1.13.1 and 'optimize=True'; this, however, seems to yield even worse
> performance (see attached).
> If you are interested, you can check the performance difference yourself
> via: ./benchmark/python/bechmark.sh
>

This sounds like you may be using relatively small matrices, where the
overhead of calculating the optimal strategy dominates. Can you try with a
few bigger test cases?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170816/687292f6/attachment.html>

From peridot.faceted at gmail.com  Wed Aug 16 12:08:39 2017
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Wed, 16 Aug 2017 16:08:39 +0000
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
In-Reply-To: <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>
References: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
 <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>
 <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>
Message-ID: <CANm_+ZognufzFdpF430O_GV7wqOkduKAdz22u6j5D3DDkqqDgw@mail.gmail.com>

(NB all: the thread title seems to interchange the acronyms for the Thread
Contraction Library (TCL) and the High-Perormance Tensor Transpose (HPTT)
packages. I'm not fixing it so as not to break threading.)

On Wed, Aug 16, 2017 at 11:40 AM Paul Springer <pavdev at gmx.de> wrote:

> If you want to get it into Numpy, it would be worth checking if the
> existing functions can be improved before adding new ones.
>
> Note that Numpy transposition method just rearranges the indices, so the
> advantage of actual transposition is to have better cache performance or
> allow direct use of CBLAS. I assume TCL uses some tricks to do
> transposition in a way that is more cache friendly?
>
> HPTT is a sophisticated library for tensor transpositions, as such it
> blocks the tensors such that (1) spatial locality can be exploited.
> Moreover, (2) it uses explicit vectorization to take advantage of the CPU's
> vector units.
>

I think this library provides functionality that isn't readily accessible
from within numpy at the moment. The only functions I know of to rearrange
the memory layout of data are things like ascontiguousarray and
asfortranarray, as well as assignment (e.g. a[...] = b). The general
strategy within numpy is to assume that all functions work equally well on
arrays with arbitrary memory layouts, so that users often don't even know
the memory layouts of their data. The striding functionality means data
usually doesn't actually get transposed until absolutely necessary.

Of course, few if any numpy functions work equally well on different memory
layouts; unary ufuncs contain code to try to carry out their iteration in
the fastest way, but it's not clear how well that works or whether they
have the freedom to choose the layouts of their output arrays.

If you wanted to integrate HPTT into numpy, I think the best approach might
be to wire it into the assignment machinery, so that when users do things
like a[::2,:] = b[:,::3].T HPTT springs into action behind the scenes and
makes this assignment as efficient as possible (how well does it handle
arrays with spaces between elements?). Then ascontiguousarray and
asfortranarray and the like could simply use assignment to an
appropriately-ordered destination when they actually needed to do anything.

TCL uses the Transpose-Transpose-GEMM-Transpose approach where all tensors
> are flattened into matrices (via HPTT) and then contracted via GEMM; the
> final result is eventually folded (via HPTT) into the desired output tensor.
>

This is a pretty direct replacement of einsum, but I think einsum may well
already do pretty much this, apart from not using HPTT to do the
transposes. So the way to get this functionality would be to make the
matrix-rearrangement primitives use HPTT, as above.

Would it be possible to expose HPTT and TCL as optional packages within
> NumPY? This way I don't have to redo the work that I've already put into
> those libraries.
>

I think numpy should be regarded as a basically-complete package for
manipulating strided in-memory data, to which we are reluctant to add new
user-visible functionality. Tools that can act under the hood to make
existing code faster, or to reduce the work users must to to make their
code run fast enough, are valuable.

> Might check the license if your work uses code from a publication.
>
> As far as licenses are concerned that should not be a problem since I
> wrote to code myself and it doesn't use code from publications other than
> mine.
>

Would some of your techniques help numpy to more rapidly evaluate things
like C[...] = A+B, when A B and C are arbitrarily strided and there are no
ordering constraints on the result? Or just A+B where numpy is free to
choose the memory layout for the result?

Anne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170816/3febb715/attachment.html>

From pavdev at gmx.de  Wed Aug 16 18:29:53 2017
From: pavdev at gmx.de (Paul Springer)
Date: Thu, 17 Aug 2017 00:29:53 +0200
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
In-Reply-To: <CAEQ_Tve-fSKsZg5PPfWFiz058e5_uH-mfdAXHuwXupKu0fY-Lg@mail.gmail.com>
References: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
 <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>
 <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>
 <CAEQ_Tve-fSKsZg5PPfWFiz058e5_uH-mfdAXHuwXupKu0fY-Lg@mail.gmail.com>
Message-ID: <e079cdba-6773-6b73-e5e2-89a7c0d677a9@gmx.de>

Am 8/16/17 um 5:38 PM schrieb Stephan Hoyer:
> On Wed, Aug 16, 2017 at 2:39 AM, Paul Springer <pavdev at gmx.de
> <mailto:pavdev at gmx.de>> wrote:
>
>
>>     What version of Numpy are you comparing to? Note that in 1.13 you
>>     can enable some optimization in einsum, and the coming 1.14 makes
>>     that the default and uses CBLAS when possible.
>     I was using 1.10.4; however, I am currently running the benchmark
>     with 1.13.1 and 'optimize=True'; this, however, seems to yield
>     even worse performance (see attached).
>     If you are interested, you can check the performance difference
>     yourself via: ./benchmark/python/bechmark.sh
>
>
> This sounds like you may be using relatively small matrices, where the
> overhead of calculating the optimal strategy dominates. Can you try
> with a few bigger test cases?
>
The sizes of the tensors varies form ~5MB up to ~100MB towards the far
right of the plot; this corresponds to matrices of size ~1000^2 to
~5000^2, thus the sizes should be large enough to amortize any overhead
associated to calculating the optimal strategy.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170817/1aed124d/attachment.html>

From pavdev at gmx.de  Wed Aug 16 18:33:27 2017
From: pavdev at gmx.de (Paul Springer)
Date: Thu, 17 Aug 2017 00:33:27 +0200
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
In-Reply-To: <CANm_+ZognufzFdpF430O_GV7wqOkduKAdz22u6j5D3DDkqqDgw@mail.gmail.com>
References: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
 <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>
 <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>
 <CANm_+ZognufzFdpF430O_GV7wqOkduKAdz22u6j5D3DDkqqDgw@mail.gmail.com>
Message-ID: <fd1bb2c0-742b-36bf-5510-bd2c3d97964d@gmx.de>

Am 8/16/17 um 6:08 PM schrieb Anne Archibald:
> (NB all: the thread title seems to interchange the acronyms for the
> Thread Contraction Library (TCL) and the High-Perormance Tensor
> Transpose (HPTT) packages. I'm not fixing it so as not to break
> threading.) 
>
> On Wed, Aug 16, 2017 at 11:40 AM Paul Springer <pavdev at gmx.de
> <mailto:pavdev at gmx.de>> wrote:
>
>>     If you want to get it into Numpy, it would be worth checking if
>>     the existing functions can be improved before adding new ones.
>>
>>     Note that Numpy transposition method just rearranges the indices,
>>     so the advantage of actual transposition is to have better cache
>>     performance or allow direct use of CBLAS. I assume TCL uses some
>>     tricks to do transposition in a way that is more cache friendly?
>     HPTT is a sophisticated library for tensor transpositions, as such
>     it blocks the tensors such that (1) spatial locality can be
>     exploited. Moreover, (2) it uses explicit vectorization to take
>     advantage of the CPU's vector units.
>
>
> I think this library provides functionality that isn't readily
> accessible from within numpy at the moment. The only functions I know
> of to rearrange the memory layout of data are things like
> ascontiguousarray and asfortranarray, as well as assignment (e.g.
> a[...] = b). The general strategy within numpy is to assume that all
> functions work equally well on arrays with arbitrary memory layouts,
> so that users often don't even know the memory layouts of their data.
> The striding functionality means data usually doesn't actually get
> transposed until absolutely necessary.
>
> Of course, few if any numpy functions work equally well on different
> memory layouts; unary ufuncs contain code to try to carry out their
> iteration in the fastest way, but it's not clear how well that works
> or whether they have the freedom to choose the layouts of their output
> arrays. 
>
> If you wanted to integrate HPTT into numpy, I think the best approach
> might be to wire it into the assignment machinery, so that when users
> do things like a[::2,:] = b[:,::3].T HPTT springs into action behind
> the scenes and makes this assignment as efficient as possible (how
> well does it handle arrays with spaces between elements?). Then
> ascontiguousarray and asfortranarray and the like could simply use
> assignment to an appropriately-ordered destination when they actually
> needed to do anything.
HPTT offers support for subtensor (via the outerSize parameter, which is
similar to the leading dimension in BLAS), thus, HPTT can also deal with
arbitrarily strided transpositions.
However, a non-unite stride for the fastest-varying index is devastating
for performance since this prohibits the use of vectorization and the
exploitation of spatial locality.

How would the integration of HPTT into NumPY look like?
Which steps would need to be taken?
Would it be required the HPTT be distributed in source code along side
NumPY (at that point I might have to change the license for HPTT) or
would it be fine to add an git dependency? That way users who build
NumPY from source could fetch HPTT and set a flag during the build
process of NumPY, indicating the HPTT is available?
How would the process look like if NumPY is distributed as a precompiled
binary?

The same questions apply with respect to TCL.
>
>     TCL uses the Transpose-Transpose-GEMM-Transpose approach where all
>     tensors are flattened into matrices (via HPTT) and then contracted
>     via GEMM; the final result is eventually folded (via HPTT) into
>     the desired output tensor.
>
>
> This is a pretty direct replacement of einsum, but I think einsum may
> well already do pretty much this, apart from not using HPTT to do the
> transposes. So the way to get this functionality would be to make the
> matrix-rearrangement primitives use HPTT, as above.
That would certainly be one approach, however, TCL also explores several
different strategies/candidates and picks the one that minimizes the
data movements required by the transpositions.
>
>     Would it be possible to expose HPTT and TCL as optional packages
>     within NumPY? This way I don't have to redo the work that I've
>     already put into those libraries.
>
>
> I think numpy should be regarded as a basically-complete package for
> manipulating strided in-memory data, to which we are reluctant to add
> new user-visible functionality. Tools that can act under the hood to
> make existing code faster, or to reduce the work users must to to make
> their code run fast enough, are valuable.
It seems to me that TCL is such a candidate, since it can replace a
significant portion of the functionality offered by numpy.einsum(),
yielding significantly higher performance.

I imagine some thing of the form:

def einsum(...):
    if( tclApplicable and tclAvailable ):
       tcl.tensorMult(...)
>  
> Would some of your techniques help numpy to more rapidly evaluate
> things like C[...] = A+B, when A B and C are arbitrarily strided and
> there are no ordering constraints on the result? Or just A+B where
> numpy is free to choose the memory layout for the result?
Actually, HPTT is only concerned with the operation of the form
B[perm(i0,i1,...)] = alpha * A[i0,i1,...] + beta * B[perm(i0,i1,...)]
(where alpha and beta are scalars). Summing over multiple transposed
tensors can be quite challenging (https://arxiv.org/abs/1705.06661) and
is not covered by HPTT. Does this answer your question?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170817/9b447a82/attachment-0001.html>

From sebastian at sipsolutions.net  Thu Aug 17 03:55:57 2017
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Thu, 17 Aug 2017 09:55:57 +0200
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
In-Reply-To: <fd1bb2c0-742b-36bf-5510-bd2c3d97964d@gmx.de>
References: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
 <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>
 <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>
 <CANm_+ZognufzFdpF430O_GV7wqOkduKAdz22u6j5D3DDkqqDgw@mail.gmail.com>
 <fd1bb2c0-742b-36bf-5510-bd2c3d97964d@gmx.de>
Message-ID: <1502956557.27862.3.camel@sipsolutions.net>

On Thu, 2017-08-17 at 00:33 +0200, Paul Springer wrote:
> Am 8/16/17 um 6:08 PM schrieb Anne Archibald:
<snip>
>  
> > If you wanted to integrate HPTT into numpy, I think the best
> > approach might be to wire it into the assignment machinery, so that
> > when users do things like a[::2,:] = b[:,::3].T HPTT springs into
> > action behind the scenes and makes this assignment as efficient as
> > possible (how well does it handle arrays with spaces between
> > elements?). Then ascontiguousarray and asfortranarray and the like
> > could simply use assignment to an appropriately-ordered destination
> > when they actually needed to do anything.
> ?HPTT offers support for subtensor (via the outerSize parameter,
> which is similar to the leading dimension in BLAS), thus, HPTT can
> also deal with arbitrarily strided transpositions.
> However, a non-unite stride for the fastest-varying index is
> devastating for performance since this prohibits the use of
> vectorization and the exploitation of spatial locality.
> 
> How would the integration of HPTT into NumPY look like??
> Which steps would need to be taken?
> Would it be required the HPTT be distributed in source code along
> side NumPY (at that point I might have to change the license for
> HPTT) or would it be fine to add an git dependency? That way users
> who build NumPY from source could fetch HPTT and set a flag during
> the build process of NumPY, indicating the HPTT is available??
> How would the process look like if NumPY is distributed as a
> precompiled binary?
> 

Well, numpy is BSD, and the official binaries will be BSD, someone else
could do less free binaries of course. I doubt we can have a hard
dependency unless it is part of the numpy source (some trick like this
at one point existed for fftw, but....). I doubt including the source
itself is going to happen quickly since we would first have to decide
to actually use a modern C++ compiler (I have no idea if that is
problematic or not).

Having a blocked/fancier (I assume) iterator jump in at least for
simple operations such as transposed+copy as Anne suggested sounds very
cool though. It could be nice for simple ufuncs at least as well. I
have no idea how difficult that may be though or how much complexity it
would add to maintenance. My guess is it might require quite a lot of
work to integrate such optimizations into the Iterator itself (even
though it would be awesome), compared to just trying to plug it into
some selected fast paths as Anne suggested.

One thing that might be very simple and also pretty nice is just trying
to keep the documentation (or wiki page or so linked from the
documentation) up to date with suggestions for people interested in
speed improvements listing things such as (not sure if we have that):
 
* Use pyfftw for speeding up ffts
* numexpr can be nice and gives a way to quickly use multiple cores
* numba can automagically compile some python functions to be fast
* Use TCL if you need faster einsum(like) operations
* ...

Just a few thoughts, did not think about details really. But yes, it is
sounds reasonable to me to re-add support for optional dependencies
such as fftw or your TCL. But packagers have to make use of that or I
fear it is actually less available than a standalone python module.

- Sebastian


> The same questions apply with respect to TCL.
> > > TCL uses the Transpose-Transpose-GEMM-Transpose approach where
> > > all tensors are flattened into matrices (via HPTT) and then
> > > contracted via GEMM; the final result is eventually folded (via
> > > HPTT) into the desired output tensor.
> > > 
> > 
> > This is a pretty direct replacement of einsum, but I think einsum
> > may well already do pretty much this, apart from not using HPTT to
> > do the transposes. So the way to get this functionality would be to
> > make the matrix-rearrangement primitives use HPTT, as above.
> ?That would certainly be one approach, however, TCL also explores
> several different strategies/candidates and picks the one that
> minimizes the data movements required by the transpositions.
> > > Would it be possible to expose HPTT and TCL as optional packages
> > > within NumPY? This way I don't have to redo the work that I've
> > > already put into those libraries.
> > > 
> > 
> > I think numpy should be regarded as a basically-complete package
> > for manipulating strided in-memory data, to which we are reluctant
> > to add new user-visible functionality. Tools that can act under the
> > hood to make existing code faster, or to reduce the work users must
> > to to make their code run fast enough, are valuable.
> ?It seems to me that TCL is such a candidate, since it can replace a
> significant portion of the functionality offered by numpy.einsum(),
> yielding significantly higher performance.
> 
> I imagine some thing of the form:
> 
> def einsum(...):
> ??? if( tclApplicable and tclAvailable ):
> ?????? tcl.tensorMult(...)
> > ?
> > Would some of your techniques help numpy to more rapidly evaluate
> > things like C[...] = A+B, when A B and C are arbitrarily strided
> > and there are no ordering constraints on the result? Or just A+B
> > where numpy is free to choose the memory layout for the result?
> ?Actually, HPTT is only concerned with the operation of the form
> B[perm(i0,i1,...)] = alpha * A[i0,i1,...] + beta * B[perm(i0,i1,...)]
> (where alpha and beta are scalars). Summing over multiple transposed
> tensors can be quite challenging (https://arxiv.org/abs/1705.06661)
> and is not covered by HPTT. Does this answer your question?
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170817/eca492b1/attachment.sig>

From chris.barker at noaa.gov  Thu Aug 17 12:15:14 2017
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 17 Aug 2017 09:15:14 -0700
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
In-Reply-To: <1502956557.27862.3.camel@sipsolutions.net>
References: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
 <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>
 <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>
 <CANm_+ZognufzFdpF430O_GV7wqOkduKAdz22u6j5D3DDkqqDgw@mail.gmail.com>
 <fd1bb2c0-742b-36bf-5510-bd2c3d97964d@gmx.de>
 <1502956557.27862.3.camel@sipsolutions.net>
Message-ID: <CALGmxEK4YSVJ47qFKmTm_z8hAySb67x5nxnnLCeQ9ipU1F-pJw@mail.gmail.com>

On Thu, Aug 17, 2017 at 12:55 AM, Sebastian Berg <sebastian at sipsolutions.net
> wrote:

> > How would the process look like if NumPY is distributed as a
> > precompiled binary?
>
>
> Well, numpy is BSD, and the official binaries will be BSD, someone else
> could do less free binaries of course.


Indeed, if you want it to be distributed as a binary with numpy, then the
license needs to be compatible -- do you have a substantial objection to
BSD? The BSD family is pretty much the standard for Python -- Python (and
numpy) are very broadly used in proprietary software.

I doubt we can have a hard
> dependency unless it is part of the numpy source


and no reason to -- if it is a hard dependency, it HAS to be compatible
licensed, and it's a lot easier to keep the source together.

However, it _could_ be a soft dependency, like LAPACK/BLAS -- I've honestly
lost track, but numpy used come with a lapack-lite (or some such), so that
it could be compiled and work with no external LAPACK implementation -- you
wouldn't get the best performance, but it would work.

 I doubt including the source
> itself is going to happen quickly since we would first have to decide
> to actually use a modern C++ compiler (I have no idea if that is
> problematic or not).
>

could it be there as a conditional compilation? There is a lot of push to
support C++11 elsewhere, so a compiled-with-a-modern-compiler numpy is not
SO far off..

(for py3 anyway...)


* Use TCL if you need faster einsum(like) operations
>

That is, of course, the other option -- distribute it on its own or maybe
in scipy, and then users can use it as an optimization for those few core
functions where speed matters to them -- honestly, it's a pretty small
fraction of numpy code.

But it sure would be nice if it could be built in, and then folks would get
better performance without even thinkning about it.


> Just a few thoughts, did not think about details really. But yes, it is
> sounds reasonable to me to re-add support for optional dependencies
> such as fftw or your TCL. But packagers have to make use of that or I
> fear it is actually less available than a standalone python module.
>

true -- though I expect Anaconda / conda forge at least would be likely to
pick it up if it works well.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170817/da6e96ba/attachment.html>

From charlesr.harris at gmail.com  Thu Aug 17 12:58:33 2017
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 17 Aug 2017 10:58:33 -0600
Subject: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor
 Transposition (TCL)
In-Reply-To: <CALGmxEK4YSVJ47qFKmTm_z8hAySb67x5nxnnLCeQ9ipU1F-pJw@mail.gmail.com>
References: <f457b8da-5659-9748-64de-c860ed7e89a2@gmx.de>
 <CAB6mnxL-6NmQQ7YVkShny-OzZjkO4zYc1LET+z=RLyQdg-31vA@mail.gmail.com>
 <55f86774-5831-e152-b617-c5e1b6364605@gmx.de>
 <CANm_+ZognufzFdpF430O_GV7wqOkduKAdz22u6j5D3DDkqqDgw@mail.gmail.com>
 <fd1bb2c0-742b-36bf-5510-bd2c3d97964d@gmx.de>
 <1502956557.27862.3.camel@sipsolutions.net>
 <CALGmxEK4YSVJ47qFKmTm_z8hAySb67x5nxnnLCeQ9ipU1F-pJw@mail.gmail.com>
Message-ID: <CAB6mnx+6JL-rMcUk-u7FvP=3eWqHTTj1QjxU-UfHgwLXaN4ckw@mail.gmail.com>

On Thu, Aug 17, 2017 at 10:15 AM, Chris Barker <chris.barker at noaa.gov>
wrote:

> On Thu, Aug 17, 2017 at 12:55 AM, Sebastian Berg <
> sebastian at sipsolutions.net> wrote:
>
>> > How would the process look like if NumPY is distributed as a
>> > precompiled binary?
>>
>>
>> Well, numpy is BSD, and the official binaries will be BSD, someone else
>> could do less free binaries of course.
>
>
> Indeed, if you want it to be distributed as a binary with numpy, then the
> license needs to be compatible -- do you have a substantial objection to
> BSD? The BSD family is pretty much the standard for Python -- Python (and
> numpy) are very broadly used in proprietary software.
>
> I doubt we can have a hard
>> dependency unless it is part of the numpy source
>
>
> and no reason to -- if it is a hard dependency, it HAS to be compatible
> licensed, and it's a lot easier to keep the source together.
>
> However, it _could_ be a soft dependency, like LAPACK/BLAS -- I've
> honestly lost track, but numpy used come with a lapack-lite (or some such),
> so that it could be compiled and work with no external LAPACK
> implementation -- you wouldn't get the best performance, but it would work.
>
>  I doubt including the source
>> itself is going to happen quickly since we would first have to decide
>> to actually use a modern C++ compiler (I have no idea if that is
>> problematic or not).
>>
>
> could it be there as a conditional compilation? There is a lot of push to
> support C++11 elsewhere, so a compiled-with-a-modern-compiler numpy is
> not SO far off..
>
> (for py3 anyway...)
>

It would take a fair amount of grunge work to get there. Variables would
need renaming, for instance `new`, and other such things. Nothing mind
bending, but not completely trivial either.


>
> * Use TCL if you need faster einsum(like) operations
>>
>
> That is, of course, the other option -- distribute it on its own or maybe
> in scipy, and then users can use it as an optimization for those few core
> functions where speed matters to them -- honestly, it's a pretty small
> fraction of numpy code.
>
> But it sure would be nice if it could be built in, and then folks would
> get better performance without even thinkning about it.
>
>
>> Just a few thoughts, did not think about details really. But yes, it is
>> sounds reasonable to me to re-add support for optional dependencies
>> such as fftw or your TCL. But packagers have to make use of that or I
>> fear it is actually less available than a standalone python module.
>>
>
> true -- though I expect Anaconda / conda forge at least would be likely to
> pick it up if it works well.
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170817/03c71ec4/attachment.html>

From millman at berkeley.edu  Fri Aug 18 15:51:52 2017
From: millman at berkeley.edu (Jarrod Millman)
Date: Fri, 18 Aug 2017 12:51:52 -0700
Subject: [Numpy-discussion] NetworkX 2.0b1 released
Message-ID: <CAB6X4sgGp4U4jo4rnrPikgrXf0TyXaAyG-Pj3LOjB9NFQ3_pPg@mail.gmail.com>

Hi All,

I am happy to announce the **beta** release of NetworkX 2.0! NetworkX
is a Python package for the creation, manipulation, and study of the
structure, dynamics, and functions of complex networks.

This release supports Python 2.7 and 3.4-3.6 and contains many new
features.  This release is the result of over two years of work with
over 600 pull requests by 85 contributors.  We have made **major
changes** to the methods in the Multi/Di/Graph classes and before the
2.0 release we need feedback on those changes.  If you have code that
imports networkx, please take some time to check that you are able to
update your code to work with the new release.

Please see the draft of the 2.0 release announcement:
  http://networkx.readthedocs.io/en/latest/news.html#networkx-2-0
In particular, we would like feedback on the migration guide from 1.X to 2.0:
  http://networkx.readthedocs.io/en/latest/release/migration_guide_from_1.x_to_2.0.html

Since it is a beta release, pip won't automatically install it.  So
  $ pip install networkx
still installs networkx-1.11 still.  But
  $ pip install --pre networkx
will install networkx-2.0b1.  If you already have networkx installed
then you need to do
  $ pip install --pre --upgrade networkx

For more information, please visit our `website
<http://networkx.github.io/>`_ and our `gallery of examples
<http://networkx.readthedocs.io/en/latest/auto_examples/index.html>`_.
Please send comments and questions to the `networkx-discuss mailing
list <http://groups.google.com/group/networkx-discuss>`_ or create an
issue `here <https://github.com/networkx/networkx/issues>`_.

Best regards,
Jarrod

From diagonaldevice at gmail.com  Fri Aug 18 17:45:23 2017
From: diagonaldevice at gmail.com (Michael Lamparski)
Date: Fri, 18 Aug 2017 17:45:23 -0400
Subject: [Numpy-discussion] Why are empty arrays False?
Message-ID: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>

Greetings, all.  I am troubled.

The TL;DR is that `bool(array([])) is False` is misleading, dangerous, and
unnecessary. Let's begin with some examples:

>>> bool(np.array(1))
True
>>> bool(np.array(0))
False
>>> bool(np.array([0, 1]))
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()
>>> bool(np.array([1]))
True
>>> bool(np.array([0]))
False
>>> bool(np.array([]))
False

One of these things is not like the other.

The first three results embody a design that is consistent with some of the
most fundamental design choices in numpy, such as the choice to have
comparison operators like `==` work elementwise.  And it is the only such
design I can think of that is consistent in all edge cases. (see footnote 1)

The next two examples (involving arrays of shape (1,)) are a
straightforward extension of the design to arrays that are isomorphic to
scalars.  I can't say I recall ever finding a use for this feature... but
it seems fairly harmless.

So how about that last example, with array([])?  Well... it's /kind of/
like how other python containers work, right? Falseness is emptiness (see
footnote 2)...  Except that this is actually *a complete lie*, due to /all
of the other examples above/!

Here's what I would like to see:

>>> bool(np.array([]))
ValueError: The truth value of a non-scalar array is ambiguous. Use a.any()
or a.all()

Why do I care?  Well, I myself wasted an hour barking up the wrong tree
while debugging some code when it turned out that I was mistakenly using
truthiness to identify empty arrays. It just so happened that the arrays
always contained 1 or 0 elements, so it /appeared/ to work except in the
rare case of array([0]) where things suddenly exploded.

I posit that there is no usage of the fact that `bool(array([])) is False`
in any real-world code which is not accompanied by a horrible bug writhing
in hiding just beneath the surface. For this reason, I wish to see this
behavior *abolished*.

Thank you.
-Michael

Footnotes:
1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would just
implicitly do `all()`, which would make `if a == b:` work like it does for
virtually every other reasonably-designed type in existence.  But then I
recall that, if this were done, then the behavior of `if a != b:` would
stand out like a sore thumb instead.  Truly, punting on 'any/all' was the
right choice.

2: np.array([[[[]]]]) is also False, which makes this an interesting sort
of n-dimensional emptiness test; but if that's really what you're looking
for, you can achieve this much more safely with `np.all(x.shape)` or
`bool(x.flat)`
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170818/4d1da794/attachment.html>

From shoyer at gmail.com  Fri Aug 18 18:00:32 2017
From: shoyer at gmail.com (Stephan Hoyer)
Date: Fri, 18 Aug 2017 15:00:32 -0700
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
Message-ID: <CAEQ_Tvcc_kiVL7Pi9K-4SsjuBXudafBApHHcjK2QMJW2e-0wNQ@mail.gmail.com>

I agree, this behavior seems actively harmful. Let's fix it.

On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski <diagonaldevice at gmail.com
> wrote:

> Greetings, all.  I am troubled.
>
> The TL;DR is that `bool(array([])) is False` is misleading, dangerous, and
> unnecessary. Let's begin with some examples:
>
> >>> bool(np.array(1))
> True
> >>> bool(np.array(0))
> False
> >>> bool(np.array([0, 1]))
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()
> >>> bool(np.array([1]))
> True
> >>> bool(np.array([0]))
> False
> >>> bool(np.array([]))
> False
>
> One of these things is not like the other.
>
> The first three results embody a design that is consistent with some of
> the most fundamental design choices in numpy, such as the choice to have
> comparison operators like `==` work elementwise.  And it is the only such
> design I can think of that is consistent in all edge cases. (see footnote 1)
>
> The next two examples (involving arrays of shape (1,)) are a
> straightforward extension of the design to arrays that are isomorphic to
> scalars.  I can't say I recall ever finding a use for this feature... but
> it seems fairly harmless.
>
> So how about that last example, with array([])?  Well... it's /kind of/
> like how other python containers work, right? Falseness is emptiness (see
> footnote 2)...  Except that this is actually *a complete lie*, due to /all
> of the other examples above/!
>
> Here's what I would like to see:
>
> >>> bool(np.array([]))
> ValueError: The truth value of a non-scalar array is ambiguous. Use
> a.any() or a.all()
>
> Why do I care?  Well, I myself wasted an hour barking up the wrong tree
> while debugging some code when it turned out that I was mistakenly using
> truthiness to identify empty arrays. It just so happened that the arrays
> always contained 1 or 0 elements, so it /appeared/ to work except in the
> rare case of array([0]) where things suddenly exploded.
>
> I posit that there is no usage of the fact that `bool(array([])) is False`
> in any real-world code which is not accompanied by a horrible bug writhing
> in hiding just beneath the surface. For this reason, I wish to see this
> behavior *abolished*.
>
> Thank you.
> -Michael
>
> Footnotes:
> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would just
> implicitly do `all()`, which would make `if a == b:` work like it does for
> virtually every other reasonably-designed type in existence.  But then I
> recall that, if this were done, then the behavior of `if a != b:` would
> stand out like a sore thumb instead.  Truly, punting on 'any/all' was the
> right choice.
>
> 2: np.array([[[[]]]]) is also False, which makes this an interesting sort
> of n-dimensional emptiness test; but if that's really what you're looking
> for, you can achieve this much more safely with `np.all(x.shape)` or
> `bool(x.flat)`
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170818/8385d553/attachment-0001.html>

From pmhobson at gmail.com  Fri Aug 18 18:37:52 2017
From: pmhobson at gmail.com (Paul Hobson)
Date: Fri, 18 Aug 2017 15:37:52 -0700
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAEQ_Tvcc_kiVL7Pi9K-4SsjuBXudafBApHHcjK2QMJW2e-0wNQ@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <CAEQ_Tvcc_kiVL7Pi9K-4SsjuBXudafBApHHcjK2QMJW2e-0wNQ@mail.gmail.com>
Message-ID: <CADT3MEBVTpbCY=DtCTUGM-9xU+5E1qL+Sn83-8Y7m+c4iHG-qw@mail.gmail.com>

Maybe I'm missing something.

This seems fine to me:
>>> bool(np.array([]))
False

But I would have expected these to raise ValueErrors recommending any() and
all():
>>> bool(np.array([1]))
True
>>> bool(np.array([0]))
False

On Fri, Aug 18, 2017 at 3:00 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

> I agree, this behavior seems actively harmful. Let's fix it.
>
> On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski <
> diagonaldevice at gmail.com> wrote:
>
>> Greetings, all.  I am troubled.
>>
>> The TL;DR is that `bool(array([])) is False` is misleading, dangerous,
>> and unnecessary. Let's begin with some examples:
>>
>> >>> bool(np.array(1))
>> True
>> >>> bool(np.array(0))
>> False
>> >>> bool(np.array([0, 1]))
>> ValueError: The truth value of an array with more than one element is
>> ambiguous. Use a.any() or a.all()
>> >>> bool(np.array([1]))
>> True
>> >>> bool(np.array([0]))
>> False
>> >>> bool(np.array([]))
>> False
>>
>> One of these things is not like the other.
>>
>> The first three results embody a design that is consistent with some of
>> the most fundamental design choices in numpy, such as the choice to have
>> comparison operators like `==` work elementwise.  And it is the only such
>> design I can think of that is consistent in all edge cases. (see footnote 1)
>>
>> The next two examples (involving arrays of shape (1,)) are a
>> straightforward extension of the design to arrays that are isomorphic to
>> scalars.  I can't say I recall ever finding a use for this feature... but
>> it seems fairly harmless.
>>
>> So how about that last example, with array([])?  Well... it's /kind of/
>> like how other python containers work, right? Falseness is emptiness (see
>> footnote 2)...  Except that this is actually *a complete lie*, due to /all
>> of the other examples above/!
>>
>> Here's what I would like to see:
>>
>> >>> bool(np.array([]))
>> ValueError: The truth value of a non-scalar array is ambiguous. Use
>> a.any() or a.all()
>>
>> Why do I care?  Well, I myself wasted an hour barking up the wrong tree
>> while debugging some code when it turned out that I was mistakenly using
>> truthiness to identify empty arrays. It just so happened that the arrays
>> always contained 1 or 0 elements, so it /appeared/ to work except in the
>> rare case of array([0]) where things suddenly exploded.
>>
>> I posit that there is no usage of the fact that `bool(array([])) is
>> False` in any real-world code which is not accompanied by a horrible bug
>> writhing in hiding just beneath the surface. For this reason, I wish to see
>> this behavior *abolished*.
>>
>> Thank you.
>> -Michael
>>
>> Footnotes:
>> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would
>> just implicitly do `all()`, which would make `if a == b:` work like it does
>> for virtually every other reasonably-designed type in existence.  But then
>> I recall that, if this were done, then the behavior of `if a != b:` would
>> stand out like a sore thumb instead.  Truly, punting on 'any/all' was the
>> right choice.
>>
>> 2: np.array([[[[]]]]) is also False, which makes this an interesting sort
>> of n-dimensional emptiness test; but if that's really what you're looking
>> for, you can achieve this much more safely with `np.all(x.shape)` or
>> `bool(x.flat)`
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170818/0e9d1c8a/attachment.html>

From diagonaldevice at gmail.com  Fri Aug 18 19:02:43 2017
From: diagonaldevice at gmail.com (Michael Lamparski)
Date: Fri, 18 Aug 2017 19:02:43 -0400
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CADT3MEBVTpbCY=DtCTUGM-9xU+5E1qL+Sn83-8Y7m+c4iHG-qw@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <CAEQ_Tvcc_kiVL7Pi9K-4SsjuBXudafBApHHcjK2QMJW2e-0wNQ@mail.gmail.com>
 <CADT3MEBVTpbCY=DtCTUGM-9xU+5E1qL+Sn83-8Y7m+c4iHG-qw@mail.gmail.com>
Message-ID: <CAGfQJGEF5VftbB2bsC4VV1py=vt48juLFGSZUbossE+YCktxkA@mail.gmail.com>

> But I would have expected these to raise ValueErrors recommending any()
and all():
> >>> bool(np.array([1]))
> True
> >>> bool(np.array([0]))
> False

While I can't confess to know the *actual* reason why single-element arrays
evaluate the way they do, this is how I understand it:

One thing that single-element arrays have going for them is that, for
arrays like this, `x.any() == x.all()`.  Hence, in these cases, there is no
ambiguity.

In this same light, we can see yet another argument against
bool(np.array([])), because guess what:  This one IS ambiguous!

>>> np.array([]).any()
False
>>> np.array([]).all()
True

On Fri, Aug 18, 2017 at 6:37 PM, Paul Hobson <pmhobson at gmail.com> wrote:

> Maybe I'm missing something.
>
> This seems fine to me:
> >>> bool(np.array([]))
> False
>
> But I would have expected these to raise ValueErrors recommending any()
> and all():
> >>> bool(np.array([1]))
> True
> >>> bool(np.array([0]))
> False
>
> On Fri, Aug 18, 2017 at 3:00 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>
>> I agree, this behavior seems actively harmful. Let's fix it.
>>
>> On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski <
>> diagonaldevice at gmail.com> wrote:
>>
>>> Greetings, all.  I am troubled.
>>>
>>> The TL;DR is that `bool(array([])) is False` is misleading, dangerous,
>>> and unnecessary. Let's begin with some examples:
>>>
>>> >>> bool(np.array(1))
>>> True
>>> >>> bool(np.array(0))
>>> False
>>> >>> bool(np.array([0, 1]))
>>> ValueError: The truth value of an array with more than one element is
>>> ambiguous. Use a.any() or a.all()
>>> >>> bool(np.array([1]))
>>> True
>>> >>> bool(np.array([0]))
>>> False
>>> >>> bool(np.array([]))
>>> False
>>>
>>> One of these things is not like the other.
>>>
>>> The first three results embody a design that is consistent with some of
>>> the most fundamental design choices in numpy, such as the choice to have
>>> comparison operators like `==` work elementwise.  And it is the only such
>>> design I can think of that is consistent in all edge cases. (see footnote 1)
>>>
>>> The next two examples (involving arrays of shape (1,)) are a
>>> straightforward extension of the design to arrays that are isomorphic to
>>> scalars.  I can't say I recall ever finding a use for this feature... but
>>> it seems fairly harmless.
>>>
>>> So how about that last example, with array([])?  Well... it's /kind of/
>>> like how other python containers work, right? Falseness is emptiness (see
>>> footnote 2)...  Except that this is actually *a complete lie*, due to /all
>>> of the other examples above/!
>>>
>>> Here's what I would like to see:
>>>
>>> >>> bool(np.array([]))
>>> ValueError: The truth value of a non-scalar array is ambiguous. Use
>>> a.any() or a.all()
>>>
>>> Why do I care?  Well, I myself wasted an hour barking up the wrong tree
>>> while debugging some code when it turned out that I was mistakenly using
>>> truthiness to identify empty arrays. It just so happened that the arrays
>>> always contained 1 or 0 elements, so it /appeared/ to work except in the
>>> rare case of array([0]) where things suddenly exploded.
>>>
>>> I posit that there is no usage of the fact that `bool(array([])) is
>>> False` in any real-world code which is not accompanied by a horrible bug
>>> writhing in hiding just beneath the surface. For this reason, I wish to see
>>> this behavior *abolished*.
>>>
>>> Thank you.
>>> -Michael
>>>
>>> Footnotes:
>>> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would
>>> just implicitly do `all()`, which would make `if a == b:` work like it does
>>> for virtually every other reasonably-designed type in existence.  But then
>>> I recall that, if this were done, then the behavior of `if a != b:` would
>>> stand out like a sore thumb instead.  Truly, punting on 'any/all' was the
>>> right choice.
>>>
>>> 2: np.array([[[[]]]]) is also False, which makes this an interesting
>>> sort of n-dimensional emptiness test; but if that's really what you're
>>> looking for, you can achieve this much more safely with `np.all(x.shape)`
>>> or `bool(x.flat)`
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170818/fcb97586/attachment-0001.html>

From wieser.eric+numpy at gmail.com  Fri Aug 18 19:07:51 2017
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Fri, 18 Aug 2017 23:07:51 +0000
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAEQ_Tvcc_kiVL7Pi9K-4SsjuBXudafBApHHcjK2QMJW2e-0wNQ@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <CAEQ_Tvcc_kiVL7Pi9K-4SsjuBXudafBApHHcjK2QMJW2e-0wNQ@mail.gmail.com>
Message-ID: <CAL1kJvAoCBroZvH4wfHF8LjR=fFQKS-gN=5AEQ2EDZXMivu8zQ@mail.gmail.com>

I'm also in favor of fixing this, although we might need a deprecation
cycle with a warning advising to use arr.size in future to detect emptiness
- just in case anyone is using it.

On Sat, Aug 19, 2017, 06:01 Stephan Hoyer <shoyer at gmail.com> wrote:

> I agree, this behavior seems actively harmful. Let's fix it.
>
> On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski <
> diagonaldevice at gmail.com> wrote:
>
>> Greetings, all.  I am troubled.
>>
>> The TL;DR is that `bool(array([])) is False` is misleading, dangerous,
>> and unnecessary. Let's begin with some examples:
>>
>> >>> bool(np.array(1))
>> True
>> >>> bool(np.array(0))
>> False
>> >>> bool(np.array([0, 1]))
>> ValueError: The truth value of an array with more than one element is
>> ambiguous. Use a.any() or a.all()
>> >>> bool(np.array([1]))
>> True
>> >>> bool(np.array([0]))
>> False
>> >>> bool(np.array([]))
>> False
>>
>> One of these things is not like the other.
>>
>> The first three results embody a design that is consistent with some of
>> the most fundamental design choices in numpy, such as the choice to have
>> comparison operators like `==` work elementwise.  And it is the only such
>> design I can think of that is consistent in all edge cases. (see footnote 1)
>>
>> The next two examples (involving arrays of shape (1,)) are a
>> straightforward extension of the design to arrays that are isomorphic to
>> scalars.  I can't say I recall ever finding a use for this feature... but
>> it seems fairly harmless.
>>
>> So how about that last example, with array([])?  Well... it's /kind of/
>> like how other python containers work, right? Falseness is emptiness (see
>> footnote 2)...  Except that this is actually *a complete lie*, due to /all
>> of the other examples above/!
>>
>> Here's what I would like to see:
>>
>> >>> bool(np.array([]))
>> ValueError: The truth value of a non-scalar array is ambiguous. Use
>> a.any() or a.all()
>>
>> Why do I care?  Well, I myself wasted an hour barking up the wrong tree
>> while debugging some code when it turned out that I was mistakenly using
>> truthiness to identify empty arrays. It just so happened that the arrays
>> always contained 1 or 0 elements, so it /appeared/ to work except in the
>> rare case of array([0]) where things suddenly exploded.
>>
>> I posit that there is no usage of the fact that `bool(array([])) is
>> False` in any real-world code which is not accompanied by a horrible bug
>> writhing in hiding just beneath the surface. For this reason, I wish to see
>> this behavior *abolished*.
>>
>> Thank you.
>> -Michael
>>
>> Footnotes:
>> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would
>> just implicitly do `all()`, which would make `if a == b:` work like it does
>> for virtually every other reasonably-designed type in existence.  But then
>> I recall that, if this were done, then the behavior of `if a != b:` would
>> stand out like a sore thumb instead.  Truly, punting on 'any/all' was the
>> right choice.
>>
>> 2: np.array([[[[]]]]) is also False, which makes this an interesting sort
>> of n-dimensional emptiness test; but if that's really what you're looking
>> for, you can achieve this much more safely with `np.all(x.shape)` or
>> `bool(x.flat)`
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170818/da0fff52/attachment.html>

From njs at pobox.com  Fri Aug 18 20:12:43 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 18 Aug 2017 17:12:43 -0700
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
Message-ID: <CAPJVwB=bq25=NVUeu9LRKvL3A3vQiQqmNFR_ftdszXSkzQ+stg@mail.gmail.com>

On Fri, Aug 18, 2017 at 2:45 PM, Michael Lamparski
<diagonaldevice at gmail.com> wrote:
> Greetings, all.  I am troubled.
>
> The TL;DR is that `bool(array([])) is False` is misleading, dangerous, and
> unnecessary. Let's begin with some examples:
>
>>>> bool(np.array(1))
> True
>>>> bool(np.array(0))
> False
>>>> bool(np.array([0, 1]))
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()
>>>> bool(np.array([1]))
> True
>>>> bool(np.array([0]))
> False
>>>> bool(np.array([]))
> False
>
> One of these things is not like the other.
>
> The first three results embody a design that is consistent with some of the
> most fundamental design choices in numpy, such as the choice to have
> comparison operators like `==` work elementwise.  And it is the only such
> design I can think of that is consistent in all edge cases. (see footnote 1)
>
> The next two examples (involving arrays of shape (1,)) are a straightforward
> extension of the design to arrays that are isomorphic to scalars.  I can't
> say I recall ever finding a use for this feature... but it seems fairly
> harmless.
>
> So how about that last example, with array([])?  Well... it's /kind of/ like
> how other python containers work, right? Falseness is emptiness (see
> footnote 2)...  Except that this is actually *a complete lie*, due to /all
> of the other examples above/!

Yeah, numpy tries to follow Python conventions, except sometimes you
run into these cases where it's trying to simultaneously follow two
incompatible extensions and things get... problematic.

> Here's what I would like to see:
>
>>>> bool(np.array([]))
> ValueError: The truth value of a non-scalar array is ambiguous. Use a.any()
> or a.all()
>
> Why do I care?  Well, I myself wasted an hour barking up the wrong tree
> while debugging some code when it turned out that I was mistakenly using
> truthiness to identify empty arrays. It just so happened that the arrays
> always contained 1 or 0 elements, so it /appeared/ to work except in the
> rare case of array([0]) where things suddenly exploded.

Yeah, we should probably deprecate and remove this (though it will
take some time).

> 2: np.array([[[[]]]]) is also False, which makes this an interesting sort of
> n-dimensional emptiness test; but if that's really what you're looking for,
> you can achieve this much more safely with `np.all(x.shape)` or
> `bool(x.flat)`

x.size is also useful for emptiness checking.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From efiring at hawaii.edu  Fri Aug 18 22:34:02 2017
From: efiring at hawaii.edu (Eric Firing)
Date: Fri, 18 Aug 2017 16:34:02 -1000
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
Message-ID: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>

On 2017/08/18 11:45 AM, Michael Lamparski wrote:
> Greetings, all.? I am troubled.
> 
> The TL;DR is that `bool(array([])) is False` is misleading, dangerous, 
> and unnecessary. Let's begin with some examples:
> 
>  >>> bool(np.array(1))
> True
>  >>> bool(np.array(0))
> False
>  >>> bool(np.array([0, 1]))
> ValueError: The truth value of an array with more than one element is 
> ambiguous. Use a.any() or a.all()
>  >>> bool(np.array([1]))
> True
>  >>> bool(np.array([0]))
> False
>  >>> bool(np.array([]))
> False
> 
> One of these things is not like the other.
> 
> The first three results embody a design that is consistent with some of 
> the most fundamental design choices in numpy, such as the choice to have 
> comparison operators like `==` work elementwise.? And it is the only 
> such design I can think of that is consistent in all edge cases. (see 
> footnote 1)
> 
> The next two examples (involving arrays of shape (1,)) are a 
> straightforward extension of the design to arrays that are isomorphic to 
> scalars.? I can't say I recall ever finding a use for this feature... 
> but it seems fairly harmless.
> 
> So how about that last example, with array([])?? Well... it's /kind of/ 
> like how other python containers work, right? Falseness is emptiness 
> (see footnote 2)...? Except that this is actually *a complete lie*, due 
> to /all of the other examples above/!

I don't agree.  I think the consistency between bool([]) and 
bool(array([])) is worth preserving.  Nothing you have shown is 
inconsistent with "Falseness is emptiness", which is quite fundamental 
in Python.  The inconsistency is in distinguishing between 1 element and 
more than one element.  To be consistent, bool(array([0])) and 
bool(array([0, 1])) should both be True.  Contrary to the ValueError 
message, there need be no ambiguity, any more than there is an ambiguity 
in bool([1, 2]).

Eric


> 
> Here's what I would like to see:
> 
>  >>> bool(np.array([]))
> ValueError: The truth value of a non-scalar array is ambiguous. Use 
> a.any() or a.all()
> 
> Why do I care?? Well, I myself wasted an hour barking up the wrong tree 
> while debugging some code when it turned out that I was mistakenly using 
> truthiness to identify empty arrays. It just so happened that the arrays 
> always contained 1 or 0 elements, so it /appeared/ to work except in the 
> rare case of array([0]) where things suddenly exploded.
> 
> I posit that there is no usage of the fact that `bool(array([])) is 
> False` in any real-world code which is not accompanied by a horrible bug 
> writhing in hiding just beneath the surface. For this reason, I wish to 
> see this behavior *abolished*.
> 
> Thank you.
> -Michael
> 
> Footnotes:
> 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would 
> just implicitly do `all()`, which would make `if a == b:` work like it 
> does for virtually every other reasonably-designed type in existence.  
> But then I recall that, if this were done, then the behavior of `if a != 
> b:` would stand out like a sore thumb instead.? Truly, punting on 
> 'any/all' was the right choice.
> 
> 2: np.array([[[[]]]]) is also False, which makes this an interesting 
> sort of n-dimensional emptiness test; but if that's really what you're 
> looking for, you can achieve this much more safely with 
> `np.all(x.shape)` or `bool(x.flat)`
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 


From wieser.eric+numpy at gmail.com  Sat Aug 19 00:19:54 2017
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Sat, 19 Aug 2017 04:19:54 +0000
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
Message-ID: <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>

Defining falseness? as emptiness in numpy is problematic, as then
bool(array(0)) and bool(0) would have different results. 0d arrays are
supposed to behave as much like their scalar values as possible, so this is
not acceptable.

More importantly though, allowing your proposed semantics would cause a lot
of silent bugs in code like `if arr == value`, which would be silently true
of array inputs. We already diverge from python on what == means, so I see
no reason to match the normal semantics of bool.

I'd be tentatively in favor of deprecating bool(array([1]) with a warning
asking for `.squeeze()` to be used, since this also hides a (smaller) class
of bugs.

On Sat, Aug 19, 2017, 10:34 Eric Firing <efiring at hawaii.edu> wrote:

> On 2017/08/18 11:45 AM, Michael Lamparski wrote:
> > Greetings, all.  I am troubled.
> >
> > The TL;DR is that `bool(array([])) is False` is misleading, dangerous,
> > and unnecessary. Let's begin with some examples:
> >
> >  >>> bool(np.array(1))
> > True
> >  >>> bool(np.array(0))
> > False
> >  >>> bool(np.array([0, 1]))
> > ValueError: The truth value of an array with more than one element is
> > ambiguous. Use a.any() or a.all()
> >  >>> bool(np.array([1]))
> > True
> >  >>> bool(np.array([0]))
> > False
> >  >>> bool(np.array([]))
> > False
> >
> > One of these things is not like the other.
> >
> > The first three results embody a design that is consistent with some of
> > the most fundamental design choices in numpy, such as the choice to have
> > comparison operators like `==` work elementwise.  And it is the only
> > such design I can think of that is consistent in all edge cases. (see
> > footnote 1)
> >
> > The next two examples (involving arrays of shape (1,)) are a
> > straightforward extension of the design to arrays that are isomorphic to
> > scalars.  I can't say I recall ever finding a use for this feature...
> > but it seems fairly harmless.
> >
> > So how about that last example, with array([])?  Well... it's /kind of/
> > like how other python containers work, right? Falseness is emptiness
> > (see footnote 2)...  Except that this is actually *a complete lie*, due
> > to /all of the other examples above/!
>
> I don't agree.  I think the consistency between bool([]) and
> bool(array([])) is worth preserving.  Nothing you have shown is
> inconsistent with "Falseness is emptiness", which is quite fundamental
> in Python.  The inconsistency is in distinguishing between 1 element and
> more than one element.  To be consistent, bool(array([0])) and
> bool(array([0, 1])) should both be True.  Contrary to the ValueError
> message, there need be no ambiguity, any more than there is an ambiguity
> in bool([1, 2]).
>
> Eric
>
>
> >
> > Here's what I would like to see:
> >
> >  >>> bool(np.array([]))
> > ValueError: The truth value of a non-scalar array is ambiguous. Use
> > a.any() or a.all()
> >
> > Why do I care?  Well, I myself wasted an hour barking up the wrong tree
> > while debugging some code when it turned out that I was mistakenly using
> > truthiness to identify empty arrays. It just so happened that the arrays
> > always contained 1 or 0 elements, so it /appeared/ to work except in the
> > rare case of array([0]) where things suddenly exploded.
> >
> > I posit that there is no usage of the fact that `bool(array([])) is
> > False` in any real-world code which is not accompanied by a horrible bug
> > writhing in hiding just beneath the surface. For this reason, I wish to
> > see this behavior *abolished*.
> >
> > Thank you.
> > -Michael
> >
> > Footnotes:
> > 1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would
> > just implicitly do `all()`, which would make `if a == b:` work like it
> > does for virtually every other reasonably-designed type in existence.
> > But then I recall that, if this were done, then the behavior of `if a !=
> > b:` would stand out like a sore thumb instead.  Truly, punting on
> > 'any/all' was the right choice.
> >
> > 2: np.array([[[[]]]]) is also False, which makes this an interesting
> > sort of n-dimensional emptiness test; but if that's really what you're
> > looking for, you can achieve this much more safely with
> > `np.all(x.shape)` or `bool(x.flat)`
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170819/38f87b25/attachment.html>

From diagonaldevice at gmail.com  Sat Aug 19 01:04:56 2017
From: diagonaldevice at gmail.com (Michael Lamparski)
Date: Sat, 19 Aug 2017 01:04:56 -0400
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
Message-ID: <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>

> More importantly though, allowing your proposed semantics would cause a
lot of silent bugs in code like `if arr == value`, which would be silently
true of array inputs. We already diverge from python on what == means, so I
see no reason to match the normal semantics of bool.

Eric hits the nail right on the head here. (er, ahh, you're both Eric!)
And this gets worse; not only would `a == b` be true, but so would `a !=
b`! For the vast majority of arrays, `bool(x != x)` would be True!

I can resonate with Eric F's feelings, because to be honest, I've never
been a big fan of the fact that comparison operators return arrays in the
first place.  That said... it's a difficult design question, and I can
respect the decision that was made; there certainly are a large variety of
circumstances where broadcasting these operations are useful. On the other
hand, it is a decision that comes with implications that cannot be ignored
in many other parts of the library, and truthiness of arrays is one of them.

> I'd be tentatively in favor of deprecating bool(array([1]) with a warning
asking for `.squeeze()` to be used, since this also hides a (smaller) class
of bugs.

I can get behind this as well, though I just keep wondering in the back of
my mind whether there's some tricky but legitimate use case that I'm not
thinking about, where arrays of size 1 just happen to have a natural
tendency to arise.

On Sat, Aug 19, 2017, 10:34 Eric Firing <efiring at hawaii.edu> wrote:

> On 2017/08/18 11:45 AM, Michael Lamparski wrote:
> > Greetings, all.  I am troubled.
> >
> > The TL;DR is that `bool(array([])) is False` is misleading, dangerous,
> > and unnecessary. Let's begin with some examples:
> >
> >  >>> bool(np.array(1))
> > True
> >  >>> bool(np.array(0))
> > False
> >  >>> bool(np.array([0, 1]))
> > ValueError: The truth value of an array with more than one element is
> > ambiguous. Use a.any() or a.all()
> >  >>> bool(np.array([1]))
> > True
> >  >>> bool(np.array([0]))
> > False
> >  >>> bool(np.array([]))
> > False
> >
> > One of these things is not like the other.
> >
> > The first three results embody a design that is consistent with some of
> > the most fundamental design choices in numpy, such as the choice to have
> > comparison operators like `==` work elementwise.  And it is the only
> > such design I can think of that is consistent in all edge cases. (see
> > footnote 1)
> >
> > The next two examples (involving arrays of shape (1,)) are a
> > straightforward extension of the design to arrays that are isomorphic to
> > scalars.  I can't say I recall ever finding a use for this feature...
> > but it seems fairly harmless.
> >
> > So how about that last example, with array([])?  Well... it's /kind of/
> > like how other python containers work, right? Falseness is emptiness
> > (see footnote 2)...  Except that this is actually *a complete lie*, due
> > to /all of the other examples above/!
>
> I don't agree.  I think the consistency between bool([]) and
> bool(array([])) is worth preserving.  Nothing you have shown is
> inconsistent with "Falseness is emptiness", which is quite fundamental
> in Python.  The inconsistency is in distinguishing between 1 element and
> more than one element.  To be consistent, bool(array([0])) and
> bool(array([0, 1])) should both be True.  Contrary to the ValueError
> message, there need be no ambiguity, any more than there is an ambiguity
> in bool([1, 2]).
>
> Eric
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170819/5108dd99/attachment-0001.html>

From andyfaff at gmail.com  Sat Aug 19 03:57:58 2017
From: andyfaff at gmail.com (Andrew Nelson)
Date: Sat, 19 Aug 2017 17:57:58 +1000
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAAbtOZdz+fLDhduYi6kX9SHCpzi41etLODp=dSSeSTwwvsuNNw@mail.gmail.com>
References: <CAAbtOZdz+fLDhduYi6kX9SHCpzi41etLODp=dSSeSTwwvsuNNw@mail.gmail.com>
Message-ID: <CAAbtOZd6k4jsHzwMnrN+0gA0qFWJxf5g0xsJkRdBcUKCjJd0bg@mail.gmail.com>

> I think the consistency between bool([]) and
bool(array([])) is worth preserving

I'm with Eric Firing on this one. Empty sequences are False in Python.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170819/a7ac300f/attachment.html>

From wieser.eric+numpy at gmail.com  Sat Aug 19 05:00:43 2017
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Sat, 19 Aug 2017 09:00:43 +0000
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAAbtOZd6k4jsHzwMnrN+0gA0qFWJxf5g0xsJkRdBcUKCjJd0bg@mail.gmail.com>
References: <CAAbtOZdz+fLDhduYi6kX9SHCpzi41etLODp=dSSeSTwwvsuNNw@mail.gmail.com>
 <CAAbtOZd6k4jsHzwMnrN+0gA0qFWJxf5g0xsJkRdBcUKCjJd0bg@mail.gmail.com>
Message-ID: <CAL1kJvBxya8GMkAbjKUmYYmVDONzP4qAX_fSuu6jFYVbg=UEhA@mail.gmail.com>

Andrew, that can only be useful if you also require that all non-empty
arrays are True - else code looking for empty arrays gets false positives
on arrays of zeros.

But as I mention above, that is not acceptable, as it produces silent traps
for new users, or functions not written with numpy in mind. "In the face of
ambiguity, refuse the tempting to guess" tells us that throwing an error is
the right thing to do here.

In idiomatic code, numpy arrays have semantics closer to scalars than to
sequences - iteration is usually a red flag. Another example of how arrays
are not like sequences - the + operator is element-wise addition, not
sequence concatenation.

On Sat, Aug 19, 2017, 15:58 Andrew Nelson <andyfaff at gmail.com> wrote:

> > I think the consistency between bool([]) and
> bool(array([])) is worth preserving
>
> I'm with Eric Firing on this one. Empty sequences are False in Python.
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170819/1f3db5a3/attachment.html>

From pav at iki.fi  Sat Aug 19 09:22:33 2017
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 19 Aug 2017 15:22:33 +0200
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
Message-ID: <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>

Michael Lamparski kirjoitti 19.08.2017 klo 07:04:
>> I'd be tentatively in favor of deprecating bool(array([1]) with a
> warning asking for `.squeeze()` to be used, since this also hides a
> (smaller) class of bugs.
> 
> I can get behind this as well, though I just keep wondering in the back
> of my mind whether there's some tricky but legitimate use case that I'm
> not thinking about, where arrays of size 1 just happen to have a natural
> tendency to arise.

Changing this sort of fundamental semantics (i.e. size-1 arrays behave
like scalars in bool, int, etc. casting context) this late in the game
in my opinion should be discussed with more care.

While the intention of making it harder to write code with bugs is good,
it should not come at the cost of having everyone fix their old scripts,
which worked correctly previously, but then suddenly stop working. Note
also that I expect polling on this mailing list will not reach the
majority of the user base, so I would suggest being very conservative
when deprecating features that are not wrong but only with suboptimal
semantics. This sort of backward-incompatible changes accumulate, and
will lead to rotting of third-party code.

-- 
Pauli Virtanen

From diagonaldevice at gmail.com  Sat Aug 19 13:18:45 2017
From: diagonaldevice at gmail.com (Michael Lamparski)
Date: Sat, 19 Aug 2017 13:18:45 -0400
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
 <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
Message-ID: <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>

On Sat, Aug 19, 2017 at 9:22 AM, Pauli Virtanen <pav at iki.fi> wrote:
> While the intention of making it harder to write code with bugs
> is good, it should not come at the cost of having everyone fix
> their old scripts, which worked correctly previously, but then
> suddenly stop working.

This is a good point.  Deprecating anything in such a widely used library
has a very big cost that must be weighed against the benefits, and I agree
that truth-testing on size=1 arrays is neither broken nor dangerous. IMO,
it is a small refactoring hazard at worst.

> Note also that I expect polling on this mailing list will not
> reach the majority of the user base, [...]

Yep.  This thread was really just to test the waters.

While there's no way to really reach out to the silent majority, I am going
to at least make a github issue and summarize the points from this
discussion there.  I'm glad to see that the general response so far has
been that this seems actionable (specifically, deprecating __nonzero__ on
size=0 arrays).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170819/af987635/attachment.html>

From efiring at hawaii.edu  Sat Aug 19 14:00:35 2017
From: efiring at hawaii.edu (Eric Firing)
Date: Sat, 19 Aug 2017 08:00:35 -1000
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
 <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
 <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>
Message-ID: <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>

On 2017/08/19 7:18 AM, Michael Lamparski wrote:
> While there's no way to really reach out to the silent majority, I am 
> going to at least make a github issue and summarize the points from this 
> discussion there.? I'm glad to see that the general response so far has 
> been that this seems actionable (specifically, deprecating __nonzero__ 
> on size=0 arrays).

No, that is the response you agree with; I don't think is fair to 
characterize it as the "general response".

From diagonaldevice at gmail.com  Sat Aug 19 16:26:47 2017
From: diagonaldevice at gmail.com (Michael Lamparski)
Date: Sat, 19 Aug 2017 16:26:47 -0400
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
 <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
 <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>
 <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>
Message-ID: <CAGfQJGEXyt5vBtEU6m7P4dBjDnKKkkQeoBaZ3xsyghnKSc_jDQ@mail.gmail.com>

On Sat, Aug 19, 2017 at 2:00 PM, Eric Firing <efiring at hawaii.edu> wrote:

> On 2017/08/19 7:18 AM, Michael Lamparski wrote:
>
>> While there's no way to really reach out to the silent majority, I am
>> going to at least make a github issue and summarize the points from this
>> discussion there.  I'm glad to see that the general response so far has
>> been that this seems actionable (specifically, deprecating __nonzero__ on
>> size=0 arrays).
>>
>
> No, that is the response you agree with; I don't think is fair to
> characterize it as the "general response".
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

With regards to gauging "general response," all I'm really trying to do is
gauge the likelihood of my issue getting closed right away without action,
if I were to file one (e.g. has this issue already been discussed, with a
decision to leave things as they are?), because I don't want to waste my
time and others' by creating an issue for something that is never going to
happen.

I've gotten the impression from this conversation that this change
(specifically for size=0) *is* possible, especially since two people with a
decent history of contribution to the numpy repository have voiced approval
for the change.  As I see it, opening an issue will at least invite some
more discussion, and at best motivate a change.

To me, that is a "generally positive response."

---

...but there's also more to it beyond the "general response."  From your
words, I get the impression that you believe that I am simply ignoring your
comments or do not value them, simply because they go against mine.  Please
understand: I *don't* enjoy the fact that truthness of numpy arrays works
differently from lists!

And there's plenty else that I don't enjoy about numpy, too;  I *don't*
enjoy the fact that I need to change a whole bunch of `assert a == b`
statements to `assert (a == b).all()` after changing the type of some tuple
to an array.  I *don't* enjoy how numpy's auto-magical shape-finding makes
it nearly impossible to have an array of heterogenous tuples.

But over the years, I've also put considerable amount of time and thought
into understanding *why* these design choices were made.  Library design is
a difficult beast.  Every design decision you make can interact in
unexpected ways with all of your other decisions, and eventually you have
to accept the fact that you can't always have your cake and eat it too.

And desigining a library like numpy, the library to end all libraries for
working with numerical data?  That is h-a-r-d HARD.  That borders on
programming-language-design hard.

The fact of the matter is that *I agree with you.*  Truthiness SHOULD
denote emptiness for python types....but I have already considered this,
and weighed it against every other design consideration that came to mind.
In the end, those other design considerations won out, and "scalar
evaluation/any()/all()" is the lesser of two evils.  To convince me
personally, you need to start by presenting something novel that I haven't
thought about.

There will be opportunity for others to do the same on Github.  Please; I
live for discussions about pitfalls in language and library design!

-Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170819/5301e968/attachment.html>

From efiring at hawaii.edu  Sat Aug 19 16:36:15 2017
From: efiring at hawaii.edu (Eric Firing)
Date: Sat, 19 Aug 2017 10:36:15 -1000
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAGfQJGEXyt5vBtEU6m7P4dBjDnKKkkQeoBaZ3xsyghnKSc_jDQ@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
 <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
 <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>
 <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>
 <CAGfQJGEXyt5vBtEU6m7P4dBjDnKKkkQeoBaZ3xsyghnKSc_jDQ@mail.gmail.com>
Message-ID: <05ae7890-bffe-219e-c627-3694de54a8fd@hawaii.edu>

On 2017/08/19 10:26 AM, Michael Lamparski wrote:
> There will be opportunity for others to do the same on Github. Please; I 
> live for discussions about pitfalls in language and library design!
> 
Thank you for your thoughtful discussion.

Eric

From njs at pobox.com  Sat Aug 19 17:24:26 2017
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 19 Aug 2017 14:24:26 -0700
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
Message-ID: <CAPJVwBkocxkYkXqtTCD66EFgHdhCwxdXG33ww+y--frH=eJWCw@mail.gmail.com>

On Fri, Aug 18, 2017 at 7:34 PM, Eric Firing <efiring at hawaii.edu> wrote:
> I don't agree.  I think the consistency between bool([]) and bool(array([]))
> is worth preserving.  Nothing you have shown is inconsistent with "Falseness
> is emptiness", which is quite fundamental in Python.  The inconsistency is
> in distinguishing between 1 element and more than one element.  To be
> consistent, bool(array([0])) and bool(array([0, 1])) should both be True.
> Contrary to the ValueError message, there need be no ambiguity, any more
> than there is an ambiguity in bool([1, 2]).

Yeah, this is a mess. But we're definitely not going to make
bool(array([0])) be True. That would break tons of code that currently
relies on the current behavior. And the current behavior does make
sense, in every case except empty arrays: bool broadcasts over the
array, and then, oh shoot, Python requires that bool's return value be
a scalar, so if this results in anything besides an array of size 1,
raise an error.

OTOH you can't really write code that depends on using the current
bool(array([])) semantics for emptiness checking, unless the only two
cases you care about are "empty" and "non-empty with exactly one
element and that element is truthy". So it's much less likely that
changing that will break existing code, plus any code that does break
was already likely broken in subtle ways.

The consistency-with-Python argument cuts two ways: if an array is a
container, then for consistency bool should do emptiness checking. If
an array is a bunch of scalars with broadcasting, then for consistency
bool should do truthiness checking on the individual elements and
raise an error on any array with size != 1. So we can't just rely on
consistency-with-Python to resolve the argument -- we need to pick one
:-). Though internal consistency within numpy would argue for the
latter option, because numpy almost always prefers the bag-of-scalars
semantics over the container semantics, e.g. for + and *, like Eric
Wieser mentioned. Though there are exceptions like iteration.

...Though actually, iteration and indexing by scalars tries to be
consistent with Python in yet a third way. They pretend that an array
is a unidimensional container holding a bunch of arrays:

In [3]: np.array([[1]])[0]
Out[3]: array([1])

In [4]: next(iter(np.array([[1]])))
Out[4]: array([1])

So according to this model, bool(np.array([])) should be False, but
bool(np.array([[]])) should be True (note that with lists, bool([[]])
is True). But alas:

In [5]: bool(np.array([])), bool(np.array([[]]))
Out[5]: (False, False)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From m.h.vankerkwijk at gmail.com  Sat Aug 19 18:05:50 2017
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Sat, 19 Aug 2017 18:05:50 -0400
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
 <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
 <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>
 <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>
Message-ID: <CAJNV+9sUm_fyQ__neJ7xpKM_WeBbjR5T3rgFrYdm45kcQW_P8A@mail.gmail.com>

Agreed with Eric Wieser here have an empty array test as `False` is
less than useless, since a non-empty array either returns something
based on its contents or an error. This means that one cannot write
statements like `if array:`. Does this leave any use case? It seems to
me it just shows there is no point in defining the truthiness of an
empty array.
-- Marten

From ben.v.root at gmail.com  Mon Aug 21 10:34:22 2017
From: ben.v.root at gmail.com (Benjamin Root)
Date: Mon, 21 Aug 2017 10:34:22 -0400
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAJNV+9sUm_fyQ__neJ7xpKM_WeBbjR5T3rgFrYdm45kcQW_P8A@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
 <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
 <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>
 <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>
 <CAJNV+9sUm_fyQ__neJ7xpKM_WeBbjR5T3rgFrYdm45kcQW_P8A@mail.gmail.com>
Message-ID: <CANNq6FmutvXEzMnc83Nn11bmUvkOrEdfatEUBc5bPMogfmJiTg@mail.gmail.com>

I've long ago stopped doing any "emptiness is false"-type tests on any
python containers when iterators and generators became common, because they
always return True.

Ben

On Sat, Aug 19, 2017 at 6:05 PM, Marten van Kerkwijk <
m.h.vankerkwijk at gmail.com> wrote:

> Agreed with Eric Wieser here have an empty array test as `False` is
> less than useless, since a non-empty array either returns something
> based on its contents or an error. This means that one cannot write
> statements like `if array:`. Does this leave any use case? It seems to
> me it just shows there is no point in defining the truthiness of an
> empty array.
> -- Marten
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170821/9f721e09/attachment.html>

From chris.barker at noaa.gov  Tue Aug 22 12:31:43 2017
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 22 Aug 2017 09:31:43 -0700
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CANNq6FmutvXEzMnc83Nn11bmUvkOrEdfatEUBc5bPMogfmJiTg@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
 <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
 <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>
 <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>
 <CAJNV+9sUm_fyQ__neJ7xpKM_WeBbjR5T3rgFrYdm45kcQW_P8A@mail.gmail.com>
 <CANNq6FmutvXEzMnc83Nn11bmUvkOrEdfatEUBc5bPMogfmJiTg@mail.gmail.com>
Message-ID: <CALGmxEJZ+N=9yH0RTnzhidxdzZfH_NF9T3=X2R1dmnLSv7bsUA@mail.gmail.com>

On Mon, Aug 21, 2017 at 7:34 AM, Benjamin Root <ben.v.root at gmail.com> wrote:

> I've long ago stopped doing any "emptiness is false"-type tests on any
> python containers when iterators and generators became common, because they
> always return True.
>

good point.

Personally, I've thought for years that Python's "Truthiness" concept is a
wart. Sure, empty sequences, and zero values are often "False" in nature,
but truthiness really is application-dependent -- in particular, sometimes
a value of zero is meaningful, and sometimes not.

Is it really so hard to write:

if len(seq) == 0:

or

if x == 0:

or

if arr.size == 0:

or

arr.shape == (0,0):

And then you are being far more explicit about what the test really is.

And thanks Ben, for pointing out the issue with iterables. One more example
of how Python has really changed its focus:

Python 2 (or maybe, Python1.5) was all about sequences. Python 3 is all
about iterables -- and the "empty is False" concept does not map well to
iterables....

As to the topic at hand, if we had it to do again, I would NOT make an
array that happens to hold a single value act like a scalar for bool() -- a
1-D array that happens to be length-1 really is a different beast than a
scalar.

But we don't have it to do again -- so we probably need to keep it as it is
for backward compatibility.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170822/f938035a/attachment.html>

From diagonaldevice at gmail.com  Tue Aug 22 14:04:25 2017
From: diagonaldevice at gmail.com (Michael Lamparski)
Date: Tue, 22 Aug 2017 14:04:25 -0400
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CALGmxEJZ+N=9yH0RTnzhidxdzZfH_NF9T3=X2R1dmnLSv7bsUA@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
 <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
 <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>
 <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>
 <CAJNV+9sUm_fyQ__neJ7xpKM_WeBbjR5T3rgFrYdm45kcQW_P8A@mail.gmail.com>
 <CANNq6FmutvXEzMnc83Nn11bmUvkOrEdfatEUBc5bPMogfmJiTg@mail.gmail.com>
 <CALGmxEJZ+N=9yH0RTnzhidxdzZfH_NF9T3=X2R1dmnLSv7bsUA@mail.gmail.com>
Message-ID: <CAGfQJGEHomyS0OmEDAi2gn3oj1Tw2j_tTVQG8A5GBBAHkuqoAQ@mail.gmail.com>

On Tue, Aug 22, 2017 at 12:31 PM, Chris Barker <chris.barker at noaa.gov>
 wrote:
> Personally, I've thought for years that Python's "Truthiness" concept is
a wart.
> Sure, empty sequences, and zero values are often "False" in nature,
> but truthiness really is application-dependent -- in particular, sometimes
> a value of zero is meaningful, and sometimes not.

I think truthiness is easily a wart in any dynamically-typed language (and
yet ironically, every language I can think of that has truthiness is
dynamically typed except for C++).  And yet for some reason it seems to be
pressed forward as idiomatic in python, and for that reason alone, I use
it.  These are questions I ask myself on a daily basis, just to support
this strange idiom:

- How close to the public API is this argument?
- Is '' a reasonable value for this string?
- How about an empty tuple?  Empty set?
- Should this sentinel value be None or a new object()?
- Is this list local to this function?
- Is the type of this optional argument always True?
- How liable are these answers to change with future refactoring?

which seems like a pretty big laundry list to keep in check for what's
supposed to be syntactic sugar.  In the end, I will admit that I think my
code "looks nice," but I think that's only because I've gotten used to
seeing it!

After answering all of these questions I tend to find that truthiness is
seldom usable in any sort of generic code.  These are the kinds of places
where I usually find myself using truthiness instead, and all involve
working with objects of known type:

# 1. A list used as a stack
while stack:
    top = stack.pop()
    ...

def read_config(d):
    # 2. Empty default value for a mutable argument that I don't mutate
    d = dict(d or {})

    a = d.pop('a')
    b = d.pop('b')
    ...

    # 3. Validating configuration
    if d:
        warn('unrecognized config keys: {!r}'.format(list(d)))

# 4. Oddball cases, e.g. the "linked list" (a, (b, (c, (d, (e, None)))))
def iter_linked_list(node):
    while node:
        value, node = node
        yield value

# 5. ...more oddball stuff...
def format_call(f, *args, **kw):
    arg_s = ', '.join(repr(x) for x in args)
    kw_s = ', '.join(f'{k:!s}={v:!r}' for k,v in kw.items())
    sep = ', ' if args and kw else ''
    return f'{f.__name__}({arg_s}{sep}{kw_s})'

Meanwhile, for an arbitrary iterator taken as an argument, if you want it
to have at least one element for some reason, then good luck; truthiness
will not help you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170822/2ea70421/attachment.html>

From chris.barker at noaa.gov  Tue Aug 22 17:48:14 2017
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 22 Aug 2017 14:48:14 -0700
Subject: [Numpy-discussion] Why are empty arrays False?
In-Reply-To: <CAGfQJGEHomyS0OmEDAi2gn3oj1Tw2j_tTVQG8A5GBBAHkuqoAQ@mail.gmail.com>
References: <CAGfQJGFuV4QpvJEkDHuARzK-BEUSaf-kYAeqE9edq6iwzzo4og@mail.gmail.com>
 <3816033a-89ba-5d5e-f3f6-c8b8c6f995c5@hawaii.edu>
 <CAL1kJvDwnUhpP_fDsBLO2eb3SDzbqRTEG+DUwYMxM36QfHjSpg@mail.gmail.com>
 <CAGfQJGFPM0F1xJh5mGBGeRJ8TYjT0350bnZ1jEpWEmxKMq=d+A@mail.gmail.com>
 <31cc66a7-2d79-6dd1-ef09-cb2974736cc2@iki.fi>
 <CAGfQJGGRFjcxLz2UrGg4bk2PRvR2mkrg00hRjweuiVb60_84bA@mail.gmail.com>
 <b5d9f6ef-15be-6729-8780-65cdcea98208@hawaii.edu>
 <CAJNV+9sUm_fyQ__neJ7xpKM_WeBbjR5T3rgFrYdm45kcQW_P8A@mail.gmail.com>
 <CANNq6FmutvXEzMnc83Nn11bmUvkOrEdfatEUBc5bPMogfmJiTg@mail.gmail.com>
 <CALGmxEJZ+N=9yH0RTnzhidxdzZfH_NF9T3=X2R1dmnLSv7bsUA@mail.gmail.com>
 <CAGfQJGEHomyS0OmEDAi2gn3oj1Tw2j_tTVQG8A5GBBAHkuqoAQ@mail.gmail.com>
Message-ID: <CALGmxEJS1m72VBR2KRbZSA-ysVa9q7QEu5CcnQxa6tnE7rnsGQ@mail.gmail.com>

On Tue, Aug 22, 2017 at 11:04 AM, Michael Lamparski <
diagonaldevice at gmail.com> wrote:

> I think truthiness is easily a wart in any dynamically-typed language (and
> yet ironically, every language I can think of that has truthiness is
> dynamically typed except for C++).  And yet for some reason it seems to be
> pressed forward as idiomatic in python, and for that reason alone, I use
> it.
>

me too :-)


> Meanwhile, for an arbitrary iterator taken as an argument, if you want it
> to have at least one element for some reason, then good luck; truthiness
> will not help you.
>

of course, nor will len()

And this is mostly OK, as if you are taking an aritrary iterable, then you
are probably going to, well, iterate over it, and:

for this in an_empty_iterable:
    ...

works fine.

But bringing it  back OT -- it's all a bit messy, but there is logic for
the existing conventions in numpy -- and I think backward compatibility is
more important than a slightly cleaner API.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170822/c5aca53f/attachment-0001.html>

From renato.fabbri at gmail.com  Thu Aug 24 09:53:27 2017
From: renato.fabbri at gmail.com (Renato Fabbri)
Date: Thu, 24 Aug 2017 10:53:27 -0300
Subject: [Numpy-discussion] power function distribution or power-law
 distribution?
Message-ID: <CALqdYOy2_0_zWeohSmumNf6RT-5JhCGoGmiw-O9Ghrwhveravw@mail.gmail.com>

numpy.random.power.__doc__

uses only the term "power function distribution".

I cannot find a comparison between this term and
"power-law distribution" and am quite interested to know
if they are simply synonyms.

Any ideas?

BTW.  how is this list related to numpy-discussion at scipy.org?

-- 
Renato Fabbri
GNU/Linux User #479299
labmacambira.sourceforge.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170824/ca71c090/attachment.html>

From pav at iki.fi  Thu Aug 24 10:07:00 2017
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 24 Aug 2017 16:07:00 +0200
Subject: [Numpy-discussion] power function distribution or power-law
 distribution?
In-Reply-To: <CALqdYOy2_0_zWeohSmumNf6RT-5JhCGoGmiw-O9Ghrwhveravw@mail.gmail.com>
References: <CALqdYOy2_0_zWeohSmumNf6RT-5JhCGoGmiw-O9Ghrwhveravw@mail.gmail.com>
Message-ID: <1503583620.2351.9.camel@iki.fi>

to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti:
> numpy.random.power.__doc__
> 
> uses only the term "power function distribution".

The documentation in the most recent Numpy version seems to be more
explicit, see the Notes section for the PDF:

https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power
.html

> BTW.  how is this list related to numpy-discussion at scipy.org?

That's the old address of this list. 
The current address is numpy-discussion at python.org and it should be
used instead.

-- 
Pauli Virtanen

From renato.fabbri at gmail.com  Thu Aug 24 10:41:14 2017
From: renato.fabbri at gmail.com (Renato Fabbri)
Date: Thu, 24 Aug 2017 11:41:14 -0300
Subject: [Numpy-discussion] power function distribution or power-law
 distribution?
In-Reply-To: <1503583620.2351.9.camel@iki.fi>
References: <CALqdYOy2_0_zWeohSmumNf6RT-5JhCGoGmiw-O9Ghrwhveravw@mail.gmail.com>
 <1503583620.2351.9.camel@iki.fi>
Message-ID: <CALqdYOyGnEMn9P4e1NHoWR_U+f2Me0iaDzt1mDZYNmCWEGGsJA@mail.gmail.com>

Thanks for the reply.

But the question remains:
how are the terms "power function distribution"
and "power-law distribution" related?

The documentation link you sent have no information on this.
(
And seems the same as I get here
In [6]: n.version.full_version
Out[6]: '1.11.0'
)

On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen <pav at iki.fi> wrote:

> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti:
> > numpy.random.power.__doc__
> >
> > uses only the term "power function distribution".
>
> The documentation in the most recent Numpy version seems to be more
> explicit, see the Notes section for the PDF:
>
> https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power
> .html
>
> > BTW.  how is this list related to numpy-discussion at scipy.org?
>
> That's the old address of this list.
> The current address is numpy-discussion at python.org and it should be
> used instead.
>
> --
> Pauli Virtanen
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>


-- 
Renato Fabbri
GNU/Linux User #479299
labmacambira.sourceforge.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170824/0fcf045e/attachment.html>

From nathan12343 at gmail.com  Thu Aug 24 10:47:51 2017
From: nathan12343 at gmail.com (Nathan Goldbaum)
Date: Thu, 24 Aug 2017 09:47:51 -0500
Subject: [Numpy-discussion] power function distribution or power-law
 distribution?
In-Reply-To: <CALqdYOyGnEMn9P4e1NHoWR_U+f2Me0iaDzt1mDZYNmCWEGGsJA@mail.gmail.com>
References: <CALqdYOy2_0_zWeohSmumNf6RT-5JhCGoGmiw-O9Ghrwhveravw@mail.gmail.com>
 <1503583620.2351.9.camel@iki.fi>
 <CALqdYOyGnEMn9P4e1NHoWR_U+f2Me0iaDzt1mDZYNmCWEGGsJA@mail.gmail.com>
Message-ID: <CAJXewOmjND4qv4i9syY61WrOiX2Y2yKNLr_qP=2Jku6X9Jdj-A@mail.gmail.com>

The latest version of numpy is 1.13.

In this case, as described in the docs, a power function distribution is
one with a probability desnity function of the form ax^(a-1) for x between
0 and 1.

On Thu, Aug 24, 2017 at 9:41 AM, Renato Fabbri <renato.fabbri at gmail.com>
wrote:

> Thanks for the reply.
>
> But the question remains:
> how are the terms "power function distribution"
> and "power-law distribution" related?
>
> The documentation link you sent have no information on this.
> (
> And seems the same as I get here
> In [6]: n.version.full_version
> Out[6]: '1.11.0'
> )
>
> On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen <pav at iki.fi> wrote:
>
>> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti:
>> > numpy.random.power.__doc__
>> >
>> > uses only the term "power function distribution".
>>
>> The documentation in the most recent Numpy version seems to be more
>> explicit, see the Notes section for the PDF:
>>
>> https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power
>> .html
>> <https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power.html>
>>
>> > BTW.  how is this list related to numpy-discussion at scipy.org?
>>
>> That's the old address of this list.
>> The current address is numpy-discussion at python.org and it should be
>> used instead.
>>
>> --
>> Pauli Virtanen
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
>
>
> --
> Renato Fabbri
> GNU/Linux User #479299
> labmacambira.sourceforge.net
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170824/3b3418d6/attachment-0001.html>

From renato.fabbri at gmail.com  Thu Aug 24 10:56:46 2017
From: renato.fabbri at gmail.com (Renato Fabbri)
Date: Thu, 24 Aug 2017 11:56:46 -0300
Subject: [Numpy-discussion] power function distribution or power-law
 distribution?
In-Reply-To: <CAJXewOmjND4qv4i9syY61WrOiX2Y2yKNLr_qP=2Jku6X9Jdj-A@mail.gmail.com>
References: <CALqdYOy2_0_zWeohSmumNf6RT-5JhCGoGmiw-O9Ghrwhveravw@mail.gmail.com>
 <1503583620.2351.9.camel@iki.fi>
 <CALqdYOyGnEMn9P4e1NHoWR_U+f2Me0iaDzt1mDZYNmCWEGGsJA@mail.gmail.com>
 <CAJXewOmjND4qv4i9syY61WrOiX2Y2yKNLr_qP=2Jku6X9Jdj-A@mail.gmail.com>
Message-ID: <CALqdYOxCReC2XKN21heuONT02YKZ3WeXjP9A1vfoOM0pe4uPCg@mail.gmail.com>

On Thu, Aug 24, 2017 at 11:47 AM, Nathan Goldbaum <nathan12343 at gmail.com>
wrote:

> The latest version of numpy is 1.13.
>
> In this case, as described in the docs, a power function distribution is
> one with a probability desnity function of the form ax^(a-1) for x between
> 0 and 1.
>

ok, let's try ourselves to relate the terms.
Would you agree that the "power function distribution" is a "power-law
distribution"
in which the domain is restricted to be [0,1]?


>
> On Thu, Aug 24, 2017 at 9:41 AM, Renato Fabbri <renato.fabbri at gmail.com>
> wrote:
>
>> Thanks for the reply.
>>
>> But the question remains:
>> how are the terms "power function distribution"
>> and "power-law distribution" related?
>>
>> The documentation link you sent have no information on this.
>> (
>> And seems the same as I get here
>> In [6]: n.version.full_version
>> Out[6]: '1.11.0'
>> )
>>
>> On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen <pav at iki.fi> wrote:
>>
>>> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti:
>>> > numpy.random.power.__doc__
>>> >
>>> > uses only the term "power function distribution".
>>>
>>> The documentation in the most recent Numpy version seems to be more
>>> explicit, see the Notes section for the PDF:
>>>
>>> https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power
>>> .html
>>> <https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power.html>
>>>
>>> > BTW.  how is this list related to numpy-discussion at scipy.org?
>>>
>>> That's the old address of this list.
>>> The current address is numpy-discussion at python.org and it should be
>>> used instead.
>>>
>>> --
>>> Pauli Virtanen
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>>
>> --
>> Renato Fabbri
>> GNU/Linux User #479299
>> labmacambira.sourceforge.net
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>


-- 
Renato Fabbri
GNU/Linux User #479299
labmacambira.sourceforge.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170824/fed3859a/attachment.html>

From josef.pktd at gmail.com  Thu Aug 24 12:57:43 2017
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 24 Aug 2017 12:57:43 -0400
Subject: [Numpy-discussion] power function distribution or power-law
 distribution?
In-Reply-To: <CALqdYOxCReC2XKN21heuONT02YKZ3WeXjP9A1vfoOM0pe4uPCg@mail.gmail.com>
References: <CALqdYOy2_0_zWeohSmumNf6RT-5JhCGoGmiw-O9Ghrwhveravw@mail.gmail.com>
 <1503583620.2351.9.camel@iki.fi>
 <CALqdYOyGnEMn9P4e1NHoWR_U+f2Me0iaDzt1mDZYNmCWEGGsJA@mail.gmail.com>
 <CAJXewOmjND4qv4i9syY61WrOiX2Y2yKNLr_qP=2Jku6X9Jdj-A@mail.gmail.com>
 <CALqdYOxCReC2XKN21heuONT02YKZ3WeXjP9A1vfoOM0pe4uPCg@mail.gmail.com>
Message-ID: <CAMMTP+BWyMsy81WxW8zHZWn_qP4_7JCBkBCkoYTz8RttdNuj9A@mail.gmail.com>

On Thu, Aug 24, 2017 at 10:56 AM, Renato Fabbri <renato.fabbri at gmail.com>
wrote:

> On Thu, Aug 24, 2017 at 11:47 AM, Nathan Goldbaum <nathan12343 at gmail.com>
> wrote:
>
>> The latest version of numpy is 1.13.
>>
>> In this case, as described in the docs, a power function distribution is
>> one with a probability desnity function of the form ax^(a-1) for x between
>> 0 and 1.
>>
>
> ok, let's try ourselves to relate the terms.
> Would you agree that the "power function distribution" is a "power-law
> distribution"
> in which the domain is restricted to be [0,1]?
>

I would phrase it weaker. The emphasis for power-law distribution is often
or commonly on the tail behavior.

The functional form of the pdf is the same as the power-law distribution
but restricted to a finite interval [0,1]
or
The power function distribution can be considered as a truncated power-law
distribution.

(I looked at it maybe 9 years ago, but gave up on the similarity because
the purpose is very different, at least based on what I looked at at the
time. The similarity in name also got me confused initially.)

Josef


>
>
>
>
>>
>> On Thu, Aug 24, 2017 at 9:41 AM, Renato Fabbri <renato.fabbri at gmail.com>
>> wrote:
>>
>>> Thanks for the reply.
>>>
>>> But the question remains:
>>> how are the terms "power function distribution"
>>> and "power-law distribution" related?
>>>
>>> The documentation link you sent have no information on this.
>>> (
>>> And seems the same as I get here
>>> In [6]: n.version.full_version
>>> Out[6]: '1.11.0'
>>> )
>>>
>>> On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen <pav at iki.fi> wrote:
>>>
>>>> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti:
>>>> > numpy.random.power.__doc__
>>>> >
>>>> > uses only the term "power function distribution".
>>>>
>>>> The documentation in the most recent Numpy version seems to be more
>>>> explicit, see the Notes section for the PDF:
>>>>
>>>> https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power
>>>> .html
>>>> <https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power.html>
>>>>
>>>> > BTW.  how is this list related to numpy-discussion at scipy.org?
>>>>
>>>> That's the old address of this list.
>>>> The current address is numpy-discussion at python.org and it should be
>>>> used instead.
>>>>
>>>> --
>>>> Pauli Virtanen
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>
>>>
>>>
>>> --
>>> Renato Fabbri
>>> GNU/Linux User #479299
>>> labmacambira.sourceforge.net
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
> Renato Fabbri
> GNU/Linux User #479299
> labmacambira.sourceforge.net
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170824/c51ab2da/attachment.html>

From robert.kern at gmail.com  Thu Aug 24 13:19:27 2017
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 24 Aug 2017 10:19:27 -0700
Subject: [Numpy-discussion] power function distribution or power-law
 distribution?
In-Reply-To: <CALqdYOxCReC2XKN21heuONT02YKZ3WeXjP9A1vfoOM0pe4uPCg@mail.gmail.com>
References: <CALqdYOy2_0_zWeohSmumNf6RT-5JhCGoGmiw-O9Ghrwhveravw@mail.gmail.com>
 <1503583620.2351.9.camel@iki.fi>
 <CALqdYOyGnEMn9P4e1NHoWR_U+f2Me0iaDzt1mDZYNmCWEGGsJA@mail.gmail.com>
 <CAJXewOmjND4qv4i9syY61WrOiX2Y2yKNLr_qP=2Jku6X9Jdj-A@mail.gmail.com>
 <CALqdYOxCReC2XKN21heuONT02YKZ3WeXjP9A1vfoOM0pe4uPCg@mail.gmail.com>
Message-ID: <CAF6FJisC_YJUMQk2kC_F06RwuHZSuRfz41qPSX2M-S9ELXif_A@mail.gmail.com>

On Thu, Aug 24, 2017 at 7:56 AM, Renato Fabbri <renato.fabbri at gmail.com>
wrote:
>
> On Thu, Aug 24, 2017 at 11:47 AM, Nathan Goldbaum <nathan12343 at gmail.com>
wrote:
>>
>> The latest version of numpy is 1.13.
>>
>> In this case, as described in the docs, a power function distribution is
one with a probability desnity function of the form ax^(a-1) for x between
0 and 1.
>
> ok, let's try ourselves to relate the terms.
> Would you agree that the "power function distribution" is a "power-law
distribution"
> in which the domain is restricted to be [0,1]?

I probably wouldn't. The coincidental similarity in functional form (domain
and normalizing constants notwithstanding) obscures the very different
mechanisms each represent.

The ambiguous name of the method `power` instead of `power_function` is my
fault. You have my apologies.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170824/26fed2ea/attachment.html>

From josef.pktd at gmail.com  Thu Aug 24 13:24:44 2017
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 24 Aug 2017 13:24:44 -0400
Subject: [Numpy-discussion] power function distribution or power-law
 distribution?
In-Reply-To: <CAMMTP+BWyMsy81WxW8zHZWn_qP4_7JCBkBCkoYTz8RttdNuj9A@mail.gmail.com>
References: <CALqdYOy2_0_zWeohSmumNf6RT-5JhCGoGmiw-O9Ghrwhveravw@mail.gmail.com>
 <1503583620.2351.9.camel@iki.fi>
 <CALqdYOyGnEMn9P4e1NHoWR_U+f2Me0iaDzt1mDZYNmCWEGGsJA@mail.gmail.com>
 <CAJXewOmjND4qv4i9syY61WrOiX2Y2yKNLr_qP=2Jku6X9Jdj-A@mail.gmail.com>
 <CALqdYOxCReC2XKN21heuONT02YKZ3WeXjP9A1vfoOM0pe4uPCg@mail.gmail.com>
 <CAMMTP+BWyMsy81WxW8zHZWn_qP4_7JCBkBCkoYTz8RttdNuj9A@mail.gmail.com>
Message-ID: <CAMMTP+CPn2y-u+63h_PMxEcJPE_6Yo28YC5uzuNXTx3NnSu7qA@mail.gmail.com>

On Thu, Aug 24, 2017 at 12:57 PM, <josef.pktd at gmail.com> wrote:

>
>
> On Thu, Aug 24, 2017 at 10:56 AM, Renato Fabbri <renato.fabbri at gmail.com>
> wrote:
>
>> On Thu, Aug 24, 2017 at 11:47 AM, Nathan Goldbaum <nathan12343 at gmail.com>
>> wrote:
>>
>>> The latest version of numpy is 1.13.
>>>
>>> In this case, as described in the docs, a power function distribution is
>>> one with a probability desnity function of the form ax^(a-1) for x between
>>> 0 and 1.
>>>
>>
>> ok, let's try ourselves to relate the terms.
>> Would you agree that the "power function distribution" is a "power-law
>> distribution"
>> in which the domain is restricted to be [0,1]?
>>
>
> I would phrase it weaker. The emphasis for power-law distribution is often
> or commonly on the tail behavior.
>
> The functional form of the pdf is the same as the power-law distribution
> but restricted to a finite interval [0,1]
> or
> The power function distribution can be considered as a truncated power-law
> distribution.
>
> (I looked at it maybe 9 years ago, but gave up on the similarity because
> the purpose is very different, at least based on what I looked at at the
> time. The similarity in name also got me confused initially.)
>


Based on what I start to remember:
The power function distribution can have increasing pdf. Because of the
truncation it does not need the same parameter restriction as the power law
distribution in order to integrate to a finite value so it can be
normalized to a proper distribution.

Josef


>
> Josef
>
>
>
>
>
>>
>>
>>
>>
>>>
>>> On Thu, Aug 24, 2017 at 9:41 AM, Renato Fabbri <renato.fabbri at gmail.com>
>>> wrote:
>>>
>>>> Thanks for the reply.
>>>>
>>>> But the question remains:
>>>> how are the terms "power function distribution"
>>>> and "power-law distribution" related?
>>>>
>>>> The documentation link you sent have no information on this.
>>>> (
>>>> And seems the same as I get here
>>>> In [6]: n.version.full_version
>>>> Out[6]: '1.11.0'
>>>> )
>>>>
>>>> On Thu, Aug 24, 2017 at 11:07 AM, Pauli Virtanen <pav at iki.fi> wrote:
>>>>
>>>>> to, 2017-08-24 kello 10:53 -0300, Renato Fabbri kirjoitti:
>>>>> > numpy.random.power.__doc__
>>>>> >
>>>>> > uses only the term "power function distribution".
>>>>>
>>>>> The documentation in the most recent Numpy version seems to be more
>>>>> explicit, see the Notes section for the PDF:
>>>>>
>>>>> https://docs.scipy.org/doc/numpy/reference/generated/numpy.r
>>>>> andom.power
>>>>> .html
>>>>> <https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.power.html>
>>>>>
>>>>> > BTW.  how is this list related to numpy-discussion at scipy.org?
>>>>>
>>>>> That's the old address of this list.
>>>>> The current address is numpy-discussion at python.org and it should be
>>>>> used instead.
>>>>>
>>>>> --
>>>>> Pauli Virtanen
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at python.org
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Renato Fabbri
>>>> GNU/Linux User #479299
>>>> labmacambira.sourceforge.net
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>>
>> --
>> Renato Fabbri
>> GNU/Linux User #479299
>> labmacambira.sourceforge.net
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170824/2a41d356/attachment-0001.html>

From ndbecker2 at gmail.com  Mon Aug 28 15:20:17 2017
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 28 Aug 2017 19:20:17 +0000
Subject: [Numpy-discussion] Interface numpy arrays to Matlab?
Message-ID: <CAG3t+pENb8jrwwW3X_qV0Jv=E1GZCkCns7mCXPo-HSN4ydv4DA@mail.gmail.com>

I've searched but haven't found any decent answer.  I need to call Matlab
from python.  Matlab has a python module for this purpose, but it doesn't
understand numpy AFAICT.  What solutions are there for efficiently
interfacing numpy arrays to Matlab?

Thanks,
Neal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170828/2c5391a1/attachment.html>

From shoyer at gmail.com  Mon Aug 28 16:21:41 2017
From: shoyer at gmail.com (Stephan Hoyer)
Date: Mon, 28 Aug 2017 13:21:41 -0700
Subject: [Numpy-discussion] Interface numpy arrays to Matlab?
In-Reply-To: <CAG3t+pENb8jrwwW3X_qV0Jv=E1GZCkCns7mCXPo-HSN4ydv4DA@mail.gmail.com>
References: <CAG3t+pENb8jrwwW3X_qV0Jv=E1GZCkCns7mCXPo-HSN4ydv4DA@mail.gmail.com>
Message-ID: <CAEQ_TvdN5iEoK3b4s7z2QRMxM1rPaCZrT8hiptrw5v9JU4NWHw@mail.gmail.com>

If you can use Octave instead of Matlab, I've had a very good experience
with Oct2Py:
https://github.com/blink1073/oct2py

On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker <ndbecker2 at gmail.com> wrote:

> I've searched but haven't found any decent answer.  I need to call Matlab
> from python.  Matlab has a python module for this purpose, but it doesn't
> understand numpy AFAICT.  What solutions are there for efficiently
> interfacing numpy arrays to Matlab?
>
> Thanks,
> Neal
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170828/aead2453/attachment.html>

From perimosocordiae at gmail.com  Mon Aug 28 16:29:25 2017
From: perimosocordiae at gmail.com (CJ Carey)
Date: Mon, 28 Aug 2017 16:29:25 -0400
Subject: [Numpy-discussion] Interface numpy arrays to Matlab?
In-Reply-To: <CAEQ_TvdN5iEoK3b4s7z2QRMxM1rPaCZrT8hiptrw5v9JU4NWHw@mail.gmail.com>
References: <CAG3t+pENb8jrwwW3X_qV0Jv=E1GZCkCns7mCXPo-HSN4ydv4DA@mail.gmail.com>
 <CAEQ_TvdN5iEoK3b4s7z2QRMxM1rPaCZrT8hiptrw5v9JU4NWHw@mail.gmail.com>
Message-ID: <CAEfGn+yyY959KEH93qdVJ3k8H637CT8N0mUb9so38eHjb7v-Yw@mail.gmail.com>

Looks like Transplant can handle this use-case.

Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html
GitHub link: https://github.com/bastibe/transplant

I haven't given it a try myself, but it looks promising.

On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

> If you can use Octave instead of Matlab, I've had a very good experience
> with Oct2Py:
> https://github.com/blink1073/oct2py
>
> On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>
>> I've searched but haven't found any decent answer.  I need to call Matlab
>> from python.  Matlab has a python module for this purpose, but it doesn't
>> understand numpy AFAICT.  What solutions are there for efficiently
>> interfacing numpy arrays to Matlab?
>>
>> Thanks,
>> Neal
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170828/bc500af1/attachment.html>

From grlee77 at gmail.com  Mon Aug 28 17:27:00 2017
From: grlee77 at gmail.com (Gregory Lee)
Date: Mon, 28 Aug 2017 17:27:00 -0400
Subject: [Numpy-discussion] Interface numpy arrays to Matlab?
In-Reply-To: <CAEfGn+yyY959KEH93qdVJ3k8H637CT8N0mUb9so38eHjb7v-Yw@mail.gmail.com>
References: <CAG3t+pENb8jrwwW3X_qV0Jv=E1GZCkCns7mCXPo-HSN4ydv4DA@mail.gmail.com>
 <CAEQ_TvdN5iEoK3b4s7z2QRMxM1rPaCZrT8hiptrw5v9JU4NWHw@mail.gmail.com>
 <CAEfGn+yyY959KEH93qdVJ3k8H637CT8N0mUb9so38eHjb7v-Yw@mail.gmail.com>
Message-ID: <CAJR3sXcK=Xtqqq7bDrw8LgXdsULsTxb7Fn9nM1XwU5O6xYS63g@mail.gmail.com>

I have not used Transplant, but it sounds fairly similar to
Python-matlab-bridge.  We currently optionally call Matlab via
Python-matlab-bridge in some of the the tests for the PyWavelets package.

https://arokem.github.io/python-matlab-bridge/
https://github.com/arokem/python-matlab-bridge

I would be interested in hearing about the benefits/drawbacks relative to
Transplant if there is anyone who has used both.


On Mon, Aug 28, 2017 at 4:29 PM, CJ Carey <perimosocordiae at gmail.com> wrote:

> Looks like Transplant can handle this use-case.
>
> Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html
> GitHub link: https://github.com/bastibe/transplant
>
> I haven't given it a try myself, but it looks promising.
>
> On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>
>> If you can use Octave instead of Matlab, I've had a very good experience
>> with Oct2Py:
>> https://github.com/blink1073/oct2py
>>
>> On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker <ndbecker2 at gmail.com>
>> wrote:
>>
>>> I've searched but haven't found any decent answer.  I need to call
>>> Matlab from python.  Matlab has a python module for this purpose, but it
>>> doesn't understand numpy AFAICT.  What solutions are there for efficiently
>>> interfacing numpy arrays to Matlab?
>>>
>>> Thanks,
>>> Neal
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170828/81adc6c5/attachment-0001.html>

From ndbecker2 at gmail.com  Tue Aug 29 07:08:57 2017
From: ndbecker2 at gmail.com (Neal Becker)
Date: Tue, 29 Aug 2017 11:08:57 +0000
Subject: [Numpy-discussion] Interface numpy arrays to Matlab?
In-Reply-To: <CAJR3sXcK=Xtqqq7bDrw8LgXdsULsTxb7Fn9nM1XwU5O6xYS63g@mail.gmail.com>
References: <CAG3t+pENb8jrwwW3X_qV0Jv=E1GZCkCns7mCXPo-HSN4ydv4DA@mail.gmail.com>
 <CAEQ_TvdN5iEoK3b4s7z2QRMxM1rPaCZrT8hiptrw5v9JU4NWHw@mail.gmail.com>
 <CAEfGn+yyY959KEH93qdVJ3k8H637CT8N0mUb9so38eHjb7v-Yw@mail.gmail.com>
 <CAJR3sXcK=Xtqqq7bDrw8LgXdsULsTxb7Fn9nM1XwU5O6xYS63g@mail.gmail.com>
Message-ID: <CAG3t+pHd68bmAJVJBtaorXjuh=ZmPEYbZzrGwaWAt_4NiDk9pw@mail.gmail.com>

Transplant sounds interesting, I think I could use this.  I don't
understand though why nobody has used a more direct approach?  Matlab has
their python API
https://www.mathworks.com/help/matlab/matlab-engine-for-python.html.  This
will pass Matlab arrays to/from python as some kind of opaque blob.  I
would guess that inside every Matlab array is a numpy array crying to be
freed - in both cases an array is a block of memory together with shape and
stride information.  So I would hope a direct conversion could be done, at
least via C API if not directly with python numpy API.  But it seems nobody
has done this, so maybe it's not that simple?


On Mon, Aug 28, 2017 at 5:32 PM Gregory Lee <grlee77 at gmail.com> wrote:

> I have not used Transplant, but it sounds fairly similar to
> Python-matlab-bridge.  We currently optionally call Matlab via
> Python-matlab-bridge in some of the the tests for the PyWavelets package.
>
> https://arokem.github.io/python-matlab-bridge/
> https://github.com/arokem/python-matlab-bridge
>
> I would be interested in hearing about the benefits/drawbacks relative to
> Transplant if there is anyone who has used both.
>
>
> On Mon, Aug 28, 2017 at 4:29 PM, CJ Carey <perimosocordiae at gmail.com>
> wrote:
>
>> Looks like Transplant can handle this use-case.
>>
>> Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html
>> GitHub link: https://github.com/bastibe/transplant
>>
>> I haven't given it a try myself, but it looks promising.
>>
>> On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>>
>>> If you can use Octave instead of Matlab, I've had a very good experience
>>> with Oct2Py:
>>> https://github.com/blink1073/oct2py
>>>
>>> On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker <ndbecker2 at gmail.com>
>>> wrote:
>>>
>>>> I've searched but haven't found any decent answer.  I need to call
>>>> Matlab from python.  Matlab has a python module for this purpose, but it
>>>> doesn't understand numpy AFAICT.  What solutions are there for efficiently
>>>> interfacing numpy arrays to Matlab?
>>>>
>>>> Thanks,
>>>> Neal
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170829/ac8d5fa6/attachment.html>

From deak.andris at gmail.com  Tue Aug 29 07:44:36 2017
From: deak.andris at gmail.com (Andras Deak)
Date: Tue, 29 Aug 2017 13:44:36 +0200
Subject: [Numpy-discussion] Interface numpy arrays to Matlab?
In-Reply-To: <CAG3t+pHd68bmAJVJBtaorXjuh=ZmPEYbZzrGwaWAt_4NiDk9pw@mail.gmail.com>
References: <CAG3t+pENb8jrwwW3X_qV0Jv=E1GZCkCns7mCXPo-HSN4ydv4DA@mail.gmail.com>
 <CAEQ_TvdN5iEoK3b4s7z2QRMxM1rPaCZrT8hiptrw5v9JU4NWHw@mail.gmail.com>
 <CAEfGn+yyY959KEH93qdVJ3k8H637CT8N0mUb9so38eHjb7v-Yw@mail.gmail.com>
 <CAJR3sXcK=Xtqqq7bDrw8LgXdsULsTxb7Fn9nM1XwU5O6xYS63g@mail.gmail.com>
 <CAG3t+pHd68bmAJVJBtaorXjuh=ZmPEYbZzrGwaWAt_4NiDk9pw@mail.gmail.com>
Message-ID: <CAMEWA4PFz9Nw_1ayqpyTLk=oGRvsNPqHXsqdM1n-d0uiePKpMw@mail.gmail.com>

On Tue, Aug 29, 2017 at 1:08 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
> [...]
> [I] would guess that inside every Matlab array is a numpy array crying to be
> freed - in both cases an array is a block of memory together with shape and
> stride information.  So I would hope a direct conversion could be done, at
> least via C API if not directly with python numpy API.  But it seems nobody
> has done this, so maybe it's not that simple?

I was going to suggest this Stack Overflow post earlier but figured
that you must have found it already:
https://stackoverflow.com/questions/34155829/how-to-efficiently-convert-matlab-engine-arrays-to-numpy-ndarray
Based on that it seems that at least arrays returned from the MATLAB
engine can be reasonably converted using their underlying data
(`_data` attribute, together with the `size` attribute to unravel
multidimensional arrays).
The other way around (i.e. passing numpy arrays to the MATLAB engine)
seems less straightforward: all I could find was
https://www.mathworks.com/matlabcentral/answers/216498-passing-numpy-ndarray-from-python-to-matlab
The comments there suggest that you can instantiate `matlab.double`
objects from lists that you can pass to the MATLAB engine. Explicitly
converting your arrays to lists along this step don't sound too good
to me.
Disclaimer: I haven't tried either methods.
Regards,

Andr?s De?k


> On Mon, Aug 28, 2017 at 5:32 PM Gregory Lee <grlee77 at gmail.com> wrote:
>>
>> I have not used Transplant, but it sounds fairly similar to
>> Python-matlab-bridge.  We currently optionally call Matlab via
>> Python-matlab-bridge in some of the the tests for the PyWavelets package.
>>
>> https://arokem.github.io/python-matlab-bridge/
>> https://github.com/arokem/python-matlab-bridge
>>
>> I would be interested in hearing about the benefits/drawbacks relative to
>> Transplant if there is anyone who has used both.
>>
>>
>> On Mon, Aug 28, 2017 at 4:29 PM, CJ Carey <perimosocordiae at gmail.com>
>> wrote:
>>>
>>> Looks like Transplant can handle this use-case.
>>>
>>> Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html
>>> GitHub link: https://github.com/bastibe/transplant
>>>
>>> I haven't given it a try myself, but it looks promising.
>>>
>>> On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>>>>
>>>> If you can use Octave instead of Matlab, I've had a very good experience
>>>> with Oct2Py:
>>>> https://github.com/blink1073/oct2py
>>>>
>>>> On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker <ndbecker2 at gmail.com>
>>>> wrote:
>>>>>
>>>>> I've searched but haven't found any decent answer.  I need to call
>>>>> Matlab from python.  Matlab has a python module for this purpose, but it
>>>>> doesn't understand numpy AFAICT.  What solutions are there for efficiently
>>>>> interfacing numpy arrays to Matlab?
>>>>>
>>>>> Thanks,
>>>>> Neal
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at python.org
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

From chris.barker at noaa.gov  Tue Aug 29 14:52:47 2017
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 29 Aug 2017 11:52:47 -0700
Subject: [Numpy-discussion] Interface numpy arrays to Matlab?
In-Reply-To: <CAG3t+pHd68bmAJVJBtaorXjuh=ZmPEYbZzrGwaWAt_4NiDk9pw@mail.gmail.com>
References: <CAG3t+pENb8jrwwW3X_qV0Jv=E1GZCkCns7mCXPo-HSN4ydv4DA@mail.gmail.com>
 <CAEQ_TvdN5iEoK3b4s7z2QRMxM1rPaCZrT8hiptrw5v9JU4NWHw@mail.gmail.com>
 <CAEfGn+yyY959KEH93qdVJ3k8H637CT8N0mUb9so38eHjb7v-Yw@mail.gmail.com>
 <CAJR3sXcK=Xtqqq7bDrw8LgXdsULsTxb7Fn9nM1XwU5O6xYS63g@mail.gmail.com>
 <CAG3t+pHd68bmAJVJBtaorXjuh=ZmPEYbZzrGwaWAt_4NiDk9pw@mail.gmail.com>
Message-ID: <CALGmxEJnPQWvV-xhAm13V_kEnm8Q=uRQDOGgBMn0m8o2J4_ciw@mail.gmail.com>

On Tue, Aug 29, 2017 at 4:08 AM, Neal Becker <ndbecker2 at gmail.com> wrote:

> Transplant sounds interesting, I think I could use this.  I don't
> understand though why nobody has used a more direct approach?  Matlab has
> their python API https://www.mathworks.com/help/matlab/matlab-engine-for-
> python.html.  This will pass Matlab arrays to/from python as some kind of
> opaque blob.  I would guess that inside every Matlab array is a numpy array
> crying to be freed - in both cases an array is a block of memory together
> with shape and stride information.  So I would hope a direct conversion
> could be done, at least via C API if not directly with python numpy API.
>

I agree -- it is absolutley bizare that they havn'etr built in a numpy
array <-> matlab array mapping!

MAybe they do'nt want Matlb usres to realize that nmpy provides most of
what MATLAB does (but better :-) ) -- and want people to use Python with
MATlab for other pytonic stuff that MATLAB doesn't do well....

but they do provide a mapping for array.array:

https://www.mathworks.com/help/matlab/matlab_external/use-python-array-array-types.html

which is a buffer you can wrap a numpy array around efficiently....

odd that you'd have to write that code.

-CHB


> But it seems nobody has done this, so maybe it's not that simple?
>
>
> On Mon, Aug 28, 2017 at 5:32 PM Gregory Lee <grlee77 at gmail.com> wrote:
>
>> I have not used Transplant, but it sounds fairly similar to
>> Python-matlab-bridge.  We currently optionally call Matlab via
>> Python-matlab-bridge in some of the the tests for the PyWavelets package.
>>
>> https://arokem.github.io/python-matlab-bridge/
>> https://github.com/arokem/python-matlab-bridge
>>
>> I would be interested in hearing about the benefits/drawbacks relative to
>> Transplant if there is anyone who has used both.
>>
>>
>> On Mon, Aug 28, 2017 at 4:29 PM, CJ Carey <perimosocordiae at gmail.com>
>> wrote:
>>
>>> Looks like Transplant can handle this use-case.
>>>
>>> Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html
>>> GitHub link: https://github.com/bastibe/transplant
>>>
>>> I haven't given it a try myself, but it looks promising.
>>>
>>> On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>>>
>>>> If you can use Octave instead of Matlab, I've had a very good
>>>> experience with Oct2Py:
>>>> https://github.com/blink1073/oct2py
>>>>
>>>> On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker <ndbecker2 at gmail.com>
>>>> wrote:
>>>>
>>>>> I've searched but haven't found any decent answer.  I need to call
>>>>> Matlab from python.  Matlab has a python module for this purpose, but it
>>>>> doesn't understand numpy AFAICT.  What solutions are there for efficiently
>>>>> interfacing numpy arrays to Matlab?
>>>>>
>>>>> Thanks,
>>>>> Neal
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at python.org
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170829/7b988dce/attachment.html>

From Catherine.M.Moroney at jpl.nasa.gov  Tue Aug 29 21:03:55 2017
From: Catherine.M.Moroney at jpl.nasa.gov (Moroney, Catherine M (398E))
Date: Wed, 30 Aug 2017 01:03:55 +0000
Subject: [Numpy-discussion] selecting all 2-d slices out of n-dimensional
 array
Message-ID: <89D45F18-79B8-454C-9A3E-8417AED831F2@jpl.nasa.gov>

Hello,

I have an n-dimensional array (say (4,4,2,2)) and I wish to automatically extract all the (4,4) slices in it.
i.e.

a = numpy.arange(0, 64).reshape(4,4,2,2)
slice1 = a[..., 0, 0]
slice2 = a[..., 0, 1]
slice3 = a[..., 1, 0]
slice4 = a[..., 1,1]

Simple enough example but in my case array ?a? will have unknown rank and size.  All I know is that it will have more than 2 dimensions, but I don?t know ahead of time how many dimensions or what the size of those dimensions are.

What is the best way of tackling this problem without writing a whole bunch of if-then cases depending on what the rank and shape of a is?  Is there a one-size-fits-all solution?

I?m using python 2.7 and numpy 1.8.2

Thanks for any advice,

Catherine
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170830/324801c0/attachment.html>

From robert.kern at gmail.com  Tue Aug 29 21:47:21 2017
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 29 Aug 2017 18:47:21 -0700
Subject: [Numpy-discussion] selecting all 2-d slices out of
 n-dimensional array
In-Reply-To: <89D45F18-79B8-454C-9A3E-8417AED831F2@jpl.nasa.gov>
References: <89D45F18-79B8-454C-9A3E-8417AED831F2@jpl.nasa.gov>
Message-ID: <CAF6FJiugy=PyJSj=FRhuMKCHy0LWJipLno9mC4OrkYOoKEZjoA@mail.gmail.com>

On Tue, Aug 29, 2017 at 6:03 PM, Moroney, Catherine M (398E) <
Catherine.M.Moroney at jpl.nasa.gov> wrote:

> Hello,
>
>
>
> I have an n-dimensional array (say (4,4,2,2)) and I wish to automatically
> extract all the (4,4) slices in it.
>
> i.e.
>
>
>
> a = numpy.arange(0, 64).reshape(4,4,2,2)
>
> slice1 = a[..., 0, 0]
>
> slice2 = a[..., 0, 1]
>
> slice3 = a[..., 1, 0]
>
> slice4 = a[..., 1,1]
>
>
>
> Simple enough example but in my case array ?a? will have unknown rank and
> size.  All I know is that it will have more than 2 dimensions, but I don?t
> know ahead of time how many dimensions or what the size of those dimensions
> are.
>
>
>
> What is the best way of tackling this problem without writing a whole
> bunch of if-then cases depending on what the rank and shape of a is?  Is
> there a one-size-fits-all solution?
>

First, reshape the array to (4, 4, -1). The -1 tells the method to choose
whatever's needed to get the size to work out. Then roll the last axis to
the front, and then you have a sequence of the (4, 4) arrays that you
wanted.

E.g. (using (4,4,3,3) as the original shape for clarity)

[~]
|26> a = numpy.arange(0, 4*4*3*3).reshape(4,4,3,3)

[~]
|27> b = a.reshape([4, 4, -1])

[~]
|28> b.shape
(4, 4, 9)

[~]
|29> c = np.rollaxis(b, -1, 0)

[~]
|30> c.shape
(9, 4, 4)

[~]
|31> c[0]
array([[  0,   9,  18,  27],
       [ 36,  45,  54,  63],
       [ 72,  81,  90,  99],
       [108, 117, 126, 135]])

[~]
|32> c[1]
array([[  1,  10,  19,  28],
       [ 37,  46,  55,  64],
       [ 73,  82,  91, 100],
       [109, 118, 127, 136]])

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170829/d3520bf6/attachment-0001.html>

From jladasky at itu.edu  Tue Aug 29 22:13:08 2017
From: jladasky at itu.edu (John Ladasky)
Date: Tue, 29 Aug 2017 19:13:08 -0700
Subject: [Numpy-discussion] selecting all 2-d slices out of
 n-dimensional array
In-Reply-To: <CAF6FJiugy=PyJSj=FRhuMKCHy0LWJipLno9mC4OrkYOoKEZjoA@mail.gmail.com>
References: <89D45F18-79B8-454C-9A3E-8417AED831F2@jpl.nasa.gov>
 <CAF6FJiugy=PyJSj=FRhuMKCHy0LWJipLno9mC4OrkYOoKEZjoA@mail.gmail.com>
Message-ID: <CAOC21_aMkQGqUJLcHc2Wre-E7E1oSfwzg7TZFZQFNAwbNAU9Gw@mail.gmail.com>

Nice solution, Robert.

My solution was not idiomatic Numpy, but it was idiomatic Python:

def slice2d(arr):
    xmax, ymax = arr.shape[-2:]
    return (arr[...,x,y] for x in range(xmax) for y in range(ymax))


On Tue, Aug 29, 2017 at 6:47 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Tue, Aug 29, 2017 at 6:03 PM, Moroney, Catherine M (398E) <
> Catherine.M.Moroney at jpl.nasa.gov> wrote:
>
>> Hello,
>>
>>
>>
>> I have an n-dimensional array (say (4,4,2,2)) and I wish to automatically
>> extract all the (4,4) slices in it.
>>
>> i.e.
>>
>>
>>
>> a = numpy.arange(0, 64).reshape(4,4,2,2)
>>
>> slice1 = a[..., 0, 0]
>>
>> slice2 = a[..., 0, 1]
>>
>> slice3 = a[..., 1, 0]
>>
>> slice4 = a[..., 1,1]
>>
>>
>>
>> Simple enough example but in my case array ?a? will have unknown rank and
>> size.  All I know is that it will have more than 2 dimensions, but I don?t
>> know ahead of time how many dimensions or what the size of those dimensions
>> are.
>>
>>
>>
>> What is the best way of tackling this problem without writing a whole
>> bunch of if-then cases depending on what the rank and shape of a is?  Is
>> there a one-size-fits-all solution?
>>
>
> First, reshape the array to (4, 4, -1). The -1 tells the method to choose
> whatever's needed to get the size to work out. Then roll the last axis to
> the front, and then you have a sequence of the (4, 4) arrays that you
> wanted.
>
> E.g. (using (4,4,3,3) as the original shape for clarity)
>
> [~]
> |26> a = numpy.arange(0, 4*4*3*3).reshape(4,4,3,3)
>
> [~]
> |27> b = a.reshape([4, 4, -1])
>
> [~]
> |28> b.shape
> (4, 4, 9)
>
> [~]
> |29> c = np.rollaxis(b, -1, 0)
>
> [~]
> |30> c.shape
> (9, 4, 4)
>
> [~]
> |31> c[0]
> array([[  0,   9,  18,  27],
>        [ 36,  45,  54,  63],
>        [ 72,  81,  90,  99],
>        [108, 117, 126, 135]])
>
> [~]
> |32> c[1]
> array([[  1,  10,  19,  28],
>        [ 37,  46,  55,  64],
>        [ 73,  82,  91, 100],
>        [109, 118, 127, 136]])
>
> --
> Robert Kern
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>


-- 
*John J. Ladasky Jr., Ph.D.*
*Research Scientist*
*International Technological University*
*2711 N. First St, San Jose, CA 95134 USA*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170829/00de816d/attachment.html>