From miles.cranmer at gmail.com  Mon Oct  1 14:36:14 2018
From: miles.cranmer at gmail.com (Miles Cranmer)
Date: Mon, 1 Oct 2018 14:36:14 -0400
Subject: [Numpy-discussion] Fwd: Performance feature for np.isin and np.in1d
In-Reply-To: <CABBNNy=1Mb7fgtO=UZKfxCQqzx+b-D+bX-1rHrNfjY2mhQU-Ew@mail.gmail.com>
References: <CABBNNy=1Mb7fgtO=UZKfxCQqzx+b-D+bX-1rHrNfjY2mhQU-Ew@mail.gmail.com>
Message-ID: <CABBNNykPycxm5hX2ZX_aAS=wKKDYYL8kEoGCxbd2F_LJS_T-hw@mail.gmail.com>

(Not sure what the right list is for this)

Hi,

I have started a PR for a "fast_integers" flag for np.isin and np.in1d
which greatly increases performance when both arrays are integral. It works
by creating a boolean array with elements set to 1 where the parent array
(ar2) has elements and 0 otherwise. This array is then indexed by the child
array (ar1) to create the output.

https://github.com/numpy/numpy/pull/12065

Thoughts on this? Please let me know if you have any questions about my
addition.

Thank you.
Best regards,
Miles
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181001/dac32f0f/attachment.html>

From charlesr.harris at gmail.com  Thu Oct  4 17:30:52 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 4 Oct 2018 15:30:52 -0600
Subject: [Numpy-discussion] Deactivated appveyor
Message-ID: <CAB6mnx+qAm3+5LRJSxuxT=MUkBqwWDwHU7DaJnQXHR_Xv8m6qQ@mail.gmail.com>

Hi All,

This is just to notify everyone making PRs on github that I have
deactivated the appeveyor webhook now that azure testing seems to be
working for the windows tests. Azure is much faster and I expect that
travis or one of the other platforms will become the testing bottleneck. I
think the new tests can be activated by closing/opening PRs, and maybe it
will happen automatically on updates, we will see. Finding details of
failing tests is a bit of a hassle, but there is an obscure button at the
bottom of the default details page that you can click for actual details,
although you will still need to hunt around to find the pipeline. This is
still somewhat experimental, so post your feedback and complaints here.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181004/eb985c40/attachment.html>

From matti.picus at gmail.com  Fri Oct  5 04:31:20 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Fri, 5 Oct 2018 11:31:20 +0300
Subject: [Numpy-discussion] Adding a hex version like PY_VERSION_HEX
Message-ID: <e9143c26-4da4-dd46-4c0f-812e7229eb9d@gmail.com>

In PR 12074 https://github.com/numpy/numpy/pull/12074 I propose adding a 
function `version.get_numpy_version_as_hex()` which returns a hex value 
to represent the current NumPy version MAJOR.MINOR.MICRO where

v = hex(MAJOR << 24 | MINOR << 16 | MICRO)

so the current 1.15.0 would become '0x10f0000'. I also made this 
avaiable via C through `get_hex_version`. The hex version is based on 
the |PY_VERSION_HEX| macro from CPython.

Currently we have a ABI version and an API version for the numpy C-API. 
We only increment those for updated or breaking changes in the NumPy 
C-API, but not for

- changes in behavior, especially in python code

- changes in sizes of outward-facing structures like PyArray_Desc

Occasionally it is desirable to determine backward compatibility from 
the runtime version, rather than from the ABI or API versions, and 
having it as a single value makes the comparison in C easy. For instance 
this may be convenient when there is suspicion that older header files 
may have been used to create or manipulate an object directly in C (or 
via a cython optimization), and we want to verify the version used to 
create the object, or when we may want to verify de-serialized objects. 
The `numpy.lib._version.NumpyVersion` class enables version comparison 
in python, but I would prefer a single value that can be stored in a C 
struct as an integer type.

Since this is an enhancement proposal, I am bringing the idea to the 
mailing list for reactions.

Matti

From Jerome.Kieffer at esrf.fr  Fri Oct  5 04:46:02 2018
From: Jerome.Kieffer at esrf.fr (Jerome Kieffer)
Date: Fri, 5 Oct 2018 10:46:02 +0200
Subject: [Numpy-discussion] Adding a hex version like PY_VERSION_HEX
In-Reply-To: <e9143c26-4da4-dd46-4c0f-812e7229eb9d@gmail.com>
References: <e9143c26-4da4-dd46-4c0f-812e7229eb9d@gmail.com>
Message-ID: <20181005104602.5c2c971c@lintaillefer.esrf.fr>

On Fri, 5 Oct 2018 11:31:20 +0300
Matti Picus <matti.picus at gmail.com> wrote:

> In PR 12074 https://github.com/numpy/numpy/pull/12074 I propose adding a 
> function `version.get_numpy_version_as_hex()` which returns a hex value 
> to represent the current NumPy version MAJOR.MINOR.MICRO where
> 
> v = hex(MAJOR << 24 | MINOR << 16 | MICRO)

+1

We use it in our code and it is a good practice, much better then 0.9.0>0.10.0 !

We added some support for dev, alpha, beta, RC and final versions in 
https://github.com/silx-kit/silx/blob/master/version.py

Cheers,
-- 
J?r?me Kieffer

From matti.picus at gmail.com  Sun Oct  7 02:24:22 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Sun, 7 Oct 2018 09:24:22 +0300
Subject: [Numpy-discussion] Adding a hex version like PY_VERSION_HEX
In-Reply-To: <20181005104602.5c2c971c@lintaillefer.esrf.fr>
References: <e9143c26-4da4-dd46-4c0f-812e7229eb9d@gmail.com>
 <20181005104602.5c2c971c@lintaillefer.esrf.fr>
Message-ID: <a6eab922-8c06-8dad-d594-373b8c7d0f5f@gmail.com>

On 05/10/18 11:46, Jerome Kieffer wrote:
> On Fri, 5 Oct 2018 11:31:20 +0300
> Matti Picus <matti.picus at gmail.com> wrote:
>
>> In PR 12074 https://github.com/numpy/numpy/pull/12074 I propose adding a
>> function `version.get_numpy_version_as_hex()` which returns a hex value
>> to represent the current NumPy version MAJOR.MINOR.MICRO where
>>
>> v = hex(MAJOR << 24 | MINOR << 16 | MICRO)
> +1
>
> We use it in our code and it is a good practice, much better then 0.9.0>0.10.0 !
>
> We added some support for dev, alpha, beta, RC and final versions in
> https://github.com/silx-kit/silx/blob/master/version.py
>
> Cheers,
Thanks. I think at this point I will change the proposal to

v = hex(MAJOR << 24 | MINOR << 16 | MICRO << 8)

which leaves room for future enhancement with "release level" and "serial" as the lower bits.

Matti


From mark.harfouche at gmail.com  Sun Oct  7 10:32:11 2018
From: mark.harfouche at gmail.com (Mark Harfouche)
Date: Sun, 7 Oct 2018 10:32:11 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
Message-ID: <CAC=AwPwftbO53-SME=V0-TKhUZo18+0aooFYObx+c0PHuAz7Yg@mail.gmail.com>

Hi All,

I've been using numpy array objects to store collections of 2D (and soon
ND) variables. When iterating through these collections, I often found it
useful to use `ndindex`, which for `for loops` behaves much like `range`
with only a `stop` parameter.

That said, it lacks a few features that are now present in `range` are
missing from `ndindex`, most notably the ability to iterate over a subset
of the ndindex.

I found myself often writing `itertools.product(range(1, data.shapep[0]),
range(3, data.shape[2]))` for custom iterations. While it does flatten out
the for loop, it is arguable less readable than having 1 or 2 levels of
nested for loops.

It is quite possible that `nditer` would solve my problems, but
unfortunately I am still not able to make sense of then numerous options it
has.

I propose an `ndrange` class that can be used to iterate over
nd-collections mimicking the API of `range` as much as possible and
adapting it to the ND case (i.e. returning tuples instead of singletons).

Since this is an enhancement proposal, I am bringing the idea to the
mailing list for reactions.

The implementation in this PR https://github.com/numpy/numpy/pull/12094 is
based on keeping track of a tuple of python `range` range objects. The
`__iter__` method returns the result of `itertools.product(*self._ranges)`

By leveraging python's `range` implementation, operations like
`containement` `index`, `reversed`, `equality` and most importantly slicing
of the ndrange object are possible to offer to the general numpy audiance.

For example, iterating through a 2D collection but avoiding indexing the
first and last column used to look like this:

```
c = np.empty((4, 4), dtype=object)
# ... compute on c
for j in range(c.shape[0]):
     for i in range(1, c.shape[1]-1):
         c[j, i] # = compute on c[j, i] that depends on the index i, j
```

With `np.ndrange` it can look something like this:

```
c = np.empty((4, 4), dtype=object)
# ... compute on c
for i in np.ndrange(c.shape)[:, 1:-1]:
    c[i] # = some operation on c[i] that depends on the index i
```

very pythonic, very familiar to numpy users

Thank you for the feedback,

Mark

References:
An issue requesting expansion to the ndindex API on github:
https://github.com/numpy/numpy/issues/6393
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181007/3611d2f3/attachment.html>

From allanhaldane at gmail.com  Sun Oct  7 17:20:11 2018
From: allanhaldane at gmail.com (Allan Haldane)
Date: Sun, 7 Oct 2018 17:20:11 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAC=AwPwftbO53-SME=V0-TKhUZo18+0aooFYObx+c0PHuAz7Yg@mail.gmail.com>
References: <CAC=AwPwftbO53-SME=V0-TKhUZo18+0aooFYObx+c0PHuAz7Yg@mail.gmail.com>
Message-ID: <e66db3d7-6aaa-f7f3-ebb7-b0b35db90b75@gmail.com>

On 10/07/2018 10:32 AM, Mark Harfouche wrote:
> With `np.ndrange` it can look something like this:
> 
> ```
> c = np.empty((4, 4), dtype=object)
> # ... compute on c
> for i in np.ndrange(c.shape)[:, 1:-1]:
>  ??? c[i] # = some operation on c[i] that depends on the index i
> ```
> 
> very pythonic, very familiar to numpy users

So if I understand, this does the same as `np.ndindex` but allows 
numpy-like slicing of the returned generator object, as requested in #6393.

I don't like the duplication in functionality between ndindex and 
ndrange here. Better rather to add the slicing functionality to ndindex, 
  than create a whole new nearly-identical function. np.ndindex is 
already a somewhat obscure and discouraged method since it is usually 
better to find a vectorized numpy operation instead of a for loop, and I 
don't like adding more obscure functions.

But as an improvement to np.ndindex, I think adding this functionality 
seems good if it can be nicely implemented. Maybe there is a way to use 
the same optimization tricks as in the current implementation of ndindex 
but allow different stop/step? A simple wrapper of ndindex?

Cheers,
Allan

From mark.harfouche at gmail.com  Mon Oct  8 12:21:40 2018
From: mark.harfouche at gmail.com (Mark Harfouche)
Date: Mon, 8 Oct 2018 12:21:40 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
Message-ID: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>

Allan,

Sorry for the delay. I had my mailing list preferences set to digest. I
changed them for now. (I hope this message continues that thread).

Thank you for your feedback. You are correct in identifying that the real
feature is expanding the `ndindex` API to support slicing. See comments
about the separate points you raised below

## Expanding the API of ndindex

> Better rather to add the slicing functionality to ndindex, than create a
whole new nearly-identical function.

This is a very important point. I should have included a note about it. My
[first attempt](
https://github.com/hmaarrfk/numpy/pull/1/files#diff-1bd953557a98073031ce66d05dbde3c8R663)
did try that approach.
I ran into 2 issues:
1. Getting around the catch-all positional argument is annoying, and logic
to do that will likely be error prone. Peculiarities about how we implement
it might cause some very strange for `tuple-like` inputs that we don't
expect.
2. `ndindex` is an iterator itself. As proposed, `ndrange`, like `range`,
is not an iterator. Changing this behaviour would likely lead to breaking
code that uses that assumption. For example anybody using introspection or
code like:

```
indx = np.ndindex(5, 5)
next(indx)  # Don't look at the (0, 0) coordinate
for i in indx:
    print(i)
```
would break if `ndindex` becomes "not an iterator"

For these two reasons, I thought it was easier to simply have a new class,
that seems like a close sibling to `ndindex`.

I personally don't care about point 1 so much. In my mind, start, stop and
step is confusing in ND. but maybe some might find it useful? Point 1 also
makes it harder to make `ndrange` more familiar to `range` users.

> I don't like adding more obscure functions

Hopefully the name `ndrange` makes it easier to find?

## Writing vectorized code

> np.ndindex is  already a somewhat obscure and discouraged method since it
is usually better to find a vectorized numpy operation instead of a for loop

I understand that this kind of function is not focused on `numerical`
operations on the elements of the matrix itself. It really is there to help
fill the void of any useful multi-dimensional python container.

I think `ndrange`/`ndindex` is there to be used like `np.vectorized`. I've
tried to use `np.vectorize` in my own code, but quickly found that making
logic fit into vectorize's requirements was often more complicated than
writing my own loop multi-nested loops. In my opinion, nested `range` loops
or `ndrange`/`ndindex` is a much more natural way to loop over collections
compared to `np.vectorized`.

I'm glad to add warnings to the docs.

## Implementation detail: itertools.product + range vs nditer

> Maybe there is a way to use  the same optimization tricks as in the
current implementation of ndindex  but allow different stop/step?

My primary goal here is to make `ndrange` behave much like `range`. By
implementing it on top of `range`, it makes it obvious to me how to enforce
that behaviour as the API of range gets expanded (though it seems to have
settled since Python 3.3). Whatever we decide to call `ndrange`/`ndindex`,
the tests I wrote can help ensure we have good range-API coverage (for now).

itertools.product + range seems to be much faster than the current
implementation of ndindex

(python 3.6)
```
%%timeit

for i in np.ndindex(100, 100):
    pass
3.94 ms ? 19.4 ?s per loop (mean ? std. dev. of 7 runs, 100 loops each)

%%timeit
import itertools
for i in itertools.product(range(100), range(100)):
    pass
231 ?s ? 1.09 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each)
```
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181008/9b3117db/attachment.html>

From allanhaldane at gmail.com  Mon Oct  8 15:33:18 2018
From: allanhaldane at gmail.com (Allan Haldane)
Date: Mon, 8 Oct 2018 15:33:18 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
Message-ID: <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>

On 10/8/18 12:21 PM, Mark Harfouche wrote:
> 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
> `range`, is not an iterator. Changing this behaviour would likely lead
> to breaking code that uses that assumption. For example anybody using
> introspection or code like:
> 
> ```
> indx = np.ndindex(5, 5)
> next(indx)? # Don't look at the (0, 0) coordinate
> for i in indx:
> ??? print(i)
> ```
> would break if `ndindex` becomes "not an iterator"

OK, I see now. Just like python3 has separate range and range_iterator
types, where range is sliceable, we would have separate ndrange and
ndindex types, where ndrange is sliceable. You're just copying the
python3 api. That justifies it pretty well for me.

I still think we shouldn't have two functions which do nearly the same
thing. We should only have one, and get rid of the other. I see two ways
forward:

 * replace ndindex by your ndrange code, so it is no longer an iter.
   This would require some deprecation cycles for the cases that break.
 * deprecate ndindex in favor of a new function ndrange. We would keep
   ndindex around for back-compatibility, with a dep warning to use
   ndrange instead.

Doing a code search on github, I can see that a lot of people's code
would break if ndindex no longer was an iter. I also like the name
ndrange for its allusion to python3's range behavior. That makes me lean
towards the second option of a separate ndrange, with possible
deprecation of ndindex.

> itertools.product + range seems to be much faster than the current
> implementation of ndindex
> 
> (python 3.6)
> ```
> %%timeit
> 
> for i in np.ndindex(100, 100):
> ??? pass
> 3.94 ms ? 19.4 ?s per loop (mean ? std. dev. of 7 runs, 100 loops each)
> 
> %%timeit
> import itertools
> for i in itertools.product(range(100), range(100)):
> ??? pass
> 231 ?s ? 1.09 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each)
> ```

If the new code ends up faster than the old code, that's great, and
further justification for using ndrange instead of ndindex. I had
thought using nditer in the old code was fastest.

So as far as I am concerned, I say go ahead with the PR the way you are
doing it.

Allan

From shoyer at gmail.com  Mon Oct  8 16:25:14 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Mon, 8 Oct 2018 13:25:14 -0700
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
Message-ID: <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>

I'm open to adding ndrange, and "soft-deprecating" ndindex (i.e.,
discouraging its use in our docs, but not actually deprecating it).
Certainly ndrange seems like a small but meaningful improvement in the
interface.

That said, I'm not convinced this is really worth the trouble. I think the
nested loop is still pretty readable/clear, and there are few times when
I've actually found ndindex() be useful.

On Mon, Oct 8, 2018 at 12:35 PM Allan Haldane <allanhaldane at gmail.com>
wrote:

> On 10/8/18 12:21 PM, Mark Harfouche wrote:
> > 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
> > `range`, is not an iterator. Changing this behaviour would likely lead
> > to breaking code that uses that assumption. For example anybody using
> > introspection or code like:
> >
> > ```
> > indx = np.ndindex(5, 5)
> > next(indx)  # Don't look at the (0, 0) coordinate
> > for i in indx:
> >     print(i)
> > ```
> > would break if `ndindex` becomes "not an iterator"
>
> OK, I see now. Just like python3 has separate range and range_iterator
> types, where range is sliceable, we would have separate ndrange and
> ndindex types, where ndrange is sliceable. You're just copying the
> python3 api. That justifies it pretty well for me.
>
> I still think we shouldn't have two functions which do nearly the same
> thing. We should only have one, and get rid of the other. I see two ways
> forward:
>
>  * replace ndindex by your ndrange code, so it is no longer an iter.
>    This would require some deprecation cycles for the cases that break.
>  * deprecate ndindex in favor of a new function ndrange. We would keep
>    ndindex around for back-compatibility, with a dep warning to use
>    ndrange instead.
>
> Doing a code search on github, I can see that a lot of people's code
> would break if ndindex no longer was an iter. I also like the name
> ndrange for its allusion to python3's range behavior. That makes me lean
> towards the second option of a separate ndrange, with possible
> deprecation of ndindex.
>
> > itertools.product + range seems to be much faster than the current
> > implementation of ndindex
> >
> > (python 3.6)
> > ```
> > %%timeit
> >
> > for i in np.ndindex(100, 100):
> >     pass
> > 3.94 ms ? 19.4 ?s per loop (mean ? std. dev. of 7 runs, 100 loops each)
> >
> > %%timeit
> > import itertools
> > for i in itertools.product(range(100), range(100)):
> >     pass
> > 231 ?s ? 1.09 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each)
> > ```
>
> If the new code ends up faster than the old code, that's great, and
> further justification for using ndrange instead of ndindex. I had
> thought using nditer in the old code was fastest.
>
> So as far as I am concerned, I say go ahead with the PR the way you are
> doing it.
>
> Allan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181008/9d69b368/attachment-0001.html>

From rmcgibbo at gmail.com  Mon Oct  8 16:31:47 2018
From: rmcgibbo at gmail.com (Robert T. McGibbon)
Date: Mon, 8 Oct 2018 16:31:47 -0400
Subject: [Numpy-discussion] Determining NPY_ABI_VERSION statically in
 compiled extensions
Message-ID: <CAN4+E8Fpbs9Ug+v9eRJkSR3W_4S=opnz+DM92zOu_3Kcq8MYWg@mail.gmail.com>

Is anyone aware of any tricks that can be played with tools like `readelf`,
`nm` or `dlopen` / `dlsym` in order to statically determine what version of
numpy a fully-compiled C extension (for example, found inside a wheel) was
compiled against? Even if it only worked with relatively new versions of
numpy, that would be fine.

I'm interested in creating something similar to
https://github.com/pypa/auditwheel that could statically check for
compatibility between wheel files and python installations, in situations
where the metadata about how they were compiled is missing.
-- 
-Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181008/2f53d801/attachment.html>

From matti.picus at gmail.com  Mon Oct  8 17:26:18 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Tue, 9 Oct 2018 00:26:18 +0300
Subject: [Numpy-discussion] Determining NPY_ABI_VERSION statically in
 compiled extensions
In-Reply-To: <CAN4+E8Fpbs9Ug+v9eRJkSR3W_4S=opnz+DM92zOu_3Kcq8MYWg@mail.gmail.com>
References: <CAN4+E8Fpbs9Ug+v9eRJkSR3W_4S=opnz+DM92zOu_3Kcq8MYWg@mail.gmail.com>
Message-ID: <64c3f2fb-c0b3-c013-5402-7be5296e4c59@gmail.com>

On 08/10/18 23:31, Robert T. McGibbon wrote:
> Is anyone aware of any tricks that can be played with tools like 
> `readelf`, `nm` or `dlopen` / `dlsym` in order to statically determine 
> what version of numpy a fully-compiled C extension (for example, found 
> inside a wheel) was compiled against? Even if it only worked with 
> relatively new versions of numpy, that would be fine.
>
> I'm interested in creating something similar to 
> https://github.com/pypa/auditwheel that could statically check for 
> compatibility between wheel files and python installations, in 
> situations where the metadata about how they were compiled is missing.
> -- 
> -Robert
>
NPY_ABI_VERSION is exposed in C as PyArray_GetNDArrayCVersion and 
NPY_API_VERSION is exposed in C as PyArray_GetNDArrayCFeatureVersion. 
These are not incremented for every NumPy release, see the documentation 
in numpy/core/setup_common.py.

The numpy.__version__ is determined by a python file numpy/version.py, 
which is probably what you want to use.

There is an open Issue to better reveal compile time info 
https://github.com/numpy/numpy/issues/10983

Matti

From mark.harfouche at gmail.com  Mon Oct  8 19:25:30 2018
From: mark.harfouche at gmail.com (Mark Harfouche)
Date: Mon, 8 Oct 2018 19:25:30 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
Message-ID: <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>

since ndrange is a superset of the features of ndindex, we can implement
ndindex with ndrange or keep it as is.
ndindex is now a glorified `nditer` object anyway. So it isn't so much of a
maintenance burden.
As for how ndindex is implemented, I'm a little worried about python 2
performance seeing as range is a list.
I would wait on changing the way ndindex is implemented for now.

I agree with Stephan that ndindex should be kept in. Many want backward
compatible code. It would be hard for me to justify why a dependency should
be bumped up to bleeding edge numpy just for a convenience iterator.

Honestly, I was really surprised to see such a speed difference, I thought
it would have been closer.

Allan, I decided to run a few more benchmarks, the nditer just seems slow
for single array access some reason. Maybe a bug?

```
import numpy as np
import itertools
a = np.ones((1000, 1000))

b = {}
for i in np.ndindex(a.shape):
    b[i] = i

%%timeit
# op_flag=('readonly',) doesn't change performance
for a_value in np.nditer(a):
    pass
109 ms ? 921 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)

%%timeit
for i in itertools.product(range(1000), range(1000)):
    a_value = a[i]
113 ms ? 1.72 ms per loop (mean ? std. dev. of 7 runs, 10 loops each)

%%timeit
for i in itertools.product(range(1000), range(1000)):
    c = b[i]
193 ms ? 3.89 ms per loop (mean ? std. dev. of 7 runs, 1 loop each)

%%timeit
for a_value in a.flat:
    pass
25.3 ms ? 278 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)

%%timeit
for k, v in b.items():
    pass
19.9 ms ? 675 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)

%%timeit
for i in itertools.product(range(1000), range(1000)):
    pass
28 ms ? 715 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
```

On Mon, Oct 8, 2018 at 4:26 PM Stephan Hoyer <shoyer at gmail.com> wrote:

> I'm open to adding ndrange, and "soft-deprecating" ndindex (i.e.,
> discouraging its use in our docs, but not actually deprecating it).
> Certainly ndrange seems like a small but meaningful improvement in the
> interface.
>
> That said, I'm not convinced this is really worth the trouble. I think the
> nested loop is still pretty readable/clear, and there are few times when
> I've actually found ndindex() be useful.
>
> On Mon, Oct 8, 2018 at 12:35 PM Allan Haldane <allanhaldane at gmail.com>
> wrote:
>
>> On 10/8/18 12:21 PM, Mark Harfouche wrote:
>> > 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
>> > `range`, is not an iterator. Changing this behaviour would likely lead
>> > to breaking code that uses that assumption. For example anybody using
>> > introspection or code like:
>> >
>> > ```
>> > indx = np.ndindex(5, 5)
>> > next(indx)  # Don't look at the (0, 0) coordinate
>> > for i in indx:
>> >     print(i)
>> > ```
>> > would break if `ndindex` becomes "not an iterator"
>>
>> OK, I see now. Just like python3 has separate range and range_iterator
>> types, where range is sliceable, we would have separate ndrange and
>> ndindex types, where ndrange is sliceable. You're just copying the
>> python3 api. That justifies it pretty well for me.
>>
>> I still think we shouldn't have two functions which do nearly the same
>> thing. We should only have one, and get rid of the other. I see two ways
>> forward:
>>
>>  * replace ndindex by your ndrange code, so it is no longer an iter.
>>    This would require some deprecation cycles for the cases that break.
>>  * deprecate ndindex in favor of a new function ndrange. We would keep
>>    ndindex around for back-compatibility, with a dep warning to use
>>    ndrange instead.
>>
>> Doing a code search on github, I can see that a lot of people's code
>> would break if ndindex no longer was an iter. I also like the name
>> ndrange for its allusion to python3's range behavior. That makes me lean
>> towards the second option of a separate ndrange, with possible
>> deprecation of ndindex.
>>
>> > itertools.product + range seems to be much faster than the current
>> > implementation of ndindex
>> >
>> > (python 3.6)
>> > ```
>> > %%timeit
>> >
>> > for i in np.ndindex(100, 100):
>> >     pass
>> > 3.94 ms ? 19.4 ?s per loop (mean ? std. dev. of 7 runs, 100 loops each)
>> >
>> > %%timeit
>> > import itertools
>> > for i in itertools.product(range(100), range(100)):
>> >     pass
>> > 231 ?s ? 1.09 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each)
>> > ```
>>
>> If the new code ends up faster than the old code, that's great, and
>> further justification for using ndrange instead of ndindex. I had
>> thought using nditer in the old code was fastest.
>>
>> So as far as I am concerned, I say go ahead with the PR the way you are
>> doing it.
>>
>> Allan
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181008/f50e08e3/attachment-0001.html>

From rmcgibbo at gmail.com  Mon Oct  8 19:37:38 2018
From: rmcgibbo at gmail.com (Robert T. McGibbon)
Date: Mon, 8 Oct 2018 19:37:38 -0400
Subject: [Numpy-discussion] Determining NPY_ABI_VERSION statically in
 compiled extensions
In-Reply-To: <64c3f2fb-c0b3-c013-5402-7be5296e4c59@gmail.com>
References: <CAN4+E8Fpbs9Ug+v9eRJkSR3W_4S=opnz+DM92zOu_3Kcq8MYWg@mail.gmail.com>
 <64c3f2fb-c0b3-c013-5402-7be5296e4c59@gmail.com>
Message-ID: <CAN4+E8FMSa6kgTd4GeYrAk5M0O74eTVqcBb1_2GSF_Rcamub6w@mail.gmail.com>

Matti,

That doesn't quite cover my use case. I'm interested in querying a .whl
file containing .so files that were compiled against numpy (not my
currently installed version of numpy) to determine the conditions under
which those `.so` files were compiled.

-Robert

On Mon, Oct 8, 2018 at 5:26 PM Matti Picus <matti.picus at gmail.com> wrote:

> On 08/10/18 23:31, Robert T. McGibbon wrote:
> > Is anyone aware of any tricks that can be played with tools like
> > `readelf`, `nm` or `dlopen` / `dlsym` in order to statically determine
> > what version of numpy a fully-compiled C extension (for example, found
> > inside a wheel) was compiled against? Even if it only worked with
> > relatively new versions of numpy, that would be fine.
> >
> > I'm interested in creating something similar to
> > https://github.com/pypa/auditwheel that could statically check for
> > compatibility between wheel files and python installations, in
> > situations where the metadata about how they were compiled is missing.
> > --
> > -Robert
> >
> NPY_ABI_VERSION is exposed in C as PyArray_GetNDArrayCVersion and
> NPY_API_VERSION is exposed in C as PyArray_GetNDArrayCFeatureVersion.
> These are not incremented for every NumPy release, see the documentation
> in numpy/core/setup_common.py.
>
> The numpy.__version__ is determined by a python file numpy/version.py,
> which is probably what you want to use.
>
> There is an open Issue to better reveal compile time info
> https://github.com/numpy/numpy/issues/10983
>
> Matti
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>


-- 
-Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181008/6e91d011/attachment.html>

From charlesr.harris at gmail.com  Tue Oct  9 13:53:21 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 9 Oct 2018 11:53:21 -0600
Subject: [Numpy-discussion] Plans for 1.15.3 release, 1.16.x branch
Message-ID: <CAB6mnxKjupYxYskaUSrsPggtThSEh8j+0gE5wvctKjma_k8QXg@mail.gmail.com>

Hi All,

I'm planning to do a 1.15.3 release in about two weeks, if there are fixes
or regressions that you feel have slipped by without getting marked for
backport, please comment.

I'm planning on branching 1.16.x in mid November, which should provide
enough time for 1-2 release candidates and a release before the end of the
year.

This is all contingent on having the numpy-wheels repo working again. The
latest 0.32 release of the wheel package broke everything, so we will
either need to pin the version or wait on potential fixes currently under
discussion.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181009/971078e5/attachment.html>

From miles.cranmer at gmail.com  Tue Oct  9 14:01:48 2018
From: miles.cranmer at gmail.com (Miles Cranmer)
Date: Tue, 9 Oct 2018 14:01:48 -0400
Subject: [Numpy-discussion] Performance feature for np.isin and np.in1d
In-Reply-To: <CABBNNykPycxm5hX2ZX_aAS=wKKDYYL8kEoGCxbd2F_LJS_T-hw@mail.gmail.com>
References: <CABBNNy=1Mb7fgtO=UZKfxCQqzx+b-D+bX-1rHrNfjY2mhQU-Ew@mail.gmail.com>
 <CABBNNykPycxm5hX2ZX_aAS=wKKDYYL8kEoGCxbd2F_LJS_T-hw@mail.gmail.com>
Message-ID: <CABBNNykhC5Mv6zgkcBEarCm63w5kjSZk_sbRWaf6pCoO0ogXNA@mail.gmail.com>

Hi,

I was wondering how I could have this PR merged (
https://github.com/numpy/numpy/pull/12065)? The discussion on the PR seems
to have gone well and all tests pass.

Cheers,
Miles

On Mon, Oct 1, 2018 at 2:36 PM Miles Cranmer <miles.cranmer at gmail.com>
wrote:

> (Not sure what the right list is for this)
>
> Hi,
>
> I have started a PR for a "fast_integers" flag for np.isin and np.in1d
> which greatly increases performance when both arrays are integral. It works
> by creating a boolean array with elements set to 1 where the parent array
> (ar2) has elements and 0 otherwise. This array is then indexed by the child
> array (ar1) to create the output.
>
> https://github.com/numpy/numpy/pull/12065
>
> Thoughts on this? Please let me know if you have any questions about my
> addition.
>
> Thank you.
> Best regards,
> Miles
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181009/389401e6/attachment.html>

From shoyer at gmail.com  Tue Oct  9 14:07:01 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Tue, 9 Oct 2018 11:07:01 -0700
Subject: [Numpy-discussion] Plans for 1.15.3 release, 1.16.x branch
In-Reply-To: <CAB6mnxKjupYxYskaUSrsPggtThSEh8j+0gE5wvctKjma_k8QXg@mail.gmail.com>
References: <CAB6mnxKjupYxYskaUSrsPggtThSEh8j+0gE5wvctKjma_k8QXg@mail.gmail.com>
Message-ID: <CAEQ_Tvf5FKba9MkCftEh3F-99kmcQjONPU=JVZaxGU3eKZ10Ug@mail.gmail.com>

On Tue, Oct 9, 2018 at 10:54 AM Charles R Harris <charlesr.harris at gmail.com>
wrote:

> I'm planning on branching 1.16.x in mid November, which should provide
> enough time for 1-2 release candidates and a release before the end of the
> year.
>

OK, this gives us a good target for finishing up the NEP-18 work!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181009/2a76786a/attachment.html>

From shoyer at gmail.com  Tue Oct  9 16:58:37 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Tue, 9 Oct 2018 13:58:37 -0700
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
Message-ID: <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>

The speed difference is interesting but really a different question than
the public API.

I'm coming around to ndrange(). I can see how it could be useful for
symbolic manipulation of arrays and indexing operations, similar to what we
do in dask and xarray.

On Mon, Oct 8, 2018 at 4:25 PM Mark Harfouche <mark.harfouche at gmail.com>
wrote:

> since ndrange is a superset of the features of ndindex, we can implement
> ndindex with ndrange or keep it as is.
> ndindex is now a glorified `nditer` object anyway. So it isn't so much of
> a maintenance burden.
> As for how ndindex is implemented, I'm a little worried about python 2
> performance seeing as range is a list.
> I would wait on changing the way ndindex is implemented for now.
>
> I agree with Stephan that ndindex should be kept in. Many want backward
> compatible code. It would be hard for me to justify why a dependency should
> be bumped up to bleeding edge numpy just for a convenience iterator.
>
> Honestly, I was really surprised to see such a speed difference, I thought
> it would have been closer.
>
> Allan, I decided to run a few more benchmarks, the nditer just seems slow
> for single array access some reason. Maybe a bug?
>
> ```
> import numpy as np
> import itertools
> a = np.ones((1000, 1000))
>
> b = {}
> for i in np.ndindex(a.shape):
>     b[i] = i
>
> %%timeit
> # op_flag=('readonly',) doesn't change performance
> for a_value in np.nditer(a):
>     pass
> 109 ms ? 921 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>
> %%timeit
> for i in itertools.product(range(1000), range(1000)):
>     a_value = a[i]
> 113 ms ? 1.72 ms per loop (mean ? std. dev. of 7 runs, 10 loops each)
>
> %%timeit
> for i in itertools.product(range(1000), range(1000)):
>     c = b[i]
> 193 ms ? 3.89 ms per loop (mean ? std. dev. of 7 runs, 1 loop each)
>
> %%timeit
> for a_value in a.flat:
>     pass
> 25.3 ms ? 278 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>
> %%timeit
> for k, v in b.items():
>     pass
> 19.9 ms ? 675 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>
> %%timeit
> for i in itertools.product(range(1000), range(1000)):
>     pass
> 28 ms ? 715 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
> ```
>
> On Mon, Oct 8, 2018 at 4:26 PM Stephan Hoyer <shoyer at gmail.com> wrote:
>
>> I'm open to adding ndrange, and "soft-deprecating" ndindex (i.e.,
>> discouraging its use in our docs, but not actually deprecating it).
>> Certainly ndrange seems like a small but meaningful improvement in the
>> interface.
>>
>> That said, I'm not convinced this is really worth the trouble. I think
>> the nested loop is still pretty readable/clear, and there are few times
>> when I've actually found ndindex() be useful.
>>
>> On Mon, Oct 8, 2018 at 12:35 PM Allan Haldane <allanhaldane at gmail.com>
>> wrote:
>>
>>> On 10/8/18 12:21 PM, Mark Harfouche wrote:
>>> > 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
>>> > `range`, is not an iterator. Changing this behaviour would likely lead
>>> > to breaking code that uses that assumption. For example anybody using
>>> > introspection or code like:
>>> >
>>> > ```
>>> > indx = np.ndindex(5, 5)
>>> > next(indx)  # Don't look at the (0, 0) coordinate
>>> > for i in indx:
>>> >     print(i)
>>> > ```
>>> > would break if `ndindex` becomes "not an iterator"
>>>
>>> OK, I see now. Just like python3 has separate range and range_iterator
>>> types, where range is sliceable, we would have separate ndrange and
>>> ndindex types, where ndrange is sliceable. You're just copying the
>>> python3 api. That justifies it pretty well for me.
>>>
>>> I still think we shouldn't have two functions which do nearly the same
>>> thing. We should only have one, and get rid of the other. I see two ways
>>> forward:
>>>
>>>  * replace ndindex by your ndrange code, so it is no longer an iter.
>>>    This would require some deprecation cycles for the cases that break.
>>>  * deprecate ndindex in favor of a new function ndrange. We would keep
>>>    ndindex around for back-compatibility, with a dep warning to use
>>>    ndrange instead.
>>>
>>> Doing a code search on github, I can see that a lot of people's code
>>> would break if ndindex no longer was an iter. I also like the name
>>> ndrange for its allusion to python3's range behavior. That makes me lean
>>> towards the second option of a separate ndrange, with possible
>>> deprecation of ndindex.
>>>
>>> > itertools.product + range seems to be much faster than the current
>>> > implementation of ndindex
>>> >
>>> > (python 3.6)
>>> > ```
>>> > %%timeit
>>> >
>>> > for i in np.ndindex(100, 100):
>>> >     pass
>>> > 3.94 ms ? 19.4 ?s per loop (mean ? std. dev. of 7 runs, 100 loops each)
>>> >
>>> > %%timeit
>>> > import itertools
>>> > for i in itertools.product(range(100), range(100)):
>>> >     pass
>>> > 231 ?s ? 1.09 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each)
>>> > ```
>>>
>>> If the new code ends up faster than the old code, that's great, and
>>> further justification for using ndrange instead of ndindex. I had
>>> thought using nditer in the old code was fastest.
>>>
>>> So as far as I am concerned, I say go ahead with the PR the way you are
>>> doing it.
>>>
>>> Allan
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181009/2b22011f/attachment-0001.html>

From ralf.gommers at gmail.com  Tue Oct  9 23:19:21 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 9 Oct 2018 20:19:21 -0700
Subject: [Numpy-discussion] Adding a hex version like PY_VERSION_HEX
In-Reply-To: <a6eab922-8c06-8dad-d594-373b8c7d0f5f@gmail.com>
References: <e9143c26-4da4-dd46-4c0f-812e7229eb9d@gmail.com>
 <20181005104602.5c2c971c@lintaillefer.esrf.fr>
 <a6eab922-8c06-8dad-d594-373b8c7d0f5f@gmail.com>
Message-ID: <CABL7CQjJzjfzU+B5sUPtoCoAfZCNDUqPhV06=rS+DPFWQ-GaRg@mail.gmail.com>

On Sat, Oct 6, 2018 at 11:24 PM Matti Picus <matti.picus at gmail.com> wrote:

> On 05/10/18 11:46, Jerome Kieffer wrote:
> > On Fri, 5 Oct 2018 11:31:20 +0300
> > Matti Picus <matti.picus at gmail.com> wrote:
> >
> >> In PR 12074 https://github.com/numpy/numpy/pull/12074 I propose adding
> a
> >> function `version.get_numpy_version_as_hex()` which returns a hex value
> >> to represent the current NumPy version MAJOR.MINOR.MICRO where
> >>
> >> v = hex(MAJOR << 24 | MINOR << 16 | MICRO)
> > +1
> >
> > We use it in our code and it is a good practice, much better then
> 0.9.0>0.10.0 !
> >
> > We added some support for dev, alpha, beta, RC and final versions in
> > https://github.com/silx-kit/silx/blob/master/version.py
> >
> > Cheers,
> Thanks. I think at this point I will change the proposal to
>
> v = hex(MAJOR << 24 | MINOR << 16 | MICRO << 8)
>
> which leaves room for future enhancement with "release level" and "serial"
> as the lower bits.
>

Makes sense, but to me adding a tuple (like sys.version_info) would be more
logical. Do that as well or instead of?

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181009/2b988fbc/attachment.html>

From wieser.eric+numpy at gmail.com  Wed Oct 10 00:03:36 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Wed, 10 Oct 2018 05:03:36 +0100
Subject: [Numpy-discussion] Adding a hex version like PY_VERSION_HEX
In-Reply-To: <CABL7CQjJzjfzU+B5sUPtoCoAfZCNDUqPhV06=rS+DPFWQ-GaRg@mail.gmail.com>
References: <e9143c26-4da4-dd46-4c0f-812e7229eb9d@gmail.com>
 <20181005104602.5c2c971c@lintaillefer.esrf.fr>
 <a6eab922-8c06-8dad-d594-373b8c7d0f5f@gmail.com>
 <CABL7CQjJzjfzU+B5sUPtoCoAfZCNDUqPhV06=rS+DPFWQ-GaRg@mail.gmail.com>
Message-ID: <CAL1kJvA_V53jtLvdp98+qniu1XpokJUDohLF3cpGkQaF_n9SDA@mail.gmail.com>

+1 on Ralf's suggestion. I'm not sure there's any case where the C code
should be using a hex version number - either it's using the C api, in
which case it should just be looking at the C api version - or it's calling
back into the python API, in which case it's probably not unreasonable to
ask it to inspect `np.__version__` / a hypothetical `sys.version_info`,
since it's already going through awkwardness to invoke pure-python APIs..

Eric

On Wed, 10 Oct 2018 at 04:23 Ralf Gommers <ralf.gommers at gmail.com> wrote:

> On Sat, Oct 6, 2018 at 11:24 PM Matti Picus <matti.picus at gmail.com> wrote:
>
>> On 05/10/18 11:46, Jerome Kieffer wrote:
>> > On Fri, 5 Oct 2018 11:31:20 +0300
>> > Matti Picus <matti.picus at gmail.com> wrote:
>> >
>> >> In PR 12074 https://github.com/numpy/numpy/pull/12074 I propose
>> adding a
>> >> function `version.get_numpy_version_as_hex()` which returns a hex value
>> >> to represent the current NumPy version MAJOR.MINOR.MICRO where
>> >>
>> >> v = hex(MAJOR << 24 | MINOR << 16 | MICRO)
>> > +1
>> >
>> > We use it in our code and it is a good practice, much better then
>> 0.9.0>0.10.0 !
>> >
>> > We added some support for dev, alpha, beta, RC and final versions in
>> > https://github.com/silx-kit/silx/blob/master/version.py
>> >
>> > Cheers,
>> Thanks. I think at this point I will change the proposal to
>>
>> v = hex(MAJOR << 24 | MINOR << 16 | MICRO << 8)
>>
>> which leaves room for future enhancement with "release level" and
>> "serial" as the lower bits.
>>
>
> Makes sense, but to me adding a tuple (like sys.version_info) would be
> more logical. Do that as well or instead of?
>
> Cheers,
> Ralf
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181010/478de7f4/attachment.html>

From wieser.eric+numpy at gmail.com  Wed Oct 10 00:34:29 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Wed, 10 Oct 2018 05:34:29 +0100
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
Message-ID: <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>

One thing that worries me here - in python, range(...) in essence generates
a lazy list - so I?d expect ndrange to generate a lazy ndarray. In
practice, that means it would be a duck-type defining an __array__ method
to evaluate it, and only implement methods already present in numpy.

It?s not clear to me what the datatype of such an array-like would be.
Candidates I can think of are:

   1. [('i0', intp), ('i1', intp), ...], but this makes tuple coercion a
   little awkward
   2. (intp, (N,)) - which collapses into a shape + (3,) array
   3. object_.
   4. Some new np.tuple_ dtype, a heterogenous tuple, which is like the
   structured np.void but without field names. I?m not sure how vectorized
   element indexing would be spelt though.

Eric
?

On Tue, 9 Oct 2018 at 21:59 Stephan Hoyer <shoyer at gmail.com> wrote:

> The speed difference is interesting but really a different question than
> the public API.
>
> I'm coming around to ndrange(). I can see how it could be useful for
> symbolic manipulation of arrays and indexing operations, similar to what we
> do in dask and xarray.
>
> On Mon, Oct 8, 2018 at 4:25 PM Mark Harfouche <mark.harfouche at gmail.com>
> wrote:
>
>> since ndrange is a superset of the features of ndindex, we can implement
>> ndindex with ndrange or keep it as is.
>> ndindex is now a glorified `nditer` object anyway. So it isn't so much of
>> a maintenance burden.
>> As for how ndindex is implemented, I'm a little worried about python 2
>> performance seeing as range is a list.
>> I would wait on changing the way ndindex is implemented for now.
>>
>> I agree with Stephan that ndindex should be kept in. Many want backward
>> compatible code. It would be hard for me to justify why a dependency should
>> be bumped up to bleeding edge numpy just for a convenience iterator.
>>
>> Honestly, I was really surprised to see such a speed difference, I
>> thought it would have been closer.
>>
>> Allan, I decided to run a few more benchmarks, the nditer just seems slow
>> for single array access some reason. Maybe a bug?
>>
>> ```
>> import numpy as np
>> import itertools
>> a = np.ones((1000, 1000))
>>
>> b = {}
>> for i in np.ndindex(a.shape):
>>     b[i] = i
>>
>> %%timeit
>> # op_flag=('readonly',) doesn't change performance
>> for a_value in np.nditer(a):
>>     pass
>> 109 ms ? 921 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>>
>> %%timeit
>> for i in itertools.product(range(1000), range(1000)):
>>     a_value = a[i]
>> 113 ms ? 1.72 ms per loop (mean ? std. dev. of 7 runs, 10 loops each)
>>
>> %%timeit
>> for i in itertools.product(range(1000), range(1000)):
>>     c = b[i]
>> 193 ms ? 3.89 ms per loop (mean ? std. dev. of 7 runs, 1 loop each)
>>
>> %%timeit
>> for a_value in a.flat:
>>     pass
>> 25.3 ms ? 278 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>>
>> %%timeit
>> for k, v in b.items():
>>     pass
>> 19.9 ms ? 675 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>>
>> %%timeit
>> for i in itertools.product(range(1000), range(1000)):
>>     pass
>> 28 ms ? 715 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>> ```
>>
>> On Mon, Oct 8, 2018 at 4:26 PM Stephan Hoyer <shoyer at gmail.com> wrote:
>>
>>> I'm open to adding ndrange, and "soft-deprecating" ndindex (i.e.,
>>> discouraging its use in our docs, but not actually deprecating it).
>>> Certainly ndrange seems like a small but meaningful improvement in the
>>> interface.
>>>
>>> That said, I'm not convinced this is really worth the trouble. I think
>>> the nested loop is still pretty readable/clear, and there are few times
>>> when I've actually found ndindex() be useful.
>>>
>>> On Mon, Oct 8, 2018 at 12:35 PM Allan Haldane <allanhaldane at gmail.com>
>>> wrote:
>>>
>>>> On 10/8/18 12:21 PM, Mark Harfouche wrote:
>>>> > 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
>>>> > `range`, is not an iterator. Changing this behaviour would likely lead
>>>> > to breaking code that uses that assumption. For example anybody using
>>>> > introspection or code like:
>>>> >
>>>> > ```
>>>> > indx = np.ndindex(5, 5)
>>>> > next(indx)  # Don't look at the (0, 0) coordinate
>>>> > for i in indx:
>>>> >     print(i)
>>>> > ```
>>>> > would break if `ndindex` becomes "not an iterator"
>>>>
>>>> OK, I see now. Just like python3 has separate range and range_iterator
>>>> types, where range is sliceable, we would have separate ndrange and
>>>> ndindex types, where ndrange is sliceable. You're just copying the
>>>> python3 api. That justifies it pretty well for me.
>>>>
>>>> I still think we shouldn't have two functions which do nearly the same
>>>> thing. We should only have one, and get rid of the other. I see two ways
>>>> forward:
>>>>
>>>>  * replace ndindex by your ndrange code, so it is no longer an iter.
>>>>    This would require some deprecation cycles for the cases that break.
>>>>  * deprecate ndindex in favor of a new function ndrange. We would keep
>>>>    ndindex around for back-compatibility, with a dep warning to use
>>>>    ndrange instead.
>>>>
>>>> Doing a code search on github, I can see that a lot of people's code
>>>> would break if ndindex no longer was an iter. I also like the name
>>>> ndrange for its allusion to python3's range behavior. That makes me lean
>>>> towards the second option of a separate ndrange, with possible
>>>> deprecation of ndindex.
>>>>
>>>> > itertools.product + range seems to be much faster than the current
>>>> > implementation of ndindex
>>>> >
>>>> > (python 3.6)
>>>> > ```
>>>> > %%timeit
>>>> >
>>>> > for i in np.ndindex(100, 100):
>>>> >     pass
>>>> > 3.94 ms ? 19.4 ?s per loop (mean ? std. dev. of 7 runs, 100 loops
>>>> each)
>>>> >
>>>> > %%timeit
>>>> > import itertools
>>>> > for i in itertools.product(range(100), range(100)):
>>>> >     pass
>>>> > 231 ?s ? 1.09 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops
>>>> each)
>>>> > ```
>>>>
>>>> If the new code ends up faster than the old code, that's great, and
>>>> further justification for using ndrange instead of ndindex. I had
>>>> thought using nditer in the old code was fastest.
>>>>
>>>> So as far as I am concerned, I say go ahead with the PR the way you are
>>>> doing it.
>>>>
>>>> Allan
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181010/89227354/attachment-0001.html>

From mark.harfouche at gmail.com  Wed Oct 10 09:56:10 2018
From: mark.harfouche at gmail.com (Mark Harfouche)
Date: Wed, 10 Oct 2018 09:56:10 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
 <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
Message-ID: <CAC=AwPxK7-2Z5Jm=9qTDY99mgpPDsnpZg93v4U6BP7sjSTMVrw@mail.gmail.com>

Eric,

Great point. The multi-dimensional slicing and sequence return type is
definitely strange. I was thinking about that last night.
I?m a little new to the __array__ methods.
Are you saying that the sequence behaviour would stay the same, (ie.
__iter__, __revesed__, __contains__), but
np.asarray(np.ndrange((3, 3)))
would return something like an array of tuples?
I?m not sure this is something that anybody can?t already with do meshgrid
+ stack

and only implement methods already present in numpy.

I?m not sure what this means.

I?ll note that in Python 3
<https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range>,
range is it?s own thing. It is still a sequence type but it doesn?t support
addition.
I?m kinda ok with ndrange/ndindex being a sequence type, supporting ND
slicing, but not being an array ;)

I?m kinda warming up to the idea of expanding ndindex.

   1. The additional start and step can be omitted from ndindex for a while
   (indefinitely?). Slicing is way more convenient anyway.
   2. Warnings can help people move from nd.index(1, 2, 3) to nd.index((1,
   2, 3))
   3. ndindex can return a seperate iterator, but the ndindex object would
   hold a reference to it. Calls to ndindex.__next__ would simply return
   next(of_that_object)
   Note. This would break introspection since the iterator is no longer
   ndindex type. I?m kinda OK with this though, but breaking code is never
   nice :(
   4. Bench-marking can help motivate the choice of iterator used for step=(1,)
   * N start=(0,) * N
   5. Wait until 2019 because I don?t want to deal with performance
   regressions of potentially using range in Python2 and I don?t want this
   to motivate any implementation details.

Mark

On Wed, Oct 10, 2018 at 12:36 AM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> One thing that worries me here - in python, range(...) in essence
> generates a lazy list - so I?d expect ndrange to generate a lazy ndarray.
> In practice, that means it would be a duck-type defining an __array__
> method to evaluate it, and only implement methods already present in numpy.
>
> It?s not clear to me what the datatype of such an array-like would be.
> Candidates I can think of are:
>
>    1. [('i0', intp), ('i1', intp), ...], but this makes tuple coercion a
>    little awkward
>    2. (intp, (N,)) - which collapses into a shape + (3,) array
>    3. object_.
>    4. Some new np.tuple_ dtype, a heterogenous tuple, which is like the
>    structured np.void but without field names. I?m not sure how
>    vectorized element indexing would be spelt though.
>
> Eric
> ?
>
> On Tue, 9 Oct 2018 at 21:59 Stephan Hoyer <shoyer at gmail.com> wrote:
>
>> The speed difference is interesting but really a different question than
>> the public API.
>>
>> I'm coming around to ndrange(). I can see how it could be useful for
>> symbolic manipulation of arrays and indexing operations, similar to what we
>> do in dask and xarray.
>>
>> On Mon, Oct 8, 2018 at 4:25 PM Mark Harfouche <mark.harfouche at gmail.com>
>> wrote:
>>
>>> since ndrange is a superset of the features of ndindex, we can implement
>>> ndindex with ndrange or keep it as is.
>>> ndindex is now a glorified `nditer` object anyway. So it isn't so much
>>> of a maintenance burden.
>>> As for how ndindex is implemented, I'm a little worried about python 2
>>> performance seeing as range is a list.
>>> I would wait on changing the way ndindex is implemented for now.
>>>
>>> I agree with Stephan that ndindex should be kept in. Many want backward
>>> compatible code. It would be hard for me to justify why a dependency should
>>> be bumped up to bleeding edge numpy just for a convenience iterator.
>>>
>>> Honestly, I was really surprised to see such a speed difference, I
>>> thought it would have been closer.
>>>
>>> Allan, I decided to run a few more benchmarks, the nditer just seems
>>> slow for single array access some reason. Maybe a bug?
>>>
>>> ```
>>> import numpy as np
>>> import itertools
>>> a = np.ones((1000, 1000))
>>>
>>> b = {}
>>> for i in np.ndindex(a.shape):
>>>     b[i] = i
>>>
>>> %%timeit
>>> # op_flag=('readonly',) doesn't change performance
>>> for a_value in np.nditer(a):
>>>     pass
>>> 109 ms ? 921 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>>>
>>> %%timeit
>>> for i in itertools.product(range(1000), range(1000)):
>>>     a_value = a[i]
>>> 113 ms ? 1.72 ms per loop (mean ? std. dev. of 7 runs, 10 loops each)
>>>
>>> %%timeit
>>> for i in itertools.product(range(1000), range(1000)):
>>>     c = b[i]
>>> 193 ms ? 3.89 ms per loop (mean ? std. dev. of 7 runs, 1 loop each)
>>>
>>> %%timeit
>>> for a_value in a.flat:
>>>     pass
>>> 25.3 ms ? 278 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>>>
>>> %%timeit
>>> for k, v in b.items():
>>>     pass
>>> 19.9 ms ? 675 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>>>
>>> %%timeit
>>> for i in itertools.product(range(1000), range(1000)):
>>>     pass
>>> 28 ms ? 715 ?s per loop (mean ? std. dev. of 7 runs, 10 loops each)
>>> ```
>>>
>>> On Mon, Oct 8, 2018 at 4:26 PM Stephan Hoyer <shoyer at gmail.com> wrote:
>>>
>>>> I'm open to adding ndrange, and "soft-deprecating" ndindex (i.e.,
>>>> discouraging its use in our docs, but not actually deprecating it).
>>>> Certainly ndrange seems like a small but meaningful improvement in the
>>>> interface.
>>>>
>>>> That said, I'm not convinced this is really worth the trouble. I think
>>>> the nested loop is still pretty readable/clear, and there are few times
>>>> when I've actually found ndindex() be useful.
>>>>
>>>> On Mon, Oct 8, 2018 at 12:35 PM Allan Haldane <allanhaldane at gmail.com>
>>>> wrote:
>>>>
>>>>> On 10/8/18 12:21 PM, Mark Harfouche wrote:
>>>>> > 2. `ndindex` is an iterator itself. As proposed, `ndrange`, like
>>>>> > `range`, is not an iterator. Changing this behaviour would likely
>>>>> lead
>>>>> > to breaking code that uses that assumption. For example anybody using
>>>>> > introspection or code like:
>>>>> >
>>>>> > ```
>>>>> > indx = np.ndindex(5, 5)
>>>>> > next(indx)  # Don't look at the (0, 0) coordinate
>>>>> > for i in indx:
>>>>> >     print(i)
>>>>> > ```
>>>>> > would break if `ndindex` becomes "not an iterator"
>>>>>
>>>>> OK, I see now. Just like python3 has separate range and range_iterator
>>>>> types, where range is sliceable, we would have separate ndrange and
>>>>> ndindex types, where ndrange is sliceable. You're just copying the
>>>>> python3 api. That justifies it pretty well for me.
>>>>>
>>>>> I still think we shouldn't have two functions which do nearly the same
>>>>> thing. We should only have one, and get rid of the other. I see two
>>>>> ways
>>>>> forward:
>>>>>
>>>>>  * replace ndindex by your ndrange code, so it is no longer an iter.
>>>>>    This would require some deprecation cycles for the cases that break.
>>>>>  * deprecate ndindex in favor of a new function ndrange. We would keep
>>>>>    ndindex around for back-compatibility, with a dep warning to use
>>>>>    ndrange instead.
>>>>>
>>>>> Doing a code search on github, I can see that a lot of people's code
>>>>> would break if ndindex no longer was an iter. I also like the name
>>>>> ndrange for its allusion to python3's range behavior. That makes me
>>>>> lean
>>>>> towards the second option of a separate ndrange, with possible
>>>>> deprecation of ndindex.
>>>>>
>>>>> > itertools.product + range seems to be much faster than the current
>>>>> > implementation of ndindex
>>>>> >
>>>>> > (python 3.6)
>>>>> > ```
>>>>> > %%timeit
>>>>> >
>>>>> > for i in np.ndindex(100, 100):
>>>>> >     pass
>>>>> > 3.94 ms ? 19.4 ?s per loop (mean ? std. dev. of 7 runs, 100 loops
>>>>> each)
>>>>> >
>>>>> > %%timeit
>>>>> > import itertools
>>>>> > for i in itertools.product(range(100), range(100)):
>>>>> >     pass
>>>>> > 231 ?s ? 1.09 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops
>>>>> each)
>>>>> > ```
>>>>>
>>>>> If the new code ends up faster than the old code, that's great, and
>>>>> further justification for using ndrange instead of ndindex. I had
>>>>> thought using nditer in the old code was fastest.
>>>>>
>>>>> So as far as I am concerned, I say go ahead with the PR the way you are
>>>>> doing it.
>>>>>
>>>>> Allan
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at python.org
>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181010/fce7ffd3/attachment-0001.html>

From shoyer at gmail.com  Wed Oct 10 12:07:50 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Wed, 10 Oct 2018 09:07:50 -0700
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
 <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
Message-ID: <CAEQ_TvdbKNBNYXXfYuEt2Kh4_aJXf3vukuhM3D9_nY1x5Dpc6A@mail.gmail.com>

On Tue, Oct 9, 2018 at 9:34 PM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> One thing that worries me here - in python, range(...) in essence
> generates a lazy list - so I?d expect ndrange to generate a lazy ndarray.
> In practice, that means it would be a duck-type defining an __array__
> method to evaluate it, and only implement methods already present in numpy.
>
> It?s not clear to me what the datatype of such an array-like would be.
> Candidates I can think of are:
>
>    1. [('i0', intp), ('i1', intp), ...], but this makes tuple coercion a
>    little awkward
>
> I think this would be the appropriate choice. What about it makes tuple
coercion awkward? If you use this as the dtype, you both set and get
element as tuples.

In particular, I would say that ndrange() should be a lazy equivalent to
the following explicit constructor:

def ndrange(shape):
dtype = [('i' + str(i), np.intp) for i in range(len(shape))]
array = np.empty(shape, dtype)
for indices in np.ndindex(*shape):
array[indices] = indices
return array

>>> ndrange((2,)
array([(0,), (1,)], dtype=[('i0', '<i8')])

>>> ndrange((2, 3))
array([[(0, 0), (0, 1), (0, 2)], [(1, 0), (1, 1), (1, 2)]], dtype=[('i0',
'<i8'), ('i1', '<i8')])

The one deviation in behavior would be that ndrange() iterates over
flattened elements rather than the first axes.

It is indeed a little awkward to have field names, but given that NumPy
creates those automatically when you supply a dtype like 'i8,i8' this is
probably a reasonable choice.


>    1. (intp, (N,)) - which collapses into a shape + (3,) array
>    2. object_.
>    3. Some new np.tuple_ dtype, a heterogenous tuple, which is like the
>    structured np.void but without field names. I?m not sure how
>    vectorized element indexing would be spelt though.
>
> Eric
> ?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181010/1b80c63d/attachment.html>

From allanhaldane at gmail.com  Wed Oct 10 14:21:00 2018
From: allanhaldane at gmail.com (Allan Haldane)
Date: Wed, 10 Oct 2018 14:21:00 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
 <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
Message-ID: <b46cde38-751a-0a93-5a44-0c374ab83e88@gmail.com>

On 10/10/18 12:34 AM, Eric Wieser wrote:
> One thing that worries me here - in python, |range(...)| in essence
> generates a lazy |list| - so I?d expect |ndrange| to generate a lazy
> |ndarray|. In practice, that means it would be a duck-type defining an
> |__array__| method to evaluate it, and only implement methods already
> present in numpy.

Isn't that what arange is for?

It seems like there are two uses of python3's range: 1. creating a 1d
iterable of indices for use in for-loops, and 2. with list(range) can be
used to create a sequence of integers.

Numpy can extend this in two directions:
 * ndrange returns an iterable of nd indices (for for-loops).
 * arange returns an 1d ndarray of integers instead of a list

The application of for-loops, which is more niche, doesn't need
ndarray's vectorized properties, so I'm not convinced it should return
an ndarray. It certainly seems simpler not to return an ndarray, due to
the dtype question.

arange on its own seems to cover the need for a vectorized version of range.

Allan

From mark.harfouche at gmail.com  Thu Oct 11 09:41:42 2018
From: mark.harfouche at gmail.com (Mark Harfouche)
Date: Thu, 11 Oct 2018 09:41:42 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <b46cde38-751a-0a93-5a44-0c374ab83e88@gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
 <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
 <b46cde38-751a-0a93-5a44-0c374ab83e88@gmail.com>
Message-ID: <CAC=AwPxhNoe=HakY-JRr7A0=qritK+A-pk5hBeuViu_BAgMMPg@mail.gmail.com>

I?m really open to these kinds of array extensions but, I (personally) just
don?t know how to do this efficiently.
I feel like ogrid and mgrid are probably enough for people that want think
kind of feature.

My implementation would just be based on python primitives which would
yield performance similar to

In [2]: %timeit np.arange(1000)
1.25 ?s ? 4.01 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each)

In [4]: %timeit np.asarray(range(1000))
99.6 ?s ? 1.38 ?s per loop (mean ? std. dev. of 7 runs, 10000 loops each)

Here is how mgrid can be used to return something similar to the indices
from ndrange

In [10]: np.mgrid[1:10:3, 2:10:3][:, 1, 1]
Out[10]: array([4, 5])

In [13]: np.ndrange((10, 10))[1::3, 2::3][1, 1]
Out[13]: (4, 5)


On Wed, Oct 10, 2018 at 2:22 PM Allan Haldane <allanhaldane at gmail.com>
wrote:

> On 10/10/18 12:34 AM, Eric Wieser wrote:
> > One thing that worries me here - in python, |range(...)| in essence
> > generates a lazy |list| - so I?d expect |ndrange| to generate a lazy
> > |ndarray|. In practice, that means it would be a duck-type defining an
> > |__array__| method to evaluate it, and only implement methods already
> > present in numpy.
>
> Isn't that what arange is for?
>
> It seems like there are two uses of python3's range: 1. creating a 1d
> iterable of indices for use in for-loops, and 2. with list(range) can be
> used to create a sequence of integers.
>
> Numpy can extend this in two directions:
>  * ndrange returns an iterable of nd indices (for for-loops).
>  * arange returns an 1d ndarray of integers instead of a list
>
> The application of for-loops, which is more niche, doesn't need
> ndarray's vectorized properties, so I'm not convinced it should return
> an ndarray. It certainly seems simpler not to return an ndarray, due to
> the dtype question.
>
> arange on its own seems to cover the need for a vectorized version of
> range.
>
> Allan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181011/d2f325b1/attachment-0001.html>

From wieser.eric+numpy at gmail.com  Thu Oct 11 10:14:26 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Thu, 11 Oct 2018 07:14:26 -0700
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAEQ_TvdbKNBNYXXfYuEt2Kh4_aJXf3vukuhM3D9_nY1x5Dpc6A@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
 <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
 <CAEQ_TvdbKNBNYXXfYuEt2Kh4_aJXf3vukuhM3D9_nY1x5Dpc6A@mail.gmail.com>
Message-ID: <CAL1kJvDMTq8YywKXbUxa_2CGRCj5pkVd36qJBu-Rs4ZW9cz3UA@mail.gmail.com>

If you use this as the dtype, you both set and get element as tuples.

Elements are not got as tuples, but they can be explicitly cast

What about it makes tuple coercion awkward?

This explicit cast

>>> dt_ind2d = np.dtype([('i0', np.intp), ('i1', np.intp)])
>>> ind = np.zeros((), dt_ind2d)[0]
>>> ind, type(ind)
((0, 0), <class 'numpy.void'>)
>>> m[ind]
Traceback (most recent call last):
  File "<pyshell#17>", line 1, in <module>
    m[inds[0]]
IndexError: only integers, slices (`:`), ellipsis (`...`),
numpy.newaxis (`None`) and integer or boolean arrays are valid indices
>>> m[tuple(ind)]
1.0

On Wed, 10 Oct 2018 at 09:08 Stephan Hoyer shoyer at gmail.com
<http://mailto:shoyer at gmail.com> wrote:

On Tue, Oct 9, 2018 at 9:34 PM Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>
>> One thing that worries me here - in python, range(...) in essence
>> generates a lazy list - so I?d expect ndrange to generate a lazy ndarray.
>> In practice, that means it would be a duck-type defining an __array__
>> method to evaluate it, and only implement methods already present in numpy.
>>
>> It?s not clear to me what the datatype of such an array-like would be.
>> Candidates I can think of are:
>>
>>    1. [('i0', intp), ('i1', intp), ...], but this makes tuple coercion a
>>    little awkward
>>
>> I think this would be the appropriate choice. What about it makes tuple
> coercion awkward? If you use this as the dtype, you both set and get
> element as tuples.
>
> In particular, I would say that ndrange() should be a lazy equivalent to
> the following explicit constructor:
>
> def ndrange(shape):
> dtype = [('i' + str(i), np.intp) for i in range(len(shape))]
> array = np.empty(shape, dtype)
> for indices in np.ndindex(*shape):
> array[indices] = indices
> return array
>
> >>> ndrange((2,)
> array([(0,), (1,)], dtype=[('i0', '<i8')])
>
> >>> ndrange((2, 3))
> array([[(0, 0), (0, 1), (0, 2)], [(1, 0), (1, 1), (1, 2)]], dtype=[('i0',
> '<i8'), ('i1', '<i8')])
>
> The one deviation in behavior would be that ndrange() iterates over
> flattened elements rather than the first axes.
>
> It is indeed a little awkward to have field names, but given that NumPy
> creates those automatically when you supply a dtype like 'i8,i8' this is
> probably a reasonable choice.
>
>
>>    1. (intp, (N,)) - which collapses into a shape + (3,) array
>>    2. object_.
>>    3. Some new np.tuple_ dtype, a heterogenous tuple, which is like the
>>    structured np.void but without field names. I?m not sure how
>>    vectorized element indexing would be spelt though.
>>
>> Eric
>> ?
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181011/e1bd6fb8/attachment-0001.html>

From wieser.eric+numpy at gmail.com  Thu Oct 11 10:19:20 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Thu, 11 Oct 2018 07:19:20 -0700
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <b46cde38-751a-0a93-5a44-0c374ab83e88@gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
 <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
 <b46cde38-751a-0a93-5a44-0c374ab83e88@gmail.com>
Message-ID: <CAL1kJvAMdZbBXOMKRp_t1_=ONOGCiuKxdxvVA-qqW2jgOSweaw@mail.gmail.com>

Isn?t that what arange is for?

Imagining ourselves in python2 land for now - I?m proposing arange is to
range, as ndrange is to xrange

I?m not convinced it should return an ndarray

I agree - I think it should return a range-like object that:

   - Is convertible via __array__ if needed
   - Looks like an ndarray, with:
      - a .dtype attribute
      - a __getitem__(Tuple[int]) which returns numpy scalars
      - .ravel() and .flat for choosing iteration order.

On Wed, 10 Oct 2018 at 11:21 Allan Haldane allanhaldane at gmail.com
<http://mailto:allanhaldane at gmail.com> wrote:

On 10/10/18 12:34 AM, Eric Wieser wrote:
> > One thing that worries me here - in python, |range(...)| in essence
> > generates a lazy |list| - so I?d expect |ndrange| to generate a lazy
> > |ndarray|. In practice, that means it would be a duck-type defining an
> > |__array__| method to evaluate it, and only implement methods already
> > present in numpy.
>
> Isn't that what arange is for?
>
> It seems like there are two uses of python3's range: 1. creating a 1d
> iterable of indices for use in for-loops, and 2. with list(range) can be
> used to create a sequence of integers.
>
> Numpy can extend this in two directions:
>  * ndrange returns an iterable of nd indices (for for-loops).
>  * arange returns an 1d ndarray of integers instead of a list
>
> The application of for-loops, which is more niche, doesn't need
> ndarray's vectorized properties, so I'm not convinced it should return
> an ndarray. It certainly seems simpler not to return an ndarray, due to
> the dtype question.
>
> arange on its own seems to cover the need for a vectorized version of
> range.
>
> Allan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181011/bd646273/attachment.html>

From harrigan.matthew at gmail.com  Thu Oct 11 12:53:51 2018
From: harrigan.matthew at gmail.com (Matthew Harrigan)
Date: Thu, 11 Oct 2018 12:53:51 -0400
Subject: [Numpy-discussion] LaTeX version of boolean indexing
Message-ID: <CAOfRF=g+Q4=Fb61LZsKD0sh6DTSBHPKKA+L0DwJ5p_d40aBqeA@mail.gmail.com>

Hello,

I am documenting some code, translating the core of the algorithm to
LaTeX.  The style I have currently is very similar to the einsum syntax
(which is awesome btw).  Here
<https://gist.github.com/mattharrigan/68b292e64381bba6b78a06a6f1762fa2> is
an example of some of the basic operations in NumPy.  One part I do not
know how to capture well is boolean indexing, ie:

mask = np.array([1, 0, 1])
x = np.array([1, 2, 3])
y = x[mask]

Any suggestions on how to clearly, formally, and concisely show that
operation?  Also, are there any guides on translating NumPy to LaTeX?  It
might be helpful for documenting algorithms and also for people learning
NumPy.

Thank you,
Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181011/68d3a86e/attachment.html>

From deak.andris at gmail.com  Thu Oct 11 13:29:12 2018
From: deak.andris at gmail.com (Andras Deak)
Date: Thu, 11 Oct 2018 19:29:12 +0200
Subject: [Numpy-discussion] LaTeX version of boolean indexing
In-Reply-To: <CAOfRF=g+Q4=Fb61LZsKD0sh6DTSBHPKKA+L0DwJ5p_d40aBqeA@mail.gmail.com>
References: <CAOfRF=g+Q4=Fb61LZsKD0sh6DTSBHPKKA+L0DwJ5p_d40aBqeA@mail.gmail.com>
Message-ID: <CAMEWA4PLLCGrX9r=woRMSTBSbuc079AH9eFbwSpim0xGi_yQsg@mail.gmail.com>

On Thu, Oct 11, 2018 at 6:54 PM Matthew Harrigan
<harrigan.matthew at gmail.com> wrote:
>
> Hello,
>
> I am documenting some code, translating the core of the algorithm to LaTeX.  The style I have currently is very similar to the einsum syntax (which is awesome btw).  Here is an example of some of the basic operations in NumPy.  One part I do not know how to capture well is boolean indexing, ie:
>
> mask = np.array([1, 0, 1])
> x = np.array([1, 2, 3])
> y = x[mask]

That is fancy indexing with an index array rather than boolean
indexing. That's why the result is [2, 1, 2] rather than [1, 3].
In case this is really what you need, it's the case of your indices
originating from another sequence: `y_i = x_{m_i}` where `m_i` is your
indexing sequence.
For proper boolean indexing you lose the one-to-one correspondence
between input and output (due to the size almost always changing), so
you might not be able to formalize it this nicely with an index
appearing in both sides. But something with an indicator might work...

Andr?s

From harrigan.matthew at gmail.com  Thu Oct 11 13:43:40 2018
From: harrigan.matthew at gmail.com (Matthew Harrigan)
Date: Thu, 11 Oct 2018 13:43:40 -0400
Subject: [Numpy-discussion] LaTeX version of boolean indexing
In-Reply-To: <CAMEWA4PLLCGrX9r=woRMSTBSbuc079AH9eFbwSpim0xGi_yQsg@mail.gmail.com>
References: <CAOfRF=g+Q4=Fb61LZsKD0sh6DTSBHPKKA+L0DwJ5p_d40aBqeA@mail.gmail.com>
 <CAMEWA4PLLCGrX9r=woRMSTBSbuc079AH9eFbwSpim0xGi_yQsg@mail.gmail.com>
Message-ID: <CAOfRF=gM4fHMcbBA+m6FY-PUTpPgkYSW7LQ00nEVjqGD2QGoYw@mail.gmail.com>

My apologies, never write code directly in an email...
s/b:
mask = np.array([1, 0, 1], dtype=bool)

What do you mean by indicator?

On Thu, Oct 11, 2018 at 1:31 PM Andras Deak <deak.andris at gmail.com> wrote:

> On Thu, Oct 11, 2018 at 6:54 PM Matthew Harrigan
> <harrigan.matthew at gmail.com> wrote:
> >
> > Hello,
> >
> > I am documenting some code, translating the core of the algorithm to
> LaTeX.  The style I have currently is very similar to the einsum syntax
> (which is awesome btw).  Here is an example of some of the basic operations
> in NumPy.  One part I do not know how to capture well is boolean indexing,
> ie:
> >
> > mask = np.array([1, 0, 1])
> > x = np.array([1, 2, 3])
> > y = x[mask]
>
> That is fancy indexing with an index array rather than boolean
> indexing. That's why the result is [2, 1, 2] rather than [1, 3].
> In case this is really what you need, it's the case of your indices
> originating from another sequence: `y_i = x_{m_i}` where `m_i` is your
> indexing sequence.
> For proper boolean indexing you lose the one-to-one correspondence
> between input and output (due to the size almost always changing), so
> you might not be able to formalize it this nicely with an index
> appearing in both sides. But something with an indicator might work...
>
> Andr?s
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181011/c16dbd76/attachment.html>

From deak.andris at gmail.com  Thu Oct 11 14:03:56 2018
From: deak.andris at gmail.com (Andras Deak)
Date: Thu, 11 Oct 2018 20:03:56 +0200
Subject: [Numpy-discussion] LaTeX version of boolean indexing
In-Reply-To: <CAOfRF=gM4fHMcbBA+m6FY-PUTpPgkYSW7LQ00nEVjqGD2QGoYw@mail.gmail.com>
References: <CAOfRF=g+Q4=Fb61LZsKD0sh6DTSBHPKKA+L0DwJ5p_d40aBqeA@mail.gmail.com>
 <CAMEWA4PLLCGrX9r=woRMSTBSbuc079AH9eFbwSpim0xGi_yQsg@mail.gmail.com>
 <CAOfRF=gM4fHMcbBA+m6FY-PUTpPgkYSW7LQ00nEVjqGD2QGoYw@mail.gmail.com>
Message-ID: <CAMEWA4PaJkSHKg_yhTCz73hTB34X-PLq4iSarqbWuiB-ccw5vQ@mail.gmail.com>

On Thu, Oct 11, 2018 at 7:45 PM Matthew Harrigan
<harrigan.matthew at gmail.com> wrote:
>
> What do you mean by indicator?
>

I mostly meant what wikipedia seems to call "set-builder notation"
(https://en.wikipedia.org/wiki/Set-builder_notation#Sets_defined_by_a_predicate).
Since your "input" is `{x_i | i in [0,1,2]}` but your output is a `y_j
for j in [0,1]`, the straightforward thing I could think of was
defining the set of valid `y_j` values (with an implicit assumption of
the order being preserved, I guess). This would mean you can say
something like
`y_i \in {x_j | m_j}` (omitting the \left/\right/\vert fluff for
simplicity here) where `m_j` are the elements of the boolean mask
(say, `m = [True, False, True]`). In this context I'd understand it
that `m_j` is the predicate and `x_j` are the corresponding values,
however the notation isn't entirely ambiguous (see also a remark on
the above wikipedia page) so you can't really get away with omitting
further explanation in order to resolve ambiguity. Though I guess
calling `m_j` elements of a mask would do the same thing.
The other option that comes to mind is to define the auxiliary indices
`n_i` for which `m_j` are True, then you of course denote the result
with integer indices: `y_i = x_{n_i}` where `i` goes from 0 to the
number of `True`s in `m_j`. But then you have the same difficulty
defining `n_i`.
All in all I'm not sure there's an elegant and concise notation for
boolean masking.

Andr?s

From mark.harfouche at gmail.com  Thu Oct 11 21:53:53 2018
From: mark.harfouche at gmail.com (Mark Harfouche)
Date: Thu, 11 Oct 2018 21:53:53 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAL1kJvAMdZbBXOMKRp_t1_=ONOGCiuKxdxvVA-qqW2jgOSweaw@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
 <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
 <b46cde38-751a-0a93-5a44-0c374ab83e88@gmail.com>
 <CAL1kJvAMdZbBXOMKRp_t1_=ONOGCiuKxdxvVA-qqW2jgOSweaw@mail.gmail.com>
Message-ID: <CAC=AwPxTCzUyQNQE2inj=D+0whknYypMQVOEpxK_XDoyGSPQeQ@mail.gmail.com>

Eric, interesting ideas.

> __getitem__(Tuple[int]) which returns numpy scalars

I'm not sure what you mean. Even if you supply a numpy uint8 to range, it
still returns a python int class.
Would you like ndrange to return a tuple of `uint8` in this case?

```
In [3]: a =
iter(range(np.uint8(10)))

In [4]:
next(a).__class__
Out[4]: int

In [5]:
np.uint8(10).__class__
Out[5]: numpy.uint8
```

Ravel seems like a cool way to choose iteration order. In the PR, I
mentionned that one reason that I removed `'F'` order from the PR was:
1. My implementation was not competitive with the `C` order implementation
in terms of speed (can be fixed)
2. I don't know if it something that people really need to iterate over
collections (annoying to maintain if unused)

Instead, I just showed an example how people could iterate in `F` order
should they need to.

I'm not sure if we ever want the `ndrange` object to return a full matrix.
It seems like we would be creating a custom tuple class just for this which
seems pretty niche.


On Thu, Oct 11, 2018 at 10:21 AM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> Isn?t that what arange is for?
>
> Imagining ourselves in python2 land for now - I?m proposing arange is to
> range, as ndrange is to xrange
>
> I?m not convinced it should return an ndarray
>
> I agree - I think it should return a range-like object that:
>
>    - Is convertible via __array__ if needed
>    - Looks like an ndarray, with:
>       - a .dtype attribute
>       - a __getitem__(Tuple[int]) which returns numpy scalars
>       - .ravel() and .flat for choosing iteration order.
>
> On Wed, 10 Oct 2018 at 11:21 Allan Haldane allanhaldane at gmail.com
> <http://mailto:allanhaldane at gmail.com> wrote:
>
> On 10/10/18 12:34 AM, Eric Wieser wrote:
>> > One thing that worries me here - in python, |range(...)| in essence
>> > generates a lazy |list| - so I?d expect |ndrange| to generate a lazy
>> > |ndarray|. In practice, that means it would be a duck-type defining an
>> > |__array__| method to evaluate it, and only implement methods already
>> > present in numpy.
>>
>> Isn't that what arange is for?
>>
>> It seems like there are two uses of python3's range: 1. creating a 1d
>> iterable of indices for use in for-loops, and 2. with list(range) can be
>> used to create a sequence of integers.
>>
>> Numpy can extend this in two directions:
>>  * ndrange returns an iterable of nd indices (for for-loops).
>>  * arange returns an 1d ndarray of integers instead of a list
>>
>> The application of for-loops, which is more niche, doesn't need
>> ndarray's vectorized properties, so I'm not convinced it should return
>> an ndarray. It certainly seems simpler not to return an ndarray, due to
>> the dtype question.
>>
>> arange on its own seems to cover the need for a vectorized version of
>> range.
>>
>> Allan
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ?
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181011/be4df84d/attachment-0001.html>

From wieser.eric+numpy at gmail.com  Thu Oct 11 22:29:56 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Thu, 11 Oct 2018 19:29:56 -0700
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAC=AwPxTCzUyQNQE2inj=D+0whknYypMQVOEpxK_XDoyGSPQeQ@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
 <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
 <b46cde38-751a-0a93-5a44-0c374ab83e88@gmail.com>
 <CAL1kJvAMdZbBXOMKRp_t1_=ONOGCiuKxdxvVA-qqW2jgOSweaw@mail.gmail.com>
 <CAC=AwPxTCzUyQNQE2inj=D+0whknYypMQVOEpxK_XDoyGSPQeQ@mail.gmail.com>
Message-ID: <CAL1kJvCPJxv-uXGQfr56yuQWs_0afm8M3bVURrkU5ZcfD1RNyA@mail.gmail.com>

I?m not sure if we ever want the ndrange object to return a full matrix.

np.array(ndrange(...)) should definitely return a full array, because
that?s what the user asked for.

Even if you supply a numpy uint8 to range, it still returns a python int
class.

If we want to design ndrange with the intent of indexing only, then it
should probably always use np.intp, whatever the type of the provided
arguments

Would you like ndrange to return a tuple of uint8 in this case?

Tuples are just one of the four options I listed in a previous message. The
downside of tuples is there?s no easy way to say ?take just the first axis
of this range?.
Whatever we pick, the return value should be such that
np.array(ndrange(...))[ind]
== ndrange(...)[idx]
?

On Thu, 11 Oct 2018 at 18:54 Mark Harfouche <mark.harfouche at gmail.com>
wrote:

> Eric, interesting ideas.
>
> > __getitem__(Tuple[int]) which returns numpy scalars
>
> I'm not sure what you mean. Even if you supply a numpy uint8 to range, it
> still returns a python int class.
> Would you like ndrange to return a tuple of `uint8` in this case?
>
> ```
> In [3]: a =
> iter(range(np.uint8(10)))
>
> In [4]:
> next(a).__class__
> Out[4]: int
>
> In [5]:
> np.uint8(10).__class__
> Out[5]: numpy.uint8
> ```
>
> Ravel seems like a cool way to choose iteration order. In the PR, I
> mentionned that one reason that I removed `'F'` order from the PR was:
> 1. My implementation was not competitive with the `C` order implementation
> in terms of speed (can be fixed)
> 2. I don't know if it something that people really need to iterate over
> collections (annoying to maintain if unused)
>
> Instead, I just showed an example how people could iterate in `F` order
> should they need to.
>
> I'm not sure if we ever want the `ndrange` object to return a full matrix.
> It seems like we would be creating a custom tuple class just for this which
> seems pretty niche.
>
>
> On Thu, Oct 11, 2018 at 10:21 AM Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>
>> Isn?t that what arange is for?
>>
>> Imagining ourselves in python2 land for now - I?m proposing arange is to
>> range, as ndrange is to xrange
>>
>> I?m not convinced it should return an ndarray
>>
>> I agree - I think it should return a range-like object that:
>>
>>    - Is convertible via __array__ if needed
>>    - Looks like an ndarray, with:
>>       - a .dtype attribute
>>       - a __getitem__(Tuple[int]) which returns numpy scalars
>>       - .ravel() and .flat for choosing iteration order.
>>
>> On Wed, 10 Oct 2018 at 11:21 Allan Haldane allanhaldane at gmail.com
>> <http://mailto:allanhaldane at gmail.com> wrote:
>>
>> On 10/10/18 12:34 AM, Eric Wieser wrote:
>>> > One thing that worries me here - in python, |range(...)| in essence
>>> > generates a lazy |list| - so I?d expect |ndrange| to generate a lazy
>>> > |ndarray|. In practice, that means it would be a duck-type defining an
>>> > |__array__| method to evaluate it, and only implement methods already
>>> > present in numpy.
>>>
>>> Isn't that what arange is for?
>>>
>>> It seems like there are two uses of python3's range: 1. creating a 1d
>>> iterable of indices for use in for-loops, and 2. with list(range) can be
>>> used to create a sequence of integers.
>>>
>>> Numpy can extend this in two directions:
>>>  * ndrange returns an iterable of nd indices (for for-loops).
>>>  * arange returns an 1d ndarray of integers instead of a list
>>>
>>> The application of for-loops, which is more niche, doesn't need
>>> ndarray's vectorized properties, so I'm not convinced it should return
>>> an ndarray. It certainly seems simpler not to return an ndarray, due to
>>> the dtype question.
>>>
>>> arange on its own seems to cover the need for a vectorized version of
>>> range.
>>>
>>> Allan
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> ?
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181011/774ad81c/attachment.html>

From mark.harfouche at gmail.com  Thu Oct 11 23:15:16 2018
From: mark.harfouche at gmail.com (Mark Harfouche)
Date: Thu, 11 Oct 2018 23:15:16 -0400
Subject: [Numpy-discussion] ndrange, like range but multidimensiontal
In-Reply-To: <CAL1kJvCPJxv-uXGQfr56yuQWs_0afm8M3bVURrkU5ZcfD1RNyA@mail.gmail.com>
References: <CAC=AwPwrrVz5P6Erpa0suc0wAsth7Aq80-DZeK=LDdKguum7rw@mail.gmail.com>
 <039bfa4b-5f14-0241-6fd6-a52b123ac176@gmail.com>
 <CAEQ_TvcP+E44bagTaV+XqZ0wopqgLv=JP1HyQYwibrecBjCOHw@mail.gmail.com>
 <CAC=AwPxrvy8zA9mKp4iOs1564AgSKmm+GOr8+azAWBtNh8H0BQ@mail.gmail.com>
 <CAEQ_TvdhZmYxNs5jj5u7umbQ8RqRoAkOVB+aVtQYTU-Y2drkwg@mail.gmail.com>
 <CAL1kJvBOb5HcSLmULyjX2sHcx49Cy4RGaz4VHCADajbAnZrj5w@mail.gmail.com>
 <b46cde38-751a-0a93-5a44-0c374ab83e88@gmail.com>
 <CAL1kJvAMdZbBXOMKRp_t1_=ONOGCiuKxdxvVA-qqW2jgOSweaw@mail.gmail.com>
 <CAC=AwPxTCzUyQNQE2inj=D+0whknYypMQVOEpxK_XDoyGSPQeQ@mail.gmail.com>
 <CAL1kJvCPJxv-uXGQfr56yuQWs_0afm8M3bVURrkU5ZcfD1RNyA@mail.gmail.com>
Message-ID: <CAC=AwPw1KN+=Y2OdgprRLMyv8cDvO=mvk2vBRairjmpbdVAhxA@mail.gmail.com>

> If we want to design ndrange with the intent of indexing only

This is the only use I had in mind. But I feel like you are able to
envision different use cases.

>  Whatever we pick, the return value should be such that np.array(ndrange(...))[ind]
== ndrange(...)[idx]
I can see the appeal to this.

On Thu, Oct 11, 2018 at 10:31 PM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> I?m not sure if we ever want the ndrange object to return a full matrix.
>
> np.array(ndrange(...)) should definitely return a full array, because
> that?s what the user asked for.
>
> Even if you supply a numpy uint8 to range, it still returns a python int
> class.
>
> If we want to design ndrange with the intent of indexing only, then it
> should probably always use np.intp, whatever the type of the provided
> arguments
>
> Would you like ndrange to return a tuple of uint8 in this case?
>
> Tuples are just one of the four options I listed in a previous message.
> The downside of tuples is there?s no easy way to say ?take just the first
> axis of this range?.
> Whatever we pick, the return value should be such that np.array(ndrange(...))[ind]
> == ndrange(...)[idx]
>
> On Thu, 11 Oct 2018 at 18:54 Mark Harfouche <mark.harfouche at gmail.com>
> wrote:
>
>> Eric, interesting ideas.
>>
>> > __getitem__(Tuple[int]) which returns numpy scalars
>>
>> I'm not sure what you mean. Even if you supply a numpy uint8 to range, it
>> still returns a python int class.
>> Would you like ndrange to return a tuple of `uint8` in this case?
>>
>> ```
>> In [3]: a =
>> iter(range(np.uint8(10)))
>>
>> In [4]:
>> next(a).__class__
>> Out[4]: int
>>
>> In [5]:
>> np.uint8(10).__class__
>> Out[5]: numpy.uint8
>> ```
>>
>> Ravel seems like a cool way to choose iteration order. In the PR, I
>> mentionned that one reason that I removed `'F'` order from the PR was:
>> 1. My implementation was not competitive with the `C` order
>> implementation in terms of speed (can be fixed)
>> 2. I don't know if it something that people really need to iterate over
>> collections (annoying to maintain if unused)
>>
>> Instead, I just showed an example how people could iterate in `F` order
>> should they need to.
>>
>> I'm not sure if we ever want the `ndrange` object to return a full
>> matrix. It seems like we would be creating a custom tuple class just for
>> this which seems pretty niche.
>>
>>
>> On Thu, Oct 11, 2018 at 10:21 AM Eric Wieser <wieser.eric+numpy at gmail.com>
>> wrote:
>>
>>> Isn?t that what arange is for?
>>>
>>> Imagining ourselves in python2 land for now - I?m proposing arange is
>>> to range, as ndrange is to xrange
>>>
>>> I?m not convinced it should return an ndarray
>>>
>>> I agree - I think it should return a range-like object that:
>>>
>>>    - Is convertible via __array__ if needed
>>>    - Looks like an ndarray, with:
>>>       - a .dtype attribute
>>>       - a __getitem__(Tuple[int]) which returns numpy scalars
>>>       - .ravel() and .flat for choosing iteration order.
>>>
>>> On Wed, 10 Oct 2018 at 11:21 Allan Haldane allanhaldane at gmail.com
>>> <http://mailto:allanhaldane at gmail.com> wrote:
>>>
>>> On 10/10/18 12:34 AM, Eric Wieser wrote:
>>>> > One thing that worries me here - in python, |range(...)| in essence
>>>> > generates a lazy |list| - so I?d expect |ndrange| to generate a lazy
>>>> > |ndarray|. In practice, that means it would be a duck-type defining an
>>>> > |__array__| method to evaluate it, and only implement methods already
>>>> > present in numpy.
>>>>
>>>> Isn't that what arange is for?
>>>>
>>>> It seems like there are two uses of python3's range: 1. creating a 1d
>>>> iterable of indices for use in for-loops, and 2. with list(range) can be
>>>> used to create a sequence of integers.
>>>>
>>>> Numpy can extend this in two directions:
>>>>  * ndrange returns an iterable of nd indices (for for-loops).
>>>>  * arange returns an 1d ndarray of integers instead of a list
>>>>
>>>> The application of for-loops, which is more niche, doesn't need
>>>> ndarray's vectorized properties, so I'm not convinced it should return
>>>> an ndarray. It certainly seems simpler not to return an ndarray, due to
>>>> the dtype question.
>>>>
>>>> arange on its own seems to cover the need for a vectorized version of
>>>> range.
>>>>
>>>> Allan
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181011/1ce3194a/attachment-0001.html>

From stefanv at berkeley.edu  Fri Oct 12 01:43:58 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Thu, 11 Oct 2018 22:43:58 -0700
Subject: [Numpy-discussion] BIDS/NumPy dev meetings, Wednesdays 12pm Pacific
Message-ID: <20181012054358.denpcg2dtoree6er@carbo>

Hi everyone,

The team at BIDS meets once a week to discuss progress, priorities, and
roadblocks.  While our priorities are broadly determined by the project
roadmap [0], we would like to provide an opportunity for the community
to give more regular and detailed feedback on our work.

We therefore invite you to join us for our weekly calls,
each **Wednesday from 12:00 to 13:00 Pacific Time**.

Detail of the next meeting (2018-10-17) is given in the agenda [1],
which is a growing document?feel free to add topics you wish to discuss.

We hope to see you there!  I will send another reminder next week.

Best regards,
St?fan


[0] https://www.numpy.org/neps/index.html
[1] https://hackmd.io/YZfpGn5BSu6acAFLBaRjtw#

From einstein.edison at gmail.com  Fri Oct 12 11:34:32 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Fri, 12 Oct 2018 17:34:32 +0200
Subject: [Numpy-discussion] Exact semantics of ufunc.reduce
Message-ID: <87e71577-4896-44b0-b898-41568db5eebe@Canary>

Hello!

I?m trying to investigate the exact way ufunc.reduce works when given a custom dtype. Does it cast before or after the operation, or somewhere in between? How does this differ from ufunc.reduceat, for example?

We ran into this issue in pydata/sparse#191 (https://github.com/pydata/sparse/issues/191) when trying to match the two where the only thing differing is the number of zeros for sum, which shouldn?t change the result.

Best Regards,
Hameer Abbasi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181012/211a20c7/attachment.html>

From ralf.gommers at gmail.com  Fri Oct 12 11:46:44 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 12 Oct 2018 08:46:44 -0700
Subject: [Numpy-discussion] BIDS/NumPy dev meetings,
 Wednesdays 12pm Pacific
In-Reply-To: <20181012054358.denpcg2dtoree6er@carbo>
References: <20181012054358.denpcg2dtoree6er@carbo>
Message-ID: <CABL7CQhdbYzBWwvPOS6t90HA4joHiufu2-ZhKVD0iFe40cg+-g@mail.gmail.com>

On Thu, Oct 11, 2018 at 10:44 PM Stefan van der Walt <stefanv at berkeley.edu>
wrote:

> Hi everyone,
>
> The team at BIDS meets once a week to discuss progress, priorities, and
> roadblocks.  While our priorities are broadly determined by the project
> roadmap [0], we would like to provide an opportunity for the community
> to give more regular and detailed feedback on our work.
>
> We therefore invite you to join us for our weekly calls,
> each **Wednesday from 12:00 to 13:00 Pacific Time**.
>
> Detail of the next meeting (2018-10-17) is given in the agenda [1],
> which is a growing document?feel free to add topics you wish to discuss.
>
> We hope to see you there!  I will send another reminder next week.
>

Sounds like a good idea, thanks for doing this. I'm unlikely to make the
first two meetings, but will try to join when I can after that.

Cheers,
Ralf


> Best regards,
> St?fan
>
>
> [0] https://www.numpy.org/neps/index.html
> [1] https://hackmd.io/YZfpGn5BSu6acAFLBaRjtw#
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181012/72ac8297/attachment.html>

From sebastian at sipsolutions.net  Fri Oct 12 12:11:20 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 12 Oct 2018 18:11:20 +0200
Subject: [Numpy-discussion] Exact semantics of ufunc.reduce
In-Reply-To: <87e71577-4896-44b0-b898-41568db5eebe@Canary>
References: <87e71577-4896-44b0-b898-41568db5eebe@Canary>
Message-ID: <eda4b2ba24ea4b65eb63cf74082f21edca43e2cc.camel@sipsolutions.net>

On Fri, 2018-10-12 at 17:34 +0200, Hameer Abbasi wrote:
> Hello!
> 
> I?m trying to investigate the exact way ufunc.reduce works when given
> a custom dtype. Does it cast before or after the operation, or
> somewhere in between? How does this differ from ufunc.reduceat, for
> example?
> 

I am not 100% sure, but I think giving the dtype definitely casts the
output type. And since most ufunc loops are defined as "ff->f", etc.
that effectively casts the input as well. It might be it casts the
input specifically, but I doubt it.

The cast will occur within the buffering machinery, so the cast is only
done in small chunks. But the operation itself should be performed
using the given dtype.

- Sebastian


> We ran into this issue in pydata/sparse#191 when trying to match the
> two where the only thing differing is the number of zeros for sum,
> which shouldn?t change the result.
> 
> Best Regards,
> Hameer Abbasi
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181012/43c1db8e/attachment.sig>

From mark.harfouche at gmail.com  Mon Oct 15 08:44:01 2018
From: mark.harfouche at gmail.com (Mark Harfouche)
Date: Mon, 15 Oct 2018 08:44:01 -0400
Subject: [Numpy-discussion] A zeros_like implementation based on calloc
 instead of copyto
Message-ID: <CAC=AwPy0biOv_q1Bt1QnNP8fj=n6rxozYHjWdrpD3XZTrvo=Lg@mail.gmail.com>

Hello,

Currently, `zeros_like` is based `copyto` as opposed to `calloc`.
This causes inconsistencies in the amount of time it takes to create an
array with `zeros` + `shape` and `zeros_like` for large arrays.
This was first raised https://github.com/numpy/numpy/issues/9909

It seems to me that a memory copy can be avoided by using
`PyArray_NewFromDescr_int`
in C.

I propose creating a new C_API function `PyArray_NewZerosLikeArray` that
behaves much like the `PyArray_NewLikeArray` with the exception that it
calls
`PyArray_NewFromDescr_int` instead of `PyArray_NewFromDescr` to initialize
the array to zeros with calloc.

An all C implementation of `zeros_like` is also possible by adapting the
`empty_like` function.

A draft implementation is viewable
https://github.com/hmaarrfk/numpy/pull/2/files for those looking for more
details about my proposed implementation.

Thank you for considering.

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181015/4faf401c/attachment.html>

From stefanv at berkeley.edu  Tue Oct 16 17:26:57 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Tue, 16 Oct 2018 14:26:57 -0700
Subject: [Numpy-discussion] BIDS/NumPy dev meetings,
 Wednesdays 12pm Pacific
In-Reply-To: <20181012054358.denpcg2dtoree6er@carbo>
References: <20181012054358.denpcg2dtoree6er@carbo>
Message-ID: <20181016212657.sk5ujmlp3p7hxw5b@carbo>

Hi everyone,

This is a friendly reminder of the BIDS/NumPy dev meetings, kicking off
tomorrow at 12pm Pacific time.

Please add any topics you wish to discuss to the agenda linked below.

Best regards,
St?fan


On Thu, 11 Oct 2018 22:43:58 -0700, Stefan van der Walt wrote:
> The team at BIDS meets once a week to discuss progress, priorities, and
> roadblocks.  While our priorities are broadly determined by the project
> roadmap [0], we would like to provide an opportunity for the community
> to give more regular and detailed feedback on our work.
> 
> We therefore invite you to join us for our weekly calls,
> each **Wednesday from 12:00 to 13:00 Pacific Time**.
> 
> Detail of the next meeting (2018-10-17) is given in the agenda [1],
> which is a growing document?feel free to add topics you wish to discuss.
> 
> We hope to see you there!  I will send another reminder next week.
> 
> 
> [0] https://www.numpy.org/neps/index.html
> [1] https://hackmd.io/YZfpGn5BSu6acAFLBaRjtw#

From allanhaldane at gmail.com  Tue Oct 16 18:41:18 2018
From: allanhaldane at gmail.com (Allan Haldane)
Date: Tue, 16 Oct 2018 18:41:18 -0400
Subject: [Numpy-discussion] BIDS/NumPy dev meetings,
 Wednesdays 12pm Pacific
In-Reply-To: <20181016212657.sk5ujmlp3p7hxw5b@carbo>
References: <20181012054358.denpcg2dtoree6er@carbo>
 <20181016212657.sk5ujmlp3p7hxw5b@carbo>
Message-ID: <668fa1d2-bd59-6444-4162-fd42f347939d@gmail.com>

I'll try to make it, especially as it looks like you want to discuss two
of my PRs! :)

I have a different meeting a bit before then which might run over
though, so sorry ahead of time if I'm not there.

Cheers,
Allan


On 10/16/18 5:26 PM, Stefan van der Walt wrote:
> Hi everyone,
> 
> This is a friendly reminder of the BIDS/NumPy dev meetings, kicking off
> tomorrow at 12pm Pacific time.
> 
> Please add any topics you wish to discuss to the agenda linked below.
> 
> Best regards,
> St?fan
> 
> 
> On Thu, 11 Oct 2018 22:43:58 -0700, Stefan van der Walt wrote:
>> The team at BIDS meets once a week to discuss progress, priorities, and
>> roadblocks.  While our priorities are broadly determined by the project
>> roadmap [0], we would like to provide an opportunity for the community
>> to give more regular and detailed feedback on our work.
>>
>> We therefore invite you to join us for our weekly calls,
>> each **Wednesday from 12:00 to 13:00 Pacific Time**.
>>
>> Detail of the next meeting (2018-10-17) is given in the agenda [1],
>> which is a growing document?feel free to add topics you wish to discuss.
>>
>> We hope to see you there!  I will send another reminder next week.
>>
>>
>> [0] https://www.numpy.org/neps/index.html
>> [1] https://hackmd.io/YZfpGn5BSu6acAFLBaRjtw#
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 


From ralf.gommers at gmail.com  Tue Oct 16 19:05:43 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 16 Oct 2018 23:05:43 +0000
Subject: [Numpy-discussion] summary of NumFOCUS Summit & roadmap
 presentations
Message-ID: <CABL7CQjtGxSepDX7oCZvRzLzJ84EFTpk9++jJY50eEfwZWp4ug@mail.gmail.com>

Hi all,

At the end of September the NumFOCUS Summit was held; Allan and I both
attended on behalf of NumPy. I've written up a summary of the event from a
NumPy perspective:
https://rgommers.github.io/2018/10/2018-numfocus-summit---a-summary/. I
also link in that post to both of the presentations I gave on the NumPy
roadmap.

I suspect that what I wrote raises more questions than it answers, so
questions/ideas/criticism very welcome!

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181016/c2aa5399/attachment.html>

From matthew.brett at gmail.com  Wed Oct 17 13:47:13 2018
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 17 Oct 2018 18:47:13 +0100
Subject: [Numpy-discussion] random.choice(replace=False) very slow
Message-ID: <CAH6Pt5ohKfRSsogwOZPz1jm9AFE5f7g83rDjij+bZq5RJcKmUw@mail.gmail.com>

Hi,

I noticed that numpy.random.choice was very slow, with the
replace=False option, and then I noticed it can (for most cases) be
made many hundreds of times faster in Python code:

In [18]: sample = np.random.uniform(size=1000000)
In [19]: timeit np.random.choice(sample, 500, replace=False)
        42.1 ms ? 214 ?s per loop (mean ? std. dev. of 7 runs, 10
loops each)
IIn [22]: def rc(x, size):
    ...:     n = np.prod(size)
    ...:     n_plus = n * 2
    ...:     inds = np.unique(np.random.randint(0, n_plus+1, size=n_plus))[:n]
    ...:     return x[inds].reshape(size)
In [23]: timeit rc(sample, 500)
86.5 ?s ? 421 ns per loop (mean ? std. dev. of 7 runs, 10000 loops each)each)

Is there a reason why it's so slow in C?  Could something more
intelligent than the above be used to speed it up?

Cheers,

Matthew

From matti.picus at gmail.com  Wed Oct 17 13:58:55 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Wed, 17 Oct 2018 20:58:55 +0300
Subject: [Numpy-discussion] Approving NEP 27 - Historical discussion of 0-D
 arrays
Message-ID: <1650bf66-12fd-8bdd-21e3-8d5a0ecb206f@gmail.com>

In PR 12166 https://github.com/numpy/numpy/pull/12166 we revived an old 
wiki document discussing the implementation of 0-dimensional arrays. 
This became informational NEP-27 
http://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html. There was 
fruitful discussion of the NEP and the need for both 0-D arrays and 
scalars on the PR comments. The NEP itself is informational and freezes 
the information to the 2006 discussion, noting that "some of the 
information here is dated, for instance indexing of 0-D arrays now is 
now implemented and does not error."


I would like to submit the NEP for discussion and approval.

Matti


From einstein.edison at gmail.com  Wed Oct 17 14:16:22 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Wed, 17 Oct 2018 20:16:22 +0200
Subject: [Numpy-discussion]
 =?utf-8?Q?random.choice(replace=3DFalse)_?=very slow
In-Reply-To: <CAH6Pt5ohKfRSsogwOZPz1jm9AFE5f7g83rDjij+bZq5RJcKmUw@mail.gmail.com>
References: <CAH6Pt5ohKfRSsogwOZPz1jm9AFE5f7g83rDjij+bZq5RJcKmUw@mail.gmail.com>
Message-ID: <e2119708-8264-458b-9afb-50a56779bc41@Canary>

Hi!

The standard algorithm for sampling without replacement is ``O(N)`` expected for ``N < 0.5 * M`` where ``M`` is the length of the original set, but ``O(N^2)`` worst-case. When this is not true, a simple Durstenfeld-Fisher-Yates shuffle [1] (``O(M)``) can be used on the original set and then the first ``N`` items selected. Although this is fast, it uses up a large amount of memory (``O(M)`` extra memory rather than ``O(N)``) and I?m not sure where the best trade off is. It also can?t be used with an arbitrary probability distribution.

One way to handle this would be to sample a maximum of ``N // 2`` samples and then select the ?unselected? samples instead. Although this has a faster expected run-time than the standard algorithm in all cases, it would break backwards-compatibility guarantees.

Best Regards,
Hameer Abbasi

[1] https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

> On Wednesday, Oct 17, 2018 at 7:48 PM, Matthew Brett <matthew.brett at gmail.com (mailto:matthew.brett at gmail.com)> wrote:
> Hi,
>
> I noticed that numpy.random.choice was very slow, with the
> replace=False option, and then I noticed it can (for most cases) be
> made many hundreds of times faster in Python code:
>
> In [18]: sample = np.random.uniform(size=1000000)
> In [19]: timeit np.random.choice(sample, 500, replace=False)
> 42.1 ms ? 214 ?s per loop (mean ? std. dev. of 7 runs, 10
> loops each)
> IIn [22]: def rc(x, size):
> ...: n = np.prod(size)
> ...: n_plus = n * 2
> ...: inds = np.unique(np.random.randint(0, n_plus+1, size=n_plus))[:n]
> ...: return x[inds].reshape(size)
> In [23]: timeit rc(sample, 500)
> 86.5 ?s ? 421 ns per loop (mean ? std. dev. of 7 runs, 10000 loops each)each)
>
> Is there a reason why it's so slow in C? Could something more
> intelligent than the above be used to speed it up?
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181017/fd4d5979/attachment.html>

From einstein.edison at gmail.com  Wed Oct 17 14:34:04 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Wed, 17 Oct 2018 20:34:04 +0200
Subject: [Numpy-discussion] Approving NEP 27 - Historical discussion of
 0-D arrays
In-Reply-To: <1650bf66-12fd-8bdd-21e3-8d5a0ecb206f@gmail.com>
References: <1650bf66-12fd-8bdd-21e3-8d5a0ecb206f@gmail.com>
Message-ID: <953635be-050c-4f6a-88d2-0e3931fe3ca4@Canary>

Hi everyone,

Ah, I neglected to see that the PR was already merged. In any case, I?ll repeat my comment here (referring to the indexing section):

I would suggest that this section be removed entirely or updated. For example, if xis either an array scalar or a rank zero array, x[...] is guaranteed to be an array and x[()]is guaranteed to be a scalar. The difference is because x[{anything here}, ...] is guaranteed to be an array. In words, if the last index is an ellipsis, the result of indexing is guaranteed to be an array.

I came across this weird behaviour when implementing the equivalent of np.wherefor PyData/Sparse.

Best Regards,
Hameer Abbasi

> On Wednesday, Oct 17, 2018 at 7:59 PM, Matti Picus <matti.picus at gmail.com (mailto:matti.picus at gmail.com)> wrote:
> In PR 12166 https://github.com/numpy/numpy/pull/12166 we revived an old
> wiki document discussing the implementation of 0-dimensional arrays.
> This became informational NEP-27
> http://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html. There was
> fruitful discussion of the NEP and the need for both 0-D arrays and
> scalars on the PR comments. The NEP itself is informational and freezes
> the information to the 2006 discussion, noting that "some of the
> information here is dated, for instance indexing of 0-D arrays now is
> now implemented and does not error."
>
>
> I would like to submit the NEP for discussion and approval.
>
> Matti
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181017/238d7dd1/attachment-0001.html>

From mark.harfouche at gmail.com  Wed Oct 17 14:58:52 2018
From: mark.harfouche at gmail.com (Mark Harfouche)
Date: Wed, 17 Oct 2018 14:58:52 -0400
Subject: [Numpy-discussion] BIDS/NumPy dev meetings,
 Wednesdays 12pm Pacific
In-Reply-To: <668fa1d2-bd59-6444-4162-fd42f347939d@gmail.com>
References: <20181012054358.denpcg2dtoree6er@carbo>
 <20181016212657.sk5ujmlp3p7hxw5b@carbo>
 <668fa1d2-bd59-6444-4162-fd42f347939d@gmail.com>
Message-ID: <CAC=AwPzk_fjX2bc9nXoJOYMpKrk44Wgh+rj9D_5CZQAyehnG3Q@mail.gmail.com>

Stefan. I would like to simply listen in. I cant seem to find the meeting
ID that we need to call in.

On Tue, Oct 16, 2018 at 6:42 PM Allan Haldane <allanhaldane at gmail.com>
wrote:

> I'll try to make it, especially as it looks like you want to discuss two
> of my PRs! :)
>
> I have a different meeting a bit before then which might run over
> though, so sorry ahead of time if I'm not there.
>
> Cheers,
> Allan
>
>
> On 10/16/18 5:26 PM, Stefan van der Walt wrote:
> > Hi everyone,
> >
> > This is a friendly reminder of the BIDS/NumPy dev meetings, kicking off
> > tomorrow at 12pm Pacific time.
> >
> > Please add any topics you wish to discuss to the agenda linked below.
> >
> > Best regards,
> > St?fan
> >
> >
> > On Thu, 11 Oct 2018 22:43:58 -0700, Stefan van der Walt wrote:
> >> The team at BIDS meets once a week to discuss progress, priorities, and
> >> roadblocks.  While our priorities are broadly determined by the project
> >> roadmap [0], we would like to provide an opportunity for the community
> >> to give more regular and detailed feedback on our work.
> >>
> >> We therefore invite you to join us for our weekly calls,
> >> each **Wednesday from 12:00 to 13:00 Pacific Time**.
> >>
> >> Detail of the next meeting (2018-10-17) is given in the agenda [1],
> >> which is a growing document?feel free to add topics you wish to discuss.
> >>
> >> We hope to see you there!  I will send another reminder next week.
> >>
> >>
> >> [0] https://www.numpy.org/neps/index.html
> >> [1] https://hackmd.io/YZfpGn5BSu6acAFLBaRjtw#
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181017/72436bef/attachment.html>

From einstein.edison at gmail.com  Wed Oct 17 15:06:56 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Wed, 17 Oct 2018 21:06:56 +0200
Subject: [Numpy-discussion] BIDS/NumPy dev meetings,
 Wednesdays 12pm Pacific
In-Reply-To: <CAC=AwPzk_fjX2bc9nXoJOYMpKrk44Wgh+rj9D_5CZQAyehnG3Q@mail.gmail.com>
References: <20181012054358.denpcg2dtoree6er@carbo>
 <20181016212657.sk5ujmlp3p7hxw5b@carbo>
 <668fa1d2-bd59-6444-4162-fd42f347939d@gmail.com>
 <CAC=AwPzk_fjX2bc9nXoJOYMpKrk44Wgh+rj9D_5CZQAyehnG3Q@mail.gmail.com>
Message-ID: <2c725077-5027-443e-a69b-8e98a1f122eb@Canary>

Dial in: https://berkeley.zoom.us/zoomconference?m=ta2dUMqcdK219Ov78Sj7CMIzzoX2CHGZ

Join in via PC: https://berkeley.zoom.us/j/400054438

Best Regards,
Hameer Abbasi

> On Wednesday, Oct 17, 2018 at 8:59 PM, Mark Harfouche <mark.harfouche at gmail.com (mailto:mark.harfouche at gmail.com)> wrote:
> Stefan. I would like to simply listen in. I cant seem to find the meeting ID that we need to call in.
>
> On Tue, Oct 16, 2018 at 6:42 PM Allan Haldane <allanhaldane at gmail.com (mailto:allanhaldane at gmail.com)> wrote:
> > I'll try to make it, especially as it looks like you want to discuss two
> > of my PRs! :)
> >
> > I have a different meeting a bit before then which might run over
> > though, so sorry ahead of time if I'm not there.
> >
> > Cheers,
> > Allan
> >
> >
> > On 10/16/18 5:26 PM, Stefan van der Walt wrote:
> > > Hi everyone,
> > >
> > > This is a friendly reminder of the BIDS/NumPy dev meetings, kicking off
> > > tomorrow at 12pm Pacific time.
> > >
> > > Please add any topics you wish to discuss to the agenda linked below.
> > >
> > > Best regards,
> > > St?fan
> > >
> > >
> > > On Thu, 11 Oct 2018 22:43:58 -0700, Stefan van der Walt wrote:
> > >> The team at BIDS meets once a week to discuss progress, priorities, and
> > >> roadblocks. While our priorities are broadly determined by the project
> > >> roadmap [0], we would like to provide an opportunity for the community
> > >> to give more regular and detailed feedback on our work.
> > >>
> > >> We therefore invite you to join us for our weekly calls,
> > >> each **Wednesday from 12:00 to 13:00 Pacific Time**.
> > >>
> > >> Detail of the next meeting (2018-10-17) is given in the agenda [1],
> > >> which is a growing document?feel free to add topics you wish to discuss.
> > >>
> > >> We hope to see you there! I will send another reminder next week.
> > >>
> > >>
> > >> [0] https://www.numpy.org/neps/index.html
> > >> [1] https://hackmd.io/YZfpGn5BSu6acAFLBaRjtw#
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org (mailto:NumPy-Discussion at python.org)
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org (mailto:NumPy-Discussion at python.org)
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181017/bae2506b/attachment.html>

From stefanv at berkeley.edu  Wed Oct 17 19:16:32 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Wed, 17 Oct 2018 16:16:32 -0700
Subject: [Numpy-discussion] Approving NEP 27 - Historical discussion of
 0-D arrays
In-Reply-To: <1650bf66-12fd-8bdd-21e3-8d5a0ecb206f@gmail.com>
References: <1650bf66-12fd-8bdd-21e3-8d5a0ecb206f@gmail.com>
Message-ID: <20181017231632.6bqdujmnqeskhehu@carbo>

On Wed, 17 Oct 2018 20:58:55 +0300, Matti Picus wrote:
> http://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html. There was
> fruitful discussion of the NEP and the need for both 0-D arrays and scalars
> on the PR comments.

Were those comments integrated back into the NEP?  If not, can we add a
paragraph to summarize the discussion?

St?fan

From shoyer at gmail.com  Wed Oct 17 20:39:12 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Wed, 17 Oct 2018 17:39:12 -0700
Subject: [Numpy-discussion] Approving NEP 27 - Historical discussion of
 0-D arrays
In-Reply-To: <20181017231632.6bqdujmnqeskhehu@carbo>
References: <1650bf66-12fd-8bdd-21e3-8d5a0ecb206f@gmail.com>
 <20181017231632.6bqdujmnqeskhehu@carbo>
Message-ID: <CAEQ_Tve3ojibA_0p-h_0pOPupKUP-=M+XtGdnUVZHn9B7WbxSg@mail.gmail.com>

On Wed, Oct 17, 2018 at 4:16 PM Stefan van der Walt <stefanv at berkeley.edu>
wrote:

> On Wed, 17 Oct 2018 20:58:55 +0300, Matti Picus wrote:
> > http://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html. There was
> > fruitful discussion of the NEP and the need for both 0-D arrays and
> scalars
> > on the PR comments.
>
> Were those comments integrated back into the NEP?  If not, can we add a
> paragraph to summarize the discussion?
>

Yes, it's in the second paragraph of the "Abstract" section.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181017/9f33962e/attachment.html>

From shoyer at gmail.com  Wed Oct 17 20:40:50 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Wed, 17 Oct 2018 17:40:50 -0700
Subject: [Numpy-discussion] Approving NEP 27 - Historical discussion of
 0-D arrays
In-Reply-To: <1650bf66-12fd-8bdd-21e3-8d5a0ecb206f@gmail.com>
References: <1650bf66-12fd-8bdd-21e3-8d5a0ecb206f@gmail.com>
Message-ID: <CAEQ_Tvebu34HGHVS7Jve8fDGSZQwhBG=-rZf19tDj4yBy-+gqA@mail.gmail.com>

On Wed, Oct 17, 2018 at 10:59 AM Matti Picus <matti.picus at gmail.com> wrote:

> In PR 12166 https://github.com/numpy/numpy/pull/12166 we revived an old
> wiki document discussing the implementation of 0-dimensional arrays.
> This became informational NEP-27
> http://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html. There was
> fruitful discussion of the NEP and the need for both 0-D arrays and
> scalars on the PR comments.
>

 We might consider adding a link to the PR under a "Discussion" section,
like what you can see for NEP 16.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181017/a3033910/attachment.html>

From stefanv at berkeley.edu  Thu Oct 18 21:10:45 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Thu, 18 Oct 2018 18:10:45 -0700
Subject: [Numpy-discussion] BIDS/NumPy dev meetings,
 Wednesdays 12pm Pacific
In-Reply-To: <20181016212657.sk5ujmlp3p7hxw5b@carbo>
References: <20181012054358.denpcg2dtoree6er@carbo>
 <20181016212657.sk5ujmlp3p7hxw5b@carbo>
Message-ID: <20181019011045.a7ofrndp6p2om43q@carbo>

Thank you to everyone who attended the development meeting this week.
We've posted the agenda/notes online at:

https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-17.md

On Tue, 16 Oct 2018 14:26:57 -0700, Stefan van der Walt wrote:
> Hi everyone,
> 
> This is a friendly reminder of the BIDS/NumPy dev meetings, kicking off
> tomorrow at 12pm Pacific time.
> 
> Please add any topics you wish to discuss to the agenda linked below.
> 
> Best regards,
> St?fan

From matti.picus at gmail.com  Fri Oct 19 04:02:01 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Fri, 19 Oct 2018 11:02:01 +0300
Subject: [Numpy-discussion] Removing priority labels from github
Message-ID: <b10b4607-65e0-ed25-b89d-a07fd2707898@gmail.com>

We currently have highest, high, normal, low, and lowest priority labels 
for github issues/PRs. At the recent status meeting, we proposed 
consolidating these to a single "high" priority label. Anything "low" 
priority should be merged or closed since it will be quickly forgotten, 
and no "normal" tag is needed.


With that, we (the BIDS team) would like to encourage reviewers to use 
the "high" priority tag to indicate things we should be working on.

Any objections or thoughts?


Matti (in the names of Tyler and Stefan)


From matti.picus at gmail.com  Fri Oct 19 04:28:36 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Fri, 19 Oct 2018 11:28:36 +0300
Subject: [Numpy-discussion] asanyarray vs. asarray
Message-ID: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/950a5083/attachment.html>

From einstein.edison at gmail.com  Fri Oct 19 04:37:41 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Fri, 19 Oct 2018 10:37:41 +0200
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
Message-ID: <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>

Hi all

> On Friday, Oct 19, 2018 at 10:28 AM, Matti Picus <matti.picus at gmail.com (mailto:matti.picus at gmail.com)> wrote:
>
> Was there discussion around which of `asarray` or asanyarray` to prefer? PR 11162, https://github.com/numpy/numpy/pull/11162, proposes `asanyarray` in place of `asarray` at the entrance to `_quantile_ureduce_func` to preserve ndarray subclasses. Should we be looking into changing all the `asarray` calls into `asanyarray`?
>
>
>

I suspect that this will cause a large number of problems around np.matrix, so unless we deprecate that, this might cause a large amount of problems. The problem with np.matrix is that it?s a subclass, but it?s not substitutable for the base class, and so violates SOLID.

There are efforts to remove np.matrix, with the largest consumer being scipy.sparse, so unless that?s revamped, deprecating np.matrix is kind of hard to do.
>
>
>
>
>
> Matti
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Best Regards,
Hameer Abbasi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/8bf33452/attachment.html>

From sebastian at sipsolutions.net  Fri Oct 19 05:21:34 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 19 Oct 2018 11:21:34 +0200
Subject: [Numpy-discussion] Removing priority labels from github
In-Reply-To: <b10b4607-65e0-ed25-b89d-a07fd2707898@gmail.com>
References: <b10b4607-65e0-ed25-b89d-a07fd2707898@gmail.com>
Message-ID: <22c3981e90e72bf4df9d02c02ce2277d1c7d6e41.camel@sipsolutions.net>

On Fri, 2018-10-19 at 11:02 +0300, Matti Picus wrote:
> We currently have highest, high, normal, low, and lowest priority
> labels 
> for github issues/PRs. At the recent status meeting, we proposed 
> consolidating these to a single "high" priority label. Anything
> "low" 
> priority should be merged or closed since it will be quickly
> forgotten, 
> and no "normal" tag is needed.
> 
> 
> With that, we (the BIDS team) would like to encourage reviewers to
> use 
> the "high" priority tag to indicate things we should be working on.
> 
> Any objections or thoughts?
> 

Sounds like a plan, especially having practically meaningless tags
right now is no help. Most of them are historical and personally I have
only been using the milestones to tag things as high priority (very
occasionally).

- Sebastian


> 
> Matti (in the names of Tyler and Stefan)
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/59384216/attachment-0001.sig>

From m.h.vankerkwijk at gmail.com  Fri Oct 19 11:11:31 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Fri, 19 Oct 2018 11:11:31 -0400
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
Message-ID: <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>

There are exceptions for `matrix` in quite a few places, and there now is
warning for `maxtrix` - it might not be bad to use `asanyarray` and add an
exception for `maxtrix`. Indeed, I quite like the suggestion by Eric Wieser
to just add the exception to `asanyarray` itself - that way when matrix is
truly deprecated, it will be a very easy change.

-- Marten
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/be3ca9f8/attachment.html>

From shoyer at gmail.com  Fri Oct 19 12:09:18 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Fri, 19 Oct 2018 09:09:18 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
Message-ID: <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>

I don't think it makes much sense to change NumPy's existing usage of
asarray() to asanyarray() unless we add subok=True arguments (which default
to False). But this ends up cluttering NumPy's public API, which is also
undesirable. The preferred way to override NumPy functions going forward
should be __array_function__.

On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk <
m.h.vankerkwijk at gmail.com> wrote:

> There are exceptions for `matrix` in quite a few places, and there now is
> warning for `maxtrix` - it might not be bad to use `asanyarray` and add an
> exception for `maxtrix`. Indeed, I quite like the suggestion by Eric Wieser
> to just add the exception to `asanyarray` itself - that way when matrix is
> truly deprecated, it will be a very easy change.
>
> -- Marten
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/d2f9d3f6/attachment.html>

From einstein.edison at gmail.com  Fri Oct 19 12:15:08 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Fri, 19 Oct 2018 18:15:08 +0200
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
Message-ID: <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>

Hi!

> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer <shoyer at gmail.com (mailto:shoyer at gmail.com)> wrote:
> I don't think it makes much sense to change NumPy's existing usage of asarray() to asanyarray() unless we add subok=True arguments (which default to False). But this ends up cluttering NumPy's public API, which is also undesirable.
>
Agreed so far.
>
> The preferred way to override NumPy functions going forward should be __array_function__.
>

I think we should ?soft support? i.e. allow but consider unsupported, the case where one of NumPy?s functions is implemented in terms of others and ?passing through? an array results in the correct behaviour for that array.

>
> On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk <m.h.vankerkwijk at gmail.com (mailto:m.h.vankerkwijk at gmail.com)> wrote:
> > There are exceptions for `matrix` in quite a few places, and there now is warning for `maxtrix` - it might not be bad to use `asanyarray` and add an exception for `maxtrix`. Indeed, I quite like the suggestion by Eric Wieser to just add the exception to `asanyarray` itself - that way when matrix is truly deprecated, it will be a very easy change.
> >
> > -- Marten
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org (mailto:NumPy-Discussion at python.org)
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Best Regards,
Hameer Abbasi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/f484bbdf/attachment.html>

From shoyer at gmail.com  Fri Oct 19 12:48:57 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Fri, 19 Oct 2018 09:48:57 -0700
Subject: [Numpy-discussion] Removing priority labels from github
In-Reply-To: <22c3981e90e72bf4df9d02c02ce2277d1c7d6e41.camel@sipsolutions.net>
References: <b10b4607-65e0-ed25-b89d-a07fd2707898@gmail.com>
 <22c3981e90e72bf4df9d02c02ce2277d1c7d6e41.camel@sipsolutions.net>
Message-ID: <CAEQ_Tvc8nJsyp3gdDWXnH5PVLrR18=MY17umvTNoWzTpu4iVGA@mail.gmail.com>

On Fri, Oct 19, 2018 at 2:22 AM Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Fri, 2018-10-19 at 11:02 +0300, Matti Picus wrote:
> > We currently have highest, high, normal, low, and lowest priority
> > labels
> > for github issues/PRs. At the recent status meeting, we proposed
> > consolidating these to a single "high" priority label. Anything
> > "low"
> > priority should be merged or closed since it will be quickly
> > forgotten,
> > and no "normal" tag is needed.
> >
> >
> > With that, we (the BIDS team) would like to encourage reviewers to
> > use
> > the "high" priority tag to indicate things we should be working on.
> >
> > Any objections or thoughts?
> >
>
> Sounds like a plan, especially having practically meaningless tags
> right now is no help. Most of them are historical and personally I have
> only been using the milestones to tag things as high priority (very
> occasionally).
>
> - Sebastian
>

+1 from me as well. I haven't been using these tags at all.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/67251a9f/attachment.html>

From ralf.gommers at gmail.com  Fri Oct 19 16:08:27 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 19 Oct 2018 20:08:27 +0000
Subject: [Numpy-discussion] Removing priority labels from github
In-Reply-To: <CAEQ_Tvc8nJsyp3gdDWXnH5PVLrR18=MY17umvTNoWzTpu4iVGA@mail.gmail.com>
References: <b10b4607-65e0-ed25-b89d-a07fd2707898@gmail.com>
 <22c3981e90e72bf4df9d02c02ce2277d1c7d6e41.camel@sipsolutions.net>
 <CAEQ_Tvc8nJsyp3gdDWXnH5PVLrR18=MY17umvTNoWzTpu4iVGA@mail.gmail.com>
Message-ID: <CABL7CQh2MS_GKdXF22Odnzk+5vePb_NC_aoyhCOBBubNkTcSOA@mail.gmail.com>

On Fri, Oct 19, 2018 at 4:49 PM Stephan Hoyer <shoyer at gmail.com> wrote:

>
>
> On Fri, Oct 19, 2018 at 2:22 AM Sebastian Berg <sebastian at sipsolutions.net>
> wrote:
>
>> On Fri, 2018-10-19 at 11:02 +0300, Matti Picus wrote:
>> > We currently have highest, high, normal, low, and lowest priority
>> > labels
>> > for github issues/PRs. At the recent status meeting, we proposed
>> > consolidating these to a single "high" priority label. Anything
>> > "low"
>> > priority should be merged or closed since it will be quickly
>> > forgotten,
>> > and no "normal" tag is needed.
>> >
>> >
>> > With that, we (the BIDS team) would like to encourage reviewers to
>> > use
>> > the "high" priority tag to indicate things we should be working on.
>> >
>> > Any objections or thoughts?
>> >
>>
>> Sounds like a plan, especially having practically meaningless tags
>> right now is no help. Most of them are historical and personally I have
>> only been using the milestones to tag things as high priority (very
>> occasionally).
>>
>> - Sebastian
>>
>
> +1 from me as well. I haven't been using these tags at all.
>

+1

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/8372695a/attachment-0001.html>

From ralf.gommers at gmail.com  Fri Oct 19 18:28:28 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 19 Oct 2018 22:28:28 +0000
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
Message-ID: <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>

On Fri, Oct 19, 2018 at 4:15 PM Hameer Abbasi <einstein.edison at gmail.com>
wrote:

> Hi!
>
> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer <shoyer at gmail.com>
> wrote:
> I don't think it makes much sense to change NumPy's existing usage of
> asarray() to asanyarray() unless we add subok=True arguments (which default
> to False). But this ends up cluttering NumPy's public API, which is also
> undesirable.
>
> Agreed so far.
>

I'm not sure I agree. "subok" is very unpythonic; the average numpy library
function should work fine for a well-behaved subclass (i.e. most things out
there except np.matrix).

>
> The preferred way to override NumPy functions going forward should be
> __array_function__.
>
>
> I think we should ?soft support? i.e. allow but consider unsupported, the
> case where one of NumPy?s functions is implemented in terms of others and
> ?passing through? an array results in the correct behaviour for that array.
>

I don't think we have or want such a concept as "soft support". We intend
to not break anything that now has asanyarray, i.e. it's supported and
ideally we have regression tests for all such functions. For anything we
transition over from asarray to asanyarray, PRs should come with new tests.


>
> On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk <
> m.h.vankerkwijk at gmail.com> wrote:
>
>> There are exceptions for `matrix` in quite a few places, and there now is
>> warning for `maxtrix` - it might not be bad to use `asanyarray` and add an
>> exception for `maxtrix`. Indeed, I quite like the suggestion by Eric Wieser
>> to just add the exception to `asanyarray` itself - that way when matrix is
>> truly deprecated, it will be a very easy change.
>>
> I don't quite understand this. Adding exceptions is not deprecation - we
then may as well just rip np.matrix out straight away.

What I suggested in the call about this issue is that it's not very
effective to treat functions like percentile/quantile one by one without an
overarching strategy. A way forward could be for someone to write an
overview of which sets of functions now have asanyarray (and actually work
with subclasses), which ones we can and want to change now, and which ones
we can and want to change after np.matrix is gone. Also, some guidelines
for new functions that we add to numpy would be handy. I suspect we've been
adding new functions that use asarray rather than asanyarray, which is
probably undesired.

Cheers,
Ralf


>
>> -- Marten
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> Best Regards,
> Hameer Abbasi
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/f5ce0f99/attachment.html>

From njs at pobox.com  Fri Oct 19 18:40:10 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 19 Oct 2018 15:40:10 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
Message-ID: <CAPJVwBnM_sr5oBMBSZ4U5S6snF+1Wt=Xnt_XvBCt47L27hv9iw@mail.gmail.com>

On Fri, Oct 19, 2018 at 3:28 PM, Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
>
> On Fri, Oct 19, 2018 at 4:15 PM Hameer Abbasi <einstein.edison at gmail.com>
> wrote:
>
>> Hi!
>>
>> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer <shoyer at gmail.com>
>> wrote:
>> I don't think it makes much sense to change NumPy's existing usage of
>> asarray() to asanyarray() unless we add subok=True arguments (which default
>> to False). But this ends up cluttering NumPy's public API, which is also
>> undesirable.
>>
>> Agreed so far.
>>
>
> I'm not sure I agree. "subok" is very unpythonic; the average numpy
> library function should work fine for a well-behaved subclass (i.e. most
> things out there except np.matrix).
>

Masked arrays also tend to break code that's not expecting them (e.g. on a
masked array, arr.sum()/arr.size will silently compute some meaningless
nonsense instead of the mean, and there are lots of formulas out there that
have some similarities with 'mean'). And people do all kinds of weird
things in third-party array subclasses.

Obviously we can't remove asanyarray or break existing code that assumes
particular numpy functions use asanyarray, but fundamentally asanyarray is
just not an API that makes sense or can be supported in a general way, and
our overall goal is to get people to gradually transition away from using
ndarray subclasses in general. That's why we're doing all this work to make
duck arrays work. So extending asanyarray support doesn't seem like a good
priority to spend our limited resources on, to me.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org <http://vorpus.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/4cda50fd/attachment.html>

From shoyer at gmail.com  Fri Oct 19 18:46:52 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Fri, 19 Oct 2018 15:46:52 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
Message-ID: <CAEQ_TvcgsPQfax3+vUbwHr33Hpffexed2y1bDn7OLgbT4rB_NQ@mail.gmail.com>

>
> I think we should ?soft support? i.e. allow but consider unsupported, the
>> case where one of NumPy?s functions is implemented in terms of others and
>> ?passing through? an array results in the correct behaviour for that array.
>>
>
> I don't think we have or want such a concept as "soft support". We intend
> to not break anything that now has asanyarray, i.e. it's supported and
> ideally we have regression tests for all such functions. For anything we
> transition over from asarray to asanyarray, PRs should come with new tests.
>

The problem with asanyarray() is that there isn't any well defined subclass
API for NumPy, beyond "mostly works like a NumPy array." If every NumPy
subclass strictly obeyed the Liskov Substitution Principle asanyarray()
would be fine, but in practice every subclass I've encountered deviates
from  the behavior of numpy.ndarray in some way.

The means the NumPy codebase has ended up littered with hacks/workarounds
to support various specific subclasses, and new subclasses still don't work
reliably. This makes it challenging to change existing code. For an example
of how bad this is gotten, look at all the work-arounds I had to add to
support np.testing.assert_array_equal() on ndarray subclasses in this
recent PR:
https://github.com/numpy/numpy/pull/12119

My hope is that __array_function__ will finally let us put a stop to this
by offering a better alternative to subclassing.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/849d0d74/attachment-0001.html>

From ralf.gommers at gmail.com  Fri Oct 19 19:01:40 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 19 Oct 2018 23:01:40 +0000
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
Message-ID: <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>

On Fri, Oct 19, 2018 at 10:28 PM Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
>
> On Fri, Oct 19, 2018 at 4:15 PM Hameer Abbasi <einstein.edison at gmail.com>
> wrote:
>
>> Hi!
>>
>> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer <shoyer at gmail.com>
>> wrote:
>> I don't think it makes much sense to change NumPy's existing usage of
>> asarray() to asanyarray() unless we add subok=True arguments (which default
>> to False). But this ends up cluttering NumPy's public API, which is also
>> undesirable.
>>
>> Agreed so far.
>>
>
> I'm not sure I agree. "subok" is very unpythonic; the average numpy
> library function should work fine for a well-behaved subclass (i.e. most
> things out there except np.matrix).
>
>>
>> The preferred way to override NumPy functions going forward should be
>> __array_function__.
>>
>>
>> I think we should ?soft support? i.e. allow but consider unsupported, the
>> case where one of NumPy?s functions is implemented in terms of others and
>> ?passing through? an array results in the correct behaviour for that array.
>>
>
> I don't think we have or want such a concept as "soft support". We intend
> to not break anything that now has asanyarray, i.e. it's supported and
> ideally we have regression tests for all such functions. For anything we
> transition over from asarray to asanyarray, PRs should come with new tests.
>
>
>>
>> On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk <
>> m.h.vankerkwijk at gmail.com> wrote:
>>
>>> There are exceptions for `matrix` in quite a few places, and there now
>>> is warning for `maxtrix` - it might not be bad to use `asanyarray` and add
>>> an exception for `maxtrix`. Indeed, I quite like the suggestion by Eric
>>> Wieser to just add the exception to `asanyarray` itself - that way when
>>> matrix is truly deprecated, it will be a very easy change.
>>>
>> I don't quite understand this. Adding exceptions is not deprecation - we
> then may as well just rip np.matrix out straight away.
>
> What I suggested in the call about this issue is that it's not very
> effective to treat functions like percentile/quantile one by one without an
> overarching strategy. A way forward could be for someone to write an
> overview of which sets of functions now have asanyarray (and actually work
> with subclasses), which ones we can and want to change now, and which ones
> we can and want to change after np.matrix is gone. Also, some guidelines
> for new functions that we add to numpy would be handy. I suspect we've been
> adding new functions that use asarray rather than asanyarray, which is
> probably undesired.
>

Thanks Nathaniel and Stephan. Your comments on my other two points are both
clear and correct (and have been made a number of times before). I think
the "write an overview so we can stop making ad-hoc decisions and having
these discussions" is the most important point I was trying to make though.
If we had such a doc and it concluded "hence we don't change anything,
__array_function__ is the only way to go" then we can just close PRs like
https://github.com/numpy/numpy/pull/11162 straight away.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/55d77c56/attachment.html>

From m.h.vankerkwijk at gmail.com  Fri Oct 19 21:23:21 2018
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Fri, 19 Oct 2018 21:23:21 -0400
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
Message-ID: <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>

Hi All,

It seems there are two extreme possibilities for general functions:
1. Put `asarray` everywhere. The main benefit that I can see is that even
if people put in list instead of arrays, one is guaranteed to have shape,
dtype, etc. But it seems a bit like calling `int` on everything that might
get used as an index, instead of letting the actual indexing do the proper
thing and call `__index__`.
2. Do not coerce at all, but rather write code assuming something is an
array already. This will often, but not always, just work for array mimics,
with coercion done only where necessary (e.g., in lower-lying C code such
as that of the ufuncs which has a smaller API surface and can be overridden
more easily).

The current __array_function__ work may well provide us with a way to
combine both, if we (over time) move the coercion inside
`ndarray.__array_function__` so that the actual implementation *can* assume
it deals with pure ndarray - then, when relevant, calling that
implementation will be what subclasses/duck arrays can happily do (and it
is up to them to ensure this works).

Of course, the above does not really answer what to do in the meantime. But
perhaps it helps in thinking of what we are actually aiming for.

One last thing: could we please stop bashing subclasses? One can subclass
essentially everything in python, often to great advantage. Subclasses such
as MaskedArray and, yes, Quantity, are widely used, and if they cause
problems perhaps that should be seen as a sign that ndarray subclassing
should be made easier and clearer.

All the best,

Marten


On Fri, Oct 19, 2018 at 7:02 PM Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
>
> On Fri, Oct 19, 2018 at 10:28 PM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Fri, Oct 19, 2018 at 4:15 PM Hameer Abbasi <einstein.edison at gmail.com>
>> wrote:
>>
>>> Hi!
>>>
>>> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer <shoyer at gmail.com>
>>> wrote:
>>> I don't think it makes much sense to change NumPy's existing usage of
>>> asarray() to asanyarray() unless we add subok=True arguments (which default
>>> to False). But this ends up cluttering NumPy's public API, which is also
>>> undesirable.
>>>
>>> Agreed so far.
>>>
>>
>> I'm not sure I agree. "subok" is very unpythonic; the average numpy
>> library function should work fine for a well-behaved subclass (i.e. most
>> things out there except np.matrix).
>>
>>>
>>> The preferred way to override NumPy functions going forward should be
>>> __array_function__.
>>>
>>>
>>> I think we should ?soft support? i.e. allow but consider unsupported,
>>> the case where one of NumPy?s functions is implemented in terms of others
>>> and ?passing through? an array results in the correct behaviour for that
>>> array.
>>>
>>
>> I don't think we have or want such a concept as "soft support". We intend
>> to not break anything that now has asanyarray, i.e. it's supported and
>> ideally we have regression tests for all such functions. For anything we
>> transition over from asarray to asanyarray, PRs should come with new tests.
>>
>>
>>>
>>> On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk <
>>> m.h.vankerkwijk at gmail.com> wrote:
>>>
>>>> There are exceptions for `matrix` in quite a few places, and there now
>>>> is warning for `maxtrix` - it might not be bad to use `asanyarray` and add
>>>> an exception for `maxtrix`. Indeed, I quite like the suggestion by Eric
>>>> Wieser to just add the exception to `asanyarray` itself - that way when
>>>> matrix is truly deprecated, it will be a very easy change.
>>>>
>>> I don't quite understand this. Adding exceptions is not deprecation - we
>> then may as well just rip np.matrix out straight away.
>>
>> What I suggested in the call about this issue is that it's not very
>> effective to treat functions like percentile/quantile one by one without an
>> overarching strategy. A way forward could be for someone to write an
>> overview of which sets of functions now have asanyarray (and actually work
>> with subclasses), which ones we can and want to change now, and which ones
>> we can and want to change after np.matrix is gone. Also, some guidelines
>> for new functions that we add to numpy would be handy. I suspect we've been
>> adding new functions that use asarray rather than asanyarray, which is
>> probably undesired.
>>
>
> Thanks Nathaniel and Stephan. Your comments on my other two points are
> both clear and correct (and have been made a number of times before). I
> think the "write an overview so we can stop making ad-hoc decisions and
> having these discussions" is the most important point I was trying to make
> though. If we had such a doc and it concluded "hence we don't change
> anything, __array_function__ is the only way to go" then we can just close
> PRs like https://github.com/numpy/numpy/pull/11162 straight away.
>
> Cheers,
> Ralf
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/4f509952/attachment-0001.html>

From wieser.eric+numpy at gmail.com  Fri Oct 19 21:49:44 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Fri, 19 Oct 2018 18:49:44 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
Message-ID: <CAL1kJvDw=GRbdrQeysaehDQOPzV2oCMNdqdB-cpfqZsn77tKtw@mail.gmail.com>

Subclasses such as MaskedArray and, yes, Quantity, are widely used, and if
they cause problems perhaps that should be seen as a sign that ndarray
subclassing should be made easier and clearer.

Both maskedarray and quantity seem like something that would make more
sense at the dtype level if our dtype system was easier to extend. It might
be good to compile a list of subclassing applications, and split them into
?this ought to be a dtype? and ?this ought to be a different type of
container?.
?

On Fri, 19 Oct 2018 at 18:24 Marten van Kerkwijk <m.h.vankerkwijk at gmail.com>
wrote:

> Hi All,
>
> It seems there are two extreme possibilities for general functions:
> 1. Put `asarray` everywhere. The main benefit that I can see is that even
> if people put in list instead of arrays, one is guaranteed to have shape,
> dtype, etc. But it seems a bit like calling `int` on everything that might
> get used as an index, instead of letting the actual indexing do the proper
> thing and call `__index__`.
> 2. Do not coerce at all, but rather write code assuming something is an
> array already. This will often, but not always, just work for array mimics,
> with coercion done only where necessary (e.g., in lower-lying C code such
> as that of the ufuncs which has a smaller API surface and can be overridden
> more easily).
>
> The current __array_function__ work may well provide us with a way to
> combine both, if we (over time) move the coercion inside
> `ndarray.__array_function__` so that the actual implementation *can* assume
> it deals with pure ndarray - then, when relevant, calling that
> implementation will be what subclasses/duck arrays can happily do (and it
> is up to them to ensure this works).
>
> Of course, the above does not really answer what to do in the meantime.
> But perhaps it helps in thinking of what we are actually aiming for.
>
> One last thing: could we please stop bashing subclasses? One can subclass
> essentially everything in python, often to great advantage. Subclasses such
> as MaskedArray and, yes, Quantity, are widely used, and if they cause
> problems perhaps that should be seen as a sign that ndarray subclassing
> should be made easier and clearer.
>
> All the best,
>
> Marten
>
>
> On Fri, Oct 19, 2018 at 7:02 PM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Fri, Oct 19, 2018 at 10:28 PM Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Oct 19, 2018 at 4:15 PM Hameer Abbasi <einstein.edison at gmail.com>
>>> wrote:
>>>
>>>> Hi!
>>>>
>>>> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer <shoyer at gmail.com>
>>>> wrote:
>>>> I don't think it makes much sense to change NumPy's existing usage of
>>>> asarray() to asanyarray() unless we add subok=True arguments (which default
>>>> to False). But this ends up cluttering NumPy's public API, which is also
>>>> undesirable.
>>>>
>>>> Agreed so far.
>>>>
>>>
>>> I'm not sure I agree. "subok" is very unpythonic; the average numpy
>>> library function should work fine for a well-behaved subclass (i.e. most
>>> things out there except np.matrix).
>>>
>>>>
>>>> The preferred way to override NumPy functions going forward should be
>>>> __array_function__.
>>>>
>>>>
>>>> I think we should ?soft support? i.e. allow but consider unsupported,
>>>> the case where one of NumPy?s functions is implemented in terms of others
>>>> and ?passing through? an array results in the correct behaviour for that
>>>> array.
>>>>
>>>
>>> I don't think we have or want such a concept as "soft support". We
>>> intend to not break anything that now has asanyarray, i.e. it's supported
>>> and ideally we have regression tests for all such functions. For anything
>>> we transition over from asarray to asanyarray, PRs should come with new
>>> tests.
>>>
>>>
>>>>
>>>> On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk <
>>>> m.h.vankerkwijk at gmail.com> wrote:
>>>>
>>>>> There are exceptions for `matrix` in quite a few places, and there now
>>>>> is warning for `maxtrix` - it might not be bad to use `asanyarray` and add
>>>>> an exception for `maxtrix`. Indeed, I quite like the suggestion by Eric
>>>>> Wieser to just add the exception to `asanyarray` itself - that way when
>>>>> matrix is truly deprecated, it will be a very easy change.
>>>>>
>>>> I don't quite understand this. Adding exceptions is not deprecation -
>>> we then may as well just rip np.matrix out straight away.
>>>
>>> What I suggested in the call about this issue is that it's not very
>>> effective to treat functions like percentile/quantile one by one without an
>>> overarching strategy. A way forward could be for someone to write an
>>> overview of which sets of functions now have asanyarray (and actually work
>>> with subclasses), which ones we can and want to change now, and which ones
>>> we can and want to change after np.matrix is gone. Also, some guidelines
>>> for new functions that we add to numpy would be handy. I suspect we've been
>>> adding new functions that use asarray rather than asanyarray, which is
>>> probably undesired.
>>>
>>
>> Thanks Nathaniel and Stephan. Your comments on my other two points are
>> both clear and correct (and have been made a number of times before). I
>> think the "write an overview so we can stop making ad-hoc decisions and
>> having these discussions" is the most important point I was trying to make
>> though. If we had such a doc and it concluded "hence we don't change
>> anything, __array_function__ is the only way to go" then we can just close
>> PRs like https://github.com/numpy/numpy/pull/11162 straight away.
>>
>> Cheers,
>> Ralf
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/9aa2428f/attachment.html>

From charlesr.harris at gmail.com  Fri Oct 19 22:00:02 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 19 Oct 2018 20:00:02 -0600
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAL1kJvDw=GRbdrQeysaehDQOPzV2oCMNdqdB-cpfqZsn77tKtw@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAL1kJvDw=GRbdrQeysaehDQOPzV2oCMNdqdB-cpfqZsn77tKtw@mail.gmail.com>
Message-ID: <CAB6mnxLxx5CKv=WCN0yGRK47h6V__iGi4t5mNkoHC3jr_ZfH+Q@mail.gmail.com>

On Fri, Oct 19, 2018 at 7:50 PM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> Subclasses such as MaskedArray and, yes, Quantity, are widely used, and if
> they cause problems perhaps that should be seen as a sign that ndarray
> subclassing should be made easier and clearer.
>
> Both maskedarray and quantity seem like something that would make more
> sense at the dtype level if our dtype system was easier to extend. It might
> be good to compile a list of subclassing applications, and split them into
> ?this ought to be a dtype? and ?this ought to be a different type of
> container?.
>

Wes Mckinney has been benchmarking masks vs sentinel values for arrow:
http://wesmckinney.com/blog/bitmaps-vs-sentinel-values/. The (bit) masks
are faster. I'm not convinced dtypes are the way to go.

Chuck


> On Fri, 19 Oct 2018 at 18:24 Marten van Kerkwijk <
> m.h.vankerkwijk at gmail.com> wrote:
>
>> Hi All,
>>
>> It seems there are two extreme possibilities for general functions:
>> 1. Put `asarray` everywhere. The main benefit that I can see is that even
>> if people put in list instead of arrays, one is guaranteed to have shape,
>> dtype, etc. But it seems a bit like calling `int` on everything that might
>> get used as an index, instead of letting the actual indexing do the proper
>> thing and call `__index__`.
>> 2. Do not coerce at all, but rather write code assuming something is an
>> array already. This will often, but not always, just work for array mimics,
>> with coercion done only where necessary (e.g., in lower-lying C code such
>> as that of the ufuncs which has a smaller API surface and can be overridden
>> more easily).
>>
>> The current __array_function__ work may well provide us with a way to
>> combine both, if we (over time) move the coercion inside
>> `ndarray.__array_function__` so that the actual implementation *can* assume
>> it deals with pure ndarray - then, when relevant, calling that
>> implementation will be what subclasses/duck arrays can happily do (and it
>> is up to them to ensure this works).
>>
>> Of course, the above does not really answer what to do in the meantime.
>> But perhaps it helps in thinking of what we are actually aiming for.
>>
>> One last thing: could we please stop bashing subclasses? One can subclass
>> essentially everything in python, often to great advantage. Subclasses such
>> as MaskedArray and, yes, Quantity, are widely used, and if they cause
>> problems perhaps that should be seen as a sign that ndarray subclassing
>> should be made easier and clearer.
>>
>> All the best,
>>
>> Marten
>>
>>
>> On Fri, Oct 19, 2018 at 7:02 PM Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Oct 19, 2018 at 10:28 PM Ralf Gommers <ralf.gommers at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Oct 19, 2018 at 4:15 PM Hameer Abbasi <
>>>> einstein.edison at gmail.com> wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer <shoyer at gmail.com>
>>>>> wrote:
>>>>> I don't think it makes much sense to change NumPy's existing usage of
>>>>> asarray() to asanyarray() unless we add subok=True arguments (which default
>>>>> to False). But this ends up cluttering NumPy's public API, which is also
>>>>> undesirable.
>>>>>
>>>>> Agreed so far.
>>>>>
>>>>
>>>> I'm not sure I agree. "subok" is very unpythonic; the average numpy
>>>> library function should work fine for a well-behaved subclass (i.e. most
>>>> things out there except np.matrix).
>>>>
>>>>>
>>>>> The preferred way to override NumPy functions going forward should be
>>>>> __array_function__.
>>>>>
>>>>>
>>>>> I think we should ?soft support? i.e. allow but consider unsupported,
>>>>> the case where one of NumPy?s functions is implemented in terms of others
>>>>> and ?passing through? an array results in the correct behaviour for that
>>>>> array.
>>>>>
>>>>
>>>> I don't think we have or want such a concept as "soft support". We
>>>> intend to not break anything that now has asanyarray, i.e. it's supported
>>>> and ideally we have regression tests for all such functions. For anything
>>>> we transition over from asarray to asanyarray, PRs should come with new
>>>> tests.
>>>>
>>>>
>>>>>
>>>>> On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk <
>>>>> m.h.vankerkwijk at gmail.com> wrote:
>>>>>
>>>>>> There are exceptions for `matrix` in quite a few places, and there
>>>>>> now is warning for `maxtrix` - it might not be bad to use `asanyarray` and
>>>>>> add an exception for `maxtrix`. Indeed, I quite like the suggestion by Eric
>>>>>> Wieser to just add the exception to `asanyarray` itself - that way when
>>>>>> matrix is truly deprecated, it will be a very easy change.
>>>>>>
>>>>> I don't quite understand this. Adding exceptions is not deprecation -
>>>> we then may as well just rip np.matrix out straight away.
>>>>
>>>> What I suggested in the call about this issue is that it's not very
>>>> effective to treat functions like percentile/quantile one by one without an
>>>> overarching strategy. A way forward could be for someone to write an
>>>> overview of which sets of functions now have asanyarray (and actually work
>>>> with subclasses), which ones we can and want to change now, and which ones
>>>> we can and want to change after np.matrix is gone. Also, some guidelines
>>>> for new functions that we add to numpy would be handy. I suspect we've been
>>>> adding new functions that use asarray rather than asanyarray, which is
>>>> probably undesired.
>>>>
>>>
>>> Thanks Nathaniel and Stephan. Your comments on my other two points are
>>> both clear and correct (and have been made a number of times before). I
>>> think the "write an overview so we can stop making ad-hoc decisions and
>>> having these discussions" is the most important point I was trying to make
>>> though. If we had such a doc and it concluded "hence we don't change
>>> anything, __array_function__ is the only way to go" then we can just close
>>> PRs like https://github.com/numpy/numpy/pull/11162 straight away.
>>>
>>> Cheers,
>>> Ralf
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181019/6419024c/attachment-0001.html>

From njs at pobox.com  Fri Oct 19 22:08:43 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 19 Oct 2018 19:08:43 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAB6mnxLxx5CKv=WCN0yGRK47h6V__iGi4t5mNkoHC3jr_ZfH+Q@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAL1kJvDw=GRbdrQeysaehDQOPzV2oCMNdqdB-cpfqZsn77tKtw@mail.gmail.com>
 <CAB6mnxLxx5CKv=WCN0yGRK47h6V__iGi4t5mNkoHC3jr_ZfH+Q@mail.gmail.com>
Message-ID: <CAPJVwB=q9RCdzEqO-T+h5sm8AWww=xxL_2X3JNt8L1HhXNYYvg@mail.gmail.com>

On Fri, Oct 19, 2018 at 7:00 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
> On Fri, Oct 19, 2018 at 7:50 PM Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>>
>> Subclasses such as MaskedArray and, yes, Quantity, are widely used, and if
>> they cause problems perhaps that should be seen as a sign that ndarray
>> subclassing should be made easier and clearer.
>>
>> Both maskedarray and quantity seem like something that would make more
>> sense at the dtype level if our dtype system was easier to extend. It might
>> be good to compile a list of subclassing applications, and split them into
>> ?this ought to be a dtype? and ?this ought to be a different type of
>> container?.
>
> Wes Mckinney has been benchmarking masks vs sentinel values for arrow:
> http://wesmckinney.com/blog/bitmaps-vs-sentinel-values/. The (bit) masks are
> faster. I'm not convinced dtypes are the way to go.

We need to add better support for both user-defined dtypes and for
user-defined containers in any case. So we're going to support both
missing value strategies regardless, and people will be able to choose
based on engineering trade-offs. A missing value dtype is going to
integrate much more easily into the rest of numpy than a new container
where you have to reimplement indexing etc., but maybe custom
containers can be faster. Okay, cool, they're both on PyPI, pick your
favorite!

Trying to wedge masks into *ndarray* seems like a non-starter, though,
because it would require auditing and updating basically all code
using the numpy C API.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From njs at pobox.com  Fri Oct 19 22:50:01 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 19 Oct 2018 19:50:01 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
Message-ID: <CAPJVwBkP9-FCcnK==eB3_GWNfJhwKwu6iXj6_UP0htWrRt_3gg@mail.gmail.com>

On Fri, Oct 19, 2018 at 6:23 PM, Marten van Kerkwijk
<m.h.vankerkwijk at gmail.com> wrote:
> Hi All,
>
> It seems there are two extreme possibilities for general functions:
> 1. Put `asarray` everywhere. The main benefit that I can see is that even if
> people put in list instead of arrays, one is guaranteed to have shape,
> dtype, etc. But it seems a bit like calling `int` on everything that might
> get used as an index, instead of letting the actual indexing do the proper
> thing and call `__index__`.
> 2. Do not coerce at all, but rather write code assuming something is an
> array already. This will often, but not always, just work for array mimics,
> with coercion done only where necessary (e.g., in lower-lying C code such as
> that of the ufuncs which has a smaller API surface and can be overridden
> more easily).

Between these two options, Numpy's APIs are very firmly on the side of
"option 1", and this is common in most public APIs I'm familiar with
(e.g. scipy). I guess you could try to reopen the discussion, but
you'd be pushing against 15+ years of precedent there...

> The current __array_function__ work may well provide us with a way to
> combine both, if we (over time) move the coercion inside
> `ndarray.__array_function__` so that the actual implementation *can* assume
> it deals with pure ndarray - then, when relevant, calling that
> implementation will be what subclasses/duck arrays can happily do (and it is
> up to them to ensure this works).
>
> Of course, the above does not really answer what to do in the meantime. But
> perhaps it helps in thinking of what we are actually aiming for.

We need some kind of asduckarray(), that coerces lists and similar but
allows duck-arrays to pass through.

> One last thing: could we please stop bashing subclasses? One can subclass
> essentially everything in python, often to great advantage. Subclasses such
> as MaskedArray and, yes, Quantity, are widely used, and if they cause
> problems perhaps that should be seen as a sign that ndarray subclassing
> should be made easier and clearer.

Who's bashing? I've spent years thinking about this and come to the
conclusion that there are no viable solutions to the problems with
subclassing ndarray, but that's not the same as bashing :-). If you've
thought of something we've missed, you should share it...

(I also know lots of senior Python devs who believe that using
Python's subclassing support is pretty much always a mistake ? this
talk is popularly cited: https://www.youtube.com/watch?v=3MNVP9-hglc ?
but the issues with ndarray are much more severe than for the average
Python class.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From charlesr.harris at gmail.com  Sat Oct 20 13:08:41 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 20 Oct 2018 11:08:41 -0600
Subject: [Numpy-discussion] Removing priority labels from github
In-Reply-To: <CABL7CQh2MS_GKdXF22Odnzk+5vePb_NC_aoyhCOBBubNkTcSOA@mail.gmail.com>
References: <b10b4607-65e0-ed25-b89d-a07fd2707898@gmail.com>
 <22c3981e90e72bf4df9d02c02ce2277d1c7d6e41.camel@sipsolutions.net>
 <CAEQ_Tvc8nJsyp3gdDWXnH5PVLrR18=MY17umvTNoWzTpu4iVGA@mail.gmail.com>
 <CABL7CQh2MS_GKdXF22Odnzk+5vePb_NC_aoyhCOBBubNkTcSOA@mail.gmail.com>
Message-ID: <CAB6mnxK422QEi=Qa8uJaC692=K2gRfjL_JwCwN6Nue0AhkTZOw@mail.gmail.com>

On Fri, Oct 19, 2018 at 2:10 PM Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
>
> On Fri, Oct 19, 2018 at 4:49 PM Stephan Hoyer <shoyer at gmail.com> wrote:
>
>>
>>
>> On Fri, Oct 19, 2018 at 2:22 AM Sebastian Berg <
>> sebastian at sipsolutions.net> wrote:
>>
>>> On Fri, 2018-10-19 at 11:02 +0300, Matti Picus wrote:
>>> > We currently have highest, high, normal, low, and lowest priority
>>> > labels
>>> > for github issues/PRs. At the recent status meeting, we proposed
>>> > consolidating these to a single "high" priority label. Anything
>>> > "low"
>>> > priority should be merged or closed since it will be quickly
>>> > forgotten,
>>> > and no "normal" tag is needed.
>>> >
>>> >
>>> > With that, we (the BIDS team) would like to encourage reviewers to
>>> > use
>>> > the "high" priority tag to indicate things we should be working on.
>>> >
>>> > Any objections or thoughts?
>>> >
>>>
>>> Sounds like a plan, especially having practically meaningless tags
>>> right now is no help. Most of them are historical and personally I have
>>> only been using the milestones to tag things as high priority (very
>>> occasionally).
>>>
>>> - Sebastian
>>>
>>
>> +1 from me as well. I haven't been using these tags at all.
>>
>
> +1
>
>
+1 I may have used one of the priority labels once or twice, I don't really
remember. When I think something needs to be fixed or merged I generally
add a benchmark.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181020/8d5c9796/attachment.html>

From charlesr.harris at gmail.com  Sat Oct 20 13:09:25 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 20 Oct 2018 11:09:25 -0600
Subject: [Numpy-discussion] Removing priority labels from github
In-Reply-To: <CAB6mnxK422QEi=Qa8uJaC692=K2gRfjL_JwCwN6Nue0AhkTZOw@mail.gmail.com>
References: <b10b4607-65e0-ed25-b89d-a07fd2707898@gmail.com>
 <22c3981e90e72bf4df9d02c02ce2277d1c7d6e41.camel@sipsolutions.net>
 <CAEQ_Tvc8nJsyp3gdDWXnH5PVLrR18=MY17umvTNoWzTpu4iVGA@mail.gmail.com>
 <CABL7CQh2MS_GKdXF22Odnzk+5vePb_NC_aoyhCOBBubNkTcSOA@mail.gmail.com>
 <CAB6mnxK422QEi=Qa8uJaC692=K2gRfjL_JwCwN6Nue0AhkTZOw@mail.gmail.com>
Message-ID: <CAB6mnxLrJThJB2sS3aQqaHSO+JryWp_qq9Y0rW9F7wmeKV3KtA@mail.gmail.com>

On Sat, Oct 20, 2018 at 11:08 AM Charles R Harris <charlesr.harris at gmail.com>
wrote:

>
>
> On Fri, Oct 19, 2018 at 2:10 PM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Fri, Oct 19, 2018 at 4:49 PM Stephan Hoyer <shoyer at gmail.com> wrote:
>>
>>>
>>>
>>> On Fri, Oct 19, 2018 at 2:22 AM Sebastian Berg <
>>> sebastian at sipsolutions.net> wrote:
>>>
>>>> On Fri, 2018-10-19 at 11:02 +0300, Matti Picus wrote:
>>>> > We currently have highest, high, normal, low, and lowest priority
>>>> > labels
>>>> > for github issues/PRs. At the recent status meeting, we proposed
>>>> > consolidating these to a single "high" priority label. Anything
>>>> > "low"
>>>> > priority should be merged or closed since it will be quickly
>>>> > forgotten,
>>>> > and no "normal" tag is needed.
>>>> >
>>>> >
>>>> > With that, we (the BIDS team) would like to encourage reviewers to
>>>> > use
>>>> > the "high" priority tag to indicate things we should be working on.
>>>> >
>>>> > Any objections or thoughts?
>>>> >
>>>>
>>>> Sounds like a plan, especially having practically meaningless tags
>>>> right now is no help. Most of them are historical and personally I have
>>>> only been using the milestones to tag things as high priority (very
>>>> occasionally).
>>>>
>>>> - Sebastian
>>>>
>>>
>>> +1 from me as well. I haven't been using these tags at all.
>>>
>>
>> +1
>>
>>
> +1 I may have used one of the priority labels once or twice, I don't
> really remember. When I think something needs to be fixed or merged I
> generally add a benchmark.
>
>
benchmark <- milestone.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181020/7c88e8bd/attachment.html>

From matti.picus at gmail.com  Mon Oct 22 02:56:37 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Mon, 22 Oct 2018 09:56:37 +0300
Subject: [Numpy-discussion] Reminder: weekly status meeting
Message-ID: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>

Hi everyone,

The team at BIDS meets once a week to discuss progress, priorities, and
roadblocks.  While our priorities are broadly determined by the project
roadmap [0], we would like to provide an opportunity for the community
to give more regular and detailed feedback on our work.

We therefore invite you to join us for our weekly calls,
each **Wednesday from 12:00 to 13:00 Pacific Time**.

Detail of the next meeting (2018-10-24) is given in the agenda [1],
which is a living document. Feel free to add topics you wish to discuss.

We hope to see you there!
Best regards,
St?fan, Tyler, Matti

[0]https://www.numpy.org/neps/index.html
[1]https://hackmd.io/5WZ6VwQKSbSR_4Ng65pUFw?both


From charlesr.harris at gmail.com  Mon Oct 22 14:06:53 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 22 Oct 2018 12:06:53 -0600
Subject: [Numpy-discussion] NumPy 1.15.3 release
Message-ID: <CAB6mnx+gRa=bw_f2Gue-Z+63te5x9nwk-j6dXjM7sES1LFoL8A@mail.gmail.com>

Hi All,

On behalf of the NumPy team, I am pleased to announce the release of NumPy
1.15.3. This is a bugfix release for bugs and regressions reported
following the 1.15.2 release.  The most noticeable fix is probably for
the memory
leak <https://github.com/numpy/numpy/issues/12037> encountered when slicing
classes derived from Numpy.

The Python versions supported by this release are 2.7, 3.4-3.7. Wheels for
this release can be downloaded from PyPI
<https://pypi.org/project/numpy/1.15.3/>, source archives are available
from Github <https://github.com/numpy/numpy/releases/tag/v1.15.3>.

Compatibility Note
==================

The NumPy 1.15.x OS X wheels released on PyPI no longer contain 32-bit
binaries.  That will also be the case in future releases. See
`#11625 <https://github.com/numpy/numpy/issues/11625>`__ for the related
discussion.  Those needing 32-bit support should look elsewhere or build
from source.

Contributors
============

A total of 7 people contributed to this release.  People with a "+" by their
names contributed a patch for the first time.

* Allan Haldane
* Charles Harris
* Jeroen Demeyer
* Kevin Sheppard
* Matthew Bowden +
* Matti Picus
* Tyler Reddy

Pull requests merged
====================

A total of 12 pull requests were merged for this release.

* `#12080 <https://github.com/numpy/numpy/pull/12080>`__: MAINT: Blacklist
some MSVC complex functions.
* `#12083 <https://github.com/numpy/numpy/pull/12083>`__: TST: Add azure CI
testing to 1.15.x branch.
* `#12084 <https://github.com/numpy/numpy/pull/12084>`__: BUG: test_path()
now uses Path.resolve()
* `#12085 <https://github.com/numpy/numpy/pull/12085>`__: TST, MAINT: Fix
some failing tests on azure-pipelines mac and...
* `#12187 <https://github.com/numpy/numpy/pull/12187>`__: BUG: Fix memory
leak in mapping.c
* `#12188 <https://github.com/numpy/numpy/pull/12188>`__: BUG: Allow
boolean subtract in histogram
* `#12189 <https://github.com/numpy/numpy/pull/12189>`__: BUG: Fix in-place
permutation
* `#12190 <https://github.com/numpy/numpy/pull/12190>`__: BUG: limit
default for get_num_build_jobs() to 8
* `#12191 <https://github.com/numpy/numpy/pull/12191>`__: BUG: OBJECT_to_*
should check for errors
* `#12192 <https://github.com/numpy/numpy/pull/12192>`__: DOC: Prepare for
NumPy 1.15.3 release.
* `#12237 <https://github.com/numpy/numpy/pull/12237>`__: BUG: Fix
MaskedArray fill_value type conversion.
* `#12238 <https://github.com/numpy/numpy/pull/12238>`__: TST: Backport
azure-pipeline testing fixes for Mac

Cheers,

Charles Harris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181022/bad670ae/attachment.html>

From stefanv at berkeley.edu  Wed Oct 24 18:07:50 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Wed, 24 Oct 2018 15:07:50 -0700
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
Message-ID: <20181024220750.pytz7dav4dabeplx@carbo>

Hi all,

On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
> We therefore invite you to join us for our weekly calls,
> each **Wednesday from 12:00 to 13:00 Pacific Time**.
> 
> Detail of the next meeting (2018-10-24) is given in the agenda

This week's meeting notes are at:

https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md

St?fan

From einstein.edison at gmail.com  Thu Oct 25 06:40:16 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Thu, 25 Oct 2018 12:40:16 +0200
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <20181024220750.pytz7dav4dabeplx@carbo>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
Message-ID: <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>

Hi!

Sorry to miss this week?s meeting.

If I may point out an inaccuracy in the notes: in PyData/Sparse most things are implemented from the ground up without relying on scipy.sparse. The only part that does rely on it is `sparse.matmul`, `sparse.dot` and `sparse.tensordot`, as well as a few conversions to/from SciPy, if these could depend on Cython wrappers instead that?d be nice.

I should probably update the docs on that. If anyone is willing to discuss pydata/sparse with me, I?ll be available for a meeting anytime.

Best Regards,
Hameer Abbasi

> On Thursday, Oct 25, 2018 at 12:08 AM, Stefan van der Walt <stefanv at berkeley.edu (mailto:stefanv at berkeley.edu)> wrote:
> Hi all,
>
> On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
> > We therefore invite you to join us for our weekly calls,
> > each **Wednesday from 12:00 to 13:00 Pacific Time**.
> >
> > Detail of the next meeting (2018-10-24) is given in the agenda
>
> This week's meeting notes are at:
>
> https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md
>
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181025/11640340/attachment.html>

From alex.rogozhnikov at yandex.ru  Thu Oct 25 16:16:32 2018
From: alex.rogozhnikov at yandex.ru (Alex Rogozhnikov)
Date: Thu, 25 Oct 2018 23:16:32 +0300
Subject: [Numpy-discussion] Depreciating asfortranarray and ascontiguousarray
Message-ID: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181025/c6824eea/attachment.html>

From joferkington at gmail.com  Thu Oct 25 17:00:26 2018
From: joferkington at gmail.com (Joe Kington)
Date: Thu, 25 Oct 2018 16:00:26 -0500
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
Message-ID: <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>

For what it's worth, these are fairly widely used functions.  From a user
standpoint, I'd gently argue against deprecating them. Documenting the
inconsistency with scalars  seems like a less invasive approach.

In particular ascontiguousarray is a very common check to make when working
with C libraries or low-level file formats.  A significant advantage over
asarray(..., order='C') is readability.  It makes the intention very
clear.  Similarly, asfortranarray is quite readable for folks that aren't
deeply familiar with numpy.

Given that the use-cases they're primarily used for are likely to be read
by developers working in other languages (i.e. ascontiguousarray gets used
at a lot of "boundaries" with other systems), keeping function names that
make intention very clear is important.

Just my $0.02, anyway.  Cheers,
-Joe

On Thu, Oct 25, 2018 at 3:17 PM Alex Rogozhnikov <alex.rogozhnikov at yandex.ru>
wrote:

> Dear numpy community,
>
> I'm planning to depreciate np.asfortranarray and np.ascontiguousarray
> functions due to their misbehavior on scalar (0-D tensors) with PR #12244.
>
> Current behavior (converting scalars to 1-d array with single element)
> - is unexpected and contradicts to documentation
> - probably, can't be changed without breaking external code
> - I believe, this was a cause for poor support of 0-d arrays in mxnet.
> - both functions are easily replaced with asarray(..., order='...'), which
> has expected behavior
>
> There is no timeline for removal - we just need to discourage from using
> this functions in new code.
>
> Function naming may be related to how numpy treats 0-d tensors specially,
> and those probably should not be called arrays.
> https://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html
> However, as a user I never thought about 0-d arrays being special and
> being "not arrays".
>
>
> Please see original discussion at github for more details
> https://github.com/numpy/numpy/issues/5300
>
> Your comments welcome,
> Alex Rogozhnikov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181025/aa8db7b6/attachment.html>

From jfoxrabinovitz at gmail.com  Thu Oct 25 17:47:37 2018
From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz)
Date: Thu, 25 Oct 2018 17:47:37 -0400
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
Message-ID: <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>

In that vein, would it be advisable to re-implement them as aliases for the
correctly behaving functions instead?

- Joe

On Thu, Oct 25, 2018 at 5:01 PM Joe Kington <joferkington at gmail.com> wrote:

> For what it's worth, these are fairly widely used functions.  From a user
> standpoint, I'd gently argue against deprecating them. Documenting the
> inconsistency with scalars  seems like a less invasive approach.
>
> In particular ascontiguousarray is a very common check to make when
> working with C libraries or low-level file formats.  A significant
> advantage over asarray(..., order='C') is readability.  It makes the
> intention very clear.  Similarly, asfortranarray is quite readable for
> folks that aren't deeply familiar with numpy.
>
> Given that the use-cases they're primarily used for are likely to be read
> by developers working in other languages (i.e. ascontiguousarray gets used
> at a lot of "boundaries" with other systems), keeping function names that
> make intention very clear is important.
>
> Just my $0.02, anyway.  Cheers,
> -Joe
>
> On Thu, Oct 25, 2018 at 3:17 PM Alex Rogozhnikov <
> alex.rogozhnikov at yandex.ru> wrote:
>
>> Dear numpy community,
>>
>> I'm planning to depreciate np.asfortranarray and np.ascontiguousarray
>> functions due to their misbehavior on scalar (0-D tensors) with PR #12244
>> .
>>
>> Current behavior (converting scalars to 1-d array with single element)
>> - is unexpected and contradicts to documentation
>> - probably, can't be changed without breaking external code
>> - I believe, this was a cause for poor support of 0-d arrays in mxnet.
>> - both functions are easily replaced with asarray(..., order='...'),
>> which has expected behavior
>>
>> There is no timeline for removal - we just need to discourage from using
>> this functions in new code.
>>
>> Function naming may be related to how numpy treats 0-d tensors specially,
>>
>> and those probably should not be called arrays.
>> https://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html
>> However, as a user I never thought about 0-d arrays being special and
>> being "not arrays".
>>
>>
>> Please see original discussion at github for more details
>> https://github.com/numpy/numpy/issues/5300
>>
>> Your comments welcome,
>> Alex Rogozhnikov
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181025/b11af800/attachment-0001.html>

From deak.andris at gmail.com  Thu Oct 25 18:10:05 2018
From: deak.andris at gmail.com (Andras Deak)
Date: Fri, 26 Oct 2018 00:10:05 +0200
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
 <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>
Message-ID: <CAMEWA4P1pqHwcdqt72o6BoX8k2yw=1P=TOiW4f6vHgfzKP6vOQ@mail.gmail.com>

On Thu, Oct 25, 2018 at 11:48 PM Joseph Fox-Rabinovitz
<jfoxrabinovitz at gmail.com> wrote:
>
> In that vein, would it be advisable to re-implement them as aliases for the correctly behaving functions instead?
>
> - Joe

Wouldn't "probably, can't be changed without breaking external code"
still apply? As I understand the suggestion for _deprecation_ is only
because there's (a lot of) code relying on the current behaviour (or
at least there's risk).

Andr?s

From tyler.je.reddy at gmail.com  Thu Oct 25 19:18:47 2018
From: tyler.je.reddy at gmail.com (Tyler Reddy)
Date: Thu, 25 Oct 2018 16:18:47 -0700
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
 <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
Message-ID: <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>

What exactly would you like Cython wrappers for? Some of the C++ code
in scipy/sparse/sparsetools?

I see you have COO.from_scipy_sparse(x) in some pydata/sparse code paths,
which presumably you'd like to avoid or improve?

On Thu, 25 Oct 2018 at 03:41, Hameer Abbasi <einstein.edison at gmail.com>
wrote:

> Hi!
>
> Sorry to miss this week?s meeting.
>
> If I may point out an inaccuracy in the notes: in PyData/Sparse most
> things are implemented from the ground up without relying on scipy.sparse.
> The only part that does rely on it is `sparse.matmul`, `sparse.dot` and
> `sparse.tensordot`, as well as a few conversions to/from SciPy, if these
> could depend on Cython wrappers instead that?d be nice.
>
> I should probably update the docs on that. If anyone is willing to discuss
> pydata/sparse with me, I?ll be available for a meeting anytime.
>
> Best Regards,
> Hameer Abbasi
>
> On Thursday, Oct 25, 2018 at 12:08 AM, Stefan van der Walt <
> stefanv at berkeley.edu> wrote:
> Hi all,
>
> On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
>
> We therefore invite you to join us for our weekly calls,
> each **Wednesday from 12:00 to 13:00 Pacific Time**.
>
> Detail of the next meeting (2018-10-24) is given in the agenda
>
>
> This week's meeting notes are at:
>
>
> https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md
>
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181025/a7aefc95/attachment.html>

From shoyer at gmail.com  Thu Oct 25 22:02:20 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Thu, 25 Oct 2018 19:02:20 -0700
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAMEWA4P1pqHwcdqt72o6BoX8k2yw=1P=TOiW4f6vHgfzKP6vOQ@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
 <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>
 <CAMEWA4P1pqHwcdqt72o6BoX8k2yw=1P=TOiW4f6vHgfzKP6vOQ@mail.gmail.com>
Message-ID: <CAEQ_TveB=VP-3Goxrp=vgeLEAp0ua-yDrDpZtXcfefOXNjPPNA@mail.gmail.com>

On Thu, Oct 25, 2018 at 3:10 PM Andras Deak <deak.andris at gmail.com> wrote:

> On Thu, Oct 25, 2018 at 11:48 PM Joseph Fox-Rabinovitz
> <jfoxrabinovitz at gmail.com> wrote:
> >
> > In that vein, would it be advisable to re-implement them as aliases for
> the correctly behaving functions instead?
> >
> > - Joe
>
> Wouldn't "probably, can't be changed without breaking external code"
> still apply? As I understand the suggestion for _deprecation_ is only
> because there's (a lot of) code relying on the current behaviour (or
> at least there's risk).


I would also advocate for fixing these functions if possible (removing
ndim=1). ascontiguousarray(...) is certainly more readable than asarray(...
order='C').

The conservative way to handle this would be to do a deprecation cycle,
specifically by issuing FutureWarning when scalars or 0d arrays are
encountered as inputs.

Cheers,
Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181025/3350cd5d/attachment.html>

From einstein.edison at gmail.com  Fri Oct 26 04:47:09 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Fri, 26 Oct 2018 10:47:09 +0200
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
 <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
 <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>
Message-ID: <164a3678-0838-4bb7-84ed-92e1a249f875@Canary>

Hi everyone,

Like I said, we just use those to coerce SciPy arrays to native ones for compatibility. You could remove all those and the package would work fine, as long as you were using native PyData/Sparse arrays.

The only core functionality dependent on scipy.sparse is matrix multiplication and the like. Everything else is for inter-operability.

Best Regards,
Hameer Abbasi

> On Friday, Oct 26, 2018 at 1:19 AM, Tyler Reddy <tyler.je.reddy at gmail.com (mailto:tyler.je.reddy at gmail.com)> wrote:
> What exactly would you like Cython wrappers for? Some of the C++ code in scipy/sparse/sparsetools?
>
> I see you have COO.from_scipy_sparse(x) in some pydata/sparse code paths, which presumably you'd like to avoid or improve?
> On Thu, 25 Oct 2018 at 03:41, Hameer Abbasi <einstein.edison at gmail.com (mailto:einstein.edison at gmail.com)> wrote:
> > Hi!
> >
> > Sorry to miss this week?s meeting.
> >
> > If I may point out an inaccuracy in the notes: in PyData/Sparse most things are implemented from the ground up without relying on scipy.sparse. The only part that does rely on it is `sparse.matmul`, `sparse.dot` and `sparse.tensordot`, as well as a few conversions to/from SciPy, if these could depend on Cython wrappers instead that?d be nice.
> >
> > I should probably update the docs on that. If anyone is willing to discuss pydata/sparse with me, I?ll be available for a meeting anytime.
> >
> > Best Regards,
> > Hameer Abbasi
> >
> >
> > > On Thursday, Oct 25, 2018 at 12:08 AM, Stefan van der Walt <stefanv at berkeley.edu (mailto:stefanv at berkeley.edu)> wrote:
> > > Hi all,
> > >
> > > On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
> > > > We therefore invite you to join us for our weekly calls,
> > > > each **Wednesday from 12:00 to 13:00 Pacific Time**.
> > > >
> > > > Detail of the next meeting (2018-10-24) is given in the agenda
> > >
> > > This week's meeting notes are at:
> > >
> > > https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md
> > >
> > > St?fan
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org (mailto:NumPy-Discussion at python.org)
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org (mailto:NumPy-Discussion at python.org)
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/cd5388d3/attachment-0001.html>

From stefanv at berkeley.edu  Fri Oct 26 13:03:14 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Fri, 26 Oct 2018 10:03:14 -0700
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <164a3678-0838-4bb7-84ed-92e1a249f875@Canary>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
 <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
 <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>
 <164a3678-0838-4bb7-84ed-92e1a249f875@Canary>
Message-ID: <20181026170314.wqvwwc4ncudz5dzo@carbo>

Hi Hameer,

On Fri, 26 Oct 2018 10:47:09 +0200, Hameer Abbasi wrote:
> The only core functionality dependent on scipy.sparse is matrix
> multiplication and the like. Everything else is for inter-operability.

Thank you for commenting here.

As you know, I am enthusiastic about seeing an `sparray` equivalent to
`spmatrix`.  When we last spoke, my recollection was that it would be
beneficial to `pydata/sparse`.  Is this still correct?

If not, are we now in a situation where it would be more helpful to
build `sparray` based on `pydata/sparse`.

If we can have a good sparse array API in place in SciPy, it may
significantly simplify code in various other libraries (I'm thinking of
scikit-learn, e.g.).

Best regards,
St?fan

From einstein.edison at gmail.com  Fri Oct 26 13:10:19 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Fri, 26 Oct 2018 19:10:19 +0200
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <20181026170314.wqvwwc4ncudz5dzo@carbo>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
 <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
 <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>
 <164a3678-0838-4bb7-84ed-92e1a249f875@Canary>
 <20181026170314.wqvwwc4ncudz5dzo@carbo>
Message-ID: <f473d66a-7c73-45c8-9e20-9feebe5d832e@Canary>

Hi Stefan!

PyData/Sparse is pretty far along, by January or so we should have a CSR/CSC replacement that is ND. It needs optimisation in a lot of cases but the API is compatible with NumPy and works pretty well already IMO.

PyData/Sparse is pretty much independent of any changes to scipy.sparse at this point. We build on top of NumPy, not scipy.sparse.

Feel free to use any or all of my code for sparray, although I think Ralf Gommers, Matthew Rocklin and others were of the opinion that the data structure should stay in PyData/Sparse and linear algebra and csgraph etc should go into SciPy.

Best Regards,
Hameer Abbasi

> On Friday, Oct 26, 2018 at 7:03 PM, Stefan van der Walt <stefanv at berkeley.edu (mailto:stefanv at berkeley.edu)> wrote:
> Hi Hameer,
>
> On Fri, 26 Oct 2018 10:47:09 +0200, Hameer Abbasi wrote:
> > The only core functionality dependent on scipy.sparse is matrix
> > multiplication and the like. Everything else is for inter-operability.
>
> Thank you for commenting here.
>
> As you know, I am enthusiastic about seeing an `sparray` equivalent to
> `spmatrix`. When we last spoke, my recollection was that it would be
> beneficial to `pydata/sparse`. Is this still correct?
>
> If not, are we now in a situation where it would be more helpful to
> build `sparray` based on `pydata/sparse`.
>
> If we can have a good sparse array API in place in SciPy, it may
> significantly simplify code in various other libraries (I'm thinking of
> scikit-learn, e.g.).
>
> Best regards,
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/729fd35b/attachment.html>

From alex.rogozhnikov at yandex.ru  Fri Oct 26 15:47:15 2018
From: alex.rogozhnikov at yandex.ru (Alex Rogozhnikov)
Date: Fri, 26 Oct 2018 22:47:15 +0300
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAEQ_TveB=VP-3Goxrp=vgeLEAp0ua-yDrDpZtXcfefOXNjPPNA@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
 <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>
 <CAMEWA4P1pqHwcdqt72o6BoX8k2yw=1P=TOiW4f6vHgfzKP6vOQ@mail.gmail.com>
 <CAEQ_TveB=VP-3Goxrp=vgeLEAp0ua-yDrDpZtXcfefOXNjPPNA@mail.gmail.com>
Message-ID: <15377471540583235@myt6-2fee75662a4f.qloud-c.yandex.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/247f9ac3/attachment.html>

From stefanv at berkeley.edu  Fri Oct 26 16:04:11 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Fri, 26 Oct 2018 13:04:11 -0700
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAEQ_TveB=VP-3Goxrp=vgeLEAp0ua-yDrDpZtXcfefOXNjPPNA@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
 <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>
 <CAMEWA4P1pqHwcdqt72o6BoX8k2yw=1P=TOiW4f6vHgfzKP6vOQ@mail.gmail.com>
 <CAEQ_TveB=VP-3Goxrp=vgeLEAp0ua-yDrDpZtXcfefOXNjPPNA@mail.gmail.com>
Message-ID: <20181026200411.6ovdxezbfhpys5el@carbo>

On Thu, 25 Oct 2018 19:02:20 -0700, Stephan Hoyer wrote:
> I would also advocate for fixing these functions if possible (removing
> ndim=1). ascontiguousarray(...) is certainly more readable than asarray(...
> order='C').

I agree; these are widely used, and makes intuitive sense as part of the
API.

St?fan

From shoyer at gmail.com  Fri Oct 26 16:25:10 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Fri, 26 Oct 2018 13:25:10 -0700
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <15377471540583235@myt6-2fee75662a4f.qloud-c.yandex.net>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
 <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>
 <CAMEWA4P1pqHwcdqt72o6BoX8k2yw=1P=TOiW4f6vHgfzKP6vOQ@mail.gmail.com>
 <CAEQ_TveB=VP-3Goxrp=vgeLEAp0ua-yDrDpZtXcfefOXNjPPNA@mail.gmail.com>
 <15377471540583235@myt6-2fee75662a4f.qloud-c.yandex.net>
Message-ID: <CAEQ_TveuX2vhAcHw=v4s4cgK40eaVFkHaMG6=iaAT-6Weu4Gww@mail.gmail.com>

On Fri, Oct 26, 2018 at 12:55 PM Alex Rogozhnikov <
alex.rogozhnikov at yandex.ru> wrote:

>
> The conservative way to handle this would be to do a deprecation cycle,
> specifically by issuing FutureWarning when scalars or 0d arrays are
> encountered as inputs.
> Sounds good to me. Behavior should be scheduled for numpy 1.18?
>

Yes, that sounds about right to me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/ab5e6e04/attachment.html>

From ralf.gommers at gmail.com  Fri Oct 26 17:27:49 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 27 Oct 2018 10:27:49 +1300
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <f473d66a-7c73-45c8-9e20-9feebe5d832e@Canary>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
 <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
 <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>
 <164a3678-0838-4bb7-84ed-92e1a249f875@Canary>
 <20181026170314.wqvwwc4ncudz5dzo@carbo>
 <f473d66a-7c73-45c8-9e20-9feebe5d832e@Canary>
Message-ID: <CABL7CQiZjTT6mc2KCUa3qYKxj8U7Nr5DmOMw6Wtq9Zb=C9Axsw@mail.gmail.com>

On Sat, Oct 27, 2018 at 6:10 AM Hameer Abbasi <einstein.edison at gmail.com>
wrote:

> Hi Stefan!
>
> PyData/Sparse is pretty far along, by January or so we should have a
> CSR/CSC replacement that is ND. It needs optimisation in a lot of cases but
> the API is compatible with NumPy and works pretty well already IMO.
>
> PyData/Sparse is pretty much independent of any changes to scipy.sparse at
> this point. We build on top of NumPy, not scipy.sparse.
>
> Feel free to use any or all of my code for sparray, although I think Ralf
> Gommers, Matthew Rocklin and others were of the opinion that the data
> structure should stay in PyData/Sparse and linear algebra and csgraph etc
> should go into SciPy.
>

Just to make sure we're talking about the same things here: Stefan, I think
with "sparray" you mean "an n-D sparse array implementation that lives in
SciPy", nothing more specific? In that case pydata/sparse is the one
implementation, and including it in scipy.sparse would make it "sparray".
I'm currently indeed leaning towards depending on pydata/sparse rather than
including it in scipy.

Cheers,
Ralf


> Best Regards,
> Hameer Abbasi
>
> On Friday, Oct 26, 2018 at 7:03 PM, Stefan van der Walt <
> stefanv at berkeley.edu> wrote:
> Hi Hameer,
>
> On Fri, 26 Oct 2018 10:47:09 +0200, Hameer Abbasi wrote:
>
> The only core functionality dependent on scipy.sparse is matrix
> multiplication and the like. Everything else is for inter-operability.
>
>
> Thank you for commenting here.
>
> As you know, I am enthusiastic about seeing an `sparray` equivalent to
> `spmatrix`. When we last spoke, my recollection was that it would be
> beneficial to `pydata/sparse`. Is this still correct?
>
> If not, are we now in a situation where it would be more helpful to
> build `sparray` based on `pydata/sparse`.
>
> If we can have a good sparse array API in place in SciPy, it may
> significantly simplify code in various other libraries (I'm thinking of
> scikit-learn, e.g.).
>
> Best regards,
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181027/23b479e6/attachment.html>

From stefanv at berkeley.edu  Fri Oct 26 18:10:28 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Fri, 26 Oct 2018 15:10:28 -0700
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <CABL7CQiZjTT6mc2KCUa3qYKxj8U7Nr5DmOMw6Wtq9Zb=C9Axsw@mail.gmail.com>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
 <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
 <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>
 <164a3678-0838-4bb7-84ed-92e1a249f875@Canary>
 <20181026170314.wqvwwc4ncudz5dzo@carbo>
 <f473d66a-7c73-45c8-9e20-9feebe5d832e@Canary>
 <CABL7CQiZjTT6mc2KCUa3qYKxj8U7Nr5DmOMw6Wtq9Zb=C9Axsw@mail.gmail.com>
Message-ID: <20181026221028.3r62dbsqpzzcvjj6@carbo>

On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
> Just to make sure we're talking about the same things here: Stefan, I think
> with "sparray" you mean "an n-D sparse array implementation that lives in
> SciPy", nothing more specific? In that case pydata/sparse is the one
> implementation, and including it in scipy.sparse would make it "sparray".
> I'm currently indeed leaning towards depending on pydata/sparse rather than
> including it in scipy.

I want to double check: when we last spoke, it seemed as though certain
refactorings inside of SciPy (specifically, sparray was mentioned) would
simplify the life of pydata/sparse devs.  That no longer seems to be the
case?

If our recommended route is to tell users to use pydata/sparse instead
of SciPy (for the sparse array object), we probably want to get rid of
our own internal implementation, and deprecate spmatrix (or, build
spmatrix on top of pydata/sparse)?

Once we can define a clear API for sparse arrays, we can include some
algorithms that ingest those objects in SciPy.  But, I'm not sure we
have an API in place that will allow handover of such objects to the
existing C/FORTRAN-level code.

St?fan

From sebastian at sipsolutions.net  Fri Oct 26 18:10:12 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 27 Oct 2018 00:10:12 +0200
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAEQ_TveuX2vhAcHw=v4s4cgK40eaVFkHaMG6=iaAT-6Weu4Gww@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
 <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>
 <CAMEWA4P1pqHwcdqt72o6BoX8k2yw=1P=TOiW4f6vHgfzKP6vOQ@mail.gmail.com>
 <CAEQ_TveB=VP-3Goxrp=vgeLEAp0ua-yDrDpZtXcfefOXNjPPNA@mail.gmail.com>
 <15377471540583235@myt6-2fee75662a4f.qloud-c.yandex.net>
 <CAEQ_TveuX2vhAcHw=v4s4cgK40eaVFkHaMG6=iaAT-6Weu4Gww@mail.gmail.com>
Message-ID: <c2be55450a2c27a6cd2ef37e1e828b544d3ccfef.camel@sipsolutions.net>

On Fri, 2018-10-26 at 13:25 -0700, Stephan Hoyer wrote:
> On Fri, Oct 26, 2018 at 12:55 PM Alex Rogozhnikov <
> alex.rogozhnikov at yandex.ru> wrote:
> >  
> > The conservative way to handle this would be to do a deprecation
> > cycle, specifically by issuing FutureWarning when scalars or 0d
> > arrays are encountered as inputs.
> > Sounds good to me. Behavior should be scheduled for numpy 1.18?
> > 
> 
> Yes, that sounds about right to me.
> 

Is there a way to avoid the future warning? An unavoidable warning in a
widely used function seems really annoying to me. Unless, the 0d thing
happens rarely, but then it might be the downstream users that get the
warning for no reason.

- Sebastian


>  
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181027/40e6771d/attachment-0001.sig>

From shoyer at gmail.com  Fri Oct 26 19:26:09 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Fri, 26 Oct 2018 16:26:09 -0700
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <c2be55450a2c27a6cd2ef37e1e828b544d3ccfef.camel@sipsolutions.net>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
 <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>
 <CAMEWA4P1pqHwcdqt72o6BoX8k2yw=1P=TOiW4f6vHgfzKP6vOQ@mail.gmail.com>
 <CAEQ_TveB=VP-3Goxrp=vgeLEAp0ua-yDrDpZtXcfefOXNjPPNA@mail.gmail.com>
 <15377471540583235@myt6-2fee75662a4f.qloud-c.yandex.net>
 <CAEQ_TveuX2vhAcHw=v4s4cgK40eaVFkHaMG6=iaAT-6Weu4Gww@mail.gmail.com>
 <c2be55450a2c27a6cd2ef37e1e828b544d3ccfef.camel@sipsolutions.net>
Message-ID: <CAEQ_TveRmOwfty-7fs1Nb+oN=ufqvTqib0cv7PXWZz6GET_30Q@mail.gmail.com>

On Fri, Oct 26, 2018 at 3:48 PM Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Fri, 2018-10-26 at 13:25 -0700, Stephan Hoyer wrote:
> > On Fri, Oct 26, 2018 at 12:55 PM Alex Rogozhnikov <
> > alex.rogozhnikov at yandex.ru> wrote:
> > >
> > > The conservative way to handle this would be to do a deprecation
> > > cycle, specifically by issuing FutureWarning when scalars or 0d
> > > arrays are encountered as inputs.
> > > Sounds good to me. Behavior should be scheduled for numpy 1.18?
> > >
> >
> > Yes, that sounds about right to me.
> >
>
> Is there a way to avoid the future warning? An unavoidable warning in a
> widely used function seems really annoying to me. Unless, the 0d thing
> happens rarely, but then it might be the downstream users that get the
> warning for no reason.
>
> - Sebastian
>

My suspicion is that 0d arrays are rarely used as arguments to
ascontiguousarray / asfortranarray. But it's hard to say for sure...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/b98e8f01/attachment.html>

From teoliphant at gmail.com  Fri Oct 26 19:26:51 2018
From: teoliphant at gmail.com (Travis Oliphant)
Date: Fri, 26 Oct 2018 18:26:51 -0500
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
Message-ID: <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>

What is the justification for deprecation exactly?  These functions have
been well documented and have had the intended behavior of producing arrays
with dimension at least 1 for some time.  Why is it unexpected to produce
arrays of at least 1 dimension?  For some users this is exactly what is
wanted.  I don't understand the statement that behavior with 0-d arrays is
unexpected.

If the desire is to shrink the API of NumPy, I could see that.   But, it
seems odd to me to remove a much-used function with an established behavior
except as part of a wider API-shrinkage effort.

0-d arrays in NumPy are a separate conversation.  At this point, I think it
was a mistake not to embrace 0-d arrays in NumPy from day one.  In some
sense 0-d arrays *are* scalars at least conceptually and for JIT-producing
systems that exist now and will be growing in the future, they can be
equivalent to scalars.

The array scalars should become how you define what is *in* a NumPy array
making them true Python types, rather than Python 1-style "instances" of a
single "Dtype" object.  You would then have 0-d arrays and these Python
"memory" types describing what is *in* the array.

There is a clear way to do this, some of which has been outlined by
Nathaniel, and the rest I have an outline for how to implement.  I can
advise someone on how to do this.

-Travis


On Thu, Oct 25, 2018 at 3:17 PM Alex Rogozhnikov <alex.rogozhnikov at yandex.ru>
wrote:

> Dear numpy community,
>
> I'm planning to depreciate np.asfortranarray and np.ascontiguousarray
> functions due to their misbehavior on scalar (0-D tensors) with PR #12244.
>
> Current behavior (converting scalars to 1-d array with single element)
> - is unexpected and contradicts to documentation
> - probably, can't be changed without breaking external code
> - I believe, this was a cause for poor support of 0-d arrays in mxnet.
> - both functions are easily replaced with asarray(..., order='...'), which
> has expected behavior
>
> There is no timeline for removal - we just need to discourage from using
> this functions in new code.
>
> Function naming may be related to how numpy treats 0-d tensors specially,
> and those probably should not be called arrays.
> https://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html
> However, as a user I never thought about 0-d arrays being special and
> being "not arrays".
>
>
> Please see original discussion at github for more details
> https://github.com/numpy/numpy/issues/5300
>
> Your comments welcome,
> Alex Rogozhnikov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/4a105e21/attachment.html>

From teoliphant at gmail.com  Fri Oct 26 19:47:00 2018
From: teoliphant at gmail.com (Travis Oliphant)
Date: Fri, 26 Oct 2018 18:47:00 -0500
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAEQ_TveRmOwfty-7fs1Nb+oN=ufqvTqib0cv7PXWZz6GET_30Q@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CACe1pJe0c7yPbmYPH9uPNNPBcN8yyBgu_U51WvWk3jfwak8mPQ@mail.gmail.com>
 <CAAa1KPaPZaU41djXwY8bxeaird2qVfz46KrSpFoyHmD9UeztzQ@mail.gmail.com>
 <CAMEWA4P1pqHwcdqt72o6BoX8k2yw=1P=TOiW4f6vHgfzKP6vOQ@mail.gmail.com>
 <CAEQ_TveB=VP-3Goxrp=vgeLEAp0ua-yDrDpZtXcfefOXNjPPNA@mail.gmail.com>
 <15377471540583235@myt6-2fee75662a4f.qloud-c.yandex.net>
 <CAEQ_TveuX2vhAcHw=v4s4cgK40eaVFkHaMG6=iaAT-6Weu4Gww@mail.gmail.com>
 <c2be55450a2c27a6cd2ef37e1e828b544d3ccfef.camel@sipsolutions.net>
 <CAEQ_TveRmOwfty-7fs1Nb+oN=ufqvTqib0cv7PXWZz6GET_30Q@mail.gmail.com>
Message-ID: <CAFMmPGMhC0CLkF5HRonMQg6bPBO4jNBU+Fc51J_avmGT_i8oZQ@mail.gmail.com>

I see now the original motivation as the unfortunate situation that mxnet
authors did not understand that np.ascontiguousarray returned an array of
at least one dimension and perhaps used that one API to assume that NumPy
did not support 0-d arrays --- which NumPy does indeed support.

Certainly that situation would motivate a documentation change to help
steer other future users from making the same incorrect assumption, but
deprecation is a separate question entirely.  I do not agree at all with
the trend to remove functions from NumPy API prior to a dedicated NumPy 2.0
effort.  This breaks the idea of semantic versioning for NumPy.

These functions do, in fact, have a use and were very much intended to
produce one-dimensional arrays --- in order to be used prior to calling C
or Fortran code that expected at least a 1-d array.    A lot of the SciPy
wrapping code needed this behavior.  It is a misinterpretation to assume
this is buggy or unintended.

Improving the documentation to warn about the behavior for 0-d arrays could
indeed be useful.

-Travis


On Fri, Oct 26, 2018 at 6:27 PM Stephan Hoyer <shoyer at gmail.com> wrote:

> On Fri, Oct 26, 2018 at 3:48 PM Sebastian Berg <sebastian at sipsolutions.net>
> wrote:
>
>> On Fri, 2018-10-26 at 13:25 -0700, Stephan Hoyer wrote:
>> > On Fri, Oct 26, 2018 at 12:55 PM Alex Rogozhnikov <
>> > alex.rogozhnikov at yandex.ru> wrote:
>> > >
>> > > The conservative way to handle this would be to do a deprecation
>> > > cycle, specifically by issuing FutureWarning when scalars or 0d
>> > > arrays are encountered as inputs.
>> > > Sounds good to me. Behavior should be scheduled for numpy 1.18?
>> > >
>> >
>> > Yes, that sounds about right to me.
>> >
>>
>> Is there a way to avoid the future warning? An unavoidable warning in a
>> widely used function seems really annoying to me. Unless, the 0d thing
>> happens rarely, but then it might be the downstream users that get the
>> warning for no reason.
>>
>> - Sebastian
>>
>
> My suspicion is that 0d arrays are rarely used as arguments to
> ascontiguousarray / asfortranarray. But it's hard to say for sure...
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/8d5db8fc/attachment-0001.html>

From alex.rogozhnikov at yandex.ru  Fri Oct 26 20:06:43 2018
From: alex.rogozhnikov at yandex.ru (Alex Rogozhnikov)
Date: Sat, 27 Oct 2018 03:06:43 +0300
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>
Message-ID: <14021211540598803@sas1-890ba5c2334a.qloud-c.yandex.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181027/173715b2/attachment.html>

From teoliphant at gmail.com  Fri Oct 26 20:34:46 2018
From: teoliphant at gmail.com (Travis Oliphant)
Date: Fri, 26 Oct 2018 19:34:46 -0500
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <14021211540598803@sas1-890ba5c2334a.qloud-c.yandex.net>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>
 <14021211540598803@sas1-890ba5c2334a.qloud-c.yandex.net>
Message-ID: <CAFMmPGNhUUcZqYXKqRVqKqe8LWaZW0+UYvPZcL6xqFBwace1oQ@mail.gmail.com>

On Fri, Oct 26, 2018 at 7:14 PM Alex Rogozhnikov <alex.rogozhnikov at yandex.ru>
wrote:

> > If the desire is to shrink the API of NumPy, I could see that.
>
> Very good desire, but my goal was different.
>

> > For some users this is exactly what is wanted.
>
> Maybe so, but I didn't face such example (and nobody mentioned those so
> far in the discussion).
> The opposite (according to the issue) happened. Mxnet example is
> sufficient in my opinion.
>

I agree that the old motivation of APIs that would make it easy to create
SciPy is no longer a major motivation for most users and even developers
and so these reasons would not be very present (as well as why it wasn't
even mentioned in the documentation).


> Simple example:
> x = np.zeros([])
> assert(x.flags.c_contiguous)
> assert(np.ascontiguousarray(x).shape == x.shape)
>
> Behavior contradicts to documentation (shape is changed) and to name
> (flags are saying - it is already c_contiguous)
>
> If you insist, that keeping ndmin=1 is important (I am not yet convinced,
> but I am ready to believe your autority),
> we can add ndmin=1 to functions' signatures, this way explicitly notifying
> users about expected dimension.
>

I understand the lack of being convinced.  This is ultimately a problem of
0-d arrays not being fully embraced and accepted by the Numeric community
originally (which NumPy inherited during the early days).   Is there a way
to document functions that will be removed on a major version increase
which don't print warnings on use? I would support this.

I'm a big supporter of making a NumPy 2.0 and have been for several years.
Now that Python 3 transition has happened, I think we could seriously
discuss this.  I'm trying to raise funding for maintenance and progress for
NumPy and SciPy right now via Quansight Labs http://www.quansight.com/labs
and I hope to be able to help find grants to support the wonderful efforts
that have been happening for some time.

While I'm thrilled and impressed by the number of amazing devs who have
kept NumPy and SciPy going in mostly their spare time, it has created
challenges that we have not had continuous maintenance funding to allow
continuous paid development so that several people who know about the early
decisions could not be retained to spend time on helping the transition.

Your bringing the problem of mxnet devs is most appreciated.  I will make a
documentation PR.

-Travis


> Alex.
>
>
> 27.10.2018, 02:27, "Travis Oliphant" <teoliphant at gmail.com>:
>
> What is the justification for deprecation exactly?  These functions have
> been well documented and have had the intended behavior of producing arrays
> with dimension at least 1 for some time.  Why is it unexpected to produce
> arrays of at least 1 dimension?  For some users this is exactly what is
> wanted.  I don't understand the statement that behavior with 0-d arrays is
> unexpected.
>
> If the desire is to shrink the API of NumPy, I could see that.   But, it
> seems odd to me to remove a much-used function with an established behavior
> except as part of a wider API-shrinkage effort.
>
> 0-d arrays in NumPy are a separate conversation.  At this point, I think
> it was a mistake not to embrace 0-d arrays in NumPy from day one.  In some
> sense 0-d arrays *are* scalars at least conceptually and for JIT-producing
> systems that exist now and will be growing in the future, they can be
> equivalent to scalars.
>
> The array scalars should become how you define what is *in* a NumPy array
> making them true Python types, rather than Python 1-style "instances" of a
> single "Dtype" object.  You would then have 0-d arrays and these Python
> "memory" types describing what is *in* the array.
>
> There is a clear way to do this, some of which has been outlined by
> Nathaniel, and the rest I have an outline for how to implement.  I can
> advise someone on how to do this.
>
> -Travis
>
>
>
>
> On Thu, Oct 25, 2018 at 3:17 PM Alex Rogozhnikov <
> alex.rogozhnikov at yandex.ru> wrote:
>
> Dear numpy community,
>
> I'm planning to depreciate np.asfortranarray and np.ascontiguousarray
> functions due to their misbehavior on scalar (0-D tensors) with PR #12244.
>
> Current behavior (converting scalars to 1-d array with single element)
> - is unexpected and contradicts to documentation
> - probably, can't be changed without breaking external code
> - I believe, this was a cause for poor support of 0-d arrays in mxnet.
> - both functions are easily replaced with asarray(..., order='...'), which
> has expected behavior
>
> There is no timeline for removal - we just need to discourage from using
> this functions in new code.
>
> Function naming may be related to how numpy treats 0-d tensors specially,
> and those probably should not be called arrays.
> https://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html
> However, as a user I never thought about 0-d arrays being special and
> being "not arrays".
>
>
> Please see original discussion at github for more details
> https://github.com/numpy/numpy/issues/5300
>
> Your comments welcome,
> Alex Rogozhnikov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> ,
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/d846e5b9/attachment-0001.html>

From teoliphant at gmail.com  Fri Oct 26 22:12:00 2018
From: teoliphant at gmail.com (Travis Oliphant)
Date: Fri, 26 Oct 2018 21:12:00 -0500
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
Message-ID: <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>

On Fri, Oct 19, 2018 at 8:24 PM Marten van Kerkwijk <
m.h.vankerkwijk at gmail.com> wrote:

> Hi All,
>
> It seems there are two extreme possibilities for general functions:
> 1. Put `asarray` everywhere. The main benefit that I can see is that even
> if people put in list instead of arrays, one is guaranteed to have shape,
> dtype, etc. But it seems a bit like calling `int` on everything that might
> get used as an index, instead of letting the actual indexing do the proper
> thing and call `__index__`.
>

Yes, actually getting a proper "array protocol" into Python would be a
fantastic approach.   We have been working with Lenore Mullin who is a
researcher on the mathematics of arrays on what it means to be an array and
believe we can come up with an actual array protocol that perhaps could be
put into Python itself (though that isn't our immediate goal right now).


> 2. Do not coerce at all, but rather write code assuming something is an
> array already. This will often, but not always, just work for array mimics,
> with coercion done only where necessary (e.g., in lower-lying C code such
> as that of the ufuncs which has a smaller API surface and can be overridden
> more easily).
>
> The current __array_function__ work may well provide us with a way to
> combine both, if we (over time) move the coercion inside
> `ndarray.__array_function__` so that the actual implementation *can* assume
> it deals with pure ndarray - then, when relevant, calling that
> implementation will be what subclasses/duck arrays can happily do (and it
> is up to them to ensure this works).
>

Also, we could get rid of asarray entirely by changing expectations.  This
automatic conversion code throughout NumPy and SciPy is an example of the
confusion in both of these libraries between "user-oriented interfaces" and
"developer-oriented interfaces".   A developer just wants the library to
use duck-typing and then raise errors if you don't provide the right type
(i.e. a list instead of an array).  The user-interface could happen in
Jupyter, or be isolated to a high-level library or meta-code approach (of
which there are several possibilities for Python).


>
> Of course, the above does not really answer what to do in the meantime.
> But perhaps it helps in thinking of what we are actually aiming for.
>

> One last thing: could we please stop bashing subclasses? One can subclass
> essentially everything in python, often to great advantage. Subclasses such
> as MaskedArray and, yes, Quantity, are widely used, and if they cause
> problems perhaps that should be seen as a sign that ndarray subclassing
> should be made easier and clearer.
>
>
I agree that we can stop bashing subclasses in general.   The problem with
numpy subclasses is that they were made without adherence to SOLID:
https://en.wikipedia.org/wiki/SOLID.  In particular the Liskov substitution
principle:  https://en.wikipedia.org/wiki/Liskov_substitution_principle .
Much of this is my fault.  Being a scientist/engineer more than a computer
scientist, I had no idea what these principles were and did not properly
apply them in creating np.matrix which clearly violates the substitution
principle.

We can clean all this and more up.

But, we really need to start talking about NumPy 2.0 to do it.   Now that
Python 3.x is really here, we can raise the money for it and get it done.
We don't have to just rely on volunteer time.

The world will thank us for actually pushing NumPy 2.0.  I know not
everyone agrees, but for whatever its worth, I feel very, very strongly
about this, and despite not being very active on this list for the past
years, I do have a lot of understanding about how the current code actually
works (and where and why its warts are).

-Travis


> All the best,
>
> Marten
>
>
> On Fri, Oct 19, 2018 at 7:02 PM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Fri, Oct 19, 2018 at 10:28 PM Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Oct 19, 2018 at 4:15 PM Hameer Abbasi <einstein.edison at gmail.com>
>>> wrote:
>>>
>>>> Hi!
>>>>
>>>> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer <shoyer at gmail.com>
>>>> wrote:
>>>> I don't think it makes much sense to change NumPy's existing usage of
>>>> asarray() to asanyarray() unless we add subok=True arguments (which default
>>>> to False). But this ends up cluttering NumPy's public API, which is also
>>>> undesirable.
>>>>
>>>> Agreed so far.
>>>>
>>>
>>> I'm not sure I agree. "subok" is very unpythonic; the average numpy
>>> library function should work fine for a well-behaved subclass (i.e. most
>>> things out there except np.matrix).
>>>
>>>>
>>>> The preferred way to override NumPy functions going forward should be
>>>> __array_function__.
>>>>
>>>>
>>>> I think we should ?soft support? i.e. allow but consider unsupported,
>>>> the case where one of NumPy?s functions is implemented in terms of others
>>>> and ?passing through? an array results in the correct behaviour for that
>>>> array.
>>>>
>>>
>>> I don't think we have or want such a concept as "soft support". We
>>> intend to not break anything that now has asanyarray, i.e. it's supported
>>> and ideally we have regression tests for all such functions. For anything
>>> we transition over from asarray to asanyarray, PRs should come with new
>>> tests.
>>>
>>>
>>>>
>>>> On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk <
>>>> m.h.vankerkwijk at gmail.com> wrote:
>>>>
>>>>> There are exceptions for `matrix` in quite a few places, and there now
>>>>> is warning for `maxtrix` - it might not be bad to use `asanyarray` and add
>>>>> an exception for `maxtrix`. Indeed, I quite like the suggestion by Eric
>>>>> Wieser to just add the exception to `asanyarray` itself - that way when
>>>>> matrix is truly deprecated, it will be a very easy change.
>>>>>
>>>> I don't quite understand this. Adding exceptions is not deprecation -
>>> we then may as well just rip np.matrix out straight away.
>>>
>>> What I suggested in the call about this issue is that it's not very
>>> effective to treat functions like percentile/quantile one by one without an
>>> overarching strategy. A way forward could be for someone to write an
>>> overview of which sets of functions now have asanyarray (and actually work
>>> with subclasses), which ones we can and want to change now, and which ones
>>> we can and want to change after np.matrix is gone. Also, some guidelines
>>> for new functions that we add to numpy would be handy. I suspect we've been
>>> adding new functions that use asarray rather than asanyarray, which is
>>> probably undesired.
>>>
>>
>> Thanks Nathaniel and Stephan. Your comments on my other two points are
>> both clear and correct (and have been made a number of times before). I
>> think the "write an overview so we can stop making ad-hoc decisions and
>> having these discussions" is the most important point I was trying to make
>> though. If we had such a doc and it concluded "hence we don't change
>> anything, __array_function__ is the only way to go" then we can just close
>> PRs like https://github.com/numpy/numpy/pull/11162 straight away.
>>
>> Cheers,
>> Ralf
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/c35beb13/attachment.html>

From ralf.gommers at gmail.com  Sat Oct 27 00:08:19 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 27 Oct 2018 17:08:19 +1300
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <20181026221028.3r62dbsqpzzcvjj6@carbo>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
 <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
 <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>
 <164a3678-0838-4bb7-84ed-92e1a249f875@Canary>
 <20181026170314.wqvwwc4ncudz5dzo@carbo>
 <f473d66a-7c73-45c8-9e20-9feebe5d832e@Canary>
 <CABL7CQiZjTT6mc2KCUa3qYKxj8U7Nr5DmOMw6Wtq9Zb=C9Axsw@mail.gmail.com>
 <20181026221028.3r62dbsqpzzcvjj6@carbo>
Message-ID: <CABL7CQg+JttTbt=9RB8wjyjvw=R0Pje3xquUozcOULx2qmig8Q@mail.gmail.com>

On Sat, Oct 27, 2018 at 11:10 AM Stefan van der Walt <stefanv at berkeley.edu>
wrote:

> On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
> > Just to make sure we're talking about the same things here: Stefan, I
> think
> > with "sparray" you mean "an n-D sparse array implementation that lives in
> > SciPy", nothing more specific? In that case pydata/sparse is the one
> > implementation, and including it in scipy.sparse would make it "sparray".
> > I'm currently indeed leaning towards depending on pydata/sparse rather
> than
> > including it in scipy.
>
> I want to double check: when we last spoke, it seemed as though certain
> refactorings inside of SciPy (specifically, sparray was mentioned) would
> simplify the life of pydata/sparse devs.  That no longer seems to be the
> case?
>

There's no such thing as `sparray` anywhere in SciPy. There's two inactive
projects to create an n-D sparse array implementation, one of which is
called sparray (https://github.com/perimosocordiae/sparray). And there's
one very active project to do that same thing which is
https://github.com/pydata/sparse


> If our recommended route is to tell users to use pydata/sparse instead
> of SciPy (for the sparse array object), we probably want to get rid of
> our own internal implementation, and deprecate spmatrix


Doc-deprecate I think; the sparse matrix classes in SciPy are very heavily
used, so it doesn't make sense to start emitting deprecation warnings for
them. But at some point we'll want to point users to pydata/sparse for new
code.


> (or, build
> spmatrix on top of pydata/sparse)?
>

It's the matrix vs. array semantics that are the issue, so not sure that
building one on top of the other would be useful.


> Once we can define a clear API for sparse arrays, we can include some
> algorithms that ingest those objects in SciPy.  But, I'm not sure we
> have an API in place that will allow handover of such objects to the
> existing C/FORTRAN-level code.
>

I don't think the constructors for sparse matrix/array care about C/F
order. pydata/sparse is pure Python (and uses Numba). For reusing
scipy.sparse.linalg and scipy.sparse.csgraph you're right I think that that
will need some careful design work. Not sure anyone has thought about that
in a lot of detail yet.

There are interesting API questions probably, such as how to treat explicit
zeros (that debate still isn't settled for the matrix classes IIRC). And
there's an interesting transition puzzle to figure out (which also includes
np.matrix). At the moment the discussion on that is spread out over many
mailing list threads and Github issues, at some point we'll need to
summarize that. Probably around the time that the CSR/CSC replacement that
Hameer mentioned is finished.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181027/6ea4ef97/attachment-0001.html>

From wieser.eric+numpy at gmail.com  Sat Oct 27 01:36:43 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Fri, 26 Oct 2018 22:36:43 -0700
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAFMmPGNhUUcZqYXKqRVqKqe8LWaZW0+UYvPZcL6xqFBwace1oQ@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>
 <14021211540598803@sas1-890ba5c2334a.qloud-c.yandex.net>
 <CAFMmPGNhUUcZqYXKqRVqKqe8LWaZW0+UYvPZcL6xqFBwace1oQ@mail.gmail.com>
Message-ID: <CAL1kJvBBpWXosi0RSBLb0dEuTqCdP-rct=n+R=u27fX3nk8=6Q@mail.gmail.com>

in order to be used prior to calling C or Fortran code that expected at
least a 1-d array

I?d argue that the behavior for these functions should have just been to
raise an error saying ?this function does not support 0d arrays?, rather
than silently inserting extra dimensions. As a bonus, that would push the
function developers to add support for 0d. Obviously we can?t make it do
that now, but what we can do is have it emit a warning in those cases.

I think our options are:

   1. Deprecate the entire function
   2. Deprecate and eventually(?) throw an error upon calling the function
   on 0d arrays, with a message like *?in future using ascontiguousarray to
   promote 0d arrays to 1d arrays will not be supported. If promotion is
   intentional, use ascontiguousarray(atleast1d(x)) to silence this warning
   and keep the old behavior, and if not use asarray(x, order='C') to preserve
   0d arrays?*
   3. Deprecate (future-warning) when passed 0d arrays, and eventually skip
   the upcast to 1d.
   If the calling code really needed a 1d array, then it will probably
   fail, which is not really different to 2, but has the advantage that the
   names are less surprising.
   4. Only improve the documentation

My preference would be 3

Eric

On Fri, 26 Oct 2018 at 17:35 Travis Oliphant <teoliphant at gmail.com> wrote:

On Fri, Oct 26, 2018 at 7:14 PM Alex Rogozhnikov <alex.rogozhnikov at yandex.ru>
> wrote:
>
>> > If the desire is to shrink the API of NumPy, I could see that.
>>
>> Very good desire, but my goal was different.
>>
>
>> > For some users this is exactly what is wanted.
>>
>> Maybe so, but I didn't face such example (and nobody mentioned those so
>> far in the discussion).
>> The opposite (according to the issue) happened. Mxnet example is
>> sufficient in my opinion.
>>
>
> I agree that the old motivation of APIs that would make it easy to create
> SciPy is no longer a major motivation for most users and even developers
> and so these reasons would not be very present (as well as why it wasn't
> even mentioned in the documentation).
>
>
>> Simple example:
>> x = np.zeros([])
>> assert(x.flags.c_contiguous)
>> assert(np.ascontiguousarray(x).shape == x.shape)
>>
>> Behavior contradicts to documentation (shape is changed) and to name
>> (flags are saying - it is already c_contiguous)
>>
>> If you insist, that keeping ndmin=1 is important (I am not yet convinced,
>> but I am ready to believe your autority),
>> we can add ndmin=1 to functions' signatures, this way explicitly
>> notifying users about expected dimension.
>>
>
> I understand the lack of being convinced.  This is ultimately a problem of
> 0-d arrays not being fully embraced and accepted by the Numeric community
> originally (which NumPy inherited during the early days).   Is there a way
> to document functions that will be removed on a major version increase
> which don't print warnings on use? I would support this.
>
> I'm a big supporter of making a NumPy 2.0 and have been for several years.
> Now that Python 3 transition has happened, I think we could seriously
> discuss this.  I'm trying to raise funding for maintenance and progress for
> NumPy and SciPy right now via Quansight Labs http://www.quansight.com/labs
> and I hope to be able to help find grants to support the wonderful efforts
> that have been happening for some time.
>
> While I'm thrilled and impressed by the number of amazing devs who have
> kept NumPy and SciPy going in mostly their spare time, it has created
> challenges that we have not had continuous maintenance funding to allow
> continuous paid development so that several people who know about the early
> decisions could not be retained to spend time on helping the transition.
>
> Your bringing the problem of mxnet devs is most appreciated.  I will make
> a documentation PR.
>
> -Travis
>
>
>
>
>> Alex.
>>
>>
>> 27.10.2018, 02:27, "Travis Oliphant" <teoliphant at gmail.com>:
>>
>> What is the justification for deprecation exactly?  These functions have
>> been well documented and have had the intended behavior of producing arrays
>> with dimension at least 1 for some time.  Why is it unexpected to produce
>> arrays of at least 1 dimension?  For some users this is exactly what is
>> wanted.  I don't understand the statement that behavior with 0-d arrays is
>> unexpected.
>>
>> If the desire is to shrink the API of NumPy, I could see that.   But, it
>> seems odd to me to remove a much-used function with an established behavior
>> except as part of a wider API-shrinkage effort.
>>
>> 0-d arrays in NumPy are a separate conversation.  At this point, I think
>> it was a mistake not to embrace 0-d arrays in NumPy from day one.  In some
>> sense 0-d arrays *are* scalars at least conceptually and for JIT-producing
>> systems that exist now and will be growing in the future, they can be
>> equivalent to scalars.
>>
>> The array scalars should become how you define what is *in* a NumPy array
>> making them true Python types, rather than Python 1-style "instances" of a
>> single "Dtype" object.  You would then have 0-d arrays and these Python
>> "memory" types describing what is *in* the array.
>>
>> There is a clear way to do this, some of which has been outlined by
>> Nathaniel, and the rest I have an outline for how to implement.  I can
>> advise someone on how to do this.
>>
>> -Travis
>>
>>
>>
>>
>> On Thu, Oct 25, 2018 at 3:17 PM Alex Rogozhnikov <
>> alex.rogozhnikov at yandex.ru> wrote:
>>
>> Dear numpy community,
>>
>> I'm planning to depreciate np.asfortranarray and np.ascontiguousarray
>> functions due to their misbehavior on scalar (0-D tensors) with PR #12244
>> .
>>
>> Current behavior (converting scalars to 1-d array with single element)
>> - is unexpected and contradicts to documentation
>> - probably, can't be changed without breaking external code
>> - I believe, this was a cause for poor support of 0-d arrays in mxnet.
>> - both functions are easily replaced with asarray(..., order='...'),
>> which has expected behavior
>>
>> There is no timeline for removal - we just need to discourage from using
>> this functions in new code.
>>
>> Function naming may be related to how numpy treats 0-d tensors specially,
>>
>> and those probably should not be called arrays.
>> https://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html
>> However, as a user I never thought about 0-d arrays being special and
>> being "not arrays".
>>
>>
>> Please see original discussion at github for more details
>> https://github.com/numpy/numpy/issues/5300
>>
>> Your comments welcome,
>> Alex Rogozhnikov
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>> ,
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181026/3ecde7ce/attachment-0001.html>

From ralf.gommers at gmail.com  Sat Oct 27 02:29:47 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 27 Oct 2018 19:29:47 +1300
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAL1kJvBBpWXosi0RSBLb0dEuTqCdP-rct=n+R=u27fX3nk8=6Q@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>
 <14021211540598803@sas1-890ba5c2334a.qloud-c.yandex.net>
 <CAFMmPGNhUUcZqYXKqRVqKqe8LWaZW0+UYvPZcL6xqFBwace1oQ@mail.gmail.com>
 <CAL1kJvBBpWXosi0RSBLb0dEuTqCdP-rct=n+R=u27fX3nk8=6Q@mail.gmail.com>
Message-ID: <CABL7CQh3BdQLeQbDfe_Re7=xOy5wtb+xh774nBsU_7Adyzb1Zw@mail.gmail.com>

On Sat, Oct 27, 2018 at 6:37 PM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> in order to be used prior to calling C or Fortran code that expected at
> least a 1-d array
>
> I?d argue that the behavior for these functions should have just been to
> raise an error saying ?this function does not support 0d arrays?, rather
> than silently inserting extra dimensions. As a bonus, that would push the
> function developers to add support for 0d. Obviously we can?t make it do
> that now, but what we can do is have it emit a warning in those cases.
>
> I think our options are:
>
>    1. Deprecate the entire function
>    2. Deprecate and eventually(?) throw an error upon calling the
>    function on 0d arrays, with a message like *?in future using
>    ascontiguousarray to promote 0d arrays to 1d arrays will not be supported.
>    If promotion is intentional, use ascontiguousarray(atleast1d(x)) to silence
>    this warning and keep the old behavior, and if not use asarray(x,
>    order='C') to preserve 0d arrays?*
>    3. Deprecate (future-warning) when passed 0d arrays, and eventually
>    skip the upcast to 1d.
>    If the calling code really needed a 1d array, then it will probably
>    fail, which is not really different to 2, but has the advantage that the
>    names are less surprising.
>    4. Only improve the documentation
>
> My preference would be 3
>
I'd go for 4, or alternatively for the warning in 2 (which can be left in
place indefinitely). 1 is unwarranted, and 3 will change behavior which is
worse than just warning or stopping to support existing behavior (= 2).

Eric
>
> On Fri, 26 Oct 2018 at 17:35 Travis Oliphant <teoliphant at gmail.com> wrote:
>
> On Fri, Oct 26, 2018 at 7:14 PM Alex Rogozhnikov <
>> alex.rogozhnikov at yandex.ru> wrote:
>>
>>> > If the desire is to shrink the API of NumPy, I could see that.
>>>
>>> Very good desire, but my goal was different.
>>>
>>
>>> > For some users this is exactly what is wanted.
>>>
>>> Maybe so, but I didn't face such example (and nobody mentioned those so
>>> far in the discussion).
>>> The opposite (according to the issue) happened. Mxnet example is
>>> sufficient in my opinion.
>>>
>>
>> I agree that the old motivation of APIs that would make it easy to create
>> SciPy is no longer a major motivation for most users and even developers
>> and so these reasons would not be very present (as well as why it wasn't
>> even mentioned in the documentation).
>>
>>
>>> Simple example:
>>> x = np.zeros([])
>>> assert(x.flags.c_contiguous)
>>> assert(np.ascontiguousarray(x).shape == x.shape)
>>>
>>> Behavior contradicts to documentation (shape is changed) and to name
>>> (flags are saying - it is already c_contiguous)
>>>
>>> If you insist, that keeping ndmin=1 is important (I am not yet
>>> convinced, but I am ready to believe your autority),
>>> we can add ndmin=1 to functions' signatures, this way explicitly
>>> notifying users about expected dimension.
>>>
>>
>> I understand the lack of being convinced.  This is ultimately a problem
>> of 0-d arrays not being fully embraced and accepted by the Numeric
>> community originally (which NumPy inherited during the early days).   Is
>> there a way to document functions that will be removed on a major version
>> increase which don't print warnings on use? I would support this.
>>
>
No, there's no such thing at the moment - the closest thing is
https://github.com/numpy/numpy/wiki/Backwards-incompatible-ideas-for-a-major-release.
I doubt we want such a thing anyway - removing functions without
deprecation warnings first doesn't seem quite right.


>
>> I'm a big supporter of making a NumPy 2.0 and have been for several
>> years. Now that Python 3 transition has happened, I think we could
>> seriously discuss this.
>>
>
I think it's more helpful to discuss goals and concrete plans for those,
rather than a "NumPy 2.0" label. The latter never worked in the past, and
not just because of lack of time/funding - it just means different things
to different people. We now have a good start on what our major goals are (
http://www.numpy.org/neps/#roadmap), let's build on that.

  I'm trying to raise funding for maintenance and progress for NumPy and
>> SciPy right now via Quansight Labs http://www.quansight.com/labs and I
>> hope to be able to help find grants to support the wonderful efforts that
>> have been happening for some time.
>>
>
The NumPy grant and having Tyler/Matti/Stefan at BIDS is a great start to
funded development; more and more diverse funding sources would be awesome.

Cheers,
Ralf


>> While I'm thrilled and impressed by the number of amazing devs who have
>> kept NumPy and SciPy going in mostly their spare time, it has created
>> challenges that we have not had continuous maintenance funding to allow
>> continuous paid development so that several people who know about the early
>> decisions could not be retained to spend time on helping the transition.
>>
>> Your bringing the problem of mxnet devs is most appreciated.  I will make
>> a documentation PR.
>>
>> -Travis
>>
>>
>>
>>
>>> Alex.
>>>
>>>
>>> 27.10.2018, 02:27, "Travis Oliphant" <teoliphant at gmail.com>:
>>>
>>> What is the justification for deprecation exactly?  These functions have
>>> been well documented and have had the intended behavior of producing arrays
>>> with dimension at least 1 for some time.  Why is it unexpected to produce
>>> arrays of at least 1 dimension?  For some users this is exactly what is
>>> wanted.  I don't understand the statement that behavior with 0-d arrays is
>>> unexpected.
>>>
>>> If the desire is to shrink the API of NumPy, I could see that.   But, it
>>> seems odd to me to remove a much-used function with an established behavior
>>> except as part of a wider API-shrinkage effort.
>>>
>>> 0-d arrays in NumPy are a separate conversation.  At this point, I think
>>> it was a mistake not to embrace 0-d arrays in NumPy from day one.  In some
>>> sense 0-d arrays *are* scalars at least conceptually and for JIT-producing
>>> systems that exist now and will be growing in the future, they can be
>>> equivalent to scalars.
>>>
>>> The array scalars should become how you define what is *in* a NumPy
>>> array making them true Python types, rather than Python 1-style "instances"
>>> of a single "Dtype" object.  You would then have 0-d arrays and these
>>> Python "memory" types describing what is *in* the array.
>>>
>>> There is a clear way to do this, some of which has been outlined by
>>> Nathaniel, and the rest I have an outline for how to implement.  I can
>>> advise someone on how to do this.
>>>
>>> -Travis
>>>
>>>
>>>
>>>
>>> On Thu, Oct 25, 2018 at 3:17 PM Alex Rogozhnikov <
>>> alex.rogozhnikov at yandex.ru> wrote:
>>>
>>> Dear numpy community,
>>>
>>> I'm planning to depreciate np.asfortranarray and np.ascontiguousarray
>>> functions due to their misbehavior on scalar (0-D tensors) with PR
>>> #12244.
>>>
>>> Current behavior (converting scalars to 1-d array with single element)
>>> - is unexpected and contradicts to documentation
>>> - probably, can't be changed without breaking external code
>>> - I believe, this was a cause for poor support of 0-d arrays in mxnet.
>>> - both functions are easily replaced with asarray(..., order='...'),
>>> which has expected behavior
>>>
>>> There is no timeline for removal - we just need to discourage from using
>>> this functions in new code.
>>>
>>> Function naming may be related to how numpy treats 0-d tensors
>>> specially,
>>> and those probably should not be called arrays.
>>> https://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html
>>> However, as a user I never thought about 0-d arrays being special and
>>> being "not arrays".
>>>
>>>
>>> Please see original discussion at github for more details
>>> https://github.com/numpy/numpy/issues/5300
>>>
>>> Your comments welcome,
>>> Alex Rogozhnikov
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>> ,
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181027/17bd3e9d/attachment-0001.html>

From einstein.edison at gmail.com  Sat Oct 27 04:29:01 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Sat, 27 Oct 2018 10:29:01 +0200
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <20181026221028.3r62dbsqpzzcvjj6@carbo>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
 <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
 <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>
 <164a3678-0838-4bb7-84ed-92e1a249f875@Canary>
 <20181026170314.wqvwwc4ncudz5dzo@carbo>
 <f473d66a-7c73-45c8-9e20-9feebe5d832e@Canary>
 <CABL7CQiZjTT6mc2KCUa3qYKxj8U7Nr5DmOMw6Wtq9Zb=C9Axsw@mail.gmail.com>
 <20181026221028.3r62dbsqpzzcvjj6@carbo>
Message-ID: <3df79185-ae85-4973-b0f2-78ae8e76acbb@Canary>


> On Saturday, Oct 27, 2018 at 12:10 AM, Stefan van der Walt <stefanv at berkeley.edu (mailto:stefanv at berkeley.edu)> wrote:
> On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
> > Just to make sure we're talking about the same things here: Stefan, I think
> > with "sparray" you mean "an n-D sparse array implementation that lives in
> > SciPy", nothing more specific? In that case pydata/sparse is the one
> > implementation, and including it in scipy.sparse would make it "sparray".
> > I'm currently indeed leaning towards depending on pydata/sparse rather than
> > including it in scipy.
>
> I want to double check: when we last spoke, it seemed as though certain
> refactorings inside of SciPy (specifically, sparray was mentioned) would
> simplify the life of pydata/sparse devs. That no longer seems to be the
> case?

Hi! I can?t recall having said this, perhaps you inferred it from the docs (it?s on the front page, so that isn?t unreasonable). We should update that sometime.

That said, we use very little of scipy.sparse in PyData/Sparse. When Matt Rocklin was maintaining the project, that was the case, but even in the later days he shifted much of his code to pure NumPy. I followed that path further, not out of unwillingness to depend on it, but out of desire for generality.

In its current state, the only things in PyData/Sparse that depend on scipy.sparse are:
Conversion to/from scipy.sparse spmatrix classes
A bit of linear algebra i.e. dot, tensordot, matmul.

Best Regards,
Hameer Abbasi

>
> If our recommended route is to tell users to use pydata/sparse instead
> of SciPy (for the sparse array object), we probably want to get rid of
> our own internal implementation, and deprecate spmatrix (or, build
> spmatrix on top of pydata/sparse)?
>
> Once we can define a clear API for sparse arrays, we can include some
> algorithms that ingest those objects in SciPy. But, I'm not sure we
> have an API in place that will allow handover of such objects to the
> existing C/FORTRAN-level code.
>
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181027/4903af6a/attachment.html>

From einstein.edison at gmail.com  Sat Oct 27 04:34:42 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Sat, 27 Oct 2018 10:34:42 +0200
Subject: [Numpy-discussion] Reminder: weekly status meeting
In-Reply-To: <CABL7CQg+JttTbt=9RB8wjyjvw=R0Pje3xquUozcOULx2qmig8Q@mail.gmail.com>
References: <e904c0a4-c086-74af-c371-53d6cd5cbb1c@gmail.com>
 <20181024220750.pytz7dav4dabeplx@carbo>
 <1f0af61b-62f6-4f98-a5e7-6241855a7006@Canary>
 <CAHPuU_brG1K2ARwV=fE4e8pLmQsN52kb02ii6bcQSiBL0vaz1g@mail.gmail.com>
 <164a3678-0838-4bb7-84ed-92e1a249f875@Canary>
 <20181026170314.wqvwwc4ncudz5dzo@carbo>
 <f473d66a-7c73-45c8-9e20-9feebe5d832e@Canary>
 <CABL7CQiZjTT6mc2KCUa3qYKxj8U7Nr5DmOMw6Wtq9Zb=C9Axsw@mail.gmail.com>
 <20181026221028.3r62dbsqpzzcvjj6@carbo>
 <CABL7CQg+JttTbt=9RB8wjyjvw=R0Pje3xquUozcOULx2qmig8Q@mail.gmail.com>
Message-ID: <cf89f4ed-07b9-4f12-89d4-bb6adc90adac@Canary>


> On Saturday, Oct 27, 2018 at 6:11 AM, Ralf Gommers <ralf.gommers at gmail.com (mailto:ralf.gommers at gmail.com)> wrote:
>
>
> On Sat, Oct 27, 2018 at 11:10 AM Stefan van der Walt <stefanv at berkeley.edu (mailto:stefanv at berkeley.edu)> wrote:
> > On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
> > > Just to make sure we're talking about the same things here: Stefan, I think
> > > with "sparray" you mean "an n-D sparse array implementation that lives in
> > > SciPy", nothing more specific? In that case pydata/sparse is the one
> > > implementation, and including it in scipy.sparse would make it "sparray".
> > > I'm currently indeed leaning towards depending on pydata/sparse rather than
> > > including it in scipy.
> >
> > I want to double check: when we last spoke, it seemed as though certain
> > refactorings inside of SciPy (specifically, sparray was mentioned) would
> > simplify the life of pydata/sparse devs. That no longer seems to be the
> > case?
>
> There's no such thing as `sparray` anywhere in SciPy. There's two inactive projects to create an n-D sparse array implementation, one of which is called sparray (https://github.com/perimosocordiae/sparray). And there's one very active project to do that same thing which is https://github.com/pydata/sparse
>
> >
> > If our recommended route is to tell users to use pydata/sparse instead
> > of SciPy (for the sparse array object), we probably want to get rid of
> > our own internal implementation, and deprecate spmatrix
>
> Doc-deprecate I think; the sparse matrix classes in SciPy are very heavily used, so it doesn't make sense to start emitting deprecation warnings for them. But at some point we'll want to point users to pydata/sparse for new code.
>
> > (or, build
> > spmatrix on top of pydata/sparse)?
>
> It's the matrix vs. array semantics that are the issue, so not sure that building one on top of the other would be useful.
>
> >
> > Once we can define a clear API for sparse arrays, we can include some
> > algorithms that ingest those objects in SciPy. But, I'm not sure we
> > have an API in place that will allow handover of such objects to the
> > existing C/FORTRAN-level code.
>
> I don't think the constructors for sparse matrix/array care about C/F order. pydata/sparse is pure Python (and uses Numba). For reusing scipy.sparse.linalg and scipy.sparse.csgraph you're right I think that that will need some careful design work. Not sure anyone has thought about that in a lot of detail yet.
>

They don?t yet. That is a planned feature, allowing an arbitrary permutation of input coordinates.
>
> There are interesting API questions probably, such as how to treat explicit zeros (that debate still isn't settled for the matrix classes IIRC).
>

Explicit zeros are easier now, just use a fill_value of NaN and work with zeros as usual.

Best Regards,
Hameer Abbasi

>
> And there's an interesting transition puzzle to figure out (which also includes np.matrix). At the moment the discussion on that is spread out over many mailing list threads and Github issues, at some point we'll need to summarize that. Probably around the time that the CSR/CSC replacement that Hameer mentioned is finished.
>
> Cheers,
> Ralf
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181027/fb0a52a0/attachment.html>

From sylvain.corlay at gmail.com  Sat Oct 27 11:16:56 2018
From: sylvain.corlay at gmail.com (Sylvain Corlay)
Date: Sat, 27 Oct 2018 17:16:56 +0200
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>
Message-ID: <CAK=Phk6Vd7PSbKtHR42NJWOSz0f6S6gY--nnMvRBtQvc-ei4Ww@mail.gmail.com>

I would also argue against deprecating these functions that we are using
increasingly in several projects that I am involved in.

On Sat, Oct 27, 2018, 01:28 Travis Oliphant <teoliphant at gmail.com> wrote:

> What is the justification for deprecation exactly?  These functions have
> been well documented and have had the intended behavior of producing arrays
> with dimension at least 1 for some time.  Why is it unexpected to produce
> arrays of at least 1 dimension?  For some users this is exactly what is
> wanted.  I don't understand the statement that behavior with 0-d arrays is
> unexpected.
>
> If the desire is to shrink the API of NumPy, I could see that.   But, it
> seems odd to me to remove a much-used function with an established behavior
> except as part of a wider API-shrinkage effort.
>
> 0-d arrays in NumPy are a separate conversation.  At this point, I think
> it was a mistake not to embrace 0-d arrays in NumPy from day one.  In some
> sense 0-d arrays *are* scalars at least conceptually and for JIT-producing
> systems that exist now and will be growing in the future, they can be
> equivalent to scalars.
>
> The array scalars should become how you define what is *in* a NumPy array
> making them true Python types, rather than Python 1-style "instances" of a
> single "Dtype" object.  You would then have 0-d arrays and these Python
> "memory" types describing what is *in* the array.
>
> There is a clear way to do this, some of which has been outlined by
> Nathaniel, and the rest I have an outline for how to implement.  I can
> advise someone on how to do this.
>
> -Travis
>
>
>
>
> On Thu, Oct 25, 2018 at 3:17 PM Alex Rogozhnikov <
> alex.rogozhnikov at yandex.ru> wrote:
>
>> Dear numpy community,
>>
>> I'm planning to depreciate np.asfortranarray and np.ascontiguousarray
>> functions due to their misbehavior on scalar (0-D tensors) with PR #12244
>> .
>>
>> Current behavior (converting scalars to 1-d array with single element)
>> - is unexpected and contradicts to documentation
>> - probably, can't be changed without breaking external code
>> - I believe, this was a cause for poor support of 0-d arrays in mxnet.
>> - both functions are easily replaced with asarray(..., order='...'),
>> which has expected behavior
>>
>> There is no timeline for removal - we just need to discourage from using
>> this functions in new code.
>>
>> Function naming may be related to how numpy treats 0-d tensors specially,
>>
>> and those probably should not be called arrays.
>> https://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html
>> However, as a user I never thought about 0-d arrays being special and
>> being "not arrays".
>>
>>
>> Please see original discussion at github for more details
>> https://github.com/numpy/numpy/issues/5300
>>
>> Your comments welcome,
>> Alex Rogozhnikov
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181027/a156b279/attachment-0001.html>

From teoliphant at gmail.com  Sat Oct 27 12:00:58 2018
From: teoliphant at gmail.com (Travis Oliphant)
Date: Sat, 27 Oct 2018 11:00:58 -0500
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CAL1kJvBBpWXosi0RSBLb0dEuTqCdP-rct=n+R=u27fX3nk8=6Q@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>
 <14021211540598803@sas1-890ba5c2334a.qloud-c.yandex.net>
 <CAFMmPGNhUUcZqYXKqRVqKqe8LWaZW0+UYvPZcL6xqFBwace1oQ@mail.gmail.com>
 <CAL1kJvBBpWXosi0RSBLb0dEuTqCdP-rct=n+R=u27fX3nk8=6Q@mail.gmail.com>
Message-ID: <CAFMmPGOTD+gkrVM2JDEAbF91mc7ogjNPuacLGtb+gYp0rtbEMQ@mail.gmail.com>

I agree with Number 2 and 4.

On Sat, Oct 27, 2018 at 12:38 AM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> in order to be used prior to calling C or Fortran code that expected at
> least a 1-d array
>
> I?d argue that the behavior for these functions should have just been to
> raise an error saying ?this function does not support 0d arrays?, rather
> than silently inserting extra dimensions. As a bonus, that would push the
> function developers to add support for 0d. Obviously we can?t make it do
> that now, but what we can do is have it emit a warning in those cases.
>
> I think our options are:
>
>    1. Deprecate the entire function
>    2. Deprecate and eventually(?) throw an error upon calling the
>    function on 0d arrays, with a message like *?in future using
>    ascontiguousarray to promote 0d arrays to 1d arrays will not be supported.
>    If promotion is intentional, use ascontiguousarray(atleast1d(x)) to silence
>    this warning and keep the old behavior, and if not use asarray(x,
>    order='C') to preserve 0d arrays?*
>    3. Deprecate (future-warning) when passed 0d arrays, and eventually
>    skip the upcast to 1d.
>    If the calling code really needed a 1d array, then it will probably
>    fail, which is not really different to 2, but has the advantage that the
>    names are less surprising.
>    4. Only improve the documentation
>
> My preference would be 3
>
> Eric
>
> On Fri, 26 Oct 2018 at 17:35 Travis Oliphant <teoliphant at gmail.com> wrote:
>
> On Fri, Oct 26, 2018 at 7:14 PM Alex Rogozhnikov <
>> alex.rogozhnikov at yandex.ru> wrote:
>>
>>> > If the desire is to shrink the API of NumPy, I could see that.
>>>
>>> Very good desire, but my goal was different.
>>>
>>
>>> > For some users this is exactly what is wanted.
>>>
>>> Maybe so, but I didn't face such example (and nobody mentioned those so
>>> far in the discussion).
>>> The opposite (according to the issue) happened. Mxnet example is
>>> sufficient in my opinion.
>>>
>>
>> I agree that the old motivation of APIs that would make it easy to create
>> SciPy is no longer a major motivation for most users and even developers
>> and so these reasons would not be very present (as well as why it wasn't
>> even mentioned in the documentation).
>>
>>
>>> Simple example:
>>> x = np.zeros([])
>>> assert(x.flags.c_contiguous)
>>> assert(np.ascontiguousarray(x).shape == x.shape)
>>>
>>> Behavior contradicts to documentation (shape is changed) and to name
>>> (flags are saying - it is already c_contiguous)
>>>
>>> If you insist, that keeping ndmin=1 is important (I am not yet
>>> convinced, but I am ready to believe your autority),
>>> we can add ndmin=1 to functions' signatures, this way explicitly
>>> notifying users about expected dimension.
>>>
>>
>> I understand the lack of being convinced.  This is ultimately a problem
>> of 0-d arrays not being fully embraced and accepted by the Numeric
>> community originally (which NumPy inherited during the early days).   Is
>> there a way to document functions that will be removed on a major version
>> increase which don't print warnings on use? I would support this.
>>
>> I'm a big supporter of making a NumPy 2.0 and have been for several
>> years. Now that Python 3 transition has happened, I think we could
>> seriously discuss this.  I'm trying to raise funding for maintenance and
>> progress for NumPy and SciPy right now via Quansight Labs
>> http://www.quansight.com/labs and I hope to be able to help find grants
>> to support the wonderful efforts that have been happening for some time.
>>
>> While I'm thrilled and impressed by the number of amazing devs who have
>> kept NumPy and SciPy going in mostly their spare time, it has created
>> challenges that we have not had continuous maintenance funding to allow
>> continuous paid development so that several people who know about the early
>> decisions could not be retained to spend time on helping the transition.
>>
>> Your bringing the problem of mxnet devs is most appreciated.  I will make
>> a documentation PR.
>>
>> -Travis
>>
>>
>>
>>
>>> Alex.
>>>
>>>
>>> 27.10.2018, 02:27, "Travis Oliphant" <teoliphant at gmail.com>:
>>>
>>> What is the justification for deprecation exactly?  These functions have
>>> been well documented and have had the intended behavior of producing arrays
>>> with dimension at least 1 for some time.  Why is it unexpected to produce
>>> arrays of at least 1 dimension?  For some users this is exactly what is
>>> wanted.  I don't understand the statement that behavior with 0-d arrays is
>>> unexpected.
>>>
>>> If the desire is to shrink the API of NumPy, I could see that.   But, it
>>> seems odd to me to remove a much-used function with an established behavior
>>> except as part of a wider API-shrinkage effort.
>>>
>>> 0-d arrays in NumPy are a separate conversation.  At this point, I think
>>> it was a mistake not to embrace 0-d arrays in NumPy from day one.  In some
>>> sense 0-d arrays *are* scalars at least conceptually and for JIT-producing
>>> systems that exist now and will be growing in the future, they can be
>>> equivalent to scalars.
>>>
>>> The array scalars should become how you define what is *in* a NumPy
>>> array making them true Python types, rather than Python 1-style "instances"
>>> of a single "Dtype" object.  You would then have 0-d arrays and these
>>> Python "memory" types describing what is *in* the array.
>>>
>>> There is a clear way to do this, some of which has been outlined by
>>> Nathaniel, and the rest I have an outline for how to implement.  I can
>>> advise someone on how to do this.
>>>
>>> -Travis
>>>
>>>
>>>
>>>
>>> On Thu, Oct 25, 2018 at 3:17 PM Alex Rogozhnikov <
>>> alex.rogozhnikov at yandex.ru> wrote:
>>>
>>> Dear numpy community,
>>>
>>> I'm planning to depreciate np.asfortranarray and np.ascontiguousarray
>>> functions due to their misbehavior on scalar (0-D tensors) with PR
>>> #12244.
>>>
>>> Current behavior (converting scalars to 1-d array with single element)
>>> - is unexpected and contradicts to documentation
>>> - probably, can't be changed without breaking external code
>>> - I believe, this was a cause for poor support of 0-d arrays in mxnet.
>>> - both functions are easily replaced with asarray(..., order='...'),
>>> which has expected behavior
>>>
>>> There is no timeline for removal - we just need to discourage from using
>>> this functions in new code.
>>>
>>> Function naming may be related to how numpy treats 0-d tensors
>>> specially,
>>> and those probably should not be called arrays.
>>> https://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html
>>> However, as a user I never thought about 0-d arrays being special and
>>> being "not arrays".
>>>
>>>
>>> Please see original discussion at github for more details
>>> https://github.com/numpy/numpy/issues/5300
>>>
>>> Your comments welcome,
>>> Alex Rogozhnikov
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>> ,
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ?
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181027/38ea2d72/attachment-0001.html>

From chris.barker at noaa.gov  Mon Oct 29 19:30:44 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 29 Oct 2018 16:30:44 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>
Message-ID: <CALGmxEJCMfm7J8BWbHiXG0LTJxvk=5vp9X=42psY7LiA3uL8fA@mail.gmail.com>

On Fri, Oct 26, 2018 at 7:12 PM, Travis Oliphant <teoliphant at gmail.com>
wrote:


>  agree that we can stop bashing subclasses in general.   The problem with
> numpy subclasses is that they were made without adherence to SOLID:
> https://en.wikipedia.org/wiki/SOLID.  In particular the Liskov
> substitution principle:  https://en.wikipedia.org/wiki/
> Liskov_substitution_principle .
>

...


> did not properly apply them in creating np.matrix which clearly violates
> the substitution principle.
>

So -- could a matrix subclass be made "properly"? or is that an example of
something that should not have been a subclass?

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181029/9f09d9cf/attachment.html>

From ralf.gommers at gmail.com  Mon Oct 29 23:54:17 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 29 Oct 2018 20:54:17 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CALGmxEJCMfm7J8BWbHiXG0LTJxvk=5vp9X=42psY7LiA3uL8fA@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>
 <CALGmxEJCMfm7J8BWbHiXG0LTJxvk=5vp9X=42psY7LiA3uL8fA@mail.gmail.com>
Message-ID: <CABL7CQhn45aM9rUYedLAOzFMb2wRLRJMbwE8taWEuLrqSGpmCQ@mail.gmail.com>

On Mon, Oct 29, 2018 at 4:31 PM Chris Barker <chris.barker at noaa.gov> wrote:

> On Fri, Oct 26, 2018 at 7:12 PM, Travis Oliphant <teoliphant at gmail.com>
> wrote:
>
>
>>  agree that we can stop bashing subclasses in general.   The problem with
>> numpy subclasses is that they were made without adherence to SOLID:
>> https://en.wikipedia.org/wiki/SOLID.  In particular the Liskov
>> substitution principle:
>> https://en.wikipedia.org/wiki/Liskov_substitution_principle .
>>
>
> ...
>
>
>> did not properly apply them in creating np.matrix which clearly violates
>> the substitution principle.
>>
>
> So -- could a matrix subclass be made "properly"? or is that an example of
> something that should not have been a subclass?
>

The latter - changing the behavior of multiplication breaks the principle.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181029/7f0a816d/attachment.html>

From wieser.eric+numpy at gmail.com  Tue Oct 30 00:47:54 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Mon, 29 Oct 2018 21:47:54 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CABL7CQhn45aM9rUYedLAOzFMb2wRLRJMbwE8taWEuLrqSGpmCQ@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>
 <CALGmxEJCMfm7J8BWbHiXG0LTJxvk=5vp9X=42psY7LiA3uL8fA@mail.gmail.com>
 <CABL7CQhn45aM9rUYedLAOzFMb2wRLRJMbwE8taWEuLrqSGpmCQ@mail.gmail.com>
Message-ID: <CAL1kJvC4Py+ikqp6bxyx-_qm9aD_mEhsCFshCR4QXWZt1YGMNg@mail.gmail.com>

The latter - changing the behavior of multiplication breaks the principle.

But this is not the main reason for deprecating matrix - almost all of the
problems I?ve seen have been caused by the way that matrices behave when
sliced. The way that m[i][j] and m[i,j] are different is just one example
of this, the fact that they must be 2d is another.

Matrices behaving differently on multiplication isn?t super different in my
mind to how string arrays fail to multiply at all.

Eric

On Mon, 29 Oct 2018 at 20:54 Ralf Gommers <ralf.gommers at gmail.com> wrote:

On Mon, Oct 29, 2018 at 4:31 PM Chris Barker <chris.barker at noaa.gov> wrote:
>
>> On Fri, Oct 26, 2018 at 7:12 PM, Travis Oliphant <teoliphant at gmail.com>
>> wrote:
>>
>>
>>>  agree that we can stop bashing subclasses in general.   The problem
>>> with numpy subclasses is that they were made without adherence to SOLID:
>>> https://en.wikipedia.org/wiki/SOLID.  In particular the Liskov
>>> substitution principle:
>>> https://en.wikipedia.org/wiki/Liskov_substitution_principle .
>>>
>>
>> ...
>>
>>
>>> did not properly apply them in creating np.matrix which clearly violates
>>> the substitution principle.
>>>
>>
>> So -- could a matrix subclass be made "properly"? or is that an example
>> of something that should not have been a subclass?
>>
>
> The latter - changing the behavior of multiplication breaks the principle.
>
> Ralf
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181029/cc86d207/attachment-0001.html>

From matti.picus at gmail.com  Tue Oct 30 05:04:04 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Tue, 30 Oct 2018 11:04:04 +0200
Subject: [Numpy-discussion] Attribute hiding APIs for PyArrayObject
Message-ID: <b176f6fa-af8e-79b5-8908-450ab88801ff@gmail.com>

TL;DR - should we revert the attribute-hiding constructs in 
ndarraytypes.h and unify PyArrayObject_fields with PyArrayObject?


Background


NumPy 1.8 deprecated direct access to PyArrayObject fields. It made 
PyArrayObject "opaque", and hid the fields behind a PyArrayObject_fields 
structure 
https://github.com/numpy/numpy/blob/v1.15.3/numpy/core/include/numpy/ndarraytypes.h#L659 
with a comment about moving this to a private header. In order to access 
the fields, users are supposed to use PyArray_FIELDNAME functions, like 
PyArray_DATA and PyArray_NDIM. It seems there were thoughts at the time 
that numpy might move away from a C-struct based

underlying data structure. Other changes were also made to enum names, 
but those are relatively painless to find-and-replace.


NumPy has a mechanism to manage deprecating APIs, C users define 
NPY_NO_DEPRICATED_API to a desired level, say NPY_1_8_API_VERSION, and 
can then access the API "as if" they were using NumPy 1.8. Users who do 
not define NPY_NO_DEPRICATED_API get a warning when compiling, and 
default to the pre-1.8 API (aliasing of PyArrayObject to 
PyArrayObject_fields and direct access to the C struct fields). This is 
convenient for downstream users, both since the new API does not provide 
much added value, and it is much easier to write a->nd than 
PyArray_NDIM(a). For instance, pandas uses direct assignment to the data 
field for fast json parsing 
https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/JSONtoObj.c#L203 
via chunks. Working around the new API in pandas would require more 
engineering. Also, for example, cython has a mechanism to transpile 
python code into C, mapping slow python attribute lookup to fast C 
struct field access 
https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types


In a parallel but not really related universe, cython recently upgraded 
the object mapping so that we can quiet the annoying "size changed" 
runtime warning https://github.com/numpy/numpy/issues/11788 without 
requiring warning filters, but that requires updating the numpy.pxd file 
provided with cython, and it was proposed that NumPy actually vendor its 
own file rather than depending on the cython one 
(https://github.com/numpy/numpy/issues/11803).


The problem


We have now made further changes to our API. In NumPy 1.14 we changed 
UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we would like to deprecate 
PyArray_SetNumericOps and PyArray_GetNumericOps. The strange warning 
when NPY_NO_DEPRICATED_API is annoying. The new API cannot be supported 
by cython without some deep surgery 
(https://github.com/cython/cython/pull/2640). When I tried dogfooding an 
updated numpy.pxd for the only cython code in NumPy, mtrand.pxy, I came 
across some of these issues (https://github.com/numpy/numpy/pull/12284). 
Forcing the new API will require downstream users to refactor code or 
re-engineer constructs, as in the pandas example above.


The question


Is the attribute-hiding effort worth it? Should we give up, revert the 
PyArrayObject/PyArrayObject_fields division and allow direct access from 
C to the numpy internals? Is there another path forward that is less 
painful?


Matti


From alex.rogozhnikov at yandex.ru  Tue Oct 30 05:57:34 2018
From: alex.rogozhnikov at yandex.ru (Alex Rogozhnikov)
Date: Tue, 30 Oct 2018 12:57:34 +0300
Subject: [Numpy-discussion] einops 0.1
Message-ID: <859711540893454@iva3-294f9af87d55.qloud-c.yandex.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/0047f3d3/attachment.html>

From ewm at redtetrahedron.org  Tue Oct 30 08:14:38 2018
From: ewm at redtetrahedron.org (Eric Moore)
Date: Tue, 30 Oct 2018 08:14:38 -0400
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAL1kJvC4Py+ikqp6bxyx-_qm9aD_mEhsCFshCR4QXWZt1YGMNg@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>
 <CALGmxEJCMfm7J8BWbHiXG0LTJxvk=5vp9X=42psY7LiA3uL8fA@mail.gmail.com>
 <CABL7CQhn45aM9rUYedLAOzFMb2wRLRJMbwE8taWEuLrqSGpmCQ@mail.gmail.com>
 <CAL1kJvC4Py+ikqp6bxyx-_qm9aD_mEhsCFshCR4QXWZt1YGMNg@mail.gmail.com>
Message-ID: <CAGeA38kJAq2meHigUJxo0ba2O8ZEmJTPx8C1m9pk9-RbwgGqGA@mail.gmail.com>

On Tue, Oct 30, 2018 at 12:49 AM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> The latter - changing the behavior of multiplication breaks the principle.
>
> But this is not the main reason for deprecating matrix - almost all of the
> problems I?ve seen have been caused by the way that matrices behave when
> sliced. The way that m[i][j] and m[i,j] are different is just one example
> of this, the fact that they must be 2d is another.
>
> Matrices behaving differently on multiplication isn?t super different in
> my mind to how string arrays fail to multiply at all.
>

The difference is that string arrays are not numeric.  This is an issue
since people want to pass a matrix Into places that want to multiple
element wise but that then breaks that code unless special provisions are
taken.  Numerical codes don?t work on string arrays anyway.

Eric


Eric
>
> On Mon, 29 Oct 2018 at 20:54 Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
> On Mon, Oct 29, 2018 at 4:31 PM Chris Barker <chris.barker at noaa.gov>
>> wrote:
>>
>>> On Fri, Oct 26, 2018 at 7:12 PM, Travis Oliphant <teoliphant at gmail.com>
>>> wrote:
>>>
>>>
>>>>  agree that we can stop bashing subclasses in general.   The problem
>>>> with numpy subclasses is that they were made without adherence to SOLID:
>>>> https://en.wikipedia.org/wiki/SOLID.  In particular the Liskov
>>>> substitution principle:
>>>> https://en.wikipedia.org/wiki/Liskov_substitution_principle .
>>>>
>>>
>>> ...
>>>
>>>
>>>> did not properly apply them in creating np.matrix which clearly
>>>> violates the substitution principle.
>>>>
>>>
>>> So -- could a matrix subclass be made "properly"? or is that an example
>>> of something that should not have been a subclass?
>>>
>>
>> The latter - changing the behavior of multiplication breaks the principle.
>>
>> Ralf
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ?
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/aa175a71/attachment-0001.html>

From sebastian at sipsolutions.net  Tue Oct 30 10:54:07 2018
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 30 Oct 2018 15:54:07 +0100
Subject: [Numpy-discussion] Depreciating asfortranarray and
 ascontiguousarray
In-Reply-To: <CABL7CQh3BdQLeQbDfe_Re7=xOy5wtb+xh774nBsU_7Adyzb1Zw@mail.gmail.com>
References: <8741181540498592@myt3-c573aa6fc782.qloud-c.yandex.net>
 <CAFMmPGM0Rca8+VzH4ncjTBJckLjYy4oQcwZb73H0U=2CmVbpcg@mail.gmail.com>
 <14021211540598803@sas1-890ba5c2334a.qloud-c.yandex.net>
 <CAFMmPGNhUUcZqYXKqRVqKqe8LWaZW0+UYvPZcL6xqFBwace1oQ@mail.gmail.com>
 <CAL1kJvBBpWXosi0RSBLb0dEuTqCdP-rct=n+R=u27fX3nk8=6Q@mail.gmail.com>
 <CABL7CQh3BdQLeQbDfe_Re7=xOy5wtb+xh774nBsU_7Adyzb1Zw@mail.gmail.com>
Message-ID: <4f2c7ca80346d277a3e8ccdb2e47ef967467033d.camel@sipsolutions.net>

On Sat, 2018-10-27 at 19:29 +1300, Ralf Gommers wrote:
> 
> 
> On Sat, Oct 27, 2018 at 6:37 PM Eric Wieser <
> wieser.eric+numpy at gmail.com> wrote:
> > > in order to be used prior to calling C or Fortran code that
> > > expected at least a 1-d array
> > > 
> > 
<snip>
> 
>  
> > > I'm a big supporter of making a NumPy 2.0 and have been for
> > > several years. Now that Python 3 transition has happened, I think
> > > we could seriously discuss this.
> 
> I think it's more helpful to discuss goals and concrete plans for
> those, rather than a "NumPy 2.0" label. The latter never worked in
> the past, and not just because of lack of time/funding - it just
> means different things to different people. We now have a good start
> on what our major goals are (http://www.numpy.org/neps/#roadmap),
> let's build on that.

I agree. I do think that we should not be scared of a major release.
But, I would rather see it as a step towards, for example, better
dtypes. Aiming for a large cleanup seems like it might be a can of
worms [0].

About the asfortranarray/ascontiguousarray thing. I am not sure I like
FutureWarnings in the edge cases, it seems likely they arise randomly
on functions where the devs may not even be aware of it.
I do not like spamming the the API, but if we cannot agree on a nice
way forward, maybe this is a point where creating new names is an
options:

 * ascorderarray/asforderarray
 * asccontiguousarray/asfcontigouousarray
 * np.asarray(..., order='C'), is somewhat the same I guess

not sure I like the names too much, but I think we could find new names
here.
And then putting warnings is IMO OK, if there is a an easy/nice enough
way to avoid them (sure we can start in documentation if it helps).
We can wait for the actual removal for very long and at least until the
next major release or so, I do not think it matters much as long as
visible deprecation warnings exist to push downstream into changing
habits/code, the maintenance burden is pretty much zero after all.

Discussing how to approach larger changes is important, but I doubt
that these particular functions are problematic enough!

- Sebastian


[0] Happy to be shown wrong, but I seriously fear that aiming too high
will hinder progress -- unless maybe there is some very good funding
and skilled devs, but even then it might be too ambitious? -- and I am
not even sure it is easier on downstream.


> 
> > >   I'm trying to raise funding for maintenance and progress for
> > > NumPy and SciPy right now via Quansight Labs 
> > > http://www.quansight.com/labs and I hope to be able to help find
> > > grants to support the wonderful efforts that have been happening
> > > for some time. 
> 
> The NumPy grant and having Tyler/Matti/Stefan at BIDS is a great
> start to funded development; more and more diverse funding sources
> would be awesome.

Yes, that is very cool news!

- Sebastian

> 
> Cheers,
> Ralf
> 
> > > While I'm thrilled and impressed by the number of amazing devs
> > > who have kept NumPy and SciPy going in mostly their spare time,
> > > it has created challenges that we have not had continuous
> > > maintenance funding to allow continuous paid development so that
> > > several people who know about the early decisions could not be
> > > retained to spend time on helping the transition.  
> > > 
> > > Your bringing the problem of mxnet devs is most appreciated.  I
> > > will make a documentation PR.
> > > 
> > > -Travis
> > > 
> > > 
> > > 
> > > >  
> > > > Alex.
> > > >  
> > > >  
> > > > 27.10.2018, 02:27, "Travis Oliphant" <teoliphant at gmail.com>:
> > > > > What is the justification for deprecation exactly?  These
> > > > > functions have been well documented and have had the intended
> > > > > behavior of producing arrays with dimension at least 1 for
> > > > > some time.  Why is it unexpected to produce arrays of at
> > > > > least 1 dimension?  For some users this is exactly what is
> > > > > wanted.  I don't understand the statement that behavior with
> > > > > 0-d arrays is unexpected.
> > > > >  
> > > > > If the desire is to shrink the API of NumPy, I could see
> > > > > that.   But, it seems odd to me to remove a much-used
> > > > > function with an established behavior except as part of a
> > > > > wider API-shrinkage effort. 
> > > > >  
> > > > > 0-d arrays in NumPy are a separate conversation.  At this
> > > > > point, I think it was a mistake not to embrace 0-d arrays in
> > > > > NumPy from day one.  In some sense 0-d arrays *are* scalars
> > > > > at least conceptually and for JIT-producing systems that
> > > > > exist now and will be growing in the future, they can be
> > > > > equivalent to scalars.  
> > > > >  
> > > > > The array scalars should become how you define what is *in* a
> > > > > NumPy array making them true Python types, rather than Python
> > > > > 1-style "instances" of a single "Dtype" object.  You would
> > > > > then have 0-d arrays and these Python "memory" types
> > > > > describing what is *in* the array.  
> > > > >  
> > > > > There is a clear way to do this, some of which has been
> > > > > outlined by Nathaniel, and the rest I have an outline for how
> > > > > to implement.  I can advise someone on how to do this.  
> > > > >  
> > > > > -Travis
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > On Thu, Oct 25, 2018 at 3:17 PM Alex Rogozhnikov <
> > > > > alex.rogozhnikov at yandex.ru> wrote:
> > > > > > Dear numpy community,
> > > > > >  
> > > > > > I'm planning to depreciate np.asfortranarray and
> > > > > > np.ascontiguousarray
> > > > > > functions due to their misbehavior on scalar (0-D tensors)
> > > > > > with PR #12244.
> > > > > >  
> > > > > > Current behavior (converting scalars to 1-d array with
> > > > > > single element)
> > > > > > - is unexpected and contradicts to documentation
> > > > > > - probably, can't be changed without breaking external code
> > > > > > - I believe, this was a cause for poor support of 0-d
> > > > > > arrays in mxnet.
> > > > > > - both functions are easily replaced with asarray(...,
> > > > > > order='...'), which has expected behavior
> > > > > >  
> > > > > > There is no timeline for removal - we just need to
> > > > > > discourage from using this functions in new code.
> > > > > >  
> > > > > > Function naming may be related to how numpy treats 0-d
> > > > > > tensors specially,  
> > > > > > and those probably should not be called arrays.
> > > > > > https://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html
> > > > > > However, as a user I never thought about 0-d arrays being
> > > > > > special and being "not arrays".
> > > > > >  
> > > > > >  
> > > > > > Please see original discussion at github for more details
> > > > > > https://github.com/numpy/numpy/issues/5300
> > > > > >  
> > > > > > Your comments welcome,
> > > > > > Alex Rogozhnikov
> > > > > >  
> > > > > > _______________________________________________
> > > > > > NumPy-Discussion mailing list
> > > > > > NumPy-Discussion at python.org
> > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > > 
> > > > > ,
> > > > > _______________________________________________
> > > > > NumPy-Discussion mailing list
> > > > > NumPy-Discussion at python.org
> > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion at python.org
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > 
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > 
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/380af5c5/attachment.sig>

From charlesr.harris at gmail.com  Tue Oct 30 13:35:04 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 30 Oct 2018 11:35:04 -0600
Subject: [Numpy-discussion] NumPy 1.15.4 release
Message-ID: <CAB6mnxK_qi3V0Ur+-Z7-fcFGTVTWCyT30ZcZKFK6tey+NUpj5A@mail.gmail.com>

Hi All,

Just a heads up that I am planning on making a 1.15.4 release this coming
weekend. The only fixes planned at this point are

   - BUG: Fix fill value in masked array '==' and '!=' ops, #12257
   <https://github.com/numpy/numpy/pull/12257>
   - BUG: clear buffer_info_cache on scalar dealloc, #12249
   <https://github.com/numpy/numpy/pull/12249>

If there are other fixes that you think needed, please let me know.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/e574fa91/attachment.html>

From matti.picus at gmail.com  Tue Oct 30 15:15:50 2018
From: matti.picus at gmail.com (Matti Picus)
Date: Tue, 30 Oct 2018 21:15:50 +0200
Subject: [Numpy-discussion] Reminder: weekly status meeting 31.10 at 12:00
 pacific time
Message-ID: <1e00f85e-16f5-841a-05de-f1ea0dbd8941@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/1307559d/attachment.html>

From einstein.edison at gmail.com  Tue Oct 30 16:24:36 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Tue, 30 Oct 2018 21:24:36 +0100
Subject: [Numpy-discussion] Reminder: weekly status meeting 31.10 at
 12:00 pacific time
In-Reply-To: <1e00f85e-16f5-841a-05de-f1ea0dbd8941@gmail.com>
References: <1e00f85e-16f5-841a-05de-f1ea0dbd8941@gmail.com>
Message-ID: <48dced8b-d66b-4b7b-8dc2-0e06916a448c@Canary>

Hello!

If I may make a suggestion, it might be nice to create a separate calendar and add people to it as needed for better management.

Best Regards,
Hameer Abbasi

> On Tuesday, Oct 30, 2018 at 8:16 PM, Matti Picus <matti.picus at gmail.com (mailto:matti.picus at gmail.com)> wrote:
>
> The draft agenda is at https://hackmd.io/D3I3CdO2T9ipZ2g5uAChcA?both.
>
>
> Everyone is invited to join.
>
>
>
>
>
>
> Matti, Tyler and Stefan
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/1088b99f/attachment.html>

From shoyer at gmail.com  Tue Oct 30 17:22:04 2018
From: shoyer at gmail.com (Stephan Hoyer)
Date: Tue, 30 Oct 2018 14:22:04 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAL1kJvC4Py+ikqp6bxyx-_qm9aD_mEhsCFshCR4QXWZt1YGMNg@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>
 <CALGmxEJCMfm7J8BWbHiXG0LTJxvk=5vp9X=42psY7LiA3uL8fA@mail.gmail.com>
 <CABL7CQhn45aM9rUYedLAOzFMb2wRLRJMbwE8taWEuLrqSGpmCQ@mail.gmail.com>
 <CAL1kJvC4Py+ikqp6bxyx-_qm9aD_mEhsCFshCR4QXWZt1YGMNg@mail.gmail.com>
Message-ID: <CAEQ_TvdLV5KEn7XvKqc_1m2zNkE7FZR_LaXR7MutPS2ULC0iZQ@mail.gmail.com>

On Mon, Oct 29, 2018 at 9:49 PM Eric Wieser <wieser.eric+numpy at gmail.com>
wrote:

> The latter - changing the behavior of multiplication breaks the principle.
>
> But this is not the main reason for deprecating matrix - almost all of the
> problems I?ve seen have been caused by the way that matrices behave when
> sliced. The way that m[i][j] and m[i,j] are different is just one example
> of this, the fact that they must be 2d is another.
>
> Matrices behaving differently on multiplication isn?t super different in
> my mind to how string arrays fail to multiply at all.
>
> Eric
>
It's certainly fine for arithmetic to work differently on an element-wise
basis or even to error. But np.matrix changes the shape of results from
various ndarray operations (e.g., both multiplication and indexing), which
is more than any dtype can do.

The Liskov substitution principle (LSP) suggests that the set of reasonable
ndarray subclasses are exactly those that could also in principle
correspond to a new dtype. Of np.ndarray subclasses in wide-spread use, I
think only the various "array with units" types come close satisfying this
criteria. They only fall short insofar as they present a misleading dtype
(without unit information).

The main problem with subclassing for numpy.ndarray is that it guarantees
too much: a large set of operations/methods along with a specific memory
layout exposed as part of its public API. Worse, ndarray itself is a little
quirky (e.g., with indexing, and its handling of scalars vs. 0d arrays). In
practice, it's basically impossible to layer on complex behavior with these
exact semantics, so only extremely minimal ndarray subclasses don't violate
LSP.

Once we have more easily extended dtypes, I suspect most of the good use
cases for subclassing will have gone away.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/b9f7959b/attachment.html>

From chris.barker at noaa.gov  Tue Oct 30 17:44:48 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 30 Oct 2018 14:44:48 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAEQ_TvdLV5KEn7XvKqc_1m2zNkE7FZR_LaXR7MutPS2ULC0iZQ@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>
 <CALGmxEJCMfm7J8BWbHiXG0LTJxvk=5vp9X=42psY7LiA3uL8fA@mail.gmail.com>
 <CABL7CQhn45aM9rUYedLAOzFMb2wRLRJMbwE8taWEuLrqSGpmCQ@mail.gmail.com>
 <CAL1kJvC4Py+ikqp6bxyx-_qm9aD_mEhsCFshCR4QXWZt1YGMNg@mail.gmail.com>
 <CAEQ_TvdLV5KEn7XvKqc_1m2zNkE7FZR_LaXR7MutPS2ULC0iZQ@mail.gmail.com>
Message-ID: <CALGmxEJ=vsPC6cRmaBC2FyKH-8whqCCNcj9Nzn89Cak=o3o9Og@mail.gmail.com>

On Tue, Oct 30, 2018 at 2:22 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

> The Liskov substitution principle (LSP) suggests that the set of
> reasonable ndarray subclasses are exactly those that could also in
> principle correspond to a new dtype. Of np.ndarray subclasses in
> wide-spread use, I think only the various "array with units" types come
> close satisfying this criteria. They only fall short insofar as they
> present a misleading dtype (without unit information).
>

How about subclasses that only add functionality? My only use case of
subclassing is exactly that:

I have a "bounding box" object (probably could have been called a
rectangle) that is a subclass of ndarray, is always shape (2,2), and has
various methods for merging two such boxes, etc, adding a point, etc.

I did it that way, 'cause I had a lot of code already that simply used a
(2,2) array to represent a bounding box, and I wanted all that code to
still work.

I have had zero problems with it.

Maybe that's too trivial to be worth talking about, but this kind of use
case can be handy.

It is a bit awkward to write the code, though -- it would be nice to have a
cleaner API for this sort of subclassing (not that I have any idea how to
do that)

The main problem with subclassing for numpy.ndarray is that it guarantees
> too much: a large set of operations/methods along with a specific memory
> layout exposed as part of its public API.
>

This is a big deal -- we really have two concepts here:
 - a Python class (type) with certain behaviors in Python code
 - a wrapper around a strided memory block.

maybe it's possible to be clear about that distinction:

"Duck Arrays" are the Python API

Maybe a C-API object  would be useful, that shares the memory layout, but
could have completely different functionality at the Python level.

- CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/0ecf6be2/attachment-0001.html>

From allanhaldane at gmail.com  Tue Oct 30 18:16:49 2018
From: allanhaldane at gmail.com (Allan Haldane)
Date: Tue, 30 Oct 2018 18:16:49 -0400
Subject: [Numpy-discussion] Reminder: weekly status meeting 31.10 at
 12:00 pacific time
In-Reply-To: <1e00f85e-16f5-841a-05de-f1ea0dbd8941@gmail.com>
References: <1e00f85e-16f5-841a-05de-f1ea0dbd8941@gmail.com>
Message-ID: <f8d22376-f68c-c142-b565-c5d0e71eddb8@gmail.com>


I'll try to make it, but can't guarantee.


The last time there was discussion of the structured-array PRs which are
currently held up, and I sort of promised to have a writeup of the issues.

I put up a draft of that here:

https://gist.github.com/ahaldane/6cd44886efb449f9c8d5ea012747323b

Allan


On 10/30/18 3:15 PM, Matti Picus wrote:
> The draft agenda is at https://hackmd.io/D3I3CdO2T9ipZ2g5uAChcA?both.
> 
> Everyone is invited to join.
> 
> 
> Matti, Tyler and Stefan
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 


From stefanv at berkeley.edu  Tue Oct 30 18:52:06 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Tue, 30 Oct 2018 15:52:06 -0700
Subject: [Numpy-discussion] Reminder: weekly status meeting 31.10 at
 12:00 pacific time
In-Reply-To: <48dced8b-d66b-4b7b-8dc2-0e06916a448c@Canary>
References: <1e00f85e-16f5-841a-05de-f1ea0dbd8941@gmail.com>
 <48dced8b-d66b-4b7b-8dc2-0e06916a448c@Canary>
Message-ID: <20181030225206.hx7nnpsoril532gp@carbo>

Hi Hameer,

On Tue, 30 Oct 2018 21:24:36 +0100, Hameer Abbasi wrote:
> If I may make a suggestion, it might be nice to create a separate
> calendar and add people to it as needed for better management.

Can you clarify what you want?  Do you mean we should not announce the
meeting agenda here, and instead only use a calendar?  Or would you like
a calendar link to subcribe to, that also contains a link to the meeting
notes?

Best regards,
St?fan

From einstein.edison at gmail.com  Tue Oct 30 18:56:17 2018
From: einstein.edison at gmail.com (Hameer Abbasi)
Date: Tue, 30 Oct 2018 23:56:17 +0100
Subject: [Numpy-discussion] Reminder: weekly status meeting 31.10 at
 12:00 pacific time
In-Reply-To: <20181030225206.hx7nnpsoril532gp@carbo>
References: <1e00f85e-16f5-841a-05de-f1ea0dbd8941@gmail.com>
 <48dced8b-d66b-4b7b-8dc2-0e06916a448c@Canary>
 <20181030225206.hx7nnpsoril532gp@carbo>
Message-ID: <b24be40b-eddb-4035-96fb-20da04a9c268@Canary>

Hi,

I meant we should have a calendar that?s possible to subscribe to, and in addition announce the agenda here, and that the calendar could contain a link to the meeting agenda.

Best Regards,
Hameer Abbasi

> On Tuesday, Oct 30, 2018 at 11:52 PM, Stefan van der Walt <stefanv at berkeley.edu (mailto:stefanv at berkeley.edu)> wrote:
> Hi Hameer,
>
> On Tue, 30 Oct 2018 21:24:36 +0100, Hameer Abbasi wrote:
> > If I may make a suggestion, it might be nice to create a separate
> > calendar and add people to it as needed for better management.
>
> Can you clarify what you want? Do you mean we should not announce the
> meeting agenda here, and instead only use a calendar? Or would you like
> a calendar link to subcribe to, that also contains a link to the meeting
> notes?
>
> Best regards,
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/87f6ac8b/attachment.html>

From harrigan.matthew at gmail.com  Tue Oct 30 19:35:30 2018
From: harrigan.matthew at gmail.com (Matthew Harrigan)
Date: Tue, 30 Oct 2018 19:35:30 -0400
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAEQ_TvdLV5KEn7XvKqc_1m2zNkE7FZR_LaXR7MutPS2ULC0iZQ@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>
 <CALGmxEJCMfm7J8BWbHiXG0LTJxvk=5vp9X=42psY7LiA3uL8fA@mail.gmail.com>
 <CABL7CQhn45aM9rUYedLAOzFMb2wRLRJMbwE8taWEuLrqSGpmCQ@mail.gmail.com>
 <CAL1kJvC4Py+ikqp6bxyx-_qm9aD_mEhsCFshCR4QXWZt1YGMNg@mail.gmail.com>
 <CAEQ_TvdLV5KEn7XvKqc_1m2zNkE7FZR_LaXR7MutPS2ULC0iZQ@mail.gmail.com>
Message-ID: <CAOfRF=gbeyg_k6XhbWOqqtUqA0L0=fr1r5JAqKwG-09Rbv+mYw@mail.gmail.com>

Would the extended dtypes also violate the Liskov substitution principle?
In place operations which would mutate the dtype are one potential issue.
Would a single dtype for an array be sufficient, i.e. np.polynomial
coefficients?  Compared to ndarray subclasses, the memory layout issue goes
away, but there is still a large set of operations exposed as part of a
public API with various quirks.  I can imagine a new function "asunitless"
scattered around downstream projects.

On Tue, Oct 30, 2018 at 5:23 PM Stephan Hoyer <shoyer at gmail.com> wrote:

> On Mon, Oct 29, 2018 at 9:49 PM Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>
>> The latter - changing the behavior of multiplication breaks the principle.
>>
>> But this is not the main reason for deprecating matrix - almost all of
>> the problems I?ve seen have been caused by the way that matrices behave
>> when sliced. The way that m[i][j] and m[i,j] are different is just one
>> example of this, the fact that they must be 2d is another.
>>
>> Matrices behaving differently on multiplication isn?t super different in
>> my mind to how string arrays fail to multiply at all.
>>
>> Eric
>>
> It's certainly fine for arithmetic to work differently on an element-wise
> basis or even to error. But np.matrix changes the shape of results from
> various ndarray operations (e.g., both multiplication and indexing), which
> is more than any dtype can do.
>
> The Liskov substitution principle (LSP) suggests that the set of
> reasonable ndarray subclasses are exactly those that could also in
> principle correspond to a new dtype. Of np.ndarray subclasses in
> wide-spread use, I think only the various "array with units" types come
> close satisfying this criteria. They only fall short insofar as they
> present a misleading dtype (without unit information).
>
> The main problem with subclassing for numpy.ndarray is that it guarantees
> too much: a large set of operations/methods along with a specific memory
> layout exposed as part of its public API. Worse, ndarray itself is a little
> quirky (e.g., with indexing, and its handling of scalars vs. 0d arrays). In
> practice, it's basically impossible to layer on complex behavior with these
> exact semantics, so only extremely minimal ndarray subclasses don't violate
> LSP.
>
> Once we have more easily extended dtypes, I suspect most of the good use
> cases for subclassing will have gone away.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/dbaab248/attachment-0001.html>

From stefanv at berkeley.edu  Tue Oct 30 20:07:09 2018
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Tue, 30 Oct 2018 17:07:09 -0700
Subject: [Numpy-discussion] Reminder: weekly status meeting 31.10 at
 12:00 pacific time
In-Reply-To: <b24be40b-eddb-4035-96fb-20da04a9c268@Canary>
References: <1e00f85e-16f5-841a-05de-f1ea0dbd8941@gmail.com>
 <48dced8b-d66b-4b7b-8dc2-0e06916a448c@Canary>
 <20181030225206.hx7nnpsoril532gp@carbo>
 <b24be40b-eddb-4035-96fb-20da04a9c268@Canary>
Message-ID: <20181031000709.k56otchs3bgzlqf4@carbo>

On Tue, 30 Oct 2018 23:56:17 +0100, Hameer Abbasi wrote:
> I meant we should have a calendar that?s possible to subscribe to, and
> in addition announce the agenda here, and that the calendar could
> contain a link to the meeting agenda.

Here you go:

https://calendar.google.com/calendar?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20

We'll also advertise this on the agenda and in future emails.

Best regards,
St?fan

From wieser.eric+numpy at gmail.com  Tue Oct 30 21:41:51 2018
From: wieser.eric+numpy at gmail.com (Eric Wieser)
Date: Tue, 30 Oct 2018 18:41:51 -0700
Subject: [Numpy-discussion] Attribute hiding APIs for PyArrayObject
In-Reply-To: <b176f6fa-af8e-79b5-8908-450ab88801ff@gmail.com>
References: <b176f6fa-af8e-79b5-8908-450ab88801ff@gmail.com>
Message-ID: <CAL1kJvCbarxGx=rYQTBCR1O9q5j08RKMg7KE-AZQ40+K+HnrAA@mail.gmail.com>

In NumPy 1.14 we changed UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we
would like to deprecate PyArray_SetNumericOps and PyArray_GetNumericOps.
The strange warning when NPY_NO_DEPRICATED_API is annoying

I?m not sure I make the connection here between hidden fields and API
deprecation. You seem to be asking two vaguely related questions:

   1. Should we have deprecated field access in the first place
   2. Does our api deprecation mechanism need work

I think a more substantial problem statement is needed for 2, so I?m only
going to respond to 1 here.

Hiding fields seems to me to match the CPython model of things, where your
public api is PyArray<thing>_SomeGetter(thing).
If you look at the cpython source code
<https://github.com/python/cpython/blob/e0720cd/Include/tupleobject.h#L24-L34>,
they only expose the underlying struct fields if you don?t define
Py_LIMITED_API, ie if you as a consumer volunteer to be broken by upstream
changes in minor versions. People (like us) are willing to produce separate
builds for each python versions, so often do not define this.

We could add a similar PyArray_LIMITED_API that allows field access under a
similar guarantee - the question is, are many downstream consumers willing
to produce builds against multiple numpy versions? (especially if they also
do so against multiple python versions)

Also, for example, cython has a mechanism to transpile python code into C,
mapping slow python attribute lookup to fast C struct field access

How does this work for builtin types? Does cython deliberately not define
Py_LIMITED_API? Or are you just forced to use PyTuple_GetItem(t) if you
want the fast path.

Eric

On Tue, 30 Oct 2018 at 02:04 Matti Picus <matti.picus at gmail.com> wrote:

TL;DR - should we revert the attribute-hiding constructs in
> ndarraytypes.h and unify PyArrayObject_fields with PyArrayObject?
>
>
> Background
>
>
> NumPy 1.8 deprecated direct access to PyArrayObject fields. It made
> PyArrayObject "opaque", and hid the fields behind a PyArrayObject_fields
> structure
>
> https://github.com/numpy/numpy/blob/v1.15.3/numpy/core/include/numpy/ndarraytypes.h#L659
> with a comment about moving this to a private header. In order to access
> the fields, users are supposed to use PyArray_FIELDNAME functions, like
> PyArray_DATA and PyArray_NDIM. It seems there were thoughts at the time
> that numpy might move away from a C-struct based
>
> underlying data structure. Other changes were also made to enum names,
> but those are relatively painless to find-and-replace.
>
>
> NumPy has a mechanism to manage deprecating APIs, C users define
> NPY_NO_DEPRICATED_API to a desired level, say NPY_1_8_API_VERSION, and
> can then access the API "as if" they were using NumPy 1.8. Users who do
> not define NPY_NO_DEPRICATED_API get a warning when compiling, and
> default to the pre-1.8 API (aliasing of PyArrayObject to
> PyArrayObject_fields and direct access to the C struct fields). This is
> convenient for downstream users, both since the new API does not provide
> much added value, and it is much easier to write a->nd than
> PyArray_NDIM(a). For instance, pandas uses direct assignment to the data
> field for fast json parsing
>
> https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/JSONtoObj.c#L203
> via chunks. Working around the new API in pandas would require more
> engineering. Also, for example, cython has a mechanism to transpile
> python code into C, mapping slow python attribute lookup to fast C
> struct field access
>
> https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types
>
>
> In a parallel but not really related universe, cython recently upgraded
> the object mapping so that we can quiet the annoying "size changed"
> runtime warning https://github.com/numpy/numpy/issues/11788 without
> requiring warning filters, but that requires updating the numpy.pxd file
> provided with cython, and it was proposed that NumPy actually vendor its
> own file rather than depending on the cython one
> (https://github.com/numpy/numpy/issues/11803).
>
>
> The problem
>
>
> We have now made further changes to our API. In NumPy 1.14 we changed
> UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we would like to deprecate
> PyArray_SetNumericOps and PyArray_GetNumericOps. The strange warning
> when NPY_NO_DEPRICATED_API is annoying. The new API cannot be supported
> by cython without some deep surgery
> (https://github.com/cython/cython/pull/2640). When I tried dogfooding an
> updated numpy.pxd for the only cython code in NumPy, mtrand.pxy, I came
> across some of these issues (https://github.com/numpy/numpy/pull/12284).
> Forcing the new API will require downstream users to refactor code or
> re-engineer constructs, as in the pandas example above.
>
>
> The question
>
>
> Is the attribute-hiding effort worth it? Should we give up, revert the
> PyArrayObject/PyArrayObject_fields division and allow direct access from
> C to the numpy internals? Is there another path forward that is less
> painful?
>
>
> Matti
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/2c0521a8/attachment-0001.html>

From njs at pobox.com  Tue Oct 30 22:33:37 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 30 Oct 2018 19:33:37 -0700
Subject: [Numpy-discussion] Attribute hiding APIs for PyArrayObject
In-Reply-To: <CAL1kJvCbarxGx=rYQTBCR1O9q5j08RKMg7KE-AZQ40+K+HnrAA@mail.gmail.com>
References: <b176f6fa-af8e-79b5-8908-450ab88801ff@gmail.com>
 <CAL1kJvCbarxGx=rYQTBCR1O9q5j08RKMg7KE-AZQ40+K+HnrAA@mail.gmail.com>
Message-ID: <CAPJVwBkOVDKthCBVy=VoLUk=J134r=YG7QBfarBzX7sGZaTDmQ@mail.gmail.com>

It's probably helpful to know that Py_LIMITED_API is a
kinda-experimental thing that was added in CPython 3.2 (see PEP 384)
and remains almost 100% unused. It has never been a popular or
influential thing (for better or worse).

-n

On Tue, Oct 30, 2018 at 6:41 PM, Eric Wieser
<wieser.eric+numpy at gmail.com> wrote:
> In NumPy 1.14 we changed UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we
> would like to deprecate PyArray_SetNumericOps and PyArray_GetNumericOps. The
> strange warning when NPY_NO_DEPRICATED_API is annoying
>
> I?m not sure I make the connection here between hidden fields and API
> deprecation. You seem to be asking two vaguely related questions:
>
> Should we have deprecated field access in the first place
> Does our api deprecation mechanism need work
>
> I think a more substantial problem statement is needed for 2, so I?m only
> going to respond to 1 here.
>
> Hiding fields seems to me to match the CPython model of things, where your
> public api is PyArray<thing>_SomeGetter(thing).
> If you look at the cpython source code, they only expose the underlying
> struct fields if you don?t define Py_LIMITED_API, ie if you as a consumer
> volunteer to be broken by upstream changes in minor versions. People (like
> us) are willing to produce separate builds for each python versions, so
> often do not define this.
>
> We could add a similar PyArray_LIMITED_API that allows field access under a
> similar guarantee - the question is, are many downstream consumers willing
> to produce builds against multiple numpy versions? (especially if they also
> do so against multiple python versions)
>
> Also, for example, cython has a mechanism to transpile python code into C,
> mapping slow python attribute lookup to fast C struct field access
>
> How does this work for builtin types? Does cython deliberately not define
> Py_LIMITED_API? Or are you just forced to use PyTuple_GetItem(t) if you want
> the fast path.
>
> Eric
>
> On Tue, 30 Oct 2018 at 02:04 Matti Picus <matti.picus at gmail.com> wrote:
>>
>> TL;DR - should we revert the attribute-hiding constructs in
>> ndarraytypes.h and unify PyArrayObject_fields with PyArrayObject?
>>
>>
>> Background
>>
>>
>> NumPy 1.8 deprecated direct access to PyArrayObject fields. It made
>> PyArrayObject "opaque", and hid the fields behind a PyArrayObject_fields
>> structure
>>
>> https://github.com/numpy/numpy/blob/v1.15.3/numpy/core/include/numpy/ndarraytypes.h#L659
>> with a comment about moving this to a private header. In order to access
>> the fields, users are supposed to use PyArray_FIELDNAME functions, like
>> PyArray_DATA and PyArray_NDIM. It seems there were thoughts at the time
>> that numpy might move away from a C-struct based
>>
>> underlying data structure. Other changes were also made to enum names,
>> but those are relatively painless to find-and-replace.
>>
>>
>> NumPy has a mechanism to manage deprecating APIs, C users define
>> NPY_NO_DEPRICATED_API to a desired level, say NPY_1_8_API_VERSION, and
>> can then access the API "as if" they were using NumPy 1.8. Users who do
>> not define NPY_NO_DEPRICATED_API get a warning when compiling, and
>> default to the pre-1.8 API (aliasing of PyArrayObject to
>> PyArrayObject_fields and direct access to the C struct fields). This is
>> convenient for downstream users, both since the new API does not provide
>> much added value, and it is much easier to write a->nd than
>> PyArray_NDIM(a). For instance, pandas uses direct assignment to the data
>> field for fast json parsing
>>
>> https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/JSONtoObj.c#L203
>> via chunks. Working around the new API in pandas would require more
>> engineering. Also, for example, cython has a mechanism to transpile
>> python code into C, mapping slow python attribute lookup to fast C
>> struct field access
>>
>> https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types
>>
>>
>> In a parallel but not really related universe, cython recently upgraded
>> the object mapping so that we can quiet the annoying "size changed"
>> runtime warning https://github.com/numpy/numpy/issues/11788 without
>> requiring warning filters, but that requires updating the numpy.pxd file
>> provided with cython, and it was proposed that NumPy actually vendor its
>> own file rather than depending on the cython one
>> (https://github.com/numpy/numpy/issues/11803).
>>
>>
>> The problem
>>
>>
>> We have now made further changes to our API. In NumPy 1.14 we changed
>> UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we would like to deprecate
>> PyArray_SetNumericOps and PyArray_GetNumericOps. The strange warning
>> when NPY_NO_DEPRICATED_API is annoying. The new API cannot be supported
>> by cython without some deep surgery
>> (https://github.com/cython/cython/pull/2640). When I tried dogfooding an
>> updated numpy.pxd for the only cython code in NumPy, mtrand.pxy, I came
>> across some of these issues (https://github.com/numpy/numpy/pull/12284).
>> Forcing the new API will require downstream users to refactor code or
>> re-engineer constructs, as in the pandas example above.
>>
>>
>> The question
>>
>>
>> Is the attribute-hiding effort worth it? Should we give up, revert the
>> PyArrayObject/PyArrayObject_fields division and allow direct access from
>> C to the numpy internals? Is there another path forward that is less
>> painful?
>>
>>
>> Matti
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>


-- 
Nathaniel J. Smith -- https://vorpus.org

From encukou at gmail.com  Wed Oct 31 04:46:03 2018
From: encukou at gmail.com (Petr Viktorin)
Date: Wed, 31 Oct 2018 09:46:03 +0100
Subject: [Numpy-discussion] Attribute hiding APIs for PyArrayObject
In-Reply-To: <CAPJVwBkOVDKthCBVy=VoLUk=J134r=YG7QBfarBzX7sGZaTDmQ@mail.gmail.com>
References: <b176f6fa-af8e-79b5-8908-450ab88801ff@gmail.com>
 <CAL1kJvCbarxGx=rYQTBCR1O9q5j08RKMg7KE-AZQ40+K+HnrAA@mail.gmail.com>
 <CAPJVwBkOVDKthCBVy=VoLUk=J134r=YG7QBfarBzX7sGZaTDmQ@mail.gmail.com>
Message-ID: <c813b71d-a0f4-87a2-67ba-b49fbb3abf59@gmail.com>

On 10/31/18 03:33, Nathaniel Smith wrote:
> It's probably helpful to know that Py_LIMITED_API is a
> kinda-experimental thing that was added in CPython 3.2 (see PEP 384)
> and remains almost 100% unused. It has never been a popular or
> influential thing (for better or worse).

Py_LIMITED_API is not very influential *outside* CPython, but it's not 
(yet) a failed experiment. (Which is not what you said, but someone 
might read it that way.)

The popularity is a bit of a chicken-and-egg problem. Py_LIMITED_API is 
not used much because the current implementation is not useful in the 
real world. But as large projects like Cython and PySide are looking at 
Py_LIMITED_API from their side, problems are getting found and fixed.
It's not a fast process, being all volunteer-driven. But the limited API 
(= stable ABI) does have a major role in thoughts about future CPython 
API design, and the idea (not current implementation) is worth looking at.

What's the idea? In addition to python35/python36/python37, there's a 
"python3" API that you can target, which is slower at run-time but won't 
inflate your build/test matrix.
It's not either-or. CPython provides both.


> -n
> 
> On Tue, Oct 30, 2018 at 6:41 PM, Eric Wieser
> <wieser.eric+numpy at gmail.com> wrote:
>> In NumPy 1.14 we changed UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we
>> would like to deprecate PyArray_SetNumericOps and PyArray_GetNumericOps. The
>> strange warning when NPY_NO_DEPRICATED_API is annoying
>>
>> I?m not sure I make the connection here between hidden fields and API
>> deprecation. You seem to be asking two vaguely related questions:
>>
>> Should we have deprecated field access in the first place
>> Does our api deprecation mechanism need work
>>
>> I think a more substantial problem statement is needed for 2, so I?m only
>> going to respond to 1 here.
>>
>> Hiding fields seems to me to match the CPython model of things, where your
>> public api is PyArray<thing>_SomeGetter(thing).
>> If you look at the cpython source code, they only expose the underlying
>> struct fields if you don?t define Py_LIMITED_API, ie if you as a consumer
>> volunteer to be broken by upstream changes in minor versions. People (like
>> us) are willing to produce separate builds for each python versions, so
>> often do not define this.
>>
>> We could add a similar PyArray_LIMITED_API that allows field access under a
>> similar guarantee - the question is, are many downstream consumers willing
>> to produce builds against multiple numpy versions? (especially if they also
>> do so against multiple python versions)
>>
>> Also, for example, cython has a mechanism to transpile python code into C,
>> mapping slow python attribute lookup to fast C struct field access
>>
>> How does this work for builtin types? Does cython deliberately not define
>> Py_LIMITED_API? Or are you just forced to use PyTuple_GetItem(t) if you want
>> the fast path.
>>
>> Eric
>>
>> On Tue, 30 Oct 2018 at 02:04 Matti Picus <matti.picus at gmail.com> wrote:
>>>
>>> TL;DR - should we revert the attribute-hiding constructs in
>>> ndarraytypes.h and unify PyArrayObject_fields with PyArrayObject?
>>>
>>>
>>> Background
>>>
>>>
>>> NumPy 1.8 deprecated direct access to PyArrayObject fields. It made
>>> PyArrayObject "opaque", and hid the fields behind a PyArrayObject_fields
>>> structure
>>>
>>> https://github.com/numpy/numpy/blob/v1.15.3/numpy/core/include/numpy/ndarraytypes.h#L659
>>> with a comment about moving this to a private header. In order to access
>>> the fields, users are supposed to use PyArray_FIELDNAME functions, like
>>> PyArray_DATA and PyArray_NDIM. It seems there were thoughts at the time
>>> that numpy might move away from a C-struct based
>>>
>>> underlying data structure. Other changes were also made to enum names,
>>> but those are relatively painless to find-and-replace.
>>>
>>>
>>> NumPy has a mechanism to manage deprecating APIs, C users define
>>> NPY_NO_DEPRICATED_API to a desired level, say NPY_1_8_API_VERSION, and
>>> can then access the API "as if" they were using NumPy 1.8. Users who do
>>> not define NPY_NO_DEPRICATED_API get a warning when compiling, and
>>> default to the pre-1.8 API (aliasing of PyArrayObject to
>>> PyArrayObject_fields and direct access to the C struct fields). This is
>>> convenient for downstream users, both since the new API does not provide
>>> much added value, and it is much easier to write a->nd than
>>> PyArray_NDIM(a). For instance, pandas uses direct assignment to the data
>>> field for fast json parsing
>>>
>>> https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/JSONtoObj.c#L203
>>> via chunks. Working around the new API in pandas would require more
>>> engineering. Also, for example, cython has a mechanism to transpile
>>> python code into C, mapping slow python attribute lookup to fast C
>>> struct field access
>>>
>>> https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types
>>>
>>>
>>> In a parallel but not really related universe, cython recently upgraded
>>> the object mapping so that we can quiet the annoying "size changed"
>>> runtime warning https://github.com/numpy/numpy/issues/11788 without
>>> requiring warning filters, but that requires updating the numpy.pxd file
>>> provided with cython, and it was proposed that NumPy actually vendor its
>>> own file rather than depending on the cython one
>>> (https://github.com/numpy/numpy/issues/11803).
>>>
>>>
>>> The problem
>>>
>>>
>>> We have now made further changes to our API. In NumPy 1.14 we changed
>>> UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we would like to deprecate
>>> PyArray_SetNumericOps and PyArray_GetNumericOps. The strange warning
>>> when NPY_NO_DEPRICATED_API is annoying. The new API cannot be supported
>>> by cython without some deep surgery
>>> (https://github.com/cython/cython/pull/2640). When I tried dogfooding an
>>> updated numpy.pxd for the only cython code in NumPy, mtrand.pxy, I came
>>> across some of these issues (https://github.com/numpy/numpy/pull/12284).
>>> Forcing the new API will require downstream users to refactor code or
>>> re-engineer constructs, as in the pandas example above.
>>>
>>>
>>> The question
>>>
>>>
>>> Is the attribute-hiding effort worth it? Should we give up, revert the
>>> PyArrayObject/PyArrayObject_fields division and allow direct access from
>>> C to the numpy internals? Is there another path forward that is less
>>> painful?
>>>
>>>
>>> Matti
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> 
> 
> 

From allanhaldane at gmail.com  Wed Oct 31 17:59:29 2018
From: allanhaldane at gmail.com (Allan Haldane)
Date: Wed, 31 Oct 2018 17:59:29 -0400
Subject: [Numpy-discussion] Attribute hiding APIs for PyArrayObject
In-Reply-To: <b176f6fa-af8e-79b5-8908-450ab88801ff@gmail.com>
References: <b176f6fa-af8e-79b5-8908-450ab88801ff@gmail.com>
Message-ID: <a5e00275-1b7f-0fbd-cffb-504b6b4d7f45@gmail.com>

On 10/30/18 5:04 AM, Matti Picus wrote:
> TL;DR - should we revert the attribute-hiding constructs in
> ndarraytypes.h and unify PyArrayObject_fields with PyArrayObject?
> 
> 
> Background
> 
> 
> NumPy 1.8 deprecated direct access to PyArrayObject fields. It made
> PyArrayObject "opaque", and hid the fields behind a PyArrayObject_fields
> structure
> https://github.com/numpy/numpy/blob/v1.15.3/numpy/core/include/numpy/ndarraytypes.h#L659
> with a comment about moving this to a private header. In order to access
> the fields, users are supposed to use PyArray_FIELDNAME functions, like
> PyArray_DATA and PyArray_NDIM. It seems there were thoughts at the time
> that numpy might move away from a C-struct based
> 
> underlying data structure. Other changes were also made to enum names,
> but those are relatively painless to find-and-replace.
> 
> 
> NumPy has a mechanism to manage deprecating APIs, C users define
> NPY_NO_DEPRICATED_API to a desired level, say NPY_1_8_API_VERSION, and
> can then access the API "as if" they were using NumPy 1.8. Users who do
> not define NPY_NO_DEPRICATED_API get a warning when compiling, and
> default to the pre-1.8 API (aliasing of PyArrayObject to
> PyArrayObject_fields and direct access to the C struct fields). This is
> convenient for downstream users, both since the new API does not provide
> much added value, and it is much easier to write a->nd than
> PyArray_NDIM(a). For instance, pandas uses direct assignment to the data
> field for fast json parsing
> https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/JSONtoObj.c#L203
> via chunks. Working around the new API in pandas would require more
> engineering. Also, for example, cython has a mechanism to transpile
> python code into C, mapping slow python attribute lookup to fast C
> struct field access
> https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types
> 
> 
> 
> In a parallel but not really related universe, cython recently upgraded
> the object mapping so that we can quiet the annoying "size changed"
> runtime warning https://github.com/numpy/numpy/issues/11788 without
> requiring warning filters, but that requires updating the numpy.pxd file
> provided with cython, and it was proposed that NumPy actually vendor its
> own file rather than depending on the cython one
> (https://github.com/numpy/numpy/issues/11803).
> 
> 
> The problem
> 
> 
> We have now made further changes to our API. In NumPy 1.14 we changed
> UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we would like to deprecate
> PyArray_SetNumericOps and PyArray_GetNumericOps. The strange warning
> when NPY_NO_DEPRICATED_API is annoying. The new API cannot be supported
> by cython without some deep surgery
> (https://github.com/cython/cython/pull/2640). When I tried dogfooding an
> updated numpy.pxd for the only cython code in NumPy, mtrand.pxy, I came
> across some of these issues (https://github.com/numpy/numpy/pull/12284).
> Forcing the new API will require downstream users to refactor code or
> re-engineer constructs, as in the pandas example above.

I haven't understood the cython issue, but just want to mention that for
optimization purposes it's nice to be able to modify the fields, like in
the pandas/json example above.

In particular, PyArray_ConcatenateArrays uses some tricks which
temporarily clobber the data pointer and shape of an array to
concatenate arrays efficiently. It seems fairly safe to me. These tricks
would be nice to re-use in a C port of the new block code we merged
recently.

Those optimizations aren't possible if only using PyArray_Object.

Cheers,
Allan


> The question
> 
> 
> Is the attribute-hiding effort worth it? Should we give up, revert the
> PyArrayObject/PyArrayObject_fields division and allow direct access from
> C to the numpy internals? Is there another path forward that is less
> painful?
> 
> 
> Matti
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Wed Oct 31 19:00:52 2018
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 31 Oct 2018 17:00:52 -0600
Subject: [Numpy-discussion] Attribute hiding APIs for PyArrayObject
In-Reply-To: <a5e00275-1b7f-0fbd-cffb-504b6b4d7f45@gmail.com>
References: <b176f6fa-af8e-79b5-8908-450ab88801ff@gmail.com>
 <a5e00275-1b7f-0fbd-cffb-504b6b4d7f45@gmail.com>
Message-ID: <CAB6mnxKPSEwNY_OgiG6rwanqxZf+ZABfTSWKb7Zi8nrORV3zRQ@mail.gmail.com>

On Wed, Oct 31, 2018 at 3:59 PM Allan Haldane <allanhaldane at gmail.com>
wrote:

> On 10/30/18 5:04 AM, Matti Picus wrote:
> > TL;DR - should we revert the attribute-hiding constructs in
> > ndarraytypes.h and unify PyArrayObject_fields with PyArrayObject?
> >
> >
> > Background
> >
> >
> > NumPy 1.8 deprecated direct access to PyArrayObject fields. It made
> > PyArrayObject "opaque", and hid the fields behind a PyArrayObject_fields
> > structure
> >
> https://github.com/numpy/numpy/blob/v1.15.3/numpy/core/include/numpy/ndarraytypes.h#L659
> > with a comment about moving this to a private header. In order to access
> > the fields, users are supposed to use PyArray_FIELDNAME functions, like
> > PyArray_DATA and PyArray_NDIM. It seems there were thoughts at the time
> > that numpy might move away from a C-struct based
> >
> > underlying data structure. Other changes were also made to enum names,
> > but those are relatively painless to find-and-replace.
> >
> >
> > NumPy has a mechanism to manage deprecating APIs, C users define
> > NPY_NO_DEPRICATED_API to a desired level, say NPY_1_8_API_VERSION, and
> > can then access the API "as if" they were using NumPy 1.8. Users who do
> > not define NPY_NO_DEPRICATED_API get a warning when compiling, and
> > default to the pre-1.8 API (aliasing of PyArrayObject to
> > PyArrayObject_fields and direct access to the C struct fields). This is
> > convenient for downstream users, both since the new API does not provide
> > much added value, and it is much easier to write a->nd than
> > PyArray_NDIM(a). For instance, pandas uses direct assignment to the data
> > field for fast json parsing
> >
> https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/JSONtoObj.c#L203
> > via chunks. Working around the new API in pandas would require more
> > engineering. Also, for example, cython has a mechanism to transpile
> > python code into C, mapping slow python attribute lookup to fast C
> > struct field access
> >
> https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types
> >
> >
> >
> > In a parallel but not really related universe, cython recently upgraded
> > the object mapping so that we can quiet the annoying "size changed"
> > runtime warning https://github.com/numpy/numpy/issues/11788 without
> > requiring warning filters, but that requires updating the numpy.pxd file
> > provided with cython, and it was proposed that NumPy actually vendor its
> > own file rather than depending on the cython one
> > (https://github.com/numpy/numpy/issues/11803).
> >
> >
> > The problem
> >
> >
> > We have now made further changes to our API. In NumPy 1.14 we changed
> > UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we would like to deprecate
> > PyArray_SetNumericOps and PyArray_GetNumericOps. The strange warning
> > when NPY_NO_DEPRICATED_API is annoying. The new API cannot be supported
> > by cython without some deep surgery
> > (https://github.com/cython/cython/pull/2640). When I tried dogfooding an
> > updated numpy.pxd for the only cython code in NumPy, mtrand.pxy, I came
> > across some of these issues (https://github.com/numpy/numpy/pull/12284).
> > Forcing the new API will require downstream users to refactor code or
> > re-engineer constructs, as in the pandas example above.
>
> I haven't understood the cython issue, but just want to mention that for
> optimization purposes it's nice to be able to modify the fields, like in
> the pandas/json example above.
>
> In particular, PyArray_ConcatenateArrays uses some tricks which
> temporarily clobber the data pointer and shape of an array to
> concatenate arrays efficiently. It seems fairly safe to me. These tricks
> would be nice to re-use in a C port of the new block code we merged
> recently.
>
> Those optimizations aren't possible if only using PyArray_Object.
>
>
It's OK for numpy internals to directly access the structures, as
presumably they will be updated if anything changes. Maybe it would be
useful for Cython to have a flag like Py_LIMITED_API?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181031/ab2413ec/attachment.html>

From ralf.gommers at gmail.com  Wed Oct 31 20:04:16 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Wed, 31 Oct 2018 17:04:16 -0700
Subject: [Numpy-discussion] Attribute hiding APIs for PyArrayObject
In-Reply-To: <CAB6mnxKPSEwNY_OgiG6rwanqxZf+ZABfTSWKb7Zi8nrORV3zRQ@mail.gmail.com>
References: <b176f6fa-af8e-79b5-8908-450ab88801ff@gmail.com>
 <a5e00275-1b7f-0fbd-cffb-504b6b4d7f45@gmail.com>
 <CAB6mnxKPSEwNY_OgiG6rwanqxZf+ZABfTSWKb7Zi8nrORV3zRQ@mail.gmail.com>
Message-ID: <CABL7CQiuiCQzh8bK9hHm1nYdwwAtPmfM8nbz7YrN3upZX3-V4A@mail.gmail.com>

On Wed, Oct 31, 2018 at 4:01 PM Charles R Harris <charlesr.harris at gmail.com>
wrote:

>
>
> On Wed, Oct 31, 2018 at 3:59 PM Allan Haldane <allanhaldane at gmail.com>
> wrote:
>
>> On 10/30/18 5:04 AM, Matti Picus wrote:
>> > TL;DR - should we revert the attribute-hiding constructs in
>> > ndarraytypes.h and unify PyArrayObject_fields with PyArrayObject?
>> >
>> >
>> > Background
>> >
>> >
>> > NumPy 1.8 deprecated direct access to PyArrayObject fields. It made
>> > PyArrayObject "opaque", and hid the fields behind a PyArrayObject_fields
>> > structure
>> >
>> https://github.com/numpy/numpy/blob/v1.15.3/numpy/core/include/numpy/ndarraytypes.h#L659
>> > with a comment about moving this to a private header. In order to access
>> > the fields, users are supposed to use PyArray_FIELDNAME functions, like
>> > PyArray_DATA and PyArray_NDIM. It seems there were thoughts at the time
>> > that numpy might move away from a C-struct based
>> >
>> > underlying data structure. Other changes were also made to enum names,
>> > but those are relatively painless to find-and-replace.
>> >
>> >
>> > NumPy has a mechanism to manage deprecating APIs, C users define
>> > NPY_NO_DEPRICATED_API to a desired level, say NPY_1_8_API_VERSION, and
>> > can then access the API "as if" they were using NumPy 1.8. Users who do
>> > not define NPY_NO_DEPRICATED_API get a warning when compiling, and
>> > default to the pre-1.8 API (aliasing of PyArrayObject to
>> > PyArrayObject_fields and direct access to the C struct fields). This is
>> > convenient for downstream users, both since the new API does not provide
>> > much added value, and it is much easier to write a->nd than
>> > PyArray_NDIM(a). For instance, pandas uses direct assignment to the data
>> > field for fast json parsing
>> >
>> https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/JSONtoObj.c#L203
>> > via chunks. Working around the new API in pandas would require more
>> > engineering. Also, for example, cython has a mechanism to transpile
>> > python code into C, mapping slow python attribute lookup to fast C
>> > struct field access
>> >
>> https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types
>> >
>> >
>> >
>> > In a parallel but not really related universe, cython recently upgraded
>> > the object mapping so that we can quiet the annoying "size changed"
>> > runtime warning https://github.com/numpy/numpy/issues/11788 without
>> > requiring warning filters, but that requires updating the numpy.pxd file
>> > provided with cython, and it was proposed that NumPy actually vendor its
>> > own file rather than depending on the cython one
>> > (https://github.com/numpy/numpy/issues/11803).
>> >
>> >
>> > The problem
>> >
>> >
>> > We have now made further changes to our API. In NumPy 1.14 we changed
>> > UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we would like to deprecate
>> > PyArray_SetNumericOps and PyArray_GetNumericOps. The strange warning
>> > when NPY_NO_DEPRICATED_API is annoying. The new API cannot be supported
>> > by cython without some deep surgery
>> > (https://github.com/cython/cython/pull/2640). When I tried dogfooding
>> an
>> > updated numpy.pxd for the only cython code in NumPy, mtrand.pxy, I came
>> > across some of these issues (https://github.com/numpy/numpy/pull/12284
>> ).
>> > Forcing the new API will require downstream users to refactor code or
>> > re-engineer constructs, as in the pandas example above.
>>
>> I haven't understood the cython issue, but just want to mention that for
>> optimization purposes it's nice to be able to modify the fields, like in
>> the pandas/json example above.
>>
>> In particular, PyArray_ConcatenateArrays uses some tricks which
>> temporarily clobber the data pointer and shape of an array to
>> concatenate arrays efficiently. It seems fairly safe to me. These tricks
>> would be nice to re-use in a C port of the new block code we merged
>> recently.
>>
>> Those optimizations aren't possible if only using PyArray_Object.
>>
>>
> It's OK for numpy internals to directly access the structures, as
> presumably they will be updated if anything changes. Maybe it would be
> useful for Cython to have a flag like Py_LIMITED_API?
>

That probably only makes sense if we enable such a flag by default  - which
is a big backwards compat break that users can then undo by setting
Py_LIMITED_API=0. Otherwise the vast majority of users will never use it,
and hence we still cannot change in the C API without breaking the world.
Such breakage would be fine for conda, because it special-cases NumPy in
the same way as Python. For wheel/pip users however, it would cause major
issues.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181031/eea54ef7/attachment-0001.html>

From ralf.gommers at gmail.com  Wed Oct 31 20:28:01 2018
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Wed, 31 Oct 2018 17:28:01 -0700
Subject: [Numpy-discussion] asanyarray vs. asarray
In-Reply-To: <CAEQ_TvdLV5KEn7XvKqc_1m2zNkE7FZR_LaXR7MutPS2ULC0iZQ@mail.gmail.com>
References: <a7e76515-95ab-e867-dc96-92c54d8b0952@gmail.com>
 <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>
 <CAJNV+9vH2o_2KCiMVc6S1ww=-NXpfGK39UoJVKKJNmv7pN602A@mail.gmail.com>
 <CAEQ_TvezgD7_KNB5PD_zjKQ=j+dN-k7BkEzNVVM9+VNAw8b85w@mail.gmail.com>
 <cb1364db-9e46-4614-a1df-d5dc33ad2c64@Canary>
 <CABL7CQgLqiTAO4H+3MUZzDbF3eR_njO4kH9UD1C_t27gnkGYYw@mail.gmail.com>
 <CABL7CQgXNt6ZGrDF8OJz73PDGJmbvaKs6q-3ArFkeLF86bpSEg@mail.gmail.com>
 <CAJNV+9uXs9at6RYyymQcoGpP4cvr+waVitHWXnsy4MciE+6RnA@mail.gmail.com>
 <CAFMmPGNbhJu1vrREnugi54gWHK9MPqFenRMsnscau=tjwVn8Hw@mail.gmail.com>
 <CALGmxEJCMfm7J8BWbHiXG0LTJxvk=5vp9X=42psY7LiA3uL8fA@mail.gmail.com>
 <CABL7CQhn45aM9rUYedLAOzFMb2wRLRJMbwE8taWEuLrqSGpmCQ@mail.gmail.com>
 <CAL1kJvC4Py+ikqp6bxyx-_qm9aD_mEhsCFshCR4QXWZt1YGMNg@mail.gmail.com>
 <CAEQ_TvdLV5KEn7XvKqc_1m2zNkE7FZR_LaXR7MutPS2ULC0iZQ@mail.gmail.com>
Message-ID: <CABL7CQh=HmhQC=eBzc0XCQ+K41T_3+59BjWQBjya2FkEVO-TSQ@mail.gmail.com>

On Tue, Oct 30, 2018 at 2:22 PM Stephan Hoyer <shoyer at gmail.com> wrote:

> On Mon, Oct 29, 2018 at 9:49 PM Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>
>> The latter - changing the behavior of multiplication breaks the principle.
>>
>> But this is not the main reason for deprecating matrix - almost all of
>> the problems I?ve seen have been caused by the way that matrices behave
>> when sliced. The way that m[i][j] and m[i,j] are different is just one
>> example of this, the fact that they must be 2d is another.
>>
>> Matrices behaving differently on multiplication isn?t super different in
>> my mind to how string arrays fail to multiply at all.
>>
>> Eric
>>
> It's certainly fine for arithmetic to work differently on an element-wise
> basis or even to error. But np.matrix changes the shape of results from
> various ndarray operations (e.g., both multiplication and indexing), which
> is more than any dtype can do.
>
> The Liskov substitution principle (LSP) suggests that the set of
> reasonable ndarray subclasses are exactly those that could also in
> principle correspond to a new dtype.
>

I don't think so. Dtypes have nothing to do with a whole set of use cases
that add extra methods or attributes. Random made-up example: user has a
system with 1000 sensor signals, some of which should be treated with
robust statistics for <reasons like unreliable hardware>. So user writes a
subclass robust_ndarray, adds a bunch of methods like median/iqr/mad, and
uses isinstance checks in functions that accept both ndarray and
robust_ndarray to figure out how to preprocess sensor signals.

Of course you can do everything you can do with subclasses also in other
ways, but such "let's add some methods or attributes" are much more common
(I think, hard to prove) than "let's change how indexing or multiplication
works" in end user code.

Cheers,
Ralf


> Of np.ndarray subclasses in wide-spread use, I think only the various
> "array with units" types come close satisfying this criteria. They only
> fall short insofar as they present a misleading dtype (without unit
> information).
>
> The main problem with subclassing for numpy.ndarray is that it guarantees
> too much: a large set of operations/methods along with a specific memory
> layout exposed as part of its public API. Worse, ndarray itself is a little
> quirky (e.g., with indexing, and its handling of scalars vs. 0d arrays). In
> practice, it's basically impossible to layer on complex behavior with these
> exact semantics, so only extremely minimal ndarray subclasses don't violate
> LSP.
>
> Once we have more easily extended dtypes, I suspect most of the good use
> cases for subclassing will have gone away.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181031/6fe27b4b/attachment.html>