From robert.kern at gmail.com Sun Jul 1 21:53:12 2018 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 1 Jul 2018 18:53:12 -0700 Subject: [Numpy-discussion] Proposal to accept NEP 19: Random Number Generator Policy Message-ID: I propose that we accept NEP 19: Random Number Generator Policy: http://www.numpy.org/neps/nep-0019-rng-policy.html The discussions on this NEP were productive and led to a major revision in how stable random number streams will be maintained in the future for the purpose of unit testing. The current version is much more elegant and imposes less pain on downstream projects. I'd like to thank everyone who participated in that discussion: you really did a good job at clarifying the issues. The NEP is much better because of your input. If there are no substantive objections within 7 days from this email, then the NEP will be accepted; see NEP 0 for more details: http://www.numpy.org/neps/nep-0000.html -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From maifer at haverford.edu Mon Jul 2 02:42:05 2018 From: maifer at haverford.edu (Maxwell Aifer) Date: Mon, 2 Jul 2018 02:42:05 -0400 Subject: [Numpy-discussion] Polynomial evaluation inconsistencies In-Reply-To: References: Message-ID: Say we add a constructor to the polynomial base class that looks something like this: ------------------------------------------------------------------------------------------- @classmethod def literal(cls, f): def basis_function_getter(self, deg): coefs = [0]*deg + [1] return lambda _: cls(coefs) basis = type('',(object,),{'__getitem__': basis_function_getter})() return f(basis, None) ------------------------------------------------------------------------------------------- Then the repr for, say, a Chebyshev polynomial could look like this: >>> Chebyshev.literal(lambda T,x: 1*T[0](x) + 2*T[1](x) + 3*T[2](x)) Does this sound like a good idea to anyone? Max On Sat, Jun 30, 2018 at 6:47 PM, Charles R Harris wrote: > > > On Sat, Jun 30, 2018 at 4:42 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sat, Jun 30, 2018 at 3:41 PM, Eric Wieser > > wrote: >> >>> Since the one of the arguments for the decreasing order seems to just be >>> textual representation - do we want to tweak the repr to something like >>> >>> Polynomial(lambda x: 2*x**3 + 3*x**2 + x + 0) >>> >>> (And add a constructor that calls the lambda with Polynomial(1)) >>> >>> Eric >>> >> >> IIRC there was a proposal for that. There is the possibility of adding >> renderers for latex and html that could be used by Jupyter, and I think the >> ordering was an option. >> > > See https://github.com/numpy/numpy/issues/8893 for the proposal. BTW, if > someone would like to work on this, go for it. > > Chuck > >> ? >>> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Mon Jul 2 02:57:24 2018 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Sun, 1 Jul 2018 23:57:24 -0700 Subject: [Numpy-discussion] Polynomial evaluation inconsistencies In-Reply-To: References: Message-ID: I think the `x` is just noise there, especially if it's ignored (that is, `T[0](x*2)` doesn't do anything reasonable). Chebyshev.literal(lambda T: 1*T[0] + 2*T[1] + 3*T[2]) Would work, but honestly I don't think that provides much clarity. I think the value here is mainly for "simple" polynomials. On Sun, 1 Jul 2018 at 23:42 Maxwell Aifer wrote: > Say we add a constructor to the polynomial base class that looks something > like this: > > > ------------------------------------------------------------------------------------------- > @classmethod > def literal(cls, f): > def basis_function_getter(self, deg): > coefs = [0]*deg + [1] > return lambda _: cls(coefs) > basis = type('',(object,),{'__getitem__': basis_function_getter})() > return f(basis, None) > > ------------------------------------------------------------------------------------------- > > > Then the repr for, say, a Chebyshev polynomial could look like this: > > >>> Chebyshev.literal(lambda T,x: 1*T[0](x) + 2*T[1](x) + 3*T[2](x)) > > Does this sound like a good idea to anyone? > > Max > > > On Sat, Jun 30, 2018 at 6:47 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sat, Jun 30, 2018 at 4:42 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Sat, Jun 30, 2018 at 3:41 PM, Eric Wieser < >>> wieser.eric+numpy at gmail.com> wrote: >>> >>>> Since the one of the arguments for the decreasing order seems to just >>>> be textual representation - do we want to tweak the repr to something like >>>> >>>> Polynomial(lambda x: 2*x**3 + 3*x**2 + x + 0) >>>> >>>> (And add a constructor that calls the lambda with Polynomial(1)) >>>> >>>> Eric >>>> >>> >>> IIRC there was a proposal for that. There is the possibility of adding >>> renderers for latex and html that could be used by Jupyter, and I think the >>> ordering was an option. >>> >> >> See https://github.com/numpy/numpy/issues/8893 for the proposal. BTW, if >> someone would like to work on this, go for it. >> >> Chuck >> >>> ? >>>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maifer at haverford.edu Mon Jul 2 03:31:46 2018 From: maifer at haverford.edu (Maxwell Aifer) Date: Mon, 2 Jul 2018 03:31:46 -0400 Subject: [Numpy-discussion] Polynomial evaluation inconsistencies In-Reply-To: References: Message-ID: Ok I see what you mean. If people really want math-like symbolic representations for everything it?s probably better to use sympy or something On Mon, Jul 2, 2018 at 2:59 AM Eric Wieser wrote: > I think the `x` is just noise there, especially if it's ignored (that is, > `T[0](x*2)` doesn't do anything reasonable). > > Chebyshev.literal(lambda T: 1*T[0] + 2*T[1] + 3*T[2]) > > Would work, but honestly I don't think that provides much clarity. I think > the value here is mainly for "simple" polynomials. > > On Sun, 1 Jul 2018 at 23:42 Maxwell Aifer wrote: > >> Say we add a constructor to the polynomial base class that looks >> something like this: >> >> >> ------------------------------------------------------------------------------------------- >> @classmethod >> def literal(cls, f): >> def basis_function_getter(self, deg): >> coefs = [0]*deg + [1] >> return lambda _: cls(coefs) >> basis = type('',(object,),{'__getitem__': >> basis_function_getter})() >> return f(basis, None) >> >> ------------------------------------------------------------------------------------------- >> >> >> Then the repr for, say, a Chebyshev polynomial could look like this: >> >> >>> Chebyshev.literal(lambda T,x: 1*T[0](x) + 2*T[1](x) + 3*T[2](x)) >> >> Does this sound like a good idea to anyone? >> >> Max >> >> >> On Sat, Jun 30, 2018 at 6:47 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Sat, Jun 30, 2018 at 4:42 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> >>>> On Sat, Jun 30, 2018 at 3:41 PM, Eric Wieser < >>>> wieser.eric+numpy at gmail.com> wrote: >>>> >>>>> Since the one of the arguments for the decreasing order seems to just >>>>> be textual representation - do we want to tweak the repr to something like >>>>> >>>>> Polynomial(lambda x: 2*x**3 + 3*x**2 + x + 0) >>>>> >>>>> (And add a constructor that calls the lambda with Polynomial(1)) >>>>> >>>>> Eric >>>>> >>>> >>>> IIRC there was a proposal for that. There is the possibility of adding >>>> renderers for latex and html that could be used by Jupyter, and I think the >>>> ordering was an option. >>>> >>> >>> See https://github.com/numpy/numpy/issues/8893 for the proposal. BTW, >>> if someone would like to work on this, go for it. >>> >>> Chuck >>> >>>> ? >>>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitrey15 at gmail.com Mon Jul 2 03:47:56 2018 From: dmitrey15 at gmail.com (Dmitrey Kroshko) Date: Mon, 2 Jul 2018 10:47:56 +0300 Subject: [Numpy-discussion] OpenOpt Suite v 0.5627 Message-ID: hello, OpenOpt Suite v 0.5627 (OpenOpt, FuncDesigner, SpaceFuncs, DerApproximator) is available for downloading from the link https://app.box.com/s/sikxqjmohtpklqe46ou86t44ma7b27es (unfortunately, I still cannot update it in PYPI entries because of problems with incorrect password) It fixes compatibility issues with latest Python / NumPy / matplotlib versions and has some improvements. Unfortunately, I had no possibilities to perform essential changes since last release in 2015 due to lack of finance support, however I still intend to continue its development. Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine at python.org Mon Jul 2 17:03:34 2018 From: antoine at python.org (Antoine Pitrou) Date: Mon, 2 Jul 2018 23:03:34 +0200 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data Message-ID: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> Hello, Some of you might know that I've been working on a PEP in order to improve pickling performance of large (or huge) data. The PEP, numbered 574 and titled "Pickle protocol 5 with out-of-band data", allows participating data types to be pickled without any memory copy. https://www.python.org/dev/peps/pep-0574/ The PEP already has an implementation, which is backported as an independent PyPI package under the name "pickle5". https://pypi.org/project/pickle5/ I also have a working patch updating PyArrow to use the PEP-defined extensions to allow for zero-copy pickling of Arrow arrays - without breaking compatibility with existing usage: https://github.com/apache/arrow/pull/2161 Still, it is obvious one the primary targets of PEP 574 is Numpy arrays, as the most prevalent datatype in the Python scientific ecosystem. I'm personally satisfied with the current state of the PEP, but I'd like to have feedback from Numpy core maintainers. I haven't tried (yet?) to draft a Numpy patch to add PEP 574 support, since that's likely to be more involved due to the complexity of Numpy and due to the core being written in C. Therefore I would like some help evaluating whether the PEP is likely to be a good fit for Numpy. Regards Antoine. From charlesr.harris at gmail.com Mon Jul 2 19:16:00 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 2 Jul 2018 17:16:00 -0600 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> References: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> Message-ID: On Mon, Jul 2, 2018 at 3:03 PM, Antoine Pitrou wrote: > > Hello, > > Some of you might know that I've been working on a PEP in order to > improve pickling performance of large (or huge) data. The PEP, > numbered 574 and titled "Pickle protocol 5 with out-of-band data", > allows participating data types to be pickled without any memory copy. > https://www.python.org/dev/peps/pep-0574/ > > The PEP already has an implementation, which is backported as an > independent PyPI package under the name "pickle5". > https://pypi.org/project/pickle5/ > > I also have a working patch updating PyArrow to use the PEP-defined > extensions to allow for zero-copy pickling of Arrow arrays - without > breaking compatibility with existing usage: > https://github.com/apache/arrow/pull/2161 > > Still, it is obvious one the primary targets of PEP 574 is Numpy > arrays, as the most prevalent datatype in the Python scientific > ecosystem. I'm personally satisfied with the current state of the PEP, > but I'd like to have feedback from Numpy core maintainers. I haven't > tried (yet?) to draft a Numpy patch to add PEP 574 support, since that's > likely to be more involved due to the complexity of Numpy and due to > the core being written in C. Therefore I would like some help > evaluating whether the PEP is likely to be a good fit for Numpy. > > Maybe somewhat off topic, but we have had trouble with a 2 GiB limit on file writes on OS X. See https://github.com/numpy/numpy/issues/3858. Does your implementation work around that? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 2 19:31:05 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 2 Jul 2018 17:31:05 -0600 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: References: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> Message-ID: On Mon, Jul 2, 2018 at 5:16 PM, Charles R Harris wrote: > > > On Mon, Jul 2, 2018 at 3:03 PM, Antoine Pitrou wrote: > >> >> Hello, >> >> Some of you might know that I've been working on a PEP in order to >> improve pickling performance of large (or huge) data. The PEP, >> numbered 574 and titled "Pickle protocol 5 with out-of-band data", >> allows participating data types to be pickled without any memory copy. >> https://www.python.org/dev/peps/pep-0574/ >> >> The PEP already has an implementation, which is backported as an >> independent PyPI package under the name "pickle5". >> https://pypi.org/project/pickle5/ >> >> I also have a working patch updating PyArrow to use the PEP-defined >> extensions to allow for zero-copy pickling of Arrow arrays - without >> breaking compatibility with existing usage: >> https://github.com/apache/arrow/pull/2161 >> >> Still, it is obvious one the primary targets of PEP 574 is Numpy >> arrays, as the most prevalent datatype in the Python scientific >> ecosystem. I'm personally satisfied with the current state of the PEP, >> but I'd like to have feedback from Numpy core maintainers. I haven't >> tried (yet?) to draft a Numpy patch to add PEP 574 support, since that's >> likely to be more involved due to the complexity of Numpy and due to >> the core being written in C. Therefore I would like some help >> evaluating whether the PEP is likely to be a good fit for Numpy. >> >> > Maybe somewhat off topic, but we have had trouble with a 2 GiB limit on > file writes on OS X. See https://github.com/numpy/numpy/issues/3858. Does > your implementation work around that? > ISTR that some parallel processing applications sent pickled arrays around to different processes, I don't know if that is still the case, but if so, no copy might be a big gain for them. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Mon Jul 2 20:41:04 2018 From: andyfaff at gmail.com (Andrew Nelson) Date: Tue, 3 Jul 2018 10:41:04 +1000 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: References: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> Message-ID: On Tue, 3 Jul 2018 at 09:31, Charles R Harris wrote: > > ISTR that some parallel processing applications sent pickled arrays around > to different processes, I don't know if that is still the case, but if so, > no copy might be a big gain for them. > That is very much correct. One example is using MCMC, which is massively parallel. I do parallelisation with mpi4py, and this requires distribution of pickled data of a reasonable size to the entire MPI world. This pickling introduces quite a bit of overhead. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Mon Jul 2 20:49:17 2018 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Mon, 2 Jul 2018 19:49:17 -0500 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: References: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> Message-ID: On Mon, Jul 2, 2018 at 7:42 PM Andrew Nelson wrote: > > > On Tue, 3 Jul 2018 at 09:31, Charles R Harris > wrote: > >> >> ISTR that some parallel processing applications sent pickled arrays >> around to different processes, I don't know if that is still the case, but >> if so, no copy might be a big gain for them. >> > > That is very much correct. One example is using MCMC, which is massively > parallel. I do parallelisation with mpi4py, and this requires distribution > of pickled data of a reasonable size to the entire MPI world. This pickling > introduces quite a bit of overhead. > Doesn?t mpi4py have support for buffered low-level communication of numpy arrays? See e.g. https://mpi4py.scipy.org/docs/usrman/tutorial.html Although I guess with Antoine?s proposal uses of the ?lowercase? mpi4py API where data might get pickled will see speedups. _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Tue Jul 3 01:35:04 2018 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 3 Jul 2018 07:35:04 +0200 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: References: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> Message-ID: <20180703053504.mubpx5uwcabmeqj5@phare.normalesup.org> On Mon, Jul 02, 2018 at 05:31:05PM -0600, Charles R Harris wrote: > ISTR that some parallel processing applications sent pickled arrays around to > different processes, I don't know if that is still the case, but if so, no copy > might be a big gain for them. Yes, most parallel code that's across processes or across computers use some form a pickle. I hope that this PEP would enable large speed ups. This would be a big deal for parallelism in numerical Python. From andrea.gavana at gmail.com Tue Jul 3 02:54:51 2018 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Tue, 3 Jul 2018 08:54:51 +0200 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: <20180703053504.mubpx5uwcabmeqj5@phare.normalesup.org> References: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> <20180703053504.mubpx5uwcabmeqj5@phare.normalesup.org> Message-ID: On Tue, 3 Jul 2018 at 07.35, Gael Varoquaux wrote: > On Mon, Jul 02, 2018 at 05:31:05PM -0600, Charles R Harris wrote: > > ISTR that some parallel processing applications sent pickled arrays > around to > > different processes, I don't know if that is still the case, but if so, > no copy > > might be a big gain for them. > > Yes, most parallel code that's across processes or across computers use > some form a pickle. I hope that this PEP would enable large speed ups. > This would be a big deal for parallelism in numerical Python. This sound so very powerful... it?s such a pity that these type of gems won?t be backported to Python 2 - we have so many legacy applications smoothly running in Python 2 and nowhere near the required resources to even start porting to Python 3, and pickle5 looks like a small revolution in the data-persistent world. Andrea. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Tue Jul 3 03:20:05 2018 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 3 Jul 2018 09:20:05 +0200 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: References: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> <20180703053504.mubpx5uwcabmeqj5@phare.normalesup.org> Message-ID: <20180703072005.pgzorszamyxguyf4@phare.normalesup.org> On Tue, Jul 03, 2018 at 08:54:51AM +0200, Andrea Gavana wrote: > This sound so very powerful... it?s such a pity that these type of gems won?t > be backported to Python 2 - we have so many legacy applications smoothly > running in Python 2 and nowhere near the required resources to even start > porting to Python 3, I am a strong defender of stability and long-term support in scientific software. But what you are demanding is that developers who do free work do not benefit from their own work to have a more powerful environment. More recent versions of Python are improved compared to older ones and make it much easier to write certain idioms. Developers make these changes over years to ensure that codebases are always simpler and more robust. Backporting in effect means doing this work twice, but the second time with more constraints. I just allocated something like a man-year to have robust parallel-computing features work both on Python 2 and Python 3. With this man-year we could have done many other things. Did I make the correct decision? I am not sure, because this is just creating more technical dept. I understand that we all sit on piles of code that we wrote for a given application and one point, and that we will not be able to modernise it all. But the fact that we don't have the bandwidth to make it evolve probably means that we need to triage what's important and call a loss the rest. Just like if I have 5 old cars in my backyard, I won't be able to keep them all on the road unless I am very rich. People asking for infinite backport to Python 2 are just asking developers to write them a second free check, even larger than the one they just got by having the feature under Python 3. Ga?l From andrea.gavana at gmail.com Tue Jul 3 03:42:08 2018 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Tue, 3 Jul 2018 09:42:08 +0200 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: <20180703072005.pgzorszamyxguyf4@phare.normalesup.org> References: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> <20180703053504.mubpx5uwcabmeqj5@phare.normalesup.org> <20180703072005.pgzorszamyxguyf4@phare.normalesup.org> Message-ID: Hi, On Tue, 3 Jul 2018 at 09.20, Gael Varoquaux wrote: > On Tue, Jul 03, 2018 at 08:54:51AM +0200, Andrea Gavana wrote: > > This sound so very powerful... it?s such a pity that these type of gems > won?t > > be backported to Python 2 - we have so many legacy applications smoothly > > running in Python 2 and nowhere near the required resources to even start > > porting to Python 3, > > I am a strong defender of stability and long-term support in scientific > software. But what you are demanding is that developers who do free work > do not benefit from their own work to have a more powerful environment. > > More recent versions of Python are improved compared to older ones and > make it much easier to write certain idioms. Developers make these > changes over years to ensure that codebases are always simpler and more > robust. Backporting in effect means doing this work twice, but the second > time with more constraints. I just allocated something like a man-year to > have robust parallel-computing features work both on Python 2 and Python > 3. With this man-year we could have done many other things. Did I make > the correct decision? I am not sure, because this is just creating more > technical dept. > > I understand that we all sit on piles of code that we wrote for a given > application and one point, and that we will not be able to modernise it > all. But the fact that we don't have the bandwidth to make it evolve > probably means that we need to triage what's important and call a loss > the rest. Just like if I have 5 old cars in my backyard, I won't be able > to keep them all on the road unless I am very rich. > > > People asking for infinite backport to Python 2 are just asking > developers to write them a second free check, even larger than the one > they just got by having the feature under Python 3. > Just to clarify: I wasn?t asking for anything, just complimenting Antoine?s work for something that appears to be a wonderful feature. There was a bit of rant from my part for sure, but I?ve never asked for someone to redo the work to make it run on Python 2. Allocating a resource to port hundreds of thousand of LOC is close to an impossibility in the industry I work in, especially because our big team (the two of us) don?t code for a living, we have way many different duties. We code to make our life easier. I?m happy if you feel better after your tirade. Andrea. > > Ga?l > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jul 3 04:27:56 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 3 Jul 2018 01:27:56 -0700 Subject: [Numpy-discussion] Fwd: Allowing broadcasting of code dimensions in generalized ufuncs In-Reply-To: References: Message-ID: On Sat, Jun 30, 2018 at 6:51 AM, Marten van Kerkwijk wrote: > Hi All, > > In case it was missed because people have tuned out of the thread: Matti and > I proposed last Tuesday to accept NEP 20 (on coming Tuesday, as per NEP 0), > which introduces notation for generalized ufuncs allowing fixed, flexible > and broadcastable core dimensions. For one thing, this will allow Matti to > finish his work on making matmul a gufunc. > > See http://www.numpy.org/neps/nep-0020-gufunc-signature-enhancement.html So I still have some of the same concerns as before... For the possibly missing dimensions: matmul is really important, and making it a gufunc solves the problem of making it overridable by duck arrays (via __array_ufunc__). Also, it will help later when we rework dtypes: new dtypes will be able to implement matmul by the normal ufunc loop registration mechanism, which is much nicer than the current system where every dtype has a special-case method just for handling matmul. The ? proposal isn't the most elegant idea ever, but we've been tossing around ideas for solving these problems for a while, and so far this seems to be the least-bad one, so... sure, let's do it. For the fixed-size dimensions: this makes me nervous. It's aimed at a real use case, which is a major point in it's favor. But a few things make me wary. For input dimensions, it's sugar ? the gufunc loop can already raise an error if it doesn't like the size it gets. For output dimensions, it does solve a real problem. But... only part of it. It's awkward that right now you only have a few limited ways to choose output dimensions, but this just extends the list of special cases, rather than solving the underlying problem. For example, 'np.linalg.qr' needs a much more generic mechanism to choose output shape, and parametrized dtypes will need a much more generic mechanism to choose output dtype, so we're definitely going to end up with some phase where arbitrary code gets to describe the output array. Are we going to look back on fixed-size dimensions as a quirky, redundant thing? Also, as currently proposed, it seems to rule out the possibility of using name-based axis specification in the future, right? (See https://github.com/numpy/numpy/pull/8819#issuecomment-366329325) Are we sure we want to do that? If everyone else is comfortable with all these things then I won't block it though. For broadcasting: I'm sorry, but I think I'm -1 on this. I feel like it falls into a classic anti-pattern in numpy, where someone sees a cool thing they could do and then goes looking for problems to justify it. (A red flag for me is that "it's easy to implement" keeps being mentioned as justification for doing it.) The all_equal and weighted_mean examples both feel pretty artificial -- traditionally we've always implemented these kinds of functions as regular functions that use (g)ufuncs internally, and it's worked fine (cf. np.allclose, ndarray.mean). In fact in some sense the whole point of numpy is to help people implement functions like this, without having to write their own gufuncs. Is there some reason these need to be gufuncs? And if there is, are these the only things that need to be gufuncs, or is there a broader class we're missing? The design just doesn't feel well-justified to me. And in the past, when we've implemented things like this, where the use cases are thin but hey why not it's easy to do, it's ended up causing two problems: first people start trying to force it into cases where it doesn't quite work, which makes everyone unhappy... and then when we eventually do try to solve the problem properly, we end up having to do elaborate workarounds to keep the old not-quite-working use cases from breaking. I'm pretty sure we're going to end up rewriting most of the ufunc code over the next few years as we ramp up duck array and user dtype support, and it's already going to be very difficult, both to design in the first place and then to implement while carefully keeping shims to keep all the old stuff working. Adding features has a very real cost, because it adds extra constraints that all this future work will have to work around. I don't think this meets the bar. By the way, I also think we're getting well past the point where we should be switching from a string-based DSL to a more structured representation. (This is another trap that numpy tends to fall into... the dtype "language" is also a major offender.) This isn't really a commentary on any part of this in particular, but just something that I've been noticing and wanted to mention :-). -n -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Tue Jul 3 04:41:10 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 3 Jul 2018 01:41:10 -0700 Subject: [Numpy-discussion] =?utf-8?q?NEP_22_=E2=80=94_Duck_typing_for_Nu?= =?utf-8?q?mPy_arrays_=E2=80=93_high_level_overview?= Message-ID: Hi all, Here's a NEP that Stephan and I wrote (actually a few months ago, but then I was super slow, thank you Stephan for your patience). It tries to lay out a high-level overview of where we're trying to go with all this duck array stuff, and some general guidelines that we've gradually come around to after lots of time scowling at whiteboards in frustration. -n ---------- =========================================================== NEP 22 ? Duck typing for NumPy arrays ? high level overview =========================================================== :Author: Stephan Hoyer , Nathaniel J. Smith :Status: Draft :Type: Informational :Created: 2018-03-22 Abstract -------- We outline a high-level vision for how NumPy will approach handling ?duck arrays?. This is an Informational-class NEP; it doesn?t prescribe full details for any particular implementation. In brief, we propose developing a number of new protocols for defining implementations of multi-dimensional arrays with high-level APIs matching NumPy. Detailed description -------------------- Traditionally, NumPy?s ``ndarray`` objects have provided two things: a high level API for expression operations on homogenously-typed, arbitrary-dimensional, array-structured data, and a concrete implementation of the API based on strided in-RAM storage. The API is powerful, fairly general, and used ubiquitously across the scientific Python stack. The concrete implementation, on the other hand, is suitable for a wide range of uses, but has limitations: as data sets grow and NumPy becomes used in a variety of new environments, there are increasingly cases where the strided in-RAM storage strategy is inappropriate, and users find they need sparse arrays, lazily evaluated arrays (as in dask), compressed arrays (as in blosc), arrays stored in GPU memory, arrays stored in alternative formats such as Arrow, and so forth ? yet users still want to work with these arrays using the familiar NumPy APIs, and re-use existing code with minimal (ideally zero) porting overhead. As a working shorthand, we call these ?duck arrays?, by analogy with Python?s ?duck typing?: a ?duck array? is a Python object which ?quacks like? a numpy array in the sense that it has the same or similar Python API, but doesn?t share the C-level implementation. This NEP doesn?t propose any specific changes to NumPy or other projects; instead, it gives an overview of how we hope to extend NumPy to support a robust ecosystem of projects implementing and relying upon its high level API. Terminology ~~~~~~~~~~~ ?Duck array? works fine as a placeholder for now, but it?s pretty jargony and may confuse new users, so we may want to pick something else for the actual API functions. Unfortunately, ?array-like? is already taken for the concept of ?anything that can be coerced into an array? (including e.g. list objects), and ?anyarray? is already taken for the concept of ?something that shares ndarray?s implementation, but has different semantics?, which is the opposite of a duck array (e.g., np.matrix is an ?anyarray?, but is not a ?duck array?). This is a classic bike-shed so for now we?re just using ?duck array?. Some possible options though include: arrayish, pseudoarray, nominalarray, ersatzarray, arraymimic, ... General approach ~~~~~~~~~~~~~~~~ At a high level, duck array support requires working through each of the API functions provided by NumPy, and figuring out how it can be extended to work with duck array objects. In some cases this is easy (e.g., methods/attributes on ndarray itself); in other cases it?s more difficult. Here are some principles we?ve found useful so far: Principle 1: Focus on ?full? duck arrays, but don?t rule out ?partial? duck arrays ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We can distinguish between two classes: * ?full? duck arrays, which aspire to fully implement np.ndarray?s Python-level APIs and work essentially anywhere that np.ndarray works * ?partial? duck arrays, which intentionally implement only a subset of np.ndarray?s API. Full duck arrays are, well, kind of boring. They have exactly the same semantics as ndarray, with differences being restricted to under-the-hood decisions about how the data is actually stored. The kind of people that are excited about making numpy more extensible are also, unsurprisingly, excited about changing or extending numpy?s semantics. So there?s been a lot of discussion of how to best support partial duck arrays. We've been guilty of this ourself. At this point though, we think the best general strategy is to focus our efforts primarily on supporting full duck arrays, and only worry about partial duck arrays as much as we need to to make sure we don't accidentally rule them out for no reason. Why focus on full duck arrays? Several reasons: First, there are lots of very clear use cases. Potential consumers of the full duck array interface include almost every package that uses numpy (scipy, sklearn, astropy, ...), and in particular packages that provide array-wrapping-classes that handle multiple types of arrays, such as xarray and dask.array. Potential implementers of the full duck array interface include: distributed arrays, sparse arrays, masked arrays, arrays with units (unless they switch to using dtypes), labeled arrays, and so forth. Clear use cases lead to good and relevant APIs. Second, the Anna Karenina principle applies here: full duck arrays are all alike, but every partial duck array is partial in its own way: * ``xarray.DataArray`` is mostly a duck array, but has incompatible broadcasting semantics. * ``xarray.Dataset`` wraps multiple arrays in one object; it still implements some array interfaces like ``__array_ufunc__``, but certainly not all of them. * ``pandas.Series`` has methods with similar behavior to numpy, but unique null-skipping behavior. * scipy?s ``LinearOperator``\s support matrix multiplication and nothing else * h5py and similar libraries for accessing array storage have objects that support numpy-like slicing and conversion into a full array, but not computation. * Some classes may be similar to ndarray, but without supporting the full indexing semantics. And so forth. Despite our best attempts, we haven't found any clear, unique way of slicing up the ndarray API into a hierarchy of related types that captures these distinctions; in fact, it?s unlikely that any single person even understands all the distinctions. And this is important, because we have a *lot* of APIs that we need to add duck array support to (both in numpy and in all the projects that depend on numpy!). By definition, these already work for ``ndarray``, so hopefully getting them to work for full duck arrays shouldn?t be so hard, since by definition full duck arrays act like ``ndarray``. It?d be very cumbersome to have to go through each function and identify the exact subset of the ndarray API that it needs, then figure out which partial array types can/should support it. Once we have things working for full duck arrays, we can go back later and refine the APIs needed further as needed. Focusing on full duck arrays allows us to start making progress immediately. In the future, it might be useful to identify specific use cases for duck arrays and standardize narrower interfaces targeted just at those use cases. For example, it might make sense to have a standard ?array loader? interface that file access libraries like h5py, netcdf, pydap, zarr, ... all implement, to make it easy to switch between these libraries. But that?s something that we can do as we go, and it doesn?t necessarily have to involve the NumPy devs at all. For an example of what this might look like, see the documentation for `dask.array.from_array `__. Principle 2: Take advantage of duck typing ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``ndarray`` has a very large API surface area:: In [1]: len(set(dir(np.ndarray)) - set(dir(object))) Out[1]: 138 And this is a huge **under**\estimate, because there are also many free-standing functions in NumPy and other libraries which currently use the NumPy C API and thus only work on ``ndarray`` objects. In type theory, a type is defined by the operations you can perform on an object; thus, the actual type of ``ndarray`` includes not just its methods and attributes, but *all* of these functions. For duck arrays to be successful, they?ll need to implement a large proportion of the ``ndarray`` API ? but not all of it. (For example, ``dask.array.Array`` does not provide an equivalent to the ``ndarray.ptp`` method, presumably because no-one has ever noticed or cared about its absence. But this doesn?t seem to have stopped people from using dask.) This means that realistically, we can?t hope to define the whole duck array API up front, or that anyone will be able to implement it all in one go; this will be an incremental process. It also means that even the so-called ?full? duck array interface is somewhat fuzzily defined at the borders; there are parts of the ``np.ndarray`` API that duck arrays won?t have to implement, but we aren?t entirely sure what those are. And ultimately, it isn?t really up to the NumPy developers to define what does or doesn?t qualify as a duck array. If we want scikit-learn functions to work on dask arrays (for example), then that?s going to require negotiation between those two projects to discover incompatibilities, and when an incompatibility is discovered it will be up to them to negotiate who should change and how. The NumPy project can provide technical tools and general advice to help resolve these disagreements, but we can?t force one group or another to take responsibility for any given bug. Therefore, even though we?re focusing on ?full? duck arrays, we *don?t* attempt to define a normative ?array ABC? ? maybe this will be useful someday, but right now, it?s not. And as a convenient side-effect, the lack of a normative definition leaves partial duck arrays room to experiment. But, we do provide some more detailed advice for duck array implementers and consumers below. Principle 3: Focus on protocols ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Historically, numpy has had lots of success at interoperating with third-party objects by defining *protocols*, like ``__array__`` (asks an arbitrary object to convert itself into an array), ``__array_interface__`` (a precursor to Python?s buffer protocol), and ``__array_ufunc__`` (allows third-party objects to support ufuncs like ``np.exp``). `NEP 16 `_ took a different approach: we need a duck-array equivalent of ``asarray``, and it proposed to do this by defining a version of ``asarray`` that would let through objects which implemented a new AbstractArray ABC. As noted above, we now think that trying to define an ABC is a bad idea for other reasons. But when this NEP was discussed on the mailing list, we realized that even on its own merits, this idea is not so great. A better approach is to define a *method* that can be called on an arbitrary object to ask it to convert itself into a duck array, and then define a version of ``asarray`` that calls this method. This is strictly more powerful: if an object is already a duck array, it can simply ``return self``. It allows more correct semantics: NEP 16 assumed that ``asarray(obj, dtype=X)`` is the same as ``asarray(obj).astype(X)``, but this isn?t true. And it supports more use cases: if h5py supported sparse arrays, it might want to provide an object which is not itself a sparse array, but which can be automatically converted into a sparse array. See NEP for full details. The protocol approach is also more consistent with core Python conventions: for example, see the ``__iter__`` method for coercing objects to iterators, or the ``__index__`` protocol for safe integer coercion. And finally, focusing on protocols leaves the door open for partial duck arrays, which can pick and choose which subset of the protocols they want to participate in, each of which have well-defined semantics. Conclusion: protocols are one honking great idea ? let?s do more of those. Principle 4: Reuse existing methods when possible ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ It?s tempting to try to define cleaned up versions of ndarray methods with a more minimal interface to allow for easier implementation. For example, ``__array_reshape__`` could drop some of the strange arguments accepted by ``reshape`` and ``__array_basic_getitem__`` could drop all the `strange edge cases `__ of NumPy?s advanced indexing. But as discussed above, we don?t really know what APIs we need for duck-typing ndarray. We would inevitably end up with a very long list of new special methods. In contrast, existing methods like ``reshape`` and ``__getitem__`` have the advantage of already being widely used/exercised by libraries that use duck arrays, and in practice, any serious duck array type is going to have to implement them anyway. Principle 5: Make it easy to do the right thing ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Making duck arrays work well is going to be a community effort. Documentation helps, but only goes so far. We want to make it easy to implement duck arrays that do the right thing. One way NumPy can help is by providing mixin classes for implementing large groups of related functionality at once. ``NDArrayOperatorsMixin`` is a good example: it allows for implementing arithmetic operators implicitly via the ``__array_ufunc__`` method. It?s not complete, and we?ll want more helpers like that (e.g. for reductions). (We initially thought that the importance of these mixins might be an argument for providing an array ABC, since that?s the standard way to do mixins in modern Python. But in discussion around NEP 16 we realized that partial duck arrays also wanted to take advantage of these mixins in some cases, so even if we did have an array ABC then the mixins would still need some sort of separate existence. So never mind that argument.) Tentative duck array guidelines ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As a general rule, libraries using duck arrays should insist upon the minimum possible requirements, and libraries implementing duck arrays should provide as complete of an API as possible. This will ensure maximum compatibility. For example, users should prefer to rely on ``.transpose()`` rather than ``.swapaxes()`` (which can be implemented in terms of transpose), but duck array authors should ideally implement both. If you are trying to implement a duck array, then you should strive to implement everything. You certainly need ``.shape``, ``.ndim`` and ``.dtype``, but also your dtype attribute should actually be a ``numpy.dtype`` object, weird fancy indexing edge cases should ideally work, etc. Only details related to NumPy?s specific ``np.ndarray`` implementation (e.g., ``strides``, ``data``, ``view``) are explicitly out of scope. A (very) rough sketch of future plans ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposals discussed so far ? ``__array_ufunc__`` and some kind of ``asarray`` protocol ? are clearly necessary but not sufficient for full duck typing support. We expect the need for additional protocols to support (at least) these features: * **Concatenating** duck arrays, which would be used internally by other array combining methods like stack/vstack/hstack. The implementation of concatenate will need to be negotiated among the list of array arguments. We expect to use an ``__array_concatenate__`` protocol like ``__array_ufunc__`` instead of multiple dispatch. * **Ufunc-like functions** that currently aren?t ufuncs. Many NumPy functions like median, percentile, sort, where and clip could be written as generalized ufuncs but currently aren?t. Either these functions should be written as ufuncs, or we should consider adding another generic wrapper mechanism that works similarly to ufuncs but makes fewer guarantees about how the implementation is done. * **Random number generation** with duck arrays, e.g., ``np.random.randn()``. For example, we might want to add new APIs like ``random_like()`` for generating new arrays with a matching shape *and* type ? though we'll need to look at some real examples of how these functions are used to figure out what would be helpful. * **Miscellaneous other functions** such as ``np.einsum``, ``np.zeros_like``, and ``np.broadcast_to`` that don?t fall into any of the above categories. * **Checking mutability** on duck arrays, which would imply that they support assignment with ``__setitem__`` and the out argument to ufuncs. Many otherwise fine duck arrays are not easily mutable (for example, because they use some kinds of sparse or compressed storage, or are in read-only shared memory), and it turns out that frequently-used code like the default implementation of ``np.mean`` needs to check this (to decide whether it can re-use temporary arrays). We intentionally do not describe exactly how to add support for these types of duck arrays here. These will be the subject of future NEPs. Copyright --------- This document has been placed in the public domain. -- Nathaniel J. Smith -- https://vorpus.org From neshuagarwal1909 at gmail.com Tue Jul 3 05:58:07 2018 From: neshuagarwal1909 at gmail.com (Aman Agarwal) Date: Tue, 3 Jul 2018 15:28:07 +0530 Subject: [Numpy-discussion] Unable to build Numpy on Mac OSX Message-ID: Hello Numpy, Can you please help us in building the numpy source code I am build on my Mac OS X 10.11.6. Step to reproduce : 1. Clone the package from git numpy/numpy.git 2. build it $> python setup.py build_ext -i 3. test it $> python -c 'import numpy; numpy.test()' on running to step 3 I am getting the following error - https://bpaste.net/show/b14b9e380f26 Please let me know if you need more information on this. Regards, Aman Agarwal -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.k.cook at gmail.com Tue Jul 3 07:59:23 2018 From: tom.k.cook at gmail.com (Tom Cook) Date: Tue, 3 Jul 2018 12:59:23 +0100 Subject: [Numpy-discussion] Corrupted installation? Message-ID: I have a remote sensing platform that uses Numpy on a raspberry pi. There are about a dozen of these installed. Three days ago, the Python part of the software started crashing during startup, with this exception message: Traceback (most recent call last): File "/home/azi/board/sensor_logger.py", line 16, in from board.measurement.data import Expression, MeasurementData File "/home/azi/board/measurement/data.py", line 2, in import numpy as np File "/usr/lib/python3/dist-packages/numpy/__init__.py", line 153, in from . import add_newdocs File "/usr/lib/python3/dist-packages/numpy/add_newdocs.py", line 13, in from numpy.lib import add_newdoc File "/usr/lib/python3/dist-packages/numpy/lib/__init__.py", line 8, in from .type_check import * File "/usr/lib/python3/dist-packages/numpy/lib/type_check.py", line 11, in import numpy.core.numeric as _nx File "/usr/lib/python3/dist-packages/numpy/core/__init__.py", line 7, in from . import umath SystemError: initialization of umath raised unreported exception I'm running numpy 1.8.2 on python 3.4.2 on a Raspberry Pi running Raspbian Jessie. I realise these are somewhat old, but as the software is running in the field (and I mean literally in a field - on another continent) I'm not in a great position to update them. Can anyone give me a pointer on how to debug this? As far as I can tell, umath.cpython-34m-arm-linux-gnueabihf.so has not changed recently. In fact, as far as I can tell, nothing changed around the time this stopped working. But I guess something must have! Regards, Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Tue Jul 3 08:21:53 2018 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 3 Jul 2018 14:21:53 +0200 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: References: <75eb2981-9888-bf95-13fb-6885ed9d6a3b@python.org> <20180703053504.mubpx5uwcabmeqj5@phare.normalesup.org> <20180703072005.pgzorszamyxguyf4@phare.normalesup.org> Message-ID: <20180703122153.6utsb4eiydm6jf4t@phare.normalesup.org> On Tue, Jul 03, 2018 at 09:42:08AM +0200, Andrea Gavana wrote: > I?m happy if you feel better after your tirade. Not really. I worry a lot that many users are going to be surprised when Python 2 stops being supported, which is in a couple of years. I wrote this tirade not to make me feel better, but to try to underlie that the switch is happening, and more and more of these exciting new things would pop up in Python 3. Soon, new releases of projects like numpy and scikit-learn won't support Python 2 anymore, which means that they will be getting exciting features too that don't benefit Python 2 users. It is a pity that some people find themselves left behind, because Python 3 is more and more exciting, with cool asynchronous features, more robust multiprocessing, better pickling, and many other great features. I found that, given a good test suite, porting from 2 to 3 wasn't very hard. The 2 key ingredients were a good test suite, and not hand-written C binding (Cython makes supporting both 2 and 3 really easy). My goal is not to shame, or create uneasy discussions, but more to encourage people to upgrade, at least for their core dependencies. Maybe I am not conveying the right message, or using the right tone. In which case, my apologies. I am genuinely excited about the Python3 future. Best, Ga?l From charlesr.harris at gmail.com Tue Jul 3 09:32:46 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 3 Jul 2018 07:32:46 -0600 Subject: [Numpy-discussion] Corrupted installation? In-Reply-To: References: Message-ID: On Tue, Jul 3, 2018 at 5:59 AM, Tom Cook wrote: > I have a remote sensing platform that uses Numpy on a raspberry pi. There > are about a dozen of these installed. Three days ago, the Python part of > the software started crashing during startup, with this exception message: > > Traceback (most recent call last): > File "/home/azi/board/sensor_logger.py", line 16, in > from board.measurement.data import Expression, MeasurementData > File "/home/azi/board/measurement/data.py", line 2, in > import numpy as np > File "/usr/lib/python3/dist-packages/numpy/__init__.py", line 153, in > > from . import add_newdocs > File "/usr/lib/python3/dist-packages/numpy/add_newdocs.py", line 13, in > > from numpy.lib import add_newdoc > File "/usr/lib/python3/dist-packages/numpy/lib/__init__.py", line 8, in > > from .type_check import * > File "/usr/lib/python3/dist-packages/numpy/lib/type_check.py", line 11, > in > import numpy.core.numeric as _nx > File "/usr/lib/python3/dist-packages/numpy/core/__init__.py", line 7, > in > from . import umath > SystemError: initialization of umath raised unreported exception > > I'm running numpy 1.8.2 on python 3.4.2 on a Raspberry Pi running Raspbian > Jessie. I realise these are somewhat old, but as the software is running > in the field (and I mean literally in a field - on another continent) I'm > not in a great position to update them. > > Can anyone give me a pointer on how to debug this? As far as I can tell, > umath.cpython-34m-arm-linux-gnueabihf.so has not changed recently. In > fact, as far as I can tell, nothing changed around the time this stopped > working. But I guess something must have! > Well, that numpy is quite old, so we have not touched it for many years. Did all your boards go down or just one or two? Did you update your own software? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Tue Jul 3 10:10:54 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Tue, 3 Jul 2018 10:10:54 -0400 Subject: [Numpy-discussion] Fwd: Allowing broadcasting of code dimensions in generalized ufuncs In-Reply-To: References: Message-ID: Hi Nathaniel, Thanks for the detailed thoughts. On Tue, Jul 3, 2018 at 4:27 AM, Nathaniel Smith wrote: > On Sat, Jun 30, 2018 at 6:51 AM, Marten van Kerkwijk > wrote: > > Hi All, > > > > In case it was missed because people have tuned out of the thread: Matti > and > > I proposed last Tuesday to accept NEP 20 (on coming Tuesday, as per NEP > 0), > > which introduces notation for generalized ufuncs allowing fixed, flexible > > and broadcastable core dimensions. For one thing, this will allow Matti > to > > finish his work on making matmul a gufunc. > > > > See http://www.numpy.org/neps/nep-0020-gufunc-signature-enhancement.html > > So I still have some of the same concerns as before... > > For the possibly missing dimensions: matmul is really important, and > making it a gufunc solves the problem of making it overridable by duck > arrays (via __array_ufunc__). Also, it will help later when we rework > dtypes: new dtypes will be able to implement matmul by the normal > ufunc loop registration mechanism, which is much nicer than the > current system where every dtype has a special-case method just for > handling matmul. Indeed, I only became recently aware of the ->dot member of the dtype struct. Pretty ugly! > The ? proposal isn't the most elegant idea ever, but > we've been tossing around ideas for solving these problems for a > while, and so far this seems to be the least-bad one, so... sure, > let's do it. > Yes, I think this ugliness is the price we pay for matmul not just meaning matrix multiplication but also no-broadcast vector-matrix, matrix-vector, and vector-vector. (Of course, my after-the-fact annoyance with that design does make me more sensitive to the argument that one should not try to shoehorn gufuncs into cases they are not meant for.) > > For the fixed-size dimensions: this makes me nervous. It's aimed at a > real use case, which is a major point in it's favor. But a few things > make me wary. For input dimensions, it's sugar ? the gufunc loop can > already raise an error if it doesn't like the size it gets. Can it? I would never have thought that the inner loop would need to do any checking; it is certainly not obvious from the code or from the documentation. Does the iterator itself check for errors every iteration? If so, that might be one place for a quick speed-up for ufunc.at... > For output > dimensions, it does solve a real problem. But... only part of it. It's > awkward that right now you only have a few limited ways to choose > output dimensions, but this just extends the list of special cases, > rather than solving the underlying problem. For example, > 'np.linalg.qr' needs a much more generic mechanism to choose output > shape, and parametrized dtypes will need a much more generic mechanism > to choose output dtype, so we're definitely going to end up with some > phase where arbitrary code gets to describe the output array. I think this is a much rarer case, which will indeed need some type of hook no matter what. (I have been thinking about solving that by moving the call to the type resolver earlier. That one gets passed `op`, so it can create an output array if it doesn't exist; it doesn't need any of the sizes.) > Are we > going to look back on fixed-size dimensions as a quirky, redundant > thing? > For this particular case, I find the signature so much clearer that that in itself is a reason to do it (readability counts and all that). > > Also, as currently proposed, it seems to rule out the possibility of > using name-based axis specification in the future, right? (See > https://github.com/numpy/numpy/pull/8819#issuecomment-366329325) Are > we sure we want to do that? > I think it only excludes having the choice of keying the dict with the default axis number *or* with its dimension name, which I think would in fact be a good idea: if one uses a dict to describe axes entries, the keys should be the names of the axes, where name can be one of the fixed numbers. (Actually, strictly, since the default axis number is always negative, one can even have both. But that would be awful.) Should add that I don't think the scheme will work all that well anyway - there are quite a few cases where one duplicates a name (say, a square matrix (n,n)), for which keying by "n" would be less than useful. > If everyone else is comfortable with all these things then I won't > block it though. > > For broadcasting: I'm sorry, but I think I'm -1 on this. I feel like > it falls into a classic anti-pattern in numpy, where someone sees a > cool thing they could do and then goes looking for problems to justify > it. (A red flag for me is that "it's easy to implement" keeps being > mentioned as justification for doing it.) The all_equal and > weighted_mean examples both feel pretty artificial -- traditionally > we've always implemented these kinds of functions as regular functions > that use (g)ufuncs internally, and it's worked fine (cf. np.allclose, > ndarray.mean). In fact in some sense the whole point of numpy is to > help people implement functions like this, without having to write > their own gufuncs. Is there some reason these need to be gufuncs? And > if there is, are these the only things that need to be gufuncs, or is > there a broader class we're missing? The design just doesn't feel > well-justified to me. > Those are all fair points. For all_equal one cannot really write a separate function easily given the absence of short-circuiting. But possibly that just argues for the need to short-circuit... > And in the past, when we've implemented things like this, where the > use cases are thin but hey why not it's easy to do, it's ended up > causing two problems: first people start trying to force it into cases > where it doesn't quite work, which makes everyone unhappy... and then > when we eventually do try to solve the problem properly, we end up > having to do elaborate workarounds to keep the old not-quite-working > use cases from breaking. > Given my sentiments about the multiple meanings of `@`, I'm sensitive to this argument. But as Matti pointed out, we do not have the accept the whole NEP. Indeed, the broadcasting is in a separate PR. > > I'm pretty sure we're going to end up rewriting most of the ufunc code > over the next few years as we ramp up duck array and user dtype > support, and it's already going to be very difficult, both to design > in the first place and then to implement while carefully keeping shims > to keep all the old stuff working. Adding features has a very real > cost, because it adds extra constraints that all this future work will > have to work around. I don't think this meets the bar. > I think by now it is clear that moving incrementally is the way to go; the ufunc code is in fact being rewritten, if slowly. > By the way, I also think we're getting well past the point where we > should be switching from a string-based DSL to a more structured > representation. (This is another trap that numpy tends to fall into... > the dtype "language" is also a major offender.) This isn't really a > commentary on any part of this in particular, but just something that > I've been noticing and wanted to mention :-). > Yet, dtype('(3,4)f8') is really clear, unlike all the other forms... It is similar to the string formatting mini-language. Anyway, that is a bit off topic. Overall, would one way to move forward be to merge the first PR (flexible and frozen) and defer the broadcastable dimensions? All the best, Marten p.s. I'm amused that the broadcastable dimensions were in fact the default originally. At some point, I should try to find out why that default was changed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jul 3 12:00:44 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 3 Jul 2018 09:00:44 -0700 Subject: [Numpy-discussion] Corrupted installation? In-Reply-To: References: Message-ID: On Tue, Jul 3, 2018, 05:01 Tom Cook wrote: > I have a remote sensing platform that uses Numpy on a raspberry pi. There > are about a dozen of these installed. Three days ago, the Python part of > the software started crashing during startup, with this exception message: > > Traceback (most recent call last): > File "/home/azi/board/sensor_logger.py", line 16, in > from board.measurement.data import Expression, MeasurementData > File "/home/azi/board/measurement/data.py", line 2, in > import numpy as np > File "/usr/lib/python3/dist-packages/numpy/__init__.py", line 153, in > > from . import add_newdocs > File "/usr/lib/python3/dist-packages/numpy/add_newdocs.py", line 13, in > > from numpy.lib import add_newdoc > File "/usr/lib/python3/dist-packages/numpy/lib/__init__.py", line 8, in > > from .type_check import * > File "/usr/lib/python3/dist-packages/numpy/lib/type_check.py", line 11, > in > import numpy.core.numeric as _nx > File "/usr/lib/python3/dist-packages/numpy/core/__init__.py", line 7, in > > from . import umath > SystemError: initialization of umath raised unreported exception > And the C level, reporting an exception to the python interpreter involves two steps: (1) stashing the exception object in a special global variable where the interpreter knows to look for it, (2) returning a special value (usually -1 or NULL) that tells the interpreter an exception has been raised and it should go look at that global variable for more details. I *think* that error means that the umath initialization routine did step (1), but not step (2), and the interpreter is cranky because it doesn't know how to interpret this. (Unfortunately I don't have a copy of py34 on my phone to confirm...) If that's correct, though, then probably the next thing you want to do is figure out what exception is being set, since that's likely to be the real error. It's annoying that python doesn't tell you. It should be possible to dig around with gdb and figure it out, but will require some knowledge of the python C API and gdb and I don't know the recipe offhand. Are you having the same problem on all the installations? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From antoine at python.org Tue Jul 3 12:07:09 2018 From: antoine at python.org (Antoine Pitrou) Date: Tue, 3 Jul 2018 18:07:09 +0200 Subject: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data In-Reply-To: References: Message-ID: <6d17281f-a569-fc5e-cb61-dc18e670ac4b@python.org> On Mon, 2 Jul 2018 17:16:00 -0600 Charles R Harris wrote: > Maybe somewhat off topic, but we have had trouble with a 2 GiB limit on > file writes on OS X. See https://github.com/numpy/numpy/issues/3858. Does > your implementation work around that? No, it's not the same topic at all. I'd recommend perhaps pinging on the python-dev PR. Regards Antoine. From shoyer at gmail.com Tue Jul 3 12:11:45 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 3 Jul 2018 09:11:45 -0700 Subject: [Numpy-discussion] Fwd: Allowing broadcasting of code dimensions in generalized ufuncs In-Reply-To: References: Message-ID: On Tue, Jul 3, 2018 at 7:13 AM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Overall, would one way to move forward be to merge the first PR (flexible > and frozen) and defer the broadcastable dimensions? > This would have my support. I have similar misgivings about broadcastable dimensions to those raised by Nathaniel. In particular, I wonder if there is some way to make use of this functionality internally in NumPy for functions like all_equal() without exposing it as part of the external gufunc API. -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Tue Jul 3 13:17:15 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Tue, 3 Jul 2018 13:17:15 -0400 Subject: [Numpy-discussion] Fwd: Allowing broadcasting of code dimensions in generalized ufuncs In-Reply-To: References: Message-ID: On Tue, Jul 3, 2018 at 12:11 PM, Stephan Hoyer wrote: > On Tue, Jul 3, 2018 at 7:13 AM Marten van Kerkwijk < > m.h.vankerkwijk at gmail.com> wrote: > >> Overall, would one way to move forward be to merge the first PR (flexible >> and frozen) and defer the broadcastable dimensions? >> > > This would have my support. > > I have similar misgivings about broadcastable dimensions to those raised > by Nathaniel. > > In particular, I wonder if there is some way to make use of this > functionality internally in NumPy for functions like all_equal() without > exposing it as part of the external gufunc API. > > OK, so let me explicitly ask whether there are any objections to going forward with flexible and frozen dimensions, but deferring on broadcastable ones until more compelling use cases have been identified? Thanks, Marten p.s. I adjusted the "acceptance PR" to reflect this: https://github.com/numpy/numpy/pull/11429 -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jul 3 17:34:30 2018 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 3 Jul 2018 14:34:30 -0700 Subject: [Numpy-discussion] Proposal to accept NEP 19: Random Number Generator Policy In-Reply-To: References: Message-ID: There has been one clarification to the text: https://github.com/numpy/numpy/pull/11488 For the legacy RandomState, we will *not* be fixing correctness bugs if they doing so would break the stream; this is in contrast with the current policy where we can fix correctness bugs. In the post-NEP world, RandomState's purpose will be to provide across-version-stable numbers for unit tests, so stability is primary. I don't expect to see many more bugs, except in arcane corners of the parameter spaces, like in #11475, which can be avoided, and we can introduce warnings to help users avoid them. https://github.com/numpy/numpy/pull/11475 -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Jul 6 01:54:55 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 5 Jul 2018 22:54:55 -0700 Subject: [Numpy-discussion] Fwd: [NumFOCUS Projects] NumFOCUS Summit Registration In-Reply-To: References: Message-ID: Hi all, In September the NumFOCUS Summit for 2018 will be held in New York. NumPy can send one representative (or a couple, but costs are only covered for one person). We opened this opportunity up to members of the Steering Council first, however it doesn't look like anyone is in the position to attend. Therefore we'd now like to ask if anyone else would like to attend. Rules of the game: you do need a strong affiliation with the project (e.g. you have commit right, have been contributing for a while, have been hired to work on NumPy, etc.); in case multiple people are interested then the NumPy Steering Council will make a decision. If you're interested, please reply here or off-list. Cheers, Ralf ---------- Forwarded message ---------- From: Leah Silen Date: Wed, Jun 27, 2018 at 4:35 PM Subject: [NumFOCUS Projects] NumFOCUS Summit Registration To: projects at numfocus.org *Hi all,We?d like to share some additional details on this year?s NumFOCUS Project Summit. The Sustainability Workshop portion of the Summit (Sept 22-23) is an opportunity for projects to focus on creating roadmaps and developing strategies for long-term sustainability. This year we would like your help in generating the content and driving the direction of the workshop. We?ll be reaching out for your suggestions on specific areas you would like to see addressed.The Project Forum for Core Developers and Critical Users (Sep 24-25) is a meeting, open to the public, for critical users to directly interact with the core developers of NumFOCUS projects. The goal is to connect you with resources to advance your community roadmaps while beginning a dialogue with key stakeholders. The event will be limited to 150 attendees. Again, we will be giving participants (both projects and users) an opportunity to drive the content and agenda. NumFOCUS will cover travel expenses for one representative per sponsored project. This includes an airfare allotment of $600 and hotel accommodations for 4 nights in New York City. The airfare amount is an average; we understand international travel costs will exceed this amount. Please contact us if you would like to have more than one rep attend. When choosing who will attend, we ask that you prioritize members of your leadership team or steering committee as this will help your project get the most benefit out of the Summit experience. In order to book hotel rooms, it is urgent that we know who will be attending as soon as possible. We?ve set up the following site for project leads to register: Once again, if you?re interested in working on any portion of the summit, we would love your input and leadership in setting the agenda. Please reach out to summit at numfocus.org if you would like to be involved. NumFOCUS Summit Committee-Andy TerrelJames PowellLeah SilenGina HelfrichJim Weiss* --- Leah Silen Executive Director, NumFOCUS leah at numfocus.org 512-222-5449 -- You received this message because you are subscribed to the Google Groups "Fiscally Sponsored Project Representatives" group. To unsubscribe from this group and stop receiving emails from it, send an email to projects+unsubscribe at numfocus.org. To post to this group, send email to projects at numfocus.org. Visit this group at https://groups.google.com/a/numfocus.org/group/projects/ . To view this discussion on the web visit https://groups.google.com/a/ numfocus.org/d/msgid/projects/CALWv6uLTL1EFFh%2BsaV3YTGVtfVUm01ey4GTnY1SP% 3DLbYNCS8ZA%40mail.gmail.com . -------------- next part -------------- An HTML attachment was scrubbed... URL: From manuel.schoelling at gmx.de Sun Jul 8 05:19:25 2018 From: manuel.schoelling at gmx.de (Manuel =?ISO-8859-1?Q?Sch=F6lling?=) Date: Sun, 08 Jul 2018 11:19:25 +0200 Subject: [Numpy-discussion] Allow moveaxis() to take strings as axis order Message-ID: <871213c48d24f739c6507c7df96c1dd2f96851e5.camel@gmx.de> Hi, I have opened a pull request [1] for enhancing moveaxis() a bit, so it can take strings as arguments to permute the axis order. The new feature is best described by a short example. All these calls to moveaxis() will do the same thing: A = np.zeros((0, 1, 2, 3)) # reverse order np.moveaxis(A, [0, 1, 2, 3], [3, 2, 1, 0]) np.moveaxis(A, 'ijkl', 'lkji') np.moveaxis(A, 'wxyz', 'zyxw') # more complicated axis permutation np.moveaxis(A, [0, 1, 2, 3], [3, 1, 2, 0]) np.moveaxis(A, 'ijkl', 'ljki') np.moveaxis(A, 'wxyz', 'zxyw') It has been mentioned by Eric Wieser that np.einsum('ijkl->lkji', A) basically does the same thing, although it is not really obvious to the user that is not used Einstein's sum convention what this call does, because the function call says it would perform a 'sum' operation, but actually only permutes axes. It was also mentioned that the Zen of Python says > There should be one-- and preferably only one --obvious way to do it. and if moveaxis() supports multiple ways to perform its operation, there could be a cognitive bias if the developer is only used to e.g. the string arguments and suddenly see the array arguments (or vice versa). Despite these arguments I think the string arguments makes calls to moveaxis() much more obvious than integer arguments, especially if ndim > 3. How does the mailing list feel about this pull request? Bye, Manuel [1] https://github.com/numpy/numpy/pull/11504 From sandro.tosi at gmail.com Sun Jul 8 22:35:48 2018 From: sandro.tosi at gmail.com (Sandro Tosi) Date: Sun, 8 Jul 2018 22:35:48 -0400 Subject: [Numpy-discussion] NumPy 1.15.0rc1 released In-Reply-To: References: Message-ID: > The Python versions supported by this release are 2.7, 3.4-3.6. The wheels are linked with > OpenBLAS 3.0, which should fix some of the linalg problems reported for NumPy 1.14, > and the source archives were created using Cython 0.28.2 and should work with the upcoming > Python 3.7. just checking: in Debian we're currently linking against libblas/liblapack (as available from http://www.netlib.org/lapack/) - should we start investigating switching to OpenBLAS? Thanks! -- Sandro "morph" Tosi My website: http://sandrotosi.me/ Me at Debian: http://wiki.debian.org/SandroTosi G+: https://plus.google.com/u/0/+SandroTosi From ralf.gommers at gmail.com Mon Jul 9 01:13:02 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 8 Jul 2018 22:13:02 -0700 Subject: [Numpy-discussion] NumPy 1.15.0rc1 released In-Reply-To: References: Message-ID: On Sun, Jul 8, 2018 at 7:35 PM, Sandro Tosi wrote: > > The Python versions supported by this release are 2.7, 3.4-3.6. The > wheels are linked with > > OpenBLAS 3.0, which should fix some of the linalg problems reported for > NumPy 1.14, > > and the source archives were created using Cython 0.28.2 and should work > with the upcoming > > Python 3.7. > > just checking: in Debian we're currently linking against > libblas/liblapack (as available from http://www.netlib.org/lapack/) - > should we start investigating switching to OpenBLAS? > Yes I'd say so, the performance difference can be very large. I always thought Debian was linking against ATLAS - just because it's available and that's how we always recommended building on Debian/Linux from source. Plain Netlib BLAS/LAPACK is never recommended. Ralf > > Thanks! > > -- > Sandro "morph" Tosi > My website: http://sandrotosi.me/ > Me at Debian: http://wiki.debian.org/SandroTosi > G+: https://plus.google.com/u/0/+SandroTosi > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From opossumnano at gmail.com Mon Jul 9 05:22:34 2018 From: opossumnano at gmail.com (Tiziano Zito) Date: Mon, 9 Jul 2018 11:22:34 +0200 Subject: [Numpy-discussion] NumPy 1.15.0rc1 released In-Reply-To: References: Message-ID: <20180709092233.k77rt2vu55awkvfu@multivac> On Sun 08 Jul, 22:35 -0400, Sandro Tosi wrote: >> The Python versions supported by this release are 2.7, 3.4-3.6. The wheels are linked with >> OpenBLAS 3.0, which should fix some of the linalg problems reported for NumPy 1.14, >> and the source archives were created using Cython 0.28.2 and should work with the upcoming >> Python 3.7. > >just checking: in Debian we're currently linking against >libblas/liblapack (as available from http://www.netlib.org/lapack/) - >should we start investigating switching to OpenBLAS? Well, as far as I can tell numpy in Debian is built using the /etc/alternatives method, i.e. you can choose which BLAS implementation to use at run time if more then one implementation is installed. In my case, it links to openblas already: """ $ ldd /usr/lib/python3/dist-packages/numpy/core/multiarray.cpython-36m-x86_64-linux-gnu.so linux-vdso.so.1 (0x00007ffe3bf7b000) libblas.so.3 => /usr/lib/x86_64-linux-gnu/libblas.so.3 (0x00007f23df471000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f23debaa000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f23de98c000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f23de5d2000) /lib64/ld-linux-x86-64.so.2 (0x00007f23df2e9000) libopenblas.so.0 => /usr/lib/x86_64-linux-gnu/libopenblas.so.0 (0x00007f23dc35f000) libgfortran.so.4 => /usr/lib/x86_64-linux-gnu/libgfortran.so.4 (0x00007f23dbf8b000) libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007f23dbd4b000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f23dbb2d000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f23db915000) $ ls -l /usr/lib/x86_64-linux-gnu/libblas.so.3 lrwxrwxrwx 1 root root 47 Sep 11 2017 /usr/lib/x86_64-linux-gnu/libblas.so.3 -> /etc/alternatives/libblas.so.3-x86_64-linux-gnu $ ls -l /etc/alternatives/libblas.so.3-x86_64-linux-gnu lrwxrwxrwx 1 root root 47 Sep 11 2017 /etc/alternatives/libblas.so.3-x86_64-linux-gnu -> /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 $ update-alternatives --display libblas.so.3-x86_64-linux-gnu libblas.so.3-x86_64-linux-gnu - auto mode link best version is /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 link currently points to /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 link libblas.so.3-x86_64-linux-gnu is /usr/lib/x86_64-linux-gnu/libblas.so.3 /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3 - priority 35 /usr/lib/x86_64-linux-gnu/blas/libblas.so.3 - priority 10 /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 - priority 40 """ So, it seems to me there's no problem to solve in Debian? Ciao! Tiziano From jeffsaremi at hotmail.com Mon Jul 9 12:48:08 2018 From: jeffsaremi at hotmail.com (jeff saremi) Date: Mon, 9 Jul 2018 16:48:08 +0000 Subject: [Numpy-discussion] Looking for description/insight/documentation on matmul Message-ID: Is there any resource available or anyone who's able to describe matmul operation of matrices when n > 2? The only description i can find is: "If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly." which is very cryptic to me. Could someone break this down please? when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting dimensions? is there one answer to that? Is it deterministic? What does "residing in the last two indices" mean? What is broadcast and where? thanks jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Jul 9 13:50:21 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 9 Jul 2018 10:50:21 -0700 Subject: [Numpy-discussion] Looking for description/insight/documentation on matmul In-Reply-To: References: Message-ID: Hi Jeff, I think PEP 465 would be the definitive reference here. See the section on "Intended usage details" in https://www.python.org/dev/peps/pep-0465/ Cheers, Stephan On Mon, Jul 9, 2018 at 9:48 AM jeff saremi wrote: > Is there any resource available or anyone who's able to describe matmul > operation of matrices when n > 2? > > The only description i can find is: "If either argument is N-D, N > 2, it > is treated as a stack of matrices residing in the last two indexes and > broadcast accordingly." which is very cryptic to me. > Could someone break this down please? > when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting > dimensions? is there one answer to that? Is it deterministic? > What does "residing in the last two indices" mean? What is broadcast and > where? > thanks > jeff > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Mon Jul 9 13:54:50 2018 From: matti.picus at gmail.com (Matti Picus) Date: Mon, 9 Jul 2018 10:54:50 -0700 Subject: [Numpy-discussion] Looking for description/insight/documentation on matmul In-Reply-To: References: Message-ID: <32ba25c2-3b23-f9b7-3761-0efe527ca3d5@gmail.com> On 09/07/18 09:48, jeff saremi wrote: > Is there any resource available or anyone who's able to describe > matmul operation of matrices when n > 2? > > The only description i can find is: "If either argument is N-D, N > 2, > it is treated as a stack of matrices residing in the last two indexes > and broadcast accordingly." which is very cryptic to me. > Could someone break this down please? > when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting > dimensions? is there one answer to that? Is it deterministic? > What does "residing in the last two indices" mean? What is broadcast > and where? > thanks > jeff > You could do np.matmul(np.ones((2, 3, 4, 5, 6)), np.ones((2, 3, 4, 6, 7))).shape which yields (2, 3, 4, 5, 7). When ndim >= 2 in both operands, matmul uses the last two dimensions as (..., n, m) @ (...., m, p) -> (..., n, p). Note the repeating "m", so your example would not work: n1=5, m1=6 in the first operand and m2=8, p2=9 in the second so m1 != m2. The "broadcast" refers only to the "..." dimensions, if in either of the operands you replace the 2 or 3 or 4 with 1 then that operand will broadcast (repeat itself) across that dimension to fit the other operand. Also if one of the three first dimensions is missing in one of the operands it will broadcast. When ndim < 2 for one of the operands only, it will be interpreted as "m", and the other dimension "n" or "p" will not appear on the output, so the signature is (..., n, m),(m) -> (..., n) or (m),(..., m, p)->(..., p) When ndim < 2 for both of the operands, it is the same as? a dot product and will produce a scalar. You didn't ask, but I will complete the picture: np.dot is different for the case of n>=2. The result will extend (combine? broadcast across?) both sets of ... dimensions, so np.dot(np.ones((2,3,4,5,6)), np.ones((8, 9, 6, 7))).shape which yields (2, 6, 4, 5, 8, 9, 7). The (2, 3, 4) dimensions are followed by (8, 9) Matti From charlesr.harris at gmail.com Mon Jul 9 18:08:12 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Jul 2018 16:08:12 -0600 Subject: [Numpy-discussion] NumPy 1.15.0rc2 released. Message-ID: Hi All, On behalf of the NumPy team I'm pleased to announce the release of NumPy 1.15.0rc2. This release has an unusual number of cleanups, many deprecations of old functions, and improvements to many existing functions. A total of 435 pull reguests were merged for this release, please look at the release notes for details. Some highlights are: - NumPy has switched to pytest for testing. - A new `numpy.printoptions` context manager. - Many improvements to the histogram functions. - Support for unicode field names in python 2.7. - Improved support for PyPy. - Fixes and improvements to `numpy.einsum`. The Python versions supported by this release are 2.7, 3.4-3.7. The wheels are linked with OpenBLAS v0.3.0, which should fix some of the linalg problems reported for NumPy 1.14. Wheels for this release can be downloaded from PyPI , source archives are available from Github . A total of 131 people contributed to this release. People with a "+" by their names contributed a patch for the first time. - Aaron Critchley + - Aarthi + - Aarthi Agurusa + - Alex Thomas + - Alexander Belopolsky - Allan Haldane - Anas Khan + - Andras Deak - Andrey Portnoy + - Anna Chiara - Aurelien Jarno + - Baurzhan Muftakhidinov - Berend Kapelle + - Bernhard M. Wiedemann - Bjoern Thiel + - Bob Eldering - Cenny Wenner + - Charles Harris - ChloeColeongco + - Chris Billington + - Christopher + - Chun-Wei Yuan + - Claudio Freire + - Daniel Smith - Darcy Meyer + - David Abdurachmanov + - David Freese - Deepak Kumar Gouda + - Dennis Weyland + - Derrick Williams + - Dmitriy Shalyga + - Eric Cousineau + - Eric Larson - Eric Wieser - Evgeni Burovski - Frederick Lefebvre + - Gaspar Karm + - Geoffrey Irving - Gerhard Hobler + - Gerrit Holl - Guo Ci + - Hameer Abbasi + - Han Shen - Hiroyuki V. Yamazaki + - Hong Xu - Ihor Melnyk + - Jaime Fernandez - Jake VanderPlas + - James Tocknell + - Jarrod Millman - Jeff VanOss + - John Kirkham - Jonas Rauber + - Jonathan March + - Joseph Fox-Rabinovitz - Julian Taylor - Junjie Bai + - Juris Bogusevs + - J?rg D?pfert - Kenichi Maehashi + - Kevin Sheppard - Kimikazu Kato + - Kirit Thadaka + - Kritika Jalan + - Lakshay Garg + - Lars G + - Licht Takeuchi - Louis Potok + - Luke Zoltan Kelley - MSeifert04 + - Mads R. B. Kristensen + - Malcolm Smith + - Mark Harfouche + - Marten H. van Kerkwijk + - Marten van Kerkwijk - Matheus Vieira Portela + - Mathieu Lamarre - Mathieu Sornay + - Matthew Brett - Matthew Rocklin + - Matthias Bussonnier - Matti Picus - Michael Droettboom - Miguel S?nchez de Le?n Peque + - Mike Toews + - Milo + - Nathaniel J. Smith - Nelle Varoquaux - Nicholas Nadeau, P.Eng., AVS + - Nick Minkyu Lee + - Nikita + - Nikita Kartashov + - Nils Becker + - Oleg Zabluda - Orestis Floros + - Pat Gunn + - Paul van Mulbregt + - Pauli Virtanen - Pierre Chanial + - Ralf Gommers - Raunak Shah + - Robert Kern - Russell Keith-Magee + - Ryan Soklaski + - Samuel Jackson + - Sebastian Berg - Siavash Eliasi + - Simon Conseil - Simon Gibbons - Stefan Krah + - Stefan van der Walt - Stephan Hoyer - Subhendu + - Subhendu Ranjan Mishra + - Tai-Lin Wu + - Tobias Fischer + - Toshiki Kataoka + - Tyler Reddy + - Unknown + - Varun Nayyar - Victor Rodriguez + - Warren Weckesser - William D. Irons + - Zane Bradley + - fo40225 + - lapack_lite code generator + - lumbric + - luzpaz + - mamrehn + - tynn + - xoviat Cheers Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Mon Jul 9 18:39:31 2018 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Mon, 9 Jul 2018 17:39:31 -0500 Subject: [Numpy-discussion] NumPy 1.15.0rc2 released. In-Reply-To: References: Message-ID: Hi Chuck, Is there a summary of the differences with respect to rc1 somewhere? Nathan On Mon, Jul 9, 2018 at 5:08 PM Charles R Harris wrote: > Hi All, > > On behalf of the NumPy team I'm pleased to announce the release of NumPy > 1.15.0rc2. > This release has an unusual number of cleanups, many deprecations of old > functions, > and improvements to many existing functions. A total of 435 pull reguests > were merged > for this release, please look at the release notes > for details. Some > highlights are: > > - NumPy has switched to pytest for testing. > - A new `numpy.printoptions` context manager. > - Many improvements to the histogram functions. > - Support for unicode field names in python 2.7. > - Improved support for PyPy. > - Fixes and improvements to `numpy.einsum`. > > The Python versions supported by this release are 2.7, 3.4-3.7. The > wheels are linked with > OpenBLAS v0.3.0, which should fix some of the linalg problems reported for > NumPy 1.14. > > Wheels for this release can be downloaded from PyPI > , source archives are available > from Github . > > A total of 131 people contributed to this release. People with a "+" by > their > names contributed a patch for the first time. > > > - Aaron Critchley + > - Aarthi + > - Aarthi Agurusa + > - Alex Thomas + > - Alexander Belopolsky > - Allan Haldane > - Anas Khan + > - Andras Deak > - Andrey Portnoy + > - Anna Chiara > - Aurelien Jarno + > - Baurzhan Muftakhidinov > - Berend Kapelle + > - Bernhard M. Wiedemann > - Bjoern Thiel + > - Bob Eldering > - Cenny Wenner + > - Charles Harris > - ChloeColeongco + > - Chris Billington + > - Christopher + > - Chun-Wei Yuan + > - Claudio Freire + > - Daniel Smith > - Darcy Meyer + > - David Abdurachmanov + > - David Freese > - Deepak Kumar Gouda + > - Dennis Weyland + > - Derrick Williams + > - Dmitriy Shalyga + > - Eric Cousineau + > - Eric Larson > - Eric Wieser > - Evgeni Burovski > - Frederick Lefebvre + > - Gaspar Karm + > - Geoffrey Irving > - Gerhard Hobler + > - Gerrit Holl > - Guo Ci + > - Hameer Abbasi + > - Han Shen > - Hiroyuki V. Yamazaki + > - Hong Xu > - Ihor Melnyk + > - Jaime Fernandez > - Jake VanderPlas + > - James Tocknell + > - Jarrod Millman > - Jeff VanOss + > - John Kirkham > - Jonas Rauber + > - Jonathan March + > - Joseph Fox-Rabinovitz > - Julian Taylor > - Junjie Bai + > - Juris Bogusevs + > - J?rg D?pfert > - Kenichi Maehashi + > - Kevin Sheppard > - Kimikazu Kato + > - Kirit Thadaka + > - Kritika Jalan + > - Lakshay Garg + > - Lars G + > - Licht Takeuchi > - Louis Potok + > - Luke Zoltan Kelley > - MSeifert04 + > - Mads R. B. Kristensen + > - Malcolm Smith + > - Mark Harfouche + > - Marten H. van Kerkwijk + > - Marten van Kerkwijk > - Matheus Vieira Portela + > - Mathieu Lamarre > - Mathieu Sornay + > - Matthew Brett > - Matthew Rocklin + > - Matthias Bussonnier > - Matti Picus > - Michael Droettboom > - Miguel S?nchez de Le?n Peque + > - Mike Toews + > - Milo + > - Nathaniel J. Smith > - Nelle Varoquaux > - Nicholas Nadeau, P.Eng., AVS + > - Nick Minkyu Lee + > - Nikita + > - Nikita Kartashov + > - Nils Becker + > - Oleg Zabluda > - Orestis Floros + > - Pat Gunn + > - Paul van Mulbregt + > - Pauli Virtanen > - Pierre Chanial + > - Ralf Gommers > - Raunak Shah + > - Robert Kern > - Russell Keith-Magee + > - Ryan Soklaski + > - Samuel Jackson + > - Sebastian Berg > - Siavash Eliasi + > - Simon Conseil > - Simon Gibbons > - Stefan Krah + > - Stefan van der Walt > - Stephan Hoyer > - Subhendu + > - Subhendu Ranjan Mishra + > - Tai-Lin Wu + > - Tobias Fischer + > - Toshiki Kataoka + > - Tyler Reddy + > - Unknown + > - Varun Nayyar > - Victor Rodriguez + > - Warren Weckesser > - William D. Irons + > - Zane Bradley + > - fo40225 + > - lapack_lite code generator + > - lumbric + > - luzpaz + > - mamrehn + > - tynn + > - xoviat > > Cheers > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 9 18:50:29 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Jul 2018 16:50:29 -0600 Subject: [Numpy-discussion] NumPy 1.15.0rc2 released. In-Reply-To: References: Message-ID: On Mon, Jul 9, 2018 at 4:39 PM, Nathan Goldbaum wrote: > Hi Chuck, > > Is there a summary of the differences with respect to rc1 somewhere? > No, but you can look at the changelog to see what went in. and compare it to the previous changelog to see what is additional. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Mon Jul 9 19:38:18 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 9 Jul 2018 16:38:18 -0700 Subject: [Numpy-discussion] NumPy 1.15.0rc2 released. In-Reply-To: References: Message-ID: Maybe not too unhelpful: mhvk at weasel:~/packages/numpy$ git log v1.15.0rc1..v1.15.0rc2 commit ccc68b80305ff5b363d10f6e905fb4e5276a8adb (HEAD, tag: v1.15.0rc2) Author: Charles Harris Date: Mon Jul 9 10:35:40 2018 -0600 REL: NumPy 1.15.0rc2 release. commit 3f6e8dc0a223162ceda7563ac774258a22934ca7 Merge: 3b91b03f6 7af0de94d Author: Charles Harris Date: Mon Jul 9 10:30:38 2018 -0600 Merge pull request #11540 from charris/update-1.15.0-notes DOC: Update the 1.15.0 release notes. commit 7af0de94dca925e2df5b49f9f9405c0cabc02348 Author: Charles Harris Date: Mon Jul 9 09:15:51 2018 -0600 DOC: Update the 1.15.0 release notes. * Note that the release supports Python 3.7. * Add np.einsum entry [ci skip] commit 3b91b03f6bf173650cea773185c6f4be56fcdb9e Merge: a44b61cd0 9bc770cb3 Author: Charles Harris Date: Sun Jul 8 15:48:02 2018 -0600 Merge pull request #11532 from charris/backport-11515 BUG: Decref of field title caused segfault commit 9bc770cb3472f54dad5d6b36ddff991e1701c06d Author: mattip Date: Fri Jul 6 11:18:37 2018 -0700 fix from review commit c9c85cd37c238e13ae6e0f978e3e8d9ad2be3af5 Author: mattip Date: Thu Jul 5 21:52:30 2018 -0700 BUG: decref of field title caused segfault commit a44b61cd03591af71046f0844699bd31c039ba33 Merge: 6c524f264 8ea9e8bf1 Author: Charles Harris Date: Sun Jul 8 13:47:43 2018 -0600 Merge pull request #11529 from eric-wieser/histogramdd-density-no-deprecation ENH: Add density argument to histogramdd. commit 8ea9e8bf13e4292d02e9ea5af2f4d10c07e02459 Author: Eric Wieser Date: Wed Jun 20 19:58:52 2018 -0700 MAINT: Rename histogramdd's normed argument to density, to match histogram Fixes gh-4371 commit 6c524f264a433426da8133bfefc34be2bc60ae55 Merge: 01cc44e4c 36cf15e69 Author: Charles Harris Date: Thu Jul 5 13:18:46 2018 -0600 Merge pull request #11511 from charris/backport-11479 BUG: Fix #define for ppc64 and ppc64le commit 36cf15e69f692b880390d7c788de83e840e53a0f Author: William D. Irons Date: Mon Jul 2 15:28:52 2018 +0000 BUG: Fix #define for ppc64 and ppc64le The current logic for defining NPY_CPU_PPC64LE and NPY_CPU_PPC64 is incorrect for two reasons: 1) The elif defined for __powerpc__ is proceesed first so any ppc64le or ppc64 system is defined as NPY_CPU_PPC. 2) __ppc64le__ is not defined on a ppc64le system. __PPC64__ is defined and so is __powerpc64__ but the check for little or big endian needs to be done seperately. This pull request fixes the defines for ppc64le and ppc64. Note: This really isn't a issue in the numpy code base at this time because the only place this variable is referenced is in npy_endian.h as a fallback in case endian.h is not on the system. It would be good to fix in case future code does reference these defines. commit 01cc44e4c8896142fa144f7a6005d74e00086d92 Merge: 107af1261 b85083f7e Author: Charles Harris Date: Tue Jul 3 19:12:36 2018 -0600 Merge pull request #11496 from charris/backport-11468 BUG: Advanced indexing assignment incorrectly took 1-D fastpath commit 107af1261a9815acb094590229c3c8eb8c5ef528 Merge: c4a5f877c c3381b3b6 Author: Charles Harris Date: Tue Jul 3 18:32:26 2018 -0600 Merge pull request #11495 from charris/backport-11455 BENCH: belated addition of lcm, gcd to ufunc benchmark. commit c4a5f877c9e75dcdd7de51246962c985a12f56a8 Merge: ba9e7e068 a5e8037ca Author: Charles Harris Date: Tue Jul 3 18:07:58 2018 -0600 Merge pull request #11494 from charris/backport-11434 MAINT: add PyPI classifier for Python 3.7 commit ba9e7e0685060e279f8bf8e4cf2d5b885cd8c000 Merge: 483f37d0b c03d32408 Author: Charles Harris Date: Tue Jul 3 18:07:37 2018 -0600 Merge pull request #11493 from charris/backport-11449 BUG: Revert #10229 to fix DLL loads on Windows. commit b85083f7e9940a2c7d5fc152206e074b31601a16 Author: Sebastian Berg Date: Sun Jul 1 12:53:53 2018 +0200 BUG: Advanced indexing assignment incorrectly took 1-D fastpath When the index array was non contiguous and not 1D the assignment 1D fastpath (indexed array being 1D) was incorrectly taken (also the assignment value had to be 0D for this to happen). This caused the iteration to use the itemsize as a stride, since it incorrectly assumed the array must be contiguous. The commit additionally adds an assert to the STRIDE fetching macro. Closes gh-11467. commit c3381b3b6865b967720de7d3b75ca534672bfc2e Author: Marten van Kerkwijk Date: Fri Jun 29 10:09:46 2018 -0400 BENCH: belated addition of lcm, gcd to ufunc benchmark. commit c03d3240873bc7ad1796b7cd5e3705577aa57ac0 Author: Charles Harris Date: Thu Jun 28 19:23:39 2018 -0600 BUG: Revert #10229 to fix DLL loads on Windows. Numpy wheels on Windows were clearing the ctypes path when they loaded the OpenBLAS DLL, leading to failure of DLL loads subsequent to NumPy import because the needed DLLs could not be found. This isn't a straight revert, it restores to the 1.15.x version of the relevant code. Closes #11431. commit a5e8037cacb40634e7e4c61af20a49750a8655c5 Author: Ralf Gommers Date: Wed Jun 27 19:26:19 2018 -0700 MAINT: add PyPI classifier for Python 3.7 [ci skip] commit 483f37d0b1f22e01c75e5963128fd037c88dbdc3 Merge: c58598f42 14e676a32 Author: Charles Harris Date: Tue Jul 3 15:57:38 2018 -0600 Merge pull request #11491 from charris/backport-11345 BUG/ENH: Einsum optimization path updates and bug fixes. commit 14e676a3224334bbc7132f9b47af563746f4697c Author: Daniel Smith Date: Tue Jul 3 11:23:11 2018 -0700 BUG/ENH: Einsum optimization path updates and bug fixes. (#11345) * Minor tweaks to the optimal path based on opt_einsum * Updates greedy path to current opt_einsum tech * Reworks einsum broadcasting vs dot tech and can_dot logic * MAINT: Fix typo in comment. * BUG: Fix bad merge fixup. commit c58598f42bfb4d22b5971770a87c0b827c22a0fb Merge: 5cd455272 c893aae32 Author: Charles Harris Date: Tue Jul 3 14:12:33 2018 -0600 Merge pull request #11489 from charris/backport-11406 BUG: Ensure out is returned in einsum. commit c893aae32993028443080a4a95a019b1a5ce2eca Author: mattip Date: Wed Jun 27 10:02:50 2018 -0700 MAINT: cleanup ret assignment commit 9f366e8627a9f04549f95a7369588aaf433cdf45 Author: mattip Date: Sun Jun 24 15:38:40 2018 -0700 check for unlikely error in Assign_Zero commit 48ed5505fb86e5dc533f98eb1e237d11f94c3c8c Author: mattip Date: Thu Jun 21 15:38:28 2018 -0700 BUG: ensure ret is out in einsum commit 5cd455272bdb86e1c5727815a6ad3053c4363dda Merge: 9597465a5 431740e8a Author: Charles Harris Date: Wed Jun 27 09:50:34 2018 -0600 Merge pull request #11427 from eric-wieser/deprecate-normed-1.15.0 BUG: Fix incorrect deprecation logic for histogram(normed=...) (1.15.x) Date: Sun Jun 24 15:38:40 2018 -0700 check for unlikely error in Assign_Zero commit 48ed5505fb86e5dc533f98eb1e237d11f94c3c8c Author: mattip Date: Thu Jun 21 15:38:28 2018 -0700 BUG: ensure ret is out in einsum commit 5cd455272bdb86e1c5727815a6ad3053c4363dda Merge: 9597465a5 431740e8a Author: Charles Harris Date: Wed Jun 27 09:50:34 2018 -0600 Merge pull request #11427 from eric-wieser/deprecate-normed-1.15.0 BUG: Fix incorrect deprecation logic for histogram(normed=...) (1.15.x) commit 431740e8a04855f8cdf2f720752e462a9cf69269 Author: Eric Wieser Date: Tue Jun 26 09:03:37 2018 -0700 BUG: Fix incorrect deprecation logic for histogram(normed=...) Fixes #11426, which was introduced in #11323 and #11352 commit 9597465a5d5721dd63a376f286b3cbad6b9dde2f Merge: 6bf63fc46 9ec209118 Author: Charles Harris Date: Thu Jun 21 13:04:28 2018 -0600 Merge pull request #11403 from mattip/remove-npyiter_close-from-notes DOC: Remove npyiter close from notes commit 9ec209118ec9e9e6df7e7c1af077111f194c7850 Author: mattip Date: Thu Jun 21 10:47:40 2018 -0700 DOC: remove NpyIter_Close from release notes commit 6bf63fc46bc22e16fa88a6d65e025df1a1e7524d Author: Charles Harris Date: Thu Jun 21 10:07:00 2018 -0600 REL: prepare 1.15.x for further development commit 914aabf07548ee3ddf5f3795177273d435a38c14 Author: Eric Wieser Date: Sun Jun 17 21:34:55 2018 -0700 TST: Show that histogramdd's normed argument is histogram's density Relevant to gh-4371 commit ccc68b80305ff5b363d10f6e905fb4e5276a8adb (HEAD, tag: v1.15.0rc2) Author: Charles Harris Date: Mon Jul 9 10:35:40 2018 -0600 REL: NumPy 1.15.0rc2 release. commit 3f6e8dc0a223162ceda7563ac774258a22934ca7 Merge: 3b91b03f6 7af0de94d Author: Charles Harris Date: Mon Jul 9 10:30:38 2018 -0600 Merge pull request #11540 from charris/update-1.15.0-notes DOC: Update the 1.15.0 release notes. commit 7af0de94dca925e2df5b49f9f9405c0cabc02348 Author: Charles Harris Date: Mon Jul 9 09:15:51 2018 -0600 DOC: Update the 1.15.0 release notes. * Note that the release supports Python 3.7. * Add np.einsum entry [ci skip] commit 3b91b03f6bf173650cea773185c6f4be56fcdb9e Merge: a44b61cd0 9bc770cb3 Author: Charles Harris Date: Sun Jul 8 15:48:02 2018 -0600 Merge pull request #11532 from charris/backport-11515 BUG: Decref of field title caused segfault commit 9bc770cb3472f54dad5d6b36ddff991e1701c06d Author: mattip Date: Fri Jul 6 11:18:37 2018 -0700 fix from review commit c9c85cd37c238e13ae6e0f978e3e8d9ad2be3af5 Author: mattip Date: Thu Jul 5 21:52:30 2018 -0700 BUG: decref of field title caused segfault commit a44b61cd03591af71046f0844699bd31c039ba33 Merge: 6c524f264 8ea9e8bf1 Author: Charles Harris Date: Sun Jul 8 13:47:43 2018 -0600 Merge pull request #11529 from eric-wieser/histogramdd-density-no-deprecation ENH: Add density argument to histogramdd. commit 8ea9e8bf13e4292d02e9ea5af2f4d10c07e02459 Author: Eric Wieser Date: Wed Jun 20 19:58:52 2018 -0700 MAINT: Rename histogramdd's normed argument to density, to match histogram Fixes gh-4371 commit 6c524f264a433426da8133bfefc34be2bc60ae55 Merge: 01cc44e4c 36cf15e69 Author: Charles Harris Date: Thu Jul 5 13:18:46 2018 -0600 Merge pull request #11511 from charris/backport-11479 BUG: Fix #define for ppc64 and ppc64le commit 36cf15e69f692b880390d7c788de83e840e53a0f Author: William D. Irons Date: Mon Jul 2 15:28:52 2018 +0000 BUG: Fix #define for ppc64 and ppc64le The current logic for defining NPY_CPU_PPC64LE and NPY_CPU_PPC64 is incorrect for two reasons: 1) The elif defined for __powerpc__ is proceesed first so any ppc64le or ppc64 system is defined as NPY_CPU_PPC. 2) __ppc64le__ is not defined on a ppc64le system. __PPC64__ is defined and so is __powerpc64__ but the check for little or big endian needs to be done seperately. This pull request fixes the defines for ppc64le and ppc64. Note: This really isn't a issue in the numpy code base at this time because the only place this variable is referenced is in npy_endian.h as a fallback in case endian.h is not on the system. It would be good to fix in case future code does reference these defines. commit 01cc44e4c8896142fa144f7a6005d74e00086d92 Merge: 107af1261 b85083f7e Author: Charles Harris Date: Tue Jul 3 19:12:36 2018 -0600 Merge pull request #11496 from charris/backport-11468 BUG: Advanced indexing assignment incorrectly took 1-D fastpath commit 107af1261a9815acb094590229c3c8eb8c5ef528 Merge: c4a5f877c c3381b3b6 Author: Charles Harris Date: Tue Jul 3 18:32:26 2018 -0600 Merge pull request #11495 from charris/backport-11455 BENCH: belated addition of lcm, gcd to ufunc benchmark. commit c4a5f877c9e75dcdd7de51246962c985a12f56a8 Merge: ba9e7e068 a5e8037ca Author: Charles Harris Date: Tue Jul 3 18:07:58 2018 -0600 Merge pull request #11494 from charris/backport-11434 MAINT: add PyPI classifier for Python 3.7 commit ba9e7e0685060e279f8bf8e4cf2d5b885cd8c000 Merge: 483f37d0b c03d32408 Author: Charles Harris Date: Tue Jul 3 18:07:37 2018 -0600 Merge pull request #11493 from charris/backport-11449 BUG: Revert #10229 to fix DLL loads on Windows. commit b85083f7e9940a2c7d5fc152206e074b31601a16 Author: Sebastian Berg Date: Sun Jul 1 12:53:53 2018 +0200 BUG: Advanced indexing assignment incorrectly took 1-D fastpath When the index array was non contiguous and not 1D the assignment 1D fastpath (indexed array being 1D) was incorrectly taken (also the assignment value had to be 0D for this to happen). This caused the iteration to use the itemsize as a stride, since it incorrectly assumed the array must be contiguous. The commit additionally adds an assert to the STRIDE fetching macro. Closes gh-11467. commit c3381b3b6865b967720de7d3b75ca534672bfc2e Author: Marten van Kerkwijk Date: Fri Jun 29 10:09:46 2018 -0400 BENCH: belated addition of lcm, gcd to ufunc benchmark. commit c03d3240873bc7ad1796b7cd5e3705577aa57ac0 Author: Charles Harris Date: Thu Jun 28 19:23:39 2018 -0600 BUG: Revert #10229 to fix DLL loads on Windows. Numpy wheels on Windows were clearing the ctypes path when they loaded the OpenBLAS DLL, leading to failure of DLL loads subsequent to NumPy import because the needed DLLs could not be found. This isn't a straight revert, it restores to the 1.15.x version of the relevant code. Closes #11431. commit a5e8037cacb40634e7e4c61af20a49750a8655c5 Author: Ralf Gommers Date: Wed Jun 27 19:26:19 2018 -0700 MAINT: add PyPI classifier for Python 3.7 [ci skip] commit 483f37d0b1f22e01c75e5963128fd037c88dbdc3 Merge: c58598f42 14e676a32 Author: Charles Harris Date: Tue Jul 3 15:57:38 2018 -0600 Merge pull request #11491 from charris/backport-11345 BUG/ENH: Einsum optimization path updates and bug fixes. commit 14e676a3224334bbc7132f9b47af563746f4697c Author: Daniel Smith Date: Tue Jul 3 11:23:11 2018 -0700 BUG/ENH: Einsum optimization path updates and bug fixes. (#11345) * Minor tweaks to the optimal path based on opt_einsum * Updates greedy path to current opt_einsum tech * Reworks einsum broadcasting vs dot tech and can_dot logic * MAINT: Fix typo in comment. * BUG: Fix bad merge fixup. commit c58598f42bfb4d22b5971770a87c0b827c22a0fb Merge: 5cd455272 c893aae32 Author: Charles Harris Date: Tue Jul 3 14:12:33 2018 -0600 Merge pull request #11489 from charris/backport-11406 BUG: Ensure out is returned in einsum. commit c893aae32993028443080a4a95a019b1a5ce2eca Author: mattip Date: Wed Jun 27 10:02:50 2018 -0700 MAINT: cleanup ret assignment commit 9f366e8627a9f04549f95a7369588aaf433cdf45 Author: mattip Date: Sun Jun 24 15:38:40 2018 -0700 check for unlikely error in Assign_Zero commit 48ed5505fb86e5dc533f98eb1e237d11f94c3c8c Author: mattip Date: Thu Jun 21 15:38:28 2018 -0700 BUG: ensure ret is out in einsum commit 5cd455272bdb86e1c5727815a6ad3053c4363dda Merge: 9597465a5 431740e8a Author: Charles Harris Date: Wed Jun 27 09:50:34 2018 -0600 Merge pull request #11427 from eric-wieser/deprecate-normed-1.15.0 BUG: Fix incorrect deprecation logic for histogram(normed=...) (1.15.x) commit 431740e8a04855f8cdf2f720752e462a9cf69269 Author: Eric Wieser Date: Tue Jun 26 09:03:37 2018 -0700 BUG: Fix incorrect deprecation logic for histogram(normed=...) Fixes #11426, which was introduced in #11323 and #11352 commit 9597465a5d5721dd63a376f286b3cbad6b9dde2f Merge: 6bf63fc46 9ec209118 Author: Charles Harris Date: Thu Jun 21 13:04:28 2018 -0600 Merge pull request #11403 from mattip/remove-npyiter_close-from-notes DOC: Remove npyiter close from notes commit 9ec209118ec9e9e6df7e7c1af077111f194c7850 Author: mattip Date: Thu Jun 21 10:47:40 2018 -0700 DOC: remove NpyIter_Close from release notes commit 6bf63fc46bc22e16fa88a6d65e025df1a1e7524d Author: Charles Harris Date: Thu Jun 21 10:07:00 2018 -0600 REL: prepare 1.15.x for further development commit 914aabf07548ee3ddf5f3795177273d435a38c14 Author: Eric Wieser Date: Sun Jun 17 21:34:55 2018 -0700 TST: Show that histogramdd's normed argument is histogram's density Relevant to gh-4371 ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 9 20:07:06 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Jul 2018 18:07:06 -0600 Subject: [Numpy-discussion] NumPy 1.15.0rc2 released. In-Reply-To: References: Message-ID: On Mon, Jul 9, 2018 at 5:38 PM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Maybe not too unhelpful: > > mhvk at weasel:~/packages/numpy$ git log v1.15.0rc1..v1.15.0rc2 > commit ccc68b80305ff5b363d10f6e905fb4e5276a8adb (HEAD, tag: v1.15.0rc2) > Author: Charles Harris > Date: Mon Jul 9 10:35:40 2018 -0600 > > REL: NumPy 1.15.0rc2 release. > > commit 3f6e8dc0a223162ceda7563ac774258a22934ca7 > Merge: 3b91b03f6 7af0de94d > Author: Charles Harris > Date: Mon Jul 9 10:30:38 2018 -0600 > > Merge pull request #11540 from charris/update-1.15.0-notes > > DOC: Update the 1.15.0 release notes. > > commit 7af0de94dca925e2df5b49f9f9405c0cabc02348 > Author: Charles Harris > Date: Mon Jul 9 09:15:51 2018 -0600 > > DOC: Update the 1.15.0 release notes. > > * Note that the release supports Python 3.7. > * Add np.einsum entry > > [ci skip] > > commit 3b91b03f6bf173650cea773185c6f4be56fcdb9e > Merge: a44b61cd0 9bc770cb3 > Author: Charles Harris > Date: Sun Jul 8 15:48:02 2018 -0600 > > Merge pull request #11532 from charris/backport-11515 > > BUG: Decref of field title caused segfault > > commit 9bc770cb3472f54dad5d6b36ddff991e1701c06d > Author: mattip > Date: Fri Jul 6 11:18:37 2018 -0700 > > fix from review > > commit c9c85cd37c238e13ae6e0f978e3e8d9ad2be3af5 > Author: mattip > Date: Thu Jul 5 21:52:30 2018 -0700 > > BUG: decref of field title caused segfault > > commit a44b61cd03591af71046f0844699bd31c039ba33 > Merge: 6c524f264 8ea9e8bf1 > Author: Charles Harris > Date: Sun Jul 8 13:47:43 2018 -0600 > > Merge pull request #11529 from eric-wieser/histogramdd- > density-no-deprecation > > ENH: Add density argument to histogramdd. > > commit 8ea9e8bf13e4292d02e9ea5af2f4d10c07e02459 > Author: Eric Wieser > Date: Wed Jun 20 19:58:52 2018 -0700 > > MAINT: Rename histogramdd's normed argument to density, to match > histogram > > Fixes gh-4371 > > commit 6c524f264a433426da8133bfefc34be2bc60ae55 > Merge: 01cc44e4c 36cf15e69 > Author: Charles Harris > Date: Thu Jul 5 13:18:46 2018 -0600 > > Merge pull request #11511 from charris/backport-11479 > > BUG: Fix #define for ppc64 and ppc64le > > commit 36cf15e69f692b880390d7c788de83e840e53a0f > Author: William D. Irons > Date: Mon Jul 2 15:28:52 2018 +0000 > > BUG: Fix #define for ppc64 and ppc64le > > The current logic for defining NPY_CPU_PPC64LE and NPY_CPU_PPC64 is > incorrect for two reasons: > 1) The elif defined for __powerpc__ is proceesed first so any > ppc64le or ppc64 system is defined as NPY_CPU_PPC. > 2) __ppc64le__ is not defined on a ppc64le system. __PPC64__ is > defined and so is __powerpc64__ but the check for little or > big endian needs to be done seperately. > > This pull request fixes the defines for ppc64le and ppc64. > Note: This really isn't a issue in the numpy code base at this time > because the only place this variable is referenced is in npy_endian.h > as a fallback in case endian.h is not on the system. > It would be good to fix in case future code does reference > these defines. > > commit 01cc44e4c8896142fa144f7a6005d74e00086d92 > Merge: 107af1261 b85083f7e > Author: Charles Harris > Date: Tue Jul 3 19:12:36 2018 -0600 > > Merge pull request #11496 from charris/backport-11468 > > BUG: Advanced indexing assignment incorrectly took 1-D fastpath > > commit 107af1261a9815acb094590229c3c8eb8c5ef528 > Merge: c4a5f877c c3381b3b6 > Author: Charles Harris > Date: Tue Jul 3 18:32:26 2018 -0600 > > Merge pull request #11495 from charris/backport-11455 > > BENCH: belated addition of lcm, gcd to ufunc benchmark. > > commit c4a5f877c9e75dcdd7de51246962c985a12f56a8 > Merge: ba9e7e068 a5e8037ca > Author: Charles Harris > Date: Tue Jul 3 18:07:58 2018 -0600 > > Merge pull request #11494 from charris/backport-11434 > > MAINT: add PyPI classifier for Python 3.7 > > commit ba9e7e0685060e279f8bf8e4cf2d5b885cd8c000 > Merge: 483f37d0b c03d32408 > Author: Charles Harris > Date: Tue Jul 3 18:07:37 2018 -0600 > > Merge pull request #11493 from charris/backport-11449 > > BUG: Revert #10229 to fix DLL loads on Windows. > > commit b85083f7e9940a2c7d5fc152206e074b31601a16 > Author: Sebastian Berg > Date: Sun Jul 1 12:53:53 2018 +0200 > > BUG: Advanced indexing assignment incorrectly took 1-D fastpath > > When the index array was non contiguous and not 1D the assignment > 1D fastpath (indexed array being 1D) was incorrectly taken (also > the assignment value had to be 0D for this to happen). > This caused the iteration to use the itemsize as a stride, since > it incorrectly assumed the array must be contiguous. > > The commit additionally adds an assert to the STRIDE fetching macro. > > Closes gh-11467. > > commit c3381b3b6865b967720de7d3b75ca534672bfc2e > Author: Marten van Kerkwijk > Date: Fri Jun 29 10:09:46 2018 -0400 > > BENCH: belated addition of lcm, gcd to ufunc benchmark. > > commit c03d3240873bc7ad1796b7cd5e3705577aa57ac0 > Author: Charles Harris > Date: Thu Jun 28 19:23:39 2018 -0600 > > BUG: Revert #10229 to fix DLL loads on Windows. > > Numpy wheels on Windows were clearing the ctypes path when they loaded > the OpenBLAS DLL, leading to failure of DLL loads subsequent to NumPy > import because the needed DLLs could not be found. > > This isn't a straight revert, it restores to the 1.15.x version of > the relevant code. > > Closes #11431. > > commit a5e8037cacb40634e7e4c61af20a49750a8655c5 > Author: Ralf Gommers > Date: Wed Jun 27 19:26:19 2018 -0700 > > MAINT: add PyPI classifier for Python 3.7 > > [ci skip] > > commit 483f37d0b1f22e01c75e5963128fd037c88dbdc3 > Merge: c58598f42 14e676a32 > Author: Charles Harris > Date: Tue Jul 3 15:57:38 2018 -0600 > > Merge pull request #11491 from charris/backport-11345 > > BUG/ENH: Einsum optimization path updates and bug fixes. > > commit 14e676a3224334bbc7132f9b47af563746f4697c > Author: Daniel Smith > Date: Tue Jul 3 11:23:11 2018 -0700 > > BUG/ENH: Einsum optimization path updates and bug fixes. (#11345) > > * Minor tweaks to the optimal path based on opt_einsum > > * Updates greedy path to current opt_einsum tech > > * Reworks einsum broadcasting vs dot tech and can_dot logic > > * MAINT: Fix typo in comment. > > * BUG: Fix bad merge fixup. > > commit c58598f42bfb4d22b5971770a87c0b827c22a0fb > Merge: 5cd455272 c893aae32 > Author: Charles Harris > Date: Tue Jul 3 14:12:33 2018 -0600 > > Merge pull request #11489 from charris/backport-11406 > > BUG: Ensure out is returned in einsum. > > commit c893aae32993028443080a4a95a019b1a5ce2eca > Author: mattip > Date: Wed Jun 27 10:02:50 2018 -0700 > > MAINT: cleanup ret assignment > > commit 9f366e8627a9f04549f95a7369588aaf433cdf45 > Author: mattip > Date: Sun Jun 24 15:38:40 2018 -0700 > > check for unlikely error in Assign_Zero > > commit 48ed5505fb86e5dc533f98eb1e237d11f94c3c8c > Author: mattip > Date: Thu Jun 21 15:38:28 2018 -0700 > > BUG: ensure ret is out in einsum > > commit 5cd455272bdb86e1c5727815a6ad3053c4363dda > Merge: 9597465a5 431740e8a > Author: Charles Harris > Date: Wed Jun 27 09:50:34 2018 -0600 > > Merge pull request #11427 from eric-wieser/deprecate-normed-1.15.0 > > BUG: Fix incorrect deprecation logic for histogram(normed=...) (1.15.x) > > Date: Sun Jun 24 15:38:40 2018 -0700 > > check for unlikely error in Assign_Zero > > commit 48ed5505fb86e5dc533f98eb1e237d11f94c3c8c > Author: mattip > Date: Thu Jun 21 15:38:28 2018 -0700 > > BUG: ensure ret is out in einsum > > commit 5cd455272bdb86e1c5727815a6ad3053c4363dda > Merge: 9597465a5 431740e8a > Author: Charles Harris > Date: Wed Jun 27 09:50:34 2018 -0600 > > Merge pull request #11427 from eric-wieser/deprecate-normed-1.15.0 > > BUG: Fix incorrect deprecation logic for histogram(normed=...) (1.15.x) > > commit 431740e8a04855f8cdf2f720752e462a9cf69269 > Author: Eric Wieser > Date: Tue Jun 26 09:03:37 2018 -0700 > > BUG: Fix incorrect deprecation logic for histogram(normed=...) > > Fixes #11426, which was introduced in #11323 and #11352 > > commit 9597465a5d5721dd63a376f286b3cbad6b9dde2f > Merge: 6bf63fc46 9ec209118 > Author: Charles Harris > Date: Thu Jun 21 13:04:28 2018 -0600 > > Merge pull request #11403 from mattip/remove-npyiter_close-from-notes > > DOC: Remove npyiter close from notes > > commit 9ec209118ec9e9e6df7e7c1af077111f194c7850 > Author: mattip > Date: Thu Jun 21 10:47:40 2018 -0700 > > DOC: remove NpyIter_Close from release notes > > commit 6bf63fc46bc22e16fa88a6d65e025df1a1e7524d > Author: Charles Harris > Date: Thu Jun 21 10:07:00 2018 -0600 > > REL: prepare 1.15.x for further development > > commit 914aabf07548ee3ddf5f3795177273d435a38c14 > Author: Eric Wieser > Date: Sun Jun 17 21:34:55 2018 -0700 > > TST: Show that histogramdd's normed argument is histogram's density > > Relevant to gh-4371 > commit ccc68b80305ff5b363d10f6e905fb4e5276a8adb (HEAD, tag: v1.15.0rc2) > Author: Charles Harris > Date: Mon Jul 9 10:35:40 2018 -0600 > > REL: NumPy 1.15.0rc2 release. > > commit 3f6e8dc0a223162ceda7563ac774258a22934ca7 > Merge: 3b91b03f6 7af0de94d > Author: Charles Harris > Date: Mon Jul 9 10:30:38 2018 -0600 > > Merge pull request #11540 from charris/update-1.15.0-notes > > DOC: Update the 1.15.0 release notes. > > commit 7af0de94dca925e2df5b49f9f9405c0cabc02348 > Author: Charles Harris > Date: Mon Jul 9 09:15:51 2018 -0600 > > DOC: Update the 1.15.0 release notes. > > * Note that the release supports Python 3.7. > * Add np.einsum entry > > [ci skip] > > commit 3b91b03f6bf173650cea773185c6f4be56fcdb9e > Merge: a44b61cd0 9bc770cb3 > Author: Charles Harris > Date: Sun Jul 8 15:48:02 2018 -0600 > > Merge pull request #11532 from charris/backport-11515 > > BUG: Decref of field title caused segfault > > commit 9bc770cb3472f54dad5d6b36ddff991e1701c06d > Author: mattip > Date: Fri Jul 6 11:18:37 2018 -0700 > > fix from review > > commit c9c85cd37c238e13ae6e0f978e3e8d9ad2be3af5 > Author: mattip > Date: Thu Jul 5 21:52:30 2018 -0700 > > BUG: decref of field title caused segfault > > commit a44b61cd03591af71046f0844699bd31c039ba33 > Merge: 6c524f264 8ea9e8bf1 > Author: Charles Harris > Date: Sun Jul 8 13:47:43 2018 -0600 > > Merge pull request #11529 from eric-wieser/histogramdd- > density-no-deprecation > > ENH: Add density argument to histogramdd. > > commit 8ea9e8bf13e4292d02e9ea5af2f4d10c07e02459 > Author: Eric Wieser > Date: Wed Jun 20 19:58:52 2018 -0700 > > MAINT: Rename histogramdd's normed argument to density, to match > histogram > > Fixes gh-4371 > > commit 6c524f264a433426da8133bfefc34be2bc60ae55 > Merge: 01cc44e4c 36cf15e69 > Author: Charles Harris > Date: Thu Jul 5 13:18:46 2018 -0600 > > Merge pull request #11511 from charris/backport-11479 > > BUG: Fix #define for ppc64 and ppc64le > > commit 36cf15e69f692b880390d7c788de83e840e53a0f > Author: William D. Irons > Date: Mon Jul 2 15:28:52 2018 +0000 > > BUG: Fix #define for ppc64 and ppc64le > > The current logic for defining NPY_CPU_PPC64LE and NPY_CPU_PPC64 is > incorrect for two reasons: > 1) The elif defined for __powerpc__ is proceesed first so any > ppc64le or ppc64 system is defined as NPY_CPU_PPC. > 2) __ppc64le__ is not defined on a ppc64le system. __PPC64__ is > defined and so is __powerpc64__ but the check for little or > big endian needs to be done seperately. > > This pull request fixes the defines for ppc64le and ppc64. > Note: This really isn't a issue in the numpy code base at this time > because the only place this variable is referenced is in npy_endian.h > as a fallback in case endian.h is not on the system. > It would be good to fix in case future code does reference > these defines. > > commit 01cc44e4c8896142fa144f7a6005d74e00086d92 > Merge: 107af1261 b85083f7e > Author: Charles Harris > Date: Tue Jul 3 19:12:36 2018 -0600 > > Merge pull request #11496 from charris/backport-11468 > > BUG: Advanced indexing assignment incorrectly took 1-D fastpath > > commit 107af1261a9815acb094590229c3c8eb8c5ef528 > Merge: c4a5f877c c3381b3b6 > Author: Charles Harris > Date: Tue Jul 3 18:32:26 2018 -0600 > > Merge pull request #11495 from charris/backport-11455 > > BENCH: belated addition of lcm, gcd to ufunc benchmark. > > commit c4a5f877c9e75dcdd7de51246962c985a12f56a8 > Merge: ba9e7e068 a5e8037ca > Author: Charles Harris > Date: Tue Jul 3 18:07:58 2018 -0600 > > Merge pull request #11494 from charris/backport-11434 > > MAINT: add PyPI classifier for Python 3.7 > > commit ba9e7e0685060e279f8bf8e4cf2d5b885cd8c000 > Merge: 483f37d0b c03d32408 > Author: Charles Harris > Date: Tue Jul 3 18:07:37 2018 -0600 > > Merge pull request #11493 from charris/backport-11449 > > BUG: Revert #10229 to fix DLL loads on Windows. > > commit b85083f7e9940a2c7d5fc152206e074b31601a16 > Author: Sebastian Berg > Date: Sun Jul 1 12:53:53 2018 +0200 > > BUG: Advanced indexing assignment incorrectly took 1-D fastpath > > When the index array was non contiguous and not 1D the assignment > 1D fastpath (indexed array being 1D) was incorrectly taken (also > the assignment value had to be 0D for this to happen). > This caused the iteration to use the itemsize as a stride, since > it incorrectly assumed the array must be contiguous. > > The commit additionally adds an assert to the STRIDE fetching macro. > > Closes gh-11467. > > commit c3381b3b6865b967720de7d3b75ca534672bfc2e > Author: Marten van Kerkwijk > Date: Fri Jun 29 10:09:46 2018 -0400 > > BENCH: belated addition of lcm, gcd to ufunc benchmark. > > commit c03d3240873bc7ad1796b7cd5e3705577aa57ac0 > Author: Charles Harris > Date: Thu Jun 28 19:23:39 2018 -0600 > > BUG: Revert #10229 to fix DLL loads on Windows. > > Numpy wheels on Windows were clearing the ctypes path when they loaded > the OpenBLAS DLL, leading to failure of DLL loads subsequent to NumPy > import because the needed DLLs could not be found. > > This isn't a straight revert, it restores to the 1.15.x version of > the relevant code. > > Closes #11431. > > commit a5e8037cacb40634e7e4c61af20a49750a8655c5 > Author: Ralf Gommers > Date: Wed Jun 27 19:26:19 2018 -0700 > > MAINT: add PyPI classifier for Python 3.7 > > [ci skip] > > commit 483f37d0b1f22e01c75e5963128fd037c88dbdc3 > Merge: c58598f42 14e676a32 > Author: Charles Harris > Date: Tue Jul 3 15:57:38 2018 -0600 > > Merge pull request #11491 from charris/backport-11345 > > BUG/ENH: Einsum optimization path updates and bug fixes. > > commit 14e676a3224334bbc7132f9b47af563746f4697c > Author: Daniel Smith > Date: Tue Jul 3 11:23:11 2018 -0700 > > BUG/ENH: Einsum optimization path updates and bug fixes. (#11345) > > * Minor tweaks to the optimal path based on opt_einsum > > * Updates greedy path to current opt_einsum tech > > * Reworks einsum broadcasting vs dot tech and can_dot logic > > * MAINT: Fix typo in comment. > > * BUG: Fix bad merge fixup. > > commit c58598f42bfb4d22b5971770a87c0b827c22a0fb > Merge: 5cd455272 c893aae32 > Author: Charles Harris > Date: Tue Jul 3 14:12:33 2018 -0600 > > Merge pull request #11489 from charris/backport-11406 > > BUG: Ensure out is returned in einsum. > > commit c893aae32993028443080a4a95a019b1a5ce2eca > Author: mattip > Date: Wed Jun 27 10:02:50 2018 -0700 > > MAINT: cleanup ret assignment > > commit 9f366e8627a9f04549f95a7369588aaf433cdf45 > Author: mattip > Date: Sun Jun 24 15:38:40 2018 -0700 > > check for unlikely error in Assign_Zero > > commit 48ed5505fb86e5dc533f98eb1e237d11f94c3c8c > Author: mattip > Date: Thu Jun 21 15:38:28 2018 -0700 > > BUG: ensure ret is out in einsum > > commit 5cd455272bdb86e1c5727815a6ad3053c4363dda > Merge: 9597465a5 431740e8a > Author: Charles Harris > Date: Wed Jun 27 09:50:34 2018 -0600 > > Merge pull request #11427 from eric-wieser/deprecate-normed-1.15.0 > > BUG: Fix incorrect deprecation logic for histogram(normed=...) (1.15.x) > > commit 431740e8a04855f8cdf2f720752e462a9cf69269 > Author: Eric Wieser > Date: Tue Jun 26 09:03:37 2018 -0700 > > BUG: Fix incorrect deprecation logic for histogram(normed=...) > > Fixes #11426, which was introduced in #11323 and #11352 > > commit 9597465a5d5721dd63a376f286b3cbad6b9dde2f > Merge: 6bf63fc46 9ec209118 > Author: Charles Harris > Date: Thu Jun 21 13:04:28 2018 -0600 > > Merge pull request #11403 from mattip/remove-npyiter_close-from-notes > > DOC: Remove npyiter close from notes > > commit 9ec209118ec9e9e6df7e7c1af077111f194c7850 > Author: mattip > Date: Thu Jun 21 10:47:40 2018 -0700 > > DOC: remove NpyIter_Close from release notes > > commit 6bf63fc46bc22e16fa88a6d65e025df1a1e7524d > Author: Charles Harris > Date: Thu Jun 21 10:07:00 2018 -0600 > > REL: prepare 1.15.x for further development > > commit 914aabf07548ee3ddf5f3795177273d435a38c14 > Author: Eric Wieser > Date: Sun Jun 17 21:34:55 2018 -0700 > > TST: Show that histogramdd's normed argument is histogram's density > > Relevant to gh-4371 > Or charris at fc [numpy.git (master)]$ python tools/changelog.py $GITHUB v1.15.0rc1..v1.15.0rc2 Contributors ============ A total of 8 people contributed to this release. People with a "+" by their names contributed a patch for the first time. * Charles Harris * Daniel Smith * Eric Wieser * Marten van Kerkwijk * Matti Picus * Ralf Gommers * Sebastian Berg * William D. Irons + Pull requests merged ==================== A total of 12 pull requests were merged for this release. * `#11403 `__: DOC: Remove npyiter close from notes * `#11427 `__: BUG: Fix incorrect deprecation logic for histogram(normed=...)... * `#11489 `__: BUG: Ensure out is returned in einsum. * `#11491 `__: BUG/ENH: Einsum optimization path updates and bug fixes. * `#11493 `__: BUG: Revert #10229 to fix DLL loads on Windows. * `#11494 `__: MAINT: add PyPI classifier for Python 3.7 * `#11495 `__: BENCH: belated addition of lcm, gcd to ufunc benchmark. * `#11496 `__: BUG: Advanced indexing assignment incorrectly took 1-D fastpath * `#11511 `__: BUG: Fix #define for ppc64 and ppc64le * `#11529 `__: ENH: Add density argument to histogramdd. * `#11532 `__: BUG: Decref of field title caused segfault * `#11540 `__: DOC: Update the 1.15.0 release notes. Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffsaremi at hotmail.com Tue Jul 10 11:50:33 2018 From: jeffsaremi at hotmail.com (jeff saremi) Date: Tue, 10 Jul 2018 15:50:33 +0000 Subject: [Numpy-discussion] Looking for description/insight/documentation on matmul In-Reply-To: <32ba25c2-3b23-f9b7-3761-0efe527ca3d5@gmail.com> References: , <32ba25c2-3b23-f9b7-3761-0efe527ca3d5@gmail.com> Message-ID: Thanks a lot Matti. It makes a lot more sense now. ________________________________ From: NumPy-Discussion on behalf of Matti Picus Sent: Monday, July 9, 2018 10:54 AM To: numpy-discussion at python.org Subject: Re: [Numpy-discussion] Looking for description/insight/documentation on matmul On 09/07/18 09:48, jeff saremi wrote: > Is there any resource available or anyone who's able to describe > matmul operation of matrices when n > 2? > > The only description i can find is: "If either argument is N-D, N > 2, > it is treated as a stack of matrices residing in the last two indexes > and broadcast accordingly." which is very cryptic to me. > Could someone break this down please? > when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting > dimensions? is there one answer to that? Is it deterministic? > What does "residing in the last two indices" mean? What is broadcast > and where? > thanks > jeff > You could do np.matmul(np.ones((2, 3, 4, 5, 6)), np.ones((2, 3, 4, 6, 7))).shape which yields (2, 3, 4, 5, 7). When ndim >= 2 in both operands, matmul uses the last two dimensions as (..., n, m) @ (...., m, p) -> (..., n, p). Note the repeating "m", so your example would not work: n1=5, m1=6 in the first operand and m2=8, p2=9 in the second so m1 != m2. The "broadcast" refers only to the "..." dimensions, if in either of the operands you replace the 2 or 3 or 4 with 1 then that operand will broadcast (repeat itself) across that dimension to fit the other operand. Also if one of the three first dimensions is missing in one of the operands it will broadcast. When ndim < 2 for one of the operands only, it will be interpreted as "m", and the other dimension "n" or "p" will not appear on the output, so the signature is (..., n, m),(m) -> (..., n) or (m),(..., m, p)->(..., p) When ndim < 2 for both of the operands, it is the same as a dot product and will produce a scalar. You didn't ask, but I will complete the picture: np.dot is different for the case of n>=2. The result will extend (combine? broadcast across?) both sets of ... dimensions, so np.dot(np.ones((2,3,4,5,6)), np.ones((8, 9, 6, 7))).shape which yields (2, 6, 4, 5, 8, 9, 7). The (2, 3, 4) dimensions are followed by (8, 9) Matti _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffsaremi at hotmail.com Tue Jul 10 11:51:27 2018 From: jeffsaremi at hotmail.com (jeff saremi) Date: Tue, 10 Jul 2018 15:51:27 +0000 Subject: [Numpy-discussion] Looking for description/insight/documentation on matmul In-Reply-To: References: , Message-ID: Looks great. thanks a lot ________________________________ From: NumPy-Discussion on behalf of Stephan Hoyer Sent: Monday, July 9, 2018 10:50 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Looking for description/insight/documentation on matmul Hi Jeff, I think PEP 465 would be the definitive reference here. See the section on "Intended usage details" in https://www.python.org/dev/peps/pep-0465/ Cheers, Stephan On Mon, Jul 9, 2018 at 9:48 AM jeff saremi > wrote: Is there any resource available or anyone who's able to describe matmul operation of matrices when n > 2? The only description i can find is: "If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly." which is very cryptic to me. Could someone break this down please? when a [2 3 5 6] is multiplied by a [7 8 9] what are the resulting dimensions? is there one answer to that? Is it deterministic? What does "residing in the last two indices" mean? What is broadcast and where? thanks jeff _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Wed Jul 11 16:49:40 2018 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 11 Jul 2018 15:49:40 -0500 Subject: [Numpy-discussion] Fwd: [NumFOCUS Projects] NumFOCUS Summit Registration In-Reply-To: References: Message-ID: On 06/07/18 00:54, Ralf Gommers wrote: > Hi all, > > In September the NumFOCUS Summit for 2018 will be held in New York. > NumPy can send one representative (or a couple, but costs are only > covered for one person). We opened this opportunity up to members of > the Steering Council first, however it doesn't look like anyone is in > the position to attend. Therefore we'd now like to ask if anyone else > would like to attend. > > Rules of the game: you do need a strong affiliation with the project > (e.g. you have commit right, have been contributing for a while, have > been hired to work on NumPy, etc.); in case multiple people are > interested then the NumPy Steering Council will make a decision. > > If you're interested, please reply here or off-list. > > Cheers, > Ralf > I would like to put my name forward. Matti > > ---------- Forwarded message ---------- > From: *Leah Silen* > > Date: Wed, Jun 27, 2018 at 4:35 PM > Subject: [NumFOCUS Projects] NumFOCUS Summit Registration > To: projects at numfocus.org > > > * > > Hi all, > > > We?d like to share some additional details on this year?s NumFOCUS > Project Summit. > > > The Sustainability Workshop portion of the Summit (Sept 22-23) is an > opportunity for projects to focus on creating roadmaps and developing > strategies for long-term sustainability. This year we would like your > help in generating the content and driving the direction of the > workshop. We?ll be reaching out for your suggestions on specific areas > you would like to see addressed. > > > The Project Forum for Core Developers and Critical Users(Sep 24-25) is > a meeting, open to the public, for critical users to directly interact > with the core developers of NumFOCUS projects. The goal is to connect > you with resources to advance your community roadmaps while beginning > a dialogue with key stakeholders. The event will be limited to 150 > attendees. Again, we will be giving participants (both projects and > users) an opportunity to drive the content and agenda. > > > NumFOCUS will cover travel expenses for one representative per > sponsored project.This includes an airfare allotment of $600 and hotel > accommodations for 4 nights in New York City. The airfare amount is an > average; we understand international travel costs will exceed this > amount. Please contact us if you would like to have more than one rep > attend. When choosing who will attend, we ask that you prioritize > members of your leadership team or steering committee as this will > help your project get the most benefit out of the Summit experience. > > > In order to book hotel rooms, it is urgent that we know who will be > attending as soon as possible.We?ve set up the following site for > project leads to register: > > > Once again, if you?re interested in working on any portion of the > summit, we would love your input and leadership in setting the agenda. > Please reach out to summit at numfocus.org if > you would like to be involved. > > > > NumFOCUS Summit Committee- > > > Andy Terrel > > James Powell > > Leah Silen > > Gina Helfrich > > Jim Weiss > > > * > --- > Leah Silen > Executive Director, NumFOCUS > leah at numfocus.org > 512-222-5449 > > > > From ralf.gommers at gmail.com Sat Jul 14 00:20:19 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 13 Jul 2018 23:20:19 -0500 Subject: [Numpy-discussion] Proposal to accept NEP 19: Random Number Generator Policy In-Reply-To: References: Message-ID: On Tue, Jul 3, 2018 at 4:34 PM, Robert Kern wrote: > There has been one clarification to the text: > > https://github.com/numpy/numpy/pull/11488 > > For the legacy RandomState, we will *not* be fixing correctness bugs if > they doing so would break the stream; this is in contrast with the current > policy where we can fix correctness bugs. In the post-NEP world, > RandomState's purpose will be to provide across-version-stable numbers for > unit tests, so stability is primary. I don't expect to see many more bugs, > except in arcane corners of the parameter spaces, like in #11475, which > can be avoided, and we can introduce warnings to help users avoid them. > > https://github.com/numpy/numpy/pull/11475 > NEP status is now accept (https://github.com/numpy/numpy/pull/11560). Thanks Robert (& Kevin & @all)! Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Sat Jul 14 16:39:25 2018 From: matti.picus at gmail.com (Matti Picus) Date: Sat, 14 Jul 2018 15:39:25 -0500 Subject: [Numpy-discussion] we held an impromptu dtype brainstorming sesison at SciPy Message-ID: <4c46b32e-e6af-2f19-8f0f-09fe3e964f98@gmail.com> The stars all aligned properly and some of the steering committee suggested we put together a quick brainstorming session over what to do with dtypes. About 20 people joined in the discussion which was very productive. We began with user stories and design requirements, and asked some present to spend 5 minutes and create a straw-man implementation of what their dream dtype implementation would contain. The resulting document https://github.com/numpy/numpy/wiki/Dtype-Brainstorming will serve as the basis for a future NEP and more work toward a better, user-extensible dtype. More comments are welcome, the discussion is only at the beginning stages. Matti From gael.varoquaux at normalesup.org Sat Jul 14 17:58:32 2018 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 14 Jul 2018 23:58:32 +0200 Subject: [Numpy-discussion] we held an impromptu dtype brainstorming sesison at SciPy In-Reply-To: <4c46b32e-e6af-2f19-8f0f-09fe3e964f98@gmail.com> References: <4c46b32e-e6af-2f19-8f0f-09fe3e964f98@gmail.com> Message-ID: <20180714215832.sxr4fb5jxtnbest6@phare.normalesup.org> Thank you so much to everybody involved; this is important for the ecosystem. Ga?l On Sat, Jul 14, 2018 at 03:39:25PM -0500, Matti Picus wrote: > The stars all aligned properly and some of the steering committee suggested > we put together a quick brainstorming session over what to do with dtypes. > About 20 people joined in the discussion which was very productive. We began > with user stories and design requirements, and asked some present to spend 5 > minutes and create a straw-man implementation of what their dream dtype > implementation would contain. The resulting document > https://github.com/numpy/numpy/wiki/Dtype-Brainstorming will serve as the > basis for a future NEP and more work toward a better, user-extensible dtype. > More comments are welcome, the discussion is only at the beginning stages. > Matti > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Gael Varoquaux Senior Researcher, INRIA Parietal NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France Phone: ++ 33-1-69-08-79-68 http://gael-varoquaux.info http://twitter.com/GaelVaroquaux From njs at pobox.com Sun Jul 15 03:23:16 2018 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 15 Jul 2018 00:23:16 -0700 Subject: [Numpy-discussion] we held an impromptu dtype brainstorming sesison at SciPy In-Reply-To: <4c46b32e-e6af-2f19-8f0f-09fe3e964f98@gmail.com> References: <4c46b32e-e6af-2f19-8f0f-09fe3e964f98@gmail.com> Message-ID: A few quick things that jumped out at me: - I think the discussion of __array_ufunc__ on dtypes is confused. Dtypes already have a generic mechanism for overriding ufuncs (the ufunc loop dispatch mechanism), it's separate from __array_ufunc__ (so in theory you could e.g. have a user-defined array class that uses __array_ufunc__ and still handles arbitrary user-defined dtypes), and it's more powerful (e.g. can do automatic casting to dtypes that don't even appear in the input arrays). - IMO we should just not support string parsing for new dtypes. The right way to pass structured data in Python is with a structured object, not a string. Language design creates tons of problems and we can avoid all of them if we just don't do it. - In Python, subclassing and nominal types (as opposed to duck types) are both code smells in general, but this is the rare case where we actually do want them. For dtypes defined in C (including all the current built-in ones!), we want to be able to call their special dtype methods directly from C, without jumping back out to Python. Fortunately, we don't have to invent anything new here -- all of Python's built-in special methods have the same issue, and they solve it with what they call "slots". For example, when you define a new type in C and want to give it an __add__ method, you don't create a Python callable and stick it in the type's __dict__ and expect PyNumber_Add to find it by doing a dict lookup and boxing up the arguments in a tuple and all that. Instead, you fill in the nb_add slot in the C-level type object, and PyNumber_Add calls that directly. (And there's also some fancy-footwork where if you fill in the nb_add slot, then type.__new__ will automatically create a Python-level wrapper and stick it in the __add__ slot in the type dict; or if you're defining a new type in Python, then type.__new__ will notice if you define a Python-level __add__ method and automatically create a C-level wrapper and stick it in the nb_add slot. End result: Python and C callers can both blindly invoke the method using either the Python or C level mechanism, and in all cases it automatically does the most efficient thing.) So for dtypes we want our own slots. This is conceptually straightforward but has a few moving parts you need to line up: to add new slots, you have to add new entries to the PyType struct. You do this the same way you extend any Python object: you subclass PyType, i.e., define a metaclass. Then you make np.dtype an instance of this new metaclass, so that np.dtype and all np.dtype subclasses automatically have the extra slots available in their type object. And then you hook up some plumbing to make sure that the slots are set up correctly (in your metaclass's __new__ method), etc. That said, we should ideally try to make np.dtype a kind of abstract base class with as little as possible logic on the actual class, because being an instance of np.dtype should mean "this object implements these Python and C-level interfaces", not actually trigger behavioral inheritance. Maybe we can move most of the stuff that's currently there into an internal 'legacy_dtype' class? Or maybe we'll just have to grit our teeth and live with a not-quite-ideal design. - OTOH, re: "mixins for units" -- just don't go there! make units a wrapper dtype that has-a underlying dtype, where the units class's methods can invoke the wrapped dtype's methods at appropriate times. On Sat, Jul 14, 2018 at 1:39 PM, Matti Picus wrote: > The stars all aligned properly and some of the steering committee suggested > we put together a quick brainstorming session over what to do with dtypes. > About 20 people joined in the discussion which was very productive. We began > with user stories and design requirements, and asked some present to spend 5 > minutes and create a straw-man implementation of what their dream dtype > implementation would contain. The resulting document > https://github.com/numpy/numpy/wiki/Dtype-Brainstorming will serve as the > basis for a future NEP and more work toward a better, user-extensible dtype. > > More comments are welcome, the discussion is only at the beginning stages. > > Matti > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith -- https://vorpus.org From ralf.gommers at gmail.com Sun Jul 15 18:02:19 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 15 Jul 2018 17:02:19 -0500 Subject: [Numpy-discussion] Fwd: [NumFOCUS Projects] NumFOCUS Summit Registration In-Reply-To: References: Message-ID: On Wed, Jul 11, 2018 at 3:49 PM, Matti Picus wrote: > On 06/07/18 00:54, Ralf Gommers wrote: > >> Hi all, >> >> In September the NumFOCUS Summit for 2018 will be held in New York. NumPy >> can send one representative (or a couple, but costs are only covered for >> one person). We opened this opportunity up to members of the Steering >> Council first, however it doesn't look like anyone is in the position to >> attend. Therefore we'd now like to ask if anyone else would like to attend. >> >> Rules of the game: you do need a strong affiliation with the project >> (e.g. you have commit right, have been contributing for a while, have been >> hired to work on NumPy, etc.); in case multiple people are interested then >> the NumPy Steering Council will make a decision. >> >> If you're interested, please reply here or off-list. >> >> Cheers, >> Ralf >> >> > I would like to put my name forward. > Thanks Matti! That makes two good candidates. I'll follow up right now with the steering committee. If anyone else wants to be considered, please reply within 24 hours. Cheers, Ralf Matti > > >> ---------- Forwarded message ---------- >> From: *Leah Silen* > >> Date: Wed, Jun 27, 2018 at 4:35 PM >> Subject: [NumFOCUS Projects] NumFOCUS Summit Registration >> To: projects at numfocus.org >> >> >> * >> >> Hi all, >> >> >> We?d like to share some additional details on this year?s NumFOCUS >> Project Summit. >> >> >> The Sustainability Workshop portion of the Summit (Sept 22-23) is an >> opportunity for projects to focus on creating roadmaps and developing >> strategies for long-term sustainability. This year we would like your help >> in generating the content and driving the direction of the workshop. We?ll >> be reaching out for your suggestions on specific areas you would like to >> see addressed. >> >> >> The Project Forum for Core Developers and Critical Users(Sep 24-25) is a >> meeting, open to the public, for critical users to directly interact with >> the core developers of NumFOCUS projects. The goal is to connect you with >> resources to advance your community roadmaps while beginning a dialogue >> with key stakeholders. The event will be limited to 150 attendees. Again, >> we will be giving participants (both projects and users) an opportunity to >> drive the content and agenda. >> >> >> NumFOCUS will cover travel expenses for one representative per sponsored >> project.This includes an airfare allotment of $600 and hotel accommodations >> for 4 nights in New York City. The airfare amount is an average; we >> understand international travel costs will exceed this amount. Please >> contact us if you would like to have more than one rep attend. When >> choosing who will attend, we ask that you prioritize members of your >> leadership team or steering committee as this will help your project get >> the most benefit out of the Summit experience. >> >> >> In order to book hotel rooms, it is urgent that we know who will be >> attending as soon as possible.We?ve set up the following site for project >> leads to register: >> >> >> Once again, if you?re interested in working on any portion of the summit, >> we would love your input and leadership in setting the agenda. Please reach >> out to summit at numfocus.org if you would like >> to be involved. >> >> >> >> NumFOCUS Summit Committee- >> >> >> Andy Terrel >> >> James Powell >> >> Leah Silen >> >> Gina Helfrich >> >> Jim Weiss >> >> >> * >> --- >> Leah Silen >> Executive Director, NumFOCUS >> leah at numfocus.org >> 512-222-5449 >> >> >> >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbmcleod at gmail.com Mon Jul 16 15:39:55 2018 From: robbmcleod at gmail.com (Robert McLeod) Date: Mon, 16 Jul 2018 12:39:55 -0700 Subject: [Numpy-discussion] ANN: Numexpr 2.6.6 Message-ID: ========================== Announcing Numexpr 2.6.6 ========================== Hi everyone, This is a bug-fix release. Thanks to Mark Dickinson for a fix to the thread barrier that occassionally suffered from spurious wakeups on MacOSX. Project documentation is available at: http://numexpr.readthedocs.io/ Changes from 2.6.5 to 2.6.6 --------------------------- - Thanks to Mark Dickinson for a fix to the thread barrier that occassionally suffered from spurious wakeups on MacOSX. What's Numexpr? --------------- Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. It has multi-threaded capabilities, as well as support for Intel's MKL (Math Kernel Library), which allows an extremely fast evaluation of transcendental functions (sin, cos, tan, exp, log...) while squeezing the last drop of performance out of your multi-core processors. Look here for a some benchmarks of numexpr using MKL: https://github.com/pydata/numexpr/wiki/NumexprMKL Its only dependency is NumPy (MKL is optional), so it works well as an easy-to-deploy, easy-to-use, computational engine for projects that don't want to adopt other solutions requiring more heavy dependencies. Where I can find Numexpr? ------------------------- The project is hosted at GitHub in: https://github.com/pydata/numexpr You can get the packages from PyPI as well (but not for RC releases): http://pypi.python.org/pypi/numexpr Documentation is hosted at: http://numexpr.readthedocs.io/en/latest/ Share your experience --------------------- Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy data! -- Robert McLeod, Ph.D. robbmcleod at gmail.com robbmcleod at protonmail.com robert.mcleod at hitachi-hhtc.ca www.entropyreduction.al -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 19 17:08:24 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 19 Jul 2018 15:08:24 -0600 Subject: [Numpy-discussion] Commit privileges Message-ID: Hi All, The NumPy Steering Council has been looking at commit rights for the NumPy developers hired at BIDS. We would like them to be able to label PRs, close issues, and merge simple fixes; doing that requires commit privileges. OTOH, it is also the case that people paid to work on NumPy don't automatically receive commit privileges. So it is a bit a quandary and we don't seem to have an official document codifying the giving of commit privileges, and the Github privileges are rather coarse grained, pretty much all or nothing for a given repository. The situation has also caused us to rethink commit privileges in general, perhaps we are being too selective. So there is some interest in offering commit privileges more freely, with the understanding that they are needed for many of the mundane tasks required to maintain NumPy, but that new people should be conservative in their exercise of the privilege. Given the reality of the Github environment, such a system needs be honor based, but would allow more people an opportunity to participate at a deeper level. So in line with that, we are going to give both of the BIDS workers commit privileges, but also extend the option of commit privileges for issue triage and other such things to the community at large. If you have contributed to NumPy and are interested in having commit rights, please respond to this post, but bear in mind that this is an experiment and that things might change if the system is abused. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jul 20 15:08:50 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 20 Jul 2018 13:08:50 -0600 Subject: [Numpy-discussion] Upcoming 1.15.0 release Message-ID: Hi All, Just a heads up that the NumPy 1.15.0 release is planned for next Monday, Jul 23. If you have encountered a problem with the pre-release, please yell. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Jul 21 02:11:21 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 20 Jul 2018 23:11:21 -0700 Subject: [Numpy-discussion] Fwd: [NumFOCUS Projects] NumFOCUS Summit Registration In-Reply-To: References: Message-ID: On Sun, Jul 15, 2018 at 3:02 PM, Ralf Gommers wrote: > > > On Wed, Jul 11, 2018 at 3:49 PM, Matti Picus > wrote: > >> On 06/07/18 00:54, Ralf Gommers wrote: >> >>> Hi all, >>> >>> In September the NumFOCUS Summit for 2018 will be held in New York. >>> NumPy can send one representative (or a couple, but costs are only covered >>> for one person). We opened this opportunity up to members of the Steering >>> Council first, however it doesn't look like anyone is in the position to >>> attend. Therefore we'd now like to ask if anyone else would like to attend. >>> >>> Rules of the game: you do need a strong affiliation with the project >>> (e.g. you have commit right, have been contributing for a while, have been >>> hired to work on NumPy, etc.); in case multiple people are interested then >>> the NumPy Steering Council will make a decision. >>> >>> If you're interested, please reply here or off-list. >>> >>> Cheers, >>> Ralf >>> >>> >> I would like to put my name forward. >> > > Thanks Matti! That makes two good candidates. I'll follow up right now > with the steering committee. > > If anyone else wants to be considered, please reply within 24 hours. > Hi all, here's an update on the decision on who will attend this event to represent NumPy. Allan Haldane and Stefan van der Walt are the two people going. Given that this is more a governance/sustainability/funding type rather than a technical event (even though there's also an opportunity to sprint), it made sense to choose people who have been working on NumPy for at least a couple of years (or much longer, in Stefan's case). Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Jul 21 14:41:36 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 21 Jul 2018 11:41:36 -0700 Subject: [Numpy-discussion] nominations sought for NumFOCUS board of directors Message-ID: Hi all, Not everyone may be on a NumFOCUS mailing list or read Planet SciPy/Python, so I thought it'd be good to point this out: https://www.numfocus.org/blog/numfocus-to-hold-2018-elections-for-board-of-directors >From my perspective, it would be great to see some candidates who are actively involved in sponsored projects, and some who are actively involved in organising events/meetups. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Jul 21 19:48:53 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 21 Jul 2018 16:48:53 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP Message-ID: Hi all, Here is a first draft of a NEP on backwards compatibility and deprecation policy. This I think mostly formalized what we've done for the last couple of years, however I'm sure opinions and wish lists will differ here. Pull request: https://github.com/numpy/numpy/pull/11596 Rendered version: https://github.com/rgommers/numpy/blob/nep-backcompat/doc/neps/nep-0023-backwards-compatibility.rst Full text below (ducks). Cheers, Ralf ======================================================= NEP 23 - Backwards compatibility and deprecation policy ======================================================= :Author: Ralf Gommers :Status: Draft :Type: Process :Created: 2018-07-14 :Resolution: (required for Accepted | Rejected | Withdrawn) Abstract -------- In this NEP we describe NumPy's approach to backwards compatibility, its deprecation and removal policy, and the trade-offs and decision processes for individual cases where breaking backwards compatibility is considered. Detailed description -------------------- NumPy has a very large user base. Those users rely on NumPy being stable and the code they write that uses NumPy functionality to keep working. NumPy is also actively maintained and improved -- and sometimes improvements require, or are made much easier, by breaking backwards compatibility. Finally, there are trade-offs in stability for existing users vs. avoiding errors or having a better user experience for new users. These competing needs often give rise to heated debates and delays in accepting or rejecting contributions. This NEP tries to address that by providing a policy as well as examples and rationales for when it is or isn't a good idea to break backwards compatibility. General principles: - Aim not to break users' code unnecessarily. - Aim never to change code in ways that can result in users silently getting incorrect results from their previously working code. - Backwards incompatible changes can be made, provided the benefits outweigh the costs. - When assessing the costs, keep in mind that most users do not read the mailing list, do not look at deprecation warnings, and sometimes wait more than one or two years before upgrading from their old version. And that NumPy has many hundreds of thousands or even a couple of million users, so "no one will do or use this" is very likely incorrect. - Benefits include improved functionality, usability and performance (in order of importance), as well as lower maintenance cost and improved future extensibility. - Bug fixes are exempt from the backwards compatibility policy. However in case of serious impact on users (e.g. a downstream library doesn't build anymore), even bug fixes may have to be delayed for one or more releases. - The Python API and the C API will be treated in the same way. Examples ^^^^^^^^ We now discuss a number of concrete examples to illustrate typical issues and trade-offs. **Changing the behavior of a function** ``np.histogram`` is probably the most infamous example. First, a new keyword ``new=False`` was introduced, this was then switched over to None one release later, and finally it was removed again. Also, it has a ``normed`` keyword that had behavior that could be considered either suboptimal or broken (depending on ones opinion on the statistics). A new keyword ``density`` was introduced to replace it; ``normed`` started giving ``DeprecationWarning`` only in v.1.15.0. Evolution of ``histogram``:: def histogram(a, bins=10, range=None, normed=False): # v1.0.0 def histogram(a, bins=10, range=None, normed=False, weights=None, new=False): #v1.1.0 def histogram(a, bins=10, range=None, normed=False, weights=None, new=None): #v1.2.0 def histogram(a, bins=10, range=None, normed=False, weights=None): #v1.5.0 def histogram(a, bins=10, range=None, normed=False, weights=None, density=None): #v1.6.0 def histogram(a, bins=10, range=None, normed=None, weights=None, density=None): #v1.15.0 # v1.15.0 was the first release where `normed` started emitting # DeprecationWarnings The ``new`` keyword was planned from the start to be temporary; such a plan forces users to change their code more than once. Such keywords (there have been other instances proposed, e.g. ``legacy_index`` in `NEP 21 `_) are not desired. The right thing to have done here would probably have been to deprecate ``histogram`` and introduce a new function ``hist`` in its place. **Returning a view rather than a copy** The ``ndarray.diag`` method used to return a copy. A view would be better for both performance and design consistency. This change was warned about (``FutureWarning``) in v.8.0, and in v1.9.0 ``diag`` was changed to return a *read-only* view. The planned change to a writeable view in v1.10.0 was postponed due to backwards compatibility concerns, and is still an open issue (gh-7661). What should have happened instead: nothing. This change resulted in a lot of discussions and wasted effort, did not achieve its final goal, and was not that important in the first place. Finishing the change to a *writeable* view in the future is not desired, because it will result in users silently getting different results if they upgraded multiple versions or simply missed the warnings. **Disallowing indexing with floats** Indexing an array with floats is asking for something ambiguous, and can be a sign of a bug in user code. After some discussion, it was deemed a good idea to deprecate indexing with floats. This was first tried for the v1.8.0 release, however in pre-release testing it became clear that this would break many libraries that depend on NumPy. Therefore it was reverted before release, to give those libraries time to fix their code first. It was finally introduced for v1.11.0 and turned into a hard error for v1.12.0. This change was disruptive, however it did catch real bugs in e.g. SciPy and scikit-learn. Overall the change was worth the cost, and introducing it in master first to allow testing, then removing it again before a release, is a useful strategy. Similar recent deprecations also look like good examples of cleanups/improvements: - removing deprecated boolean indexing (gh-8312) - deprecating truth testing on empty arrays (gh-9718) - deprecating ``np.sum(generator)`` (gh-10670, one issue with this one is that its warning message is wrong - this should error in the future). **Removing the financial functions** The financial functions (e.g. ``np.pmt``) are badly named, are present in the main NumPy namespace, and don't really fit well with NumPy's scope. They were added in 2008 after `a discussion < https://mail.python.org/pipermail/numpy-discussion/2008-April/032353.html>`_ on the mailing list where opinion was divided (but a majority in favor). At the moment these functions don't cause a lot of overhead, however there are multiple issues and PRs a year for them which cost maintainer time to deal with. And they clutter up the ``numpy`` namespace. Discussion in 2013 happened on removing them again (gh-2880). This case is borderline, but given that they're clearly out of scope, deprecation and removal out of at least the main ``numpy`` namespace can be proposed. Alternatively, document clearly that new features for financial functions are unwanted, to keep the maintenance costs to a minimum. **Examples of features not added because of backwards compatibility** TODO: do we have good examples here? Possibly subclassing related? Removing complete submodules ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This year there have been suggestions to consider removing some or all of ``numpy.distutils``, ``numpy.f2py``, ``numpy.linalg``, and ``numpy.random``. The motivation was that all these cost maintenance effort, and that they slow down work on the core of Numpy (ndarrays, dtypes and ufuncs). The import on downstream libraries and users would be very large, and maintenance of these modules would still have to happen. Therefore this is simply not a good idea; removing these submodules should not happen even for a new major version of NumPy. Subclassing of ndarray ^^^^^^^^^^^^^^^^^^^^^^ Subclassing of ``ndarray`` is a pain point. ``ndarray`` was not (or at least not well) designed to be subclassed. Despite that, a lot of subclasses have been created even within the NumPy code base itself, and some of those (e.g. ``MaskedArray``, ``astropy.units.Quantity``) are quite popular. The main problems with subclasses are: - They make it hard to change ``ndarray`` in ways that would otherwise be backwards compatible. - Some of them change the behavior of ndarray methods, making it difficult to write code that accepts array duck-types. Subclassing ``ndarray`` has been officially discouraged for a long time. Of the most important subclasses, ``np.matrix`` will be deprecated (see gh-10142) and ``MaskedArray`` will be kept in NumPy (`NEP 17 `_). ``MaskedArray`` will ideally be rewritten in a way such that it uses only public NumPy APIs. For subclasses outside of NumPy, more work is needed to provide alternatives (e.g. mixins, see gh-9016 and gh-10446) or better support for custom dtypes (see gh-2899). Until that is done, subclasses need to be taken into account when making change to the NumPy code base. A future change in NumPy to not support subclassing will certainly need a major version increase. Policy ------ 1. Code changes that have the potential to silently change the results of a users' code must never be made (except in the case of clear bugs). 2. Code changes that break users' code (i.e. the user will see a clear exception) can be made, *provided the benefit is worth the cost* and suitable deprecation warnings have been raised first. 3. Deprecation warnings are in all cases warnings that functionality will be removed. If there is no intent to remove functionlity, then deprecation in documentation only or other types of warnings shall be used. 4. Deprecations for stylistic reasons (e.g. consistency between functions) are strongly discouraged. Deprecations: - shall include the version numbers of both when the functionality was deprecated and when it will be removed (either two releases after the warning is introduced, or in the next major version). - shall include information on alternatives to the deprecated functionality, or a reason for the deprecation if no clear alternative is available. - shall use ``VisibleDeprecationWarning`` rather than ``DeprecationWarning`` for cases of relevance to end users (as opposed to cases only relevant to libraries building on top of NumPy). - shall be listed in the release notes of the release where the deprecation happened. Removal of deprecated functionality: - shall be done after 2 releases (assuming a 6-monthly release cycle; if that changes, there shall be at least 1 year between deprecation and removal), unless the impact of the removal is such that a major version number increase is warranted. - shall be listed in the release notes of the release where the removal happened. Versioning: - removal of deprecated code can be done in any minor (but not bugfix) release. - for heavily used functionality (e.g. removal of ``np.matrix``, of a whole submodule, or significant changes to behavior for subclasses) the major version number shall be increased. In concrete cases where this policy needs to be applied, decisions are made according to the `NumPy governance model `_. Functionality with more strict policies: - ``numpy.random`` has its own backwards compatibility policy, see `NEP 19 `_. - The file format for ``.npy`` and ``.npz`` files must not be changed in a backwards incompatible way. Alternatives ------------ **Being more agressive with deprecations.** The goal of being more agressive is to allow NumPy to move forward faster. This would avoid others inventing their own solutions (often in multiple places), as well as be a benefit to users without a legacy code base. We reject this alternative because of the place NumPy has in the scientific Python ecosystem - being fairly conservative is required in order to not increase the extra maintenance for downstream libraries and end users to an unacceptable level. **Semantic versioning.** This would change the versioning scheme for code removals; those could then only be done when the major version number is increased. Rationale for rejection: semantic versioning is relatively common in software engineering, however it is not at all common in the Python world. Also, it would mean that NumPy's version number simply starts to increase faster, which would be more confusing than helpful. gh-10156 contains more discussion on this alternative. Discussion ---------- TODO This section may just be a bullet list including links to any discussions regarding the NEP: - This includes links to mailing list threads or relevant GitHub issues. References and Footnotes ------------------------ .. [1] TODO Copyright --------- This document has been placed in the public domain. [1]_ -------------- next part -------------- An HTML attachment was scrubbed... URL: From einstein.edison at gmail.com Sat Jul 21 20:46:04 2018 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Sat, 21 Jul 2018 20:46:04 -0400 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: Hello, Very well written article! It takes a lot of important things into account. I think a number of things should be mentioned, if only in the alternatives: - One major version number change, with lots of ?major version change? deprecations grouped into it, along with an LTS release. - The possibility of another major version change (possibly the same one) where we re-write all portions that were agreed upon (via NEPs) to be re-written, with a longer LTS release (3 years? 5?). - I?m thinking this one could be similar to the Python 2 -> Python 3 transition. Note that this is different from having constant breakages, this will be a mostly one-time effort and one-time breakage. - We break the ABI, but not most of the C API. - We port at least bug fixes and possibly oft-requested functionality to the old version for a long time. - But we fix all of the little things that are agreed upon by the community to be ?missing? or ?wrong? in the current release. It may be a while before this is adopted but it?ll be really beneficial in the long run. - We ping the dev-discussions of most major downstream users (SciPy, all the scikits, Matplotlib, etc.) for their ?pain points? and also if they think this is a good idea. This way, the amount of users included aren?t just those on the NumPy mailing list. - We enforce good practices in our code. For example, we will explicitly disallow subclassing from ndarray, we get rid of scalars, we fix the type system. This may sound radical (I myself think so), but consider that if we get rid of a large amount of technical debt on the onset, have a reputation for a clean code-base (rather than one that?s decades old), then we could onboard a lot more active developers and existing developers can also get a lot more work done. I may be getting ahead of myself on this, but feel free to leave your thoughts and opinions. Best regards, Hameer Abbasi Sent from Astro for Mac On 22. Jul 2018 at 01:48, Ralf Gommers wrote: Hi all, Here is a first draft of a NEP on backwards compatibility and deprecation policy. This I think mostly formalized what we've done for the last couple of years, however I'm sure opinions and wish lists will differ here. Pull request: https://github.com/numpy/numpy/pull/11596 Rendered version: https://github.com/rgommers/numpy/blob/nep-backcompat/doc/neps/nep-0023-backwards-compatibility.rst Full text below (ducks). Cheers, Ralf ======================================================= NEP 23 - Backwards compatibility and deprecation policy ======================================================= :Author: Ralf Gommers :Status: Draft :Type: Process :Created: 2018-07-14 :Resolution: (required for Accepted | Rejected | Withdrawn) Abstract -------- In this NEP we describe NumPy's approach to backwards compatibility, its deprecation and removal policy, and the trade-offs and decision processes for individual cases where breaking backwards compatibility is considered. Detailed description -------------------- NumPy has a very large user base. Those users rely on NumPy being stable and the code they write that uses NumPy functionality to keep working. NumPy is also actively maintained and improved -- and sometimes improvements require, or are made much easier, by breaking backwards compatibility. Finally, there are trade-offs in stability for existing users vs. avoiding errors or having a better user experience for new users. These competing needs often give rise to heated debates and delays in accepting or rejecting contributions. This NEP tries to address that by providing a policy as well as examples and rationales for when it is or isn't a good idea to break backwards compatibility. General principles: - Aim not to break users' code unnecessarily. - Aim never to change code in ways that can result in users silently getting incorrect results from their previously working code. - Backwards incompatible changes can be made, provided the benefits outweigh the costs. - When assessing the costs, keep in mind that most users do not read the mailing list, do not look at deprecation warnings, and sometimes wait more than one or two years before upgrading from their old version. And that NumPy has many hundreds of thousands or even a couple of million users, so "no one will do or use this" is very likely incorrect. - Benefits include improved functionality, usability and performance (in order of importance), as well as lower maintenance cost and improved future extensibility. - Bug fixes are exempt from the backwards compatibility policy. However in case of serious impact on users (e.g. a downstream library doesn't build anymore), even bug fixes may have to be delayed for one or more releases. - The Python API and the C API will be treated in the same way. Examples ^^^^^^^^ We now discuss a number of concrete examples to illustrate typical issues and trade-offs. **Changing the behavior of a function** ``np.histogram`` is probably the most infamous example. First, a new keyword ``new=False`` was introduced, this was then switched over to None one release later, and finally it was removed again. Also, it has a ``normed`` keyword that had behavior that could be considered either suboptimal or broken (depending on ones opinion on the statistics). A new keyword ``density`` was introduced to replace it; ``normed`` started giving ``DeprecationWarning`` only in v.1.15.0. Evolution of ``histogram``:: def histogram(a, bins=10, range=None, normed=False): # v1.0.0 def histogram(a, bins=10, range=None, normed=False, weights=None, new=False): #v1.1.0 def histogram(a, bins=10, range=None, normed=False, weights=None, new=None): #v1.2.0 def histogram(a, bins=10, range=None, normed=False, weights=None): #v1.5.0 def histogram(a, bins=10, range=None, normed=False, weights=None, density=None): #v1.6.0 def histogram(a, bins=10, range=None, normed=None, weights=None, density=None): #v1.15.0 # v1.15.0 was the first release where `normed` started emitting # DeprecationWarnings The ``new`` keyword was planned from the start to be temporary; such a plan forces users to change their code more than once. Such keywords (there have been other instances proposed, e.g. ``legacy_index`` in `NEP 21 `_) are not desired. The right thing to have done here would probably have been to deprecate ``histogram`` and introduce a new function ``hist`` in its place. **Returning a view rather than a copy** The ``ndarray.diag`` method used to return a copy. A view would be better for both performance and design consistency. This change was warned about (``FutureWarning``) in v.8.0, and in v1.9.0 ``diag`` was changed to return a *read-only* view. The planned change to a writeable view in v1.10.0 was postponed due to backwards compatibility concerns, and is still an open issue (gh-7661). What should have happened instead: nothing. This change resulted in a lot of discussions and wasted effort, did not achieve its final goal, and was not that important in the first place. Finishing the change to a *writeable* view in the future is not desired, because it will result in users silently getting different results if they upgraded multiple versions or simply missed the warnings. **Disallowing indexing with floats** Indexing an array with floats is asking for something ambiguous, and can be a sign of a bug in user code. After some discussion, it was deemed a good idea to deprecate indexing with floats. This was first tried for the v1.8.0 release, however in pre-release testing it became clear that this would break many libraries that depend on NumPy. Therefore it was reverted before release, to give those libraries time to fix their code first. It was finally introduced for v1.11.0 and turned into a hard error for v1.12.0. This change was disruptive, however it did catch real bugs in e.g. SciPy and scikit-learn. Overall the change was worth the cost, and introducing it in master first to allow testing, then removing it again before a release, is a useful strategy. Similar recent deprecations also look like good examples of cleanups/improvements: - removing deprecated boolean indexing (gh-8312) - deprecating truth testing on empty arrays (gh-9718) - deprecating ``np.sum(generator)`` (gh-10670, one issue with this one is that its warning message is wrong - this should error in the future). **Removing the financial functions** The financial functions (e.g. ``np.pmt``) are badly named, are present in the main NumPy namespace, and don't really fit well with NumPy's scope. They were added in 2008 after `a discussion < https://mail.python.org/pipermail/numpy-discussion/2008-April/032353.html>`_ on the mailing list where opinion was divided (but a majority in favor). At the moment these functions don't cause a lot of overhead, however there are multiple issues and PRs a year for them which cost maintainer time to deal with. And they clutter up the ``numpy`` namespace. Discussion in 2013 happened on removing them again (gh-2880). This case is borderline, but given that they're clearly out of scope, deprecation and removal out of at least the main ``numpy`` namespace can be proposed. Alternatively, document clearly that new features for financial functions are unwanted, to keep the maintenance costs to a minimum. **Examples of features not added because of backwards compatibility** TODO: do we have good examples here? Possibly subclassing related? Removing complete submodules ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This year there have been suggestions to consider removing some or all of ``numpy.distutils``, ``numpy.f2py``, ``numpy.linalg``, and ``numpy.random``. The motivation was that all these cost maintenance effort, and that they slow down work on the core of Numpy (ndarrays, dtypes and ufuncs). The import on downstream libraries and users would be very large, and maintenance of these modules would still have to happen. Therefore this is simply not a good idea; removing these submodules should not happen even for a new major version of NumPy. Subclassing of ndarray ^^^^^^^^^^^^^^^^^^^^^^ Subclassing of ``ndarray`` is a pain point. ``ndarray`` was not (or at least not well) designed to be subclassed. Despite that, a lot of subclasses have been created even within the NumPy code base itself, and some of those (e.g. ``MaskedArray``, ``astropy.units.Quantity``) are quite popular. The main problems with subclasses are: - They make it hard to change ``ndarray`` in ways that would otherwise be backwards compatible. - Some of them change the behavior of ndarray methods, making it difficult to write code that accepts array duck-types. Subclassing ``ndarray`` has been officially discouraged for a long time. Of the most important subclasses, ``np.matrix`` will be deprecated (see gh-10142) and ``MaskedArray`` will be kept in NumPy (`NEP 17 `_). ``MaskedArray`` will ideally be rewritten in a way such that it uses only public NumPy APIs. For subclasses outside of NumPy, more work is needed to provide alternatives (e.g. mixins, see gh-9016 and gh-10446) or better support for custom dtypes (see gh-2899). Until that is done, subclasses need to be taken into account when making change to the NumPy code base. A future change in NumPy to not support subclassing will certainly need a major version increase. Policy ------ 1. Code changes that have the potential to silently change the results of a users' code must never be made (except in the case of clear bugs). 2. Code changes that break users' code (i.e. the user will see a clear exception) can be made, *provided the benefit is worth the cost* and suitable deprecation warnings have been raised first. 3. Deprecation warnings are in all cases warnings that functionality will be removed. If there is no intent to remove functionlity, then deprecation in documentation only or other types of warnings shall be used. 4. Deprecations for stylistic reasons (e.g. consistency between functions) are strongly discouraged. Deprecations: - shall include the version numbers of both when the functionality was deprecated and when it will be removed (either two releases after the warning is introduced, or in the next major version). - shall include information on alternatives to the deprecated functionality, or a reason for the deprecation if no clear alternative is available. - shall use ``VisibleDeprecationWarning`` rather than ``DeprecationWarning`` for cases of relevance to end users (as opposed to cases only relevant to libraries building on top of NumPy). - shall be listed in the release notes of the release where the deprecation happened. Removal of deprecated functionality: - shall be done after 2 releases (assuming a 6-monthly release cycle; if that changes, there shall be at least 1 year between deprecation and removal), unless the impact of the removal is such that a major version number increase is warranted. - shall be listed in the release notes of the release where the removal happened. Versioning: - removal of deprecated code can be done in any minor (but not bugfix) release. - for heavily used functionality (e.g. removal of ``np.matrix``, of a whole submodule, or significant changes to behavior for subclasses) the major version number shall be increased. In concrete cases where this policy needs to be applied, decisions are made according to the `NumPy governance model `_. Functionality with more strict policies: - ``numpy.random`` has its own backwards compatibility policy, see `NEP 19 `_. - The file format for ``.npy`` and ``.npz`` files must not be changed in a backwards incompatible way. Alternatives ------------ **Being more agressive with deprecations.** The goal of being more agressive is to allow NumPy to move forward faster. This would avoid others inventing their own solutions (often in multiple places), as well as be a benefit to users without a legacy code base. We reject this alternative because of the place NumPy has in the scientific Python ecosystem - being fairly conservative is required in order to not increase the extra maintenance for downstream libraries and end users to an unacceptable level. **Semantic versioning.** This would change the versioning scheme for code removals; those could then only be done when the major version number is increased. Rationale for rejection: semantic versioning is relatively common in software engineering, however it is not at all common in the Python world. Also, it would mean that NumPy's version number simply starts to increase faster, which would be more confusing than helpful. gh-10156 contains more discussion on this alternative. Discussion ---------- TODO This section may just be a bullet list including links to any discussions regarding the NEP: - This includes links to mailing list threads or relevant GitHub issues. References and Footnotes ------------------------ .. [1] TODO Copyright --------- This document has been placed in the public domain. [1]_ _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Jul 21 21:05:28 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 21 Jul 2018 18:05:28 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sat, Jul 21, 2018 at 5:46 PM, Hameer Abbasi wrote: > Hello, > > Very well written article! It takes a lot of important things into > account. I think a number of things should be mentioned, if only in the > alternatives: > > - One major version number change, with lots of ?major version change? > deprecations grouped into it, along with an LTS release. > > Good point, will add under alternatives. Note that we've tried that before, or planned to do. It doesn't work well in practice; we also don't really have the manpower to do all the changes we'd want in a single release. > > - The possibility of another major version change (possibly the same > one) where we re-write all portions that were agreed upon (via NEPs) to be > re-written, with a longer LTS release (3 years? 5?). > - I?m thinking this one could be similar to the Python 2 -> Python > 3 transition. Note that this is different from having constant breakages, > this will be a mostly one-time effort and one-time breakage. > > The Python 2 to 3 analogy is a good reason for not doing this:) > > - We break the ABI, but not most of the C API. > > Good catch, I didn't mention ABI at all. My opinion: breaking ABI will still require a major version change, but the bar for it is now lower. Basically what Travis was arguing for years ago, only today his argument is actually true due to conda and binary wheels on the 3 major platforms. > - We port at least bug fixes and possibly oft-requested functionality > to the old version for a long time. > - But we fix all of the little things that are agreed upon by the > community to be ?missing? or ?wrong? in the current release. It may be a > while before this is adopted but it?ll be really beneficial in the long run. > - We ping the dev-discussions of most major downstream users > (SciPy, all the scikits, Matplotlib, etc.) for their ?pain points? and also > if they think this is a good idea. This way, the amount of users included > aren?t just those on the NumPy mailing list. > - We enforce good practices in our code. For example, we will > explicitly disallow subclassing from ndarray, we get rid of scalars, we fix > the type system. > > This may sound radical (I myself think so), but consider that if we get > rid of a large amount of technical debt on the onset, have a reputation for > a clean code-base (rather than one that?s decades old), then we could > onboard a lot more active developers and existing developers can also get a > lot more work done. I may be getting ahead of myself on this, but feel free > to leave your thoughts and opinions. > I think it sounds nice in theory, but given the history on large design changes/decisions I don't believe we are able to get things right on a first big rewrite. For example "fix the type system" - we all would like something better, but in the 5+ years that we've talked about it, no one has even put a complete design on paper. And for ones we did do like __numpy_ufunc__ we definitely needed a few iterations. That points to gradual evolution being a better model. Cheers. Ralf > Best regards, > Hameer Abbasi > Sent from Astro for Mac > > On 22. Jul 2018 at 01:48, Ralf Gommers wrote: > > > Hi all, > > Here is a first draft of a NEP on backwards compatibility and deprecation > policy. This I think mostly formalized what we've done for the last couple > of years, however I'm sure opinions and wish lists will differ here. > > Pull request: https://github.com/numpy/numpy/pull/11596 > > Rendered version: https://github.com/rgommers/ > numpy/blob/nep-backcompat/doc/neps/nep-0023-backwards-compatibility.rst > > Full text below (ducks). > > Cheers, > Ralf > > > ======================================================= > NEP 23 - Backwards compatibility and deprecation policy > ======================================================= > > :Author: Ralf Gommers > :Status: Draft > :Type: Process > :Created: 2018-07-14 > :Resolution: (required for Accepted | Rejected | Withdrawn) > > Abstract > -------- > > In this NEP we describe NumPy's approach to backwards compatibility, > its deprecation and removal policy, and the trade-offs and decision > processes for individual cases where breaking backwards compatibility > is considered. > > > Detailed description > -------------------- > > NumPy has a very large user base. Those users rely on NumPy being stable > and the code they write that uses NumPy functionality to keep working. > NumPy is also actively maintained and improved -- and sometimes > improvements > require, or are made much easier, by breaking backwards compatibility. > Finally, there are trade-offs in stability for existing users vs. avoiding > errors or having a better user experience for new users. These competing > needs often give rise to heated debates and delays in accepting or > rejecting > contributions. This NEP tries to address that by providing a policy as > well > as examples and rationales for when it is or isn't a good idea to break > backwards compatibility. > > General principles: > > - Aim not to break users' code unnecessarily. > - Aim never to change code in ways that can result in users silently > getting > incorrect results from their previously working code. > - Backwards incompatible changes can be made, provided the benefits > outweigh > the costs. > - When assessing the costs, keep in mind that most users do not read the > mailing > list, do not look at deprecation warnings, and sometimes wait more than > one or > two years before upgrading from their old version. And that NumPy has > many hundreds of thousands or even a couple of million users, so "no one > will > do or use this" is very likely incorrect. > - Benefits include improved functionality, usability and performance (in > order > of importance), as well as lower maintenance cost and improved future > extensibility. > - Bug fixes are exempt from the backwards compatibility policy. However > in case > of serious impact on users (e.g. a downstream library doesn't build > anymore), > even bug fixes may have to be delayed for one or more releases. > - The Python API and the C API will be treated in the same way. > > > Examples > ^^^^^^^^ > > We now discuss a number of concrete examples to illustrate typical issues > and trade-offs. > > **Changing the behavior of a function** > > ``np.histogram`` is probably the most infamous example. > First, a new keyword ``new=False`` was introduced, this was then switched > over to None one release later, and finally it was removed again. > Also, it has a ``normed`` keyword that had behavior that could be > considered > either suboptimal or broken (depending on ones opinion on the statistics). > A new keyword ``density`` was introduced to replace it; ``normed`` started > giving > ``DeprecationWarning`` only in v.1.15.0. Evolution of ``histogram``:: > > def histogram(a, bins=10, range=None, normed=False): # v1.0.0 > > def histogram(a, bins=10, range=None, normed=False, weights=None, > new=False): #v1.1.0 > > def histogram(a, bins=10, range=None, normed=False, weights=None, > new=None): #v1.2.0 > > def histogram(a, bins=10, range=None, normed=False, weights=None): > #v1.5.0 > > def histogram(a, bins=10, range=None, normed=False, weights=None, > density=None): #v1.6.0 > > def histogram(a, bins=10, range=None, normed=None, weights=None, > density=None): #v1.15.0 > # v1.15.0 was the first release where `normed` started emitting > # DeprecationWarnings > > The ``new`` keyword was planned from the start to be temporary; such a plan > forces users to change their code more than once. Such keywords (there > have > been other instances proposed, e.g. ``legacy_index`` in > `NEP 21 `_) > are not > desired. The right thing to have done here would probably have been to > deprecate ``histogram`` and introduce a new function ``hist`` in its place. > > **Returning a view rather than a copy** > > The ``ndarray.diag`` method used to return a copy. A view would be better > for > both performance and design consistency. This change was warned about > (``FutureWarning``) in v.8.0, and in v1.9.0 ``diag`` was changed to return > a *read-only* view. The planned change to a writeable view in v1.10.0 was > postponed due to backwards compatibility concerns, and is still an open > issue > (gh-7661). > > What should have happened instead: nothing. This change resulted in a lot > of > discussions and wasted effort, did not achieve its final goal, and was not > that > important in the first place. Finishing the change to a *writeable* view > in > the future is not desired, because it will result in users silently getting > different results if they upgraded multiple versions or simply missed the > warnings. > > **Disallowing indexing with floats** > > Indexing an array with floats is asking for something ambiguous, and can > be a > sign of a bug in user code. After some discussion, it was deemed a good > idea > to deprecate indexing with floats. This was first tried for the v1.8.0 > release, however in pre-release testing it became clear that this would > break > many libraries that depend on NumPy. Therefore it was reverted before > release, > to give those libraries time to fix their code first. It was finally > introduced for v1.11.0 and turned into a hard error for v1.12.0. > > This change was disruptive, however it did catch real bugs in e.g. SciPy > and > scikit-learn. Overall the change was worth the cost, and introducing it in > master first to allow testing, then removing it again before a release, is > a > useful strategy. > > Similar recent deprecations also look like good examples of > cleanups/improvements: > > - removing deprecated boolean indexing (gh-8312) > - deprecating truth testing on empty arrays (gh-9718) > - deprecating ``np.sum(generator)`` (gh-10670, one issue with this one is > that > its warning message is wrong - this should error in the future). > > **Removing the financial functions** > > The financial functions (e.g. ``np.pmt``) are badly named, are present in > the > main NumPy namespace, and don't really fit well with NumPy's scope. > They were added in 2008 after > `a discussion 2008-April/032353.html>`_ > on the mailing list where opinion was divided (but a majority in favor). > At the moment these functions don't cause a lot of overhead, however there > are > multiple issues and PRs a year for them which cost maintainer time to deal > with. And they clutter up the ``numpy`` namespace. Discussion in 2013 > happened > on removing them again (gh-2880). > > This case is borderline, but given that they're clearly out of scope, > deprecation and removal out of at least the main ``numpy`` namespace can be > proposed. Alternatively, document clearly that new features for financial > functions are unwanted, to keep the maintenance costs to a minimum. > > **Examples of features not added because of backwards compatibility** > > TODO: do we have good examples here? Possibly subclassing related? > > > Removing complete submodules > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > This year there have been suggestions to consider removing some or all of > ``numpy.distutils``, ``numpy.f2py``, ``numpy.linalg``, and > ``numpy.random``. > The motivation was that all these cost maintenance effort, and that they > slow > down work on the core of Numpy (ndarrays, dtypes and ufuncs). > > The import on downstream libraries and users would be very large, and > maintenance of these modules would still have to happen. Therefore this is > simply not a good idea; removing these submodules should not happen even > for > a new major version of NumPy. > > > Subclassing of ndarray > ^^^^^^^^^^^^^^^^^^^^^^ > > Subclassing of ``ndarray`` is a pain point. ``ndarray`` was not (or at > least > not well) designed to be subclassed. Despite that, a lot of subclasses > have > been created even within the NumPy code base itself, and some of those > (e.g. > ``MaskedArray``, ``astropy.units.Quantity``) are quite popular. The main > problems with subclasses are: > > - They make it hard to change ``ndarray`` in ways that would otherwise be > backwards compatible. > - Some of them change the behavior of ndarray methods, making it difficult > to > write code that accepts array duck-types. > > Subclassing ``ndarray`` has been officially discouraged for a long time. > Of > the most important subclasses, ``np.matrix`` will be deprecated (see > gh-10142) > and ``MaskedArray`` will be kept in NumPy (`NEP 17 > `_). > ``MaskedArray`` will ideally be rewritten in a way such that it uses only > public NumPy APIs. For subclasses outside of NumPy, more work is needed to > provide alternatives (e.g. mixins, see gh-9016 and gh-10446) or better > support > for custom dtypes (see gh-2899). Until that is done, subclasses need to be > taken into account when making change to the NumPy code base. A future > change > in NumPy to not support subclassing will certainly need a major version > increase. > > > Policy > ------ > > 1. Code changes that have the potential to silently change the results of > a users' > code must never be made (except in the case of clear bugs). > 2. Code changes that break users' code (i.e. the user will see a clear > exception) > can be made, *provided the benefit is worth the cost* and suitable > deprecation > warnings have been raised first. > 3. Deprecation warnings are in all cases warnings that functionality will > be removed. > If there is no intent to remove functionlity, then deprecation in > documentation > only or other types of warnings shall be used. > 4. Deprecations for stylistic reasons (e.g. consistency between functions) > are > strongly discouraged. > > Deprecations: > > - shall include the version numbers of both when the functionality was > deprecated > and when it will be removed (either two releases after the warning is > introduced, or in the next major version). > - shall include information on alternatives to the deprecated > functionality, or a > reason for the deprecation if no clear alternative is available. > - shall use ``VisibleDeprecationWarning`` rather than > ``DeprecationWarning`` > for cases of relevance to end users (as opposed to cases only relevant to > libraries building on top of NumPy). > - shall be listed in the release notes of the release where the > deprecation happened. > > Removal of deprecated functionality: > > - shall be done after 2 releases (assuming a 6-monthly release cycle; if > that changes, > there shall be at least 1 year between deprecation and removal), unless > the > impact of the removal is such that a major version number increase is > warranted. > - shall be listed in the release notes of the release where the removal > happened. > > Versioning: > > - removal of deprecated code can be done in any minor (but not bugfix) > release. > - for heavily used functionality (e.g. removal of ``np.matrix``, of a > whole submodule, > or significant changes to behavior for subclasses) the major version > number shall > be increased. > > In concrete cases where this policy needs to be applied, decisions are > made according > to the `NumPy governance model > `_. > > Functionality with more strict policies: > > - ``numpy.random`` has its own backwards compatibility policy, > see `NEP 19 `_. > - The file format for ``.npy`` and ``.npz`` files must not be changed in a > backwards > incompatible way. > > > Alternatives > ------------ > > **Being more agressive with deprecations.** > > The goal of being more agressive is to allow NumPy to move forward faster. > This would avoid others inventing their own solutions (often in multiple > places), as well as be a benefit to users without a legacy code base. We > reject this alternative because of the place NumPy has in the scientific > Python > ecosystem - being fairly conservative is required in order to not increase > the > extra maintenance for downstream libraries and end users to an unacceptable > level. > > **Semantic versioning.** > > This would change the versioning scheme for code removals; those could then > only be done when the major version number is increased. Rationale for > rejection: semantic versioning is relatively common in software > engineering, > however it is not at all common in the Python world. Also, it would mean > that > NumPy's version number simply starts to increase faster, which would be > more > confusing than helpful. gh-10156 contains more discussion on this > alternative. > > > Discussion > ---------- > > TODO > > This section may just be a bullet list including links to any discussions > regarding the NEP: > > - This includes links to mailing list threads or relevant GitHub issues. > > > References and Footnotes > ------------------------ > > .. [1] TODO > > > Copyright > --------- > > This document has been placed in the public domain. [1]_ > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sat Jul 21 21:39:23 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 21 Jul 2018 21:39:23 -0400 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: Hi Ralf, Overall, this looks good. But I think the subclassing section is somewhat misleading in suggesting `ndarray` is not well designed to be subclassed. At least, for neither my work on Quantity nor that on MaskedArray, I've found that the design of `ndarray` itself was a problem. Instead, it was the functions that were, as most were not written with subclassing or duck typing in mind, but rather with the assumption that all input should be an array, and that somehow it is useful to pass anything users pass in through `asarray`. With then layers on top to avoid this in specific circumstances... But perhaps this is what you meant? (I would agree, though, that some ndarray subclasses have been designed poorly - especially, matrix, which then led to a problematic duck array in sparse - and that this has resulted in substantial hassle. Also, subclassing the subclasses is much more problematic that subclassing ndarray - MaskedArray being a particularly annoying example!) The subclassing section also notes that subclassing has been discouraged for a long time. Is that so? Over time, I've certainly had comments from Nathaniel and some others in discussions of PRs that go in that direction, which perhaps reflected some internal consensus I wasn't aware of, but the documentation does not seem to discourage it (check, e.g., the subclassing section [1]). I also think that it may be good to keep in mind that until `__array_ufunc__`, there wasn't much of a choice - support for duck arrays was even more half-hearted (hopefully to become much better with `__array_function__`). Overall, it seems to me that these days in the python eco-system subclassing is simply expected to work. Even within numpy there are other examples (e.g., ufuncs, dtypes) for which there has been quite a bit of discussion about the benefits subclasses would bring. All the best, Marten [1] https://docs.scipy.org/doc/numpy/user/basics.subclassing.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sat Jul 21 22:00:16 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 21 Jul 2018 22:00:16 -0400 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: > > - We enforce good practices in our code. For example, we will > explicitly disallow subclassing from ndarray, we get rid of scalars, we fix > the type system. > > > Given my other reply, probably no surprise that I strongly disagree with the idea of disallowing subclasses. But I'll add to that reply a more general sentiment, that I think one of the problems has been to think that as one develops code, one thinks one knows in advance what users may want to do with it, what input makes sense, etc. But at least I have found that I am often wrong, that I'm not imaginative enough to know what people may want to do. So, my sense is that the best one can do is to make as few assumptions as possible, so avoid coercing, etc. And if the code gets to a position where it needs to guess what is meant, it should just fail. -- Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sat Jul 21 22:06:53 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 21 Jul 2018 22:06:53 -0400 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: Hi Ralf, Maybe as a concrete example of something that has been discussed, for which your proposed text makes (I think) clear what should or should not be done: Many of us hate that `np.array` (like, sadly, many other numpy parts) auto-converts anything not obviously array-like to dtype=object, and it has been suggested we should no longer do this by default [1]. Given your NEP, I think you would disagree with that path, as it would quite obviously break user's code (we also get regular issues about object arrays, which show that they are used a lot in the wild). So, instead I guess one might go with a route where one could explicitly tell `dtype=object` was not wanted (say, `dtype="everything-but-object')? All the best, Marten [1] https://github.com/numpy/numpy/issues/5353 -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sat Jul 21 22:08:48 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 21 Jul 2018 22:08:48 -0400 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: Agreed that changes better be gradual, and that we do not have the manpower to do otherwise (I was slightly shocked to see that my 94 commits in the last two years make me the fourth most prolific contributor in that period... And that is from the couple of hours a week I use while procrastinating on things related to my astronomy day job!) -- Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Jul 21 22:15:14 2018 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 21 Jul 2018 19:15:14 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sat, Jul 21, 2018 at 4:48 PM, Ralf Gommers wrote: > Hi all, > > Here is a first draft of a NEP on backwards compatibility and deprecation > policy. This I think mostly formalized what we've done for the last couple > of years, however I'm sure opinions and wish lists will differ here. Oh *awesome*, thanks for putting this together. I think this is a great start, but I'd structure it a bit differently. So let me just make a few high-level comments first and see what you think. Regarding the "general principles" and then "policy": to me these feel like more a brainstorming list, that hasn't been fully distilled down into principles yet. I would try to structure it to start with the overarching principles (changes need to benefit users more than they harm them, numpy is widely used so breaking changes should by default be assumed to be fairly harmful, decisions should be based on data and actual effects on users rather than e.g. appealing to the docs or abstract aesthetic principles, silently getting wrong answer is much worse than a loud error), then talk about some of the ways this plays out (if people are currently silently getting the wrong answer -- which is the definition of a bug, but also shows up in the index-by-float case -- then that's really bad; some of our tools for collecting data about how bad a breakage is include testing prominent downstreams ourselves, adding warnings or making .0 releases and seeing how people react, etc.), and then examples. Speaking of examples: I hate to say this because in general I think using examples is a great idea. But... I think you should delete most of these examples. The problem is scope creep: the goal for this NEP (IMO) should be to lay out the principles we use to think about these issues in general, but right now it comes across as trying to lay down a final resolution on lots of specific issues (including several where there are ongoing conversations). It ends up like trying to squish multiple NEPs into one, which makes it hard to discuss, and also distracts from the core purpose. My suggestion: keep just two examples, histogram and indexing-with-floats. These are safely done and dusted, totally uncontroversial (AFAIK), and the first is a good illustration of how one can try to be careful and do the right thing but still get it all wrong, while the second is a good example of (a) how we gathered data and decided that an actually pretty disruptive change was nonetheless worth it, and (b) how we had to manage it through multiple false starts. Regarding the actual policy: One alteration to current practice jumped out at me. This policy categorically rules out all changes that could cause currently working code to silently start doing something wrong, regardless of the specific circumstances. That's not how we actually do things right now. Instead, our policy in recent years has been that such changes are permitted in theory, but (a) the starting presumption is that this is super harmful to users so we need a *very* good reason to do it, and (b) if we do go ahead with it, then during the deprecation period we use a highly-visible FutureWarning (instead of the invisible-by-default DeprecationWarning). Personally I think the current policy strikes a better balance. You can see some examples of where we've used this by running 'git log -S FUTUREWARNING -S FutureWarning' -- it's things like a bad default for the rcond argument in lstsq, an obscure and error-prone corner case in indexing (0addc016ba), strange semantics for NaT (https://mail.scipy.org/pipermail/numpy-discussion/2015-October/073968.html), ... we could quibble about individual cases, but I think that taking these on a case-by-case basis is better than ruling them out categorically. And in any case, that is what we do now, so if you want to change this, it's something we should discuss and probably write down some rationale and such :-). Regarding the major version number thing: ugh do we really want to talk about this more. I'd probably leave it out of the NEP entirely. If it stays in, I think it needs a clearer description of what counts as a "major" change. There are some examples of things that do "sound" major, but... the rest of our policy is all about measuring disruption based on effects on users, and by that metric, the index-by-float removal was pretty major. My guess is that by the time we finally remove np.matrix, the actual disruption will be less than it was for removing index-by-float. (As it should be, since keeping index-by-float around was actively causing bugs in even well-maintained downstreams, in a way that np.matrix doesn't.) -n -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Sat Jul 21 22:25:47 2018 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 21 Jul 2018 19:25:47 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sat, Jul 21, 2018 at 5:46 PM, Hameer Abbasi wrote: > The possibility of another major version change (possibly the same one) > where we re-write all portions that were agreed upon (via NEPs) to be > re-written, with a longer LTS release (3 years? 5?). > > I?m thinking this one could be similar to the Python 2 -> Python 3 > transition. Note that this is different from having constant breakages, this > will be a mostly one-time effort and one-time breakage. I agree that this approach should probably be discussed in the NEP, specifically in the "rejected alternatives" section. It keeps coming up, and the reasons why it doesn't work for numpy are not obvious, so well-meaning people will keep bringing it up. It'd be helpful to have a single authoritative place to link to explaining why we don't do things that way. The beginning of the NEP should maybe also state up front that we follow a rolling-deprecations model where different breaking changes happen simultaneously on their own timelines. It's so obvious to me that I didn't notice it was missing, but this is a helpful reminder that it's not obvious to everyone :-). -n -- Nathaniel J. Smith -- https://vorpus.org From ralf.gommers at gmail.com Sun Jul 22 14:57:32 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 22 Jul 2018 11:57:32 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: Hi Marten, Thanks for the thoughtful reply. On Sat, Jul 21, 2018 at 6:39 PM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Hi Ralf, > > Overall, this looks good. But I think the subclassing section is somewhat > misleading in suggesting `ndarray` is not well designed to be subclassed. > At least, for neither my work on Quantity nor that on MaskedArray, I've > found that the design of `ndarray` itself was a problem. Instead, it was > the functions that were, as most were not written with subclassing or duck > typing in mind, but rather with the assumption that all input should be an > array, and that somehow it is useful to pass anything users pass in through > `asarray`. With then layers on top to avoid this in specific > circumstances... But perhaps this is what you meant? (I would agree, > though, that some ndarray subclasses have been designed poorly - > especially, matrix, which then led to a problematic duck array in sparse - > and that this has resulted in substantial hassle. Also, subclassing the > subclasses is much more problematic that subclassing ndarray - MaskedArray > being a particularly annoying example!) > You're completely right I think. We have had problems with subclasses for a long time, but that is due to mainly np.matrix being badly behaved, which then led to code everywhere using asarray, which then led to lots of issues with other subclasses. This basically meant subclasses were problematic, and hence most numpy devs would like to not see more subclasses. > The subclassing section also notes that subclassing has been discouraged > for a long time. Is that so? Over time, I've certainly had comments from > Nathaniel and some others in discussions of PRs that go in that direction, > which perhaps reflected some internal consensus I wasn't aware of, > I think yes there is some vague but not written down mostly-consensus, due to the dynamic with asarray above. > but the documentation does not seem to discourage it (check, e.g., the > subclassing section [1]). I also think that it may be good to keep in mind > that until `__array_ufunc__`, there wasn't much of a choice - support for > duck arrays was even more half-hearted (hopefully to become much better > with `__array_function__`). > True. I think long term duck arrays are the way to go, because asarray is not going to disappear. But for now we just have to do the best we can dealing with subclasses. The subclassing doc [1] really needs an update on what the practical issues are. > Overall, it seems to me that these days in the python eco-system > subclassing is simply expected to work. Even within numpy there are other > examples (e.g., ufuncs, dtypes) for which there has been quite a bit of > discussion about the benefits subclasses would bring. > I'm now thinking what to do with the subclassing section in the NEP. Best to completely remove? I was triggered to include it by some things Stephan said last week about subclasses being a blocker to adding new features. So if we keep the section, it may be helpful for you and Stephan to help shape that. Cheers, Ralf > > All the best, > > Marten > > [1] https://docs.scipy.org/doc/numpy/user/basics.subclassing.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jul 22 15:01:20 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 22 Jul 2018 12:01:20 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sat, Jul 21, 2018 at 7:06 PM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Hi Ralf, > > Maybe as a concrete example of something that has been discussed, for > which your proposed text makes (I think) clear what should or should not be > done: > > Many of us hate that `np.array` (like, sadly, many other numpy parts) > auto-converts anything not obviously array-like to dtype=object, and it has > been suggested we should no longer do this by default [1]. Given your NEP, > I think you would disagree with that path, as it would quite obviously > break user's code (we also get regular issues about object arrays, which > show that they are used a lot in the wild). So, instead I guess one might > go with a route where one could explicitly tell `dtype=object` was not > wanted (say, `dtype="everything-but-object')? > Thanks, good example. "everything-but-object" makes sense to me. I'd indeed argue that changing the current conversion to object dtype behavior would break way too much code. Cheers, Ralf > All the best, > > Marten > > [1] https://github.com/numpy/numpy/issues/5353 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jul 22 15:03:05 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 22 Jul 2018 12:03:05 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sat, Jul 21, 2018 at 7:25 PM, Nathaniel Smith wrote: > On Sat, Jul 21, 2018 at 5:46 PM, Hameer Abbasi > wrote: > > The possibility of another major version change (possibly the same one) > > where we re-write all portions that were agreed upon (via NEPs) to be > > re-written, with a longer LTS release (3 years? 5?). > > > > I?m thinking this one could be similar to the Python 2 -> Python 3 > > transition. Note that this is different from having constant breakages, > this > > will be a mostly one-time effort and one-time breakage. > > I agree that this approach should probably be discussed in the NEP, > specifically in the "rejected alternatives" section. It keeps coming > up, and the reasons why it doesn't work for numpy are not obvious, so > well-meaning people will keep bringing it up. It'd be helpful to have > a single authoritative place to link to explaining why we don't do > things that way. > good idea, will do > The beginning of the NEP should maybe also state up front that we > follow a rolling-deprecations model where different breaking changes > happen simultaneously on their own timelines. It's so obvious to me > that I didn't notice it was missing, but this is a helpful reminder > that it's not obvious to everyone :-). > Hmm, indeed. It is that obvious to us, but clearly not to people who are new to NumPy/Python. Will add. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jul 22 15:28:30 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 22 Jul 2018 12:28:30 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sat, Jul 21, 2018 at 7:15 PM, Nathaniel Smith wrote: > On Sat, Jul 21, 2018 at 4:48 PM, Ralf Gommers > wrote: > > Hi all, > > > > Here is a first draft of a NEP on backwards compatibility and deprecation > > policy. This I think mostly formalized what we've done for the last > couple > > of years, however I'm sure opinions and wish lists will differ here. > > Oh *awesome*, thanks for putting this together. > > I think this is a great start, but I'd structure it a bit differently. > So let me just make a few high-level comments first and see what you > think. > > Regarding the "general principles" and then "policy": to me these feel > like more a brainstorming list, that hasn't been fully distilled down > into principles yet. I would try to structure it to start with the > overarching principles (changes need to benefit users more than they > harm them, numpy is widely used so breaking changes should by default > be assumed to be fairly harmful, decisions should be based on data and > actual effects on users rather than e.g. appealing to the docs or > abstract aesthetic principles, silently getting wrong answer is much > worse than a loud error), then talk about some of the ways this plays > out (if people are currently silently getting the wrong answer -- > which is the definition of a bug, but also shows up in the > index-by-float case -- then that's really bad; some of our tools for > collecting data about how bad a breakage is include testing prominent > downstreams ourselves, adding warnings or making .0 releases and > seeing how people react, etc.), and then examples. > Thanks, I'll try and rework the general principles, you have some excellent points in here. > Speaking of examples: I hate to say this because in general I think > using examples is a great idea. But... I think you should delete most > of these examples. The problem is scope creep: the goal for this NEP > (IMO) should be to lay out the principles we use to think about these > issues in general, but right now it comes across as trying to lay down > a final resolution on lots of specific issues (including several where > there are ongoing conversations). It ends up like trying to squish > multiple NEPs into one, which makes it hard to discuss, and also > distracts from the core purpose. > I'm not sure this is the best thing to do. I can remove a couple, but aiming to be "totally uncontroversial" is almost impossible given the topic of the NEP. The diag view example is important I think, it's the second most discussed backwards compatibility issue next to histogram. I'm happy to remove the statement on what should happen with it going forward though. Then, I think it's not unreasonable to draw a couple of hard lines. For example, removing complete submodules like linalg or random has ended up on some draft brainstorm roadmap list because someone (no idea who) put it there after a single meeting. Clearly the cost-benefit of that is such that there's no point even discussing that more, so I'd rather draw that line here than every time someone open an issue. Very recent example: https://github.com/numpy/numpy/issues/11457 (remove auto-import of numpy.testing). > > My suggestion: keep just two examples, histogram and > indexing-with-floats. These are safely done and dusted, totally > uncontroversial (AFAIK), and the first is a good illustration of how > one can try to be careful and do the right thing but still get it all > wrong, while the second is a good example of (a) how we gathered data > and decided that an actually pretty disruptive change was nonetheless > worth it, and (b) how we had to manage it through multiple false > starts. > > Regarding the actual policy: One alteration to current practice jumped > out at me. This policy categorically rules out all changes that could > cause currently working code to silently start doing something wrong, > regardless of the specific circumstances. That's not how we actually > do things right now. Instead, our policy in recent years has been that > such changes are permitted in theory, but (a) the starting presumption > is that this is super harmful to users so we need a *very* good reason > to do it, and (b) if we do go ahead with it, then during the > deprecation period we use a highly-visible FutureWarning (instead of > the invisible-by-default DeprecationWarning). > > Personally I think the current policy strikes a better balance. You > can see some examples of where we've used this by running 'git log -S > FUTUREWARNING -S FutureWarning' -- it's things like a bad default for > the rcond argument in lstsq, an obscure and error-prone corner case in > indexing (0addc016ba), strange semantics for NaT > (https://mail.scipy.org/pipermail/numpy-discussion/ > 2015-October/073968.html), > ... we could quibble about individual cases, but I think that taking > these on a case-by-case basis is better than ruling them out > categorically. And in any case, that is what we do now, so if you want > to change this, it's something we should discuss and probably write > down some rationale and such :-). > You're right here. Thanks for the examples. I'll update this according to your suggestion, and propose to use one of the examples (rcond probably) to illustrate. > Regarding the major version number thing: ugh do we really want to > talk about this more. I'd probably leave it out of the NEP entirely. > If it stays in, I think it needs a clearer description of what counts > as a "major" change. I think it has value to keep it, and that it's not really possible to come up with a very clear description of "major". In particular, I'd like every deprecation message to say "this deprecated feature will be removed by release X.Y.0". At the moment we don't do that, so if users see a message they don't know if a removal will happen next year, in the far future (2.0), or never. The major version thing is quite useful to signal our intent. Doesn't mean we need to exhaustively discuss when to do a 2.0 though, I agree that that's not a very useful discussion right now. Happy to remove this though if people don't like it. Other opinions? Cheers, Ralf > There are some examples of things that do "sound" > major, but... the rest of our policy is all about measuring disruption > based on effects on users, and by that metric, the index-by-float > removal was pretty major. My guess is that by the time we finally > remove np.matrix, the actual disruption will be less than it was for > removing index-by-float. (As it should be, since keeping > index-by-float around was actively causing bugs in even > well-maintained downstreams, in a way that np.matrix doesn't.) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anntzer.lee at gmail.com Sun Jul 22 16:52:22 2018 From: anntzer.lee at gmail.com (Antony Lee) Date: Sun, 22 Jul 2018 22:52:22 +0200 Subject: [Numpy-discussion] mplcairo 0.1 release Message-ID: Dear all, I am pleased to announce the release of mplcairo 0.1 # Description mplcairo is a Matplotlib backend based on the well-known cairo library, supporting output to both raster (including interactively) and vector formats. In other words, it provides the functionality of Matplotlib's {,qt5,gtk3,wx,tk,macos}{agg,cairo}, pdf, ps, and svg backends. Per Matplotlib's standard API, the backend can be selected by calling matplotlib.use("module://mplcairo.qt") or setting your MPLBACKEND environment variable to `module://mplcairo.qt` for Qt5, and similarly for other toolkits. The source tarball, and Py3.6 manylinux and Windows wheels, are available on PyPI (I am looking for help to generate the OSX wheels). See the README for more details. # Why a new backend? Compared to Matplotlib's builtin Agg and cairo backends, mplcairo presents the following features: - Improved accuracy (e.g., with marker positioning, quad meshes, and text kerning). - Support for a wider variety of font formats, such as otf and pfb, for vector (PDF, PS, SVG) backends (Matplotlib's Agg backend also supports such fonts). - Optional support for complex text layout (right-to-left languages, etc.) using Raqm. **Note** that Raqm depends on Fribidi, which is licensed under the LGPLv2.1+. - Support for embedding URLs in PDF (but not SVG) output (requires cairo?1.15.4). - Support for multi-page output both for PDF and PS (Matplotlib only supports multi-page PDF). - Support for custom blend modes (see `examples/operators.py`). See the README for more details. # Changelog from mplcairo 0.1a1 to mplcairo 0.1 - Integration with libraqm now occurs via dlopen() rather than being selected at compile-time. - Various rendering and performance improvements. - On Travis, we now run Matplotlib's test suite with mplcairo patching the default Agg renderer. Enjoy, Antony Lee -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sun Jul 22 19:31:28 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sun, 22 Jul 2018 19:31:28 -0400 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: Hi Ralf, >> Overall, this looks good. But I think the subclassing section is somewhat >> misleading in suggesting `ndarray` is not well designed to be subclassed. >> At least, for neither my work on Quantity nor that on MaskedArray, I've >> found that the design of `ndarray` itself was a problem. Instead, it was >> the functions that were, as most were not written with subclassing or duck >> typing in mind, but rather with the assumption that all input should be an >> array, and that somehow it is useful to pass anything users pass in through >> `asarray`. With then layers on top to avoid this in specific >> circumstances... But perhaps this is what you meant? (I would agree, >> though, that some ndarray subclasses have been designed poorly - >> especially, matrix, which then led to a problematic duck array in sparse - >> and that this has resulted in substantial hassle. Also, subclassing the >> subclasses is much more problematic that subclassing ndarray - MaskedArray >> being a particularly annoying example!) >> > > You're completely right I think. We have had problems with subclasses for > a long time, but that is due to mainly np.matrix being badly behaved, which > then led to code everywhere using asarray, which then led to lots of issues > with other subclasses. This basically meant subclasses were problematic, > and hence most numpy devs would like to not see more subclasses. > Perhaps this history is in fact useful to mention? To learn from mistakes, it must be possible to know about them! > >> The subclassing section also notes that subclassing has been discouraged >> for a long time. Is that so? Over time, I've certainly had comments from >> Nathaniel and some others in discussions of PRs that go in that direction, >> which perhaps reflected some internal consensus I wasn't aware of, >> > > I think yes there is some vague but not written down mostly-consensus, due > to the dynamic with asarray above. > > >> but the documentation does not seem to discourage it (check, e.g., the >> subclassing section [1]). I also think that it may be good to keep in mind >> that until `__array_ufunc__`, there wasn't much of a choice - support for >> duck arrays was even more half-hearted (hopefully to become much better >> with `__array_function__`). >> > > True. I think long term duck arrays are the way to go, because asarray is > not going to disappear. But for now we just have to do the best we can > dealing with subclasses. > > The subclassing doc [1] really needs an update on what the practical > issues are. > > Indeed. > >> Overall, it seems to me that these days in the python eco-system >> subclassing is simply expected to work. Even within numpy there are other >> examples (e.g., ufuncs, dtypes) for which there has been quite a bit of >> discussion about the benefits subclasses would bring. >> > > I'm now thinking what to do with the subclassing section in the NEP. Best > to completely remove? I was triggered to include it by some things Stephan > said last week about subclasses being a blocker to adding new features. So > if we keep the section, it may be helpful for you and Stephan to help shape > that. > > I think even just the history you wrote above is useful. Before suggesting further specific text, might it make sense for the NEP to note that since subclassing will not go away, there is value in having at least one non-trivial, well-designed subclass in numpy? I think eventually MaskedArray might become that: it would be an internal check that subclasses can work with all numpy functions (there is no reason for duplication of functions in `np.ma`!). It also is an example of a container-type subclass that adds extra information to an ndarray (since that information is itself array-like, it is not necessarily a super-logical subclass, but it is there... and can thus serve as an example). A second subclass which we have not discussed, but which I think is used quite a bit (from my statistics of one...), is `np.memmap`. Useful if only for showing that a relatively quick hack can give you something quite helpful. All the best, Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 23 13:21:05 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Jul 2018 11:21:05 -0600 Subject: [Numpy-discussion] NumPy 1.15.0 released. Message-ID: Hi All, On behalf of the NumPy team I'm pleased to announce the release of NumPy 1.15.0rc2. This release has an unusual number of cleanups, many deprecations of old functions, and improvements to many existing functions. A total of 438 pull reguests were merged for this release, please look at the release notes for details. Some highlights are: - NumPy has switched to pytest for testing. - A new `numpy.printoptions` context manager. - Many improvements to the histogram functions. - Support for unicode field names in python 2.7. - Improved support for PyPy. - Fixes and improvements to `numpy.einsum`. The Python versions supported by this release are 2.7, 3.4-3.7. The wheels are linked with OpenBLAS v0.3.0, which should fix some of the linalg problems reported for NumPy 1.14. Wheels for this release can be downloaded from PyPI , source archives are available from Github . *Contributors* A total of 133 people contributed to this release. People with a "+" by their names contributed a patch for the first time. * Aaron Critchley + * Aarthi + * Aarthi Agurusa + * Alex Thomas + * Alexander Belopolsky * Allan Haldane * Anas Khan + * Andras Deak * Andrey Portnoy + * Anna Chiara * Aurelien Jarno + * Baurzhan Muftakhidinov * Berend Kapelle + * Bernhard M. Wiedemann * Bjoern Thiel + * Bob Eldering * Cenny Wenner + * Charles Harris * ChloeColeongco + * Chris Billington + * Christopher + * Chun-Wei Yuan + * Claudio Freire + * Daniel Smith * Darcy Meyer + * David Abdurachmanov + * David Freese * Deepak Kumar Gouda + * Dennis Weyland + * Derrick Williams + * Dmitriy Shalyga + * Eric Cousineau + * Eric Larson * Eric Wieser * Evgeni Burovski * Frederick Lefebvre + * Gaspar Karm + * Geoffrey Irving * Gerhard Hobler + * Gerrit Holl * Guo Ci + * Hameer Abbasi + * Han Shen * Hiroyuki V. Yamazaki + * Hong Xu * Ihor Melnyk + * Jaime Fernandez * Jake VanderPlas + * James Tocknell + * Jarrod Millman * Jeff VanOss + * John Kirkham * Jonas Rauber + * Jonathan March + * Joseph Fox-Rabinovitz * Julian Taylor * Junjie Bai + * Juris Bogusevs + * J?rg D?pfert * Kenichi Maehashi + * Kevin Sheppard * Kimikazu Kato + * Kirit Thadaka + * Kritika Jalan + * Kyle Sunden + * Lakshay Garg + * Lars G + * Licht Takeuchi * Louis Potok + * Luke Zoltan Kelley * MSeifert04 + * Mads R. B. Kristensen + * Malcolm Smith + * Mark Harfouche + * Marten H. van Kerkwijk + * Marten van Kerkwijk * Matheus Vieira Portela + * Mathieu Lamarre * Mathieu Sornay + * Matthew Brett * Matthew Rocklin + * Matthias Bussonnier * Matti Picus * Michael Droettboom * Miguel S?nchez de Le?n Peque + * Mike Toews + * Milo + * Nathaniel J. Smith * Nelle Varoquaux * Nicholas Nadeau, P.Eng., AVS + * Nick Minkyu Lee + * Nikita + * Nikita Kartashov + * Nils Becker + * Oleg Zabluda * Orestis Floros + * Pat Gunn + * Paul van Mulbregt + * Pauli Virtanen * Pierre Chanial + * Ralf Gommers * Raunak Shah + * Robert Kern * Russell Keith-Magee + * Ryan Soklaski + * Samuel Jackson + * Sebastian Berg * Siavash Eliasi + * Simon Conseil * Simon Gibbons * Stefan Krah + * Stefan van der Walt * Stephan Hoyer * Subhendu + * Subhendu Ranjan Mishra + * Tai-Lin Wu + * Tobias Fischer + * Toshiki Kataoka + * Tyler Reddy + * Unknown + * Varun Nayyar * Victor Rodriguez + * Warren Weckesser * William D. Irons + * Zane Bradley + * cclauss + * fo40225 + * lapack_lite code generator + * lumbric + * luzpaz + * mamrehn + * tynn + * xoviat Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 23 13:23:11 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Jul 2018 11:23:11 -0600 Subject: [Numpy-discussion] NumPy 1.15.0 released. In-Reply-To: References: Message-ID: On Mon, Jul 23, 2018 at 11:21 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Hi All, > > On behalf of the NumPy team I'm pleased to announce the release of NumPy > 1.15.0rc2. > Oops, NumPy 1.15.0. Oh well ... This release has an unusual number of cleanups, many deprecations of old > functions, > and improvements to many existing functions. A total of 438 pull reguests > were merged > for this release, please look at the release notes > for details. Some > highlights are: > > - NumPy has switched to pytest for testing. > - A new `numpy.printoptions` context manager. > - Many improvements to the histogram functions. > - Support for unicode field names in python 2.7. > - Improved support for PyPy. > - Fixes and improvements to `numpy.einsum`. > > The Python versions supported by this release are 2.7, 3.4-3.7. The > wheels are linked with > OpenBLAS v0.3.0, which should fix some of the linalg problems reported for > NumPy 1.14. > > Wheels for this release can be downloaded from PyPI > , source archives are available > from Github . > > > *Contributors* > > A total of 133 people contributed to this release. People with a "+" by > their > names contributed a patch for the first time. > > * Aaron Critchley + > * Aarthi + > * Aarthi Agurusa + > * Alex Thomas + > * Alexander Belopolsky > * Allan Haldane > * Anas Khan + > * Andras Deak > * Andrey Portnoy + > * Anna Chiara > * Aurelien Jarno + > * Baurzhan Muftakhidinov > * Berend Kapelle + > * Bernhard M. Wiedemann > * Bjoern Thiel + > * Bob Eldering > * Cenny Wenner + > * Charles Harris > * ChloeColeongco + > * Chris Billington + > * Christopher + > * Chun-Wei Yuan + > * Claudio Freire + > * Daniel Smith > * Darcy Meyer + > * David Abdurachmanov + > * David Freese > * Deepak Kumar Gouda + > * Dennis Weyland + > * Derrick Williams + > * Dmitriy Shalyga + > * Eric Cousineau + > * Eric Larson > * Eric Wieser > * Evgeni Burovski > * Frederick Lefebvre + > * Gaspar Karm + > * Geoffrey Irving > * Gerhard Hobler + > * Gerrit Holl > * Guo Ci + > * Hameer Abbasi + > * Han Shen > * Hiroyuki V. Yamazaki + > * Hong Xu > * Ihor Melnyk + > * Jaime Fernandez > * Jake VanderPlas + > * James Tocknell + > * Jarrod Millman > * Jeff VanOss + > * John Kirkham > * Jonas Rauber + > * Jonathan March + > * Joseph Fox-Rabinovitz > * Julian Taylor > * Junjie Bai + > * Juris Bogusevs + > * J?rg D?pfert > * Kenichi Maehashi + > * Kevin Sheppard > * Kimikazu Kato + > * Kirit Thadaka + > * Kritika Jalan + > * Kyle Sunden + > * Lakshay Garg + > * Lars G + > * Licht Takeuchi > * Louis Potok + > * Luke Zoltan Kelley > * MSeifert04 + > * Mads R. B. Kristensen + > * Malcolm Smith + > * Mark Harfouche + > * Marten H. van Kerkwijk + > * Marten van Kerkwijk > * Matheus Vieira Portela + > * Mathieu Lamarre > * Mathieu Sornay + > * Matthew Brett > * Matthew Rocklin + > * Matthias Bussonnier > * Matti Picus > * Michael Droettboom > * Miguel S?nchez de Le?n Peque + > * Mike Toews + > * Milo + > * Nathaniel J. Smith > * Nelle Varoquaux > * Nicholas Nadeau, P.Eng., AVS + > * Nick Minkyu Lee + > * Nikita + > * Nikita Kartashov + > * Nils Becker + > * Oleg Zabluda > * Orestis Floros + > * Pat Gunn + > * Paul van Mulbregt + > * Pauli Virtanen > * Pierre Chanial + > * Ralf Gommers > * Raunak Shah + > * Robert Kern > * Russell Keith-Magee + > * Ryan Soklaski + > * Samuel Jackson + > * Sebastian Berg > * Siavash Eliasi + > * Simon Conseil > * Simon Gibbons > * Stefan Krah + > * Stefan van der Walt > * Stephan Hoyer > * Subhendu + > * Subhendu Ranjan Mishra + > * Tai-Lin Wu + > * Tobias Fischer + > * Toshiki Kataoka + > * Tyler Reddy + > * Unknown + > * Varun Nayyar > * Victor Rodriguez + > * Warren Weckesser > * William D. Irons + > * Zane Bradley + > * cclauss + > * fo40225 + > * lapack_lite code generator + > * lumbric + > * luzpaz + > * mamrehn + > * tynn + > * xoviat > > Cheers, > > Charles Harris > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Mon Jul 23 13:44:14 2018 From: ben.v.root at gmail.com (Benjamin Root) Date: Mon, 23 Jul 2018 13:44:14 -0400 Subject: [Numpy-discussion] [Matplotlib-devel] mplcairo 0.1 release In-Reply-To: References: Message-ID: Congratulations to Antony for his hard work on this important backend! As far as I am concerned, the cairo backend is the future of matplotlib. Test this backend out for yourselves and help us take matplotlib to the next level in high-quality charting! Cheers! Ben Root On Sun, Jul 22, 2018 at 4:52 PM, Antony Lee wrote: > Dear all, > > I am pleased to announce the release of mplcairo 0.1 > > # Description > > mplcairo is a Matplotlib backend based on the well-known cairo library, > supporting output to both raster (including interactively) and vector > formats. In other words, it provides the functionality of Matplotlib's > {,qt5,gtk3,wx,tk,macos}{agg,cairo}, pdf, ps, and svg backends. > > Per Matplotlib's standard API, the backend can be selected by calling > > matplotlib.use("module://mplcairo.qt") > > or setting your MPLBACKEND environment variable to `module://mplcairo.qt` > for > Qt5, and similarly for other toolkits. > > The source tarball, and Py3.6 manylinux and Windows wheels, are available > on > PyPI (I am looking for help to generate the OSX wheels). > > See the README for more details. > > # Why a new backend? > > Compared to Matplotlib's builtin Agg and cairo backends, mplcairo presents > the > following features: > > - Improved accuracy (e.g., with marker positioning, quad meshes, and text > kerning). > - Support for a wider variety of font formats, such as otf and pfb, for > vector > (PDF, PS, SVG) backends (Matplotlib's Agg backend also supports such > fonts). > - Optional support for complex text layout (right-to-left languages, etc.) > using Raqm. **Note** that Raqm depends on Fribidi, which is licensed > under > the LGPLv2.1+. > - Support for embedding URLs in PDF (but not SVG) output (requires > cairo?1.15.4). > - Support for multi-page output both for PDF and PS (Matplotlib only > supports > multi-page PDF). > - Support for custom blend modes (see `examples/operators.py`). > > See the README for more details. > > # Changelog from mplcairo 0.1a1 to mplcairo 0.1 > > - Integration with libraqm now occurs via dlopen() rather than being > selected > at compile-time. > - Various rendering and performance improvements. > - On Travis, we now run Matplotlib's test suite with mplcairo patching the > default Agg renderer. > > Enjoy, > > Antony Lee > > _______________________________________________ > Matplotlib-devel mailing list > Matplotlib-devel at python.org > https://mail.python.org/mailman/listinfo/matplotlib-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Jul 23 13:46:57 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 23 Jul 2018 10:46:57 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sat, Jul 21, 2018 at 6:40 PM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > But I think the subclassing section is somewhat misleading in suggesting > `ndarray` is not well designed to be subclassed. At least, for neither my > work on Quantity nor that on MaskedArray, I've found that the design of > `ndarray` itself was a problem. Instead, it was the functions that were, as > most were not written with subclassing or duck typing in mind, but rather > with the assumption that all input should be an array, and that somehow it > is useful to pass anything users pass in through `asarray`. With then > layers on top to avoid this in specific circumstances... But perhaps this > is what you meant? > I can't speak for Ralf, but yes, this is part of what I had in mind. I don't think you can separate "core" objects/methods from functions that act on them. Either the entire system is designed to handle subclassing through some well-defined interface or is it not. If you don't design a system for subclassing but allow it anyways (and it's impossible to prohibit problematically in Python), then you can easily end up with very fragile systems that are difficult to modify or extend. As Ralf noted in the NEP, "Some of them change the behavior of ndarray methods, making it difficult to write code that accepts array duck-types." These changes end up having implications for apparently unrelated functions (e.g., np.median needing to call np.mean internally to handle units properly). I don't think anyone really wants that sort of behavior or lock-in in NumPy itself, but of course that is the price we pay for not having well-defined interfaces :). Hopefully NEP-18 will change that, and eventually we will be able to remove hacks from NumPy that we added only because there weren't any better alternatives available. For the NEP itself, i would not mention "A future change in NumPy to not support subclassing," because it's not as if subclassing is suddenly not going to work as of a certain NumPy release. Certain types of subclasses (e.g., those that only add extra methods and/or metadata and do not modify any existing functionality) have never been a problem and will be fine to support indefinitely. Rather, we might state that "At some point in the future, the NumPy development team may no longer interested in maintaining workarounds for specific subclasses, because other interfaces for extending NumPy are believed to be more maintainable/preferred." Overall, it seems to me that these days in the python eco-system > subclassing is simply expected to work. > I don't think this is true. You can use subclassing on builtin types like dict, but just because you can do it doesn't mean it's a good idea. If you change built-in methods to work in different ways other things will break in unexpected ways (or simply not change, also in unexpected ways). Probably the only really safe way to subclass a dictionary is to define the __missing__() method and not change any other aspects of the public interface directly. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Jul 23 13:51:46 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 23 Jul 2018 10:51:46 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sun, Jul 22, 2018 at 12:28 PM Ralf Gommers wrote: > In particular, I'd like every deprecation message to say "this deprecated > feature will be removed by release X.Y.0". At the moment we don't do that, > so if users see a message they don't know if a removal will happen next > year, in the far future (2.0), or never. The major version thing is quite > useful to signal our intent. Doesn't mean we need to exhaustively discuss > when to do a 2.0 though, I agree that that's not a very useful discussion > right now. > I think a more realistic policy would be to say, "This feature was deprecated by release X.Y and may be removed as early as release X.Z." In general we have been conservative in terms of actually finalizing deprecations in NumPy, which I think is warranted given irregularity of our release cycle. It's hard to know exactly which release is going to come out a year or 18 months from when a deprecation starts. > Happy to remove this though if people don't like it. Other opinions? > I would also lean towards removing mention of any major version changes for NumPy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Jul 23 14:46:46 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 23 Jul 2018 11:46:46 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sun, Jul 22, 2018 at 12:28 PM Ralf Gommers wrote: > Then, I think it's not unreasonable to draw a couple of hard lines. For > example, removing complete submodules like linalg or random has ended up on > some draft brainstorm roadmap list because someone (no idea who) put it > there after a single meeting. Clearly the cost-benefit of that is such that > there's no point even discussing that more, so I'd rather draw that line > here than every time someone open an issue. > I'm happy to give the broader context here. This came up in the NumPy sprint in Berkeley back in May of this year. The existence of all of these submodules in NumPy is mostly a historical artifact, due to the previously poor state of Python packaging. Our thinking was that perhaps this could be revisited in this age of conda and manylinux wheels. This isn't to say that it would actually be a good idea to remove any of these submodules today. Separate modules bring both benefits and downsides. Benefits: - It can be easier to maintain projects separately rather than inside NumPy, e.g., bug fixes do not need to be tied to NumPy releases. - Separate modules could reduce the maintenance burden for NumPy itself, because energy gets focused on core features. - For projects for which a rewrite would be warranted (e.g., numpy.ma and scipy.sparse), it is *much* easier to innovate outside of NumPy/SciPy. - Packaging. As mentioned above, this is no longer as beneficial as it once way. Downsides: - It's harder to find separate packages than NumPy modules. - If the maintainers and maintenance processes are very similar, then separate projects can add unnecessary overhead. - Changing from bundled to separate packages imposes a significant cost upon their users (e.g., due to changed import paths). Coming back to the NEP: The import on downstream libraries and users would be very large, and > maintenance of these modules would still have to happen. Therefore this is simply > not a good idea; removing these submodules should not happen even for a > new major version of NumPy. > I'm afraid I disagree pretty strongly here. There should absolutely be a high bar for removing submodules, but we should not rule out the possibility entirely. It is certainly true that modules need to be maintained for them to be remain usable, but I particularly object to the idea that this should be forced upon NumPy maintainers. Open source projects need to be maintained by their users, and if their users cannot devote energy to maintain them then the open source project deserves to die. This is just as true for NumPy submodules as for external packages. NumPy itself only has an obligation to maintain submodules if they are actively needed by the NumPy project and valued by active NumPy contributors. Otherwise, they should be maintained by users who care about them -- whether that means inside or outside NumPy. It serves nobody well to insist on NumPy developers maintaining projects that they don't use or care about. I like would suggest the following criteria for considering removing a NumPy submodule: 1. It cannot be relied upon by other portions of NumPy. 2. Either (a) the submodule imposes a significant maintenance burden upon the rest of NumPy that is not balanced by the level of dedicated contributions, or (b) much better alternatives exist outside of NumPy Preferably all three criteria should be satisfied. -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Mon Jul 23 16:43:40 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 23 Jul 2018 16:43:40 -0400 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: > > > Rather, we might state that "At some point in the future, the NumPy > development team may no longer interested in maintaining workarounds for > specific subclasses, because other interfaces for extending NumPy are > believed to be more maintainable/preferred." > > That sentence I think covers it very well. Subclasses can and should be expected to evolve along with numpy, and if that means some numpy-version dependent parts, so be it (we have those now...). It is just that one should not remove functionality without providing the better alternative! -- Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Jul 23 17:32:02 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 23 Jul 2018 14:32:02 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Mon, Jul 23, 2018 at 1:45 PM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > That sentence I think covers it very well. Subclasses can and should be > expected to evolve along with numpy, and if that means some numpy-version > dependent parts, so be it (we have those now...). > My hope would be that NumPy gets out of the business of officially providing interfaces like subclassing that are this hard to maintain. In general, we try to hold ourselves to a higher standard of stable code, and this sets up unfortunate conflicts between the needs of different NumPy users. It is just that one should not remove functionality without providing the > better alternative! > Totally agreed! -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Mon Jul 23 18:24:43 2018 From: matti.picus at gmail.com (Matti Picus) Date: Mon, 23 Jul 2018 15:24:43 -0700 Subject: [Numpy-discussion] Commit privileges In-Reply-To: References: Message-ID: On 19/07/18 14:08, Charles R Harris wrote: > Hi All, > > The NumPy Steering Council has been looking at commit rights for the > NumPy developers hired at BIDS. We would like them to be able to label > PRs, close issues, and merge simple fixes; doing that requires commit > privileges. OTOH, it is also the case that people paid to work on > NumPy don't automatically receive commit privileges. So it is a bit a > quandary and we don't seem to have an official document codifying the > giving of commit privileges, and the Github privileges are rather > coarse grained, pretty much all or nothing for a given repository. > > The situation has also caused us to rethink commit privileges in > general, perhaps we are being too selective. So there is some interest > in offering commit privileges more freely, with the understanding that > they are needed for many of the mundane tasks required to maintain > NumPy, but that new people should be conservative in their exercise of > the privilege. Given the reality of the Github environment, such a > system needs be honor based, but would allow more people an > opportunity to participate at a deeper level. > > So in line with that, we are going to give both of the BIDS workers > commit privileges, but also extend the option of commit privileges for > issue triage and other such things to the community at large. If you > have contributed to NumPy and are interested in having commit rights, > please respond to this post, but bear in mind that this is an > experiment and that things might change if the system is abused. > > Chuck > I rephrased this mail as an addition to the governance document, see https://github.com/numpy/numpy/pull/11609 Matti From ralf.gommers at gmail.com Mon Jul 23 21:21:42 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Jul 2018 18:21:42 -0700 Subject: [Numpy-discussion] Commit privileges In-Reply-To: References: Message-ID: On Mon, Jul 23, 2018 at 3:24 PM, Matti Picus wrote: > On 19/07/18 14:08, Charles R Harris wrote: > >> Hi All, >> >> The NumPy Steering Council has been looking at commit rights for the >> NumPy developers hired at BIDS. We would like them to be able to label PRs, >> close issues, and merge simple fixes; doing that requires commit >> privileges. OTOH, it is also the case that people paid to work on NumPy >> don't automatically receive commit privileges. So it is a bit a quandary >> and we don't seem to have an official document codifying the giving of >> commit privileges, and the Github privileges are rather coarse grained, >> pretty much all or nothing for a given repository. >> >> The situation has also caused us to rethink commit privileges in general, >> perhaps we are being too selective. So there is some interest in offering >> commit privileges more freely, with the understanding that they are needed >> for many of the mundane tasks required to maintain NumPy, but that new >> people should be conservative in their exercise of the privilege. Given the >> reality of the Github environment, such a system needs be honor based, but >> would allow more people an opportunity to participate at a deeper level. >> >> So in line with that, we are going to give both of the BIDS workers >> commit privileges, but also extend the option of commit privileges for >> issue triage and other such things to the community at large. If you have >> contributed to NumPy and are interested in having commit rights, please >> respond to this post, but bear in mind that this is an experiment and that >> things might change if the system is abused. >> >> Chuck >> >> > I rephrased this mail as an addition to the governance document, see > https://github.com/numpy/numpy/pull/11609 Thanks Matti, good idea. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Jul 24 14:34:39 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 24 Jul 2018 11:34:39 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Mon, Jul 23, 2018 at 1:43 PM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > >> Rather, we might state that "At some point in the future, the NumPy >> development team may no longer interested in maintaining workarounds for >> specific subclasses, because other interfaces for extending NumPy are >> believed to be more maintainable/preferred." >> >> That sentence I think covers it very well. Subclasses can and should be > expected to evolve along with numpy, and if that means some numpy-version > dependent parts, so be it (we have those now...). It is just that one > should not remove functionality without providing the better alternative! > Thanks for the input both, that makes sense. I'll try and rewrite the section along these lines. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Tue Jul 24 15:04:49 2018 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Tue, 24 Jul 2018 12:04:49 -0700 Subject: [Numpy-discussion] Roadmap proposal, v3 Message-ID: <20180724190449.pxud4eebr7juktzq@carbo> Hi everyone, Please take a look at the latest roadmap proposal: https://github.com/numpy/numpy/pull/11611 This is a living document, so can easily be modified in the future, but we'd like to get in place a document that corresponds fairly closely with current community priorities. Best regards, St?fan From gael.varoquaux at normalesup.org Tue Jul 24 16:14:39 2018 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 24 Jul 2018 22:14:39 +0200 Subject: [Numpy-discussion] Roadmap proposal, v3 In-Reply-To: <20180724190449.pxud4eebr7juktzq@carbo> References: <20180724190449.pxud4eebr7juktzq@carbo> Message-ID: <20180724201439.sfe3eeeoca4v4bmr@phare.normalesup.org> Looks great! Thank you for doing this! Ga?l On Tue, Jul 24, 2018 at 12:04:49PM -0700, Stefan van der Walt wrote: > Hi everyone, > Please take a look at the latest roadmap proposal: > https://github.com/numpy/numpy/pull/11611 > This is a living document, so can easily be modified in the future, but > we'd like to get in place a document that corresponds fairly closely > with current community priorities. > Best regards, > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Gael Varoquaux Senior Researcher, INRIA Parietal NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France Phone: ++ 33-1-69-08-79-68 http://gael-varoquaux.info http://twitter.com/GaelVaroquaux From shoyer at gmail.com Tue Jul 24 16:44:07 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 24 Jul 2018 13:44:07 -0700 Subject: [Numpy-discussion] Roadmap proposal, v3 In-Reply-To: <20180724201439.sfe3eeeoca4v4bmr@phare.normalesup.org> References: <20180724190449.pxud4eebr7juktzq@carbo> <20180724201439.sfe3eeeoca4v4bmr@phare.normalesup.org> Message-ID: Stefan and Ralf -- thanks for finishing this up! I'm quite happy with the state of this. On Tue, Jul 24, 2018 at 1:17 PM Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > Looks great! Thank you for doing this! > > Ga?l > > On Tue, Jul 24, 2018 at 12:04:49PM -0700, Stefan van der Walt wrote: > > Hi everyone, > > > Please take a look at the latest roadmap proposal: > > > https://github.com/numpy/numpy/pull/11611 > > > This is a living document, so can easily be modified in the future, but > > we'd like to get in place a document that corresponds fairly closely > > with current community priorities. > > > Best regards, > > St?fan > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > -- > Gael Varoquaux > Senior Researcher, INRIA Parietal > NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France > Phone: ++ 33-1-69-08-79-68 <+33%201%2069%2008%2079%2068> > http://gael-varoquaux.info http://twitter.com/GaelVaroquaux > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From einstein.edison at gmail.com Tue Jul 24 18:07:38 2018 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Tue, 24 Jul 2018 15:07:38 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On 23. Jul 2018 at 19:46, Stephan Hoyer wrote: On Sat, Jul 21, 2018 at 6:40 PM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > But I think the subclassing section is somewhat misleading in suggesting > `ndarray` is not well designed to be subclassed. At least, for neither my > work on Quantity nor that on MaskedArray, I've found that the design of > `ndarray` itself was a problem. Instead, it was the functions that were, as > most were not written with subclassing or duck typing in mind, but rather > with the assumption that all input should be an array, and that somehow it > is useful to pass anything users pass in through `asarray`. With then > layers on top to avoid this in specific circumstances... But perhaps this > is what you meant? > I can't speak for Ralf, but yes, this is part of what I had in mind. I don't think you can separate "core" objects/methods from functions that act on them. Either the entire system is designed to handle subclassing through some well-defined interface or is it not. If you don't design a system for subclassing but allow it anyways ( and it's impossible to prohibit problematically in Python This isn?t really true. Metaprogramming to the rescue I guess. https://stackoverflow.com/questions/16564198/pythons-equivalent-of-nets-sealed-class#16564232 Best regards, Hameer Abbasi Sent from Astro for Mac ), then you can easily end up with very fragile systems that are difficult to modify or extend. As Ralf noted in the NEP, "Some of them change the behavior of ndarray methods, making it difficult to write code that accepts array duck-types." These changes end up having implications for apparently unrelated functions (e.g., np.median needing to call np.mean internally to handle units properly). I don't think anyone really wants that sort of behavior or lock-in in NumPy itself, but of course that is the price we pay for not having well-defined interfaces :). Hopefully NEP-18 will change that, and eventually we will be able to remove hacks from NumPy that we added only because there weren't any better alternatives available. For the NEP itself, i would not mention "A future change in NumPy to not support subclassing," because it's not as if subclassing is suddenly not going to work as of a certain NumPy release. Certain types of subclasses (e.g., those that only add extra methods and/or metadata and do not modify any existing functionality) have never been a problem and will be fine to support indefinitely. Rather, we might state that "At some point in the future, the NumPy development team may no longer interested in maintaining workarounds for specific subclasses, because other interfaces for extending NumPy are believed to be more maintainable/preferred." Overall, it seems to me that these days in the python eco-system > subclassing is simply expected to work. > I don't think this is true. You can use subclassing on builtin types like dict, but just because you can do it doesn't mean it's a good idea. If you change built-in methods to work in different ways other things will break in unexpected ways (or simply not change, also in unexpected ways). Probably the only really safe way to subclass a dictionary is to define the __missing__() method and not change any other aspects of the public interface directly. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From einstein.edison at gmail.com Tue Jul 24 18:21:17 2018 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Tue, 24 Jul 2018 15:21:17 -0700 Subject: [Numpy-discussion] Roadmap proposal, v3 In-Reply-To: <20180724190449.pxud4eebr7juktzq@carbo> References: <20180724190449.pxud4eebr7juktzq@carbo> Message-ID: Hey Stefan/Ralf/Stephan, This looks nice, generally what the community agrees on. Great work, and thanks for putting this together. Best regards, Hameer Abbasi Sent from Astro for Mac On 24. Jul 2018 at 21:04, Stefan van der Walt wrote: Hi everyone, Please take a look at the latest roadmap proposal: https://github.com/numpy/numpy/pull/11611 This is a living document, so can easily be modified in the future, but we'd like to get in place a document that corresponds fairly closely with current community priorities. Best regards, St?fan _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Jul 24 20:34:15 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 24 Jul 2018 17:34:15 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Mon, Jul 23, 2018 at 11:46 AM, Stephan Hoyer wrote: > On Sun, Jul 22, 2018 at 12:28 PM Ralf Gommers > wrote: > >> Then, I think it's not unreasonable to draw a couple of hard lines. For >> example, removing complete submodules like linalg or random has ended up on >> some draft brainstorm roadmap list because someone (no idea who) put it >> there after a single meeting. Clearly the cost-benefit of that is such that >> there's no point even discussing that more, so I'd rather draw that line >> here than every time someone open an issue. >> > > I'm happy to give the broader context here. This came up in the NumPy > sprint in Berkeley back in May of this year. > > The existence of all of these submodules in NumPy is mostly a historical > artifact, due to the previously poor state of Python packaging. > That's true. Our thinking was that perhaps this could be revisited in this age of conda > and manylinux wheels. > > This isn't to say that it would actually be a good idea to remove any of > these submodules today. Separate modules bring both benefits and downsides. > > Benefits: > - It can be easier to maintain projects separately rather than inside > NumPy, e.g., bug fixes do not need to be tied to NumPy releases. > - Separate modules could reduce the maintenance burden for NumPy itself, > because energy gets focused on core features. > That's certainly not a given though. Those things still need to be maintained, and splitting up packages increases overhead for e.g. doing releases. It's quite unclear if splitting would increase the developer pool. - For projects for which a rewrite would be warranted (e.g., numpy.ma and > scipy.sparse), it is *much* easier to innovate outside of NumPy/SciPy. > Agreed. That can happen and is already happening though (e.g. https://github.com/pydata/sparse). It doesn't have much to do with removing existing submodules. - Packaging. As mentioned above, this is no longer as beneficial as it once > way. > True, no longer as beneficial - that's not really a benefit though, packaging just works fine either way. > Downsides: > - It's harder to find separate packages than NumPy modules. > - If the maintainers and maintenance processes are very similar, then > separate projects can add unnecessary overhead. > - Changing from bundled to separate packages imposes a significant cost > upon their users (e.g., due to changed import paths). > > Coming back to the NEP: > > The import on downstream libraries and users would be very large, and >> > maintenance of these modules would still have to happen. Therefore this >> is simply not a good idea; removing these submodules should not happen >> even for a new major version of NumPy. >> > > I'm afraid I disagree pretty strongly here. There should absolutely be a > high bar for removing submodules, but we should not rule out the > possibility entirely. > My thinking here is: given that we're not even willing to remove MaskedArray (NEP 17), for which the benefits of removing are a lot higher and the user base smaller, we are certainly not going to be removing random or linalg or distutils in the foreseeable future. So we may as well say that. Otherwise we have the discussions regularly (we actually just did have one for numpy.testing in gh-11457), which is just a waste of energy. > It is certainly true that modules need to be maintained for them to be > remain usable, but I particularly object to the idea that this should be > forced upon NumPy maintainers. > Nothing is "forced on you" as a NumPy maintainer - we are all individuals who do things voluntarily (okay, almost all - we have some funding now) and can choose to not spend any time on certain parts of NumPy. MaskedArray languished for quite a while before Marten and Eric spent a lot of time in improving it and closing lots of issues related to it. That can happen. Open source projects need to be maintained by their users, and if their > users cannot devote energy to maintain them then the open source project > deserves to die. This is just as true for NumPy submodules as for external > packages. > > NumPy itself only has an obligation to maintain submodules if they are > actively needed by the NumPy project and valued by active NumPy > contributors. > This is very developer-centric view. We have lots of users and also lots of no-longer-active contributors. The needs, interests and previous work put into NumPy of those groups of people matter. Otherwise, they should be maintained by users who care about them -- > whether that means inside or outside NumPy. It serves nobody well to insist > on NumPy developers maintaining projects that they don't use or care about. > > I like would suggest the following criteria for considering removing a > NumPy submodule: > 1. It cannot be relied upon by other portions of NumPy. > 2. Either > (a) the submodule imposes a significant maintenance burden upon the rest > of NumPy that is not balanced by the level of dedicated contributions, or > (b) much better alternatives exist outside of NumPy > To quote Nathaniel: "the rest of our policy is all about measuring disruption based on effects on users". That's absent from your criteria. Why I would like to keep this point in is: - the discussion does come up, see draft brainstorm roadmap list and gh-11457. - the outcome of such discussions is in practice 100% clear. - I would like to avoid having drawn out discussions each time (this eats up a lot of energy for me), and I *really* would like to avoid saying "I don't have time to discuss, but this is just not going to happen" or "consider it vetoed". - Hence: just write it down, so we can refer to it. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jul 24 23:07:58 2018 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 24 Jul 2018 20:07:58 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Sun, Jul 22, 2018 at 12:28 PM, Ralf Gommers wrote: > On Sat, Jul 21, 2018 at 7:15 PM, Nathaniel Smith wrote: >> Speaking of examples: I hate to say this because in general I think >> using examples is a great idea. But... I think you should delete most >> of these examples. The problem is scope creep: the goal for this NEP >> (IMO) should be to lay out the principles we use to think about these >> issues in general, but right now it comes across as trying to lay down >> a final resolution on lots of specific issues (including several where >> there are ongoing conversations). It ends up like trying to squish >> multiple NEPs into one, which makes it hard to discuss, and also >> distracts from the core purpose. > > > I'm not sure this is the best thing to do. I can remove a couple, but aiming > to be "totally uncontroversial" is almost impossible given the topic of the > NEP. Of course the NEP itself will have some things to discuss ? but I think the discussion will be more productive if we can stay focused on the core part of the NEP, which is the general principles we use to evaluate each specific situation as it comes up. Look at how much of the discussion so far has gotten derailed onto topics like subclassing, submodules, etc. > The diag view example is important I think, it's the second most > discussed backwards compatibility issue next to histogram. I'm happy to > remove the statement on what should happen with it going forward though. It's the most discussed issue because it was the test case where we developed all these policies in the first place :-). I'm not sure it's particularly interesting aside from that, and that specific history ("let's come up with a transition plan for this feature that no-one actually cares about, b/c no-one cares about it so it's a good thing to use as a test case") is unlikely to be repeated. > Then, I think it's not unreasonable to draw a couple of hard lines. For > example, removing complete submodules like linalg or random has ended up on > some draft brainstorm roadmap list because someone (no idea who) put it > there after a single meeting. Clearly the cost-benefit of that is such that > there's no point even discussing that more, so I'd rather draw that line > here than every time someone open an issue. Very recent example: > https://github.com/numpy/numpy/issues/11457 (remove auto-import of > numpy.testing). I can see an argument for splitting random and linalg into their own modules, which numpy depends on and imports so that existing code doesn't break. E.g. this might let people install an old version of random if they needed to reproduce some old results, or help us merge numpy and scipy's linalg modules into a single package. I agree though that making 'np.linalg' start raising AttributeError is a total non-starter. >> Regarding the major version number thing: ugh do we really want to >> talk about this more. I'd probably leave it out of the NEP entirely. >> If it stays in, I think it needs a clearer description of what counts >> as a "major" change. > > > I think it has value to keep it, and that it's not really possible to come > up with a very clear description of "major". In particular, I'd like every > deprecation message to say "this deprecated feature will be removed by > release X.Y.0". At the moment we don't do that, so if users see a message > they don't know if a removal will happen next year, in the far future (2.0), > or never. The major version thing is quite useful to signal our intent. > Doesn't mean we need to exhaustively discuss when to do a 2.0 though, I > agree that that's not a very useful discussion right now. The problem is that "2.0" means a lot of different things to different people, not just "some future date to be determined", so using it that way will confuse people. Also, it's hard to predict when a deprecation will actually happen... it's very common that we adjust the schedule as we go (e.g. when we try to remove it and then discover it breaks everyone so we have to put it back for a while). I feel like it would be better to do this based on time -- like say "this will be removed " or something, and then it might take longer but not shorter? Re: version numbers, I actually think numpy should consider switching to calver [1]. We'd be giving up on being able to do a "2.0", but that's kind of a good thing -- if a change is too big to handle through our normal deprecation cycle, then it's probably too big to handle period. And "numpy 2018.3" gives you more information than our current scheme -- for example you could see at a glance that numpy 2012.1 is super out-of-date, and we could tell people that numpy 2019.1 will drop python 2 support. ...But that's a whole other discussion, and we shouldn't get derailed onto it here in this NEP's thread :-). [1] https://calver.org/ -n -- Nathaniel J. Smith -- https://vorpus.org From ralf.gommers at gmail.com Wed Jul 25 00:20:27 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 24 Jul 2018 21:20:27 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Tue, Jul 24, 2018 at 8:07 PM, Nathaniel Smith wrote: > On Sun, Jul 22, 2018 at 12:28 PM, Ralf Gommers > wrote: > > On Sat, Jul 21, 2018 at 7:15 PM, Nathaniel Smith wrote: > >> Speaking of examples: I hate to say this because in general I think > >> using examples is a great idea. But... I think you should delete most > >> of these examples. The problem is scope creep: the goal for this NEP > >> (IMO) should be to lay out the principles we use to think about these > >> issues in general, but right now it comes across as trying to lay down > >> a final resolution on lots of specific issues (including several where > >> there are ongoing conversations). It ends up like trying to squish > >> multiple NEPs into one, which makes it hard to discuss, and also > >> distracts from the core purpose. > > > > > > I'm not sure this is the best thing to do. I can remove a couple, but > aiming > > to be "totally uncontroversial" is almost impossible given the topic of > the > > NEP. > > Of course the NEP itself will have some things to discuss ? but I > think the discussion will be more productive if we can stay focused on > the core part of the NEP, which is the general principles we use to > evaluate each specific situation as it comes up. Look at how much of > the discussion so far has gotten derailed onto topics like > subclassing, submodules, etc. > The subclassing discussion was actually illuminating and useful. Maybe it does deserve its own write-up somewhere though. Happy to remove that too. Would then like to put it somewhere else - in the docs, another NEP, ...? The submodules one I'd really like to keep. > > The diag view example is important I think, it's the second most > > discussed backwards compatibility issue next to histogram. I'm happy to > > remove the statement on what should happen with it going forward though. > > It's the most discussed issue because it was the test case where we > developed all these policies in the first place :-). Pretty sure that's not true, we had policies long before that plus it was not advertised as a test case for backwards compat (it's just an improvement that someone wanted to implement). But well, I don't care enough about this particular one to argue about it - I'll remove it. I'm not sure it's > particularly interesting aside from that, and that specific history > ("let's come up with a transition plan for this feature that no-one > actually cares about, b/c no-one cares about it so it's a good thing > to use as a test case") is unlikely to be repeated. > > > Then, I think it's not unreasonable to draw a couple of hard lines. For > > example, removing complete submodules like linalg or random has ended up > on > > some draft brainstorm roadmap list because someone (no idea who) put it > > there after a single meeting. Clearly the cost-benefit of that is such > that > > there's no point even discussing that more, so I'd rather draw that line > > here than every time someone open an issue. Very recent example: > > https://github.com/numpy/numpy/issues/11457 (remove auto-import of > > numpy.testing). > > I can see an argument for splitting random and linalg into their own > modules, which numpy depends on and imports so that existing code > doesn't break. Me too, that could happen. But that's unrelated to backwards compatibility. E.g. this might let people install an old version of > random if they needed to reproduce some old results, or help us merge > numpy and scipy's linalg modules into a single package. I agree though > that making 'np.linalg' start raising AttributeError is a total > non-starter. > It is, hence why I say above that I'd like to keep that example. > >> Regarding the major version number thing: ugh do we really want to > >> talk about this more. I'd probably leave it out of the NEP entirely. > >> If it stays in, I think it needs a clearer description of what counts > >> as a "major" change. > > > > > > I think it has value to keep it, and that it's not really possible to > come > > up with a very clear description of "major". In particular, I'd like > every > > deprecation message to say "this deprecated feature will be removed by > > release X.Y.0". At the moment we don't do that, so if users see a message > > they don't know if a removal will happen next year, in the far future > (2.0), > > or never. The major version thing is quite useful to signal our intent. > > Doesn't mean we need to exhaustively discuss when to do a 2.0 though, I > > agree that that's not a very useful discussion right now. > > The problem is that "2.0" means a lot of different things to different > people, not just "some future date to be determined", so using it that > way will confuse people. Also, it's hard to predict when a deprecation > will actually happen... it's very common that we adjust the schedule > as we go (e.g. when we try to remove it and then discover it breaks > everyone so we have to put it back for a while). > > I feel like it would be better to do this based on time This does make sense to me. -- like say > "this will be removed " or something, and then it > might take longer but not shorter? > You can't practically do "today", should be . But yes that is useful, the point is to give a clear indication and it's then easy for the user to figure out when the earliest date is that the removal could happen. Given that this is clear and avoids the version number discussion, I'm happy to go with that and remove the major/minor version text. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed Jul 25 07:44:19 2018 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 25 Jul 2018 07:44:19 -0400 Subject: [Numpy-discussion] update to numpy-1.5.0 gives new warnings from scipy Message-ID: After update to numpy-1.5.0, I'm getting warnings from scipy. These probably come from my code using convolve. Does scipy need updating? /home/nbecker/.local/lib/python3.6/site-packages/scipy/fftpack/basic.py:160: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result. z[index] = x /home/nbecker/.local/lib/python3.6/site-packages/scipy/signal/signaltools.py:491: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result. return x[reverse].conj() /home/nbecker/.local/lib/python3.6/site-packages/scipy/signal/signaltools.py:251: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result. in1zpadded[sc] = in1.copy() -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Jul 25 18:13:30 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 25 Jul 2018 22:13:30 +0000 Subject: [Numpy-discussion] update to numpy-1.5.0 gives new warnings from scipy In-Reply-To: References: Message-ID: On Wed, Jul 25, 2018 at 11:44 AM, Neal Becker wrote: > After update to numpy-1.5.0, I'm getting warnings from scipy. > These probably come from my code using convolve. Does scipy need updating? > Should already be fixed in scipy master. Cheers, Ralf > /home/nbecker/.local/lib/python3.6/site-packages/scipy/fftpack/basic.py:160: > FutureWarning: Using a non-tuple sequence for multidimensional indexing is > deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this > will be interpreted as an array index, `arr[np.array(seq)]`, which will > result either in an error or a different result. > z[index] = x > /home/nbecker/.local/lib/python3.6/site-packages/scipy/signal/signaltools.py:491: > FutureWarning: Using a non-tuple sequence for multidimensional indexing is > deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this > will be interpreted as an array index, `arr[np.array(seq)]`, which will > result either in an error or a different result. > return x[reverse].conj() > /home/nbecker/.local/lib/python3.6/site-packages/scipy/signal/signaltools.py:251: > FutureWarning: Using a non-tuple sequence for multidimensional indexing is > deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this > will be interpreted as an array index, `arr[np.array(seq)]`, which will > result either in an error or a different result. > in1zpadded[sc] = in1.copy() > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jul 25 11:13:26 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 25 Jul 2018 08:13:26 -0700 Subject: [Numpy-discussion] Roadmap proposal, v3 In-Reply-To: References: <20180724190449.pxud4eebr7juktzq@carbo> Message-ID: Great work, thanks! I see this: ?- Fixed width encoded strings (utf8, latin1, ...)? And a bit of discussion in the PR. But I think there are key questions to be addressed in handling strings in numpy. I know we?ve had a lot of discussion about it on this list over the years ? but is there a place that has captured that discussion / and or we can start a new one? For example, I am very wary of putting a non-fixed width encoding (e.g. Utf-8) in a fixed width field. But this PR is not the place to discuss that. -CHB Sent from my iPhone On Jul 24, 2018, at 3:21 PM, Hameer Abbasi wrote: Hey Stefan/Ralf/Stephan, This looks nice, generally what the community agrees on. Great work, and thanks for putting this together. Best regards, Hameer Abbasi Sent from Astro for Mac On 24. Jul 2018 at 21:04, Stefan van der Walt wrote: Hi everyone, Please take a look at the latest roadmap proposal: https://github.com/numpy/numpy/pull/11611 This is a living document, so can easily be modified in the future, but we'd like to get in place a document that corresponds fairly closely with current community priorities. Best regards, St?fan _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Jul 25 18:10:49 2018 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 26 Jul 2018 00:10:49 +0200 Subject: [Numpy-discussion] update to numpy-1.5.0 gives new warnings from scipy In-Reply-To: References: Message-ID: <6187d31033d1720364332d58b2e60a73dc754ffe.camel@sipsolutions.net> On Wed, 2018-07-25 at 07:44 -0400, Neal Becker wrote: > After update to numpy-1.5.0, I'm getting warnings from scipy. > These probably come from my code using convolve. Does scipy need > updating? > Probably yes, I am a bit surprised we did not notice it before if it is in scipy (or maybe scipy is already fixed?). This may be one of the more controversial new warnings, so lets see if it comes up more. Right now it seems not to affect much, I guess. If the correct thing to do is to use the list as an array, then the easiest solution maybe to do: z[index,] = x # note the additional `,` # or alternatively of course: z[np.asarray(index)] = x Otherwise, you will have to use `tuple(index)` to make sure numpy interprets it as a multi-dimensional index. The problem here, that this solves, is that if you have `z[some_list]` currently numpy basically guesses whether you want a multi-dimensional index or not. - Sebastian > /home/nbecker/.local/lib/python3.6/site- > packages/scipy/fftpack/basic.py:160: FutureWarning: Using a non-tuple > sequence for multidimensional indexing is deprecated; use > `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be > interpreted as an array index, `arr[np.array(seq)]`, which will > result either in an error or a different result. > z[index] = x > /home/nbecker/.local/lib/python3.6/site- > packages/scipy/signal/signaltools.py:491: FutureWarning: Using a non- > tuple sequence for multidimensional indexing is deprecated; use > `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be > interpreted as an array index, `arr[np.array(seq)]`, which will > result either in an error or a different result. > return x[reverse].conj() > /home/nbecker/.local/lib/python3.6/site- > packages/scipy/signal/signaltools.py:251: FutureWarning: Using a non- > tuple sequence for multidimensional indexing is deprecated; use > `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be > interpreted as an array index, `arr[np.array(seq)]`, which will > result either in an error or a different result. > in1zpadded[sc] = in1.copy() > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From stefanv at berkeley.edu Wed Jul 25 18:54:57 2018 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Wed, 25 Jul 2018 15:54:57 -0700 Subject: [Numpy-discussion] Roadmap proposal, v3 In-Reply-To: References: <20180724190449.pxud4eebr7juktzq@carbo> Message-ID: <20180725225457.eccwrgyslc2rdugv@carbo> Hi Chris, On Wed, 25 Jul 2018 08:13:26 -0700, Chris Barker - NOAA Federal wrote: > For example, I am very wary of putting a non-fixed width encoding (e.g. > Utf-8) in a fixed width field. > > But this PR is not the place to discuss that. Since you've followed that discussion closely, can you push a commit to my PR with text that more accurately captures the situation? Thanks! St?fan From shoyer at gmail.com Wed Jul 25 19:57:18 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 25 Jul 2018 16:57:18 -0700 Subject: [Numpy-discussion] Roadmap proposal, v3 In-Reply-To: <20180725225457.eccwrgyslc2rdugv@carbo> References: <20180724190449.pxud4eebr7juktzq@carbo> <20180725225457.eccwrgyslc2rdugv@carbo> Message-ID: On Wed, Jul 25, 2018 at 4:02 PM Stefan van der Walt wrote: > Hi Chris, > > On Wed, 25 Jul 2018 08:13:26 -0700, Chris Barker - NOAA Federal wrote: > > For example, I am very wary of putting a non-fixed width encoding (e.g. > > Utf-8) in a fixed width field. > > > > But this PR is not the place to discuss that. > > Since you've followed that discussion closely, can you push a commit to > my PR with text that more accurately captures the situation? > > Thanks! > St?fan > Hi Chris, Obviously the string dtype proposal in the roadmap is only a sketch at this point :). I do think that options listed currently (encoded strings with fixed-width storage and variable length strings) cover the breadth of proposals from last time. We may not want to implement all of them in NumPy, but I think we can agree that there are use cases for all them, even if only as external dtypes? Would it help to add "and/or" after the first bullet? Mostly I care about having like to have "improve string dtypes" in some form on the roadmap, and thought it would be helpful to list the concrete proposals that I recall. The actual design choices (especially if we proposal to change any default behavior) will certainly need a NEP. Best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Jul 25 20:41:19 2018 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 25 Jul 2018 20:41:19 -0400 Subject: [Numpy-discussion] Roadmap proposal, v3 In-Reply-To: References: <20180724190449.pxud4eebr7juktzq@carbo> <20180725225457.eccwrgyslc2rdugv@carbo> Message-ID: > Obviously the string dtype proposal in the roadmap is only a sketch at this point :). > > I do think that options listed currently (encoded strings with fixed-width storage and variable length strings) cover the breadth of proposals from last time. We may not want to implement all of them in NumPy, but I think we can agree that there are use cases for all them, even if only as external dtypes? Maybe :-) ? but I totally agree that more complete handling of strings should be on the roadmap. > Would it help to add "and/or" after the first bullet? Mostly I care about having like to have "improve string dtypes" in some form on the roadmap, and thought it would be helpful to list the concrete proposals that I recall. Sure, something like and/or that makes it clear that the details are yet to be determined would be great. > The actual design choices (especially if we proposal to change any default behavior) will certainly need a NEP. Then that will be the place to hash out the details ? perfect. I just got a little concerned that s not-well vetted solution was getting nailed down in the roadmap. -CHB From ralf.gommers at gmail.com Thu Jul 26 00:19:12 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 26 Jul 2018 04:19:12 +0000 Subject: [Numpy-discussion] Roadmap proposal, v3 In-Reply-To: References: <20180724190449.pxud4eebr7juktzq@carbo> <20180725225457.eccwrgyslc2rdugv@carbo> Message-ID: On Thu, Jul 26, 2018 at 12:41 AM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > > Obviously the string dtype proposal in the roadmap is only a sketch at > this point :). > > > > I do think that options listed currently (encoded strings with > fixed-width storage and variable length strings) cover the breadth of > proposals from last time. We may not want to implement all of them in > NumPy, but I think we can agree that there are use cases for all them, even > if only as external dtypes? > > Maybe :-) ? but I totally agree that more complete handling of strings > should be on the roadmap. > > > Would it help to add "and/or" after the first bullet? Mostly I care > about having like to have "improve string dtypes" in some form on the > roadmap, and thought it would be helpful to list the concrete proposals > that I recall. > > Sure, something like and/or that makes it clear that the details are > yet to be determined would be great. > > > The actual design choices (especially if we proposal to change any > default behavior) will certainly need a NEP. > +1 the roadmap just contains topics/directions of interest. It's not the place for any technical decisions. Related note: we are now using the "wish list" issue label for anything that is too small to put on a roadmap or write a NEP for. Right now there's a lot of random stuff in that label though ([1]), so I think we have to clean that up. Examples of good wish list items that are on there now: - Document API generation with setup.py, genapi.py, etc. (gh-9203) - Feature request: signal broadcasting is OK over core dimension (gh-8811) - Multidimensional fftfreq/rfftfreq (gh-9094) I plan to go through and remove the label from issues that don't fit in the next couple of days. Cheers, Ralf [1] https://github.com/numpy/numpy/issues?q=is%3Aopen+is%3Aissue+label%3A%2223+-+Wish+List%22 -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Fri Jul 27 14:02:06 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 27 Jul 2018 11:02:06 -0700 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Tue, Jul 24, 2018 at 5:38 PM Ralf Gommers wrote: > This is very developer-centric view. We have lots of users and also lots > of no-longer-active contributors. The needs, interests and previous work > put into NumPy of those groups of people matter. > Yes, I suppose it is :). I tend to view NumPy's developers (interpreted somewhat broadly, including those who contribute to the project in other ways) as the ultimate representatives of NumPy's user base. > I like would suggest the following criteria for considering removing a >> NumPy submodule: >> > 1. It cannot be relied upon by other portions of NumPy. >> 2. Either >> (a) the submodule imposes a significant maintenance burden upon the rest >> of NumPy that is not balanced by the level of dedicated contributions, or >> (b) much better alternatives exist outside of NumPy >> > > To quote Nathaniel: "the rest of our policy is all about measuring > disruption based on effects on users". That's absent from your criteria. > Yes, "Can be achieved with minimum disruption for users" would be appropriate to add as another top level criteria. Why I would like to keep this point in is: > - the discussion does come up, see draft brainstorm roadmap list and > gh-11457. > - the outcome of such discussions is in practice 100% clear. > - I would like to avoid having drawn out discussions each time (this eats > up a lot of energy for me), and I *really* would like to avoid saying "I > don't have time to discuss, but this is just not going to happen" or > "consider it vetoed". > - Hence: just write it down, so we can refer to it. > I would rather we just say that the bar for deprecating or removing *any* functionality in NumPy is extremely high. np.matrix is probably the best example in recent times: - np.matrix is officially discouraged (which we prefer even to deprecation) - we *anticipate* deprecating it as soon as there's a viable alternative to scipy.sparse - even then, we will be very cautious about ever removing it, with the understanding that it is widely used As for updating this section of the NEP: - We could certainly note that to date NumPy has not removed any complete submodules (is this true?), and that these modules in particular, the cost-benefit ratio does not favor removal at this time. - Documenting the criteria we've come up with here, even though it hasn't been satisfied yet, might be helpful to demonstrate the high bar that is required. - I don't like rejecting the possibility of removing submodules entirely "simply not a good idea". It may become a good idea in the future, if some of the underlying facts change. I would also suggest highlighting two other strategies that NumPy uses in favor of deprecation/removal: - Official discouragement. Discouraging or deemphasizing in our docs is the preferred strategy for older APIs that still have well defined behavior but that are arguably less consistent with the rest of NumPy. Examples: isin vs in1d, stack/block vs hstack/dstack/vstack. - Benign neglect. This is our preferred strategy to removing submodules. Merely being in NumPy does not automatically guarantee that a module is well maintained, nor does it imply that a submodule is the best tool for the job. That's OK, as long as the incremental maintenance burden on the rest of NumPy is not too high. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Fri Jul 27 14:21:43 2018 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 27 Jul 2018 14:21:43 -0400 Subject: [Numpy-discussion] NEP 15, 20 implementations waiting for review Message-ID: <70bc563e-8e9a-57b4-f0e3-1fe87bf42b15@gmail.com> Two largish pull requests that implement approved NEPS are waiting for review: https://github.com/numpy/numpy/pull/11175 for expanded gufunc signatures (NEP 20) https://github.com/numpy/numpy/pull/10915 for merging multiarray and umath c-extension modules (NEP 15) I realize reviewer time is precious, and appreciate that these PRs are perhaps more complex to review than others, but it would be nice to keep the good momentum we have in the NEP process moving forward. Thanks, Matti From stefanv at berkeley.edu Fri Jul 27 18:02:58 2018 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Fri, 27 Jul 2018 15:02:58 -0700 Subject: [Numpy-discussion] Adoption of a Code of Conduct Message-ID: <20180727220258.fdrtqxuk5t5xfqns@carbo> Hi everyone, A while ago, SciPy (the library) adopted its Code of Conduct: https://docs.scipy.org/doc/scipy/reference/dev/conduct/code_of_conduct.html We worked hard to make that document friendly, while at the same time stating clearly the kinds of behavior that would and would not be tolerated. I propose that we adopt the SciPy code of conduct for NumPy as well. It is a good way to signal to newcomers that this is a community that cares about how people are treated. And I think we should do anything in our power to make NumPy as attractive as possible! If we adopt this document as policy, we will need to select a Code of Conduct committee, to whom potential transgressions can be reported. The individuals doing this for SciPy may very well be happy to do the same for NumPy, but the community should decide whom will best serve those roles. Let me know your thoughts. Thanks! St?fan From ralf.gommers at gmail.com Fri Jul 27 18:30:20 2018 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 27 Jul 2018 22:30:20 +0000 Subject: [Numpy-discussion] Adoption of a Code of Conduct In-Reply-To: <20180727220258.fdrtqxuk5t5xfqns@carbo> References: <20180727220258.fdrtqxuk5t5xfqns@carbo> Message-ID: On Fri, Jul 27, 2018 at 10:02 PM, Stefan van der Walt wrote: > Hi everyone, > > A while ago, SciPy (the library) adopted its Code of Conduct: > https://docs.scipy.org/doc/scipy/reference/dev/conduct/code_ > of_conduct.html > > We worked hard to make that document friendly, while at the same time > stating clearly the kinds of behavior that would and would not be > tolerated. > > I propose that we adopt the SciPy code of conduct for NumPy as well. It > is a good way to signal to newcomers that this is a community that cares > about how people are treated. And I think we should do anything in our > power to make NumPy as attractive as possible! > +1 Maybe a bit of context: the SciPy code of conduct had quite a lot of discussion, and importantly in the end everyone involved in the discussion was happy with (or at least not displeased by) the final document. Hence I see it as a good document to adopt also by other projects. And here's what I wrote as the intro for that CoC discussion: As you probably know, Code of Conduct (CoC) documents are becoming more common every year for open source projects, and there are a number of good reasons to adopt a CoC: 1. It gives us the opportunity to explicitly express the values and behaviors we'd like to see in our community. 2. It is designed to make everyone feel welcome (and while I think we're a welcoming community anyway, not having a CoC may look explicitly unwelcoming to some potential contributors nowadays). 3. It gives us a tool to address a set of problems if and when they occur, as well as a way for anyone to report issues or behavior that is unacceptable to them (much better than having those people potentially leave the community). 4. SciPy is not yet a fiscally sponsored project of NumFOCUS, however I think we'd like to be in the near future. NumFOCUS has started to require having a CoC as a prerequisite for new projects joining it. The PSF has the same requirement for any sponsorship for events/projects that it gives. Note on (4): NumPy is a sponsored project of NumFOCUS, and I've been asked several times how it can be that NumPy is sponsored but does not have a CoC. Cheers, Ralf > If we adopt this document as policy, we will need to select a Code of > Conduct committee, to whom potential transgressions can be reported. > The individuals doing this for SciPy may very well be happy to do the > same for NumPy, but the community should decide whom will best serve > those roles. > > Let me know your thoughts. > > Thanks! > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Fri Jul 27 20:03:36 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Fri, 27 Jul 2018 20:03:36 -0400 Subject: [Numpy-discussion] Adoption of a Code of Conduct In-Reply-To: References: <20180727220258.fdrtqxuk5t5xfqns@carbo> Message-ID: My ideal version would be substantially shorter, maybe just quote the golden rule, but I am happy with the suggestion to just adapt this text. I particularly appreciate the lack of absolutism in the text, and the acknowledgement that it is possible to have a bad day even while not distracting from the overall message. -- Marten On Fri, Jul 27, 2018 at 6:30 PM, Ralf Gommers wrote: > > > On Fri, Jul 27, 2018 at 10:02 PM, Stefan van der Walt < > stefanv at berkeley.edu> wrote: > >> Hi everyone, >> >> A while ago, SciPy (the library) adopted its Code of Conduct: >> https://docs.scipy.org/doc/scipy/reference/dev/conduct/code_ >> of_conduct.html >> >> We worked hard to make that document friendly, while at the same time >> stating clearly the kinds of behavior that would and would not be >> tolerated. >> >> I propose that we adopt the SciPy code of conduct for NumPy as well. It >> is a good way to signal to newcomers that this is a community that cares >> about how people are treated. And I think we should do anything in our >> power to make NumPy as attractive as possible! >> > > +1 > > Maybe a bit of context: the SciPy code of conduct had quite a lot of > discussion, and importantly in the end everyone involved in the discussion > was happy with (or at least not displeased by) the final document. Hence I > see it as a good document to adopt also by other projects. > > And here's what I wrote as the intro for that CoC discussion: > As you probably know, Code of Conduct (CoC) documents are becoming more > common every year for open source projects, and there are a number of good > reasons to adopt a CoC: > 1. It gives us the opportunity to explicitly express the values and > behaviors we'd like to see in our community. > 2. It is designed to make everyone feel welcome (and while I think we're a > welcoming community anyway, not having a CoC may look explicitly > unwelcoming to some potential contributors nowadays). > 3. It gives us a tool to address a set of problems if and when they occur, > as well as a way for anyone to report issues or behavior that is > unacceptable to them (much better than having those people potentially > leave the community). > 4. SciPy is not yet a fiscally sponsored project of NumFOCUS, however I > think we'd like to be in the near future. NumFOCUS has started to require > having a CoC as a prerequisite for new projects joining it. The PSF has > the same requirement for any sponsorship for events/projects that it gives. > > Note on (4): NumPy is a sponsored project of NumFOCUS, and I've been asked > several times how it can be that NumPy is sponsored but does not have a > CoC. > > Cheers, > Ralf > > >> If we adopt this document as policy, we will need to select a Code of >> Conduct committee, to whom potential transgressions can be reported. >> The individuals doing this for SciPy may very well be happy to do the >> same for NumPy, but the community should decide whom will best serve >> those roles. >> >> Let me know your thoughts. >> >> Thanks! >> St?fan >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jul 27 20:30:54 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 27 Jul 2018 18:30:54 -0600 Subject: [Numpy-discussion] Adoption of a Code of Conduct In-Reply-To: References: <20180727220258.fdrtqxuk5t5xfqns@carbo> Message-ID: On Fri, Jul 27, 2018 at 6:03 PM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > My ideal version would be substantially shorter, maybe just quote the > golden rule, but I am happy with the suggestion to just adapt this text. I > particularly appreciate the lack of absolutism in the text, and the > acknowledgement that it is possible to have a bad day even while not > distracting from the overall message. > I tend to the shorter is better side as well, but need to reread what SciPy ended up with. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Fri Jul 27 20:31:51 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 27 Jul 2018 17:31:51 -0700 Subject: [Numpy-discussion] Adoption of a Code of Conduct In-Reply-To: References: <20180727220258.fdrtqxuk5t5xfqns@carbo> Message-ID: I would be happy to adopt the SciPy code of conduct and code of conduct committee both. On Fri, Jul 27, 2018 at 5:04 PM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > My ideal version would be substantially shorter, maybe just quote the > golden rule, but I am happy with the suggestion to just adapt this text. I > particularly appreciate the lack of absolutism in the text, and the > acknowledgement that it is possible to have a bad day even while not > distracting from the overall message. > -- Marten > > On Fri, Jul 27, 2018 at 6:30 PM, Ralf Gommers > wrote: > >> >> >> On Fri, Jul 27, 2018 at 10:02 PM, Stefan van der Walt < >> stefanv at berkeley.edu> wrote: >> >>> Hi everyone, >>> >>> A while ago, SciPy (the library) adopted its Code of Conduct: >>> >>> https://docs.scipy.org/doc/scipy/reference/dev/conduct/code_of_conduct.html >>> >>> We worked hard to make that document friendly, while at the same time >>> stating clearly the kinds of behavior that would and would not be >>> tolerated. >>> >>> I propose that we adopt the SciPy code of conduct for NumPy as well. It >>> is a good way to signal to newcomers that this is a community that cares >>> about how people are treated. And I think we should do anything in our >>> power to make NumPy as attractive as possible! >>> >> >> +1 >> >> Maybe a bit of context: the SciPy code of conduct had quite a lot of >> discussion, and importantly in the end everyone involved in the discussion >> was happy with (or at least not displeased by) the final document. Hence I >> see it as a good document to adopt also by other projects. >> >> And here's what I wrote as the intro for that CoC discussion: >> As you probably know, Code of Conduct (CoC) documents are becoming more >> common every year for open source projects, and there are a number of good >> reasons to adopt a CoC: >> 1. It gives us the opportunity to explicitly express the values and >> behaviors we'd like to see in our community. >> 2. It is designed to make everyone feel welcome (and while I think we're >> a welcoming community anyway, not having a CoC may look explicitly >> unwelcoming to some potential contributors nowadays). >> 3. It gives us a tool to address a set of problems if and when they >> occur, as well as a way for anyone to report issues or behavior that is >> unacceptable to them (much better than having those people potentially >> leave the community). >> 4. SciPy is not yet a fiscally sponsored project of NumFOCUS, however I >> think we'd like to be in the near future. NumFOCUS has started to require >> having a CoC as a prerequisite for new projects joining it. The PSF has >> the same requirement for any sponsorship for events/projects that it gives. >> >> Note on (4): NumPy is a sponsored project of NumFOCUS, and I've been >> asked several times how it can be that NumPy is sponsored but does not have >> a CoC. >> >> Cheers, >> Ralf >> >> >>> If we adopt this document as policy, we will need to select a Code of >>> Conduct committee, to whom potential transgressions can be reported. >>> The individuals doing this for SciPy may very well be happy to do the >>> same for NumPy, but the community should decide whom will best serve >>> those roles. >>> >>> Let me know your thoughts. >>> >>> Thanks! >>> St?fan >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Fri Jul 27 23:48:27 2018 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Fri, 27 Jul 2018 20:48:27 -0700 Subject: [Numpy-discussion] Adoption of a Code of Conduct In-Reply-To: References: <20180727220258.fdrtqxuk5t5xfqns@carbo> Message-ID: <164df0076f8.27ae.acf34a9c767d7bb498a799333be0433e@fastmail.com> On July 27, 2018 17:04:23 Marten van Kerkwijk wrote: > My ideal version would be substantially shorter, maybe just quote the > golden rule, but I am happy with the suggestion to just adapt this text. Agreed! There's some basic ground that needs to be covered, though, and the result of exploring that fully is, practically, what you see here. I'm not opposed to modifying the document in principle, although I reckon it would be somewhat easier, from both a maintenance and adoption perspective, to use the same. Best regards, St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 30 21:25:15 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 30 Jul 2018 19:25:15 -0600 Subject: [Numpy-discussion] Adoption of a Code of Conduct In-Reply-To: <20180727220258.fdrtqxuk5t5xfqns@carbo> References: <20180727220258.fdrtqxuk5t5xfqns@carbo> Message-ID: On Fri, Jul 27, 2018 at 4:02 PM, Stefan van der Walt wrote: > Hi everyone, > > A while ago, SciPy (the library) adopted its Code of Conduct: > https://docs.scipy.org/doc/scipy/reference/dev/conduct/ > code_of_conduct.html > > We worked hard to make that document friendly, while at the same time > stating clearly the kinds of behavior that would and would not be > tolerated. > > I propose that we adopt the SciPy code of conduct for NumPy as well. It > is a good way to signal to newcomers that this is a community that cares > about how people are treated. And I think we should do anything in our > power to make NumPy as attractive as possible! > > If we adopt this document as policy, we will need to select a Code of > Conduct committee, to whom potential transgressions can be reported. > The individuals doing this for SciPy may very well be happy to do the > same for NumPy, but the community should decide whom will best serve > those roles. > > Let me know your thoughts. > +1 from me. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 30 21:32:22 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 30 Jul 2018 19:32:22 -0600 Subject: [Numpy-discussion] backwards compatibility and deprecation policy NEP In-Reply-To: References: Message-ID: On Fri, Jul 27, 2018 at 12:02 PM, Stephan Hoyer wrote: > On Tue, Jul 24, 2018 at 5:38 PM Ralf Gommers > wrote: > >> This is very developer-centric view. We have lots of users and also lots >> of no-longer-active contributors. The needs, interests and previous work >> put into NumPy of those groups of people matter. >> > > Yes, I suppose it is :). > > I tend to view NumPy's developers (interpreted somewhat broadly, including > those who contribute to the project in other ways) as the ultimate > representatives of NumPy's user base. > > >> I like would suggest the following criteria for considering removing a >>> NumPy submodule: >>> >> 1. It cannot be relied upon by other portions of NumPy. >>> 2. Either >>> (a) the submodule imposes a significant maintenance burden upon the rest >>> of NumPy that is not balanced by the level of dedicated contributions, or >>> (b) much better alternatives exist outside of NumPy >>> >> >> To quote Nathaniel: "the rest of our policy is all about measuring >> disruption based on effects on users". That's absent from your criteria. >> > > Yes, "Can be achieved with minimum disruption for users" would be > appropriate to add as another top level criteria. > > Why I would like to keep this point in is: >> - the discussion does come up, see draft brainstorm roadmap list and >> gh-11457. >> - the outcome of such discussions is in practice 100% clear. >> - I would like to avoid having drawn out discussions each time (this eats >> up a lot of energy for me), and I *really* would like to avoid saying "I >> don't have time to discuss, but this is just not going to happen" or >> "consider it vetoed". >> - Hence: just write it down, so we can refer to it. >> > > I would rather we just say that the bar for deprecating or removing *any* > functionality in NumPy is extremely high. np.matrix is probably the best > example in recent times: > - np.matrix is officially discouraged (which we prefer even to deprecation) > - we *anticipate* deprecating it as soon as there's a viable alternative > to scipy.sparse > - even then, we will be very cautious about ever removing it, with the > understanding that it is widely used > > As for updating this section of the NEP: > - We could certainly note that to date NumPy has not removed any complete > submodules (is this true?), and that these modules in particular, the > cost-benefit ratio does not favor removal at this time. > Not quite true. We removed the Numarray and Numeric compatibility modules. That broke Konrad Hinson's package. > - Documenting the criteria we've come up with here, even though it hasn't > been satisfied yet, might be helpful to demonstrate the high bar that is > required. > - I don't like rejecting the possibility of removing submodules entirely > "simply not a good idea". It may become a good idea in the future, if some > of the underlying facts change. > > I would also suggest highlighting two other strategies that NumPy uses in > favor of deprecation/removal: > - Official discouragement. Discouraging or deemphasizing in our docs is > the preferred strategy for older APIs that still have well defined behavior > but that are arguably less consistent with the rest of NumPy. Examples: > isin vs in1d, stack/block vs hstack/dstack/vstack. > - Benign neglect. This is our preferred strategy to removing submodules. > Merely being in NumPy does not automatically guarantee that a module is > well maintained, nor does it imply that a submodule is the best tool for > the job. That's OK, as long as the incremental maintenance burden on the > rest of NumPy is not too high. > It might help to make a cheat sheet listing discouraged functions together with their suggested replacements. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: