From gizmoguy1 at gmail.com Wed Jul 1 14:34:13 2020 From: gizmoguy1 at gmail.com (John Preston) Date: Wed, 1 Jul 2020 19:34:13 +0100 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies Message-ID: Hello all, The following proposal was originally issue #16722 on GitHub but at the request of Matti Picus I am moving the discussion to this list. "NumPy is the fundamental package needed for scientific computing with Python." I am asking the NumPy project to leverage its position as a core dependency among statistical, numerical, and ML projects, in the pursuit of climate justice. It is easy to identify open-source software used by the oil and gas industry which relies on NumPy [1] [2] , and it is highly likely that NumPy is used in closed-source and in-house software at oil and gas extraction companies such as Aramco, ExxonMobil, BP, Shell, and others. I believe it is possible to use software licensing to discourage the use of NumPy and dependent packages by companies such as these, and that doing so would frustrate the ability of these companies to identify and extract new oil and gas reserves. I propose NumPy's current BSD 3-Clause license be extended to include the following conditions, in line with the Climate Strike License [3] : * The Software may not be used in applications and services that are used for or aid in the exploration, extraction, refinement, processing, or transportation of fossil fuels. * The Software may not be used by companies that rely on fossil fuel extraction as their primary means of revenue. This includes but is not limited to the companies listed at https://climatestrike.software/blocklist I accept that there are issues around adopting such a proposal, including that: addition of such clauses violates the Open Source Initiative's canonical Open Source Definition, which explicitly excludes licenses that limit re-use "in a specific field of endeavor", and therefore if these clauses were adopted NumPy would no longer "be open-source" by this definition; there may be collateral damage among the wider user base and project sponsorship, due to the vague nature of the first clause, and this may affect the longevity of the project and its standing within the Python, numerical, statistical, and ML communities. My intention with the opening of this issue is to promote constructive discussion of the use of software licensing -- and other measures -- for working towards climate justice -- and other forms of justice -- in the context of NumPy and other popular open-source libraries. Some people will say that NumPy is "just a tool" and that it sits independent of how it is used, but due to its utility and its influence as a major open-source library, I think it is essential that we consider the position of the Climate Strike License authors, that "as tech workers, we should take responsibility in how our software is used". Many thanks to all of the contributors who have put so much time and energy into NumPy. ? ?? ? [1] https://github.com/gazprom-neft/petroflow [2] https://github.com/climate-strike/analysis [3] https://github.com/climate-strike/license From sseibert at anaconda.com Wed Jul 1 15:16:12 2020 From: sseibert at anaconda.com (Stanley Seibert) Date: Wed, 1 Jul 2020 14:16:12 -0500 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: I think it is important to acknowledge that, regardless of the merits of such a license change on its own, NumPy's position in the dependency stack of PyData makes a license change that restricts an existing class of users impossible without causing a lot of chaos for non-NumPy developers who may not be involved in the decision. Imagine if NumPy switched to GPL (which also conflicts with the IT policy for many companies). This would immediately trigger a fork of NumPy at the last BSD licensed release. Ignoring the trademark issues, let's assume the fork is called "numpy-nogpl". Now every single PyData package that depends on NumPy now has to decide whether to depend on numpy or numpy-nogpl. A project sticking with "numpy" effectively means they are now forcing their user base to accept GPL software (even though their own package license has not changed), so likely many will have to push out a release that depends on numpy-nogpl, at least until they can decide whether they are willing to lose some of their users. Now every PyData package (of which there are many) is trying to decide which NumPy fork to depend on, and those packages that aren't updated have a new user policy forced on them. This is not unlike the problem with NumPy releasing a backward incompatible API change and breaking downstream packages, but in this case the incompatibility is legal, rather than functional. The deeper a project is in the dependency stack, the bigger the collateral disruption will be. I think the only way to do something like this would be for the NumPy development community to choose to fork themselves, pick a new project name, stop working on the original NumPy, and then lobby the community to switch to their fork. On Wed, Jul 1, 2020 at 1:35 PM John Preston wrote: > Hello all, > > The following proposal was originally issue #16722 on GitHub but at > the request of Matti Picus I am moving the discussion to this list. > > > "NumPy is the fundamental package needed for scientific computing with > Python." > > I am asking the NumPy project to leverage its position as a core > dependency among statistical, numerical, and ML projects, in the > pursuit of climate justice. It is easy to identify open-source > software used by the oil and gas industry which relies on NumPy [1] > [2] , and it is highly likely that NumPy is used in closed-source and > in-house software at oil and gas extraction companies such as Aramco, > ExxonMobil, BP, Shell, and others. I believe it is possible to use > software licensing to discourage the use of NumPy and dependent > packages by companies such as these, and that doing so would frustrate > the ability of these companies to identify and extract new oil and gas > reserves. > > I propose NumPy's current BSD 3-Clause license be extended to include > the following conditions, in line with the Climate Strike License [3] > : > > * The Software may not be used in applications and services that > are used for or > aid in the exploration, extraction, refinement, processing, or > transportation > of fossil fuels. > > * The Software may not be used by companies that rely on fossil > fuel extraction > as their primary means of revenue. This includes but is not > limited to the > companies listed at https://climatestrike.software/blocklist > > I accept that there are issues around adopting such a proposal, including > that: > > addition of such clauses violates the Open Source Initiative's > canonical Open Source Definition, which explicitly excludes licenses > that limit re-use "in a specific field of endeavor", and therefore if > these clauses were adopted NumPy would no longer "be open-source" by > this definition; > there may be collateral damage among the wider user base and project > sponsorship, due to the vague nature of the first clause, and this may > affect the longevity of the project and its standing within the > Python, numerical, statistical, and ML communities. > > My intention with the opening of this issue is to promote constructive > discussion of the use of software licensing -- and other measures -- > for working towards climate justice -- and other forms of justice -- > in the context of NumPy and other popular open-source libraries. Some > people will say that NumPy is "just a tool" and that it sits > independent of how it is used, but due to its utility and its > influence as a major open-source library, I think it is essential that > we consider the position of the Climate Strike License authors, that > "as tech workers, we should take responsibility in how our software is > used". > > Many thanks to all of the contributors who have put so much time and > energy into NumPy. ? ?? ? > > [1] https://github.com/gazprom-neft/petroflow > [2] https://github.com/climate-strike/analysis > [3] https://github.com/climate-strike/license > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Jul 1 15:17:17 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 01 Jul 2020 14:17:17 -0500 Subject: [Numpy-discussion] Improving Complex Comparison/Ordering in Numpy In-Reply-To: References: Message-ID: <453d1835fa50583ccf0f0c39252a55cd33a368cd.camel@sipsolutions.net> On Sat, 2020-06-27 at 16:08 -0700, Rakesh Vasudevan wrote: > Hi all, > > Following up on this. Created a WIP PR > https://github.com/numpy/numpy/pull/16700 > > As stated in the original thread, We need to start by having a sort() > function for complex numbers that can do it based on keys, rather > than > plain arithmetic ordering. > > There are two broad ways to approach a sorting function that supports > keys > (Not just for complex numbers). > Thanks for this. I think the idea is good in general and I would be happy to discuss details here. It was discussed briefly here: https://github.com/numpy/numpy/issues/15981 This is a WIP, but allows nicely to try out how the new API could/should look like, and see the potential impact to code. The current choice is for: np.sort(arr, keys=(arr.real, arr.image)) for example. `keys` is like the `key` argument to pythons sorts, but unlike python sorts is not passed a function but rather a sequence of arrays. Alternative spellings could be `by=...`? Or maybe someone has a different API idea. There are also some implementation details to figure out, since internally it probably will do an `argsort` over all key arrays which is much like, but a bit faster than, `np.lexsort`+`np.take_along_axis`. I like this approach in general, since I do not think complex lexicographic sorting is "obvious" and this also allows the choice of: np.sort(complex_arr, keys=(abs(complex_arr,)) to get convenient (although maybe not fastest) sorting by magnitude seems like a reasonable API choice. So I am happy if Rakesh pushes this forward, and if anyone has doubts about the API choice in general or the implications to complex sorting specifically it would be good to discuss this. The PR allows some testing of the feature already. Cheers, Sebastian > 1. Add a key kwarg to the sort() (function and method). To support > key > based sorting on arrays. > 2. Use a new function on the lines off sortby(c_arr, > key=(c_arr.real, > c_arr.imag) > > In this PR I have chosen approach 1 for the following reasons > > 1. > > Approach 1 means it is easier to deal with both in-place method > and the > function. Since we can make the change in the c-sort function, we > have > minimal change in the python layer. This I hope results, minimal > impact on > current code that handles complex sorting. One example within > numpy is is > linalg module's svd() function. > 2. > > With approach 2 when we deprecate complex arithmetic ordering, > existing > methods using sort() for complex types, need to update their > signature. > > As it stands the PR does the following 3 things within the Python-C > Array > method implementation of sort > > 1. Checks for complex type- If array is of complex-type, it > creates a > default key(When no key is passed) which mimics the current > arithmetic > ordering in Numpy . > 2. Uses the keys to perform a Py_LexSort and generate indices. > 3. We perform the take_along_axis via C call back and copy over > the > result to the original array (pseudo in-place). > > I am requesting feedback/help on implementing take_along_axis logic > in C > level in an in-place manner and the approach in general. > > This will further feed into max() and min() as well. Once we figure > this > out. Next step would be to deprecate arithmetic ordering for complex > types > (Which I think will be a PR on it's own) > > > Regards > > Rakesh > > On Thu, Jun 4, 2020 at 9:21 PM Brock Mendel > wrote: > > > Corresponding pandas issue: > > https://github.com/pandas-dev/pandas/issues/28050 > > > > On Thu, Jun 4, 2020 at 9:17 PM Rakesh Vasudevan < > > rakesh.nvasudev at gmail.com> > > wrote: > > > > > Hi all, > > > > > > As a follow up to gh-15981 < > > > https://github.com/numpy/numpy/issues/15981>;, > > > I would like to propose a change to bring complex dtype(s) > > > comparison > > > operators and related functions, in line with respective cpython > > > implementations. > > > > > > The current state of complex dtype comparisons/ordering as > > > summarised in > > > the issue is as follows: > > > > > > # In python > > > > > > > > cnum = 1 + 2j > > > > > cnum_two = 1 + 3j > > > > > > # Doing a comparision yields > > > > > cnum > cnum_two > > > > > > TypeError: '>' not supported between instances of 'complex' and > > > 'complex' > > > > > > > > > # Doing the same in Numpy scalar comparision > > > > > > > > np.array(cnum) > np.array(cnum_two) > > > > > > # Yields > > > > > > False > > > > > > > > > *NOTE*: only >, <, >= , <= do not work on complex numbers in > > > python , > > > equality (==) does work > > > > > > similarly sorting uses comparison operators behind to sort > > > complex > > > values. Again this behavior diverges from the default python > > > behavior. > > > > > > # In native python > > > > > clist = [cnum, cnum_2] > > > > > sorted(clist, key=lambda c: (c.real, c.imag)) > > > [(1+2j), (1+3j)] > > > > > > # In numpy > > > > > > > > np.sort(clist) #Uses the default comparision order > > > > > > # Yields same result > > > > > > # To get a cpython like sorting call we can do the following in > > > numpy > > > np.take_along_axis(clist, np.lexsort((clist.real, clist.imag), > > > 0), 0) > > > > > > > > > This proposal aims to bring parity between default python > > > handling of > > > complex numbers and handling complex types in numpy > > > > > > This is a two-step process > > > > > > > > > 1. Sort complex numbers in a pythonic way , accepting key > > > arguments, > > > and deprecate usage of sort() on complex numbers without key > > > argument > > > 1. Possibly extend this to max(), min(), if it makes sense > > > to do > > > so. > > > 2. Since sort() is being updated for complex numbers, > > > searchsorted() is also a good candidate for implementing > > > this change. > > > 2. Once this is done, we can deprecate the usage of comparison > > > operators (>, <, >= , <=) on complex dtypes > > > > > > > > > > > > > > > *Handling sort() for complex numbers* > > > There are two approaches we can take for this > > > > > > > > > 1. update sort() method, to have a ?key? kwarg. When key value > > > is > > > passed, use lexsort to get indices and continue sorting of it. > > > We could > > > support lambda function keys like python, but that is likely > > > to be very > > > slow. > > > 2. Create a new wrapper function sort_by() (placeholder name, > > > Requesting name suggestions/feedback)That essentially acts > > > like a syntactic > > > sugar for > > > 1. np.take_along_axis(clist, np.lexsort((clist.real, > > > clist.imag), > > > 0), 0) > > > > > > > > > 1. Improve the existing sort_complex() method with the new key > > > search > > > functionality (Though the change will only reflect for complex > > > dtypes). > > > > > > We could choose either method, both have pros and cons , approach > > > 1 makes > > > the sort function signature, closer to its python counterpart, > > > while using > > > approach 2 provides a better distinction between the two > > > approaches for > > > sorting. The performance on approach 1 function would vary, due > > > to the key > > > being an optional argument. Would love the community?s thoughts > > > on this. > > > > > > > > > *Handling min() and max() for complex numbers* > > > > > > Since min and max are essentially a set of comparisons, in python > > > they > > > are not allowed on complex numbers > > > > > > > > clist = [cnum, cnum_2] > > > > > > min(clist) > > > Traceback (most recent call last): > > > File "", line 1, in > > > TypeError: '<' not supported between instances of 'complex' and > > > 'complex' > > > > > > # But using keys argument again works > > > min(clist, key=lambda c: (c.real, c.imag)) > > > > > > We could use a similar key kwarg for min() and max() in python, > > > but > > > question remains how we handle the keys, in this use case , naive > > > way would > > > be to sort() on keys and take last or first element, which is > > > likely going > > > to be slow. Requesting suggestions on approaching this. > > > > > > *Comments on isclose()* > > > Both python and numpy use the absolute value/magnitude for > > > comparing if > > > two values are close enough. Hence I do not see this change > > > affecting this > > > function. > > > > > > Requesting feedback and suggestions on the above. > > > > > > Thank you, > > > > > > Rakesh > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From gyromagnetic at gmail.com Wed Jul 1 15:22:20 2020 From: gyromagnetic at gmail.com (gyro funch) Date: Wed, 1 Jul 2020 13:22:20 -0600 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: Hello, I greatly respect the intention, but this is a very slippery slope. Will you exempt groups within these companies that are working on 'green' technologies (e.g., biofuels)? Will you add to the license restrictions companies who make use of oil and gas extracted by these companies (automotive, chemical/polymers, etc.)? Will you follow the chain from extraction to consumption and add the links to the license 'blacklist'? -gyro On 7/1/2020 12:34 PM, John Preston wrote: > Hello all, > > The following proposal was originally issue #16722 on GitHub but at > the request of Matti Picus I am moving the discussion to this list. > > > "NumPy is the fundamental package needed for scientific computing with Python." > > I am asking the NumPy project to leverage its position as a core > dependency among statistical, numerical, and ML projects, in the > pursuit of climate justice. It is easy to identify open-source > software used by the oil and gas industry which relies on NumPy [1] > [2] , and it is highly likely that NumPy is used in closed-source and > in-house software at oil and gas extraction companies such as Aramco, > ExxonMobil, BP, Shell, and others. I believe it is possible to use > software licensing to discourage the use of NumPy and dependent > packages by companies such as these, and that doing so would frustrate > the ability of these companies to identify and extract new oil and gas > reserves. > > I propose NumPy's current BSD 3-Clause license be extended to include > the following conditions, in line with the Climate Strike License [3] > : > > * The Software may not be used in applications and services that > are used for or > aid in the exploration, extraction, refinement, processing, or > transportation > of fossil fuels. > > * The Software may not be used by companies that rely on fossil > fuel extraction > as their primary means of revenue. This includes but is not > limited to the > companies listed at https://climatestrike.software/blocklist > > I accept that there are issues around adopting such a proposal, including that: > > addition of such clauses violates the Open Source Initiative's > canonical Open Source Definition, which explicitly excludes licenses > that limit re-use "in a specific field of endeavor", and therefore if > these clauses were adopted NumPy would no longer "be open-source" by > this definition; > there may be collateral damage among the wider user base and project > sponsorship, due to the vague nature of the first clause, and this may > affect the longevity of the project and its standing within the > Python, numerical, statistical, and ML communities. > > My intention with the opening of this issue is to promote constructive > discussion of the use of software licensing -- and other measures -- > for working towards climate justice -- and other forms of justice -- > in the context of NumPy and other popular open-source libraries. Some > people will say that NumPy is "just a tool" and that it sits > independent of how it is used, but due to its utility and its > influence as a major open-source library, I think it is essential that > we consider the position of the Climate Strike License authors, that > "as tech workers, we should take responsibility in how our software is > used". > > Many thanks to all of the contributors who have put so much time and > energy into NumPy. ? ?? ? > > [1] https://github.com/gazprom-neft/petroflow > [2] https://github.com/climate-strike/analysis > [3] https://github.com/climate-strike/license > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: pEpkey.asc Type: application/pgp-keys Size: 1765 bytes Desc: not available URL: From andrea.gavana at gmail.com Wed Jul 1 15:32:17 2020 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Wed, 1 Jul 2020 21:32:17 +0200 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: On Wed, 1 Jul 2020 at 21.23, gyro funch wrote: > Hello, > > I greatly respect the intention, but this is a very slippery slope. > > Will you exempt groups within these companies that are working on > 'green' technologies (e.g., biofuels)? > > Will you add to the license restrictions companies who make use of oil > and gas extracted by these companies (automotive, chemical/polymers, etc.)? > > Will you follow the chain from extraction to consumption and add the > links to the license 'blacklist'? > > -gyro Thank you for injecting some sense and a few reality checks into the discussion. Andrea. > > > On 7/1/2020 12:34 PM, John Preston wrote: > > Hello all, > > > > The following proposal was originally issue #16722 on GitHub but at > > the request of Matti Picus I am moving the discussion to this list. > > > > > > "NumPy is the fundamental package needed for scientific computing with > Python." > > > > I am asking the NumPy project to leverage its position as a core > > dependency among statistical, numerical, and ML projects, in the > > pursuit of climate justice. It is easy to identify open-source > > software used by the oil and gas industry which relies on NumPy [1] > > [2] , and it is highly likely that NumPy is used in closed-source and > > in-house software at oil and gas extraction companies such as Aramco, > > ExxonMobil, BP, Shell, and others. I believe it is possible to use > > software licensing to discourage the use of NumPy and dependent > > packages by companies such as these, and that doing so would frustrate > > the ability of these companies to identify and extract new oil and gas > > reserves. > > > > I propose NumPy's current BSD 3-Clause license be extended to include > > the following conditions, in line with the Climate Strike License [3] > > : > > > > * The Software may not be used in applications and services that > > are used for or > > aid in the exploration, extraction, refinement, processing, or > > transportation > > of fossil fuels. > > > > * The Software may not be used by companies that rely on fossil > > fuel extraction > > as their primary means of revenue. This includes but is not > > limited to the > > companies listed at https://climatestrike.software/blocklist > > > > I accept that there are issues around adopting such a proposal, > including that: > > > > addition of such clauses violates the Open Source Initiative's > > canonical Open Source Definition, which explicitly excludes licenses > > that limit re-use "in a specific field of endeavor", and therefore if > > these clauses were adopted NumPy would no longer "be open-source" by > > this definition; > > there may be collateral damage among the wider user base and project > > sponsorship, due to the vague nature of the first clause, and this may > > affect the longevity of the project and its standing within the > > Python, numerical, statistical, and ML communities. > > > > My intention with the opening of this issue is to promote constructive > > discussion of the use of software licensing -- and other measures -- > > for working towards climate justice -- and other forms of justice -- > > in the context of NumPy and other popular open-source libraries. Some > > people will say that NumPy is "just a tool" and that it sits > > independent of how it is used, but due to its utility and its > > influence as a major open-source library, I think it is essential that > > we consider the position of the Climate Strike License authors, that > > "as tech workers, we should take responsibility in how our software is > > used". > > > > Many thanks to all of the contributors who have put so much time and > > energy into NumPy. ? ?? ? > > > > [1] https://github.com/gazprom-neft/petroflow > > [2] https://github.com/climate-strike/analysis > > [3] https://github.com/climate-strike/license > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Jul 1 15:48:57 2020 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 1 Jul 2020 12:48:57 -0700 Subject: [Numpy-discussion] Improving Complex Comparison/Ordering in Numpy In-Reply-To: <453d1835fa50583ccf0f0c39252a55cd33a368cd.camel@sipsolutions.net> References: <453d1835fa50583ccf0f0c39252a55cd33a368cd.camel@sipsolutions.net> Message-ID: On Wed, Jul 1, 2020 at 12:23 PM Sebastian Berg wrote: > This is a WIP, but allows nicely to try out how the new API > could/should look like, and see the potential impact to code. The > current choice is for: > > np.sort(arr, keys=(arr.real, arr.image)) > > for example. `keys` is like the `key` argument to pythons sorts, but > unlike python sorts is not passed a function but rather a sequence of > arrays. > > Alternative spellings could be `by=...`? Or maybe someone has a > different API idea. > I really like the look of np.sort(arr, by=(arr.real, arr.image)). - This avoids adding an extra function sortby into NumPy's API. The default behavior (by=None) would of course be to sort by the arrays being sorted, so it's backwards compatible. - Calling the new argument "by" instead of "key" avoids confusion with the behavior of Python's sort/sorted (which take functions instead of sequences). The combination of lexsort() and take_along_axis() makes it possible to achieve this behavior currently, but it is definitely less clear than a single function call. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tcaswell at gmail.com Wed Jul 1 15:57:43 2020 From: tcaswell at gmail.com (Thomas Caswell) Date: Wed, 1 Jul 2020 15:57:43 -0400 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: While well intentioned, this is not something that NumPy (or the rest of the scientific Python stack) should consider doing. Philosophically, I think this is something that those of us who work on Open Source have to accept: some people are going to use it for things we think make the world a better place and some people are going to use it for things we think make the world a worse place. The mechanisms that ensure we can get the tools into the hands of the first group also means we can not keep them out of the hands of the second (independent of how any given person defines the groups). Tom On Wed, Jul 1, 2020 at 3:32 PM Andrea Gavana wrote: > On Wed, 1 Jul 2020 at 21.23, gyro funch wrote: > >> Hello, >> >> I greatly respect the intention, but this is a very slippery slope. >> >> Will you exempt groups within these companies that are working on >> 'green' technologies (e.g., biofuels)? >> >> Will you add to the license restrictions companies who make use of oil >> and gas extracted by these companies (automotive, chemical/polymers, >> etc.)? >> >> Will you follow the chain from extraction to consumption and add the >> links to the license 'blacklist'? >> >> -gyro > > > Thank you for injecting some sense and a few reality checks into the > discussion. > > Andrea. > > > >> >> >> On 7/1/2020 12:34 PM, John Preston wrote: >> > Hello all, >> > >> > The following proposal was originally issue #16722 on GitHub but at >> > the request of Matti Picus I am moving the discussion to this list. >> > >> > >> > "NumPy is the fundamental package needed for scientific computing with >> Python." >> > >> > I am asking the NumPy project to leverage its position as a core >> > dependency among statistical, numerical, and ML projects, in the >> > pursuit of climate justice. It is easy to identify open-source >> > software used by the oil and gas industry which relies on NumPy [1] >> > [2] , and it is highly likely that NumPy is used in closed-source and >> > in-house software at oil and gas extraction companies such as Aramco, >> > ExxonMobil, BP, Shell, and others. I believe it is possible to use >> > software licensing to discourage the use of NumPy and dependent >> > packages by companies such as these, and that doing so would frustrate >> > the ability of these companies to identify and extract new oil and gas >> > reserves. >> > >> > I propose NumPy's current BSD 3-Clause license be extended to include >> > the following conditions, in line with the Climate Strike License [3] >> > : >> > >> > * The Software may not be used in applications and services that >> > are used for or >> > aid in the exploration, extraction, refinement, processing, or >> > transportation >> > of fossil fuels. >> > >> > * The Software may not be used by companies that rely on fossil >> > fuel extraction >> > as their primary means of revenue. This includes but is not >> > limited to the >> > companies listed at https://climatestrike.software/blocklist >> > >> > I accept that there are issues around adopting such a proposal, >> including that: >> > >> > addition of such clauses violates the Open Source Initiative's >> > canonical Open Source Definition, which explicitly excludes licenses >> > that limit re-use "in a specific field of endeavor", and therefore if >> > these clauses were adopted NumPy would no longer "be open-source" by >> > this definition; >> > there may be collateral damage among the wider user base and project >> > sponsorship, due to the vague nature of the first clause, and this may >> > affect the longevity of the project and its standing within the >> > Python, numerical, statistical, and ML communities. >> > >> > My intention with the opening of this issue is to promote constructive >> > discussion of the use of software licensing -- and other measures -- >> > for working towards climate justice -- and other forms of justice -- >> > in the context of NumPy and other popular open-source libraries. Some >> > people will say that NumPy is "just a tool" and that it sits >> > independent of how it is used, but due to its utility and its >> > influence as a major open-source library, I think it is essential that >> > we consider the position of the Climate Strike License authors, that >> > "as tech workers, we should take responsibility in how our software is >> > used". >> > >> > Many thanks to all of the contributors who have put so much time and >> > energy into NumPy. ? ?? ? >> > >> > [1] https://github.com/gazprom-neft/petroflow >> > [2] https://github.com/climate-strike/analysis >> > [3] https://github.com/climate-strike/license >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> > >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -- Thomas Caswell tcaswell at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Wed Jul 1 15:59:50 2020 From: rmay31 at gmail.com (Ryan May) Date: Wed, 1 Jul 2020 13:59:50 -0600 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: Hi, I can respect where this comes from, especially as someone who works in atmospheric science. I'm glad people are trying to do what they can. With that said, I am -1000 on this. In my opinion, a software license is a wholly inappropriate venue for trying to do this. At the top of the home page for the Free Software Foundation: "Free software developers guarantee everyone equal rights to their programs". What you're proposing is essentially "everyone equal rights so long as they aren't working on things I disagree with". The nobility of the cause in my opinion doesn't justify compromising the values behind free software. As someone with some miniscule commits in the numpy codebase, I would not want them distributed under the modified license. As a developer of other downstream projects, I would switch to the BSD fork of the project that would inevitably materialize. Ryan On Wed, Jul 1, 2020 at 12:35 PM John Preston wrote: > Hello all, > > The following proposal was originally issue #16722 on GitHub but at > the request of Matti Picus I am moving the discussion to this list. > > > "NumPy is the fundamental package needed for scientific computing with > Python." > > I am asking the NumPy project to leverage its position as a core > dependency among statistical, numerical, and ML projects, in the > pursuit of climate justice. It is easy to identify open-source > software used by the oil and gas industry which relies on NumPy [1] > [2] , and it is highly likely that NumPy is used in closed-source and > in-house software at oil and gas extraction companies such as Aramco, > ExxonMobil, BP, Shell, and others. I believe it is possible to use > software licensing to discourage the use of NumPy and dependent > packages by companies such as these, and that doing so would frustrate > the ability of these companies to identify and extract new oil and gas > reserves. > > I propose NumPy's current BSD 3-Clause license be extended to include > the following conditions, in line with the Climate Strike License [3] > : > > * The Software may not be used in applications and services that > are used for or > aid in the exploration, extraction, refinement, processing, or > transportation > of fossil fuels. > > * The Software may not be used by companies that rely on fossil > fuel extraction > as their primary means of revenue. This includes but is not > limited to the > companies listed at https://climatestrike.software/blocklist > > I accept that there are issues around adopting such a proposal, including > that: > > addition of such clauses violates the Open Source Initiative's > canonical Open Source Definition, which explicitly excludes licenses > that limit re-use "in a specific field of endeavor", and therefore if > these clauses were adopted NumPy would no longer "be open-source" by > this definition; > there may be collateral damage among the wider user base and project > sponsorship, due to the vague nature of the first clause, and this may > affect the longevity of the project and its standing within the > Python, numerical, statistical, and ML communities. > > My intention with the opening of this issue is to promote constructive > discussion of the use of software licensing -- and other measures -- > for working towards climate justice -- and other forms of justice -- > in the context of NumPy and other popular open-source libraries. Some > people will say that NumPy is "just a tool" and that it sits > independent of how it is used, but due to its utility and its > influence as a major open-source library, I think it is essential that > we consider the position of the Climate Strike License authors, that > "as tech workers, we should take responsibility in how our software is > used". > > Many thanks to all of the contributors who have put so much time and > energy into NumPy. ? ?? ? > > [1] https://github.com/gazprom-neft/petroflow > [2] https://github.com/climate-strike/analysis > [3] https://github.com/climate-strike/license > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -- Ryan May -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Wed Jul 1 18:20:45 2020 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 1 Jul 2020 16:20:45 -0600 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: On 01-07-2020 12:34, John Preston wrote: > Hello all, > > The following proposal was originally issue #16722 on GitHub but at > the request of Matti Picus I am moving the discussion to this list. [snip] Hello John, I don't have copyright on any of the Numpy code, however would like to express a few problems I see in this proposal. First, as you write, such a license does not qualify as Free Software as defined by OSI or the DFSG. Adopting this license would mean that Numpy could not be included in many distributions that give their users the guarantee that the softer they receive is Free Software. Debian would remove Numpy from its archive, for example. Fedora would probably do the same. Conda would need to do the same, but being Numpy at the base of the Python scientific stack, this would effectively kill Conda. This would have immediate ripercussions on companies that offer services based on Numpy and on software that depends on Numpy. Second, the term of the license are extremely vague, at least in a legal framework. In particular, "used for or aid in" is a very poor choice of words. It could be argued that if I use Numpy in the code that handles the orders for my pizza shop and I am asked to deliver pizzas to Exon employer working late at night I am "aiding in the "the exploration, extraction, refinement, processing, or transportation of fossil fuels". Thus, someone that has copyright on (even very small) part of the Numpy code could sue me and demand a free lifetime supply of pizza for me to continue to be able to use Numpy. In practice this would make everyone avoid using Numpy in their software by being scared of violating these clauses. At the same time, the wording may be too vague to be enforceable in court. This in practice would mean that most of the "good guys" (as per the Climate Strike License definition) would be avoiding to use Numpy because they do not have the resources to fight alleged license violations in court, while the "bad guys" will continue to do it because they have a whole legal department to handle something like this. Third, if a software project would be to adopt something like the Climate Strike License, why shouldn't it adopt licenses whose terms are thought to advance some other political agenda? While the fact that the reliance on fossil fuels is the cause of climate change is widely (but not universally) acknowledged and we may agree that the the big economical interests in the enterprises related to fossil fuels are holding back alternative solutions, there are many other causes on which an agreement would be very difficult and would drag the project members into interminable discussions. Fourth, are we sure that making fossil fuel companies and companies that rely on fossil fuels less efficient (by forbidding access to the Python scientific software stack) would make them less dangerous for the climate? Absurdly, the Climate Strike License forbids a company that wants to migrate from a busyness model based on fossil fuels to something more sustainable to use a software with this license to evaluate and form their plans. Free Software (in its copyleft or permissive licensing variants) has been so successful also because its promoters have not tried to leverage it for other (noble or otherwise) scopes. There has been talk in the past to incorporate other clauses in the Free Software license to advance other causes (from "cause no harm" kind of things to provision to ensure the economical viability of the development) and the conclusion has always been that it is not a good idea. The reasons presented ere are just some. I am sure you can find more detailed essays from authors much more authoritative than me on this matter. Cheers, Dan From sebastian at sipsolutions.net Wed Jul 1 21:44:04 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 01 Jul 2020 20:44:04 -0500 Subject: [Numpy-discussion] Improving Complex Comparison/Ordering in Numpy In-Reply-To: References: <453d1835fa50583ccf0f0c39252a55cd33a368cd.camel@sipsolutions.net> Message-ID: <5c68a2bedaa6b050fa0d63d6252e88df5fa247cf.camel@sipsolutions.net> On Wed, 2020-07-01 at 12:48 -0700, Stephan Hoyer wrote: > On Wed, Jul 1, 2020 at 12:23 PM Sebastian Berg < > sebastian at sipsolutions.net> > wrote: > > > This is a WIP, but allows nicely to try out how the new API > > could/should look like, and see the potential impact to code. The > > current choice is for: > > > > np.sort(arr, keys=(arr.real, arr.image)) > > > > for example. `keys` is like the `key` argument to pythons sorts, > > but > > unlike python sorts is not passed a function but rather a sequence > > of > > arrays. > > > > Alternative spellings could be `by=...`? Or maybe someone has a > > different API idea. > > > > I really like the look of np.sort(arr, by=(arr.real, arr.image)). > - This avoids adding an extra function sortby into NumPy's API. The > default > behavior (by=None) would of course be to sort by the arrays being > sorted, > so it's backwards compatible. > - Calling the new argument "by" instead of "key" avoids confusion > with the > behavior of Python's sort/sorted (which take functions instead of > sequences). I just noticed that `DataFrame.sort_values()` uses `by=...` with a list of column names. However, I guess that is fairly compatible with this usage. - Sebastan > The combination of lexsort() and take_along_axis() makes it possible > to > achieve this behavior currently, but it is definitely less clear than > a > single function call. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From jni at fastmail.com Thu Jul 2 04:58:11 2020 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Thu, 2 Jul 2020 18:58:11 +1000 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: Hi everyone, If you live in Australia, this has been a rough year to think about climate change. After the hottest and driest year on record, over 20% of the forest surface area of the south east was burned in the bushfires. Although I was hundreds of kilometres from the nearest fire, the air quality was rated as hazardous for several days in my city. This brought home for me two points. One, that "4?C" is not about taking off a jumper and going to the beach more often, but actually represents a complete transformation of our planet. 4?C is what separates us from the last ice age, so we can expect our planet in 80 years to be as unrecognisable from today as today is from the ice age. Two, that climate change is already with us, and we can't just continue to ignore the problem and enjoy whatever years of climate peace we thought we had left. Greta has it right, we are running out of time and absolutely drastic action is needed. All this is a prelude to add my voice to everyone who has already said that messing with the NumPy license is absolutely *not* the drastic action needed, and will be counter-productive, as many have noted. Having said this, I'm happy that the community is getting involved and getting active and coming up with creative ideas to do their part. If someone wants to start a "Pythonistas for Climate Action" user group, I'll be the first to join. I had planned to give a lightning talk in the vein of the above at SciPy, which, and believe me that I hate to hate on my favourite conference, recently loudly thanked Shell [1] for being a platinum sponsor. (Not to mention that Enthought derives about a third of its income from fossil fuel companies.) Unfortunately and for obvious reasons I won't make it to SciPy after all, but again, I'm happy to see the community rising. Perhaps this is derailing the discussion, but, anyone up for a "Python for Climate Action" BoF at the conference? I can probably make the late-afternoon BoFs given the time difference. Juan. [1]: https://twitter.com/SciPyConf/status/1276898138977193984 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Jul 2 06:12:13 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 2 Jul 2020 12:12:13 +0200 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: On Thu, Jul 2, 2020 at 10:58 AM Juan Nunez-Iglesias wrote: > Hi everyone, > > If you live in Australia, this has been a rough year to think about > climate change. After the hottest and driest year on record, over 20% of > the forest surface area of the south east was burned in the bushfires. > Although I was hundreds of kilometres from the nearest fire, the air > quality was rated as hazardous for several days in my city. This brought > home for me two points. > > One, that "4?C" is not about taking off a jumper and going to the beach > more often, but actually represents a complete transformation of our > planet. 4?C is what separates us from the last ice age, so we can expect > our planet in 80 years to be as unrecognisable from today as today is from > the ice age. > > Two, that climate change is already with us, and we can't just continue to > ignore the problem and enjoy whatever years of climate peace we thought we > had left. Greta has it right, we are running out of time and absolutely > drastic action is needed. > > All this is a prelude to add my voice to everyone who has already said > that *messing with the NumPy license is absolutely *not* the drastic > action needed*, and will be counter-productive, as many have noted. > > Having said this, I'm happy that the community is getting involved and > getting active and coming up with creative ideas to do their part. If > someone wants to start a "Pythonistas for Climate Action" user group, I'll > be the first to join. I had planned to give a lightning talk in the vein of > the above at SciPy, which, and believe me that I hate to hate on my > favourite conference, recently loudly thanked Shell [1] for being a > platinum sponsor. (Not to mention that Enthought derives about a third of > its income from fossil fuel companies.) Unfortunately and for obvious > reasons I won't make it to SciPy after all, but again, I'm happy to see the > community rising. > > Perhaps this is derailing the discussion, but, anyone up for a "Python for > Climate Action" BoF at the conference? I can probably make the > late-afternoon BoFs given the time difference. > Thanks for this Juan. I don't think it's derailing the discussion. Thinking about things we *can* do that may have a positive influence on the climate emergency we're in, or the state of the world in general, are valid and probably the most productive turn this conversation can take. Changing the NumPy license isn't feasible, because of many of the pragmatic reasons already pointed out. That said, the "NumPy is just a tool" point of view is fairly naive; I think we do have a responsibility to at least think about the wider issues and possibly make some changes. One thing I have been thinking about recently is the educational material and high level documentation we produce. When we use data sources or write tutorials, we can incorporate data and examples related to climate issues, social issues, ethics in ML/AI, etc. Another thing to think about is: what do we, NumPy maintainers and contributors, choose to spend our time on? Not each issue/PR opened deserves our time equally - we're (almost) all volunteers after all. A PR that for example improves the classroom experience of teaching NumPy may be prioritized over a PR that helps fix an issue for . I'd be interested to hear if others back thought about this before or have any ideas. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Thu Jul 2 06:33:46 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Thu, 2 Jul 2020 12:33:46 +0200 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: Ralf basically wrote the email that I was about the send in a much more structured way so thanks for that. I'd like to mention also that oil&gas industry practically cannot be cornered by these restrictions. So even the cause is very noble and I wholeheartedly agree, forcing this type of exclusions only will make their hand stronger in going to other commercial software (they can really afford even acquiring whole companies) and forcing their employees using it and finally boomeranging back to the reduction of the potential contributors to open source who would have otherwise contributed back just because they liked it (like most of us did back in the day). For example, Shell and Intel are corporate level collaborators. Should we ban also usage of MKL? Of course not, because this is not about driving Shell and others to software starvation but actually forcing them to take concrete steps towards the climate crisis. This is not to say we are desperate, quite the contrary, however this strategy seems dire against the possible outcomes. I really would like to take a more concrete approach that Ralf outlined. Again, it is not a crusade against commercial software, I truly think all have different shoes to fill in. However, making the switch from commercial software to open source as smooth as possible would actually emit the message that we are not bound to conglomerate structures to achieve noble goals. Thus this would make a bolder statement as far as what software can manage to display. Signal processing can make fuel consumption notebooks, stats can display bicycle usage results and their impact etc. Again it is a mentality that we are trying to build so it shouldn't be up to the level of annoyance so that everyone can hop on the bandwagon. On Thu, Jul 2, 2020 at 12:14 PM Ralf Gommers wrote: > > > On Thu, Jul 2, 2020 at 10:58 AM Juan Nunez-Iglesias > wrote: > >> Hi everyone, >> >> If you live in Australia, this has been a rough year to think about >> climate change. After the hottest and driest year on record, over 20% of >> the forest surface area of the south east was burned in the bushfires. >> Although I was hundreds of kilometres from the nearest fire, the air >> quality was rated as hazardous for several days in my city. This brought >> home for me two points. >> >> One, that "4?C" is not about taking off a jumper and going to the beach >> more often, but actually represents a complete transformation of our >> planet. 4?C is what separates us from the last ice age, so we can expect >> our planet in 80 years to be as unrecognisable from today as today is from >> the ice age. >> >> Two, that climate change is already with us, and we can't just continue >> to ignore the problem and enjoy whatever years of climate peace we thought >> we had left. Greta has it right, we are running out of time and absolutely >> drastic action is needed. >> >> All this is a prelude to add my voice to everyone who has already said >> that *messing with the NumPy license is absolutely *not* the drastic >> action needed*, and will be counter-productive, as many have noted. >> >> Having said this, I'm happy that the community is getting involved and >> getting active and coming up with creative ideas to do their part. If >> someone wants to start a "Pythonistas for Climate Action" user group, I'll >> be the first to join. I had planned to give a lightning talk in the vein of >> the above at SciPy, which, and believe me that I hate to hate on my >> favourite conference, recently loudly thanked Shell [1] for being a >> platinum sponsor. (Not to mention that Enthought derives about a third of >> its income from fossil fuel companies.) Unfortunately and for obvious >> reasons I won't make it to SciPy after all, but again, I'm happy to see the >> community rising. >> >> Perhaps this is derailing the discussion, but, anyone up for a "Python >> for Climate Action" BoF at the conference? I can probably make the >> late-afternoon BoFs given the time difference. >> > > Thanks for this Juan. I don't think it's derailing the discussion. > Thinking about things we *can* do that may have a positive influence on the > climate emergency we're in, or the state of the world in general, are valid > and probably the most productive turn this conversation can take. Changing > the NumPy license isn't feasible, because of many of the pragmatic reasons > already pointed out. That said, the "NumPy is just a tool" point of view is > fairly naive; I think we do have a responsibility to at least think about > the wider issues and possibly make some changes. > > One thing I have been thinking about recently is the educational material > and high level documentation we produce. When we use data sources or write > tutorials, we can incorporate data and examples related to climate issues, > social issues, ethics in ML/AI, etc. > > Another thing to think about is: what do we, NumPy maintainers and > contributors, choose to spend our time on? Not each issue/PR opened > deserves our time equally - we're (almost) all volunteers after all. A PR > that for example improves the classroom experience of teaching NumPy may be > prioritized over a PR that helps fix an issue for framework that's not contributing back in any way>. > > I'd be interested to hear if others back thought about this before or have > any ideas. > > Cheers, > Ralf > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From albuscode at gmail.com Thu Jul 2 06:38:56 2020 From: albuscode at gmail.com (Inessa Pawson) Date: Thu, 2 Jul 2020 20:38:56 +1000 Subject: [Numpy-discussion] Python for Climate Action session at SciPy'20 In-Reply-To: References: Message-ID: Hi, Juan! I?m still in the process of scheduling live networking sessions at SciPy?20 and would be happy to set up one on the topic of Python for Climate Action. We could host it on July 8th or 10th at 5 - 6 p.m. CDT. Would you be available to moderate it? > ---------- Forwarded message ---------- > From: Juan Nunez-Iglesias > To: Discussion of Numerical Python > Cc: > Bcc: > Date: Thu, 2 Jul 2020 18:58:11 +1000 > Subject: Re: [Numpy-discussion] Proposal to add clause to license > prohibiting use by oil and gas extraction companies > Hi everyone, > > If you live in Australia, this has been a rough year to think about > climate change. After the hottest and driest year on record, over 20% of > the forest surface area of the south east was burned in the bushfires. > Although I was hundreds of kilometres from the nearest fire, the air > quality was rated as hazardous for several days in my city. This brought > home for me two points. > > One, that "4?C" is not about taking off a jumper and going to the beach > more often, but actually represents a complete transformation of our > planet. 4?C is what separates us from the last ice age, so we can expect > our planet in 80 years to be as unrecognisable from today as today is from > the ice age. > > Two, that climate change is already with us, and we can't just continue to > ignore the problem and enjoy whatever years of climate peace we thought we > had left. Greta has it right, we are running out of time and absolutely > drastic action is needed. > > All this is a prelude to add my voice to everyone who has already said > that *messing with the NumPy license is absolutely *not* the drastic > action needed*, and will be counter-productive, as many have noted. > > Having said this, I'm happy that the community is getting involved and > getting active and coming up with creative ideas to do their part. If > someone wants to start a "Pythonistas for Climate Action" user group, I'll > be the first to join. I had planned to give a lightning talk in the vein of > the above at SciPy, which, and believe me that I hate to hate on my > favourite conference, recently loudly thanked Shell [1] for being a > platinum sponsor. (Not to mention that Enthought derives about a third of > its income from fossil fuel companies.) Unfortunately and for obvious > reasons I won't make it to SciPy after all, but again, I'm happy to see the > community rising. > > Perhaps this is derailing the discussion, but, anyone up for a "Python for > Climate Action" BoF at the conference? I can probably make the > late-afternoon BoFs given the time difference. > > Juan. > > [1]: https://twitter.com/SciPyConf/status/1276898138977193984 > > > -- Every good wish, *Inessa Pawson* Albus Code inessa at albuscode.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni at fastmail.com Thu Jul 2 10:16:42 2020 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Fri, 3 Jul 2020 00:16:42 +1000 Subject: [Numpy-discussion] Python for Climate Action session at SciPy'20 In-Reply-To: References: Message-ID: Hi Inessa, Thanks for offering! I definitely want to participate but I would *love it* if an actual climate scientist or even *any* atmospheric scientist would step up to chair the session! I have not thought all that deeply about this problem, and mostly I feel helpless and frustrated. If no one else volunteers though I'm happy to do it. I much prefer the Wednesday session. Let's book it in! Thank you all, Juan. > On 2 Jul 2020, at 8:38 pm, Inessa Pawson wrote: > > Hi, Juan! > I?m still in the process of scheduling live networking sessions at SciPy?20 and would be happy to set up one on the topic of Python for Climate Action. We could host it on July 8th or 10th at 5 - 6 p.m. CDT. Would you be available to moderate it? > > > ---------- Forwarded message ---------- > From: Juan Nunez-Iglesias > > To: Discussion of Numerical Python > > Cc: > Bcc: > Date: Thu, 2 Jul 2020 18:58:11 +1000 > Subject: Re: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies > Hi everyone, > > If you live in Australia, this has been a rough year to think about climate change. After the hottest and driest year on record, over 20% of the forest surface area of the south east was burned in the bushfires. Although I was hundreds of kilometres from the nearest fire, the air quality was rated as hazardous for several days in my city. This brought home for me two points. > > One, that "4?C" is not about taking off a jumper and going to the beach more often, but actually represents a complete transformation of our planet. 4?C is what separates us from the last ice age, so we can expect our planet in 80 years to be as unrecognisable from today as today is from the ice age. > > Two, that climate change is already with us, and we can't just continue to ignore the problem and enjoy whatever years of climate peace we thought we had left. Greta has it right, we are running out of time and absolutely drastic action is needed. > > All this is a prelude to add my voice to everyone who has already said that messing with the NumPy license is absolutely *not* the drastic action needed, and will be counter-productive, as many have noted. > > Having said this, I'm happy that the community is getting involved and getting active and coming up with creative ideas to do their part. If someone wants to start a "Pythonistas for Climate Action" user group, I'll be the first to join. I had planned to give a lightning talk in the vein of the above at SciPy, which, and believe me that I hate to hate on my favourite conference, recently loudly thanked Shell [1] for being a platinum sponsor. (Not to mention that Enthought derives about a third of its income from fossil fuel companies.) Unfortunately and for obvious reasons I won't make it to SciPy after all, but again, I'm happy to see the community rising. > > Perhaps this is derailing the discussion, but, anyone up for a "Python for Climate Action" BoF at the conference? I can probably make the late-afternoon BoFs given the time difference. > > Juan. > > [1]: https://twitter.com/SciPyConf/status/1276898138977193984 > > -- > Every good wish, > Inessa Pawson > Albus Code > inessa at albuscode.org > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikofski at berkeley.edu Thu Jul 2 12:09:32 2020 From: mikofski at berkeley.edu (Dr. Mark Alexander Mikofski PhD) Date: Thu, 2 Jul 2020 09:09:32 -0700 Subject: [Numpy-discussion] Python for Climate Action session at SciPy'20 In-Reply-To: References: Message-ID: I can repost this on pvlib (solar energy photovoltaic library) Python Google group (https://groups.google.com/forum/m/#!forum/pvlib-python). We have plenty of both climate and atmospheric scientists, and we are avid users of Numpy, SciPy, and the scientific stack. We would love to share constructive uses of Python in climate science. On Thu, Jul 2, 2020, 7:18 AM Juan Nunez-Iglesias wrote: > Hi Inessa, > > Thanks for offering! I definitely want to participate but I would *love > it* if an actual climate scientist or even *any* atmospheric scientist > would step up to chair the session! I have not thought all that deeply > about this problem, and mostly I feel helpless and frustrated. > > If no one else volunteers though I'm happy to do it. > > I much prefer the Wednesday session. Let's book it in! > > Thank you all, > > Juan. > > On 2 Jul 2020, at 8:38 pm, Inessa Pawson wrote: > > Hi, Juan! > I?m still in the process of scheduling live networking sessions at > SciPy?20 and would be happy to set up one on the topic of Python for > Climate Action. We could host it on July 8th or 10th at 5 - 6 p.m. CDT. > Would you be available to moderate it? > > >> ---------- Forwarded message ---------- >> From: Juan Nunez-Iglesias >> To: Discussion of Numerical Python >> Cc: >> Bcc: >> Date: Thu, 2 Jul 2020 18:58:11 +1000 >> Subject: Re: [Numpy-discussion] Proposal to add clause to license >> prohibiting use by oil and gas extraction companies >> Hi everyone, >> >> If you live in Australia, this has been a rough year to think about >> climate change. After the hottest and driest year on record, over 20% of >> the forest surface area of the south east was burned in the bushfires. >> Although I was hundreds of kilometres from the nearest fire, the air >> quality was rated as hazardous for several days in my city. This brought >> home for me two points. >> >> One, that "4?C" is not about taking off a jumper and going to the beach >> more often, but actually represents a complete transformation of our >> planet. 4?C is what separates us from the last ice age, so we can expect >> our planet in 80 years to be as unrecognisable from today as today is from >> the ice age. >> >> Two, that climate change is already with us, and we can't just continue >> to ignore the problem and enjoy whatever years of climate peace we thought >> we had left. Greta has it right, we are running out of time and absolutely >> drastic action is needed. >> >> All this is a prelude to add my voice to everyone who has already said >> that *messing with the NumPy license is absolutely *not* the drastic >> action needed*, and will be counter-productive, as many have noted. >> >> Having said this, I'm happy that the community is getting involved and >> getting active and coming up with creative ideas to do their part. If >> someone wants to start a "Pythonistas for Climate Action" user group, I'll >> be the first to join. I had planned to give a lightning talk in the vein of >> the above at SciPy, which, and believe me that I hate to hate on my >> favourite conference, recently loudly thanked Shell [1] for being a >> platinum sponsor. (Not to mention that Enthought derives about a third of >> its income from fossil fuel companies.) Unfortunately and for obvious >> reasons I won't make it to SciPy after all, but again, I'm happy to see the >> community rising. >> >> Perhaps this is derailing the discussion, but, anyone up for a "Python >> for Climate Action" BoF at the conference? I can probably make the >> late-afternoon BoFs given the time difference. >> >> Juan. >> >> [1]: https://twitter.com/SciPyConf/status/1276898138977193984 >> >> >> -- > Every good wish, > *Inessa Pawson* > Albus Code > inessa at albuscode.org > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikofski at berkeley.edu Thu Jul 2 13:18:54 2020 From: mikofski at berkeley.edu (Dr. Mark Alexander Mikofski PhD) Date: Thu, 2 Jul 2020 10:18:54 -0700 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: Thank you everyone. This is a fascinating thread, and very interesting to see how it has transformed into constructive discussion of positive action. Along that line I think it could be useful to curate a list of Python (and OpenSci) packages using Numpy, SciPy, or any part of the Python scientific stack. For example I know there are several clean, renewable energy packages that depend on Numpy, etc, especially in solar energy. The lead maintainer for pvlib python presented a list at our annual IEEE PV Specialists conference 2 years ago, it's on GitHub here https://openpvtools.readthedocs.io/en/latest/ In particular pvlib python and rdtools are two widely used tools that depend on Numpy, SciPy, etc. (Disclaimer, I am one of the maintainers for pvlib.) I think we could pull together projects from the journal of open source (JOSS), OpenSci, and NumFOCUS. Then we could host/link/highlight examples, case studies, and projects that use Numpy to combat climate change. There is already a separate thread started by Inessa Pawson (Re: [Numpy-discussion] Python for Climate Action session at SciPy'20) following up on Juan's idea for BoF at SciPy to highlight climate change action by the Python scientific community. We could try to encourage more climate change and clean energy projects to participate in SciPy and PyCon. Conversely, we could even promote Numpy with climate and clean energy scientists at their conferences like AMS and IEEE PVSC. For example, next year we are planning to host a Python tutorial at PVSC as part of the tutorial program that preceeds the conference, but we need support with logistics. (Thanks already to Yuvi Panda for help with mybinder & TLJH.) This could be the opportunity for the Python scientific community, clean energy & climate scientists, academic, & national labs to collaborate and synergize. I volunteer to participate in whatever capacity I can to develop this collaboration if folks think it is useful. I'm not sure how to proceed, but whatever the result I believe some momentum is forming here, so there's an opportunity to carpe diem. Cheers, Mark On Thu, Jul 2, 2020, 3:35 AM Ilhan Polat wrote: > Ralf basically wrote the email that I was about the send in a much more > structured way so thanks for that. I'd like to mention also that oil&gas > industry practically cannot be cornered by these restrictions. So even the > cause is very noble and I wholeheartedly agree, forcing this type of > exclusions only will make their hand stronger in going to other commercial > software (they can really afford even acquiring whole companies) and > forcing their employees using it and finally boomeranging back to the > reduction of the potential contributors to open source who would have > otherwise contributed back just because they liked it (like most of us did > back in the day). For example, Shell and Intel are corporate level > collaborators. Should we ban also usage of MKL? Of course not, because this > is not about driving Shell and others to software starvation but actually > forcing them to take concrete steps towards the climate crisis. This is not > to say we are desperate, quite the contrary, however this strategy seems > dire against the possible outcomes. > > I really would like to take a more concrete approach that Ralf outlined. > Again, it is not a crusade against commercial software, I truly think all > have different shoes to fill in. However, making the switch from commercial > software to open source as smooth as possible would actually emit the > message that we are not bound to conglomerate structures to achieve noble > goals. Thus this would make a bolder statement as far as what software can > manage to display. Signal processing can make fuel consumption notebooks, > stats can display bicycle usage results and their impact etc. Again it is a > mentality that we are trying to build so it shouldn't be up to the level of > annoyance so that everyone can hop on the bandwagon. > > > > On Thu, Jul 2, 2020 at 12:14 PM Ralf Gommers > wrote: > >> >> >> On Thu, Jul 2, 2020 at 10:58 AM Juan Nunez-Iglesias >> wrote: >> >>> Hi everyone, >>> >>> If you live in Australia, this has been a rough year to think about >>> climate change. After the hottest and driest year on record, over 20% of >>> the forest surface area of the south east was burned in the bushfires. >>> Although I was hundreds of kilometres from the nearest fire, the air >>> quality was rated as hazardous for several days in my city. This brought >>> home for me two points. >>> >>> One, that "4?C" is not about taking off a jumper and going to the beach >>> more often, but actually represents a complete transformation of our >>> planet. 4?C is what separates us from the last ice age, so we can expect >>> our planet in 80 years to be as unrecognisable from today as today is from >>> the ice age. >>> >>> Two, that climate change is already with us, and we can't just continue >>> to ignore the problem and enjoy whatever years of climate peace we thought >>> we had left. Greta has it right, we are running out of time and absolutely >>> drastic action is needed. >>> >>> All this is a prelude to add my voice to everyone who has already said >>> that *messing with the NumPy license is absolutely *not* the drastic >>> action needed*, and will be counter-productive, as many have noted. >>> >>> Having said this, I'm happy that the community is getting involved and >>> getting active and coming up with creative ideas to do their part. If >>> someone wants to start a "Pythonistas for Climate Action" user group, I'll >>> be the first to join. I had planned to give a lightning talk in the vein of >>> the above at SciPy, which, and believe me that I hate to hate on my >>> favourite conference, recently loudly thanked Shell [1] for being a >>> platinum sponsor. (Not to mention that Enthought derives about a third of >>> its income from fossil fuel companies.) Unfortunately and for obvious >>> reasons I won't make it to SciPy after all, but again, I'm happy to see the >>> community rising. >>> >>> Perhaps this is derailing the discussion, but, anyone up for a "Python >>> for Climate Action" BoF at the conference? I can probably make the >>> late-afternoon BoFs given the time difference. >>> >> >> Thanks for this Juan. I don't think it's derailing the discussion. >> Thinking about things we *can* do that may have a positive influence on the >> climate emergency we're in, or the state of the world in general, are valid >> and probably the most productive turn this conversation can take. Changing >> the NumPy license isn't feasible, because of many of the pragmatic reasons >> already pointed out. That said, the "NumPy is just a tool" point of view is >> fairly naive; I think we do have a responsibility to at least think about >> the wider issues and possibly make some changes. >> >> One thing I have been thinking about recently is the educational material >> and high level documentation we produce. When we use data sources or write >> tutorials, we can incorporate data and examples related to climate issues, >> social issues, ethics in ML/AI, etc. >> >> Another thing to think about is: what do we, NumPy maintainers and >> contributors, choose to spend our time on? Not each issue/PR opened >> deserves our time equally - we're (almost) all volunteers after all. A PR >> that for example improves the classroom experience of teaching NumPy may be >> prioritized over a PR that helps fix an issue for > framework that's not contributing back in any way>. >> >> I'd be interested to hear if others back thought about this before or >> have any ideas. >> >> Cheers, >> Ralf >> >> >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gizmoguy1 at gmail.com Thu Jul 2 17:05:44 2020 From: gizmoguy1 at gmail.com (John Preston) Date: Thu, 2 Jul 2020 22:05:44 +0100 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: Thank you all for your input on this proposal. I am very grateful for the time you have all spent to provide such well reasoned critiques and I'm especially glad to see that this thread has triggered discussion of other, more pragmatic, actions that the community can take in pursuit of climate justice. ? I found Stanley's analogy of this proposal being a "backwards incompatible [legal] API change" particularly insightful, and Daniele has illustrated exactly the kind of chaos this would create downstream, threatening both NumPy itself (due to the packaging requirements of distributors like Debian and Fedora) and its dependent packages like Conda. Fundamentally I see this issue as a philosophical one around how we define, and the importance of, 'free software' and 'open source software'. From a principles-based perspective, I agree with Ryan that "equal rights except" is not truly equal, and that changing the definition(s) of F/OSS would damage the movement by making it much less clear what is and isn't F/OSS. On the other hand, from a pragmatic perspective, I care less about if software I use is strictly F/OSS, and more about if I can do what I want with it, and who else gets to enjoy those privileges -- I choose the word 'privilege' here specifically to highlight that the core of F/OSS is rights, which are unconditional, whereas this proposal would make those rights contingent on conditions that cannot be met by all actors, and therefore they would be privileges, not rights. So essentially, this proposal is asking "are there some uses of NumPy which are so ethically wrong, that it would be better for NumPy to be non-F/OSS in order to prevent those uses, than for NumPy to be F/OSS, and advance the F/OSS movement, while also allowing those uses?" Answering this question requires an awareness of the broader context within which NumPy sits. Ilhan has pointed out that O&G companies cannot be coerced by more restrictive licensing of NumPy because there are commercial options that they could use instead. Therefore, without evidence that NumPy powers a significant chunk of the analytics at major O&G companies, and that relicensing NumPy would cause significant disruption to those companies and their ability to carry out their operations, it is much more likely that any negative effect on O&G, and therefore any positive effect on the climate, would be outweighed by the harm caused to downstream packages. I agree that the first term is particularly vague, and I would love to see input from lawyers on how the software community can adopt rigid clauses in licenses for software that needs this, because although F/OSS may be "good by default" in that for most software, most of the time, releasing as F/OSS will be good, this does not mean that there is no software which requires stricter licensing. I would draw an analogy with responsible disclosure of vulnerabilities: vendors are provided with a window of time to fix a vulnerability before researchers publish their findings, on the basis that immediate publication of the findings presents more of a threat than a benefit, because malicious actors could weaponise and abuse the vulnerability before it is patched. In other words, as software creators, we have a responsibility to weigh the potential and actual uses of our software to determine if we are in a position to prevent harm by licensing or relicensing our software appropriately. I do not think the second clause is vague or unenforceable, as it should be demonstrable by any company what its primary revenue sources are and if any of those activities constitute fossil fuel extraction. However, the second clause by itself may not be sufficient to prevent use of software by O&G: Shell could form a company Shell Analytics, which carries out all analytical work for the other departments, and thus the primary business of that company would be "numerical services". Regarding other political agendas, as an advocate of responsible/ethical/political/... software licensing (where appropriate), I would like to see a set of lawyer-vetted clauses that could be plugged into base licenses and combined in a compatible way. While there are many other causes (arms manufacture, animal rights, ...) which are more controversial than the climate emergency which could be discussed in the context of software use and software licensing, I do not think that it would be a bad thing for these discussions to be able to take place. Regarding companies migrating their business models, this is a great point but I have no ideas how I would structure a clause that could allow this without potentially opening an unwanted loophole. I suppose any company that wished to pivot like this could incorporate a new entity which would be permitted to use the software, effectively the inverse of the Shell Analytics example. I believe I have addressed all of the issues which have been raised. In the interest of keeping discussion here focused on NumPy and actions that this community can take towards solving the climate emergency and achieving climate justice, I would ask that any further non-NumPy-specific criticism of the proposal be directed towards the Climate Strike License repo at GitHub [1], and I am very happy to continue discussing these issues via email or in another public forum of people's preference. ? I think the most critical blow to this proposal is the lack of evidence that relicensing NumPy would significantly frustrate the operation of major O&G extraction companies. I think Juan's suggestion of a "Pythonistas for Climate Action" sounds fantastic and I'm really glad to see Inessa Pawson has started another thread for establishing a "Python for Climate Action" session at SciPy'20. It is disappointing to hear of the sponsorship of the conference by Shell. ? Perhaps we should call on the organisers to drop Shell as a sponsor? I am happy to draft a petition letter if anyone is willing to sign, and very open to other suggestions. Finally, I would like to point everyone towards ClimateAction.tech [2], "a global community of tech professionals using our skills, expertise and platforms to support solutions to the climate crisis". [1] https://github.com/climate-strike/license [2] https://climateaction.tech/ Many thanks again for all of your responses, John On Thu, 2 Jul 2020 at 18:19, Dr. Mark Alexander Mikofski PhD wrote: > > Thank you everyone. This is a fascinating thread, and very interesting to see how it has transformed into constructive discussion of positive action. Along that line I think it could be useful to curate a list of Python (and OpenSci) packages using Numpy, SciPy, or any part of the Python scientific stack. > > For example I know there are several clean, renewable energy packages that depend on Numpy, etc, especially in solar energy. The lead maintainer for pvlib python presented a list at our annual IEEE PV Specialists conference 2 years ago, it's on GitHub here https://openpvtools.readthedocs.io/en/latest/ > > In particular pvlib python and rdtools are two widely used tools that depend on Numpy, SciPy, etc. (Disclaimer, I am one of the maintainers for pvlib.) > > I think we could pull together projects from the journal of open source (JOSS), OpenSci, and NumFOCUS. Then we could host/link/highlight examples, case studies, and projects that use Numpy to combat climate change. There is already a separate thread started by Inessa Pawson (Re: [Numpy-discussion] Python for Climate Action session at SciPy'20) following up on Juan's idea for BoF at SciPy to highlight climate change action by the Python scientific community. We could try to encourage more climate change and clean energy projects to participate in SciPy and PyCon. Conversely, we could even promote Numpy with climate and clean energy scientists at their conferences like AMS and IEEE PVSC. For example, next year we are planning to host a Python tutorial at PVSC as part of the tutorial program that preceeds the conference, but we need support with logistics. (Thanks already to Yuvi Panda for help with mybinder & TLJH.) This could be the opportunity for the Python scientific community, clean energy & climate scientists, academic, & national labs to collaborate and synergize. > > I volunteer to participate in whatever capacity I can to develop this collaboration if folks think it is useful. I'm not sure how to proceed, but whatever the result I believe some momentum is forming here, so there's an opportunity to carpe diem. > > Cheers, > Mark > > On Thu, Jul 2, 2020, 3:35 AM Ilhan Polat wrote: >> >> Ralf basically wrote the email that I was about the send in a much more structured way so thanks for that. I'd like to mention also that oil&gas industry practically cannot be cornered by these restrictions. So even the cause is very noble and I wholeheartedly agree, forcing this type of exclusions only will make their hand stronger in going to other commercial software (they can really afford even acquiring whole companies) and forcing their employees using it and finally boomeranging back to the reduction of the potential contributors to open source who would have otherwise contributed back just because they liked it (like most of us did back in the day). For example, Shell and Intel are corporate level collaborators. Should we ban also usage of MKL? Of course not, because this is not about driving Shell and others to software starvation but actually forcing them to take concrete steps towards the climate crisis. This is not to say we are desperate, quite the contrary, however this strategy seems dire against the possible outcomes. >> >> I really would like to take a more concrete approach that Ralf outlined. Again, it is not a crusade against commercial software, I truly think all have different shoes to fill in. However, making the switch from commercial software to open source as smooth as possible would actually emit the message that we are not bound to conglomerate structures to achieve noble goals. Thus this would make a bolder statement as far as what software can manage to display. Signal processing can make fuel consumption notebooks, stats can display bicycle usage results and their impact etc. Again it is a mentality that we are trying to build so it shouldn't be up to the level of annoyance so that everyone can hop on the bandwagon. >> >> >> >> On Thu, Jul 2, 2020 at 12:14 PM Ralf Gommers wrote: >>> >>> >>> >>> On Thu, Jul 2, 2020 at 10:58 AM Juan Nunez-Iglesias wrote: >>>> >>>> Hi everyone, >>>> >>>> If you live in Australia, this has been a rough year to think about climate change. After the hottest and driest year on record, over 20% of the forest surface area of the south east was burned in the bushfires. Although I was hundreds of kilometres from the nearest fire, the air quality was rated as hazardous for several days in my city. This brought home for me two points. >>>> >>>> One, that "4?C" is not about taking off a jumper and going to the beach more often, but actually represents a complete transformation of our planet. 4?C is what separates us from the last ice age, so we can expect our planet in 80 years to be as unrecognisable from today as today is from the ice age. >>>> >>>> Two, that climate change is already with us, and we can't just continue to ignore the problem and enjoy whatever years of climate peace we thought we had left. Greta has it right, we are running out of time and absolutely drastic action is needed. >>>> >>>> All this is a prelude to add my voice to everyone who has already said that messing with the NumPy license is absolutely *not* the drastic action needed, and will be counter-productive, as many have noted. >>>> >>>> Having said this, I'm happy that the community is getting involved and getting active and coming up with creative ideas to do their part. If someone wants to start a "Pythonistas for Climate Action" user group, I'll be the first to join. I had planned to give a lightning talk in the vein of the above at SciPy, which, and believe me that I hate to hate on my favourite conference, recently loudly thanked Shell [1] for being a platinum sponsor. (Not to mention that Enthought derives about a third of its income from fossil fuel companies.) Unfortunately and for obvious reasons I won't make it to SciPy after all, but again, I'm happy to see the community rising. >>>> >>>> Perhaps this is derailing the discussion, but, anyone up for a "Python for Climate Action" BoF at the conference? I can probably make the late-afternoon BoFs given the time difference. >>> >>> >>> Thanks for this Juan. I don't think it's derailing the discussion. Thinking about things we *can* do that may have a positive influence on the climate emergency we're in, or the state of the world in general, are valid and probably the most productive turn this conversation can take. Changing the NumPy license isn't feasible, because of many of the pragmatic reasons already pointed out. That said, the "NumPy is just a tool" point of view is fairly naive; I think we do have a responsibility to at least think about the wider issues and possibly make some changes. >>> >>> One thing I have been thinking about recently is the educational material and high level documentation we produce. When we use data sources or write tutorials, we can incorporate data and examples related to climate issues, social issues, ethics in ML/AI, etc. >>> >>> Another thing to think about is: what do we, NumPy maintainers and contributors, choose to spend our time on? Not each issue/PR opened deserves our time equally - we're (almost) all volunteers after all. A PR that for example improves the classroom experience of teaching NumPy may be prioritized over a PR that helps fix an issue for . >>> >>> I'd be interested to hear if others back thought about this before or have any ideas. >>> >>> Cheers, >>> Ralf >>> >>> >>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From rakesh.nvasudev at gmail.com Thu Jul 2 17:38:16 2020 From: rakesh.nvasudev at gmail.com (Rakesh Vasudevan) Date: Thu, 2 Jul 2020 14:38:16 -0700 Subject: [Numpy-discussion] Improving Complex Comparison/Ordering in Numpy In-Reply-To: <5c68a2bedaa6b050fa0d63d6252e88df5fa247cf.camel@sipsolutions.net> References: <453d1835fa50583ccf0f0c39252a55cd33a368cd.camel@sipsolutions.net> <5c68a2bedaa6b050fa0d63d6252e88df5fa247cf.camel@sipsolutions.net> Message-ID: I agree with the idea of setting apart the parameter from python , "by" sounds like a good alternative Rakesh On Wed, Jul 1, 2020, 18:45 Sebastian Berg wrote: > On Wed, 2020-07-01 at 12:48 -0700, Stephan Hoyer wrote: > > On Wed, Jul 1, 2020 at 12:23 PM Sebastian Berg < > > sebastian at sipsolutions.net> > > wrote: > > > > > This is a WIP, but allows nicely to try out how the new API > > > could/should look like, and see the potential impact to code. The > > > current choice is for: > > > > > > np.sort(arr, keys=(arr.real, arr.image)) > > > > > > for example. `keys` is like the `key` argument to pythons sorts, > > > but > > > unlike python sorts is not passed a function but rather a sequence > > > of > > > arrays. > > > > > > Alternative spellings could be `by=...`? Or maybe someone has a > > > different API idea. > > > > > > > I really like the look of np.sort(arr, by=(arr.real, arr.image)). > > - This avoids adding an extra function sortby into NumPy's API. The > > default > > behavior (by=None) would of course be to sort by the arrays being > > sorted, > > so it's backwards compatible. > > - Calling the new argument "by" instead of "key" avoids confusion > > with the > > behavior of Python's sort/sorted (which take functions instead of > > sequences). > > > I just noticed that `DataFrame.sort_values()` uses `by=...` with a list > of column names. However, I guess that is fairly compatible with this > usage. > > - Sebastan > > > > The combination of lexsort() and take_along_axis() makes it possible > > to > > achieve this behavior currently, but it is definitely less clear than > > a > > single function call. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From albuscode at gmail.com Thu Jul 2 22:12:00 2020 From: albuscode at gmail.com (Inessa Pawson) Date: Fri, 3 Jul 2020 12:12:00 +1000 Subject: [Numpy-discussion] Help us to make NumPy better Message-ID: Yes, it?s a survey. But it?s very important. Having limited human and financial resources is a common challenge for open source projects. NumPy is not an exception. Please join this structured dialogue with the NumPy leadership team to better guide and prioritize decision-making about the development of NumPy as software and a community. What started as a humble project by a dedicated core of user-developers has since transformed into a foundational component of the widely-adopted scientific Python ecosystem used by millions worldwide. To engage non-English speaking stakeholders, the inaugural NumPy community survey is offered in 7 additional languages: Bangla, Hindi, Japanese, Mandarin, Portuguese, Russian, and Spanish. The survey will take about 15 minutes of your time and close on July 17th. Click here to get started. Inessa Pawson NumPy Survey Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Thu Jul 2 22:22:53 2020 From: toddrjen at gmail.com (Todd) Date: Thu, 2 Jul 2020 22:22:53 -0400 Subject: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies In-Reply-To: References: Message-ID: On Thu, Jul 2, 2020 at 5:06 PM John Preston wrote: > Thank you all for your input on this proposal. I am very grateful for > the time you have all spent to provide such well reasoned critiques > and I'm especially glad to see that this thread has triggered > discussion of other, more pragmatic, actions that the community can > take in pursuit of climate justice. ? > > I found Stanley's analogy of this proposal being a "backwards > incompatible [legal] API change" particularly insightful, and Daniele > has illustrated exactly the kind of chaos this would create > downstream, threatening both NumPy itself (due to the packaging > requirements of distributors like Debian and Fedora) and its dependent > packages like Conda. Fundamentally I see this issue as a philosophical > one around how we define, and the importance of, 'free software' and > 'open source software'. From a principles-based perspective, I agree > with Ryan that "equal rights except" is not truly equal, and that > changing the definition(s) of F/OSS would damage the movement by > making it much less clear what is and isn't F/OSS. On the other hand, > from a pragmatic perspective, I care less about if software I use is > strictly F/OSS, and more about if I can do what I want with it, and > who else gets to enjoy those privileges -- I choose the word > 'privilege' here specifically to highlight that the core of F/OSS is > rights, which are unconditional, whereas this proposal would make > those rights contingent on conditions that cannot be met by all > actors, and therefore they would be privileges, not rights. So > essentially, this proposal is asking "are there some uses of NumPy > which are so ethically wrong, that it would be better for NumPy to be > non-F/OSS in order to prevent those uses, than for NumPy to be F/OSS, > and advance the F/OSS movement, while also allowing those uses?" > > Answering this question requires an awareness of the broader context > within which NumPy sits. Ilhan has pointed out that O&G companies > cannot be coerced by more restrictive licensing of NumPy because there > are commercial options that they could use instead. Therefore, without > evidence that NumPy powers a significant chunk of the analytics at > major O&G companies, and that relicensing NumPy would cause > significant disruption to those companies and their ability to carry > out their operations, it is much more likely that any negative effect > on O&G, and therefore any positive effect on the climate, would be > outweighed by the harm caused to downstream packages. > > I agree that the first term is particularly vague, and I would love to > see input from lawyers on how the software community can adopt rigid > clauses in licenses for software that needs this, because although > F/OSS may be "good by default" in that for most software, most of the > time, releasing as F/OSS will be good, this does not mean that there > is no software which requires stricter licensing. I would draw an > analogy with responsible disclosure of vulnerabilities: vendors are > provided with a window of time to fix a vulnerability before > researchers publish their findings, on the basis that immediate > publication of the findings presents more of a threat than a benefit, > because malicious actors could weaponise and abuse the vulnerability > before it is patched. In other words, as software creators, we have a > responsibility to weigh the potential and actual uses of our software > to determine if we are in a position to prevent harm by licensing or > relicensing our software appropriately. > . I think you are still grossly underestimating just how disastrous this change would be to numpy. For one thing, this would make numpy GPL-incompatible. No GPL software would be legally able to use numpy as a dependency anymore, killing likely thousands of downstream projects. And it isn't always under the control of the project, since a lot of projects have non-Python dependencies that are GPL. For example PyFFTW could no longer exist, since FFTW3 is GPL. RPY2, which lets R and Python interact, would be effectively killed, since R and many core packages are GPL, and it is essentially useless without numpy or other packages that depend on numpy. The end result would be an instant fork of the project at the point the license changed. There are just too many packages that use GPL to make such a change feasible. So this would end up fracturing and hurting the community without actually accomplishing your goal. This isn't a hypothetical issue, people have tried putting additional restrictions on their software like this, and it tends to kill the project. -------------- next part -------------- An HTML attachment was scrubbed... URL: From albuscode at gmail.com Fri Jul 3 03:45:02 2020 From: albuscode at gmail.com (Inessa Pawson) Date: Fri, 3 Jul 2020 17:45:02 +1000 Subject: [Numpy-discussion] Python for Climate Action session at SciPy'20 In-Reply-To: References: Message-ID: Juan, I?ve scheduled the ?Python for Climate Action? session for July 8th at 5 - 6 p.m. CDT. Mark, it would be great if someone from pvlib could moderate it. The networking sessions at SciPy'20 will be hosted in a format similar to unconferences. We will be using Zoom to make it easier for the moderators to oversee their sessions. Don't hesitate to contact me if you have any further questions regarding this session. ---------- Forwarded message ---------- > From: "Dr. Mark Alexander Mikofski PhD" > To: Discussion of Numerical Python > Cc: > Bcc: > Date: Thu, 2 Jul 2020 09:09:32 -0700 > Subject: Re: [Numpy-discussion] Python for Climate Action session at > SciPy'20 > I can repost this on pvlib (solar energy photovoltaic library) Python > Google group (https://groups.google.com/forum/m/#!forum/pvlib-python). We > have plenty of both climate and atmospheric scientists, and we are avid > users of Numpy, SciPy, and the scientific stack. We would love to share > constructive uses of Python in climate science. > > On Thu, Jul 2, 2020, 7:18 AM Juan Nunez-Iglesias wrote: > >> Hi Inessa, >> >> Thanks for offering! I definitely want to participate but I would *love >> it* if an actual climate scientist or even *any* atmospheric scientist >> would step up to chair the session! I have not thought all that deeply >> about this problem, and mostly I feel helpless and frustrated. >> >> If no one else volunteers though I'm happy to do it. >> >> I much prefer the Wednesday session. Let's book it in! >> >> Thank you all, >> >> Juan. >> >> On 2 Jul 2020, at 8:38 pm, Inessa Pawson wrote: >> >> Hi, Juan! >> I?m still in the process of scheduling live networking sessions at >> SciPy?20 and would be happy to set up one on the topic of Python for >> Climate Action. We could host it on July 8th or 10th at 5 - 6 p.m. CDT. >> Would you be available to moderate it? >> >> -- Every good wish, *Inessa Pawson* Albus Code inessa at albuscode.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From sabertooth2022 at gmail.com Fri Jul 3 09:19:21 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Fri, 3 Jul 2020 18:49:21 +0530 Subject: [Numpy-discussion] Google Summer of Docs Faq and Tutorial Proposed Structure . In-Reply-To: References: Message-ID: Hi Melissa , I have been trying to follow https://numpy.org/devdocs/dev/index.html and build docs , i am not able to execute the building albeit following the exact guidelines. Is there a slack channel for this as i'd be able to better explain / resolve it there , till then : I have tried make html command after adhering to the above link Error when i haven't switched it to required git version : installed numpy 32f514f546 != current repo git version '3cf4c28906' use "make dist" or "GITVER=32f514f546 make html ..." Makefile:93: recipe for target 'version-check' failed make: *** [version-check] Error 1 Now when i make it work as per my installed numpy version : There are quite a lot of warnings and many problems like : document isn't included in any toctree failed to import 'numpy.polynomial.set_default_printstyle': no module named numpy.polynomial.set_default_printstyle WARNING: autodoc: failed to import module 'typing' from module 'numpy'; the following exception was raised: No module named 'numpy.typing' Many modules were missing which gave a warning , Now maybe the errors are coming in just my workspace , but I have tried reinstalling everything , got the latest version of cython too . It would be really helpful for me if you are able to give me some advice , or a workaround for the problem On Fri, 26 Jun 2020 at 01:39, Melissa Mendon?a wrote: > Hello Mrinal, > > I don't think I have any other specific issues to comment on your > proposal. The mentors will analyze all proposals after July 9th. > > If you want to start contributing to the docs, you are welcome. You can > check the Contributor Guide (https://numpy.org/devdocs/dev/index.html), > How to contribute documentation ( > https://numpy.org/devdocs/dev/howto-docs.html) and the current > documentation. > > Thanks, > > Melissa > > On Thu, Jun 25, 2020 at 2:27 PM Saber Tooth > wrote: > >> Hello Melissa , >> I have altered and built upon my previous proposal , if you could take a >> few minutes to read it it would be great . I have tried to explain the >> sections we had discussed in the meeting on 22 June . >> >> Moreover as per your feedback , I would like to start contributing to the >> docs and start building the docs in my Linux environment . >> >> Best , >> Mrinal Tyagi >> >> >> On Wed, 24 Jun 2020 at 2:36 PM, Saber Tooth >> wrote: >> >>> Hello Melissa and all the Mentors , >>> >>> In the last meeting of NumPy Docs held on June 22nd , I had Proposed >>> Tutorial and Faqs structure , where it was discussed how we should include >>> them in the new structured Documentation . >>> Now after a lot of Googling I have been able to provide a >>> structured solution >>> It is very important for us to include the FAQs section , but why ? >>> If we see how many views some questions on Stack overflow >>> https://stackoverflow.com/questions/tagged/numpy?tab=frequent&page=1&pagesize=15 >>> have got the answer is obvious , the 1st page itself has garnered some 4-5 *million >>> *views and there are really many questions . It means if we are able to >>> answer some of these questions ,not necessarily the exact questions , it >>> would be really good for the Docs . >>> >>> I have tried to break the FAQs page into sections so that everything is >>> not cramped onto one page and we have good docs as per SEO guidelines . >>> I have structured based on best practices for the FAQs section : >>> https://www.shivarweb.com/9665/faq-page-best-practices/ . >>> >>> I have improved further upon my previous proposal , please have a look. >>> Eagerly waiting @Melissa Mendon?a @Ralf Gommers >>> for your feedback on the same , I would be >>> obliged if you have some ideas which could be incorporated with the same or >>> built upon these . >>> >>> >>> >>> Link to Proposal : >>> https://docs.google.com/document/d/1q-BYO-GqlBIsMizCzjbANORz6OOaC-6oEzuE1_HqJl4/edit?usp=sharing >>> >>> >>> Link to FAQs Proposed structure : >>> https://docs.google.com/document/d/1q-BYO-GqlBIsMizCzjbANORz6OOaC-6oEzuE1_HqJl4/edit#bookmark=id.2vkitb6djmq0 >>> >>> Link to Tutorials Proposed structure : >>> https://docs.google.com/document/d/1q-BYO-GqlBIsMizCzjbANORz6OOaC-6oEzuE1_HqJl4/edit#bookmark=id.pwfvp3lbd2fx >>> >>> Thanks , >>> Mrinal Tyagi >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Fri Jul 3 09:39:06 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 3 Jul 2020 10:39:06 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday July 6 In-Reply-To: References: Message-ID: Hi all! This is a reminder that our next Documentation Team meeting will be on *Monday, July 6* at 3PM UTC**. If you wish to join on Zoom, you need to use this link https://zoom.us/j/420005230 Here's the permanent hackmd document with the meeting notes (still being updated in the next few days!): https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20200706T15&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar/r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From sabertooth2022 at gmail.com Fri Jul 3 11:22:14 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Fri, 3 Jul 2020 20:52:14 +0530 Subject: [Numpy-discussion] Google Summer of Docs Faq and Tutorial Proposed Structure . In-Reply-To: References: Message-ID: Hello Melissa , Just read your email regarding docs meeting , I have now also tried to use https://bjnath.github.io/demodocs/new_docs_dev as a way to build docs , they were really useful for me as now i have got the message "build succeeded. The HTML pages are in build/html." . I believe we can really replace https://numpy.org/devdocs/dev/index.html with https://bjnath.github.io/demodocs/new_docs_dev , the former being incomplete and sometimes misleading . I can now really start on with contributing to NumPy having done my Development setup . I would also like to ask what are the fields in NumPy Docs that u would really like me to contribute to , should i be doing as per my proposal or an issue ? Looking forward to hearing from you Best, Mrinal On Fri, 3 Jul 2020 at 18:49, Saber Tooth wrote: > Hi Melissa , > I have been trying to follow https://numpy.org/devdocs/dev/index.html and > build docs , i am not able to execute the building albeit following the > exact guidelines. > Is there a slack channel for this as i'd be able to better explain / > resolve it there , till then : > > I have tried make html command after adhering to the above link > Error when i haven't switched it to required git version : > > installed numpy 32f514f546 != current repo git version '3cf4c28906' > use "make dist" or "GITVER=32f514f546 make html ..." > Makefile:93: recipe for target 'version-check' failed > make: *** [version-check] Error 1 > > Now when i make it work as per my installed numpy version : > > There are quite a lot of warnings and many problems like : > > document isn't included in any toctree > failed to import 'numpy.polynomial.set_default_printstyle': no module > named numpy.polynomial.set_default_printstyle > WARNING: autodoc: failed to import module 'typing' from module 'numpy'; > the following exception was raised: > No module named 'numpy.typing' > > Many modules were missing which gave a warning , > > Now maybe the errors are coming in just my workspace , but I have tried > reinstalling everything , got the latest version of cython too . > > It would be really helpful for me if you are able to give me some advice , > or a workaround for the problem > > On Fri, 26 Jun 2020 at 01:39, Melissa Mendon?a > wrote: > >> Hello Mrinal, >> >> I don't think I have any other specific issues to comment on your >> proposal. The mentors will analyze all proposals after July 9th. >> >> If you want to start contributing to the docs, you are welcome. You can >> check the Contributor Guide (https://numpy.org/devdocs/dev/index.html), >> How to contribute documentation ( >> https://numpy.org/devdocs/dev/howto-docs.html) and the current >> documentation. >> >> Thanks, >> >> Melissa >> >> On Thu, Jun 25, 2020 at 2:27 PM Saber Tooth >> wrote: >> >>> Hello Melissa , >>> I have altered and built upon my previous proposal , if you could take a >>> few minutes to read it it would be great . I have tried to explain the >>> sections we had discussed in the meeting on 22 June . >>> >>> Moreover as per your feedback , I would like to start contributing to >>> the docs and start building the docs in my Linux environment . >>> >>> Best , >>> Mrinal Tyagi >>> >>> >>> On Wed, 24 Jun 2020 at 2:36 PM, Saber Tooth >>> wrote: >>> >>>> Hello Melissa and all the Mentors , >>>> >>>> In the last meeting of NumPy Docs held on June 22nd , I had Proposed >>>> Tutorial and Faqs structure , where it was discussed how we should include >>>> them in the new structured Documentation . >>>> Now after a lot of Googling I have been able to provide a >>>> structured solution >>>> It is very important for us to include the FAQs section , but why ? >>>> If we see how many views some questions on Stack overflow >>>> https://stackoverflow.com/questions/tagged/numpy?tab=frequent&page=1&pagesize=15 >>>> have got the answer is obvious , the 1st page itself has garnered some 4-5 *million >>>> *views and there are really many questions . It means if we are able >>>> to answer some of these questions ,not necessarily the exact questions , it >>>> would be really good for the Docs . >>>> >>>> I have tried to break the FAQs page into sections so that everything is >>>> not cramped onto one page and we have good docs as per SEO guidelines . >>>> I have structured based on best practices for the FAQs section : >>>> https://www.shivarweb.com/9665/faq-page-best-practices/ . >>>> >>>> I have improved further upon my previous proposal , please have a look. >>>> Eagerly waiting @Melissa Mendon?a @Ralf Gommers >>>> for your feedback on the same , I would be >>>> obliged if you have some ideas which could be incorporated with the same or >>>> built upon these . >>>> >>>> >>>> >>>> Link to Proposal : >>>> https://docs.google.com/document/d/1q-BYO-GqlBIsMizCzjbANORz6OOaC-6oEzuE1_HqJl4/edit?usp=sharing >>>> >>>> >>>> Link to FAQs Proposed structure : >>>> https://docs.google.com/document/d/1q-BYO-GqlBIsMizCzjbANORz6OOaC-6oEzuE1_HqJl4/edit#bookmark=id.2vkitb6djmq0 >>>> >>>> Link to Tutorials Proposed structure : >>>> https://docs.google.com/document/d/1q-BYO-GqlBIsMizCzjbANORz6OOaC-6oEzuE1_HqJl4/edit#bookmark=id.pwfvp3lbd2fx >>>> >>>> Thanks , >>>> Mrinal Tyagi >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Fri Jul 3 17:17:12 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 3 Jul 2020 18:17:12 -0300 Subject: [Numpy-discussion] Google Summer of Docs Faq and Tutorial Proposed Structure . In-Reply-To: References: Message-ID: Hello Mrinal, That is indeed something we are looking at. Feel free to join the discussion at the documentation meeting on Monday, or here if you'd like. I don't think Ben has opened an issue or PR about this yet, but if he does the link will surely be at the hackmd document for the docs meeting (here: https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg) About contributing, you can start by issues that you feel are interesting to you, or even opening new ones, such as https://github.com/numpy/numpy/issues/15760 and https://github.com/numpy/numpy/issues/15793, for example. If you need help you can tag me on the specific issue and I'll try to answer to the best of my abilities. About the GSoD proposal, keep in mind that the period for the evaluation of projects starts on July 9. We've had a large number of people interested in the project, so there are no guarantees at this point that your project will be chosen. If you want to work on your project anyway, feel free! Best, - Melissa On Fri, Jul 3, 2020 at 12:22 PM Saber Tooth wrote: > Hello Melissa , > Just read your email regarding docs meeting , > I have now also tried to use > https://bjnath.github.io/demodocs/new_docs_dev as a way to build docs , > they were really useful for me as now i have got the message "build > succeeded. > > The HTML pages are in build/html." . > > I believe we can really replace https://numpy.org/devdocs/dev/index.html > with https://bjnath.github.io/demodocs/new_docs_dev , the former being > incomplete and sometimes misleading . > I can now really start on with contributing to NumPy having done my > Development setup . > I would also like to ask what are the fields in NumPy Docs that u would > really like me to contribute to , > should i be doing as per my proposal or an issue ? > > Looking forward to hearing from you > Best, > Mrinal > > On Fri, 3 Jul 2020 at 18:49, Saber Tooth wrote: > >> Hi Melissa , >> I have been trying to follow https://numpy.org/devdocs/dev/index.html and >> build docs , i am not able to execute the building albeit following the >> exact guidelines. >> Is there a slack channel for this as i'd be able to better explain / >> resolve it there , till then : >> >> I have tried make html command after adhering to the above link >> Error when i haven't switched it to required git version : >> >> installed numpy 32f514f546 != current repo git version '3cf4c28906' >> use "make dist" or "GITVER=32f514f546 make html ..." >> Makefile:93: recipe for target 'version-check' failed >> make: *** [version-check] Error 1 >> >> Now when i make it work as per my installed numpy version : >> >> There are quite a lot of warnings and many problems like : >> >> document isn't included in any toctree >> failed to import 'numpy.polynomial.set_default_printstyle': no module >> named numpy.polynomial.set_default_printstyle >> WARNING: autodoc: failed to import module 'typing' from module 'numpy'; >> the following exception was raised: >> No module named 'numpy.typing' >> >> Many modules were missing which gave a warning , >> >> Now maybe the errors are coming in just my workspace , but I have tried >> reinstalling everything , got the latest version of cython too . >> >> It would be really helpful for me if you are able to give me some advice >> , or a workaround for the problem >> >> On Fri, 26 Jun 2020 at 01:39, Melissa Mendon?a >> wrote: >> >>> Hello Mrinal, >>> >>> I don't think I have any other specific issues to comment on your >>> proposal. The mentors will analyze all proposals after July 9th. >>> >>> If you want to start contributing to the docs, you are welcome. You can >>> check the Contributor Guide (https://numpy.org/devdocs/dev/index.html), >>> How to contribute documentation ( >>> https://numpy.org/devdocs/dev/howto-docs.html) and the current >>> documentation. >>> >>> Thanks, >>> >>> Melissa >>> >>> On Thu, Jun 25, 2020 at 2:27 PM Saber Tooth >>> wrote: >>> >>>> Hello Melissa , >>>> I have altered and built upon my previous proposal , if you could take >>>> a few minutes to read it it would be great . I have tried to explain the >>>> sections we had discussed in the meeting on 22 June . >>>> >>>> Moreover as per your feedback , I would like to start contributing to >>>> the docs and start building the docs in my Linux environment . >>>> >>>> Best , >>>> Mrinal Tyagi >>>> >>>> >>>> On Wed, 24 Jun 2020 at 2:36 PM, Saber Tooth >>>> wrote: >>>> >>>>> Hello Melissa and all the Mentors , >>>>> >>>>> In the last meeting of NumPy Docs held on June 22nd , I had Proposed >>>>> Tutorial and Faqs structure , where it was discussed how we should include >>>>> them in the new structured Documentation . >>>>> Now after a lot of Googling I have been able to provide a >>>>> structured solution >>>>> It is very important for us to include the FAQs section , but why ? >>>>> If we see how many views some questions on Stack overflow >>>>> https://stackoverflow.com/questions/tagged/numpy?tab=frequent&page=1&pagesize=15 >>>>> have got the answer is obvious , the 1st page itself has garnered some 4-5 *million >>>>> *views and there are really many questions . It means if we are able >>>>> to answer some of these questions ,not necessarily the exact questions , it >>>>> would be really good for the Docs . >>>>> >>>>> I have tried to break the FAQs page into sections so that everything >>>>> is not cramped onto one page and we have good docs as per SEO guidelines . >>>>> I have structured based on best practices for the FAQs section : >>>>> https://www.shivarweb.com/9665/faq-page-best-practices/ . >>>>> >>>>> I have improved further upon my previous proposal , please have a look. >>>>> Eagerly waiting @Melissa Mendon?a @Ralf Gommers >>>>> for your feedback on the same , I would be >>>>> obliged if you have some ideas which could be incorporated with the same or >>>>> built upon these . >>>>> >>>>> >>>>> >>>>> Link to Proposal : >>>>> https://docs.google.com/document/d/1q-BYO-GqlBIsMizCzjbANORz6OOaC-6oEzuE1_HqJl4/edit?usp=sharing >>>>> >>>>> >>>>> Link to FAQs Proposed structure : >>>>> https://docs.google.com/document/d/1q-BYO-GqlBIsMizCzjbANORz6OOaC-6oEzuE1_HqJl4/edit#bookmark=id.2vkitb6djmq0 >>>>> >>>>> Link to Tutorials Proposed structure : >>>>> https://docs.google.com/document/d/1q-BYO-GqlBIsMizCzjbANORz6OOaC-6oEzuE1_HqJl4/edit#bookmark=id.pwfvp3lbd2fx >>>>> >>>>> Thanks , >>>>> Mrinal Tyagi >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From albuscode at gmail.com Fri Jul 3 22:12:00 2020 From: albuscode at gmail.com (Inessa Pawson) Date: Sat, 4 Jul 2020 12:12:00 +1000 Subject: [Numpy-discussion] Help us to make NumPy better In-Reply-To: References: Message-ID: On Fri, Jul 3, 2020 at 5:45 PM wrote: > > ---------- Forwarded message ---------- > From: Inessa Pawson > To: numpy-discussion at python.org > Cc: > Bcc: > Date: Fri, 3 Jul 2020 12:12:00 +1000 > Subject: [Numpy-discussion] Help us to make NumPy better > > Yes, it?s a survey. But it?s very important. > > Having limited human and financial resources is a common challenge for > open source projects. NumPy is not an exception. Please join this > structured dialogue with the NumPy leadership team to better guide and > prioritize decision-making about the development of NumPy as software and a > community. > > What started as a humble project by a dedicated core of user-developers > has since transformed into a foundational component of the widely-adopted > scientific Python ecosystem used by millions worldwide. To engage > non-English speaking stakeholders, the inaugural NumPy community survey is > offered in 7 additional languages: Bangla, Hindi, Japanese, Mandarin, > Portuguese, Russian, and Spanish. > > The survey will take about 15 minutes of your time and close on July 17th. Click > here to get > started. > > Inessa Pawson > > NumPy Survey Team > Thank you to everyone who took the time to participate in the inaugural NumPy community survey in the first 24 hours! It is an effort of a group of volunteers who showed a remarkable dedication to the project amidst the COVID-19 pandemic. I?d like to take this opportunity to acknowledge everyone involved. Survey team (in alphabetical order) Ross Barnowski Sebastian Berg Xiaoyi Deng Ralf Gommers Stephanie Mendoza Inessa Pawson Translations team (in alphabetical order) Yuki Dunn Kiko Correoso Garcia Jose Guzman Paul Ivanov Siddhartha Kapoor Tetsuo Koyama Guilherme Leobas Dayane Machado Mahfuza Humayra Mohona Aerik Pawson Zijie Poh Sumera Priyadarsini Shaloo Shalini Kriti Singh Alexandre de Siqueira Special thanks Prof. Frederick Conrad (Survey Research Center, Institute for Social Research, University of Michigan) Prof. Jim Lepkowski (Survey Research Center, Institute for Social Research, University of Michigan) Prof. Michael Traugott (Survey Research Center, Institute for Social Research, University of Michigan) Deji Suolang Mame Fatou Thiam -- Every good wish, *Inessa Pawson* Albus Code inessa at albuscode.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From numpy_gsod at bigriver.xyz Sat Jul 4 08:35:11 2020 From: numpy_gsod at bigriver.xyz (Ben Nathanson) Date: Sat, 4 Jul 2020 08:35:11 -0400 Subject: [Numpy-discussion] Google Summer of Docs Faq and Tutorial Proposed Structure . Message-ID: Let me first say I'm a volunteer and not a member of the group that will evaluate proposals. But I can tell you how I would choose. A right-sized proposal is stronger than an overambitious one. I'd scale back and pick a single target. From your listed experience, this would be your first significant docs work. Before you can overhaul an entire site, you need to get a feel for the pace of the work. The review process alone guarantees the site can't be transformed in three months. For argument's sake, let's say you focus on mining the internet for how-to subjects. This is not a bad GSoD idea, because it's open-ended: No matter how fast you write, you'll never exhaust the supply of topics, and if work proceeds more slowly than you'd hoped, you will have established a methodology, and any how-tos we get are more than we have now. Apart from appropriate scope, a proposal needs credibility. GSoD tech writers are a coveted resource, so a project wants to be confident that the writer they pick will deliver the goods. We're scientists and engineers, yourself included, so we look for evidence. Put evidence in your proposal. Assuming you pick how-tos, show the NumPy team how thoroughly you've considered the issues, explain your methodology and its strengths and shortcomings, and, most importantly, give samples of how-tos you've transformed. It's like a job interview, but harder: Not only do you have to provide the answers, you have to anticipate the questions. Does that mean you have to do work upfront that you might do during GSoD? Yes. You're staking capital. The more you put on the table, the more confident the team can be in your sincerity and skill. That said, you do stand to lose it all if someone else is chosen. That can happen even if you send in a proposal that everyone agrees looks good. Your effort will not be a waste; you'll have developed skill in drafting a competitive proposal -- useful in grant writing, calls for papers, and next year's GSoD. Again, this is peer review, not official guidance. And to be clear, I myself am not participating in GSoD; I thought I might but chose instead to simply volunteer. From numpy_gsod at bigriver.xyz Sat Jul 4 09:40:37 2020 From: numpy_gsod at bigriver.xyz (Ben Nathanson) Date: Sat, 4 Jul 2020 09:40:37 -0400 Subject: [Numpy-discussion] Google Summer of Docs Faq and Tutorial Proposed Structure . In-Reply-To: References: Message-ID: If you do choose how-tos, I meant to say that the first place to mine should be this very list. For instance, a question the other day on seeding random sequences sparked an illuminating and far-reaching discussion. Some things that make this list a great source: * Extracting how-tos from the mailing list does a real service -- questions on the mailing list are much less visible via Google search than SO questions * Answers here are likely to be deep and interesting (i.e., not simply answers you'll find in the docs) * We own the list; no doubts about usage rights * We have authoritative answers from code authors Mining only this list would not be enough for a proposal, however; there'd need to be something else as well (e.g., mining SO/Reddit). On the subject of mining SO, I'd suggest not only weighting by frequency but also searching out answers that came from the community -- e.g. Robert Kern, Warren Weckesser, Jaime Fern?ndez del R?o (jaime), and others whom I apologize for missing. Here again we'd add value by giving prominence (and an imprimatur) to the best answers. From evgeny.burovskiy at gmail.com Sat Jul 4 13:01:29 2020 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 4 Jul 2020 20:01:29 +0300 Subject: [Numpy-discussion] reseed random generator (1.19) In-Reply-To: References: <2C4AB2E0-D641-4AF8-B2F2-C959F133D483@hxcore.ol> <53BDD45B-693A-42DA-9165-DDF79A75B3D7@hxcore.ol> Message-ID: Thanks Kevin, thanks Robert, this is very helpful! I'd strongly agree with Matti that your explanations could/should make it to the docs. Maybe it's something for the GSoD. While we're on the subject, one comment and two (hopefully last) questions: 1. My two cents w.r.t. `np.random.simple_seed()` function Robert mentioned: I personally would find it way more confusing than a clear explanation + example in the docs. I'd ask myself what's "simple" here, click through to the source of this `simple_seed`, find out that it's a docsting and a two-liner, and just copy-paste the latter into my user code. Again, just FWIW. 2. What would be a preferred way of spelling out "give me the N-th spawned child SeedSequence"? The use case is that I prepare (human-readable) input files once and run a number of computational jobs in separate OS processes. From what Kevin said, I can of course five each worker a pair of (entropy, worker_id) and then each of them does at startup > parent_seq = SeedSequence(entropy) > this_sequence = seed_seq.spawn(worker_id)[worker_id] Is this a recommended way, or is there a better API? Or does the number of spawned children need to be known beforehand? I'd much rather avoid serialization/deserialization if possible. 3. Is there a way of telling the number of draws a generator did? The use case is to checkpoint the number of draws and `.advance` the bit generator when resuming from the checkpoint. (The runs are longer then the batch queue limits). Thanks! Evgeni On Mon, Jun 29, 2020 at 11:06 PM Robert Kern wrote: > > On Mon, Jun 29, 2020 at 11:30 AM Robert Kern wrote: >> >> On Mon, Jun 29, 2020 at 11:10 AM Kevin Sheppard wrote: >>> >>> The total number of digits in the binary representation is somewhere between 32 and 128. >> >> >> I like using the standard library `secrets` module. >> >> >>> import secrets >> >>> secrets.randbelow(1<<128) >> 8080125189471896523368405732926911908 >> >> If you want an easy-to-follow rule, just use the above snippet to get a 128-bit number. More than 128 bits won't do you any good (at least by default, the internal bottleneck inside of SeedSequence is a 128-bit pool), and 128-bit numbers are just about small enough to copy-paste comfortably. > > > Sorry, `secrets.randbits(128)` is the cleaner form of this. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Sat Jul 4 13:55:45 2020 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 4 Jul 2020 13:55:45 -0400 Subject: [Numpy-discussion] reseed random generator (1.19) In-Reply-To: References: <2C4AB2E0-D641-4AF8-B2F2-C959F133D483@hxcore.ol> <53BDD45B-693A-42DA-9165-DDF79A75B3D7@hxcore.ol> Message-ID: On Sat, Jul 4, 2020 at 1:03 PM Evgeni Burovski wrote: > Thanks Kevin, thanks Robert, this is very helpful! > > I'd strongly agree with Matti that your explanations could/should make > it to the docs. Maybe it's something for the GSoD. > > While we're on the subject, one comment and two (hopefully last) questions: > > 1. My two cents w.r.t. `np.random.simple_seed()` function Robert > mentioned: I personally would find it way more confusing than a clear > explanation + example in the docs. I'd ask myself what's "simple" > here, click through to the source of this `simple_seed`, find out that > it's a docsting and a two-liner, and just copy-paste the latter into > my user code. Again, just FWIW. > Noted. > 2. What would be a preferred way of spelling out "give me the N-th > spawned child SeedSequence"? > The use case is that I prepare (human-readable) input files once and > run a number of computational jobs in separate OS processes. From what > Kevin said, I can of course five each worker a pair of (entropy, > worker_id) and then each of them does at startup > > > parent_seq = SeedSequence(entropy) > > this_sequence = seed_seq.spawn(worker_id)[worker_id] > > Is this a recommended way, or is there a better API? Or does the > number of spawned children need to be known beforehand? > I'd much rather avoid serialization/deserialization if possible. > Assuming that `worker_id` starts at 0: this_sequence = SeedSequence(entropy, spawn_key=(worker_id,)) > 3. Is there a way of telling the number of draws a generator did? > > The use case is to checkpoint the number of draws and `.advance` the > bit generator when resuming from the checkpoint. (The runs are longer > then the batch queue limits). > There are computations you can do on the internal state of PCG64 and Philox to get this information, but not in general, no. I do recommend serializing the Generator or BitGenerator (or at least the BitGenerator's .state property, which is a nice JSONable dict for PCG64) for checkpointing purposes. Among other things, there is a cached uint32 for when odd numbers of uint32s are drawn that you might need to handle. The state of the default PCG64 is much smaller than MT19937. It's less work and more reliable than computing that distance and storing the original seed and the distance. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Sat Jul 4 14:37:49 2020 From: ndbecker2 at gmail.com (Neal Becker) Date: Sat, 4 Jul 2020 14:37:49 -0400 Subject: [Numpy-discussion] reseed random generator (1.19) In-Reply-To: References: <2C4AB2E0-D641-4AF8-B2F2-C959F133D483@hxcore.ol> <53BDD45B-693A-42DA-9165-DDF79A75B3D7@hxcore.ol> Message-ID: On Sat, Jul 4, 2020 at 1:56 PM Robert Kern wrote: .... > > 3. Is there a way of telling the number of draws a generator did? >> >> The use case is to checkpoint the number of draws and `.advance` the >> bit generator when resuming from the checkpoint. (The runs are longer >> then the batch queue limits). >> > > There are computations you can do on the internal state of PCG64 and > Philox to get this information, but not in general, no. I do recommend > serializing the Generator or BitGenerator (or at least the BitGenerator's > .state property, which is a nice JSONable dict for PCG64) for checkpointing > purposes. Among other things, there is a cached uint32 for when odd numbers > of uint32s are drawn that you might need to handle. The state of the > default PCG64 is much smaller than MT19937. It's less work and more > reliable than computing that distance and storing the original seed and the > distance. > > -- > Robert Kern > Sorry, you lost me here. If I want to save, restore the state of a generator, can I use pickle/unpickle? -- *Those who don't understand recursion are doomed to repeat it* -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Jul 4 14:41:17 2020 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 4 Jul 2020 14:41:17 -0400 Subject: [Numpy-discussion] reseed random generator (1.19) In-Reply-To: References: <2C4AB2E0-D641-4AF8-B2F2-C959F133D483@hxcore.ol> <53BDD45B-693A-42DA-9165-DDF79A75B3D7@hxcore.ol> Message-ID: On Sat, Jul 4, 2020, 2:39 PM Neal Becker wrote: > > > On Sat, Jul 4, 2020 at 1:56 PM Robert Kern wrote: > .... > >> >> 3. Is there a way of telling the number of draws a generator did? >>> >>> The use case is to checkpoint the number of draws and `.advance` the >>> bit generator when resuming from the checkpoint. (The runs are longer >>> then the batch queue limits). >>> >> >> There are computations you can do on the internal state of PCG64 and >> Philox to get this information, but not in general, no. I do recommend >> serializing the Generator or BitGenerator (or at least the BitGenerator's >> .state property, which is a nice JSONable dict for PCG64) for checkpointing >> purposes. Among other things, there is a cached uint32 for when odd numbers >> of uint32s are drawn that you might need to handle. The state of the >> default PCG64 is much smaller than MT19937. It's less work and more >> reliable than computing that distance and storing the original seed and the >> distance. >> >> -- >> Robert Kern >> > > Sorry, you lost me here. If I want to save, restore the state of a > generator, can I use pickle/unpickle? > Absolutely. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tyler.je.reddy at gmail.com Sat Jul 4 19:47:24 2020 From: tyler.je.reddy at gmail.com (Tyler Reddy) Date: Sat, 4 Jul 2020 17:47:24 -0600 Subject: [Numpy-discussion] ANN: SciPy 1.5.1 Message-ID: Hi all, On behalf of the SciPy development team I'm pleased to announce the release of SciPy 1.5.1, which is a bug fix release. Sources and binary wheels can be found at: https://pypi.org/project/scipy/ and at: https://github.com/scipy/scipy/releases/tag/v1.5.1 One of a few ways to install this release with pip: pip install scipy==1.5.1 ========================== SciPy 1.5.1 Release Notes ========================== SciPy 1.5.1 is a bug-fix release with no new features compared to 1.5.0. In particular, an issue where DLL loading can fail for SciPy wheels on Windows with Python 3.6 has been fixed. Authors ======= * Peter Bell * Lo?c Est?ve * Philipp Th?lke + * Tyler Reddy * Paul van Mulbregt * Pauli Virtanen * Warren Weckesser A total of 7 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete. Issues closed for 1.5.1 -------------------------------- * `#9108 `__: documentation: scipy.spatial.KDTree vs. scipy.spatial.cKDTree * `#12218 `__: Type error in stats.ks_2samp when alternative != 'two-sided- * `#12406 `__: DOC: Docstring in stats.anderson function not properly formatted * `#12418 `__: Regression in hierarchy.dendogram Pull requests for 1.5.1 -------------------------------- * `#12280 `__: BUG: Fixes gh-12218, TypeError converting int to float inside... * `#12336 `__: BUG: KDTree should reject complex input points * `#12344 `__: MAINT: Don't use numpy's aliases of Python builtin objects. * `#12407 `__: DOC: Fix docstring for dist param in anderson function * `#12410 `__: CI: Run the Azure Windows Python36 32bit tests with mode 'fast' * `#12421 `__: Fix regression in scipy 1.5.0 in dendogram when labels is a numpy... * `#12462 `__: MAINT: move distributor_init import after __config__ import Checksums ========= MD5 ~~~ b71e8115d61c604cc65e5ecc556131f6 scipy-1.5.1-cp36-cp36m-macosx_10_9_x86_64.whl 0190c11f75ed28a7e56050182ca95a18 scipy-1.5.1-cp36-cp36m-manylinux1_i686.whl c4dd717a3a0c3fe64380039e4fda663f scipy-1.5.1-cp36-cp36m-manylinux1_x86_64.whl baad02c954e85e7fd3d4a9fd49fc6359 scipy-1.5.1-cp36-cp36m-win32.whl 9edc3a9aedf6bffccb17101c905126d0 scipy-1.5.1-cp36-cp36m-win_amd64.whl 83479a6de66a6bc2da0990fa71cf3cec scipy-1.5.1-cp37-cp37m-macosx_10_9_x86_64.whl f2d5c8713b087545c5ec19cc8e46212c scipy-1.5.1-cp37-cp37m-manylinux1_i686.whl 6a18a9636342574ae55d3a80136c550c scipy-1.5.1-cp37-cp37m-manylinux1_x86_64.whl 5da68faf5b32c539d1cb5390df010cc8 scipy-1.5.1-cp37-cp37m-win32.whl 2ca8c59a6712e91ac78b8540ab694b53 scipy-1.5.1-cp37-cp37m-win_amd64.whl cceb059d0cf6a70e62452deb5571ba00 scipy-1.5.1-cp38-cp38-macosx_10_9_x86_64.whl 8a65b30ccd72409704d3300922da2b7f scipy-1.5.1-cp38-cp38-manylinux1_i686.whl 00181f52a7917d1c3d50e42a76a6df96 scipy-1.5.1-cp38-cp38-manylinux1_x86_64.whl 2aa8b6ddceaebe7b33d71dbad0e208cc scipy-1.5.1-cp38-cp38-win32.whl a626585d08b0991c8f2df0caacdf9997 scipy-1.5.1-cp38-cp38-win_amd64.whl f6986798b7d22ffc5f80b749d7ec27ca scipy-1.5.1.tar.gz e126a1a0ff954b924a8273faa7437fe3 scipy-1.5.1.tar.xz 3bce82b23d45d1a96ee270f23176746a scipy-1.5.1.zip SHA256 ~~~~~~ 058e84930407927f71963a4ad8c1dc96c4d2d075636a68578195648c81f78810 scipy-1.5.1-cp36-cp36m-macosx_10_9_x86_64.whl 7908c85854c5b5b6d3ce7fefafac1ca3e23ff9ac41edabc2d46ae5dc9fa070ac scipy-1.5.1-cp36-cp36m-manylinux1_i686.whl 8302d69fb1528ea7c7f2a1ea640d354c981b6eb8192d1c175349874209397604 scipy-1.5.1-cp36-cp36m-manylinux1_x86_64.whl 35d042d6499caf1a5d171baed0ebf01eb665b7af2ad98a8ff1b0e6e783654540 scipy-1.5.1-cp36-cp36m-win32.whl 5e0bb43ff581811ab7f27425f6b96c1ddf7591ccad2e486c9af0b910c18f7185 scipy-1.5.1-cp36-cp36m-win_amd64.whl b4858ccbd88f4b53950fb9fc0069c1d9fea83d7cff2382e1d8b023d3f4883014 scipy-1.5.1-cp37-cp37m-macosx_10_9_x86_64.whl eb46d8b5947ca27b0bc972cecfba8130f088a83ab3d08c1a6033d9070b3046b3 scipy-1.5.1-cp37-cp37m-manylinux1_i686.whl fff15df01bef1243468be60c55178ed7576270b200aab08a7ffd5b8e0bbc340c scipy-1.5.1-cp37-cp37m-manylinux1_x86_64.whl 81859ed3aad620752dd2c07c32b5d3a80a0d47c5e3813904621954a78a0ae899 scipy-1.5.1-cp37-cp37m-win32.whl c05c6fe76228cc13c5214e9faf5f2a871a1da54473bc417ab9da310d0e5fff8b scipy-1.5.1-cp37-cp37m-win_amd64.whl 71742889393a724dfce755b6b61228677873d269a4234e51ddaf08b998433c91 scipy-1.5.1-cp38-cp38-macosx_10_9_x86_64.whl 9323d268775991b79690f7b9a28a4e8b8c4f2b160ed9f8a90123127314e2d3c1 scipy-1.5.1-cp38-cp38-manylinux1_i686.whl 06b19a650471781056c1a2172eeeeb777b8b516e9434005dd392a4559e0938b9 scipy-1.5.1-cp38-cp38-manylinux1_x86_64.whl 57a0f2be3063dbe1e3daf31ec9005576e8fd1022a28159d0db71d14566899d16 scipy-1.5.1-cp38-cp38-win32.whl c06e731aa46c0dfc563cc636155758178ebc019ef78b9b0f4370effe2ac0f0e6 scipy-1.5.1-cp38-cp38-win_amd64.whl 039572f0ca9578a466683558c5bf1e65d442860ec6e13307d528749cfe6d07b8 scipy-1.5.1.tar.gz 0728bd66a5251cfeff17a72280ae5a40ec14add217f94868d1415b3c469b610a scipy-1.5.1.tar.xz 6dfa9d1e718588f48731e865674b3270130f7736d6c7dc5ceaeb048f55ed793a scipy-1.5.1.zip -------------- next part -------------- An HTML attachment was scrubbed... URL: From albuscode at gmail.com Mon Jul 6 06:14:44 2020 From: albuscode at gmail.com (Inessa Pawson) Date: Mon, 6 Jul 2020 20:14:44 +1000 Subject: [Numpy-discussion] =?utf-8?q?Parlez-vous_fran=C3=A7ais=3F?= Message-ID: As most of you know, the inaugural NumPy community survey is currently underway. Fabrice Silva kindly offered his help to translate the survey questionnaire into French to maximize the participation of NumPy users and developers from the French-speaking countries. He has already completed his part of the translation (in less than 24 hours!). We are looking for another French-speaking volunteer to finalize the translation process of this document. If you are available, please email me at inessa at albuscode.org. -- Every good wish, Inessa Pawson NumPy survey team inessa at albuscode.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmeurer at gmail.com Mon Jul 6 14:39:19 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Mon, 6 Jul 2020 12:39:19 -0600 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? Message-ID: I've been trying to figure out this behavior. It doesn't seem to be documented at https://numpy.org/doc/stable/reference/arrays.indexing.html >>> a = np.empty((2, 3)) >>> a.shape (2, 5) >>> a[True].shape (1, 2, 5) >>> a[False].shape (0, 2, 5) It seems like indexing with a raw boolean (True or False) adds an axis with a dimension 1 or 0, resp. Except it only works once: >>> a[:,False] array([], shape=(2, 0, 3), dtype=float64) >>> a[:,False, False] array([], shape=(2, 0, 3), dtype=float64) >>> a[:,False,True].shape (2, 0, 3) >>> a[:,True,False].shape (2, 0, 3) The docs say "A single boolean index array is practically identical to x[obj.nonzero()]". I have a hard time seeing this as an extension of that, since indexing by `np.nonzero(False)` or `np.nonzero(True)` *replaces* the given axis. >>> a[np.nonzero(True)].shape (1, 3) >>> a[np.nonzero(False)].shape (0, 3) I think at best this behavior should be documented. I'm trying to understand the motivation for it, or if it's even intentional. And in particular, why do multiple boolean indices not insert multiple axes? It would actually be useful to be able to generically add length 0 axes using an index, similar to how `newaxis` adds a length 1 axis. Aaron Meurer From sebastian at sipsolutions.net Mon Jul 6 14:51:09 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 06 Jul 2020 13:51:09 -0500 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: References: Message-ID: <94681b6f3db5965de95b63ed3a012c503cb9ba8c.camel@sipsolutions.net> On Mon, 2020-07-06 at 12:39 -0600, Aaron Meurer wrote: > I've been trying to figure out this behavior. It doesn't seem to be > documented at > https://numpy.org/doc/stable/reference/arrays.indexing.html > > > > > a = np.empty((2, 3)) > > > > a.shape > (2, 5) > > > > a[True].shape > (1, 2, 5) > > > > a[False].shape > (0, 2, 5) > > It seems like indexing with a raw boolean (True or False) adds an > axis > with a dimension 1 or 0, resp. > > Except it only works once: > > > > > a[:,False] > array([], shape=(2, 0, 3), dtype=float64) > > > > a[:,False, False] > array([], shape=(2, 0, 3), dtype=float64) > > > > a[:,False,True].shape > (2, 0, 3) > > > > a[:,True,False].shape > (2, 0, 3) > > The docs say "A single boolean index array is practically identical > to > x[obj.nonzero()]". I have a hard time seeing this as an extension of > that, since indexing by `np.nonzero(False)` or `np.nonzero(True)` > *replaces* the given axis. > > >>> a[np.nonzero(True)].shape > (1, 3) > > > > a[np.nonzero(False)].shape > (0, 3) > > I think at best this behavior should be documented. I'm trying to > understand the motivation for it, or if it's even intentional. And in > particular, why do multiple boolean indices not insert multiple axes? > It would actually be useful to be able to generically add length 0 > axes using an index, similar to how `newaxis` adds a length 1 axis. Its fully intentional as it is the correct generalization from an N-D boolean index to include a 0-D boolean index. To be fair, there is a footnote in the "Detailed notes" saying that: "the nonzero equivalence for Boolean arrays does not hold for zero dimensional boolean arrays.", this is for technical reasons since `nonzero` does not do useful things for 0-D input. In any case, a boolean index always does the following: 1. It will *remove as many dimensions as the index has, because this is the number of dimensions effectively indexed by it* 2. It will add a single new dimension at the same place. The length of this new dimension is the number of `True` elements. 3. If you have multiple advanced indexing you get annoying broadcasting of all of these. That is *always* confusing for boolean indices. 0-D should not be too special there... And this generalizes to 0-D just as well, even if it may be a bit surprising at first. I have written much of this more clearly once before in this NEP, which may be a good read to _really_ understand it: https://numpy.org/neps/nep-0021-advanced-indexing.html In general, I wonder if going into much depth about how 0-D arrays are not actually really handled very special is good. Yes, its confusing on its own, but it seems also a bit like overloading the user with unnecessary knowledge? Cheers, Sebastian > > Aaron Meurer > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From asmeurer at gmail.com Mon Jul 6 15:27:17 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Mon, 6 Jul 2020 13:27:17 -0600 Subject: [Numpy-discussion] What is up with raw boolean indices (like a[False])? In-Reply-To: References: Message-ID: > Its fully intentional as it is the correct generalization from an N-D > boolean index to include a 0-D boolean index. > To be fair, there is a footnote in the "Detailed notes" saying that: > "the nonzero equivalence for Boolean arrays does not hold for zero > dimensional boolean arrays.", this is for technical reasons since > `nonzero` does not do useful things for 0-D input. > > In any case, a boolean index always does the following: > 1. It will *remove as many dimensions as the index has, because this > is the number of dimensions effectively indexed by it* > 2. It will add a single new dimension at the same place. The length of > this new dimension is the number of `True` elements. > 3. If you have multiple advanced indexing you get annoying broadcasting > of all of these. That is *always* confusing for boolean indices. > 0-D should not be too special there... > And this generalizes to 0-D just as well, even if it may be a bit > surprising at first. I guess if those are the base rules for boolean indices this makes sense. So that brings up the question then, is there a way to add arbitrary empty dimensions using an index? > > I have written much of this more clearly once before in this NEP, which > may be a good read to _really_ understand it: > https://numpy.org/neps/nep-0021-advanced-indexing.html > In general, I wonder if going into much depth about how 0-D arrays are > not actually really handled very special is good. Yes, its confusing > on its own, but it seems also a bit like overloading the user with > unnecessary knowledge? The page I referenced is already written like a very highly technical document, so I think it should embrace that and fully describe the spec of NumPy indexing. NumPy could use more user-friendly documentation for indexing, but that page ain't it. FWIW, I wrote some documentation on slices of my own here https://quansight.github.io/ndindex/slices.html. I eventually plan to extend this to all forms of NumPy indexing. Anyway, the three bullet points you mentioned above would be helpful to include in the docs. > Cheers, > Sebastian From sabertooth2022 at gmail.com Tue Jul 7 03:07:42 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Tue, 7 Jul 2020 12:37:42 +0530 Subject: [Numpy-discussion] GSoD application technical limitations Message-ID: Hello Melissa , Today I was writing my GSoD proposal application , while I tried to copy / paste my exact Google docs proposal in the application description , but even after clearing all the formatting , the application form is stating that the " response is too large , try shortening some answers " Now many people have complained regarding this issue on the season of docs channel , so it means the issue is upsetting everyone . So I wanted to request if it's fine I also mention the link to my google docs Proposal where it is explained in much more detail. In application I have almost shortened half of my proposal therefore it has started to lose its minute explanatory details . Thanks , Mrinal Tyagi -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Jul 7 04:34:17 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 7 Jul 2020 10:34:17 +0200 Subject: [Numpy-discussion] GSoD application technical limitations In-Reply-To: References: Message-ID: On Tue, Jul 7, 2020 at 9:07 AM Saber Tooth wrote: > Hello Melissa , > > Today I was writing my GSoD proposal application , while I tried to copy / > paste my exact Google docs proposal in the application description , but > even after clearing all the formatting , the application form is stating > that the " response is too large , try shortening some answers " > Now many people have complained regarding this issue on the season of docs > channel , so it means the issue is upsetting everyone . > > So I wanted to request if it's fine I also mention the link to my google > docs Proposal where it is explained in much more detail. > In application I have almost shortened half of my proposal therefore it > has started to lose its minute explanatory details . > Hi Mrinal, thanks for asking - yes this is the right thing to do. Please add a link and keep your Google Docs proposal exactly as you would like. This goes for all applications, we will take your Google Docs, HackMD doc or other format proposal into account. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Jul 7 13:19:36 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 07 Jul 2020 12:19:36 -0500 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday Message-ID: <59ff27bc13d18d14ec61034be309ca8ad3290f20.camel@sipsolutions.net> Hi all, There will be a NumPy Community meeting Wednesday July 8th at 1pm Pacific Time (20:00 UTC). Everyone is invited and encouraged to join in and edit the work-in-progress meeting topics and notes at: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sabertooth2022 at gmail.com Wed Jul 8 02:31:12 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Wed, 8 Jul 2020 12:01:12 +0530 Subject: [Numpy-discussion] NumPy Documentation Gallery structure Message-ID: Hello Melissa and Ralf , I was just wondering if we haven't had much discussion on the structuring of tutorials , while we have had some discussion on Explanations . I have been analysing the Tutorials of some other communities too , say Matplotlib : they have basically structured the tutorials page like a Gallery where examples have been arranged in the order of experience level . Link to Matplotlib tutorials : https://matplotlib.org/3.2.2/tutorials/index.html Here is SunPy Documentation. They have tried to structure their How-TO examples in the same way like a Gallery . Link to SunPy Gallery Examples : https://docs.sunpy.org/en/v2.0.1/generated/gallery/index.html# Please have a Look , maybe we can also get some new ideas for our Documentation . I think it will help us to structure the Tutorials page of NumPy too . Thanks , Mrinal Tyagi -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Wed Jul 8 02:39:18 2020 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 8 Jul 2020 09:39:18 +0300 Subject: [Numpy-discussion] NumPy Documentation Gallery structure In-Reply-To: References: Message-ID: <1188e74d-8a05-0687-1f2e-e87c27988741@gmail.com> An HTML attachment was scrubbed... URL: From sabertooth2022 at gmail.com Wed Jul 8 03:05:54 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Wed, 8 Jul 2020 12:35:54 +0530 Subject: [Numpy-discussion] NumPy Documentation Gallery structure In-Reply-To: <1188e74d-8a05-0687-1f2e-e87c27988741@gmail.com> References: <1188e74d-8a05-0687-1f2e-e87c27988741@gmail.com> Message-ID: Hi Matti , Yup , Matplotlib and SunPy are using Sphinx Gallery ? , I am asking , could we are also have some discussion around such structure maybe modifying it in a way more suited to NumPy community ? What do you think? Thanks , Mrinal On Wed, 8 Jul, 2020, 12:09 pm Matti Picus, wrote: > I think those projects use > https://github.com/sphinx-gallery/sphinx-gallery to do the layout > > Matti > > > On 7/8/20 9:31 AM, Saber Tooth wrote: > > Hello Melissa and Ralf , > > I was just wondering if we haven't had much discussion on the structuring > of tutorials , while we have had some discussion on Explanations . > > I have been analysing the Tutorials of some other communities too , say > Matplotlib : > they have basically structured the tutorials page like a Gallery where > examples have been arranged in the order of experience level . > Link to Matplotlib tutorials : > https://matplotlib.org/3.2.2/tutorials/index.html > > Here is SunPy Documentation. They have tried to structure their How-TO > examples in the same way like a Gallery . > Link to SunPy Gallery Examples : > https://docs.sunpy.org/en/v2.0.1/generated/gallery/index.html# > > Please have a Look , maybe we can also get some new ideas for our > Documentation . > I think it will help us to structure the Tutorials page of NumPy too . > > Thanks , > Mrinal Tyagi > -- > You received this message because you are subscribed to the Google Groups > "numpy-scipy-gsod" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to numpy-scipy-gsod+unsubscribe at googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/numpy-scipy-gsod/CAAq%2BcUHVfs-u%2B2cAwXZjwvnd6VOPBvmM1BMxJtPmfa6oHZz0nA%40mail.gmail.com > > . > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.k.sheppard at gmail.com Wed Jul 8 03:19:05 2020 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Wed, 8 Jul 2020 08:19:05 +0100 Subject: [Numpy-discussion] NumPy Documentation Gallery structure In-Reply-To: References: <1188e74d-8a05-0687-1f2e-e87c27988741@gmail.com>, Message-ID: An HTML attachment was scrubbed... URL: From sabertooth2022 at gmail.com Fri Jul 10 01:59:27 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Fri, 10 Jul 2020 11:29:27 +0530 Subject: [Numpy-discussion] Collection of Questions for Explanations Message-ID: Hi Mentors , In the last Docs meeting , we discussed about Explanations and ways we could use to gather relevant topics for which we could provide explanations . The light was shed upon mining of questions firstly from numpy communities via Discussion like groups , then move to other sources . We can mine such topics from the NumPy community via mining email threads of numpy-discussion at python.org for Questions . Apart from doing this I am thinking of opening an issue , the track of which will be kept by me and other contributors if they wish . We can write relevant topics for explanations over there . The link to this issue can be shared with numpy-discussion at python.org and other NumPy community forums , where contributors can comment their questions or queries which they wish to get documented . This way we can avoid multiple Github issues by using one Github issue , moreover this issue will populate queries from the community which will be visible to everyone and may give rise to ideas regarding explanations which we haven't thought about , in fact we can close other such issues later by moving their queries to this issue . In this manner we can have enough data within a month or two to document an Explanations page . Then we could frame explanations from this pool of data in this single Github issue , Do suggest your ideas/advice for the following and also if we should be doing something else . Thanks , Mrinal Tyagi -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Fri Jul 10 11:18:57 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 10 Jul 2020 12:18:57 -0300 Subject: [Numpy-discussion] Collection of Questions for Explanations In-Reply-To: References: Message-ID: Sounds like a good idea, Mrinal! Make sure to tag it with DOC and mention that it's a tracking issue in the description. Cheers, Melissa Em sex, 10 de jul de 2020 02:59, Saber Tooth escreveu: > Hi Mentors , > > In the last Docs meeting , we discussed about Explanations and ways we > could use to gather relevant topics for which we could provide explanations > . The light was shed upon mining of questions firstly from numpy > communities via Discussion like groups , then move to other sources . > > We can mine such topics from the NumPy community via mining email threads > of numpy-discussion at python.org for Questions . Apart from doing this I > am thinking of opening an issue , the track of which will be kept by me and > other contributors if they wish . We can write relevant topics for > explanations over there . The link to this issue can be shared with > numpy-discussion at python.org and other NumPy community forums , where > contributors can comment their questions or queries which they wish to get > documented . > > This way we can avoid multiple Github issues by using one Github issue , > moreover this issue will populate queries from the community which will be > visible to everyone and may give rise to ideas regarding explanations which > we haven't thought about , in fact we can close other such issues later by > moving their queries to this issue . > > In this manner we can have enough data within a month or two to document > an Explanations page . Then we could frame explanations from this pool of > data in this single Github issue , > > Do suggest your ideas/advice for the following and also if we should be > doing something else . > > Thanks , > Mrinal Tyagi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Jul 10 12:46:50 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 10 Jul 2020 11:46:50 -0500 Subject: [Numpy-discussion] NumPy Sprint on the weekend Message-ID: <91b28adc57378254ccc40f29a73196e66e1bd981.camel@sipsolutions.net> Hi all, since there is the SciPy sprints going on we will also be joining the event and sprinting as well. The kickoff for the SciPy sprint session is at 9:00 CST (7:00 PST, 16:00 CET), with the start of the sprints around 9:30. We are gathering all information at the hackmd: https://hackmd.io/r4pcf4CQRMOUY2VXOWk5_Q so that we can be flexible with organization (e.g. which video software to use). Note that attendance may pick up only a few hours later, but we will try to be around most of the time. Currently, we have a few documentation fixed listed to be worked on, but it will be a good time to just chat or ask for help setting up a getting started. Another bigger topic is website translations, but more will be added certainly and input is welcome! If you plan on coming, feel free to write yourself into the hackmd so that we know when you are going to be around. Hope you drop in, if just to say hi! Cheers, Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sabertooth2022 at gmail.com Sat Jul 11 03:20:38 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Sat, 11 Jul 2020 12:50:38 +0530 Subject: [Numpy-discussion] Collection of Questions for Explanations In-Reply-To: References: Message-ID: Thanks Melissa , I have opened the issue , but am not able to add labels to it as currently i don't have 'write' access in the org ! I think we can share this issue in the NumPy sprint to gather some data from there . what do you think? On Fri, 10 Jul 2020 at 20:49, Melissa Mendon?a wrote: > Sounds like a good idea, Mrinal! Make sure to tag it with DOC and mention > that it's a tracking issue in the description. > > Cheers, > > Melissa > > Em sex, 10 de jul de 2020 02:59, Saber Tooth > escreveu: > >> Hi Mentors , >> >> In the last Docs meeting , we discussed about Explanations and ways we >> could use to gather relevant topics for which we could provide explanations >> . The light was shed upon mining of questions firstly from numpy >> communities via Discussion like groups , then move to other sources . >> >> We can mine such topics from the NumPy community via mining email threads >> of numpy-discussion at python.org for Questions . Apart from doing this I >> am thinking of opening an issue , the track of which will be kept by me and >> other contributors if they wish . We can write relevant topics for >> explanations over there . The link to this issue can be shared with >> numpy-discussion at python.org and other NumPy community forums , where >> contributors can comment their questions or queries which they wish to get >> documented . >> >> This way we can avoid multiple Github issues by using one Github issue , >> moreover this issue will populate queries from the community which will be >> visible to everyone and may give rise to ideas regarding explanations which >> we haven't thought about , in fact we can close other such issues later by >> moving their queries to this issue . >> >> In this manner we can have enough data within a month or two to document >> an Explanations page . Then we could frame explanations from this pool of >> data in this single Github issue , >> >> Do suggest your ideas/advice for the following and also if we should be >> doing something else . >> >> Thanks , >> Mrinal Tyagi >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sabertooth2022 at gmail.com Sat Jul 11 05:09:33 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Sat, 11 Jul 2020 14:39:33 +0530 Subject: [Numpy-discussion] Collection of Questions for Explanations In-Reply-To: References: Message-ID: Here is the issue https://github.com/numpy/numpy/issues/16801 that i have opened , I request the required access rights to add labels to it . Moreover I have added this topic in the NumPy sprint https://hackmd.io/r4pcf4CQRMOUY2VXOWk5_Q so that contributors can use it to submit their queries/doubts . I also request you to share this github issue with the other NumPy discussion forums that I may not be aware of so that everyone knows about it and everyone can do their bit in helping us mine Explanations . Thanks , Mrinal Tyagi On Sat, 11 Jul 2020 at 12:50, Saber Tooth wrote: > Thanks Melissa , > I have opened the issue , but am not able to add labels to it as currently > i don't have 'write' access in the org ! > I think we can share this issue in the NumPy sprint to gather some data > from there . > what do you think? > > On Fri, 10 Jul 2020 at 20:49, Melissa Mendon?a > wrote: > >> Sounds like a good idea, Mrinal! Make sure to tag it with DOC and mention >> that it's a tracking issue in the description. >> >> Cheers, >> >> Melissa >> >> Em sex, 10 de jul de 2020 02:59, Saber Tooth >> escreveu: >> >>> Hi Mentors , >>> >>> In the last Docs meeting , we discussed about Explanations and ways we >>> could use to gather relevant topics for which we could provide explanations >>> . The light was shed upon mining of questions firstly from numpy >>> communities via Discussion like groups , then move to other sources . >>> >>> We can mine such topics from the NumPy community via mining email >>> threads of numpy-discussion at python.org for Questions . Apart from >>> doing this I am thinking of opening an issue , the track of which will be >>> kept by me and other contributors if they wish . We can write relevant >>> topics for explanations over there . The link to this issue can be shared >>> with numpy-discussion at python.org and other NumPy community forums , >>> where contributors can comment their questions or queries which they wish >>> to get documented . >>> >>> This way we can avoid multiple Github issues by using one Github issue , >>> moreover this issue will populate queries from the community which will be >>> visible to everyone and may give rise to ideas regarding explanations which >>> we haven't thought about , in fact we can close other such issues later by >>> moving their queries to this issue . >>> >>> In this manner we can have enough data within a month or two to document >>> an Explanations page . Then we could frame explanations from this pool of >>> data in this single Github issue , >>> >>> Do suggest your ideas/advice for the following and also if we should be >>> doing something else . >>> >>> Thanks , >>> Mrinal Tyagi >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ram at rachum.com Sun Jul 12 09:00:45 2020 From: ram at rachum.com (Ram Rachum) Date: Sun, 12 Jul 2020 16:00:45 +0300 Subject: [Numpy-discussion] An alternative to vectorize that lets you access the array? In-Reply-To: References: Message-ID: Hi everyone, Here's a problem I've been dealing with. I wonder whether NumPy has a tool that will help me, or whether this could be a useful feature request. In the upcoming EuroPython 20200, I'll do a talk about live-coding a music synthesizer. It's going to be a fun talk, I'll use the sounddevice module to make a program that plays music. Do attend, or watch it on YouTube when it's out :) There's a part in my talk that I could make simpler, and thus shave 3-4 minutes of cumbersome explanations. These 3-4 minutes matter a great deal to me. But for that I need to do something with NumPy and I don't know whether it's possible or not. The sounddevice library takes an ndarray of sound data and plays it. Currently I use `vectorize` to produce that array: output_array = np.vectorize(f, otypes='d')(input_array) And I'd like to replace it with this code, which is supposed to give the same output: output_array = np.ndarray(input_array.shape, dtype='d') for i, item in enumerate(input_array): output_array[i] = f(item) The reason I want the second version is that I can then have sounddevice start playing `output_array` in a separate thread, while it's being calculated. (Yes, I know about the GIL, I believe that sounddevice releases it.) Unfortunately, the for loop is very slow, even when I'm not processing the data on separate thread. I benchmarked it on both CPython and PyPy3, which is my target platform. On CPython it's 3 times slower than vectorize, and on PyPy3 it's 67 times slower than vectorize! That's despite the fact that the Numpy documentation says "The `vectorize` function is provided primarily for convenience, not for performance. The implementation is essentially a `for` loop." So here are a few questions: 1. Is there something like `vectorize`, except you get to access the output array before it's finished? If not, what do you think about adding that as an option to `vectorize`? 2. Is there a more efficient way of writing the `for` loop I've written above? Or any other kind of solution to my problem? Thanks for your help, Ram Rachum. -------------- next part -------------- An HTML attachment was scrubbed... URL: From deak.andris at gmail.com Sun Jul 12 09:42:57 2020 From: deak.andris at gmail.com (Andras Deak) Date: Sun, 12 Jul 2020 15:42:57 +0200 Subject: [Numpy-discussion] An alternative to vectorize that lets you access the array? In-Reply-To: References: Message-ID: On Sun, Jul 12, 2020 at 3:02 PM Ram Rachum wrote: > > Hi everyone, > > Here's a problem I've been dealing with. I wonder whether NumPy has a tool that will help me, or whether this could be a useful feature request. > > In the upcoming EuroPython 20200, I'll do a talk about live-coding a music synthesizer. It's going to be a fun talk, I'll use the sounddevice module to make a program that plays music. Do attend, or watch it on YouTube when it's out :) > > There's a part in my talk that I could make simpler, and thus shave 3-4 minutes of cumbersome explanations. These 3-4 minutes matter a great deal to me. But for that I need to do something with NumPy and I don't know whether it's possible or not. > > > The sounddevice library takes an ndarray of sound data and plays it. Currently I use `vectorize` to produce that array: > > output_array = np.vectorize(f, otypes='d')(input_array) > > And I'd like to replace it with this code, which is supposed to give the same output: > > output_array = np.ndarray(input_array.shape, dtype='d') > for i, item in enumerate(input_array): > output_array[i] = f(item) > > The reason I want the second version is that I can then have sounddevice start playing `output_array` in a separate thread, while it's being calculated. (Yes, I know about the GIL, I believe that sounddevice releases it.) > > Unfortunately, the for loop is very slow, even when I'm not processing the data on separate thread. I benchmarked it on both CPython and PyPy3, which is my target platform. On CPython it's 3 times slower than vectorize, and on PyPy3 it's 67 times slower than vectorize! That's despite the fact that the Numpy documentation says "The `vectorize` function is provided primarily for convenience, not for performance. The implementation is essentially a `for` loop." > > So here are a few questions: > > 1. Is there something like `vectorize`, except you get to access the output array before it's finished? If not, what do you think about adding that as an option to `vectorize`? > > 2. Is there a more efficient way of writing the `for` loop I've written above? Or any other kind of solution to my problem? Hi Ram, I find your description of the behaviour really surprising! My experience with np.vectorize has been consistent with the documentation's note. Can you please provide some more context? 1. What shape is your array? 2. How exactly did you compute the runtimes? 3. How large runtimes are we talking? Are you sure you're not measuring some kind of overhead? 4. What kind of work does f do? This is mostly relevant for your question about alternatives for your loop. Unfortunately I don't believe it's possible or it would even _be_ possible to give access to half-done results of computations. As far as I know even asynchronous libraries make you have to wait until some result is done. So unless you chop up your array along the first dimension and explicitly work with each slice independently, I'm pretty sure this is not possible. Just imagine the wealth of possible race conditions if you could have access to half-initialized arrays. The only actionable suggestion I have for your loop is to replace the `np.ndarray` call with one to `np.empty`. My impression has always been that arrays should be instantiated with one of the helper functions rather than directly from the type. Personally, I don't use vectorize at all because I tend to find that it only misleads the reader. Regards, Andr?s > > Thanks for your help, > Ram Rachum. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Sun Jul 12 10:01:32 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 12 Jul 2020 09:01:32 -0500 Subject: [Numpy-discussion] An alternative to vectorize that lets you access the array? In-Reply-To: References: Message-ID: <80a74935c7065bb738a46e4eeeed8e5e038e36b5.camel@sipsolutions.net> On Sun, 2020-07-12 at 16:00 +0300, Ram Rachum wrote: > Hi everyone, > > Here's a problem I've been dealing with. I wonder whether NumPy has a > tool > that will help me, or whether this could be a useful feature request. > > In the upcoming EuroPython 20200, I'll do a talk about live-coding a > music > synthesizer. It's going to be a fun talk, I'll use the sounddevice > module to make > a > program that plays music. Do attend, or watch it on YouTube when it's > out :) > Sounds like a fun talk :). > There's a part in my talk that I could make simpler, and thus shave > 3-4 > minutes of cumbersome explanations. These 3-4 minutes matter a great > deal > to me. But for that I need to do something with NumPy and I don't > know > whether it's possible or not. > > > The sounddevice library takes an ndarray of sound data and plays it. > Currently I use `vectorize` to produce that array: > > output_array = np.vectorize(f, otypes='d')(input_array) > > And I'd like to replace it with this code, which is supposed to give > the > same output: > > output_array = np.ndarray(input_array.shape, dtype='d') Maybe use `np.empty(inpyt_array.shape, dtype="d")` instead. `np.ndarray` works but is pretty low-level, and I would usually avoid it for array creation. > for i, item in enumerate(input_array): > output_array[i] = f(item) > Ok, one hack that you can try, is to replace `item` with `item.item()`, that will convert the NumPy scalar to a Python scalar, which is quite a lot more lightweight and faster. Also it might give PyPy more chance to optimize `f` I suppose. > The reason I want the second version is that I can then have > sounddevice > start playing `output_array` in a separate thread, while it's being > calculated. (Yes, I know about the GIL, I believe that sounddevice > releases > it.) `np.vectorize` will definitely not release the GIL, this loop may in between (I am not sure), but also adds quite a bit of overheads compared to `vectorize`. The best thing of course would be if you can rewrite `f` to accept an array? > Unfortunately, the for loop is very slow, even when I'm not > processing the > data on separate thread. I benchmarked it on both CPython and PyPy3, > which > is my target platform. On CPython it's 3 times slower than vectorize, > and > on PyPy3 it's 67 times slower than vectorize! That's despite the fact > that > the Numpy documentation says "The `vectorize` function is provided > primarily for convenience, not for performance. The implementation is > essentially a `for` loop." PyPy is nice because it makes NumPy just work. Unfortunately, that also adds some overheads, so at least some slowdown is probably expected. I am not sure about why it is so much. I would not be surprised if a list comprehension is not much faster, especially on PyPy (assuming you cannot modify `f` to work with arrays). > So here are a few questions: > > 1. Is there something like `vectorize`, except you get to access the > output > array before it's finished? If not, what do you think about adding > that as > an option to `vectorize`? vectorize should allow an `out=` argument to pass in the output array, would that help you? So you can access it, but I am not sure how that will help you. Although you could create a big result array and then access chunks of it: final_arr = np.empty(...) newly_written = slice(0, 1000) run_calculation(final_arr[newly_written]) where newly_written is defined by the input chunk you got, I suppose. > > 2. Is there a more efficient way of writing the `for` loop I've > written > above? Or any other kind of solution to my As said, the main thing would be to modify `f` in whatever way possible. For that it would be useful to know what `f` does exactly. Maybe you can move `f` to Cython or numba, or maybe write in a way that works on arrays... > > Thanks for your help, > Ram Rachum. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ram at rachum.com Sun Jul 12 10:19:30 2020 From: ram at rachum.com (Ram Rachum) Date: Sun, 12 Jul 2020 17:19:30 +0300 Subject: [Numpy-discussion] An alternative to vectorize that lets you access the array? In-Reply-To: <80a74935c7065bb738a46e4eeeed8e5e038e36b5.camel@sipsolutions.net> References: <80a74935c7065bb738a46e4eeeed8e5e038e36b5.camel@sipsolutions.net> Message-ID: Thank you Sebastian and Andras for your detailed replies. Sebastian, your suggestion of adding `item.item()` solved my problem! Now the for loop is still slower than vectorize, but by a smaller factor, and that's fast enough for my demonstration. My problem is solved and I'm very happy! I also tried your `out=` suggestion for vectorize, but I think you made a mistake, as it doesn't seem that it takes that argument. If I missed something and it does (maybe it's a very new feature?) that would be even better for me than the `.item()` solution. On Sun, Jul 12, 2020 at 5:03 PM Sebastian Berg wrote: > On Sun, 2020-07-12 at 16:00 +0300, Ram Rachum wrote: > > Hi everyone, > > > > Here's a problem I've been dealing with. I wonder whether NumPy has a > > tool > > that will help me, or whether this could be a useful feature request. > > > > In the upcoming EuroPython 20200, I'll do a talk about live-coding a > > music > > synthesizer. It's going to be a fun talk, I'll use the sounddevice > > module to make > > a > > program that plays music. Do attend, or watch it on YouTube when it's > > out :) > > > > Sounds like a fun talk :). > > > There's a part in my talk that I could make simpler, and thus shave > > 3-4 > > minutes of cumbersome explanations. These 3-4 minutes matter a great > > deal > > to me. But for that I need to do something with NumPy and I don't > > know > > whether it's possible or not. > > > > > > The sounddevice library takes an ndarray of sound data and plays it. > > Currently I use `vectorize` to produce that array: > > > > output_array = np.vectorize(f, otypes='d')(input_array) > > > > And I'd like to replace it with this code, which is supposed to give > > the > > same output: > > > > output_array = np.ndarray(input_array.shape, dtype='d') > > Maybe use `np.empty(inpyt_array.shape, dtype="d")` instead. > `np.ndarray` works but is pretty low-level, and I would usually avoid > it for array creation. > > > for i, item in enumerate(input_array): > > output_array[i] = f(item) > > > > Ok, one hack that you can try, is to replace `item` with `item.item()`, > that will convert the NumPy scalar to a Python scalar, which is quite a > lot more lightweight and faster. Also it might give PyPy more chance > to optimize `f` I suppose. > > > > The reason I want the second version is that I can then have > > sounddevice > > start playing `output_array` in a separate thread, while it's being > > calculated. (Yes, I know about the GIL, I believe that sounddevice > > releases > > it.) > > `np.vectorize` will definitely not release the GIL, this loop may in > between (I am not sure), but also adds quite a bit of overheads > compared to `vectorize`. The best thing of course would be if you can > rewrite `f` to accept an array? > > > > Unfortunately, the for loop is very slow, even when I'm not > > processing the > > data on separate thread. I benchmarked it on both CPython and PyPy3, > > which > > is my target platform. On CPython it's 3 times slower than vectorize, > > and > > on PyPy3 it's 67 times slower than vectorize! That's despite the fact > > that > > the Numpy documentation says "The `vectorize` function is provided > > primarily for convenience, not for performance. The implementation is > > essentially a `for` loop." > > PyPy is nice because it makes NumPy just work. Unfortunately, that also > adds some overheads, so at least some slowdown is probably expected. I > am not sure about why it is so much. > I would not be surprised if a list comprehension is not much faster, > especially on PyPy (assuming you cannot modify `f` to work with > arrays). > > > So here are a few questions: > > > > 1. Is there something like `vectorize`, except you get to access the > > output > > array before it's finished? If not, what do you think about adding > > that as > > an option to `vectorize`? > > vectorize should allow an `out=` argument to pass in the output array, > would that help you? So you can access it, but I am not sure how that > will help you. Although you could create a big result array and then > access chunks of it: > > final_arr = np.empty(...) > newly_written = slice(0, 1000) > run_calculation(final_arr[newly_written]) > > where newly_written is defined by the input chunk you got, I suppose. > > > > > > 2. Is there a more efficient way of writing the `for` loop I've > > written > > above? Or any other kind of solution to my > > As said, the main thing would be to modify `f` in whatever way > possible. For that it would be useful to know what `f` does exactly. > Maybe you can move `f` to Cython or numba, or maybe write in a way that > works on arrays... > > > > > Thanks for your help, > > Ram Rachum. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sun Jul 12 15:46:51 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 12 Jul 2020 14:46:51 -0500 Subject: [Numpy-discussion] Numpy FFT normalization options issue (addition of new option) In-Reply-To: References: <1591279296277-0.post@n7.nabble.com> <1591388154629-0.post@n7.nabble.com> <1591670629042-0.post@n7.nabble.com> <1592938084292-0.post@n7.nabble.com> <1593233597029-0.post@n7.nabble.com> , Message-ID: Just a heads-up. As I think the discussion seemed to settled on "backwards" (default, identical to None), "forward" and the existing "ortho". Thus "forward" and "backward" are now new valid values for the `norm` keyword argument to the fft functions in NumPy. (see https://github.com/numpy/numpy/pull/16476) - Sebastian On Mon, 2020-06-29 at 01:59 +0000, Peter Bell wrote: > > > Honestly, I don't find "forward" very informative. There isn't > > > any real convention on whether FFT of IFFT have any > > > normalization. > > > To the best of my experience, either forward or inverse could be > > > normalized by 1/N, or each normalized by 1/sqrt(N), or neither > > > could be normalized. I will say my expertise is in signal > > > processing and communications. > > > > > > Perhaps > > > norm = {full, half, none} would be clearest to me. > > If I understand your point correctly and the discussion so far, the > > intention here is to use the keyword to denote the convention for > > an > > FFT-IFFT pair rather than just normalization in a single > > transformation (either FFT or IFFT). > > The idea being that calling ifft on the output of fft while using > > the > > same `norm` would be more or less identity. This would work for > > "half", but not for, say, "full". We need to come up with a name > > that > > specifies where normalization happens with regards to the > > forward-inverse pair. > > For what it's worth, I'm not sure that norm referring to a pair of > transforms was ever a conscious decision. The numpy issue that first > proposed the norm argument was gh-2142 which references > scipy.fftpack's discrete cosine transforms. However, fftpack's dct > never applied a 1/N normalization factor in either direction. So, > norm=None really did mean "no normalization". It was then carried > over to NumPy with None instead meaning "default normalization". > > Unfortunately, this means norm=None could easily be mistaken for "no > normalization", and would make accepting norm="none" terribly > confusing. To break this confusion, I think the documentation should > refer to norm={"backward", "ortho", "forward"} where "backward" is a > synonym for norm=None. > > As an aside, the history with the dct makes it clear the choice was > "ortho" and not "unitary" because the dct is a real transform. > > -Peter > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ram at rachum.com Mon Jul 13 08:45:00 2020 From: ram at rachum.com (Ram Rachum) Date: Mon, 13 Jul 2020 15:45:00 +0300 Subject: [Numpy-discussion] An alternative to vectorize that lets you access the array? In-Reply-To: References: <80a74935c7065bb738a46e4eeeed8e5e038e36b5.camel@sipsolutions.net> Message-ID: Thank you Sebastian and Andras for your detailed replies. Sebastian, your suggestion of adding `item.item()` solved my problem! Now the for loop is still slower than vectorize, but by a smaller factor, and that's fast enough for my demonstration. My problem is solved and I'm very happy! I also tried your `out=` suggestion for vectorize, but I think you made a mistake, as it doesn't seem that it takes that argument. If I missed something and it does (maybe it's a very new feature?) that would be even better for me than the `.item()` solution. > On Sun, Jul 12, 2020 at 5:03 PM Sebastian Berg > wrote: > >> On Sun, 2020-07-12 at 16:00 +0300, Ram Rachum wrote: >> > Hi everyone, >> > >> > Here's a problem I've been dealing with. I wonder whether NumPy has a >> > tool >> > that will help me, or whether this could be a useful feature request. >> > >> > In the upcoming EuroPython 20200, I'll do a talk about live-coding a >> > music >> > synthesizer. It's going to be a fun talk, I'll use the sounddevice >> > module to make >> > a >> > program that plays music. Do attend, or watch it on YouTube when it's >> > out :) >> > >> >> Sounds like a fun talk :). >> >> > There's a part in my talk that I could make simpler, and thus shave >> > 3-4 >> > minutes of cumbersome explanations. These 3-4 minutes matter a great >> > deal >> > to me. But for that I need to do something with NumPy and I don't >> > know >> > whether it's possible or not. >> > >> > >> > The sounddevice library takes an ndarray of sound data and plays it. >> > Currently I use `vectorize` to produce that array: >> > >> > output_array = np.vectorize(f, otypes='d')(input_array) >> > >> > And I'd like to replace it with this code, which is supposed to give >> > the >> > same output: >> > >> > output_array = np.ndarray(input_array.shape, dtype='d') >> >> Maybe use `np.empty(inpyt_array.shape, dtype="d")` instead. >> `np.ndarray` works but is pretty low-level, and I would usually avoid >> it for array creation. >> >> > for i, item in enumerate(input_array): >> > output_array[i] = f(item) >> > >> >> Ok, one hack that you can try, is to replace `item` with `item.item()`, >> that will convert the NumPy scalar to a Python scalar, which is quite a >> lot more lightweight and faster. Also it might give PyPy more chance >> to optimize `f` I suppose. >> >> >> > The reason I want the second version is that I can then have >> > sounddevice >> > start playing `output_array` in a separate thread, while it's being >> > calculated. (Yes, I know about the GIL, I believe that sounddevice >> > releases >> > it.) >> >> `np.vectorize` will definitely not release the GIL, this loop may in >> between (I am not sure), but also adds quite a bit of overheads >> compared to `vectorize`. The best thing of course would be if you can >> rewrite `f` to accept an array? >> >> >> > Unfortunately, the for loop is very slow, even when I'm not >> > processing the >> > data on separate thread. I benchmarked it on both CPython and PyPy3, >> > which >> > is my target platform. On CPython it's 3 times slower than vectorize, >> > and >> > on PyPy3 it's 67 times slower than vectorize! That's despite the fact >> > that >> > the Numpy documentation says "The `vectorize` function is provided >> > primarily for convenience, not for performance. The implementation is >> > essentially a `for` loop." >> >> PyPy is nice because it makes NumPy just work. Unfortunately, that also >> adds some overheads, so at least some slowdown is probably expected. I >> am not sure about why it is so much. >> I would not be surprised if a list comprehension is not much faster, >> especially on PyPy (assuming you cannot modify `f` to work with >> arrays). >> >> > So here are a few questions: >> > >> > 1. Is there something like `vectorize`, except you get to access the >> > output >> > array before it's finished? If not, what do you think about adding >> > that as >> > an option to `vectorize`? >> >> vectorize should allow an `out=` argument to pass in the output array, >> would that help you? So you can access it, but I am not sure how that >> will help you. Although you could create a big result array and then >> access chunks of it: >> >> final_arr = np.empty(...) >> newly_written = slice(0, 1000) >> run_calculation(final_arr[newly_written]) >> >> where newly_written is defined by the input chunk you got, I suppose. >> >> >> > >> > 2. Is there a more efficient way of writing the `for` loop I've >> > written >> > above? Or any other kind of solution to my >> >> As said, the main thing would be to modify `f` in whatever way >> possible. For that it would be useful to know what `f` does exactly. >> Maybe you can move `f` to Cython or numba, or maybe write in a way that >> works on arrays... >> >> > >> > Thanks for your help, >> > Ram Rachum. >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Jul 13 10:52:40 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 13 Jul 2020 09:52:40 -0500 Subject: [Numpy-discussion] An alternative to vectorize that lets you access the array? In-Reply-To: References: <80a74935c7065bb738a46e4eeeed8e5e038e36b5.camel@sipsolutions.net> Message-ID: On Mon, 2020-07-13 at 15:45 +0300, Ram Rachum wrote: > Thank you Sebastian and Andras for your detailed replies. > > Sebastian, your suggestion of adding `item.item()` solved my problem! > Now > the for loop is still slower than vectorize, but by a smaller factor, > and > that's fast enough for my demonstration. My problem is solved and I'm > very > happy! > > I also tried your `out=` suggestion for vectorize, but I think you > made a > mistake, as it doesn't seem that it takes that argument. If I missed > something and it does (maybe it's a very new feature?) that would be > even > better for me than the `.item()` solution. > You are right, I thought vectorize may be a proper ufunc internally in this branch (like frompyfunc), but `frompyfunc` currently does not support dtypes other than object (which could be a nice improvement to make vectorize more replaceable). - Sebastian > > > On Sun, Jul 12, 2020 at 5:03 PM Sebastian Berg < > > sebastian at sipsolutions.net> > > wrote: > > > > > On Sun, 2020-07-12 at 16:00 +0300, Ram Rachum wrote: > > > > Hi everyone, > > > > > > > > Here's a problem I've been dealing with. I wonder whether NumPy > > > > has a > > > > tool > > > > that will help me, or whether this could be a useful feature > > > > request. > > > > > > > > In the upcoming EuroPython 20200, I'll do a talk about live- > > > > coding a > > > > music > > > > synthesizer. It's going to be a fun talk, I'll use the > > > > sounddevice > > > > module to > > > > make > > > > a > > > > program that plays music. Do attend, or watch it on YouTube > > > > when it's > > > > out :) > > > > > > > > > > Sounds like a fun talk :). > > > > > > > There's a part in my talk that I could make simpler, and thus > > > > shave > > > > 3-4 > > > > minutes of cumbersome explanations. These 3-4 minutes matter a > > > > great > > > > deal > > > > to me. But for that I need to do something with NumPy and I > > > > don't > > > > know > > > > whether it's possible or not. > > > > > > > > > > > > The sounddevice library takes an ndarray of sound data and > > > > plays it. > > > > Currently I use `vectorize` to produce that array: > > > > > > > > output_array = np.vectorize(f, otypes='d')(input_array) > > > > > > > > And I'd like to replace it with this code, which is supposed to > > > > give > > > > the > > > > same output: > > > > > > > > output_array = np.ndarray(input_array.shape, dtype='d') > > > > > > Maybe use `np.empty(inpyt_array.shape, dtype="d")` instead. > > > `np.ndarray` works but is pretty low-level, and I would usually > > > avoid > > > it for array creation. > > > > > > > for i, item in enumerate(input_array): > > > > output_array[i] = f(item) > > > > > > > > > > Ok, one hack that you can try, is to replace `item` with > > > `item.item()`, > > > that will convert the NumPy scalar to a Python scalar, which is > > > quite a > > > lot more lightweight and faster. Also it might give PyPy more > > > chance > > > to optimize `f` I suppose. > > > > > > > > > > The reason I want the second version is that I can then have > > > > sounddevice > > > > start playing `output_array` in a separate thread, while it's > > > > being > > > > calculated. (Yes, I know about the GIL, I believe that > > > > sounddevice > > > > releases > > > > it.) > > > > > > `np.vectorize` will definitely not release the GIL, this loop may > > > in > > > between (I am not sure), but also adds quite a bit of overheads > > > compared to `vectorize`. The best thing of course would be if > > > you can > > > rewrite `f` to accept an array? > > > > > > > > > > Unfortunately, the for loop is very slow, even when I'm not > > > > processing the > > > > data on separate thread. I benchmarked it on both CPython and > > > > PyPy3, > > > > which > > > > is my target platform. On CPython it's 3 times slower than > > > > vectorize, > > > > and > > > > on PyPy3 it's 67 times slower than vectorize! That's despite > > > > the fact > > > > that > > > > the Numpy documentation says "The `vectorize` function is > > > > provided > > > > primarily for convenience, not for performance. The > > > > implementation is > > > > essentially a `for` loop." > > > > > > PyPy is nice because it makes NumPy just work. Unfortunately, > > > that also > > > adds some overheads, so at least some slowdown is probably > > > expected. I > > > am not sure about why it is so much. > > > I would not be surprised if a list comprehension is not much > > > faster, > > > especially on PyPy (assuming you cannot modify `f` to work with > > > arrays). > > > > > > > So here are a few questions: > > > > > > > > 1. Is there something like `vectorize`, except you get to > > > > access the > > > > output > > > > array before it's finished? If not, what do you think about > > > > adding > > > > that as > > > > an option to `vectorize`? > > > > > > vectorize should allow an `out=` argument to pass in the output > > > array, > > > would that help you? So you can access it, but I am not sure how > > > that > > > will help you. Although you could create a big result array and > > > then > > > access chunks of it: > > > > > > final_arr = np.empty(...) > > > newly_written = slice(0, 1000) > > > run_calculation(final_arr[newly_written]) > > > > > > where newly_written is defined by the input chunk you got, I > > > suppose. > > > > > > > > > > 2. Is there a more efficient way of writing the `for` loop I've > > > > written > > > > above? Or any other kind of solution to my > > > > > > As said, the main thing would be to modify `f` in whatever way > > > possible. For that it would be useful to know what `f` does > > > exactly. > > > Maybe you can move `f` to Cython or numba, or maybe write in a > > > way that > > > works on arrays... > > > > > > > Thanks for your help, > > > > Ram Rachum. > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Wed Jul 15 12:15:12 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 15 Jul 2020 11:15:12 -0500 Subject: [Numpy-discussion] NumPy Development Meeting Today - Triage Focus Message-ID: <0953c5547f72e944714da292719e2c5d7ec6025f.camel@sipsolutions.net> Hi all, Our bi-weekly triage-focused NumPy development meeting is tomorrow (Wednesday, July 15th) at 11 am Pacific Time (18:00 UTC). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or PRs that you feel should be prioritized or simply discussed briefly. Just comment on it so we can label it, or add your PR/issue to this weeks topics for discussion. Best regards Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From daniele at grinta.net Wed Jul 15 13:45:53 2020 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 15 Jul 2020 11:45:53 -0600 Subject: [Numpy-discussion] An alternative to vectorize that lets you access the array? In-Reply-To: References: Message-ID: <5751b010-1aca-3eaf-d7f3-9bcba51928f5@grinta.net> On 12/07/2020 07:00, Ram Rachum wrote: > The reason I want the second version is that I can then have sounddevice > start playing `output_array` in a separate thread, while it's being > calculated. (Yes, I know about the GIL, I believe that sounddevice > releases it.) I don't think this is a sound design. I don't know sounddevice, but in similar situations the standard pattern is to allocate a buffer (in this case it can be a numpy array) and pass that to the consumer (soundevice in your case). The consumer then tells the producer (your music synth) when it has to produce more data. At a quick read, it seems that the sounddevice.Stream class allows to apply this pattern https://python-sounddevice.readthedocs.io/en/0.3.15/usage.html#callback-streams This also easily allows your produces function to operate on arrays and not on single elements. Using numpy functions to operate on arrays is going to be more efficient than iterating on the elements in Python. Cheers, Dan From sabertooth2022 at gmail.com Thu Jul 16 04:14:17 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Thu, 16 Jul 2020 13:44:17 +0530 Subject: [Numpy-discussion] Discussion around How-to ,Tutorials. Message-ID: Hello Melissa , Are we having our Docs meeting on 20th July ? If yes , then could we have some discussion around the structure of tutorials ,how-to sections . I wanted to discuss how we are planning to distinguish between the two , should the How-to be straightforward solutions like here or something else . Tutorials should be structured as in here or something else . The general belief is when we hear the word "How to" is that we are greeted to just the solution of the problem like above , while when we hear Tutorials we are greeted to Detailed solutions with explanations for most of the steps . Thanks , Mrinal -------------- next part -------------- An HTML attachment was scrubbed... URL: From aminthefresh at gmail.com Thu Jul 16 14:27:03 2020 From: aminthefresh at gmail.com (Amin Sadeghi) Date: Thu, 16 Jul 2020 14:27:03 -0400 Subject: [Numpy-discussion] Augment unique method Message-ID: It would be handy to add "atol" and "rtol" optional arguments to the "unique" method. I'm proposing this since uniqueness is a bit vague for floats. This change would be clearly backwards-compatible. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rth.yurchak at gmail.com Thu Jul 16 14:41:35 2020 From: rth.yurchak at gmail.com (Roman Yurchak) Date: Thu, 16 Jul 2020 20:41:35 +0200 Subject: [Numpy-discussion] Augment unique method In-Reply-To: References: Message-ID: <626440af-3ec8-bac7-fcfd-7e4e8237a499@gmail.com> One issue with adding a tolerance to np.unique for floats is say you have [0, 0.1, 0.2, 0.3, 0.4, 0.5] with atol=0.15 Should this return a single element or multiple ones? One once side each consecutive float is closer than the tolerance to the next one but the first one and the last one are clearly not within atol. Generally this is similar to what DBSCAN clustering algorithm does (e.g. in scikit-learn) and that would probably be out of scope for np.unique. Roman On 16/07/2020 20:27, Amin Sadeghi wrote: > It would be handy to add "atol" and "rtol" optional arguments to?the > "unique" method. I'm proposing this since uniqueness is a bit vague for > floats. This change would be clearly backwards-compatible. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From shoyer at gmail.com Thu Jul 16 15:06:21 2020 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 16 Jul 2020 12:06:21 -0700 Subject: [Numpy-discussion] Augment unique method In-Reply-To: <626440af-3ec8-bac7-fcfd-7e4e8237a499@gmail.com> References: <626440af-3ec8-bac7-fcfd-7e4e8237a499@gmail.com> Message-ID: On Thu, Jul 16, 2020 at 11:41 AM Roman Yurchak wrote: > One issue with adding a tolerance to np.unique for floats is say you have > [0, 0.1, 0.2, 0.3, 0.4, 0.5] with atol=0.15 > > Should this return a single element or multiple ones? One once side each > consecutive float is closer than the tolerance to the next one but the > first one and the last one are clearly not within atol. > > Generally this is similar to what DBSCAN clustering algorithm does (e.g. > in scikit-learn) and that would probably be out of scope for np.unique. > I agree, I don't think there's an easy answer for selecting "approximately unique" floats in the case of overlap. np.unique() does actually have well defined behavior for float, comparing floats for exact equality. This isn't always directly useful, but it definitely is well defined. My suggestion for this use-case would be round floats to the desired precision before passing them into np.unique(). > Roman > > On 16/07/2020 20:27, Amin Sadeghi wrote: > > It would be handy to add "atol" and "rtol" optional arguments to the > > "unique" method. I'm proposing this since uniqueness is a bit vague for > > floats. This change would be clearly backwards-compatible. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Thu Jul 16 15:16:29 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Thu, 16 Jul 2020 16:16:29 -0300 Subject: [Numpy-discussion] Discussion around How-to ,Tutorials. In-Reply-To: References: Message-ID: Hello, Mrinal Yes, we are having our meeting on monday (more information in a message still to come). Right now, our idea of what tutorials and how-tos are comes from Daniele Procida [1] (detailed in NEP 44 [2]). However, I believe the discussion around exactly what we will cover or try to achieve in each kind of document is not finished. In fact, we are still learning how to organize this. Feel free to add a topic in the meeting notes for the Documentation Meeting [3] so we can discuss this on monday. Cheers, Melissa [1] https://documentation.divio.com/ [2] https://numpy.org/neps/nep-0044-restructuring-numpy-docs.html [3] https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg On Thu, Jul 16, 2020 at 5:14 AM Saber Tooth wrote: > Hello Melissa , > > Are we having our Docs meeting on 20th July ? > If yes , then could we have some discussion around the structure of > tutorials ,how-to sections . > I wanted to discuss how we are planning to distinguish between the two , > should the How-to be straightforward solutions like here > or > something else . > Tutorials should be structured as in here > or > something else . > > The general belief is when we hear the word "How to" is that we are > greeted to just the solution of the problem like above , while when we hear > Tutorials we are greeted to Detailed solutions with explanations for most > of the steps . > > Thanks , > Mrinal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aminthefresh at gmail.com Thu Jul 16 16:04:38 2020 From: aminthefresh at gmail.com (aminthefresh at gmail.com) Date: Thu, 16 Jul 2020 16:04:38 -0400 Subject: [Numpy-discussion] Augment unique method In-Reply-To: References: <626440af-3ec8-bac7-fcfd-7e4e8237a499@gmail.com> Message-ID: <001b01d65bac$5676ff30$0364fd90$@gmail.com> I see your point. How about passing number of significant figures instead of atol. In fact, that?s what I originally intended but I thought that it could be expressed via atol and rtol, whereas number of significant figures doesn?t seem to suffer from the ambiguity you pointed out. From: NumPy-Discussion On Behalf Of Stephan Hoyer Sent: Thursday, July 16, 2020 3:06 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Augment unique method On Thu, Jul 16, 2020 at 11:41 AM Roman Yurchak > wrote: One issue with adding a tolerance to np.unique for floats is say you have [0, 0.1, 0.2, 0.3, 0.4, 0.5] with atol=0.15 Should this return a single element or multiple ones? One once side each consecutive float is closer than the tolerance to the next one but the first one and the last one are clearly not within atol. Generally this is similar to what DBSCAN clustering algorithm does (e.g. in scikit-learn) and that would probably be out of scope for np.unique. I agree, I don't think there's an easy answer for selecting "approximately unique" floats in the case of overlap. np.unique() does actually have well defined behavior for float, comparing floats for exact equality. This isn't always directly useful, but it definitely is well defined. My suggestion for this use-case would be round floats to the desired precision before passing them into np.unique(). Roman On 16/07/2020 20:27, Amin Sadeghi wrote: > It would be handy to add "atol" and "rtol" optional arguments to the > "unique" method. I'm proposing this since uniqueness is a bit vague for > floats. This change would be clearly backwards-compatible. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Jul 16 16:14:05 2020 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 16 Jul 2020 13:14:05 -0700 Subject: [Numpy-discussion] Augment unique method In-Reply-To: <001b01d65bac$5676ff30$0364fd90$@gmail.com> References: <626440af-3ec8-bac7-fcfd-7e4e8237a499@gmail.com> <001b01d65bac$5676ff30$0364fd90$@gmail.com> Message-ID: On Thu, Jul 16, 2020 at 1:04 PM wrote: > I see your point. How about passing number of significant figures instead > of atol. > > > > In fact, that?s what I originally intended but I thought that it could be > expressed via atol and rtol, whereas number of significant figures doesn?t > seem to suffer from the ambiguity you pointed out. > This can already be expressed clearly* with a separate function call, e.g., np.unique(np.round(x, 3)) In general, it's a better software design practice to have separate composable functions rather than adding more features into a single function. So I don't think this would be an improvement for np.unique(). * Note: this is rounding to fixed precision rather than a fixed number of significant figures. I can see a case why adding a helper function for rounding to a number of significant digits would be useful, but this should be a separate change from np.unique(). You can certainly do this currently in NumPy but it's a bit of work: https://stackoverflow.com/questions/18915378/rounding-to-significant-figures-in-numpy > > > *From:* NumPy-Discussion gmail.com at python.org> *On Behalf Of *Stephan Hoyer > *Sent:* Thursday, July 16, 2020 3:06 PM > *To:* Discussion of Numerical Python > *Subject:* Re: [Numpy-discussion] Augment unique method > > > > On Thu, Jul 16, 2020 at 11:41 AM Roman Yurchak > wrote: > > One issue with adding a tolerance to np.unique for floats is say you have > [0, 0.1, 0.2, 0.3, 0.4, 0.5] with atol=0.15 > > Should this return a single element or multiple ones? One once side each > consecutive float is closer than the tolerance to the next one but the > first one and the last one are clearly not within atol. > > Generally this is similar to what DBSCAN clustering algorithm does (e.g. > in scikit-learn) and that would probably be out of scope for np.unique. > > > > I agree, I don't think there's an easy answer for selecting "approximately > unique" floats in the case of overlap. > > > > np.unique() does actually have well defined behavior for float, comparing > floats for exact equality. This isn't always directly useful, but it > definitely is well defined. > > > > My suggestion for this use-case would be round floats to the desired > precision before passing them into np.unique(). > > > > > > Roman > > On 16/07/2020 20:27, Amin Sadeghi wrote: > > It would be handy to add "atol" and "rtol" optional arguments to the > > "unique" method. I'm proposing this since uniqueness is a bit vague for > > floats. This change would be clearly backwards-compatible. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Fri Jul 17 13:10:36 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 17 Jul 2020 14:10:36 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday July 20 In-Reply-To: References: Message-ID: Hello everyone! This is a reminder that our next Documentation Team meeting will be on *Monday, July 20* at 3PM UTC**. If you wish to join on Zoom, you need to use this link https://zoom.us/j/420005230 Here's the permanent hackmd document with the meeting notes: https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20200720T15&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar/r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Jul 21 13:18:42 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 21 Jul 2020 12:18:42 -0500 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday Message-ID: <98e596664cf77fded3f138c6f402458e084f4fb5.camel@sipsolutions.net> Hi all, There will be a NumPy Community meeting Wednesday July 22nd at 1pm Pacific Time (20:00 UTC). Everyone is invited and encouraged to join in and edit the work-in-progress meeting topics and notes at: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Tue Jul 21 17:53:32 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 21 Jul 2020 15:53:32 -0600 Subject: [Numpy-discussion] NumPy 1.19.1 released. Message-ID: Hi All, On behalf of the NumPy team I am pleased to announce that NumPy 1.19.1 has been released. This release supports Python 3.6-3.8 and may be built with the latest Python 3.9 beta. It fixes several bugs found in the 1.19.0 release, replaces several functions deprecated in the upcoming Python-3.9 release, has improved support for AIX, and has a number of development related updates to keep CI working with recent upstream changes. Downstream developers should use Cython >= 0.29.21 when building for Python 3.9 and Cython >= 0.29.16 when building for Python 3.8. OpenBLAS >= 3.7 is needed to avoid wrong results on the Skylake architecture. The NumPy Wheels for this release can be downloaded from PyPI , source archives, release notes, and wheel hashes are available from Github . Linux users will need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014 wheels. *Contributors* A total of 15 people contributed to this release. People with a "+" by their names contributed a patch for the first time. - Abhinav Reddy + - Anirudh Subramanian - Antonio Larrosa + - Charles Harris - Chunlin Fang - Eric Wieser - Etienne Guesnet + - Kevin Sheppard - Matti Picus - Raghuveer Devulapalli - Roman Yurchak - Ross Barnowski - Sayed Adel - Sebastian Berg - Tyler Reddy *Pull requests merged* A total of 25 pull requests were merged for this release. - #16649: MAINT, CI: disable Shippable cache - #16652: MAINT: Replace PyUString_GET_SIZE with PyUnicode_GetLength. - #16654: REL: Fix outdated docs link - #16656: BUG: raise IEEE exception on AIX - #16672: BUG: Fix bug in AVX complex absolute while processing array of... - #16693: TST: Add extra debugging information to CPU features detection - #16703: BLD: Add CPU entry for Emscripten / WebAssembly - #16705: TST: Disable Python 3.9-dev testing. - #16714: MAINT: Disable use_hugepages in case of ValueError - #16724: BUG: Fix PyArray_SearchSorted signature. - #16768: MAINT: Fixes for deprecated functions in scalartypes.c.src - #16772: MAINT: Remove unneeded call to PyUnicode_READY - #16776: MAINT: Fix deprecated functions in scalarapi.c - #16779: BLD, ENH: Add RPATH support for AIX - #16780: BUG: Fix default fallback in genfromtxt - #16784: BUG: Added missing return after raising error in methods.c - #16795: BLD: update cython to 0.29.21 - #16832: MAINT: setuptools 49.2.0 emits a warning, avoid it - #16872: BUG: Validate output size in bin- and multinomial - #16875: BLD, MAINT: Pin setuptools - #16904: DOC: Reconstruct Testing Guideline. - #16905: TST, BUG: Re-raise MemoryError exception in test_large_zip's... - #16906: BUG, DOC: Fix bad MPL kwarg. - #16916: BUG: Fix string/bytes to complex assignment - #16922: REL: Prepare for NumPy 1.19.1 release Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmeurer at gmail.com Wed Jul 22 18:23:13 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Wed, 22 Jul 2020 16:23:13 -0600 Subject: [Numpy-discussion] Why does fancy indexing work like this? Message-ID: Why does fancy indexing have this behavior? >>> a = np.empty((0, 1, 2)) >>> b = np.empty((1, 1, 2)) >>> a[np.array([10, 10])] Traceback (most recent call last): File "", line 1, in IndexError: index 10 is out of bounds for axis 0 with size 0 >>> a[:, np.array([10, 10])] array([], shape=(0, 2, 2), dtype=float64) >>> a[:, :, np.array([10, 10])] array([], shape=(0, 1, 2), dtype=float64) >>> b[np.array([10, 10])] Traceback (most recent call last): File "", line 1, in IndexError: index 10 is out of bounds for axis 0 with size 1 >>> b[:, np.array([10, 10])] Traceback (most recent call last): File "", line 1, in IndexError: index 10 is out of bounds for axis 1 with size 1 >>> b[:, :, np.array([10, 10])] Traceback (most recent call last): File "", line 1, in IndexError: index 10 is out of bounds for axis 2 with size 2 As far as I can tell, the behavior is that if an array has a 0 dimension and an integer array index indexes an axis that isn't 0, there are no bounds checks. Why does it do this? It seems to be inconsistent with the behavior of shape () fancy indices (integer indices). I couldn't find any reference to this behavior in https://numpy.org/doc/stable/reference/arrays.indexing.html. Aaron Meurer From sebastian at sipsolutions.net Wed Jul 22 18:31:58 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 22 Jul 2020 17:31:58 -0500 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: References: Message-ID: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> On Wed, 2020-07-22 at 16:23 -0600, Aaron Meurer wrote: > Why does fancy indexing have this behavior? > > > > > a = np.empty((0, 1, 2)) > > > > b = np.empty((1, 1, 2)) > > > > a[np.array([10, 10])] > Traceback (most recent call last): > File "", line 1, in > IndexError: index 10 is out of bounds for axis 0 with size 0 > > > > a[:, np.array([10, 10])] > array([], shape=(0, 2, 2), dtype=float64) > > > > a[:, :, np.array([10, 10])] > array([], shape=(0, 1, 2), dtype=float64) > > > > b[np.array([10, 10])] > Traceback (most recent call last): > File "", line 1, in > IndexError: index 10 is out of bounds for axis 0 with size 1 > > > > b[:, np.array([10, 10])] > Traceback (most recent call last): > File "", line 1, in > IndexError: index 10 is out of bounds for axis 1 with size 1 > > > > b[:, :, np.array([10, 10])] > Traceback (most recent call last): > File "", line 1, in > IndexError: index 10 is out of bounds for axis 2 with size 2 > > As far as I can tell, the behavior is that if an array has a 0 > dimension and an integer array index indexes an axis that isn't 0, > there are no bounds checks. Why does it do this? It seems to be > inconsistent with the behavior of shape () fancy indices (integer > indices). I couldn't find any reference to this behavior in > https://numpy.org/doc/stable/reference/arrays.indexing.html. > The reason is because we used to not do this when there are *two* advanced indices: arr = np.ones((5, 6)) arr[[], [10, 10]] giving an empty result. If you check on master (and maybe on 1.19.x, I am not sure). You should see that all of your examples give a deprecation warning to be turned into an error (except the example I gave above, which can be argued to be correct). - Sebastian > Aaron Meurer > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From asmeurer at gmail.com Wed Jul 22 18:55:14 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Wed, 22 Jul 2020 16:55:14 -0600 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> Message-ID: Ah, so I guess I caught this issue right as it got fixed. There are no warnings in 1.19.0, but I can confirm I get the warnings in numpy master. 1.19.1 isn't on conda yet, but I tried building it and didn't get the warning there. So I guess I need to wait for 0.19.2. How long do deprecation cycles like this tend to last (I'm also curious when the warnings for things like a[[[0, 1], [0, 1]]] will go away)? Aaron Meurer On Wed, Jul 22, 2020 at 4:32 PM Sebastian Berg wrote: > > On Wed, 2020-07-22 at 16:23 -0600, Aaron Meurer wrote: > > Why does fancy indexing have this behavior? > > > > > > > a = np.empty((0, 1, 2)) > > > > > b = np.empty((1, 1, 2)) > > > > > a[np.array([10, 10])] > > Traceback (most recent call last): > > File "", line 1, in > > IndexError: index 10 is out of bounds for axis 0 with size 0 > > > > > a[:, np.array([10, 10])] > > array([], shape=(0, 2, 2), dtype=float64) > > > > > a[:, :, np.array([10, 10])] > > array([], shape=(0, 1, 2), dtype=float64) > > > > > b[np.array([10, 10])] > > Traceback (most recent call last): > > File "", line 1, in > > IndexError: index 10 is out of bounds for axis 0 with size 1 > > > > > b[:, np.array([10, 10])] > > Traceback (most recent call last): > > File "", line 1, in > > IndexError: index 10 is out of bounds for axis 1 with size 1 > > > > > b[:, :, np.array([10, 10])] > > Traceback (most recent call last): > > File "", line 1, in > > IndexError: index 10 is out of bounds for axis 2 with size 2 > > > > As far as I can tell, the behavior is that if an array has a 0 > > dimension and an integer array index indexes an axis that isn't 0, > > there are no bounds checks. Why does it do this? It seems to be > > inconsistent with the behavior of shape () fancy indices (integer > > indices). I couldn't find any reference to this behavior in > > https://numpy.org/doc/stable/reference/arrays.indexing.html. > > > > The reason is because we used to not do this when there are *two* > advanced indices: > > arr = np.ones((5, 6)) > arr[[], [10, 10]] > > giving an empty result. If you check on master (and maybe on 1.19.x, I > am not sure). You should see that all of your examples give a > deprecation warning to be turned into an error (except the example I > gave above, which can be argued to be correct). > > - Sebastian > > > > Aaron Meurer > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From asmeurer at gmail.com Wed Jul 22 19:08:04 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Wed, 22 Jul 2020 17:08:04 -0600 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> Message-ID: On Wed, Jul 22, 2020 at 4:55 PM Aaron Meurer wrote: > > Ah, so I guess I caught this issue right as it got fixed. There are no > warnings in 1.19.0, but I can confirm I get the warnings in numpy > master. 1.19.1 isn't on conda yet, but I tried building it and didn't > get the warning there. So I guess I need to wait for 0.19.2. Or rather 1.20 I guess https://github.com/numpy/numpy/pull/15900. By the way, it would be useful if deprecation warnings like this had a functionality to enable the actual post-deprecation behavior. Right now the warning says to run warnings.simplefilter('error'), but this causes the above indexing to raise DeprecationWarning, not IndexError. Aaron Meurer From sebastian at sipsolutions.net Wed Jul 22 19:13:57 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 22 Jul 2020 18:13:57 -0500 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> Message-ID: On Wed, 2020-07-22 at 16:55 -0600, Aaron Meurer wrote: > Ah, so I guess I caught this issue right as it got fixed. There are > no Yes, on a general note. Advanced indexing grew over time in a maze of paths, and things like empty arrays were long not too well supported in many parts of NumPy. That this went through > warnings in 1.19.0, but I can confirm I get the warnings in numpy > master. 1.19.1 isn't on conda yet, but I tried building it and didn't > get the warning there. So I guess I need to wait for 0.19.2. We don't add warnings in minor releases, so 1.19.2 will definitely never get it. I did not remember whether it was in there, because it was merged around the same time 1.19.x was branched. About your warnings, do you have a nice way to do that? The mechanism for warnings does not really give a good way to catch that a warning was raised and then turn it into an error. Unless someone contributes a slick way to do it, I am not sure the complexity pays off. IIRC, I added the note about raising the warning, because in this particular case the deprecation warning (turned into an error) happens to be chained due to implementation details. (so you do see the "original" error printed out). > > How long do deprecation cycles like this tend to last (I'm also > curious when the warnings for things like a[[[0, 1], [0, 1]]] will go > away)? Not sure, this is a corner case, and is bugging pandas a bit, so it may be a bit quicker, but likely still 2 releases? We are not always good about phasing out deprecations immediately when it is plausible. The one you mention strikes me as a bigger one though, so I think we should wait about 2 years. It is plausible that we are there already, even for a while. Cheers, Sebastian > > Aaron Meurer > > On Wed, Jul 22, 2020 at 4:32 PM Sebastian Berg > wrote: > > On Wed, 2020-07-22 at 16:23 -0600, Aaron Meurer wrote: > > > Why does fancy indexing have this behavior? > > > > > > > > > a = np.empty((0, 1, 2)) > > > > > > b = np.empty((1, 1, 2)) > > > > > > a[np.array([10, 10])] > > > Traceback (most recent call last): > > > File "", line 1, in > > > IndexError: index 10 is out of bounds for axis 0 with size 0 > > > > > > a[:, np.array([10, 10])] > > > array([], shape=(0, 2, 2), dtype=float64) > > > > > > a[:, :, np.array([10, 10])] > > > array([], shape=(0, 1, 2), dtype=float64) > > > > > > b[np.array([10, 10])] > > > Traceback (most recent call last): > > > File "", line 1, in > > > IndexError: index 10 is out of bounds for axis 0 with size 1 > > > > > > b[:, np.array([10, 10])] > > > Traceback (most recent call last): > > > File "", line 1, in > > > IndexError: index 10 is out of bounds for axis 1 with size 1 > > > > > > b[:, :, np.array([10, 10])] > > > Traceback (most recent call last): > > > File "", line 1, in > > > IndexError: index 10 is out of bounds for axis 2 with size 2 > > > > > > As far as I can tell, the behavior is that if an array has a 0 > > > dimension and an integer array index indexes an axis that isn't > > > 0, > > > there are no bounds checks. Why does it do this? It seems to be > > > inconsistent with the behavior of shape () fancy indices (integer > > > indices). I couldn't find any reference to this behavior in > > > https://numpy.org/doc/stable/reference/arrays.indexing.html. > > > > > > > The reason is because we used to not do this when there are *two* > > advanced indices: > > > > arr = np.ones((5, 6)) > > arr[[], [10, 10]] > > > > giving an empty result. If you check on master (and maybe on > > 1.19.x, I > > am not sure). You should see that all of your examples give a > > deprecation warning to be turned into an error (except the example > > I > > gave above, which can be argued to be correct). > > > > - Sebastian > > > > > > > Aaron Meurer > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From asmeurer at gmail.com Wed Jul 22 19:35:04 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Wed, 22 Jul 2020 17:35:04 -0600 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> Message-ID: > About your warnings, do you have a nice way to do that? The mechanism > for warnings does not really give a good way to catch that a warning > was raised and then turn it into an error. Unless someone contributes > a slick way to do it, I am not sure the complexity pays off. I don't really know how flags and options and such work in NumPy, but I would imagine something like if flags['post-deprecation'] = True: # Either a single flag for all deprecations or a per-deprecation flag raise IndexError(...) else: warnings.warn(...) I don't know if the fact that the code that does this is in C complicates things. In other words, something that works kind of like __future__ flags for upgrading the behavior to post-deprecation. > > IIRC, I added the note about raising the warning, because in this > particular case the deprecation warning (turned into an error) happens > to be chained due to implementation details. (so you do see the > "original" error printed out). Yes, it's nice that you can see it. But for my use case, I want to be able to "except IndexError". Basically, for ndindex, I test against NumPy to make sure the semantics are identical, and that includes making sure identical exceptions are raised. I also want to make it so that the ndindex semantics always follow post-deprecation behavior for any NumPy deprecations, since that leads to a cleaner API. But that means that my test code has to do fancy shenanigans to catch these deprecation warnings and treat them like the right errors. But even as a general principle, I think for any deprecation warning, users should be able to update their code in such a way that the current version doesn't give the warning and also it will continue to work and be idiomatic for future versions. For simple deprecations where you remove a function x(), this is often as simple as telling people to replace x() with y(). But these deprecations aren't so simple, because the indexing itself is valid and will stay valid, it's just the behavior that will change. If there's no way to do this, then a deprecation warning serves little purpose because users who see the warning won't be able to do anything about it until things actually change. There would be little difference from just changing things outright. For the list as tuple indexing thing, you can already kind of do this by making sure your fancy indices are always arrays. For the out of bounds one, it's a little harder. I guess for most use-cases, you aren't actually checking for IndexErrors, and the thing that will become an error usually indicates a bug in user code, so maybe it isn't a huge deal (I admit my use-cases aren't typical). Aaron Meurer From sebastian at sipsolutions.net Thu Jul 23 11:18:28 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 23 Jul 2020 10:18:28 -0500 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> Message-ID: <3f4ba7b2363e17e1d08be90f19aff38ae9860eca.camel@sipsolutions.net> On Wed, 2020-07-22 at 17:35 -0600, Aaron Meurer wrote: > > About your warnings, do you have a nice way to do that? The > > mechanism > > for warnings does not really give a good way to catch that a > > warning > > was raised and then turn it into an error. Unless someone > > contributes > > a slick way to do it, I am not sure the complexity pays off. > > I don't really know how flags and options and such work in NumPy, but > I would imagine something like > > if flags['post-deprecation'] = True: # Either a single flag for all > deprecations or a per-deprecation flag > raise IndexError(...) > else: > warnings.warn(...) > We have never done global flags for these things much in NumPy, I don't know of precedence in other packages, possibly aside future imports, but I am not even sure they have been used in this way. > I don't know if the fact that the code that does this is in C > complicates things. > > In other words, something that works kind of like __future__ flags > for > upgrading the behavior to post-deprecation. > > > IIRC, I added the note about raising the warning, because in this > > particular case the deprecation warning (turned into an error) > > happens > > to be chained due to implementation details. (so you do see the > > "original" error printed out). > > Yes, it's nice that you can see it. But for my use case, I want to be > able to "except IndexError". Basically, for ndindex, I test against > NumPy to make sure the semantics are identical, and that includes > making sure identical exceptions are raised. I also want to make it > so > that the ndindex semantics always follow post-deprecation behavior > for > any NumPy deprecations, since that leads to a cleaner API. But that > means that my test code has to do fancy shenanigans to catch these > deprecation warnings and treat them like the right errors. > > But even as a general principle, I think for any deprecation warning, > users should be able to update their code in such a way that the > current version doesn't give the warning and also it will continue to For FutureWarnings, I will always try very hard to give an option to opt-in to new behaviour or old behaviour ? ideally with code compatible also with earlier NumPy versions. Here, for a DeprecationWarning that has obviously no "alternative", I cannot think of any precedence in any other package or Python itself doing such a dance. And it is extremely fringe (you only need it because you are testing another package against numpy behaviour!). So I am happy to merge it if its proposed (maybe its easier for you to add this to NumPy then work around it in your tests), but I am honestly concerned that proposing this as a general principle is far more churn then worth the trouble. At least unless there is some consensus (and probably precendence in the scientific python ecosystem or python itself). Cheers, Sebastian > work and be idiomatic for future versions. For simple deprecations > where you remove a function x(), this is often as simple as telling > people to replace x() with y(). But these deprecations aren't so > simple, because the indexing itself is valid and will stay valid, > it's > just the behavior that will change. If there's no way to do this, > then > a deprecation warning serves little purpose because users who see the > warning won't be able to do anything about it until things actually > change. There would be little difference from just changing things > outright. For the list as tuple indexing thing, you can already kind > of do this by making sure your fancy indices are always arrays. For > the out of bounds one, it's a little harder. I guess for most > use-cases, you aren't actually checking for IndexErrors, and the > thing > that will become an error usually indicates a bug in user code, so > maybe it isn't a huge deal (I admit my use-cases aren't typical). > > Aaron Meurer > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Thu Jul 23 11:30:01 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 23 Jul 2020 10:30:01 -0500 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: <3f4ba7b2363e17e1d08be90f19aff38ae9860eca.camel@sipsolutions.net> References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> <3f4ba7b2363e17e1d08be90f19aff38ae9860eca.camel@sipsolutions.net> Message-ID: <8773e52959d499e668f06d60d49a068bc6a77d43.camel@sipsolutions.net> On Thu, 2020-07-23 at 10:18 -0500, Sebastian Berg wrote: > On Wed, 2020-07-22 at 17:35 -0600, Aaron Meurer wrote: > > > About your warnings, do you have a nice way to do that? The > > > mechanism > > > for warnings does not really give a good way to catch that a > > > warning > > > was raised and then turn it into an error. Unless someone > > > contributes > > > a slick way to do it, I am not sure the complexity pays off. > > > > I don't really know how flags and options and such work in NumPy, > > but > > I would imagine something like > > > > if flags['post-deprecation'] = True: # Either a single flag for all > > deprecations or a per-deprecation flag > > raise IndexError(...) > > else: > > warnings.warn(...) > > > > We have never done global flags for these things much in NumPy, I > don't > know of precedence in other packages, possibly aside future imports, > but I am not even sure they have been used in this way. > > > I don't know if the fact that the code that does this is in C > > complicates things. > > > > In other words, something that works kind of like __future__ flags > > for > > upgrading the behavior to post-deprecation. > > > > > IIRC, I added the note about raising the warning, because in this > > > particular case the deprecation warning (turned into an error) > > > happens > > > to be chained due to implementation details. (so you do see the > > > "original" error printed out). > > > > Yes, it's nice that you can see it. But for my use case, I want to > > be > > able to "except IndexError". Basically, for ndindex, I test against > > NumPy to make sure the semantics are identical, and that includes > > making sure identical exceptions are raised. I also want to make it > > so > > that the ndindex semantics always follow post-deprecation behavior > > for > > any NumPy deprecations, since that leads to a cleaner API. But that > > means that my test code has to do fancy shenanigans to catch these > > deprecation warnings and treat them like the right errors. > > > > But even as a general principle, I think for any deprecation > > warning, > > users should be able to update their code in such a way that the > > current version doesn't give the warning and also it will continue > > to > > For FutureWarnings, I will always try very hard to give an option to > opt-in to new behaviour or old behaviour ? ideally with code > compatible > also with earlier NumPy versions. > > Here, for a DeprecationWarning that has obviously no "alternative", I > cannot think of any precedence in any other package or Python itself > doing such a dance. And it is extremely fringe (you only need it > because you are testing another package against numpy behaviour!). > > So I am happy to merge it if its proposed (maybe its easier for you > to > add this to NumPy then work around it in your tests), but I am > honestly > concerned that proposing this as a general principle is far more > churn > then worth the trouble. At least unless there is some consensus (and > probably precendence in the scientific python ecosystem or python > itself). > After writing this, I realized that I actually remember the *opposite* discussion occurring before. I think in some of the equality deprecations, we actually raise the new error due to an internal try/except clause. And there was a complaint that its confusing that a non-deprecation-warning is raised when the error will only happen with DeprecationWarnings being set to error. - Sebastian > Cheers, > > Sebastian > > > > work and be idiomatic for future versions. For simple deprecations > > where you remove a function x(), this is often as simple as telling > > people to replace x() with y(). But these deprecations aren't so > > simple, because the indexing itself is valid and will stay valid, > > it's > > just the behavior that will change. If there's no way to do this, > > then > > a deprecation warning serves little purpose because users who see > > the > > warning won't be able to do anything about it until things actually > > change. There would be little difference from just changing things > > outright. For the list as tuple indexing thing, you can already > > kind > > of do this by making sure your fancy indices are always arrays. For > > the out of bounds one, it's a little harder. I guess for most > > use-cases, you aren't actually checking for IndexErrors, and the > > thing > > that will become an error usually indicates a bug in user code, so > > maybe it isn't a huge deal (I admit my use-cases aren't typical). > > > > Aaron Meurer > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From asmeurer at gmail.com Thu Jul 23 14:18:28 2020 From: asmeurer at gmail.com (Aaron Meurer) Date: Thu, 23 Jul 2020 12:18:28 -0600 Subject: [Numpy-discussion] Why does fancy indexing work like this? In-Reply-To: <8773e52959d499e668f06d60d49a068bc6a77d43.camel@sipsolutions.net> References: <66ed40caf93a7f24672ce511370ce38176a6eff1.camel@sipsolutions.net> <3f4ba7b2363e17e1d08be90f19aff38ae9860eca.camel@sipsolutions.net> <8773e52959d499e668f06d60d49a068bc6a77d43.camel@sipsolutions.net> Message-ID: > After writing this, I realized that I actually remember the *opposite* > discussion occurring before. I think in some of the equality > deprecations, we actually raise the new error due to an internal > try/except clause. And there was a complaint that its confusing that a > non-deprecation-warning is raised when the error will only happen with > DeprecationWarnings being set to error. > > - Sebastian I noticed that warnings.catch_warnings does the right thing with warnings that are raised alongside an exception (although it is a bit clunky to use). Aaron Meurer From tyler.je.reddy at gmail.com Thu Jul 23 22:21:07 2020 From: tyler.je.reddy at gmail.com (Tyler Reddy) Date: Thu, 23 Jul 2020 20:21:07 -0600 Subject: [Numpy-discussion] ANN: SciPy 1.5.2 Message-ID: Hi all, On behalf of the SciPy development team I'm pleased to announce the release of SciPy 1.5.2, which is a bug fix release. Sources and binary wheels can be found at: https://pypi.org/project/scipy/ and at: https://github.com/scipy/scipy/releases/tag/v1.5.2 One of a few ways to install this release with pip: pip install scipy==1.5.2 ========================== SciPy 1.5.2 Release Notes ========================== SciPy 1.5.2 is a bug-fix release with no new features compared to 1.5.1. Authors ====== * Peter Bell * Tobias Biester + * Evgeni Burovski * Thomas A Caswell * Ralf Gommers * Sturla Molden * Andrew Nelson * ofirr + * Sambit Panda * Ilhan Polat * Tyler Reddy * Atsushi Sakai * Pauli Virtanen A total of 13 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete. Issues closed for 1.5.2 ------------------------------ * `#3847 `__: Crash of interpolate.splprep(task=-1) * `#7395 `__: splprep segfaults if fixed knots are specified * `#10761 `__: scipy.signal.convolve2d produces incorrect values for large arrays * `#11971 `__: DOC: search in devdocs returns wrong link * `#12155 `__: BUG: Fix permutation of distance matrices in scipy.stats.multiscale_graphcorr * `#12203 `__: Unable to install on PyPy 7.3.1 (Python 3.6.9) * `#12316 `__: negative scipy.spatial.distance.correlation * `#12422 `__: BUG: slsqp: ValueError: failed to initialize intent(inout) array... * `#12428 `__: stats.truncnorm.rvs() never returns a scalar in 1.5 * `#12441 `__: eigvalsh inconsistent eigvals= subset_by_index= * `#12445 `__: DOC: scipy.linalg.eigh * `#12449 `__: Warnings are not filtered in csr_matrix.sum() * `#12469 `__: SciPy 1.9 exception in LSQSphereBivariateSpline * `#12487 `__: BUG: optimize: incorrect result from approx_fprime * `#12493 `__: CI: GitHub Actions for maintenance branches * `#12533 `__: eigh gives incorrect results * `#12579 `__: BLD, MAINT: distutils issues in wheels repo Pull requests for 1.5.2 ------------------------------- * `#12156 `__: BUG: Fix permutation of distance matrices in scipy.stats.multiscale_graphcorr * `#12238 `__: BUG: Use 64-bit indexing in convolve2d to avoid overflow * `#12256 `__: BLD: Build lsap as a single extension instead of extension +... * `#12320 `__: BUG: spatial: avoid returning negative correlation distance * `#12383 `__: ENH: Make cKDTree.tree more efficient * `#12392 `__: DOC: update scipy-sphinx-theme * `#12430 `__: BUG: truncnorm and geninvgauss never return scalars from rvs * `#12437 `__: BUG: optimize: cast bounds to floats in new_bounds_to_old/old_bounds_to_new * `#12442 `__: MAINT:linalg: Fix for input args of eigvalsh * `#12461 `__: MAINT: sparse: write matrix/asmatrix wrappers without warning... * `#12478 `__: BUG: fix array_like input defects and add tests for all functions... * `#12488 `__: BUG: fix approx_derivative step size. Closes #12487 * `#12500 `__: CI: actions branch trigger fix * `#12501 `__: CI: actions branch trigger fix * `#12504 `__: BUG: cKDTreeNode use after free * `#12529 `__: MAINT: allow graceful docs re-upload * `#12538 `__: BUG:linalg: eigh type parameter handling corrected * `#12560 `__: MAINT: truncnorm.rvs compatibility for \`Generator\` * `#12562 `__: redo gh-12188: fix segfaults in splprep with fixed knots * `#12586 `__: BLD: Add -std=c99 to sigtools to compile with C99 * `#12590 `__: CI: Add GCC 4.8 entry to travis build matrix * `#12591 `__: BLD: fix cython error on master-branch cython Checksums ========= MD5 ~~~ 2e046d26cdc4241a6a5b2907d57528df scipy-1.5.2-cp36-cp36m-macosx_10_9_x86_64.whl 902dea66453e2fa0616e9479970986f5 scipy-1.5.2-cp36-cp36m-manylinux1_i686.whl e130db080706d9f4ce22d8493c8e1ce2 scipy-1.5.2-cp36-cp36m-manylinux1_x86_64.whl 721f16bae600731e479a5b4e98ce9a97 scipy-1.5.2-cp36-cp36m-win32.whl a3171cfe38618d51acbfb8d1b39ac612 scipy-1.5.2-cp36-cp36m-win_amd64.whl c9f733d4d2e82c098c08760963dafaf8 scipy-1.5.2-cp37-cp37m-macosx_10_9_x86_64.whl 53ba6c502d09145b38e0e857b2d4a273 scipy-1.5.2-cp37-cp37m-manylinux1_i686.whl b9db33944ac4147936a7f42df8e95ad2 scipy-1.5.2-cp37-cp37m-manylinux1_x86_64.whl be9e8bfdf0e5e0914d1e1605be26d9c0 scipy-1.5.2-cp37-cp37m-win32.whl 848fa7b82a25d0ce36710ccc47ebc2ca scipy-1.5.2-cp37-cp37m-win_amd64.whl 590cd3b70a2dc8664896d6b9e2e5fc6d scipy-1.5.2-cp38-cp38-macosx_10_9_x86_64.whl 7fdbb19c15702b98319ea4ea32df8458 scipy-1.5.2-cp38-cp38-manylinux1_i686.whl 301f3a873e1bfef70d6f594c489fafe8 scipy-1.5.2-cp38-cp38-manylinux1_x86_64.whl 8c08ac0f55810e89e336eb3bf5a7b337 scipy-1.5.2-cp38-cp38-win32.whl 711f5c47c801dc79bead7d40669fd8c9 scipy-1.5.2-cp38-cp38-win_amd64.whl 620fc39f371e04a76af5d0290f8d3753 scipy-1.5.2.tar.gz 5bc188f21054a2ecff74fae40dd298da scipy-1.5.2.tar.xz 17bc80802955d100f6c1335594eda29a scipy-1.5.2.zip SHA256 ~~~~~~ cca9fce15109a36a0a9f9cfc64f870f1c140cb235ddf27fe0328e6afb44dfed0 scipy-1.5.2-cp36-cp36m-macosx_10_9_x86_64.whl 1c7564a4810c1cd77fcdee7fa726d7d39d4e2695ad252d7c86c3ea9d85b7fb8f scipy-1.5.2-cp36-cp36m-manylinux1_i686.whl 07e52b316b40a4f001667d1ad4eb5f2318738de34597bd91537851365b6c61f1 scipy-1.5.2-cp36-cp36m-manylinux1_x86_64.whl d56b10d8ed72ec1be76bf10508446df60954f08a41c2d40778bc29a3a9ad9bce scipy-1.5.2-cp36-cp36m-win32.whl 8e28e74b97fc8d6aa0454989db3b5d36fc27e69cef39a7ee5eaf8174ca1123cb scipy-1.5.2-cp36-cp36m-win_amd64.whl 6e86c873fe1335d88b7a4bfa09d021f27a9e753758fd75f3f92d714aa4093768 scipy-1.5.2-cp37-cp37m-macosx_10_9_x86_64.whl a0afbb967fd2c98efad5f4c24439a640d39463282040a88e8e928db647d8ac3d scipy-1.5.2-cp37-cp37m-manylinux1_i686.whl eecf40fa87eeda53e8e11d265ff2254729d04000cd40bae648e76ff268885d66 scipy-1.5.2-cp37-cp37m-manylinux1_x86_64.whl 315aa2165aca31375f4e26c230188db192ed901761390be908c9b21d8b07df62 scipy-1.5.2-cp37-cp37m-win32.whl ec5fe57e46828d034775b00cd625c4a7b5c7d2e354c3b258d820c6c72212a6ec scipy-1.5.2-cp37-cp37m-win_amd64.whl fc98f3eac993b9bfdd392e675dfe19850cc8c7246a8fd2b42443e506344be7d9 scipy-1.5.2-cp38-cp38-macosx_10_9_x86_64.whl a785409c0fa51764766840185a34f96a0a93527a0ff0230484d33a8ed085c8f8 scipy-1.5.2-cp38-cp38-manylinux1_i686.whl 0a0e9a4e58a4734c2eba917f834b25b7e3b6dc333901ce7784fd31aefbd37b2f scipy-1.5.2-cp38-cp38-manylinux1_x86_64.whl dac09281a0eacd59974e24525a3bc90fa39b4e95177e638a31b14db60d3fa806 scipy-1.5.2-cp38-cp38-win32.whl 92eb04041d371fea828858e4fff182453c25ae3eaa8782d9b6c32b25857d23bc scipy-1.5.2-cp38-cp38-win_amd64.whl 066c513d90eb3fd7567a9e150828d39111ebd88d3e924cdfc9f8ce19ab6f90c9 scipy-1.5.2.tar.gz 28d5d2e9af6ca5c0352cd83fb64191f2d8e883ab5287a221ba7a175c8cc2ccbe scipy-1.5.2.tar.xz a9054595a370f24d68f7a694037316b69ae80f5837323d567f76cde055189c08 scipy-1.5.2.zip -------------- next part -------------- An HTML attachment was scrubbed... URL: From eyal.kutz at gmail.com Sun Jul 26 10:21:29 2020 From: eyal.kutz at gmail.com (Eyal Kutz) Date: Sun, 26 Jul 2020 17:21:29 +0300 Subject: [Numpy-discussion] NumPy dtype API improvement suggestion Message-ID: I am interested in suggesting an API improvement for NumPy. I wish to make it so that the following code: @np.dtype class Point: x: np.int16 y: np.int16 would be equivalent to the following code: Point = np.dtype([('x', np.int16), ('y', np.int16)]) I am willing to submit the code changes required to make this happen. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.k.sheppard at gmail.com Sun Jul 26 12:31:58 2020 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Sun, 26 Jul 2020 17:31:58 +0100 Subject: [Numpy-discussion] NumPy dtype API improvement suggestion In-Reply-To: References: Message-ID: Better would be to have an object like NamedTuple in typing that would allow class Point(DType): x: np.int16 y: np.int16 On Sun, Jul 26, 2020 at 3:22 PM Eyal Kutz wrote: > I am interested in suggesting an API improvement for NumPy. > I wish to make it so that the following code: > @np.dtype > class Point: > x: np.int16 > y: np.int16 > would be equivalent to the following code: > Point = np.dtype([('x', np.int16), ('y', np.int16)]) > > I am willing to submit the code changes required to make this happen. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eyal.kutz at gmail.com Sun Jul 26 12:58:53 2020 From: eyal.kutz at gmail.com (Eyal Kutz) Date: Sun, 26 Jul 2020 19:58:53 +0300 Subject: [Numpy-discussion] NumPy dtype API improvement suggestion Message-ID: *Kevin Sheppard, I agree with you but* I don't know how to do this. do you? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rainwoodman at gmail.com Sun Jul 26 20:26:00 2020 From: rainwoodman at gmail.com (Feng Yu) Date: Sun, 26 Jul 2020 17:26:00 -0700 Subject: [Numpy-discussion] NumPy dtype API improvement suggestion In-Reply-To: References: Message-ID: Hi, Would it be possible to also allow a byte offset for the field? e.g., class Point(np.struct): x: np.field('i4', offset=8) y: np.field(' wrote: > Better would be to have an object like NamedTuple in typing that would > allow > > class Point(DType): > x: np.int16 > y: np.int16 > > > > On Sun, Jul 26, 2020 at 3:22 PM Eyal Kutz wrote: > >> I am interested in suggesting an API improvement for NumPy. >> I wish to make it so that the following code: >> @np.dtype >> class Point: >> x: np.int16 >> y: np.int16 >> would be equivalent to the following code: >> Point = np.dtype([('x', np.int16), ('y', np.int16)]) >> >> I am willing to submit the code changes required to make this happen. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sabertooth2022 at gmail.com Mon Jul 27 02:46:53 2020 From: sabertooth2022 at gmail.com (Saber Tooth) Date: Mon, 27 Jul 2020 12:16:53 +0530 Subject: [Numpy-discussion] Journey with NumPy in Docs Development Message-ID: Hello @Melissa Mendon?a @Ralf Gommers @Matti Picus , My journey working with NumPy Docs team has been nothing short of Progressive , the insights that i have gained attending the Docs meeting has helped me refine my proposal . Attending the past 3 meetings have helped me remove some of redundant features in my proposal , letting me structure the Explanations , How-To's ,Tutorials in a better way keeping in mind the requirements as highlighted by NEP-44 and the mentors . I have really tried to work upon NumPy community bonding , i have been a part of some really good discussions , contributed to some ideas , raised some queries ? which so far i can say has been really progressive for me to start contributing , as today i understand what NumPy Docs is trying to deliver . I have really been enticed by the workflow of the community which encourages me to think and contribute to as much as I can. I have been mining the NumPy Discussion mailing list since last meeting using the archive as mentioned in Hackmd doc by @Ralf Gommers , collecting topics for which we can frame Tutorials , How To's and Explanations . Discussion around Explanations has been really fruitful with @Melissa Mendon?a and the community . Moreover I am working on a how-to nowadays for which I will soon open a PR . Contributing to NumPy in such a manner I believe will help me get started with Official Documentation Period in a more Efficient manner , as i had already worked upon Community Bonding which will continue to strengthen in upcoming days . Thank You for your encouragement and having me , I am looking into delivering much more and contributing to my GSoD Proposal and beyond that too . Thanks , Mrinal Tyagi -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Jul 27 09:46:45 2020 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 27 Jul 2020 08:46:45 -0500 Subject: [Numpy-discussion] NumPy dtype API improvement suggestion In-Reply-To: References: Message-ID: <18fc3daec6da6a2b4f1c42a8f81024810d9e4936.camel@sipsolutions.net> On Sun, 2020-07-26 at 17:31 +0100, Kevin Sheppard wrote: > Better would be to have an object like NamedTuple in typing that > would allow > > class Point(DType): > x: np.int16 > y: np.int16 > I agree with this type of use case (whatever the syntax is). But I think there are too many small issues around it currently to do that in NumPy. For that the new DTypes need to move along a bit further, so that we do not lock-in nice syntax with features that have lots of quirks. See also: https://numpy.org/neps/nep-0041-improved-dtype-support.html https://numpy.org/neps/nep-0042-new-dtypes.html I think you could do much of such syntax now, but I would suggest to do it outside of NumPy proper (at least the experimentation) at this time for that reason. Hopefully, it won't be too long until we can think of creating such API for good inside NumPy. Cheers, Sebastian > > On Sun, Jul 26, 2020 at 3:22 PM Eyal Kutz > wrote: > > > I am interested in suggesting an API improvement for NumPy. > > I wish to make it so that the following code: > > @np.dtype > > class Point: > > x: np.int16 > > y: np.int16 > > would be equivalent to the following code: > > Point = np.dtype([('x', np.int16), ('y', np.int16)]) > > > > I am willing to submit the code changes required to make this > > happen. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From warren.weckesser at gmail.com Wed Jul 29 13:50:37 2020 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Wed, 29 Jul 2020 13:50:37 -0400 Subject: [Numpy-discussion] NumPy Development Meeting Today - Triage Focus Message-ID: Hi all, Sorry for the short notice--Sebastian is off this week, and the rest of us forgot to send the email reminder. Our bi-weekly triage-focused NumPy development meeting is in 10 minutes (today, Wednesday, July 29th, at 11 am Pacific Time (18:00 UTC)). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or PRs that you feel should be prioritized or simply discussed briefly. Just comment on it so we can label it, or add your PR/issue to this weeks topics for discussion. Best regards Warren From melissawm at gmail.com Fri Jul 31 08:10:27 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 31 Jul 2020 09:10:27 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday August 3 In-Reply-To: References: Message-ID: Hi all! This is a reminder that our next Documentation Team meeting will be on *Monday, August 3* at 3PM UTC**. If you wish to join on Zoom, you need to use this link https://zoom.us/j/420005230 Here's the permanent hackmd document with the meeting notes (still being updated in the next few days!): https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20200803T15&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar/r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.steinbach at hzdr.de Fri Jul 31 08:32:16 2020 From: p.steinbach at hzdr.de (Peter Steinbach) Date: Fri, 31 Jul 2020 14:32:16 +0200 Subject: [Numpy-discussion] a summary function to get a quick glimpse on the contents of a numpy array Message-ID: Dear numpy devs and interested readers, as a day-to-day user, it occurred to me that having a quick look into the contents and extents of arrays is well doable with numpy. numpy offers a rich set of methods for this. However, very often I oversee myself and others that one just wants to see if the values of an array have a certain min/max or mean or how wide the range of values are. I hence sat down to write a summary function that returns a string of hand-packed summary statistics for a quick inspection. I propose to include it into numpy and would love to have your feedback on this idea before I submit a PR. Here is the core functionality: Examples -------- >>> a = np.random.normal(size=20) >>> print(summary(a)) min 25perc mean stdev median 75perc max -2.289870 -2.265757 -0.083213 1.115033 -0.162885 -2.217532 1.639802 >>> a = np.reshape(a, newshape=(4,5)) >>> print(summary(a,axis=1)) min 25perc mean stdev median 75perc max 0 -0.976279 -0.974090 0.293003 1.009383 0.466814 -0.969712 1.519695 1 -0.468854 -0.467739 0.184139 0.649378 -0.036762 -0.465510 1.303144 2 -2.289870 -2.276455 -0.324450 1.230031 -0.289008 -2.249625 1.111107 3 -1.782239 -1.777304 -0.485546 1.259598 -1.236190 -1.767434 1.639802 So you see, it is merely a tiny helper function that can aid practitioners and data scientists to get a quick insight on what an array contains. first off, here is the code: https://github.com/psteinb/numpy/blob/summary-function/numpy/lib/utils.py#L1021 I put it there as I am not sure at this point, if the community would appreciate such a function or not. Judging from the tests, lib/utils.py appears to a be place for undocumented functions. So to resolve this and prepare a proper PR, please let me know where this summary function could reside! Second, please give me your thoughts on the summary function's output? Should the number of digits be configurable? Should the columns be configurable? Is is ok to honor the axis parameter which is found in so many numpy functions? Last but not least, let me stress that this is my first time contribution to numpy. I love the library and would like to contribute something back. So bear with me, if my code violates best practices in your community for now. I'll bite my teeth into the formalities of a github PR once I get support from the community and the core devs. I think that a summary function would be a valuable addition to numpy! Best, Peter -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5373 bytes Desc: S/MIME Cryptographic Signature URL: From melissawm at gmail.com Fri Jul 31 11:21:49 2020 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 31 Jul 2020 12:21:49 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday August 3 In-Reply-To: References: Message-ID: Thanks, you too! On Fri, Jul 31, 2020 at 12:21 PM olzhas robo.ai wrote: > Thank you, Melissa! Have a great weekend :) > > On Fri, 31 Jul 2020 at 13:10, Melissa Mendon?a > wrote: > >> Hi all! >> >> This is a reminder that our next Documentation Team meeting will be on *Monday, >> August 3* at 3PM UTC**. If you wish to join on Zoom, you need to use >> this link >> >> https://zoom.us/j/420005230 >> >> Here's the permanent hackmd document with the meeting notes (still being >> updated in the next few days!): >> >> https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg >> >> >> Hope to see you around! >> >> ** You can click this link to get the correct time at your timezone: >> https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20200803T15&p1=1440&ah=1 >> >> *** You can add the NumPy community calendar to your google calendar by >> clicking this link: >> https://calendar.google.com/calendar/r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 >> >> - Melissa >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: