From stefanv at berkeley.edu Thu Jul 1 03:39:00 2021 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 01 Jul 2021 00:39:00 -0700 Subject: [Numpy-discussion] =?utf-8?q?=60keepdims=3DTrue=60_for_argmin/ar?= =?utf-8?q?gmx_and_C-API_=60PyArray=5FArgMaxWithKeepdims=60?= In-Reply-To: <1df08e1ed077b7ee380fd6b3374bff5c2fdadd8e.camel@sipsolutions.net> References: <1df08e1ed077b7ee380fd6b3374bff5c2fdadd8e.camel@sipsolutions.net> Message-ID: <0d0e2ffb-9811-4b6b-8aef-3bd3ff78e895@www.fastmail.com> Hi Sebastian, On Wed, Jun 30, 2021, at 18:23, Sebastian Berg wrote: > The PR https://github.com/numpy/numpy/pull/19211 proposes to extend > argmin and argmax with a `keepdims=False` keyword-only argument. This seems consistent with existing APIs, so I'm not concerned. For those wondering, `keepdims` preserves the number of dimensions of the original array in a reduction operation like `sum`: In [1]: X = np.random.random((10, 15)) In [2]: np.sum(X).shape Out[2]: () In [3]: np.sum(X, keepdims=True).shape Out[3]: (1, 1) This is sometimes useful for broadcasting. > The PR also proposes to add: > > * `PyArray_ArgMinWithKeepdims` > * `PyArray_ArgMaxWithKeepdims` I am curious whether this is our general pattern for adding keyword argument functionality to functions in the C-API. It seems a bit excessive! St?fan From sebastian at sipsolutions.net Thu Jul 1 12:31:55 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 01 Jul 2021 11:31:55 -0500 Subject: [Numpy-discussion] `keepdims=True` for argmin/argmx and C-API `PyArray_ArgMaxWithKeepdims` In-Reply-To: <0d0e2ffb-9811-4b6b-8aef-3bd3ff78e895@www.fastmail.com> References: <1df08e1ed077b7ee380fd6b3374bff5c2fdadd8e.camel@sipsolutions.net> <0d0e2ffb-9811-4b6b-8aef-3bd3ff78e895@www.fastmail.com> Message-ID: On Thu, 2021-07-01 at 00:39 -0700, Stefan van der Walt wrote: > Hi Sebastian, > > On Wed, Jun 30, 2021, at 18:23, Sebastian Berg wrote: > > The PR https://github.com/numpy/numpy/pull/19211?proposes to extend > > argmin and argmax with a `keepdims=False` keyword-only argument. > > This seems consistent with existing APIs, so I'm not concerned. > > For those wondering, `keepdims` preserves the number of dimensions of > the original array in a reduction operation like `sum`: > > In [1]: X = np.random.random((10, 15)) > > In [2]: np.sum(X).shape > Out[2]: () > > In [3]: np.sum(X, keepdims=True).shape > Out[3]: (1, 1) > > This is sometimes useful for broadcasting. > > > The PR? also proposes to add: > > > > * `PyArray_ArgMinWithKeepdims` > > * `PyArray_ArgMaxWithKeepdims` > > I am curious whether this is our general pattern for adding keyword > argument functionality to functions in the C-API.? It seems a bit > excessive! True, I am now tending a bit towards delaying this until someone actually asks for it... In most use-cases just using the Python API is likely only a small overhead anyway if done right. I do not think we have a pattern. We do have some functions with the pattern of `With...And...` to allow signatures of different complexity. But very few of this type of python additions ever made it into the C- API. For `Reshape`, `order=` was added by introducing `NewShape`. I have some hope that very long-term, HPy might solve this for us... Cheers, Sebastian > > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From gsingh at quansight.com Thu Jul 1 12:49:29 2021 From: gsingh at quansight.com (Gagandeep Singh) Date: Thu, 1 Jul 2021 22:19:29 +0530 Subject: [Numpy-discussion] `keepdims=True` for argmin/argmx and C-API `PyArray_ArgMaxWithKeepdims` In-Reply-To: References: <1df08e1ed077b7ee380fd6b3374bff5c2fdadd8e.camel@sipsolutions.net> <0d0e2ffb-9811-4b6b-8aef-3bd3ff78e895@www.fastmail.com> Message-ID: Hi, So should I remove these new functions from public C-API? Let me know. I will do that. On Thu, 1 Jul, 2021, 10:02 pm Sebastian Berg, wrote: > On Thu, 2021-07-01 at 00:39 -0700, Stefan van der Walt wrote: > > Hi Sebastian, > > > > On Wed, Jun 30, 2021, at 18:23, Sebastian Berg wrote: > > > The PR https://github.com/numpy/numpy/pull/19211 proposes to extend > > > argmin and argmax with a `keepdims=False` keyword-only argument. > > > > This seems consistent with existing APIs, so I'm not concerned. > > > > For those wondering, `keepdims` preserves the number of dimensions of > > the original array in a reduction operation like `sum`: > > > > In [1]: X = np.random.random((10, 15)) > > > > In [2]: np.sum(X).shape > > Out[2]: () > > > > In [3]: np.sum(X, keepdims=True).shape > > Out[3]: (1, 1) > > > > This is sometimes useful for broadcasting. > > > > > The PR also proposes to add: > > > > > > * `PyArray_ArgMinWithKeepdims` > > > * `PyArray_ArgMaxWithKeepdims` > > > > I am curious whether this is our general pattern for adding keyword > > argument functionality to functions in the C-API. It seems a bit > > excessive! > > True, I am now tending a bit towards delaying this until someone > actually asks for it... > In most use-cases just using the Python API is likely only a small > overhead anyway if done right. > > I do not think we have a pattern. We do have some functions with the > pattern of `With...And...` to allow signatures of different complexity. > But very few of this type of python additions ever made it into the C- > API. For `Reshape`, `order=` was added by introducing `NewShape`. > > I have some hope that very long-term, HPy might solve this for us... > > Cheers, > > Sebastian > > > > > > > St?fan > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Jul 1 16:40:12 2021 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 1 Jul 2021 23:40:12 +0300 Subject: [Numpy-discussion] `keepdims=True` for argmin/argmx and C-API `PyArray_ArgMaxWithKeepdims` In-Reply-To: References: <1df08e1ed077b7ee380fd6b3374bff5c2fdadd8e.camel@sipsolutions.net> <0d0e2ffb-9811-4b6b-8aef-3bd3ff78e895@www.fastmail.com> Message-ID: <68756d1e-3e23-0efc-24ed-aecb732f45d8@gmail.com> On 1/7/21 7:49 pm, Gagandeep Singh wrote: > Hi, > > So should I remove these new functions from public C-API? Let me know. > I will do that. > > Yes please. If needed we can add them, but once in we cannot remove them. Matti From gsingh at quansight.com Fri Jul 2 00:42:02 2021 From: gsingh at quansight.com (Gagandeep Singh) Date: Fri, 2 Jul 2021 10:12:02 +0530 Subject: [Numpy-discussion] `keepdims=True` for argmin/argmx and C-API `PyArray_ArgMaxWithKeepdims` In-Reply-To: <68756d1e-3e23-0efc-24ed-aecb732f45d8@gmail.com> References: <1df08e1ed077b7ee380fd6b3374bff5c2fdadd8e.camel@sipsolutions.net> <0d0e2ffb-9811-4b6b-8aef-3bd3ff78e895@www.fastmail.com> <68756d1e-3e23-0efc-24ed-aecb732f45d8@gmail.com> Message-ID: Hi, I have removed the two new C functions from public C-API. Let me know if anything else is needed. Thanks. On Fri, Jul 2, 2021 at 2:10 AM Matti Picus wrote: > > On 1/7/21 7:49 pm, Gagandeep Singh wrote: > > Hi, > > > > So should I remove these new functions from public C-API? Let me know. > > I will do that. > > > > > > Yes please. If needed we can add them, but once in we cannot remove them. > > Matti > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Fri Jul 2 07:19:24 2021 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Fri, 2 Jul 2021 14:19:24 +0300 Subject: [Numpy-discussion] is_triangular, is_diagonal, is_symmetric et al. in NumPy or SciPy linalg In-Reply-To: References: Message-ID: Ah right. So two things, the original reason f9r this question is because I can't decide in https://github.com/scipy/scipy/pull/12824 whether others would also benefit from quick structure determination. I can keep it private function or we can put them some misc or lib folder so all can use. Say there is a special method for triangular matrices but you can't guarantee the structure so you can quickly check for it. At worst O(n**2) complexity for diagonal arrays and almost O(2n) for full arrays makes it quite appealing. But then again maybe NumPy is a better place since probably it will be faster to have this in pure C with the right headers and without the extra Cython overhead. Funny you mention the container idea. This is precisely what I'm doing in PR mentioned above (I'll push when I'm done). I stole the idea from Tim Davis himself in a Julia discussion for keeping the factorization as an attribute to be used later if need be. So yes it makes a lot of sense Sparse or not. On Wed, 30 Jun 2021, 19:14 Evgeni Burovski, wrote: > Hi Ilhan, > > Overall I think something like this would be great. However, I wonder > if you considered having a specialized container with a structure tag > instead of trying to discover the structure. If it's a container, it > can neatly wrap various lapack storage schemes and dispatch to an > appropriate lapack functionality. Possibly even sparse storage > schemes. And it seems a bit more robust than trying to discover the > structure (e.g. what about off-band elements of \sim 1e-16 etc). > > The next question is of course if this should live in scipy/numpy > .linalg or as a separate repo, at least for some time (maybe in the > scipy organization?). So that it can iterate faster, among other > things. > (I'd be interested in contributing FWIW) > > Cheers, > > Evgeni > > > On Wed, Jun 30, 2021 at 1:22 AM Ilhan Polat wrote: > > > > Dear all, > > > > I'm writing some helper Cythpm functions for scipy.linalg which is kinda > performant and usable. And there is still quite some wiggle room for more. > > > > In many linalg routines there is a lot of performance benefit if the > structure can be discovered in a cheap and reliable way at the outset. For > example if symmetric then eig can delegate to eigh or if triangular then > triangular solvers can be used in linalg.solve and lstsq so forth > > > > Here is the Cythonized version for Jupyter notebook to paste to discover > the lower/upper bandwidth of square array A that competes well with A != 0 > just to use some low level function (note the latter returns an array hence > more cost is involved) There is a higher level supervisor function that > checks C-contiguousness otherwise specializes to different versions of it > > > > Initial cell > > > > %load_ext Cython > > %load_ext line_profiler > > import cython > > import line_profiler > > > > Then another cell > > > > %%cython > > # cython: language_level=3 > > # cython: linetrace=True > > # cython: binding = True > > # distutils: define_macros=CYTHON_TRACE=1 > > # distutils: define_macros=CYTHON_TRACE_NOGIL=1 > > > > cimport cython > > cimport numpy as cnp > > import numpy as np > > import line_profiler > > ctypedef fused np_numeric_t: > > cnp.int8_t > > cnp.int16_t > > cnp.int32_t > > cnp.int64_t > > cnp.uint8_t > > cnp.uint16_t > > cnp.uint32_t > > cnp.uint64_t > > cnp.float32_t > > cnp.float64_t > > cnp.complex64_t > > cnp.complex128_t > > cnp.int_t > > cnp.long_t > > cnp.longlong_t > > cnp.uint_t > > cnp.ulong_t > > cnp.ulonglong_t > > cnp.intp_t > > cnp.uintp_t > > cnp.float_t > > cnp.double_t > > cnp.longdouble_t > > > > > > @cython.linetrace(True) > > @cython.initializedcheck(False) > > @cython.boundscheck(False) > > @cython.wraparound(False) > > cpdef inline (int, int) band_check_internal(np_numeric_t[:, ::1]A): > > cdef Py_ssize_t n = A.shape[0], lower_band = 0, upper_band = 0, r, c > > cdef np_numeric_t zero = 0 > > > > for r in xrange(n): > > # Only bother if outside the existing band: > > for c in xrange(r-lower_band): > > if A[r, c] != zero: > > lower_band = r - c > > break > > > > for c in xrange(n - 1, r + upper_band, -1): > > if A[r, c] != zero: > > upper_band = c - r > > break > > > > return lower_band, upper_band > > > > Final cell for use-case --------------- > > > > # Make arbitrary lower-banded array > > n = 50 # array size > > k = 3 # k'th subdiagonal > > R = np.zeros([n, n], dtype=np.float32) > > R[[x for x in range(n)], [x for x in range(n)]] = 1 > > R[[x for x in range(n-1)], [x for x in range(1,n)]] = 1 > > R[[x for x in range(1,n)], [x for x in range(n-1)]] = 1 > > R[[x for x in range(k,n)], [x for x in range(n-k)]] = 2 > > > > Some very haphazardly put together metrics > > > > %timeit band_check_internal(R) > > 2.59 ?s ? 84.7 ns per loop (mean ? std. dev. of 7 runs, 100000 loops > each) > > > > %timeit np.linalg.solve(R, zzz) > > 824 ?s ? 6.24 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) > > > > %timeit R != 0. > > 1.65 ?s ? 43.1 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops > each) > > > > So the worst case cost is negligible in general (note that the given > code is slower as it uses the fused type however if I go with tempita > standalone version is faster) > > > > Two questions: > > > > 1) This is missing np.half/float16 functionality since any arithmetic > with float16 is might not be reliable including nonzero check. IS it safe > to view it as np.uint16 and use that specialization? I'm not sure about the > sign bit hence the question. I can leave this out since almost all linalg > suite rejects this datatype due to well-known lack of supprt. > > > > 2) Should this be in NumPy or SciPy linalg? It is quite relevant to be > on SciPy but then again this stuff is purely about array structures. But if > the opinion is for NumPy then I would need a volunteer because NumPy > codebase flies way above my head. > > > > > > All feedback welcome > > > > Best > > ilhan > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Fri Jul 2 13:03:19 2021 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Fri, 2 Jul 2021 18:03:19 +0100 Subject: [Numpy-discussion] is_triangular, is_diagonal, is_symmetric et al. in NumPy or SciPy linalg In-Reply-To: References: Message-ID: If you're going to provide routines for structure determination it might be worth looking at algorithms that can identify more general or less obvious structure as well. SymPy's matrices module needs a lot of work and is improving a lot which will become noticeable over the next few releases but one of the important optimisations being used is Tarjan's algorithm for finding the strongly connected components of a graph. This is a generalisation of checking for triangular or diagonal matrices. With this approach you can identify any permutation of the rows and columns of a square matrix that can bring it into block triangular or block diagonal form which can reduce many O(n**3) algorithms substantially. The big-O for Tarjan's algorithm itself is basically the same as checking whether a matrix is triangular/diagonal. For example the matrix determinant is invariant under permutations of the rows and columns. If you can permute a matrix into block triangular form then the determinant is just the product of the determinants of the diagonal blocks. If the base case algorithm has n**3 operations then reducing it to two operations of size n/2 is a speed up of ~4x. In the extreme this discovers that a matrix is triangular and reduces the whole operation to O(n) (plus the cost of Tarjan's algorithm). However the graph-based approach also benefits wider classes e.g. you get almost all the same benefit for a matrix that is almost diagonal but has a few off-diagonal elements. Using sympy master branch (as .strongly_connected_components() is not released yet): In [19]: from sympy import Matrix In [20]: M = Matrix([[1, 0, 2, 0], [9, 3, 1, 2], [3, 0, 4, 0], [5, 8, 6, 7]]) In [21]: M Out[21]: ?1 0 2 0? ?9 3 1 2? ?3 0 4 0? ?5 8 6 7? In [22]: M.strongly_connected_components() # Tarjan's algorithm Out[22]: [[0, 2], [1, 3]] In [23]: M[[0, 2, 1, 3], [0, 2, 1, 3]] # outer indexing for permutation Out[23]: ?1 2 0 0? ?3 4 0 0? ?9 1 3 2? ?5 6 8 7? In [24]: M.det() Out[24]: -10 In [25]: M[[0,2],[0,2]].det() * M[[1, 3], [1, 3]].det() Out[25]: -10 -- Oscar On Fri, 2 Jul 2021 at 12:20, Ilhan Polat wrote: > > Ah right. So two things, the original reason f9r this question is because I can't decide in https://github.com/scipy/scipy/pull/12824 whether others would also benefit from quick structure determination. > > I can keep it private function or we can put them some misc or lib folder so all can use. Say there is a special method for triangular matrices but you can't guarantee the structure so you can quickly check for it. At worst O(n**2) complexity for diagonal arrays and almost O(2n) for full arrays makes it quite appealing. > > But then again maybe NumPy is a better place since probably it will be faster to have this in pure C with the right headers and without the extra Cython overhead. > > Funny you mention the container idea. This is precisely what I'm doing in PR mentioned above (I'll push when I'm done). I stole the idea from Tim Davis himself in a Julia discussion for keeping the factorization as an attribute to be used later if need be. So yes it makes a lot of sense Sparse or not. > > On Wed, 30 Jun 2021, 19:14 Evgeni Burovski, wrote: >> >> Hi Ilhan, >> >> Overall I think something like this would be great. However, I wonder >> if you considered having a specialized container with a structure tag >> instead of trying to discover the structure. If it's a container, it >> can neatly wrap various lapack storage schemes and dispatch to an >> appropriate lapack functionality. Possibly even sparse storage >> schemes. And it seems a bit more robust than trying to discover the >> structure (e.g. what about off-band elements of \sim 1e-16 etc). >> >> The next question is of course if this should live in scipy/numpy >> .linalg or as a separate repo, at least for some time (maybe in the >> scipy organization?). So that it can iterate faster, among other >> things. >> (I'd be interested in contributing FWIW) >> >> Cheers, >> >> Evgeni >> >> >> On Wed, Jun 30, 2021 at 1:22 AM Ilhan Polat wrote: >> > >> > Dear all, >> > >> > I'm writing some helper Cythpm functions for scipy.linalg which is kinda performant and usable. And there is still quite some wiggle room for more. >> > >> > In many linalg routines there is a lot of performance benefit if the structure can be discovered in a cheap and reliable way at the outset. For example if symmetric then eig can delegate to eigh or if triangular then triangular solvers can be used in linalg.solve and lstsq so forth >> > >> > Here is the Cythonized version for Jupyter notebook to paste to discover the lower/upper bandwidth of square array A that competes well with A != 0 just to use some low level function (note the latter returns an array hence more cost is involved) There is a higher level supervisor function that checks C-contiguousness otherwise specializes to different versions of it >> > >> > Initial cell >> > >> > %load_ext Cython >> > %load_ext line_profiler >> > import cython >> > import line_profiler >> > >> > Then another cell >> > >> > %%cython >> > # cython: language_level=3 >> > # cython: linetrace=True >> > # cython: binding = True >> > # distutils: define_macros=CYTHON_TRACE=1 >> > # distutils: define_macros=CYTHON_TRACE_NOGIL=1 >> > >> > cimport cython >> > cimport numpy as cnp >> > import numpy as np >> > import line_profiler >> > ctypedef fused np_numeric_t: >> > cnp.int8_t >> > cnp.int16_t >> > cnp.int32_t >> > cnp.int64_t >> > cnp.uint8_t >> > cnp.uint16_t >> > cnp.uint32_t >> > cnp.uint64_t >> > cnp.float32_t >> > cnp.float64_t >> > cnp.complex64_t >> > cnp.complex128_t >> > cnp.int_t >> > cnp.long_t >> > cnp.longlong_t >> > cnp.uint_t >> > cnp.ulong_t >> > cnp.ulonglong_t >> > cnp.intp_t >> > cnp.uintp_t >> > cnp.float_t >> > cnp.double_t >> > cnp.longdouble_t >> > >> > >> > @cython.linetrace(True) >> > @cython.initializedcheck(False) >> > @cython.boundscheck(False) >> > @cython.wraparound(False) >> > cpdef inline (int, int) band_check_internal(np_numeric_t[:, ::1]A): >> > cdef Py_ssize_t n = A.shape[0], lower_band = 0, upper_band = 0, r, c >> > cdef np_numeric_t zero = 0 >> > >> > for r in xrange(n): >> > # Only bother if outside the existing band: >> > for c in xrange(r-lower_band): >> > if A[r, c] != zero: >> > lower_band = r - c >> > break >> > >> > for c in xrange(n - 1, r + upper_band, -1): >> > if A[r, c] != zero: >> > upper_band = c - r >> > break >> > >> > return lower_band, upper_band >> > >> > Final cell for use-case --------------- >> > >> > # Make arbitrary lower-banded array >> > n = 50 # array size >> > k = 3 # k'th subdiagonal >> > R = np.zeros([n, n], dtype=np.float32) >> > R[[x for x in range(n)], [x for x in range(n)]] = 1 >> > R[[x for x in range(n-1)], [x for x in range(1,n)]] = 1 >> > R[[x for x in range(1,n)], [x for x in range(n-1)]] = 1 >> > R[[x for x in range(k,n)], [x for x in range(n-k)]] = 2 >> > >> > Some very haphazardly put together metrics >> > >> > %timeit band_check_internal(R) >> > 2.59 ?s ? 84.7 ns per loop (mean ? std. dev. of 7 runs, 100000 loops each) >> > >> > %timeit np.linalg.solve(R, zzz) >> > 824 ?s ? 6.24 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) >> > >> > %timeit R != 0. >> > 1.65 ?s ? 43.1 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) >> > >> > So the worst case cost is negligible in general (note that the given code is slower as it uses the fused type however if I go with tempita standalone version is faster) >> > >> > Two questions: >> > >> > 1) This is missing np.half/float16 functionality since any arithmetic with float16 is might not be reliable including nonzero check. IS it safe to view it as np.uint16 and use that specialization? I'm not sure about the sign bit hence the question. I can leave this out since almost all linalg suite rejects this datatype due to well-known lack of supprt. >> > >> > 2) Should this be in NumPy or SciPy linalg? It is quite relevant to be on SciPy but then again this stuff is purely about array structures. But if the opinion is for NumPy then I would need a volunteer because NumPy codebase flies way above my head. >> > >> > >> > All feedback welcome >> > >> > Best >> > ilhan >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From ilhanpolat at gmail.com Fri Jul 2 16:00:19 2021 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Fri, 2 Jul 2021 23:00:19 +0300 Subject: [Numpy-discussion] is_triangular, is_diagonal, is_symmetric et al. in NumPy or SciPy linalg In-Reply-To: References: Message-ID: Yes they go by the name of morally triangular matrices (quite a stupid name but in their defense I think it was an insider joke) this is also given in Tim Davis' book as an exercise via linked lists. The issue is that LAPACK doesn't support these permuted matrices. Hence we are left with two options Either copy paste row/columns so that the array stays contiguous or permute a copy of the array. Both can be significant cost while trying to shave off solving time. But you are right this can be present even though solvers and eig routines won't use it. I'll put my Cython code back in. On Fri, 2 Jul 2021, 20:05 Oscar Benjamin, wrote: > If you're going to provide routines for structure determination it > might be worth looking at algorithms that can identify more general or > less obvious structure as well. SymPy's matrices module needs a lot of > work and is improving a lot which will become noticeable over the next > few releases but one of the important optimisations being used is > Tarjan's algorithm for finding the strongly connected components of a > graph. This is a generalisation of checking for triangular or diagonal > matrices. With this approach you can identify any permutation of the > rows and columns of a square matrix that can bring it into block > triangular or block diagonal form which can reduce many O(n**3) > algorithms substantially. The big-O for Tarjan's algorithm itself is > basically the same as checking whether a matrix is > triangular/diagonal. > > For example the matrix determinant is invariant under permutations of > the rows and columns. If you can permute a matrix into block > triangular form then the determinant is just the product of the > determinants of the diagonal blocks. If the base case algorithm has > n**3 operations then reducing it to two operations of size n/2 is a > speed up of ~4x. In the extreme this discovers that a matrix is > triangular and reduces the whole operation to O(n) (plus the cost of > Tarjan's algorithm). However the graph-based approach also benefits > wider classes e.g. you get almost all the same benefit for a matrix > that is almost diagonal but has a few off-diagonal elements. > > Using sympy master branch (as .strongly_connected_components() is not > released yet): > > In [19]: from sympy import Matrix > > In [20]: M = Matrix([[1, 0, 2, 0], [9, 3, 1, 2], [3, 0, 4, 0], [5, 8, 6, > 7]]) > > In [21]: M > Out[21]: > ?1 0 2 0? > ?9 3 1 2? > ?3 0 4 0? > ?5 8 6 7? > > In [22]: M.strongly_connected_components() # Tarjan's algorithm > Out[22]: [[0, 2], [1, 3]] > > In [23]: M[[0, 2, 1, 3], [0, 2, 1, 3]] # outer indexing for permutation > Out[23]: > ?1 2 0 0? > ?3 4 0 0? > ?9 1 3 2? > ?5 6 8 7? > > In [24]: M.det() > Out[24]: -10 > > In [25]: M[[0,2],[0,2]].det() * M[[1, 3], [1, 3]].det() > Out[25]: -10 > > -- > Oscar > > On Fri, 2 Jul 2021 at 12:20, Ilhan Polat wrote: > > > > Ah right. So two things, the original reason f9r this question is > because I can't decide in https://github.com/scipy/scipy/pull/12824 > whether others would also benefit from quick structure determination. > > > > I can keep it private function or we can put them some misc or lib > folder so all can use. Say there is a special method for triangular > matrices but you can't guarantee the structure so you can quickly check for > it. At worst O(n**2) complexity for diagonal arrays and almost O(2n) for > full arrays makes it quite appealing. > > > > But then again maybe NumPy is a better place since probably it will be > faster to have this in pure C with the right headers and without the extra > Cython overhead. > > > > Funny you mention the container idea. This is precisely what I'm doing > in PR mentioned above (I'll push when I'm done). I stole the idea from Tim > Davis himself in a Julia discussion for keeping the factorization as an > attribute to be used later if need be. So yes it makes a lot of sense > Sparse or not. > > > > On Wed, 30 Jun 2021, 19:14 Evgeni Burovski, > wrote: > >> > >> Hi Ilhan, > >> > >> Overall I think something like this would be great. However, I wonder > >> if you considered having a specialized container with a structure tag > >> instead of trying to discover the structure. If it's a container, it > >> can neatly wrap various lapack storage schemes and dispatch to an > >> appropriate lapack functionality. Possibly even sparse storage > >> schemes. And it seems a bit more robust than trying to discover the > >> structure (e.g. what about off-band elements of \sim 1e-16 etc). > >> > >> The next question is of course if this should live in scipy/numpy > >> .linalg or as a separate repo, at least for some time (maybe in the > >> scipy organization?). So that it can iterate faster, among other > >> things. > >> (I'd be interested in contributing FWIW) > >> > >> Cheers, > >> > >> Evgeni > >> > >> > >> On Wed, Jun 30, 2021 at 1:22 AM Ilhan Polat > wrote: > >> > > >> > Dear all, > >> > > >> > I'm writing some helper Cythpm functions for scipy.linalg which is > kinda performant and usable. And there is still quite some wiggle room for > more. > >> > > >> > In many linalg routines there is a lot of performance benefit if the > structure can be discovered in a cheap and reliable way at the outset. For > example if symmetric then eig can delegate to eigh or if triangular then > triangular solvers can be used in linalg.solve and lstsq so forth > >> > > >> > Here is the Cythonized version for Jupyter notebook to paste to > discover the lower/upper bandwidth of square array A that competes well > with A != 0 just to use some low level function (note the latter returns an > array hence more cost is involved) There is a higher level supervisor > function that checks C-contiguousness otherwise specializes to different > versions of it > >> > > >> > Initial cell > >> > > >> > %load_ext Cython > >> > %load_ext line_profiler > >> > import cython > >> > import line_profiler > >> > > >> > Then another cell > >> > > >> > %%cython > >> > # cython: language_level=3 > >> > # cython: linetrace=True > >> > # cython: binding = True > >> > # distutils: define_macros=CYTHON_TRACE=1 > >> > # distutils: define_macros=CYTHON_TRACE_NOGIL=1 > >> > > >> > cimport cython > >> > cimport numpy as cnp > >> > import numpy as np > >> > import line_profiler > >> > ctypedef fused np_numeric_t: > >> > cnp.int8_t > >> > cnp.int16_t > >> > cnp.int32_t > >> > cnp.int64_t > >> > cnp.uint8_t > >> > cnp.uint16_t > >> > cnp.uint32_t > >> > cnp.uint64_t > >> > cnp.float32_t > >> > cnp.float64_t > >> > cnp.complex64_t > >> > cnp.complex128_t > >> > cnp.int_t > >> > cnp.long_t > >> > cnp.longlong_t > >> > cnp.uint_t > >> > cnp.ulong_t > >> > cnp.ulonglong_t > >> > cnp.intp_t > >> > cnp.uintp_t > >> > cnp.float_t > >> > cnp.double_t > >> > cnp.longdouble_t > >> > > >> > > >> > @cython.linetrace(True) > >> > @cython.initializedcheck(False) > >> > @cython.boundscheck(False) > >> > @cython.wraparound(False) > >> > cpdef inline (int, int) band_check_internal(np_numeric_t[:, ::1]A): > >> > cdef Py_ssize_t n = A.shape[0], lower_band = 0, upper_band = 0, > r, c > >> > cdef np_numeric_t zero = 0 > >> > > >> > for r in xrange(n): > >> > # Only bother if outside the existing band: > >> > for c in xrange(r-lower_band): > >> > if A[r, c] != zero: > >> > lower_band = r - c > >> > break > >> > > >> > for c in xrange(n - 1, r + upper_band, -1): > >> > if A[r, c] != zero: > >> > upper_band = c - r > >> > break > >> > > >> > return lower_band, upper_band > >> > > >> > Final cell for use-case --------------- > >> > > >> > # Make arbitrary lower-banded array > >> > n = 50 # array size > >> > k = 3 # k'th subdiagonal > >> > R = np.zeros([n, n], dtype=np.float32) > >> > R[[x for x in range(n)], [x for x in range(n)]] = 1 > >> > R[[x for x in range(n-1)], [x for x in range(1,n)]] = 1 > >> > R[[x for x in range(1,n)], [x for x in range(n-1)]] = 1 > >> > R[[x for x in range(k,n)], [x for x in range(n-k)]] = 2 > >> > > >> > Some very haphazardly put together metrics > >> > > >> > %timeit band_check_internal(R) > >> > 2.59 ?s ? 84.7 ns per loop (mean ? std. dev. of 7 runs, 100000 loops > each) > >> > > >> > %timeit np.linalg.solve(R, zzz) > >> > 824 ?s ? 6.24 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops > each) > >> > > >> > %timeit R != 0. > >> > 1.65 ?s ? 43.1 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops > each) > >> > > >> > So the worst case cost is negligible in general (note that the given > code is slower as it uses the fused type however if I go with tempita > standalone version is faster) > >> > > >> > Two questions: > >> > > >> > 1) This is missing np.half/float16 functionality since any arithmetic > with float16 is might not be reliable including nonzero check. IS it safe > to view it as np.uint16 and use that specialization? I'm not sure about the > sign bit hence the question. I can leave this out since almost all linalg > suite rejects this datatype due to well-known lack of supprt. > >> > > >> > 2) Should this be in NumPy or SciPy linalg? It is quite relevant to > be on SciPy but then again this stuff is purely about array structures. But > if the opinion is for NumPy then I would need a volunteer because NumPy > codebase flies way above my head. > >> > > >> > > >> > All feedback welcome > >> > > >> > Best > >> > ilhan > >> > > >> > _______________________________________________ > >> > NumPy-Discussion mailing list > >> > NumPy-Discussion at python.org > >> > https://mail.python.org/mailman/listinfo/numpy-discussion > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at python.org > >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Fri Jul 2 17:59:39 2021 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 2 Jul 2021 18:59:39 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday July 5 In-Reply-To: References: Message-ID: Hi all! Our next Documentation Team meeting will be on *Monday, July 5* at ***4PM UTC***. All are welcome - you don't need to already be a contributor to join. If you have questions or are curious about what we're doing, we'll be happy to meet you! If you wish to join on Zoom, use this link: https://zoom.us/j/96219574921?pwd=VTRNeGwwOUlrYVNYSENpVVBRRjlkZz09#success Here's the permanent hackmd document with the meeting notes (still being updated in the next few days!): https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20210705T16&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar /r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jul 4 16:00:06 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 4 Jul 2021 22:00:06 +0200 Subject: [Numpy-discussion] copy="never" discussion and no deprecation cycle? In-Reply-To: References: <224c4894-4ed2-a6ae-bd7a-2de8ce3b7d02@gmail.com> <18d0ca87-7bf8-4629-9781-364a75b92a94@www.fastmail.com> <9e960b67-2556-4485-824f-13c4e83868be@www.fastmail.com> <8E0F2CD0-CBBC-4252-8B6D-9D69E7E8DCAB@fastmail.com> <8c0dd25f-8df6-4d59-ae47-0961cf6b95db@www.fastmail.com> Message-ID: Let's see if we can finalize this. On Thu, Jun 24, 2021 at 9:23 PM Stefan van der Walt wrote: > On Thu, Jun 24, 2021, at 01:03, Ralf Gommers wrote: > > For this one, I'd say it kinda looks like we do need one, so then let's > just add one and be done with it, rather than inventing odd patterns like > tacking enum members onto an existing function. > > > There are two arguments on the table that resonate with me: > > 1. Chuck argues that the current `copy=False` behavior (which, in fact, > means copy-if-needed) is nonsensical and should be fixed. > 2. Ralf argues that strings are ultimately the interface we'd like to see. > > To achieve (1), we would need a deprecation cycle. During that > deprecation cycle, we would need to provide a way to continue providing > 'copy-if-needed' behavior. This can be achieved either with an enum or by > accepting strings. > > Stephan argues that accepting strings will be harmful to new code running > on old versions of NumPy. I would still like to get a sense of how often > this happens, or if that is a hit we are willing to take. If we decide > that the concern is a significant one, then we would have to go the enum > route, at least for a while. However, I see no compelling reason to have > that enum live in the top-level namespace though: it is for relatively > advanced use, and it will be temporary. > > If we take the enum route, how do we get to (2)? We add a type check for > a few releases and raise an error on string arguments (or, alternatively, > handle 'always'/'never'/'if_needed' without advertising that > functionality). Then, once we switch to string arguments, users will get > an error (for old NumPy) or it will work as expected (for new NumPy). > What Stephan said in his last email seems right, just switch to strings at some point (probably after 3 years or so), and stop recommending the enum. > I didn't think so originally, but I suppose we are in NEP territory now. > I don't think so. We basically arrived at the solution, and there's a PR that is mostly done too. This really isn't that complicated that we should require a NEP. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Sun Jul 4 21:52:58 2021 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Sun, 04 Jul 2021 18:52:58 -0700 Subject: [Numpy-discussion] =?utf-8?q?copy=3D=22never=22_discussion_and_n?= =?utf-8?q?o_deprecation_cycle=3F?= In-Reply-To: References: <224c4894-4ed2-a6ae-bd7a-2de8ce3b7d02@gmail.com> <18d0ca87-7bf8-4629-9781-364a75b92a94@www.fastmail.com> <9e960b67-2556-4485-824f-13c4e83868be@www.fastmail.com> <8E0F2CD0-CBBC-4252-8B6D-9D69E7E8DCAB@fastmail.com> <8c0dd25f-8df6-4d59-ae47-0961cf6b95db@www.fastmail.com> Message-ID: <30aebe03-f4fb-412b-8f03-f47c7abbfca6@www.fastmail.com> On Sun, Jul 4, 2021, at 13:00, Ralf Gommers wrote: > I don't think so. We basically arrived at the solution, and there's a PR that is mostly done too. This really isn't that complicated that we should require a NEP. Personally, I don't like np.CopyMode in the main namespace. If we can agree to stash it somewhere else, and tentatively aim to move to strings at point X in time for consistency with the rest of the API, I have no issue with going ahead. St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikofski at berkeley.edu Mon Jul 5 01:35:37 2021 From: mikofski at berkeley.edu (Dr. Mark Alexander Mikofski PhD) Date: Sun, 4 Jul 2021 22:35:37 -0700 Subject: [Numpy-discussion] [ANN] Software job opportunity in clean energy Message-ID: Dear Pythonistas, DNV Energy USA is looking for an experienced software engineer to help accelerate the renewable energy transition. Do you know any software engineers interested in clean energy? Would you mind sharing the following link with your network? https://www.linkedin.com/jobs/view/2574048777 Thank you! Mark A. Mikofski -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Jul 5 03:42:48 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 5 Jul 2021 09:42:48 +0200 Subject: [Numpy-discussion] copy="never" discussion and no deprecation cycle? In-Reply-To: <30aebe03-f4fb-412b-8f03-f47c7abbfca6@www.fastmail.com> References: <224c4894-4ed2-a6ae-bd7a-2de8ce3b7d02@gmail.com> <18d0ca87-7bf8-4629-9781-364a75b92a94@www.fastmail.com> <9e960b67-2556-4485-824f-13c4e83868be@www.fastmail.com> <8E0F2CD0-CBBC-4252-8B6D-9D69E7E8DCAB@fastmail.com> <8c0dd25f-8df6-4d59-ae47-0961cf6b95db@www.fastmail.com> <30aebe03-f4fb-412b-8f03-f47c7abbfca6@www.fastmail.com> Message-ID: On Mon, Jul 5, 2021 at 3:53 AM Stefan van der Walt wrote: > On Sun, Jul 4, 2021, at 13:00, Ralf Gommers wrote: > > I don't think so. We basically arrived at the solution, and there's a PR > that is mostly done too. This really isn't that complicated that we should > require a NEP. > > > Personally, I don't like np.CopyMode in the main namespace. If we can > agree to stash it somewhere else, and tentatively aim to move to strings at > point X in time for consistency with the rest of the API, I have no issue > with going ahead. > I share your dislike, but I don't really see a better place where it doesn't make it even harder to spell, but I did just think of an alternative that may actually be quite reasonable: keep it private. The reason why Gagandeep started working on this is so we can have the never-copy behavior in the `numpy.array_api` namespace. For the `asarray` function there, the `copy` keyword is still boolean, with description: Whether or not to make a copy of the input. If True, always copies. If False, never copies for input which supports DLPack or the buffer protocol, and raises ValueError in case that would be necessary. If None , reuses existing memory buffer if possible, copies otherwise. Default: None. In the end I think that's better than strings, and way better than enums - we just can't have that in the main namespace, because we can't change what `False` does. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Mon Jul 5 14:17:56 2021 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Mon, 05 Jul 2021 11:17:56 -0700 Subject: [Numpy-discussion] =?utf-8?q?copy=3D=22never=22_discussion_and_n?= =?utf-8?q?o_deprecation_cycle=3F?= In-Reply-To: References: <224c4894-4ed2-a6ae-bd7a-2de8ce3b7d02@gmail.com> <18d0ca87-7bf8-4629-9781-364a75b92a94@www.fastmail.com> <9e960b67-2556-4485-824f-13c4e83868be@www.fastmail.com> <8E0F2CD0-CBBC-4252-8B6D-9D69E7E8DCAB@fastmail.com> <8c0dd25f-8df6-4d59-ae47-0961cf6b95db@www.fastmail.com> <30aebe03-f4fb-412b-8f03-f47c7abbfca6@www.fastmail.com> Message-ID: <06b21c5a-c5c9-4169-91ca-808eafd4ac01@www.fastmail.com> On Mon, Jul 5, 2021, at 00:42, Ralf Gommers wrote: > I share your dislike, but I don't really see a better place where it doesn't make it even harder to spell, but I did just think of an alternative that may actually be quite reasonable: keep it private. That would be fine. We haven't had this feature requested for many years, so as long as it is available in some shape or form it should satisfy the advanced users who need it. It also doesn't force us into a decision we cannot reverse (adding to the top-level API). > The reason why Gagandeep started working on this is so we can have the never-copy behavior in the `numpy.array_api` namespace. For the `asarray` function there, the `copy` keyword is still boolean, with description: > > Whether or not to make a copy of the input. If ` True`, always copies. > If ` False`, never copies for input which supports DLPack or the buffer protocol, > and raises ` ValueError`` `in case that would be necessary. > If ` None ` , reuses existing memory buffer if possible, copies otherwise. > Default: ` None`. > > In the end I think that's better than strings, and way better than enums - we just can't have that in the main namespace, because we can't change what `False` does. I agree that this is a good API (although not everybody else does). W.r.t. NumPy's API: it could be okay to change the behavior of copy=False to make it more strict (no copies ever), because then at least errors will be raised and we can provide a message with instructions on how to fix it. St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Jul 6 22:05:41 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 06 Jul 2021 21:05:41 -0500 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday Message-ID: <50e7b4fd9567a93a48e19a8b08a7a7735ffeb86b.camel@sipsolutions.net> Hi all, There will be a NumPy Community meeting Wednesday July 7th at 20:00 UTC. Everyone is invited and encouraged to join in and edit the work-in-progress meeting topics and notes at: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian From sebastian at sipsolutions.net Wed Jul 7 11:51:52 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 07 Jul 2021 10:51:52 -0500 Subject: [Numpy-discussion] Floating point warnings/errors for comparisons, etc.? Message-ID: Hi all, I am trying to clean up our floating point warning handling: https://github.com/numpy/numpy/pull/19316 And an upcoming PR to remove most floating point error clearing. There are some things I am unsure about, though. Part of why it got so confusing, is that GCC seemed to have fixed/changed their behaviour for comparisons with NaN. In GCC 7 it did not give the warning, but GCC 8 does (correctly). Comparison with NaN ------------------- IEEE says that the default comparisons should give warnings for comparison with NaN (except == and !=). And notes that an alternative should be provided (C99 does this with `isless`, etc.). We currently break this by suppressing invalid value warnings for all comparisons (e.g. also `NaN > 0.`). We can easily do either version (aside possibly compiler issues). Making it give warnings had one test case fail for `quantile`, which uses the pattern: if not (np.all(q >= 0) and np.all(q <= 1)): raise ValueError("bad q") This would additionally (and first) give an "invalid value" warning and require `np.errstate(invalid="ignore") to suppress it. I dislike diverging from IEEE, but Python also does not warn for [1]: float("nan") >= 0 and presumably the user either explicitly created the NaN or has seen a warning earlier during computation when the NaN was first created. (IEEE does not distinguish creating a new NaN with `0./0.` from a comparison with `NaN > 0.` [2]. So we can't easily make this settable via `np.errstate` or so.) So, should we ignore the warning here? Compiler Issues --------------- Some compilers may get flags wrong. How much effort do we want to spend on details few users will notice? My current problem is `1 % 0` and `divmod(1, 0)`. The MacOS/clang CI does not set the correct "invalid value" warning flag. (The remainder is NaN, so a new NaN is created and that should be indicated but the C99 `fmod` does not set it.) Signalling NaNs --------------- I propose dropping any special concern for signalling NaNs. Which means they raise almost always. Although, rarely we might suppress the warning if we do it manually for normal NaNs [0]. We have two tests which check for behaviour on signalling NaNs. I could not find having any logic to them besides someone being surprised at signalling NaN behaviour at the time ? not based on use-cases. Even functions like `isnan` give a warning for signalling NaNs! The "fix" for anyone having sNaN's is to convert them to qNaNs as early as possible. Which e.g. `np.positive(arr, out=arr)` should probably do. If this becomes an issue, maybe we could have an explicit ufunc. Cheers, Sebastian [0] Mainly it seems SSE2 does not provide some non-error comparisons. So trying to avoid manually clearing errors might make some SSE code considerable slower (e.g. `isfinite`, `np.min`). [1] Probably Python just does not check the CPU warning flags [2] https://www.gnu.org/software/libc/manual/html_node/FP-Exceptions.html From jerry.morrison+numpy at gmail.com Wed Jul 7 15:55:49 2021 From: jerry.morrison+numpy at gmail.com (Jerry Morrison) Date: Wed, 7 Jul 2021 12:55:49 -0700 Subject: [Numpy-discussion] NumPy's BLAS library on macOS? Message-ID: Would someone please answer installation questions about NumPy's BLAS on macOS? I'm not finding the answers in the release notes , the PR source, the docs , or Stack Overflow . Q1. The NumPy 1.21.0 release note says "This change enables the Accelerate Framework as an option on macOS." How to set that option on/off? Q2. How to determine if NumPy uses Accelerate vs. its internal copy of OpenBLAS? After installing a wheel, `numpy.show_config()` shows the openblas_info library_dirs et al as '/usr/local/lib'. Neither '/usr/local/lib/' nor 'site-packages/numpy/' contains a *blas*.so library (for Python 3.8.* on macOS 10.14.6) but the doc says "The OpenBLAS libraries are included in the wheel." Q3. How to pip install NumPy 1.21.0 in a way that ensures it uses its embedded OpenBLAS on macOS as on Linux? I'm aiming for as portable results as possible. Or should we link NumPy to an external OpenBLAS via `pip install numpy --no-binary numpy==1.21.0` with `~/.numpy-site.cfg`? (Ditto for SciPy.) Q4. Can the new NPY_* environment variables select specific BLAS & LAPACK libraries through pip install, and perhaps install faster than building NumPy, SciPy, etc. from source? How to do that? Q5. Is NumPy's embedded OpenBLAS compiled by gcc or clang? Is that controllable via `pip install`? Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Jul 7 16:31:42 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 7 Jul 2021 22:31:42 +0200 Subject: [Numpy-discussion] NumPy's BLAS library on macOS? In-Reply-To: References: Message-ID: On Wed, Jul 7, 2021 at 9:56 PM Jerry Morrison < jerry.morrison+numpy at gmail.com> wrote: > Would someone please answer installation questions about NumPy's BLAS on > macOS? I'm not finding the answers in the release notes > , the PR > source, the docs > , or Stack Overflow > . > > > Q1. The NumPy 1.21.0 release note > says "This change > enables the Accelerate Framework as an option on macOS." How to set > that option on/off? > It's autodetected at build time. If you have no other BLAS installed, it will be used. Or explicitly select it with NPY_BLAS_ORDER/NPY_LAPACK_ORDER > Q2. How to determine if NumPy uses Accelerate vs. its internal copy of > OpenBLAS? > After installing a wheel, `numpy.show_config()` shows the openblas_info > library_dirs et al as '/usr/local/lib'. Neither '/usr/local/lib/' nor > 'site-packages/numpy/' contains a *blas*.so library (for Python 3.8.* on > macOS 10.14.6) but the doc says "The > OpenBLAS libraries are included in the wheel." > It's a build-time option, you cannot select it at runtime. > Q3. How to pip install NumPy 1.21.0 in a way that ensures it uses its > embedded OpenBLAS on macOS as on Linux? I'm aiming for as portable results > as possible. Or should we link NumPy to an external OpenBLAS via `pip > install numpy --no-binary numpy==1.21.0` with `~/.numpy-site.cfg`? (Ditto > for SciPy.) > If you install a wheel, you will always get the bundled OpenBLAS on every platform for which we have binary wheels. > > Q4. Can the new NPY_* environment variables select specific BLAS & LAPACK > libraries through pip install, and perhaps install faster than building > NumPy, SciPy, etc. from source? How to do that? > This question seems a little bit confused. Those env vars just select the BLAS/LAPACK library. It will not affect build time - we're never building BLAS or LAPACK itself from source. > Q5. Is NumPy's embedded OpenBLAS compiled by gcc or clang? Is that > controllable via `pip install`? > gcc/gfortran. and no, you cannot control it through pip Cheers, Ralf > Thank you! > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jerry.morrison+numpy at gmail.com Wed Jul 7 19:23:24 2021 From: jerry.morrison+numpy at gmail.com (Jerry Morrison) Date: Wed, 7 Jul 2021 16:23:24 -0700 Subject: [Numpy-discussion] NumPy's BLAS library on macOS? In-Reply-To: References: Message-ID: Got it! *Summary:* * Installing a numpy wheel (e.g. `pip install numpy==1.21.0`) uses its embedded OpenBLAS on every platform that has a wheel. That OpenBLAS is always compiled with gcc/gfortran. In this case, `np.show_config()` reports `library_dirs = ['/usr/local/lib']` even though there's no libblas in that directory. * Installing numpy from source (e.g. `pip install numpy==1.21.0 --no-binary numpy)` looks for BLAS & LAPACK libraries at build time as influenced by the environment vars NPY_BLAS_ORDER/NPY_LAPACK_ORDER or by the file ~/.numpy-site.cfg. On macOS, 'accelerate' is in the default search order after 'openblas'. On macOS < 11.3, importing numpy that's linked to Accelerate will detect an Accelerate bug and raise a RuntimeError. On Wed, Jul 7, 2021 at 1:32 PM Ralf Gommers wrote: > > > On Wed, Jul 7, 2021 at 9:56 PM Jerry Morrison < > jerry.morrison+numpy at gmail.com> wrote: > >> Would someone please answer installation questions about NumPy's BLAS on >> macOS? I'm not finding the answers in the release notes >> , the PR >> source, the docs >> , or Stack Overflow >> . >> >> >> Q1. The NumPy 1.21.0 release note >> says "This change >> enables the Accelerate Framework as an option on macOS." How to set >> that option on/off? >> > > It's autodetected at build time. If you have no other BLAS installed, it > will be used. Or explicitly select it with NPY_BLAS_ORDER/NPY_LAPACK_ORDER > > >> Q2. How to determine if NumPy uses Accelerate vs. its internal copy of >> OpenBLAS? >> After installing a wheel, `numpy.show_config()` shows the openblas_info >> library_dirs et al as '/usr/local/lib'. Neither '/usr/local/lib/' nor >> 'site-packages/numpy/' contains a *blas*.so library (for Python 3.8.* on >> macOS 10.14.6) but the doc says "The >> OpenBLAS libraries are included in the wheel." >> > > It's a build-time option, you cannot select it at runtime. > > >> Q3. How to pip install NumPy 1.21.0 in a way that ensures it uses its >> embedded OpenBLAS on macOS as on Linux? I'm aiming for as portable results >> as possible. Or should we link NumPy to an external OpenBLAS via `pip >> install numpy --no-binary numpy==1.21.0` with `~/.numpy-site.cfg`? (Ditto >> for SciPy.) >> > > If you install a wheel, you will always get the bundled OpenBLAS on every > platform for which we have binary wheels. > > >> >> Q4. Can the new NPY_* environment variables select specific BLAS & LAPACK >> libraries through pip install, and perhaps install faster than building >> NumPy, SciPy, etc. from source? How to do that? >> > > This question seems a little bit confused. Those env vars just select the > BLAS/LAPACK library. It will not affect build time - we're never building > BLAS or LAPACK itself from source. > > >> Q5. Is NumPy's embedded OpenBLAS compiled by gcc or clang? Is that >> controllable via `pip install`? >> > > gcc/gfortran. and no, you cannot control it through pip > > Cheers, > Ralf > > >> Thank you! >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Jul 8 01:14:10 2021 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 8 Jul 2021 08:14:10 +0300 Subject: [Numpy-discussion] NumPy's BLAS library on macOS? In-Reply-To: References: Message-ID: On 8/7/21 2:23 am, Jerry Morrison wrote: > Got it! > > *Summary:* > * Installing a numpy wheel?(e.g. `pip install numpy==1.21.0`) uses its > embedded OpenBLAS on every platform that has a wheel. > ??That OpenBLAS is always compiled with gcc/gfortran. > ? In this case, `np.show_config()` reports `library_dirs = > ['/usr/local/lib']` even though there's no libblas in that directory. > > * Installing numpy from source (e.g. `pip install numpy==1.21.0 > --no-binary numpy)` looks for BLAS & LAPACK libraries at build time as > influenced by the environment vars NPY_BLAS_ORDER/NPY_LAPACK_ORDER or > by the file ~/.numpy-site.cfg. > ? On macOS, 'accelerate' is in the default search order after 'openblas'. > ? On macOS detect an Accelerate bug and raise a RuntimeError. That seems correct, although admittedly show_config could do a better job. The problem is that not every BLAS implementation provides a convenient method to self-report. It might be nice to document all this somewhere more permanent, the docstring for show_config might be a good place to start. Matti From jerry.morrison+numpy at gmail.com Thu Jul 8 17:46:45 2021 From: jerry.morrison+numpy at gmail.com (Jerry Morrison) Date: Thu, 8 Jul 2021 14:46:45 -0700 Subject: [Numpy-discussion] NumPy's BLAS library on macOS? In-Reply-To: References: Message-ID: On Wed, Jul 7, 2021 at 10:14 PM Matti Picus wrote: > > On 8/7/21 2:23 am, Jerry Morrison wrote: > > Got it! > > > > *Summary:* > > * Installing a numpy wheel (e.g. `pip install numpy==1.21.0`) uses its > > embedded OpenBLAS on every platform that has a wheel. > > That OpenBLAS is always compiled with gcc/gfortran. > > In this case, `np.show_config()` reports `library_dirs = > > ['/usr/local/lib']` even though there's no libblas in that directory. > > > > * Installing numpy from source (e.g. `pip install numpy==1.21.0 > > --no-binary numpy)` looks for BLAS & LAPACK libraries at build time as > > influenced by the environment vars NPY_BLAS_ORDER/NPY_LAPACK_ORDER or > > by the file ~/.numpy-site.cfg. > > On macOS, 'accelerate' is in the default search order after 'openblas'. > > On macOS < 11.3, importing numpy that's linked to Accelerate will > > detect an Accelerate bug and raise a RuntimeError. > > > That seems correct, although admittedly show_config could do a better > job. The problem is that not every BLAS implementation provides a > convenient method to self-report. > For implementations that don't self-report, could show_config detect that it's using embedded OpenBLAS or a system Accelerate library? > > It might be nice to document all this somewhere more permanent, the > docstring for show_config might be a good place to start. > Agreed. On the https://numpy.org/install/ installation page? Do you want a PR? (How to get the translations?) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Jul 9 05:20:16 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 9 Jul 2021 11:20:16 +0200 Subject: [Numpy-discussion] NumPy's BLAS library on macOS? In-Reply-To: References: Message-ID: On Thu, Jul 8, 2021 at 11:47 PM Jerry Morrison < jerry.morrison+numpy at gmail.com> wrote: > > On Wed, Jul 7, 2021 at 10:14 PM Matti Picus wrote: > >> >> On 8/7/21 2:23 am, Jerry Morrison wrote: >> > Got it! >> > >> > *Summary:* >> > * Installing a numpy wheel (e.g. `pip install numpy==1.21.0`) uses its >> > embedded OpenBLAS on every platform that has a wheel. >> > That OpenBLAS is always compiled with gcc/gfortran. >> > In this case, `np.show_config()` reports `library_dirs = >> > ['/usr/local/lib']` even though there's no libblas in that directory. >> > >> > * Installing numpy from source (e.g. `pip install numpy==1.21.0 >> > --no-binary numpy)` looks for BLAS & LAPACK libraries at build time as >> > influenced by the environment vars NPY_BLAS_ORDER/NPY_LAPACK_ORDER or >> > by the file ~/.numpy-site.cfg. >> > On macOS, 'accelerate' is in the default search order after >> 'openblas'. >> > On macOS < 11.3, importing numpy that's linked to Accelerate will >> > detect an Accelerate bug and raise a RuntimeError. >> >> >> That seems correct, although admittedly show_config could do a better >> job. The problem is that not every BLAS implementation provides a >> convenient method to self-report. >> > > For implementations that don't self-report, could show_config detect that > it's using embedded OpenBLAS or a system Accelerate library? > It's not about "self reporting". What `show_config` shows for BLAS/LAPACK is the build time configuration, not the one at runtime. The paths are present in `numpy/__config.py__`, which is a file generated by the build process. > > >> >> It might be nice to document all this somewhere more permanent, the >> docstring for show_config might be a good place to start. >> > > Agreed. On the https://numpy.org/install/ installation page? > Do you want a PR? (How to get the translations?) > Thanks, a PR would be nice. It's too detailed for the website; Matti's suggestion was in the docstring of `show_config`, which is defined at https://github.com/numpy/numpy/blob/main/numpy/distutils/misc_util.py#L2332 Cheers, Ralf > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Jul 13 17:15:49 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 13 Jul 2021 16:15:49 -0500 Subject: [Numpy-discussion] copy="never" discussion and no deprecation cycle? In-Reply-To: <06b21c5a-c5c9-4169-91ca-808eafd4ac01@www.fastmail.com> References: <224c4894-4ed2-a6ae-bd7a-2de8ce3b7d02@gmail.com> <18d0ca87-7bf8-4629-9781-364a75b92a94@www.fastmail.com> <9e960b67-2556-4485-824f-13c4e83868be@www.fastmail.com> <8E0F2CD0-CBBC-4252-8B6D-9D69E7E8DCAB@fastmail.com> <8c0dd25f-8df6-4d59-ae47-0961cf6b95db@www.fastmail.com> <30aebe03-f4fb-412b-8f03-f47c7abbfca6@www.fastmail.com> <06b21c5a-c5c9-4169-91ca-808eafd4ac01@www.fastmail.com> Message-ID: On Mon, 2021-07-05 at 11:17 -0700, Stefan van der Walt wrote: > On Mon, Jul 5, 2021, at 00:42, Ralf Gommers wrote: > > I share your dislike, but I don't really see a better place where > > it doesn't make it even harder to spell, but I did just think of an > > alternative that may actually be quite reasonable: keep it private. > > That would be fine.? We haven't had this feature requested for many > years, so as long as it is available in some shape or form it should > satisfy the advanced users who need it.? It also doesn't force us > into a decision we cannot reverse (adding to the top-level API). > I am happy with (semi?)-private. Although, I would prefer a long-term goal we can work towards. > > The reason why Gagandeep started working on this is so we can have > > the never-copy behavior in the `numpy.array_api` namespace. For the > > `asarray` function there, the `copy` keyword is still boolean, with > > description: > > > > ??? Whether or not to make a copy of the input. If ` True`, always > > copies. > > ??? If ` False`, never copies for input which supports DLPack or > > the buffer protocol, > > ??? and raises ` ValueError`` `in case that would be necessary. > > ??? If ` None ` , reuses existing memory buffer if possible, copies > > otherwise. > > ??? Default: ` None`. > > > > In the end I think that's better than strings, and way better than > > enums - we just can't have that in the main namespace, because we > > can't change what `False` does. > If we can converge on this as an ideal API, should we really keep `copy=False` around without a warning? And if tag on a warning, maybe we may as well migrate NumPy itself (excruciatingly slow if necessary)? We seem to find some principle to dislike every single idea (I am probably forgetting a few): * Enums: * Namespace bloat * (Maybe clunky spelling) * Strings: * not strictly backward compatible (if accidentally used on old versions or potentially `__array_function__`.) * Slow to transition necessary * (Possibly not a good mix with `True/False` in general) * Transition `copy={True, False, None}`: * "Terrible API for a 3-way option" * some users have to update their code (libraries more than end-users, and libraries are easier to update). and I am honestly not sure that any of those is worrying. My preference would be to decide on the ideal API, and then move towards it. And if we don't think `CopyMode` is the right solution then it should be added only "semi-public": Probably with an underscore and documented to go away again, but allowing a pattern of: if np.__version__ > 1.22.0: if hasattr(np, "_CopyMode"): never_copy = np._CopyMode.NEVER else: never_copy = "never" else: # oops For libraries that need to work around transition difficulties. About a NEP: I am not sure we need one, although I am not opposed. It may make sense... Especially if whatever we converge on violates some written or unwritten "policy". However, I am wary to bring up a possible NEP if there is no clarity of where things are going. IMO, a NEP should be a concrete proposal, and that means that whoever writes it must have confidence in a proposal. If we transitioned from the brain-storming stage to a "formal decision making" one, then maybe a NEP is what we need. But, I currently don't know what the concrete proposal would be. Cheers, Sebastian > I agree < > https://github.com/numpy/numpy/pull/19173#issuecomment-858226896> > that this is a good API (although not everybody else does). < > https://github.com/numpy/numpy/pull/19173#issuecomment-860314626> > > W.r.t. NumPy's API: it could be okay to change the behavior of > copy=False to make it more strict (no copies ever), because then at > least errors will be raised and we can provide a message with > instructions on how to fix it. > > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Tue Jul 13 21:27:13 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 13 Jul 2021 20:27:13 -0500 Subject: [Numpy-discussion] NumPy Development Meeting Wednesday Message-ID: <2cb4f8a27719336641f08eabd1bae5f329aa8ae0.camel@sipsolutions.net> Hi all, Our bi-weekly triage-focused NumPy development meeting is Wednesday, July 14th at 9 am Pacific Time (16:00 UTC). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or PRs that you feel should be prioritized, discussed, or reviewed. Best regards Sebastian From tom.programs at gmail.com Wed Jul 14 11:38:12 2021 From: tom.programs at gmail.com (Tom Programming) Date: Wed, 14 Jul 2021 18:38:12 +0300 Subject: [Numpy-discussion] sinpi/cospi trigonometric functions Message-ID: <93FAA9D1-59F6-41B4-BAFB-577356097FFA@gmail.com> Hi all, (I am very new to this mail list so please cut me some slack) trigonometric functions like sin(x) are usually implemented as: 1. Some very complicated function that does bit twiddling and basically computes the reminder of x by pi/2. An example in http://www.netlib.org/fdlibm/e_rem_pio2.c (that calls http://www.netlib.org/fdlibm/k_rem_pio2.c ). i.e. ~500 lines of branching C code. The complexity arises in part because for big values of x the subtraction becomes more and more ill defined, due to x being represented in binary base to which an irrational number has to subtracted and consecutive floating point values being more and more apart for higher absolute values. 2. A Taylor series for the small values of x, 3. Plus some manipulation to get the correct branch, deal with subnormal numbers, deal with -0, etc... If we used a function like sinpi(x) = sin(pi*x) part (1) can be greatly simplified, since it becomes trivial to separate the reminder of the division by pi/2. There are gains both in the accuracy and the performance. In large parts of the code anyways there is a pi inside the argument of sin since it is common to compute something like sin(2*pi*f*t) etc. So I wonder if it is feasible to implement those functions in numpy. To strengthen my argument I'll note that the IEEE standard, too, defines ( https://irem.univ-reunion.fr/IMG/pdf/ieee-754-2008.pdf ) the functions sinPi, cosPi, tanPi, atanPi, atan2Pi. And there are existing implementations, for example, in Julia ( https://github.com/JuliaLang/julia/blob/6aaedecc447e3d8226d5027fb13d0c3cbfbfea2a/base/special/trig.jl#L741-L745 ) and the Boost C++ Math library ( https://www.boost.org/doc/libs/1_54_0/boost/math/special_functions/sin_pi.hpp ) And that issue caused by apparently inexact calculations have been raised in the past in various forums ( https://stackoverflow.com/questions/20903384/numpy-sinpi-returns-negative-value https://stackoverflow.com/questions/51425732/how-to-make-sinpi-and-cospi-2-zero https://www.reddit.com/r/Python/comments/2g99wa/why_does_python_not_make_sinpi_0_just_really/ ... ) PS: to be nitpicky I see that most implementation implement sinpi as sin(pi*x) for small values of x, i.e. they multiply x by pi and then use the same coefficients for the Taylor series as the canonical sin. A multiply instruction could be spared, in my opinion, by storing different Taylor expansion number coefficients tailored for the sinpi function. It is not clear to me if it is not done because the performance gain is small, because I am wrong about something, or because those 6 constants of the Taylor expansion have a "sacred aura" about them and nobody wants to enter deeply into this. PPS: I am aware that it could be seen as rude to request a feature from an open source project but I am asking if there is a point in providing these functions in the first place. I could try to provide implementations for them in some time if it is indeed a worthwhile effort Yours, Tom. From robert.kern at gmail.com Wed Jul 14 12:15:46 2021 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 14 Jul 2021 12:15:46 -0400 Subject: [Numpy-discussion] sinpi/cospi trigonometric functions In-Reply-To: <93FAA9D1-59F6-41B4-BAFB-577356097FFA@gmail.com> References: <93FAA9D1-59F6-41B4-BAFB-577356097FFA@gmail.com> Message-ID: On Wed, Jul 14, 2021 at 11:39 AM Tom Programming wrote: > Hi all, > > (I am very new to this mail list so please cut me some slack) > > trigonometric functions like sin(x) are usually implemented as: > > 1. Some very complicated function that does bit twiddling and basically > computes the reminder of x by pi/2. An example in > http://www.netlib.org/fdlibm/e_rem_pio2.c (that calls > http://www.netlib.org/fdlibm/k_rem_pio2.c ). i.e. ~500 lines of branching > C code. The complexity arises in part because for big values of x the > subtraction becomes more and more ill defined, due to x being represented > in binary base to which an irrational number has to subtracted and > consecutive floating point values being more and more apart for higher > absolute values. > 2. A Taylor series for the small values of x, > 3. Plus some manipulation to get the correct branch, deal with subnormal > numbers, deal with -0, etc... > > If we used a function like sinpi(x) = sin(pi*x) part (1) can be greatly > simplified, since it becomes trivial to separate the reminder of the > division by pi/2. There are gains both in the accuracy and the performance. > > In large parts of the code anyways there is a pi inside the argument of > sin since it is common to compute something like sin(2*pi*f*t) etc. So I > wonder if it is feasible to implement those functions in numpy. > > To strengthen my argument I'll note that the IEEE standard, too, defines ( > https://irem.univ-reunion.fr/IMG/pdf/ieee-754-2008.pdf ) the functions > sinPi, cosPi, tanPi, atanPi, atan2Pi. And there are existing > implementations, for example, in Julia ( > https://github.com/JuliaLang/julia/blob/6aaedecc447e3d8226d5027fb13d0c3cbfbfea2a/base/special/trig.jl#L741-L745 > ) and the Boost C++ Math library ( > https://www.boost.org/doc/libs/1_54_0/boost/math/special_functions/sin_pi.hpp > ) > > And that issue caused by apparently inexact calculations have been raised > in the past in various forums ( > https://stackoverflow.com/questions/20903384/numpy-sinpi-returns-negative-value > https://stackoverflow.com/questions/51425732/how-to-make-sinpi-and-cospi-2-zero > https://www.reddit.com/r/Python/comments/2g99wa/why_does_python_not_make_sinpi_0_just_really/ > ... ) > > PS: to be nitpicky I see that most implementation implement sinpi as > sin(pi*x) for small values of x, i.e. they multiply x by pi and then use > the same coefficients for the Taylor series as the canonical sin. A > multiply instruction could be spared, in my opinion, by storing different > Taylor expansion number coefficients tailored for the sinpi function. It is > not clear to me if it is not done because the performance gain is small, > because I am wrong about something, or because those 6 constants of the > Taylor expansion have a "sacred aura" about them and nobody wants to enter > deeply into this. > The main value of the sinpi(x) formulation is that you can do the reduction on x more accurately than on pi*x (reduce-then-multiply rather than multiply-then-reduce) for people who particularly care about the special locations of half-integer x. sin() and cos() are often not implemented in software, but by CPU instructions, so you don't want to reimplement them. There is likely not a large accuracy win by removing the final multiplication. We do have sindg(), cosdg(), and tandg() in scipy.special that work similarly for inputs in degrees rather than radians. They also follow the reduce-then-multiply strategy. scipy.special would be a good place for sinpi() and friends. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From josh.craig.wilson at gmail.com Wed Jul 14 12:25:10 2021 From: josh.craig.wilson at gmail.com (Joshua Wilson) Date: Wed, 14 Jul 2021 09:25:10 -0700 Subject: [Numpy-discussion] sinpi/cospi trigonometric functions In-Reply-To: References: <93FAA9D1-59F6-41B4-BAFB-577356097FFA@gmail.com> Message-ID: I'll note that SciPy actually does have `sinpi` and `cospi`-they just happen to be private: https://github.com/scipy/scipy/blob/master/scipy/special/functions.json#L58 https://github.com/scipy/scipy/blob/master/scipy/special/functions.json#L12 They are used extensively inside the module though as helpers in other special functions and have extensive tests of their own: https://github.com/scipy/scipy/blob/master/scipy/special/tests/test_mpmath.py#L533 https://github.com/scipy/scipy/blob/master/scipy/special/tests/test_mpmath.py#L547 https://github.com/scipy/scipy/blob/master/scipy/special/tests/test_mpmath.py#L960 https://github.com/scipy/scipy/blob/master/scipy/special/tests/test_mpmath.py#L1741 I have no objections to making them public; the PR is as simple as removing the underscores and adding a docstring. On Wed, Jul 14, 2021 at 9:17 AM Robert Kern wrote: > > On Wed, Jul 14, 2021 at 11:39 AM Tom Programming wrote: >> >> Hi all, >> >> (I am very new to this mail list so please cut me some slack) >> >> trigonometric functions like sin(x) are usually implemented as: >> >> 1. Some very complicated function that does bit twiddling and basically computes the reminder of x by pi/2. An example in http://www.netlib.org/fdlibm/e_rem_pio2.c (that calls http://www.netlib.org/fdlibm/k_rem_pio2.c ). i.e. ~500 lines of branching C code. The complexity arises in part because for big values of x the subtraction becomes more and more ill defined, due to x being represented in binary base to which an irrational number has to subtracted and consecutive floating point values being more and more apart for higher absolute values. >> 2. A Taylor series for the small values of x, >> 3. Plus some manipulation to get the correct branch, deal with subnormal numbers, deal with -0, etc... >> >> If we used a function like sinpi(x) = sin(pi*x) part (1) can be greatly simplified, since it becomes trivial to separate the reminder of the division by pi/2. There are gains both in the accuracy and the performance. >> >> In large parts of the code anyways there is a pi inside the argument of sin since it is common to compute something like sin(2*pi*f*t) etc. So I wonder if it is feasible to implement those functions in numpy. >> >> To strengthen my argument I'll note that the IEEE standard, too, defines ( https://irem.univ-reunion.fr/IMG/pdf/ieee-754-2008.pdf ) the functions sinPi, cosPi, tanPi, atanPi, atan2Pi. And there are existing implementations, for example, in Julia ( https://github.com/JuliaLang/julia/blob/6aaedecc447e3d8226d5027fb13d0c3cbfbfea2a/base/special/trig.jl#L741-L745 ) and the Boost C++ Math library ( https://www.boost.org/doc/libs/1_54_0/boost/math/special_functions/sin_pi.hpp ) >> >> And that issue caused by apparently inexact calculations have been raised in the past in various forums ( https://stackoverflow.com/questions/20903384/numpy-sinpi-returns-negative-value https://stackoverflow.com/questions/51425732/how-to-make-sinpi-and-cospi-2-zero https://www.reddit.com/r/Python/comments/2g99wa/why_does_python_not_make_sinpi_0_just_really/ ... ) >> >> PS: to be nitpicky I see that most implementation implement sinpi as sin(pi*x) for small values of x, i.e. they multiply x by pi and then use the same coefficients for the Taylor series as the canonical sin. A multiply instruction could be spared, in my opinion, by storing different Taylor expansion number coefficients tailored for the sinpi function. It is not clear to me if it is not done because the performance gain is small, because I am wrong about something, or because those 6 constants of the Taylor expansion have a "sacred aura" about them and nobody wants to enter deeply into this. > > > The main value of the sinpi(x) formulation is that you can do the reduction on x more accurately than on pi*x (reduce-then-multiply rather than multiply-then-reduce) for people who particularly care about the special locations of half-integer x. sin() and cos() are often not implemented in software, but by CPU instructions, so you don't want to reimplement them. There is likely not a large accuracy win by removing the final multiplication. > > We do have sindg(), cosdg(), and tandg() in scipy.special that work similarly for inputs in degrees rather than radians. They also follow the reduce-then-multiply strategy. scipy.special would be a good place for sinpi() and friends. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Wed Jul 14 12:29:25 2021 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 14 Jul 2021 12:29:25 -0400 Subject: [Numpy-discussion] sinpi/cospi trigonometric functions In-Reply-To: References: <93FAA9D1-59F6-41B4-BAFB-577356097FFA@gmail.com> Message-ID: On Wed, Jul 14, 2021 at 12:26 PM Joshua Wilson wrote: > I'll note that SciPy actually does have `sinpi` and `cospi`-they just > happen to be private: > > https://github.com/scipy/scipy/blob/master/scipy/special/functions.json#L58 > https://github.com/scipy/scipy/blob/master/scipy/special/functions.json#L12 > > They are used extensively inside the module though as helpers in other > special functions and have extensive tests of their own: > > > https://github.com/scipy/scipy/blob/master/scipy/special/tests/test_mpmath.py#L533 > > https://github.com/scipy/scipy/blob/master/scipy/special/tests/test_mpmath.py#L547 > > https://github.com/scipy/scipy/blob/master/scipy/special/tests/test_mpmath.py#L960 > > https://github.com/scipy/scipy/blob/master/scipy/special/tests/test_mpmath.py#L1741 > > I have no objections to making them public; the PR is as simple as > removing the underscores and adding a docstring. > Delightful! -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Jul 15 06:21:43 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 15 Jul 2021 12:21:43 +0200 Subject: [Numpy-discussion] reducing effort spent on wheel builds? Message-ID: Hey all, This whole thread is quite interesting: https://twitter.com/zooba/status/1415440484181417998. Given how much effort we are spending on really niche wheel builds, I?m wondering if we should just draw a line somewhere: - we do what we do now for the main platforms: Windows, Linux (x86, aarch64), macOS, *but*: - no wheels for ppc64le - no wheels for Alpine Linux - no wheels for PyPy - no wheels for Raspberry Pi, AIX or whatever other niche thing comes next. - drop 32-bit Linux in case it is becoming an annoyance. This is not an actual proposal (yet) and I should sleep on this some more, but I've seen Chuck and Matti burn a lot of time on the numpy-wheels repo again recently, and I've done the same for SciPy. The situation is not very sustainable and needs a rethink. The current recipe is "someone who cares about a platform writes a PEP, then pip/wheel add a platform tag for it (very little work), and then the maintainers of each Python package are now responsible for wheel builds (a ton of work)". Most of these platforms have package managers, which are all more capable than pip et al., and if they don't then wheels can be hosted elsewhere (example: https://www.piwheels.org/). And then there's Conda, Nix, Spack, etc. too of course. Drawing a line somewhere distributes the workload, where packagers who care about some platform and have better tools at hand can do the packaging, and maintainers can go do something with more impact like write new code or review PRs. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Thu Jul 15 06:52:43 2021 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Thu, 15 Jul 2021 13:52:43 +0300 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: Message-ID: FWIW, here's a big fat +1 from me for spreading the load. I'd even advocate for trimming the CI and going for "We officially support this (small) list of platforms and gladly accept patches for anything else as long as they do not break the officially supported ones". ISTM the list of supported platforms (both wheels and CI) has grown too large and seems to only grow in time. The user requests are perfectly understandable, everyone wants to be in the press-the-button-and-it-works world, but the maintainer/RM effort seems to be crossing over to being too much. A perfect picture in this regard is probably what we had a couple of years ago with Gohlke Windows wheels. Christoph was doing a great job maintaining the wheels and sending patches. For more exotic platforms, there simply has to be a champion. Or if some entity wants to fund the work for some platform, great --- this should also live somewhere outside of the main repo/CI/pypi. Whether to upload sdists to PYPI --- this is a bit orthogonal, but indeed maybe it's best to only upload the small set of wheels indeed. sdists are just stored in the GH releases and it's not a big deal to grab them from GH (or even pip install from the release tag directly). This would probably improve user experience too (no more cryptic errors from attempting to compile from source on an unsuspecting user). My 2p, Evgeni On Thu, Jul 15, 2021 at 1:22 PM Ralf Gommers wrote: > > Hey all, > > This whole thread is quite interesting: https://twitter.com/zooba/status/1415440484181417998. Given how much effort we are spending on really niche wheel builds, I?m wondering if we should just draw a line somewhere: > > we do what we do now for the main platforms: Windows, Linux (x86, aarch64), macOS, *but*: > no wheels for ppc64le > no wheels for Alpine Linux > no wheels for PyPy > no wheels for Raspberry Pi, AIX or whatever other niche thing comes next. > drop 32-bit Linux in case it is becoming an annoyance. > > This is not an actual proposal (yet) and I should sleep on this some more, but I've seen Chuck and Matti burn a lot of time on the numpy-wheels repo again recently, and I've done the same for SciPy. The situation is not very sustainable and needs a rethink. > > The current recipe is "someone who cares about a platform writes a PEP, then pip/wheel add a platform tag for it (very little work), and then the maintainers of each Python package are now responsible for wheel builds (a ton of work)". Most of these platforms have package managers, which are all more capable than pip et al., and if they don't then wheels can be hosted elsewhere (example: https://www.piwheels.org/). And then there's Conda, Nix, Spack, etc. too of course. > > Drawing a line somewhere distributes the workload, where packagers who care about some platform and have better tools at hand can do the packaging, and maintainers can go do something with more impact like write new code or review PRs. > > > > Cheers, > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From melissawm at gmail.com Thu Jul 15 07:46:44 2021 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Thu, 15 Jul 2021 08:46:44 -0300 Subject: [Numpy-discussion] Newcomer's meeting - later today (4pm UTC)! Message-ID: Hi all! Sorry for the late notice - our next Newcomer's Meeting is today, * July 15, at 4pm UTC.* This is an informal meeting with no agenda to ask questions, get to know other people and (hopefully) figure out ways to contribute to NumPy. Feel free to join if you are lurking around but found it hard to start contributing - we'll do our best to support you. If you wish to join on Zoom, use this link: https://zoom.us/j/6345425936 Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Newcomer%27s+Meeting&iso=20210715T16&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar /r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Thu Jul 15 08:21:56 2021 From: andyfaff at gmail.com (Andrew Nelson) Date: Thu, 15 Jul 2021 22:21:56 +1000 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: Message-ID: I'd be +1 on reducing the number of wheels to reduce maintainer effort (esp. the 32 bit builds). However, I think reducing the CI dedication (+maintainer effort towards that) to those minority platforms would inevitably increase the risk of numpy/scipy failing to work on them, and possibly risk not being installable on them. Would we want to drop them without having a positive assurance that the minority platform/package maintainers would keep checking for numpy/scipy health; I guess one can make that choice when you know just how many times those exotic wheels are downloaded (modulo CI runs). I'd like to see sdists being kept on PyPI. Those minority platforms may keep patching scipy/numpy source to keep it installable from source, and pip knows where to find it. I suspect the majority platforms will have all the wheels they need, so only the small set of exotic platforms will use sdist, with those users being more capable of dealing with the aftermath. On Thu, 15 Jul 2021 at 20:53, Evgeni Burovski wrote: > FWIW, here's a big fat +1 from me for spreading the load. I'd even > advocate for trimming the CI and going for "We officially support this > (small) list of platforms and gladly accept patches for anything else > as long as they do not break the officially supported ones". ISTM the > list of supported platforms (both wheels and CI) has grown too large > and seems to only grow in time. > > The user requests are perfectly understandable, everyone wants to be > in the press-the-button-and-it-works world, but the maintainer/RM > effort seems to be crossing over to being too much. > > A perfect picture in this regard is probably what we had a couple of > years ago with Gohlke Windows wheels. Christoph was doing a great job > maintaining the wheels and sending patches. For more exotic platforms, > there simply has to be a champion. Or if some entity wants to fund the > work for some platform, great --- this should also live somewhere > outside of the main repo/CI/pypi. > > Whether to upload sdists to PYPI --- this is a bit orthogonal, but > indeed maybe it's best to only upload the small set of wheels indeed. > sdists are just stored in the GH releases and it's not a big deal to > grab them from GH (or even pip install from the release tag directly). > This would probably improve user experience too (no more cryptic > errors from attempting to compile from source on an unsuspecting > user). > > My 2p, > > Evgeni > > On Thu, Jul 15, 2021 at 1:22 PM Ralf Gommers > wrote: > > > > Hey all, > > > > This whole thread is quite interesting: > https://twitter.com/zooba/status/1415440484181417998. Given how much > effort we are spending on really niche wheel builds, I?m wondering if we > should just draw a line somewhere: > > > > we do what we do now for the main platforms: Windows, Linux (x86, > aarch64), macOS, *but*: > > no wheels for ppc64le > > no wheels for Alpine Linux > > no wheels for PyPy > > no wheels for Raspberry Pi, AIX or whatever other niche thing comes next. > > drop 32-bit Linux in case it is becoming an annoyance. > > > > This is not an actual proposal (yet) and I should sleep on this some > more, but I've seen Chuck and Matti burn a lot of time on the numpy-wheels > repo again recently, and I've done the same for SciPy. The situation is not > very sustainable and needs a rethink. > > > > The current recipe is "someone who cares about a platform writes a PEP, > then pip/wheel add a platform tag for it (very little work), and then the > maintainers of each Python package are now responsible for wheel builds (a > ton of work)". Most of these platforms have package managers, which are all > more capable than pip et al., and if they don't then wheels can be hosted > elsewhere (example: https://www.piwheels.org/). And then there's Conda, > Nix, Spack, etc. too of course. > > > > Drawing a line somewhere distributes the workload, where packagers who > care about some platform and have better tools at hand can do the > packaging, and maintainers can go do something with more impact like write > new code or review PRs. > > > > > > > > Cheers, > > Ralf > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -- _____________________________________ Dr. Andrew Nelson _____________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Jul 15 08:40:33 2021 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 15 Jul 2021 15:40:33 +0300 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: Message-ID: <65a423a1-c126-2dc9-426d-44c3292683f5@gmail.com> On 15/7/21 1:21 pm, Ralf Gommers wrote: > Hey all, > > I've seen Chuck and Matti burn a lot of time on the numpy-wheels repo > again recently, and I've done the same for SciPy. ... > > Cheers, > Ralf Since my name was mentioned, the things I have spent time on for wheel packaging in order of time spent (not scientifically measured rather looking back at the closed PRs and thinking "oh yeah, that was painful") have been - Updating manylnux to move away from 10 year old glibc on linux (still stuck and not clear how to finish it [0]) - Moving from travis-ci.org to travis-ci.com (with the panic around build credits) and from Appveyor to github actions/azure pipelines - Moving from rackspace's wheel hosting to anaconda.org - Working around CI failures with aarch64 for linux, mainly due to shortcomings in the free CI providers - Bugs in dependencies: avoiding buggy Accelerate when building NumPy and debugging Windows/aarch64 problems with OpenBLAS - Updating OpenBLAS versions - Shepherding Apple M1 hardware through the manylinux/multibuild/wheel pipeline (all the hard work was done by others) - Trying to make 64-bit interfaces for OpenBLAS work (99.9% done by Chuck) - Updating PyPy versions Only the last one, which was actually the least painful, would be helped by Ralf's list. On the other hand, packaging is made harder as more technologies go into a wheel build. The twitter thread started with "SciPy added a required dependency on a technology which broke things, but people stepped up to fix the problem quickly" and morphed into "lets drop wheels for lesser used platforms". I am not sure the discussion should have moved away from the first point so quickly. Perhaps there should be some discussion of the cost of adding new build dependencies and somehow making the dependency conditional for a release or two until all the worst kinks are worked out. For the record, I am +1 on removing sdists from PyPI until pip changes its default to --only-binary :all: [1] Matti [0] https://github.com/pypa/manylinux/issues/1012 [1] https://github.com/pypa/pip/issues/9140 From ralf.gommers at gmail.com Thu Jul 15 09:11:03 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 15 Jul 2021 15:11:03 +0200 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: <65a423a1-c126-2dc9-426d-44c3292683f5@gmail.com> References: <65a423a1-c126-2dc9-426d-44c3292683f5@gmail.com> Message-ID: On Thu, Jul 15, 2021 at 2:41 PM Matti Picus wrote: > On 15/7/21 1:21 pm, Ralf Gommers wrote: > > > Hey all, > > > > I've seen Chuck and Matti burn a lot of time on the numpy-wheels repo > > again recently, and I've done the same for SciPy. ... > > > > Cheers, > > Ralf > > > Since my name was mentioned, the things I have spent time on for wheel > packaging in order of time spent (not scientifically measured rather > looking back at the closed PRs and thinking "oh yeah, that was painful") > have been > > - Updating manylnux to move away from 10 year old glibc on linux (still > stuck and not clear how to finish it [0]) > > - Moving from travis-ci.org to travis-ci.com (with the panic around > build credits) and from Appveyor to github actions/azure pipelines > > - Moving from rackspace's wheel hosting to anaconda.org > > - Working around CI failures with aarch64 for linux, mainly due to > shortcomings in the free CI providers > > - Bugs in dependencies: avoiding buggy Accelerate when building NumPy > and debugging Windows/aarch64 problems with OpenBLAS > > - Updating OpenBLAS versions > > - Shepherding Apple M1 hardware through the manylinux/multibuild/wheel > pipeline (all the hard work was done by others) > > - Trying to make 64-bit interfaces for OpenBLAS work (99.9% done by Chuck) > > - Updating PyPy versions > Thanks Matti! This list is super helpful. > Only the last one, which was actually the least painful, would be helped > by Ralf's list. > Not so sure about that - probably the single biggest pain points are CI providers (especially the exotic ones) and OpenBLAS - a dependency we struggle with mostly because of wheels. Without ppc64le and s390x (forgot that one) we wouldn't need TravisCI at all for example, and we would have less work on https://github.com/MacPython/openblas-libs too. On the other hand, packaging is made harder as more technologies go into > a wheel build. The twitter thread started with "SciPy added a required > dependency on a technology which broke things, but people stepped up to > fix the problem quickly" and morphed into "lets drop wheels for lesser > used platforms". I am not sure the discussion should have moved away > from the first point so quickly. Perhaps there should be some discussion > of the cost of adding new build dependencies and somehow making the > dependency conditional for a release or two until all the worst kinks > are worked out. > Not quite the right mailing list, but: it *is* an optional dependency in scipy 1.7.0, and there has been no proposal so far to make it required. > > For the record, I am +1 on removing sdists from PyPI until pip changes > its default to --only-binary :all: [1] > > Matti > > > [0] https://github.com/pypa/manylinux/issues/1012 Ah yes:( That just seems like a major oversight in the perennial manylinux spec. If no one fixes it, guess we wait till we can jump to `_2_??` with a higher GCC requirement. Cheers, Ralf > [1] https://github.com/pypa/pip/issues/9140 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.k.sheppard at gmail.com Thu Jul 15 09:44:53 2021 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Thu, 15 Jul 2021 14:44:53 +0100 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: <65a423a1-c126-2dc9-426d-44c3292683f5@gmail.com>, Message-ID: <722D8D57-0B5D-4699-8C9E-FF49A00A2853@hxcore.ol> An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 15 10:02:17 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 15 Jul 2021 08:02:17 -0600 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: Message-ID: I spent so much time updating the wheels builds to 64 bit BLAS mostly because - I needed to actually understand how multibuild worked (and docs are minimal). - I don't know powershell (and docs are hard to find). - I don't know azure works at a lower level (and docs are hard to find). And I thought it would all be drop in easy :) What that indicates to me is that we could use a build expert and simpler infrastructure. Anaconda no doubt has people far more knowledgeable about such matters than I am. As to the twitter thread, what is the proposal(s)? I'm perfectly happy to leave support for more exotic platforms to people who have the hardware and make use of it, but no doubt such people have the opposite problem of not knowing much about NumPy. Chuck On Thu, Jul 15, 2021 at 4:22 AM Ralf Gommers wrote: > Hey all, > > This whole thread is quite interesting: > https://twitter.com/zooba/status/1415440484181417998. Given how much > effort we are spending on really niche wheel builds, I?m wondering if we > should just draw a line somewhere: > > - we do what we do now for the main platforms: Windows, Linux (x86, > aarch64), macOS, *but*: > - no wheels for ppc64le > - no wheels for Alpine Linux > - no wheels for PyPy > - no wheels for Raspberry Pi, AIX or whatever other niche thing comes > next. > - drop 32-bit Linux in case it is becoming an annoyance. > > This is not an actual proposal (yet) and I should sleep on this some more, > but I've seen Chuck and Matti burn a lot of time on the numpy-wheels repo > again recently, and I've done the same for SciPy. The situation is not very > sustainable and needs a rethink. > > The current recipe is "someone who cares about a platform writes a PEP, > then pip/wheel add a platform tag for it (very little work), and then the > maintainers of each Python package are now responsible for wheel builds (a > ton of work)". Most of these platforms have package managers, which are all > more capable than pip et al., and if they don't then wheels can be hosted > elsewhere (example: https://www.piwheels.org/). And then there's Conda, > Nix, Spack, etc. too of course. > > Drawing a line somewhere distributes the workload, where packagers who > care about some platform and have better tools at hand can do the > packaging, and maintainers can go do something with more impact like write > new code or review PRs. > > > > Cheers, > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 15 10:15:12 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 15 Jul 2021 08:15:12 -0600 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: Message-ID: On Thu, Jul 15, 2021 at 8:02 AM Charles R Harris wrote: > I spent so much time updating the wheels builds to 64 bit BLAS mostly > because > > > - I needed to actually understand how multibuild worked (and docs are > minimal). > - I don't know powershell (and docs are hard to find). > - I don't know azure works at a lower level (and docs are hard to > find). > > > And I thought it would all be drop in easy :) What that indicates to me is > that we could use a build expert and simpler infrastructure. Anaconda no > doubt has people far more knowledgeable about such matters than I am. > > As to the twitter thread, what is the proposal(s)? I'm perfectly happy to > leave support for more exotic platforms to people who have the hardware and > make use of it, but no doubt such people have the opposite problem of not > knowing much about NumPy. > > Chuck > > On Thu, Jul 15, 2021 at 4:22 AM Ralf Gommers > wrote: > >> Hey all, >> >> This whole thread is quite interesting: >> https://twitter.com/zooba/status/1415440484181417998. Given how much >> effort we are spending on really niche wheel builds, I?m wondering if we >> should just draw a line somewhere: >> >> - we do what we do now for the main platforms: Windows, Linux (x86, >> aarch64), macOS, *but*: >> - no wheels for ppc64le >> - no wheels for Alpine Linux >> - no wheels for PyPy >> - no wheels for Raspberry Pi, AIX or whatever other niche thing comes >> next. >> - drop 32-bit Linux in case it is becoming an annoyance. >> >> This is not an actual proposal (yet) and I should sleep on this some >> more, but I've seen Chuck and Matti burn a lot of time on the numpy-wheels >> repo again recently, and I've done the same for SciPy. The situation is not >> very sustainable and needs a rethink. >> >> The current recipe is "someone who cares about a platform writes a PEP, >> then pip/wheel add a platform tag for it (very little work), and then the >> maintainers of each Python package are now responsible for wheel builds (a >> ton of work)". Most of these platforms have package managers, which are all >> more capable than pip et al., and if they don't then wheels can be hosted >> elsewhere (example: https://www.piwheels.org/). And then there's Conda, >> Nix, Spack, etc. too of course. >> >> Drawing a line somewhere distributes the workload, where packagers who >> care about some platform and have better tools at hand can do the >> packaging, and maintainers can go do something with more impact like write >> new code or review PRs. >> >> >> >> Cheers, >> Ralf >> >> Let me add that distutils brings it's own set of problems. A better build system could help make building wheels simpler on various platforms. But then, someone would also need to become expert in that build system. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Thu Jul 15 10:26:34 2021 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Thu, 15 Jul 2021 17:26:34 +0300 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: Message-ID: > However, I think reducing the CI dedication (+maintainer effort towards that) to those minority platforms would inevitably increase the risk of numpy/scipy failing to work on them, and possibly risk not being installable on them. Would we want to drop them without having a positive assurance that the minority platform/package maintainers would keep checking for numpy/scipy health Nothing blocks a platform champion from running the CI on their scipy fork. This way, the burden is on them, not on individual contributors (including first-time contributors who only deal with common platforms). ??, 15 ???. 2021 ?., 15:22 Andrew Nelson : > I'd be +1 on reducing the number of wheels to reduce maintainer effort > (esp. the 32 bit builds). However, I think reducing the CI dedication > (+maintainer effort towards that) to those minority platforms would > inevitably increase the risk of numpy/scipy failing to work on them, and > possibly risk not being installable on them. Would we want to drop them > without having a positive assurance that the minority platform/package > maintainers would keep checking for numpy/scipy health; I guess one can > make that choice when you know just how many times those exotic wheels are > downloaded (modulo CI runs). > > I'd like to see sdists being kept on PyPI. Those minority platforms may > keep patching scipy/numpy source to keep it installable from source, and > pip knows where to find it. I suspect the majority platforms will have all > the wheels they need, so only the small set of exotic platforms will use > sdist, with those users being more capable of dealing with the aftermath. > > On Thu, 15 Jul 2021 at 20:53, Evgeni Burovski > wrote: > >> FWIW, here's a big fat +1 from me for spreading the load. I'd even >> advocate for trimming the CI and going for "We officially support this >> (small) list of platforms and gladly accept patches for anything else >> as long as they do not break the officially supported ones". ISTM the >> list of supported platforms (both wheels and CI) has grown too large >> and seems to only grow in time. >> >> The user requests are perfectly understandable, everyone wants to be >> in the press-the-button-and-it-works world, but the maintainer/RM >> effort seems to be crossing over to being too much. >> >> A perfect picture in this regard is probably what we had a couple of >> years ago with Gohlke Windows wheels. Christoph was doing a great job >> maintaining the wheels and sending patches. For more exotic platforms, >> there simply has to be a champion. Or if some entity wants to fund the >> work for some platform, great --- this should also live somewhere >> outside of the main repo/CI/pypi. >> >> Whether to upload sdists to PYPI --- this is a bit orthogonal, but >> indeed maybe it's best to only upload the small set of wheels indeed. >> sdists are just stored in the GH releases and it's not a big deal to >> grab them from GH (or even pip install from the release tag directly). >> This would probably improve user experience too (no more cryptic >> errors from attempting to compile from source on an unsuspecting >> user). >> >> My 2p, >> >> Evgeni >> >> On Thu, Jul 15, 2021 at 1:22 PM Ralf Gommers >> wrote: >> > >> > Hey all, >> > >> > This whole thread is quite interesting: >> https://twitter.com/zooba/status/1415440484181417998. Given how much >> effort we are spending on really niche wheel builds, I?m wondering if we >> should just draw a line somewhere: >> > >> > we do what we do now for the main platforms: Windows, Linux (x86, >> aarch64), macOS, *but*: >> > no wheels for ppc64le >> > no wheels for Alpine Linux >> > no wheels for PyPy >> > no wheels for Raspberry Pi, AIX or whatever other niche thing comes >> next. >> > drop 32-bit Linux in case it is becoming an annoyance. >> > >> > This is not an actual proposal (yet) and I should sleep on this some >> more, but I've seen Chuck and Matti burn a lot of time on the numpy-wheels >> repo again recently, and I've done the same for SciPy. The situation is not >> very sustainable and needs a rethink. >> > >> > The current recipe is "someone who cares about a platform writes a PEP, >> then pip/wheel add a platform tag for it (very little work), and then the >> maintainers of each Python package are now responsible for wheel builds (a >> ton of work)". Most of these platforms have package managers, which are all >> more capable than pip et al., and if they don't then wheels can be hosted >> elsewhere (example: https://www.piwheels.org/). And then there's Conda, >> Nix, Spack, etc. too of course. >> > >> > Drawing a line somewhere distributes the workload, where packagers who >> care about some platform and have better tools at hand can do the >> packaging, and maintainers can go do something with more impact like write >> new code or review PRs. >> > >> > >> > >> > Cheers, >> > Ralf >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > > > -- > _____________________________________ > Dr. Andrew Nelson > > > _____________________________________ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Jul 15 15:54:21 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 15 Jul 2021 21:54:21 +0200 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: <722D8D57-0B5D-4699-8C9E-FF49A00A2853@hxcore.ol> References: <65a423a1-c126-2dc9-426d-44c3292683f5@gmail.com> <722D8D57-0B5D-4699-8C9E-FF49A00A2853@hxcore.ol> Message-ID: On Thu, Jul 15, 2021 at 3:45 PM Kevin Sheppard wrote: > When thinking about supporting a platform, it seems reasonable to consider > other sources for pre-compiled binaries, e.g., conda and especially > conda-forge. Conda-forge has > > > > linux-ppc64le v1.7.0 > > linux-64 v1.7.0 > > linux-aarch64 v1.7.0 > > osx-arm64 v1.7.0 > > osx-64 v1.7.0 > > win-32 v1.2.1 > > win-64 v1.7.0 > > > > If this list included PyPy that it would be virtually complete. > Conda-forge supports PyPy, and the NumPy build seems to have a problem but that's being worked on at the moment: https://github.com/conda-forge/numpy-feedstock/pull/238 Cheers, Ralf > > It seems reasonable to direct users to conda-forge if they are unable to > build a package for a less common platform. > > > > Building on conda-forge is often easier than building a wheel for PyPI due > to the substantial infrastructure that the conda-forge team provides to > automatically wire up CI for a lot of platforms. > > > > Kevin > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Jul 15 16:04:08 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 15 Jul 2021 22:04:08 +0200 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: Message-ID: On Thu, Jul 15, 2021 at 4:15 PM Charles R Harris wrote: > > > On Thu, Jul 15, 2021 at 8:02 AM Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> I spent so much time updating the wheels builds to 64 bit BLAS mostly >> because >> >> >> - I needed to actually understand how multibuild worked (and docs are >> minimal). >> - I don't know powershell (and docs are hard to find). >> - I don't know azure works at a lower level (and docs are hard to >> find). >> >> Yes, this is a problem. powershell/bash + no docs is not a recipe for success. Longer term we could move it over to cibuildwheel which has decent docs, but that in itself is a lot of work so it won't happen overnight. >> And I thought it would all be drop in easy :) What that indicates to me >> is that we could use a build expert and simpler infrastructure. Anaconda no >> doubt has people far more knowledgeable about such matters than I am. >> >> As to the twitter thread, what is the proposal(s)? >> > There's no proposal yet, still stewing a bit on it. Although maybe the first part of one is ready: no more new wheel flavors for things with <<0.5% of the user base. I'm perfectly happy to leave support for more exotic platforms to people >> who have the hardware and make use of it, but no doubt such people have the >> opposite problem of not knowing much about NumPy. >> > >> >> > Let me add that distutils brings it's own set of problems. A better build > system could help make building wheels simpler on various platforms. But > then, someone would also need to become expert in that build system. > Working on that one:) Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Fri Jul 16 03:02:01 2021 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 16 Jul 2021 10:02:01 +0300 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: <65a423a1-c126-2dc9-426d-44c3292683f5@gmail.com> Message-ID: On 15/7/21 4:11 pm, Ralf Gommers wrote: > > > On Thu, Jul 15, 2021 at 2:41 PM Matti Picus > wrote: > > On the other hand, packaging is made harder as more technologies > go into > a wheel build. ... Perhaps there should be some discussion > of the cost of adding new build dependencies and somehow making the > dependency conditional ... > > > Not quite the right mailing list, but: it *is* an optional dependency > in scipy 1.7.0, and there has been no proposal so far to make it > required. > > > Cheers, > Ralf > Thanks for the clarification, sorry for the misunderstanding Matti From chris.barker at noaa.gov Fri Jul 16 14:11:28 2021 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 16 Jul 2021 11:11:28 -0700 Subject: [Numpy-discussion] reducing effort spent on wheel builds? In-Reply-To: References: <65a423a1-c126-2dc9-426d-44c3292683f5@gmail.com> Message-ID: Just a note on: > For the record, I am +1 on removing sdists from PyPI until pip changes its default to --only-binary :all: [1] I agree that the defaults for pip are unfortunate (and indeed the legacy of pip doing, well, a lot, (i.e. building and installing and package managing and dependencies, and ...) with one interface. However, There's a long tradition of sdists on PyPi -- and PyPi is used, for the most part, as the source of sdists for other systems (conda-forge for example). I did just check, and numpy is an exception -- it's pointing to gitHub: source: url: https://github.com/numpy/numpy/releases/download/v{{ version }}/numpy-{{ version }}.tar.gz But others may be counting on sdists on PyPi. Also, an sdist is not always the same as a gitHub release -- there is some "magic" in building it -- it's not just a copy of the repo. Again, numpy may be building its releases as an sdist (or it just doesn't. matter), but something to keep in mind. Another thought is to only support platforms that have a committed maintainer -- I think that's how Python itself does it. The more obscure platforms are only supported if someone steps up to support them (I suppose that's technically true for all platforms, but not hard to find someone on the existing core dev team to support the majors). This can be a bit tricky, as the users of a platform may not have the skills to maintain the builds, but it seems fair enough to only support platforms that someone cares enough about to do the work. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Sat Jul 17 13:44:18 2021 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Sat, 17 Jul 2021 14:44:18 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday July 19 In-Reply-To: References: Message-ID: Hi all! Our next Documentation Team meeting will be on *Monday, July 19* at ***4PM UTC***. All are welcome - you don't need to already be a contributor to join. If you have questions or are curious about what we're doing, we'll be happy to meet you! If you wish to join on Zoom, use this link: https://zoom.us/j/96219574921?pwd=VTRNeGwwOUlrYVNYSENpVVBRRjlkZz09#success Here's the permanent hackmd document with the meeting notes (still being updated in the next few days!): https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20210719T16&p1=1440&ah=1 *** You can add the NumPy community calendar to your google calendar by clicking this link: https://calendar.google.com/calendar /r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Sun Jul 18 11:53:26 2021 From: matti.picus at gmail.com (Matti Picus) Date: Sun, 18 Jul 2021 18:53:26 +0300 Subject: [Numpy-discussion] Proposal to accept NEP 49: Data allocation strategies In-Reply-To: References: <68067d32-4112-67e2-3b8b-08feb7c875ba@gmail.com> <7ee0aded6d0b6036710a683ed60c832d920ac6fa.camel@sipsolutions.net> <2f5624a12f2005a62f5c2549624fb3684009d41a.camel@sipsolutions.net> <8c06b4f4-ebfb-e8d5-9460-4e700d70ca85@gmail.com> <1620921962860-0.post@n7.nabble.com> Message-ID: <7156a370-3147-a4b9-ad34-5f6ac7d8a223@gmail.com> The NEP [0] and the corresponding PR [1] have gone through another round of editing. I would like to restart the discussion here if anyone has more to add. Things that have changed since the last round: - The functions now accept a context argument - The code has been cleaned up for consistency - The language of the NEP has been tightened Thanks to all who have contributed to the discussion so far. Matti [0] https://numpy.org/neps/nep-0049.html [1] https://github.com/numpy/numpy/pull/17582 From charlesr.harris at gmail.com Sun Jul 18 16:13:08 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 18 Jul 2021 14:13:08 -0600 Subject: [Numpy-discussion] (no subject) Message-ID: Hi All, On behalf of the NumPy team I am pleased to announce the release of NumPy 1.21.1. The NumPy 1.21.1 is a maintenance release that fixes bugs discovered after the 1.21.0 release. OpenBLAS has also been updated to v0.3.17 to deal with arm64 problems. The Python versions supported for this release are 3.7-3.9. The 1.21.x series is compatible with development Python 3.10 and Python 3.10 will be officially supported after it is released. Wheels can be downloaded from PyPI ; source archives, release notes, and wheel hashes are available on Github . Linux users will need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014 wheels. *Contributors* A total of 11 people contributed to this release. People with a \"+\" by their names contributed a patch for the first time. - Bas van Beek - Charles Harris - Ganesh Kathiresan - Gregory R. Lee - Hugo Defois + - Kevin Sheppard - Matti Picus - Ralf Gommers - Sayed Adel - Sebastian Berg - Thomas J. Fan *Pull requests merged* A total of 26 pull requests were merged for this release. - #19311: REV,BUG: Replace `NotImplemented` with `typing.Any` - #19324: MAINT: Fixed the return-dtype of `ndarray.real` and `imag` - #19330: MAINT: Replace `"dtype[Any]"` with `dtype` in the definiton of\... - #19342: DOC: Fix some docstrings that crash pdf generation. - #19343: MAINT: bump scipy-mathjax - #19347: BUG: Fix arr.flat.index for large arrays and big-endian machines - #19348: ENH: add `numpy.f2py.get_include` function - #19349: BUG: Fix reference count leak in ufunc dtype handling - #19350: MAINT: Annotate missing attributes of `np.number` subclasses - #19351: BUG: Fix cast safety and comparisons for zero sized voids - #19352: BUG: Correct Cython declaration in random - #19353: BUG: protect against accessing base attribute of a NULL subarray - #19365: BUG, SIMD: Fix detecting AVX512 features on Darwin - #19366: MAINT: remove `print()`\'s in distutils template handling - #19390: ENH: SIMD architectures to show\_config - #19391: BUG: Do not raise deprecation warning for all nans in unique\... - #19392: BUG: Fix NULL special case in object-to-any cast code - #19430: MAINT: Use arm64-graviton2 for testing on travis - #19495: BUILD: update OpenBLAS to v0.3.17 - #19496: MAINT: Avoid unicode characters in division SIMD code comments - #19499: BUG, SIMD: Fix infinite loop during count non-zero on GCC-11 - #19500: BUG: fix a numpy.npiter leak in npyiter\_multi\_index\_set - #19501: TST: Fix a `GenericAlias` test failure for python 3.9.0 - #19502: MAINT: Start testing with Python 3.10.0b3. - #19503: MAINT: Add missing dtype overloads for object- and ctypes-based\... - #19510: REL: Prepare for NumPy 1.21.1 release. Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Jul 19 15:01:58 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 19 Jul 2021 14:01:58 -0500 Subject: [Numpy-discussion] Add smallest_normal and smallest_subnormal attributes to finfo In-Reply-To: References: Message-ID: <0ac63e8104df5184e6c884bcaf7f68b6f2ae5d56.camel@sipsolutions.net> Hi all, We just merged the PR to add `np.finfo.smallest_normal` and `np.finfo.smallest_subnormal` to `np.finfo` (the floating point DType information object): * smallest_normal: - An alias for `np.finfo.tiny` - The smallest "normal". I.e. the smallest number larger than zero that has full precision. * smallest_subnormal: - Equivalent to `np.nextafter(0., 1.)` - The smallest subnormal/denormal number. I.e. the smallest representable number larger than zero. Please don't hesitate to comment if you have any thoughts on the API addition. Cheers, Sebastian On Wed, 2021-04-21 at 17:44 -0500, Stephannie Jim?nez Gacha wrote: > Good afternoon, > > Given the discussions happened in the Data API consortium when > looking into > the attributes of `finfo` used in the wild, we found that `tiny` is > used > regularly but in a good amount of cases not for its intended purpose > but > rather as "just give me a small number". Following this we are > proposing > the addition of `smallest_normal` and `smallest_subnormal` > attributes. > Personally, I think that the `tiny` name is a little bit odd and > misleading, so it will be great to leave that as an alias but have a > clear > name in this class. > > Right now the PR: https://github.com/numpy/numpy/pull/18536?has all > the > changes and all the values added were checked against IEEE-754 > standard. > One of the main concerns is the support of subnormal numbers in > certain > architectures, where the values can't be calculated accurately. Given > the > state of the discussion, we don't know if the best alternative is to > not > add the `smallest_subnormal` attribute and just add the > `smallest_number` > attribute as an alias to `tiny`. > > We open this to discussion to see what way we can go in order to get > this > PR merged. > > *Stephannie Jimenez Gacha*Software developer > > *Quansight* | Your Data Experts > > w: www.quansight.com? e: sgacha at quansight.com > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From stephaniemendoza2014 at gmail.com Tue Jul 20 11:33:51 2021 From: stephaniemendoza2014 at gmail.com (Stephanie Mendoza) Date: Tue, 20 Jul 2021 10:33:51 -0500 Subject: [Numpy-discussion] NumPy Community Survey 2021 - Call for participation Message-ID: Hello NumPy Community, Last year?s inaugural community survey yielded participation from over 1,200 NumPy users from 75 different countries. The responses provided invaluable feedback from the NumPy community. You can view the results from the 2020 survey here: https://numpy.org/user-survey-2020/ It?s time for another survey, and we are counting on you once again. The 2021 survey is now LIVE, and you can participate using the following link: https://berkeley.qualtrics.com/jfe/form/SV_aaOONjgcBXDSl4q. Please feel free to share this link with others so that we can reach as much of the NumPy community as possible. Please direct any questions / concerns to stephaniemendoza2014 at gmail.com. Sincerely, Stephanie Mendoza -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Jul 20 16:52:44 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 20 Jul 2021 15:52:44 -0500 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday Message-ID: Hi all, There will be a NumPy Community meeting Wednesday July 21st at 20:00 UTC. Everyone is invited and encouraged to join in and edit the work-in-progress meeting topics and notes at: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ndbecker2 at gmail.com Wed Jul 21 08:39:56 2021 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 21 Jul 2021 08:39:56 -0400 Subject: [Numpy-discussion] Add count (and dtype) to packbits Message-ID: In my application I need to pack bits of a specified group size into integral values. Currently np.packbits only packs into full bytes. For example, I might have a string of bits encoded as a np.uint8 vector with each uint8 item specifying a single bit 1/0. I want to encode them 4 bits at a time into a np.uint32 vector. python code to implement this: --------------- def pack_bits (inp, bits_per_word, dir=1, dtype=np.int32): assert bits_per_word <= np.dtype(dtype).itemsize * 8 assert len(inp) % bits_per_word == 0 out = np.empty (len (inp)//bits_per_word, dtype=dtype) i = 0 o = 0 while i < len(inp): ret = 0 for b in range (bits_per_word): if dir > 0: ret |= inp[i] << b else: ret |= inp[i] << (bits_per_word - b - 1) i += 1 out[o] = ret o += 1 return out --------------- It looks like unpackbits has a "count" parameter but packbits does not. Also would be good to be able to specify an output dtype. From deak.andris at gmail.com Wed Jul 21 08:52:55 2021 From: deak.andris at gmail.com (Andras Deak) Date: Wed, 21 Jul 2021 14:52:55 +0200 Subject: [Numpy-discussion] Add count (and dtype) to packbits In-Reply-To: References: Message-ID: On Wed, Jul 21, 2021 at 2:40 PM Neal Becker wrote: > In my application I need to pack bits of a specified group size into > integral values. > Currently np.packbits only packs into full bytes. > For example, I might have a string of bits encoded as a np.uint8 > vector with each uint8 item specifying a single bit 1/0. I want to > encode them 4 bits at a time into a np.uint32 vector. > > python code to implement this: > > --------------- > def pack_bits (inp, bits_per_word, dir=1, dtype=np.int32): > assert bits_per_word <= np.dtype(dtype).itemsize * 8 > assert len(inp) % bits_per_word == 0 > out = np.empty (len (inp)//bits_per_word, dtype=dtype) > i = 0 > o = 0 > while i < len(inp): > ret = 0 > for b in range (bits_per_word): > if dir > 0: > ret |= inp[i] << b > else: > ret |= inp[i] << (bits_per_word - b - 1) > i += 1 > out[o] = ret > o += 1 > return out > --------------- > Can't you just `packbits` into a uint8 array and then convert that to uint32? If I change `dtype` in your code from `np.int32` to `np.uint32` (as you mentioned in your email) I can do this: rng = np.random.default_rng() arr = (rng.uniform(size=32) < 0.5).astype(np.uint8) group_size = 4 original = pack_bits(arr, group_size, dtype=np.uint32) new = np.packbits(arr.reshape(-1, group_size), axis=-1, bitorder='little').ravel().astype(np.uint32) print(np.array_equal(new, original)) # True There could be edge cases where the result dtype is too small, but I haven't thought about that part of the problem. I assume this would work as long as `group_size <= 8`. Andr?s > It looks like unpackbits has a "count" parameter but packbits does not. > Also would be good to be able to specify an output dtype. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed Jul 21 08:58:12 2021 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 21 Jul 2021 08:58:12 -0400 Subject: [Numpy-discussion] Add count (and dtype) to packbits In-Reply-To: References: Message-ID: Well that's just the point, I wanted to consider group size > 8. On Wed, Jul 21, 2021 at 8:53 AM Andras Deak wrote: > > On Wed, Jul 21, 2021 at 2:40 PM Neal Becker wrote: >> >> In my application I need to pack bits of a specified group size into >> integral values. >> Currently np.packbits only packs into full bytes. >> For example, I might have a string of bits encoded as a np.uint8 >> vector with each uint8 item specifying a single bit 1/0. I want to >> encode them 4 bits at a time into a np.uint32 vector. >> >> python code to implement this: >> >> --------------- >> def pack_bits (inp, bits_per_word, dir=1, dtype=np.int32): >> assert bits_per_word <= np.dtype(dtype).itemsize * 8 >> assert len(inp) % bits_per_word == 0 >> out = np.empty (len (inp)//bits_per_word, dtype=dtype) >> i = 0 >> o = 0 >> while i < len(inp): >> ret = 0 >> for b in range (bits_per_word): >> if dir > 0: >> ret |= inp[i] << b >> else: >> ret |= inp[i] << (bits_per_word - b - 1) >> i += 1 >> out[o] = ret >> o += 1 >> return out >> --------------- > > > Can't you just `packbits` into a uint8 array and then convert that to uint32? If I change `dtype` in your code from `np.int32` to `np.uint32` (as you mentioned in your email) I can do this: > > rng = np.random.default_rng() > arr = (rng.uniform(size=32) < 0.5).astype(np.uint8) > group_size = 4 > original = pack_bits(arr, group_size, dtype=np.uint32) > new = np.packbits(arr.reshape(-1, group_size), axis=-1, bitorder='little').ravel().astype(np.uint32) > print(np.array_equal(new, original)) > # True > > There could be edge cases where the result dtype is too small, but I haven't thought about that part of the problem. I assume this would work as long as `group_size <= 8`. > > Andr?s > >> >> It looks like unpackbits has a "count" parameter but packbits does not. >> Also would be good to be able to specify an output dtype. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Those who don't understand recursion are doomed to repeat it From nicholaitukanov at gmail.com Wed Jul 21 15:37:28 2021 From: nicholaitukanov at gmail.com (Nicholai Tukanov) Date: Wed, 21 Jul 2021 14:37:28 -0500 Subject: [Numpy-discussion] Adding POWER10 (VSX4) support to the SIMD framework Message-ID: I would like to understand how to go about extending the SIMD framework in order to add support for POWER10. Specifically, I would like to add the following instructions: `lxvp` and `stxvp` which loads/stores 256 bits into/from two vectors. I believe that this will be able to give a decent performance boost for those on POWER machines since it can halved the amount of loads/stores issued. Additionally, matrix engines (2-D SIMD instructions) are becoming quite popular due to their performance improvements for deep learning and scientific computing. Would it be beneficial to add these new advanced SIMD instructions into the framework or should these instructions be left to libraries such as OpenBLAS and MKL? Thank you, Nicholai Tukanov -------------- next part -------------- An HTML attachment was scrubbed... URL: From Daniel.Waddington at ibm.com Wed Jul 21 20:54:30 2021 From: Daniel.Waddington at ibm.com (Daniel Waddington) Date: Thu, 22 Jul 2021 00:54:30 +0000 Subject: [Numpy-discussion] Proposed change from POSIX to PyMem_RawXXX Message-ID: An HTML attachment was scrubbed... URL: From Daniel.Waddington at ibm.com Thu Jul 22 12:48:05 2021 From: Daniel.Waddington at ibm.com (Daniel Waddington) Date: Thu, 22 Jul 2021 16:48:05 +0000 Subject: [Numpy-discussion] Proposed change from POSIX to PyMem_RawXXX (plain text resend) Message-ID: Hi, I'm working with Numpy in the context of supporting different memory types such as persistent memory and CXL attached. I would like to propose a minor change, but figured I would get some initial feedback from the developer community before submitting a PR. In multiarray/alloc.c the allocator (beneath the cache) using the POSIX malloc/calloc/realloc/free. I propose that these should be changed to PyMem_RawXXX equivalents. The reason for this is that by doing so, one can use the python custom allocator functions (e.g. PyMem_GetAllocator/PyMem_SetAllocator) to intercept the memory allocator for NumPy arrays. This will be useful as heterogeneous memories need supporting. I don't think this will drastically change performance but it is an extra function redirection (and it will only impact when the cache can't deliver). There are likely other places in NumPy that could do with a rinse and repeat - may be someone could advise? Thanks, Daniel Waddington IBM Research --- Example patch for 1.19.x (I'm building with Python3.6) diff --git a/numpy/core/src/multiarray/alloc.c b/numpy/core/src/multiarray/alloc.c index 795fc7315..e9e888478 100644 --- a/numpy/core/src/multiarray/alloc.c +++ b/numpy/core/src/multiarray/alloc.c @@ -248,7 +248,7 @@ PyDataMem_NEW(size_t size) void *result; assert(size != 0); - result = malloc(size); + result = PyMem_RawMalloc(size); if (_PyDataMem_eventhook != NULL) { NPY_ALLOW_C_API_DEF NPY_ALLOW_C_API @@ -270,7 +270,7 @@ PyDataMem_NEW_ZEROED(size_t size, size_t elsize) { void *result; - result = calloc(size, elsize); + result = PyMem_RawCalloc(size, elsize); if (_PyDataMem_eventhook != NULL) { NPY_ALLOW_C_API_DEF NPY_ALLOW_C_API @@ -291,7 +291,7 @@ NPY_NO_EXPORT void PyDataMem_FREE(void *ptr) { PyTraceMalloc_Untrack(NPY_TRACE_DOMAIN, (npy_uintp)ptr); - free(ptr); + PyMem_RawFree(ptr); From sebastian at sipsolutions.net Thu Jul 22 13:04:58 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 22 Jul 2021 12:04:58 -0500 Subject: [Numpy-discussion] Proposed change from POSIX to PyMem_RawXXX (plain text resend) In-Reply-To: References: Message-ID: On Thu, 2021-07-22 at 16:48 +0000, Daniel Waddington wrote: > Hi, > I'm working with Numpy in the context of supporting different memory > types such as persistent memory and CXL attached.? I would like to > propose a minor change, but figured I would get some initial feedback > from the developer community before submitting a PR. Hi Daniel, you may want to have a look at Matti's NEP to allow custom allocation strategies: https://numpy.org/neps/nep-0049.html When implemented, this will allow to explicitly modify the behaviour here (which means you could make it use the Python version). In principle, once that work is done, we could also use the Python allocator as you are proposing. It may be a follow-up discussion. The difficulty is that the NumPy ABI is fully open: 1. A user can create an array with data they allocated 2. In theory, a user could `realloc` or even replace an arrays `data` In practice, hopefully nobody does the second one, but we can't be sure. The first means we have to wait for the NEP, because it will allow us to work around the problem: We can use different `free`/`realloc` if a user provided the data. The second means that we have to be careful when consider changing the default even after implementing the NEP. But it may be possible, at least if we do it slowly/gently. Cheers, Sebastian > ? > In multiarray/alloc.c the allocator (beneath the cache) using the > POSIX malloc/calloc/realloc/free.? I propose that these should be > changed to PyMem_RawXXX equivalents.? The reason for this is that by > doing so, one can use the python custom allocator functions (e.g. > PyMem_GetAllocator/PyMem_SetAllocator) to intercept the memory > allocator for NumPy arrays.? This will be useful as heterogeneous > memories need supporting. I don't think this will drastically change > performance but it is an extra function redirection (and it will only > impact when the cache can't deliver). > ? > There are likely other places in NumPy that could do with a rinse and > repeat - may be someone could advise? > ? > Thanks, > Daniel Waddington > IBM Research > ? > --- > Example patch for 1.19.x (I'm building with Python3.6) > ? > diff --git a/numpy/core/src/multiarray/alloc.c > b/numpy/core/src/multiarray/alloc.c > index 795fc7315..e9e888478 100644 > --- a/numpy/core/src/multiarray/alloc.c > +++ b/numpy/core/src/multiarray/alloc.c > @@ -248,7 +248,7 @@ PyDataMem_NEW(size_t size) > ???? void *result; > ? > ???? assert(size != 0); > -??? result = malloc(size); > +??? result = PyMem_RawMalloc(size); > ???? if (_PyDataMem_eventhook != NULL) { > ???????? NPY_ALLOW_C_API_DEF > ???????? NPY_ALLOW_C_API > @@ -270,7 +270,7 @@ PyDataMem_NEW_ZEROED(size_t size, size_t elsize) > ?{ > ???? void *result; > ? > -??? result = calloc(size, elsize); > +??? result = PyMem_RawCalloc(size, elsize); > ???? if (_PyDataMem_eventhook != NULL) { > ???????? NPY_ALLOW_C_API_DEF > ???????? NPY_ALLOW_C_API > @@ -291,7 +291,7 @@ NPY_NO_EXPORT void > ?PyDataMem_FREE(void *ptr) > ?{ > ???? PyTraceMalloc_Untrack(NPY_TRACE_DOMAIN, (npy_uintp)ptr); > -??? free(ptr); > +??? PyMem_RawFree(ptr); > ? > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From scratchmex at gmail.com Mon Jul 26 01:04:27 2021 From: scratchmex at gmail.com (Ivan Gonzalez) Date: Mon, 26 Jul 2021 01:04:27 -0400 Subject: [Numpy-discussion] Fwd: ndarray should offer __format__ that can adjust precision In-Reply-To: References: Message-ID: It would be nice to be able to use the Python syntax we already use to format the precision of floating numbers in numpy: >>> a = np.array([-np.pi, np.pi]) >>> print(f"{a:+.2f}") [-3.14 +3.14] This is particularly useful when you have large arrangements. The problem is that if you want to do it today, it is not implemented: >>> print(f"{a:+.2f}") Traceback (most recent call last): File "", line 1, in TypeError: unsupported format string passed to numpy.ndarray.__format__ In this PR (https://github.com/numpy/numpy/pull/19550) I propose a very basic formatting implementation for numeric numbers that uses `array2string` just like it currently does `str` At first, since we are only considering formatting the numeric type, floating numbers specifically, we are only interested in being able to change the precision, the sign, and possibly the rounding or truncation. Since the `array2string` function already does everything we need, we only need to implement the` __format__` function of the `ndarray` class which parses a predefined format (similar to the one already used by Python for built-in data types) to indicate the parameters before said. I propose a mini format specification inspired in the [Format Specification Mini-Language](https://docs.python.org/3/library/string.html#formatspec). ``` format_spec ::= [sign][.precision][type] sign ::= "+" | "-" | " " precision ::= [0-9]+ type ::= "f" | "e" ``` We are going to consider only 3 arguments of the `array2string` function:` precision`, `suppress_small`,` sign`. In particular, the `type` token sets the` suppress_small` argument to True when the type is `f` and False when it is `e`. This is in order to mimic Python's behavior in truncating decimals when using the fixed-point notation. As @brandon-rhodes said in gh-5543, the behavior when you try to format an array containing Python objects, the behavior should be the same as Python has implemented by default in the `object` class: ` format (a, "") ` should be equivalent to `str (a)` and `format(a, "not empty")` should raise an exception. What remains to be defined is the behavior when trying to format an array with a non-numeric data type (`np.numeric`) other than `np.object_`. Should we raise an exception? In my opinion yes, since in the future formatting is extended -- for example, for dates -- people are aware that before that was not implemented. I'm open to suggestions. - Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Mon Jul 26 16:03:55 2021 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Mon, 26 Jul 2021 17:03:55 -0300 Subject: [Numpy-discussion] Newcomer's Meeting July 29: special accessibility mini-sprint! Message-ID: Hello, folks! This week we have our planned Newcomer's meeting on July 29, at 8PM UTC. This time, we are teaming up with Tony Fast and Isabela Presedo-Floyd to propose a low-code mini-sprint focusing on accessibility. This is a great opportunity to make your first contribution to NumPy! We will be focusing on writing alt-text for images in our documentation. Alt-text provides a textual alternative to non-text content in web pages, and as cited in this WebAIM document [1]: "Alternative text serves several functions: - It is read by screen readers in place of images allowing the content and function of the image to be accessible to those with visual or certain cognitive disabilities. - It is displayed in place of the image in browsers if the image file is not loaded or when the user has chosen not to view images. - It provides a semantic meaning and description to images which can be read by search engines or be used to later determine the content of the image from page context alone." You can find some more information on how to add alt-text in [2]. As usual, all are welcome - feel free to join even if you just want to observe! To join on Zoom, use this link: https://zoom.us/j/6345425936 Hope to see you around! - Melissa [1] https://webaim.org/techniques/alttext/ [2] https://www.w3.org/WAI/tutorials/images/decision-tree/ ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Newcomers%27s+Meeting+-+Accessibility+Mini-Sprint&iso=20210729T20&p1=1440&ah=1 *** You can add the NumPy community calendar to your Google calendar by clicking this link: https://calendar.google.com/calendar /r?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Jul 27 14:39:39 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 27 Jul 2021 11:39:39 -0700 Subject: [Numpy-discussion] NumPy Development Meeting Wednesday - Triage Focus Message-ID: <95c5b85fe4e29e66d775d9aea8fd844458dbb9af.camel@sipsolutions.net> Hi all, Our bi-weekly triage-focused NumPy development meeting is Wednesday, July 28th at 9 am Pacific Time (16:00 UTC). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or PRs that you feel should be prioritized, discussed, or reviewed. Best regards Sebastian From sebastian at sipsolutions.net Tue Jul 27 19:50:21 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 27 Jul 2021 16:50:21 -0700 Subject: [Numpy-discussion] Floating point precision expectations in NumPy Message-ID: <3f6563789d2b4ef963c4407e187e1797866eeea5.camel@sipsolutions.net> Hi all, there is a proposal to add some Intel specific fast math routine to NumPy: https://github.com/numpy/numpy/pull/19478 part of numerical algorithms is that there is always a speed vs. precision trade-off, giving a more precise result is slower. So there is a question what the general precision expectation should be in NumPy. And how much is it acceptable to diverge in the precision/speed trade-off depending on CPU/system? I doubt we can formulate very clear rules here, but any input on what precision you would expect or trade-offs seem acceptable would be appreciated! Some more details ----------------- This is mainly interesting e.g. for functions like logarithms, trigonometric functions, or cubic roots. Some basic functions (multiplication, addition) are correct as per IEEE standard and give the best possible result, but these are typically only correct within very small numerical errors. This is typically measured as "ULP": https://en.wikipedia.org/wiki/Unit_in_the_last_place where 0.5 ULP would be the best possible result. Merging the PR may mean relaxing the current precision slightly in some places. In general Intel advertises 4 ULP of precision (although the actual precision for most functions seems better). Here are two tables, one from glibc and one for the Intel functions: https://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html (Mainly the LA column) https://software.intel.com/content/www/us/en/develop/documentation/onemkl-vmperfdata/top/real-functions/measured-accuracy-of-all-real-vm-functions.html Different implementation give different accuracy, but formulating some guidelines/expectation (or referencing them) would be useful guidance. For basic From gregor.thalhammer at gmail.com Wed Jul 28 06:13:44 2021 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Wed, 28 Jul 2021 12:13:44 +0200 Subject: [Numpy-discussion] Floating point precision expectations in NumPy In-Reply-To: <3f6563789d2b4ef963c4407e187e1797866eeea5.camel@sipsolutions.net> References: <3f6563789d2b4ef963c4407e187e1797866eeea5.camel@sipsolutions.net> Message-ID: > Am 28.07.2021 um 01:50 schrieb Sebastian Berg : > > Hi all, > > there is a proposal to add some Intel specific fast math routine to > NumPy: > > https://github.com/numpy/numpy/pull/19478 Many years ago I wrote a package https://github.com/geggo/uvml that makes the VML, a fast implementation of transcendetal math functions, available for numpy. Don?t know if it still compiles. It uses Intel VML, designed for processing arrays, not the SVML intrinsics. By this it is less machine dependent (optimized implementations are selected automatically depending on the availability of, e.g., SSE, AVX, or AVX512), just link to a library. It compiles as an external module, can be activated at runtime. Different precision models can be selected at runtime (globally). I thinks Intel advocates to use the LA (low accuracy) mode as a good compromise between performance and accuracy. Different people have strongly diverging opinions about what to expect. The speedups possibly gained by these approaches often vaporize in non-benchmark applications, as for those functions performance is often limited by memory bandwidth, unless all your data stays in CPU cache. By default I would go for high accuracy mode, with option to switch to low accuracy if one urgently needs the better performance. But then one should use different approaches for speeding up numpy. Gregor > > part of numerical algorithms is that there is always a speed vs. > precision trade-off, giving a more precise result is slower. > > So there is a question what the general precision expectation should be > in NumPy. And how much is it acceptable to diverge in the > precision/speed trade-off depending on CPU/system? > > I doubt we can formulate very clear rules here, but any input on what > precision you would expect or trade-offs seem acceptable would be > appreciated! > > > Some more details > ----------------- > > This is mainly interesting e.g. for functions like logarithms, > trigonometric functions, or cubic roots. > > Some basic functions (multiplication, addition) are correct as per IEEE > standard and give the best possible result, but these are typically > only correct within very small numerical errors. > > This is typically measured as "ULP": > > https://en.wikipedia.org/wiki/Unit_in_the_last_place > > where 0.5 ULP would be the best possible result. > > > Merging the PR may mean relaxing the current precision slightly in some > places. In general Intel advertises 4 ULP of precision (although the > actual precision for most functions seems better). > > > Here are two tables, one from glibc and one for the Intel functions: > > https://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html > (Mainly the LA column) https://software.intel.com/content/www/us/en/develop/documentation/onemkl-vmperfdata/top/real-functions/measured-accuracy-of-all-real-vm-functions.html > > > Different implementation give different accuracy, but formulating some > guidelines/expectation (or referencing them) would be useful guidance. > > For basic > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From ganesh3597 at gmail.com Thu Jul 29 12:16:16 2021 From: ganesh3597 at gmail.com (Ganesh Kathiresan) Date: Thu, 29 Jul 2021 21:46:16 +0530 Subject: [Numpy-discussion] Proposal for adding bit_count Message-ID: Hi All, I am working on a new UFunc, ` bit_count ` (popcount in other languages) that aims to count the number of 1-bits in the absolute value of an Integer. Implementation ---------------------------------- The primary reference for the implementation is CountBitsSetParallel . Here we take 12 operations to achieve the result which is the same as the lookup table method but does not suffer from memory issues or cache misses. The implementation is aimed at unsigned integers, absolute value of signed integers and objects that support the operation. Usage -------------- >>> np.bit_count(1023) 10 >>> a = np.array([2**i - 1 for i in range(16)]) >>> np.bit_count(a) array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) >>> np.int32(1023).bit_count() 10 Notes ------------- 1. Python has included this method here (3.10+). Tracking issue 2. NumPy tracking issue 3. Interesting read on how we get the magic number. Needed a bit of digging :) Please let us know what you think about the implementation and where we can improve in terms of performance or interface. Regards, Ganesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From jerry.morrison+numpy at gmail.com Fri Jul 30 14:04:06 2021 From: jerry.morrison+numpy at gmail.com (Jerry Morrison) Date: Fri, 30 Jul 2021 11:04:06 -0700 Subject: [Numpy-discussion] Floating point precision expectations in NumPy In-Reply-To: <3f6563789d2b4ef963c4407e187e1797866eeea5.camel@sipsolutions.net> References: <3f6563789d2b4ef963c4407e187e1797866eeea5.camel@sipsolutions.net> Message-ID: On Tue, Jul 27, 2021 at 4:55 PM Sebastian Berg wrote: > Hi all, > > there is a proposal to add some Intel specific fast math routine to > NumPy: > > https://github.com/numpy/numpy/pull/19478 > > part of numerical algorithms is that there is always a speed vs. > precision trade-off, giving a more precise result is slower. > > So there is a question what the general precision expectation should be > in NumPy. And how much is it acceptable to diverge in the > precision/speed trade-off depending on CPU/system? > > I doubt we can formulate very clear rules here, but any input on what > precision you would expect or trade-offs seem acceptable would be > appreciated! > > > Some more details > ----------------- > > This is mainly interesting e.g. for functions like logarithms, > trigonometric functions, or cubic roots. > > Some basic functions (multiplication, addition) are correct as per IEEE > standard and give the best possible result, but these are typically > only correct within very small numerical errors. > > This is typically measured as "ULP": > > https://en.wikipedia.org/wiki/Unit_in_the_last_place > > where 0.5 ULP would be the best possible result. > > > Merging the PR may mean relaxing the current precision slightly in some > places. In general Intel advertises 4 ULP of precision (although the > actual precision for most functions seems better). > > > Here are two tables, one from glibc and one for the Intel functions: > > > https://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html > (Mainly the LA column) > https://software.intel.com/content/www/us/en/develop/documentation/onemkl-vmperfdata/top/real-functions/measured-accuracy-of-all-real-vm-functions.html > > > Different implementation give different accuracy, but formulating some > guidelines/expectation (or referencing them) would be useful guidance. > "Close enough" depends on the application but non-linear models can get the "butterfly effect" where the results diverge if they aren't identical. For a certain class of scientific programming applications, reproducibility is paramount. Development teams may use a variety of development laptops, workstations, scientific computing clusters, and cloud computing platforms. If the tests pass on your machine but fail in CI, you have a debugging problem. If your published scientific article links to source code that replicates your computation, scientists will expect to be able to run that code, now or in a couple decades, and replicate the same outputs. They'll be using different OS releases and maybe different CPU + accelerator architectures. Reproducible Science is good. Replicated Science is better. Clearly there are other applications where it's easy to trade reproducibility and some precision for speed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Jul 30 15:17:49 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 30 Jul 2021 12:17:49 -0700 Subject: [Numpy-discussion] Floating point precision expectations in NumPy In-Reply-To: References: <3f6563789d2b4ef963c4407e187e1797866eeea5.camel@sipsolutions.net> Message-ID: On Fri, 2021-07-30 at 11:04 -0700, Jerry Morrison wrote: > On Tue, Jul 27, 2021 at 4:55 PM Sebastian Berg < > sebastian at sipsolutions.net> > wrote: > > > Hi all, > > > > there is a proposal to add some Intel specific fast math routine to > > NumPy: > > > > ??? https://github.com/numpy/numpy/pull/19478 > > > > part of numerical algorithms is that there is always a speed vs. > > precision trade-off, giving a more precise result is slower. > > I have to make a correction: I linked the SVML, which is distinct from VML (which the PR proposes), the actual table for precision is here: https://github.com/numpy/numpy/pull/19485#issuecomment-887995864 > "Close enough" depends on the application but non-linear models can > get the > "butterfly effect" where the results diverge if they aren't > identical. Right, so my hope was to gauge what the general expectation is. I take it you expect a high accuracy. The error for the computations itself is seems low on first sight, but of course they can explode quickly in non-linear settings... (In the chaotic systems I worked with, the shadowing theorem would usually alleviate such worries. And testing the integration would be more important. But I am sure for certain questions things may be far more tricky.) > > For a certain class of scientific programming applications, > reproducibility > is paramount. > > Development teams may use a variety of development laptops, > workstations, > scientific computing clusters, and cloud computing platforms. If the > tests > pass on your machine but fail in CI, you have a debugging problem. > > If your published scientific article links to source code that > replicates > your computation, scientists will expect to be able to run that code, > now > or in a couple decades, and replicate the same outputs. They'll be > using > different OS releases and maybe different CPU + accelerator > architectures. > > Reproducible Science is good. Replicated Science is better. > > > Clearly there are other applications where it's easy to trade > reproducibility and some precision for speed. Agreed, although there are so many factors, often out of our control, that I am not sure that true replicability is achievable without containers :(. It would be amazing if NumPy could have a "replicable" mode, but I am not sure how that could be done, or if the "ground work" in the math and linear algebra libraries even exists. However, even if it is practically impossible to make things replicable, there is an argument for improving reproducibility and replicability, e.g. by choosing the high-accuracy version here. Even if it is impossible to actually ensure. Cheers, Sebastian > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From ralf.gommers at gmail.com Sat Jul 31 14:14:08 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 31 Jul 2021 20:14:08 +0200 Subject: [Numpy-discussion] Adding POWER10 (VSX4) support to the SIMD framework In-Reply-To: References: Message-ID: On Wed, Jul 21, 2021 at 9:38 PM Nicholai Tukanov wrote: > I would like to understand how to go about extending the SIMD framework in > order to add support for POWER10. Specifically, I would like to add the > following instructions: `lxvp` and `stxvp` which loads/stores 256 bits > into/from two vectors. I believe that this will be able to give a decent > performance boost for those on POWER machines since it can halved the > amount of loads/stores issued. > Thanks for proposing this Nicholai. Hopefully someone more knowledgeable than me can point out how to go about this. > Additionally, matrix engines (2-D SIMD instructions) are becoming quite > popular due to their performance improvements for deep learning and > scientific computing. Would it be beneficial to add these new advanced SIMD > instructions into the framework or should these instructions be left to > libraries such as OpenBLAS and MKL? > This is indeed best left to OpenBLAS, MKL et al. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: