From harrigan.matthew at gmail.com Thu Nov 1 06:51:21 2018 From: harrigan.matthew at gmail.com (Matthew Harrigan) Date: Thu, 1 Nov 2018 06:51:21 -0400 Subject: [Numpy-discussion] asanyarray vs. asarray In-Reply-To: References: <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>

Message-ID: > > I don't think so. Dtypes have nothing to do with a whole set of use cases > that add extra methods or attributes. Random made-up example: user has a > system with 1000 sensor signals, some of which should be treated with > robust statistics for . So user writes a > subclass robust_ndarray, adds a bunch of methods like median/iqr/mad, and > uses isinstance checks in functions that accept both ndarray and > robust_ndarray to figure out how to preprocess sensor signals. > > Of course you can do everything you can do with subclasses also in other > ways, but such "let's add some methods or attributes" are much more common > (I think, hard to prove) than "let's change how indexing or multiplication > works" in end user code. > > Cheers, > Ralf > The build on Ralf's thought, a common subclass use case would be to add logging to various methods and attributes. That might actually be useful for ndarray for understanding what is under the hood of some function in a downstream project. It would satisfy SOLID and not be related at all to dtype subclasses. On Wed, Oct 31, 2018 at 8:28 PM Ralf Gommers wrote: > > > On Tue, Oct 30, 2018 at 2:22 PM Stephan Hoyer wrote: > >> On Mon, Oct 29, 2018 at 9:49 PM Eric Wieser >> wrote: >> >>> The latter - changing the behavior of multiplication breaks the >>> principle. >>> >>> But this is not the main reason for deprecating matrix - almost all of >>> the problems I?ve seen have been caused by the way that matrices behave >>> when sliced. The way that m[i][j] and m[i,j] are different is just one >>> example of this, the fact that they must be 2d is another. >>> >>> Matrices behaving differently on multiplication isn?t super different in >>> my mind to how string arrays fail to multiply at all. >>> >>> Eric >>> >> It's certainly fine for arithmetic to work differently on an element-wise >> basis or even to error. But np.matrix changes the shape of results from >> various ndarray operations (e.g., both multiplication and indexing), which >> is more than any dtype can do. >> >> The Liskov substitution principle (LSP) suggests that the set of >> reasonable ndarray subclasses are exactly those that could also in >> principle correspond to a new dtype. >> > > I don't think so. Dtypes have nothing to do with a whole set of use cases > that add extra methods or attributes. Random made-up example: user has a > system with 1000 sensor signals, some of which should be treated with > robust statistics for . So user writes a > subclass robust_ndarray, adds a bunch of methods like median/iqr/mad, and > uses isinstance checks in functions that accept both ndarray and > robust_ndarray to figure out how to preprocess sensor signals. > > Of course you can do everything you can do with subclasses also in other > ways, but such "let's add some methods or attributes" are much more common > (I think, hard to prove) than "let's change how indexing or multiplication > works" in end user code. > > Cheers, > Ralf > > > >> Of np.ndarray subclasses in wide-spread use, I think only the various >> "array with units" types come close satisfying this criteria. They only >> fall short insofar as they present a misleading dtype (without unit >> information). >> >> The main problem with subclassing for numpy.ndarray is that it guarantees >> too much: a large set of operations/methods along with a specific memory >> layout exposed as part of its public API. Worse, ndarray itself is a little >> quirky (e.g., with indexing, and its handling of scalars vs. 0d arrays). In >> practice, it's basically impossible to layer on complex behavior with these >> exact semantics, so only extremely minimal ndarray subclasses don't violate >> LSP. >> >> Once we have more easily extended dtypes, I suspect most of the good use >> cases for subclassing will have gone away. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Thu Nov 1 15:06:29 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Thu, 1 Nov 2018 15:06:29 -0400 Subject: [Numpy-discussion] asanyarray vs. asarray In-Reply-To: References: <24100c7f-20fd-4eed-99b0-d37660f52223@Canary>

Message-ID: The substitution principle is interesting (and, being trained as an astronomer, not a computer scientist, I had not heard of it before). I think matrix is indeed obviously wrong here (with indexing being more annoying, but multiplication being a good example as well). Perhaps more interesting as an example to consider is MaskedArray, which is much closer to a sensible subclass, though different from Quantity in that what is masked can itself be an ndarray subclass. In a sense, it is more of a container class, in which the operations are done on what is inside it, with some care taken about which elements are fixed. This becomes quite clear when one thinks of implementing __array_ufunc__ or __array_function__: for Quantity, calling super after dealing with the units is very logical, for MaskedArray, it makes more sense to call the (universal) function again on the contents [1]. For this particular class, if reimplemented, it might make most sense as a "mixin" since its attributes depend both on the masked class (.mask, etc.) and on what is being masked (say, .unit for a quantity). Thus, the final class might be an auto-generated new class (e.g., MaskedQuantity(MaskedArray, Quantity)). We have just added a new Distribution class to astropy which is based on this idea [2] (since this uses casting from structured dtypes which hold the samples to real arrays on which functions are evaluated, this probably could be done just as well or better with more flexible dtypes, but we have to deal with what's available in the real world, not the ideal one...). -- Marten [1] http://www.numpy.org/neps/nep-0013-ufunc-overrides.html#subclass-hierarchies [2] https://github.com/astropy/astropy/pull/6945 -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sun Nov 4 10:59:24 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sun, 4 Nov 2018 10:59:24 -0500 Subject: [Numpy-discussion] Should unique types of all arguments be passed on in __array_function__? Message-ID: Hi All, While thinking about implementations using __array_function__, I wondered whether the "types" argument passed on is not defined too narrowly. Currently, it only contains the types of arguments that provide __array_ufunc__, but wouldn't it make more sense to provide the unique types of all arguments, independently of whether those types have defined __array_ufunc__? It would seem quite useful for any override to know, e.g., whether a string or an integer is passed on. I thought of this partially as I was wondering how an implementation for ndarray itself would look like. For that, it is definitely useful to know all unique types, since if it is only ndarray, no casting whatsoever needs to be done, while if there are integers, lists, etc, an attempt has to be made to turn these into arrays (i.e., the `as[any]array` calls currently present in the implementations, which really more logically are part of `ndarray.__array_function__` dispatch). Should we change this? It is quite trivially done, but perhaps I am missing a reason for omitting the non-override types. All the best, Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sun Nov 4 11:44:06 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sun, 4 Nov 2018 11:44:06 -0500 Subject: [Numpy-discussion] Implementations of ndarray.__array_function__ (and ndarray.__array_ufunc__) Message-ID: Hi again, Another thought about __array_function__, this time about the implementation for ndarray. In it, we currently check whether any of the types define a (different) __array_function__, and, if so, give up. This seems too strict: I think that, at least in principle, subclasses should be allowed through even if they override __array_function__. This thought was triggered by Travis pointing to the Liskov substitution principle [1], that code written for a given type should just work on a (properly written) subclass. This suggests `ndarray` should not exclude subclasses even if they override __array_function__, since if the subclass does not work that way, it can already ensure an error is raised since it knows it is called first. Indeed, this is also how python itself works: if, eg., I subclass list as follows: ``` class MyList(list): def __radd__(self, other): return NotImplemented ``` then any `list + mylist` will just concatenate the lists, even though `MyList.__radd__` explicitly tells it cannot do it (it returning `NotImplemented` means that `list.__add__` gets a change). The reason that we do not already follow this logic may be that currently `ndarray.__array_function__` ends by calling the public function, which will lead to infinite recursion if there is a subclass that overrides __array_function__ and returns NotImplemented. However, inside ndarray.__array_function__, there is no real reason to call the public function - one might as well just call the implementation, in which case this is not a problem. Does the above make sense? I realize that the same would be true for `__array_ufunc__`, though there the situation is slightly trickier since it is not as easy to bypass any further override checks. Nevertheless, it does seem like it would be correct to do the same there. (And if we agree this is the case, I'd quite happily implement it -- with the merger of multiarray and umath it has become much easier to do.) All the best, Marten [1] https://en.wikipedia.org/wiki/Liskov_substitution_principle -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Nov 4 12:31:05 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Nov 2018 10:31:05 -0700 Subject: [Numpy-discussion] NumPy 1.15.4 release Message-ID: Hi All, On behalf of the NumPy team, I am pleased to announce the release of NumPy 1.15.4. This is a bugfix release for bugs and regressions reported following the 1.15.3 release. The most noticeable fix is probably having a boolean type fill value for masked arrays after the use of `==` and `!=`. The Python versions supported by this release are 2.7, 3.4-3.7. Wheels for this release can be downloaded from PyPI , source archives are available from Github . Compatibility Note ================== The NumPy 1.15.x OS X wheels released on PyPI no longer contain 32-bit binaries. That will also be the case in future releases. See `#11625 < https://github.com/numpy/numpy/issues/11625>`__ for the related discussion. Those needing 32-bit support should look elsewhere or build from source. Contributors ============ A total of 4 people contributed to this release. People with a "+" by their names contributed a patch for the first time. - Charles Harris - Matti Picus - Sebastian Berg - bbbbbbbbba + Pull requests merged ==================== A total of 4 pull requests were merged for this release. - `#12296 `__: BUG: Dealloc cached buffer info - `#12297 `__: BUG: Fix fill value in masked array '==' and '!=' ops. - `#12307 `__: DOC: Correct the default value of `optimize` in `numpy.einsum` - `#12320 `__: REL: Prepare for the NumPy 1.15.4 release Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Nov 4 13:04:36 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Nov 2018 11:04:36 -0700 Subject: [Numpy-discussion] Prep for NumPy 1.16.0 branch Message-ID: Hi All, Time to begin looking forward to the NumPy 1.16.x branch. I think there are three main topics to address: 1. current PRs that need review and merging, 2. critical fixes that need to be made, 3. status of `__array_function__`. The last probably needs some discussion. `__array_fuction__` seems to be working at this point, but does cause noticeable slowdowns in some function calls. I don't know if those slowdowns are significant in practice, the only way to discover that may be to make the release, or at least thorough testing of the release candidates, but we should at least discuss them and possible workarounds if needed. Trying to have things in good shape is important because 1.16.x will be the last release that supports Python 2.7, and even though we will be maintaining it for the next year, that will be easier if it is stable. As to the first two topics, I think we should try to be conservative at this point and look mostly for bug fixes and documentation updates. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sun Nov 4 13:30:47 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sun, 4 Nov 2018 13:30:47 -0500 Subject: [Numpy-discussion] Prep for NumPy 1.16.0 branch In-Reply-To: References: Message-ID: Hi Chuck, For `__array_function__`, there was some discussion in https://github.com/numpy/numpy/issues/12225 that for 1.16 we might want to follow after all Nathaniel's suggestion of using an environment variable or so to opt in (since introspection breaks on python2 with our wrapped implementations). Given also the possibly significant hit in performance, this may be the best option. All the best, Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Sun Nov 4 16:48:19 2018 From: matti.picus at gmail.com (Matti Picus) Date: Sun, 4 Nov 2018 16:48:19 -0500 Subject: [Numpy-discussion] Prep for NumPy 1.16.0 branch In-Reply-To: References: Message-ID: <84bd348f-d85f-c7ee-3795-eea3bcb2f734@gmail.com> On 4/11/18 8:04 pm, Charles R Harris wrote: > Hi All, > > Time to begin looking forward to the NumPy 1.16.x branch. I think > there are three main topics to address: > > 1. current PRs that need review and merging, > 2. critical fixes that need to be made, > 3. status of `__array_function__`. > > The last probably needs some discussion. `__array_fuction__` seems to > be working at this point, but does cause noticeable slowdowns in some > function calls. I don't know if those slowdowns are significant in > practice, the only way to discover that may be to make the release, or > at least thorough testing of the release candidates, but we should at > least discuss them and possible workarounds if needed. Trying to have > things in good shape is important because 1.16.x will be the last > release that supports Python 2.7, and even though we will be > maintaining it for the next year, that will be easier if it is stable. > As to the first two topics, I think we should try to be conservative > at this point and look mostly for bug fixes and documentation updates. > > Thoughts? > > Chuck > Beyond things with the 1.16 milestone, it would be nice to address the structured array cleanup https://gist.github.com/ahaldane/6cd44886efb449f9c8d5ea012747323b and to get the matmul-as-ufunc https://github.com/numpy/numpy/pull/12219 merged Matti From shoyer at gmail.com Sun Nov 4 19:51:07 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 4 Nov 2018 16:51:07 -0800 Subject: [Numpy-discussion] Should unique types of all arguments be passed on in __array_function__? In-Reply-To: References: Message-ID: On Sun, Nov 4, 2018 at 8:03 AM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > I thought of this partially as I was wondering how an implementation for > ndarray itself would look like. For that, it is definitely useful to know > all unique types, since if it is only ndarray, no casting whatsoever needs > to be done, while if there are integers, lists, etc, an attempt has to be > made to turn these into arrays > OK, so hypothetically we could invoke versions of each the numpy function that doesn't call `as[any]array`, and this would slightly speed-up subclasses that call super().__array_function__? The former feels pretty unlikely for now -- and would be speeding up a somewhat niche use-case (more niche even than __array_function__ in general) -- but perhaps I could be convinced. > (i.e., the `as[any]array` calls currently present in the implementations, > which really more logically are part of `ndarray.__array_function__` > dispatch). > I can sort of see the reasoning for this, but I suspect the overhead of actually calling `ndarray.__array_function__` as part of calling every NumPy functions would be prohibitive. It would mean that __array_function__ attributes get checked twice, once for dispatching and once in `ndarray.__array_function__`. It would also mean that `ndarray.__array_function__` would need to grow a general purpose coercion mechanism for converting array-like arguments into ndarray objects. I suspect this isn't really possible given the diversity of function signatures in NumPy, e.g., consider the handling of lists in np.block() (recurse) vs. np.concatenate (pass through) vs ufuncs (coerce to ndarray). The best we could do would be add another special function like dispatchers for handling coercion for each specific NumPy functions. Should we change this? It is quite trivially done, but perhaps I am missing > a reason for omitting the non-override types. > Realistically, without these other changes in NumPy, how would this improve code using __array_function__? From a general purpose dispatching perspective, are there cases where you'd want to return NotImplemented based on types that don't implement __array_function__? I guess this might help if your alternative array class is super-explicit, and doesn't automatically call `asmyarray()` on each argument. You could rely on __array_function__ to return NotImplement (and thus raise TypeError) rather than type checking in every function you write for your alternative arrays. One minor downside would speed: now __array_function__ implementations need to check a longer list of types. Another minor downside: if users follow the example of NDArrayOperatorsMixin docstring, they would now need to explicitly list all of the scalar types (without __array_function__) that they support, including builtin types like int and type(None). I suppose this ties into our recommended best practices for doing type checking in __array_ufunc__/__array_function__ implementations, which should probably be updated regardless: https://github.com/numpy/numpy/issues/12258#issuecomment-432858949 Best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sun Nov 4 20:16:12 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 4 Nov 2018 17:16:12 -0800 Subject: [Numpy-discussion] Prep for NumPy 1.16.0 branch In-Reply-To: References:

Message-ID: On Sun, Nov 4, 2018 at 10:32 AM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Hi Chuck, > > For `__array_function__`, there was some discussion in > https://github.com/numpy/numpy/issues/12225 that for 1.16 we might want > to follow after all Nathaniel's suggestion of using an environment variable > or so to opt in (since introspection breaks on python2 with our wrapped > implementations). Given also the possibly significant hit in performance, > this may be the best option. > All the best, > > Marten > I am also leaning towards this right now, depending on how long we plan to wait for releasing 1.16. It will take us at least a little while to sort out performance issues for __array_function__, I'd guess at least a few weeks. Then a blocker still might turn up during the release candidate process (though I think we've found most of the major bugs / downstream issues already through tests on NumPy's dev branch). Overall, it does feels a little misguided to rush in a change as pervasive as __array_function__ for a long term support release. If we exclude __array_function__ I expect the whole release process for 1.16 would go much smoother. We might even try to get 1.17 out faster than usual, so we can minimize the number of additional changes besides __array_function__ and going Python 3 only -- that's already a good bit of change. Note that if we make this change (reverting __array_function__), we'll need to revisit where we put a few deprecation warnings -- these will need to be restored into function bodies, not their dispatcher functions. Also: it would be really nice if we get matmul-as-ufunc in before (or at the same time) as __array_function__, so we have a complete story about it being possible to override everything in NumPy. This is another argument for delaying __array_function__, if matmul-as-ufunc can't make it in time for 1.16. Best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sun Nov 4 20:57:01 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 4 Nov 2018 17:57:01 -0800 Subject: [Numpy-discussion] Implementations of ndarray.__array_function__ (and ndarray.__array_ufunc__) In-Reply-To: References: Message-ID: On Sun, Nov 4, 2018 at 8:45 AM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Does the above make sense? I realize that the same would be true for > `__array_ufunc__`, though there the situation is slightly trickier since it > is not as easy to bypass any further override checks. Nevertheless, it does > seem like it would be correct to do the same there. (And if we agree this > is the case, I'd quite happily implement it -- with the merger of > multiarray and umath it has become much easier to do.) > Marten actually implemented a draft version of this already in https://github.com/numpy/numpy/pull/12328 :). I found reading over the PR helpful for understand this proposal. I guess the practical import of this change is that it makes it (much?) easier to write __array_function__ for ndarray subclasses: if there's a function where NumPy's default function works fine, you don't need to bother with returning anything other than NotImplemented from __array_function__. It's sort of like NotImplementedButCoercible, but only for ndarray subclasses. One minor downside is that this might make it harder to eventually deprecate and/or contemplate removing checks for 'mean' methods in functions like np.mean(), because __array_function__ implementers might still be relying on this. But so far, I think this makes sense. The PR includes additional changes to np.core.overrides, but I'm not sure if those are actually required here (or rather only possible due to this change). I guess they are needed if you want to be able to count on ndarray.__array_function__ being called after subclass __array_function__ methods. I'm not sure I like this part: it means that ndarray.__array_function__ actually gets called when other arguments implement __array_function__. For interactions with objects that aren't ndarray subclasses this is entirely pointless and would unnecessarily slow things down, since ndarray._array_function__ will always return NotImplemented. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Nov 4 22:02:24 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Nov 2018 20:02:24 -0700 Subject: [Numpy-discussion] Prep for NumPy 1.16.0 branch In-Reply-To: References:

Message-ID: On Sun, Nov 4, 2018 at 6:16 PM Stephan Hoyer wrote: > On Sun, Nov 4, 2018 at 10:32 AM Marten van Kerkwijk < > m.h.vankerkwijk at gmail.com> wrote: > >> Hi Chuck, >> >> For `__array_function__`, there was some discussion in >> https://github.com/numpy/numpy/issues/12225 that for 1.16 we might want >> to follow after all Nathaniel's suggestion of using an environment variable >> or so to opt in (since introspection breaks on python2 with our wrapped >> implementations). Given also the possibly significant hit in performance, >> this may be the best option. >> All the best, >> >> Marten >> > > I am also leaning towards this right now, depending on how long we plan to > wait for releasing 1.16. It will take us at least a little while to sort > out performance issues for __array_function__, I'd guess at least a few > weeks. Then a blocker still might turn up during the release candidate > process (though I think we've found most of the major bugs / downstream > issues already through tests on NumPy's dev branch). > My tentative schedule is to branch in about two weeks, then allow 2 weeks of testing for rc1, possibly another two weeks for rc2, and then a final. so possibly about six weeks to final release. That leaves 2 to 4 weeks of slack before 2019. > Overall, it does feels a little misguided to rush in a change as pervasive > as __array_function__ for a long term support release. If we exclude > __array_function__ I expect the whole release process for 1.16 would go > much smoother. We might even try to get 1.17 out faster than usual, so we > can minimize the number of additional changes besides __array_function__ > and going Python 3 only -- that's already a good bit of change. > I would like to get 1.17 out a bit early. I'm not sure how many backwards incompatible changes we want to have in the first post python2 release. My initial thoughts are to drop Python 2.7 testing, go to C99, and get the new fft in. Beyond that, I'm hesitant to start tearing out all the Python2 special casing in the first new release, although that could certainly be the main task for 1.17 and would clean up the code considerably. It might also be a good time to catch up on changing deprecations to errors. Thoughts on how to proceed are welcome. > Note that if we make this change (reverting __array_function__), we'll > need to revisit where we put a few deprecation warnings -- these will need > to be restored into function bodies, not their dispatcher functions. > > Also: it would be really nice if we get matmul-as-ufunc in before (or at > the same time) as __array_function__, so we have a complete story about it > being possible to override everything in NumPy. This is another argument > for delaying __array_function__, if matmul-as-ufunc can't make it in time > for 1.16. > That's two votes for matmul-as-ufunc. How much would it cost to simply make __array_function__ a nop? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.harfouche at gmail.com Sun Nov 4 22:34:51 2018 From: mark.harfouche at gmail.com (Mark Harfouche) Date: Sun, 4 Nov 2018 22:34:51 -0500 Subject: [Numpy-discussion] out parameter for np.fromfile Message-ID: I was wondering what would your thoughts be on adding an output parameter to np.fromfile? The advantage would be when interfacing with executables like ffmpeg which are arguably easier to use by calling them as a subprocess compared to a shared library in python. Having the output parameter in np.fromfile would enable pre-allocation of large arrays that are reused during the computation of new image frames when decoding large video files. Thoughts are appreciated! Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.harfouche at gmail.com Sun Nov 4 22:43:04 2018 From: mark.harfouche at gmail.com (Mark Harfouche) Date: Sun, 4 Nov 2018 22:43:04 -0500 Subject: [Numpy-discussion] Prep for NumPy 1.16.0 branch In-Reply-To: References:

Message-ID: > Thoughts on how to proceed are welcome. I've been involved in scikit-image and that project tore out the python2 only code rather quickly after 2.7 support was dropped. I think it caused a few hiccups when backporting bugfixes. I imagine that `1.16.1` and `1.16.2` releases will come out quickly and as such, I think removing `if else` statements for python2 immediately after `1.16` is released will cause annoyances in the first few months bugs are being ironed out. My 2cents. On Sun, Nov 4, 2018 at 10:04 PM Charles R Harris wrote: > > > On Sun, Nov 4, 2018 at 6:16 PM Stephan Hoyer wrote: > >> On Sun, Nov 4, 2018 at 10:32 AM Marten van Kerkwijk < >> m.h.vankerkwijk at gmail.com> wrote: >> >>> Hi Chuck, >>> >>> For `__array_function__`, there was some discussion in >>> https://github.com/numpy/numpy/issues/12225 that for 1.16 we might want >>> to follow after all Nathaniel's suggestion of using an environment variable >>> or so to opt in (since introspection breaks on python2 with our wrapped >>> implementations). Given also the possibly significant hit in performance, >>> this may be the best option. >>> All the best, >>> >>> Marten >>> >> >> I am also leaning towards this right now, depending on how long we plan >> to wait for releasing 1.16. It will take us at least a little while to sort >> out performance issues for __array_function__, I'd guess at least a few >> weeks. Then a blocker still might turn up during the release candidate >> process (though I think we've found most of the major bugs / downstream >> issues already through tests on NumPy's dev branch). >> > > My tentative schedule is to branch in about two weeks, then allow 2 weeks > of testing for rc1, possibly another two weeks for rc2, and then a final. > so possibly about six weeks to final release. That leaves 2 to 4 weeks of > slack before 2019. > > >> Overall, it does feels a little misguided to rush in a change as >> pervasive as __array_function__ for a long term support release. If we >> exclude __array_function__ I expect the whole release process for 1.16 >> would go much smoother. We might even try to get 1.17 out faster than >> usual, so we can minimize the number of additional changes besides >> __array_function__ and going Python 3 only -- that's already a good bit of >> change. >> > > I would like to get 1.17 out a bit early. I'm not sure how many backwards > incompatible changes we want to have in the first post python2 release. My > initial thoughts are to drop Python 2.7 testing, go to C99, and get the new > fft in. Beyond that, I'm hesitant to start tearing out all the Python2 > special casing in the first new release, although that could certainly be > the main task for 1.17 and would clean up the code considerably. It might > also be a good time to catch up on changing deprecations to errors. > Thoughts on how to proceed are welcome. > > >> Note that if we make this change (reverting __array_function__), we'll >> need to revisit where we put a few deprecation warnings -- these will need >> to be restored into function bodies, not their dispatcher functions. >> >> Also: it would be really nice if we get matmul-as-ufunc in before (or at >> the same time) as __array_function__, so we have a complete story about it >> being possible to override everything in NumPy. This is another argument >> for delaying __array_function__, if matmul-as-ufunc can't make it in time >> for 1.16. >> > > That's two votes for matmul-as-ufunc. How much would it cost to simply > make __array_function__ a nop? > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Mon Nov 5 09:00:05 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 5 Nov 2018 09:00:05 -0500 Subject: [Numpy-discussion] Should unique types of all arguments be passed on in __array_function__? In-Reply-To: References:

Message-ID: Hi Stephan, I fear my example about thinking about `ndarray.__array_function__` distracted from the gist of my question, which was whether for `__array_function__` implementations *generally* it wouldn't be handier to have all unique types rather than just those that override `__array_function__`. It would seem that for any other implementation than for numpy itself, the presence of __array_function__ is indeed almost irrelevant. As a somewhat random example, why would it, e.g., for DASK be useful to know that another argument is a Quantity, but not that it is a file handle? (Presumably, it cannot handle either...) All the best, Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Mon Nov 5 09:08:18 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 5 Nov 2018 09:08:18 -0500 Subject: [Numpy-discussion] Should unique types of all arguments be passed on in __array_function__? In-Reply-To: References:

Message-ID: More specifically: Should we change this? It is quite trivially done, but perhaps I am missing >> a reason for omitting the non-override types. >> > > Realistically, without these other changes in NumPy, how would this > improve code using __array_function__? From a general purpose dispatching > perspective, are there cases where you'd want to return NotImplemented > based on types that don't implement __array_function__? > I think, yes, that would be the closest analogy to the python operators. Saves you from having separate cases for types that have and do not have `__array_function__`. > I guess this might help if your alternative array class is super-explicit, > and doesn't automatically call `asmyarray()` on each argument. You could > rely on __array_function__ to return NotImplement (and thus raise > TypeError) rather than type checking in every function you write for your > alternative arrays. > Indeed. > One minor downside would speed: now __array_function__ implementations > need to check a longer list of types. > That's true. > > Another minor downside: if users follow the example of > NDArrayOperatorsMixin docstring, they would now need to explicitly list all > of the scalar types (without __array_function__) that they support, > including builtin types like int and type(None). I suppose this ties into > our recommended best practices for doing type checking in > __array_ufunc__/__array_function__ implementations, which should probably > be updated regardless: > https://github.com/numpy/numpy/issues/12258#issuecomment-432858949 > > Also true. It makes me wonder again whether passing on the types is useful at all... But I end up thinking that it is not up to an implementation to raise TypeError - it should just return NotImplemented. If we'd wanted to give more information, we might also consider passing on `overloaded_args` - then perhaps one has the best of both worlds. All the best, Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Mon Nov 5 09:23:59 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 5 Nov 2018 09:23:59 -0500 Subject: [Numpy-discussion] Implementations of ndarray.__array_function__ (and ndarray.__array_ufunc__) In-Reply-To: References: Message-ID: On Sun, Nov 4, 2018 at 8:57 PM Stephan Hoyer wrote: > On Sun, Nov 4, 2018 at 8:45 AM Marten van Kerkwijk < > m.h.vankerkwijk at gmail.com> wrote: > >> Does the above make sense? I realize that the same would be true for >> `__array_ufunc__`, though there the situation is slightly trickier since it >> is not as easy to bypass any further override checks. Nevertheless, it does >> seem like it would be correct to do the same there. (And if we agree this >> is the case, I'd quite happily implement it -- with the merger of >> multiarray and umath it has become much easier to do.) >> > > Marten actually implemented a draft version of this already in > https://github.com/numpy/numpy/pull/12328 :). I found reading over the PR > helpful for understand this proposal. > > I guess the practical import of this change is that it makes it (much?) > easier to write __array_function__ for ndarray subclasses: if there's a > function where NumPy's default function works fine, you don't need to > bother with returning anything other than NotImplemented from > __array_function__. It's sort of like NotImplementedButCoercible, but only > for ndarray subclasses. > Yes, return NotImplemented if there is another array, or, even simpler, just call super. Note that it is not quite like `NotImplementedButCoercible`, since no actual coercion to ndarray would necessarily be needed - with adherence to the Liskov substitution principle, the subclass might stay intact (if only partially initialized). > One minor downside is that this might make it harder to eventually > deprecate and/or contemplate removing checks for 'mean' methods in > functions like np.mean(), because __array_function__ implementers might > still be relying on this. > I think this is somewhat minor indeed, since we can (and should) insist that subclasses here properly behave as subclasses, so if an ndarray-specific implementation breaks a subclass, that might well indicate that the subclass is not quite good enough (and we can now point out there is a way to override the function). It might also indicate that the code itself could be better - that would be a win. But so far, I think this makes sense. > > The PR includes additional changes to np.core.overrides, but I'm not sure > if those are actually required here (or rather only possible due to this > change). I guess they are needed if you want to be able to count on > ndarray.__array_function__ being called after subclass __array_function__ > methods. > It is mostly a transfer of functionality from `get_override_types_and_args` to the place where the implementation is decided upon. Perhaps more logical even if we do not pursue this. > > I'm not sure I like this part: it means that ndarray.__array_function__ > actually gets called when other arguments implement __array_function__. For > interactions with objects that aren't ndarray subclasses this is entirely > pointless and would unnecessarily slow things down, since > ndarray._array_function__ will always return NotImplemented. > Agreed here. I did in fact think about it, but wasn't sure (and didn't have time to think how to check) that the gain in time for cases where an ndarray comes before the relevant array mimic (and there thus a needless call to ndarray.__array_function__ can be prevented) was worth it compared to the cost of attempting to do the removal for cases where the array mimic came first or where there was no regular ndarray in the first place. But I think this is an implementation detail; for now, let me add a note to the PR about it. All the best, Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Mon Nov 5 09:28:12 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 5 Nov 2018 09:28:12 -0500 Subject: [Numpy-discussion] Should unique types of all arguments be passed on in __array_function__? In-Reply-To: References:

Message-ID: Hi Stephan, Another part of your reply worth considering, though slightly off topic for the question here, of what to pass on in `types`: On Sun, Nov 4, 2018 at 7:51 PM Stephan Hoyer wrote: > On Sun, Nov 4, 2018 at 8:03 AM Marten van Kerkwijk < > m.h.vankerkwijk at gmail.com> wrote: > >> I thought of this partially as I was wondering how an implementation for >> ndarray itself would look like. For that, it is definitely useful to know >> all unique types, since if it is only ndarray, no casting whatsoever needs >> to be done, while if there are integers, lists, etc, an attempt has to be >> made to turn these into arrays >> > > OK, so hypothetically we could invoke versions of each the numpy function > that doesn't call `as[any]array`, and this would slightly speed-up > subclasses that call super().__array_function__? > > A longer-term goal that I had in mind here was generally for the implementations to just be able to assume their arguments are ndarray, i.e., be free to assume there is a shape, dtype, etc. That is not specifically useful for subclasses; for pure python code, it might also mean array mimics could happily use the implementation. But perhaps more importantly, the code would become substantially cleaner. Anyway, really a longer-term goal... All the best, Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Mon Nov 5 09:36:39 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 5 Nov 2018 09:36:39 -0500 Subject: [Numpy-discussion] out parameter for np.fromfile In-Reply-To: References: Message-ID: Hi Mark, Having an `out` might make sense. With present numpy, if you are really dealing with a file or file-like object, you might consider using `np.memmap` to access the data more directly. If it is something that looks more like a buffer, `np.frombuffer` may be useful (that doesn't copy data, but points the array at the memory that holds the buffer). All the best, Marten On Sun, Nov 4, 2018 at 10:35 PM Mark Harfouche wrote: > I was wondering what would your thoughts be on adding an output parameter > to np.fromfile? > > The advantage would be when interfacing with executables like ffmpeg > which are arguably easier to use by calling them as a subprocess compared > to a shared library in python. > > Having the output parameter in np.fromfile would enable pre-allocation of > large arrays that are reused during the computation of new image frames > when decoding large video files. > > Thoughts are appreciated! > > Mark > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Mon Nov 5 09:42:42 2018 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Mon, 5 Nov 2018 09:42:42 -0500 Subject: [Numpy-discussion] Prep for NumPy 1.16.0 branch In-Reply-To: References:

Message-ID: For astropy, we also waiting a little before having a rip-python2-out fiesta. I think it is worth trying to get matmul in 1.16, independently of __array_function__ - it really belongs to ufunc overwrites and all the groundwork has been done. For __array_function__, is it at all an option to go to the disable-by-default step? I.e., by default have array_function_dispatch just return the implementation instead of wrapping it? Though perhaps reversion is indeed cleaner; most people who would like to play with it are quite able to install the development version... -- Marten On Sun, Nov 4, 2018 at 10:43 PM Mark Harfouche wrote: > > Thoughts on how to proceed are welcome. > > I've been involved in scikit-image and that project tore out the python2 > only code rather quickly after 2.7 support was dropped. I think it caused a > few hiccups when backporting bugfixes. I imagine that `1.16.1` and `1.16.2` > releases will come out quickly and as such, I think removing `if else` > statements for python2 immediately after `1.16` is released will cause > annoyances in the first few months bugs are being ironed out. > > My 2cents. > > On Sun, Nov 4, 2018 at 10:04 PM Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, Nov 4, 2018 at 6:16 PM Stephan Hoyer wrote: >> >>> On Sun, Nov 4, 2018 at 10:32 AM Marten van Kerkwijk < >>> m.h.vankerkwijk at gmail.com> wrote: >>> >>>> Hi Chuck, >>>> >>>> For `__array_function__`, there was some discussion in >>>> https://github.com/numpy/numpy/issues/12225 that for 1.16 we might >>>> want to follow after all Nathaniel's suggestion of using an environment >>>> variable or so to opt in (since introspection breaks on python2 with our >>>> wrapped implementations). Given also the possibly significant hit in >>>> performance, this may be the best option. >>>> All the best, >>>> >>>> Marten >>>> >>> >>> I am also leaning towards this right now, depending on how long we plan >>> to wait for releasing 1.16. It will take us at least a little while to sort >>> out performance issues for __array_function__, I'd guess at least a few >>> weeks. Then a blocker still might turn up during the release candidate >>> process (though I think we've found most of the major bugs / downstream >>> issues already through tests on NumPy's dev branch). >>> >> >> My tentative schedule is to branch in about two weeks, then allow 2 weeks >> of testing for rc1, possibly another two weeks for rc2, and then a final. >> so possibly about six weeks to final release. That leaves 2 to 4 weeks of >> slack before 2019. >> >> >>> Overall, it does feels a little misguided to rush in a change as >>> pervasive as __array_function__ for a long term support release. If we >>> exclude __array_function__ I expect the whole release process for 1.16 >>> would go much smoother. We might even try to get 1.17 out faster than >>> usual, so we can minimize the number of additional changes besides >>> __array_function__ and going Python 3 only -- that's already a good bit of >>> change. >>> >> >> I would like to get 1.17 out a bit early. I'm not sure how many backwards >> incompatible changes we want to have in the first post python2 release. My >> initial thoughts are to drop Python 2.7 testing, go to C99, and get the new >> fft in. Beyond that, I'm hesitant to start tearing out all the Python2 >> special casing in the first new release, although that could certainly be >> the main task for 1.17 and would clean up the code considerably. It might >> also be a good time to catch up on changing deprecations to errors. >> Thoughts on how to proceed are welcome. >> >> >>> Note that if we make this change (reverting __array_function__), we'll >>> need to revisit where we put a few deprecation warnings -- these will need >>> to be restored into function bodies, not their dispatcher functions. >>> >>> Also: it would be really nice if we get matmul-as-ufunc in before (or at >>> the same time) as __array_function__, so we have a complete story about it >>> being possible to override everything in NumPy. This is another argument >>> for delaying __array_function__, if matmul-as-ufunc can't make it in time >>> for 1.16. >>> >> >> That's two votes for matmul-as-ufunc. How much would it cost to simply >> make __array_function__ a nop? >> >> Chuck >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.harfouche at gmail.com Mon Nov 5 11:24:57 2018 From: mark.harfouche at gmail.com (Mark Harfouche) Date: Mon, 5 Nov 2018 11:24:57 -0500 Subject: [Numpy-discussion] out parameter for np.fromfile In-Reply-To: References:

Message-ID: Thanks Marten. I tried memap for a few things but it seemed to create an other OS level buffer in specific situations. I think the `seek` operation in the `memmap` also caused some performance bottlenecks. Maybe I'll have time to summarize my findings an other day. The particular usecase of `ffmpeg` is tricky since it is grabbing a lot of data from `stdout` which isn't a typical file buffer. Specifically, it is often `buffered` but `unseekable`. On Mon, Nov 5, 2018 at 9:38 AM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Hi Mark, > > Having an `out` might make sense. With present numpy, if you are really > dealing with a file or file-like object, you might consider using > `np.memmap` to access the data more directly. If it is something that looks > more like a buffer, `np.frombuffer` may be useful (that doesn't copy data, > but points the array at the memory that holds the buffer). > > All the best, > > Marten > > > On Sun, Nov 4, 2018 at 10:35 PM Mark Harfouche > wrote: > >> I was wondering what would your thoughts be on adding an output parameter >> to np.fromfile? >> >> The advantage would be when interfacing with executables like ffmpeg >> which are arguably easier to use by calling them as a subprocess compared >> to a shared library in python. >> >> Having the output parameter in np.fromfile would enable pre-allocation >> of large arrays that are reused during the computation of new image frames >> when decoding large video files. >> >> Thoughts are appreciated! >> >> Mark >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.s.farimani at gmail.com Mon Nov 5 16:44:52 2018 From: f.s.farimani at gmail.com (Foad Sojoodi Farimani) Date: Mon, 5 Nov 2018 22:44:52 +0100 Subject: [Numpy-discussion] numpy pprint? Message-ID: Hello everyone, Following this question , I'm convinced that numpy ndarrays are not MATLAB/mathematical multidimentional matrices and I should stop expecting them to be. However I still think it would have a lot of benefit to have a function like sympy's pprint to pretty print. something like pandas .head and .tail method plus .left .right .UpLeft .UpRight .DownLeft .DownRight methods. when nothing mentioned it would show 4 corners and put dots in the middle if the array is to big for the terminal. Best, Foad -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.harfouche at gmail.com Tue Nov 6 00:11:20 2018 From: mark.harfouche at gmail.com (Mark Harfouche) Date: Tue, 6 Nov 2018 00:11:20 -0500 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References: Message-ID: Foad, Visualizing data is definitely a complex field. I definitely feel your pain. Printing your data is but one way of visualizing it, and probably only useful for very small and constrained datasets. Have you looked into set_printoptions to see how numpy?s existing capabilities might help you with your visualization? The code you showed seems quite good. I wouldn?t worry about performance when it comes to functions that will seldom be called in tight loops. As you?ll learn more about python and numpy, you?ll keep expanding it to include more use cases. For many of my projects, I create small submodules for visualization tailored to the specific needs of the particular project. I?ll try to incorporate your functions and see how I use them. Your original post seems to have some confusion about C Style vs F Style ordering. I hope that has been resolved. There is also a lot of good documentation https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes about transitioning from matlab. Mark On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani wrote: > Hello everyone, > > Following this question , > I'm convinced that numpy ndarrays are not MATLAB/mathematical > multidimentional matrices and I should stop expecting them to be. However I > still think it would have a lot of benefit to have a function like sympy's > pprint to pretty print. something like pandas .head and .tail method plus > .left .right .UpLeft .UpRight .DownLeft .DownRight methods. when nothing > mentioned it would show 4 corners and put dots in the middle if the array > is to big for the terminal. > > Best, > Foad > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Tue Nov 6 00:50:31 2018 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Mon, 5 Nov 2018 21:50:31 -0800 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: Hijacking this thread while on the topic of pprint - we might want to look into a table-based `_html_repr_` or `_latex_repr_` for use in ipython - where we can print the full array and let scrollbars replace ellipses. Eric On Mon, 5 Nov 2018 at 21:11 Mark Harfouche wrote: > Foad, > > Visualizing data is definitely a complex field. I definitely feel your > pain. > Printing your data is but one way of visualizing it, and probably only > useful for very small and constrained datasets. > Have you looked into set_printoptions > > to see how numpy?s existing capabilities might help you with your > visualization? > > The code you showed seems quite good. I wouldn?t worry about performance > when it comes to functions that will seldom be called in tight loops. > As you?ll learn more about python and numpy, you?ll keep expanding it to > include more use cases. > For many of my projects, I create small submodules for visualization > tailored to the specific needs of the particular project. > I?ll try to incorporate your functions and see how I use them. > > Your original post seems to have some confusion about C Style vs F Style > ordering. I hope that has been resolved. > There is also a lot of good documentation > > https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes > about transitioning from matlab. > > Mark > > On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < > f.s.farimani at gmail.com> wrote: > >> Hello everyone, >> >> Following this question , >> I'm convinced that numpy ndarrays are not MATLAB/mathematical >> multidimentional matrices and I should stop expecting them to be. However I >> still think it would have a lot of benefit to have a function like sympy's >> pprint to pretty print. something like pandas .head and .tail method plus >> .left .right .UpLeft .UpRight .DownLeft .DownRight methods. when nothing >> mentioned it would show 4 corners and put dots in the middle if the array >> is to big for the terminal. >> >> Best, >> Foad >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.s.farimani at gmail.com Tue Nov 6 02:26:02 2018 From: f.s.farimani at gmail.com (Foad Sojoodi Farimani) Date: Tue, 6 Nov 2018 08:26:02 +0100 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: Dear Mark, Thanks for the reply. I will write in between your lines: On Tue, Nov 6, 2018 at 6:11 AM Mark Harfouche wrote: > Foad, > > Visualizing data is definitely a complex field. I definitely feel your > pain. > I have actually been using numpy for a couple of years without noticing these issues. recently I have been trying to encourage my collogues to move from MATLAB to Python and also prepare some workshops for PhD network of my university. > Printing your data is but one way of visualizing it, and probably only > useful for very small and constrained datasets. > well actually it can be very useful. Consider Pandas .head() and .tail() methods or Sympy's pretty printing functionalities. for bigger datasets the function can get the terminals width and height and then based on the input (U(n),D(n),L(n),R(n),UR(n,m),UL(n,m),DR(n,m),DL(n,m)) display what can be shown and put horizontal 3-dots \u2026 ? or vertical/inclined ones. Or id it is Jupyter then one can use Markdown/LaTeX for pretty printing or even HTML to add sliders as suggested by Eric. > Have you looked into set_printoptions > > to see how numpy?s existing capabilities might help you with your > visualization? > This is indeed very useful. specially the threshold option can help a lot with adjusting the width. but only for specific cases. > The code you showed seems quite good. I wouldn?t worry about performance > when it comes to functions that will seldom be called in tight loops. > Thanks but I know it is very bad: - it does not work properly for floats - it only works for 1D and 2D - there can be some recursive function I believe. As you?ll learn more about python and numpy, you?ll keep expanding it to > include more use cases. > For many of my projects, I create small submodules for visualization > tailored to the specific needs of the particular project. > I?ll try to incorporate your functions and see how I use them. > Thanks a lot. looking forwards to your feedback > Your original post seems to have some confusion about C Style vs F Style > ordering. I hope that has been resolved. > I actually came to the conclusion that calling it C-Style or F-Style or maybe row-major column-major are bad practices. Numpy's ndarrays are not mathematical multidimensional arrays but Pythons nested, homogenous and uniform lists. it means for example 1, [1], [[1]] and [[[1]]] are all different, while in all other mathematical languages out there (including Sympy's matrices) they are the same. > There is also a lot of good documentation > > https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes > about transitioning from matlab. > I have seen this one and many others, which I'm trying to comprehend and then put in some slides made in Jupyter notebooks. Maybe when they are ready I will create a GitHub repo and upload them alongside the possible video recordings of the workshops. Foad > Mark > > On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < > f.s.farimani at gmail.com> wrote: > >> Hello everyone, >> >> Following this question , >> I'm convinced that numpy ndarrays are not MATLAB/mathematical >> multidimentional matrices and I should stop expecting them to be. However I >> still think it would have a lot of benefit to have a function like sympy's >> pprint to pretty print. something like pandas .head and .tail method plus >> .left .right .UpLeft .UpRight .DownLeft .DownRight methods. when nothing >> mentioned it would show 4 corners and put dots in the middle if the array >> is to big for the terminal. >> >> Best, >> Foad >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.s.farimani at gmail.com Tue Nov 6 02:28:19 2018 From: f.s.farimani at gmail.com (Foad Sojoodi Farimani) Date: Tue, 6 Nov 2018 08:28:19 +0100 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: It is not highking if I asked for it :)) for IPython/Jupyter using Markdown/LaTeX would be awesome or even better using HTML to add sliders just like Pandas... F. On Tue, Nov 6, 2018 at 6:51 AM Eric Wieser wrote: > Hijacking this thread while on the topic of pprint - we might want to look > into a table-based `_html_repr_` or `_latex_repr_` for use in ipython - > where we can print the full array and let scrollbars replace ellipses. > > Eric > > On Mon, 5 Nov 2018 at 21:11 Mark Harfouche > wrote: > >> Foad, >> >> Visualizing data is definitely a complex field. I definitely feel your >> pain. >> Printing your data is but one way of visualizing it, and probably only >> useful for very small and constrained datasets. >> Have you looked into set_printoptions >> >> to see how numpy?s existing capabilities might help you with your >> visualization? >> >> The code you showed seems quite good. I wouldn?t worry about performance >> when it comes to functions that will seldom be called in tight loops. >> As you?ll learn more about python and numpy, you?ll keep expanding it to >> include more use cases. >> For many of my projects, I create small submodules for visualization >> tailored to the specific needs of the particular project. >> I?ll try to incorporate your functions and see how I use them. >> >> Your original post seems to have some confusion about C Style vs F Style >> ordering. I hope that has been resolved. >> There is also a lot of good documentation >> >> https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes >> about transitioning from matlab. >> >> Mark >> >> On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < >> f.s.farimani at gmail.com> wrote: >> >>> Hello everyone, >>> >>> Following this question , >>> I'm convinced that numpy ndarrays are not MATLAB/mathematical >>> multidimentional matrices and I should stop expecting them to be. However I >>> still think it would have a lot of benefit to have a function like sympy's >>> pprint to pretty print. something like pandas .head and .tail method plus >>> .left .right .UpLeft .UpRight .DownLeft .DownRight methods. when nothing >>> mentioned it would show 4 corners and put dots in the middle if the array >>> is to big for the terminal. >>> >>> Best, >>> Foad >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Tue Nov 6 03:45:17 2018 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Tue, 6 Nov 2018 00:45:17 -0800 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: Here's how that could look https://numpyintegration-ericwieser.notebooks.azure.com/j/notebooks/pprint.ipynb Feel free to play around and see if you can produce something more useful On Mon, 5 Nov 2018 at 23:28 Foad Sojoodi Farimani wrote: > It is not highking if I asked for it :)) > for IPython/Jupyter using Markdown/LaTeX would be awesome > or even better using HTML to add sliders just like Pandas... > > F. > > On Tue, Nov 6, 2018 at 6:51 AM Eric Wieser > wrote: > >> Hijacking this thread while on the topic of pprint - we might want to >> look into a table-based `_html_repr_` or `_latex_repr_` for use in ipython >> - where we can print the full array and let scrollbars replace ellipses. >> >> Eric >> >> On Mon, 5 Nov 2018 at 21:11 Mark Harfouche >> wrote: >> >>> Foad, >>> >>> Visualizing data is definitely a complex field. I definitely feel your >>> pain. >>> Printing your data is but one way of visualizing it, and probably only >>> useful for very small and constrained datasets. >>> Have you looked into set_printoptions >>> >>> to see how numpy?s existing capabilities might help you with your >>> visualization? >>> >>> The code you showed seems quite good. I wouldn?t worry about performance >>> when it comes to functions that will seldom be called in tight loops. >>> As you?ll learn more about python and numpy, you?ll keep expanding it to >>> include more use cases. >>> For many of my projects, I create small submodules for visualization >>> tailored to the specific needs of the particular project. >>> I?ll try to incorporate your functions and see how I use them. >>> >>> Your original post seems to have some confusion about C Style vs F Style >>> ordering. I hope that has been resolved. >>> There is also a lot of good documentation >>> >>> https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes >>> about transitioning from matlab. >>> >>> Mark >>> >>> On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < >>> f.s.farimani at gmail.com> wrote: >>> >>>> Hello everyone, >>>> >>>> Following this question , >>>> I'm convinced that numpy ndarrays are not MATLAB/mathematical >>>> multidimentional matrices and I should stop expecting them to be. However I >>>> still think it would have a lot of benefit to have a function like sympy's >>>> pprint to pretty print. something like pandas .head and .tail method plus >>>> .left .right .UpLeft .UpRight .DownLeft .DownRight methods. when nothing >>>> mentioned it would show 4 corners and put dots in the middle if the array >>>> is to big for the terminal. >>>> >>>> Best, >>>> Foad >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.s.farimani at gmail.com Tue Nov 6 03:55:59 2018 From: f.s.farimani at gmail.com (Foad Sojoodi Farimani) Date: Tue, 6 Nov 2018 09:55:59 +0100 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: Wow, this is awesome. Some points though: - not everybody uses IPython/Jupyter having the functionality for conventional consols would also help. something like Sypy's init_printing/init_session which smartly chooses the right representation considering the terminal. - I don't think putting everything in boxes is helping. it is confusing. I would rather having horizontal and vertical square brackets represent each nested array - it would be awesome if in IPython/Jupyter hovering over an element a popup would show the index - one could read the width and height of the terminal and other options I mentioned in reply Mark to show L R U P or combination of these plus some numbers (similar to Pandas .head .tail) methods and then show the rest by unicod 3dot P.S. I had no idea our university Microsoft services also offers Azure Notebooks awesome :P F. On Tue, Nov 6, 2018 at 9:45 AM Eric Wieser wrote: > Here's how that could look > > > https://numpyintegration-ericwieser.notebooks.azure.com/j/notebooks/pprint.ipynb > > Feel free to play around and see if you can produce something more useful > > > > On Mon, 5 Nov 2018 at 23:28 Foad Sojoodi Farimani > wrote: > >> It is not highking if I asked for it :)) >> for IPython/Jupyter using Markdown/LaTeX would be awesome >> or even better using HTML to add sliders just like Pandas... >> >> F. >> >> On Tue, Nov 6, 2018 at 6:51 AM Eric Wieser >> wrote: >> >>> Hijacking this thread while on the topic of pprint - we might want to >>> look into a table-based `_html_repr_` or `_latex_repr_` for use in ipython >>> - where we can print the full array and let scrollbars replace ellipses. >>> >>> Eric >>> >>> On Mon, 5 Nov 2018 at 21:11 Mark Harfouche >>> wrote: >>> >>>> Foad, >>>> >>>> Visualizing data is definitely a complex field. I definitely feel your >>>> pain. >>>> Printing your data is but one way of visualizing it, and probably only >>>> useful for very small and constrained datasets. >>>> Have you looked into set_printoptions >>>> >>>> to see how numpy?s existing capabilities might help you with your >>>> visualization? >>>> >>>> The code you showed seems quite good. I wouldn?t worry about >>>> performance when it comes to functions that will seldom be called in tight >>>> loops. >>>> As you?ll learn more about python and numpy, you?ll keep expanding it >>>> to include more use cases. >>>> For many of my projects, I create small submodules for visualization >>>> tailored to the specific needs of the particular project. >>>> I?ll try to incorporate your functions and see how I use them. >>>> >>>> Your original post seems to have some confusion about C Style vs F >>>> Style ordering. I hope that has been resolved. >>>> There is also a lot of good documentation >>>> >>>> https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes >>>> about transitioning from matlab. >>>> >>>> Mark >>>> >>>> On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < >>>> f.s.farimani at gmail.com> wrote: >>>> >>>>> Hello everyone, >>>>> >>>>> Following this question , >>>>> I'm convinced that numpy ndarrays are not MATLAB/mathematical >>>>> multidimentional matrices and I should stop expecting them to be. However I >>>>> still think it would have a lot of benefit to have a function like sympy's >>>>> pprint to pretty print. something like pandas .head and .tail method plus >>>>> .left .right .UpLeft .UpRight .DownLeft .DownRight methods. when nothing >>>>> mentioned it would show 4 corners and put dots in the middle if the array >>>>> is to big for the terminal. >>>>> >>>>> Best, >>>>> Foad >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.harfouche at gmail.com Tue Nov 6 05:07:04 2018 From: mark.harfouche at gmail.com (Mark Harfouche) Date: Tue, 6 Nov 2018 05:07:04 -0500 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: Foad, In response to: Thanks but I know it is very bad: - it does not work properly for floats - it only works for 1D and 2D - there can be some recursive function I believe. I think this is the awesome part about being able to write 10 lines of code that are specified to representing exactly 1 thing. Other than that, yeah, encouraging people to transition from matlab is challenging. Matlab is definitely good at doing matrix operations. Python3 somewhat helps in that regard. I'm super glad you are bringing usability issues up and working toward solving them. Maybe you can describe the interface for python you find practical to introduce to newcomers so as to motivate the discussion? Mark On Tue, Nov 6, 2018 at 3:57 AM Foad Sojoodi Farimani wrote: > Wow, this is awesome. > Some points though: > > - not everybody uses IPython/Jupyter having the functionality for > conventional consols would also help. something like > Sypy's init_printing/init_session which smartly chooses the right > representation considering the terminal. > - I don't think putting everything in boxes is helping. it is > confusing. I would rather having horizontal and vertical square brackets > represent each nested array > - it would be awesome if in IPython/Jupyter hovering over an element a > popup would show the index > - one could read the width and height of the terminal and other > options I mentioned in reply Mark to show L R U P or combination of these > plus some numbers (similar to Pandas .head .tail) methods and then show the > rest by unicod 3dot > > P.S. I had no idea our university Microsoft services also offers Azure > Notebooks awesome :P > > F. > > On Tue, Nov 6, 2018 at 9:45 AM Eric Wieser > wrote: > >> Here's how that could look >> >> >> https://numpyintegration-ericwieser.notebooks.azure.com/j/notebooks/pprint.ipynb >> >> Feel free to play around and see if you can produce something more useful >> >> >> >> On Mon, 5 Nov 2018 at 23:28 Foad Sojoodi Farimani >> wrote: >> >>> It is not highking if I asked for it :)) >>> for IPython/Jupyter using Markdown/LaTeX would be awesome >>> or even better using HTML to add sliders just like Pandas... >>> >>> F. >>> >>> On Tue, Nov 6, 2018 at 6:51 AM Eric Wieser >>> wrote: >>> >>>> Hijacking this thread while on the topic of pprint - we might want to >>>> look into a table-based `_html_repr_` or `_latex_repr_` for use in ipython >>>> - where we can print the full array and let scrollbars replace ellipses. >>>> >>>> Eric >>>> >>>> On Mon, 5 Nov 2018 at 21:11 Mark Harfouche >>>> wrote: >>>> >>>>> Foad, >>>>> >>>>> Visualizing data is definitely a complex field. I definitely feel your >>>>> pain. >>>>> Printing your data is but one way of visualizing it, and probably only >>>>> useful for very small and constrained datasets. >>>>> Have you looked into set_printoptions >>>>> >>>>> to see how numpy?s existing capabilities might help you with your >>>>> visualization? >>>>> >>>>> The code you showed seems quite good. I wouldn?t worry about >>>>> performance when it comes to functions that will seldom be called in tight >>>>> loops. >>>>> As you?ll learn more about python and numpy, you?ll keep expanding it >>>>> to include more use cases. >>>>> For many of my projects, I create small submodules for visualization >>>>> tailored to the specific needs of the particular project. >>>>> I?ll try to incorporate your functions and see how I use them. >>>>> >>>>> Your original post seems to have some confusion about C Style vs F >>>>> Style ordering. I hope that has been resolved. >>>>> There is also a lot of good documentation >>>>> >>>>> https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes >>>>> about transitioning from matlab. >>>>> >>>>> Mark >>>>> >>>>> On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < >>>>> f.s.farimani at gmail.com> wrote: >>>>> >>>>>> Hello everyone, >>>>>> >>>>>> Following this question >>>>>> , I'm convinced that >>>>>> numpy ndarrays are not MATLAB/mathematical multidimentional matrices and I >>>>>> should stop expecting them to be. However I still think it would have a lot >>>>>> of benefit to have a function like sympy's pprint to pretty print. >>>>>> something like pandas .head and .tail method plus .left .right .UpLeft >>>>>> .UpRight .DownLeft .DownRight methods. when nothing mentioned it would show >>>>>> 4 corners and put dots in the middle if the array is to big for the >>>>>> terminal. >>>>>> >>>>>> Best, >>>>>> Foad >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at python.org >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.s.farimani at gmail.com Tue Nov 6 05:16:23 2018 From: f.s.farimani at gmail.com (Foad Sojoodi Farimani) Date: Tue, 6 Nov 2018 11:16:23 +0100 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: In between your lines: On Tue, Nov 6, 2018 at 11:07 AM Mark Harfouche wrote: > Foad, > In response to: > > Thanks but I know it is very bad: > > - it does not work properly for floats > - it only works for 1D and 2D > - there can be some recursive function I believe. > > I think this is the awesome part about being able to write 10 lines of > code that are specified to representing exactly 1 thing. > > Other than that, yeah, encouraging people to transition from matlab is > challenging. Matlab is definitely good at doing matrix operations. Python3 > somewhat helps in that regard. > > I'm super glad you are bringing usability issues up and working toward > solving them. > > Maybe you can describe the interface for python you find practical to > introduce to newcomers so as to motivate the discussion? > I have been thinking about Spyder but it has a lot of issues with the standard python distribution and pip. Jupyterlab would be awesome except some Jupyter Notebook extensions are missing. For example variable inspector, RISE for slides, Hinterland, ... For the moment Jupyter Notebook is the most reliable/complete I could find. F. > > Mark > > > On Tue, Nov 6, 2018 at 3:57 AM Foad Sojoodi Farimani < > f.s.farimani at gmail.com> wrote: > >> Wow, this is awesome. >> Some points though: >> >> - not everybody uses IPython/Jupyter having the functionality for >> conventional consols would also help. something like >> Sypy's init_printing/init_session which smartly chooses the right >> representation considering the terminal. >> - I don't think putting everything in boxes is helping. it is >> confusing. I would rather having horizontal and vertical square brackets >> represent each nested array >> - it would be awesome if in IPython/Jupyter hovering over an element >> a popup would show the index >> - one could read the width and height of the terminal and other >> options I mentioned in reply Mark to show L R U P or combination of these >> plus some numbers (similar to Pandas .head .tail) methods and then show the >> rest by unicod 3dot >> >> P.S. I had no idea our university Microsoft services also offers Azure >> Notebooks awesome :P >> >> F. >> >> On Tue, Nov 6, 2018 at 9:45 AM Eric Wieser >> wrote: >> >>> Here's how that could look >>> >>> >>> https://numpyintegration-ericwieser.notebooks.azure.com/j/notebooks/pprint.ipynb >>> >>> Feel free to play around and see if you can produce something more useful >>> >>> >>> >>> On Mon, 5 Nov 2018 at 23:28 Foad Sojoodi Farimani < >>> f.s.farimani at gmail.com> wrote: >>> >>>> It is not highking if I asked for it :)) >>>> for IPython/Jupyter using Markdown/LaTeX would be awesome >>>> or even better using HTML to add sliders just like Pandas... >>>> >>>> F. >>>> >>>> On Tue, Nov 6, 2018 at 6:51 AM Eric Wieser >>>> wrote: >>>> >>>>> Hijacking this thread while on the topic of pprint - we might want to >>>>> look into a table-based `_html_repr_` or `_latex_repr_` for use in ipython >>>>> - where we can print the full array and let scrollbars replace ellipses. >>>>> >>>>> Eric >>>>> >>>>> On Mon, 5 Nov 2018 at 21:11 Mark Harfouche >>>>> wrote: >>>>> >>>>>> Foad, >>>>>> >>>>>> Visualizing data is definitely a complex field. I definitely feel >>>>>> your pain. >>>>>> Printing your data is but one way of visualizing it, and probably >>>>>> only useful for very small and constrained datasets. >>>>>> Have you looked into set_printoptions >>>>>> >>>>>> to see how numpy?s existing capabilities might help you with your >>>>>> visualization? >>>>>> >>>>>> The code you showed seems quite good. I wouldn?t worry about >>>>>> performance when it comes to functions that will seldom be called in tight >>>>>> loops. >>>>>> As you?ll learn more about python and numpy, you?ll keep expanding it >>>>>> to include more use cases. >>>>>> For many of my projects, I create small submodules for visualization >>>>>> tailored to the specific needs of the particular project. >>>>>> I?ll try to incorporate your functions and see how I use them. >>>>>> >>>>>> Your original post seems to have some confusion about C Style vs F >>>>>> Style ordering. I hope that has been resolved. >>>>>> There is also a lot of good documentation >>>>>> >>>>>> https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes >>>>>> about transitioning from matlab. >>>>>> >>>>>> Mark >>>>>> >>>>>> On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < >>>>>> f.s.farimani at gmail.com> wrote: >>>>>> >>>>>>> Hello everyone, >>>>>>> >>>>>>> Following this question >>>>>>> , I'm convinced that >>>>>>> numpy ndarrays are not MATLAB/mathematical multidimentional matrices and I >>>>>>> should stop expecting them to be. However I still think it would have a lot >>>>>>> of benefit to have a function like sympy's pprint to pretty print. >>>>>>> something like pandas .head and .tail method plus .left .right .UpLeft >>>>>>> .UpRight .DownLeft .DownRight methods. when nothing mentioned it would show >>>>>>> 4 corners and put dots in the middle if the array is to big for the >>>>>>> terminal. >>>>>>> >>>>>>> Best, >>>>>>> Foad >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing list >>>>>>> NumPy-Discussion at python.org >>>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at python.org >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.harfouche at gmail.com Tue Nov 6 05:41:20 2018 From: mark.harfouche at gmail.com (Mark Harfouche) Date: Tue, 6 Nov 2018 05:41:20 -0500 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: To install spyder I wonder if Anaconda is a possibility. It also installs a lot of packages that your pupils/peers might be using but that you might not anticipate. On a semi-related note, a recent change to the repr broke a lot of downstream tests. Htlm and latex reprs are probably easier to experiment with on that sense. That said. That might just be a doctest issue and not a numpy issue. On Tue, Nov 6, 2018 at 5:18 AM Foad Sojoodi Farimani wrote: > In between your lines: > > On Tue, Nov 6, 2018 at 11:07 AM Mark Harfouche > wrote: > >> Foad, >> In response to: >> >> Thanks but I know it is very bad: >> >> - it does not work properly for floats >> - it only works for 1D and 2D >> - there can be some recursive function I believe. >> >> I think this is the awesome part about being able to write 10 lines of >> code that are specified to representing exactly 1 thing. >> >> Other than that, yeah, encouraging people to transition from matlab is >> challenging. Matlab is definitely good at doing matrix operations. Python3 >> somewhat helps in that regard. >> >> I'm super glad you are bringing usability issues up and working toward >> solving them. >> >> Maybe you can describe the interface for python you find practical to >> introduce to newcomers so as to motivate the discussion? >> > > I have been thinking about Spyder but it has a lot of issues with the > standard python distribution and pip. Jupyterlab would be awesome except > some Jupyter Notebook extensions are missing. For example variable > inspector, RISE for slides, Hinterland, ... For the moment Jupyter Notebook > is the most reliable/complete I could find. > > F. > > >> >> Mark >> >> >> On Tue, Nov 6, 2018 at 3:57 AM Foad Sojoodi Farimani < >> f.s.farimani at gmail.com> wrote: >> >>> Wow, this is awesome. >>> Some points though: >>> >>> - not everybody uses IPython/Jupyter having the functionality for >>> conventional consols would also help. something like >>> Sypy's init_printing/init_session which smartly chooses the right >>> representation considering the terminal. >>> - I don't think putting everything in boxes is helping. it is >>> confusing. I would rather having horizontal and vertical square brackets >>> represent each nested array >>> - it would be awesome if in IPython/Jupyter hovering over an element >>> a popup would show the index >>> - one could read the width and height of the terminal and other >>> options I mentioned in reply Mark to show L R U P or combination of these >>> plus some numbers (similar to Pandas .head .tail) methods and then show the >>> rest by unicod 3dot >>> >>> P.S. I had no idea our university Microsoft services also offers Azure >>> Notebooks awesome :P >>> >>> F. >>> >>> On Tue, Nov 6, 2018 at 9:45 AM Eric Wieser >>> wrote: >>> >>>> Here's how that could look >>>> >>>> >>>> https://numpyintegration-ericwieser.notebooks.azure.com/j/notebooks/pprint.ipynb >>>> >>>> Feel free to play around and see if you can produce something more >>>> useful >>>> >>>> >>>> >>>> On Mon, 5 Nov 2018 at 23:28 Foad Sojoodi Farimani < >>>> f.s.farimani at gmail.com> wrote: >>>> >>>>> It is not highking if I asked for it :)) >>>>> for IPython/Jupyter using Markdown/LaTeX would be awesome >>>>> or even better using HTML to add sliders just like Pandas... >>>>> >>>>> F. >>>>> >>>>> On Tue, Nov 6, 2018 at 6:51 AM Eric Wieser < >>>>> wieser.eric+numpy at gmail.com> wrote: >>>>> >>>>>> Hijacking this thread while on the topic of pprint - we might want to >>>>>> look into a table-based `_html_repr_` or `_latex_repr_` for use in ipython >>>>>> - where we can print the full array and let scrollbars replace ellipses. >>>>>> >>>>>> Eric >>>>>> >>>>>> On Mon, 5 Nov 2018 at 21:11 Mark Harfouche >>>>>> wrote: >>>>>> >>>>>>> Foad, >>>>>>> >>>>>>> Visualizing data is definitely a complex field. I definitely feel >>>>>>> your pain. >>>>>>> Printing your data is but one way of visualizing it, and probably >>>>>>> only useful for very small and constrained datasets. >>>>>>> Have you looked into set_printoptions >>>>>>> >>>>>>> to see how numpy?s existing capabilities might help you with your >>>>>>> visualization? >>>>>>> >>>>>>> The code you showed seems quite good. I wouldn?t worry about >>>>>>> performance when it comes to functions that will seldom be called in tight >>>>>>> loops. >>>>>>> As you?ll learn more about python and numpy, you?ll keep expanding >>>>>>> it to include more use cases. >>>>>>> For many of my projects, I create small submodules for visualization >>>>>>> tailored to the specific needs of the particular project. >>>>>>> I?ll try to incorporate your functions and see how I use them. >>>>>>> >>>>>>> Your original post seems to have some confusion about C Style vs F >>>>>>> Style ordering. I hope that has been resolved. >>>>>>> There is also a lot of good documentation >>>>>>> >>>>>>> https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes >>>>>>> about transitioning from matlab. >>>>>>> >>>>>>> Mark >>>>>>> >>>>>>> On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < >>>>>>> f.s.farimani at gmail.com> wrote: >>>>>>> >>>>>>>> Hello everyone, >>>>>>>> >>>>>>>> Following this question >>>>>>>> , I'm convinced that >>>>>>>> numpy ndarrays are not MATLAB/mathematical multidimentional matrices and I >>>>>>>> should stop expecting them to be. However I still think it would have a lot >>>>>>>> of benefit to have a function like sympy's pprint to pretty print. >>>>>>>> something like pandas .head and .tail method plus .left .right .UpLeft >>>>>>>> .UpRight .DownLeft .DownRight methods. when nothing mentioned it would show >>>>>>>> 4 corners and put dots in the middle if the array is to big for the >>>>>>>> terminal. >>>>>>>> >>>>>>>> Best, >>>>>>>> Foad >>>>>>>> _______________________________________________ >>>>>>>> NumPy-Discussion mailing list >>>>>>>> NumPy-Discussion at python.org >>>>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing list >>>>>>> NumPy-Discussion at python.org >>>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at python.org >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From deak.andris at gmail.com Tue Nov 6 05:43:01 2018 From: deak.andris at gmail.com (Andras Deak) Date: Tue, 6 Nov 2018 11:43:01 +0100 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: On Tue, Nov 6, 2018 at 8:26 AM Foad Sojoodi Farimani wrote: > > Dear Mark, > > Thanks for the reply. I will write in between your lines: > > On Tue, Nov 6, 2018 at 6:11 AM Mark Harfouche wrote: >> >> Foad, >> >> Visualizing data is definitely a complex field. I definitely feel your pain. > > I have actually been using numpy for a couple of years without noticing these issues. recently I have been trying to encourage my collogues to move from MATLAB to Python and also prepare some workshops for PhD network of my university. >> >> Printing your data is but one way of visualizing it, and probably only useful for very small and constrained datasets. > > well actually it can be very useful. Consider Pandas .head() and .tail() methods or Sympy's pretty printing functionalities. for bigger datasets the function can get the terminals width and height and then based on the input (U(n),D(n),L(n),R(n),UR(n,m),UL(n,m),DR(n,m),DL(n,m)) display what can be shown and put horizontal 3-dots \u2026 ? or vertical/inclined ones. Or id it is Jupyter then one can use Markdown/LaTeX for pretty printing or even HTML to add sliders as suggested by Eric. >> >> Have you looked into set_printoptions to see how numpy?s existing capabilities might help you with your visualization? > > This is indeed very useful. specially the threshold option can help a lot with adjusting the width. but only for specific cases. >> >> The code you showed seems quite good. I wouldn?t worry about performance when it comes to functions that will seldom be called in tight loops. > > Thanks but I know it is very bad: > > it does not work properly for floats > it only works for 1D and 2D > there can be some recursive function I believe. >> >> As you?ll learn more about python and numpy, you?ll keep expanding it to include more use cases. >> For many of my projects, I create small submodules for visualization tailored to the specific needs of the particular project. >> I?ll try to incorporate your functions and see how I use them. > > Thanks a lot. looking forwards to your feedback >> >> Your original post seems to have some confusion about C Style vs F Style ordering. I hope that has been resolved. > > I actually came to the conclusion that calling it C-Style or F-Style or maybe row-major column-major are bad practices. Numpy's ndarrays are not mathematical multidimensional arrays but Pythons nested, homogenous and uniform lists. it means for example 1, [1], [[1]] and [[[1]]] are all different, while in all other mathematical languages out there (including Sympy's matrices) they are the same. I'm probably missing your point, because I don't understand your claim. Mathematically speaking, 1 and [1] and [[1] and [[[1]]] are different objects. One is a scalar, the second is an element of R^n with n=1 which is basically a scalar too from a math perspective, the third one is a 2-index object (an operator acting on R^1), the last one is a three-index object. These are all mathematically distinct. Furthermore, row-major and column-major order are a purely technical detail describing how the underlying data that is being represented by these multidimensional arrays is laid out in memory. So C/F-style order and the semantics of multidimensional arrays, at least as I see it, are independent notions. Andr?s From f.s.farimani at gmail.com Tue Nov 6 05:55:47 2018 From: f.s.farimani at gmail.com (Foad Sojoodi Farimani) Date: Tue, 6 Nov 2018 11:55:47 +0100 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: Dear Andr?s, Try those different option in MATLAB for example. or Octave/Scilab/Sympy-Matrix... they are all the same. The term "multidimensional arrays" is a little bit vague. one might think of multidimensional matrices ( I don't think there is such a thing in math) if coming from MATLAB. I also think the row-major column major terminology is confusing. there are no rows or columns for that matter. Numpy ndarrays are homogeneous, uniform nested lists. one can represent different layers of this list in different ways using rows or columns. regardless of all these different point of views having graphical and pretty printing representations would help a lot. that's my main goal at the moment. Best, Foad On Tue, Nov 6, 2018 at 11:43 AM Andras Deak wrote: > On Tue, Nov 6, 2018 at 8:26 AM Foad Sojoodi Farimani > wrote: > > > > Dear Mark, > > > > Thanks for the reply. I will write in between your lines: > > > > On Tue, Nov 6, 2018 at 6:11 AM Mark Harfouche > wrote: > >> > >> Foad, > >> > >> Visualizing data is definitely a complex field. I definitely feel your > pain. > > > > I have actually been using numpy for a couple of years without noticing > these issues. recently I have been trying to encourage my collogues to move > from MATLAB to Python and also prepare some workshops for PhD network of my > university. > >> > >> Printing your data is but one way of visualizing it, and probably only > useful for very small and constrained datasets. > > > > well actually it can be very useful. Consider Pandas .head() and .tail() > methods or Sympy's pretty printing functionalities. for bigger datasets the > function can get the terminals width and height and then based on the input > (U(n),D(n),L(n),R(n),UR(n,m),UL(n,m),DR(n,m),DL(n,m)) display what can be > shown and put horizontal 3-dots \u2026 ? or vertical/inclined ones. Or id > it is Jupyter then one can use Markdown/LaTeX for pretty printing or even > HTML to add sliders as suggested by Eric. > >> > >> Have you looked into set_printoptions to see how numpy?s existing > capabilities might help you with your visualization? > > > > This is indeed very useful. specially the threshold option can help a > lot with adjusting the width. but only for specific cases. > >> > >> The code you showed seems quite good. I wouldn?t worry about > performance when it comes to functions that will seldom be called in tight > loops. > > > > Thanks but I know it is very bad: > > > > it does not work properly for floats > > it only works for 1D and 2D > > there can be some recursive function I believe. > >> > >> As you?ll learn more about python and numpy, you?ll keep expanding it > to include more use cases. > >> For many of my projects, I create small submodules for visualization > tailored to the specific needs of the particular project. > >> I?ll try to incorporate your functions and see how I use them. > > > > Thanks a lot. looking forwards to your feedback > >> > >> Your original post seems to have some confusion about C Style vs F > Style ordering. I hope that has been resolved. > > > > I actually came to the conclusion that calling it C-Style or F-Style or > maybe row-major column-major are bad practices. Numpy's ndarrays are not > mathematical multidimensional arrays but Pythons nested, homogenous and > uniform lists. it means for example 1, [1], [[1]] and [[[1]]] are all > different, while in all other mathematical languages out there (including > Sympy's matrices) they are the same. > > I'm probably missing your point, because I don't understand your > claim. Mathematically speaking, 1 and [1] and [[1] and [[[1]]] are > different objects. One is a scalar, the second is an element of R^n > with n=1 which is basically a scalar too from a math perspective, the > third one is a 2-index object (an operator acting on R^1), the last > one is a three-index object. These are all mathematically distinct. > Furthermore, row-major and column-major order are a purely technical > detail describing how the underlying data that is being represented by > these multidimensional arrays is laid out in memory. So C/F-style > order and the semantics of multidimensional arrays, at least as I see > it, are independent notions. > > Andr?s > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Nov 6 09:42:05 2018 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 6 Nov 2018 07:42:05 -0700 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: On Tue, Nov 6, 2018 at 3:56 AM Foad Sojoodi Farimani wrote: > Dear Andr?s, > > Try those different option in MATLAB for example. or > Octave/Scilab/Sympy-Matrix... they are all the same. The term > "multidimensional arrays" is a little bit vague. one might think of > multidimensional matrices ( I don't think there is such a thing in math) if > coming from MATLAB. I also think the row-major column major terminology is > confusing. there are no rows or columns for that matter. Numpy ndarrays are > homogeneous, uniform nested lists. one can represent different layers of > this list in different ways using rows or columns. > > I think the current popular terminology is `tensors` for `multidimensional arrays`. Note that matrices are a different type of object. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Nov 6 15:11:13 2018 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 6 Nov 2018 12:11:13 -0800 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: On Tue, Nov 6, 2018 at 6:43 AM Charles R Harris wrote: > > On Tue, Nov 6, 2018 at 3:56 AM Foad Sojoodi Farimani < > f.s.farimani at gmail.com> wrote: > >> Dear Andr?s, >> >> Try those different option in MATLAB for example. or >> Octave/Scilab/Sympy-Matrix... they are all the same. >> > Of course, these are all systems with a focus on matrices per se rather than general arrays. They take liberties with non-2-dim arrays that make sense if the focus is on treating 2-dim arrays as matrices. One of the motivating reason's for numpy's early development (as Numeric) was to get away from those assumptions and limitations and be a general array processing system. Part of the reason that we choose the terminology "multidimensional array" is to emphasize those differences. > The term "multidimensional arrays" is a little bit vague. one might think >> of multidimensional matrices ( I don't think there is such a thing in math) >> if coming from MATLAB. I also think the row-major column major terminology >> is confusing. there are no rows or columns for that matter. >> > Granted, but it's long-established terminology, and not actually important for a user to know unless if someone is working in C with a flat representation of the allocated memory. > Numpy ndarrays are homogeneous, uniform nested lists. one can represent >> different layers of this list in different ways using rows or columns. >> > You have to be careful here as well. "list" also has semantic baggage. Data structures are generally only called "lists" in a wide variety of programming languages if they have cheap appends and other such mutation operations. numpy arrays don't (as well as the things that we call "arrays" in FORTRAN and C/C++ that are distinct from what we would call "lists" in those languages). Please be assured that "multidimensional array" is terminology that we didn't make up. It does derive from a tradition of mathematical programming in FORTRAN and C and makes meaningful semantic distinctions within that tradition. There are other traditions, and we might well have settled on different terminology if we had derived from those. We do expect people to come from a variety of traditions and have a period of adjustment as they learn some new terminology. That's perfectly reasonable, which is good, because it is entirely unavoidable. There isn't a universal set of terminology that's going to work with everyone's experience out of the gate. I think the current popular terminology is `tensors` for `multidimensional > arrays`. Note that matrices are a different type of object. > Popular, but quite misleading, in the same way that not every 2-dim array is a matrix. As someone who works on tensor machine learning methods once complained to me. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.s.farimani at gmail.com Tue Nov 6 16:50:54 2018 From: f.s.farimani at gmail.com (Foad Sojoodi Farimani) Date: Tue, 6 Nov 2018 22:50:54 +0100 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References: Message-ID: Hello Numpyers, I just added the pretty printing for 3D arrays too: https://stackoverflow.com/a/53164538/4999991 I would highly appreciate if you could check the implementation and let me know how do you think about it. Best, Foad On Mon, Nov 5, 2018 at 10:44 PM Foad Sojoodi Farimani < f.s.farimani at gmail.com> wrote: > Hello everyone, > > Following this question , > I'm convinced that numpy ndarrays are not MATLAB/mathematical > multidimentional matrices and I should stop expecting them to be. However I > still think it would have a lot of benefit to have a function like sympy's > pprint to pretty print. something like pandas .head and .tail method plus > .left .right .UpLeft .UpRight .DownLeft .DownRight methods. when nothing > mentioned it would show 4 corners and put dots in the middle if the array > is to big for the terminal. > > Best, > Foad > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Tue Nov 6 18:06:40 2018 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Tue, 6 Nov 2018 15:06:40 -0800 Subject: [Numpy-discussion] Prep for NumPy 1.16.0 branch In-Reply-To: References:

Message-ID: <20181106230640.eqgvpmt2wcqglsh4@carbo> On Sun, 04 Nov 2018 17:16:12 -0800, Stephan Hoyer wrote: > On Sun, Nov 4, 2018 at 10:32 AM Marten van Kerkwijk < > m.h.vankerkwijk at gmail.com> wrote: > > > For `__array_function__`, there was some discussion in > > https://github.com/numpy/numpy/issues/12225 that for 1.16 we might want > > to follow after all Nathaniel's suggestion of using an environment variable > > or so to opt in (since introspection breaks on python2 with our wrapped > > implementations). Given also the possibly significant hit in performance, > > this may be the best option. > > All the best, > > I am also leaning towards this right now, depending on how long we plan to > wait for releasing 1.16. It will take us at least a little while to sort > out performance issues for __array_function__, I'd guess at least a few > weeks. Then a blocker still might turn up during the release candidate > process (though I think we've found most of the major bugs / downstream > issues already through tests on NumPy's dev branch). Just to make sure I understand correctly: the suggestion is to use an environment variable to temporarily toggle the feature, but the plan in the long run will be to have it enabled all the time, correct? St?fan From stefanv at berkeley.edu Tue Nov 6 18:52:05 2018 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Tue, 6 Nov 2018 15:52:05 -0800 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: <20181106235205.6v4bo6aglwnzsoop@carbo> On Tue, 06 Nov 2018 12:11:13 -0800, Robert Kern wrote: > Popular, but quite misleading, in the same way that not every 2-dim array > is a matrix. As someone who works on tensor machine learning methods once > complained to me. Are you referring to vectors, structured arrays, or something else? St?fan From robert.kern at gmail.com Tue Nov 6 19:13:35 2018 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 6 Nov 2018 16:13:35 -0800 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: <20181106235205.6v4bo6aglwnzsoop@carbo> References:

<20181106235205.6v4bo6aglwnzsoop@carbo> Message-ID: On Tue, Nov 6, 2018 at 3:55 PM Stefan van der Walt wrote: > On Tue, 06 Nov 2018 12:11:13 -0800, Robert Kern wrote: > > Popular, but quite misleading, in the same way that not every 2-dim array > > is a matrix. As someone who works on tensor machine learning methods once > > complained to me. > > Are you referring to vectors, structured arrays, or something else? > I was responding to this statement by Chuck: > I think the current popular terminology is `tensors` for `multidimensional arrays`. Mostly popularized by Tensorflow. But the "tensors" that flow through Tensorflow are mostly just multidimensional arrays and have no tensor-algebraic meaning. Similarly, a 2-dim array (say, a grayscale intensity image) doesn't necessarily have a matrix-algebraic interpretation, either. A 640x480 grayscale image is not a linear transformation from RR^640 to RR^480. It's just a collection of numbers that are convenient to organize as a 2D grid. This seems to be a pain point with some tensor methods ML researchers who have to explain their work to an audience that seems to think that Tensorflow must make their lives (and theses) easy. :-) -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Tue Nov 6 19:28:36 2018 From: matti.picus at gmail.com (Matti Picus) Date: Tue, 6 Nov 2018 16:28:36 -0800 Subject: [Numpy-discussion] Weekly status meeting 8.11 at 12:00 pacific time Message-ID: <814a37a5-2650-5a63-f692-d859bd194dd6@gmail.com> We will be holding our weekly BIDS NumPy status meeting on Thurs Nov 8 at noon pacific time. We moved the meeting to Thursday because of a scheduling conflict. Please join us. The draft agenda, along with details of how to join, is up at https://hackmd.io/TTurMvviSkarcxf8vURq-Q?both Previous sessions' notes are available at https://github.com/BIDS-numpy/docs/tree/master/status_meetings Matti, Tyler and Stefan From shoyer at gmail.com Tue Nov 6 23:09:42 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 6 Nov 2018 23:09:42 -0500 Subject: [Numpy-discussion] Prep for NumPy 1.16.0 branch In-Reply-To: <20181106230640.eqgvpmt2wcqglsh4@carbo> References:

<20181106230640.eqgvpmt2wcqglsh4@carbo> Message-ID: On Tue, Nov 6, 2018 at 6:08 PM Stefan van der Walt wrote: > On Sun, 04 Nov 2018 17:16:12 -0800, Stephan Hoyer wrote: > > On Sun, Nov 4, 2018 at 10:32 AM Marten van Kerkwijk < > > m.h.vankerkwijk at gmail.com> wrote: > > > > > For `__array_function__`, there was some discussion in > > > https://github.com/numpy/numpy/issues/12225 that for 1.16 we might > want > > > to follow after all Nathaniel's suggestion of using an environment > variable > > > or so to opt in (since introspection breaks on python2 with our wrapped > > > implementations). Given also the possibly significant hit in > performance, > > > this may be the best option. > > > All the best, > > > > I am also leaning towards this right now, depending on how long we plan > to > > wait for releasing 1.16. It will take us at least a little while to sort > > out performance issues for __array_function__, I'd guess at least a few > > weeks. Then a blocker still might turn up during the release candidate > > process (though I think we've found most of the major bugs / downstream > > issues already through tests on NumPy's dev branch). > > Just to make sure I understand correctly: the suggestion is to use an > environment variable to temporarily toggle the feature, but the plan in > the long run will be to have it enabled all the time, correct? > Yes, exactly. __array_function__ would be opt-in only for 1.16 but enabled by default for 1.17. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Wed Nov 7 01:23:33 2018 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Tue, 6 Nov 2018 22:23:33 -0800 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: Foad: having the functionality for conventional consols would also help I think the most important thing in a conventional console is to output the array in a format that allows you to reconstruct the object. That makes it way easier for people to reproduce each others problems without having their full dataset. If your goal is to visualize complex arrays, I think the console is a pretty limited tool, and numpy already does as much as is worthwhile there. I don?t think putting everything in boxes is helping. it is confusing. I would rather having horizontal and vertical square brackets represent each nested array See my update at the same link, which shows an alternative which draws those brackets as you envi it would be awesome if in IPython/Jupyter hovering over an element a popup would show the index It? already does? to show L R U P or combination of these plus some numbers I don?t know what you mean by this. Eric On Tue, 6 Nov 2018 at 00:56 Foad Sojoodi Farimani f.s.farimani at gmail.com wrote: Wow, this is awesome. > Some points though: > > - not everybody uses IPython/Jupyter having the functionality for > conventional consols would also help. something like > Sypy's init_printing/init_session which smartly chooses the right > representation considering the terminal. > - I don't think putting everything in boxes is helping. it is > confusing. I would rather having horizontal and vertical square brackets > represent each nested array > - it would be awesome if in IPython/Jupyter hovering over an element a > popup would show the index > - one could read the width and height of the terminal and other > options I mentioned in reply Mark to show L R U P or combination of these > plus some numbers (similar to Pandas .head .tail) methods and then show the > rest by unicod 3dot > > P.S. I had no idea our university Microsoft services also offers Azure > Notebooks awesome :P > > F. > > On Tue, Nov 6, 2018 at 9:45 AM Eric Wieser > wrote: > >> Here's how that could look >> >> >> https://numpyintegration-ericwieser.notebooks.azure.com/j/notebooks/pprint.ipynb >> >> Feel free to play around and see if you can produce something more useful >> >> >> >> On Mon, 5 Nov 2018 at 23:28 Foad Sojoodi Farimani >> wrote: >> >>> It is not highking if I asked for it :)) >>> for IPython/Jupyter using Markdown/LaTeX would be awesome >>> or even better using HTML to add sliders just like Pandas... >>> >>> F. >>> >>> On Tue, Nov 6, 2018 at 6:51 AM Eric Wieser >>> wrote: >>> >>>> Hijacking this thread while on the topic of pprint - we might want to >>>> look into a table-based `_html_repr_` or `_latex_repr_` for use in ipython >>>> - where we can print the full array and let scrollbars replace ellipses. >>>> >>>> Eric >>>> >>>> On Mon, 5 Nov 2018 at 21:11 Mark Harfouche >>>> wrote: >>>> >>>>> Foad, >>>>> >>>>> Visualizing data is definitely a complex field. I definitely feel your >>>>> pain. >>>>> Printing your data is but one way of visualizing it, and probably only >>>>> useful for very small and constrained datasets. >>>>> Have you looked into set_printoptions >>>>> >>>>> to see how numpy?s existing capabilities might help you with your >>>>> visualization? >>>>> >>>>> The code you showed seems quite good. I wouldn?t worry about >>>>> performance when it comes to functions that will seldom be called in tight >>>>> loops. >>>>> As you?ll learn more about python and numpy, you?ll keep expanding it >>>>> to include more use cases. >>>>> For many of my projects, I create small submodules for visualization >>>>> tailored to the specific needs of the particular project. >>>>> I?ll try to incorporate your functions and see how I use them. >>>>> >>>>> Your original post seems to have some confusion about C Style vs F >>>>> Style ordering. I hope that has been resolved. >>>>> There is also a lot of good documentation >>>>> >>>>> https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes >>>>> about transitioning from matlab. >>>>> >>>>> Mark >>>>> >>>>> On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < >>>>> f.s.farimani at gmail.com> wrote: >>>>> >>>>>> Hello everyone, >>>>>> >>>>>> Following this question >>>>>> , I'm convinced that >>>>>> numpy ndarrays are not MATLAB/mathematical multidimentional matrices and I >>>>>> should stop expecting them to be. However I still think it would have a lot >>>>>> of benefit to have a function like sympy's pprint to pretty print. >>>>>> something like pandas .head and .tail method plus .left .right .UpLeft >>>>>> .UpRight .DownLeft .DownRight methods. when nothing mentioned it would show >>>>>> 4 corners and put dots in the middle if the array is to big for the >>>>>> terminal. >>>>>> >>>>>> Best, >>>>>> Foad >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at python.org >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.s.farimani at gmail.com Wed Nov 7 03:24:30 2018 From: f.s.farimani at gmail.com (Foad Sojoodi Farimani) Date: Wed, 7 Nov 2018 09:24:30 +0100 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: Dear Eric, In between your lines: On Wed, Nov 7, 2018 at 7:23 AM Eric Wieser wrote: > Foad: > > having the functionality for conventional consols would also help > > I think the most important thing in a conventional console is to output > the array in a format that allows you to reconstruct the object. That makes > it way easier for people to reproduce each others problems without having > their full dataset. If your goal is to visualize complex arrays, I think > the console is a pretty limited tool, and numpy already does as much as is > worthwhile there. > I agree with most of what you say: 1. the current representation of numpy array with print() is of course already there and it is not my goal to replace it. but rather add something like Sympy's pprint function, an alternative representation. 2. the reason I'm using console is first because there are people who use it and secondly because I have no idea how to do what you are doing :)) there is room for both I think I don?t think putting everything in boxes is helping. it is confusing. I > would rather having horizontal and vertical square brackets represent each > nested array > > See my update at the same link, which shows an alternative which draws > those brackets as you envi > wow this is awesome: [image: 2018-11-07_09-07-37.gif] I wonder if you could make this the default view without the need of those other inputs or the lamda function? this is almost 80% of what I had in mind > it would be awesome if in IPython/Jupyter hovering over an element a popup > would show the index > > It? already does? > it doesn't on my browser :( > to show L R U P or combination of these plus some numbers > > I don?t know what you mean by this. > imaging the Pandas .head() and .tail functions which accept positive integer inputs to show specific number of rows. now our print function could have two inputs one string which should be L for left, R for right, U for up and D for down. respectively UL, UR, DL and DR for corners. another input is a tuple of integers which in the case of U,D,L,R is only one integer showing the number of rows or columns. and in the case of UL, UR, DL, DR two integers showing the number of rows and columns in that specific corner to be shown. What could be added: 1. adding slide bars for big datasets 2. compressing the result according to the terminals dimensions (as Pandas does) 3. editing the variables like Spyders variable explorer 4. adding dimensions or rows/columns within a dimension or elements in rows/columns Again thanks a lot for your help. I appreciate your kind support. Best, Foad Eric > > On Tue, 6 Nov 2018 at 00:56 Foad Sojoodi Farimani f.s.farimani at gmail.com > wrote: > > Wow, this is awesome. >> Some points though: >> >> - not everybody uses IPython/Jupyter having the functionality for >> conventional consols would also help. something like >> Sypy's init_printing/init_session which smartly chooses the right >> representation considering the terminal. >> - I don't think putting everything in boxes is helping. it is >> confusing. I would rather having horizontal and vertical square brackets >> represent each nested array >> - it would be awesome if in IPython/Jupyter hovering over an element >> a popup would show the index >> - one could read the width and height of the terminal and other >> options I mentioned in reply Mark to show L R U P or combination of these >> plus some numbers (similar to Pandas .head .tail) methods and then show the >> rest by unicod 3dot >> >> P.S. I had no idea our university Microsoft services also offers Azure >> Notebooks awesome :P >> >> F. >> >> On Tue, Nov 6, 2018 at 9:45 AM Eric Wieser >> wrote: >> >>> Here's how that could look >>> >>> >>> https://numpyintegration-ericwieser.notebooks.azure.com/j/notebooks/pprint.ipynb >>> >>> Feel free to play around and see if you can produce something more useful >>> >>> >>> >>> On Mon, 5 Nov 2018 at 23:28 Foad Sojoodi Farimani < >>> f.s.farimani at gmail.com> wrote: >>> >>>> It is not highking if I asked for it :)) >>>> for IPython/Jupyter using Markdown/LaTeX would be awesome >>>> or even better using HTML to add sliders just like Pandas... >>>> >>>> F. >>>> >>>> On Tue, Nov 6, 2018 at 6:51 AM Eric Wieser >>>> wrote: >>>> >>>>> Hijacking this thread while on the topic of pprint - we might want to >>>>> look into a table-based `_html_repr_` or `_latex_repr_` for use in ipython >>>>> - where we can print the full array and let scrollbars replace ellipses. >>>>> >>>>> Eric >>>>> >>>>> On Mon, 5 Nov 2018 at 21:11 Mark Harfouche >>>>> wrote: >>>>> >>>>>> Foad, >>>>>> >>>>>> Visualizing data is definitely a complex field. I definitely feel >>>>>> your pain. >>>>>> Printing your data is but one way of visualizing it, and probably >>>>>> only useful for very small and constrained datasets. >>>>>> Have you looked into set_printoptions >>>>>> >>>>>> to see how numpy?s existing capabilities might help you with your >>>>>> visualization? >>>>>> >>>>>> The code you showed seems quite good. I wouldn?t worry about >>>>>> performance when it comes to functions that will seldom be called in tight >>>>>> loops. >>>>>> As you?ll learn more about python and numpy, you?ll keep expanding it >>>>>> to include more use cases. >>>>>> For many of my projects, I create small submodules for visualization >>>>>> tailored to the specific needs of the particular project. >>>>>> I?ll try to incorporate your functions and see how I use them. >>>>>> >>>>>> Your original post seems to have some confusion about C Style vs F >>>>>> Style ordering. I hope that has been resolved. >>>>>> There is also a lot of good documentation >>>>>> >>>>>> https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes >>>>>> about transitioning from matlab. >>>>>> >>>>>> Mark >>>>>> >>>>>> On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < >>>>>> f.s.farimani at gmail.com> wrote: >>>>>> >>>>>>> Hello everyone, >>>>>>> >>>>>>> Following this question >>>>>>> , I'm convinced that >>>>>>> numpy ndarrays are not MATLAB/mathematical multidimentional matrices and I >>>>>>> should stop expecting them to be. However I still think it would have a lot >>>>>>> of benefit to have a function like sympy's pprint to pretty print. >>>>>>> something like pandas .head and .tail method plus .left .right .UpLeft >>>>>>> .UpRight .DownLeft .DownRight methods. when nothing mentioned it would show >>>>>>> 4 corners and put dots in the middle if the array is to big for the >>>>>>> terminal. >>>>>>> >>>>>>> Best, >>>>>>> Foad >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing list >>>>>>> NumPy-Discussion at python.org >>>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at python.org >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 2018-11-07_09-07-37.gif Type: image/gif Size: 37249 bytes Desc: not available URL: From f.s.farimani at gmail.com Wed Nov 7 04:16:14 2018 From: f.s.farimani at gmail.com (Foad Sojoodi Farimani) Date: Wed, 7 Nov 2018 10:16:14 +0100 Subject: [Numpy-discussion] numpy pprint? In-Reply-To: References:

Message-ID: Der All, Here in this NoteBook I tried to compare my unicode implementation with Eric's HTML version plus links to the discussions on different forums if you like to follow For example: [image: chrome_2018-11-07_10-15-20.png] Best regards, Foad On Wed, Nov 7, 2018 at 9:24 AM Foad Sojoodi Farimani wrote: > Dear Eric, > > In between your lines: > > On Wed, Nov 7, 2018 at 7:23 AM Eric Wieser > wrote: > >> Foad: >> >> having the functionality for conventional consols would also help >> >> I think the most important thing in a conventional console is to output >> the array in a format that allows you to reconstruct the object. That makes >> it way easier for people to reproduce each others problems without having >> their full dataset. If your goal is to visualize complex arrays, I think >> the console is a pretty limited tool, and numpy already does as much as is >> worthwhile there. >> > I agree with most of what you say: > > 1. the current representation of numpy array with print() is of course > already there and it is not my goal to replace it. but rather add something > like Sympy's pprint function, an alternative representation. > 2. the reason I'm using console is first because there are people who > use it and secondly because I have no idea how to do what you are doing :)) > there is room for both I think > > I don?t think putting everything in boxes is helping. it is confusing. I >> would rather having horizontal and vertical square brackets represent each >> nested array >> >> See my update at the same link, which shows an alternative which draws >> those brackets as you envi >> > wow this is awesome: > > > [image: 2018-11-07_09-07-37.gif] > > I wonder if you could make this the default view without the need of those > other inputs or the lamda function? this is almost 80% of what I had in mind > > > >> it would be awesome if in IPython/Jupyter hovering over an element a >> popup would show the index >> >> It? already does? >> > it doesn't on my browser :( > >> to show L R U P or combination of these plus some numbers >> >> I don?t know what you mean by this. >> > imaging the Pandas .head() and .tail functions which accept positive > integer inputs to show specific number of rows. now our print function > could have two inputs one string which should be L for left, R for right, > U for up and D for down. respectively UL, UR, DL and DR for corners. > another input is a tuple of integers which in the case of U,D,L,R is only > one integer showing the number of rows or columns. and in the case of UL, > UR, DL, DR two integers showing the number of rows and columns in that > specific corner to be shown. > > What could be added: > > 1. adding slide bars for big datasets > 2. compressing the result according to the terminals dimensions (as > Pandas does) > 3. editing the variables like Spyders variable explorer > 4. adding dimensions or rows/columns within a dimension or elements in > rows/columns > > Again thanks a lot for your help. I appreciate your kind support. > > Best, > Foad > > Eric >> >> On Tue, 6 Nov 2018 at 00:56 Foad Sojoodi Farimani f.s.farimani at gmail.com >> wrote: >> >> Wow, this is awesome. >>> Some points though: >>> >>> - not everybody uses IPython/Jupyter having the functionality for >>> conventional consols would also help. something like >>> Sypy's init_printing/init_session which smartly chooses the right >>> representation considering the terminal. >>> - I don't think putting everything in boxes is helping. it is >>> confusing. I would rather having horizontal and vertical square brackets >>> represent each nested array >>> - it would be awesome if in IPython/Jupyter hovering over an element >>> a popup would show the index >>> - one could read the width and height of the terminal and other >>> options I mentioned in reply Mark to show L R U P or combination of these >>> plus some numbers (similar to Pandas .head .tail) methods and then show the >>> rest by unicod 3dot >>> >>> P.S. I had no idea our university Microsoft services also offers Azure >>> Notebooks awesome :P >>> >>> F. >>> >>> On Tue, Nov 6, 2018 at 9:45 AM Eric Wieser >>> wrote: >>> >>>> Here's how that could look >>>> >>>> >>>> https://numpyintegration-ericwieser.notebooks.azure.com/j/notebooks/pprint.ipynb >>>> >>>> Feel free to play around and see if you can produce something more >>>> useful >>>> >>>> >>>> >>>> On Mon, 5 Nov 2018 at 23:28 Foad Sojoodi Farimani < >>>> f.s.farimani at gmail.com> wrote: >>>> >>>>> It is not highking if I asked for it :)) >>>>> for IPython/Jupyter using Markdown/LaTeX would be awesome >>>>> or even better using HTML to add sliders just like Pandas... >>>>> >>>>> F. >>>>> >>>>> On Tue, Nov 6, 2018 at 6:51 AM Eric Wieser < >>>>> wieser.eric+numpy at gmail.com> wrote: >>>>> >>>>>> Hijacking this thread while on the topic of pprint - we might want to >>>>>> look into a table-based `_html_repr_` or `_latex_repr_` for use in ipython >>>>>> - where we can print the full array and let scrollbars replace ellipses. >>>>>> >>>>>> Eric >>>>>> >>>>>> On Mon, 5 Nov 2018 at 21:11 Mark Harfouche >>>>>> wrote: >>>>>> >>>>>>> Foad, >>>>>>> >>>>>>> Visualizing data is definitely a complex field. I definitely feel >>>>>>> your pain. >>>>>>> Printing your data is but one way of visualizing it, and probably >>>>>>> only useful for very small and constrained datasets. >>>>>>> Have you looked into set_printoptions >>>>>>> >>>>>>> to see how numpy?s existing capabilities might help you with your >>>>>>> visualization? >>>>>>> >>>>>>> The code you showed seems quite good. I wouldn?t worry about >>>>>>> performance when it comes to functions that will seldom be called in tight >>>>>>> loops. >>>>>>> As you?ll learn more about python and numpy, you?ll keep expanding >>>>>>> it to include more use cases. >>>>>>> For many of my projects, I create small submodules for visualization >>>>>>> tailored to the specific needs of the particular project. >>>>>>> I?ll try to incorporate your functions and see how I use them. >>>>>>> >>>>>>> Your original post seems to have some confusion about C Style vs F >>>>>>> Style ordering. I hope that has been resolved. >>>>>>> There is also a lot of good documentation >>>>>>> >>>>>>> https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html#numpy-for-matlab-users-notes >>>>>>> about transitioning from matlab. >>>>>>> >>>>>>> Mark >>>>>>> >>>>>>> On Mon, Nov 5, 2018 at 4:46 PM Foad Sojoodi Farimani < >>>>>>> f.s.farimani at gmail.com> wrote: >>>>>>> >>>>>>>> Hello everyone, >>>>>>>> >>>>>>>> Following this question >>>>>>>> , I'm convinced that >>>>>>>> numpy ndarrays are not MATLAB/mathematical multidimentional matrices and I >>>>>>>> should stop expecting them to be. However I still think it would have a lot >>>>>>>> of benefit to have a function like sympy's pprint to pretty print. >>>>>>>> something like pandas .head and .tail method plus .left .right .UpLeft >>>>>>>> .UpRight .DownLeft .DownRight methods. when nothing mentioned it would show >>>>>>>> 4 corners and put dots in the middle if the array is to big for the >>>>>>>> terminal. >>>>>>>> >>>>>>>> Best, >>>>>>>> Foad >>>>>>>> _______________________________________________ >>>>>>>> NumPy-Discussion mailing list >>>>>>>> NumPy-Discussion at python.org >>>>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing list >>>>>>> NumPy-Discussion at python.org >>>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at python.org >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: chrome_2018-11-07_10-15-20.png Type: image/png Size: 28101 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 2018-11-07_09-07-37.gif Type: image/gif Size: 37249 bytes Desc: not available URL: From einstein.edison at gmail.com Fri Nov 9 10:15:31 2018 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Fri, 9 Nov 2018 16:15:31 +0100 Subject: [Numpy-discussion] asarray/anyarray; matrix/subclass Message-ID: <48ba8375-8499-4d57-acf0-803d02d120ce@Canary> > Begin forwarded message: > > From: Stephan Hoyer > Date: Friday, Nov 09, 2018 at 3:19 PM > To: Hameer Abbasi > Cc: Stefan van der Walt , Marten van Kerkwijk > Subject: asarray/anyarray; matrix/subclass > > This is a great discussion, but let's try to have it in public (e.g., on the NumPy mailing list). > On Fri, Nov 9, 2018 at 8:42 AM Hameer Abbasi wrote: > > Hi Stephan, > > > > The issue I have with writing another function is that asarray/asanyarray are so widely used that it?d be a huge maintenance task to update them throughout NumPy, not to mention other codebases, not to mention other codebases having to rely on newer NumPy versions for this. In short, it would dramatically reduce adaptability of this function. > > > > One path we can take is to allow asarray/asanyarray to be overridable via __array_function__ (the former is debatable). This solves most of our duck-array related issues without introducing another protocol. > > > > Regardless of what path we choose, I would recommend changing asanyarray to not pass through np.matrix regardless, instead passing through mat.view(type=np.ndarray) instead, which has O(1) cost and memory. In the vast majority of contexts, it?s used to ensure an array-ish structure for another operation, and usually there?s no guarantee that what comes out will be a matrix anyway. I suggest we raise a FutureWarning and then change this behaviour. > > > > There have been a number of discussions about deprecating np.matrix (and a few about MaskedArray as well, though there are less compelling reasons for that one). I suggest we start down that path as soon as possible. The biggest (only?) user I know of blocking that is scipy.sparse, and we?re on our way to replacing that with PyData/Sparse. > > > > Best Regards, > > Hameer Abbasi > > > > > > > On Friday, Nov 09, 2018 at 1:26 AM, Stephan Hoyer wrote: > > > Hi Hameer, > > > > > > I'd love to talk about this in more detail. I agree that something like this is needed. > > > > > > The challenge with reusing an existing function like asanyarray() is that there is at least one (somewhat?) widely used ndarray subclass that badly violates the Liskov Substitution Principle: np.matrix. > > > > > > NumPy can't really use np.asanyarray() widely for internal purposes until we don't have to worry about np matrix. We might special case np.matrix in some way, but then asanyarray() would do totally opposite things on different versions of NumPy. It's almost certainly a better idea to just write a new function with the desired semantics, and "soft deprecate" asanyarray(). The new function can explicitly black list np.matrix, as well as any other subclasses we know of that badly violate LSP. > > > > > > Cheers, > > > Stephan > > > On Thu, Nov 8, 2018 at 5:06 PM Hameer Abbasi wrote: > > > > No, Stefan, I?ll do that now. Putting you in the cc. > > > > > > > > It slipped my mind among the million other things I had in mind ? Namely: My job visa. It was only done this Monday. > > > > > > > > Hi, Marten, Stephan: > > > > > > > > Stefan wants me to write up a NEP that allows a given object to specify that it is a duck array ? Namely, that it follows duck-array semantics. > > > > > > > > We were thinking of switching asanyarray to switch to passing through anything that implements the duck-array protocol along with ndarray subclasses. I?m sure this would help XArray and Quantity work better with existing codebases, along with PyData/Sparse arrays. > > > > > > > > Would you be interested? > > > > > > > > Best Regards, > > > > Hameer Abbasi > > > > > > > > > > > > > On Thursday, Nov 08, 2018 at 9:09 PM, Stefan van der Walt wrote: > > > > > Hi Hameer, > > > > > > > > > > In last week's meeting, we had the following in the notes: > > > > > > > > > > > Hameer is contacting Marten & Stephan and write up a draft NEP for > > > > > > clarifying the asarray/asanyarray and matrix/subclass path forward. > > > > > > > > > > Did any of that happen that you could share? > > > > > > > > > > Thanks and best regards, > > > > > St?fan Hello, everyone, Me, Stefan van der Walt, Stephan Hoyer and Marten van Kerkwijk were having a discussion about the state of matrix, asarray and asanyarray. Our thoughts are summarised above (in the quoted text that I?m forwarding) Basically, this grew out of a discussion relating to asanyarray/asarray inconsistencies in NumPy about which to use where. Historically, asarray was used in many libraries/places instead of asanyarray usually because np.matrix caused problems due to its special behaviour with regard to indexing (it always returns a 2-D object when eliminating one dimension, but a 0-D one when eliminating both), its behaviour regarding __mul__ (the multiplication operator represents matrix multiplication rather than element-wise multiplication) and its fixed dimensionality (matrix is 2D only). Because of these three things, as Stephan accurately pointed out, it violates the Liskov Substitution Principle. Because of this behaviour, many libraries switched from using asanyarray to asarray, as np.matrix wouldn?t work with their code. This shut out other matrix subclasses from being used as well, such as MaskedArray and astropy.Quantity. Even if asanyarray is used, there is usually no guarantee that a matrix will be returned instead of an array. The changes I?m proposing are twofold, but simple: asanyarray should return mat.view(type=np.ndarray) instead of matrices, after an appropriate time with a FutureWarning. This allows us to preserve the performance (Creating a view is O(1) both in memory and time), and the mutability of the original matrix. This change should happen after a FutureWarning and the usual grace period. In the spirit of allowing duck-arrays to work with existing NumPy code, asanyarray should be overridable via __array_function__, so that duck arrays can decide whether to pass themselves through. If subclasses are allowed, so should ducka-arrays as well. This is a part of a larger effort to deprecate np.matrix. As far as I?m aware, it has one big customer (scipy.sparse). The effort to replace that is already underway at PyData/Sparse. Best Regards, Hameer Abbasi -------------- next part -------------- An HTML attachment was scrubbed... URL: From dpgrote at lbl.gov Fri Nov 9 18:00:01 2018 From: dpgrote at lbl.gov (David Grote) Date: Fri, 9 Nov 2018 15:00:01 -0800 Subject: [Numpy-discussion] Problem with libgfortran installed with pip install numpy In-Reply-To: References: Message-ID: Hi Matthew - Do you have any comment in this? Thanks! Dave On Wed, Sep 5, 2018 at 5:01 PM Charles R Harris wrote: > > > On Wed, Sep 5, 2018 at 5:38 PM David Grote wrote: > >> >> Hi - I have recently come across this problem. On my mac, I build a >> Fortran code, producing a shared object, that I import into Python along >> with numpy. This had been working fine until recently when I started seeing >> sag faults deep inside the Fortran code, usually in Fortran print >> statements. I tracked this down to a gfortran version issue. >> >> I use the brew installation of Python and gcc (using the most recent >> version, 8.2.0). gcc of course installs a version of libgfortran.dylib. >> Doing a lsof of a running Python, I see that it finds that copy of >> libgfortran, and also a copy that was downloaded with numpy >> (/usr/local/lib/python3.7/site-packages/numpy/.dylibs/libgfortran.3.dylib). >> Looking at numpy's copy of libgfortran, I see that it is version 4.9.0, >> much older. Since my code is importing numpy first, the OS seems be using >> numpy's version of libgfortran to link when importing my code. I know from >> other experience that older versions of libgfortran are not compatible with >> code compiled using a new version of gfortran and so therefore segfaults >> happen. >> >> If I download the numpy source and do python setup.py install, I don't >> have this problem. >> >> After this long description, my question is why is such an old version of >> gcc used to build the distribution of numpy that gets installed from pypi? >> gcc version 4.9.0 is from 2014. Can a newer version be used? >> > > The library came in with the use of OpenBLAS, I don't think there is a > fundamental reason that a newer version of gfortran couldn't be used, but I > have little experience with the Mac. Note that we have also given up on 32 > bit Python on Mac for library related reasons. Matthew Brett would be the > guy to discuss this with. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Nov 9 18:08:40 2018 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 9 Nov 2018 15:08:40 -0800 Subject: [Numpy-discussion] Problem with libgfortran installed with pip install numpy In-Reply-To: References: Message-ID: On Wed, Sep 5, 2018 at 4:37 PM, David Grote wrote: > > Hi - I have recently come across this problem. On my mac, I build a Fortran > code, producing a shared object, that I import into Python along with numpy. > This had been working fine until recently when I started seeing sag faults > deep inside the Fortran code, usually in Fortran print statements. I tracked > this down to a gfortran version issue. > > I use the brew installation of Python and gcc (using the most recent > version, 8.2.0). gcc of course installs a version of libgfortran.dylib. > Doing a lsof of a running Python, I see that it finds that copy of > libgfortran, and also a copy that was downloaded with numpy > (/usr/local/lib/python3.7/site-packages/numpy/.dylibs/libgfortran.3.dylib). > Looking at numpy's copy of libgfortran, I see that it is version 4.9.0, much > older. Since my code is importing numpy first, the OS seems be using numpy's > version of libgfortran to link when importing my code. I know from other > experience that older versions of libgfortran are not compatible with code > compiled using a new version of gfortran and so therefore segfaults happen. Normally on MacOS, it's fine to have multiple versions of the same library used at the same time, because the linker looks up symbols using a (source library, symbol name) pair. (This is called the "two-level namespace".) So it's strange that these two libgfortrans would interfere with each other. Does gfortran not use the two-level namespace when linking fortran code? -n -- Nathaniel J. Smith -- https://vorpus.org From shoyer at gmail.com Fri Nov 9 18:28:09 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 9 Nov 2018 18:28:09 -0500 Subject: [Numpy-discussion] Should unique types of all arguments be passed on in __array_function__? In-Reply-To: References:

Message-ID: On Mon, Nov 5, 2018 at 9:00 AM Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Hi Stephan, > > I fear my example about thinking about `ndarray.__array_function__` > distracted from the gist of my question, which was whether for > `__array_function__` implementations *generally* it wouldn't be handier to > have all unique types rather than just those that override > `__array_function__`. It would seem that for any other implementation than > for numpy itself, the presence of __array_function__ is indeed almost > irrelevant. As a somewhat random example, why would it, e.g., for DASK be > useful to know that another argument is a Quantity, but not that it is a > file handle? (Presumably, it cannot handle either...) > In practice, it is of course easy to simply ignore arguments that don?t define __array_function__. But I do think the distinction is important for more than merely ndarray: the value of the types argument tells you the set of types that might have a conflicting implementation. For example, Dask might be happy to handle any non-arrays as scalars (like NumPy), e.g., it should be fine to make a dask array consisting of a decimal object. Since decimal doesn?t define __array_function__, there?s no need to do anything special to handle it inside dask.array.Array.__array_function__. If decimal appeared in types, then dask would have to be careful to let arbitrary types that don?t define __array_function__ pass through. In contrast, dask definitely wants to know if another type defines __array_function__, because they might have a conflicting implementation. This is the main reason why we have the types argument in the first place ? to make these checks easy. In my experience, it is super common for Python arithmetic methods to be implemented improperly, i.e., never returning NotImplemented. This will hopefully be less common with __array_function__. More broadly, it is only necessary to reject an argument type at the __array_function__ level if it defines __array_function__ itself, because that?s the only case where it would make a difference to return NotImplemented rather than trying (and failing) to call the overriden function implementation. > > All the best, > > Marten > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Fri Nov 9 18:39:13 2018 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 9 Nov 2018 18:39:13 -0500 Subject: [Numpy-discussion] asarray/anyarray; matrix/subclass In-Reply-To: <48ba8375-8499-4d57-acf0-803d02d120ce@Canary> References: <48ba8375-8499-4d57-acf0-803d02d120ce@Canary> Message-ID: I?m still not sure I agree with the advantages of reusing asanyarray(), even if matrix did not exist. Yes, asanyarray will exist in old NumPy versions, but you can?t use it with sparse arrays anyways because it will have the wrong semantics. I expect this would be a bug magnet, with inadvertent loading of sparse arrays into memory if you?re accidentally using old NumPy. With regards to the protocol, I would suggest a dedicated method, e.g., __asanyarray__ (or something similar based on the final chosen name of the function). Coercing to arrays is special enough to have its own dedicated protocol, and it could be useful for libraries like xarray to check for __asanyarray__ attributes before deciding which coercion mechanism to use. On Fri, Nov 9, 2018 at 10:17 AM Hameer Abbasi wrote: > Begin forwarded message: > > From: Stephan Hoyer > Date: Friday, Nov 09, 2018 at 3:19 PM > To: Hameer Abbasi > Cc: Stefan van der Walt , Marten van Kerkwijk > Subject: asarray/anyarray; matrix/subclass > > This is a great discussion, but let's try to have it in public (e.g., on > the NumPy mailing list). > On Fri, Nov 9, 2018 at 8:42 AM Hameer Abbasi > wrote: > >> Hi Stephan, >> >> The issue I have with writing another function is that asarray/asanyarray >> are so widely used that it?d be a huge maintenance task to update them >> throughout NumPy, not to mention other codebases, not to mention other >> codebases having to rely on newer NumPy versions for this. In short, it >> would dramatically reduce adaptability of this function. >> >> One path we can take is to allow asarray/asanyarray to be overridable via >> __array_function__ (the former is debatable). This solves most of our >> duck-array related issues without introducing another protocol. >> >> Regardless of what path we choose, I would recommend changing asanyarray >> to not pass through np.matrix regardless, instead passing through >> mat.view(type=np.ndarray) instead, which has O(1) cost and memory. In the >> vast majority of contexts, it?s used to ensure an array-ish structure for >> another operation, and usually there?s no guarantee that what comes out >> will be a matrix anyway. I suggest we raise a FutureWarning and then change >> this behaviour. >> >> There have been a number of discussions about deprecating np.matrix (and >> a few about MaskedArray as well, though there are less compelling reasons >> for that one). I suggest we start down that path as soon as possible. The >> biggest (only?) user I know of blocking that is scipy.sparse, and we?re on >> our way to replacing that with PyData/Sparse. >> >> Best Regards, >> Hameer Abbasi >> >> On Friday, Nov 09, 2018 at 1:26 AM, Stephan Hoyer >> wrote: >> Hi Hameer, >> >> I'd love to talk about this in more detail. I agree that something like >> this is needed. >> >> The challenge with reusing an existing function like asanyarray() is that >> there is at least one (somewhat?) widely used ndarray subclass that badly >> violates the Liskov Substitution Principle: np.matrix. >> >> NumPy can't really use np.asanyarray() widely for internal purposes until >> we don't have to worry about np matrix. We might special case np.matrix in >> some way, but then asanyarray() would do totally opposite things on >> different versions of NumPy. It's almost certainly a better idea to just >> write a new function with the desired semantics, and "soft deprecate" >> asanyarray(). The new function can explicitly black list np.matrix, as well >> as any other subclasses we know of that badly violate LSP. >> >> Cheers, >> Stephan >> On Thu, Nov 8, 2018 at 5:06 PM Hameer Abbasi >> wrote: >> >>> No, Stefan, I?ll do that now. Putting you in the cc. >>> >>> It slipped my mind among the million other things I had in mind ? >>> Namely: My job visa. It was only done this Monday. >>> >>> Hi, Marten, Stephan: >>> >>> Stefan wants me to write up a NEP that allows a given object to specify >>> that it is a duck array ? Namely, that it follows duck-array semantics. >>> >>> We were thinking of switching asanyarray to switch to passing through >>> anything that implements the duck-array protocol along with ndarray >>> subclasses. I?m sure this would help XArray and Quantity work better with >>> existing codebases, along with PyData/Sparse arrays. >>> >>> Would you be interested? >>> >>> Best Regards, >>> Hameer Abbasi >>> >>> On Thursday, Nov 08, 2018 at 9:09 PM, Stefan van der Walt < >>> stefanv at berkeley.edu> wrote: >>> Hi Hameer, >>> >>> In last week's meeting, we had the following in the notes: >>> >>> Hameer is contacting Marten & Stephan and write up a draft NEP for >>> clarifying the asarray/asanyarray and matrix/subclass path forward. >>> >>> >>> Did any of that happen that you could share? >>> >>> Thanks and best regards, >>> St?fan >>> >>> > Hello, everyone, > > Me, Stefan van der Walt, Stephan Hoyer and Marten van Kerkwijk were having > a discussion about the state of matrix, asarray and asanyarray. Our > thoughts are summarised above (in the quoted text that I?m forwarding) > > Basically, this grew out of a discussion relating to asanyarray/asarray > inconsistencies in NumPy about which to use where. Historically, asarray > was used in many libraries/places instead of asanyarray usually because > np.matrix caused problems due to its special behaviour with regard to > indexing (it always returns a 2-D object when eliminating one dimension, > but a 0-D one when eliminating both), its behaviour regarding __mul__ (the > multiplication operator represents matrix multiplication rather than > element-wise multiplication) and its fixed dimensionality (matrix is 2D > only). Because of these three things, as Stephan accurately pointed out, it > violates the Liskov Substitution Principle. > > Because of this behaviour, many libraries switched from using asanyarray > to asarray, as np.matrix wouldn?t work with their code. This shut out other > matrix subclasses from being used as well, such as MaskedArray and > astropy.Quantity. Even if asanyarray is used, there is usually no guarantee > that a matrix will be returned instead of an array. > > The changes I?m proposing are twofold, but simple: > > - asanyarray should return mat.view(type=np.ndarray) instead of > matrices, after an appropriate time with a FutureWarning. This allows us to > preserve the performance (Creating a view is O(1) both in memory and time), > and the mutability of the original matrix. This change should happen after > a FutureWarning and the usual grace period. > - In the spirit of allowing duck-arrays to work with existing NumPy > code, asanyarray should be overridable via __array_function__, so that duck > arrays can decide whether to pass themselves through. If subclasses are > allowed, so should ducka-arrays as well. > > This is a part of a larger effort to deprecate np.matrix. As far as I?m > aware, it has one big customer (scipy.sparse). The effort to replace that > is already underway at PyData/Sparse. > > Best Regards, > Hameer Abbasi > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Nov 9 18:45:46 2018 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 9 Nov 2018 15:45:46 -0800 Subject: [Numpy-discussion] asarray/anyarray; matrix/subclass In-Reply-To: <48ba8375-8499-4d57-acf0-803d02d120ce@Canary> References: <48ba8375-8499-4d57-acf0-803d02d120ce@Canary> Message-ID: But matrix isn't the only problem with asanyarray. np.ma also violates Liskov. No doubt there are other problematic ndarray subclasses out there too... If we were going to try to reuse asanyarray through some deprecation mechanism, I think we'd need to deprecate allowing asanyarray to return *any* ndarray subclass, unless they explicitly provided an __asanyarray__ dunder. But at that point I'm not sure what the point would be of reusing it. On Fri, Nov 9, 2018 at 7:15 AM, Hameer Abbasi wrote: > Begin forwarded message: > > From: Stephan Hoyer > Date: Friday, Nov 09, 2018 at 3:19 PM > To: Hameer Abbasi > Cc: Stefan van der Walt , Marten van Kerkwijk > Subject: asarray/anyarray; matrix/subclass > > This is a great discussion, but let's try to have it in public (e.g., on the > NumPy mailing list). > On Fri, Nov 9, 2018 at 8:42 AM Hameer Abbasi > wrote: >> >> Hi Stephan, >> >> The issue I have with writing another function is that asarray/asanyarray >> are so widely used that it?d be a huge maintenance task to update them >> throughout NumPy, not to mention other codebases, not to mention other >> codebases having to rely on newer NumPy versions for this. In short, it >> would dramatically reduce adaptability of this function. >> >> One path we can take is to allow asarray/asanyarray to be overridable via >> __array_function__ (the former is debatable). This solves most of our >> duck-array related issues without introducing another protocol. >> >> Regardless of what path we choose, I would recommend changing asanyarray >> to not pass through np.matrix regardless, instead passing through >> mat.view(type=np.ndarray) instead, which has O(1) cost and memory. In the >> vast majority of contexts, it?s used to ensure an array-ish structure for >> another operation, and usually there?s no guarantee that what comes out will >> be a matrix anyway. I suggest we raise a FutureWarning and then change this >> behaviour. >> >> There have been a number of discussions about deprecating np.matrix (and a >> few about MaskedArray as well, though there are less compelling reasons for >> that one). I suggest we start down that path as soon as possible. The >> biggest (only?) user I know of blocking that is scipy.sparse, and we?re on >> our way to replacing that with PyData/Sparse. >> >> Best Regards, >> Hameer Abbasi >> >> On Friday, Nov 09, 2018 at 1:26 AM, Stephan Hoyer >> wrote: >> Hi Hameer, >> >> I'd love to talk about this in more detail. I agree that something like >> this is needed. >> >> The challenge with reusing an existing function like asanyarray() is that >> there is at least one (somewhat?) widely used ndarray subclass that badly >> violates the Liskov Substitution Principle: np.matrix. >> >> NumPy can't really use np.asanyarray() widely for internal purposes until >> we don't have to worry about np matrix. We might special case np.matrix in >> some way, but then asanyarray() would do totally opposite things on >> different versions of NumPy. It's almost certainly a better idea to just >> write a new function with the desired semantics, and "soft deprecate" >> asanyarray(). The new function can explicitly black list np.matrix, as well >> as any other subclasses we know of that badly violate LSP. >> >> Cheers, >> Stephan >> On Thu, Nov 8, 2018 at 5:06 PM Hameer Abbasi