From cournape at gmail.com Sun Feb 1 02:48:29 2015 From: cournape at gmail.com (David Cournapeau) Date: Sun, 1 Feb 2015 08:48:29 +0100 Subject: [Numpy-discussion] missing FloatingPointError for numpy on cygwin64 In-Reply-To: References: Message-ID: Hi Sebastian, I think you may be one of the first person to report using cygwin 64. I think it makes sense to support that platform as it is becoming more common. Could you report the value of `sys.platform` on cygwin64 ? The first place I would look for cygwin-related FPU issues is there: https://github.com/numpy/numpy/blob/master/numpy/core/setup.py#L638 David On Sat, Jan 31, 2015 at 9:53 PM, Sebastien Gouezel < sebastien.gouezel at univ-rennes1.fr> wrote: > Dear all, > > I tried to use numpy (version 1.9.1, installed by `pip install numpy`) > on cygwin64. I encountered the following weird bug: > > >>> import numpy > >>> with numpy.errstate(all='raise'): > ... print 1/float64(0.0) > inf > > I was expecting a FloatingPointError, but it didn't show up. Curiously, > with different numerical types (all intxx, or float128), I indeed get > the FloatingPointError. > > Same thing with the most recent git version, or with 1.7.1 provided as a > precompiled package by cygwin. This behavior does not happen on cygwin32 > (I always get the FloatingPointError there). > > I wonder if there is something weird with my config, or if this is a > genuine reproducible bug. If so, where should I start looking if I want > to fix it? (I don't know anything about numpy's code) > > Sebastien > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastien.gouezel at univ-rennes1.fr Sun Feb 1 09:24:46 2015 From: sebastien.gouezel at univ-rennes1.fr (Sebastien Gouezel) Date: Sun, 01 Feb 2015 15:24:46 +0100 Subject: [Numpy-discussion] missing FloatingPointError for numpy on cygwin64 In-Reply-To: References: Message-ID: Le 01/02/2015 08:48, David Cournapeau wrote: > Could you report the value of `sys.platform` on cygwin64 ? The first > place I would look for cygwin-related FPU issues is there: > https://github.com/numpy/numpy/blob/master/numpy/core/setup.py#L638 sys.platform is simply cygwin, just as on cygwin32. It should not be too hard to support cygwin64: most tests pass in the testsuite, almost all failing ones are related to exceptions or warnings that are not raised as in my first observation. However, debugging this seems to be beyond my skills! From jtaylor.debian at googlemail.com Sun Feb 1 12:26:46 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Sun, 01 Feb 2015 18:26:46 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.9.2 release candidate Message-ID: <54CE61D6.2090000@googlemail.com> Hi, We have finished the first release candidate of NumPy 1.9.2. The 1.9.2 release will as usual be a bugfix only release to the 1.9.x series. The tarballs and win32 binaries are available on sourceforge: https://sourceforge.net/projects/numpy/files/NumPy/1.9.2rc1/ If no regressions show up the final release is planned next week. The upgrade is recommended for all users of the 1.9.x series. Following issues have been fixed: * gh-5316: fix too large dtype alignment of strings and complex types * gh-5424: fix ma.median when used on ndarrays * gh-5481: Fix astype for structured array fields of different byte order * gh-5155: Fix loadtxt with comments=None and a string None data * gh-4476: Masked array view fails if structured dtype has datetime component * gh-5388: Make RandomState.set_state and RandomState.get_state threadsafe * gh-5390: make seed, randint and shuffle threadsafe * gh-5374: Fixed incorrect assert_array_almost_equal_nulp documentation * gh-5393: Add support for ATLAS > 3.9.33. * gh-5313: PyArray_AsCArray caused segfault for 3d arrays * gh-5492: handle out of memory in rfftf * gh-4181: fix a few bugs in the random.pareto docstring * gh-5359: minor changes to linspace docstring * gh-4723: fix a compile issues on AIX Source tarballs, windows installers and release notes can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.9.2rc1/ Cheers, The NumPy Developer team -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From matthew.brett at gmail.com Sun Feb 1 18:53:48 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 1 Feb 2015 15:53:48 -0800 Subject: [Numpy-discussion] [SciPy-Dev] ANN: NumPy 1.9.2 release candidate In-Reply-To: <54CE61D6.2090000@googlemail.com> References: <54CE61D6.2090000@googlemail.com> Message-ID: Hi, On Sun, Feb 1, 2015 at 9:26 AM, Julian Taylor wrote: > Hi, > > We have finished the first release candidate of NumPy 1.9.2. > The 1.9.2 release will as usual be a bugfix only release to the 1.9.x > series. > The tarballs and win32 binaries are available on sourceforge: > https://sourceforge.net/projects/numpy/files/NumPy/1.9.2rc1/ > > If no regressions show up the final release is planned next week. > The upgrade is recommended for all users of the 1.9.x series. > > Following issues have been fixed: > * gh-5316: fix too large dtype alignment of strings and complex types > * gh-5424: fix ma.median when used on ndarrays > * gh-5481: Fix astype for structured array fields of different byte order > * gh-5155: Fix loadtxt with comments=None and a string None data > * gh-4476: Masked array view fails if structured dtype has datetime > component > * gh-5388: Make RandomState.set_state and RandomState.get_state threadsafe > * gh-5390: make seed, randint and shuffle threadsafe > * gh-5374: Fixed incorrect assert_array_almost_equal_nulp documentation > * gh-5393: Add support for ATLAS > 3.9.33. > * gh-5313: PyArray_AsCArray caused segfault for 3d arrays > * gh-5492: handle out of memory in rfftf > * gh-4181: fix a few bugs in the random.pareto docstring > * gh-5359: minor changes to linspace docstring > * gh-4723: fix a compile issues on AIX > > Source tarballs, windows installers and release notes can be found at > https://sourceforge.net/projects/numpy/files/NumPy/1.9.2rc1/ I built wheels for OSX testing, via the automated travis builders [1]. Install with: pip install -f http://wheels.scipy.org -U --pre numpy Scipy ecosystem tests (scipy, pandas, etc) running against the rc1 wheel at [2]. Cheers, Matthew [1] https://travis-ci.org/MacPython/numpy-wheels [2] https://travis-ci.org/MacPython/scipy-stack-osx-testing From maniteja.modesty067 at gmail.com Mon Feb 2 08:30:36 2015 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Mon, 2 Feb 2015 19:00:36 +0530 Subject: [Numpy-discussion] Regarding taking up project ideas and GSoC 2015 Message-ID: Hello everyone, I am a third year computer science undergraduate student, from BITS Pilani, India. First of all, I have cross posted to both the mailing lists, since there is a large overlap among contributors of the organisations. Please do mention if there is any objection to do so. I am writing this mail to introduce myself and to try to work on ideas that are hovering around, if possible contribute as much as I can to the community. Apologies in advance for the lengthy mail. I have been in active touch with numpy and scipy since December, though I have been following the mailing lists and discussions for quite some time. There is an ambience which makes me stick to the github page all the time, just so that I keep learning new things and as well as help others if possible. I think I have a working knowledge of git, though not at advanced level, which I learned when working on small patches to numpy and scipy, thanks for the help there. Though not greatly familiar with the structure of modules in the library, I have a basic understanding of the codebase, getting to know things as and when I have been digging at them, during various issues. I have recently looked into the Scipy 1.0 Roadmap, todo wiki and Project Ideas on github wiki. I was interested in the numpy idea for pythonic dtypes and also in the scipy interpolation module, which were close to my academic fields. I would also be eager to learn about the other ideas, but as they are not a part of my curriculum, I would be needing to spend some time to get an idea of them. Also there are various enhancement proposals, mentioned in issues and PRs like the elliptical integrals of third kind, which was brought up recently. It would be great to try to work on such ideas, but need your guidance. I am a pursing a computer science major, and completed courses in linear algebra, differential equations, computer graphics, probability and statistics, machine learning, information retrieval and data mining. I have beginner knowledge in image and signal processing. I was wondering if I could request the concerned people who work in these fields to help beginners like me to look into and learn about the concepts. It would be a really great learning experience to work under you guys, and start as early as possible, irrespective of whether it is a feasible SoC project, but only according to your convenience. I would be more than pleased and happy to be a long term contributor. My github profile is https://github.com/maniteja123. It is just a mix of my modest beginning to open source contribution and my academic projects. Waiting in anticipation for your response. Thanks for spending time to read along my lengthy mail. Cheers, N.Maniteja _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Mon Feb 2 09:25:32 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Mon, 2 Feb 2015 06:25:32 -0800 Subject: [Numpy-discussion] Views of a different dtype In-Reply-To: <1422695822.12798.14.camel@sebastian-t440> References: <1422695822.12798.14.camel@sebastian-t440> Message-ID: On Sat, Jan 31, 2015 at 1:17 AM, Sebastian Berg wrote: > On Fr, 2015-01-30 at 19:52 -0800, Jaime Fern?ndez del R?o wrote: > > On Thu, Jan 29, 2015 at 8:57 AM, Nathaniel Smith > > wrote: > > On Thu, Jan 29, 2015 at 12:56 AM, Jaime Fern?ndez del R?o > > wrote: > > [...] > > > > > > > Could we make it more like: check to see if the last dimension > > works. > > If not, raise an error (and let the user transpose some other > > dimension there if that's what they wanted)? Or require the > > user to > > specify which dimension will absorb the shape change? (If we > > were > > doing this from scratch, then it would be tempting to just say > > that we > > always add a new dimension at the end with newtype.itemsize / > > oldtype.itemsize entries, or absorb such a dimension if > > shrinking. As > > a bonus, this would always work, regardless of contiguity! > > Except that > > when shrinking the last dimension would have to be contiguous, > > of > > course.) > > > > > > When we roll @ in and people start working with stacks of matrices, we > > will probably find ourselves having to create an alias, similar to .T, > > for .swapaxes(-1, -2). Searching for the smallest stride allows to > > take views of such arrays, which does not work right now because the > > array is no longer contiguous globally. > > > > That is true, but I agree with Nathaniel at least as far as that I would > prefer a user to be able to safely use `view` even he has not even an > inkling about what his memory layout is. One option would be an > `axis=-1` default (maybe FutureWarn this from `axis=None` which would > look at order, see below -- or maybe have axis='A', 'C' and 'F' and > default to 'A' for starters). > > This even now could start creating bugs when enabling relaxed > strides :(, because your good old fortran order complex array being > viewed as a float one could expand along the wrong axis, and even > without such arrays swap order pretty fast when operating on them, which > can create impossibly to find bugs, because even a poweruser is likely > to forget about such things. > > Of course you could argue that view is a poweruser feature and a user > using it should keep these things in mind.... Though if you argue that, > you can almost just use `np.ndarray` directly ;) -- ok, not really > considering how cumbersome it is, but still. > I have been giving this some thought, and am willing to concede that my first proposal may have been too ambitious. So even though the knob goes to 11, we can always do things incrementally. I am also wary of adding new keywords when it seems obvious that we do not have the functionality completely figured out, so here's my new proposal: - The objective is that a view of an array that is the result of slicing a contiguous array should be possible, if it remains "contiguous" (meaning stride == itemsize) along its original contiguous (first or last) dimension. This eliminates axis transposition from the previous proposal, although reversing the axes completely would also work. - To verify this, unless the C contiguous or Fortran contiguous flags are set, we would still need to look at the strides. An array would be C contiguous if, starting from the last stride it is equal to the itemsize, and working backwards every next stride is larger or equal than the product of the previous stride by the previous dimension. dimensions of size 1 would be ignored for these, except for the last one, which would be taken to have stride = itemsize. The Fortran case is of course the same in reverse. - I think the above combined with the current preference of C contiguousness over Fortran, would actually allow the views to always be reversible, which is also a nice thing to have. This eliminates most of the weirdness, but extends current functionality to cover cases like Jens reported a few days back. Does this sound better? Jaime > > - Sebastian > > > > > I guess the main consideration for this is that we may be > > stuck with > > stuff b/c of backwards compatibility. Can you maybe say a > > little bit > > about what is allowed now, and what constraints that puts on > > things? > > E.g. are we already grovelling around in strides and picking > > random > > dimensions in some cases? > > > > > > Just to restate it: right now we only allow new views if the array is > > globally contiguous, so either along the first or last dimension. > > > > > > Jaime > > > > > > -n > > > > -- > > Nathaniel J. Smith > > Postdoctoral researcher - Informatics - University of > > Edinburgh > > http://vorpus.org > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > -- > > (\__/) > > ( O.o) > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > > planes de dominaci?n mundial. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastien.gouezel at univ-rennes1.fr Mon Feb 2 16:47:33 2015 From: sebastien.gouezel at univ-rennes1.fr (Sebastien Gouezel) Date: Mon, 02 Feb 2015 22:47:33 +0100 Subject: [Numpy-discussion] missing FloatingPointError for numpy on cygwin64 In-Reply-To: References: Message-ID: Le 01/02/2015 08:48, David Cournapeau a ?crit : > The first > place I would look for cygwin-related FPU issues is there: > https://github.com/numpy/numpy/blob/master/numpy/core/setup.py#L638 The pointer is a good one. Thanks to you, I have found the problem (a wrong numpy-specific fenv.h was included in the math routines instead of the system-wide version) and submitted a patch to the bug tracker. With the patch, almost all tests pass on cygwin64! From ndbecker2 at gmail.com Mon Feb 2 20:34:32 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 02 Feb 2015 20:34:32 -0500 Subject: [Numpy-discussion] Views of a different dtype References: Message-ID: I find it useful to be able to view a simple 1D contiguous array of complex as float (alternative real and imag), and also the do the reverse. From sebastian at sipsolutions.net Tue Feb 3 04:28:59 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 03 Feb 2015 10:28:59 +0100 Subject: [Numpy-discussion] Views of a different dtype In-Reply-To: References: <1422695822.12798.14.camel@sebastian-t440> Message-ID: <1422955739.5533.42.camel@sebastian-t440> On Mo, 2015-02-02 at 06:25 -0800, Jaime Fern?ndez del R?o wrote: > On Sat, Jan 31, 2015 at 1:17 AM, Sebastian Berg > wrote: > On Fr, 2015-01-30 at 19:52 -0800, Jaime Fern?ndez del R?o > wrote: > > On Thu, Jan 29, 2015 at 8:57 AM, Nathaniel Smith > > > wrote: > > On Thu, Jan 29, 2015 at 12:56 AM, Jaime Fern?ndez > del R?o > > wrote: > > [...] > > > > > > > Could we make it more like: check to see if the last > dimension > > works. > > If not, raise an error (and let the user transpose > some other > > dimension there if that's what they wanted)? Or > require the > > user to > > specify which dimension will absorb the shape > change? (If we > > were > > doing this from scratch, then it would be tempting > to just say > > that we > > always add a new dimension at the end with > newtype.itemsize / > > oldtype.itemsize entries, or absorb such a dimension > if > > shrinking. As > > a bonus, this would always work, regardless of > contiguity! > > Except that > > when shrinking the last dimension would have to be > contiguous, > > of > > course.) > > > > > > When we roll @ in and people start working with stacks of > matrices, we > > will probably find ourselves having to create an alias, > similar to .T, > > for .swapaxes(-1, -2). Searching for the smallest stride > allows to > > take views of such arrays, which does not work right now > because the > > array is no longer contiguous globally. > > > > That is true, but I agree with Nathaniel at least as far as > that I would > prefer a user to be able to safely use `view` even he has not > even an > inkling about what his memory layout is. One option would be > an > `axis=-1` default (maybe FutureWarn this from `axis=None` > which would > look at order, see below -- or maybe have axis='A', 'C' and > 'F' and > default to 'A' for starters). > > This even now could start creating bugs when enabling relaxed > strides :(, because your good old fortran order complex array > being > viewed as a float one could expand along the wrong axis, and > even > without such arrays swap order pretty fast when operating on > them, which > can create impossibly to find bugs, because even a poweruser > is likely > to forget about such things. > > Of course you could argue that view is a poweruser feature and > a user > using it should keep these things in mind.... Though if you > argue that, > you can almost just use `np.ndarray` directly ;) -- ok, not > really > considering how cumbersome it is, but still. > > > I have been giving this some thought, and am willing to concede that > my first proposal may have been too ambitious. So even though the knob > goes to 11, we can always do things incrementally. I am also wary of > adding new keywords when it seems obvious that we do not have the > functionality completely figured out, so here's my new proposal: > > > * The objective is that a view of an array that is the result of > slicing a contiguous array should be possible, if it remains > "contiguous" (meaning stride == itemsize) along its original > contiguous (first or last) dimension. This eliminates axis > transposition from the previous proposal, although reversing > the axes completely would also work. > * To verify this, unless the C contiguous or Fortran contiguous > flags are set, we would still need to look at the strides. An > array would be C contiguous if, starting from the last stride > it is equal to the itemsize, and working backwards every next > stride is larger or equal than the product of the previous > stride by the previous dimension. dimensions of size 1 would > be ignored for these, except for the last one, which would be > taken to have stride = itemsize. The Fortran case is of course > the same in reverse. > * I think the above combined with the current preference of C > contiguousness over Fortran, would actually allow the views to > always be reversible, which is also a nice thing to have. > This eliminates most of the weirdness, but extends current > functionality to cover cases like Jens reported a few days back. > > > Does this sound better? > It seems fine as such, but I still worry about relaxed strides, though this is not really directly related to your efforts here. The problem I see is something like this (any numpy version): arr = np.array([[1, 2]], dtype=np.float64, order='C').T # note that arr is fortran contiguous view = arr.view(np.complex128) not_arr = view.view(np.float64) np.array_equal(arr, not_arr) # False! And with relaxed strides, the situation should become worse, because "Fortran order unless C order" logic is harder to predict, and here does an actual difference even for non (1, 1) arrays. Which creates the possibility of breaking currently working code. - Sebastian > > Jaime > > > > - Sebastian > > > > > I guess the main consideration for this is that we > may be > > stuck with > > stuff b/c of backwards compatibility. Can you maybe > say a > > little bit > > about what is allowed now, and what constraints that > puts on > > things? > > E.g. are we already grovelling around in strides and > picking > > random > > dimensions in some cases? > > > > > > Just to restate it: right now we only allow new views if the > array is > > globally contiguous, so either along the first or last > dimension. > > > > > > Jaime > > > > > > -n > > > > -- > > Nathaniel J. Smith > > Postdoctoral researcher - Informatics - University > of > > Edinburgh > > http://vorpus.org > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > -- > > (\__/) > > ( O.o) > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale > en sus > > planes de dominaci?n mundial. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > planes de dominaci?n mundial. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From jaime.frio at gmail.com Tue Feb 3 10:18:19 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Tue, 3 Feb 2015 07:18:19 -0800 Subject: [Numpy-discussion] Views of a different dtype In-Reply-To: <1422955739.5533.42.camel@sebastian-t440> References: <1422695822.12798.14.camel@sebastian-t440> <1422955739.5533.42.camel@sebastian-t440> Message-ID: On Tue, Feb 3, 2015 at 1:28 AM, Sebastian Berg wrote: > On Mo, 2015-02-02 at 06:25 -0800, Jaime Fern?ndez del R?o wrote: > > On Sat, Jan 31, 2015 at 1:17 AM, Sebastian Berg > > wrote: > > On Fr, 2015-01-30 at 19:52 -0800, Jaime Fern?ndez del R?o > > wrote: > > > On Thu, Jan 29, 2015 at 8:57 AM, Nathaniel Smith > > > > > wrote: > > > On Thu, Jan 29, 2015 at 12:56 AM, Jaime Fern?ndez > > del R?o > > > wrote: > > > [...] > > > > > > > > > > > > Could we make it more like: check to see if the last > > dimension > > > works. > > > If not, raise an error (and let the user transpose > > some other > > > dimension there if that's what they wanted)? Or > > require the > > > user to > > > specify which dimension will absorb the shape > > change? (If we > > > were > > > doing this from scratch, then it would be tempting > > to just say > > > that we > > > always add a new dimension at the end with > > newtype.itemsize / > > > oldtype.itemsize entries, or absorb such a dimension > > if > > > shrinking. As > > > a bonus, this would always work, regardless of > > contiguity! > > > Except that > > > when shrinking the last dimension would have to be > > contiguous, > > > of > > > course.) > > > > > > > > > When we roll @ in and people start working with stacks of > > matrices, we > > > will probably find ourselves having to create an alias, > > similar to .T, > > > for .swapaxes(-1, -2). Searching for the smallest stride > > allows to > > > take views of such arrays, which does not work right now > > because the > > > array is no longer contiguous globally. > > > > > > > That is true, but I agree with Nathaniel at least as far as > > that I would > > prefer a user to be able to safely use `view` even he has not > > even an > > inkling about what his memory layout is. One option would be > > an > > `axis=-1` default (maybe FutureWarn this from `axis=None` > > which would > > look at order, see below -- or maybe have axis='A', 'C' and > > 'F' and > > default to 'A' for starters). > > > > This even now could start creating bugs when enabling relaxed > > strides :(, because your good old fortran order complex array > > being > > viewed as a float one could expand along the wrong axis, and > > even > > without such arrays swap order pretty fast when operating on > > them, which > > can create impossibly to find bugs, because even a poweruser > > is likely > > to forget about such things. > > > > Of course you could argue that view is a poweruser feature and > > a user > > using it should keep these things in mind.... Though if you > > argue that, > > you can almost just use `np.ndarray` directly ;) -- ok, not > > really > > considering how cumbersome it is, but still. > > > > > > I have been giving this some thought, and am willing to concede that > > my first proposal may have been too ambitious. So even though the knob > > goes to 11, we can always do things incrementally. I am also wary of > > adding new keywords when it seems obvious that we do not have the > > functionality completely figured out, so here's my new proposal: > > > > > > * The objective is that a view of an array that is the result of > > slicing a contiguous array should be possible, if it remains > > "contiguous" (meaning stride == itemsize) along its original > > contiguous (first or last) dimension. This eliminates axis > > transposition from the previous proposal, although reversing > > the axes completely would also work. > > * To verify this, unless the C contiguous or Fortran contiguous > > flags are set, we would still need to look at the strides. An > > array would be C contiguous if, starting from the last stride > > it is equal to the itemsize, and working backwards every next > > stride is larger or equal than the product of the previous > > stride by the previous dimension. dimensions of size 1 would > > be ignored for these, except for the last one, which would be > > taken to have stride = itemsize. The Fortran case is of course > > the same in reverse. > > * I think the above combined with the current preference of C > > contiguousness over Fortran, would actually allow the views to > > always be reversible, which is also a nice thing to have. > > This eliminates most of the weirdness, but extends current > > functionality to cover cases like Jens reported a few days back. > > > > > > Does this sound better? > > > > It seems fine as such, but I still worry about relaxed strides, though > this is not really directly related to your efforts here. The problem I > see is something like this (any numpy version): > > arr = np.array([[1, 2]], dtype=np.float64, order='C').T > # note that arr is fortran contiguous > view = arr.view(np.complex128) > not_arr = view.view(np.float64) > np.array_equal(arr, not_arr) # False! > Yes, dimensions of size one can be a pain... > > And with relaxed strides, the situation should become worse, because > "Fortran order unless C order" logic is harder to predict, and here does > an actual difference even for non (1, 1) arrays. Which creates the > possibility of breaking currently working code. > Do you have a concrete example of what a non (1, 1) array that fails with relaxed strides would look like? If we used, as right now, the array flags as a first choice point, and only if none is set try to determine it from the strides/dimensions information, I fail to imagine any situation where the end result would be worse than now. I don't think that a little bit of predictable surprising in an advanced functionality is too bad. We could start raising "on the face of ambiguity, we refuse to guess" errors, even for the current behavior you show above, but that is more likely to trip people by not giving them any simple workaround, that it seems to me would be "add a .T if all dimensions are 1" in some particular situations. Or are you thinking of something more serious than a shape mismatch when you write about "breaking current code"? If there are any real loopholes in expanding this functionality, then lets not do it, but we know we have at least one user unsatisfied with the current performance, so I really think it is worth trying. Plus, I'll admit to that, messing around with some of these stuff deep inside the guts of the beast is lots of fun! ;) Jaime > > - Sebastian > > > > > > Jaime > > > > > > > > - Sebastian > > > > > > > > I guess the main consideration for this is that we > > may be > > > stuck with > > > stuff b/c of backwards compatibility. Can you maybe > > say a > > > little bit > > > about what is allowed now, and what constraints that > > puts on > > > things? > > > E.g. are we already grovelling around in strides and > > picking > > > random > > > dimensions in some cases? > > > > > > > > > Just to restate it: right now we only allow new views if the > > array is > > > globally contiguous, so either along the first or last > > dimension. > > > > > > > > > Jaime > > > > > > > > > -n > > > > > > -- > > > Nathaniel J. Smith > > > Postdoctoral researcher - Informatics - University > > of > > > Edinburgh > > > http://vorpus.org > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > > > -- > > > (\__/) > > > ( O.o) > > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale > > en sus > > > planes de dominaci?n mundial. > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > -- > > (\__/) > > ( O.o) > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > > planes de dominaci?n mundial. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Feb 3 11:59:09 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 03 Feb 2015 17:59:09 +0100 Subject: [Numpy-discussion] Views of a different dtype In-Reply-To: References: <1422695822.12798.14.camel@sebastian-t440> <1422955739.5533.42.camel@sebastian-t440> Message-ID: <1422982749.12912.5.camel@sebastian-t440> On Di, 2015-02-03 at 07:18 -0800, Jaime Fern?ndez del R?o wrote: > > > > > Do you have a concrete example of what a non (1, 1) array that fails > with relaxed strides would look like? > > > If we used, as right now, the array flags as a first choice point, and > only if none is set try to determine it from the strides/dimensions > information, I fail to imagine any situation where the end result > would be worse than now. I don't think that a little bit of > predictable surprising in an advanced functionality is too bad. We > could start raising "on the face of ambiguity, we refuse to guess" > errors, even for the current behavior you show above, but that is more > likely to trip people by not giving them any simple workaround, that > it seems to me would be "add a .T if all dimensions are 1" in some > particular situations. Or are you thinking of something more serious > than a shape mismatch when you write about "breaking current code"? > Yes, I am talking only about wrongly shaped results for some fortran order arrays. A (20, 1) fortran order complex array being viewed as float, will with relaxed strides become a (20, 2) array instead of a (40, 1) one. > > If there are any real loopholes in expanding this functionality, then > lets not do it, but we know we have at least one user unsatisfied with > the current performance, so I really think it is worth trying. Plus, > I'll admit to that, messing around with some of these stuff deep > inside the guts of the beast is lots of fun! ;) > I do not think there are loopholes with expanding this functionality. I think there have regressions when we put relaxed strides to on, because suddenly the fortran order array might be expanded along a 1-sized axis, because it is also C order. So I wonder if we can fix these regressions and at the same time maybe provide a more intuitive approach then using the memory order blindly.... - Sebastian > > Jaime > > > - Sebastian > > > > > > Jaime > > > > > > > > - Sebastian > > > > > > > > I guess the main consideration for this is > that we > > may be > > > stuck with > > > stuff b/c of backwards compatibility. Can > you maybe > > say a > > > little bit > > > about what is allowed now, and what > constraints that > > puts on > > > things? > > > E.g. are we already grovelling around in > strides and > > picking > > > random > > > dimensions in some cases? > > > > > > > > > Just to restate it: right now we only allow new > views if the > > array is > > > globally contiguous, so either along the first or > last > > dimension. > > > > > > > > > Jaime > > > > > > > > > -n > > > > > > -- > > > Nathaniel J. Smith > > > Postdoctoral researcher - Informatics - > University > > of > > > Edinburgh > > > http://vorpus.org > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > > > -- > > > (\__/) > > > ( O.o) > > > ( > <) Este es Conejo. Copia a Conejo en tu firma > y ay?dale > > en sus > > > planes de dominaci?n mundial. > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > -- > > (\__/) > > ( O.o) > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale > en sus > > planes de dominaci?n mundial. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > planes de dominaci?n mundial. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From olof.backing at combitech.se Tue Feb 3 12:51:37 2015 From: olof.backing at combitech.se (Backing Olof) Date: Tue, 3 Feb 2015 17:51:37 +0000 Subject: [Numpy-discussion] F2PY and multi-dimension arrays - unexpected array size error Message-ID: <49F675D2-88C7-46BC-8252-153429468219@combitech.se> Hi I am helping out with a Python and Fortran project. Let me give you some background: * Fortran source: C Bergstrom FCC C User subroutine VUMAT subroutine VUMAT( C Read only - * nblock, ndir, nshr, nstatev, nprops, * stepTime, dt, * props, * density, strainInc, * tempOld, * stressOld, stateOld, enerInternOld, enerInelasOld, C Write only - * stressNew, stateNew, enerInternNew, enerInelasNew ) C include 'vaba_param.inc' C dimension props(nprops), 1 density(nblock), strainInc(nblock,ndir+nshr), 2 tempOld(nblock), 5 stressOld(nblock,ndir+nshr), stateOld(nblock,nstatev), 6 enerInternOld(nblock), enerInelasOld(nblock), 2 stressNew(nblock,ndir+nshr), stateNew(nblock,nstatev), 3 enerInternNew(nblock), enerInelasNew(nblock) * Corresponding .pyf integer :: nblock integer :: ndir integer :: nshr integer :: nstatev integer :: nprops real :: steptime real :: dt real dimension(nprops) :: props real dimension(nblock) :: density real dimension(nblock,ndir+nshr) :: straininc real dimension(nblock) :: tempold real dimension(nblock,ndir+nshr) :: stressold real dimension(nblock,nstatev) :: stateold real dimension(nblock) :: enerinternold real dimension(nblock) :: enerinelasold real dimension(nblock,ndir+nshr),intent(out) :: stressnew real dimension(nblock,nstatev),intent(out) :: statenew real dimension(nblock),intent(out) :: enerinternnew real dimension(nblock),intent(out) :: enerinelasnew * Python source with call of Fortran routine: nblock = 1 ndir = 3 nshr = 3 nstatev = 3 nprops = 11 stepTime = 1 dt = 1 props = np.array([10, 0.5, 1e10, 5, 1e12, 3e-6, 8e-6, 27, 2], float) density = np.array([[7.8e3]], float) strainInc = np.array([[1,-0.5,-0.5,0,0,0]], float) tempOld = np.array([[1]], float) stressOld = np.array([[1,1,1,1,1,1]], float) stateOld = np.array([[1,1,1]], float) enerInternOld = np.array([1], float) enerInelasOld = np.array([1], float) stressNew = np.array([[]], float) stateNew = np.array([[]], float) enerInternNew = np.array([[]], float) enerInelasNew = np.array([[]], float) stressNew, stateNew, enerInternNew, enerInelasNew = vumat(nblock, ndir, nshr, nstatev, nprops, stepTime, dt, props, density, strainInc, tempOld, stressOld, stateOld, enerInternOld, enerInelasOld) When trying to run with Python 2.7 I get: olof at ubuntu:~$ ./demo.py unexpected array size: new_size=4, got array with arr_size=1 Traceback (most recent call last): File "./demo.py", line 33, in main() File "./demo.py", line 30, in main stressNew, stateNew, enerInternNew, enerInelasNew = vumat(nblock, ndir, nshr, nstatev, nprops, stepTime, dt, props, density, strainInc, tempOld, stressOld, stateOld, enerInternOld, enerInelasOld) VUMAT_Bergstrom_FCC.error: failed in converting 9th argument `stressold' of VUMAT_Bergstrom_FCC.vumat to C/Fortran array Other stuff: * python 2.7.6 * numpy/f2py 1.8.2 * gcc/gfortran 4.8.2 * ubuntu 14.04 LTS 32-bit I have tried to google, read the f2py manual, fortran tutorials etc, but to no avail. I must also admit that my knowledge in python is so-so and fortran even less(!). What is the missing statement/syntax that I can?t get correct? Your humble programmer, Olof From jaime.frio at gmail.com Tue Feb 3 15:47:20 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Tue, 3 Feb 2015 12:47:20 -0800 Subject: [Numpy-discussion] Views of a different dtype In-Reply-To: <1422982749.12912.5.camel@sebastian-t440> References: <1422695822.12798.14.camel@sebastian-t440> <1422955739.5533.42.camel@sebastian-t440> <1422982749.12912.5.camel@sebastian-t440> Message-ID: On Tue, Feb 3, 2015 at 8:59 AM, Sebastian Berg wrote: > On Di, 2015-02-03 at 07:18 -0800, Jaime Fern?ndez del R?o wrote: > > > > > > > > > > > Do you have a concrete example of what a non (1, 1) array that fails > > with relaxed strides would look like? > > > > > > If we used, as right now, the array flags as a first choice point, and > > only if none is set try to determine it from the strides/dimensions > > information, I fail to imagine any situation where the end result > > would be worse than now. I don't think that a little bit of > > predictable surprising in an advanced functionality is too bad. We > > could start raising "on the face of ambiguity, we refuse to guess" > > errors, even for the current behavior you show above, but that is more > > likely to trip people by not giving them any simple workaround, that > > it seems to me would be "add a .T if all dimensions are 1" in some > > particular situations. Or are you thinking of something more serious > > than a shape mismatch when you write about "breaking current code"? > > > > Yes, I am talking only about wrongly shaped results for some fortran > order arrays. A (20, 1) fortran order complex array being viewed as > float, will with relaxed strides become a (20, 2) array instead of a > (40, 1) one. > That is a limitation of the current implementation too, and happens already whenever relaxed strides are in place. Which is the default for 1.10, right? Perhaps giving 'view' an 'order' or 'axis' kwarg could make sense after all? It should probably be more of a hint of what to do (fortran vs c) when in doubt. "C" would prioritize last axis, "F" the first, and we could even add a "raise" option to have it fail if the axis cannot be inferred from the strides and shape. Current behavior would is equivalent to what "C" would do. Jaime > > > > If there are any real loopholes in expanding this functionality, then > > lets not do it, but we know we have at least one user unsatisfied with > > the current performance, so I really think it is worth trying. Plus, > > I'll admit to that, messing around with some of these stuff deep > > inside the guts of the beast is lots of fun! ;) > > > > I do not think there are loopholes with expanding this functionality. I > think there have regressions when we put relaxed strides to on, because > suddenly the fortran order array might be expanded along a 1-sized axis, > because it is also C order. So I wonder if we can fix these regressions > and at the same time maybe provide a more intuitive approach then using > the memory order blindly.... > > - Sebastian > > > > > Jaime > > > > > > - Sebastian > > > > > > > > > > Jaime > > > > > > > > > > > > - Sebastian > > > > > > > > > > > I guess the main consideration for this is > > that we > > > may be > > > > stuck with > > > > stuff b/c of backwards compatibility. Can > > you maybe > > > say a > > > > little bit > > > > about what is allowed now, and what > > constraints that > > > puts on > > > > things? > > > > E.g. are we already grovelling around in > > strides and > > > picking > > > > random > > > > dimensions in some cases? > > > > > > > > > > > > Just to restate it: right now we only allow new > > views if the > > > array is > > > > globally contiguous, so either along the first or > > last > > > dimension. > > > > > > > > > > > > Jaime > > > > > > > > > > > > -n > > > > > > > > -- > > > > Nathaniel J. Smith > > > > Postdoctoral researcher - Informatics - > > University > > > of > > > > Edinburgh > > > > http://vorpus.org > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > > > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > (\__/) > > > > ( O.o) > > > > ( > <) Este es Conejo. Copia a Conejo en tu firma > > y ay?dale > > > en sus > > > > planes de dominaci?n mundial. > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > > > -- > > > (\__/) > > > ( O.o) > > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale > > en sus > > > planes de dominaci?n mundial. > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > -- > > (\__/) > > ( O.o) > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > > planes de dominaci?n mundial. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Tue Feb 3 15:58:17 2015 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Tue, 3 Feb 2015 15:58:17 -0500 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? Message-ID: I have an implementation of the Heaviside function as numpy ufunc. Is there any interest in adding this to numpy? The function is simply: 0 if x < 0 heaviside(x) = 0.5 if x == 0 1 if x > 0 Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From aron at ahmadia.net Tue Feb 3 15:59:39 2015 From: aron at ahmadia.net (Aron Ahmadia) Date: Tue, 3 Feb 2015 15:59:39 -0500 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? In-Reply-To: References: Message-ID: That seems useful to me. On Tue, Feb 3, 2015 at 3:58 PM, Warren Weckesser wrote: > I have an implementation of the Heaviside function as numpy ufunc. Is > there any interest in adding this to numpy? The function is simply: > > 0 if x < 0 > heaviside(x) = 0.5 if x == 0 > 1 if x > 0 > > > Warren > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Tue Feb 3 16:52:12 2015 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Tue, 03 Feb 2015 21:52:12 +0000 Subject: [Numpy-discussion] Views of a different dtype References: <1422695822.12798.14.camel@sebastian-t440> <1422955739.5533.42.camel@sebastian-t440> <1422982749.12912.5.camel@sebastian-t440> Message-ID: On Tue Feb 03 2015 at 1:47:34 PM Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Tue, Feb 3, 2015 at 8:59 AM, Sebastian Berg > wrote: > >> On Di, 2015-02-03 at 07:18 -0800, Jaime Fern?ndez del R?o wrote: >> > >> >> > >> > >> > >> > Do you have a concrete example of what a non (1, 1) array that fails >> > with relaxed strides would look like? >> > >> > >> > If we used, as right now, the array flags as a first choice point, and >> > only if none is set try to determine it from the strides/dimensions >> > information, I fail to imagine any situation where the end result >> > would be worse than now. I don't think that a little bit of >> > predictable surprising in an advanced functionality is too bad. We >> > could start raising "on the face of ambiguity, we refuse to guess" >> > errors, even for the current behavior you show above, but that is more >> > likely to trip people by not giving them any simple workaround, that >> > it seems to me would be "add a .T if all dimensions are 1" in some >> > particular situations. Or are you thinking of something more serious >> > than a shape mismatch when you write about "breaking current code"? >> > >> >> Yes, I am talking only about wrongly shaped results for some fortran >> order arrays. A (20, 1) fortran order complex array being viewed as >> float, will with relaxed strides become a (20, 2) array instead of a >> (40, 1) one. >> > > That is a limitation of the current implementation too, and happens > already whenever relaxed strides are in place. Which is the default for > 1.10, right? > > Perhaps giving 'view' an 'order' or 'axis' kwarg could make sense after > all? It should probably be more of a hint of what to do (fortran vs c) when > in doubt. "C" would prioritize last axis, "F" the first, and we could even > add a "raise" option to have it fail if the axis cannot be inferred from > the strides and shape. Current behavior would is equivalent to what "C" > would do. > > Jaime > IMHO, the best option would be something like this: - When changing to a type with smaller itemsize, add a new axis after the others so the resulting array is C contiguous (unless a different axis is specified by a keyword argument). The advantage here is that if you index the new view using the old indices for an entry, you get an array showing its representation in the new type. - No shape change for views with the same itemsize - When changing to a type with a larger itemsize, collapse along the last axis unless a different axis is specified, throwing an error if the axis specified does not match the axis specified. The last point essentially is just adding an axis argument. I like that idea because it gives users the ability to do all that the array's memory layout allows in a clear and concise way. Throwing an error if the default axis doesn't work would be a good way to prevent strange bugs from happening when the default behavior is expected. The first point would be a break in backwards compatibility, so I'm not sure if it's feasible at this point. The advantage would be that all all arrays returned when using this functionality would be contiguous along the last axis. The shape of the new array would be independent of the memory layout of the original one. This would also be a much cleaner way to ensure that views of a different type are always reversible while still allowing for relaxed strides. Either way, thanks for looking into this. It's a great feature to have available. -Ian Henriksen > > > >> > >> > If there are any real loopholes in expanding this functionality, then >> > lets not do it, but we know we have at least one user unsatisfied with >> > the current performance, so I really think it is worth trying. Plus, >> > I'll admit to that, messing around with some of these stuff deep >> > inside the guts of the beast is lots of fun! ;) >> > >> >> I do not think there are loopholes with expanding this functionality. I >> think there have regressions when we put relaxed strides to on, because >> suddenly the fortran order array might be expanded along a 1-sized axis, >> because it is also C order. So I wonder if we can fix these regressions >> and at the same time maybe provide a more intuitive approach then using >> the memory order blindly.... >> > >> - Sebastian >> >> > >> > Jaime >> > >> > >> > - Sebastian >> > >> > >> > > >> > > Jaime >> > > >> > > >> > > >> > > - Sebastian >> > > >> > > > >> > > > I guess the main consideration for this is >> > that we >> > > may be >> > > > stuck with >> > > > stuff b/c of backwards compatibility. Can >> > you maybe >> > > say a >> > > > little bit >> > > > about what is allowed now, and what >> > constraints that >> > > puts on >> > > > things? >> > > > E.g. are we already grovelling around in >> > strides and >> > > picking >> > > > random >> > > > dimensions in some cases? >> > > > >> > > > >> > > > Just to restate it: right now we only allow new >> > views if the >> > > array is >> > > > globally contiguous, so either along the first or >> > last >> > > dimension. >> > > > >> > > > >> > > > Jaime >> > > > >> > > > >> > > > -n >> > > > >> > > > -- >> > > > Nathaniel J. Smith >> > > > Postdoctoral researcher - Informatics - >> > University >> > > of >> > > > Edinburgh >> > > > http://vorpus.org >> > > > >> > _______________________________________________ >> > > > NumPy-Discussion mailing list >> > > > NumPy-Discussion at scipy.org >> > > > >> > > >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > (\__/) >> > > > ( O.o) >> > > > ( > <) Este es Conejo. Copia a Conejo en tu firma >> > y ay?dale >> > > en sus >> > > > planes de dominaci?n mundial. >> > > > _______________________________________________ >> > > > NumPy-Discussion mailing list >> > > > NumPy-Discussion at scipy.org >> > > > >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > >> > > >> > > >> > > _______________________________________________ >> > > NumPy-Discussion mailing list >> > > NumPy-Discussion at scipy.org >> > > >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > >> > > >> > > >> > > >> > > >> > > -- >> > > (\__/) >> > > ( O.o) >> > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale >> > en sus >> > > planes de dominaci?n mundial. >> > > _______________________________________________ >> > > NumPy-Discussion mailing list >> > > NumPy-Discussion at scipy.org >> > > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > >> > >> > -- >> > (\__/) >> > ( O.o) >> > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus >> > planes de dominaci?n mundial. >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Tue Feb 3 23:14:09 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 4 Feb 2015 04:14:09 +0000 (UTC) Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? References: Message-ID: <1143124890444715758.373612sturla.molden-gmail.com@news.gmane.org> Warren Weckesser wrote: > 0 if x < 0 > heaviside(x) = 0.5 if x == 0 > 1 if x > 0 > This is not correct. The discrete form of the Heaviside step function has the value 1 for x == 0. heaviside = lambda x : 1 - (x < 0).astype(int) Sturla From aron at ahmadia.net Tue Feb 3 23:41:30 2015 From: aron at ahmadia.net (Aron Ahmadia) Date: Tue, 3 Feb 2015 23:41:30 -0500 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? In-Reply-To: <1143124890444715758.373612sturla.molden-gmail.com@news.gmane.org> References: <1143124890444715758.373612sturla.molden-gmail.com@news.gmane.org> Message-ID: > This is not correct. The discrete form of the Heaviside step function has > the value 1 for x == 0. > Yeah, I was looking at it and wondering if I'd misremembered the definition. Assuming you're implementing the discrete Heaviside function, H[0] = 1 as Sturla notes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Wed Feb 4 00:18:51 2015 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Wed, 4 Feb 2015 00:18:51 -0500 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? In-Reply-To: <1143124890444715758.373612sturla.molden-gmail.com@news.gmane.org> References: <1143124890444715758.373612sturla.molden-gmail.com@news.gmane.org> Message-ID: On Tue, Feb 3, 2015 at 11:14 PM, Sturla Molden wrote: > Warren Weckesser wrote: > > > 0 if x < 0 > > heaviside(x) = 0.5 if x == 0 > > 1 if x > 0 > > > > This is not correct. The discrete form of the Heaviside step function has > the value 1 for x == 0. > > heaviside = lambda x : 1 - (x < 0).astype(int) > > > By "discrete form", do you mean discrete time (i.e. a function defined on the integers)? Then I agree, the discrete time unit step function is defined as u(k) = 0 k < 0 1 k >= 0 for integer k. The domain of the proposed Heaviside function is not discrete; it is defined for arbitrary floating point (real) arguments. In this case, the choice heaviside(0) = 0.5 is a common convention. See for example, * http://mathworld.wolfram.com/HeavisideStepFunction.html * http://www.mathworks.com/help/symbolic/heaviside.html * http://en.wikipedia.org/wiki/Heaviside_step_function, in particular http://en.wikipedia.org/wiki/Heaviside_step_function#Zero_argument Other common conventions are the right-continuous version that you prefer (heavisde(0) = 1), or the left-continuous version (heaviside(0) = 0). We can accommodate the alternatives with an additional argument that sets the value at 0: heaviside(x, zero_value=0.5) Warren > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Wed Feb 4 00:58:05 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Tue, 3 Feb 2015 21:58:05 -0800 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? In-Reply-To: References: Message-ID: On Tue, Feb 3, 2015 at 12:58 PM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > I have an implementation of the Heaviside function as numpy ufunc. Is > there any interest in adding this to numpy? The function is simply: > > 0 if x < 0 > heaviside(x) = 0.5 if x == 0 > 1 if x > 0 > I don't think there's anything like it in numpy. Wouldn't scipy.special be a better home for it? Jaime > > > > Warren > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 4 01:02:26 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Feb 2015 01:02:26 -0500 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? In-Reply-To: References: <1143124890444715758.373612sturla.molden-gmail.com@news.gmane.org> Message-ID: On Wed, Feb 4, 2015 at 12:18 AM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > > > On Tue, Feb 3, 2015 at 11:14 PM, Sturla Molden > wrote: > >> Warren Weckesser wrote: >> >> > 0 if x < 0 >> > heaviside(x) = 0.5 if x == 0 >> > 1 if x > 0 >> > >> >> This is not correct. The discrete form of the Heaviside step function has >> the value 1 for x == 0. >> >> heaviside = lambda x : 1 - (x < 0).astype(int) >> >> >> > > > By "discrete form", do you mean discrete time (i.e. a function defined on > the integers)? Then I agree, the discrete time unit step function is > defined as > > u(k) = 0 k < 0 > 1 k >= 0 > > for integer k. > > The domain of the proposed Heaviside function is not discrete; it is > defined for arbitrary floating point (real) arguments. In this case, the > choice heaviside(0) = 0.5 is a common convention. See for example, > > * http://mathworld.wolfram.com/HeavisideStepFunction.html > * http://www.mathworks.com/help/symbolic/heaviside.html > * http://en.wikipedia.org/wiki/Heaviside_step_function, in particular > http://en.wikipedia.org/wiki/Heaviside_step_function#Zero_argument > > Other common conventions are the right-continuous version that you prefer > (heavisde(0) = 1), or the left-continuous version (heaviside(0) = 0). > > We can accommodate the alternatives with an additional argument that sets > the value at 0: > > heaviside(x, zero_value=0.5) > What's the usecase for a heaviside function? I don't think I have needed one since I was using mathematica or maple. (x < 0).astype(...) (x <= 0).astype(...) np.sign(x, dtype) look useful enough for most cases, or not? (What I wish numpy had is conditional place that doesn't calculate all the values. (I think there is a helper function in scipy.stats for that)) Josef > > > > Warren > > > >> >> Sturla >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dskgair at gmail.com Wed Feb 4 02:22:48 2015 From: dskgair at gmail.com (David Kershaw) Date: Wed, 4 Feb 2015 07:22:48 +0000 (UTC) Subject: [Numpy-discussion] advanced indexing question Message-ID: The numpy reference manual, array objects/indexing/advance indexing, says: Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view). If I run the following code: import numpy as np d=range[2] x=np.arange(36).reshape(3,2,3,2) y=x[:,d,:,d] y+=1 print x x[:,d,:,d]+=1 print x then the first print x shows that x is unchanged as it should be since y was a copy, not a view, but the second print x shows that all the elements of x with 1st index = 3rd index are now 1 bigger. Why did the left side of x[:,d,:,d]+=1 act like a view and not a copy? Thanks, David From sebastian at sipsolutions.net Wed Feb 4 03:03:14 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 04 Feb 2015 09:03:14 +0100 Subject: [Numpy-discussion] advanced indexing question In-Reply-To: References: Message-ID: <1423036994.10576.1.camel@sebastian-t440> On Mi, 2015-02-04 at 07:22 +0000, David Kershaw wrote: > The numpy reference manual, array objects/indexing/advance indexing, > says: > Advanced indexing always returns a copy of the data (contrast with > basic slicing that returns a view). > > If I run the following code: > import numpy as np > d=range[2] > x=np.arange(36).reshape(3,2,3,2) > y=x[:,d,:,d] > y+=1 > print x > x[:,d,:,d]+=1 > print x > then the first print x shows that x is unchanged as it should be since y > was a copy, not a view, but the second print x shows that all the elements > of x with 1st index = 3rd index are now 1 bigger. Why did the left side of > x[:,d,:,d]+=1 > act like a view and not a copy? > Python has a mechanism both for getting an item and for setting an item. The latter will end up doing this (python already does this for us): x[:,d,:,d] = x[:,d,:,d] + 1 so there is an item assignment going on (__setitem__ not __getitem__) - Sebastian > Thanks, > David > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From dskgair at gmail.com Wed Feb 4 04:15:49 2015 From: dskgair at gmail.com (David Kershaw) Date: Wed, 4 Feb 2015 09:15:49 +0000 (UTC) Subject: [Numpy-discussion] advanced indexing question References: <1423036994.10576.1.camel@sebastian-t440> Message-ID: Sebastian Berg sipsolutions.net> writes: > > Python has a mechanism both for getting an item and for setting an item. > The latter will end up doing this (python already does this for us): > x[:,d,:,d] = x[:,d,:,d] + 1 > so there is an item assignment going on (__setitem__ not __getitem__) > > - Sebastian > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Thanks for the prompt help Sebastian, So can I use any legitimate ndarray indexing selection object, obj, in x.__setitem__(obj,y) and as long as y's shape can be broadcast to x[obj]'s shape it will always set the appropriate elements of x to the corresponding elements of y? From sturla.molden at gmail.com Wed Feb 4 05:05:58 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 04 Feb 2015 11:05:58 +0100 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? In-Reply-To: References: <1143124890444715758.373612sturla.molden-gmail.com@news.gmane.org> Message-ID: On 04/02/15 06:18, Warren Weckesser wrote: > By "discrete form", do you mean discrete time (i.e. a function defined > on the integers)? Then I agree, the discrete time unit step function is > defined as It is the cumulative integral of the delta function, and thus it can never obtain the value 0.5. The delta function is defined to have an integral of 0 or 1. Sturla From davidmenhur at gmail.com Wed Feb 4 05:45:25 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Wed, 4 Feb 2015 11:45:25 +0100 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? In-Reply-To: References: <1143124890444715758.373612sturla.molden-gmail.com@news.gmane.org> Message-ID: On 4 February 2015 at 11:05, Sturla Molden wrote: > On 04/02/15 06:18, Warren Weckesser wrote: > >> By "discrete form", do you mean discrete time (i.e. a function defined >> on the integers)? Then I agree, the discrete time unit step function is >> defined as > > It is the cumulative integral of the delta function, and thus it can > never obtain the value 0.5. The delta function is defined to have an > integral of 0 or 1. > > Sturla There are several definitions. Abramowitz and Stegun (http://people.math.sfu.ca/~cbm/aands/page_1020.htm) assign the value 0.5 at x=0. It can also be defined as: H(x) = 1/2 * (1 + sign(x)) Where sign(0) = 0, and therefore H(0) = 1/2. Actually, Heaviside function is better seen as a distribution instead of a function, and then there is no problem with the value at 0, as long as it is finite. From jaime.frio at gmail.com Wed Feb 4 10:13:15 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 4 Feb 2015 07:13:15 -0800 Subject: [Numpy-discussion] Views of a different dtype In-Reply-To: References: <1422695822.12798.14.camel@sebastian-t440> <1422955739.5533.42.camel@sebastian-t440> <1422982749.12912.5.camel@sebastian-t440> Message-ID: On Tue, Feb 3, 2015 at 1:52 PM, Ian Henriksen < insertinterestingnamehere at gmail.com> wrote: > On Tue Feb 03 2015 at 1:47:34 PM Jaime Fern?ndez del R?o < > jaime.frio at gmail.com> wrote: > >> On Tue, Feb 3, 2015 at 8:59 AM, Sebastian Berg < >> sebastian at sipsolutions.net> wrote: >> >>> On Di, 2015-02-03 at 07:18 -0800, Jaime Fern?ndez del R?o wrote: >>> > >>> >>> > >>> > >>> > >>> > Do you have a concrete example of what a non (1, 1) array that fails >>> > with relaxed strides would look like? >>> > >>> > >>> > If we used, as right now, the array flags as a first choice point, and >>> > only if none is set try to determine it from the strides/dimensions >>> > information, I fail to imagine any situation where the end result >>> > would be worse than now. I don't think that a little bit of >>> > predictable surprising in an advanced functionality is too bad. We >>> > could start raising "on the face of ambiguity, we refuse to guess" >>> > errors, even for the current behavior you show above, but that is more >>> > likely to trip people by not giving them any simple workaround, that >>> > it seems to me would be "add a .T if all dimensions are 1" in some >>> > particular situations. Or are you thinking of something more serious >>> > than a shape mismatch when you write about "breaking current code"? >>> > >>> >>> Yes, I am talking only about wrongly shaped results for some fortran >>> order arrays. A (20, 1) fortran order complex array being viewed as >>> float, will with relaxed strides become a (20, 2) array instead of a >>> (40, 1) one. >>> >> >> That is a limitation of the current implementation too, and happens >> already whenever relaxed strides are in place. Which is the default for >> 1.10, right? >> >> Perhaps giving 'view' an 'order' or 'axis' kwarg could make sense after >> all? It should probably be more of a hint of what to do (fortran vs c) when >> in doubt. "C" would prioritize last axis, "F" the first, and we could even >> add a "raise" option to have it fail if the axis cannot be inferred from >> the strides and shape. Current behavior would is equivalent to what "C" >> would do. >> >> Jaime >> > > > IMHO, the best option would be something like this: > > - When changing to a type with smaller itemsize, add a new axis after the > others so the resulting array is C contiguous (unless a different axis is > specified by a keyword argument). The advantage here is that if you index > the new view using the old indices for an entry, you get an array showing > its representation in the new type. > - No shape change for views with the same itemsize > - When changing to a type with a larger itemsize, collapse along the last > axis unless a different axis is specified, throwing an error if the axis > specified does not match the axis specified. > My only concern with adding a new axis, backwards compatibility aside, is that you would not know wether to keep or discard the resulting size 1 dimension when taking a view of the view. We could reuse the keepdims terminology from ufuncs, though. So the current behavior would remain unchanged if you set axis=None, keepdims=True, and we could transition to a more reasonable default with axis=-1, keepdims=False over a few releases with adequate warnings. In my mind, when expanding, the axis would not indicate where to place the new axis, but which axis to collapse over, that is the hard part to figure out!If you want something else, it is only a call to rollaxis away. Perhaps we could also give keepdims a meaning when expanding, as to whether to add a new axis at the end, or change the size of the chosen dimension. I don't know, there may be an actual interface hidden somewhere here, but still needs some cooking to fully figure it out. Jaime > > The last point essentially is just adding an axis argument. I like that > idea because it gives users the ability to do all that the array's memory > layout allows in a clear and concise way. Throwing an error if the default > axis doesn't work would be a good way to prevent strange bugs from > happening when the default behavior is expected. > > The first point would be a break in backwards compatibility, so I'm not > sure if it's feasible at this point. The advantage would be that all all > arrays returned when using this functionality would be contiguous along the > last axis. The shape of the new array would be independent of the memory > layout of the original one. This would also be a much cleaner way to ensure > that views of a different type are always reversible while still allowing > for relaxed strides. > > Either way, thanks for looking into this. It's a great feature to have > available. > > -Ian Henriksen > > >> >> >> >>> > >>> > If there are any real loopholes in expanding this functionality, then >>> > lets not do it, but we know we have at least one user unsatisfied with >>> > the current performance, so I really think it is worth trying. Plus, >>> > I'll admit to that, messing around with some of these stuff deep >>> > inside the guts of the beast is lots of fun! ;) >>> > >>> >>> I do not think there are loopholes with expanding this functionality. I >>> think there have regressions when we put relaxed strides to on, because >>> suddenly the fortran order array might be expanded along a 1-sized axis, >>> because it is also C order. So I wonder if we can fix these regressions >>> and at the same time maybe provide a more intuitive approach then using >>> the memory order blindly.... >>> >> >>> - Sebastian >>> >>> > >>> > Jaime >>> > >>> > >>> > - Sebastian >>> > >>> > >>> > > >>> > > Jaime >>> > > >>> > > >>> > > >>> > > - Sebastian >>> > > >>> > > > >>> > > > I guess the main consideration for this is >>> > that we >>> > > may be >>> > > > stuck with >>> > > > stuff b/c of backwards compatibility. Can >>> > you maybe >>> > > say a >>> > > > little bit >>> > > > about what is allowed now, and what >>> > constraints that >>> > > puts on >>> > > > things? >>> > > > E.g. are we already grovelling around in >>> > strides and >>> > > picking >>> > > > random >>> > > > dimensions in some cases? >>> > > > >>> > > > >>> > > > Just to restate it: right now we only allow new >>> > views if the >>> > > array is >>> > > > globally contiguous, so either along the first or >>> > last >>> > > dimension. >>> > > > >>> > > > >>> > > > Jaime >>> > > > >>> > > > >>> > > > -n >>> > > > >>> > > > -- >>> > > > Nathaniel J. Smith >>> > > > Postdoctoral researcher - Informatics - >>> > University >>> > > of >>> > > > Edinburgh >>> > > > http://vorpus.org >>> > > > >>> > _______________________________________________ >>> > > > NumPy-Discussion mailing list >>> > > > NumPy-Discussion at scipy.org >>> > > > >>> > > >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > -- >>> > > > (\__/) >>> > > > ( O.o) >>> > > > ( > <) Este es Conejo. Copia a Conejo en tu firma >>> > y ay?dale >>> > > en sus >>> > > > planes de dominaci?n mundial. >>> > > > _______________________________________________ >>> > > > NumPy-Discussion mailing list >>> > > > NumPy-Discussion at scipy.org >>> > > > >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > > >>> > > >>> > > >>> > > _______________________________________________ >>> > > NumPy-Discussion mailing list >>> > > NumPy-Discussion at scipy.org >>> > > >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > -- >>> > > (\__/) >>> > > ( O.o) >>> > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale >>> > en sus >>> > > planes de dominaci?n mundial. >>> > > _______________________________________________ >>> > > NumPy-Discussion mailing list >>> > > NumPy-Discussion at scipy.org >>> > > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> > >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> > >>> > >>> > >>> > >>> > -- >>> > (\__/) >>> > ( O.o) >>> > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus >>> > planes de dominaci?n mundial. >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> >> -- >> (\__/) >> ( O.o) >> ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes >> de dominaci?n mundial. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjw at ncf.ca Wed Feb 4 14:51:19 2015 From: cjw at ncf.ca (Colin J. Williams) Date: Wed, 04 Feb 2015 14:51:19 -0500 Subject: [Numpy-discussion] Characteristic of a Matrix. In-Reply-To: References: <54AB3588.1090902@ncf.ca> <54AC849C.20902@ncf.ca> Message-ID: <54D27837.9000608@ncf.ca> On 06/01/2015 8:38 PM, Alexander Belopolsky wrote: > > On Tue, Jan 6, 2015 at 8:20 PM, Nathaniel Smith > wrote: > > > Since matrices are now part of some high school curricula, I urge that they > > be treated appropriately in Numpy. Further, I suggest that > consideration be > > given to establishing V and VT sub-classes, to cover vectors and > transposed > > vectors. > > The numpy devs don't really have the interest or the skills to create > a great library for pedagogical use in high schools. If you're > interested in an interface like this, then I'd suggest creating a new > package focused specifically on that (which might use numpy > internally). There's really no advantage in glomming this into numpy > proper. > > > Sorry for taking this further off-topic, but I recently discovered an > excellent SAGE package, . While it's > targeted audience includes math graduate students and research > mathematicians, parts of it are accessible to schoolchildren. SAGE is > written in Python and integrates a number of packages including numpy. My remark about high school was intended to emphasise that matrix algebra is an essential part of linear algebra. Numpy has not fully developed this part. I feel that Guido may not have fully understood the availability of the Matrix class when he approved the reliance on dot(). > > I would highly recommend to anyone interested in using Python for > education to take a look at SAGE. Thanks Alexander, I'll do that. It looks excellent, but it seems that the University of Washington has funding problems and does not appear to have the crew of volunteers that Python has. Regards, Colin W. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjw at ncf.ca Wed Feb 4 15:07:23 2015 From: cjw at ncf.ca (Colin J. Williams) Date: Wed, 04 Feb 2015 15:07:23 -0500 Subject: [Numpy-discussion] Characteristic of a Matrix. In-Reply-To: References: <54AB3588.1090902@ncf.ca> <54AC849C.20902@ncf.ca> <54AD8CAF.5090207@ncf.ca> Message-ID: <54D27BFB.3000800@ncf.ca> On 08/01/2015 1:19 PM, Ryan Nelson wrote: > Colin, > > I'll second the endorsement of Sage; however, for teaching purposes, I > would suggest Sage Math Cloud. It is a free, web-based version of > Sage, and it does not require you or the students to install any > software (besides a new-ish web browser). It also make > sharing/collaborative work quite easy as well. I've used this a bit > for demos, and it's great. The author William Stein is good at > correcting bugs/issues very quickly. > > Sage implements it's own Matrix and Vector classes, and the Vector > class has a "column" method that returns a column vector (transpose). > http://www.sagemath.org/doc/tutorial/tour_linalg.html > > For what it's worth, I agree with others about the benefits of > avoiding a Matrix class in Numpy. In my experience, it certainly makes > things cleaner in larger projects when I always use NDArray and just > call the appropriate linear algebra functions (e.g. np.dot, etc) when > that is context I need. > > Anyway, just my two cents. > > Ryan Ryan, Thanks. I agree that Sage Math Cloud seems the better way to go for students. However your preference for the dot() world may be because the Numpy Matrix Class is inadequately developed. I'm not suggesting that development, at this time, but proposing that the errors I referenced be considered as bugs. Colin W. > > On Wed, Jan 7, 2015 at 2:44 PM, cjw > > wrote: > > Thanks Alexander, > > I'll look at Sage. > > Colin W. > > > On 06-Jan-15 8:38 PM, Alexander Belopolsky wrote: >> On Tue, Jan 6, 2015 at 8:20 PM, Nathaniel Smith wrote: >> >>>> Since matrices are now part of some high school curricula, I urge that >>> they >>>> be treated appropriately in Numpy. Further, I suggest that >>> consideration be >>>> given to establishing V and VT sub-classes, to cover vectors and >>> transposed >>>> vectors. >>> The numpy devs don't really have the interest or the skills to create >>> a great library for pedagogical use in high schools. If you're >>> interested in an interface like this, then I'd suggest creating a new >>> package focused specifically on that (which might use numpy >>> internally). There's really no advantage in glomming this into numpy >>> proper. >> Sorry for taking this further off-topic, but I recently discovered an >> excellent SAGE package, . While it's targeted >> audience includes math graduate students and research mathematicians, parts >> of it are accessible to schoolchildren. SAGE is written in Python and >> integrates a number of packages including numpy. >> >> I would highly recommend to anyone interested in using Python for education >> to take a look at SAGE. >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Feb 5 14:06:17 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 5 Feb 2015 11:06:17 -0800 Subject: [Numpy-discussion] New function: np.stack? Message-ID: There are two usual ways to combine a sequence of arrays into a new array: 1. concatenated along an existing axis 2. stacked along a new axis For 1, we have np.concatenate. For 2, we have np.vstack, np.hstack, np.dstack and np.column_stack. For arrays with arbitrary dimensions, there is the np.array constructor, possibly with transpose to get the result in the correct order. (I've used this last option in the past but haven't been especially happy with it -- it takes some trial and error to get the axis swapping or transpose right for higher dimensional input.) This methods are similar but subtly distinct, and none of them generalize well to n-dimensional input. It seems like the function we are missing is the plain np.stack, which takes the axis to stack along as a keyword argument. The exact desired functionality is clearest to understand by example: >>> X = [np.random.randn(100, 200) for i in range(10)] >>> stack(X, axis=0).shape (10, 100, 200) >>> stack(X, axis=1).shape (100, 10, 200) >>> stack(X, axis=2).shape (100, 200, 10) So I'd like to propose this new function for numpy. The desired signature would be simply np.stack(arrays, axis=0). Ideally, the confusing mess of other stacking functions could then be deprecated, though we could probably never remove them. Matthew Rocklin recent wrote an out of core version this for his dask project (part of Blaze), which is what got me thinking about the need for this functionality: https://github.com/ContinuumIO/dask/pull/30 Cheers, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu Feb 5 14:10:10 2015 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 5 Feb 2015 14:10:10 -0500 Subject: [Numpy-discussion] New function: np.stack? In-Reply-To: References: Message-ID: +1! I could never keep straight which stack function I needed anyway. Wasn't there a proposal a while back for a more generic stacker, like "tetrix" or something that allowed one to piece together tiles of different sizes? Ben Root On Thu, Feb 5, 2015 at 2:06 PM, Stephan Hoyer wrote: > There are two usual ways to combine a sequence of arrays into a new array: > 1. concatenated along an existing axis > 2. stacked along a new axis > > For 1, we have np.concatenate. For 2, we have np.vstack, np.hstack, > np.dstack and np.column_stack. For arrays with arbitrary dimensions, there > is the np.array constructor, possibly with transpose to get the result in > the correct order. (I've used this last option in the past but haven't been > especially happy with it -- it takes some trial and error to get the axis > swapping or transpose right for higher dimensional input.) > > This methods are similar but subtly distinct, and none of them generalize > well to n-dimensional input. It seems like the function we are missing is > the plain np.stack, which takes the axis to stack along as a keyword > argument. The exact desired functionality is clearest to understand by > example: > > >>> X = [np.random.randn(100, 200) for i in range(10)] > >>> stack(X, axis=0).shape > (10, 100, 200) > >>> stack(X, axis=1).shape > (100, 10, 200) > >>> stack(X, axis=2).shape > (100, 200, 10) > > So I'd like to propose this new function for numpy. The desired signature > would be simply np.stack(arrays, axis=0). Ideally, the confusing mess of > other stacking functions could then be deprecated, though we could probably > never remove them. > > Matthew Rocklin recent wrote an out of core version this for his dask > project (part of Blaze), which is what got me thinking about the need for > this functionality: > https://github.com/ContinuumIO/dask/pull/30 > > Cheers, > Stephan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Feb 5 15:15:14 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2015 15:15:14 -0500 Subject: [Numpy-discussion] suggestion: improve text of failing test Message-ID: The assert_allclose text is not precise enough to be helpful to fix a test failure that cannot be replicated on every machine, and we cannot just quickly grab --pdb-failures. By how much do I have to lower the precision to make it pass on this continuous integration machine? assert_allclose(he, hefd, rtol=5e-10) File "C:\Python27\envs\py3\lib\site-packages\numpy\testing\utils.py", line 1297, in assert_allclose verbose=verbose, header=header) File "C:\Python27\envs\py3\lib\site-packages\numpy\testing\utils.py", line 665, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=5e-10, atol=0 (mismatch 100.0%) x: array([[ -2.965667e+01, -1.988865e+02, -2.370194e+00, -1.003654e+01], [ -1.988865e+02, -1.383377e+03, -1.592292e+01, -6.800266e+01], [ -2.370194e+00, -1.592292e+01, -8.301699e-01, -8.301699e-01], [ -1.003654e+01, -6.800266e+01, -8.301699e-01, -3.449885e+00]]) y: array([[ -2.965667e+01, -1.988865e+02, -2.370194e+00, -1.003654e+01], [ -1.988865e+02, -1.383377e+03, -1.592292e+01, -6.800266e+01], [ -2.370194e+00, -1.592292e+01, -8.301699e-01, -8.301699e-01], [ -1.003654e+01, -6.800266e+01, -8.301699e-01, -3.449885e+00]]) the suggestion is to add rtol and atol to the mismatch summary, so we can see if it's just a precision issue or something serious rtol = np.max(np.abs(x / y - 1) atol = np.max(np.abs(x - y) (mismatch 100.0% rtol=xxx atol=xxx) (and as aside to the "all close" discussion: I do set the tolerances very carefully especially if the agreement with comparison numbers is below 1e-6 or so) Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Feb 5 15:39:27 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 5 Feb 2015 12:39:27 -0800 Subject: [Numpy-discussion] suggestion: improve text of failing test In-Reply-To: References: Message-ID: On 5 Feb 2015 12:15, wrote: > > The assert_allclose text is not precise enough to be helpful to fix a test failure that cannot be replicated on every machine, and we cannot just quickly grab --pdb-failures. > > By how much do I have to lower the precision to make it pass on this continuous integration machine? > > > assert_allclose(he, hefd, rtol=5e-10) > File "C:\Python27\envs\py3\lib\site-packages\numpy\testing\utils.py", line 1297, in assert_allclose > verbose=verbose, header=header) > File "C:\Python27\envs\py3\lib\site-packages\numpy\testing\utils.py", line 665, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Not equal to tolerance rtol=5e-10, atol=0 > > (mismatch 100.0%) > x: array([[ -2.965667e+01, -1.988865e+02, -2.370194e+00, -1.003654e+01], > [ -1.988865e+02, -1.383377e+03, -1.592292e+01, -6.800266e+01], > [ -2.370194e+00, -1.592292e+01, -8.301699e-01, -8.301699e-01], > [ -1.003654e+01, -6.800266e+01, -8.301699e-01, -3.449885e+00]]) > y: array([[ -2.965667e+01, -1.988865e+02, -2.370194e+00, -1.003654e+01], > [ -1.988865e+02, -1.383377e+03, -1.592292e+01, -6.800266e+01], > [ -2.370194e+00, -1.592292e+01, -8.301699e-01, -8.301699e-01], > [ -1.003654e+01, -6.800266e+01, -8.301699e-01, -3.449885e+00]]) > > > the suggestion is to add rtol and atol to the mismatch summary, so we can see if it's just a precision issue or something serious > > rtol = np.max(np.abs(x / y - 1) > atol = np.max(np.abs(x - y) > > (mismatch 100.0% rtol=xxx atol=xxx) So basically just printing what rtol and/or atol would have to be to make the test pass? Sounds useful to me. (There is a bit of an infelicity in that if you're using both atol and rtol in the same test then there's no easy way to suggest how to fix both simultaneously, but I'm not sure how to fix that. Maybe we should also print max(abs(x[y == 0]))?) Want to submit a pull request? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Thu Feb 5 16:07:36 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Thu, 5 Feb 2015 13:07:36 -0800 Subject: [Numpy-discussion] New function: np.stack? In-Reply-To: References: Message-ID: On Thu, Feb 5, 2015 at 11:10 AM, Benjamin Root wrote: > +1! I could never keep straight which stack function I needed anyway. > > Wasn't there a proposal a while back for a more generic stacker, like > "tetrix" or something that allowed one to piece together tiles of different > sizes? > > Ben Root > > On Thu, Feb 5, 2015 at 2:06 PM, Stephan Hoyer wrote: > >> There are two usual ways to combine a sequence of arrays into a new array: >> 1. concatenated along an existing axis >> 2. stacked along a new axis >> >> For 1, we have np.concatenate. For 2, we have np.vstack, np.hstack, >> np.dstack and np.column_stack. For arrays with arbitrary dimensions, there >> is the np.array constructor, possibly with transpose to get the result in >> the correct order. (I've used this last option in the past but haven't been >> especially happy with it -- it takes some trial and error to get the axis >> swapping or transpose right for higher dimensional input.) >> >> This methods are similar but subtly distinct, and none of them generalize >> well to n-dimensional input. It seems like the function we are missing is >> the plain np.stack, which takes the axis to stack along as a keyword >> argument. The exact desired functionality is clearest to understand by >> example: >> >> >>> X = [np.random.randn(100, 200) for i in range(10)] >> >>> stack(X, axis=0).shape >> (10, 100, 200) >> >>> stack(X, axis=1).shape >> (100, 10, 200) >> >>> stack(X, axis=2).shape >> (100, 200, 10) >> >> So I'd like to propose this new function for numpy. The desired signature >> would be simply np.stack(arrays, axis=0). Ideally, the confusing mess of >> other stacking functions could then be deprecated, though we could probably >> never remove them. >> > Leaving aside error checking, once you have a positive axis, I think this can be implemented in 2 lines of code: sl = (slice(None),)*axis + (np.newaxis,) return np.concatenate(arr[sl] for arr in arrays) I don't have an opinion either way, and I guess if the hstacks and company have a place in numpy, this does as well. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.eberspaecher at gmail.com Thu Feb 5 16:35:51 2015 From: alex.eberspaecher at gmail.com (=?UTF-8?B?QWxleGFuZGVyIEViZXJzcMOkY2hlcg==?=) Date: Thu, 05 Feb 2015 22:35:51 +0100 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? In-Reply-To: References: <1143124890444715758.373612sturla.molden-gmail.com@news.gmane.org> Message-ID: <54D3E237.3020304@gmail.com> On 04.02.2015 11:45, Da?id wrote: > There are several definitions. Abramowitz and Stegun > (http://people.math.sfu.ca/~cbm/aands/page_1020.htm) assign the value > 0.5 at x=0. The NIST handbook uses the value 0 at x=0. Perhaps a Heaviside with an optional argument that defines the value at x=0 would be good. I'd love to see that in NumPy. > Actually, Heaviside function is better seen as a distribution instead > of a function, and then there is no problem with the value at 0, as > long as it is finite. Understanding a distribution as the limit of a sequence of functions, the value at x=0 then depended on the choice of function in the sequence, I guess. Using something symmetrical such a Gaussian or a centred box then makes the value of 0.5 plausible. Alex From josef.pktd at gmail.com Thu Feb 5 17:16:13 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2015 17:16:13 -0500 Subject: [Numpy-discussion] suggestion: improve text of failing test In-Reply-To: References: Message-ID: On Thu, Feb 5, 2015 at 3:39 PM, Nathaniel Smith wrote: > On 5 Feb 2015 12:15, wrote: > > > > The assert_allclose text is not precise enough to be helpful to fix a > test failure that cannot be replicated on every machine, and we cannot just > quickly grab --pdb-failures. > > > > By how much do I have to lower the precision to make it pass on this > continuous integration machine? > > > > > > assert_allclose(he, hefd, rtol=5e-10) > > File "C:\Python27\envs\py3\lib\site-packages\numpy\testing\utils.py", > line 1297, in assert_allclose > > verbose=verbose, header=header) > > File "C:\Python27\envs\py3\lib\site-packages\numpy\testing\utils.py", > line 665, in assert_array_compare > > raise AssertionError(msg) > > AssertionError: > > Not equal to tolerance rtol=5e-10, atol=0 > > > > (mismatch 100.0%) > > x: array([[ -2.965667e+01, -1.988865e+02, -2.370194e+00, > -1.003654e+01], > > [ -1.988865e+02, -1.383377e+03, -1.592292e+01, -6.800266e+01], > > [ -2.370194e+00, -1.592292e+01, -8.301699e-01, -8.301699e-01], > > [ -1.003654e+01, -6.800266e+01, -8.301699e-01, -3.449885e+00]]) > > y: array([[ -2.965667e+01, -1.988865e+02, -2.370194e+00, > -1.003654e+01], > > [ -1.988865e+02, -1.383377e+03, -1.592292e+01, -6.800266e+01], > > [ -2.370194e+00, -1.592292e+01, -8.301699e-01, -8.301699e-01], > > [ -1.003654e+01, -6.800266e+01, -8.301699e-01, -3.449885e+00]]) > > > > > > the suggestion is to add rtol and atol to the mismatch summary, so we > can see if it's just a precision issue or something serious > > > > rtol = np.max(np.abs(x / y - 1) > > atol = np.max(np.abs(x - y) > > > > (mismatch 100.0% rtol=xxx atol=xxx) > > So basically just printing what rtol and/or atol would have to be to make > the test pass? Sounds useful to me. (There is a bit of an infelicity in > that if you're using both atol and rtol in the same test then there's no > easy way to suggest how to fix both simultaneously, but I'm not sure how to > fix that. Maybe we should also print max(abs(x[y == 0]))?) > I usually check the rtol and atol as above in pdb on failure, Most of the time it's enough information to figure out how to twist the numbers. There are only a few cases where I'm fine tuning both rtol and atol at the same time. I guess there is the sum of the tol from the definition of allclose. We don't have many cases with y == 0 mixed together with large numbers, because our reference numbers usually also have numerical noise. One point is also to just make the test output more informative, to see if the test machine is just a bit off even if the mismatch=100%. > Want to submit a pull request? > Not really, I'd rather stick to my corner and let someone else get on the numpy contributor list :) (header was "suggestion" not "proposal") Thanks, Josef > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Fri Feb 6 05:01:14 2015 From: pierre.haessig at crans.org (Pierre Haessig) Date: Fri, 06 Feb 2015 11:01:14 +0100 Subject: [Numpy-discussion] Any interest in a 'heaviside' ufunc? In-Reply-To: References: Message-ID: <54D490EA.7070700@crans.org> Le 04/02/2015 06:58, Jaime Fern?ndez del R?o a ?crit : > > I have an implementation of the Heaviside function as numpy > ufunc. Is there any interest in adding this to numpy? The > function is simply: > > 0 if x < 0 > heaviside(x) = 0.5 if x == 0 > 1 if x > 0 > > > I don't think there's anything like it in numpy. Wouldn't > scipy.special be a better home for it? scipy.signal could also host it, since it already contains functions for linear systems (e.g. step response, which are closely related), and also some waveform generators like square() http://docs.scipy.org/doc/scipy-0.14.0/reference/signal.html However, I agree with Joseph when he says that this function is a bit thin. best, Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: From domors at gmx.net Sun Feb 8 16:03:58 2015 From: domors at gmx.net (Stefan Reiterer) Date: Sun, 8 Feb 2015 22:03:58 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful Message-ID: An HTML attachment was scrubbed... URL: From hoogendoorn.eelco at gmail.com Sun Feb 8 16:08:58 2015 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Sun, 8 Feb 2015 22:08:58 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: > I personally use Octave and/or Numpy for several years now and never ever needed braodcasting. But since it is still there there will be many users who need it, there will be some use for it. Uhm, yeah, there is some use for it. Im all for explicit over implicit, but personally current broadcasting rules have never bothered me, certainly not to the extent of justifying massive backwards compatibility violations. Take It from someone who relies on broadcasting for every other line of code. On Sun, Feb 8, 2015 at 10:03 PM, Stefan Reiterer wrote: > Hi! > > As shortly discussed on github: > https://github.com/numpy/numpy/issues/5541 > > I personally think that silent Broadcasting is not a good thing. I had > recently a lot > of trouble with row and column vectors which got bradcastet toghether > altough it was > more annoying than useful, especially since I had to search deep down into > the code to find out > that the problem was nothing else than Broadcasting... > > I personally use Octave and/or Numpy for several years now and never > ever needed braodcasting. > But since it is still there there will be many users who need it, there > will be some use for it. > > So I suggest that the best would be to throw warnings when arrays get > Broadcasted like > Octave do. Python warnings can be catched and handled, that would be a > great benefit. > > Another idea would to provide warning levels for braodcasting, e.g > 0 = Never, 1=Warn once, 2=Warn always, 3 = Forbid aka throw exception, > with 0 as default. > This would avoid breaking other code, and give the user some control over > braodcasting. > > Kind regards, > Stefan > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From domors at gmx.net Sun Feb 8 16:14:24 2015 From: domors at gmx.net (Stefan Reiterer) Date: Sun, 8 Feb 2015 22:14:24 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Feb 8 16:15:56 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 8 Feb 2015 16:15:56 -0500 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On Sun, Feb 8, 2015 at 4:08 PM, Eelco Hoogendoorn wrote: >> I personally use Octave and/or Numpy for several years now and never ever >> needed braodcasting. > But since it is still there there will be many users who need it, there will > be some use for it. > > Uhm, yeah, there is some use for it. Im all for explicit over implicit, but > personally current broadcasting rules have never bothered me, certainly not > to the extent of justifying massive backwards compatibility violations. Take > It from someone who relies on broadcasting for every other line of code. > > > On Sun, Feb 8, 2015 at 10:03 PM, Stefan Reiterer wrote: >> >> Hi! >> >> As shortly discussed on github: >> https://github.com/numpy/numpy/issues/5541 >> >> I personally think that silent Broadcasting is not a good thing. I had >> recently a lot >> of trouble with row and column vectors which got bradcastet toghether >> altough it was >> more annoying than useful, especially since I had to search deep down into >> the code to find out >> that the problem was nothing else than Broadcasting... >> >> I personally use Octave and/or Numpy for several years now and never ever >> needed braodcasting. >> But since it is still there there will be many users who need it, there >> will be some use for it. >> >> So I suggest that the best would be to throw warnings when arrays get >> Broadcasted like >> Octave do. Python warnings can be catched and handled, that would be a >> great benefit. >> >> Another idea would to provide warning levels for braodcasting, e.g >> 0 = Never, 1=Warn once, 2=Warn always, 3 = Forbid aka throw exception, >> with 0 as default. >> This would avoid breaking other code, and give the user some control over >> braodcasting. >> >> Kind regards, >> Stefan Numpy broadcasting is the greatest feature of numpy !!! It just takes a bit of getting used to coming from another matrix language. Josef what 3!? >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Sun Feb 8 16:17:51 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 8 Feb 2015 14:17:51 -0700 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer wrote: > Yeah I'm aware of that, that's the reason why I suggested a warning level > as an alternative. > Setting no warnings as default would avoid breaking existing code. > *Gesendet:* Sonntag, 08. Februar 2015 um 22:08 Uhr > *Von:* "Eelco Hoogendoorn" > *An:* "Discussion of Numerical Python" > *Betreff:* Re: [Numpy-discussion] Silent Broadcasting considered harmful > > I personally use Octave and/or Numpy for several years now and never > ever needed braodcasting. > But since it is still there there will be many users who need it, there > will be some use for it. > > Uhm, yeah, there is some use for it. Im all for explicit over implicit, > but personally current broadcasting rules have never bothered me, certainly > not to the extent of justifying massive backwards compatibility violations. > Take It from someone who relies on broadcasting for every other line of > code. > > > It's how numpy works. It would be like getting into your car and being warned that it has wheels. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From domors at gmx.net Sun Feb 8 16:24:35 2015 From: domors at gmx.net (Stefan Reiterer) Date: Sun, 8 Feb 2015 22:24:35 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sun Feb 8 16:33:22 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 8 Feb 2015 13:33:22 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: Hi, On Sun, Feb 8, 2015 at 1:24 PM, Stefan Reiterer wrote: > I don't think this is a good comparison, especially since broadcasting is a > feature not a necessity ... > It's more like turning off/on driving assistance. > > And as already mentioned: other matrix languages also allow it, but they > warn about it's usage. > This has indeed it's merits. > Gesendet: Sonntag, 08. Februar 2015 um 22:17 Uhr > Von: "Charles R Harris" > An: "Discussion of Numerical Python" > Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful > > > On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer wrote: >> >> Yeah I'm aware of that, that's the reason why I suggested a warning level >> as an alternative. >> Setting no warnings as default would avoid breaking existing code. >> Gesendet: Sonntag, 08. Februar 2015 um 22:08 Uhr >> Von: "Eelco Hoogendoorn" >> An: "Discussion of Numerical Python" >> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful >> > I personally use Octave and/or Numpy for several years now and never >> > ever needed braodcasting. >> But since it is still there there will be many users who need it, there >> will be some use for it. >> >> Uhm, yeah, there is some use for it. Im all for explicit over implicit, >> but personally current broadcasting rules have never bothered me, certainly >> not to the extent of justifying massive backwards compatibility violations. >> Take It from someone who relies on broadcasting for every other line of >> code. >> > > > It's how numpy works. It would be like getting into your car and being > warned that it has wheels. I agree. I knew about broadcasting as soon as I started using numpy, so I can honestly say this has never surprised me. There are other major incompatibilities between Matlab and numpy, such as 0-based indices, and array views. I think the matrix class would solve this for you, although most of us don't use that very much. It would be a major change to warn about broadcasting by default, and it would be odd, because we encourage broadcasting in our docs and common code, rightly I think. The naive user would not know to turn on this warning. I could imagine a non-default warning, but I doubt it would be much used. Cheers, Matthew From sgwoodjr at gmail.com Sun Feb 8 16:39:45 2015 From: sgwoodjr at gmail.com (Simon Wood) Date: Sun, 8 Feb 2015 16:39:45 -0500 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On Sun, Feb 8, 2015 at 4:24 PM, Stefan Reiterer wrote: > I don't think this is a good comparison, especially since broadcasting is > a feature not a necessity ... > It's more like turning off/on driving assistance. > > And as already mentioned: other matrix languages also allow it, but they > warn about it's usage. > This has indeed it's merits. > *Gesendet:* Sonntag, 08. Februar 2015 um 22:17 Uhr > *Von:* "Charles R Harris" > *An:* "Discussion of Numerical Python" > *Betreff:* Re: [Numpy-discussion] Silent Broadcasting considered harmful > > > On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer wrote: >> >> Yeah I'm aware of that, that's the reason why I suggested a warning >> level as an alternative. >> Setting no warnings as default would avoid breaking existing code. >> *Gesendet:* Sonntag, 08. Februar 2015 um 22:08 Uhr >> *Von:* "Eelco Hoogendoorn" >> *An:* "Discussion of Numerical Python" >> *Betreff:* Re: [Numpy-discussion] Silent Broadcasting considered harmful >> > I personally use Octave and/or Numpy for several years now and never >> ever needed braodcasting. >> But since it is still there there will be many users who need it, there >> will be some use for it. >> >> Uhm, yeah, there is some use for it. Im all for explicit over implicit, >> but personally current broadcasting rules have never bothered me, certainly >> not to the extent of justifying massive backwards compatibility violations. >> Take It from someone who relies on broadcasting for every other line of >> code. >> >> > > It's how numpy works. It would be like getting into your car and being > warned that it has wheels. > > Chuck > _______________________________________________ NumPy-Discussion > mailing list NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > I agree, I do not think this is a good comparison. All cars have wheels, there are no surprises there. This is more like a car that decides to do something completely different from everything that you learned about in driving school. I find the broadcasting aspect of Numpy a turn off. If I go to add a 1x3 vector to a 3x1 vector, I want the program to warn me or error out. I don't want it to do something under the covers that has no mathematical basis or definition. Also, Octave may provide a warning, but Matlab errors out..."Matrix dimensions must agree". Which they must, at least in my world. _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From focke at slac.stanford.edu Sun Feb 8 16:54:06 2015 From: focke at slac.stanford.edu (Warren Focke) Date: Sun, 8 Feb 2015 13:54:06 -0800 (PST) Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: , Message-ID: On Sun, 8 Feb 2015, Stefan Reiterer wrote: > And as already mentioned: other matrix languages also allow it, but they warn about it's usage. > This has indeed it's merits. numpy isn't a matrix language. They're arrays. Storing numbers that you are thinking of as a vector in an array doesn't turn the array into a vector. There are linear algebra modules; I haven't used them. w From matthew.brett at gmail.com Sun Feb 8 16:56:13 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 8 Feb 2015 13:56:13 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: Hi, On Sun, Feb 8, 2015 at 1:39 PM, Simon Wood wrote: > > > On Sun, Feb 8, 2015 at 4:24 PM, Stefan Reiterer wrote: >> >> I don't think this is a good comparison, especially since broadcasting is >> a feature not a necessity ... >> It's more like turning off/on driving assistance. >> >> And as already mentioned: other matrix languages also allow it, but they >> warn about it's usage. >> This has indeed it's merits. >> Gesendet: Sonntag, 08. Februar 2015 um 22:17 Uhr >> Von: "Charles R Harris" >> An: "Discussion of Numerical Python" >> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful >> >> >> On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer wrote: >>> >>> Yeah I'm aware of that, that's the reason why I suggested a warning level >>> as an alternative. >>> Setting no warnings as default would avoid breaking existing code. >>> Gesendet: Sonntag, 08. Februar 2015 um 22:08 Uhr >>> Von: "Eelco Hoogendoorn" >>> An: "Discussion of Numerical Python" >>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful >>> > I personally use Octave and/or Numpy for several years now and never >>> > ever needed braodcasting. >>> But since it is still there there will be many users who need it, there >>> will be some use for it. >>> >>> Uhm, yeah, there is some use for it. Im all for explicit over implicit, >>> but personally current broadcasting rules have never bothered me, certainly >>> not to the extent of justifying massive backwards compatibility violations. >>> Take It from someone who relies on broadcasting for every other line of >>> code. >>> >> >> >> It's how numpy works. It would be like getting into your car and being >> warned that it has wheels. >> >> Chuck >> _______________________________________________ NumPy-Discussion mailing >> list NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > I agree, I do not think this is a good comparison. All cars have wheels, > there are no surprises there. This is more like a car that decides to do > something completely different from everything that you learned about in > driving school. > I find the broadcasting aspect of Numpy a turn off. If I go to add a 1x3 > vector to a 3x1 vector, I want the program to warn me or error out. I don't > want it to do something under the covers that has no mathematical basis or > definition. Also, Octave may provide a warning, but Matlab errors > out..."Matrix dimensions must agree". Which they must, at least in my world. In a previous life, many of us were very serious users of Matlab, myself included. Matlab / Octave have a model of the array as being a matrix, but numpy does not have this model. There is a Matrix class that implements this model, but usually experienced numpy users either never use this, or stop using it. I can only say - subjectively I know - that I did not personally suffer from this when I switched to numpy from Matlab, partly because I was fully aware that I was going to have to change the way I thought about arrays, for various reasons. After a short while getting used to it, broadcasting seemed like a huge win. I guess the fact that other languages have adopted it means that others have had the same experience. So, numpy is not a straight replacement of Matlab, in terms of design. To pursue the analogy, you have learned to drive an automatic car. Numpy is a stick-shift car. There are good reasons to prefer a stick-shift, but it does mean that someone trained on an automatic is bound to feel that a stick-shift is uncomfortable for a while. Best, Matthew From hoogendoorn.eelco at gmail.com Sun Feb 8 16:57:05 2015 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Sun, 8 Feb 2015 22:57:05 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: This. (nd)arrays are a far more widespread concept than linear algebraic operations. If you want LA semantics, use the matrix subclass. Or don't, since simply sticking to the much more pervasive and general ndarray semantics is usually simpler and less confusing. On Sun, Feb 8, 2015 at 10:54 PM, Warren Focke wrote: > On Sun, 8 Feb 2015, Stefan Reiterer wrote: > > > And as already mentioned: other matrix languages also allow it, but they > warn about it's usage. > > This has indeed it's merits. > > numpy isn't a matrix language. They're arrays. Storing numbers that you are > thinking of as a vector in an array doesn't turn the array into a vector. > There > are linear algebra modules; I haven't used them. > > w > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Sun Feb 8 16:56:53 2015 From: ben.root at ou.edu (Benjamin Root) Date: Sun, 8 Feb 2015 16:56:53 -0500 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: numpy is like Tesla. Everybody else has been doing it wrong... Ben Root On Sun, Feb 8, 2015 at 4:39 PM, Simon Wood wrote: > > > On Sun, Feb 8, 2015 at 4:24 PM, Stefan Reiterer wrote: > >> I don't think this is a good comparison, especially since broadcasting is >> a feature not a necessity ... >> It's more like turning off/on driving assistance. >> >> And as already mentioned: other matrix languages also allow it, but they >> warn about it's usage. >> This has indeed it's merits. >> *Gesendet:* Sonntag, 08. Februar 2015 um 22:17 Uhr >> *Von:* "Charles R Harris" >> *An:* "Discussion of Numerical Python" >> *Betreff:* Re: [Numpy-discussion] Silent Broadcasting considered harmful >> >> >> On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer wrote: >>> >>> Yeah I'm aware of that, that's the reason why I suggested a warning >>> level as an alternative. >>> Setting no warnings as default would avoid breaking existing code. >>> *Gesendet:* Sonntag, 08. Februar 2015 um 22:08 Uhr >>> *Von:* "Eelco Hoogendoorn" >>> *An:* "Discussion of Numerical Python" >>> *Betreff:* Re: [Numpy-discussion] Silent Broadcasting considered harmful >>> > I personally use Octave and/or Numpy for several years now and never >>> ever needed braodcasting. >>> But since it is still there there will be many users who need it, there >>> will be some use for it. >>> >>> Uhm, yeah, there is some use for it. Im all for explicit over >>> implicit, but personally current broadcasting rules have never bothered me, >>> certainly not to the extent of justifying massive backwards compatibility >>> violations. Take It from someone who relies on broadcasting for every other >>> line of code. >>> >>> >> >> It's how numpy works. It would be like getting into your car and being >> warned that it has wheels. >> >> Chuck >> _______________________________________________ NumPy-Discussion >> mailing list NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > I agree, I do not think this is a good comparison. All cars have wheels, > there are no surprises there. This is more like a car that decides to do > something completely different from everything that you learned about in > driving school. > > I find the broadcasting aspect of Numpy a turn off. If I go to add a 1x3 > vector to a 3x1 vector, I want the program to warn me or error out. I don't > want it to do something under the covers that has no mathematical basis or > definition. Also, Octave may provide a warning, but Matlab errors > out..."Matrix dimensions must agree". Which they must, at least in my world. > _______________________________________________ > >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From domors at gmx.net Sun Feb 8 17:17:30 2015 From: domors at gmx.net (Stefan Reiterer) Date: Sun, 8 Feb 2015 23:17:30 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Feb 8 17:28:10 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 8 Feb 2015 17:28:10 -0500 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On Sun, Feb 8, 2015 at 4:56 PM, Matthew Brett wrote: > Hi, > > On Sun, Feb 8, 2015 at 1:39 PM, Simon Wood wrote: >> >> >> On Sun, Feb 8, 2015 at 4:24 PM, Stefan Reiterer wrote: >>> >>> I don't think this is a good comparison, especially since broadcasting is >>> a feature not a necessity ... >>> It's more like turning off/on driving assistance. >>> >>> And as already mentioned: other matrix languages also allow it, but they >>> warn about it's usage. >>> This has indeed it's merits. >>> Gesendet: Sonntag, 08. Februar 2015 um 22:17 Uhr >>> Von: "Charles R Harris" >>> An: "Discussion of Numerical Python" >>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful >>> >>> >>> On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer wrote: >>>> >>>> Yeah I'm aware of that, that's the reason why I suggested a warning level >>>> as an alternative. >>>> Setting no warnings as default would avoid breaking existing code. >>>> Gesendet: Sonntag, 08. Februar 2015 um 22:08 Uhr >>>> Von: "Eelco Hoogendoorn" >>>> An: "Discussion of Numerical Python" >>>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful >>>> > I personally use Octave and/or Numpy for several years now and never >>>> > ever needed braodcasting. >>>> But since it is still there there will be many users who need it, there >>>> will be some use for it. >>>> >>>> Uhm, yeah, there is some use for it. Im all for explicit over implicit, >>>> but personally current broadcasting rules have never bothered me, certainly >>>> not to the extent of justifying massive backwards compatibility violations. >>>> Take It from someone who relies on broadcasting for every other line of >>>> code. >>>> >>> >>> >>> It's how numpy works. It would be like getting into your car and being >>> warned that it has wheels. >>> >>> Chuck >>> _______________________________________________ NumPy-Discussion mailing >>> list NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> I agree, I do not think this is a good comparison. All cars have wheels, >> there are no surprises there. This is more like a car that decides to do >> something completely different from everything that you learned about in >> driving school. > >> I find the broadcasting aspect of Numpy a turn off. If I go to add a 1x3 >> vector to a 3x1 vector, I want the program to warn me or error out. I don't >> want it to do something under the covers that has no mathematical basis or >> definition. Also, Octave may provide a warning, but Matlab errors >> out..."Matrix dimensions must agree". Which they must, at least in my world. > > In a previous life, many of us were very serious users of Matlab, > myself included. > > Matlab / Octave have a model of the array as being a matrix, but numpy > does not have this model. There is a Matrix class that implements > this model, but usually experienced numpy users either never use this, > or stop using it. > > I can only say - subjectively I know - that I did not personally > suffer from this when I switched to numpy from Matlab, partly because > I was fully aware that I was going to have to change the way I thought > about arrays, for various reasons. After a short while getting used > to it, broadcasting seemed like a huge win. I guess the fact that > other languages have adopted it means that others have had the same > experience. > > So, numpy is not a straight replacement of Matlab, in terms of design. > > To pursue the analogy, you have learned to drive an automatic car. > Numpy is a stick-shift car. There are good reasons to prefer a > stick-shift, but it does mean that someone trained on an automatic is > bound to feel that a stick-shift is uncomfortable for a while. I think the analogy is Python printing at the start and all the time a warning "We use indentation, not braces, brackets or `end` to indicate blocks of code." Josef > > Best, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Sun Feb 8 17:43:19 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 8 Feb 2015 17:43:19 -0500 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On Sun, Feb 8, 2015 at 5:17 PM, Stefan Reiterer wrote: > Actually I use numpy for several years now, and I love it. > The reason that I think silent broadcasting of sums is bad > comes simply from the fact, that I had more trouble with it, than it helped > me. > > I won't stop using numpy because of that, but I think this behavior may > backfire, > and thats the reason I started this discussion. Till now the only way out of > the misery > is to make proper unit tests, and to be careful as hell with dimensions and > shape checks. I fully agree with the last part. We need a lot more checks and be a lot more careful in numpy than in matlab. But that's a fundamental difference between the array versus matrix approach. For me the main behavior I had to adjust to was loosing a dimension in any reduce operation, mean, sum, ... if x is 2d x - x.mean(1) we loose a dimension, and it doesn't broadcast in the right direction x - x.mean(0) perfect, no `repeat` needed, it just broadcasts the way we need. Josef > > Providing optional warnings just would be an elegant way out of this. > > Cheers, > Stefan > Gesendet: Sonntag, 08. Februar 2015 um 22:56 Uhr > Von: "Matthew Brett" > > An: "Discussion of Numerical Python" > Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful > Hi, > > On Sun, Feb 8, 2015 at 1:39 PM, Simon Wood wrote: >> >> >> On Sun, Feb 8, 2015 at 4:24 PM, Stefan Reiterer wrote: >>> >>> I don't think this is a good comparison, especially since broadcasting is >>> a feature not a necessity ... >>> It's more like turning off/on driving assistance. >>> >>> And as already mentioned: other matrix languages also allow it, but they >>> warn about it's usage. >>> This has indeed it's merits. >>> Gesendet: Sonntag, 08. Februar 2015 um 22:17 Uhr >>> Von: "Charles R Harris" >>> An: "Discussion of Numerical Python" >>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful >>> >>> >>> On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer wrote: >>>> >>>> Yeah I'm aware of that, that's the reason why I suggested a warning >>>> level >>>> as an alternative. >>>> Setting no warnings as default would avoid breaking existing code. >>>> Gesendet: Sonntag, 08. Februar 2015 um 22:08 Uhr >>>> Von: "Eelco Hoogendoorn" >>>> An: "Discussion of Numerical Python" >>>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful >>>> > I personally use Octave and/or Numpy for several years now and never >>>> > ever needed braodcasting. >>>> But since it is still there there will be many users who need it, there >>>> will be some use for it. >>>> >>>> Uhm, yeah, there is some use for it. Im all for explicit over implicit, >>>> but personally current broadcasting rules have never bothered me, >>>> certainly >>>> not to the extent of justifying massive backwards compatibility >>>> violations. >>>> Take It from someone who relies on broadcasting for every other line of >>>> code. >>>> >>> >>> >>> It's how numpy works. It would be like getting into your car and being >>> warned that it has wheels. >>> >>> Chuck >>> _______________________________________________ NumPy-Discussion mailing >>> list NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> I agree, I do not think this is a good comparison. All cars have wheels, >> there are no surprises there. This is more like a car that decides to do >> something completely different from everything that you learned about in >> driving school. > >> I find the broadcasting aspect of Numpy a turn off. If I go to add a 1x3 >> vector to a 3x1 vector, I want the program to warn me or error out. I >> don't >> want it to do something under the covers that has no mathematical basis or >> definition. Also, Octave may provide a warning, but Matlab errors >> out..."Matrix dimensions must agree". Which they must, at least in my >> world. > > In a previous life, many of us were very serious users of Matlab, > myself included. > > Matlab / Octave have a model of the array as being a matrix, but numpy > does not have this model. There is a Matrix class that implements > this model, but usually experienced numpy users either never use this, > or stop using it. > > I can only say - subjectively I know - that I did not personally > suffer from this when I switched to numpy from Matlab, partly because > I was fully aware that I was going to have to change the way I thought > about arrays, for various reasons. After a short while getting used > to it, broadcasting seemed like a huge win. I guess the fact that > other languages have adopted it means that others have had the same > experience. > > So, numpy is not a straight replacement of Matlab, in terms of design. > > To pursue the analogy, you have learned to drive an automatic car. > Numpy is a stick-shift car. There are good reasons to prefer a > stick-shift, but it does mean that someone trained on an automatic is > bound to feel that a stick-shift is uncomfortable for a while. > > Best, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From njs at pobox.com Sun Feb 8 17:47:52 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 8 Feb 2015 14:47:52 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On 8 Feb 2015 13:39, "Simon Wood" wrote: > > I find the broadcasting aspect of Numpy a turn off. If I go to add a 1x3 vector to a 3x1 vector, I want the program to warn me or error out. I don't want it to do something under the covers that has no mathematical basis or definition. Also, Octave may provide a warning, but Matlab errors out..."Matrix dimensions must agree". Which they must, at least in my world. There may be another matlab/numpy idiom clash here that's affecting this: in MATLAB, vectors are always 1 x n or n x 1, because of the matrix focused history. In numpy the idiomatic thing to do is to make vectors one-dimensional, and then this confusion cannot arise. Indeed, the only cases I'm thinking of where I even create a 3x1 or 1x3 vector in the first place are when I'm about to do something clever with broadcasting. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgwoodjr at gmail.com Sun Feb 8 17:47:57 2015 From: sgwoodjr at gmail.com (Simon Wood) Date: Sun, 8 Feb 2015 17:47:57 -0500 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On Sun, Feb 8, 2015 at 5:28 PM, wrote: > On Sun, Feb 8, 2015 at 4:56 PM, Matthew Brett > wrote: > > Hi, > > > > On Sun, Feb 8, 2015 at 1:39 PM, Simon Wood wrote: > >> > >> > >> On Sun, Feb 8, 2015 at 4:24 PM, Stefan Reiterer wrote: > >>> > >>> I don't think this is a good comparison, especially since broadcasting > is > >>> a feature not a necessity ... > >>> It's more like turning off/on driving assistance. > >>> > >>> And as already mentioned: other matrix languages also allow it, but > they > >>> warn about it's usage. > >>> This has indeed it's merits. > >>> Gesendet: Sonntag, 08. Februar 2015 um 22:17 Uhr > >>> Von: "Charles R Harris" > >>> An: "Discussion of Numerical Python" > >>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful > >>> > >>> > >>> On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer > wrote: > >>>> > >>>> Yeah I'm aware of that, that's the reason why I suggested a warning > level > >>>> as an alternative. > >>>> Setting no warnings as default would avoid breaking existing code. > >>>> Gesendet: Sonntag, 08. Februar 2015 um 22:08 Uhr > >>>> Von: "Eelco Hoogendoorn" > >>>> An: "Discussion of Numerical Python" > >>>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful > >>>> > I personally use Octave and/or Numpy for several years now and > never > >>>> > ever needed braodcasting. > >>>> But since it is still there there will be many users who need it, > there > >>>> will be some use for it. > >>>> > >>>> Uhm, yeah, there is some use for it. Im all for explicit over > implicit, > >>>> but personally current broadcasting rules have never bothered me, > certainly > >>>> not to the extent of justifying massive backwards compatibility > violations. > >>>> Take It from someone who relies on broadcasting for every other line > of > >>>> code. > >>>> > >>> > >>> > >>> It's how numpy works. It would be like getting into your car and being > >>> warned that it has wheels. > >>> > >>> Chuck > >>> _______________________________________________ NumPy-Discussion > mailing > >>> list NumPy-Discussion at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > >> > >> I agree, I do not think this is a good comparison. All cars have wheels, > >> there are no surprises there. This is more like a car that decides to do > >> something completely different from everything that you learned about in > >> driving school. > > > >> I find the broadcasting aspect of Numpy a turn off. If I go to add a 1x3 > >> vector to a 3x1 vector, I want the program to warn me or error out. I > don't > >> want it to do something under the covers that has no mathematical basis > or > >> definition. Also, Octave may provide a warning, but Matlab errors > >> out..."Matrix dimensions must agree". Which they must, at least in my > world. > > > > In a previous life, many of us were very serious users of Matlab, > > myself included. > > > > Matlab / Octave have a model of the array as being a matrix, but numpy > > does not have this model. There is a Matrix class that implements > > this model, but usually experienced numpy users either never use this, > > or stop using it. > > > > I can only say - subjectively I know - that I did not personally > > suffer from this when I switched to numpy from Matlab, partly because > > I was fully aware that I was going to have to change the way I thought > > about arrays, for various reasons. After a short while getting used > > to it, broadcasting seemed like a huge win. I guess the fact that > > other languages have adopted it means that others have had the same > > experience. > > > > So, numpy is not a straight replacement of Matlab, in terms of design. > > > > To pursue the analogy, you have learned to drive an automatic car. > > Numpy is a stick-shift car. There are good reasons to prefer a > > stick-shift, but it does mean that someone trained on an automatic is > > bound to feel that a stick-shift is uncomfortable for a while. > > > I think the analogy is Python printing at the start and all the time a > warning > "We use indentation, not braces, brackets or `end` to indicate blocks of > code." > > Josef > > > Not quite the same. This is not so much about language semantics as mathematical definitions. You (the Numpy community) have decided to overload certain mathematical operators to act in a way that is not consistent with linear algebra teachings. This can be a bit confusing for people who develop and implement mathematical algorithms that have a strong foundation in linear algebra, irrespective of the language they are migrating from. With that said, I do appreciate the comments by Matthew, Eelco and others. Numpy is *not* a linear algebra package, so it does not adhere to the same mathematical definitions. This realization has cleared some things up. -Simon > > > > Best, > > > > Matthew > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Feb 8 17:52:53 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 8 Feb 2015 14:52:53 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On 8 Feb 2015 13:04, "Stefan Reiterer" wrote: > > So I suggest that the best would be to throw warnings when arrays get Broadcasted like > Octave do. Python warnings can be catched and handled, that would be a great benefit. > > Another idea would to provide warning levels for braodcasting, e.g > 0 = Never, 1=Warn once, 2=Warn always, 3 = Forbid aka throw exception, > with 0 as default. > This would avoid breaking other code, and give the user some control over braodcasting. Unfortunately adding warnings is a non-starter for technical reasons, even before we get into the more subjective debate about ideal API design: issuing a warning is extremely slow (relative to typical array operations), EVEN IF the warning is disabled. (By the time you can figure out it's disabled, it's too late.) So this would cause massive slowdowns in existing code. Note also that in numpy, even simple expressions like '2 * arr' rely on broadcasting. Do you really want warnings for all these cases? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjw at ncf.ca Sun Feb 8 17:57:39 2015 From: cjw at ncf.ca (cjw) Date: Sun, 08 Feb 2015 17:57:39 -0500 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: <54D7E9E3.2040302@ncf.ca> An HTML attachment was scrubbed... URL: From rays at blue-cove.com Sun Feb 8 18:24:25 2015 From: rays at blue-cove.com (R Schumacher) Date: Sun, 08 Feb 2015 15:24:25 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: <201502082324.t18NOWh5009583@blue-cove.com> At 02:47 PM 2/8/2015, Simon Wood wrote: >Not quite the same. This is not so much about language semantics as >mathematical definitions. You (the Numpy community) have decided to >overload certain mathematical operators to act in a way that is not >consistent with linear algebra teachings. This can be a bit >confusing for people who develop and implement mathematical >algorithms that have a strong foundation in linear algebra, >irrespective of the language they are migrating from. > >With that said, I do appreciate the comments by Matthew, Eelco and >others. Numpy is *not* a linear algebra package, so it does not >adhere to the same mathematical definitions. This realization has >cleared some things up. Via my (admittedly infrequent use of) numpy.linalg http://docs.scipy.org/doc/numpy/reference/routines.linalg.html#linear-algebra-on-several-matrices-at-once I think it behaves more in line with algebraic thinkers. I do not have any issue with broadcasting, and use it frequently, but I've always wanted to see more examples and discussion directly in the docs, in general. I have over years post/argued for a doc site more like PHP-doc, where users can contribute examples and discuss them. There is a wealth of such examples here in the list and the tutorial, but requires unnecessary time and Google-foo. - Ray Schumacher From hoogendoorn.eelco at gmail.com Sun Feb 8 18:24:55 2015 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Mon, 9 Feb 2015 00:24:55 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: Yeah, its all about the preferred semantics. Indeed if you want to use LA semantics, ndarray semantics are somewhat of a disappointment; though I would argue they do have a very well design internal logic of their own. Much moreso than LA semantics in the first place; LA semantics fail to generalize in any sort of elegant way to higher order tensors, and all the operations you might expect from them. For that reason, my default approach for (multi)linear products is np.einsum. Instead of relying on row/column conventions which might break if you add additional axes, or need to swap them around for performance/compatibility reasons, it is actually very nice to always make it explicit which axes you are acting upon. On Sun, Feb 8, 2015 at 11:47 PM, Simon Wood wrote: > > > On Sun, Feb 8, 2015 at 5:28 PM, wrote: > >> On Sun, Feb 8, 2015 at 4:56 PM, Matthew Brett >> wrote: >> > Hi, >> > >> > On Sun, Feb 8, 2015 at 1:39 PM, Simon Wood wrote: >> >> >> >> >> >> On Sun, Feb 8, 2015 at 4:24 PM, Stefan Reiterer >> wrote: >> >>> >> >>> I don't think this is a good comparison, especially since >> broadcasting is >> >>> a feature not a necessity ... >> >>> It's more like turning off/on driving assistance. >> >>> >> >>> And as already mentioned: other matrix languages also allow it, but >> they >> >>> warn about it's usage. >> >>> This has indeed it's merits. >> >>> Gesendet: Sonntag, 08. Februar 2015 um 22:17 Uhr >> >>> Von: "Charles R Harris" >> >>> An: "Discussion of Numerical Python" >> >>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered harmful >> >>> >> >>> >> >>> On Sun, Feb 8, 2015 at 2:14 PM, Stefan Reiterer >> wrote: >> >>>> >> >>>> Yeah I'm aware of that, that's the reason why I suggested a warning >> level >> >>>> as an alternative. >> >>>> Setting no warnings as default would avoid breaking existing code. >> >>>> Gesendet: Sonntag, 08. Februar 2015 um 22:08 Uhr >> >>>> Von: "Eelco Hoogendoorn" >> >>>> An: "Discussion of Numerical Python" >> >>>> Betreff: Re: [Numpy-discussion] Silent Broadcasting considered >> harmful >> >>>> > I personally use Octave and/or Numpy for several years now and >> never >> >>>> > ever needed braodcasting. >> >>>> But since it is still there there will be many users who need it, >> there >> >>>> will be some use for it. >> >>>> >> >>>> Uhm, yeah, there is some use for it. Im all for explicit over >> implicit, >> >>>> but personally current broadcasting rules have never bothered me, >> certainly >> >>>> not to the extent of justifying massive backwards compatibility >> violations. >> >>>> Take It from someone who relies on broadcasting for every other line >> of >> >>>> code. >> >>>> >> >>> >> >>> >> >>> It's how numpy works. It would be like getting into your car and being >> >>> warned that it has wheels. >> >>> >> >>> Chuck >> >>> _______________________________________________ NumPy-Discussion >> mailing >> >>> list NumPy-Discussion at scipy.org >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>> >> >> >> >> I agree, I do not think this is a good comparison. All cars have >> wheels, >> >> there are no surprises there. This is more like a car that decides to >> do >> >> something completely different from everything that you learned about >> in >> >> driving school. >> > >> >> I find the broadcasting aspect of Numpy a turn off. If I go to add a >> 1x3 >> >> vector to a 3x1 vector, I want the program to warn me or error out. I >> don't >> >> want it to do something under the covers that has no mathematical >> basis or >> >> definition. Also, Octave may provide a warning, but Matlab errors >> >> out..."Matrix dimensions must agree". Which they must, at least in my >> world. >> > >> > In a previous life, many of us were very serious users of Matlab, >> > myself included. >> > >> > Matlab / Octave have a model of the array as being a matrix, but numpy >> > does not have this model. There is a Matrix class that implements >> > this model, but usually experienced numpy users either never use this, >> > or stop using it. >> > >> > I can only say - subjectively I know - that I did not personally >> > suffer from this when I switched to numpy from Matlab, partly because >> > I was fully aware that I was going to have to change the way I thought >> > about arrays, for various reasons. After a short while getting used >> > to it, broadcasting seemed like a huge win. I guess the fact that >> > other languages have adopted it means that others have had the same >> > experience. >> > >> > So, numpy is not a straight replacement of Matlab, in terms of design. >> > >> > To pursue the analogy, you have learned to drive an automatic car. >> > Numpy is a stick-shift car. There are good reasons to prefer a >> > stick-shift, but it does mean that someone trained on an automatic is >> > bound to feel that a stick-shift is uncomfortable for a while. >> >> >> I think the analogy is Python printing at the start and all the time a >> warning >> "We use indentation, not braces, brackets or `end` to indicate blocks of >> code." >> >> Josef >> >> >> > Not quite the same. This is not so much about language semantics as > mathematical definitions. You (the Numpy community) have decided to > overload certain mathematical operators to act in a way that is not > consistent with linear algebra teachings. This can be a bit confusing for > people who develop and implement mathematical algorithms that have a strong > foundation in linear algebra, irrespective of the language they are > migrating from. > > With that said, I do appreciate the comments by Matthew, Eelco and others. > Numpy is *not* a linear algebra package, so it does not adhere to the same > mathematical definitions. This realization has cleared some things up. > > -Simon > > >> > >> > Best, >> > >> > Matthew >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Sun Feb 8 19:12:52 2015 From: efiring at hawaii.edu (Eric Firing) Date: Sun, 08 Feb 2015 14:12:52 -1000 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: <54D7FB84.9000401@hawaii.edu> On 2015/02/08 12:43 PM, josef.pktd at gmail.com wrote: > > For me the main behavior I had to adjust to was loosing a dimension in > any reduce operation, mean, sum, ... > > if x is 2d > x - x.mean(1) > we loose a dimension, and it doesn't broadcast in the right direction Though you can use: x_demeaned = x - np.mean(x, axis=1, keepdims=True) > > x - x.mean(0) > perfect, no `repeat` needed, it just broadcasts the way we need. > > Josef From josef.pktd at gmail.com Sun Feb 8 19:52:13 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 8 Feb 2015 19:52:13 -0500 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: <54D7FB84.9000401@hawaii.edu> References: <54D7FB84.9000401@hawaii.edu> Message-ID: On Sun, Feb 8, 2015 at 7:12 PM, Eric Firing wrote: > On 2015/02/08 12:43 PM, josef.pktd at gmail.com wrote: > >> >> For me the main behavior I had to adjust to was loosing a dimension in >> any reduce operation, mean, sum, ... >> >> if x is 2d >> x - x.mean(1) >> we loose a dimension, and it doesn't broadcast in the right direction > > Though you can use: > > x_demeaned = x - np.mean(x, axis=1, keepdims=True) Yes, I thought afterwards it may not be a good example, because it illustrates that numpy developers do respond with improving things that are clumsier than in other languages/packages like the "matrix" languages. (and I don't want broadcasting to change or even to cause warnings.) keepdims didn't exist when I started out with numpy and scipy 7 or so years ago. Nevertheless, it's still often easier to write a function that assumes a specific shape structure than coding for general nd arrays. def my_function_that_works_over_rows(x, axis=0): if x.ndim == 1: x = x[:, None] if axis !=0: raise ValueError('only axis=0 is supported :(') Josef > >> >> x - x.mean(0) >> perfect, no `repeat` needed, it just broadcasts the way we need. >> >> Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From sturla.molden at gmail.com Mon Feb 9 01:48:15 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 9 Feb 2015 06:48:15 +0000 (UTC) Subject: [Numpy-discussion] Silent Broadcasting considered harmful References: Message-ID: <604083984445156579.995302sturla.molden-gmail.com@news.gmane.org> Matthew Brett wrote: > I agree. I knew about broadcasting as soon as I started using numpy, > so I can honestly say this has never surprised me. Fortran 90 has broadcasting too. NumPy's broadcasting was inspired by Fortran 90, which was the lingua franca of scientific computing in the 1990s. Like NumPy, Fortran 90 is an array language, not a matrix language like Matlab. > There are other major incompatibilities between Matlab and numpy, such > as 0-based indices, and array views. Yes. NumPy is not Matlab and not intending to clone Matlab. Those who want Matlab know where to find it. Those who need a free Matlab clone can download Scilab. Sturla From sturla.molden at gmail.com Mon Feb 9 01:59:09 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 9 Feb 2015 06:59:09 +0000 (UTC) Subject: [Numpy-discussion] Silent Broadcasting considered harmful References: Message-ID: <1285134740445157537.157712sturla.molden-gmail.com@news.gmane.org> Simon Wood wrote: > Not quite the same. This is not so much about language semantics as > mathematical definitions. You (the Numpy community) have decided to > overload certain mathematical operators to act in a way that is not > consistent with linear algebra teachings. We have overloaded the operators to make them work like Fortran 90. That is about as hardcore numerical computing semantics as you get. > This can be a bit confusing for > people who develop and implement mathematical algorithms that have a strong > foundation in linear algebra, irrespective of the language they are > migrating from. Those who develop such algorithms are probably going to use BLAS and LAPACK. Then by deduction they know about Fortran, and then by deduction they know about array broadcasting. Sturla From sturla.molden at gmail.com Mon Feb 9 02:29:42 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 09 Feb 2015 08:29:42 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: , Message-ID: On 08/02/15 23:17, Stefan Reiterer wrote: > Actually I use numpy for several years now, and I love it. > The reason that I think silent broadcasting of sums is bad > comes simply from the fact, that I had more trouble with it, than it > helped me. In Fortran 90, broadcasting allows us to write concise expressions which are compiled into very efficient machine code. This would be very difficult for the compiler if the code used explicit repeats and reshapes instead of broadcasting. In NumPy and Matlab, this aspect is not equally important. But broadcasting makes array expressions more terse, and is save some redundant array copies. NumPy code tends to be memory bound; broadcasting can therefore improve performance, particularly when arrays are large. But the effect is not as dramatic as it is in Fortran. Readability is also important. Vectorized Matlab code quickly degenerates into a mess of reshapes and repmats, and can be very hard to read. NumPy and Fortran 90 codes do not loose the readability like this even in the most complicated array expressions. Sturla From domors at gmx.net Mon Feb 9 02:34:00 2015 From: domors at gmx.net (Stefan Reiterer) Date: Mon, 9 Feb 2015 08:34:00 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From domors at gmx.net Mon Feb 9 02:37:24 2015 From: domors at gmx.net (Stefan Reiterer) Date: Mon, 9 Feb 2015 08:37:24 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: <201502082324.t18NOWh5009583@blue-cove.com> References: , <201502082324.t18NOWh5009583@blue-cove.com> Message-ID: An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Mon Feb 9 03:47:26 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 09 Feb 2015 09:47:26 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: , Message-ID: On 09/02/15 08:34, Stefan Reiterer wrote: > So maybe the better way would be not to add warnings to braodcasting > operations, but to overhaul the matrix class > to make it more attractive for numerical linear algebra(?) I think you underestimate the amount of programming this would take. Take an arch LA trait of Matlab, the backslash operator. If you have an expression like a \ b it is evaluated as solve(a,b). But how should you solve this? LU, QR, Cholesky, SVD? Exact? Least-squares? Surely that depends on the context and the array. Matlab's backslash operators amounts to about 100,000 LOC. Let's say we add attributes to the Matrix class to symbolically express things like inversion, symmetric matrix, triangular matrix, tridiagonal matrix, etc. Say that you know a is symmetric and positive definite and b rectangular. How would you solve this most efficiently with BLAS and LAPACK? (a**-1) * b In SciPy notation this is cho_solve(cho_factor(A),B). But what if you know that B has a certain structure? And what if the arrays are in C order instead of Fortran order? Is there a way to avoid copy-and-transpose? NumPy's dot operator does this, but LA solvers should too. Put that on top of kernels for evalutationg any kind of (a**-1) * b expressions. But what if the orders are reversed? b * (a**-1) Those will also need special casings. A LA package would need specialized code for any of these thinkable combinations, and pick the best one in a fly. That is what Matlab's backslash operator can do. NumPy cannot. But it is not difficult if you want to sit down and program it. If would just take you about 100,000 to half a million LOC. And chances are you effort will not even be appreciated. Sturla From njs at pobox.com Mon Feb 9 03:51:30 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 9 Feb 2015 00:51:30 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On 8 Feb 2015 23:34, "Stefan Reiterer" wrote: > > Ok that are indeed some good reasons to keep the status quo, especially since performance is crucial for numpy. > > It's a dillemma: Using the matrix class for linear algebra would be the correct way for such thing, > but the matrix API is not that powerful and beautiful as the one of arrays. > On the other hand arrays are beautiful, but not exactly intended to use for linear algebra. > > So maybe the better way would be not to add warnings to braodcasting operations, but to overhaul the matrix class > to make it more attractive for numerical linear algebra(?) The problem here is that as soon as you try to mix code using the matrix class with code using the array class, everything devolves into an unmaintainable mess. And all third party libraries have settled on using the array class, so our long term plan is to get rid of the matrix class entirely -- it's a nice idea but in practice it creates more problems than it solves. (You can look up PEP 465 for some more discussion of this.) What I think this illustrates is just that textbook linear algebra and computational linear algebra have somewhat different concerns and needs. Textbooks don't have to care about the comparability of APIs across a large system, but that's an absolutely crucial concern when writing code. The obvious analogy is how in textbook linear algebra you just write A^{-1} B but on a computer you had better think about some matrix factorization method instead. Similarly, textbooks generally don't have any formal way to describe flow control and loops, and just write A_i = B_i C_i but in an implementation, looping is often the bulk of the code. Broadcasting provides a powerful notation for describing such loops. (Indeed, it's pretty common to see formal write-ups of algorithms where the the authors either have to resort to painful circumlocutions to describe the data flow, or else just give up and use MATLAB or numpy notation instead of traditional math notation.) So I think it's incorrect to say that arrays are not designed to be used for linear algebra. They're not designed to be used to write linear algebra *textbooks* -- but they're an extraordinarily well-designed tool for *computational* linear algebra. These are different things and need different tools. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From domors at gmx.net Mon Feb 9 04:15:22 2015 From: domors at gmx.net (Stefan Reiterer) Date: Mon, 9 Feb 2015 10:15:22 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Mon Feb 9 10:35:35 2015 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 9 Feb 2015 16:35:35 +0100 Subject: [Numpy-discussion] new mingw-w64 based numpy and scipy wheel (still experimental) In-Reply-To: References: Message-ID: Hi Carl, Could you please provide some details on how you used your mingw-static toolchain to build OpenBLAS, numpy & scipy? I would like to replicate but apparently the default Makefile in the openblas projects expects unix commands such as `uname` and `perl` that are not part of your archive. Did you compile those utilities from source or did you use another distribution of mingw with additional tools such as MSYS? For numpy and scipy, besides applying your patches, did you configure anything in site.cfg? I understand that you put the libopenblas.dll in the numpy/core folder but where do you put the BLAS / LAPACK header files? I would like to help automating that build in some CI environment (with either Windows or Linux + wine) but I am affraid that I am not familiar enough with the windows build of numpy & scipy to get it working all by myself. -- Olivier From rays at blue-cove.com Mon Feb 9 10:52:19 2015 From: rays at blue-cove.com (R Schumacher) Date: Mon, 09 Feb 2015 07:52:19 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: <201502082324.t18NOWh5009583@blue-cove.com> Message-ID: <201502091552.t19FqOAN010807@blue-cove.com> The problem would be human resources to implement it on the doc site; I don't know how PHP people control modern spammers. The current, Pythonic way would be to edit the tutorial when inspired. However http://wiki.scipy.org/Tentative_NumPy_Tutorial reads: "Please do not hesitate to click the edit button. You will need to create a User Account first. " which leads to http://wiki.scipy.org/UserPreferences "UserPreferences You are not allowed to view this page." ;) - Ray Schumacher At 11:37 PM 2/8/2015, you wrote: >That sounds like a good idea! I didn't see any real good examples of usage >after some googling. Giving more examples of effective usage could also clear >more things up regarding design decisions. Additionally I'm always interested >in learning some new tricks :) > >Cheers, >Stefan > >Gesendet: Montag, 09. Februar 2015 um 00:24 Uhr Von: "R Schumacher" > An: "Discussion of Numerical Python" > Betreff: Re: [Numpy-discussion] Silent >Broadcasting considered harmful >At 02:47 PM 2/8/2015, Simon Wood wrote: >Not quite the same. This is >not so much about language semantics as >mathematical definitions. >You (the Numpy community) have decided to >overload certain >mathematical operators to act in a way that is not >consistent with >linear algebra teachings. This can be a bit >confusing for people >who develop and implement mathematical >algorithms that have a >strong foundation in linear algebra, >irrespective of the language >they are migrating from. > >With that said, I do appreciate the >comments by Matthew, Eelco and >others. Numpy is *not* a linear >algebra package, so it does not >adhere to the same mathematical >definitions. This realization has >cleared some things up. Via my >(admittedly infrequent use of) numpy.linalg >http://docs.scipy.org/doc/numpy/reference/routines.linalg.html#linear-algebra-on-several-matrices-at-once >I think it behaves more in line with algebraic thinkers. I do not >have any issue with broadcasting, and use it frequently, but I've >always wanted to see more examples and discussion directly in the >docs, in general. I have over years post/argued for a doc site more >like PHP-doc, where users can contribute examples and discuss them. >There is a wealth of such examples here in the list and the >tutorial, but requires unnecessary time and Google-foo. - Ray >Schumacher _______________________________________________ >NumPy-Discussion mailing list NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Mon Feb 9 11:31:09 2015 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 9 Feb 2015 11:31:09 -0500 Subject: [Numpy-discussion] converting a list of tuples into an array of tuples? Message-ID: I am trying to write up some code that takes advantage of np.tile() on arbitrary array-like objects. I only want to tile along the first axis. Any other axis, if they exist, should be left alone. I first coerce the object using np.asanyarray(), tile it, and then coerce it back to the original type. The problem seems to be that some of my array-like objects are being "over-coerced", particularly the list of tuples. I tried doing "np.asanyarray(a, dtype='O')", but that still turns it into a 2-D array. Am I missing something? Thanks, Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Feb 9 12:22:35 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 9 Feb 2015 09:22:35 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: Message-ID: On Sun, Feb 8, 2015 at 2:17 PM, Stefan Reiterer wrote: > Till now the only way out of the misery > is to make proper unit tests, > That's the only way out of the misery of software bugs in general -- nothing special here ;-) Python is a dynamically typed language -- EVERYTHING could do something unexpected if you pass in different type, or shape of array or whatever than you expect. If you want type safety -- use something else ;-) I'm sorry out had a hard time with a particular bug -- but for me, I find broadcasting errors to usually be about as shallow as type errors -- which is to say usually found early and easily. Providing optional warnings just would be an elegant way out of this. > Broadcasting is widely used in numpy code -- a huge pile of warnings would be really painful! Do you realize that: arr = np.ones((5,)) ar2 = arr * 5 is broadcasting, too? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Feb 9 12:39:13 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 9 Feb 2015 09:39:13 -0800 Subject: [Numpy-discussion] converting a list of tuples into an array of tuples? In-Reply-To: References: Message-ID: It appears that the only reliable way to do this may be to use a loop to modify an object arrays in-place. Pandas has a version of this written in Cython: https://github.com/pydata/pandas/blob/c1a0dbc4c0dd79d77b2a34be5bc35493279013ab/pandas/lib.pyx#L342 To quote Wes McKinney "Seriously can't believe I had to write this function" Best, Stephan On Mon, Feb 9, 2015 at 8:31 AM, Benjamin Root wrote: > I am trying to write up some code that takes advantage of np.tile() on > arbitrary array-like objects. I only want to tile along the first axis. Any > other axis, if they exist, should be left alone. I first coerce the object > using np.asanyarray(), tile it, and then coerce it back to the original > type. > > The problem seems to be that some of my array-like objects are being > "over-coerced", particularly the list of tuples. I tried doing > "np.asanyarray(a, dtype='O')", but that still turns it into a 2-D array. > > Am I missing something? > > Thanks, > Ben Root > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Mon Feb 9 12:49:38 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Mon, 9 Feb 2015 09:49:38 -0800 Subject: [Numpy-discussion] converting a list of tuples into an array of tuples? In-Reply-To: References: Message-ID: On Mon, Feb 9, 2015 at 8:31 AM, Benjamin Root wrote: > I am trying to write up some code that takes advantage of np.tile() on > arbitrary array-like objects. I only want to tile along the first axis. Any > other axis, if they exist, should be left alone. I first coerce the object > using np.asanyarray(), tile it, and then coerce it back to the original > type. > > The problem seems to be that some of my array-like objects are being > "over-coerced", particularly the list of tuples. I tried doing > "np.asanyarray(a, dtype='O')", but that still turns it into a 2-D array. > The default constructors will drill down until they find a scalar or a non-matching shape. So you get an array full of Python ints, or floats, but still 2-D: >>> a = [tuple(range(j, j+3)) for j in range(5)] >>> a [(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)] >>> np.asarray(a, dtype=object) array([[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]], dtype=object) If you add a non-matching item, e.g. an empty tuple, then all works fine for your purposes: >>> a.append(()) >>> np.asarray(a, dtype=object) array([(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), ()], dtype=object) But you would then have to discard that item before tiling. The only other way is to first create the object array, then assign your array-like object to it: >>> a.pop() () >>> a [(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)] >>> b = np.empty(len(a), object) >>> b[:] = a >>> b array([(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)], dtype=object) Not sure if this has always worked, or if it breaks down in some corner case, but Wes may not have had to write that function after all! At least not in Cython. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Mon Feb 9 12:59:14 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 9 Feb 2015 17:59:14 +0000 (UTC) Subject: [Numpy-discussion] Silent Broadcasting considered harmful References: Message-ID: <1172003712445196982.840417sturla.molden-gmail.com@news.gmane.org> Chris Barker wrote: > Do you realize that: > > arr = np.ones((5,)) > > ar2 = arr * 5 > > is broadcasting, too? Perhaps we should only warn for a subset of broadcastings? E.g. avoid the warning on scalar times array. I prefer we don't warn about this though, because it might be interpreted as if broadcasting is "undesired". A warning means something bad right? There are just two things that can come out if this: First some stupid package author will turn them on and cause warning mayhem everywhere. Second, some stupid manager will decide that the NumPy code should be free of broadcasts, and then NumPy is crippled for the developers. Sturla From ben.root at ou.edu Mon Feb 9 14:14:34 2015 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 9 Feb 2015 14:14:34 -0500 Subject: [Numpy-discussion] converting a list of tuples into an array of tuples? In-Reply-To: References: Message-ID: Yeah, well, you know Wes... in for a penny, in for a pound (or something like that). Significant portions of pandas already needs Cython, so might as well get as much performance as possible. Btw, the edge case (if you want to call it that), is if it is given an N-dimensional array: >>> import numpy as np >>> a = np.zeros((4, 5)) >>> b = np.empty(4, object) >>> b[:] = a Traceback (most recent call last): File "", line 1, in ValueError: could not broadcast input array from shape (4,5) into shape (4) I am already filtering out ndarrays anyway, so it isn't a big deal. I was just hoping to reduce the amount of code as possible by removing the filter. This will do for now. Thank you for the clarifications. Ben Root On Mon, Feb 9, 2015 at 12:49 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Mon, Feb 9, 2015 at 8:31 AM, Benjamin Root wrote: > >> I am trying to write up some code that takes advantage of np.tile() on >> arbitrary array-like objects. I only want to tile along the first axis. Any >> other axis, if they exist, should be left alone. I first coerce the object >> using np.asanyarray(), tile it, and then coerce it back to the original >> type. >> >> The problem seems to be that some of my array-like objects are being >> "over-coerced", particularly the list of tuples. I tried doing >> "np.asanyarray(a, dtype='O')", but that still turns it into a 2-D array. >> > > The default constructors will drill down until they find a scalar or a > non-matching shape. So you get an array full of Python ints, or floats, but > still 2-D: > > >>> a = [tuple(range(j, j+3)) for j in range(5)] > >>> a > [(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)] > >>> np.asarray(a, dtype=object) > array([[0, 1, 2], > [1, 2, 3], > [2, 3, 4], > [3, 4, 5], > [4, 5, 6]], dtype=object) > > If you add a non-matching item, e.g. an empty tuple, then all works fine > for your purposes: > > >>> a.append(()) > >>> np.asarray(a, dtype=object) > array([(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), ()], > dtype=object) > > But you would then have to discard that item before tiling. The only other > way is to first create the object array, then assign your array-like object > to it: > > >>> a.pop() > () > >>> a > [(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)] > >>> b = np.empty(len(a), object) > >>> b[:] = a > >>> b > array([(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)], > dtype=object) > > Not sure if this has always worked, or if it breaks down in some corner > case, but Wes may not have had to write that function after all! At least > not in Cython. > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Mon Feb 9 16:22:20 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 9 Feb 2015 21:22:20 +0000 (UTC) Subject: [Numpy-discussion] new mingw-w64 based numpy and scipy wheel (still experimental) References: Message-ID: <1011426328445209322.896459sturla.molden-gmail.com@news.gmane.org> Two quick comments: - You need MSYS or Cygwin to build OpenBLAS. MSYS has uname and perl. Carl probably used MSYS. - BLAS and LAPACK are Fortran libs, hence there are no header files. NumPy and SciPy include their own cblas headers. Sturla Olivier Grisel wrote: > Hi Carl, > > Could you please provide some details on how you used your > mingw-static toolchain to build OpenBLAS, numpy & scipy? I would like > to replicate but apparently the default Makefile in the openblas > projects expects unix commands such as `uname` and `perl` that are not > part of your archive. Did you compile those utilities from source or > did you use another distribution of mingw with additional tools such > as MSYS? > > For numpy and scipy, besides applying your patches, did you configure > anything in site.cfg? I understand that you put the libopenblas.dll in > the numpy/core folder but where do you put the BLAS / LAPACK header > files? > > I would like to help automating that build in some CI environment > (with either Windows or Linux + wine) but I am affraid that I am not > familiar enough with the windows build of numpy & scipy to get it > working all by myself. From cmkleffner at gmail.com Mon Feb 9 16:49:35 2015 From: cmkleffner at gmail.com (Carl Kleffner) Date: Mon, 9 Feb 2015 22:49:35 +0100 Subject: [Numpy-discussion] new mingw-w64 based numpy and scipy wheel (still experimental) In-Reply-To: <1011426328445209322.896459sturla.molden-gmail.com@news.gmane.org> References: <1011426328445209322.896459sturla.molden-gmail.com@news.gmane.org> Message-ID: Basically you need: (1) site.cfg or %HOME%\.numpy-site.cfg with the following content: (change the paths according to your installation) [openblas] libraries = openblas library_dirs = D:/devel/packages/openblas/amd64/lib include_dirs = D:/devel/packages/openblas/amd64/include OpenBLAS was build with the help of msys2, the successor of msys. (2) I created an import library for python##.dll in \libs\ I copied python##.dll in a temporary folder and executed: (example for python-2.7) > gendef python27.dll > dlltool --dllname python27.dll --def python27.def --output-lib libpython27.dll.a > copy libpython27.dll.a \libs\libpython27.dll.a (3) before starting the numpy build I copied libopenblas.dll to numpy\core\ Actually I rework the overall procedure to allow the installation of the toolchain as a wheel with some postprocessing to handle all this intermediate steps. Cheers, Carl 2015-02-09 22:22 GMT+01:00 Sturla Molden : > Two quick comments: > - You need MSYS or Cygwin to build OpenBLAS. MSYS has uname and perl. Carl > probably used MSYS. > - BLAS and LAPACK are Fortran libs, hence there are no header files. NumPy > and SciPy include their own cblas headers. > > Sturla > > > Olivier Grisel wrote: > > Hi Carl, > > > > Could you please provide some details on how you used your > > mingw-static toolchain to build OpenBLAS, numpy & scipy? I would like > > to replicate but apparently the default Makefile in the openblas > > projects expects unix commands such as `uname` and `perl` that are not > > part of your archive. Did you compile those utilities from source or > > did you use another distribution of mingw with additional tools such > > as MSYS? > > > > For numpy and scipy, besides applying your patches, did you configure > > anything in site.cfg? I understand that you put the libopenblas.dll in > > the numpy/core folder but where do you put the BLAS / LAPACK header > > files? > > > > I would like to help automating that build in some CI environment > > (with either Windows or Linux + wine) but I am affraid that I am not > > familiar enough with the windows build of numpy & scipy to get it > > working all by myself. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjw at ncf.ca Mon Feb 9 19:02:55 2015 From: cjw at ncf.ca (cjw) Date: Mon, 09 Feb 2015 19:02:55 -0500 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: , Message-ID: <54D94AAF.8050801@ncf.ca> An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Feb 9 20:02:27 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 9 Feb 2015 17:02:27 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: <54D94AAF.8050801@ncf.ca> References: <54D94AAF.8050801@ncf.ca> Message-ID: On Mon, Feb 9, 2015 at 4:02 PM, cjw wrote: > to overhaul the matrix class > to make it more attractive for numerical linear algebra(?) > > +1 > Sure -- though I don't know that this actually has anyting to do with braodcasting -- unless the idea is that Matrices would be broadcastable? But anyway, the Matrix class leaves a lot to be desired. Enough, in fact, that most of us don't recommend using it at all. There has been a bunch of discussion on this list in the past about what could be done to make it better. But the real blocker is that no-one that actually develops numpy itself used them, or has need for them. The strongest use-case seems to be for teaching that involves linear algebra concepts, not real production code. Also -- it's proven to be really hard to write sub-classes of ndarray that work consistently and well -- you tend to keep accidentally getting raw arrays back... So maybe it can't really be done well at all? -Chris > I hope that this will be explored. @ could still be used by those who > wish remain in the array world. > > Colin W. > > Cheers, > Stefan > *Gesendet:* Sonntag, 08. Februar 2015 um 23:52 Uhr > *Von:* "Nathaniel Smith" > *An:* "Discussion of Numerical Python" > *Betreff:* Re: [Numpy-discussion] Silent Broadcasting considered harmful > > On 8 Feb 2015 13:04, "Stefan Reiterer" wrote: > > > > So I suggest that the best would be to throw warnings when arrays get > Broadcasted like > > Octave do. Python warnings can be catched and handled, that would be a great > benefit. > > > > Another idea would to provide warning levels for braodcasting, e.g > > 0 = Never, 1=Warn once, 2=Warn always, 3 = Forbid aka throw exception, > > with 0 as default. > > This would avoid breaking other code, and give the user some control over > braodcasting. > > Unfortunately adding warnings is a non-starter for technical reasons, even > before we get into the more subjective debate about ideal API design: issuing a > warning is extremely slow (relative to typical array operations), EVEN IF the > warning is disabled. (By the time you can figure out it's disabled, it's too > late.) So this would cause massive slowdowns in existing code. > > Note also that in numpy, even simple expressions like '2 * arr' rely on > broadcasting. Do you really want warnings for all these cases? > > -n > > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Tue Feb 10 03:02:57 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 10 Feb 2015 08:02:57 +0000 (UTC) Subject: [Numpy-discussion] Silent Broadcasting considered harmful References: <54D94AAF.8050801@ncf.ca> Message-ID: <383916662445247937.461554sturla.molden-gmail.com@news.gmane.org> Chris Barker wrote: > The strongest use-case seems to be > for teaching that involves linear algebra concepts, not real production > code. Not really. SymPy is a better teaching tool. Some find A*B easier to read than dot(A,B). But with the @ operator in Python 3.5 it does not have a usecase at all. Sturla From toddrjen at gmail.com Tue Feb 10 03:28:24 2015 From: toddrjen at gmail.com (Todd) Date: Tue, 10 Feb 2015 09:28:24 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: <54D94AAF.8050801@ncf.ca> References: <54D94AAF.8050801@ncf.ca> Message-ID: On Feb 10, 2015 1:03 AM, "cjw" wrote: > > > On 09-Feb-15 2:34 AM, Stefan Reiterer wrote: >> >> Ok that are indeed some good reasons to keep the status quo, especially since >> performance is crucial for numpy. >> It's a dillemma: Using the matrix class for linear algebra would be the correct >> way for such thing, >> but the matrix API is not that powerful and beautiful as the one of arrays. >> On the other hand arrays are beautiful, but not exactly intended to use for >> linear algebra. >> So maybe the better way would be not to add warnings to braodcasting operations, >> but to overhaul the matrix class >> to make it more attractive for numerical linear algebra(?) > > +1 > I hope that this will be explored. @ could still be used by those who wish remain in the array world. > What about splitting it off into a scikit, or at least some sort of separate package? If there is sufficient interest in it, it can be maintained there. If not, at least people can use it as-is. But there would not be any expectation going forward that the rest of numpy has to work well with it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Feb 10 11:40:56 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 10 Feb 2015 08:40:56 -0800 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: <54D94AAF.8050801@ncf.ca> Message-ID: On Tue, Feb 10, 2015 at 12:28 AM, Todd wrote: > >> So maybe the better way would be not to add warnings to braodcasting > operations, > >> but to overhaul the matrix class > >> to make it more attractive for numerical linear algebra(?) > > What about splitting it off into a scikit, or at least some sort of > separate package? If there is sufficient interest in it, it can be > maintained there. If not, at least people can use it as-is. But there > would not be any expectation going forward that the rest of numpy has to > work well with it > Well, splitting it off is a good idea, seeing as how it hasn't gotten much love. But if the rest of numpy does not work well with it, then it becomes even less useful. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Feb 10 12:10:48 2015 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 10 Feb 2015 18:10:48 +0100 Subject: [Numpy-discussion] Aligned / configurable memory allocation Message-ID: <20150210181048.5ef9224c@fsol> Hello, I apologize for pinging the list, but I was wondering if there was interest in either of https://github.com/numpy/numpy/pull/5457 (make array data aligned by default) or https://github.com/numpy/numpy/pull/5470 (make the array data allocator configurable)? Regards Antoine. From njs at pobox.com Tue Feb 10 14:26:22 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 10 Feb 2015 11:26:22 -0800 Subject: [Numpy-discussion] Aligned / configurable memory allocation In-Reply-To: <20150210181048.5ef9224c@fsol> References: <20150210181048.5ef9224c@fsol> Message-ID: On 10 Feb 2015 09:11, "Antoine Pitrou" wrote: > > > Hello, > > I apologize for pinging the list, but I was wondering if there was > interest in either of https://github.com/numpy/numpy/pull/5457 (make > array data aligned by default) or > https://github.com/numpy/numpy/pull/5470 (make the array data allocator > configurable)? I'm not a fan of the configurable allocator. It adds new public APIs for us to support, and makes switching to using Python's own memory allocation APIs more complex. The feature is intrinsically dangerous, because newly installed deallocators must be able to handle memory allocated by the previous allocator. (AFAICT the included test case can crash the test process if you get unlucky and GC runs during it?). And no one's articulated any compelling argument for why we need this configurability. Regarding the aligned allocation patch, I think the problem is just that none of us have any way to evaluate it. I'd feel a lot more comfortable with some solid numbers showing the costs and benefits on old and new systems. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Feb 10 14:50:01 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 10 Feb 2015 20:50:01 +0100 Subject: [Numpy-discussion] Silent Broadcasting considered harmful In-Reply-To: References: <54D94AAF.8050801@ncf.ca> Message-ID: On Tue, Feb 10, 2015 at 5:40 PM, Chris Barker wrote: > > On Tue, Feb 10, 2015 at 12:28 AM, Todd wrote: > >> >> So maybe the better way would be not to add warnings to braodcasting >> operations, >> >> but to overhaul the matrix class >> >> to make it more attractive for numerical linear algebra(?) >> > > >> What about splitting it off into a scikit, or at least some sort of >> separate package? If there is sufficient interest in it, it can be >> maintained there. If not, at least people can use it as-is. But there >> would not be any expectation going forward that the rest of numpy has to >> work well with it >> > > Well, splitting it off is a good idea, > It's not, that would be a massive backwards compat break. Just leave as is, and write this discussion up in a FAQ so we won't keep going in circles on this topic. Ralf > seeing as how it hasn't gotten much love. But if the rest of numpy does > not work well with it, then it becomes even less useful. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmund.hjulstad at gmail.com Tue Feb 10 15:02:26 2015 From: asmund.hjulstad at gmail.com (=?UTF-8?Q?=C3=85smund_Hjulstad?=) Date: Tue, 10 Feb 2015 21:02:26 +0100 Subject: [Numpy-discussion] How to debugging python crash in ipython notebook Message-ID: Hello all, I am struggling with a python crash of an ipython notebook (kernel) that I do not know how to debug. If I run this: (valgt is a pandas dataframe with 354 lines and 14 numerical columns) sns.pairplot(valgt) plt.savefig('parvis.png', dpi=600) in the same cell, the kernel consistently crashes. If I run in two separate cells, or exporting to a Python script and running there (changing the inline render to qt), everything is OK. This is win64 on Win7, both the free build from conda, the accelerate build from conda, and also a version I have built myself. (VS2010 and Intel Fortran XE 2015) The following is the package list from the 'vanilla' conda: # packages in environment [...]: # dateutil 2.1 py34_2 ipython 2.3.1 py34_0 jinja2 2.7.3 py34_1 markupsafe 0.23 py34_0 matplotlib 1.4.2 np19py34_0 numexpr 2.3.1 np19py34_0 numpy 1.9.1 py34_0 pandas 0.15.2 np19py34_0 pip 6.0.6 py34_0 pyparsing 2.0.1 py34_0 pyqt 4.10.4 py34_0 pyreadline 2.0 py34_0 python 3.4.2 1 pytz 2014.9 py34_0 pywin32 219 py34_0 pyzmq 14.5.0 py34_0 rpy2 2.5.5 scipy 0.15.1 np19py34_0 seaborn 0.5.1 np19py34_0 setuptools 12.0.5 py34_0 six 1.9.0 py34_0 tornado 4.0.2 py34_0 rpy2 is from the Gohlke python collection. The home-built version is a python 3.4.3rc1, with certifi (14.5.14) Cython (0.21.2) ipython (2.4.1) Jinja2 (2.7.3) jsonschema (2.4.0) MarkupSafe (0.23) matplotlib (1.4.2) mistune (0.5) nose (1.3.4) numexpr (2.4.1.dev0) numpy (1.10.0.dev0+98a8fe3) pandas (0.15.2) pip (6.0.8) pyparsing (2.0.3) pyreadline (2.0) PySide (1.2.2) python-dateutil (2.4.0) pytz (2014.10) pywin32 (219) pyzmq (14.5.0) rpy2 (2.5.6) scipy (0.15.1) seaborn (0.6.dev0) setuptools (12.0.5) six (1.9.0) tornado (4.1) wheel (0.24.0) They crash in the same way, though, the same applies to a test I did with ipython 3 I am wondering how important it is to have everything built with the same compiler and MKL library. Does all packages that use numpy need to be built against the specific numpy+mkl version? For example rpy2 (which is not used near the cells where it crashes), should I always rebuild this as well if I build differently, or...? And, how can I generate debug information from the kernel that ipython notebook starts? Any pointers are most appreciated. -- mvh, ?smund Hjulstad -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Feb 10 16:10:26 2015 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 10 Feb 2015 22:10:26 +0100 Subject: [Numpy-discussion] Aligned / configurable memory allocation References: <20150210181048.5ef9224c@fsol> Message-ID: <20150210221026.1f559156@fsol> On Tue, 10 Feb 2015 11:26:22 -0800 Nathaniel Smith wrote: > On 10 Feb 2015 09:11, "Antoine Pitrou" wrote: > > > > > > Hello, > > > > I apologize for pinging the list, but I was wondering if there was > > interest in either of https://github.com/numpy/numpy/pull/5457 (make > > array data aligned by default) or > > https://github.com/numpy/numpy/pull/5470 (make the array data allocator > > configurable)? > > I'm not a fan of the configurable allocator. It adds new public APIs for us > to support, and makes switching to using Python's own memory allocation > APIs more complex. The feature is intrinsically dangerous, because newly > installed deallocators must be able to handle memory allocated by the > previous allocator. (AFAICT the included test case can crash the test > process if you get unlucky and GC runs during it?). It's taken care of in the patch. > Regarding the aligned allocation patch, I think the problem is just that > none of us have any way to evaluate it. I'd feel a lot more comfortable > with some solid numbers showing the costs and benefits on old and new > systems. Ok. Regards Antoine. From njs at pobox.com Tue Feb 10 16:33:58 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 10 Feb 2015 13:33:58 -0800 Subject: [Numpy-discussion] Aligned / configurable memory allocation In-Reply-To: <20150210221026.1f559156@fsol> References: <20150210181048.5ef9224c@fsol> <20150210221026.1f559156@fsol> Message-ID: On 10 Feb 2015 13:10, "Antoine Pitrou" wrote: > > On Tue, 10 Feb 2015 11:26:22 -0800 > Nathaniel Smith wrote: > > On 10 Feb 2015 09:11, "Antoine Pitrou" wrote: > > > > > > > > > Hello, > > > > > > I apologize for pinging the list, but I was wondering if there was > > > interest in either of https://github.com/numpy/numpy/pull/5457 (make > > > array data aligned by default) or > > > https://github.com/numpy/numpy/pull/5470 (make the array data allocator > > > configurable)? > > > > I'm not a fan of the configurable allocator. It adds new public APIs for us > > to support, and makes switching to using Python's own memory allocation > > APIs more complex. The feature is intrinsically dangerous, because newly > > installed deallocators must be able to handle memory allocated by the > > previous allocator. (AFAICT the included test case can crash the test > > process if you get unlucky and GC runs during it?). > > It's taken care of in the patch. Ah, I see -- I missed that you added an allocator field to PyArrayObject. That does reduce my objections to the patch. But I'm still not sure what problems this is solving exactly. Also, if we do decide to add a deallocation callback to PyArrayObject then I think we should take advantage of the opportunity to also make life easier for c API users who need a custom callback on a case-by-case basis and currently have to jump through hoops using ->base. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Tue Feb 10 16:54:49 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 10 Feb 2015 22:54:49 +0100 Subject: [Numpy-discussion] Aligned / configurable memory allocation In-Reply-To: References: <20150210181048.5ef9224c@fsol> <20150210221026.1f559156@fsol> Message-ID: <54DA7E29.6000202@googlemail.com> On 10.02.2015 22:33, Nathaniel Smith wrote: > On 10 Feb 2015 13:10, "Antoine Pitrou" > wrote: >> >> On Tue, 10 Feb 2015 11:26:22 -0800 >> Nathaniel Smith > wrote: >> > On 10 Feb 2015 09:11, "Antoine Pitrou" > wrote: >> > > >> > > >> > > Hello, >> > > >> > > I apologize for pinging the list, but I was wondering if there was >> > > interest in either of https://github.com/numpy/numpy/pull/5457 (make >> > > array data aligned by default) or >> > > https://github.com/numpy/numpy/pull/5470 (make the array data > allocator >> > > configurable)? >> > >> > I'm not a fan of the configurable allocator. It adds new public APIs > for us >> > to support, and makes switching to using Python's own memory allocation >> > APIs more complex. The feature is intrinsically dangerous, because newly >> > installed deallocators must be able to handle memory allocated by the >> > previous allocator. (AFAICT the included test case can crash the test >> > process if you get unlucky and GC runs during it?). >> >> It's taken care of in the patch. unfortunately it also breaks the ABI on two fronts, by adding a new member to the public array struct which needs initializing by non api using users and by removing the ability to use free on array pointers. Both not particularly large breaks, but breaks nonetheless. At least for the first issue we should (like for the proposed dtype and ufunc changes) apply a more generic break of hiding the new internal members in a new private structure that embeds the public structure unchanged. The second issue can probably be ignored, though we could retain it for posix/c11 as those standards wisely decided to make aligned pointers freeable with free. That on the other hand costs us efficient calloc and realloc (standard comities are weird sometimes ...) > > Ah, I see -- I missed that you added an allocator field to > PyArrayObject. That does reduce my objections to the patch. But I'm > still not sure what problems this is solving exactly. > > Also, if we do decide to add a deallocation callback to PyArrayObject > then I think we should take advantage of the opportunity to also make > life easier for c API users who need a custom callback on a case-by-case > basis and currently have to jump through hoops using ->base. > From cjw at ncf.ca Tue Feb 10 17:07:54 2015 From: cjw at ncf.ca (cjw) Date: Tue, 10 Feb 2015 15:07:54 -0700 (MST) Subject: [Numpy-discussion] Matrix Class Message-ID: <1423606074771-39719.post@n7.nabble.com> It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b' # Correctly flagged, not numeric except ValueError: print("d[0, 1]= 'b' # Correctly flagged, not numeric", ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** >>> a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] >>> -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From kartik.peri at gmail.com Tue Feb 10 21:38:54 2015 From: kartik.peri at gmail.com (Kartik Kumar Perisetla) Date: Tue, 10 Feb 2015 21:38:54 -0500 Subject: [Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray Message-ID: Hi all, for one of my projects I am using basically using NLTK for pos tagging, which internally uses a 'english.pickle' file. I managed to package the nltk library with these pickle files to make them available to mapper and reducer for hadoop streaming job using -file option. However, when nltk library is trying to load that pickle file, it gives error for numpy- since the cluster I am running this job does not have numpy installed. Also, I don't have root access thus, can't install numpy or any other package on cluster. So the only way is to package the python modules to make it available for mapper and reducer. I successfully managed to do that. But now the problem is when numpy is imported, it imports multiarray by default( as seen in *init*.py) and this is where I am getting the error: File "/usr/lib64/python2.6/pickle.py", line 1370, in load return Unpickler(file).load() File "/usr/lib64/python2.6/pickle.py", line 858, in load dispatch[key](self) File "/usr/lib64/python2.6/pickle.py", line 1090, in load_global klass = self.find_class(module, name) File "/usr/lib64/python2.6/pickle.py", line 1124, in find_class __import__(module) File "numpy.mod/numpy/__init__.py", line 170, in File "numpy.mod/numpy/add_newdocs.py", line 13, in File "numpy.mod/numpy/lib/__init__.py", line 8, in File "numpy.mod/numpy/lib/type_check.py", line 11, in File "numpy.mod/numpy/core/__init__.py", line 6, in ImportError: cannot import name multiarray I tried moving numpy directory on my local machine that contains multiarray.pyd, to the cluster to make it available to mapper and reducer but this didn't help. Any input on how to resolve this(keeping the constraint that I cannot install anything on cluster machines)? Thanks! -- Regards, Kartik Perisetla -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Wed Feb 11 00:06:46 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 11 Feb 2015 05:06:46 +0000 (UTC) Subject: [Numpy-discussion] Silent Broadcasting considered harmful References: <54D94AAF.8050801@ncf.ca> Message-ID: <676532241445323875.925957sturla.molden-gmail.com@news.gmane.org> Chris Barker wrote: > Well, splitting it off is a good idea, seeing as how it hasn't gotten much > love. But if the rest of numpy does not work well with it, then it becomes > even less useful. PEP 3118 takes care of that. Sturla From davidmenhur at gmail.com Wed Feb 11 01:56:22 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Wed, 11 Feb 2015 07:56:22 +0100 Subject: [Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray In-Reply-To: References: Message-ID: On 11 February 2015 at 03:38, Kartik Kumar Perisetla wrote: > Also, I don't have root access thus, can't install numpy or any other > package on cluster You can create a virtualenv, and install packages on it without needing root access. To minimize trouble, you can ensure it uses the system packages when available. Here are instructions on how to install it: https://stackoverflow.com/questions/9348869/how-to-install-virtualenv-without-using-sudo http://opensourcehacker.com/2012/09/16/recommended-way-for-sudo-free-installation-of-python-software-with-virtualenv/ This does not require root access, but it is probably good to check with the sysadmins to make sure they are fine with it. /David. From kartik.peri at gmail.com Wed Feb 11 02:06:12 2015 From: kartik.peri at gmail.com (Kartik Kumar Perisetla) Date: Wed, 11 Feb 2015 02:06:12 -0500 Subject: [Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray In-Reply-To: References: Message-ID: Thanks David. But do I need to install virtualenv on every node in hadoop cluster? Actually I am not very sure whether same namenodes are assigned for my every hadoop job. So how shall I proceed on such scenario. Thanks for your inputs. Kartik On Feb 11, 2015 1:56 AM, "Da?id" wrote: > On 11 February 2015 at 03:38, Kartik Kumar Perisetla > wrote: > > Also, I don't have root access thus, can't install numpy or any other > > package on cluster > > You can create a virtualenv, and install packages on it without > needing root access. To minimize trouble, you can ensure it uses the > system packages when available. Here are instructions on how to > install it: > > > https://stackoverflow.com/questions/9348869/how-to-install-virtualenv-without-using-sudo > > http://opensourcehacker.com/2012/09/16/recommended-way-for-sudo-free-installation-of-python-software-with-virtualenv/ > > This does not require root access, but it is probably good to check > with the sysadmins to make sure they are fine with it. > > > /David. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Wed Feb 11 07:17:35 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Wed, 11 Feb 2015 13:17:35 +0100 Subject: [Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray In-Reply-To: References: Message-ID: On 11 February 2015 at 08:06, Kartik Kumar Perisetla wrote: > Thanks David. But do I need to install virtualenv on every node in hadoop > cluster? Actually I am not very sure whether same namenodes are assigned for > my every hadoop job. So how shall I proceed on such scenario. I have never used hadoop, but in the clusters I have used, you have a home folder on the central node, and each and every computing node has access to it. You can then install Python in your home folder and make every node run that, or pull a local copy. Probably the cluster support can clear this up further and adapt it to your particular case. /David. From dieter.van.eessen at gmail.com Wed Feb 11 07:46:49 2015 From: dieter.van.eessen at gmail.com (Dieter Van Eessen) Date: Wed, 11 Feb 2015 13:46:49 +0100 Subject: [Numpy-discussion] 3D array and the right hand rule In-Reply-To: References: Message-ID: Ok, thanks for the reply! Indeed, I know about the use of transformation matrices to manipulate points in space. That's all matrix manipulation anyway.... But, (and perhaps this is not the right place to ask the following question): But are there no known mathmatical algorithms which involve the use of 3n arrays (or higher dimensions) to transform an object between one state and the other? This is an open question, as my knowledge of math is lacking on this area. I'm currently limited to 3D object manipulation and some statistics which all rely on matrix calculus... kind regards, Dieter On Fri, Jan 30, 2015 at 2:32 AM, Alexander Belopolsky wrote: > > On Mon, Jan 26, 2015 at 6:06 AM, Dieter Van Eessen < > dieter.van.eessen at gmail.com> wrote: > >> I've read that numpy.array isn't arranged according to the >> 'right-hand-rule' (right-hand-rule => thumb = +x; index finger = +y, bend >> middle finder = +z). This is also confirmed by an old message I dug up from >> the mailing list archives. (see message below) >> > > Dieter, > > It looks like you are confusing dimensionality of the array with the > dimensionality of a vector that it might store. If you are interested in > using numpy for 3D modeling, you will likely only encounter 1-dimensional > arrays (vectors) of size 3 and 2-dimensional arrays (matrices) of size 9 > or shape (3, 3). > > A 3-dimensional array is a stack of matrices and the 'right-hand-rule' > does not really apply. The notion of C/F-contiguous deals with the order > of axes (e.g. width first or depth first) while the right-hand-rule is > about the direction of the axes (if you "flip" the middle finger right hand > becomes left.) In the case of arrays this would probably correspond to > little-endian vs. big-endian: is a[0] stored at a higher or lower address > than a[1]. However, whatever the answer to this question is for a > particular system, it is the same for all axes in the array, so right-hand > - left-hand distinction does not apply. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- gtz, Dieter VE -------------- next part -------------- An HTML attachment was scrubbed... URL: From rnelsonchem at gmail.com Wed Feb 11 10:21:44 2015 From: rnelsonchem at gmail.com (Ryan Nelson) Date: Wed, 11 Feb 2015 10:21:44 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <1423606074771-39719.post@n7.nabble.com> References: <1423606074771-39719.post@n7.nabble.com> Message-ID: So: In [2]: np.mat([4,'5',6]) Out[2]: matrix([['4', '5', '6']], dtype=' wrote: > It seems to be agreed that there are weaknesses in the existing Numpy > Matrix > Class. > > Some problems are illustrated below. > > I'll try to put some suggestions over the coming weeks and would appreciate > comments. > > Colin W. > > Test Script: > > if __name__ == '__main__': > a= mat([4, 5, 6]) # Good > print('a: ', a) > b= mat([4, '5', 6]) # Not the expected result > print('b: ', b) > c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular > print('c: ', c) > d= mat([[1, 2, 3]]) > try: > d[0, 1]= 'b' # Correctly flagged, not numeric > except ValueError: > print("d[0, 1]= 'b' # Correctly flagged, not numeric", > ' > ValueError') > print('d: ', d) > > Result: > > *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit > (AMD64)] on win32. *** > >>> > a: [[4 5 6]] > b: [['4' '5' '6']] > c: [[[4, 5, 6] [7, 8]]] > d[0, 1]= 'b' # Correctly flagged, not numeric ValueError > d: [[1 2 3]] > >>> > > > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Feb 11 10:47:54 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 11 Feb 2015 16:47:54 +0100 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <1423606074771-39719.post@n7.nabble.com> References: <1423606074771-39719.post@n7.nabble.com> Message-ID: <1423669674.13462.1.camel@sipsolutions.net> On Di, 2015-02-10 at 15:07 -0700, cjw wrote: > It seems to be agreed that there are weaknesses in the existing Numpy Matrix > Class. > > Some problems are illustrated below. > Not to delve deeply into a discussion, but unfortunately, there seem far more fundamental problems because of the always 2-D thing and the simple fact that matrix is more of a second class citizen in numpy (or in other words a lot of this is just the general fact that it is an ndarray subclass). I think some of these issues were summarized in the discussion about the @ operator. I am not saying that a matrix class separate from numpy cannot solve these, but within numpy it seems hard. > I'll try to put some suggestions over the coming weeks and would appreciate > comments. > > Colin W. > > Test Script: > > if __name__ == '__main__': > a= mat([4, 5, 6]) # Good > print('a: ', a) > b= mat([4, '5', 6]) # Not the expected result > print('b: ', b) > c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular > print('c: ', c) > d= mat([[1, 2, 3]]) > try: > d[0, 1]= 'b' # Correctly flagged, not numeric > except ValueError: > print("d[0, 1]= 'b' # Correctly flagged, not numeric", ' > ValueError') > print('d: ', d) > > Result: > > *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit > (AMD64)] on win32. *** > >>> > a: [[4 5 6]] > b: [['4' '5' '6']] > c: [[[4, 5, 6] [7, 8]]] > d[0, 1]= 'b' # Correctly flagged, not numeric ValueError > d: [[1 2 3]] > >>> > > > > > > -- > View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From cjw at ncf.ca Wed Feb 11 11:38:03 2015 From: cjw at ncf.ca (cjw) Date: Wed, 11 Feb 2015 11:38:03 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <1423669674.13462.1.camel@sipsolutions.net> References: <1423606074771-39719.post@n7.nabble.com> <1423669674.13462.1.camel@sipsolutions.net> Message-ID: <54DB856B.3000709@ncf.ca> An HTML attachment was scrubbed... URL: From cjw at ncf.ca Wed Feb 11 11:54:28 2015 From: cjw at ncf.ca (cjw) Date: Wed, 11 Feb 2015 11:54:28 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: References: <1423606074771-39719.post@n7.nabble.com> Message-ID: <54DB8944.3010608@ncf.ca> An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Feb 11 12:13:32 2015 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 11 Feb 2015 12:13:32 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <54DB856B.3000709@ncf.ca> References: <1423606074771-39719.post@n7.nabble.com> <1423669674.13462.1.camel@sipsolutions.net> <54DB856B.3000709@ncf.ca> Message-ID: <54DB8DBC.9020505@gmail.com> Just recalling the one-year-ago discussion: http://comments.gmane.org/gmane.comp.python.numeric.general/56494 Alan Isaac From sebastian at sipsolutions.net Wed Feb 11 12:19:25 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 11 Feb 2015 18:19:25 +0100 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <54DB856B.3000709@ncf.ca> References: <1423606074771-39719.post@n7.nabble.com> <1423669674.13462.1.camel@sipsolutions.net> <54DB856B.3000709@ncf.ca> Message-ID: <1423675165.13462.3.camel@sipsolutions.net> On Mi, 2015-02-11 at 11:38 -0500, cjw wrote: > > On 11-Feb-15 10:47 AM, Sebastian Berg wrote: > > > On Di, 2015-02-10 at 15:07 -0700, cjw wrote: > > > It seems to be agreed that there are weaknesses in the existing Numpy Matrix > > > Class. > > > > > > Some problems are illustrated below. > > > > > Not to delve deeply into a discussion, but unfortunately, there seem far > > more fundamental problems because of the always 2-D thing and the simple > > fact that matrix is more of a second class citizen in numpy (or in other > > words a lot of this is just the general fact that it is an ndarray > > subclass). > Thanks Sebastian, > > We'll have to see what comes out of the discussion. > > I would be grateful if you could expand on the "always 2D thing". Is > there a need for a collection of matrices, where a function is applied > to each component of the collection? > No, I just mean the fact that a matrix is always 2D. This makes some things like some indexing operations awkward and some functions that expect a numpy array (but think they can handle subclasses fine) may just plain brake. And then ndarray subclasses are just a bit problematic.... In short, you cannot generally expect a function which works great with arrays to also work great with matrices, I believe. this is true for some things within numpy and certainly for third party libraries I am sure. - Sebastian > Colin W. > > > > I think some of these issues were summarized in the discussion about the > > @ operator. I am not saying that a matrix class separate from numpy > > cannot solve these, but within numpy it seems hard. > > > > > > > I'll try to put some suggestions over the coming weeks and would appreciate > > > comments. > > > > > > Colin W. > > > > > > Test Script: > > > > > > if __name__ == '__main__': > > > a= mat([4, 5, 6]) # Good > > > print('a: ', a) > > > b= mat([4, '5', 6]) # Not the expected result > > > print('b: ', b) > > > c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular > > > print('c: ', c) > > > d= mat([[1, 2, 3]]) > > > try: > > > d[0, 1]= 'b' # Correctly flagged, not numeric > > > except ValueError: > > > print("d[0, 1]= 'b' # Correctly flagged, not numeric", ' > > > ValueError') > > > print('d: ', d) > > > > > > Result: > > > > > > *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit > > > (AMD64)] on win32. *** > > > a: [[4 5 6]] > > > b: [['4' '5' '6']] > > > c: [[[4, 5, 6] [7, 8]]] > > > d[0, 1]= 'b' # Correctly flagged, not numeric ValueError > > > d: [[1 2 3]] > > > > > > > > > > > > > > > -- > > > View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html > > > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From shoyer at gmail.com Wed Feb 11 13:22:28 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 11 Feb 2015 10:22:28 -0800 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <1423675165.13462.3.camel@sipsolutions.net> References: <1423606074771-39719.post@n7.nabble.com> <1423669674.13462.1.camel@sipsolutions.net> <54DB856B.3000709@ncf.ca> <1423675165.13462.3.camel@sipsolutions.net> Message-ID: On Wed, Feb 11, 2015 at 9:19 AM, Sebastian Berg wrote: > On Mi, 2015-02-11 at 11:38 -0500, cjw wrote: > No, I just mean the fact that a matrix is always 2D. This makes some > things like some indexing operations awkward and some functions that > expect a numpy array (but think they can handle subclasses fine) may > just plain brake. And then ndarray subclasses are just a bit > problematic.... > Indeed. In my opinion, a "fixed" version of np.matrix should (1) not be a np.ndarray subclass and (2) exist in a third party library not numpy itself. I don't think it's really feasible to fix np.matrix in its current state as an ndarray subclass, but even a fixed matrix class doesn't really belong in numpy itself, which has too long release cycles and compatibility guarantees for experimentation -- not to mention that the mere existence of the matrix class in numpy leads new users astray. If you're really excited about using matrix objects, I really would recommend starting a new project to implement the functionality (or maybe such a project already exists -- I don't know). Numpy has some excellent hooks for non-ndarray ndarray-like objects, so it's pretty straightforward to integrate with numpy ufuncs, etc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhattersley at gmail.com Wed Feb 11 13:36:52 2015 From: rhattersley at gmail.com (R Hattersley) Date: Wed, 11 Feb 2015 18:36:52 +0000 Subject: [Numpy-discussion] Matrix Class In-Reply-To: References: <1423606074771-39719.post@n7.nabble.com> <1423669674.13462.1.camel@sipsolutions.net> <54DB856B.3000709@ncf.ca> <1423675165.13462.3.camel@sipsolutions.net> Message-ID: On 11 February 2015 at 18:22, Stephan Hoyer wrote: > In my opinion, a "fixed" version of np.matrix should (1) not be a > np.ndarray subclass and (2) exist in a third party library not numpy itself. > +1 for both of those ... but especially the first. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjw at ncf.ca Wed Feb 11 14:07:52 2015 From: cjw at ncf.ca (cjw) Date: Wed, 11 Feb 2015 14:07:52 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <1423675165.13462.3.camel@sipsolutions.net> References: <1423606074771-39719.post@n7.nabble.com> <1423669674.13462.1.camel@sipsolutions.net> <54DB856B.3000709@ncf.ca> <1423675165.13462.3.camel@sipsolutions.net> Message-ID: <54DBA888.4090009@ncf.ca> An HTML attachment was scrubbed... URL: From cjw at ncf.ca Wed Feb 11 14:25:35 2015 From: cjw at ncf.ca (cjw) Date: Wed, 11 Feb 2015 14:25:35 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <54DB8DBC.9020505@gmail.com> References: <1423606074771-39719.post@n7.nabble.com> <1423669674.13462.1.camel@sipsolutions.net> <54DB856B.3000709@ncf.ca> <54DB8DBC.9020505@gmail.com> Message-ID: <54DBACAF.8050203@ncf.ca> An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Feb 11 14:57:26 2015 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 11 Feb 2015 14:57:26 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <54DBACAF.8050203@ncf.ca> References: <1423606074771-39719.post@n7.nabble.com> <1423669674.13462.1.camel@sipsolutions.net> <54DB856B.3000709@ncf.ca> <54DB8DBC.9020505@gmail.com> <54DBACAF.8050203@ncf.ca> Message-ID: <54DBB426.9010301@gmail.com> On 2/11/2015 2:25 PM, cjw wrote: > I think of the matrix as a numeric object. What would the case be for having a Boolean matrix? It's one of my primary uses: https://en.wikipedia.org/wiki/Adjacency_matrix Numpy alread provides SVD: http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.svd.html A lot of core linear algebra is in `numpy.linalg`, and SciPy has much more. Remember for matrix `M` you can always apply any numpy function to `M.A`. I think gains could be in lazy evaluation structures (e.g., a KroneckerProduct object that never actually produces the product unless forced to.) Cheers, Alan From pav at iki.fi Wed Feb 11 15:36:40 2015 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 11 Feb 2015 22:36:40 +0200 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <54DBB426.9010301@gmail.com> References: <1423606074771-39719.post@n7.nabble.com> <1423669674.13462.1.camel@sipsolutions.net> <54DB856B.3000709@ncf.ca> <54DB8DBC.9020505@gmail.com> <54DBACAF.8050203@ncf.ca> <54DBB426.9010301@gmail.com> Message-ID: 11.02.2015, 21:57, Alan G Isaac kirjoitti: [clip] > I think gains could be in lazy evaluation structures (e.g., > a KroneckerProduct object that never actually produces the product > unless forced to.) This sounds like an abstract linear operator interface. Several attempts have been made to this direction in Python world, but I think none of them has really gained traction so far. One is even in Scipy. Unfortunately, that one's design has grown organically, and it's mostly suited just for specifying inputs to sparse solvers etc. rather than abstract manipulations. If there was a popular way to deal with these objects, it could become even more popular reasonably quickly. From rnelsonchem at gmail.com Wed Feb 11 16:18:14 2015 From: rnelsonchem at gmail.com (Ryan Nelson) Date: Wed, 11 Feb 2015 16:18:14 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <54DB8944.3010608@ncf.ca> References: <1423606074771-39719.post@n7.nabble.com> <54DB8944.3010608@ncf.ca> Message-ID: Colin, I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda environment with Python2.7 and Numpy 1.7.0, and I get the same: ############ Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) [MSC v .1500 64 bit (AMD64)] Type "copyright", "credits" or "license" for more information. IPython 2.3.1 -- An enhanced Interactive Python. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://binstar.org ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.7.0' In [3]: np.mat([4,'5',6]) Out[3]: matrix([['4', '5', '6']], dtype='|S1') In [4]: np.mat([4,'5',6], dtype=int) Out[4]: matrix([[4, 5, 6]]) ############### As to your comment about coordinating with Statsmodels, you should see the links in the thread that Alan posted: http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 Josef's comments at the time seem to echo the issues the devs (and others) have with the matrix class. Maybe things have changed with Statsmodels. I know I mentioned Sage and SageMathCloud before. I'll just point out that there are folks that use this for real research problems, not just as a pedagogical tool. They have a Matrix/vector/column_matrix class that do what you were expecting from your problems posted above. Indeed below is a (truncated) cut and past from a Sage Worksheet. (See http://www.sagemath.org/doc/tutorial/tour_linalg.html) ########## In : Matrix([1,'2',3]) Error in lines 1-1 Traceback (most recent call last): TypeError: unable to find a common ring for all elements In : Matrix([[1,2,3],[4,5]]) ValueError: List of rows is not valid (rows are wrong types or lengths) In : vector([1,2,3]) (1, 2, 3) In : column_matrix([1,2,3]) [1] [2] [3] ########## Large portions of the custom code and wrappers in Sage are written in Python. I don't think their Matrix object is a subclass of ndarray, so perhaps you could strip out the Matrix stuff from here to make a separate project with just the Matrix stuff, if you don't want to go through the Sage interface. On Wed, Feb 11, 2015 at 11:54 AM, cjw wrote: > > On 11-Feb-15 10:21 AM, Ryan Nelson wrote: > > So: > > In [2]: np.mat([4,'5',6]) > Out[2]: > matrix([['4', '5', '6']], dtype=' > In [3]: np.mat([4,'5',6], dtype=int) > Out[3]: matrix([[4, 5, 6]]) > > > Thanks Ryan, > > We are not singing from the same hymn book. > > Using PyScripter, I get: > > *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit > (AMD64)] on win32. *** > >>> import numpy as np > >>> print('Numpy version: ', np.__version__) > ('Numpy version: ', '1.9.0') > >>> > > Could you say which version you are using please? > > Colin W > > > On Tue, Feb 10, 2015 at 5:07 PM, cjw wrote: > > > It seems to be agreed that there are weaknesses in the existing Numpy > Matrix > Class. > > Some problems are illustrated below. > > I'll try to put some suggestions over the coming weeks and would appreciate > comments. > > Colin W. > > Test Script: > > if __name__ == '__main__': > a= mat([4, 5, 6]) # Good > print('a: ', a) > b= mat([4, '5', 6]) # Not the expected result > print('b: ', b) > c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular > print('c: ', c) > d= mat([[1, 2, 3]]) > try: > d[0, 1]= 'b' # Correctly flagged, not numeric > except ValueError: > print("d[0, 1]= 'b' # Correctly flagged, not numeric", > ' > ValueError') > print('d: ', d) > > Result: > > *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit > (AMD64)] on win32. *** > > a: [[4 5 6]] > b: [['4' '5' '6']] > c: [[[4, 5, 6] [7, 8]]] > d[0, 1]= 'b' # Correctly flagged, not numeric ValueError > d: [[1 2 3]] > > > > > > -- > View this message in context:http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kartik.peri at gmail.com Wed Feb 11 16:59:35 2015 From: kartik.peri at gmail.com (Kartik Kumar Perisetla) Date: Wed, 11 Feb 2015 16:59:35 -0500 Subject: [Numpy-discussion] Using numpy on hadoop streaming: ImportError: cannot import name multiarray In-Reply-To: References: Message-ID: Hi David, Thanks for your response. But I can't install anything on cluster. *Could anyone please help me understand how the file 'multiarray.so' is used by the tagger. I mean how it is loaded( I assume its some sort of DLL for windows and shared library for unix based systems). Is it a module or what?* Right now what I did is I packaged numpy so that numpy will be present at the current working directory for mapper and reducer. So now control goes into numpy packaged alongwith mapper. But still right now I see such error: *File "glossextractionengine.mod/nltk/tag/__init__.py", line 123, in pos_tag File "glossextractionengine.mod/pickle.py", line 1380, in load return doctest.testmod() File "glossextractionengine.mod/pickle.py", line 860, in load return stopinst.value File "glossextractionengine.mod/pickle.py", line 1092, in load_global dispatch[GLOBAL] = load_global File "glossextractionengine.mod/pickle.py", line 1126, in find_class klass = getattr(mod, name) File "numpy.mod/numpy/__init__.py", line 137, in File "numpy.mod/numpy/add_newdocs.py", line 13, in File "numpy.mod/numpy/lib/__init__.py", line 4, in File "numpy.mod/numpy/lib/type_check.py", line 21, in File "numpy.mod/numpy/core/__init__.py", line 9, in ImportError: No module named multiarray* In this case the file 'multiarray.so' is present in within core package only, but it is still not found. Can anyone throw some light on it. Thanks! Kartik On Wed, Feb 11, 2015 at 7:17 AM, Da?id wrote: > On 11 February 2015 at 08:06, Kartik Kumar Perisetla > wrote: > > Thanks David. But do I need to install virtualenv on every node in hadoop > > cluster? Actually I am not very sure whether same namenodes are assigned > for > > my every hadoop job. So how shall I proceed on such scenario. > > I have never used hadoop, but in the clusters I have used, you have a > home folder on the central node, and each and every computing node has > access to it. You can then install Python in your home folder and make > every node run that, or pull a local copy. > > Probably the cluster support can clear this up further and adapt it to > your particular case. > > /David. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Regards, Kartik Perisetla -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Feb 12 09:21:56 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 12 Feb 2015 09:21:56 -0500 Subject: [Numpy-discussion] unpacking data values into array of bits Message-ID: I need to transmit some data values. These values will be float and long values. I need them encoded into a string of bits. The only way I found so far to do this seems rather roundabout: np.unpackbits (np.array (memoryview(struct.pack ('d', pi)))) Out[45]: array([0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0], dtype=uint8) (which I'm not certain is correct) Also, I don't know how to reverse this process -- -- Those who don't understand recursion are doomed to repeat it From robert.kern at gmail.com Thu Feb 12 09:32:22 2015 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Feb 2015 14:32:22 +0000 Subject: [Numpy-discussion] unpacking data values into array of bits In-Reply-To: References: Message-ID: On Thu, Feb 12, 2015 at 2:21 PM, Neal Becker wrote: > > I need to transmit some data values. These values will be float and long > values. I need them encoded into a string of bits. > > The only way I found so far to do this seems rather roundabout: > > > np.unpackbits (np.array (memoryview(struct.pack ('d', pi)))) > Out[45]: > array([0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, > 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, > 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0], dtype=uint8) > > (which I'm not certain is correct) > > Also, I don't know how to reverse this process You already had your string ready for transmission with `struct.pack('d', pi)`. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Feb 12 10:00:50 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 12 Feb 2015 10:00:50 -0500 Subject: [Numpy-discussion] unpacking data values into array of bits References: Message-ID: Robert Kern wrote: > On Thu, Feb 12, 2015 at 2:21 PM, Neal Becker wrote: >> >> I need to transmit some data values. These values will be float and long >> values. I need them encoded into a string of bits. >> >> The only way I found so far to do this seems rather roundabout: >> >> >> np.unpackbits (np.array (memoryview(struct.pack ('d', pi)))) >> Out[45]: >> array([0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, > 0, >> 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, > 0, >> 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0], dtype=uint8) >> >> (which I'm not certain is correct) >> >> Also, I don't know how to reverse this process > > You already had your string ready for transmission with `struct.pack('d', > pi)`. > > -- > Robert Kern my transmitter wants an np array of bits, not a string -- -- Those who don't understand recursion are doomed to repeat it From robert.kern at gmail.com Thu Feb 12 10:08:39 2015 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Feb 2015 15:08:39 +0000 Subject: [Numpy-discussion] unpacking data values into array of bits In-Reply-To: References: Message-ID: On Thu, Feb 12, 2015 at 3:00 PM, Neal Becker wrote: > > Robert Kern wrote: > > > On Thu, Feb 12, 2015 at 2:21 PM, Neal Becker wrote: > >> > >> I need to transmit some data values. These values will be float and long > >> values. I need them encoded into a string of bits. > >> > >> The only way I found so far to do this seems rather roundabout: > >> > >> > >> np.unpackbits (np.array (memoryview(struct.pack ('d', pi)))) > >> Out[45]: > >> array([0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, > > 0, > >> 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, > > 0, > >> 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0], dtype=uint8) > >> > >> (which I'm not certain is correct) > >> > >> Also, I don't know how to reverse this process > > > > You already had your string ready for transmission with `struct.pack('d', > > pi)`. > > > > -- > > Robert Kern > > my transmitter wants an np array of bits, not a string Can you provide any details on what your "transmitter" is? -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Feb 12 10:22:08 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 12 Feb 2015 10:22:08 -0500 Subject: [Numpy-discussion] unpacking data values into array of bits References: Message-ID: Robert Kern wrote: > On Thu, Feb 12, 2015 at 3:00 PM, Neal Becker wrote: >> >> Robert Kern wrote: >> >> > On Thu, Feb 12, 2015 at 2:21 PM, Neal Becker > wrote: >> >> >> >> I need to transmit some data values. These values will be float and > long >> >> values. I need them encoded into a string of bits. >> >> >> >> The only way I found so far to do this seems rather roundabout: >> >> >> >> >> >> np.unpackbits (np.array (memoryview(struct.pack ('d', pi)))) >> >> Out[45]: >> >> array([0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, > 1, >> > 0, >> >> 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, > 0, >> > 0, >> >> 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0], > dtype=uint8) >> >> >> >> (which I'm not certain is correct) >> >> >> >> Also, I don't know how to reverse this process >> > >> > You already had your string ready for transmission with > `struct.pack('d', >> > pi)`. >> > >> > -- >> > Robert Kern >> >> my transmitter wants an np array of bits, not a string > > Can you provide any details on what your "transmitter" is? > > -- My transmitter is c++ code that accepts as input a numpy array of np.int32. Each element of that array has value 0 or 1. From robert.kern at gmail.com Thu Feb 12 10:32:00 2015 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Feb 2015 15:32:00 +0000 Subject: [Numpy-discussion] unpacking data values into array of bits In-Reply-To: References: Message-ID: On Thu, Feb 12, 2015 at 3:22 PM, Neal Becker wrote: > > Robert Kern wrote: > > > On Thu, Feb 12, 2015 at 3:00 PM, Neal Becker wrote: > >> > >> Robert Kern wrote: > >> > >> > On Thu, Feb 12, 2015 at 2:21 PM, Neal Becker > > wrote: > >> >> > >> >> I need to transmit some data values. These values will be float and > > long > >> >> values. I need them encoded into a string of bits. > >> >> > >> >> The only way I found so far to do this seems rather roundabout: > >> >> > >> >> > >> >> np.unpackbits (np.array (memoryview(struct.pack ('d', pi)))) > >> >> Out[45]: > >> >> array([0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, > > 1, > >> > 0, > >> >> 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, > > 0, > >> > 0, > >> >> 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0], > > dtype=uint8) > >> >> > >> >> (which I'm not certain is correct) > >> >> > >> >> Also, I don't know how to reverse this process > >> > > >> > You already had your string ready for transmission with > > `struct.pack('d', > >> > pi)`. > >> > > >> > -- > >> > Robert Kern > >> > >> my transmitter wants an np array of bits, not a string > > > > Can you provide any details on what your "transmitter" is? > > > > -- > > My transmitter is c++ code that accepts as input a numpy array of np.int32. > Each element of that array has value 0 or 1. Ah, great. That makes sense, then. def tobeckerbits(x): return np.unpackbits(np.frombuffer(np.asarray(x), dtype=np.uint8)).astype(np.int32) def frombeckerbits(bits, dtype): return np.frombuffer(np.packbits(bits), dtype=dtype)[0] -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From rays at blue-cove.com Thu Feb 12 10:44:44 2015 From: rays at blue-cove.com (R Schumacher) Date: Thu, 12 Feb 2015 07:44:44 -0800 Subject: [Numpy-discussion] unpacking data values into array of bits In-Reply-To: References: Message-ID: <201502121544.t1CFim6g007523@blue-cove.com> Hmmm np.unpackbits (np.array (memoryview(struct.pack ('d', np.pi)))).astype(np.int32) np.array([b for b in np.binary_repr(314159)], 'int32') # ints only! np.array([b for b in bin(struct.unpack('!i',struct.pack('!f',1.0))[0])[2:]], 'int32') timing is untested. - Ray Schumacher At 07:22 AM 2/12/2015, you wrote: >Robert Kern wrote: > > > On Thu, Feb 12, 2015 at 3:00 PM, Neal Becker wrote: > >> > >> Robert Kern wrote: > >> > >> > On Thu, Feb 12, 2015 at 2:21 PM, Neal Becker > > wrote: > >> >> > >> >> I need to transmit some data values. These values will be float and > > long > >> >> values. I need them encoded into a string of bits. > >> >> > >> >> The only way I found so far to do this seems rather roundabout: > >> >> > >> >> > >> >> np.unpackbits (np.array (memoryview(struct.pack ('d', pi)))) > >> >> Out[45]: > >> >> array([0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, > > 1, > >> > 0, > >> >> 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, > > 0, > >> > 0, > >> >> 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0], > > dtype=uint8) > >> >> > >> >> (which I'm not certain is correct) > >> >> > >> >> Also, I don't know how to reverse this process > >> > > >> > You already had your string ready for transmission with > > `struct.pack('d', > >> > pi)`. > >> > > >> > -- > >> > Robert Kern > >> > >> my transmitter wants an np array of bits, not a string > > > > Can you provide any details on what your "transmitter" is? > > > > -- > >My transmitter is c++ code that accepts as input a numpy array of np.int32. >Each element of that array has value 0 or 1. > > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion From ndbecker2 at gmail.com Thu Feb 12 10:45:20 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 12 Feb 2015 10:45:20 -0500 Subject: [Numpy-discussion] unpacking data values into array of bits References: Message-ID: Robert Kern wrote: > On Thu, Feb 12, 2015 at 3:22 PM, Neal Becker wrote: >> >> Robert Kern wrote: >> >> > On Thu, Feb 12, 2015 at 3:00 PM, Neal Becker > wrote: >> >> >> >> Robert Kern wrote: >> >> >> >> > On Thu, Feb 12, 2015 at 2:21 PM, Neal Becker >> > wrote: >> >> >> >> >> >> I need to transmit some data values. These values will be float and >> > long >> >> >> values. I need them encoded into a string of bits. >> >> >> >> >> >> The only way I found so far to do this seems rather roundabout: >> >> >> >> >> >> >> >> >> np.unpackbits (np.array (memoryview(struct.pack ('d', pi)))) >> >> >> Out[45]: >> >> >> array([0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, > 0, >> > 1, >> >> > 0, >> >> >> 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, > 0, >> > 0, >> >> > 0, >> >> >> 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0], >> > dtype=uint8) >> >> >> >> >> >> (which I'm not certain is correct) >> >> >> >> >> >> Also, I don't know how to reverse this process >> >> > >> >> > You already had your string ready for transmission with >> > `struct.pack('d', >> >> > pi)`. >> >> > >> >> > -- >> >> > Robert Kern >> >> >> >> my transmitter wants an np array of bits, not a string >> > >> > Can you provide any details on what your "transmitter" is? >> > >> > -- >> >> My transmitter is c++ code that accepts as input a numpy array of > np.int32. >> Each element of that array has value 0 or 1. > > Ah, great. That makes sense, then. > > def tobeckerbits(x): > return np.unpackbits(np.frombuffer(np.asarray(x), > dtype=np.uint8)).astype(np.int32) > > def frombeckerbits(bits, dtype): > return np.frombuffer(np.packbits(bits), dtype=dtype)[0] > > -- > Robert Kern Nice! Also seems to work for arrays of values: def tobeckerbits(x): return np.unpackbits(np.frombuffer(np.asarray(x), dtype=np.uint8)).astype(np.int32) def frombeckerbits(bits, dtype): return np.frombuffer(np.packbits(bits), dtype=dtype) << leaving off the [0] x = tobeckerbits (2.7) y = frombeckerbits (x, float) x2 = tobeckerbits (np.array ((1.1, 2.2))) y2 = frombeckerbits (x2, float) -- -- Those who don't understand recursion are doomed to repeat it From rays at blue-cove.com Thu Feb 12 10:55:50 2015 From: rays at blue-cove.com (R Schumacher) Date: Thu, 12 Feb 2015 07:55:50 -0800 Subject: [Numpy-discussion] unpacking data values into array of bits In-Reply-To: References: Message-ID: <201502121555.t1CFtsxt010136@blue-cove.com> At 07:45 AM 2/12/2015, you wrote: >Robert Kern wrote: > > def tobeckerbits(x): > > return np.unpackbits(np.frombuffer(np.asarray(x), > > dtype=np.uint8)).astype(np.int32) > > > > def frombeckerbits(bits, dtype): > > return np.frombuffer(np.packbits(bits), dtype=dtype)[0] > > > > -- > > Robert Kern > >Nice! Also seems to work for arrays of values: It's also fastest, with most pushed to numpy: >>> s='np.unpackbits(np.frombuffer(np.asarray(np.pi), dtype=np.uint8)).astype(np.int32)' >>> timr(s) 0.000252470827292 >>> s="np.array([b for b in bin(struct.unpack('!i',struct.pack('!f',1.0))[0])[2:]], 'int32')" >>> timr(s) 0.000513665240078 >>> s="np.unpackbits (np.array (memoryview(struct.pack ('d', np.pi)))).astype(np.int32)" >>> timr(s) 0.000466455247988 >>> s="np.array([b for b in np.binary_repr(314159)], 'int32')" >>> timr(s) 0.000423350472602 From njs at pobox.com Thu Feb 12 12:15:31 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 12 Feb 2015 09:15:31 -0800 Subject: [Numpy-discussion] unpacking data values into array of bits In-Reply-To: <201502121555.t1CFtsxt010136@blue-cove.com> References: <201502121555.t1CFtsxt010136@blue-cove.com> Message-ID: On 12 Feb 2015 07:55, "R Schumacher" wrote: > > >>> s='np.unpackbits(np.frombuffer(np.asarray(np.pi), > dtype=np.uint8)).astype(np.int32)' > >>> timr(s) > 0.000252470827292 I'm not sure what timr is, but you should check out ipython and its built in %timeit command, which is trivial to use but, because it uses the 'timeit' module, goes to extreme effort to get accurate times (minimizing timing overhead, performing multiple runs, etc.). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Thu Feb 12 18:44:07 2015 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Thu, 12 Feb 2015 18:44:07 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: References: <1423606074771-39719.post@n7.nabble.com> <54DB8944.3010608@ncf.ca> Message-ID: Thanks Ryan. There are a number of good thoughts in your message. I'll try to keep track of them. Another respondent reported different results than mine. I'm in the process of re-installing to check. Colin W. On 11 February 2015 at 16:18, Ryan Nelson wrote: > Colin, > > I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda > environment with Python2.7 and Numpy 1.7.0, and I get the same: > > ############ > Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) > [MSC v > .1500 64 bit (AMD64)] > Type "copyright", "credits" or "license" for more information. > > IPython 2.3.1 -- An enhanced Interactive Python. > Anaconda is brought to you by Continuum Analytics. > Please check out: http://continuum.io/thanks and https://binstar.org > ? -> Introduction and overview of IPython's features. > %quickref -> Quick reference. > help -> Python's own help system. > object? -> Details about 'object', use 'object??' for extra details. > > In [1]: import numpy as np > > In [2]: np.__version__ > Out[2]: '1.7.0' > > In [3]: np.mat([4,'5',6]) > Out[3]: > matrix([['4', '5', '6']], > dtype='|S1') > > In [4]: np.mat([4,'5',6], dtype=int) > Out[4]: matrix([[4, 5, 6]]) > ############### > > As to your comment about coordinating with Statsmodels, you should see the > links in the thread that Alan posted: > http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 > http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 > Josef's comments at the time seem to echo the issues the devs (and others) > have with the matrix class. Maybe things have changed with Statsmodels. > > I know I mentioned Sage and SageMathCloud before. I'll just point out that > there are folks that use this for real research problems, not just as a > pedagogical tool. They have a Matrix/vector/column_matrix class that do > what you were expecting from your problems posted above. Indeed below is a > (truncated) cut and past from a Sage Worksheet. (See > http://www.sagemath.org/doc/tutorial/tour_linalg.html) > ########## > In : Matrix([1,'2',3]) > Error in lines 1-1 > Traceback (most recent call last): > TypeError: unable to find a common ring for all elements > > In : Matrix([[1,2,3],[4,5]]) > ValueError: List of rows is not valid (rows are wrong types or lengths) > > In : vector([1,2,3]) > (1, 2, 3) > > In : column_matrix([1,2,3]) > [1] > [2] > [3] > ########## > > Large portions of the custom code and wrappers in Sage are written in > Python. I don't think their Matrix object is a subclass of ndarray, so > perhaps you could strip out the Matrix stuff from here to make a separate > project with just the Matrix stuff, if you don't want to go through the > Sage interface. > > > On Wed, Feb 11, 2015 at 11:54 AM, cjw wrote: > >> >> On 11-Feb-15 10:21 AM, Ryan Nelson wrote: >> >> So: >> >> In [2]: np.mat([4,'5',6]) >> Out[2]: >> matrix([['4', '5', '6']], dtype='> >> In [3]: np.mat([4,'5',6], dtype=int) >> Out[3]: matrix([[4, 5, 6]]) >> >> >> Thanks Ryan, >> >> We are not singing from the same hymn book. >> >> Using PyScripter, I get: >> >> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit >> (AMD64)] on win32. *** >> >>> import numpy as np >> >>> print('Numpy version: ', np.__version__) >> ('Numpy version: ', '1.9.0') >> >>> >> >> Could you say which version you are using please? >> >> Colin W >> >> On Tue, Feb 10, 2015 at 5:07 PM, cjw wrote: >> >> >> It seems to be agreed that there are weaknesses in the existing Numpy >> Matrix >> Class. >> >> Some problems are illustrated below. >> >> I'll try to put some suggestions over the coming weeks and would appreciate >> comments. >> >> Colin W. >> >> Test Script: >> >> if __name__ == '__main__': >> a= mat([4, 5, 6]) # Good >> print('a: ', a) >> b= mat([4, '5', 6]) # Not the expected result >> print('b: ', b) >> c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular >> print('c: ', c) >> d= mat([[1, 2, 3]]) >> try: >> d[0, 1]= 'b' # Correctly flagged, not numeric >> except ValueError: >> print("d[0, 1]= 'b' # Correctly flagged, not numeric", >> ' >> ValueError') >> print('d: ', d) >> >> Result: >> >> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit >> (AMD64)] on win32. *** >> >> a: [[4 5 6]] >> b: [['4' '5' '6']] >> c: [[[4, 5, 6] [7, 8]]] >> d[0, 1]= 'b' # Correctly flagged, not numeric ValueError >> d: [[1 2 3]] >> >> >> >> >> -- >> View this message in context:http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html >> Sent from the Numpy-discussion mailing list archive at Nabble.com. >> _______________________________________________ >> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Fri Feb 13 05:51:48 2015 From: faltet at gmail.com (Francesc Alted) Date: Fri, 13 Feb 2015 11:51:48 +0100 Subject: [Numpy-discussion] Vectorizing computation Message-ID: Hi, I would like to vectorize the next computation: nx, ny, nz = 720, 180, 3 outheight = np.arange(nz) * 3 oro = np.arange(nx * ny).reshape((nx, ny)) def compute1(outheight, oro): result = np.zeros((nx, ny, nz)) for ix in range(nx): for iz in range(nz): result[ix, :, iz] = outheight[iz] + oro[ix, :] return result I think this should be possible by using an advanced use of broadcasting in numpy. Anyone willing to post a solution? Thanks, -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Fri Feb 13 06:51:02 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 13 Feb 2015 12:51:02 +0100 Subject: [Numpy-discussion] Vectorizing computation In-Reply-To: References: Message-ID: <54DDE526.8070701@googlemail.com> On 02/13/2015 11:51 AM, Francesc Alted wrote: > Hi, > > I would like to vectorize the next computation: > > nx, ny, nz = 720, 180, 3 > outheight = np.arange(nz) * 3 > oro = np.arange(nx * ny).reshape((nx, ny)) > > def compute1(outheight, oro): > result = np.zeros((nx, ny, nz)) > for ix in range(nx): > for iz in range(nz): > result[ix, :, iz] = outheight[iz] + oro[ix, :] > return result > > I think this should be possible by using an advanced use of broadcasting > in numpy. Anyone willing to post a solution? result = outheight + oro.reshape(nx, ny, 1) From faltet at gmail.com Fri Feb 13 07:03:50 2015 From: faltet at gmail.com (Francesc Alted) Date: Fri, 13 Feb 2015 13:03:50 +0100 Subject: [Numpy-discussion] Vectorizing computation In-Reply-To: <54DDE526.8070701@googlemail.com> References: <54DDE526.8070701@googlemail.com> Message-ID: 2015-02-13 12:51 GMT+01:00 Julian Taylor : > On 02/13/2015 11:51 AM, Francesc Alted wrote: > > Hi, > > > > I would like to vectorize the next computation: > > > > nx, ny, nz = 720, 180, 3 > > outheight = np.arange(nz) * 3 > > oro = np.arange(nx * ny).reshape((nx, ny)) > > > > def compute1(outheight, oro): > > result = np.zeros((nx, ny, nz)) > > for ix in range(nx): > > for iz in range(nz): > > result[ix, :, iz] = outheight[iz] + oro[ix, :] > > return result > > > > I think this should be possible by using an advanced use of broadcasting > > in numpy. Anyone willing to post a solution? > > > result = outheight + oro.reshape(nx, ny, 1) > > And 4x faster for my case. Oh my, I am afraid that my mind will never scratch all the amazing possibilities that broadcasting is offering :) Thank you very much for such an elegant solution! Francesc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Fri Feb 13 07:25:31 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 13 Feb 2015 13:25:31 +0100 Subject: [Numpy-discussion] Vectorizing computation In-Reply-To: References: <54DDE526.8070701@googlemail.com> Message-ID: <54DDED3B.9050504@googlemail.com> On 02/13/2015 01:03 PM, Francesc Alted wrote: > 2015-02-13 12:51 GMT+01:00 Julian Taylor >: > > On 02/13/2015 11:51 AM, Francesc Alted wrote: > > Hi, > > > > I would like to vectorize the next computation: > > > > nx, ny, nz = 720, 180, 3 > > outheight = np.arange(nz) * 3 > > oro = np.arange(nx * ny).reshape((nx, ny)) > > > > def compute1(outheight, oro): > > result = np.zeros((nx, ny, nz)) > > for ix in range(nx): > > for iz in range(nz): > > result[ix, :, iz] = outheight[iz] + oro[ix, :] > > return result > > > > I think this should be possible by using an advanced use of > broadcasting > > in numpy. Anyone willing to post a solution? > > > result = outheight + oro.reshape(nx, ny, 1) > > > And 4x faster for my case. Oh my, I am afraid that my mind will never > scratch all the amazing possibilities that broadcasting is offering :) > > Thank you very much for such an elegant solution! > if speed is a concern this is faster as it has a better data layout for numpy during the computation, but the result may be worse layed out for further processing result = outheight.reshape(nz, 1, 1) + oro return np.rollaxis(result, 0, 3) From faltet at gmail.com Fri Feb 13 07:32:57 2015 From: faltet at gmail.com (Francesc Alted) Date: Fri, 13 Feb 2015 13:32:57 +0100 Subject: [Numpy-discussion] Vectorizing computation In-Reply-To: <54DDED3B.9050504@googlemail.com> References: <54DDE526.8070701@googlemail.com> <54DDED3B.9050504@googlemail.com> Message-ID: 2015-02-13 13:25 GMT+01:00 Julian Taylor : > On 02/13/2015 01:03 PM, Francesc Alted wrote: > > 2015-02-13 12:51 GMT+01:00 Julian Taylor > >: > > > > On 02/13/2015 11:51 AM, Francesc Alted wrote: > > > Hi, > > > > > > I would like to vectorize the next computation: > > > > > > nx, ny, nz = 720, 180, 3 > > > outheight = np.arange(nz) * 3 > > > oro = np.arange(nx * ny).reshape((nx, ny)) > > > > > > def compute1(outheight, oro): > > > result = np.zeros((nx, ny, nz)) > > > for ix in range(nx): > > > for iz in range(nz): > > > result[ix, :, iz] = outheight[iz] + oro[ix, :] > > > return result > > > > > > I think this should be possible by using an advanced use of > > broadcasting > > > in numpy. Anyone willing to post a solution? > > > > > > result = outheight + oro.reshape(nx, ny, 1) > > > > > > And 4x faster for my case. Oh my, I am afraid that my mind will never > > scratch all the amazing possibilities that broadcasting is offering :) > > > > Thank you very much for such an elegant solution! > > > > > if speed is a concern this is faster as it has a better data layout for > numpy during the computation, but the result may be worse layed out for > further processing > > result = outheight.reshape(nz, 1, 1) + oro > return np.rollaxis(result, 0, 3) > > Holly cow, this makes for another 4x speed improvement! I don't think I need that much in my scenario, so I will stick with the first one (more readable and the expected data layout), but thanks a lot! Francesc -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Feb 13 17:36:29 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 13 Feb 2015 23:36:29 +0100 Subject: [Numpy-discussion] Nature says 'Pick up Python' Message-ID: A recent article in Nature advice scientists to use Python, Cython and the SciPy stack. http://www.nature.com/news/programming-pick-up-python-1.16833 Sturla From josef.pktd at gmail.com Sat Feb 14 11:35:25 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 14 Feb 2015 11:35:25 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: References: <1423606074771-39719.post@n7.nabble.com> <54DB8944.3010608@ncf.ca> Message-ID: On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson wrote: > Colin, > > I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda > environment with Python2.7 and Numpy 1.7.0, and I get the same: > > ############ > Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) > [MSC v > .1500 64 bit (AMD64)] > Type "copyright", "credits" or "license" for more information. > > IPython 2.3.1 -- An enhanced Interactive Python. > Anaconda is brought to you by Continuum Analytics. > Please check out: http://continuum.io/thanks and https://binstar.org > ? -> Introduction and overview of IPython's features. > %quickref -> Quick reference. > help -> Python's own help system. > object? -> Details about 'object', use 'object??' for extra details. > > In [1]: import numpy as np > > In [2]: np.__version__ > Out[2]: '1.7.0' > > In [3]: np.mat([4,'5',6]) > Out[3]: > matrix([['4', '5', '6']], > dtype='|S1') > > In [4]: np.mat([4,'5',6], dtype=int) > Out[4]: matrix([[4, 5, 6]]) > ############### > > As to your comment about coordinating with Statsmodels, you should see the > links in the thread that Alan posted: > http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 > http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 > Josef's comments at the time seem to echo the issues the devs (and others) > have with the matrix class. Maybe things have changed with Statsmodels. Not changed, we have a strict policy against using np.matrix. generic efficient versions for linear operators, kronecker or sparse block matrix styly operations would be useful, but I would use array semantics, similar to using dot or linalg functions on ndarrays. Josef (long reply canceled because I'm writing too much that might only be of tangential interest or has been in some of the matrix discussion before.) > > I know I mentioned Sage and SageMathCloud before. I'll just point out that > there are folks that use this for real research problems, not just as a > pedagogical tool. They have a Matrix/vector/column_matrix class that do what > you were expecting from your problems posted above. Indeed below is a > (truncated) cut and past from a Sage Worksheet. (See > http://www.sagemath.org/doc/tutorial/tour_linalg.html) > ########## > In : Matrix([1,'2',3]) > Error in lines 1-1 > Traceback (most recent call last): > TypeError: unable to find a common ring for all elements > > In : Matrix([[1,2,3],[4,5]]) > ValueError: List of rows is not valid (rows are wrong types or lengths) > > In : vector([1,2,3]) > (1, 2, 3) > > In : column_matrix([1,2,3]) > [1] > [2] > [3] > ########## > > Large portions of the custom code and wrappers in Sage are written in > Python. I don't think their Matrix object is a subclass of ndarray, so > perhaps you could strip out the Matrix stuff from here to make a separate > project with just the Matrix stuff, if you don't want to go through the Sage > interface. > > > On Wed, Feb 11, 2015 at 11:54 AM, cjw wrote: >> >> >> On 11-Feb-15 10:21 AM, Ryan Nelson wrote: >> >> So: >> >> In [2]: np.mat([4,'5',6]) >> Out[2]: >> matrix([['4', '5', '6']], dtype='> >> In [3]: np.mat([4,'5',6], dtype=int) >> Out[3]: matrix([[4, 5, 6]]) >> >> Thanks Ryan, >> >> We are not singing from the same hymn book. >> >> Using PyScripter, I get: >> >> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit >> (AMD64)] on win32. *** >> >>> import numpy as np >> >>> print('Numpy version: ', np.__version__) >> ('Numpy version: ', '1.9.0') >> >>> >> >> Could you say which version you are using please? >> >> Colin W >> >> On Tue, Feb 10, 2015 at 5:07 PM, cjw wrote: >> >> It seems to be agreed that there are weaknesses in the existing Numpy >> Matrix >> Class. >> >> Some problems are illustrated below. >> >> I'll try to put some suggestions over the coming weeks and would >> appreciate >> comments. >> >> Colin W. >> >> Test Script: >> >> if __name__ == '__main__': >> a= mat([4, 5, 6]) # Good >> print('a: ', a) >> b= mat([4, '5', 6]) # Not the expected result >> print('b: ', b) >> c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular >> print('c: ', c) >> d= mat([[1, 2, 3]]) >> try: >> d[0, 1]= 'b' # Correctly flagged, not numeric >> except ValueError: >> print("d[0, 1]= 'b' # Correctly flagged, not numeric", >> ' >> ValueError') >> print('d: ', d) >> >> Result: >> >> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit >> (AMD64)] on win32. *** >> >> a: [[4 5 6]] >> b: [['4' '5' '6']] >> c: [[[4, 5, 6] [7, 8]]] >> d[0, 1]= 'b' # Correctly flagged, not numeric ValueError >> d: [[1 2 3]] >> >> >> >> -- >> View this message in context: >> http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html >> Sent from the Numpy-discussion mailing list archive at Nabble.com. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cjw at ncf.ca Sat Feb 14 12:05:37 2015 From: cjw at ncf.ca (cjw) Date: Sat, 14 Feb 2015 12:05:37 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: References: <1423606074771-39719.post@n7.nabble.com> <54DB8944.3010608@ncf.ca> Message-ID: <54DF8061.4080202@ncf.ca> On 14-Feb-15 11:35 AM, josef.pktd at gmail.com wrote: > On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson wrote: >> Colin, >> >> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda >> environment with Python2.7 and Numpy 1.7.0, and I get the same: >> >> ############ >> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) >> [MSC v >> .1500 64 bit (AMD64)] >> Type "copyright", "credits" or "license" for more information. >> >> IPython 2.3.1 -- An enhanced Interactive Python. >> Anaconda is brought to you by Continuum Analytics. >> Please check out: http://continuum.io/thanks and https://binstar.org >> ? -> Introduction and overview of IPython's features. >> %quickref -> Quick reference. >> help -> Python's own help system. >> object? -> Details about 'object', use 'object??' for extra details. >> >> In [1]: import numpy as np >> >> In [2]: np.__version__ >> Out[2]: '1.7.0' >> >> In [3]: np.mat([4,'5',6]) >> Out[3]: >> matrix([['4', '5', '6']], >> dtype='|S1') >> >> In [4]: np.mat([4,'5',6], dtype=int) >> Out[4]: matrix([[4, 5, 6]]) >> ############### >> >> As to your comment about coordinating with Statsmodels, you should see the >> links in the thread that Alan posted: >> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 >> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 >> Josef's comments at the time seem to echo the issues the devs (and others) >> have with the matrix class. Maybe things have changed with Statsmodels. > Not changed, we have a strict policy against using np.matrix. > > generic efficient versions for linear operators, kronecker or sparse > block matrix styly operations would be useful, but I would use array > semantics, similar to using dot or linalg functions on ndarrays. > > Josef > (long reply canceled because I'm writing too much that might only be > of tangential interest or has been in some of the matrix discussion > before.) Josef, Many thanks. I have gained the impression that there is some antipathy to np.matrix, perhaps this is because, as others have suggested, the array doesn't provide an appropriate framework. Where are such policy decisions documented? Numpy doesn't appear to have a BDFL. I had read Alan's links back in February and now have note of them. Colin W. > > > >> I know I mentioned Sage and SageMathCloud before. I'll just point out that >> there are folks that use this for real research problems, not just as a >> pedagogical tool. They have a Matrix/vector/column_matrix class that do what >> you were expecting from your problems posted above. Indeed below is a >> (truncated) cut and past from a Sage Worksheet. (See >> http://www.sagemath.org/doc/tutorial/tour_linalg.html) >> ########## >> In : Matrix([1,'2',3]) >> Error in lines 1-1 >> Traceback (most recent call last): >> TypeError: unable to find a common ring for all elements >> >> In : Matrix([[1,2,3],[4,5]]) >> ValueError: List of rows is not valid (rows are wrong types or lengths) >> >> In : vector([1,2,3]) >> (1, 2, 3) >> >> In : column_matrix([1,2,3]) >> [1] >> [2] >> [3] >> ########## >> >> Large portions of the custom code and wrappers in Sage are written in >> Python. I don't think their Matrix object is a subclass of ndarray, so >> perhaps you could strip out the Matrix stuff from here to make a separate >> project with just the Matrix stuff, if you don't want to go through the Sage >> interface. >> >> >> On Wed, Feb 11, 2015 at 11:54 AM, cjw wrote: >>> >>> On 11-Feb-15 10:21 AM, Ryan Nelson wrote: >>> >>> So: >>> >>> In [2]: np.mat([4,'5',6]) >>> Out[2]: >>> matrix([['4', '5', '6']], dtype='>> >>> In [3]: np.mat([4,'5',6], dtype=int) >>> Out[3]: matrix([[4, 5, 6]]) >>> >>> Thanks Ryan, >>> >>> We are not singing from the same hymn book. >>> >>> Using PyScripter, I get: >>> >>> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit >>> (AMD64)] on win32. *** >>>>>> import numpy as np >>>>>> print('Numpy version: ', np.__version__) >>> ('Numpy version: ', '1.9.0') >>> Could you say which version you are using please? >>> >>> Colin W >>> >>> On Tue, Feb 10, 2015 at 5:07 PM, cjw wrote: >>> >>> It seems to be agreed that there are weaknesses in the existing Numpy >>> Matrix >>> Class. >>> >>> Some problems are illustrated below. >>> >>> I'll try to put some suggestions over the coming weeks and would >>> appreciate >>> comments. >>> >>> Colin W. >>> >>> Test Script: >>> >>> if __name__ == '__main__': >>> a= mat([4, 5, 6]) # Good >>> print('a: ', a) >>> b= mat([4, '5', 6]) # Not the expected result >>> print('b: ', b) >>> c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular >>> print('c: ', c) >>> d= mat([[1, 2, 3]]) >>> try: >>> d[0, 1]= 'b' # Correctly flagged, not numeric >>> except ValueError: >>> print("d[0, 1]= 'b' # Correctly flagged, not numeric", >>> ' >>> ValueError') >>> print('d: ', d) >>> >>> Result: >>> >>> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit >>> (AMD64)] on win32. *** >>> >>> a: [[4 5 6]] >>> b: [['4' '5' '6']] >>> c: [[[4, 5, 6] [7, 8]]] >>> d[0, 1]= 'b' # Correctly flagged, not numeric ValueError >>> d: [[1 2 3]] >>> >>> >>> >>> -- >>> View this message in context: >>> http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html >>> Sent from the Numpy-discussion mailing list archive at Nabble.com. >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Sat Feb 14 14:36:46 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 14 Feb 2015 14:36:46 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: <54DF8061.4080202@ncf.ca> References: <1423606074771-39719.post@n7.nabble.com> <54DB8944.3010608@ncf.ca> <54DF8061.4080202@ncf.ca> Message-ID: On Sat, Feb 14, 2015 at 12:05 PM, cjw wrote: > > On 14-Feb-15 11:35 AM, josef.pktd at gmail.com wrote: >> >> On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson >> wrote: >>> >>> Colin, >>> >>> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test >>> conda >>> environment with Python2.7 and Numpy 1.7.0, and I get the same: >>> >>> ############ >>> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) >>> [MSC v >>> .1500 64 bit (AMD64)] >>> Type "copyright", "credits" or "license" for more information. >>> >>> IPython 2.3.1 -- An enhanced Interactive Python. >>> Anaconda is brought to you by Continuum Analytics. >>> Please check out: http://continuum.io/thanks and https://binstar.org >>> ? -> Introduction and overview of IPython's features. >>> %quickref -> Quick reference. >>> help -> Python's own help system. >>> object? -> Details about 'object', use 'object??' for extra details. >>> >>> In [1]: import numpy as np >>> >>> In [2]: np.__version__ >>> Out[2]: '1.7.0' >>> >>> In [3]: np.mat([4,'5',6]) >>> Out[3]: >>> matrix([['4', '5', '6']], >>> dtype='|S1') >>> >>> In [4]: np.mat([4,'5',6], dtype=int) >>> Out[4]: matrix([[4, 5, 6]]) >>> ############### >>> >>> As to your comment about coordinating with Statsmodels, you should see >>> the >>> links in the thread that Alan posted: >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 >>> Josef's comments at the time seem to echo the issues the devs (and >>> others) >>> have with the matrix class. Maybe things have changed with Statsmodels. >> >> Not changed, we have a strict policy against using np.matrix. >> >> generic efficient versions for linear operators, kronecker or sparse >> block matrix styly operations would be useful, but I would use array >> semantics, similar to using dot or linalg functions on ndarrays. >> >> Josef >> (long reply canceled because I'm writing too much that might only be >> of tangential interest or has been in some of the matrix discussion >> before.) > > Josef, > > Many thanks. I have gained the impression that there is some antipathy to > np.matrix, perhaps this is because, as others have suggested, the array > doesn't provide an appropriate framework. It's not directly antipathy, it's cost-benefit analysis. np.matrix has few advantages, but makes reading and maintaining code much more difficult. Having to watch out for multiplication `*` is a lot of extra work. Checking shapes and fixing bugs with unexpected dtypes is also a lot of work, but we have large benefits. For a long time the policy in statsmodels was to keep pandas out of the core of functions (i.e. out of the actual calculations) and restrict it to inputs and returns. However, pandas is becoming more popular and can do some things much better than plain numpy, so it is slowly moving inside some of our core calculations. It's still an easy source of bugs, but we do gain something. Benefits like these don't exist for np.matrix. > > Where are such policy decisions documented? Numpy doesn't appear to have a > BDFL. In general it's a mix of mailing list discussions and discussion in issues and PRs. I'm not directly involved in numpy and don't subscribe to the numpy's github notifications. For scipy (and partially for statsmodels): I think large parts of policies for code and workflow are not explicitly specified, but are more an understanding of maintainers and developers that can slowly change over time, build up through spread out discussion as temporary consensus (or without strong objections). scipy has a hacking text file to describe some of it, but I haven't read it in ages. (long term changes compared to 6 years ago: required code review and required test coverage.) Josef > > I had read Alan's links back in February and now have note of them. > > Colin W. > >> >> >> >>> I know I mentioned Sage and SageMathCloud before. I'll just point out >>> that >>> there are folks that use this for real research problems, not just as a >>> pedagogical tool. They have a Matrix/vector/column_matrix class that do >>> what >>> you were expecting from your problems posted above. Indeed below is a >>> (truncated) cut and past from a Sage Worksheet. (See >>> http://www.sagemath.org/doc/tutorial/tour_linalg.html) >>> ########## >>> In : Matrix([1,'2',3]) >>> Error in lines 1-1 >>> Traceback (most recent call last): >>> TypeError: unable to find a common ring for all elements >>> >>> In : Matrix([[1,2,3],[4,5]]) >>> ValueError: List of rows is not valid (rows are wrong types or lengths) >>> >>> In : vector([1,2,3]) >>> (1, 2, 3) >>> >>> In : column_matrix([1,2,3]) >>> [1] >>> [2] >>> [3] >>> ########## >>> >>> Large portions of the custom code and wrappers in Sage are written in >>> Python. I don't think their Matrix object is a subclass of ndarray, so >>> perhaps you could strip out the Matrix stuff from here to make a separate >>> project with just the Matrix stuff, if you don't want to go through the >>> Sage >>> interface. >>> >>> >>> On Wed, Feb 11, 2015 at 11:54 AM, cjw wrote: >>>> >>>> >>>> On 11-Feb-15 10:21 AM, Ryan Nelson wrote: >>>> >>>> So: >>>> >>>> In [2]: np.mat([4,'5',6]) >>>> Out[2]: >>>> matrix([['4', '5', '6']], dtype='>>> >>>> In [3]: np.mat([4,'5',6], dtype=int) >>>> Out[3]: matrix([[4, 5, 6]]) >>>> >>>> Thanks Ryan, >>>> >>>> We are not singing from the same hymn book. >>>> >>>> Using PyScripter, I get: >>>> >>>> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit >>>> (AMD64)] on win32. *** >>>>>>> >>>>>>> import numpy as np >>>>>>> print('Numpy version: ', np.__version__) >>>> >>>> ('Numpy version: ', '1.9.0') >>>> Could you say which version you are using please? >>>> >>>> Colin W >>>> >>>> On Tue, Feb 10, 2015 at 5:07 PM, cjw wrote: >>>> >>>> It seems to be agreed that there are weaknesses in the existing Numpy >>>> Matrix >>>> Class. >>>> >>>> Some problems are illustrated below. >>>> >>>> I'll try to put some suggestions over the coming weeks and would >>>> appreciate >>>> comments. >>>> >>>> Colin W. >>>> >>>> Test Script: >>>> >>>> if __name__ == '__main__': >>>> a= mat([4, 5, 6]) # Good >>>> print('a: ', a) >>>> b= mat([4, '5', 6]) # Not the expected result >>>> print('b: ', b) >>>> c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as >>>> rectangular >>>> print('c: ', c) >>>> d= mat([[1, 2, 3]]) >>>> try: >>>> d[0, 1]= 'b' # Correctly flagged, not >>>> numeric >>>> except ValueError: >>>> print("d[0, 1]= 'b' # Correctly flagged, not >>>> numeric", >>>> ' >>>> ValueError') >>>> print('d: ', d) >>>> >>>> Result: >>>> >>>> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit >>>> (AMD64)] on win32. *** >>>> >>>> a: [[4 5 6]] >>>> b: [['4' '5' '6']] >>>> c: [[[4, 5, 6] [7, 8]]] >>>> d[0, 1]= 'b' # Correctly flagged, not numeric ValueError >>>> d: [[1 2 3]] >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html >>>> Sent from the Numpy-discussion mailing list archive at Nabble.com. >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Sat Feb 14 16:27:58 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 14 Feb 2015 14:27:58 -0700 Subject: [Numpy-discussion] Matrix Class In-Reply-To: References: <1423606074771-39719.post@n7.nabble.com> <54DB8944.3010608@ncf.ca> <54DF8061.4080202@ncf.ca> Message-ID: On Sat, Feb 14, 2015 at 12:36 PM, wrote: > On Sat, Feb 14, 2015 at 12:05 PM, cjw wrote: > > > > On 14-Feb-15 11:35 AM, josef.pktd at gmail.com wrote: > >> > >> On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson > >> wrote: > >>> > >>> Colin, > >>> > >>> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test > >>> conda > >>> environment with Python2.7 and Numpy 1.7.0, and I get the same: > >>> > >>> ############ > >>> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, > 16:57:52) > >>> [MSC v > >>> .1500 64 bit (AMD64)] > >>> Type "copyright", "credits" or "license" for more information. > >>> > >>> IPython 2.3.1 -- An enhanced Interactive Python. > >>> Anaconda is brought to you by Continuum Analytics. > >>> Please check out: http://continuum.io/thanks and https://binstar.org > >>> ? -> Introduction and overview of IPython's features. > >>> %quickref -> Quick reference. > >>> help -> Python's own help system. > >>> object? -> Details about 'object', use 'object??' for extra details. > >>> > >>> In [1]: import numpy as np > >>> > >>> In [2]: np.__version__ > >>> Out[2]: '1.7.0' > >>> > >>> In [3]: np.mat([4,'5',6]) > >>> Out[3]: > >>> matrix([['4', '5', '6']], > >>> dtype='|S1') > >>> > >>> In [4]: np.mat([4,'5',6], dtype=int) > >>> Out[4]: matrix([[4, 5, 6]]) > >>> ############### > >>> > >>> As to your comment about coordinating with Statsmodels, you should see > >>> the > >>> links in the thread that Alan posted: > >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 > >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 > >>> Josef's comments at the time seem to echo the issues the devs (and > >>> others) > >>> have with the matrix class. Maybe things have changed with Statsmodels. > >> > >> Not changed, we have a strict policy against using np.matrix. > >> > >> generic efficient versions for linear operators, kronecker or sparse > >> block matrix styly operations would be useful, but I would use array > >> semantics, similar to using dot or linalg functions on ndarrays. > >> > >> Josef > >> (long reply canceled because I'm writing too much that might only be > >> of tangential interest or has been in some of the matrix discussion > >> before.) > > > > Josef, > > > > Many thanks. I have gained the impression that there is some antipathy > to > > np.matrix, perhaps this is because, as others have suggested, the array > > doesn't provide an appropriate framework. > > It's not directly antipathy, it's cost-benefit analysis. > > np.matrix has few advantages, but makes reading and maintaining code > much more difficult. > Having to watch out for multiplication `*` is a lot of extra work. > > Checking shapes and fixing bugs with unexpected dtypes is also a lot > of work, but we have large benefits. > For a long time the policy in statsmodels was to keep pandas out of > the core of functions (i.e. out of the actual calculations) and > restrict it to inputs and returns. However, pandas is becoming more > popular and can do some things much better than plain numpy, so it is > slowly moving inside some of our core calculations. > It's still an easy source of bugs, but we do gain something. > Any bits of Pandas that might be good for numpy/scipy to steal? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Feb 14 20:21:43 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 14 Feb 2015 20:21:43 -0500 Subject: [Numpy-discussion] Matrix Class In-Reply-To: References: <1423606074771-39719.post@n7.nabble.com> <54DB8944.3010608@ncf.ca> <54DF8061.4080202@ncf.ca> Message-ID: On Sat, Feb 14, 2015 at 4:27 PM, Charles R Harris wrote: > > > On Sat, Feb 14, 2015 at 12:36 PM, wrote: >> >> On Sat, Feb 14, 2015 at 12:05 PM, cjw wrote: >> > >> > On 14-Feb-15 11:35 AM, josef.pktd at gmail.com wrote: >> >> >> >> On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson >> >> wrote: >> >>> >> >>> Colin, >> >>> >> >>> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test >> >>> conda >> >>> environment with Python2.7 and Numpy 1.7.0, and I get the same: >> >>> >> >>> ############ >> >>> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, >> >>> 16:57:52) >> >>> [MSC v >> >>> .1500 64 bit (AMD64)] >> >>> Type "copyright", "credits" or "license" for more information. >> >>> >> >>> IPython 2.3.1 -- An enhanced Interactive Python. >> >>> Anaconda is brought to you by Continuum Analytics. >> >>> Please check out: http://continuum.io/thanks and https://binstar.org >> >>> ? -> Introduction and overview of IPython's features. >> >>> %quickref -> Quick reference. >> >>> help -> Python's own help system. >> >>> object? -> Details about 'object', use 'object??' for extra details. >> >>> >> >>> In [1]: import numpy as np >> >>> >> >>> In [2]: np.__version__ >> >>> Out[2]: '1.7.0' >> >>> >> >>> In [3]: np.mat([4,'5',6]) >> >>> Out[3]: >> >>> matrix([['4', '5', '6']], >> >>> dtype='|S1') >> >>> >> >>> In [4]: np.mat([4,'5',6], dtype=int) >> >>> Out[4]: matrix([[4, 5, 6]]) >> >>> ############### >> >>> >> >>> As to your comment about coordinating with Statsmodels, you should see >> >>> the >> >>> links in the thread that Alan posted: >> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 >> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 >> >>> Josef's comments at the time seem to echo the issues the devs (and >> >>> others) >> >>> have with the matrix class. Maybe things have changed with >> >>> Statsmodels. >> >> >> >> Not changed, we have a strict policy against using np.matrix. >> >> >> >> generic efficient versions for linear operators, kronecker or sparse >> >> block matrix styly operations would be useful, but I would use array >> >> semantics, similar to using dot or linalg functions on ndarrays. >> >> >> >> Josef >> >> (long reply canceled because I'm writing too much that might only be >> >> of tangential interest or has been in some of the matrix discussion >> >> before.) >> > >> > Josef, >> > >> > Many thanks. I have gained the impression that there is some antipathy >> > to >> > np.matrix, perhaps this is because, as others have suggested, the array >> > doesn't provide an appropriate framework. >> >> It's not directly antipathy, it's cost-benefit analysis. >> >> np.matrix has few advantages, but makes reading and maintaining code >> much more difficult. >> Having to watch out for multiplication `*` is a lot of extra work. >> >> Checking shapes and fixing bugs with unexpected dtypes is also a lot >> of work, but we have large benefits. >> For a long time the policy in statsmodels was to keep pandas out of >> the core of functions (i.e. out of the actual calculations) and >> restrict it to inputs and returns. However, pandas is becoming more >> popular and can do some things much better than plain numpy, so it is >> slowly moving inside some of our core calculations. >> It's still an easy source of bugs, but we do gain something. > > > Any bits of Pandas that might be good for numpy/scipy to steal? I'm not a Pandas expert. Some of it comes into statsmodels because we need the data handling also inside a function, e.g. keeping track of labels, indices, and so on. Another reason is that contributors are more familiar with pandas's way of solving a problems, even if I suspect numpy would be more efficient. However, a recent change, replaces where I would have used np.unique with pandas.factorize which is supposed to be faster. https://github.com/statsmodels/statsmodels/pull/2213 Two or three years ago my numpy way of group handling (using np.unique, bincount and similar) was still faster than the pandas `apply` version, I'm not sure that's still true. And to emphasize: all our heavy stuff especially the big models still only have numpy and scipy inside (with the exception of one model waiting in a PR). Josef > > > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jaime.frio at gmail.com Sat Feb 14 21:21:18 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sat, 14 Feb 2015 18:21:18 -0800 Subject: [Numpy-discussion] Matrix Class In-Reply-To: References: <1423606074771-39719.post@n7.nabble.com> <54DB8944.3010608@ncf.ca> <54DF8061.4080202@ncf.ca> Message-ID: On Sat, Feb 14, 2015 at 5:21 PM, wrote: > On Sat, Feb 14, 2015 at 4:27 PM, Charles R Harris > wrote: > > > > > > On Sat, Feb 14, 2015 at 12:36 PM, wrote: > >> > >> On Sat, Feb 14, 2015 at 12:05 PM, cjw wrote: > >> > > >> > On 14-Feb-15 11:35 AM, josef.pktd at gmail.com wrote: > >> >> > >> >> On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson > >> >> wrote: > >> >>> > >> >>> Colin, > >> >>> > >> >>> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test > >> >>> conda > >> >>> environment with Python2.7 and Numpy 1.7.0, and I get the same: > >> >>> > >> >>> ############ > >> >>> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, > >> >>> 16:57:52) > >> >>> [MSC v > >> >>> .1500 64 bit (AMD64)] > >> >>> Type "copyright", "credits" or "license" for more information. > >> >>> > >> >>> IPython 2.3.1 -- An enhanced Interactive Python. > >> >>> Anaconda is brought to you by Continuum Analytics. > >> >>> Please check out: http://continuum.io/thanks and > https://binstar.org > >> >>> ? -> Introduction and overview of IPython's features. > >> >>> %quickref -> Quick reference. > >> >>> help -> Python's own help system. > >> >>> object? -> Details about 'object', use 'object??' for extra > details. > >> >>> > >> >>> In [1]: import numpy as np > >> >>> > >> >>> In [2]: np.__version__ > >> >>> Out[2]: '1.7.0' > >> >>> > >> >>> In [3]: np.mat([4,'5',6]) > >> >>> Out[3]: > >> >>> matrix([['4', '5', '6']], > >> >>> dtype='|S1') > >> >>> > >> >>> In [4]: np.mat([4,'5',6], dtype=int) > >> >>> Out[4]: matrix([[4, 5, 6]]) > >> >>> ############### > >> >>> > >> >>> As to your comment about coordinating with Statsmodels, you should > see > >> >>> the > >> >>> links in the thread that Alan posted: > >> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 > >> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 > >> >>> Josef's comments at the time seem to echo the issues the devs (and > >> >>> others) > >> >>> have with the matrix class. Maybe things have changed with > >> >>> Statsmodels. > >> >> > >> >> Not changed, we have a strict policy against using np.matrix. > >> >> > >> >> generic efficient versions for linear operators, kronecker or sparse > >> >> block matrix styly operations would be useful, but I would use array > >> >> semantics, similar to using dot or linalg functions on ndarrays. > >> >> > >> >> Josef > >> >> (long reply canceled because I'm writing too much that might only be > >> >> of tangential interest or has been in some of the matrix discussion > >> >> before.) > >> > > >> > Josef, > >> > > >> > Many thanks. I have gained the impression that there is some > antipathy > >> > to > >> > np.matrix, perhaps this is because, as others have suggested, the > array > >> > doesn't provide an appropriate framework. > >> > >> It's not directly antipathy, it's cost-benefit analysis. > >> > >> np.matrix has few advantages, but makes reading and maintaining code > >> much more difficult. > >> Having to watch out for multiplication `*` is a lot of extra work. > >> > >> Checking shapes and fixing bugs with unexpected dtypes is also a lot > >> of work, but we have large benefits. > >> For a long time the policy in statsmodels was to keep pandas out of > >> the core of functions (i.e. out of the actual calculations) and > >> restrict it to inputs and returns. However, pandas is becoming more > >> popular and can do some things much better than plain numpy, so it is > >> slowly moving inside some of our core calculations. > >> It's still an easy source of bugs, but we do gain something. > > > > > > Any bits of Pandas that might be good for numpy/scipy to steal? > > I'm not a Pandas expert. > Some of it comes into statsmodels because we need the data handling > also inside a function, e.g. keeping track of labels, indices, and so > on. Another reason is that contributors are more familiar with > pandas's way of solving a problems, even if I suspect numpy would be > more efficient. > > However, a recent change, replaces where I would have used np.unique > with pandas.factorize which is supposed to be faster. > https://github.com/statsmodels/statsmodels/pull/2213 Numpy could use some form of hash table for its arraysetops, which is where pandas is getting its advantage from. It is a tricky thing though, see e.g. these timings: a = np.ranomdom.randint(10, size=1000) srs = pd.Series(a) %timeit np.unique(a) 100000 loops, best of 3: 13.2 ?s per loop %timeit srs.unique() 100000 loops, best of 3: 15.6 ?s per loop %timeit pd.factorize(a) 10000 loops, best of 3: 25.6 ?s per loop %timeit np.unique(a, return_inverse=True) 10000 loops, best of 3: 82.5 ?s per loop This last timings are with 1.9.0 an 0.14.0, so numpy doesn't have https://github.com/numpy/numpy/pull/5012 yet, which makes the operation in which numpy is slower about 2x faster. And if you need your unique values sorted, then things are more even, especially if numpy runs 2x faster: %timeit pd.factorize(a, sort=True) 10000 loops, best of 3: 36.4 ?s per loop The algorithms scale differently though, so for sufficiently large data Pandas is going to win almost certainly. Not sure if they support all dtypes, nor how efficient their use of memory is. I did a toy implementation of a hash table, mimicking Python's dictionary, for numpy some time ago, see here: https://github.com/jaimefrio/numpy/commit/50b951289dfe9e2c3ef8950184090742ff2ac896 and if I remember correctly for the basic unique operations it was generally faster, both than numpy and pandas, but only by a factor of about 2x, which didn't seem to justify the effort. More complicated operations can probably benefit more, as the pd.factorize example shows. It still seems like an awful lot of work for an operation that isn't obviously needed. If Numpy attempted to have some form of groupby functionality it could make more sense. As is, not really sure. Jaime > > Two or three years ago my numpy way of group handling (using > np.unique, bincount and similar) was still faster than the pandas > `apply` version, I'm not sure that's still true. > > > And to emphasize: all our heavy stuff especially the big models still > only have numpy and scipy inside (with the exception of one model > waiting in a PR). > > Josef > > > > > > > > > > Chuck > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwojc at p.lodz.pl Mon Feb 16 05:58:24 2015 From: mwojc at p.lodz.pl (Marek Wojciechowski) Date: Mon, 16 Feb 2015 11:58:24 +0100 Subject: [Numpy-discussion] ANN: ffnet-0.8.0 released Message-ID: <4564421.3WSxHCMc7D@think> ffnet-0.8.0 has been released. ffnet is a fast and easy-to-use feed-forward neural network training solution for python This version supports python 3. Look at ffnet website: http://ffnet.sourceforge.net for installation instructions and documentation. Regards, -- Marek Wojciechowski --- Politechnika ????dzka Lodz University of Technology Tre???? tej wiadomo??ci zawiera informacje przeznaczone tylko dla adresata. Je??eli nie jeste??cie Pa??stwo jej adresatem, b??d?? otrzymali??cie j?? przez pomy??k?? prosimy o powiadomienie o tym nadawcy oraz trwa??e jej usuni??cie. This email contains information intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient or if you have received this message in error, please notify the sender and delete it from your system. From njs at pobox.com Tue Feb 17 20:07:07 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 17 Feb 2015 17:07:07 -0800 Subject: [Numpy-discussion] PyCon? Message-ID: Hi all, It looks like I'll be at PyCon this year. Anyone else? Any interest in organizing a numpy sprint? -n -- Nathaniel J. Smith -- http://vorpus.org From cournape at gmail.com Wed Feb 18 02:43:34 2015 From: cournape at gmail.com (David Cournapeau) Date: Wed, 18 Feb 2015 07:43:34 +0000 Subject: [Numpy-discussion] PyCon? In-Reply-To: References: Message-ID: I'll be there as well, though I am still figuring out when exactly . On Wed, Feb 18, 2015 at 1:07 AM, Nathaniel Smith wrote: > Hi all, > > It looks like I'll be at PyCon this year. Anyone else? Any interest in > organizing a numpy sprint? > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raniere at ime.unicamp.br Thu Feb 19 07:03:15 2015 From: raniere at ime.unicamp.br (Raniere Silva) Date: Thu, 19 Feb 2015 10:03:15 -0200 Subject: [Numpy-discussion] Google Summer of Code and NumFOCUS Message-ID: <20150219120315.GA16143@pupunha> Hi, NumFOCUS has promotes and supports the ongoing research and development of open-source computing tools including NumPy. This year NumFOCUS want to try be a Google Summer of Code "umbrella" mentoring organization, Umbrella organizations are mentoring organizations accepted into the Google Summer of Code program that have other open source organizations working "under" them. Sometime organizations that work very closely or have very similar goals or communities may get put together under an "umbrella." Google stills expects all organizations under the umbrella, whether accepted into the program under their title or not, to adhere to all the rules and regulations of the program. From https://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2015/help_page#umbrella_organization To help promote and support NumPy. We encourage NumPy to apply to Google Summer of Code under your own title and will be very happy if you can also do with us. If you are interested, please check https://github.com/swcarpentry/gsoc2015 and https://github.com/swcarpentry/gsoc2015/blob/master/CONTRIBUTING.md. If you have any question, please email me directly. Thanks in advance, Raniere -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: not available URL: From ralf.gommers at gmail.com Fri Feb 20 04:05:50 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 20 Feb 2015 10:05:50 +0100 Subject: [Numpy-discussion] GSoC'15 prep - ideas page Message-ID: Hi all, It's time to start preparing for this year's Google Summer of Code. There is actually one urgent thing to be done (before 19.00 UTC today), which is to get our ideas page in decent shape. It doesn't have to be final, but there has to be enough on there for the organizers to judge it. This page is here: https://github.com/scipy/scipy/wiki/GSoC-project-ideas. I'll be reworking it and linking it from the PSF page today, but if you already have new ideas please add them there. See https://wiki.python.org/moin/SummerOfCode/OrgIdeasPageTemplate for this year's template for adding a new idea. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Feb 20 10:53:05 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 20 Feb 2015 16:53:05 +0100 Subject: [Numpy-discussion] Google Summer of Code and NumFOCUS In-Reply-To: <20150219120315.GA16143@pupunha> References: <20150219120315.GA16143@pupunha> Message-ID: Hi Raniere, On Thu, Feb 19, 2015 at 1:03 PM, Raniere Silva wrote: > Hi, > > NumFOCUS has promotes and supports the ongoing research and development of > open-source computing tools including NumPy. > > This year NumFOCUS want to try be a Google Summer of Code > "umbrella" mentoring organization, > > Umbrella organizations are mentoring organizations accepted into the > Google > Summer of Code program that have other open source organizations > working > "under" them. Sometime organizations that work very closely or have > very > similar goals or communities may get put together under an "umbrella." > Google stills expects all organizations under the umbrella, whether > accepted > into the program under their title or not, to adhere to all the rules > and > regulations of the program. > > From > https://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2015/help_page#umbrella_organization > > To help promote and support NumPy. > > We encourage NumPy to apply to Google Summer of Code under your own title > and will be very happy if you can also do with us. > If you are interested, please check > https://github.com/swcarpentry/gsoc2015 > and https://github.com/swcarpentry/gsoc2015/blob/master/CONTRIBUTING.md. > > If you have any question, please email me directly. > Thanks for the enthusiasm in getting Numfocus set up as a new mentoring org - I'm sure that this will be useful for the long term. I'm not sure if you're aware that Numpy, Scipy and many other scientific Python projects have been participating under the umbrella of the PSF for a number of years already (see https://wiki.python.org/moin/SummerOfCode/2014). For this year we'd like to keep it that way. I'll email you privately with more details and to see if I can help you in any way. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From raniere at ime.unicamp.br Fri Feb 20 11:19:00 2015 From: raniere at ime.unicamp.br (Raniere Silva) Date: Fri, 20 Feb 2015 14:19:00 -0200 Subject: [Numpy-discussion] Google Summer of Code and NumFOCUS In-Reply-To: References: <20150219120315.GA16143@pupunha> Message-ID: <20150220161900.GF12853@pupunha> Hi Ralf, > Thanks for the enthusiasm in getting Numfocus set up as a new mentoring org > - I'm sure that this will be useful for the long term. Thanks for your email. > I'm not sure if > you're aware that Numpy, Scipy and many other scientific Python projects > have been participating under the umbrella of the PSF for a number of years > already (see https://wiki.python.org/moin/SummerOfCode/2014). Yes, I'm aware of it. I only care that every project have the opportunity to participate at GSoC and participants have fun during it. =) > For this year we'd like to keep it that way. No problem for me. I hope that we can work together on this next year. Raniere -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: not available URL: From aldcroft at head.cfa.harvard.edu Sun Feb 22 13:21:43 2015 From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas) Date: Sun, 22 Feb 2015 13:21:43 -0500 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? Message-ID: The idea of a one-byte string dtype has been extensively discussed twice before, with a lot of good input and ideas, but no action [1, 2]. tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte string dtype named 's' that uses latin-1 encoding as a bridge to enable Python 3 usage in the near term? A key consequence of not having a one-byte string dtype is that handling ASCII data stored in binary formats such as HDF or FITS is basically broken in Python 3. Packages like h5py, pytables, and astropy.io.fits all return text data arrays with the numpy 'S' type, and in fact have no direct support for the numpy wide unicode 'U' type. In Python 3, the 'S' type array cannot be compared with the Python str type, so that something like below fails: >>> mask = (names_array == "john") # FAIL Problems like this are now showing up in the wild [3]. Workarounds are also showing up, like a way to easily convert from 'S' to 'U' within astropy Tables [4], but this is really not a desirable way to go. Gigabyte-sized string data arrays are not uncommon, so converting to UCS-4 is a real memory and performance hit. For a good top-level summary of much of the previous thread discussion, see [5] from Chris Barker. Condensing this down to just a few points: - *Changing* the behavior of the existing 'S' type is going to break code and seems a bad idea. - *Adding* a new dtype 's' will work and allow highly performant conversion from 'S' to 's' via view(). - Using the latin-1 encoding will minimize code breakage vis-a-vis what works in Python 2 [6]. Using latin-1 is a pragmatic compromise that provides continuity to allow scientists to run their existing code in Python 3 and have things just work. It isn't perfect and it should not be the end of the story, but it would be good. This single issue is the *only* thing blocking me and my team from using Python 3 in operations. As a final point, I don't know the numpy internals at all, but it *seems* like this proposal is one of the easiest to implement amongst those that were discussed. Cheers, Tom [1]: http://mail.scipy.org/pipermail/numpy-discussion/2014-January/068622.html [2]: http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070574.html [3]: https://github.com/astropy/astropy/issues/3311 [4]: http://astropy.readthedocs.org/en/latest/api/astropy.table.Table.html#astropy.table.Table.convert_bytestring_to_unicode [5]: http://mail.scipy.org/pipermail/numpy-discussion/2014-July/070631.html [6]: It is not uncommon to store uint8 data in a bytestring -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sun Feb 22 14:29:25 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 22 Feb 2015 20:29:25 +0100 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On 22/02/15 19:21, Aldcroft, Thomas wrote: > Problems like this are now showing up in the wild [3]. Workarounds are > also showing up, like a way to easily convert from 'S' to 'U' within > astropy Tables [4], but this is really not a desirable way to go. > Gigabyte-sized string data arrays are not uncommon, so converting to > UCS-4 is a real memory and performance hit. Why UCS-4? The Python's internal "flexible string respresentation" will use ascii for ascii text. By PEP 393 an application should not assume an internal string representation at all: https://www.python.org/dev/peps/pep-0393/ If the problem is PEP 393 violation in NumPy string or unicode dtype, we shouldn't violate it even further by adding a latin-1 encoded ascii string. We should let Python represent strings as it wants, and it will not bloat. I am m -1 on adding latin-1 and +1 on making the unicode dtype PEP 393 compliant if it is not. And on Python 3 'U' and 'S' should just be synonyms. You can also store an array of bytes with uint8. Then you can decode it however you like to make a Python string. If it is encoded as latin-1 then decode it as latin-1: In [1]: import numpy as np In [2]: ascii_bytestr = "The quick brown fox jumps over the lazy dog".encode('latin-1') In [3]: numpy_bytestr = np.array(memoryview(ascii_bytestr)) In [4]: numpy_bytestr.dtype, numpy_bytestr.shape Out[4]: (dtype('uint8'), (43,)) In [5]: bytes(numpy_bytestr).decode('latin-1') Out[5]: 'The quick brown fox jumps over the lazy dog' Sturla From njs at pobox.com Sun Feb 22 14:52:24 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 22 Feb 2015 11:52:24 -0800 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 10:21 AM, Aldcroft, Thomas wrote: > The idea of a one-byte string dtype has been extensively discussed twice > before, with a lot of good input and ideas, but no action [1, 2]. > > tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte string > dtype named 's' that uses latin-1 encoding as a bridge to enable Python 3 > usage in the near term? I think this is a good idea. I think overall it would be good for numpy to switch to using variable-length strings in most cases (cf. pandas), which is a different kind of change, but fixed-length 8-bit encoded text is obviously a common on-disk format in scientific applications, so numpy will still need some way to deal with it conveniently. In the long run we'd like to have more flexibility (e.g. allowing choice of character encoding), but since this proposal is a subset of that functionality, then it won't interfere with later improvements. I can see an argument for utf8 over latin1, but it really doesn't matter that much so whatever, blue and purple bikesheds are both fine. The tricky bit here is "just" :-). Do you want to implement this? Do you know someone who does? It's possible but will be somewhat annoying, since to do it directly without refactoring how dtypes work first then you'll have to add lots of copy-paste code to all the different ufuncs. -n -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Sun Feb 22 14:57:35 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 22 Feb 2015 11:57:35 -0800 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 11:29 AM, Sturla Molden wrote: > On 22/02/15 19:21, Aldcroft, Thomas wrote: > >> Problems like this are now showing up in the wild [3]. Workarounds are >> also showing up, like a way to easily convert from 'S' to 'U' within >> astropy Tables [4], but this is really not a desirable way to go. >> Gigabyte-sized string data arrays are not uncommon, so converting to >> UCS-4 is a real memory and performance hit. > > Why UCS-4? The Python's internal "flexible string respresentation" will > use ascii for ascii text. This is a discussion about how strings are represented as bit-patterns inside ndarrays; the internal storage representation used by 'str' is irrelevant. -n -- Nathaniel J. Smith -- http://vorpus.org From robert.kern at gmail.com Sun Feb 22 15:04:17 2015 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 22 Feb 2015 20:04:17 +0000 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 7:29 PM, Sturla Molden wrote: > > On 22/02/15 19:21, Aldcroft, Thomas wrote: > > > Problems like this are now showing up in the wild [3]. Workarounds are > > also showing up, like a way to easily convert from 'S' to 'U' within > > astropy Tables [4], but this is really not a desirable way to go. > > Gigabyte-sized string data arrays are not uncommon, so converting to > > UCS-4 is a real memory and performance hit. > > Why UCS-4? The Python's internal "flexible string respresentation" will > use ascii for ascii text. numpy's 'U' dtype is UCS-4, and this is what Thomas is referring to, not Python's string type. It cannot have a flexible representation as it *is* the representation. Python 3's `str` type is opaque, so it can freely choose how to represent the data in memory. numpy dtypes transparently describe how the data is represented in memory. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sun Feb 22 15:40:26 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 22 Feb 2015 21:40:26 +0100 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On 22/02/15 21:04, Robert Kern wrote: > Python 3's `str` type is opaque, so it can > freely choose how to represent the data in memory. numpy dtypes > transparently describe how the data is represented in memory. Hm, yes, that is a good point. Sturla From sturla.molden at gmail.com Sun Feb 22 15:50:19 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 22 Feb 2015 21:50:19 +0100 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On 22/02/15 20:57, Nathaniel Smith wrote: > This is a discussion about how strings are represented as bit-patterns > inside ndarrays; the internal storage representation used by 'str' is > irrelevant. I thought it would be clever to just use the same internal representation as Python would choose. But obviously it is not. UTF-8 would fail because it is not regularly stored. And every string in an ndarray will need to have the same encoding, but Python might think otherwise. Sturla From aldcroft at head.cfa.harvard.edu Sun Feb 22 17:40:08 2015 From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas) Date: Sun, 22 Feb 2015 17:40:08 -0500 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 2:52 PM, Nathaniel Smith wrote: > On Sun, Feb 22, 2015 at 10:21 AM, Aldcroft, Thomas > wrote: > > The idea of a one-byte string dtype has been extensively discussed twice > > before, with a lot of good input and ideas, but no action [1, 2]. > > > > tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte > string > > dtype named 's' that uses latin-1 encoding as a bridge to enable Python 3 > > usage in the near term? > > I think this is a good idea. I think overall it would be good for > numpy to switch to using variable-length strings in most cases (cf. > pandas), which is a different kind of change, but fixed-length 8-bit > encoded text is obviously a common on-disk format in scientific > applications, so numpy will still need some way to deal with it > conveniently. In the long run we'd like to have more flexibility (e.g. > allowing choice of character encoding), but since this proposal is a > subset of that functionality, then it won't interfere with later > improvements. I can see an argument for utf8 over latin1, but it > really doesn't matter that much so whatever, blue and purple bikesheds > are both fine. > > The tricky bit here is "just" :-). Do you want to implement this? Do > you know someone who does? It's possible but will be somewhat > annoying, since to do it directly without refactoring how dtypes work > first then you'll have to add lots of copy-paste code to all the > different ufuncs. > I'm would be happy to have a go at this, with the caveat that someone who understands numpy would need to get me started with a minimal prototype. >From there I can do the "annoying" copy-paste for ufuncs etc, writing tests and docs. I'm assuming that with a prototype then the rest can be done without any deep understanding of numpy internals (which I do not have). - Tom > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 22 17:42:19 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 22 Feb 2015 15:42:19 -0700 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 12:52 PM, Nathaniel Smith wrote: > On Sun, Feb 22, 2015 at 10:21 AM, Aldcroft, Thomas > wrote: > > The idea of a one-byte string dtype has been extensively discussed twice > > before, with a lot of good input and ideas, but no action [1, 2]. > > > > tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte > string > > dtype named 's' that uses latin-1 encoding as a bridge to enable Python 3 > > usage in the near term? > > I think this is a good idea. I think overall it would be good for > numpy to switch to using variable-length strings in most cases (cf. > pandas), which is a different kind of change, but fixed-length 8-bit > encoded text is obviously a common on-disk format in scientific > applications, so numpy will still need some way to deal with it > conveniently. In the long run we'd like to have more flexibility (e.g. > allowing choice of character encoding), but since this proposal is a > subset of that functionality, then it won't interfere with later > improvements. I can see an argument for utf8 over latin1, but it > really doesn't matter that much so whatever, blue and purple bikesheds > are both fine. > > The tricky bit here is "just" :-). Do you want to implement this? Do > you know someone who does? It's possible but will be somewhat > annoying, since to do it directly without refactoring how dtypes work > first then you'll have to add lots of copy-paste code to all the > different ufuncs. > We're also running out of letters for types. We need to decide on how to extend that representation. It would seem straight forward to just start using multiple letters, but there is a lot of code the uses things like `for dt in 'efdg':`. Can we perhaps introduce an extended dtype structure, maybe with some ideas from dynd and versioning. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 22 17:46:57 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 22 Feb 2015 15:46:57 -0700 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 3:40 PM, Aldcroft, Thomas < aldcroft at head.cfa.harvard.edu> wrote: > > > On Sun, Feb 22, 2015 at 2:52 PM, Nathaniel Smith wrote: > >> On Sun, Feb 22, 2015 at 10:21 AM, Aldcroft, Thomas >> wrote: >> > The idea of a one-byte string dtype has been extensively discussed twice >> > before, with a lot of good input and ideas, but no action [1, 2]. >> > >> > tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte >> string >> > dtype named 's' that uses latin-1 encoding as a bridge to enable Python >> 3 >> > usage in the near term? >> >> I think this is a good idea. I think overall it would be good for >> numpy to switch to using variable-length strings in most cases (cf. >> pandas), which is a different kind of change, but fixed-length 8-bit >> encoded text is obviously a common on-disk format in scientific >> applications, so numpy will still need some way to deal with it >> conveniently. In the long run we'd like to have more flexibility (e.g. >> allowing choice of character encoding), but since this proposal is a >> subset of that functionality, then it won't interfere with later >> improvements. I can see an argument for utf8 over latin1, but it >> really doesn't matter that much so whatever, blue and purple bikesheds >> are both fine. >> >> The tricky bit here is "just" :-). Do you want to implement this? Do >> you know someone who does? It's possible but will be somewhat >> annoying, since to do it directly without refactoring how dtypes work >> first then you'll have to add lots of copy-paste code to all the >> different ufuncs. >> > > I'm would be happy to have a go at this, with the caveat that someone who > understands numpy would need to get me started with a minimal prototype. > From there I can do the "annoying" copy-paste for ufuncs etc, writing tests > and docs. I'm assuming that with a prototype then the rest can be done > without any deep understanding of numpy internals (which I do not have). > > - Tom > > The last two new types added to numpy were float16 and datetime64. Might be worth looking at the steps needed to implement those. There was also a user type, `rational` that got added, that could also provide a template. Maybe we need to have a way to add 'numpy certified' user data types. It might also be possible to reuse the `c` data type, currently implemented as `S1` IIRC, but that could cause some problems. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Feb 22 17:52:40 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 22 Feb 2015 14:52:40 -0800 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 2:42 PM, Charles R Harris wrote: > > On Sun, Feb 22, 2015 at 12:52 PM, Nathaniel Smith wrote: >> >> On Sun, Feb 22, 2015 at 10:21 AM, Aldcroft, Thomas >> wrote: >> > The idea of a one-byte string dtype has been extensively discussed twice >> > before, with a lot of good input and ideas, but no action [1, 2]. >> > >> > tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte >> > string >> > dtype named 's' that uses latin-1 encoding as a bridge to enable Python >> > 3 >> > usage in the near term? >> >> I think this is a good idea. I think overall it would be good for >> numpy to switch to using variable-length strings in most cases (cf. >> pandas), which is a different kind of change, but fixed-length 8-bit >> encoded text is obviously a common on-disk format in scientific >> applications, so numpy will still need some way to deal with it >> conveniently. In the long run we'd like to have more flexibility (e.g. >> allowing choice of character encoding), but since this proposal is a >> subset of that functionality, then it won't interfere with later >> improvements. I can see an argument for utf8 over latin1, but it >> really doesn't matter that much so whatever, blue and purple bikesheds >> are both fine. >> >> The tricky bit here is "just" :-). Do you want to implement this? Do >> you know someone who does? It's possible but will be somewhat >> annoying, since to do it directly without refactoring how dtypes work >> first then you'll have to add lots of copy-paste code to all the >> different ufuncs. > > We're also running out of letters for types. We need to decide on how to > extend that representation. It would seem straight forward to just start > using multiple letters, but there is a lot of code the uses things like `for > dt in 'efdg':`. Can we perhaps introduce an extended dtype structure, maybe > with some ideas from dynd and versioning. I don't mind using "s" for this particular case, but in general I think we should de-emphasise the string representations, and even allow new dtypes to forgo them entirely. We have all of Python to work with. It's much nicer for users and for us to write things like dtype=np.someclass(special_option=True) instead of dtype="SC[S_O=T]" or whatever weird ad-hoc syntax we come up with. (Obviously there are some details to work out with things like the .npy format, but these seem solveable.) -n -- Nathaniel J. Smith -- http://vorpus.org From aldcroft at head.cfa.harvard.edu Sun Feb 22 17:56:25 2015 From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas) Date: Sun, 22 Feb 2015 17:56:25 -0500 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 5:46 PM, Charles R Harris wrote: > > > On Sun, Feb 22, 2015 at 3:40 PM, Aldcroft, Thomas < > aldcroft at head.cfa.harvard.edu> wrote: > >> >> >> On Sun, Feb 22, 2015 at 2:52 PM, Nathaniel Smith wrote: >> >>> On Sun, Feb 22, 2015 at 10:21 AM, Aldcroft, Thomas >>> wrote: >>> > The idea of a one-byte string dtype has been extensively discussed >>> twice >>> > before, with a lot of good input and ideas, but no action [1, 2]. >>> > >>> > tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte >>> string >>> > dtype named 's' that uses latin-1 encoding as a bridge to enable >>> Python 3 >>> > usage in the near term? >>> >>> I think this is a good idea. I think overall it would be good for >>> numpy to switch to using variable-length strings in most cases (cf. >>> pandas), which is a different kind of change, but fixed-length 8-bit >>> encoded text is obviously a common on-disk format in scientific >>> applications, so numpy will still need some way to deal with it >>> conveniently. In the long run we'd like to have more flexibility (e.g. >>> allowing choice of character encoding), but since this proposal is a >>> subset of that functionality, then it won't interfere with later >>> improvements. I can see an argument for utf8 over latin1, but it >>> really doesn't matter that much so whatever, blue and purple bikesheds >>> are both fine. >>> >>> The tricky bit here is "just" :-). Do you want to implement this? Do >>> you know someone who does? It's possible but will be somewhat >>> annoying, since to do it directly without refactoring how dtypes work >>> first then you'll have to add lots of copy-paste code to all the >>> different ufuncs. >>> >> >> I'm would be happy to have a go at this, with the caveat that someone who >> understands numpy would need to get me started with a minimal prototype. >> From there I can do the "annoying" copy-paste for ufuncs etc, writing tests >> and docs. I'm assuming that with a prototype then the rest can be done >> without any deep understanding of numpy internals (which I do not have). >> >> - Tom >> >> > > The last two new types added to numpy were float16 and datetime64. Might > be worth looking at the steps needed to implement those. There was also a > user type, `rational` that got added, that could also provide a template. > Maybe we need to have a way to add 'numpy certified' user data types. It > might also be possible to reuse the `c` data type, currently implemented as > `S1` IIRC, but that could cause some problems. > OK I'll have a look at those. Thanks, Tom > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Feb 22 18:36:00 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 22 Feb 2015 15:36:00 -0800 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 2:46 PM, Charles R Harris wrote: > > On Sun, Feb 22, 2015 at 3:40 PM, Aldcroft, Thomas > wrote: >> >> >> >> On Sun, Feb 22, 2015 at 2:52 PM, Nathaniel Smith wrote: >>> >>> On Sun, Feb 22, 2015 at 10:21 AM, Aldcroft, Thomas >>> wrote: >>> > The idea of a one-byte string dtype has been extensively discussed >>> > twice >>> > before, with a lot of good input and ideas, but no action [1, 2]. >>> > >>> > tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte >>> > string >>> > dtype named 's' that uses latin-1 encoding as a bridge to enable Python >>> > 3 >>> > usage in the near term? >>> >>> I think this is a good idea. I think overall it would be good for >>> numpy to switch to using variable-length strings in most cases (cf. >>> pandas), which is a different kind of change, but fixed-length 8-bit >>> encoded text is obviously a common on-disk format in scientific >>> applications, so numpy will still need some way to deal with it >>> conveniently. In the long run we'd like to have more flexibility (e.g. >>> allowing choice of character encoding), but since this proposal is a >>> subset of that functionality, then it won't interfere with later >>> improvements. I can see an argument for utf8 over latin1, but it >>> really doesn't matter that much so whatever, blue and purple bikesheds >>> are both fine. >>> >>> The tricky bit here is "just" :-). Do you want to implement this? Do >>> you know someone who does? It's possible but will be somewhat >>> annoying, since to do it directly without refactoring how dtypes work >>> first then you'll have to add lots of copy-paste code to all the >>> different ufuncs. >> >> >> I'm would be happy to have a go at this, with the caveat that someone who >> understands numpy would need to get me started with a minimal prototype. >> From there I can do the "annoying" copy-paste for ufuncs etc, writing tests >> and docs. I'm assuming that with a prototype then the rest can be done >> without any deep understanding of numpy internals (which I do not have). >> >> - Tom >> > > > The last two new types added to numpy were float16 and datetime64. Might be > worth looking at the steps needed to implement those. There was also a user > type, `rational` that got added, that could also provide a template. Maybe > we need to have a way to add 'numpy certified' user data types. It might > also be possible to reuse the `c` data type, currently implemented as `S1` > IIRC, but that could cause some problems. float16 and rational probably aren't too relevant because they are fixed-size types, and variable-size dtypes are much trickier. datetime64 will be more similar, but also add its own irrelevant complexities -- you might be best off just looking at how S and U work and copying them. -n -- Nathaniel J. Smith -- http://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldcroft at head.cfa.harvard.edu Sun Feb 22 18:38:54 2015 From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas) Date: Sun, 22 Feb 2015 18:38:54 -0500 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Sun, Feb 22, 2015 at 5:56 PM, Aldcroft, Thomas < aldcroft at head.cfa.harvard.edu> wrote: > > > On Sun, Feb 22, 2015 at 5:46 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, Feb 22, 2015 at 3:40 PM, Aldcroft, Thomas < >> aldcroft at head.cfa.harvard.edu> wrote: >> >>> >>> >>> On Sun, Feb 22, 2015 at 2:52 PM, Nathaniel Smith wrote: >>> >>>> On Sun, Feb 22, 2015 at 10:21 AM, Aldcroft, Thomas >>>> wrote: >>>> > The idea of a one-byte string dtype has been extensively discussed >>>> twice >>>> > before, with a lot of good input and ideas, but no action [1, 2]. >>>> > >>>> > tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte >>>> string >>>> > dtype named 's' that uses latin-1 encoding as a bridge to enable >>>> Python 3 >>>> > usage in the near term? >>>> >>>> I think this is a good idea. I think overall it would be good for >>>> numpy to switch to using variable-length strings in most cases (cf. >>>> pandas), which is a different kind of change, but fixed-length 8-bit >>>> encoded text is obviously a common on-disk format in scientific >>>> applications, so numpy will still need some way to deal with it >>>> conveniently. In the long run we'd like to have more flexibility (e.g. >>>> allowing choice of character encoding), but since this proposal is a >>>> subset of that functionality, then it won't interfere with later >>>> improvements. I can see an argument for utf8 over latin1, but it >>>> really doesn't matter that much so whatever, blue and purple bikesheds >>>> are both fine. >>>> >>>> The tricky bit here is "just" :-). Do you want to implement this? Do >>>> you know someone who does? It's possible but will be somewhat >>>> annoying, since to do it directly without refactoring how dtypes work >>>> first then you'll have to add lots of copy-paste code to all the >>>> different ufuncs. >>>> >>> >>> I'm would be happy to have a go at this, with the caveat that someone >>> who understands numpy would need to get me started with a minimal >>> prototype. From there I can do the "annoying" copy-paste for ufuncs etc, >>> writing tests and docs. I'm assuming that with a prototype then the rest >>> can be done without any deep understanding of numpy internals (which I do >>> not have). >>> >>> - Tom >>> >>> >> >> The last two new types added to numpy were float16 and datetime64. Might >> be worth looking at the steps needed to implement those. There was also a >> user type, `rational` that got added, that could also provide a template. >> Maybe we need to have a way to add 'numpy certified' user data types. It >> might also be possible to reuse the `c` data type, currently implemented as >> `S1` IIRC, but that could cause some problems. >> > > OK I'll have a look at those. > On second thought.. Maybe I'm being naive, but I think that starting from scratch looking at entirely new dtypes is harder than it needs to be, or at least not the most straightforward path [EDIT: just saw email from Nathan agreeing here]. What is being proposed is essentially: - For Python 2, the 's' type is exactly a clone of 'S'. In other words 's' will interface with Python as a bytes (aka str) object just like 'S'. - For Python 3, the 's' type is internally the same as 'S' (np.bytes_) in all operations, but interfaces with Python as a latin-1 encoded string. So the only difference is at the interface layer with Python (initialization, comparison, iteration, etc). So as a starting point we would want to clone 'S' to 's', then fix up the interface to Python 3. Does that sound about right? - Tom > > Thanks, > Tom > > >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Feb 22 18:44:57 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 22 Feb 2015 15:44:57 -0800 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Feb 22, 2015 3:39 PM, "Aldcroft, Thomas" wrote: > > > > On Sun, Feb 22, 2015 at 5:56 PM, Aldcroft, Thomas < aldcroft at head.cfa.harvard.edu> wrote: >> >> >> >> On Sun, Feb 22, 2015 at 5:46 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: >>> >>> >>> >>> On Sun, Feb 22, 2015 at 3:40 PM, Aldcroft, Thomas < aldcroft at head.cfa.harvard.edu> wrote: >>>> >>>> >>>> >>>> On Sun, Feb 22, 2015 at 2:52 PM, Nathaniel Smith wrote: >>>>> >>>>> On Sun, Feb 22, 2015 at 10:21 AM, Aldcroft, Thomas >>>>> wrote: >>>>> > The idea of a one-byte string dtype has been extensively discussed twice >>>>> > before, with a lot of good input and ideas, but no action [1, 2]. >>>>> > >>>>> > tl;dr: Perfect is the enemy of good. Can numpy just add a one-byte string >>>>> > dtype named 's' that uses latin-1 encoding as a bridge to enable Python 3 >>>>> > usage in the near term? >>>>> >>>>> I think this is a good idea. I think overall it would be good for >>>>> numpy to switch to using variable-length strings in most cases (cf. >>>>> pandas), which is a different kind of change, but fixed-length 8-bit >>>>> encoded text is obviously a common on-disk format in scientific >>>>> applications, so numpy will still need some way to deal with it >>>>> conveniently. In the long run we'd like to have more flexibility (e.g. >>>>> allowing choice of character encoding), but since this proposal is a >>>>> subset of that functionality, then it won't interfere with later >>>>> improvements. I can see an argument for utf8 over latin1, but it >>>>> really doesn't matter that much so whatever, blue and purple bikesheds >>>>> are both fine. >>>>> >>>>> The tricky bit here is "just" :-). Do you want to implement this? Do >>>>> you know someone who does? It's possible but will be somewhat >>>>> annoying, since to do it directly without refactoring how dtypes work >>>>> first then you'll have to add lots of copy-paste code to all the >>>>> different ufuncs. >>>> >>>> >>>> I'm would be happy to have a go at this, with the caveat that someone who understands numpy would need to get me started with a minimal prototype. From there I can do the "annoying" copy-paste for ufuncs etc, writing tests and docs. I'm assuming that with a prototype then the rest can be done without any deep understanding of numpy internals (which I do not have). >>>> >>>> - Tom >>>> >>> >>> >>> The last two new types added to numpy were float16 and datetime64. Might be worth looking at the steps needed to implement those. There was also a user type, `rational` that got added, that could also provide a template. Maybe we need to have a way to add 'numpy certified' user data types. It might also be possible to reuse the `c` data type, currently implemented as `S1` IIRC, but that could cause some problems. >> >> >> OK I'll have a look at those. > > > On second thought.. Maybe I'm being naive, but I think that starting from scratch looking at entirely new dtypes is harder than it needs to be, or at least not the most straightforward path [EDIT: just saw email from Nathan agreeing here]. What is being proposed is essentially: > > - For Python 2, the 's' type is exactly a clone of 'S'. In other words 's' will interface with Python as a bytes (aka str) object just like 'S'. > - For Python 3, the 's' type is internally the same as 'S' (np.bytes_) in all operations, but interfaces with Python as a latin-1 encoded string. So the only difference is at the interface layer with Python (initialization, comparison, iteration, etc). > > So as a starting point we would want to clone 'S' to 's', then fix up the interface to Python 3. Does that sound about right? Sounds reasonable to me. You'll also want to consider interactions between the dtypes -- mixed operations like array("a", dtype="s") == array("a", dtype="U") should do the right thing, and casting s<->U ditto. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Mon Feb 23 02:52:56 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sun, 22 Feb 2015 23:52:56 -0800 Subject: [Numpy-discussion] np.nonzero behavior with multidimensional arrays Message-ID: This was raised in SO today: http://stackoverflow.com/questions/28663142/why-is-np-wheres-result-read-only-for-multi-dimensional-arrays/28664009 np.nonzero (and np.where for boolean arrays) behave differently for 1-D and higher dimensional arrays: In the first case, a tuple with a single behaved base ndarray is returned: >>> a = np.ma.array(range(6)) >>> np.where(a > 3) (array([4, 5]),) >>> np.where(a > 3)[0].flags C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In the second, a tuple with as many arrays as dimensions in the passed array is returned, but the arrays are not base ndarrays, but of the same subtype as was passed to the function. These arrays are also set as non-writeable: >>> np.where(a.reshape(2, 3) > 3) (masked_array(data = [1 1], mask = False, fill_value = 999999) , masked_array(data = [1 2], mask = False, fill_value = 999999) ) >>> np.where(a.reshape(2, 3) > 3)[0].flags C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : False ALIGNED : True UPDATEIFCOPY : False I can't think of any reason that justifies this difference, and believe they should be made to return similar results. My feeling is that the proper behavior is the 1-D one, and that the behavior for multidimensional arrays should match it. Anyone can think of any reason that justifies the current behavior? Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.collette at gmail.com Mon Feb 23 11:55:22 2015 From: andrew.collette at gmail.com (Andrew Collette) Date: Mon, 23 Feb 2015 09:55:22 -0700 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: Hi all, > Using latin-1 is a pragmatic compromise that provides continuity to allow > scientists to run their existing code in Python 3 and have things just work. > It isn't perfect and it should not be the end of the story, but it would be > good. This single issue is the *only* thing blocking me and my team from > using Python 3 in operations. Since you mentioned HDF compatibility, I would just note that the two string formats HDF5 supports are ASCII and UTF-8, although presently no validation is performed by HDF5 as to the actual contents. This shouldn't discourage anyone from going with Latin-1, but it would mean that h5py (and presumably PyTables) would have to choose from the following options: 1. Convert to UTF-8, and risk truncation 2. Store as ASCII and replace out-of-range characters with "?" 3. Just store the Latin-1 text in a type labelled "ASCII", and live with it. 4. Raise an exception if non-ASCII characters are present Realistically, h5py might go with (3) as the ASCII type in HDF5 is much abused already. Andrew From aldcroft at head.cfa.harvard.edu Mon Feb 23 12:19:46 2015 From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas) Date: Mon, 23 Feb 2015 12:19:46 -0500 Subject: [Numpy-discussion] One-byte string dtype: third time's the charm? In-Reply-To: References: Message-ID: On Mon, Feb 23, 2015 at 11:55 AM, Andrew Collette wrote: > Hi all, > > > Using latin-1 is a pragmatic compromise that provides continuity to allow > > scientists to run their existing code in Python 3 and have things just > work. > > It isn't perfect and it should not be the end of the story, but it would > be > > good. This single issue is the *only* thing blocking me and my team from > > using Python 3 in operations. > > Since you mentioned HDF compatibility, I would just note that the two > string formats HDF5 supports are ASCII and UTF-8, although presently > no validation is performed by HDF5 as to the actual contents. This > shouldn't discourage anyone from going with Latin-1, but it would mean > that h5py (and presumably PyTables) would have to choose from the > following options: > > 1. Convert to UTF-8, and risk truncation > 2. Store as ASCII and replace out-of-range characters with "?" > 3. Just store the Latin-1 text in a type labelled "ASCII", and live with > it. > 4. Raise an exception if non-ASCII characters are present > > Realistically, h5py might go with (3) as the ASCII type in HDF5 is > much abused already. > I was working on the assumption that (3) would be the best choice, for the reason you gave and to minimize breakage in transitioning to Python 3. - Tom > > Andrew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Mon Feb 23 15:12:33 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 23 Feb 2015 21:12:33 +0100 Subject: [Numpy-discussion] np.nonzero behavior with multidimensional arrays In-Reply-To: References: Message-ID: <54EB89B1.2090101@googlemail.com> On 23.02.2015 08:52, Jaime Fern?ndez del R?o wrote: > This was raised in SO today: > > http://stackoverflow.com/questions/28663142/why-is-np-wheres-result-read-only-for-multi-dimensional-arrays/28664009 > > np.nonzero (and np.where for boolean arrays) behave differently for 1-D > and higher dimensional arrays: > > In the first case, a tuple with a single behaved base ndarray is returned: > > In the second, a tuple with as many arrays as dimensions in the passed > array is returned, but the arrays are not base ndarrays, but of the same > subtype as was passed to the function. These arrays are also set as > non-writeable: > The non-writeable looks like a bug too me, it should probably just use PyArray_FLAGS(self) instead of 0. We had a similar one with the new indexing, its easy to forget this. Concerning subtypes, I don't think there is a good reason to preserve them here and it should just return an ndarray. where with one argument returns a new object that indexes the input object so it is not really related anymore to what it indexes and there is no information that numpy could reasonably propagate. (where with three arguments make sense with subtypes and fixing that is on my todo list) From jaime.frio at gmail.com Mon Feb 23 16:29:51 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Mon, 23 Feb 2015 13:29:51 -0800 Subject: [Numpy-discussion] np.nonzero behavior with multidimensional arrays In-Reply-To: <54EB89B1.2090101@googlemail.com> References: <54EB89B1.2090101@googlemail.com> Message-ID: On Mon, Feb 23, 2015 at 12:12 PM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 23.02.2015 08:52, Jaime Fern?ndez del R?o wrote: > > This was raised in SO today: > > > > > http://stackoverflow.com/questions/28663142/why-is-np-wheres-result-read-only-for-multi-dimensional-arrays/28664009 > > > > np.nonzero (and np.where for boolean arrays) behave differently for 1-D > > and higher dimensional arrays: > > > > In the first case, a tuple with a single behaved base ndarray is > returned: > > > > In the second, a tuple with as many arrays as dimensions in the passed > > array is returned, but the arrays are not base ndarrays, but of the same > > subtype as was passed to the function. These arrays are also set as > > non-writeable: > > > > > The non-writeable looks like a bug too me, it should probably just use > PyArray_FLAGS(self) instead of 0. We had a similar one with the new > indexing, its easy to forget this. > > Concerning subtypes, I don't think there is a good reason to preserve > them here and it should just return an ndarray. > where with one argument returns a new object that indexes the input > object so it is not really related anymore to what it indexes and there > is no information that numpy could reasonably propagate. > That was my thinking when I sent that message last night: add the PyArray_FLAGS argument, and pass the type of the return array rather than the input array when creating the views. I tried to put that in a PR, but it fails a number of tests, as the return of np.nonzero is specifically checked to return the subtype of the passed in array, both in matrixlib, as well as in core/test_regression.py, related to Trac #791: https://github.com/numpy/numpy/issues/1389 So it seems that 7 years ago they had a different view on this, perhaps Chuck remembers what the rationale was, but this seems like a weird requirement for index returning functions: nonzero, argmin/max, argsort, argpartition and the like. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Feb 23 17:22:59 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Feb 2015 15:22:59 -0700 Subject: [Numpy-discussion] np.nonzero behavior with multidimensional arrays In-Reply-To: References: <54EB89B1.2090101@googlemail.com> Message-ID: On Mon, Feb 23, 2015 at 2:29 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > > > On Mon, Feb 23, 2015 at 12:12 PM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> On 23.02.2015 08:52, Jaime Fern?ndez del R?o wrote: >> > This was raised in SO today: >> > >> > >> http://stackoverflow.com/questions/28663142/why-is-np-wheres-result-read-only-for-multi-dimensional-arrays/28664009 >> > >> > np.nonzero (and np.where for boolean arrays) behave differently for 1-D >> > and higher dimensional arrays: >> > >> > In the first case, a tuple with a single behaved base ndarray is >> returned: >> > >> > In the second, a tuple with as many arrays as dimensions in the passed >> > array is returned, but the arrays are not base ndarrays, but of the same >> > subtype as was passed to the function. These arrays are also set as >> > non-writeable: >> > >> >> >> The non-writeable looks like a bug too me, it should probably just use >> PyArray_FLAGS(self) instead of 0. We had a similar one with the new >> indexing, its easy to forget this. >> >> Concerning subtypes, I don't think there is a good reason to preserve >> them here and it should just return an ndarray. >> where with one argument returns a new object that indexes the input >> object so it is not really related anymore to what it indexes and there >> is no information that numpy could reasonably propagate. >> > > That was my thinking when I sent that message last night: add the > PyArray_FLAGS argument, and pass the type of the return array rather than > the input array when creating the views. > > I tried to put that in a PR, but it fails a number of tests, as the return > of np.nonzero is specifically checked to return the subtype of the passed > in array, both in matrixlib, as well as in core/test_regression.py, related > to Trac #791: > > https://github.com/numpy/numpy/issues/1389 > > So it seems that 7 years ago they had a different view on this, perhaps > Chuck remembers what the rationale was, but this seems like a weird > requirement for index returning functions: nonzero, argmin/max, argsort, > argpartition and the like. > That would be, what, 2008? That was way long ago, back around 1.1, and before I was much involved. I don't know what the rational was at that time, but it may have been inherited from Numeric or Numarray, or just seemed like the right thing to do. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Feb 24 01:49:41 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 24 Feb 2015 07:49:41 +0100 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas Message-ID: Hi all, On Fri, Feb 20, 2015 at 10:05 AM, Ralf Gommers wrote: > Hi all, > > It's time to start preparing for this year's Google Summer of Code. There > is actually one urgent thing to be done (before 19.00 UTC today), which is > to get our ideas page in decent shape. It doesn't have to be final, but > there has to be enough on there for the organizers to judge it. This page > is here: https://github.com/scipy/scipy/wiki/GSoC-project-ideas. I'll be > reworking it and linking it from the PSF page today, but if you already > have new ideas please add them there. See > https://wiki.python.org/moin/SummerOfCode/OrgIdeasPageTemplate for this > year's template for adding a new idea. > The ideas page is now in pretty good shape. More ideas are very welcome though, especially easy or easy/intermediate ideas. Numpy right now has zero easy ones and Scipy only one and a half. What we also need is mentors. All ideas already have a potential mentor listed, however some ideas are from last year and I'm not sure that all those mentors really are available this year. And more than one potential mentor per idea is always good. So can everyone please add/remove his or her name on that page? I'm happy to take care of most of the organizational aspects this year, however I'll be offline for two weeks in July and from the end of August onwards, so I'll some help in those periods. Any volunteers? Thanks, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From nickpapior at gmail.com Tue Feb 24 08:16:30 2015 From: nickpapior at gmail.com (Nick Papior Andersen) Date: Tue, 24 Feb 2015 13:16:30 +0000 Subject: [Numpy-discussion] PR, extended site.cfg capabilities Message-ID: Dear all, I have initiated a PR-5597 , which enables the reading of new flags from the site.cfg file. @rgommers requested that I posted some information on this site, possibly somebody could test it on their setup. So the PR basically enables reading these extra options in each section: runtime_library_dirs : Add runtime library directories to the shared libraries (overrides the dreaded LD_LIBRARY_PATH) extra_compile_args: Adds extra compile flags to the compilation extra_link_args: Adds extra flags when linking to libraries Note that this PR will "fix" a lot of issues down the line. Specifically all software which utilises numpy's distutils will benefit from this. As an example, I have successfully set runtime_library_dirs for site.cfg in numpy, where scipy, petsc4py, pygsl, slepc4py utilises these flags and this enables me to create environments without the need for LD_LIBRARY_PATH. The other options simply adds to the flexibility of the compilation to test different optimisations etc. For instance my OpenBLAS section looks like this: [openblas] library_dirs = /opt/openblas/0.2.13/gnu-4.9.2/lib include_dirs = /opt/openblas/0.2.13/gnu-4.9.2/include runtime_library_dirs = /opt/openblas/0.2.13/gnu-4.9.2/lib I hope this can be of use to somebody else than me :) Feel free to test it and provide feedback! -- Kind regards Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Tue Feb 24 08:31:15 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 24 Feb 2015 14:31:15 +0100 Subject: [Numpy-discussion] PR, extended site.cfg capabilities In-Reply-To: References: Message-ID: <54EC7D23.6000700@googlemail.com> On 02/24/2015 02:16 PM, Nick Papior Andersen wrote: > Dear all, > > I have initiated a PR-5597 , > which enables the reading of new flags from the site.cfg file. > @rgommers requested that I posted some information on this site, > possibly somebody could test it on their setup. I do not fully understand the purpose of these changes. Can you give some more detailed use cases? > > So the PR basically enables reading these extra options in each section: > runtime_library_dirs : Add runtime library directories to the shared > libraries (overrides the dreaded LD_LIBRARY_PATH) LD_LIBRARY_PATH should not be used during compilation, this is a runtime flag that numpy.distutils has no control over. Can you explain in more detail what you intend to do with this flag? > extra_compile_args: Adds extra compile flags to the compilation extra flags to which compilation? site.cfg lists libraries that already are compiled. The only influence compiler flags could have is for header only libraries that profit from some flags. But numpy has no support for such libraries currently. E.g. cblas.h (which is just a header with signatures) is bundled with numpy. I guess third parties may make use of this, an example would be good. > extra_link_args: Adds extra flags when linking to libraries This flag may be useful. It could be used to pass options required during linking, like -Wl,--no-as-needed that is sometimes needed to link with gsl. Possibly also useful for link time optimizations. From nickpapior at gmail.com Tue Feb 24 08:51:18 2015 From: nickpapior at gmail.com (Nick Papior Andersen) Date: Tue, 24 Feb 2015 14:51:18 +0100 Subject: [Numpy-discussion] PR, extended site.cfg capabilities In-Reply-To: <54EC7D23.6000700@googlemail.com> References: <54EC7D23.6000700@googlemail.com> Message-ID: 2015-02-24 14:31 GMT+01:00 Julian Taylor : > On 02/24/2015 02:16 PM, Nick Papior Andersen wrote: > > Dear all, > > > > I have initiated a PR-5597 , > > which enables the reading of new flags from the site.cfg file. > > @rgommers requested that I posted some information on this site, > > possibly somebody could test it on their setup. > > I do not fully understand the purpose of these changes. Can you give > some more detailed use cases? > > > > > > So the PR basically enables reading these extra options in each section: > > runtime_library_dirs : Add runtime library directories to the shared > > libraries (overrides the dreaded LD_LIBRARY_PATH) > > LD_LIBRARY_PATH should not be used during compilation, this is a runtime > flag that numpy.distutils has no control over. > Can you explain in more detail what you intend to do with this flag? > Yes, but in my case I almost never set LD_LIBRARY_PATH, instead I link with the runtime library directory so that LD_LIBRARY_PATH need not be set. Consider this output from linalg/lapack_lite.so $> echo $LD_LIBRARY_PATH $>ldd lapack_lite.so libopenblas.so.0 => not found $> echo $LD_LIBRARY_PATH /path/to/openblas/lib $>ldd lapack_lite.so libopenblas.so.0 => /path/to/openblas/lib/libopenblas.so.0 However, if I compile numpy with runtime_library_dirs = /path/to/openblas/lib in the openblas section, then the output would be $> echo $LD_LIBRARY_PATH $>ldd lapack_lite.so libopenblas.so.0 => /path/to/openblas/lib/libopenblas.so.0 I.e. screw-ups in LD_LIBRARY_PATHS can be circumvented. > > extra_compile_args: Adds extra compile flags to the compilation > > extra flags to which compilation? > site.cfg lists libraries that already are compiled. The only influence > compiler flags could have is for header only libraries that profit from > some flags. But numpy has no support for such libraries currently. E.g. > cblas.h (which is just a header with signatures) is bundled with numpy. > I guess third parties may make use of this, an example would be good. > The way I see distutils in numpy is that it extends the generic distutils package so that packages relying on numpy can compile their software the way they want. In some of the extra software I work with, using numpy's distutils to link to lapack/blas is easy, but adding specific compilation flags to sources is not so easy (requires editing the compiler sources). Also, when numpy compiles the lapack_litemodules.c it does so by the generic flags in the compilers specified in the numpy distribution, however now it also uses the flags provided in extra_compile_args from the lapack section of site.cfg. In that regard I would not consider numpy as having "no support" as some packages does in fact use it. > > extra_link_args: Adds extra flags when linking to libraries > > This flag may be useful. > It could be used to pass options required during linking, like > -Wl,--no-as-needed that is sometimes needed to link with gsl. > Possibly also useful for link time optimizations. > Exactly, the runtime_library_dirs can be considered a shorthand for: extra_link_args = -Wl,-rpath= -Wl,-rpath= So you might consider it superfluous, but the intrinsic distutils package allows both abstractions, so why not allow them both? > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I hope this clarified a bit. Thanks for the questions. -- Kind regards Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Tue Feb 24 08:56:23 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 24 Feb 2015 14:56:23 +0100 Subject: [Numpy-discussion] PR, extended site.cfg capabilities In-Reply-To: <54EC7D23.6000700@googlemail.com> References: <54EC7D23.6000700@googlemail.com> Message-ID: <54EC8307.4000400@googlemail.com> On 02/24/2015 02:31 PM, Julian Taylor wrote: > On 02/24/2015 02:16 PM, Nick Papior Andersen wrote: >> Dear all, >> >> I have initiated a PR-5597 , >> which enables the reading of new flags from the site.cfg file. >> @rgommers requested that I posted some information on this site, >> possibly somebody could test it on their setup. > > I do not fully understand the purpose of these changes. Can you give > some more detailed use cases? I think I understand better now, so this is intended as a site.cfg equivalent (and possibly more portable) variant of the environment variables that control these options? e.g. runtime_lib_dirs would be equivalent to LD_RUN_PATH env. variable and build_ext --rpath and the compile_extra_opts equivalent to the OPT env variable? > > >> >> So the PR basically enables reading these extra options in each section: >> runtime_library_dirs : Add runtime library directories to the shared >> libraries (overrides the dreaded LD_LIBRARY_PATH) > > LD_LIBRARY_PATH should not be used during compilation, this is a runtime > flag that numpy.distutils has no control over. > Can you explain in more detail what you intend to do with this flag? > >> extra_compile_args: Adds extra compile flags to the compilation > > extra flags to which compilation? > site.cfg lists libraries that already are compiled. The only influence > compiler flags could have is for header only libraries that profit from > some flags. But numpy has no support for such libraries currently. E.g. > cblas.h (which is just a header with signatures) is bundled with numpy. > I guess third parties may make use of this, an example would be good. > >> extra_link_args: Adds extra flags when linking to libraries > > This flag may be useful. > It could be used to pass options required during linking, like > -Wl,--no-as-needed that is sometimes needed to link with gsl. > Possibly also useful for link time optimizations. > From nickpapior at gmail.com Tue Feb 24 09:01:33 2015 From: nickpapior at gmail.com (Nick Papior Andersen) Date: Tue, 24 Feb 2015 15:01:33 +0100 Subject: [Numpy-discussion] PR, extended site.cfg capabilities In-Reply-To: <54EC8307.4000400@googlemail.com> References: <54EC7D23.6000700@googlemail.com> <54EC8307.4000400@googlemail.com> Message-ID: 2015-02-24 14:56 GMT+01:00 Julian Taylor : > On 02/24/2015 02:31 PM, Julian Taylor wrote: > > On 02/24/2015 02:16 PM, Nick Papior Andersen wrote: > >> Dear all, > >> > >> I have initiated a PR-5597 , > >> which enables the reading of new flags from the site.cfg file. > >> @rgommers requested that I posted some information on this site, > >> possibly somebody could test it on their setup. > > > > I do not fully understand the purpose of these changes. Can you give > > some more detailed use cases? > > I think I understand better now, so this is intended as a site.cfg > equivalent (and possibly more portable) variant of the environment > variables that control these options? > e.g. runtime_lib_dirs would be equivalent to LD_RUN_PATH env. variable > and build_ext --rpath > and the compile_extra_opts equivalent to the OPT env variable? > > Yes, but with the flexibility of each library (section). Instead of "globally" using the env's. And also that the site.cfg file is used in scipy which does not force the user to build numpy AND scipy with build_ext --rpath, etc. > > > > >> > >> So the PR basically enables reading these extra options in each section: > >> runtime_library_dirs : Add runtime library directories to the shared > >> libraries (overrides the dreaded LD_LIBRARY_PATH) > > > > LD_LIBRARY_PATH should not be used during compilation, this is a runtime > > flag that numpy.distutils has no control over. > > Can you explain in more detail what you intend to do with this flag? > > > >> extra_compile_args: Adds extra compile flags to the compilation > > > > extra flags to which compilation? > > site.cfg lists libraries that already are compiled. The only influence > > compiler flags could have is for header only libraries that profit from > > some flags. But numpy has no support for such libraries currently. E.g. > > cblas.h (which is just a header with signatures) is bundled with numpy. > > I guess third parties may make use of this, an example would be good. > > > >> extra_link_args: Adds extra flags when linking to libraries > > > > This flag may be useful. > > It could be used to pass options required during linking, like > > -Wl,--no-as-needed that is sometimes needed to link with gsl. > > Possibly also useful for link time optimizations. > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Kind regards Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Feb 24 11:41:03 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 24 Feb 2015 09:41:03 -0700 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: Message-ID: On Mon, Feb 23, 2015 at 11:49 PM, Ralf Gommers wrote: > Hi all, > > > On Fri, Feb 20, 2015 at 10:05 AM, Ralf Gommers > wrote: > >> Hi all, >> >> It's time to start preparing for this year's Google Summer of Code. There >> is actually one urgent thing to be done (before 19.00 UTC today), which is >> to get our ideas page in decent shape. It doesn't have to be final, but >> there has to be enough on there for the organizers to judge it. This page >> is here: https://github.com/scipy/scipy/wiki/GSoC-project-ideas. I'll be >> reworking it and linking it from the PSF page today, but if you already >> have new ideas please add them there. See >> https://wiki.python.org/moin/SummerOfCode/OrgIdeasPageTemplate for this >> year's template for adding a new idea. >> > > The ideas page is now in pretty good shape. More ideas are very welcome > though, especially easy or easy/intermediate ideas. Numpy right now has > zero easy ones and Scipy only one and a half. > We could add a benchmark project for numpy that would build off the work Pauli is doing in Scipy. That would be easy to intermediate I think, as the programming bits might be easy, but coming up with the benchmarks would be more difficult. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Tue Feb 24 12:15:19 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 24 Feb 2015 18:15:19 +0100 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: Message-ID: <54ECB1A7.8060401@googlemail.com> On 02/24/2015 05:41 PM, Charles R Harris wrote: > > > On Mon, Feb 23, 2015 at 11:49 PM, Ralf Gommers > wrote: > > Hi all, > > > On Fri, Feb 20, 2015 at 10:05 AM, Ralf Gommers > > wrote: > > Hi all, > > It's time to start preparing for this year's Google Summer of > Code. There is actually one urgent thing to be done (before > 19.00 UTC today), which is to get our ideas page in decent > shape. It doesn't have to be final, but there has to be enough > on there for the organizers to judge it. This page is here: > https://github.com/scipy/scipy/wiki/GSoC-project-ideas. I'll be > reworking it and linking it from the PSF page today, but if you > already have new ideas please add them there. See > https://wiki.python.org/moin/SummerOfCode/OrgIdeasPageTemplate > for this year's template for adding a new idea. > > > The ideas page is now in pretty good shape. More ideas are very > welcome though, especially easy or easy/intermediate ideas. Numpy > right now has zero easy ones and Scipy only one and a half. > > > We could add a benchmark project for numpy that would build off the work > Pauli is doing in Scipy. That would be easy to intermediate I think, as > the programming bits might be easy, but coming up with the benchmarks > would be more difficult. > we already have decent set of benchmarks in yaroslavs setup: http://yarikoptic.github.io/numpy-vbench/ From charlesr.harris at gmail.com Tue Feb 24 13:29:16 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 24 Feb 2015 11:29:16 -0700 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: <54ECB1A7.8060401@googlemail.com> References: <54ECB1A7.8060401@googlemail.com> Message-ID: On Tue, Feb 24, 2015 at 10:15 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 02/24/2015 05:41 PM, Charles R Harris wrote: > > > > > > On Mon, Feb 23, 2015 at 11:49 PM, Ralf Gommers > > wrote: > > > > Hi all, > > > > > > On Fri, Feb 20, 2015 at 10:05 AM, Ralf Gommers > > > wrote: > > > > Hi all, > > > > It's time to start preparing for this year's Google Summer of > > Code. There is actually one urgent thing to be done (before > > 19.00 UTC today), which is to get our ideas page in decent > > shape. It doesn't have to be final, but there has to be enough > > on there for the organizers to judge it. This page is here: > > https://github.com/scipy/scipy/wiki/GSoC-project-ideas. I'll be > > reworking it and linking it from the PSF page today, but if you > > already have new ideas please add them there. See > > https://wiki.python.org/moin/SummerOfCode/OrgIdeasPageTemplate > > for this year's template for adding a new idea. > > > > > > The ideas page is now in pretty good shape. More ideas are very > > welcome though, especially easy or easy/intermediate ideas. Numpy > > right now has zero easy ones and Scipy only one and a half. > > > > > > We could add a benchmark project for numpy that would build off the work > > Pauli is doing in Scipy. That would be easy to intermediate I think, as > > the programming bits might be easy, but coming up with the benchmarks > > would be more difficult. > > > > we already have decent set of benchmarks in yaroslavs setup: > http://yarikoptic.github.io/numpy-vbench/ > Are you suggesting that we steal copy those (my thought), or that we don't need a project? I note that Pauli is using Air Speed Velocity instead of vbench. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Feb 25 00:11:31 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 24 Feb 2015 21:11:31 -0800 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: <54ECB1A7.8060401@googlemail.com> Message-ID: Not sure if this is a full GSoC but it would be good to get the benchmarks into the numpy repository, so we can start asking people who submit optimizations to submit new benchmarks as part of the PR (just like other changes require tests). On Feb 24, 2015 10:29 AM, "Charles R Harris" wrote: > > > On Tue, Feb 24, 2015 at 10:15 AM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> On 02/24/2015 05:41 PM, Charles R Harris wrote: >> > >> > >> > On Mon, Feb 23, 2015 at 11:49 PM, Ralf Gommers > > > wrote: >> > >> > Hi all, >> > >> > >> > On Fri, Feb 20, 2015 at 10:05 AM, Ralf Gommers >> > > wrote: >> > >> > Hi all, >> > >> > It's time to start preparing for this year's Google Summer of >> > Code. There is actually one urgent thing to be done (before >> > 19.00 UTC today), which is to get our ideas page in decent >> > shape. It doesn't have to be final, but there has to be enough >> > on there for the organizers to judge it. This page is here: >> > https://github.com/scipy/scipy/wiki/GSoC-project-ideas. I'll be >> > reworking it and linking it from the PSF page today, but if you >> > already have new ideas please add them there. See >> > https://wiki.python.org/moin/SummerOfCode/OrgIdeasPageTemplate >> > for this year's template for adding a new idea. >> > >> > >> > The ideas page is now in pretty good shape. More ideas are very >> > welcome though, especially easy or easy/intermediate ideas. Numpy >> > right now has zero easy ones and Scipy only one and a half. >> > >> > >> > We could add a benchmark project for numpy that would build off the work >> > Pauli is doing in Scipy. That would be easy to intermediate I think, as >> > the programming bits might be easy, but coming up with the benchmarks >> > would be more difficult. >> > >> >> we already have decent set of benchmarks in yaroslavs setup: >> http://yarikoptic.github.io/numpy-vbench/ >> > > Are you suggesting that we steal copy those (my thought), or > that we don't need a project? I note that Pauli is using Air Speed Velocity > instead of vbench. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed Feb 25 12:59:11 2015 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 25 Feb 2015 19:59:11 +0200 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: <54ECB1A7.8060401@googlemail.com> Message-ID: 25.02.2015, 07:11, Nathaniel Smith kirjoitti: > Not sure if this is a full GSoC but it would be good to get the benchmarks > into the numpy repository, so we can start asking people who submit > optimizations to submit new benchmarks as part of the PR (just like other > changes require tests). This may be relevant in this respect: https://github.com/scipy/scipy/pull/4501 From pav at iki.fi Wed Feb 25 13:23:39 2015 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 25 Feb 2015 20:23:39 +0200 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: <54ECB1A7.8060401@googlemail.com> Message-ID: 25.02.2015, 19:59, Pauli Virtanen kirjoitti: > 25.02.2015, 07:11, Nathaniel Smith kirjoitti: >> Not sure if this is a full GSoC but it would be good to get the benchmarks >> into the numpy repository, so we can start asking people who submit >> optimizations to submit new benchmarks as part of the PR (just like other >> changes require tests). > > This may be relevant in this respect: > > https://github.com/scipy/scipy/pull/4501 Ok, I didn't read the thread. The vbench benchmarks seem to not be so many and could probably be ported to asv fairly quickly. The bigger job is in setting up and maintaining a host that runs them periodically. Also, asv doesn't (yet) do branches. Pauli From charlesr.harris at gmail.com Wed Feb 25 15:09:58 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 25 Feb 2015 13:09:58 -0700 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: <54ECB1A7.8060401@googlemail.com> Message-ID: On Wed, Feb 25, 2015 at 11:23 AM, Pauli Virtanen wrote: > 25.02.2015, 19:59, Pauli Virtanen kirjoitti: > > 25.02.2015, 07:11, Nathaniel Smith kirjoitti: > >> Not sure if this is a full GSoC but it would be good to get the > benchmarks > >> into the numpy repository, so we can start asking people who submit > >> optimizations to submit new benchmarks as part of the PR (just like > other > >> changes require tests). > > > > This may be relevant in this respect: > > > > https://github.com/scipy/scipy/pull/4501 > > Ok, I didn't read the thread. The vbench benchmarks seem to not be so > many and could probably be ported to asv fairly quickly. The bigger job > is in setting up and maintaining a host that runs them periodically. > Also, asv doesn't (yet) do branches. > I would expect a GSOC student to also write some benchmarks. Anyone have thoughts/ideas on hosting? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Feb 25 15:23:02 2015 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 25 Feb 2015 12:23:02 -0800 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: <54ECB1A7.8060401@googlemail.com> Message-ID: On Feb 25, 2015 12:10 PM, "Charles R Harris" wrote: > > > > On Wed, Feb 25, 2015 at 11:23 AM, Pauli Virtanen wrote: >> >> 25.02.2015, 19:59, Pauli Virtanen kirjoitti: >> > 25.02.2015, 07:11, Nathaniel Smith kirjoitti: >> >> Not sure if this is a full GSoC but it would be good to get the benchmarks >> >> into the numpy repository, so we can start asking people who submit >> >> optimizations to submit new benchmarks as part of the PR (just like other >> >> changes require tests). >> > >> > This may be relevant in this respect: >> > >> > https://github.com/scipy/scipy/pull/4501 >> >> Ok, I didn't read the thread. The vbench benchmarks seem to not be so >> many and could probably be ported to asv fairly quickly. The bigger job >> is in setting up and maintaining a host that runs them periodically. >> Also, asv doesn't (yet) do branches. > > > I would expect a GSOC student to also write some benchmarks. Anyone have thoughts/ideas on hosting? One possibility is Rackspace, who seem keen to hand out ~unlimited amounts of computing resources to FOSS projects. (I think their default is $10,000/mo/project worth of VMs/storage/etc.) Of course one has to be a bit careful running benchmarks on virtual hardware... -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Wed Feb 25 16:24:42 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 25 Feb 2015 13:24:42 -0800 Subject: [Numpy-discussion] Objects exposing the array interface Message-ID: An issue was raised yesterday in github, regarding np.may_share_memory when run on a class exposing an array using the __array__ method. You can check the details here: https://github.com/numpy/numpy/issues/5604 Looking into it, I found out that NumPy doesn't really treat objects exposing __array__,, __array_interface__, or __array_struct__ as if they were proper arrays: 1. When converting these objects to arrays using PyArray_Converter, if the arrays returned by any of the array interfaces is not C contiguous, aligned, and writeable, a copy that is will be made. Proper arrays and subclasses are passed unchanged. This is the source of the error reported above. 2. When converting these objects using PyArray_OutputConverter, as well as in similar code in the ufucn machinery, anything other than a proper array or subclass raises an error. This means that, contrary to what the docs on subclassing say, see below, you cannot use an object exposing the array interface as an output parameter to a ufunc The following classes can be used to test this behavior: class Foo: def __init__(self, arr): self.arr = arr def __array__(self): return self.arr class Bar: def __init__(self, arr): self.arr = arr self.__array_interface__ = arr.__array_interface__ class Baz: def __init__(self, arr): self.arr = arr self.__array_struct__ = arr.__array_struct__ They all behave the same with these examples: >>> a = Foo(np.ones(5)) >>> np.add(a, a) array([ 2., 2., 2., 2., 2.]) >>> np.add.accumulate(a) array([ 1., 2., 3., 4., 5.]) >>> np.add(a, a, out=a) Traceback (most recent call last): File "", line 1, in TypeError: return arrays must be of ArrayType >>> np.add.accumulate(a, out=a) Traceback (most recent call last): File "", line 1, in TypeError: output must be an array I think this should be changed, and whatever gets handed by this methods/interfaces be treated as if it were an array or subclass of it. This is actually what the docs on subclassing say about __array__ here: http://docs.scipy.org/doc/numpy/reference/arrays.classes.html#numpy.class.__array__ This also seems to contradict a rather cryptic comment in the code of PyArray_GetArrayParamsFromObject, which is part of the call sequence of this whole mess, see here: https://github.com/numpy/numpy/blob/maintenance/1.9.x/numpy/core/src/multiarray/ctors.c#L1495 /* * If op supplies the __array__ function. * The documentation says this should produce a copy, so * we skip this method if writeable is true, because the intent * of writeable is to modify the operand. * XXX: If the implementation is wrong, and/or if actual * usage requires this behave differently, * this should be changed! */ There has already been some discussion in the issue linked above, but I would appreciate any other thoughts on the idea of treating objects with some form of array interface as if they were arrays. Does it need a deprecation cycle? Is there some case I am not considering where this could go horribly wrong? Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Feb 25 16:56:03 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 25 Feb 2015 13:56:03 -0800 Subject: [Numpy-discussion] Objects exposing the array interface In-Reply-To: References: Message-ID: On Wed, Feb 25, 2015 at 1:24 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > 1. When converting these objects to arrays using PyArray_Converter, if > the arrays returned by any of the array interfaces is not C contiguous, > aligned, and writeable, a copy that is will be made. Proper arrays and > subclasses are passed unchanged. This is the source of the error reported > above. > When converting these objects to arrays using PyArray_Converter, if the arrays returned by any of the array interfaces is not C contiguous, aligned, and writeable, a copy that is will be made. Proper arrays and subclasses are passed unchanged. This is the source of the error reported above. I'm not entirely sure I understand this -- how is PyArray_Convert used in numpy? For example, if I pass a non-contiguous array to your class Foo, np.asarray does not do a copy: In [25]: orig = np.zeros((3, 4))[:2, :3] In [26]: orig.flags Out[26]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [27]: subclass = Foo(orig) In [28]: np.asarray(subclass) Out[28]: array([[ 0., 0., 0.], [ 0., 0., 0.]]) In [29]: np.asarray(subclass)[:] = 1 In [30]: np.asarray(subclass) Out[30]: array([[ 1., 1., 1.], [ 1., 1., 1.]]) But yes, this is probably a bug. 2. When converting these objects using PyArray_OutputConverter, as well as > in similar code in the ufucn machinery, anything other than a proper array > or subclass raises an error. This means that, contrary to what the docs on > subclassing say, see below, you cannot use an object exposing the array > interface as an output parameter to a ufunc > Here it might be a good idea to distinguish between objects that define __array__ vs __array_interface__/__array_struct__. A class that defines __array__ might not be very ndarray-like at all, but rather be something that can be *converted* to an ndarray. For example, objects in pandas define __array__, but updating the return value of df.__array__() in-place will not necessarily update the DataFrame (e.g., if the frame had inhomogeneous dtypes). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Wed Feb 25 17:48:10 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 25 Feb 2015 14:48:10 -0800 Subject: [Numpy-discussion] Objects exposing the array interface In-Reply-To: References: Message-ID: On Wed, Feb 25, 2015 at 1:56 PM, Stephan Hoyer wrote: > > > On Wed, Feb 25, 2015 at 1:24 PM, Jaime Fern?ndez del R?o < > jaime.frio at gmail.com> wrote: > >> 1. When converting these objects to arrays using PyArray_Converter, if >> the arrays returned by any of the array interfaces is not C contiguous, >> aligned, and writeable, a copy that is will be made. Proper arrays and >> subclasses are passed unchanged. This is the source of the error reported >> above. >> > > > When converting these objects to arrays using PyArray_Converter, if the > arrays returned by any of the array interfaces is not C contiguous, > aligned, and writeable, a copy that is will be made. Proper arrays and > subclasses are passed unchanged. This is the source of the error reported > above. > > I'm not entirely sure I understand this -- how is PyArray_Convert used in > numpy? For example, if I pass a non-contiguous array to your class Foo, > np.asarray does not do a copy: > It is used by many (all?) C functions that take an array as input. This follows a different path than what np.asarray or np.asanyarray do, which are calls to np.array, which maps to the C function _array_fromobject which can be found here: https://github.com/numpy/numpy/blob/maintenance/1.9.x/numpy/core/src/multiarray/multiarraymodule.c#L1592 And ufuncs have their own conversion code, which doesn't really help either. Not sure it would be possible to have them all use a common code base, but it is certainly well worth trying. > > In [25]: orig = np.zeros((3, 4))[:2, :3] > > In [26]: orig.flags > Out[26]: > C_CONTIGUOUS : False > F_CONTIGUOUS : False > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [27]: subclass = Foo(orig) > > In [28]: np.asarray(subclass) > Out[28]: > array([[ 0., 0., 0.], > [ 0., 0., 0.]]) > > In [29]: np.asarray(subclass)[:] = 1 > > In [30]: np.asarray(subclass) > Out[30]: > array([[ 1., 1., 1.], > [ 1., 1., 1.]]) > > > But yes, this is probably a bug. > > 2. When converting these objects using PyArray_OutputConverter, as well as >> in similar code in the ufucn machinery, anything other than a proper array >> or subclass raises an error. This means that, contrary to what the docs on >> subclassing say, see below, you cannot use an object exposing the array >> interface as an output parameter to a ufunc >> > > Here it might be a good idea to distinguish between objects that define > __array__ vs __array_interface__/__array_struct__. A class that defines > __array__ might not be very ndarray-like at all, but rather be something > that can be *converted* to an ndarray. For example, objects in pandas > define __array__, but updating the return value of df.__array__() in-place > will not necessarily update the DataFrame (e.g., if the frame had > inhomogeneous dtypes). > I am not really sure what the behavior of __array__ should be. The link to the subclassing docs I gave before indicates that it should be possible to write to it if it is writeable (and probably pandas should set the writeable flag to False if it cannot be reliably written to), but the obscure comment I mentioned seems to point to the opposite, that it should never be written to. This is probably a good moment in time to figure out what the proper behavior should be and document it. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Feb 26 00:22:23 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 25 Feb 2015 21:22:23 -0800 Subject: [Numpy-discussion] Objects exposing the array interface In-Reply-To: References: Message-ID: On Wed, Feb 25, 2015 at 2:48 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > I am not really sure what the behavior of __array__ should be. The link > to the subclassing docs I gave before indicates that it should be possible > to write to it if it is writeable (and probably pandas should set the > writeable flag to False if it cannot be reliably written to), but the > obscure comment I mentioned seems to point to the opposite, that it should > never be written to. This is probably a good moment in time to figure out > what the proper behavior should be and document it. > It's one thing to rely on the result of __array__ being writeable. It's another thing to rely on writing to that array to modify the original array-like object. Presuming the later would be a mistake. Let me give three categories of examples where I know this would fail: - pandas: for DataFrame objects with inhomogeneous dtype - netCDF4 and other IO libraries: The array's data may be readonly on disk or require a network call to access. The memory model may not even be able to be cleanly mapped to numpy's (e.g., it may use chunked storage) - blaze.Data: Blaze arrays use lazily evaluation and don't support mutation As far as I know, none of these libraries produce readonly ndarray objects from __array__. It can actually be highly convenient to return normal, writeable ndarrays even if they don't modify the original source, because this lets you do all the normal numpy stuff to the returned array, including operations that mutate it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sank.daniel at gmail.com Thu Feb 26 00:41:50 2015 From: sank.daniel at gmail.com (Daniel Sank) Date: Wed, 25 Feb 2015 21:41:50 -0800 Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal Message-ID: Dear numpy users, I would like to clarify the docstring for numpy.random.normal. I submitted a patch but it was rejected because it breaks the tests. Unfortunately, the development workflow page does not explain how to run the tests (in fact it doesn't mention them at all). I am therefore writing to discuss my proposed change and find out how to run the tests so that I can make it compatible with the existing code. The current form of the docstring for numpy.random.normal is as follows: """ normal(loc=0.0, scale=1.0, size=None) Parameters ---------- loc : float Mean ("centre") of the distribution. scale : float Standard deviation (spread or "width") of the distribution. size : tuple of ints Output shape. If the given shape is, e.g., ``(m, n, k)``, then ``m * n * k`` samples are drawn. Notes ----- The probability density for the Gaussian distribution is .. math:: p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }} e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} }, where :math:`\mu` is the mean and :math:`\sigma` the standard deviation. The square of the standard deviation, :math:`\sigma^2`, is called the variance. """ It seems unnecessarily convoluted to name the input arguments "loc" and "scale", then immediately define them as the "mean" and "standard deviation" in the Parameters section, and then again rename them as "mu" and "sigma" in the written formula. I propose to simply change the argument names to "mean" and "sigma" to improve consistency. I tried fixing this via a pull request [1] but it was closed because my change broke the tests. Unfortunately, the development workflow section of the web page doesn't explain how to run the tests (in fact it doesn't even mention them). How can I make the proposed change without breaking the tests, or equivalently how do I find out how to run the tests myself so I can find an acceptable way of making the change on my own? [1] https://github.com/numpy/numpy/pull/5607#issuecomment-76114282 -- Daniel Sank -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Thu Feb 26 05:54:24 2015 From: toddrjen at gmail.com (Todd) Date: Thu, 26 Feb 2015 11:54:24 +0100 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: Message-ID: I am not able to mentor, but I have some ideas about easier projects. These may be too easy, too hard, or not even desirable so take them or leave them as you please. scipy: Implement a set of circular statistics functions comparable to those in R or MATLAB circular statistics toolbox. Either implement some window functions that only apply to the beginning and end of an array, or implement a wrapper that takes a window function and some parameters and creates a new window that only applies to the beginning and end of an array. numpy: Integrate the bottleneck project optimizations into numpy proper. Integrate as much as possible the matplotlib.mlab functionality into numpy (and, optionally, also scipy). In many places different approaches to the same task have substantially different performance (such as indexing vs. take) and check for one approach being substantially slower. If it is, fix the performance problem if possible (perhaps by using the same implementation), and if not document the difference. Modify ufuncs so their documentation appears in help() in addition to numpy.info(). Hi all, On Fri, Feb 20, 2015 at 10:05 AM, Ralf Gommers wrote: > Hi all, > > It's time to start preparing for this year's Google Summer of Code. There > is actually one urgent thing to be done (before 19.00 UTC today), which is > to get our ideas page in decent shape. It doesn't have to be final, but > there has to be enough on there for the organizers to judge it. This page > is here: https://github.com/scipy/scipy/wiki/GSoC-project-ideas. I'll be > reworking it and linking it from the PSF page today, but if you already > have new ideas please add them there. See > https://wiki.python.org/moin/SummerOfCode/OrgIdeasPageTemplate for this > year's template for adding a new idea. > The ideas page is now in pretty good shape. More ideas are very welcome though, especially easy or easy/intermediate ideas. Numpy right now has zero easy ones and Scipy only one and a half. What we also need is mentors. All ideas already have a potential mentor listed, however some ideas are from last year and I'm not sure that all those mentors really are available this year. And more than one potential mentor per idea is always good. So can everyone please add/remove his or her name on that page? I'm happy to take care of most of the organizational aspects this year, however I'll be offline for two weeks in July and from the end of August onwards, so I'll some help in those periods. Any volunteers? Thanks, Ralf _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Thu Feb 26 10:09:57 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Thu, 26 Feb 2015 07:09:57 -0800 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: Message-ID: On Thu, Feb 26, 2015 at 2:54 AM, Todd wrote: > I am not able to mentor, but I have some ideas about easier projects. > These may be too easy, too hard, or not even desirable so take them or > leave them as you please. > > scipy: > > Implement a set of circular statistics functions comparable to those in R > or MATLAB circular statistics toolbox. > > Either implement some window functions that only apply to the beginning > and end of an array, or implement a wrapper that takes a window function > and some parameters and creates a new window that only applies to the > beginning and end of an array. > > numpy: > > Integrate the bottleneck project optimizations into numpy proper. > > Not sure how much of the bottleneck optimizations can be fitted into the ufunc machinery. But I'd be more than happy to mentor or co-mentor an implementation in numpy of the moving window functions. I have already contributed some work on some of those in scipy.ndimage and pandas, and find the subject fascinating. > Integrate as much as possible the matplotlib.mlab functionality into numpy > (and, optionally, also scipy). > > In many places different approaches to the same task have substantially > different performance (such as indexing vs. take) and check for one > approach being substantially slower. If it is, fix the performance problem > if possible (perhaps by using the same implementation), and if not document > the difference. > The take performance advantage is no longer there since seberg's rewrite of indexing. Are there any other obvious examples? > Modify ufuncs so their documentation appears in help() in addition to > numpy.info(). > To add one of my own: the old iterator is still being used in many, many places throughout the numpy code base. Wouldn't it make sense to port those to the new one? In doing so, it would probably lead to producing simplified interfaces to the new iterator, e.g. reproducing the old PyIter_AllButAxis is infinitely more verbose with the new iterator. > Hi all, > > > On Fri, Feb 20, 2015 at 10:05 AM, Ralf Gommers > wrote: > >> Hi all, >> >> It's time to start preparing for this year's Google Summer of Code. There >> is actually one urgent thing to be done (before 19.00 UTC today), which is >> to get our ideas page in decent shape. It doesn't have to be final, but >> there has to be enough on there for the organizers to judge it. This page >> is here: https://github.com/scipy/scipy/wiki/GSoC-project-ideas. I'll be >> reworking it and linking it from the PSF page today, but if you already >> have new ideas please add them there. See >> https://wiki.python.org/moin/SummerOfCode/OrgIdeasPageTemplate for this >> year's template for adding a new idea. >> > > The ideas page is now in pretty good shape. More ideas are very welcome > though, especially easy or easy/intermediate ideas. Numpy right now has > zero easy ones and Scipy only one and a half. > > What we also need is mentors. All ideas already have a potential mentor > listed, however some ideas are from last year and I'm not sure that all > those mentors really are available this year. And more than one potential > mentor per idea is always good. So can everyone please add/remove his or > her name on that page? > > I'm happy to take care of most of the organizational aspects this year, > however I'll be offline for two weeks in July and from the end of August > onwards, so I'll some help in those periods. Any volunteers? > > Thanks, > Ralf > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Feb 26 10:25:06 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 26 Feb 2015 16:25:06 +0100 Subject: [Numpy-discussion] GSoC'15 - mentors & ideas In-Reply-To: References: Message-ID: <1424964306.21620.1.camel@sipsolutions.net> On Do, 2015-02-26 at 07:09 -0800, Jaime Fern?ndez del R?o wrote: > > > To add one of my own: the old iterator is still being used in many, > many places throughout the numpy code base. Wouldn't it make sense to > port those to the new one? In doing so, it would probably lead to > producing simplified interfaces to the new iterator, e.g. reproducing > the old PyIter_AllButAxis is infinitely more verbose with the new > iterator. > > Might be a bit off topic. But I used to wonder if it could make sense to create a Cython code generation support for nditer? NDiter is pretty powerful, but we often have things like the contiguous special case, buffering, etc. that is always identical code but without having something ready in cython nobody will ever use nditer from cython, even though for some things it might make a lot of sense. - Sebastian > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From cimrman3 at ntc.zcu.cz Thu Feb 26 11:12:51 2015 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu, 26 Feb 2015 17:12:51 +0100 Subject: [Numpy-discussion] ANN: SfePy 2015.1 Message-ID: <54EF4603.8090300@ntc.zcu.cz> I am pleased to announce release 2015.1 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method or by the isogeometric analysis (preliminary support). It is distributed under the new BSD license. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker, wiki: http://github.com/sfepy Highlights of this release -------------------------- - support for multiple fields in isogeometric analysis - redesigned handling of solver parameters - new modal analysis example For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman and Contributors (*) (*) Contributors to this release (alphabetical order): Lubos Kejzlar, Vladimir Lukes From mads.ipsen at gmail.com Fri Feb 27 11:02:17 2015 From: mads.ipsen at gmail.com (Mads Ipsen) Date: Fri, 27 Feb 2015 17:02:17 +0100 Subject: [Numpy-discussion] dot and MKL memory Message-ID: <54F09509.1070203@gmail.com> Hi, If I build Python 2.7.2 and numpy-1.9.1 and run the following script import numpy c = numpy.ones((500000, 4)) mat = numpy.identity(4) r = numpy.dot(c, mat) the evaluation of the 'dot' increases the memory by app. 35 MB. If, in addition, I build numpy-1.9.1 with MKL support, and run the script, the evaluation of the 'dot' increases the memory by app. 450 MB. Is the expected? Best regards, Mads specs: Ubuntu 12.04 ifort (IFORT) 14.0.1 2013100 gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) -- +---------------------------------------------------------------------+ | Mads Ipsen | +----------------------------------+----------------------------------+ | Overgaden Oven Vandet 106, 4.tv | phone: +45-29716388 | | DK-1415 K?benhavn K | email: mads.ipsen at gmail.com | | Denmark | map : https://goo.gl/maps/oQ6y6 | +----------------------------------+----------------------------------+ From charlesr.harris at gmail.com Fri Feb 27 11:50:03 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 27 Feb 2015 09:50:03 -0700 Subject: [Numpy-discussion] dot and MKL memory In-Reply-To: <54F09509.1070203@gmail.com> References: <54F09509.1070203@gmail.com> Message-ID: On Fri, Feb 27, 2015 at 9:02 AM, Mads Ipsen wrote: > Hi, > > If I build Python 2.7.2 and numpy-1.9.1 and run the following script > > import numpy > c = numpy.ones((500000, 4)) > mat = numpy.identity(4) > r = numpy.dot(c, mat) > > the evaluation of the 'dot' increases the memory by app. 35 MB. > > If, in addition, I build numpy-1.9.1 with MKL support, and run the > script, the evaluation of the 'dot' increases the memory by app. 450 MB. > > Is the expected? > > Best regards, > > Mads > > specs: > > Ubuntu 12.04 > ifort (IFORT) 14.0.1 2013100 > gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) > > No, but I don't know why that is happening with MKL. Can anyone else reproduce this? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Fri Feb 27 12:43:12 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Fri, 27 Feb 2015 18:43:12 +0100 Subject: [Numpy-discussion] dot and MKL memory In-Reply-To: References: <54F09509.1070203@gmail.com> Message-ID: On 27 February 2015 at 17:50, Charles R Harris wrote: >> > > No, but I don't know why that is happening with MKL. Can anyone else > reproduce this? > > Chuck I can't reproduce. I have checked on my system python (ATLAS) and Conda with MKL running in parallel with vmstat. The difference between them is under 1 MB. I am Fedora 20, system python using GCC 4.8 and ATLAS 3.8.4. Coda is compiled with GCC 4.4 and linked to MKL 11.1. /David. From sturla.molden at gmail.com Fri Feb 27 17:33:22 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 27 Feb 2015 23:33:22 +0100 Subject: [Numpy-discussion] So I found a bug... Message-ID: Somewhere... But where is it? NumPy, SciPy, Matplotlib, Cython or ipython? I am suspecting ipython, but proving it is hard... http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/12464039/lenna-bug.ipynb Sturla From ben.root at ou.edu Fri Feb 27 17:39:10 2015 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 27 Feb 2015 17:39:10 -0500 Subject: [Numpy-discussion] So I found a bug... In-Reply-To: References: Message-ID: It is Friday evening here... I must be really dense. What bug are we looking at? Is this another white-gold vs. blue-black dress color thing? Ben Root On Fri, Feb 27, 2015 at 5:33 PM, Sturla Molden wrote: > Somewhere... But where is it? > > NumPy, SciPy, Matplotlib, Cython or ipython? > > I am suspecting ipython, but proving it is hard... > > > http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/12464039/lenna-bug.ipynb > > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Fri Feb 27 17:40:51 2015 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 27 Feb 2015 17:40:51 -0500 Subject: [Numpy-discussion] So I found a bug... In-Reply-To: References: Message-ID: oh... I think I see what you are referring to. The second image should have the regular lenna image, not the negative? Ben Root On Fri, Feb 27, 2015 at 5:39 PM, Benjamin Root wrote: > It is Friday evening here... I must be really dense. What bug are we > looking at? Is this another white-gold vs. blue-black dress color thing? > > Ben Root > > On Fri, Feb 27, 2015 at 5:33 PM, Sturla Molden > wrote: > >> Somewhere... But where is it? >> >> NumPy, SciPy, Matplotlib, Cython or ipython? >> >> I am suspecting ipython, but proving it is hard... >> >> >> http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/12464039/lenna-bug.ipynb >> >> >> Sturla >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Feb 27 17:52:17 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 27 Feb 2015 23:52:17 +0100 Subject: [Numpy-discussion] So I found a bug... In-Reply-To: References: Message-ID: On 27/02/15 23:40, Benjamin Root wrote: > oh... I think I see what you are referring to. The second image should > have the regular lenna image, not the negative? Yeah. The ndarray references are getting messed up. It is actually quite serious. Sturla > > Ben Root > > On Fri, Feb 27, 2015 at 5:39 PM, Benjamin Root > wrote: > > It is Friday evening here... I must be really dense. What bug are we > looking at? Is this another white-gold vs. blue-black dress color thing? > > Ben Root > > On Fri, Feb 27, 2015 at 5:33 PM, Sturla Molden > > wrote: > > Somewhere... But where is it? > > NumPy, SciPy, Matplotlib, Cython or ipython? > > I am suspecting ipython, but proving it is hard... > > http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/12464039/lenna-bug.ipynb > > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sturla.molden at gmail.com Fri Feb 27 17:57:59 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 27 Feb 2015 23:57:59 +0100 Subject: [Numpy-discussion] So I found a bug... In-Reply-To: References: Message-ID: On 27/02/15 23:39, Benjamin Root wrote: > Is this another white-gold vs. blue-black dress color thing? No. It is what you said in you next post. I hate that dress image. The first time I looked at it it was white and gold, then it became blue and black, and the third time it was grayish blue and bronze. I'm not going to look at it again, it might have exploit code to plant malware in my brain. Sturla > > Ben Root > > On Fri, Feb 27, 2015 at 5:33 PM, Sturla Molden > wrote: > > Somewhere... But where is it? > > NumPy, SciPy, Matplotlib, Cython or ipython? > > I am suspecting ipython, but proving it is hard... > > http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/12464039/lenna-bug.ipynb > > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Fri Feb 27 18:04:42 2015 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 27 Feb 2015 23:04:42 +0000 Subject: [Numpy-discussion] So I found a bug... In-Reply-To: References: Message-ID: On Fri, Feb 27, 2015 at 10:33 PM, Sturla Molden wrote: > > Somewhere... But where is it? > > NumPy, SciPy, Matplotlib, Cython or ipython? > > I am suspecting ipython, but proving it is hard... > > http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/12464039/lenna-bug.ipynb When plt.imshow() is given floating point RGB images, it assumes that each channel is normalized to 1. You are mixing a 0..255 image with a 0..1 image. Divide `lenna` by 255.0 before you stack it with `_dct`. Or multiply `_dct` by 255 and cast it to uint8. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Feb 27 18:19:35 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sat, 28 Feb 2015 00:19:35 +0100 Subject: [Numpy-discussion] So I found a bug... In-Reply-To: References: Message-ID: On 28/02/15 00:04, Robert Kern wrote: > When plt.imshow() is given floating point RGB images, it assumes that > each channel is normalized to 1. You are mixing a 0..255 image with a > 0..1 image. Divide `lenna` by 255.0 before you stack it with `_dct`. Or > multiply `_dct` by 255 and cast it to uint8. Right. Thanks. Since it's past midnight this probably means I should not touch the computer until my brain has rebooted. From mads.ipsen at gmail.com Fri Feb 27 18:47:59 2015 From: mads.ipsen at gmail.com (Mads Ipsen) Date: Sat, 28 Feb 2015 00:47:59 +0100 Subject: [Numpy-discussion] dot and MKL memory In-Reply-To: References: <54F09509.1070203@gmail.com> Message-ID: <54F1022F.1060807@gmail.com> On 27/02/15 17:50, Charles R Harris wrote: > > > On Fri, Feb 27, 2015 at 9:02 AM, Mads Ipsen > wrote: > > Hi, > > If I build Python 2.7.2 and numpy-1.9.1 and run the following script > > import numpy > c = numpy.ones((500000, 4)) > mat = numpy.identity(4) > r = numpy.dot(c, mat) > > the evaluation of the 'dot' increases the memory by app. 35 MB. > > If, in addition, I build numpy-1.9.1 with MKL support, and run the > script, the evaluation of the 'dot' increases the memory by app. 450 MB. > > Is the expected? > > Best regards, > > Mads > > specs: > > Ubuntu 12.04 > ifort (IFORT) 14.0.1 2013100 > gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) > > > No, but I don't know why that is happening with MKL. Can anyone else > reproduce this? > > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I have now tested and reproduced this on two different Ubuntu 12.04 boxes. Both were tested with a vanilla Python-2.7.9 install and two different MKL setups *) composer_xe_2015.2.164 icc (ICC) 15.0.2 20150121 ifort (IFORT) 15.0.2 20150121 build: python setup.py build --compiler=intelem python setup.py install *) composer_xe_2013_sp1.1.106 ifort (IFORT) 14.0.1 20131008 gcc version 4.6.3 build: python setup.py config --compiler=unix config_fc --fcompiler=intelem install On both setups I use the following site.cfg [mkl] library_dirs = /path_to_intel_composer/mkl/lib/intel64/ include_dirs = /path_to_intel_composer/mkl/lib/intel64/mkl/include/ mkl_libs = mkl_rt lapack_libs = If I omit the MKL setup (i.e. build without) the 400 MB memory goes away. If I by any means can provide you with some valuable info, please let me know. Best regards, Mads -- +---------------------------------------------------------------------+ | Mads Ipsen | +----------------------------------+----------------------------------+ | Overgaden Oven Vandet 106, 4.tv | phone: +45-29716388 | | DK-1415 K?benhavn K | email: mads.ipsen at gmail.com | | Denmark | map : https://goo.gl/maps/oQ6y6 | +----------------------------------+----------------------------------+