From pierre.haessig at crans.org Fri Apr 1 08:15:56 2016 From: pierre.haessig at crans.org (Pierre Haessig) Date: Fri, 1 Apr 2016 14:15:56 +0200 Subject: [Numpy-discussion] openopt.org down for months? Message-ID: <56FE667C.40009@crans.org> Hello, I noticed some weeks (or months) ago that the openopt.org website is down. Today, discussing optimization packages in Python with a colleague , I noticed it is still down today. Has somebody reading the numpy list more information about the state of the OpenOpt project? Beyond the code which is still reachable on PyPI, I remember the website was a useful resource on the topic with its Wikipedia-style pages. I've investigated related websites and noticed that : * last PyPI update is August 2015 https://pypi.python.org/pypi/openopt * last tweet is July 2015 https://twitter.com/dmitrey15 (with tweets coming every other months or so until that). best, Pierre From jaime.frio at gmail.com Fri Apr 1 16:04:24 2016 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Fri, 1 Apr 2016 22:04:24 +0200 Subject: [Numpy-discussion] Starting work on ufunc rewrite In-Reply-To: References: Message-ID: On Thu, Mar 31, 2016 at 10:14 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > There is certainly good precedent for the approach you suggest. > Shortly after Nathaniel mentioned the rewrite to me, I looked up > d-pointers as a possible technique: https://wiki.qt.io/D-Pointer. > Yes, the idea is similar, although somewhat simpler since we are doing C, not C++. > > If we allow arbitrary kwargs for the new functions, is that something > you would want to note in the public structure? I was thinking > something along the lines of adding a hook to process additional > kwargs and return a void * that would then be passed to the loop. > I'm not sure I understand what you mean... But I also don't think it is very relevant at this point? What I intend to do is simply to hide the guts of ufuncs, breaking everyone's code once... so that we can later change whatever we want without breaking anything else. PyUFunc_GenericFunction already takes *args and **kwargs, and the internal logic of how these get processed can be modified at will. If what you are proposing is to create a PyUFunc_FromFuncAndDataAndSignatureAndKwargProcessor API function that would provide a customized function to process extra kwargs and somehow pass them into the actual ufunc loop, that would just be an API extension, and there shouldn't be any major problem in introducing that whenever, especially once we are free to modify the internal representation of ufuncs without breaking ABI compatibility. > To do this incrementally, perhaps opening a special development branch > on the main repository is in order? > Yes, something like that seems like the right thing to do indeed. I would like someone with more git foo than me to spell out the details of how we would create and eventually merge that branch. > > I would love to join in the fun as time permits. Unfortunately, it is > not especially permissive right about now. I will at least throw in > some ideas that I have been mulling over. > Please do! Jaime > > -Joe > > > On Thu, Mar 31, 2016 at 4:00 PM, Jaime Fern?ndez del R?o > wrote: > > I have started discussing with Nathaniel the implementation of the ufunc > ABI > > break that he proposed in a draft NEP a few months ago: > > > > http://thread.gmane.org/gmane.comp.python.numeric.general/61270 > > > > His original proposal was to make the public portion of PyUFuncObject be: > > > > typedef struct { > > PyObject_HEAD > > int nin, nout, nargs; > > } PyUFuncObject; > > > > Of course the idea is that internally we would use a much larger struct > that > > we could change at will, as long as its first few entries matched those > of > > PyUFuncObject. My problem with this, and I may very well be missing > > something, is that in PyUFunc_Type we need to set the tp_basicsize to the > > size of the extended struct, so we would end up having to expose its > > contents. This is somewhat similar to what now happens with > PyArrayObject: > > anyone can #include "ndarraytypes.h", cast PyArrayObject* to > > PyArrayObjectFields*, and access the guts of the struct without using the > > supplied API inline functions. Not the end of the world, but if you want > to > > make something private, you might as well make it truly private. > > > > I think it would be to have something similar to what NpyIter does:: > > > > typedef struct { > > PyObject_HEAD > > NpyUFunc *ufunc; > > } PyUFuncObject; > > > > where NpyUFunc would, at this level, be an opaque type of which nothing > > would be known. We could have some of the NpyUFunc attributes cached on > the > > PyUFuncObject struct for easier access, as is done in > NewNpyArrayIterObject. > > This would also give us more liberty in making NpyUFunc be whatever we > want > > it to be, including a variable-sized memory chunk that we could use and > > access at will. NpyIter is again a good example, where rather than > storing > > pointers to strides and dimensions arrays, these are made part of the > > NpyIter memory chunk, effectively being equivalent to having variable > sized > > arrays as part of the struct. And I think we will probably no longer > trigger > > the Cython warnings about size changes either. > > > > Any thoughts on this approach? Is there anything fundamentally wrong with > > what I'm proposing here? > > > > Also, this is probably going to end up being a rewrite of a pretty large > and > > complex codebase. I am not sure that working on this on my own and > > eventually sending a humongous PR is the best approach. Any thoughts on > how > > best to handle turning this into a collaborative, incremental effort? > Anyone > > who would like to join in the fun? > > > > Jaime > > > > -- > > (\__/) > > ( O.o) > > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > planes de > > dominaci?n mundial. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ayush.kesarwani at gmail.com Sat Apr 2 12:17:27 2016 From: ayush.kesarwani at gmail.com (Ayush Kesarwani) Date: Sat, 2 Apr 2016 21:47:27 +0530 Subject: [Numpy-discussion] Call for Proposals || PyCon India 2016 Message-ID: Hello Everyone The Call for Proposals (CFP) for PyCon India 2016, New Delhi are live now. We have started accepting proposals. Those interested to submit a proposal for a talk/proposal should submit the same at the given link [1]. More information about the event is present at the official website [2]. Kindly adhere to the guidelines mentioned for the submission of proposals. Please help us spread the word. Kindly use #inpycon in your social updates. Any queries regarding the CFP could be sent to contact at in.pycon.org . Regards Team InPycon [1] bit.ly/inpycon2016cfp [2] http://bit.ly/inpycon2016 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mitchell at intertrust.com Sat Apr 2 15:59:11 2016 From: mitchell at intertrust.com (Steve Mitchell) Date: Sat, 2 Apr 2016 19:59:11 +0000 Subject: [Numpy-discussion] rational custom dtype example Message-ID: <57155BF8DF3FF541BF7E8515C12E957836F1B229@exch-1.corp.intertrust.com> I have noticed a few issues with the "rational" custom C dtype example. 1. It doesn't build on Windows. I managed to tweak it to build. Mainly, the MSVC9 compiler is C89. 2. A few tests don't pass on Windows, due to integer sizes. 3. The copyswap and copyswapn routines don't do in-place swapping if src == NULL, as specified in the docs. http://docs.scipy.org/doc/numpy-1.10.0/reference/c-api.types-and-structures.html --Steve -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Apr 2 21:12:03 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 2 Apr 2016 18:12:03 -0700 Subject: [Numpy-discussion] Starting work on ufunc rewrite In-Reply-To: References: Message-ID: On Thu, Mar 31, 2016 at 1:00 PM, Jaime Fern?ndez del R?o wrote: > I have started discussing with Nathaniel the implementation of the ufunc ABI > break that he proposed in a draft NEP a few months ago: > > http://thread.gmane.org/gmane.comp.python.numeric.general/61270 > > His original proposal was to make the public portion of PyUFuncObject be: > > typedef struct { > PyObject_HEAD > int nin, nout, nargs; > } PyUFuncObject; > > Of course the idea is that internally we would use a much larger struct that > we could change at will, as long as its first few entries matched those of > PyUFuncObject. My problem with this, and I may very well be missing > something, is that in PyUFunc_Type we need to set the tp_basicsize to the > size of the extended struct, so we would end up having to expose its > contents. How so? tp_basicsize tells you the size of the real struct, but that doesn't let you actually access any of its fields. Unless you decide to start cheating and reaching into random bits of memory by hand, but, well, this is C, we can't really prevent that :-). > This is somewhat similar to what now happens with PyArrayObject: > anyone can #include "ndarraytypes.h", cast PyArrayObject* to > PyArrayObjectFields*, and access the guts of the struct without using the > supplied API inline functions. Not the end of the world, but if you want to > make something private, you might as well make it truly private. Yeah, there is also an issue here where we don't always do a great job of separating our internal headers from our public headers. But that's orthogonal -- any solution for hiding PyUFunc's internals will require handling that somehow. > I think it would be to have something similar to what NpyIter does:: > > typedef struct { > PyObject_HEAD > NpyUFunc *ufunc; > } PyUFuncObject; A few points: We have to leave nin, nout, nargs where they are in PyUFuncObject, because there code out there that accesses them. This technique is usually used when you want to allow subclassing of a struct, while also allowing you to add fields later without breaking ABI. We don't want to allow subclassing of PyUFunc (regardless of what happens here -- subclassing just creates tons of problems), so AFAICT it isn't really necessary. It adds a bit of extra complexity (two allocations instead of one, extra pointer chasing, etc.), though to be fair the hidden struct approach also adds some complexity (you have to cast to the internal type), so it's not a huge deal either way. If the NpyUFunc pointer field is public then in principle people could refer to it and create problems down the line in case we ever decided to switch to a different strategy... not very likely given that it'd just be a meaningless opaque pointer, but mentioning for completeness's sake. > where NpyUFunc would, at this level, be an opaque type of which nothing > would be known. We could have some of the NpyUFunc attributes cached on the > PyUFuncObject struct for easier access, as is done in NewNpyArrayIterObject. Caching sounds like *way* more complexity than we want :-). As soon as you have two copies of data then they can get out of sync... > This would also give us more liberty in making NpyUFunc be whatever we want > it to be, including a variable-sized memory chunk that we could use and > access at will. Python objects are allowed to be variable size: tp_basicsize is the minimum size. Built-ins like lists and strings have variable size structs. > NpyIter is again a good example, where rather than storing > pointers to strides and dimensions arrays, these are made part of the > NpyIter memory chunk, effectively being equivalent to having variable sized > arrays as part of the struct. And I think we will probably no longer trigger > the Cython warnings about size changes either. > > Any thoughts on this approach? Is there anything fundamentally wrong with > what I'm proposing here? Modulo the issue with nin/nout/nargs, I don't think it makes a huge difference either way. I don't see any compelling advantages to your proposal given our particular situation, but it doesn't make a huge difference either way. Maybe I'm missing something. > Also, this is probably going to end up being a rewrite of a pretty large and > complex codebase. I am not sure that working on this on my own and > eventually sending a humongous PR is the best approach. Any thoughts on how > best to handle turning this into a collaborative, incremental effort? Anyone > who would like to join in the fun? I'd strongly recommend breaking it up into individually mergeable pieces to the absolute maximum extent possible, and merging them back as we go, so that we never have a giant branch diverging from master. (E.g., refactor a few functions -> submit a PR -> merge, refactor some more -> merge, add a new feature enabled by the refactoring -> merge, repeat). There are limits to how far you can take this, e.g. the PR for just hiding the current API + adding back the public API pieces that Numba needs will itself be not quite trivial even if we do no refactoring yet, and until we get more of an outline for where we're trying to get to it will be hard to tell how to break it into pieces :-). But once things are hidden it should be possible to do quite a bit of internal rearranging incrementally on master, I hope? For coordinating this though it would probably be good to start working on some public notes (gdocs or the wiki or something) where we sketch out some overall plan, make a plan of attack for how to break it up, etc., and maybe have some higher-bandwidth conversations to make that outline (google hangout?). -n -- Nathaniel J. Smith -- https://vorpus.org From matthew.brett at gmail.com Sat Apr 2 21:11:51 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 2 Apr 2016 18:11:51 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Fri, Mar 25, 2016 at 6:39 AM, Peter Cock wrote: > On Fri, Mar 25, 2016 at 3:02 AM, Robert T. McGibbon wrote: >> I suspect that many of the maintainers of major scipy-ecosystem projects are >> aware of these (or other similar) travis wheel caches, but would guess that >> the pool of travis-ci python users who weren't aware of these wheel caches >> is much much larger. So there will still be a lot of travis-ci clock cycles >> saved by manylinux wheels. >> >> -Robert > > Yes exactly. Availability of NumPy Linux wheels on PyPI is definitely something > I would suggest adding to the release notes. Hopefully this will help trigger > a general availability of wheels in the numpy-ecosystem :) > > In the case of Travis CI, their VM images for Python already have a version > of NumPy installed, but having the latest version of NumPy and SciPy etc > available as Linux wheels would be very nice. We're very nearly there now. The latest versions of numpy, scipy, scikit-image, pandas, numexpr, statsmodels wheels for testing at http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ Please do test with: python -m install --upgrade pip pip install --trusted-host=ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com --find-links=http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com numpy scipy scikit-learn numexpr python -c 'import numpy; numpy.test("full")' python -c 'import scipy; scipy.test("full")' We would love to get any feedback as to whether these work on your machines. Cheers, Matthew From njs at pobox.com Sat Apr 2 21:15:41 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 2 Apr 2016 18:15:41 -0700 Subject: [Numpy-discussion] Starting work on ufunc rewrite In-Reply-To: References: Message-ID: On Thu, Mar 31, 2016 at 3:09 PM, Irwin Zaid wrote: > Hey guys, > > I figured I'd just chime in here. > > Over in DyND-town, we've spent a lot of time figuring out how to structure > DyND callables, which are actually more general than NumPy gufuncs. We've > just recently got them to a place where we are very happy, and are able to > represent a wide range of computations. > > Our callables use a two-fold approach to evaluation. The first pass is a > resolution pass, where a callable can specialize what it is doing based on > the input types. It is able to deduce the return type, multidispatch, or > even perform some sort of recursive analysis in the form of computations > that call themselves. The second pass is construction of a kernel object > that is exactly specialized to the metadata (e.g., strides, contiguity, ...) > of the array. > > The callable itself can store arbitrary data, as can each pass of the > evaluation. Either (or both) of these passes can be done ahead of time, > allowing one to have a callable exactly specialized for your array. > > If NumPy is looking to change it's ufunc design, we'd be happy to share our > experiences with this. Yeah, this all sounds very relevant :-). You can even see some of the kernel of that design in numpy's current ufuncs, with their first-stage "resolver" choosing which inner loop to use, but we definitely need to make these semantics richer if we want to allow for things like inner loops that depend on kwargs (e.g. sort(..., kind="quicksort") versus sort(..., kind="mergesort")) or dtype attributes. Is your design written up anywhere? -n -- Nathaniel J. Smith -- https://vorpus.org From olivier.grisel at ensta.org Sun Apr 3 07:37:49 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sun, 3 Apr 2016 13:37:49 +0200 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: typo: python -m install --upgrade pip should read: python -m pip install --upgrade pip -- Olivier From olivier.grisel at ensta.org Sun Apr 3 10:20:11 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sun, 3 Apr 2016 16:20:11 +0200 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: I ran some tests on an image of the future ubuntu xenial that ships a version of pip recent enough to install manylinux1 wheels by default and everything looks fine. Just to clarify, those wheels use openblas 0.2.17 that have proven to be both fast and very stable on various CPU architectures while we could not achieve similar results with atlas 3.10. -- Olivier Grisel From p.j.a.cock at googlemail.com Mon Apr 4 12:02:08 2016 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 4 Apr 2016 17:02:08 +0100 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Sun, Apr 3, 2016 at 2:11 AM, Matthew Brett wrote: > On Fri, Mar 25, 2016 at 6:39 AM, Peter Cock wrote: >> On Fri, Mar 25, 2016 at 3:02 AM, Robert T. McGibbon wrote: >>> I suspect that many of the maintainers of major scipy-ecosystem projects are >>> aware of these (or other similar) travis wheel caches, but would guess that >>> the pool of travis-ci python users who weren't aware of these wheel caches >>> is much much larger. So there will still be a lot of travis-ci clock cycles >>> saved by manylinux wheels. >>> >>> -Robert >> >> Yes exactly. Availability of NumPy Linux wheels on PyPI is definitely something >> I would suggest adding to the release notes. Hopefully this will help trigger >> a general availability of wheels in the numpy-ecosystem :) >> >> In the case of Travis CI, their VM images for Python already have a version >> of NumPy installed, but having the latest version of NumPy and SciPy etc >> available as Linux wheels would be very nice. > > We're very nearly there now. > > The latest versions of numpy, scipy, scikit-image, pandas, numexpr, > statsmodels wheels for testing at > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ > > Please do test with: > ... > > We would love to get any feedback as to whether these work on your machines. Hi Matthew, Testing on a 64bit CentOS 6 machine with Python 3.5 compiled from source under my home directory: $ python3.5 -m pip install --upgrade pip Requirement already up-to-date: pip in ./lib/python3.5/site-packages $ python3.5 -m pip install --trusted-host=ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com --find-links=http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com numpy scipy Requirement already satisfied (use --upgrade to upgrade): numpy in ./lib/python3.5/site-packages Requirement already satisfied (use --upgrade to upgrade): scipy in ./lib/python3.5/site-packages $ python3.5 -m pip install --trusted-host=ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com --find-links=http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com numpy scipy --upgrade Collecting numpy Downloading http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/numpy-1.11.0-cp35-cp35m-manylinux1_x86_64.whl (15.5MB) 100% |????????????????????????????????| 15.5MB 42.1MB/s Collecting scipy Downloading http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/scipy-0.17.0-cp35-cp35m-manylinux1_x86_64.whl (40.8MB) 100% |????????????????????????????????| 40.8MB 53.6MB/s Installing collected packages: numpy, scipy Found existing installation: numpy 1.10.4 Uninstalling numpy-1.10.4: Successfully uninstalled numpy-1.10.4 Found existing installation: scipy 0.16.0 Uninstalling scipy-0.16.0: Successfully uninstalled scipy-0.16.0 Successfully installed numpy-1.11.0 scipy-0.17.0 $ python3.5 -c 'import numpy; numpy.test("full")' Running unit tests for numpy NumPy version 1.11.0 NumPy relaxed strides checking option: False NumPy is installed in /home/xxx/lib/python3.5/site-packages/numpy Python version 3.5.0 (default, Sep 28 2015, 11:25:31) [GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] nose version 1.3.7 .............................................................................................................................................................................................................................S....................................................................................................................................................................KKK....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................S....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K.................................................................................................................................................................................................................................................................................................................................................................................................................................................K.......................................................................................................................................................................................................................................................................................................................................................................................................................................................K......................K............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ ---------------------------------------------------------------------- Ran 6332 tests in 243.029s OK (KNOWNFAIL=7, SKIP=2) So far so good, but there are a lot of deprecation warnings etc from SciPy, $ python3.5 -c 'import scipy; scipy.test("full")' Running unit tests for scipy NumPy version 1.11.0 NumPy relaxed strides checking option: False NumPy is installed in /home/xxx/lib/python3.5/site-packages/numpy SciPy version 0.17.0 SciPy is installed in /home/xxx/lib/python3.5/site-packages/scipy Python version 3.5.0 (default, Sep 28 2015, 11:25:31) [GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] nose version 1.3.7 [snip] /home/xxx/lib/python3.5/site-packages/numpy/lib/utils.py:99: DeprecationWarning: `rand` is deprecated! numpy.testing.rand is deprecated in numpy 1.11. Use numpy.random.rand instead. warnings.warn(depdoc, DeprecationWarning) [snip] /home/xxx/lib/python3.5/site-packages/scipy/io/arff/tests/test_arffread.py:254: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future ], dtype='datetime64[m]') /home/xxx/lib/python3.5/site-packages/scipy/io/arff/arffread.py:638: PendingDeprecationWarning: generator '_loadarff..generator' raised StopIteration [snip] /home/xxx/lib/python3.5/site-packages/scipy/sparse/tests/test_base.py:2425: DeprecationWarning: This function is deprecated. Please call randint(-5, 5 + 1) instead I = np.random.random_integers(-M + 1, M - 1, size=NUM_SAMPLES) [snip] 0-th dimension must be fixed to 3 but got 15 [snip] ---------------------------------------------------------------------- Ran 21407 tests in 741.602s OK (KNOWNFAIL=130, SKIP=1775) Hopefully I didn't miss anything important in hand editing the scipy output. Peter From matthew.brett at gmail.com Mon Apr 4 13:47:03 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 4 Apr 2016 10:47:03 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Hi, On Mon, Apr 4, 2016 at 9:02 AM, Peter Cock wrote: > On Sun, Apr 3, 2016 at 2:11 AM, Matthew Brett wrote: >> On Fri, Mar 25, 2016 at 6:39 AM, Peter Cock wrote: >>> On Fri, Mar 25, 2016 at 3:02 AM, Robert T. McGibbon wrote: >>>> I suspect that many of the maintainers of major scipy-ecosystem projects are >>>> aware of these (or other similar) travis wheel caches, but would guess that >>>> the pool of travis-ci python users who weren't aware of these wheel caches >>>> is much much larger. So there will still be a lot of travis-ci clock cycles >>>> saved by manylinux wheels. >>>> >>>> -Robert >>> >>> Yes exactly. Availability of NumPy Linux wheels on PyPI is definitely something >>> I would suggest adding to the release notes. Hopefully this will help trigger >>> a general availability of wheels in the numpy-ecosystem :) >>> >>> In the case of Travis CI, their VM images for Python already have a version >>> of NumPy installed, but having the latest version of NumPy and SciPy etc >>> available as Linux wheels would be very nice. >> >> We're very nearly there now. >> >> The latest versions of numpy, scipy, scikit-image, pandas, numexpr, >> statsmodels wheels for testing at >> http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ >> >> Please do test with: >> ... >> >> We would love to get any feedback as to whether these work on your machines. > > Hi Matthew, > > Testing on a 64bit CentOS 6 machine with Python 3.5 compiled > from source under my home directory: > > > $ python3.5 -m pip install --upgrade pip > Requirement already up-to-date: pip in ./lib/python3.5/site-packages > > $ python3.5 -m pip install > --trusted-host=ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > --find-links=http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > numpy scipy > Requirement already satisfied (use --upgrade to upgrade): numpy in > ./lib/python3.5/site-packages > Requirement already satisfied (use --upgrade to upgrade): scipy in > ./lib/python3.5/site-packages > > $ python3.5 -m pip install > --trusted-host=ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > --find-links=http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > numpy scipy --upgrade > Collecting numpy > Downloading http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/numpy-1.11.0-cp35-cp35m-manylinux1_x86_64.whl > (15.5MB) > 100% |????????????????????????????????| 15.5MB 42.1MB/s > Collecting scipy > Downloading http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/scipy-0.17.0-cp35-cp35m-manylinux1_x86_64.whl > (40.8MB) > 100% |????????????????????????????????| 40.8MB 53.6MB/s > Installing collected packages: numpy, scipy > Found existing installation: numpy 1.10.4 > Uninstalling numpy-1.10.4: > Successfully uninstalled numpy-1.10.4 > Found existing installation: scipy 0.16.0 > Uninstalling scipy-0.16.0: > Successfully uninstalled scipy-0.16.0 > Successfully installed numpy-1.11.0 scipy-0.17.0 > > > $ python3.5 -c 'import numpy; numpy.test("full")' > Running unit tests for numpy > NumPy version 1.11.0 > NumPy relaxed strides checking option: False > NumPy is installed in /home/xxx/lib/python3.5/site-packages/numpy > Python version 3.5.0 (default, Sep 28 2015, 11:25:31) [GCC 4.4.7 > 20120313 (Red Hat 4.4.7-16)] > nose version 1.3.7 > .............................................................................................................................................................................................................................S....................................................................................................................................................................KKK....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................S....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K.................................................................................................................................................................................................................................................................................................................................................................................................................................................K.......................................................................................................................................................................................................................................................................................................................................................................................................................................................K......................K............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ > ---------------------------------------------------------------------- > Ran 6332 tests in 243.029s > > OK (KNOWNFAIL=7, SKIP=2) > > > > So far so good, but there are a lot of deprecation warnings etc from SciPy, > > > $ python3.5 -c 'import scipy; scipy.test("full")' > Running unit tests for scipy > NumPy version 1.11.0 > NumPy relaxed strides checking option: False > NumPy is installed in /home/xxx/lib/python3.5/site-packages/numpy > SciPy version 0.17.0 > SciPy is installed in /home/xxx/lib/python3.5/site-packages/scipy > Python version 3.5.0 (default, Sep 28 2015, 11:25:31) [GCC 4.4.7 > 20120313 (Red Hat 4.4.7-16)] > nose version 1.3.7 > [snip] > /home/xxx/lib/python3.5/site-packages/numpy/lib/utils.py:99: > DeprecationWarning: `rand` is deprecated! > numpy.testing.rand is deprecated in numpy 1.11. Use numpy.random.rand instead. > warnings.warn(depdoc, DeprecationWarning) > [snip] > /home/xxx/lib/python3.5/site-packages/scipy/io/arff/tests/test_arffread.py:254: > DeprecationWarning: parsing timezone aware datetimes is deprecated; > this will raise an error in the future > ], dtype='datetime64[m]') > /home/xxx/lib/python3.5/site-packages/scipy/io/arff/arffread.py:638: > PendingDeprecationWarning: generator '_loadarff..generator' > raised StopIteration > [snip] > /home/xxx/lib/python3.5/site-packages/scipy/sparse/tests/test_base.py:2425: > DeprecationWarning: This function is deprecated. Please call > randint(-5, 5 + 1) instead > I = np.random.random_integers(-M + 1, M - 1, size=NUM_SAMPLES) > [snip] > 0-th dimension must be fixed to 3 but got 15 > [snip] > ---------------------------------------------------------------------- > Ran 21407 tests in 741.602s > > OK (KNOWNFAIL=130, SKIP=1775) > > > Hopefully I didn't miss anything important in hand editing the scipy output. Thanks a lot for testing. I believe the deprecation warnings are expected, because numpy 1.11.0 introduced a new deprecation warning when using `random_integers`. Scipy 0.17.0 is using `random_integers` in a few places. Best, Matthew From gfyoung17 at gmail.com Mon Apr 4 14:26:26 2016 From: gfyoung17 at gmail.com (G Young) Date: Mon, 4 Apr 2016 19:26:26 +0100 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Matthew, you are correct. A lot of things happened with random integer generation recently (including deprecating random_integers), but I believe those warnings should be squashed in the up and coming version of SciPy from what I remember. On Mon, Apr 4, 2016 at 6:47 PM, Matthew Brett wrote: > Hi, > > On Mon, Apr 4, 2016 at 9:02 AM, Peter Cock > wrote: > > On Sun, Apr 3, 2016 at 2:11 AM, Matthew Brett > wrote: > >> On Fri, Mar 25, 2016 at 6:39 AM, Peter Cock > wrote: > >>> On Fri, Mar 25, 2016 at 3:02 AM, Robert T. McGibbon < > rmcgibbo at gmail.com> wrote: > >>>> I suspect that many of the maintainers of major scipy-ecosystem > projects are > >>>> aware of these (or other similar) travis wheel caches, but would > guess that > >>>> the pool of travis-ci python users who weren't aware of these wheel > caches > >>>> is much much larger. So there will still be a lot of travis-ci clock > cycles > >>>> saved by manylinux wheels. > >>>> > >>>> -Robert > >>> > >>> Yes exactly. Availability of NumPy Linux wheels on PyPI is definitely > something > >>> I would suggest adding to the release notes. Hopefully this will help > trigger > >>> a general availability of wheels in the numpy-ecosystem :) > >>> > >>> In the case of Travis CI, their VM images for Python already have a > version > >>> of NumPy installed, but having the latest version of NumPy and SciPy > etc > >>> available as Linux wheels would be very nice. > >> > >> We're very nearly there now. > >> > >> The latest versions of numpy, scipy, scikit-image, pandas, numexpr, > >> statsmodels wheels for testing at > >> > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ > >> > >> Please do test with: > >> ... > >> > >> We would love to get any feedback as to whether these work on your > machines. > > > > Hi Matthew, > > > > Testing on a 64bit CentOS 6 machine with Python 3.5 compiled > > from source under my home directory: > > > > > > $ python3.5 -m pip install --upgrade pip > > Requirement already up-to-date: pip in ./lib/python3.5/site-packages > > > > $ python3.5 -m pip install > > --trusted-host= > ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > > --find-links= > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > > numpy scipy > > Requirement already satisfied (use --upgrade to upgrade): numpy in > > ./lib/python3.5/site-packages > > Requirement already satisfied (use --upgrade to upgrade): scipy in > > ./lib/python3.5/site-packages > > > > $ python3.5 -m pip install > > --trusted-host= > ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > > --find-links= > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > > numpy scipy --upgrade > > Collecting numpy > > Downloading > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/numpy-1.11.0-cp35-cp35m-manylinux1_x86_64.whl > > (15.5MB) > > 100% |????????????????????????????????| 15.5MB 42.1MB/s > > Collecting scipy > > Downloading > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/scipy-0.17.0-cp35-cp35m-manylinux1_x86_64.whl > > (40.8MB) > > 100% |????????????????????????????????| 40.8MB 53.6MB/s > > Installing collected packages: numpy, scipy > > Found existing installation: numpy 1.10.4 > > Uninstalling numpy-1.10.4: > > Successfully uninstalled numpy-1.10.4 > > Found existing installation: scipy 0.16.0 > > Uninstalling scipy-0.16.0: > > Successfully uninstalled scipy-0.16.0 > > Successfully installed numpy-1.11.0 scipy-0.17.0 > > > > > > $ python3.5 -c 'import numpy; numpy.test("full")' > > Running unit tests for numpy > > NumPy version 1.11.0 > > NumPy relaxed strides checking option: False > > NumPy is installed in /home/xxx/lib/python3.5/site-packages/numpy > > Python version 3.5.0 (default, Sep 28 2015, 11:25:31) [GCC 4.4.7 > > 20120313 (Red Hat 4.4.7-16)] > > nose version 1.3.7 > > > .............................................................................................................................................................................................................................S....................................................................................................................................................................KKK....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................S....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K.................................................................................................................................................................................................................................................................................................................................................................................................................................................K.......................................................................................................................................................................................................................................................................................................................................................................................................................................................K......................K............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ > > ---------------------------------------------------------------------- > > Ran 6332 tests in 243.029s > > > > OK (KNOWNFAIL=7, SKIP=2) > > > > > > > > So far so good, but there are a lot of deprecation warnings etc from > SciPy, > > > > > > $ python3.5 -c 'import scipy; scipy.test("full")' > > Running unit tests for scipy > > NumPy version 1.11.0 > > NumPy relaxed strides checking option: False > > NumPy is installed in /home/xxx/lib/python3.5/site-packages/numpy > > SciPy version 0.17.0 > > SciPy is installed in /home/xxx/lib/python3.5/site-packages/scipy > > Python version 3.5.0 (default, Sep 28 2015, 11:25:31) [GCC 4.4.7 > > 20120313 (Red Hat 4.4.7-16)] > > nose version 1.3.7 > > [snip] > > /home/xxx/lib/python3.5/site-packages/numpy/lib/utils.py:99: > > DeprecationWarning: `rand` is deprecated! > > numpy.testing.rand is deprecated in numpy 1.11. Use numpy.random.rand > instead. > > warnings.warn(depdoc, DeprecationWarning) > > [snip] > > > /home/xxx/lib/python3.5/site-packages/scipy/io/arff/tests/test_arffread.py:254: > > DeprecationWarning: parsing timezone aware datetimes is deprecated; > > this will raise an error in the future > > ], dtype='datetime64[m]') > > /home/xxx/lib/python3.5/site-packages/scipy/io/arff/arffread.py:638: > > PendingDeprecationWarning: generator '_loadarff..generator' > > raised StopIteration > > [snip] > > > /home/xxx/lib/python3.5/site-packages/scipy/sparse/tests/test_base.py:2425: > > DeprecationWarning: This function is deprecated. Please call > > randint(-5, 5 + 1) instead > > I = np.random.random_integers(-M + 1, M - 1, size=NUM_SAMPLES) > > [snip] > > 0-th dimension must be fixed to 3 but got 15 > > [snip] > > ---------------------------------------------------------------------- > > Ran 21407 tests in 741.602s > > > > OK (KNOWNFAIL=130, SKIP=1775) > > > > > > Hopefully I didn't miss anything important in hand editing the scipy > output. > > Thanks a lot for testing. > > I believe the deprecation warnings are expected, because numpy 1.11.0 > introduced a new deprecation warning when using `random_integers`. > Scipy 0.17.0 is using `random_integers` in a few places. > > Best, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.p.conte at gmail.com Mon Apr 4 13:35:45 2016 From: matt.p.conte at gmail.com (mpc) Date: Mon, 4 Apr 2016 10:35:45 -0700 (MST) Subject: [Numpy-discussion] Multidimension array access in C via Python API Message-ID: <1459791345159-42710.post@n7.nabble.com> Hello, is there a C-API function for numpy that can implement Python's multidimensional indexing? For example, if I had a 2d array: PyArrayObject * M; and an index int i; how do I extract the i-th row M[i,:] or i-th column M[:,i]? Ideally it would be great if it returned another PyArrayObject* object (not a newly allocated one, but whose data will point to the correct memory locations of M). I've searched everywhere in the API documentation, Google, and SO, but no luck. Any help is greatly appreciated. Thank you. -Matthew -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From tjhnson at gmail.com Mon Apr 4 15:23:06 2016 From: tjhnson at gmail.com (T J) Date: Mon, 4 Apr 2016 14:23:06 -0500 Subject: [Numpy-discussion] Floor divison on int returns float Message-ID: I'm on NumPy 1.10.4 (mkl). >>> np.uint(3) // 2 # 1.0 >>> 3 // 2 # 1 Is this behavior expected? It's certainly not desired from my perspective. If this is not a bug, could someone explain the rationale to me. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Mon Apr 4 15:44:51 2016 From: ewm at redtetrahedron.org (Eric Moore) Date: Mon, 4 Apr 2016 15:44:51 -0400 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459791345159-42710.post@n7.nabble.com> References: <1459791345159-42710.post@n7.nabble.com> Message-ID: /* obj[ind] */ PyObject* DoIndex(PyObject* obj, int ind) { PyObject *oind, *ret; oind = PyLong_FromLong(ind); if (!oind) { return NULL; } ret = PyObject_GetItem(obj, oind); Py_DECREF(oind); return ret; } /* obj[inds[0], inds[1], ... inds[n_ind-1]] */ PyObject* DoMultiIndex(PyObject* obj, int *inds, int n_ind) { PyObject *ret, *oind, *temp; oind = PyTuple_New(n_ind); if (!oind) return NULL; for (int k = 0; k < n_ind; ++k) { temp = PyLong_FromLong(inds[k]); if (!temp) Py_DECREF(oind); PyTuple_SET_ITEM(oind, k, temp); } ret = PyObject_GetItem(obj, oind); Py_DECREF(oind); return ret; } /* obj[b:e:step] */ PyObject* DoSlice(PyObject* obj, int b, int e, int step) { PyObject *oind, *ret, *ob, *oe, *ostep; ob = PyLong_FromLong(b); if (!ob) return NULL; oe = PyLong_FromLong(e); if (!oe) { Py_DECREF(ob); return NULL; } ostep = PyLong_FromLong(step); if (!ostep) { Py_DECREF(ob); Py_DECREF(oe); return NULL; } oind = PySlice_New(ob, oe, ostep); Py_DECREF(ob); Py_DECREF(oe); Py_DECREF(ostep); if (!oind) return NULL; ret = PyObject_GetItem(obj, oind); Py_DECREF(oind); return ret; } -Eric On Mon, Apr 4, 2016 at 1:35 PM, mpc wrote: > Hello, > > is there a C-API function for numpy that can implement Python's > multidimensional indexing? > > For example, if I had a 2d array: > > PyArrayObject * M; > > and an index > > int i; > > how do I extract the i-th row M[i,:] or i-th column M[:,i]? > > Ideally it would be great if it returned another PyArrayObject* object (not > a newly allocated one, but whose data will point to the correct memory > locations of M). > > I've searched everywhere in the API documentation, Google, and SO, but no > luck. > > Any help is greatly appreciated. > > Thank you. > > -Matthew > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Mon Apr 4 15:56:57 2016 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 4 Apr 2016 09:56:57 -1000 Subject: [Numpy-discussion] Floor divison on int returns float In-Reply-To: References: Message-ID: <5702C709.1040801@hawaii.edu> On 2016/04/04 9:23 AM, T J wrote: > I'm on NumPy 1.10.4 (mkl). > > >>> np.uint(3) // 2 # 1.0 > >>> 3 // 2 # 1 > > Is this behavior expected? It's certainly not desired from my > perspective. If this is not a bug, could someone explain the rationale > to me. > > Thanks. I agree that it's almost always undesirable; one would reasonably expect some sort of int. Here's what I think is going on: The odd behavior occurs only with np.uint, which is np.uint64, and when the denominator is a signed int. The problem is that if the denominator is negative, the result will be negative, so it can't have the same type as the first numerator. Furthermore, if the denominator is -1, the result will be minus the numerator, and that can't be represented by np.uint or np.int. Therefore the result is returned as np.float64. The promotion rules are based on what *could* happen in an operation, not on what *is* happening in a given instance. Eric From charlesr.harris at gmail.com Mon Apr 4 16:24:08 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 4 Apr 2016 14:24:08 -0600 Subject: [Numpy-discussion] rational custom dtype example In-Reply-To: <57155BF8DF3FF541BF7E8515C12E957836F1B229@exch-1.corp.intertrust.com> References: <57155BF8DF3FF541BF7E8515C12E957836F1B229@exch-1.corp.intertrust.com> Message-ID: On Sat, Apr 2, 2016 at 1:59 PM, Steve Mitchell wrote: > I have noticed a few issues with the ?rational? custom C dtype example. > > > > 1. It doesn?t build on Windows. I managed to tweak it to build. > Mainly, the MSVC9 compiler is C89. > > 2. A few tests don?t pass on Windows, due to integer sizes. > > 3. The copyswap and copyswapn routines don?t do in-place swapping > if src == NULL, as specified in the docs. > > http://docs.scipy.org/doc/numpy-1.10.0/reference/c-api.types-and-structures.html > > > Needless to say the example hasn't been tested on windows ;) A PR fixing the issues that you discovered would be welcome. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.p.conte at gmail.com Mon Apr 4 15:59:17 2016 From: matt.p.conte at gmail.com (mpc) Date: Mon, 4 Apr 2016 12:59:17 -0700 (MST) Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: References: <1459791345159-42710.post@n7.nabble.com> Message-ID: <1459799957089-42715.post@n7.nabble.com> Thanks for responding. It looks you made/found these yourself since I can't find anything like this in the API. I can't believe it isn't, so convenient! By the way, from what I understand, the ':' is represented as *PySlice_New(NULL, NULL, NULL) *in the C API when accessing by index, correct? Therefore the final result will be something like: *PyObject* first_column_tuple = PyTuple_New(2); PyTuple_SET_ITEM(first_column_tuple, 0, PySlice_New(NULL, NULL, NULL)); PyTuple_SET_ITEM(first_column_tuple, 1, PyInt_FromLong(0)); PyObject* first_column_buffer = PyObject_GetItem(src_buffer, first_column_tuple); * -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42715.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ewm at redtetrahedron.org Mon Apr 4 17:14:27 2016 From: ewm at redtetrahedron.org (Eric Moore) Date: Mon, 4 Apr 2016 17:14:27 -0400 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459799957089-42715.post@n7.nabble.com> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> Message-ID: Yes, PySlice_New(NULL, NULL, NULL) is the same as ':'. Depending on what exactly you want to do with the column once you've extracted it, this may not be the best way to do it. Are you absolutely certain that you actually need a PyArrayObject that points to the column? Eric On Mon, Apr 4, 2016 at 3:59 PM, mpc wrote: > Thanks for responding. > > It looks you made/found these yourself since I can't find anything like > this > in the API. I can't believe it isn't, so convenient! > > By the way, from what I understand, the ':' is represented as > *PySlice_New(NULL, NULL, NULL) *in the C API when accessing by index, > correct? > > > Therefore the final result will be something like: > > *PyObject* first_column_tuple = PyTuple_New(2); > PyTuple_SET_ITEM(first_column_tuple, 0, PySlice_New(NULL, NULL, NULL)); > PyTuple_SET_ITEM(first_column_tuple, 1, PyInt_FromLong(0)); > PyObject* first_column_buffer = PyObject_GetItem(src_buffer, > first_column_tuple); > * > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42715.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.p.conte at gmail.com Mon Apr 4 16:28:20 2016 From: matt.p.conte at gmail.com (mpc) Date: Mon, 4 Apr 2016 13:28:20 -0700 (MST) Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> Message-ID: <1459801700080-42717.post@n7.nabble.com> I think that I do, since I intend to do array specific operations on the resulting column of data. e.g: *PyArray_Min* *PyArray_Max* which require a PyArrayObject argument I also plan to use *PyArray_Where* to find individual point locations in data columns x,y,z within a 3D range, but it doesn't look like it needs PyArrayObject. -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42717.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From njs at pobox.com Mon Apr 4 18:57:24 2016 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 4 Apr 2016 15:57:24 -0700 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459799957089-42715.post@n7.nabble.com> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> Message-ID: On Apr 4, 2016 1:58 PM, "mpc" wrote: > > Thanks for responding. > > It looks you made/found these yourself since I can't find anything like this > in the API. I can't believe it isn't, so convenient! > > By the way, from what I understand, the ':' is represented as > *PySlice_New(NULL, NULL, NULL) *in the C API when accessing by index, > correct? > > > Therefore the final result will be something like: > > *PyObject* first_column_tuple = PyTuple_New(2); > PyTuple_SET_ITEM(first_column_tuple, 0, PySlice_New(NULL, NULL, NULL)); > PyTuple_SET_ITEM(first_column_tuple, 1, PyInt_FromLong(0)); > PyObject* first_column_buffer = PyObject_GetItem(src_buffer, > first_column_tuple); > * If this is what your code looks like, then I strongly suspect you'll be better off writing it in cython or even just plain python. The above code won't run any faster than the equivalent code in python, but it's much harder to read and write... -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.p.conte at gmail.com Tue Apr 5 11:39:39 2016 From: matt.p.conte at gmail.com (mpc) Date: Tue, 5 Apr 2016 08:39:39 -0700 (MST) Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> Message-ID: <1459870779508-42719.post@n7.nabble.com> This is the reason I'm doing this in the first place, because I made a pure python version but it runs really slow for larger data sets, so I'm basically rewriting the same function but using the Python and Numpy C API, but if you're saying it won't run any faster then maybe I'm going at it the wrong way. (Why use the C function version if it's the same speed anyway?) You're suggesting perhaps a cython approach, or perhaps a strictly C/C++ approach given the raw data? -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ewm at redtetrahedron.org Tue Apr 5 13:17:42 2016 From: ewm at redtetrahedron.org (Eric Moore) Date: Tue, 5 Apr 2016 13:17:42 -0400 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459870779508-42719.post@n7.nabble.com> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> Message-ID: Its difficult to say why your code is slow without seeing it. i.e. are you generating large temporaries? Or doing loops in python that can be pushed down to C via vectorizing? It may or may not be necessary to leave python to get things to run fast enough. -Eric On Tue, Apr 5, 2016 at 11:39 AM, mpc wrote: > This is the reason I'm doing this in the first place, because I made a pure > python version but it runs really slow for larger data sets, so I'm > basically rewriting the same function but using the Python and Numpy C API, > but if you're saying it won't run any faster then maybe I'm going at it the > wrong way. (Why use the C function version if it's the same speed anyway?) > > You're suggesting perhaps a cython approach, or perhaps a strictly C/C++ > approach given the raw data? > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42719.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Tue Apr 5 13:22:17 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 5 Apr 2016 10:22:17 -0700 Subject: [Numpy-discussion] Using OpenBLAS for manylinux wheels In-Reply-To: References: Message-ID: On Mon, Mar 28, 2016 at 2:33 PM, Matthew Brett wrote: > Hi, > > Olivier Grisel and I are working on building and testing manylinux > wheels for numpy and scipy. > > We first thought that we should use ATLAS BLAS, but Olivier found that > my build of these could be very slow [1]. I set up a testing grid [2] > which found test errors for numpy and scipy using ATLAS wheels. > > On the other hand, the same testing grid finds no errors or failures > [3] using latest OpenBLAS (0.2.17) and running tests for: > > numpy > scipy > scikit-learn > numexpr > pandas > statsmodels > > This is on the travis-ci ubuntu VMs. > > Please do test on your own machines with something like this script [4]: > > source test_manylinux.sh > > We have worried in the past about the reliability of OpenBLAS, but I > find these tests reassuring. > > Are there any other tests of OpenBLAS that we should run to assure > ourselves that it is safe to use? Here is an update on progress: We've now done a lot of testing on the Linux OpenBLAS wheels. They pass all tests on Linux, with Intel kernels: https://travis-ci.org/matthew-brett/manylinux-testing/builds/120825485 http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian/builds/22 http://nipy.bic.berkeley.edu/builders/manylinux-2.7-fedora/builds/10 Xianyi, the maintainer of OpenBLAS, is very helpfully running the OpenBLAS buildbot nightly tests with numpy and scipy: http://build.openblas.net/builders There is still one BLAS-related failure on these tests on AMD chips: https://github.com/xianyi/OpenBLAS-CI/issues/10 I propose to hold off distributing the OpenBLAS wheels until the OpenBLAS tests are clean on the OpenBLAS buildbots - any objections? Cheers, Matthew From solipsis at pitrou.net Tue Apr 5 13:24:30 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 5 Apr 2016 19:24:30 +0200 Subject: [Numpy-discussion] Multidimension array access in C via Python API References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> Message-ID: <20160405192430.4d46efd3@fsol> On Tue, 5 Apr 2016 08:39:39 -0700 (MST) mpc wrote: > This is the reason I'm doing this in the first place, because I made a pure > python version but it runs really slow for larger data sets, so I'm > basically rewriting the same function but using the Python and Numpy C API, > but if you're saying it won't run any faster then maybe I'm going at it the > wrong way. (Why use the C function version if it's the same speed anyway?) The Python and Numpy C API are generally not very user-friendly compared to Python code, even hand-optimized. Cython will let you write code that looks quite close to normal Python code, but with additional annotations for better performance. Or you can try Numba, a just-in-time compiler for scientific code that understand Numpy arrays: http://numba.pydata.org/ Regards Antoine. From olivier.grisel at ensta.org Tue Apr 5 13:36:37 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Tue, 5 Apr 2016 19:36:37 +0200 Subject: [Numpy-discussion] Using OpenBLAS for manylinux wheels In-Reply-To: References: Message-ID: > Xianyi, the maintainer of OpenBLAS, is very helpfully running the > OpenBLAS buildbot nightly tests with numpy and scipy: > > http://build.openblas.net/builders > > There is still one BLAS-related failure on these tests on AMD chips: > > https://github.com/xianyi/OpenBLAS-CI/issues/10 > > I propose to hold off distributing the OpenBLAS wheels until the > OpenBLAS tests are clean on the OpenBLAS buildbots - any objections? I agree. If someone can understand the fortran code of that scipy test failure to extract a minimalistic reproduction case that only use a few BLAS calls that would help. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From stefan at seefeld.name Tue Apr 5 13:42:24 2016 From: stefan at seefeld.name (Stefan Seefeld) Date: Tue, 5 Apr 2016 13:42:24 -0400 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <20160405192430.4d46efd3@fsol> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <20160405192430.4d46efd3@fsol> Message-ID: <5703F900.8080809@seefeld.name> On 05.04.2016 13:24, Antoine Pitrou wrote: > On Tue, 5 Apr 2016 08:39:39 -0700 (MST) > mpc wrote: >> This is the reason I'm doing this in the first place, because I made a pure >> python version but it runs really slow for larger data sets, so I'm >> basically rewriting the same function but using the Python and Numpy C API, >> but if you're saying it won't run any faster then maybe I'm going at it the >> wrong way. (Why use the C function version if it's the same speed anyway?) > The Python and Numpy C API are generally not very user-friendly > compared to Python code, even hand-optimized. > > Cython will let you write code that looks quite close to normal Python > code, but with additional annotations for better performance. Or you > can try Numba, a just-in-time compiler for scientific code that > understand Numpy arrays: > http://numba.pydata.org/ And just for the record: there is also Boost.Python, if you already have a C++ library you want to bind to Python. There even is a Boost.NumPy project, which however isn't formally part of Boost yet. Still, it may be an inspiration...: http://stefanseefeld.github.io/boost.python/doc/html/index.html https://github.com/ndarray/Boost.NumPy Regards, Stefan -- ...ich hab' noch einen Koffer in Berlin... From njs at pobox.com Tue Apr 5 13:44:38 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 5 Apr 2016 10:44:38 -0700 Subject: [Numpy-discussion] Using OpenBLAS for manylinux wheels In-Reply-To: References: Message-ID: On Apr 5, 2016 10:23 AM, "Matthew Brett" wrote: > > On Mon, Mar 28, 2016 at 2:33 PM, Matthew Brett wrote: > > Hi, > > > > Olivier Grisel and I are working on building and testing manylinux > > wheels for numpy and scipy. > > > > We first thought that we should use ATLAS BLAS, but Olivier found that > > my build of these could be very slow [1]. I set up a testing grid [2] > > which found test errors for numpy and scipy using ATLAS wheels. > > > > On the other hand, the same testing grid finds no errors or failures > > [3] using latest OpenBLAS (0.2.17) and running tests for: > > > > numpy > > scipy > > scikit-learn > > numexpr > > pandas > > statsmodels > > > > This is on the travis-ci ubuntu VMs. > > > > Please do test on your own machines with something like this script [4]: > > > > source test_manylinux.sh > > > > We have worried in the past about the reliability of OpenBLAS, but I > > find these tests reassuring. > > > > Are there any other tests of OpenBLAS that we should run to assure > > ourselves that it is safe to use? > > Here is an update on progress: > > We've now done a lot of testing on the Linux OpenBLAS wheels. They > pass all tests on Linux, with Intel kernels: > > https://travis-ci.org/matthew-brett/manylinux-testing/builds/120825485 > http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian/builds/22 > http://nipy.bic.berkeley.edu/builders/manylinux-2.7-fedora/builds/10 > > Xianyi, the maintainer of OpenBLAS, is very helpfully running the > OpenBLAS buildbot nightly tests with numpy and scipy: > > http://build.openblas.net/builders > > There is still one BLAS-related failure on these tests on AMD chips: > > https://github.com/xianyi/OpenBLAS-CI/issues/10 > > I propose to hold off distributing the OpenBLAS wheels until the > OpenBLAS tests are clean on the OpenBLAS buildbots - any objections? Alternatively, would it make sense to add a local patch to our openblas builds to blacklist the piledriver kernel and then distribute them now? (I'm not immediately sure what would be involved in doing this but it seems unlikely that it would require anything tricky?) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.p.conte at gmail.com Tue Apr 5 12:48:14 2016 From: matt.p.conte at gmail.com (mpc) Date: Tue, 5 Apr 2016 09:48:14 -0700 (MST) Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> Message-ID: <1459874894454-42726.post@n7.nabble.com> The idea is that I want to thin a large 2D buffer of x,y,z points to a given resolution by dividing the data into equal sized "cubes" (i.e. resolution is number of cubes along each axis) and averaging the points inside each cube (if any). * # Fill up buffer data for demonstration purposes with initial buffer of size 10,000,000 to reduce to 1,000,000 size = 10000000 buffer = np.ndarray(shape=(size,3), dtype=np.float) # fill it up buffer[:, 0] = np.random.ranf(size) buffer[:, 1] = np.random.ranf(size) buffer[:, 2] = np.random.ranf(size) # Create result buffer to size of cubed resolution (i.e. 100 ^ 3 = 1,000,000) resolution = 100 thinned_buffer = np.ndarray(shape=(resolution ** 3,3), dtype=np.float) # Trying to convert the following into C to speed it up x_buffer = buffer[:, 0] y_buffer = buffer[:, 1] z_buffer = buffer[:, 2] min_x = x_buffer.min() max_x = x_buffer.max() min_y = y_buffer.min() max_y = y_buffer.max() min_z = z_buffer.min() max_z = z_buffer.max() z_block = (max_z - min_z) / resolution x_block = (max_x - min_x) / resolution y_block = (max_y - min_y) / resolution current_idx = 0 x_idx = min_x while x_idx < max_x: y_idx = min_y while y_idx < max_y: z_idx = min_z while z_idx < max_z: inside_block_points = np.where((x_buffer >= x_idx) & (x_buffer <= x_idx + x_block) & (y_buffer >= y_idx) & (y_buffer <= y_idx + y_block) & (z_buffer >= z_idx) & (z_buffer <= z_idx + z_block)) if inside_block_points[0].size > 0: mean_point = buffer[inside_block_points[0]].mean(axis=0) thinned_buffer[current_idx] = mean_point current_idx += 1 z_idx += z_block y_idx += y_block x_idx += x_block return thin_buffer * -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42726.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ben.v.root at gmail.com Tue Apr 5 13:56:28 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 5 Apr 2016 13:56:28 -0400 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459874894454-42726.post@n7.nabble.com> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <1459874894454-42726.post@n7.nabble.com> Message-ID: You might do better using scipy.spatial. It has very useful data structures for handling spatial coordinates. I am not exactly sure how to use them for this specific problem (not a domain expert), but I would imagine that the QHull wrappers there might give you some useful tools. Ben Root On Tue, Apr 5, 2016 at 12:48 PM, mpc wrote: > The idea is that I want to thin a large 2D buffer of x,y,z points to a > given > resolution by dividing the data into equal sized "cubes" (i.e. resolution > is > number of cubes along each axis) and averaging the points inside each cube > (if any). > > > * # Fill up buffer data for demonstration purposes with initial buffer > of > size 10,000,000 to reduce to 1,000,000 > size = 10000000 > buffer = np.ndarray(shape=(size,3), dtype=np.float) > # fill it up > buffer[:, 0] = np.random.ranf(size) > buffer[:, 1] = np.random.ranf(size) > buffer[:, 2] = np.random.ranf(size) > > # Create result buffer to size of cubed resolution (i.e. 100 ^ 3 = > 1,000,000) > resolution = 100 > thinned_buffer = np.ndarray(shape=(resolution ** 3,3), dtype=np.float) > > # Trying to convert the following into C to speed it up > x_buffer = buffer[:, 0] > y_buffer = buffer[:, 1] > z_buffer = buffer[:, 2] > min_x = x_buffer.min() > max_x = x_buffer.max() > min_y = y_buffer.min() > max_y = y_buffer.max() > min_z = z_buffer.min() > max_z = z_buffer.max() > z_block = (max_z - min_z) / resolution > x_block = (max_x - min_x) / resolution > y_block = (max_y - min_y) / resolution > > current_idx = 0 > x_idx = min_x > while x_idx < max_x: > y_idx = min_y > while y_idx < max_y: > z_idx = min_z > while z_idx < max_z: > inside_block_points = np.where((x_buffer >= x_idx) & > (x_buffer <= > x_idx + x_block) & > (y_buffer >= > y_idx) & > (y_buffer <= > y_idx + y_block) & > (z_buffer >= > z_idx) & > (z_buffer <= > z_idx + z_block)) > if inside_block_points[0].size > 0: > mean_point = > buffer[inside_block_points[0]].mean(axis=0) > thinned_buffer[current_idx] = mean_point > current_idx += 1 > z_idx += z_block > y_idx += y_block > x_idx += x_block > return thin_buffer > * > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42726.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 5 14:03:10 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 5 Apr 2016 11:03:10 -0700 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459870779508-42719.post@n7.nabble.com> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> Message-ID: On Apr 5, 2016 9:39 AM, "mpc" wrote: > > This is the reason I'm doing this in the first place, because I made a pure > python version but it runs really slow for larger data sets, so I'm > basically rewriting the same function but using the Python and Numpy C API, > but if you're saying it won't run any faster then maybe I'm going at it the > wrong way. (Why use the C function version if it's the same speed anyway?) I haven't had a chance to look carefully at the code you posted, but the useful general rule is that code does not magically become faster by writing it in C -- it only becomes faster if by writing it in C you are able to write it in a way that you couldn't from python. Those high-level Numpy C functions you started out using from C *are* the code that you're using from python (e.g. writing arr1 + arr2 essentially just calls PyArray_Add and so forth); there's a little bit of pure overhead from python itself but it's surprisingly small. (I've seen estimates of 10-20% for regular python code, and it should be much less than that for code like yours that's spending most of its time inside numpy.) Where you can really win is places where you go outside the Numpy C API -- replacing an element-by-element loop with C, that sort of thing. > You're suggesting perhaps a cython approach, or perhaps a strictly C/C++ > approach given the raw data? Where cython shines is when you have a small amount of code that really needs to be in C, and it's surrounded by / mixed in with code where regular numpy operations are fine. Or just in general it's great for taking care of all the annoying boilerplate involved in getting between python and C. First though you need some idea of what, algorithmically, you want to change about your implementation that will make it go faster, and why C/C++ might or might not help with implementing this strategy :-). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Apr 5 14:16:14 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 5 Apr 2016 11:16:14 -0700 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459874894454-42726.post@n7.nabble.com> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <1459874894454-42726.post@n7.nabble.com> Message-ID: On Tue, Apr 5, 2016 at 9:48 AM, mpc wrote: > The idea is that I want to thin a large 2D buffer of x,y,z points to a > given > resolution by dividing the data into equal sized "cubes" (i.e. resolution > is > number of cubes along each axis) and averaging the points inside each cube > (if any). > are the original x,y,z points aranged along a nice even grid? or arbitrarily spaced? if the former, I have Cython code that does that :-) I could dig it up, haven't used it in a while. or scikit.image might have something. If the latter, then Ben is right -- you NEED a spatial index -- scipy.spatial.kdtree will probably do what you want, though it would be easier to use a sphere to average over than a cube. Also, maybe Kernel Density Estimation could help here???? https://jakevdp.github.io/blog/2013/12/01/kernel-density-estimation/ Otherwise, you could use Cython to write a non-vectorized version of your below code -- it would be order NM where N is the number of "cubes" and M is the number of original points. I think, but would be a lot faster than the pure python. -CHB Here is where you would do the cython: while x_idx < max_x: > y_idx = min_y > while y_idx < max_y: > z_idx = min_z > while z_idx < max_z: > inside_block_points = np.where((x_buffer >= x_idx) & > (x_buffer <= > x_idx + x_block) & > (y_buffer >= > y_idx) & > (y_buffer <= > y_idx + y_block) & > (z_buffer >= > z_idx) & > (z_buffer <= > z_idx + z_block)) > instead of where, you could loop through all your points and find the ones inside your extents. though now that I think about it -- you are mapping arbitrary points to a regular grid, so you only need to go through the points once, assigning each one to a bin, and then compute the average in each bin. Is this almost a histogram? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Apr 5 14:19:56 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 05 Apr 2016 20:19:56 +0200 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459874894454-42726.post@n7.nabble.com> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <1459874894454-42726.post@n7.nabble.com> Message-ID: <1459880396.9911.22.camel@sipsolutions.net> On Di, 2016-04-05 at 09:48 -0700, mpc wrote: > The idea is that I want to thin a large 2D buffer of x,y,z points to > a given > resolution by dividing the data into equal sized "cubes" (i.e. > resolution is > number of cubes along each axis) and averaging the points inside each > cube > (if any). > Another point is timing your actual code, in this case you could have noticed that all time is spend in the while loops and little time in those min/max calls before. Algorithms, or what you do is the other thing. In the end, it seems your code is just a high dimensional histogram. Though I am not sure if numpy's histogram is fast, I am sure it vastly outperforms this and if you are interested in how it does this, you could even check its code, it is just in python (though numpy internally always has quite a lot of fun boilerplate to make sure of corner cases). And if you search for what you want to do first, you may find faster solutions easily, batteries included and all, there are a lot of tools out there. The other point is, don't optimize much if you don't know exactly what you need to optimize. - Sebastian > > * # Fill up buffer data for demonstration purposes with initial > buffer of > size 10,000,000 to reduce to 1,000,000 > size = 10000000 > buffer = np.ndarray(shape=(size,3), dtype=np.float) > # fill it up > buffer[:, 0] = np.random.ranf(size) > buffer[:, 1] = np.random.ranf(size) > buffer[:, 2] = np.random.ranf(size) > > # Create result buffer to size of cubed resolution (i.e. 100 ^ 3 > = > 1,000,000) > resolution = 100 > thinned_buffer = np.ndarray(shape=(resolution ** 3,3), > dtype=np.float) > > # Trying to convert the following into C to speed it up > x_buffer = buffer[:, 0] > y_buffer = buffer[:, 1] > z_buffer = buffer[:, 2] > min_x = x_buffer.min() > max_x = x_buffer.max() > min_y = y_buffer.min() > max_y = y_buffer.max() > min_z = z_buffer.min() > max_z = z_buffer.max() > z_block = (max_z - min_z) / resolution > x_block = (max_x - min_x) / resolution > y_block = (max_y - min_y) / resolution > > current_idx = 0 > x_idx = min_x > while x_idx < max_x: > y_idx = min_y > while y_idx < max_y: > z_idx = min_z > while z_idx < max_z: > inside_block_points = np.where((x_buffer >= x_idx) & > > (x_buffer <= > x_idx + x_block) & > > (y_buffer >= > y_idx) & > > (y_buffer <= > y_idx + y_block) & > > (z_buffer >= > z_idx) & > > (z_buffer <= > z_idx + z_block)) > if inside_block_points[0].size > 0: > mean_point = > buffer[inside_block_points[0]].mean(axis=0) > thinned_buffer[current_idx] = mean_point > current_idx += 1 > z_idx += z_block > y_idx += y_block > x_idx += x_block > return thin_buffer > * > > > > -- > View this message in context: http://numpy-discussion.10968.n7.nabble > .com/Multidimension-array-access-in-C-via-Python-API > -tp42710p42726.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Tue Apr 5 14:29:22 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 05 Apr 2016 20:29:22 +0200 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459880396.9911.22.camel@sipsolutions.net> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <1459874894454-42726.post@n7.nabble.com> <1459880396.9911.22.camel@sipsolutions.net> Message-ID: <1459880962.9911.23.camel@sipsolutions.net> On Di, 2016-04-05 at 20:19 +0200, Sebastian Berg wrote: > On Di, 2016-04-05 at 09:48 -0700, mpc wrote: > > The idea is that I want to thin a large 2D buffer of x,y,z points > > to > > a given > > resolution by dividing the data into equal sized "cubes" (i.e. > > resolution is > > number of cubes along each axis) and averaging the points inside > > each > > cube > > (if any). > > > > Another point is timing your actual code, in this case you could have > noticed that all time is spend in the while loops and little time in > those min/max calls before. > > Algorithms, or what you do is the other thing. In the end, it seems > your code is just a high dimensional histogram. Though I am not sure > if > numpy's histogram is fast, I am sure it vastly outperforms this and > if Hmm, well maybe not quite, but it seems similar like a weighted histogram. > you are interested in how it does this, you could even check its > code, > it is just in python (though numpy internally always has quite a lot > of > fun boilerplate to make sure of corner cases). > > And if you search for what you want to do first, you may find faster > solutions easily, batteries included and all, there are a lot of > tools > out there. The other point is, don't optimize much if you don't know > exactly what you need to optimize. > > - Sebastian > > > > > * # Fill up buffer data for demonstration purposes with initial > > buffer of > > size 10,000,000 to reduce to 1,000,000 > > size = 10000000 > > buffer = np.ndarray(shape=(size,3), dtype=np.float) > > # fill it up > > buffer[:, 0] = np.random.ranf(size) > > buffer[:, 1] = np.random.ranf(size) > > buffer[:, 2] = np.random.ranf(size) > > > > # Create result buffer to size of cubed resolution (i.e. 100 ^ > > 3 > > = > > 1,000,000) > > resolution = 100 > > thinned_buffer = np.ndarray(shape=(resolution ** 3,3), > > dtype=np.float) > > > > # Trying to convert the following into C to speed it up > > x_buffer = buffer[:, 0] > > y_buffer = buffer[:, 1] > > z_buffer = buffer[:, 2] > > min_x = x_buffer.min() > > max_x = x_buffer.max() > > min_y = y_buffer.min() > > max_y = y_buffer.max() > > min_z = z_buffer.min() > > max_z = z_buffer.max() > > z_block = (max_z - min_z) / resolution > > x_block = (max_x - min_x) / resolution > > y_block = (max_y - min_y) / resolution > > > > current_idx = 0 > > x_idx = min_x > > while x_idx < max_x: > > y_idx = min_y > > while y_idx < max_y: > > z_idx = min_z > > while z_idx < max_z: > > inside_block_points = np.where((x_buffer >= x_idx) > > & > > > > (x_buffer <= > > x_idx + x_block) & > > > > (y_buffer >= > > y_idx) & > > > > (y_buffer <= > > y_idx + y_block) & > > > > (z_buffer >= > > z_idx) & > > > > (z_buffer <= > > z_idx + z_block)) > > if inside_block_points[0].size > 0: > > mean_point = > > buffer[inside_block_points[0]].mean(axis=0) > > thinned_buffer[current_idx] = mean_point > > current_idx += 1 > > z_idx += z_block > > y_idx += y_block > > x_idx += x_block > > return thin_buffer > > * > > > > > > > > -- > > View this message in context: > > http://numpy-discussion.10968.n7.nabble > > .com/Multidimension-array-access-in-C-via-Python-API > > -tp42710p42726.html > > Sent from the Numpy-discussion mailing list archive at Nabble.com. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From matt.p.conte at gmail.com Tue Apr 5 13:55:22 2016 From: matt.p.conte at gmail.com (mpc) Date: Tue, 5 Apr 2016 10:55:22 -0700 (MST) Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <1459874894454-42726.post@n7.nabble.com> Message-ID: <1459878922273-42732.post@n7.nabble.com> The points are indeed arbitrarily spaced, and yes I have heard tale of using spatial indices for this sort of problem, and it looks like that would be the best bet for me. Thanks for the other suggestions as well! -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42732.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From matt.p.conte at gmail.com Tue Apr 5 14:09:40 2016 From: matt.p.conte at gmail.com (mpc) Date: Tue, 5 Apr 2016 11:09:40 -0700 (MST) Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <1459874894454-42726.post@n7.nabble.com> Message-ID: <1459879780918-42733.post@n7.nabble.com> This wasn't intended to be a histogram, but you're right in that it would be much better if I can just go through each point once and bin the results, that makes more sense, thanks! -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42733.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ewm at redtetrahedron.org Tue Apr 5 15:31:03 2016 From: ewm at redtetrahedron.org (Eric Moore) Date: Tue, 5 Apr 2016 15:31:03 -0400 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: <1459879780918-42733.post@n7.nabble.com> References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <1459874894454-42726.post@n7.nabble.com> <1459879780918-42733.post@n7.nabble.com> Message-ID: def reduce_data(buffer, resolution): thinned_buffer = np.zeros((resolution**3, 3)) min_xyz = buffer.min(axis=0) max_xyz = buffer.max(axis=0) delta_xyz = max_xyz - min_xyz inds_xyz = np.floor(resolution * (buffer - min_xyz) / delta_xyz).astype(int) # handle values right at the max inds_xyz[inds_xyz == resolution] -= 1 # covert to linear indices so that we can use np.add.at inds_lin = inds_xyz[:,0] inds_lin += inds_xyz[:,1] * resolution inds_lin += inds_xyz[:,2] * resolution**2 np.add.at(thinned_buffer, inds_lin, buffer) counts = np.bincount(inds_lin, minlength=resolution**3) thinned_buffer[counts != 0, :] /= counts[counts != 0, None] return thinned_buffer The bulk of the time is spent in np.add.at, so just over 5 s here with your 1e7 to 1e6 example. On Tue, Apr 5, 2016 at 2:09 PM, mpc wrote: > This wasn't intended to be a histogram, but you're right in that it would > be > much better if I can just go through each point once and bin the results, > that makes more sense, thanks! > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42733.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Tue Apr 5 15:42:50 2016 From: ewm at redtetrahedron.org (Eric Moore) Date: Tue, 5 Apr 2016 15:42:50 -0400 Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <1459874894454-42726.post@n7.nabble.com> <1459879780918-42733.post@n7.nabble.com> Message-ID: Eh. The order of the outputs will be different than your code, if that makes a difference. On Tue, Apr 5, 2016 at 3:31 PM, Eric Moore wrote: > def reduce_data(buffer, resolution): > thinned_buffer = np.zeros((resolution**3, 3)) > > min_xyz = buffer.min(axis=0) > max_xyz = buffer.max(axis=0) > delta_xyz = max_xyz - min_xyz > > inds_xyz = np.floor(resolution * (buffer - min_xyz) / > delta_xyz).astype(int) > > # handle values right at the max > inds_xyz[inds_xyz == resolution] -= 1 > > # covert to linear indices so that we can use np.add.at > inds_lin = inds_xyz[:,0] > inds_lin += inds_xyz[:,1] * resolution > inds_lin += inds_xyz[:,2] * resolution**2 > > np.add.at(thinned_buffer, inds_lin, buffer) > counts = np.bincount(inds_lin, minlength=resolution**3) > > thinned_buffer[counts != 0, :] /= counts[counts != 0, None] > return thinned_buffer > > > The bulk of the time is spent in np.add.at, so just over 5 s here with > your 1e7 to 1e6 example. > > On Tue, Apr 5, 2016 at 2:09 PM, mpc wrote: > >> This wasn't intended to be a histogram, but you're right in that it would >> be >> much better if I can just go through each point once and bin the results, >> that makes more sense, thanks! >> >> >> >> -- >> View this message in context: >> http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42733.html >> Sent from the Numpy-discussion mailing list archive at Nabble.com. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.p.conte at gmail.com Tue Apr 5 14:56:49 2016 From: matt.p.conte at gmail.com (mpc) Date: Tue, 5 Apr 2016 11:56:49 -0700 (MST) Subject: [Numpy-discussion] Multidimension array access in C via Python API In-Reply-To: References: <1459791345159-42710.post@n7.nabble.com> <1459799957089-42715.post@n7.nabble.com> <1459870779508-42719.post@n7.nabble.com> <1459874894454-42726.post@n7.nabble.com> <1459879780918-42733.post@n7.nabble.com> Message-ID: <1459882609441-42736.post@n7.nabble.com> That's a very clever approach. I also found a way using the pandas library with the groupby function. points_df = pandas.DataFrame.from_records(buffer) new_buffer = points_df.groupby(qcut(points_df.index, resolution**3)).mean() I did the original approach with all of those loops because I need a way to measure and report on progress, and although these advanced functions are great, they are still asynchronous and blocking and provides no means of indicating progress. Still cool though, thanks! -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42736.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From toddrjen at gmail.com Tue Apr 5 22:11:14 2016 From: toddrjen at gmail.com (Todd) Date: Tue, 5 Apr 2016 22:11:14 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: When you try to transpose a 1D array, it does nothing. This is the correct behavior, since it transposing a 1D array is meaningless. However, this can often lead to unexpected errors since this is rarely what you want. You can convert the array to 2D, using `np.atleast_2d` or `arr[None]`, but this makes simple linear algebra computations more difficult. I propose adding an argument to transpose, perhaps called `expand` or `expanddim`, which if `True` (it is `False` by default) will force the array to be at least 2D. A shortcut property, `ndarray.T2`, would be the same as `ndarray.transpose(True)`. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Tue Apr 5 22:26:16 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Tue, 5 Apr 2016 22:26:16 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: <570473C8.1050209@gmail.com> On 4/5/2016 10:11 PM, Todd wrote: > When you try to transpose a 1D array, it does nothing. This is the > correct behavior, since it transposing a 1D array is meaningless. > However, this can often lead to unexpected errors since this is rarely > what you want. You can convert the array to 2D, using `np.atleast_2d` > or `arr[None]`, but this makes simple linear algebra computations more > difficult. > > I propose adding an argument to transpose, perhaps called `expand` or > `expanddim`, which if `True` (it is `False` by default) will force the > array to be at least 2D. A shortcut property, `ndarray.T2`, would be > the same as `ndarray.transpose(True)`. Use `dot`. E.g., m.dot(a) hth, Alan Isaac From jni.soma at gmail.com Tue Apr 5 22:49:45 2016 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Wed, 6 Apr 2016 12:49:45 +1000 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: <570473C8.1050209@gmail.com> References: <570473C8.1050209@gmail.com> Message-ID: Todd, Would you consider a 1D array to be a row vector or a column vector for the purposes of transposition? The "correct" answer is not clear to me. Juan. On Wed, Apr 6, 2016 at 12:26 PM, Alan Isaac wrote: > On 4/5/2016 10:11 PM, Todd wrote: > >> When you try to transpose a 1D array, it does nothing. This is the >> correct behavior, since it transposing a 1D array is meaningless. >> However, this can often lead to unexpected errors since this is rarely >> what you want. You can convert the array to 2D, using `np.atleast_2d` >> or `arr[None]`, but this makes simple linear algebra computations more >> difficult. >> >> I propose adding an argument to transpose, perhaps called `expand` or >> `expanddim`, which if `True` (it is `False` by default) will force the >> array to be at least 2D. A shortcut property, `ndarray.T2`, would be >> the same as `ndarray.transpose(True)`. >> > > > > Use `dot`. E.g., > m.dot(a) > > hth, > Alan Isaac > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 5 23:14:23 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 5 Apr 2016 20:14:23 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Tue, Apr 5, 2016 at 7:11 PM, Todd wrote: > When you try to transpose a 1D array, it does nothing. This is the correct > behavior, since it transposing a 1D array is meaningless. However, this can > often lead to unexpected errors since this is rarely what you want. You can > convert the array to 2D, using `np.atleast_2d` or `arr[None]`, but this > makes simple linear algebra computations more difficult. > > I propose adding an argument to transpose, perhaps called `expand` or > `expanddim`, which if `True` (it is `False` by default) will force the array > to be at least 2D. A shortcut property, `ndarray.T2`, would be the same as > `ndarray.transpose(True)`. An alternative that was mentioned in the bug tracker (https://github.com/numpy/numpy/issues/7495), possibly by me, would be to have arr.T2 act as a stacked-transpose operator, i.e. treat an arr with shape (..., n, m) as being a (...)-shaped stack of (n, m) matrices, and transpose each of those matrices, so the output shape is (..., m, n). And since this operation intrinsically acts on arrays with shape (..., n, m) then trying to apply it to a 0d or 1d array would be an error. -n -- Nathaniel J. Smith -- https://vorpus.org From olivier.grisel at ensta.org Wed Apr 6 05:04:19 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Wed, 6 Apr 2016 11:04:19 +0200 Subject: [Numpy-discussion] Using OpenBLAS for manylinux wheels In-Reply-To: References: Message-ID: 2016-04-05 19:44 GMT+02:00 Nathaniel Smith : > >> I propose to hold off distributing the OpenBLAS wheels until the >> OpenBLAS tests are clean on the OpenBLAS buildbots - any objections? > > Alternatively, would it make sense to add a local patch to our openblas > builds to blacklist the piledriver kernel and then distribute them now? (I'm > not immediately sure what would be involved in doing this but it seems > unlikely that it would require anything tricky?) I tried to force use the NEHALEM or the BARCELONA driver on a PILEDRIVER AMD box and while it fixes the original test failure in isolve it causes this other scipy test to fail: ====================================================================== FAIL: test_nanmedian_all_axis (test_stats.TestNanFunc) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/scipy/stats/tests/test_stats.py", line 242, in test_nanmedian_all_axis assert_equal(len(w), 4) File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 375, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: ACTUAL: 1 DESIRED: 4 -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From contrebasse at gmail.com Wed Apr 6 05:51:46 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Wed, 6 Apr 2016 09:51:46 +0000 (UTC) Subject: [Numpy-discussion] ndarray.T2 for 2D transpose References: Message-ID: Nathaniel Smith pobox.com> writes: > An alternative that was mentioned in the bug tracker > (https://github.com/numpy/numpy/issues/7495), possibly by me, would be > to have arr.T2 act as a stacked-transpose operator, i.e. treat an arr > with shape (..., n, m) as being a (...)-shaped stack of (n, m) > matrices, and transpose each of those matrices, so the output shape is > (..., m, n). And since this operation intrinsically acts on arrays > with shape (..., n, m) then trying to apply it to a 0d or 1d array > would be an error. I think that the problem is not that it doesn't raise an error for 1D array, but that it doesn't do anything useful to 1D arrays. Raising an error would change nothing to the way transpose is used now. For a 1D array a of shape (N,), I expect a.T2 to be of shape (N, 1), which is useful when writing formulas, and clearer that a[None].T. Actually I'd like a.T to do that alreadu, but I guess backward compatibility is more important. From njs at pobox.com Wed Apr 6 06:26:00 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 6 Apr 2016 03:26:00 -0700 Subject: [Numpy-discussion] Using OpenBLAS for manylinux wheels In-Reply-To: References: Message-ID: On Wed, Apr 6, 2016 at 2:04 AM, Olivier Grisel wrote: > 2016-04-05 19:44 GMT+02:00 Nathaniel Smith : >> >>> I propose to hold off distributing the OpenBLAS wheels until the >>> OpenBLAS tests are clean on the OpenBLAS buildbots - any objections? >> >> Alternatively, would it make sense to add a local patch to our openblas >> builds to blacklist the piledriver kernel and then distribute them now? (I'm >> not immediately sure what would be involved in doing this but it seems >> unlikely that it would require anything tricky?) > > I tried to force use the NEHALEM or the BARCELONA driver on a PILEDRIVER > AMD box and while it fixes the original test failure in isolve it > causes this other > scipy test to fail: > > ====================================================================== > FAIL: test_nanmedian_all_axis (test_stats.TestNanFunc) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/local/lib/python2.7/site-packages/scipy/stats/tests/test_stats.py", > line 242, in test_nanmedian_all_axis > assert_equal(len(w), 4) > File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", > line 375, in assert_equal > raise AssertionError(msg) > AssertionError: > Items are not equal: > ACTUAL: 1 > DESIRED: 4 I'm reading this email next to https://github.com/xianyi/OpenBLAS-CI/issues/10#issuecomment-206195714 and I'm confused :-). -n -- Nathaniel J. Smith -- https://vorpus.org From mmwoodman at gmail.com Wed Apr 6 06:59:32 2016 From: mmwoodman at gmail.com (Marmaduke Woodman) Date: Wed, 6 Apr 2016 12:59:32 +0200 Subject: [Numpy-discussion] Build NumPy against MATLAB BLAS & Lapack Message-ID: hi I'm trying to provide a Python numerical library to colleagues who are MATLAB users, since recent MATLAB versions include an official method for calling Python [1], and I'm struggling to build NumPy which is compatible with MATLAB's libraries. My current site.cfg is here https://gist.github.com/maedoc/a41cb253011ad55edacca560a66dff76 I build from Git master with BLAS=none LAPACK=none ATLAS=none python setup.py install Currently, most of NumPy works, except linear algebra calls, like pinv, which segfault inside the MKL: np.__config__.show() shows NumPy thinks it linked against OpenBLAS with a CBLAS ABI, not a standard Fortran BLAS api. Is there a way to force the NumPy build to ignore OpenBLAS and assume it's calling the standard Fortran style BLAS functions? Thanks in advance for any comments or suggestions, Marmaduke [1] http://fr.mathworks.com/help/matlab/call-python-libraries.html From olivier.grisel at ensta.org Wed Apr 6 07:19:16 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Wed, 6 Apr 2016 13:19:16 +0200 Subject: [Numpy-discussion] Using OpenBLAS for manylinux wheels In-Reply-To: References: Message-ID: Yes sorry I forgot to update the thread. Actually I am no longer sure how I go this error. I am re-running the full test suite because I cannot reproduce it when running the test_stats.py module alone. -- Olivier From ndbecker2 at gmail.com Wed Apr 6 09:18:24 2016 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 06 Apr 2016 09:18:24 -0400 Subject: [Numpy-discussion] mtrand.c update 1.11 breaks my crappy code Message-ID: I have C++ code that tries to share the mtrand state. It unfortunately depends on the layout of RandomState which used to be: struct __pyx_obj_6mtrand_RandomState { PyObject_HEAD rk_state *internal_state; PyObject *lock; }; But with 1.11 it's: struct __pyx_obj_6mtrand_RandomState { PyObject_HEAD struct __pyx_vtabstruct_6mtrand_RandomState *__pyx_vtab; rk_state *internal_state; PyObject *lock; PyObject *state_address; }; So 1. Why the change? 2. How can I write portable code? From robert.kern at gmail.com Wed Apr 6 09:29:57 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 6 Apr 2016 14:29:57 +0100 Subject: [Numpy-discussion] mtrand.c update 1.11 breaks my crappy code In-Reply-To: References: Message-ID: On Wed, Apr 6, 2016 at 2:18 PM, Neal Becker wrote: > > I have C++ code that tries to share the mtrand state. It unfortunately > depends on the layout of RandomState which used to be: > > struct __pyx_obj_6mtrand_RandomState { > PyObject_HEAD > rk_state *internal_state; > PyObject *lock; > }; > > But with 1.11 it's: > struct __pyx_obj_6mtrand_RandomState { > PyObject_HEAD > struct __pyx_vtabstruct_6mtrand_RandomState *__pyx_vtab; > rk_state *internal_state; > PyObject *lock; > PyObject *state_address; > }; > > So > 1. Why the change? > 2. How can I write portable code? There is no C API to RandomState at this time, stable, portable or otherwise. It's all private implementation detail. If you would like a stable and portable C API for RandomState, you will need to contribute one using PyCapsules to expose the underlying rk_state* pointer. https://docs.python.org/2.7/c-api/capsule.html -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Wed Apr 6 09:47:00 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Wed, 6 Apr 2016 15:47:00 +0200 Subject: [Numpy-discussion] Using OpenBLAS for manylinux wheels In-Reply-To: References: Message-ID: I updated the issue: https://github.com/xianyi/OpenBLAS-CI/issues/10#issuecomment-206195714 The random test_nanmedian_all_axis failure is unrelated to openblas and should be ignored. -- Olivier From ndbecker2 at gmail.com Wed Apr 6 09:49:36 2016 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 06 Apr 2016 09:49:36 -0400 Subject: [Numpy-discussion] mtrand.c update 1.11 breaks my crappy code References: Message-ID: Robert Kern wrote: > On Wed, Apr 6, 2016 at 2:18 PM, Neal Becker wrote: >> >> I have C++ code that tries to share the mtrand state. It unfortunately >> depends on the layout of RandomState which used to be: >> >> struct __pyx_obj_6mtrand_RandomState { >> PyObject_HEAD >> rk_state *internal_state; >> PyObject *lock; >> }; >> >> But with 1.11 it's: >> struct __pyx_obj_6mtrand_RandomState { >> PyObject_HEAD >> struct __pyx_vtabstruct_6mtrand_RandomState *__pyx_vtab; >> rk_state *internal_state; >> PyObject *lock; >> PyObject *state_address; >> }; >> >> So >> 1. Why the change? >> 2. How can I write portable code? > > There is no C API to RandomState at this time, stable, portable or > otherwise. It's all private implementation detail. If you would like a > stable and portable C API for RandomState, you will need to contribute one > using PyCapsules to expose the underlying rk_state* pointer. > > https://docs.python.org/2.7/c-api/capsule.html > > -- > Robert Kern I don't see how pycapsule helps here. What I need is, my C++ code receives a RandomState object. I need to call e.g., rk_random, passing the pointer to rk_state - code looks like this; RandomState* r = (RandomState*)(rs.ptr()); // result_type buffer; // rk_fill ((void*)&buffer, sizeof(buffer), r->internal_state); if (sizeof(result_type) == sizeof (uint64_t)) return rk_ulong (r->internal_state); else if (sizeof(result_type) == sizeof (uint32_t)) return rk_random (r->internal_state); From ndbecker2 at gmail.com Wed Apr 6 10:02:37 2016 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 06 Apr 2016 10:02:37 -0400 Subject: [Numpy-discussion] mtrand.c update 1.11 breaks my crappy code References: Message-ID: Neal Becker wrote: > Robert Kern wrote: > >> On Wed, Apr 6, 2016 at 2:18 PM, Neal Becker wrote: >>> >>> I have C++ code that tries to share the mtrand state. It unfortunately >>> depends on the layout of RandomState which used to be: >>> >>> struct __pyx_obj_6mtrand_RandomState { >>> PyObject_HEAD >>> rk_state *internal_state; >>> PyObject *lock; >>> }; >>> >>> But with 1.11 it's: >>> struct __pyx_obj_6mtrand_RandomState { >>> PyObject_HEAD >>> struct __pyx_vtabstruct_6mtrand_RandomState *__pyx_vtab; >>> rk_state *internal_state; >>> PyObject *lock; >>> PyObject *state_address; >>> }; >>> >>> So >>> 1. Why the change? >>> 2. How can I write portable code? >> >> There is no C API to RandomState at this time, stable, portable or >> otherwise. It's all private implementation detail. If you would like a >> stable and portable C API for RandomState, you will need to contribute >> one using PyCapsules to expose the underlying rk_state* pointer. >> >> https://docs.python.org/2.7/c-api/capsule.html >> >> -- >> Robert Kern > > I don't see how pycapsule helps here. What I need is, my C++ code > receives > a RandomState object. I need to call e.g., rk_random, passing the pointer > to rk_state - code looks like this; > > RandomState* r = (RandomState*)(rs.ptr()); > // result_type buffer; > // rk_fill ((void*)&buffer, sizeof(buffer), r->internal_state); > if (sizeof(result_type) == sizeof (uint64_t)) > return rk_ulong (r->internal_state); > else if (sizeof(result_type) == sizeof (uint32_t)) > return rk_random (r->internal_state); Nevermind, I see it's described here: https://docs.python.org/2.7/extending/extending.html#using-capsules From njs at pobox.com Wed Apr 6 10:31:32 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 6 Apr 2016 07:31:32 -0700 Subject: [Numpy-discussion] mtrand.c update 1.11 breaks my crappy code In-Reply-To: References: Message-ID: On Apr 6, 2016 06:31, "Robert Kern" wrote: > > On Wed, Apr 6, 2016 at 2:18 PM, Neal Becker wrote: > > > > I have C++ code that tries to share the mtrand state. It unfortunately > > depends on the layout of RandomState which used to be: > > > > struct __pyx_obj_6mtrand_RandomState { > > PyObject_HEAD > > rk_state *internal_state; > > PyObject *lock; > > }; > > > > But with 1.11 it's: > > struct __pyx_obj_6mtrand_RandomState { > > PyObject_HEAD > > struct __pyx_vtabstruct_6mtrand_RandomState *__pyx_vtab; > > rk_state *internal_state; > > PyObject *lock; > > PyObject *state_address; > > }; > > > > So > > 1. Why the change? > > 2. How can I write portable code? > > There is no C API to RandomState at this time, stable, portable or otherwise. It's all private implementation detail. If you would like a stable and portable C API for RandomState, you will need to contribute one using PyCapsules to expose the underlying rk_state* pointer. > > https://docs.python.org/2.7/c-api/capsule.html I'm very wary about the idea of exposing the rk_state pointer at all. We could have a C API to random but my strong preference would be for something that only exposes opaque function calls that take a RandomState and return some random numbers, and getting even this right in a clean and maintainable way isn't trivial. Obviously another option is to call one of the python methods to get an ndarray and read out its memory contents. If you can do this in a batch (fetching a bunch of numbers for each call) to amortize the additional overhead of going through python, then it might work fine. (Python overhead is not actually that much -- mostly just having to do a handful of extra allocations.) Or, possibly the best option, one could use one of the many fine C random libraries inside your code, and if you need your code to be deterministic given a RandomState you could derive your state initialization from a single call to some RandomState method. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed Apr 6 11:17:20 2016 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 06 Apr 2016 11:17:20 -0400 Subject: [Numpy-discussion] mtrand.c update 1.11 breaks my crappy code References: Message-ID: Nathaniel Smith wrote: > On Apr 6, 2016 06:31, "Robert Kern" wrote: >> >> On Wed, Apr 6, 2016 at 2:18 PM, Neal Becker wrote: >> > >> > I have C++ code that tries to share the mtrand state. It unfortunately >> > depends on the layout of RandomState which used to be: >> > >> > struct __pyx_obj_6mtrand_RandomState { >> > PyObject_HEAD >> > rk_state *internal_state; >> > PyObject *lock; >> > }; >> > >> > But with 1.11 it's: >> > struct __pyx_obj_6mtrand_RandomState { >> > PyObject_HEAD >> > struct __pyx_vtabstruct_6mtrand_RandomState *__pyx_vtab; >> > rk_state *internal_state; >> > PyObject *lock; >> > PyObject *state_address; >> > }; >> > >> > So >> > 1. Why the change? >> > 2. How can I write portable code? >> >> There is no C API to RandomState at this time, stable, portable or > otherwise. It's all private implementation detail. If you would like a > stable and portable C API for RandomState, you will need to contribute one > using PyCapsules to expose the underlying rk_state* pointer. >> >> https://docs.python.org/2.7/c-api/capsule.html > > I'm very wary about the idea of exposing the rk_state pointer at all. We > could have a C API to random but my strong preference would be for > something that only exposes opaque function calls that take a RandomState > and return some random numbers, and getting even this right in a clean and > maintainable way isn't trivial. > > Obviously another option is to call one of the python methods to get an > ndarray and read out its memory contents. If you can do this in a batch > (fetching a bunch of numbers for each call) to amortize the additional > overhead of going through python, then it might work fine. (Python > overhead is not actually that much -- mostly just having to do a handful > of extra allocations.) > > Or, possibly the best option, one could use one of the many fine C random > libraries inside your code, and if you need your code to be deterministic > given a RandomState you could derive your state initialization from a > single call to some RandomState method. > > -n I prefer to use a single instance of a RandomState so that there are guarantees about the independence of streams generated from python random functions, and from my c++ code. True, there are simpler approaches - but I'm a purist. Yes, if there were an api use mkl random functions from a RandomState object that would solve my problem. Or even if there was an API to get a internal_state pointer from a RandomState object. From robert.kern at gmail.com Wed Apr 6 11:27:49 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 6 Apr 2016 16:27:49 +0100 Subject: [Numpy-discussion] mtrand.c update 1.11 breaks my crappy code In-Reply-To: References: Message-ID: On Wed, Apr 6, 2016 at 4:17 PM, Neal Becker wrote: > I prefer to use a single instance of a RandomState so that there are > guarantees about the independence of streams generated from python random > functions, and from my c++ code. True, there are simpler approaches - but > I'm a purist. Consider using PRNGs that actually expose truly independent streams instead of a single shared stream: https://github.com/bashtage/ng-numpy-randomstate -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Apr 6 11:44:19 2016 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 6 Apr 2016 08:44:19 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: <4414284980401594570@unknownmsgid> > I think that the problem is not that it doesn't raise an error for 1D array, > but that it doesn't do anything useful to 1D arrays. Raising an error would > change nothing to the way transpose is used now. No, but it would make it clear that you can't expect transpose to make a 1D array into a2D array. > For a 1D array a of shape (N,), I expect a.T2 to be of shape (N, 1), Why not (1,N)? -- it is not well defined, though I suppose it's not so bad to establish a convention that a 1-D array is a "row vector" rather than a "column vector". But the truth is that Numpy arrays are arrays, not matrices and vectors. The "right" way to do this is to properly extend and support the matrix object, adding row and column vector objects, and then it would be clear. But while there has been a lot of discussion about that in the past, the fact is that no one wants it bad enough to write the code. So I think it's better to keep Numpy arrays "pure", and if you want to change the rank of an array, you do so explicitly. I use: A_vector.shape = (-1,1) BTW, if transposing a (N,) array gives you a (N,1) array, what does transposing a (N,1) array give you? (1,N) or (N,) ? -CHB > which > is useful when writing formulas, and clearer that a[None].T. Actually I'd > like a.T to do that alreadu, but I guess backward compatibility is more > important. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From insertinterestingnamehere at gmail.com Wed Apr 6 13:10:48 2016 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Wed, 06 Apr 2016 17:10:48 +0000 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Tue, Apr 5, 2016 at 9:14 PM Nathaniel Smith wrote: > On Tue, Apr 5, 2016 at 7:11 PM, Todd wrote: > > When you try to transpose a 1D array, it does nothing. This is the > correct > > behavior, since it transposing a 1D array is meaningless. However, this > can > > often lead to unexpected errors since this is rarely what you want. You > can > > convert the array to 2D, using `np.atleast_2d` or `arr[None]`, but this > > makes simple linear algebra computations more difficult. > > > > I propose adding an argument to transpose, perhaps called `expand` or > > `expanddim`, which if `True` (it is `False` by default) will force the > array > > to be at least 2D. A shortcut property, `ndarray.T2`, would be the same > as > > `ndarray.transpose(True)`. > > An alternative that was mentioned in the bug tracker > (https://github.com/numpy/numpy/issues/7495), possibly by me, would be > to have arr.T2 act as a stacked-transpose operator, i.e. treat an arr > with shape (..., n, m) as being a (...)-shaped stack of (n, m) > matrices, and transpose each of those matrices, so the output shape is > (..., m, n). And since this operation intrinsically acts on arrays > with shape (..., n, m) then trying to apply it to a 0d or 1d array > would be an error. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > I agree that we could really use a shorter syntax for a broadcasting transpose. Swapaxes is far too verbose for something that should be so common now that we've introduced the new matmul operator. That said, the fact that 1-D vectors are conceptually so similar to row vectors makes transposing a 1-D array a potential pitfall for a lot of people. When broadcasting along the leading dimension, a (n) shaped array and a (1, n) shaped array are already treated as equivalent. Treating a 1-D array like a row vector for transposes seems like a reasonable way to make things more intuitive for users. Rather than raising an error for arrays with fewer than two dimensions, the new syntax could be made equivalent to np.swapaxes(np.atleast2d(arr), -1, -2). From the standpoint of broadcasting semantics, using atleast2d can be viewed as allowing broadcasting along the inner dimensions. Though that's not a common thing, at least there's a precedent. The only downside I can see with allowing T2 to call atleast2d is that it would make things like A @ b and A @ b.T2 equivalent when B is one-dimensional. That's already the case with our current syntax though. There's some inherent design tension between the fact that broadcasting usually prepends ones to fill in missing dimensions and the fact that our current linear algebra semantics often treat rows as columns, but making 1-D arrays into rows makes a lot of sense as far as user experience goes. Great ideas everyone! Best, -Ian Henriksen -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Wed Apr 6 13:39:49 2016 From: toddrjen at gmail.com (Todd) Date: Wed, 6 Apr 2016 13:39:49 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: <570473C8.1050209@gmail.com> Message-ID: I would make `arr.T2` the same as `np.atleast_2d(arr).T`. So a 1D array would act as a row vector, since that is already the convention for coercing 1D arrays to 2D. On Tue, Apr 5, 2016 at 10:49 PM, Juan Nunez-Iglesias wrote: > Todd, > > Would you consider a 1D array to be a row vector or a column vector for > the purposes of transposition? The "correct" answer is not clear to me. > > Juan. > > On Wed, Apr 6, 2016 at 12:26 PM, Alan Isaac wrote: > >> On 4/5/2016 10:11 PM, Todd wrote: >> >>> When you try to transpose a 1D array, it does nothing. This is the >>> correct behavior, since it transposing a 1D array is meaningless. >>> However, this can often lead to unexpected errors since this is rarely >>> what you want. You can convert the array to 2D, using `np.atleast_2d` >>> or `arr[None]`, but this makes simple linear algebra computations more >>> difficult. >>> >>> I propose adding an argument to transpose, perhaps called `expand` or >>> `expanddim`, which if `True` (it is `False` by default) will force the >>> array to be at least 2D. A shortcut property, `ndarray.T2`, would be >>> the same as `ndarray.transpose(True)`. >>> >> >> >> >> Use `dot`. E.g., >> m.dot(a) >> >> hth, >> Alan Isaac >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Wed Apr 6 13:43:15 2016 From: toddrjen at gmail.com (Todd) Date: Wed, 6 Apr 2016 13:43:15 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Tue, Apr 5, 2016 at 11:14 PM, Nathaniel Smith wrote: > On Tue, Apr 5, 2016 at 7:11 PM, Todd wrote: > > When you try to transpose a 1D array, it does nothing. This is the > correct > > behavior, since it transposing a 1D array is meaningless. However, this > can > > often lead to unexpected errors since this is rarely what you want. You > can > > convert the array to 2D, using `np.atleast_2d` or `arr[None]`, but this > > makes simple linear algebra computations more difficult. > > > > I propose adding an argument to transpose, perhaps called `expand` or > > `expanddim`, which if `True` (it is `False` by default) will force the > array > > to be at least 2D. A shortcut property, `ndarray.T2`, would be the same > as > > `ndarray.transpose(True)`. > > An alternative that was mentioned in the bug tracker > (https://github.com/numpy/numpy/issues/7495), possibly by me, would be > to have arr.T2 act as a stacked-transpose operator, i.e. treat an arr > with shape (..., n, m) as being a (...)-shaped stack of (n, m) > matrices, and transpose each of those matrices, so the output shape is > (..., m, n). And since this operation intrinsically acts on arrays > with shape (..., n, m) then trying to apply it to a 0d or 1d array > would be an error. > > My intention was to make linear algebra operations easier in numpy. With the @ operator available, it is now very easy to do basic linear algebra on arrays without needing the matrix class. But getting an array into a state where you can use the @ operator effectively is currently pretty verbose and confusing. I was trying to find a way to make the @ operator more useful. -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Wed Apr 6 13:47:14 2016 From: toddrjen at gmail.com (Todd) Date: Wed, 6 Apr 2016 13:47:14 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: <4414284980401594570@unknownmsgid> References: <4414284980401594570@unknownmsgid> Message-ID: On Wed, Apr 6, 2016 at 11:44 AM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > But the truth is that Numpy arrays are arrays, not matrices and vectors. > > The "right" way to do this is to properly extend and support the > matrix object, adding row and column vector objects, and then it would > be clear. But while there has been a lot of discussion about that in > the past, the fact is that no one wants it bad enough to write the > code. > > So I think it's better to keep Numpy arrays "pure", and if you want to > change the rank of an array, you do so explicitly. > I think that cat is already out of the bag. As long as you can do matrix multiplication on arrays using the @ operator, I think they aren't really "pure" anymore. > BTW, if transposing a (N,) array gives you a (N,1) array, what does > transposing a (N,1) array give you? > > (1,N) or (N,) ? > My suggestion is that this explicitly increases the number of dimensions to at least 2. The result will always have at least 2 dimensions. So 0D -> 2D, 1D -> 2D, 2D -> 2D, 3D -> 3D, 4D -> 4D, etc. So this would be equivalent to the existing `atleast_2d` function. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Apr 6 15:05:11 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Wed, 6 Apr 2016 15:05:11 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: <4414284980401594570@unknownmsgid> Message-ID: <57055DE7.8050504@gmail.com> On 4/6/2016 1:47 PM, Todd wrote: > My suggestion is that this explicitly increases the number of > dimensions to at least 2. The result will always have at least 2 > dimensions. So 0D -> 2D, 1D -> 2D, 2D -> 2D, 3D -> 3D, 4D -> 4D, etc. > So this would be equivalent to the existing `atleast_2d` function. I truly hope nothing is done like this. But underlying the proposal is apparently the idea that there be an attribute equivalent to `atleast_2d`. Then call it `d2p`. You can now have `a.d2p.T` which is a lot more explicit and general than say `a.T2`, while requiring only 3 more keystrokes. (It's still horribly ugly, though, and I hope this too is dismissed.) Alan Isaac From chris.barker at noaa.gov Wed Apr 6 16:50:34 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 6 Apr 2016 13:50:34 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: <4414284980401594570@unknownmsgid> Message-ID: On Wed, Apr 6, 2016 at 10:47 AM, Todd wrote: > > I think that cat is already out of the bag. As long as you can do matrix > multiplication on arrays using the @ operator, I think they aren't really > "pure" anymore. > not really -- you still need to use arrays that are the "correct" shape. Ideally, a row vector is (1, N) and a column vector is (N,1). Though I know there are places that a 1-D array is treated as a column vector. > > >> BTW, if transposing a (N,) array gives you a (N,1) array, what does >> transposing a (N,1) array give you? >> >> (1,N) or (N,) ? >> > > My suggestion is that this explicitly increases the number of dimensions > to at least 2. The result will always have at least 2 dimensions. So 0D > -> 2D, 1D -> 2D, 2D -> 2D, 3D -> 3D, 4D -> 4D, etc. So this would be > equivalent to the existing `atleast_2d` function. > my point is that for 2D arrays: arr.T.T == arr, but in this case, we would be making a one way street: when you transpose a 1D array, you treat it as a row vector, and return a "column vector" -- a (N,1) array. But when you transpose a "column vector" to get a row vector, you get a (1,N) array, not a (N) array. So I think we need to either have proper row and column vectors (to go with matrices) or require people to create the appropriate 2D arrays. Perhaps there should be an easier more obvious way to spell "make this a column vector", but I don't think .T is it. Though arr.shape = (-1,1) has always worked fine for me. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Apr 6 17:20:51 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 6 Apr 2016 14:20:51 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Wed, Apr 6, 2016 at 10:43 AM, Todd wrote: > On Tue, Apr 5, 2016 at 11:14 PM, Nathaniel Smith wrote: >> >> On Tue, Apr 5, 2016 at 7:11 PM, Todd wrote: >> > When you try to transpose a 1D array, it does nothing. This is the >> > correct >> > behavior, since it transposing a 1D array is meaningless. However, this >> > can >> > often lead to unexpected errors since this is rarely what you want. You >> > can >> > convert the array to 2D, using `np.atleast_2d` or `arr[None]`, but this >> > makes simple linear algebra computations more difficult. >> > >> > I propose adding an argument to transpose, perhaps called `expand` or >> > `expanddim`, which if `True` (it is `False` by default) will force the >> > array >> > to be at least 2D. A shortcut property, `ndarray.T2`, would be the same >> > as >> > `ndarray.transpose(True)`. >> >> An alternative that was mentioned in the bug tracker >> (https://github.com/numpy/numpy/issues/7495), possibly by me, would be >> to have arr.T2 act as a stacked-transpose operator, i.e. treat an arr >> with shape (..., n, m) as being a (...)-shaped stack of (n, m) >> matrices, and transpose each of those matrices, so the output shape is >> (..., m, n). And since this operation intrinsically acts on arrays >> with shape (..., n, m) then trying to apply it to a 0d or 1d array >> would be an error. >> > > My intention was to make linear algebra operations easier in numpy. With > the @ operator available, it is now very easy to do basic linear algebra on > arrays without needing the matrix class. But getting an array into a state > where you can use the @ operator effectively is currently pretty verbose and > confusing. I was trying to find a way to make the @ operator more useful. Can you elaborate on what you're doing that you find verbose and confusing, maybe paste an example? I've never had any trouble like this doing linear algebra with @ or dot (which have similar semantics for 1d arrays), which is probably just because I've had different use cases, but it's much easier to talk about these things with a concrete example in front of us to put everyone on the same page. -n -- Nathaniel J. Smith -- https://vorpus.org From irvin.probst at ensta-bretagne.fr Thu Apr 7 03:39:14 2016 From: irvin.probst at ensta-bretagne.fr (Irvin Probst) Date: Thu, 7 Apr 2016 09:39:14 +0200 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: <57060EA2.4070705@ensta-bretagne.fr> On 06/04/2016 04:11, Todd wrote: > > When you try to transpose a 1D array, it does nothing. This is the > correct behavior, since it transposing a 1D array is meaningless. > However, this can often lead to unexpected errors since this is rarely > what you want. You can convert the array to 2D, using `np.atleast_2d` > or `arr[None]`, but this makes simple linear algebra computations more > difficult. > > I propose adding an argument to transpose, perhaps called `expand` or > `expanddim`, which if `True` (it is `False` by default) will force the > array to be at least 2D. A shortcut property, `ndarray.T2`, would be > the same as `ndarray.transpose(True)` > Hello, My two cents here, I've seen hundreds of people (literally hundreds) stumbling on this .T trick with 1D vectors when they were trying to do some linear algebra with numpy so at first I had the same feeling as you. But the real issue was that *all* these people were coming from matlab and expected numpy to behave the same way. Once the logic behind 1D vectors was explained it made sense to most of them and there were no more problems. And by the way I don't see any way to tell apart a 1D "row vector" from a 1D "column vector", think of a code mixing a Rn=>R jacobian matrix and some data supposed to be used as measurements in a linear system, so we have J=np.array([1,2,3,4]) and B=np.array([5,6,7,8]), what would the output of J.T2 and B.T2 be ? I think it's much better to get used to writing J=np.array([1,2,3,4]).reshape(1,4) and B=np.array([5,6,7,8]).reshape(4,1), then you can use .T and @ without any verbosity and at least if forces users (read "my students" here) to think twice before writing some linear algebra nonsense. Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From contrebasse at gmail.com Thu Apr 7 04:59:44 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Thu, 7 Apr 2016 08:59:44 +0000 (UTC) Subject: [Numpy-discussion] ndarray.T2 for 2D transpose References: <4414284980401594570@unknownmsgid> <57055DE7.8050504@gmail.com> Message-ID: Alan Isaac gmail.com> writes: > But underlying the proposal is apparently the > idea that there be an attribute equivalent to > `atleast_2d`. Then call it `d2p`. > You can now have `a.d2p.T` which is a lot > more explicit and general than say `a.T2`, > while requiring only 3 more keystrokes. How about a.T2d or a .T2D ? From opossumnano at gmail.com Thu Apr 7 05:07:16 2016 From: opossumnano at gmail.com (Tiziano Zito) Date: Thu, 07 Apr 2016 02:07:16 -0700 (PDT) Subject: [Numpy-discussion] =?utf-8?q?=5BANN=5D_Summer_School_=22Advanced_?= =?utf-8?q?Scientific_Programming_in_Python=22_in_Reading=2C_UK=2C_Septemb?= =?utf-8?b?ZXIgNeKAlDExLCAyMDE2?= Message-ID: <57062344.4a231c0a.a405e.ffffa5c8@mx.google.com> Advanced Scientific Programming in Python ========================================= a Summer School by the G-Node, and the Centre for Integrative Neuroscience and Neurodynamics, School of Psychology and Clinical Language Sciences, University of Reading, UK Scientists spend more and more time writing, maintaining, and debugging software. While techniques for doing this efficiently have evolved, only few scientists have been trained to use them. As a result, instead of doing their research, they spend far too much time writing deficient code and reinventing the wheel. In this course we will present a selection of advanced programming techniques and best practices which are standard in the industry, but especially tailored to the needs of a programming scientist. Lectures are devised to be interactive and to give the students enough time to acquire direct hands-on experience with the materials. Students will work in pairs throughout the school and will team up to practice the newly learned skills in a real programming project ? an entertaining computer game. We use the Python programming language for the entire course. Python works as a simple programming language for beginners, but more importantly, it also works great in scientific simulations and data analysis. We show how clean language design, ease of extensibility, and the great wealth of open source libraries for scientific computing and data visualization are driving Python to become a standard tool for the programming scientist. This school is targeted at Master or PhD students and Post-docs from all areas of science. Competence in Python or in another language such as Java, C/C++, MATLAB, or Mathematica is absolutely required. Basic knowledge of Python and of a version control system such as git, subversion, mercurial, or bazaar is assumed. Participants without any prior experience with Python and/or git should work through the proposed introductory material before the course. We are striving hard to get a pool of students which is international and gender-balanced. You can apply online: https://python.g-node.org Application deadline: 23:59 UTC, May 15, 2016. Be sure to read the FAQ before applying. Participation is for free, i.e. no fee is charged! Participants however should take care of travel, living, and accommodation expenses by themselves. Travel grants may be available. Date & Location =============== September 5?11, 2016. Reading, UK Program ======= - Best Programming Practices ? Best practices for scientific programming ? Version control with git and how to contribute to open source projects with GitHub ? Best practices in data visualization - Software Carpentry ? Test-driven development ? Debugging with a debuggger ? Profiling code - Scientific Tools for Python ? Advanced NumPy - Advanced Python ? Decorators ? Context managers ? Generators - The Quest for Speed ? Writing parallel applications ? Interfacing to C with Cython ? Memory-bound problems and memory profiling ? Data containers: storage and fast access to large data - Practical Software Development ? Group project Preliminary Faculty =================== ? Francesc Alted, freelance consultant, author of PyTables, Spain ? Pietro Berkes, Enthought Inc., Cambridge, UK ? Zbigniew J?drzejewski-Szmek, Krasnow Institute, George Mason University, Fairfax, VA, USA ? Eilif Muller, Blue Brain Project, ?cole Polytechnique F?d?rale de Lausanne, Switzerland ? Juan Nunez-Iglesias, Victorian Life Sciences Computation Initiative, University of Melbourne, Australia ? Rike-Benjamin Schuppner, Institute for Theoretical Biology, Humboldt-Universit?t zu Berlin, Germany ? Bartosz Tele?czuk, European Institute for Theoretical Neuroscience, CNRS, Paris, France ? St?fan van der Walt, Berkeley Institute for Data Science, UC Berkeley, CA, USA ? Nelle Varoquaux, Centre for Computational Biology Mines ParisTech, Institut Curie, U900 INSERM, Paris, France ? Tiziano Zito, freelance consultant, Germany Organizers ========== For the German Neuroinformatics Node of the INCF (G-Node) Germany: ? Tiziano Zito, freelance consultant, Germany ? Zbigniew J?drzejewski-Szmek, Krasnow Institute, George Mason University, Fairfax, USA ? Jakob Jordan, Institute of Neuroscience and Medicine (INM-6), Forschungszentrum J?lich GmbH, Germany For the Centre for Integrative Neuroscience and Neurodynamics, School of Psychology and Clinical Language Sciences, University of Reading UK: ? Etienne Roesch, Centre for Integrative Neuroscience and Neurodynamics, University of Reading, UK Website: https://python.g-node.org Contact: python-info at g-node.org From contrebasse at gmail.com Thu Apr 7 05:31:34 2016 From: contrebasse at gmail.com (Joseph Martinot-Lagarde) Date: Thu, 7 Apr 2016 09:31:34 +0000 (UTC) Subject: [Numpy-discussion] ndarray.T2 for 2D transpose References: <4414284980401594570@unknownmsgid> Message-ID: > > For a 1D array a of shape (N,), I expect a.T2 to be of shape (N, 1), > > Why not (1,N)? -- it is not well defined, though I suppose it's not so > bad to establish a convention that a 1-D array is a "row vector" > rather than a "column vector". I like Todd's simple proposal: a.T2 should be equivalent to np.atleast_2d(arr).T > BTW, if transposing a (N,) array gives you a (N,1) array, what does > transposing a (N,1) array give you? > > (1,N) or (N,) ? The proposal changes nothin for dims > 1, so (1,N). That means that a.T2.T2 doesn"t have the same shape as a. It boils down to practicality vs purity, as often ! From faltet at gmail.com Thu Apr 7 05:46:19 2016 From: faltet at gmail.com (Francesc Alted) Date: Thu, 7 Apr 2016 11:46:19 +0200 Subject: [Numpy-discussion] ANN: numexpr 2.5.2 released Message-ID: ========================= Announcing Numexpr 2.5.2 ========================= Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. It wears multi-threaded capabilities, as well as support for Intel's MKL (Math Kernel Library), which allows an extremely fast evaluation of transcendental functions (sin, cos, tan, exp, log...) while squeezing the last drop of performance out of your multi-core processors. Look here for a some benchmarks of numexpr using MKL: https://github.com/pydata/numexpr/wiki/NumexprMKL Its only dependency is NumPy (MKL is optional), so it works well as an easy-to-deploy, easy-to-use, computational engine for projects that don't want to adopt other solutions requiring more heavy dependencies. What's new ========== This is a maintenance release shaking some remaining problems with VML (it is nice to see how Anaconda VML's support helps raising hidden issues). Now conj() and abs() are actually added as VML-powered functions, preventing the same problems than log10() before (PR #212); thanks to Tom Kooij. Upgrading to this release is highly recommended. In case you want to know more in detail what has changed in this version, see: https://github.com/pydata/numexpr/blob/master/RELEASE_NOTES.rst Where I can find Numexpr? ========================= The project is hosted at GitHub in: https://github.com/pydata/numexpr You can get the packages from PyPI as well (but not for RC releases): http://pypi.python.org/pypi/numexpr Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy data! -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Thu Apr 7 07:43:36 2016 From: faltet at gmail.com (Francesc Alted) Date: Thu, 7 Apr 2016 13:43:36 +0200 Subject: [Numpy-discussion] ANN: python-blosc 1.3.1 Message-ID: ============================= Announcing python-blosc 1.3.1 ============================= What is new? ============ This is an important release in terms of stability. Now, the -O1 flag for compiling the included C-Blosc sources on Linux. This represents slower performance, but fixes the nasty issue #110. In case maximum speed is needed, please `compile python-blosc with an external C-Blosc library < https://github.com/Blosc/python-blosc#compiling-with-an-installed-blosc-library-recommended )>`_. Also, symbols like BLOSC_MAX_BUFFERSIZE have been replaced for allowing backward compatibility with python-blosc 1.2.x series. For whetting your appetite, look at some benchmarks here: https://github.com/Blosc/python-blosc#benchmarking For more info, you can have a look at the release notes in: https://github.com/Blosc/python-blosc/blob/master/RELEASE_NOTES.rst More docs and examples are available in the documentation site: http://python-blosc.blosc.org What is it? =========== Blosc (http://www.blosc.org) is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc works well for compressing numerical arrays that contains data with relatively low entropy, like sparse data, time series, grids with regular-spaced values, etc. python-blosc (http://python-blosc.blosc.org/) is the Python wrapper for the Blosc compression library, with added functions (`compress_ptr()` and `pack_array()`) for efficiently compressing NumPy arrays, minimizing the number of memory copies during the process. python-blosc can be used to compress in-memory data buffers for transmission to other machines, persistence or just as a compressed cache. There is also a handy tool built on top of python-blosc called Bloscpack (https://github.com/Blosc/bloscpack). It features a commmand line interface that allows you to compress large binary datafiles on-disk. It also comes with a Python API that has built-in support for serializing and deserializing Numpy arrays both on-disk and in-memory at speeds that are competitive with regular Pickle/cPickle machinery. Sources repository ============== The sources and documentation are managed through github services at: http://github.com/Blosc/python-blosc ---- **Enjoy data!** -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Thu Apr 7 08:45:52 2016 From: faltet at gmail.com (Francesc Alted) Date: Thu, 7 Apr 2016 14:45:52 +0200 Subject: [Numpy-discussion] ANN: bcolz 1.0.0 (final) released Message-ID: ============================= Announcing bcolz 1.0.0 final ============================= What's new ========== Yeah, 1.0.0 is finally here. We are not introducing any exciting new feature (just some optimizations and bug fixes), but bcolz is already 6 years old and it implements most of the capabilities that it was designed for, so I decided to release a 1.0.0 meaning that the format is declared stable and that people can be assured that future bcolz releases will be able to read bcolz 1.0 data files (and probably much earlier ones too) for a long while. Such a format is fully described at: https://github.com/Blosc/bcolz/blob/master/DISK_FORMAT_v1.rst Also, a 1.0.0 release means that bcolz 1.x series will be based on C-Blosc 1.x series (https://github.com/Blosc/c-blosc). After C-Blosc 2.x (https://github.com/Blosc/c-blosc2) would be out, a new bcolz 2.x is expected taking advantage of shiny new features of C-Blosc2 (more compressors, more filters, native variable length support and the concept of super-chunks), which should be very beneficial for next bcolz generation. Important: this is a final release and there are no important known bugs there, so this is recommended to be used in production. Enjoy! For a more detailed change log, see: https://github.com/Blosc/bcolz/blob/master/RELEASE_NOTES.rst For some comparison between bcolz and other compressed data containers, see: https://github.com/FrancescAlted/DataContainersTutorials specially chapters 3 (in-memory containers) and 4 (on-disk containers). Also, if it happens that you are in Madrid during this weekend, you can drop by my tutorial and talk: http://pydata.org/madrid2016/schedule/ See you! What it is ========== *bcolz* provides columnar and compressed data containers that can live either on-disk or in-memory. Column storage allows for efficiently querying tables with a large number of columns. It also allows for cheap addition and removal of column. In addition, bcolz objects are compressed by default for reducing memory/disk I/O needs. The compression process is carried out internally by Blosc, an extremely fast meta-compressor that is optimized for binary data. Lastly, high-performance iterators (like ``iter()``, ``where()``) for querying the objects are provided. bcolz can use numexpr internally so as to accelerate many vector and query operations (although it can use pure NumPy for doing so too). numexpr optimizes the memory usage and use several cores for doing the computations, so it is blazing fast. Moreover, since the carray/ctable containers can be disk-based, and it is possible to use them for seamlessly performing out-of-memory computations. bcolz has minimal dependencies (NumPy), comes with an exhaustive test suite and fully supports both 32-bit and 64-bit platforms. Also, it is typically tested on both UNIX and Windows operating systems. Together, bcolz and the Blosc compressor, are finally fulfilling the promise of accelerating memory I/O, at least for some real scenarios: http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb#Plots Other users of bcolz are Visualfabriq (http://www.visualfabriq.com/) , Quantopian (https://www.quantopian.com/) and Scikit-Allel ( https://github.com/cggh/scikit-allel) which you can read more about by pointing your browser at the links below. * Visualfabriq: * *bquery*, A query and aggregation framework for Bcolz: * https://github.com/visualfabriq/bquery * Quantopian: * Using compressed data containers for faster backtesting at scale: * https://quantopian.github.io/talks/NeedForSpeed/slides.html * Scikit-Allel * Provides an alternative backend to work with compressed arrays * https://scikit-allel.readthedocs.org/en/latest/model/bcolz.html Resources ========= Visit the main bcolz site repository at: http://github.com/Blosc/bcolz Manual: http://bcolz.blosc.org Home of Blosc compressor: http://blosc.org User's mail list: bcolz at googlegroups.com http://groups.google.com/group/bcolz License is the new BSD: https://github.com/Blosc/bcolz/blob/master/LICENSES/BCOLZ.txt Release notes can be found in the Git repository: https://github.com/Blosc/bcolz/blob/master/RELEASE_NOTES.rst ---- **Enjoy data!** -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Thu Apr 7 11:13:41 2016 From: toddrjen at gmail.com (Todd) Date: Thu, 7 Apr 2016 11:13:41 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Wed, Apr 6, 2016 at 5:20 PM, Nathaniel Smith wrote: > On Wed, Apr 6, 2016 at 10:43 AM, Todd wrote: > > > > My intention was to make linear algebra operations easier in numpy. With > > the @ operator available, it is now very easy to do basic linear algebra > on > > arrays without needing the matrix class. But getting an array into a > state > > where you can use the @ operator effectively is currently pretty verbose > and > > confusing. I was trying to find a way to make the @ operator more > useful. > > Can you elaborate on what you're doing that you find verbose and > confusing, maybe paste an example? I've never had any trouble like > this doing linear algebra with @ or dot (which have similar semantics > for 1d arrays), which is probably just because I've had different use > cases, but it's much easier to talk about these things with a concrete > example in front of us to put everyone on the same page. > > Let's say you want to do a simple matrix multiplication example. You create two example arrays like so: a = np.arange(20) b = np.arange(10, 50, 10) Now you want to do a.T @ b First you need to turn a into a 2D array. I can think of 10 ways to do this off the top of my head, and there may be more: 1a) a[:, None] 1b) a[None] 1c) a[None, :] 2a) a.shape = (1, -1) 2b) a.shape = (-1, 1) 3a) a.reshape(1, -1) 3b) a.reshape(-1, 1) 4a) np.reshape(a, (1, -1)) 4b) np.reshape(a, (-1, 1)) 5) np.atleast_2d(a) 5 is pretty clear, and will work fine with any number of dimensions, but is also long to type out when trying to do a simple example. The different variants of 1, 2, 3, and 4, however, will only work with 1D arrays (making them less useful for functions), are not immediately obvious to me what the result will be (I always need to try it to make sure the result is what I expect), and are easy to get mixed up in my opinion. They also require people keep a mental list of lots of ways to do what should be a very simple task. Basically, my argument here is the same as the argument from pep465 for the inclusion of the @ operator: https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers "A large proportion of scientific code is written by people who are experts in their domain, but are not experts in programming. And there are many university courses run each year with titles like "Data analysis for social scientists" which assume no programming background, and teach some combination of mathematical techniques, introduction to programming, and the use of programming to implement these mathematical techniques, all within a 10-15 week period. These courses are more and more often being taught in Python rather than special-purpose languages like R or Matlab. For these kinds of users, whose programming knowledge is fragile, the existence of a transparent mapping between formulas and code often means the difference between succeeding and failing to write that code at all." -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Thu Apr 7 11:14:40 2016 From: toddrjen at gmail.com (Todd) Date: Thu, 7 Apr 2016 11:14:40 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: <4414284980401594570@unknownmsgid> <57055DE7.8050504@gmail.com> Message-ID: On Thu, Apr 7, 2016 at 4:59 AM, Joseph Martinot-Lagarde < contrebasse at gmail.com> wrote: > Alan Isaac gmail.com> writes: > > > But underlying the proposal is apparently the > > idea that there be an attribute equivalent to > > `atleast_2d`. Then call it `d2p`. > > You can now have `a.d2p.T` which is a lot > > more explicit and general than say `a.T2`, > > while requiring only 3 more keystrokes. > > > How about a.T2d or a .T2D ? > > I thought of that, but I wanted to keep things as short as possible (but not shorter). -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Thu Apr 7 11:22:44 2016 From: toddrjen at gmail.com (Todd) Date: Thu, 7 Apr 2016 11:22:44 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: <57060EA2.4070705@ensta-bretagne.fr> References: <57060EA2.4070705@ensta-bretagne.fr> Message-ID: On Thu, Apr 7, 2016 at 3:39 AM, Irvin Probst wrote: > On 06/04/2016 04:11, Todd wrote: > > When you try to transpose a 1D array, it does nothing. This is the > correct behavior, since it transposing a 1D array is meaningless. However, > this can often lead to unexpected errors since this is rarely what you > want. You can convert the array to 2D, using `np.atleast_2d` or > `arr[None]`, but this makes simple linear algebra computations more > difficult. > > I propose adding an argument to transpose, perhaps called `expand` or > `expanddim`, which if `True` (it is `False` by default) will force the > array to be at least 2D. A shortcut property, `ndarray.T2`, would be the > same as `ndarray.transpose(True)` > > Hello, > My two cents here, I've seen hundreds of people (literally hundreds) > stumbling on this .T trick with 1D vectors when they were trying to do some > linear algebra with numpy so at first I had the same feeling as you. But > the real issue was that *all* these people were coming from matlab and > expected numpy to behave the same way. Once the logic behind 1D vectors was > explained it made sense to most of them and there were no more problems. > > The problem isn't necessarily understanding, although that is a problem. The bigger problem is having to jump through hoops to do basic matrix math. > And by the way I don't see any way to tell apart a 1D "row vector" from a > 1D "column vector", think of a code mixing a Rn=>R jacobian matrix and some > data supposed to be used as measurements in a linear system, so we have > J=np.array([1,2,3,4]) and B=np.array([5,6,7,8]), what would the output of > J.T2 and B.T2 be ? > > As I said elsewhere, we already have a convention for this established by `np.atleast_2d`. 1D arrays are treated as row vectors. `np.hstack` and `np.vstack` also treat 1D arrays as row vectors. So `arr.T2` will follow this convention, being equivalent to `np.atleast_2d(arr).T`. > I think it's much better to get used to writing > J=np.array([1,2,3,4]).reshape(1,4) and B=np.array([5,6,7,8]).reshape(4,1), > then you can use .T and @ without any verbosity and at least if forces > users (read "my students" here) to think twice before writing some linear > algebra nonsense. > > That works okay when you know beforehand what the shape of the array is (although it may very well be the different between a simple, 1-line piece of code and a 3-line piece of code). But what if you try to turn this into a general-purpose function? Then any function that has linear algebra needs to call `atleast_2d` on every value used in that linear algebra, or use `if` tests. And if you forget, it may not be obvious until much later depending on what you initially use the function for and what you use it for later. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Apr 7 11:35:12 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Apr 2016 11:35:12 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 11:13 AM, Todd wrote: > On Wed, Apr 6, 2016 at 5:20 PM, Nathaniel Smith wrote: >> >> On Wed, Apr 6, 2016 at 10:43 AM, Todd wrote: >> > >> > My intention was to make linear algebra operations easier in numpy. >> > With >> > the @ operator available, it is now very easy to do basic linear algebra >> > on >> > arrays without needing the matrix class. But getting an array into a >> > state >> > where you can use the @ operator effectively is currently pretty verbose >> > and >> > confusing. I was trying to find a way to make the @ operator more >> > useful. >> >> Can you elaborate on what you're doing that you find verbose and >> confusing, maybe paste an example? I've never had any trouble like >> this doing linear algebra with @ or dot (which have similar semantics >> for 1d arrays), which is probably just because I've had different use >> cases, but it's much easier to talk about these things with a concrete >> example in front of us to put everyone on the same page. >> > > Let's say you want to do a simple matrix multiplication example. You create > two example arrays like so: > > a = np.arange(20) > b = np.arange(10, 50, 10) > > Now you want to do > > a.T @ b > > First you need to turn a into a 2D array. I can think of 10 ways to do this > off the top of my head, and there may be more: > > 1a) a[:, None] > 1b) a[None] > 1c) a[None, :] > 2a) a.shape = (1, -1) > 2b) a.shape = (-1, 1) > 3a) a.reshape(1, -1) > 3b) a.reshape(-1, 1) > 4a) np.reshape(a, (1, -1)) > 4b) np.reshape(a, (-1, 1)) > 5) np.atleast_2d(a) > > 5 is pretty clear, and will work fine with any number of dimensions, but is > also long to type out when trying to do a simple example. The different > variants of 1, 2, 3, and 4, however, will only work with 1D arrays (making > them less useful for functions), are not immediately obvious to me what the > result will be (I always need to try it to make sure the result is what I > expect), and are easy to get mixed up in my opinion. They also require > people keep a mental list of lots of ways to do what should be a very simple > task. > > Basically, my argument here is the same as the argument from pep465 for the > inclusion of the @ operator: > https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers > > "A large proportion of scientific code is written by people who are experts > in their domain, but are not experts in programming. And there are many > university courses run each year with titles like "Data analysis for social > scientists" which assume no programming background, and teach some > combination of mathematical techniques, introduction to programming, and the > use of programming to implement these mathematical techniques, all within a > 10-15 week period. These courses are more and more often being taught in > Python rather than special-purpose languages like R or Matlab. > > For these kinds of users, whose programming knowledge is fragile, the > existence of a transparent mapping between formulas and code often means the > difference between succeeding and failing to write that code at all." This doesn't work because of the ambiguity between column and row vector. In most cases 1d vectors in statistics/econometrics are column vectors. Sometime it takes me a long time to figure out whether an author uses row or column vector for transpose. i.e. I often need x.T dot y which works for 1d and 2d to produce inner product. but the outer product would require most of the time a column vector so it's defined as x dot x.T. I think keeping around explicitly 2d arrays if necessary is less error prone and confusing. But I wouldn't mind a shortcut for atleast_2d (although more often I need atleast_2dcol to translate formulas) Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From toddrjen at gmail.com Thu Apr 7 11:42:04 2016 From: toddrjen at gmail.com (Todd) Date: Thu, 7 Apr 2016 11:42:04 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 11:35 AM, wrote: > On Thu, Apr 7, 2016 at 11:13 AM, Todd wrote: > > On Wed, Apr 6, 2016 at 5:20 PM, Nathaniel Smith wrote: > >> > >> On Wed, Apr 6, 2016 at 10:43 AM, Todd wrote: > >> > > >> > My intention was to make linear algebra operations easier in numpy. > >> > With > >> > the @ operator available, it is now very easy to do basic linear > algebra > >> > on > >> > arrays without needing the matrix class. But getting an array into a > >> > state > >> > where you can use the @ operator effectively is currently pretty > verbose > >> > and > >> > confusing. I was trying to find a way to make the @ operator more > >> > useful. > >> > >> Can you elaborate on what you're doing that you find verbose and > >> confusing, maybe paste an example? I've never had any trouble like > >> this doing linear algebra with @ or dot (which have similar semantics > >> for 1d arrays), which is probably just because I've had different use > >> cases, but it's much easier to talk about these things with a concrete > >> example in front of us to put everyone on the same page. > >> > > > > Let's say you want to do a simple matrix multiplication example. You > create > > two example arrays like so: > > > > a = np.arange(20) > > b = np.arange(10, 50, 10) > > > > Now you want to do > > > > a.T @ b > > > > First you need to turn a into a 2D array. I can think of 10 ways to do > this > > off the top of my head, and there may be more: > > > > 1a) a[:, None] > > 1b) a[None] > > 1c) a[None, :] > > 2a) a.shape = (1, -1) > > 2b) a.shape = (-1, 1) > > 3a) a.reshape(1, -1) > > 3b) a.reshape(-1, 1) > > 4a) np.reshape(a, (1, -1)) > > 4b) np.reshape(a, (-1, 1)) > > 5) np.atleast_2d(a) > > > > 5 is pretty clear, and will work fine with any number of dimensions, but > is > > also long to type out when trying to do a simple example. The different > > variants of 1, 2, 3, and 4, however, will only work with 1D arrays > (making > > them less useful for functions), are not immediately obvious to me what > the > > result will be (I always need to try it to make sure the result is what I > > expect), and are easy to get mixed up in my opinion. They also require > > people keep a mental list of lots of ways to do what should be a very > simple > > task. > > > > Basically, my argument here is the same as the argument from pep465 for > the > > inclusion of the @ operator: > > > https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers > > > > "A large proportion of scientific code is written by people who are > experts > > in their domain, but are not experts in programming. And there are many > > university courses run each year with titles like "Data analysis for > social > > scientists" which assume no programming background, and teach some > > combination of mathematical techniques, introduction to programming, and > the > > use of programming to implement these mathematical techniques, all > within a > > 10-15 week period. These courses are more and more often being taught in > > Python rather than special-purpose languages like R or Matlab. > > > > For these kinds of users, whose programming knowledge is fragile, the > > existence of a transparent mapping between formulas and code often means > the > > difference between succeeding and failing to write that code at all." > > This doesn't work because of the ambiguity between column and row vector. > > In most cases 1d vectors in statistics/econometrics are column > vectors. Sometime it takes me a long time to figure out whether an > author uses row or column vector for transpose. > > i.e. I often need x.T dot y which works for 1d and 2d to produce > inner product. > but the outer product would require most of the time a column vector > so it's defined as x dot x.T. > > I think keeping around explicitly 2d arrays if necessary is less error > prone and confusing. > > But I wouldn't mind a shortcut for atleast_2d (although more often I > need atleast_2dcol to translate formulas) > > At least from what I have seen, in all cases in numpy where a 1D array is treated as a 2D array, it is always treated as a row vector, the examples I can think of being atleast_2d, hstack, vstack, and dstack. So using this convention would be in line with how it is used elsewhere in numpy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Apr 7 11:56:37 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Apr 2016 11:56:37 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 11:42 AM, Todd wrote: > On Thu, Apr 7, 2016 at 11:35 AM, wrote: >> >> On Thu, Apr 7, 2016 at 11:13 AM, Todd wrote: >> > On Wed, Apr 6, 2016 at 5:20 PM, Nathaniel Smith wrote: >> >> >> >> On Wed, Apr 6, 2016 at 10:43 AM, Todd wrote: >> >> > >> >> > My intention was to make linear algebra operations easier in numpy. >> >> > With >> >> > the @ operator available, it is now very easy to do basic linear >> >> > algebra >> >> > on >> >> > arrays without needing the matrix class. But getting an array into a >> >> > state >> >> > where you can use the @ operator effectively is currently pretty >> >> > verbose >> >> > and >> >> > confusing. I was trying to find a way to make the @ operator more >> >> > useful. >> >> >> >> Can you elaborate on what you're doing that you find verbose and >> >> confusing, maybe paste an example? I've never had any trouble like >> >> this doing linear algebra with @ or dot (which have similar semantics >> >> for 1d arrays), which is probably just because I've had different use >> >> cases, but it's much easier to talk about these things with a concrete >> >> example in front of us to put everyone on the same page. >> >> >> > >> > Let's say you want to do a simple matrix multiplication example. You >> > create >> > two example arrays like so: >> > >> > a = np.arange(20) >> > b = np.arange(10, 50, 10) >> > >> > Now you want to do >> > >> > a.T @ b >> > >> > First you need to turn a into a 2D array. I can think of 10 ways to do >> > this >> > off the top of my head, and there may be more: >> > >> > 1a) a[:, None] >> > 1b) a[None] >> > 1c) a[None, :] >> > 2a) a.shape = (1, -1) >> > 2b) a.shape = (-1, 1) >> > 3a) a.reshape(1, -1) >> > 3b) a.reshape(-1, 1) >> > 4a) np.reshape(a, (1, -1)) >> > 4b) np.reshape(a, (-1, 1)) >> > 5) np.atleast_2d(a) >> > >> > 5 is pretty clear, and will work fine with any number of dimensions, but >> > is >> > also long to type out when trying to do a simple example. The different >> > variants of 1, 2, 3, and 4, however, will only work with 1D arrays >> > (making >> > them less useful for functions), are not immediately obvious to me what >> > the >> > result will be (I always need to try it to make sure the result is what >> > I >> > expect), and are easy to get mixed up in my opinion. They also require >> > people keep a mental list of lots of ways to do what should be a very >> > simple >> > task. >> > >> > Basically, my argument here is the same as the argument from pep465 for >> > the >> > inclusion of the @ operator: >> > >> > https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers >> > >> > "A large proportion of scientific code is written by people who are >> > experts >> > in their domain, but are not experts in programming. And there are many >> > university courses run each year with titles like "Data analysis for >> > social >> > scientists" which assume no programming background, and teach some >> > combination of mathematical techniques, introduction to programming, and >> > the >> > use of programming to implement these mathematical techniques, all >> > within a >> > 10-15 week period. These courses are more and more often being taught in >> > Python rather than special-purpose languages like R or Matlab. >> > >> > For these kinds of users, whose programming knowledge is fragile, the >> > existence of a transparent mapping between formulas and code often means >> > the >> > difference between succeeding and failing to write that code at all." >> >> This doesn't work because of the ambiguity between column and row vector. >> >> In most cases 1d vectors in statistics/econometrics are column >> vectors. Sometime it takes me a long time to figure out whether an >> author uses row or column vector for transpose. >> >> i.e. I often need x.T dot y which works for 1d and 2d to produce >> inner product. >> but the outer product would require most of the time a column vector >> so it's defined as x dot x.T. >> >> I think keeping around explicitly 2d arrays if necessary is less error >> prone and confusing. >> >> But I wouldn't mind a shortcut for atleast_2d (although more often I >> need atleast_2dcol to translate formulas) >> > > At least from what I have seen, in all cases in numpy where a 1D array is > treated as a 2D array, it is always treated as a row vector, the examples I > can think of being atleast_2d, hstack, vstack, and dstack. So using this > convention would be in line with how it is used elsewhere in numpy. AFAIK, linear algebra works differently, 1-D is special >>> xx = np.arange(20).reshape(4,5) >>> yy = np.arange(4) >>> xx.dot(yy) Traceback (most recent call last): File "", line 1, in xx.dot(yy) ValueError: objects are not aligned >>> yy = np.arange(5) >>> xx.dot(yy) array([ 30, 80, 130, 180]) >>> xx.dot(yy[:,None]) array([[ 30], [ 80], [130], [180]]) >>> yy[:4].dot(xx) array([70, 76, 82, 88, 94]) >>> np.__version__ '1.6.1' I don't think numpy treats 1d arrays as row vectors. numpy has C-order for axis preference which coincides in many cases with row vector behavior. >>> np.concatenate(([[1,2,3]], [4,5,6])) Traceback (most recent call last): File "", line 1, in np.concatenate(([[1,2,3]], [4,5,6])) ValueError: arrays must have same number of dimensions It's not an uncommon exception for me. Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Thu Apr 7 13:00:33 2016 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Thu, 07 Apr 2016 17:00:33 +0000 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Wed, Apr 6, 2016 at 3:21 PM Nathaniel Smith wrote: > Can you elaborate on what you're doing that you find verbose and > confusing, maybe paste an example? I've never had any trouble like > this doing linear algebra with @ or dot (which have similar semantics > for 1d arrays), which is probably just because I've had different use > cases, but it's much easier to talk about these things with a concrete > example in front of us to put everyone on the same page. > > -n > Here's another example that I've seen catch people now and again. A = np.random.rand(100, 100) b = np.random.rand(10) A * b.T In this case the user pretty clearly meant to be broadcasting along the rows of A rather than along the columns, but the code fails silently. When an issue like this gets mixed into a larger series of broadcasting operations, the error becomes difficult to find. This error isn't necessarily unique to beginners either. It's a common typo that catches intermediate users who know about broadcasting semantics but weren't keeping close enough track of the dimensionality of the different intermediate expressions in their code. Best, -Ian Henriksen -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Apr 7 13:03:10 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 7 Apr 2016 10:03:10 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 8:13 AM, Todd wrote: > First you need to turn a into a 2D array. I can think of 10 ways to do > this off the top of my head, and there may be more: > > snip Basically, my argument here is the same as the argument from pep465 for the > inclusion of the @ operator: > > https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers > > I think is this all a good argument for a clean and obvious way to make a column vector, but I don't think overloading transpose is the way to do that. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Apr 7 13:20:24 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 07 Apr 2016 19:20:24 +0200 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: <1460049624.6828.14.camel@sipsolutions.net> On Do, 2016-04-07 at 11:56 -0400, josef.pktd at gmail.com wrote: > > > > I don't think numpy treats 1d arrays as row vectors. numpy has C > -order for axis preference which coincides in many cases with row > vector behavior. > Well, broadcasting rules, are that (n,) should typically behave similar to (1, n). However, for dot/matmul and @ the rules are stretched to mean "the one dimensional thing that gives an inner product" (using matmul since my python has no @ yet): In [12]: a = np.arange(20) In [13]: b = np.arange(20) In [14]: np.matmul(a, b) Out[14]: 2470 In [15]: np.matmul(a, b[:, None]) Out[15]: array([2470]) In [16]: np.matmul(a[None, :], b) Out[16]: array([2470]) In [17]: np.matmul(a[None, :], b[:, None]) Out[17]: array([[2470]]) which indeed gives us a fun thing, because if you look at the last line, the outer product equivalent would be: outer = np.matmul(a[None, :].T, b[:, None].T) Now if I go back to the earlier example: a.T @ b Does not achieve the outer product at all with using T2, since a.T2 @ b.T2 # only correct for a, but not for b a.T2 @ b # b attempts to be "inner", so does not work It almost seems to me that the example is a counter example, because on first sight the `T2` attribute would still leave you with no shorthand for `b`. I understand the pain of having to write (and parse get into the depth of) things like `arr[:, np.newaxis]` or reshape. I also understand the idea of a shorthand for vectorized matrix operations. That is, an argument for a T2 attribute which errors on 1D arrays (not sure I like it, but that is a different issue). However, it seems that implicit adding of an axis which only works half the time does not help too much? I have to admit I don't write these things too much, but I wonder if it would not help more if we just provided some better information/link to longer examples in the "dimension mismatch" error message? In the end it is quite simple, as Nathaniel, I think I would like to see some example code, where the code obviously looks easier then before? With the `@` operator that was the case, with the "dimension adding logic" I am not so sure, plus it seems it may add other pitfalls. - Sebastian > >>> np.concatenate(([[1,2,3]], [4,5,6])) > Traceback (most recent call last): > File "", line 1, in > np.concatenate(([[1,2,3]], [4,5,6])) > ValueError: arrays must have same number of dimensions > > It's not an uncommon exception for me. > > Josef > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Thu Apr 7 13:22:12 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 07 Apr 2016 19:22:12 +0200 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: <1460049732.6828.17.camel@sipsolutions.net> On Do, 2016-04-07 at 17:00 +0000, Ian Henriksen wrote: > > > On Wed, Apr 6, 2016 at 3:21 PM Nathaniel Smith wrote: > > Can you elaborate on what you're doing that you find verbose and > > confusing, maybe paste an example? I've never had any trouble like > > this doing linear algebra with @ or dot (which have similar > > semantics > > for 1d arrays), which is probably just because I've had different > > use > > cases, but it's much easier to talk about these things with a > > concrete > > example in front of us to put everyone on the same page. > > > > -n > Here's another example that I've seen catch people now and again. > > A = np.random.rand(100, 100) > b = np.random.rand(10) > A * b.T > > In this case the user pretty clearly meant to be broadcasting along > the rows of A > rather than along the columns, but the code fails silently. When an > issue like this > gets mixed into a larger series of broadcasting operations, the error > becomes > difficult to find. This error isn't necessarily unique to beginners > either. It's a > common typo that catches intermediate users who know about > broadcasting > semantics but weren't keeping close enough track of the > dimensionality of the > different intermediate expressions in their code. > Yes, but as noted in my other mail, A @ b.T2 would behave the same as far as I can see?! Because `@` tries to make logic of 1-D arrays in an "inner" fashion. - Sebastian > Best, > > -Ian Henriksen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From josef.pktd at gmail.com Thu Apr 7 13:29:54 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Apr 2016 13:29:54 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: <1460049624.6828.14.camel@sipsolutions.net> References: <1460049624.6828.14.camel@sipsolutions.net> Message-ID: On Thu, Apr 7, 2016 at 1:20 PM, Sebastian Berg wrote: > On Do, 2016-04-07 at 11:56 -0400, josef.pktd at gmail.com wrote: > > > > > > > > > > > I don't think numpy treats 1d arrays as row vectors. numpy has C > > -order for axis preference which coincides in many cases with row > > vector behavior. > > > > Well, broadcasting rules, are that (n,) should typically behave similar > to (1, n). However, for dot/matmul and @ the rules are stretched to > mean "the one dimensional thing that gives an inner product" (using > matmul since my python has no @ yet): > > In [12]: a = np.arange(20) > In [13]: b = np.arange(20) > > In [14]: np.matmul(a, b) > Out[14]: 2470 > > In [15]: np.matmul(a, b[:, None]) > Out[15]: array([2470]) > > In [16]: np.matmul(a[None, :], b) > Out[16]: array([2470]) > > In [17]: np.matmul(a[None, :], b[:, None]) > Out[17]: array([[2470]]) > > which indeed gives us a fun thing, because if you look at the last > line, the outer product equivalent would be: > > outer = np.matmul(a[None, :].T, b[:, None].T) > > Now if I go back to the earlier example: > > a.T @ b > > Does not achieve the outer product at all with using T2, since > > a.T2 @ b.T2 # only correct for a, but not for b > a.T2 @ b # b attempts to be "inner", so does not work > > It almost seems to me that the example is a counter example, because on > first sight the `T2` attribute would still leave you with no shorthand > for `b`. > a.T2 @ b.T2.T (T2 as shortcut for creating a[:, None] that's neat, except if a is already 2D) Josef > > I understand the pain of having to write (and parse get into the depth > of) things like `arr[:, np.newaxis]` or reshape. I also understand the > idea of a shorthand for vectorized matrix operations. That is, an > argument for a T2 attribute which errors on 1D arrays (not sure I like > it, but that is a different issue). > > However, it seems that implicit adding of an axis which only works half > the time does not help too much? I have to admit I don't write these > things too much, but I wonder if it would not help more if we just > provided some better information/link to longer examples in the > "dimension mismatch" error message? > > In the end it is quite simple, as Nathaniel, I think I would like to > see some example code, where the code obviously looks easier then > before? With the `@` operator that was the case, with the "dimension > adding logic" I am not so sure, plus it seems it may add other > pitfalls. > > - Sebastian > > > > > > >>> np.concatenate(([[1,2,3]], [4,5,6])) > > Traceback (most recent call last): > > File "", line 1, in > > np.concatenate(([[1,2,3]], [4,5,6])) > > ValueError: arrays must have same number of dimensions > > > > It's not an uncommon exception for me. > > > > Josef > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Apr 7 13:35:58 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 07 Apr 2016 19:35:58 +0200 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: <1460049624.6828.14.camel@sipsolutions.net> Message-ID: <1460050558.6828.20.camel@sipsolutions.net> On Do, 2016-04-07 at 13:29 -0400, josef.pktd at gmail.com wrote: > > > On Thu, Apr 7, 2016 at 1:20 PM, Sebastian Berg < > sebastian at sipsolutions.net> wrote: > > On Do, 2016-04-07 at 11:56 -0400, josef.pktd at gmail.com wrote: > > > > > > > > > > > > > > > > > > I don't think numpy treats 1d arrays as row vectors. numpy has C > > > -order for axis preference which coincides in many cases with row > > > vector behavior. > > > > > > > Well, broadcasting rules, are that (n,) should typically behave > > similar > > to (1, n). However, for dot/matmul and @ the rules are stretched to > > mean "the one dimensional thing that gives an inner product" (using > > matmul since my python has no @ yet): > > > > In [12]: a = np.arange(20) > > In [13]: b = np.arange(20) > > > > In [14]: np.matmul(a, b) > > Out[14]: 2470 > > > > In [15]: np.matmul(a, b[:, None]) > > Out[15]: array([2470]) > > > > In [16]: np.matmul(a[None, :], b) > > Out[16]: array([2470]) > > > > In [17]: np.matmul(a[None, :], b[:, None]) > > Out[17]: array([[2470]]) > > > > which indeed gives us a fun thing, because if you look at the last > > line, the outer product equivalent would be: > > > > outer = np.matmul(a[None, :].T, b[:, None].T) > > > > Now if I go back to the earlier example: > > > > a.T @ b > > > > Does not achieve the outer product at all with using T2, since > > > > a.T2 @ b.T2 # only correct for a, but not for b > > a.T2 @ b # b attempts to be "inner", so does not work > > > > It almost seems to me that the example is a counter example, > > because on > > first sight the `T2` attribute would still leave you with no > > shorthand > > for `b`. > a.T2 @ b.T2.T > Actually, better would be: a.T2 @ b.T2.T2 # Aha? And true enough, that works, but is it still reasonably easy to find and understand? Or is it just frickeling around, the same as you would try `a[:, None]` before finding `a[None, :]`, maybe worse? - Sebastian > > (T2 as shortcut for creating a[:, None] that's neat, except if a is > already 2D) > > Josef > > > > > I understand the pain of having to write (and parse get into the > > depth > > of) things like `arr[:, np.newaxis]` or reshape. I also understand > > the > > idea of a shorthand for vectorized matrix operations. That is, an > > argument for a T2 attribute which errors on 1D arrays (not sure I > > like > > it, but that is a different issue). > > > > However, it seems that implicit adding of an axis which only works > > half > > the time does not help too much? I have to admit I don't write > > these > > things too much, but I wonder if it would not help more if we just > > provided some better information/link to longer examples in the > > "dimension mismatch" error message? > > > > In the end it is quite simple, as Nathaniel, I think I would like > > to > > see some example code, where the code obviously looks easier then > > before? With the `@` operator that was the case, with the > > "dimension > > adding logic" I am not so sure, plus it seems it may add other > > pitfalls. > > > > - Sebastian > > > > > > > > > > > >>> np.concatenate(([[1,2,3]], [4,5,6])) > > > Traceback (most recent call last): > > > File "", line 1, in > > > np.concatenate(([[1,2,3]], [4,5,6])) > > > ValueError: arrays must have same number of dimensions > > > > > > It's not an uncommon exception for me. > > > > > > Josef > > > > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From josef.pktd at gmail.com Thu Apr 7 13:46:23 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Apr 2016 13:46:23 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: <1460050558.6828.20.camel@sipsolutions.net> References: <1460049624.6828.14.camel@sipsolutions.net> <1460050558.6828.20.camel@sipsolutions.net> Message-ID: On Thu, Apr 7, 2016 at 1:35 PM, Sebastian Berg wrote: > On Do, 2016-04-07 at 13:29 -0400, josef.pktd at gmail.com wrote: > > > > > > On Thu, Apr 7, 2016 at 1:20 PM, Sebastian Berg < > > sebastian at sipsolutions.net> wrote: > > > On Do, 2016-04-07 at 11:56 -0400, josef.pktd at gmail.com wrote: > > > > > > > > > > > > > > > > > > > > > > > > > I don't think numpy treats 1d arrays as row vectors. numpy has C > > > > -order for axis preference which coincides in many cases with row > > > > vector behavior. > > > > > > > > > > Well, broadcasting rules, are that (n,) should typically behave > > > similar > > > to (1, n). However, for dot/matmul and @ the rules are stretched to > > > mean "the one dimensional thing that gives an inner product" (using > > > matmul since my python has no @ yet): > > > > > > In [12]: a = np.arange(20) > > > In [13]: b = np.arange(20) > > > > > > In [14]: np.matmul(a, b) > > > Out[14]: 2470 > > > > > > In [15]: np.matmul(a, b[:, None]) > > > Out[15]: array([2470]) > > > > > > In [16]: np.matmul(a[None, :], b) > > > Out[16]: array([2470]) > > > > > > In [17]: np.matmul(a[None, :], b[:, None]) > > > Out[17]: array([[2470]]) > > > > > > which indeed gives us a fun thing, because if you look at the last > > > line, the outer product equivalent would be: > > > > > > outer = np.matmul(a[None, :].T, b[:, None].T) > > > > > > Now if I go back to the earlier example: > > > > > > a.T @ b > > > > > > Does not achieve the outer product at all with using T2, since > > > > > > a.T2 @ b.T2 # only correct for a, but not for b > > > a.T2 @ b # b attempts to be "inner", so does not work > > > > > > It almost seems to me that the example is a counter example, > > > because on > > > first sight the `T2` attribute would still leave you with no > > > shorthand > > > for `b`. > > a.T2 @ b.T2.T > > > > Actually, better would be: > > a.T2 @ b.T2.T2 # Aha? > > And true enough, that works, but is it still reasonably easy to find > and understand? > Or is it just frickeling around, the same as you would try `a[:, None]` > before finding `a[None, :]`, maybe worse? > I had thought about it earlier, but its "too cute" for my taste (and I think I would complain during code review when I see this.) Josef > > - Sebastian > > > > > (T2 as shortcut for creating a[:, None] that's neat, except if a is > > already 2D) > > > > Josef > > > > > > > > I understand the pain of having to write (and parse get into the > > > depth > > > of) things like `arr[:, np.newaxis]` or reshape. I also understand > > > the > > > idea of a shorthand for vectorized matrix operations. That is, an > > > argument for a T2 attribute which errors on 1D arrays (not sure I > > > like > > > it, but that is a different issue). > > > > > > However, it seems that implicit adding of an axis which only works > > > half > > > the time does not help too much? I have to admit I don't write > > > these > > > things too much, but I wonder if it would not help more if we just > > > provided some better information/link to longer examples in the > > > "dimension mismatch" error message? > > > > > > In the end it is quite simple, as Nathaniel, I think I would like > > > to > > > see some example code, where the code obviously looks easier then > > > before? With the `@` operator that was the case, with the > > > "dimension > > > adding logic" I am not so sure, plus it seems it may add other > > > pitfalls. > > > > > > - Sebastian > > > > > > > > > > > > > > > > >>> np.concatenate(([[1,2,3]], [4,5,6])) > > > > Traceback (most recent call last): > > > > File "", line 1, in > > > > np.concatenate(([[1,2,3]], [4,5,6])) > > > > ValueError: arrays must have same number of dimensions > > > > > > > > It's not an uncommon exception for me. > > > > > > > > Josef > > > > > > > > > > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at scipy.org > > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Apr 7 14:17:34 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 7 Apr 2016 11:17:34 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen < insertinterestingnamehere at gmail.com> wrote: > Here's another example that I've seen catch people now and again. > > A = np.random.rand(100, 100) > b = np.random.rand(10) > A * b.T > typo? that was supposed to be b = np.random.rand(100). yes? This is exactly what someone else referred to as the expectations of someone that comes from MATLAB, and doesn't yet "get" that 1D arrays are 1D arrays. All of this is EXACTLY the motivation for the matric class -- which never took off, and was never complete (it needed a row and column vector implementation, if you ask me. But Ithikn the reason it didn't take off is that it really isn't that useful, but is different enough from regular arrays to be a greater source of confusion. And it was decided that all people REALLY wanted was an obviou sway to get matric multiply, which we now have with @. So this discussion brings up that we also need an easy an obvious way to make a column vector -- maybe: np.col_vector(arr) which would be a synonym for np.reshape(arr, (-1,1)) would that make anyone happy? NOTE: having transposing a 1D array raise an exception would help remove a lot of the confusion, but it may be too late for that.... In this case the user pretty clearly meant to be broadcasting along the > rows of A > rather than along the columns, but the code fails silently. > hence the exception idea.... maybe a warning? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Apr 7 14:21:46 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 7 Apr 2016 11:21:46 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 11:17 AM, Chris Barker wrote: > On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen > wrote: >> >> Here's another example that I've seen catch people now and again. >> >> A = np.random.rand(100, 100) >> b = np.random.rand(10) >> A * b.T > > > typo? that was supposed to be > > b = np.random.rand(100). yes? > > This is exactly what someone else referred to as the expectations of someone > that comes from MATLAB, and doesn't yet "get" that 1D arrays are 1D arrays. > > All of this is EXACTLY the motivation for the matric class -- which never > took off, and was never complete (it needed a row and column vector > implementation, if you ask me. But Ithikn the reason it didn't take off is > that it really isn't that useful, but is different enough from regular > arrays to be a greater source of confusion. And it was decided that all > people REALLY wanted was an obviou sway to get matric multiply, which we now > have with @. > > So this discussion brings up that we also need an easy an obvious way to > make a column vector -- > > maybe: > > np.col_vector(arr) > > which would be a synonym for np.reshape(arr, (-1,1)) Yes, I was going to suggest `colvec` and `rowvec`. Matthew From josef.pktd at gmail.com Thu Apr 7 14:31:17 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Apr 2016 14:31:17 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 2:17 PM, Chris Barker wrote: > On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen < > insertinterestingnamehere at gmail.com> wrote: > >> Here's another example that I've seen catch people now and again. >> >> A = np.random.rand(100, 100) >> b = np.random.rand(10) >> A * b.T >> > > typo? that was supposed to be > > b = np.random.rand(100). yes? > > This is exactly what someone else referred to as the expectations of > someone that comes from MATLAB, and doesn't yet "get" that 1D arrays are 1D > arrays. > > All of this is EXACTLY the motivation for the matric class -- which never > took off, and was never complete (it needed a row and column vector > implementation, if you ask me. But Ithikn the reason it didn't take off is > that it really isn't that useful, but is different enough from regular > arrays to be a greater source of confusion. And it was decided that all > people REALLY wanted was an obviou sway to get matric multiply, which we > now have with @. > > So this discussion brings up that we also need an easy an obvious way to > make a column vector -- > > maybe: > > np.col_vector(arr) > > which would be a synonym for np.reshape(arr, (-1,1)) > > would that make anyone happy? > > NOTE: having transposing a 1D array raise an exception would help remove a > lot of the confusion, but it may be too late for that.... > > > In this case the user pretty clearly meant to be broadcasting along the >> rows of A >> rather than along the columns, but the code fails silently. >> > > hence the exception idea.... > > maybe a warning? > AFAIR, there is a lot of code that works correctly with .T being a noop for 1D e.g. covariance matrix/inner product x.T dot y as mentioned before. write unit tests with non square 2d arrays and the exception / test error shows up fast. Josef > > -CHB > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From irvin.probst at ensta-bretagne.fr Thu Apr 7 14:58:10 2016 From: irvin.probst at ensta-bretagne.fr (Irvin Probst) Date: Thu, 07 Apr 2016 20:58:10 +0200 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, 7 Apr 2016 14:31:17 -0400, josef.pktd at gmail.com wrote: >> So this discussion brings up that we also need an easy an obvious >> way to make a column vector --? >> >> maybe: >> >> np.col_vector(arr) FWIW I would give a +1e42 to something like np.colvect and np.rowvect (or whatever variant of these names). This is human readable and does not break anything, it's just an explicit shortcut to reshape/atleast_2d/etc. Regards. From insertinterestingnamehere at gmail.com Thu Apr 7 15:21:13 2016 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Thu, 07 Apr 2016 19:21:13 +0000 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 12:18 PM Chris Barker wrote: > On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen < > insertinterestingnamehere at gmail.com> wrote: > >> Here's another example that I've seen catch people now and again. >> >> A = np.random.rand(100, 100) >> b = np.random.rand(10) >> A * b.T >> > > typo? that was supposed to be > > b = np.random.rand(100). yes? > Hahaha, thanks, yes, in describing a common typo I demonstrated another one. At least this one doesn't fail silently. > > This is exactly what someone else referred to as the expectations of > someone that comes from MATLAB, and doesn't yet "get" that 1D arrays are 1D > arrays. > > All of this is EXACTLY the motivation for the matric class -- which never > took off, and was never complete (it needed a row and column vector > implementation, if you ask me. But Ithikn the reason it didn't take off is > that it really isn't that useful, but is different enough from regular > arrays to be a greater source of confusion. And it was decided that all > people REALLY wanted was an obviou sway to get matric multiply, which we > now have with @. > Most of the cases I've seen this error have come from people unfamiliar with matlab who, like I said, weren't tracking dimensions quite as carefully as they should have. That said, it's just anecdotal evidence. I wouldn't be at all surprised if this were an issue for matlab users as well. As far as the matrix class goes, we really shouldn't be telling anyone to use that anymore. > > So this discussion brings up that we also need an easy an obvious way to > make a column vector -- > > maybe: > > np.col_vector(arr) > > which would be a synonym for np.reshape(arr, (-1,1)) > > would that make anyone happy? > > NOTE: having transposing a 1D array raise an exception would help remove a > lot of the confusion, but it may be too late for that.... > > > In this case the user pretty clearly meant to be broadcasting along the >> rows of A >> rather than along the columns, but the code fails silently. >> > > hence the exception idea.... > > Yep. An exception may be the best way forward here. My biggest objection is that the current semantics make it easy for people to silently get unintended behavior. > maybe a warning? > > -CHB > > -Ian Henriksen -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Thu Apr 7 15:26:47 2016 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Thu, 07 Apr 2016 19:26:47 +0000 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 12:31 PM wrote: > write unit tests with non square 2d arrays and the exception / test error > shows up fast. > > Josef > > Absolutely, but good programming practices don't totally obviate helpful error messages. Best, -Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Apr 7 15:53:25 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Apr 2016 15:53:25 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 3:26 PM, Ian Henriksen < insertinterestingnamehere at gmail.com> wrote: > On Thu, Apr 7, 2016 at 12:31 PM wrote: > >> write unit tests with non square 2d arrays and the exception / test error >> shows up fast. >> >> Josef >> >> > Absolutely, but good programming practices don't totally obviate helpful > error > messages. > The current behavior is perfectly well defined, and I don't want a lot of warnings showing up because .T works suddenly only for ndim != 1. I make lots of mistakes during programming. But shape mismatch are usually very fast to catch. If you want safe programming, then force everyone to use only 2-D like in matlab. It would have prevented me from making many mistakes. >>> np.array(1).T array(1) another noop. Why doesn't it convert it to 2d? Josef > > Best, > -Ian > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Thu Apr 7 16:07:40 2016 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Thu, 07 Apr 2016 20:07:40 +0000 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 1:53 PM wrote: > On Thu, Apr 7, 2016 at 3:26 PM, Ian Henriksen < > insertinterestingnamehere at gmail.com> wrote: > >> On Thu, Apr 7, 2016 at 12:31 PM wrote: >> >>> write unit tests with non square 2d arrays and the exception / test >>> error shows up fast. >>> >>> Josef >>> >>> >> Absolutely, but good programming practices don't totally obviate helpful >> error >> messages. >> > > The current behavior is perfectly well defined, and I don't want a lot of > warnings showing up because .T works suddenly only for ndim != 1. > I make lots of mistakes during programming. But shape mismatch are usually > very fast to catch. > > If you want safe programming, then force everyone to use only 2-D like in > matlab. It would have prevented me from making many mistakes. > > >>> np.array(1).T > array(1) > > another noop. Why doesn't it convert it to 2d? > > Josef > > I think we've misunderstood each other. Sorry if I was unclear. As I've understood the discussion thus far, "raising an error" refers to raising an error when a 1D array is passed used with the syntax a.T2 (for swapping the last two dimensions?). As far as whether or not a.T should raise an error for 1D arrays, that ship has definitely already sailed. I'm making the case that there's value in having an abbreviated syntax that helps prevent errors from accidentally using a 1D array, not that we should change the existing semantics. Cheers, -Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Apr 7 16:20:25 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Apr 2016 16:20:25 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 4:07 PM, Ian Henriksen < insertinterestingnamehere at gmail.com> wrote: > On Thu, Apr 7, 2016 at 1:53 PM wrote: > >> On Thu, Apr 7, 2016 at 3:26 PM, Ian Henriksen < >> insertinterestingnamehere at gmail.com> wrote: >> >>> On Thu, Apr 7, 2016 at 12:31 PM wrote: >>> >>>> write unit tests with non square 2d arrays and the exception / test >>>> error shows up fast. >>>> >>>> Josef >>>> >>>> >>> Absolutely, but good programming practices don't totally obviate helpful >>> error >>> messages. >>> >> >> The current behavior is perfectly well defined, and I don't want a lot of >> warnings showing up because .T works suddenly only for ndim != 1. >> I make lots of mistakes during programming. But shape mismatch are >> usually very fast to catch. >> >> If you want safe programming, then force everyone to use only 2-D like in >> matlab. It would have prevented me from making many mistakes. >> >> >>> np.array(1).T >> array(1) >> >> another noop. Why doesn't it convert it to 2d? >> >> Josef >> >> > I think we've misunderstood each other. Sorry if I was unclear. As I've > understood the discussion thus far, "raising an error" refers to raising > an error when > a 1D array is passed used with the syntax a.T2 (for swapping the last two > dimensions?). As far as whether or not a.T should raise an error for 1D > arrays, that > ship has definitely already sailed. I'm making the case that there's value > in having > an abbreviated syntax that helps prevent errors from accidentally using a > 1D array, > not that we should change the existing semantics. > Sorry, I misunderstood. I'm not sure which case CHB initially meant. Josef > > Cheers, > > -Ian > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Apr 7 16:37:56 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 7 Apr 2016 13:37:56 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen wrote: > > Here's another example that I've seen catch people now and again. > > A = np.random.rand(100, 100) > b = np.random.rand(10) > A * b.T > > In this case the user pretty clearly meant to be broadcasting along the rows > of A > rather than along the columns, but the code fails silently. When an issue > like this > gets mixed into a larger series of broadcasting operations, the error > becomes > difficult to find. I feel like this is an argument for named axes, and broadcasting rules that respect those names, as in xarray? There's been some speculative discussion about adding something along these lines to numpy, though nothing that's even reached the half-baked stage. -n -- Nathaniel J. Smith -- https://vorpus.org From chris.barker at noaa.gov Thu Apr 7 16:38:48 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 7 Apr 2016 13:38:48 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 11:31 AM, wrote: > maybe a warning? >> > > AFAIR, there is a lot of code that works correctly with .T being a noop > for 1D > e.g. covariance matrix/inner product x.T dot y as mentioned before. > oh well, then no warning, either. > write unit tests with non square 2d arrays and the exception / test error > shows up fast. > Guido wrote a note to python-ideas about the conflict between the use cases of "scripting" and "large system development" -- he urged both camps, to respect and listen to each other. I think this is very much a "scripters" issue -- so no unit tests, etc.... For my part, I STILL have to kick myself once in a while for using square arrays in testing/exploration! -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Thu Apr 7 18:03:38 2016 From: stefanv at berkeley.edu (=?UTF-8?Q?St=C3=A9fan_van_der_Walt?=) Date: Thu, 7 Apr 2016 15:03:38 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On 7 April 2016 at 11:17, Chris Barker wrote: > np.col_vector(arr) > > which would be a synonym for np.reshape(arr, (-1,1)) > > would that make anyone happy? I'm curious to see use cases where this doesn't solve the problem. The most common operations that I run into: colvec = lambda x: np.c_[x] x = np.array([1, 2, 3]) A = np.arange(9).reshape((3, 3)) 1) x @ x (equivalent to x @ colvec(x)) 2) A @ x (equivalent to A @ colvec(x), apart from the shape) 3) x @ A 4) x @ colvec(x) -- gives an error, but perhaps this should work and be equivalent to np.dot(colvec(x), rowvec(x)) ? If (4) were changed, 1D arrays would mostly* be interpreted as row vectors, and there would be no need for a rowvec function. And we already do that kind of magic for (2). St?fan * not for special case (1) From charlesr.harris at gmail.com Fri Apr 8 12:59:22 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 8 Apr 2016 10:59:22 -0600 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 4:03 PM, St?fan van der Walt wrote: > On 7 April 2016 at 11:17, Chris Barker wrote: > > np.col_vector(arr) > > > > which would be a synonym for np.reshape(arr, (-1,1)) > > > > would that make anyone happy? > > I'm curious to see use cases where this doesn't solve the problem. > > The most common operations that I run into: > > colvec = lambda x: np.c_[x] > > x = np.array([1, 2, 3]) > A = np.arange(9).reshape((3, 3)) > > > 1) x @ x (equivalent to x @ colvec(x)) > 2) A @ x (equivalent to A @ colvec(x), apart from the shape) > 3) x @ A > 4) x @ colvec(x) -- gives an error, but perhaps this should work and > be equivalent to np.dot(colvec(x), rowvec(x)) ? > > If (4) were changed, 1D arrays would mostly* be interpreted as row > vectors, and there would be no need for a rowvec function. And we > already do that kind of magic for (2). > Apropos column/row vectors, I've toyed a bit with the idea of adding a flag to numpy arrays to indicate that the last index is one or the other, and maybe neither. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Fri Apr 8 13:26:54 2016 From: stefanv at berkeley.edu (=?UTF-8?Q?St=C3=A9fan_van_der_Walt?=) Date: Fri, 8 Apr 2016 10:26:54 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On 7 April 2016 at 15:03, St?fan van der Walt wrote: > 4) x @ colvec(x) -- gives an error, but perhaps this should work and > be equivalent to np.dot(colvec(x), rowvec(x)) ? Sorry, that should have been 4) colvec(x) @ x St?fan From chris.barker at noaa.gov Fri Apr 8 14:17:42 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 8 Apr 2016 11:17:42 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Fri, Apr 8, 2016 at 9:59 AM, Charles R Harris wrote: > Apropos column/row vectors, I've toyed a bit with the idea of adding a > flag to numpy arrays to indicate that the last index is one or the other, > and maybe neither. > I don't follow this. wouldn't it ony be an issue for 1D arrays, rather than the "last index". Or maybe I'm totally missing the point. But anyway, are (N,1) and (1, N) arrays insufficient for representing column and row vectors for some reason? If not -- then we have a way to express a column or row vector, we just need an easier and more obvious way to create them. *maybe* we could have actual column and row vector classes -- they would BE regular arrays, with (1,N) or (N,1) dimensions, and act the same in every way except their __repr__. and we're provide handy factor functions for them. These were needed to complete the old Matrix class -- which is no longer needed now that we have @ (i.e. a 2D array IS a matrix) Note: this is not very well thought out! -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Apr 8 15:55:47 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 8 Apr 2016 13:55:47 -0600 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Fri, Apr 8, 2016 at 12:17 PM, Chris Barker wrote: > On Fri, Apr 8, 2016 at 9:59 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Apropos column/row vectors, I've toyed a bit with the idea of adding a >> flag to numpy arrays to indicate that the last index is one or the other, >> and maybe neither. >> > > I don't follow this. wouldn't it ony be an issue for 1D arrays, rather > than the "last index". Or maybe I'm totally missing the point. > > But anyway, are (N,1) and (1, N) arrays insufficient for representing > column and row vectors for some reason? If not -- then we have a way to > express a column or row vector, we just need an easier and more obvious way > to create them. > > *maybe* we could have actual column and row vector classes -- they would > BE regular arrays, with (1,N) or (N,1) dimensions, and act the same in > every way except their __repr__. and we're provide handy factor functions > for them. > > These were needed to complete the old Matrix class -- which is no longer > needed now that we have @ (i.e. a 2D array IS a matrix) > One problem with that approach is that `vrow @ vcol` has dimension 1 x 1, which is not a scalar. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Fri Apr 8 16:28:20 2016 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Fri, 08 Apr 2016 20:28:20 +0000 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Thu, Apr 7, 2016 at 4:04 PM St?fan van der Walt wrote: > On 7 April 2016 at 11:17, Chris Barker wrote: > > np.col_vector(arr) > > > > which would be a synonym for np.reshape(arr, (-1,1)) > > > > would that make anyone happy? > > I'm curious to see use cases where this doesn't solve the problem. > > The most common operations that I run into: > > colvec = lambda x: np.c_[x] > > x = np.array([1, 2, 3]) > A = np.arange(9).reshape((3, 3)) > > > 1) x @ x (equivalent to x @ colvec(x)) > 2) A @ x (equivalent to A @ colvec(x), apart from the shape) > 3) x @ A > 4) x @ colvec(x) -- gives an error, but perhaps this should work and > be equivalent to np.dot(colvec(x), rowvec(x)) ? > > If (4) were changed, 1D arrays would mostly* be interpreted as row > vectors, and there would be no need for a rowvec function. And we > already do that kind of magic for (2). > > St?fan > > * not for special case (1) > > Thinking this over a bit more, I think a broadcasting transpose that errors out on arrays that are less than 2D would cover the use cases of which I'm aware. The biggest things to me are having a broadcasting 2D transpose and having some form of transpose that doesn't silently pass 1D arrays through unchanged. Adding properties like colvec and rowvec has less bearing on the use cases I'm aware of, but they both provide nice syntax sugar for common reshape operations. It seems like it would cover all the needed cases for simplifying expressions involving matrix multiplication. It's not totally clear what the semantics should be for higher dimensional arrays or 2D arrays that already have a shape incompatible with the one desired. Best, -Ian Henriksen -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Apr 8 16:52:04 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 8 Apr 2016 16:52:04 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Fri, Apr 8, 2016 at 3:55 PM, Charles R Harris wrote: > > > On Fri, Apr 8, 2016 at 12:17 PM, Chris Barker > wrote: > >> On Fri, Apr 8, 2016 at 9:59 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> Apropos column/row vectors, I've toyed a bit with the idea of adding a >>> flag to numpy arrays to indicate that the last index is one or the other, >>> and maybe neither. >>> >> >> I don't follow this. wouldn't it ony be an issue for 1D arrays, rather >> than the "last index". Or maybe I'm totally missing the point. >> >> But anyway, are (N,1) and (1, N) arrays insufficient for representing >> column and row vectors for some reason? If not -- then we have a way to >> express a column or row vector, we just need an easier and more obvious way >> to create them. >> >> *maybe* we could have actual column and row vector classes -- they would >> BE regular arrays, with (1,N) or (N,1) dimensions, and act the same in >> every way except their __repr__. and we're provide handy factor functions >> for them. >> >> These were needed to complete the old Matrix class -- which is no longer >> needed now that we have @ (i.e. a 2D array IS a matrix) >> > > One problem with that approach is that `vrow @ vcol` has dimension 1 x 1, > which is not a scalar. > I think it's not supposed to be a scalar, if @ breaks on scalars `vrow @ vcol @ a Josef` > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Fri Apr 8 17:09:53 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Fri, 8 Apr 2016 17:09:53 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: <57081E21.9030103@gmail.com> On 4/8/2016 4:28 PM, Ian Henriksen wrote: > The biggest things to me are having a broadcasting 2D transpose and having some > form of transpose that doesn't silently pass 1D arrays through unchanged. This comment, like much of this thread, seems to long for the matrix class but not want to actually use it. It seems pretty simple to me: if you want everything forced to 2d, always use the matrix class. If you want to use arrays, they work nicely now, and they work as expected once you understand what you are working with. (I.e., *not* matrices.) Btw, numpy.outer(a, b) produces an outer product. This may be off topic, but it seemed to me that some of the discussion overlooks this. I suggest that anyone who thinks numpy is falling short in this area point out how Mma has addressed this shortcoming. Wolfram will never be accused of a reluctance to add functions when there is a perceived need ... Cheers, Alan Isaac From charlesr.harris at gmail.com Fri Apr 8 17:11:05 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 8 Apr 2016 15:11:05 -0600 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Fri, Apr 8, 2016 at 2:52 PM, wrote: > > > On Fri, Apr 8, 2016 at 3:55 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Fri, Apr 8, 2016 at 12:17 PM, Chris Barker >> wrote: >> >>> On Fri, Apr 8, 2016 at 9:59 AM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> Apropos column/row vectors, I've toyed a bit with the idea of adding a >>>> flag to numpy arrays to indicate that the last index is one or the other, >>>> and maybe neither. >>>> >>> >>> I don't follow this. wouldn't it ony be an issue for 1D arrays, rather >>> than the "last index". Or maybe I'm totally missing the point. >>> >>> But anyway, are (N,1) and (1, N) arrays insufficient for representing >>> column and row vectors for some reason? If not -- then we have a way to >>> express a column or row vector, we just need an easier and more obvious way >>> to create them. >>> >>> *maybe* we could have actual column and row vector classes -- they would >>> BE regular arrays, with (1,N) or (N,1) dimensions, and act the same in >>> every way except their __repr__. and we're provide handy factor functions >>> for them. >>> >>> These were needed to complete the old Matrix class -- which is no longer >>> needed now that we have @ (i.e. a 2D array IS a matrix) >>> >> >> One problem with that approach is that `vrow @ vcol` has dimension 1 x 1, >> which is not a scalar. >> > > I think it's not supposed to be a scalar, if @ breaks on scalars > > `vrow @ vcol @ a > It's supposed to be a scalar and the expression should be written `vrow @ vcol * a`, although parens are probably desireable for clarity `(vrow @ vcol) * a`. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Apr 8 17:13:08 2016 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 8 Apr 2016 14:13:08 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: <57081E21.9030103@gmail.com> References: <57081E21.9030103@gmail.com> Message-ID: On Fri, Apr 8, 2016 at 2:09 PM, Alan Isaac wrote: > On 4/8/2016 4:28 PM, Ian Henriksen wrote: >> >> The biggest things to me are having a broadcasting 2D transpose and having >> some >> form of transpose that doesn't silently pass 1D arrays through unchanged. > > > > This comment, like much of this thread, seems to long > for the matrix class but not want to actually use it. > > It seems pretty simple to me: if you want everything > forced to 2d, always use the matrix class. If you want > to use arrays, they work nicely now, and they work > as expected once you understand what you are > working with. (I.e., *not* matrices.) Note the word "broadcasting" -- he doesn't want 2d matrices, he wants tools that make it easy to work with stacks of 2d matrices stored in 2-or-more-dimensional arrays. -n -- Nathaniel J. Smith -- https://vorpus.org From josef.pktd at gmail.com Fri Apr 8 17:30:44 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 8 Apr 2016 17:30:44 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: Message-ID: On Fri, Apr 8, 2016 at 5:11 PM, Charles R Harris wrote: > > > On Fri, Apr 8, 2016 at 2:52 PM, wrote: > >> >> >> On Fri, Apr 8, 2016 at 3:55 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Fri, Apr 8, 2016 at 12:17 PM, Chris Barker >>> wrote: >>> >>>> On Fri, Apr 8, 2016 at 9:59 AM, Charles R Harris < >>>> charlesr.harris at gmail.com> wrote: >>>> >>>>> Apropos column/row vectors, I've toyed a bit with the idea of adding a >>>>> flag to numpy arrays to indicate that the last index is one or the other, >>>>> and maybe neither. >>>>> >>>> >>>> I don't follow this. wouldn't it ony be an issue for 1D arrays, rather >>>> than the "last index". Or maybe I'm totally missing the point. >>>> >>>> But anyway, are (N,1) and (1, N) arrays insufficient for representing >>>> column and row vectors for some reason? If not -- then we have a way to >>>> express a column or row vector, we just need an easier and more obvious way >>>> to create them. >>>> >>>> *maybe* we could have actual column and row vector classes -- they >>>> would BE regular arrays, with (1,N) or (N,1) dimensions, and act the same >>>> in every way except their __repr__. and we're provide handy factor >>>> functions for them. >>>> >>>> These were needed to complete the old Matrix class -- which is no >>>> longer needed now that we have @ (i.e. a 2D array IS a matrix) >>>> >>> >>> One problem with that approach is that `vrow @ vcol` has dimension 1 x >>> 1, which is not a scalar. >>> >> >> I think it's not supposed to be a scalar, if @ breaks on scalars >> >> `vrow @ vcol @ a >> > > It's supposed to be a scalar and the expression should be written `vrow @ > vcol * a`, although parens are probably desireable for clarity `(vrow @ > vcol) * a`. > if a is 1d or twod vcol, and vrow and vcol could also be 2d arrays (not a single row or col) this is just a part of a long linear algebra expression 1d dot 1d is different from vrow dot vcol A dot 1d is different from A dot vcol. There intentional differences in the linear algebra behavior of 1d versus a col or row vector. One of those is dropping the extra dimension. We are using this a lot to switch between 1-d and 2-d cases. And another great thing about numpy is that often code immediately generalizes from 1-d to 2d with just some tiny adjustments. (I haven't played with @ yet) I worry that making the 1-d arrays suddenly behave ambiguously as weird 1-d/2-d mixture will make code more inconsistent and more difficult to follow. shortcuts and variations of atleast_2d sound fine, but not implicitly Josef > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Fri Apr 8 18:04:50 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Fri, 8 Apr 2016 18:04:50 -0400 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: <57081E21.9030103@gmail.com> Message-ID: <57082B02.7050509@gmail.com> On 4/8/2016 5:13 PM, Nathaniel Smith wrote: > he doesn't want 2d matrices, he wants > tools that make it easy to work with stacks of 2d matrices stored in > 2-or-more-dimensional arrays. Like `map`? Alan Isaac From insertinterestingnamehere at gmail.com Fri Apr 8 19:37:40 2016 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Fri, 08 Apr 2016 23:37:40 +0000 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: <57082B02.7050509@gmail.com> References: <57081E21.9030103@gmail.com> <57082B02.7050509@gmail.com> Message-ID: On Fri, Apr 8, 2016 at 4:04 PM Alan Isaac wrote: > On 4/8/2016 5:13 PM, Nathaniel Smith wrote: > > he doesn't want 2d matrices, he wants > > tools that make it easy to work with stacks of 2d matrices stored in > > 2-or-more-dimensional arrays. > > > Like `map`? > > Alan Isaac > > Sorry if there's any misunderstanding here. Map doesn't really help much. That'd only be good for dealing with three dimensional cases and you'd get a list of arrays, not a view with the appropriate axes swapped. np.einsum('...ji', a) np.swapaxes(a, -1, -2) np.rollaxis(a, -1, -2) all do the right thing, but they are all fairly verbose for such a simple operation. Here's a simple example of when such a thing would be useful. With 2D arrays you can write a.dot(b.T) If you want to have that same operation follow the existing gufunc broadcasting semantics you end up having to write one of the following np.einsum('...ij,...kj', a, b) a @ np.swapaxes(a, -1, -2) a @ np.rollaxis(a, -1, -2) None of those are very concise, and, when I look at them, I don't usually think "that does a.dot(b.T)." If we introduced the T2 syntax, this would be valid: a @ b.T2 It makes the intent much clearer. This helps readability even more when you're trying to put together something that follows a larger equation while still broadcasting correctly. Does this help make the use cases a bit clearer? Best, -Ian Henriksen -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Apr 9 17:24:19 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 9 Apr 2016 15:24:19 -0600 Subject: [Numpy-discussion] Preliminary schedule for 1.12 Message-ID: Hi All, As we are trying out an accelerated release schedule this year it is time to start thinking of the 1.12 release. My current plan is to release a Numpy 1.11.1 at the end of the month with a few small fixups. Numpy 1.11.0 looks to have been one of our more successful releases and there are currently only three fixups in 1.11.x, none of which are major, so I think we can just release with no betas or release candidates unless something terrible turns up. After that release I'm looking to branch 1.12.x in early to mid May aiming at a final sometime in late July or early August. The main thing I think we must have in 1.12 is `__numpy_ufunc__`, so unless someone else wants to resurrect that topic I will do so myself starting sometime next week. I don't think a lot of work is needed to finish things up, Nathaniel's PR #6001 is a good start and with the addition of some opt out code that adheres to the Python convention should provide a solution we can all live with. Others may disagree, which is why we are still discussing the topic at this late date, but I'm hopeful. If there are other PRs or issues that folks feel need to be in 1.12.x, please reply to this post. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From yellowhat46 at gmail.com Sun Apr 10 06:04:16 2016 From: yellowhat46 at gmail.com (Vasco Gervasi) Date: Sun, 10 Apr 2016 12:04:16 +0200 Subject: [Numpy-discussion] f2py: ram usage Message-ID: Hi all, I am trying to write some code to do calculation onto an array: for each row I need to do some computation and have a number as return. To speed up the process I wrote a fortran subroutine that is called from python [using f2py] for each row of the array, so the input of this subroutine is a row and the output is a number. This method works but I saw some speed advantage if I pass the entire array to fortran and then, inside fortran, call the subroutine that does the math; so in this case I pass an array and return a vector. But I noticed that when python pass the array to fortran, the array is copied and the RAM usage double. Is there a way to "move" the array to fortran, I don't care if the array is lost after the call to fortran. The pyd module is generated using: python f2py.py -c --opt="-ffree-form -Ofast" -m F2PYMOD F2PYMOD.f90 Thanks Vasco -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sun Apr 10 06:53:49 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 10 Apr 2016 12:53:49 +0200 Subject: [Numpy-discussion] f2py: ram usage In-Reply-To: References: Message-ID: <1460285629.5816.3.camel@sipsolutions.net> On So, 2016-04-10 at 12:04 +0200, Vasco Gervasi wrote: > Hi all, > I am trying to write some code to do calculation onto an array: for > each row I need to do some computation and have a number as return. > To speed up the process I wrote a fortran subroutine that is called > from python [using f2py] for each row of the array, so the input of > this subroutine is a row and the output is a number. > This method works but I saw some speed advantage if I pass the entire > array to fortran and then, inside fortran, call the subroutine that > does the math; so in this case I pass an array and return a vector. > But I noticed that when python pass the array to fortran, the array > is copied and the RAM usage double. I expect that the fortran code needs your arrays to be fortran contiguous, so the wrappers need to copy them. The easiest solution may be to create your array in python with the `order="F"` flag. NumPy will have a tendency to prefer C-order and uses it as default though when doing something with an "F" ordered array. That said, I have never used f2py, so these are just well founded guesses. - Sebastian > Is there a way to "move" the array to fortran, I don't care if the > array is lost after the call to fortran. > The pyd module is generated using: python f2py.py -c --opt="-ffree > -form -Ofast" -m F2PYMOD F2PYMOD.f90 > > Thanks > Vasco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From gnurser at gmail.com Mon Apr 11 06:56:13 2016 From: gnurser at gmail.com (George Nurser) Date: Mon, 11 Apr 2016 11:56:13 +0100 Subject: [Numpy-discussion] f2py: ram usage In-Reply-To: <1460285629.5816.3.camel@sipsolutions.net> References: <1460285629.5816.3.camel@sipsolutions.net> Message-ID: Yes, f2py is probably copying the arrays; you can check this by appending -DF2PY_REPORT_ON_ARRAY_COPY=1 to your call to f2py. I normally prefer to keep the numpy arrays C-order (most efficient for numpy) and simply pass the array transpose to the f2py-ized fortran routine. This means that the fortran array indices are reversed, but this is the most natural way in any case. --George Nurser On 10 April 2016 at 11:53, Sebastian Berg wrote: > On So, 2016-04-10 at 12:04 +0200, Vasco Gervasi wrote: > > Hi all, > > I am trying to write some code to do calculation onto an array: for > > each row I need to do some computation and have a number as return. > > To speed up the process I wrote a fortran subroutine that is called > > from python [using f2py] for each row of the array, so the input of > > this subroutine is a row and the output is a number. > > This method works but I saw some speed advantage if I pass the entire > > array to fortran and then, inside fortran, call the subroutine that > > does the math; so in this case I pass an array and return a vector. > > But I noticed that when python pass the array to fortran, the array > > is copied and the RAM usage double. > > I expect that the fortran code needs your arrays to be fortran > contiguous, so the wrappers need to copy them. > > The easiest solution may be to create your array in python with the > `order="F"` flag. NumPy will have a tendency to prefer C-order and uses > it as default though when doing something with an "F" ordered array. > > That said, I have never used f2py, so these are just well founded > guesses. > > - Sebastian > > > > > Is there a way to "move" the array to fortran, I don't care if the > > array is lost after the call to fortran. > > The pyd module is generated using: python f2py.py -c --opt="-ffree > > -form -Ofast" -m F2PYMOD F2PYMOD.f90 > > > > Thanks > > Vasco > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yellowhat46 at gmail.com Mon Apr 11 07:24:14 2016 From: yellowhat46 at gmail.com (Vasco Gervasi) Date: Mon, 11 Apr 2016 13:24:14 +0200 Subject: [Numpy-discussion] f2py: ram usage In-Reply-To: References: <1460285629.5816.3.camel@sipsolutions.net> Message-ID: Using order='F' solved the problem. Thanks for reply. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matej.tyc at gmail.com Mon Apr 11 08:39:41 2016 From: matej.tyc at gmail.com (=?UTF-8?B?TWF0xJtqICBUw73EjQ==?=) Date: Mon, 11 Apr 2016 14:39:41 +0200 Subject: [Numpy-discussion] Numpy arrays shareable among related processes (PR #7533) Message-ID: Dear Numpy developers, I propose a pull request https://github.com/numpy/numpy/pull/7533 that features numpy arrays that can be shared among processes (with some effort). Why: In CPython, multiprocessing is the only way of how to exploit multi-core CPUs if your parallel code can't avoid creating Python objects. In that case, CPython's GIL makes threads unusable. However, unlike with threading, sharing data among processes is something that is non-trivial and platform-dependent. Although numpy (and certainly some other packages) implement some operations in a way that GIL is not a concern, consider another case: You have a large amount of data in a form of a numpy array and you want to pass it to a function of an arbitrary Python module that also expects numpy array (e.g. list of vertices coordinates as an input and array of the corresponding polygon as an output). Here, it is clear GIL is an issue you and since you want a numpy array on both ends, now you would have to copy your numpy array to a multiprocessing.Array (to pass the data) and then to convert it back to ndarray in the worker process. This contribution would streamline it a bit - you would create an array as you are used to, pass it to the subprocess as you would do with the multiprocessing.Array, and the process can work with a numpy array right away. How: The idea is to create a numpy array in a buffer that can be shared among processes. Python has support for this in its standard library, so the current solution creates a multiprocessing.Array and then passes it as the "buffer" to the ndarray.__new__. That would be it on Unixes, but on Windows, there has to be a a custom pickle method, otherwise the array "forgets" that its buffer is that special and the sharing doesn't work. Some of what has been said in the pull request & my answer to that: * ... I do see some value in providing a canonical right way to construct shared memory arrays in NumPy, but I'm not very happy with this solution, ... terrible code organization (with the global variables): * I understand that, however this is a pattern of Python multiprocessing and everybody who wants to use the Pool and shared data either is familiar with this approach or has to become familiar with[2, 3]. The good compromise is to have a separate module for each parallel calculation, so global variables are not a problem. * Can you explain why the ndarray subclass is needed? Subclasses can be rather annoying to get right, and also for other reasons. * The shmarray class needs the custom pickler (but only on Windows). * If there's some way to we can paper over the boilerplate such that users can use it without understanding the arcana of multiprocessing, then yes, that would be great. But otherwise I'm not sure there's anything to be gained by putting it in a library rather than referring users to the examples on StackOverflow [1] [2]. * What about telling users: "You can use numpy with multiprocessing. Remeber the multiprocessing.Value and multiprocessing.Aray classes? numpy.shm works exactly the same way, which means that it shares their limitations. Refer to an example: ." Notice that although those SO links contain all of the information, it is very difficult to get it up and running for a newcomer like me few years ago. * This needs tests and justification for custom pickling methods, which are not used in any of the current examples. ... * I am sorry, but don't fully understand that point. The custom pickling method of shmarray has to be there on Windows, but users don't have to know about it at all. As noted earlier, the global variable is the only way of using standard Python multiprocessing.Pool with shared objects. [1]: http://stackoverflow.com/questions/10721915/shared-memory-objects-in-python-multiprocessing [2]: http://stackoverflow.com/questions/7894791/use-numpy-array-in-shared-memory-for-multiprocessing [3]: http://stackoverflow.com/questions/1675766/how-to-combine-pool-map-with-array-shared-memory-in-python-multiprocessing From shoyer at gmail.com Mon Apr 11 12:37:41 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 11 Apr 2016 09:37:41 -0700 Subject: [Numpy-discussion] Numpy arrays shareable among related processes (PR #7533) In-Reply-To: References: Message-ID: On Mon, Apr 11, 2016 at 5:39 AM, Mat?j T?? wrote: > * ... I do see some value in providing a canonical right way to > construct shared memory arrays in NumPy, but I'm not very happy with > this solution, ... terrible code organization (with the global > variables): > * I understand that, however this is a pattern of Python > multiprocessing and everybody who wants to use the Pool and shared > data either is familiar with this approach or has to become familiar > with[2, 3]. The good compromise is to have a separate module for each > parallel calculation, so global variables are not a problem. > OK, we can agree to disagree on this one. I still don't think I could get code using this pattern checked in at my work (for good reason). > * If there's some way to we can paper over the boilerplate such that > users can use it without understanding the arcana of multiprocessing, > then yes, that would be great. But otherwise I'm not sure there's > anything to be gained by putting it in a library rather than referring > users to the examples on StackOverflow [1] [2]. > * What about telling users: "You can use numpy with multiprocessing. > Remeber the multiprocessing.Value and multiprocessing.Aray classes? > numpy.shm works exactly the same way, which means that it shares their > limitations. Refer to an example: ." Notice that > although those SO links contain all of the information, it is very > difficult to get it up and running for a newcomer like me few years > ago. > I guess I'm still not convinced this is the best we can with the multiprocessing library. If we're going to do this, then we definitely need to have the fully canonical example. For example, could you make the shared array a global variable and then still pass references to functions called by the processes anyways? The examples on stackoverflow that we're both looking are varied enough that it's not obvious to me that this is as good as it gets. * This needs tests and justification for custom pickling methods, > which are not used in any of the current examples. ... > * I am sorry, but don't fully understand that point. The custom > pickling method of shmarray has to be there on Windows, but users > don't have to know about it at all. As noted earlier, the global > variable is the only way of using standard Python multiprocessing.Pool > with shared objects. > That sounds like a fine justification, but given that it wasn't obvious you needs a comment saying as much in the source code :). Also, it breaks pickle, which is another limitation that needs to be documented. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Apr 11 19:24:12 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 11 Apr 2016 16:24:12 -0700 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: <57081E21.9030103@gmail.com> <57082B02.7050509@gmail.com> Message-ID: On Fri, Apr 8, 2016 at 4:37 PM, Ian Henriksen < insertinterestingnamehere at gmail.com> wrote: > If we introduced the T2 syntax, this would be valid: > > a @ b.T2 > > It makes the intent much clearer. > would: a @ colvector(b) work too? or is T2 generalized to more than one column? (though I suppose colvector() could support more than one column also -- weird though that might be.) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Mon Apr 11 20:25:10 2016 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Tue, 12 Apr 2016 00:25:10 +0000 Subject: [Numpy-discussion] ndarray.T2 for 2D transpose In-Reply-To: References: <57081E21.9030103@gmail.com> <57082B02.7050509@gmail.com> Message-ID: On Mon, Apr 11, 2016 at 5:24 PM Chris Barker wrote: > On Fri, Apr 8, 2016 at 4:37 PM, Ian Henriksen < > insertinterestingnamehere at gmail.com> wrote: > > >> If we introduced the T2 syntax, this would be valid: >> >> a @ b.T2 >> >> It makes the intent much clearer. >> > > would: > > a @ colvector(b) > > work too? or is T2 generalized to more than one column? (though I suppose > colvector() could support more than one column also -- weird though that > might be.) > > -CHB > Right, so I've opted to withdraw my support for having the T2 syntax prepend dimensions when the array has fewer than two dimensions. Erroring out in the 1D case addresses my concerns well enough. The colvec/rowvec idea seems nice too, but it matters a bit less to me, so I'll leave that discussion open for others to follow up on. Having T2 be a broadcasting transpose is a bit more general than any semantics for rowvec/colvec that I can think of. Here are specific arrays that, in the a @ b.T2 can only be handled using some sort of transpose: a = np.random.rand(2, 3, 4) b = np.random.rand(2, 1, 3, 4) Using these inputs, the expression a @ b.T2 would have the shape (2, 2, 3, 3). All the T2 property would be doing is a transpose that has similar broadcasting semantics to matmul, solve, inv, and the other gufuncs. The primary difference with those other functions is that transposes would be done as views whereas the other operations, because of the computations they perform, all have to return new output arrays. Hope this helps, -Ian Henriksen -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrik.andersson.3 at volvocars.com Tue Apr 12 05:23:38 2016 From: patrik.andersson.3 at volvocars.com (Andersson, Patrik (S.E.)) Date: Tue, 12 Apr 2016 09:23:38 +0000 Subject: [Numpy-discussion] NumpyDistribution has no object attribute include_package_deta Message-ID: <492BB257675288498D0A474D4BE584DD1CBE510E@AMSPRD4001MB023.050d.mgd.msft.net> Hi, I'm compiling Numpy 1.11.0 from source with python3.3, gcc-5.3.0, with LAPACK_SRC and BLAS_SRC set. During the creating numpy.egg-info, an error occur with NumpyDistribution, it complains about it has not the attribute include_package_data. Where shall I start look? Kind regards Patrik -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjhnson at gmail.com Tue Apr 12 20:57:59 2016 From: tjhnson at gmail.com (T J) Date: Tue, 12 Apr 2016 19:57:59 -0500 Subject: [Numpy-discussion] Floor divison on int returns float In-Reply-To: <5702C709.1040801@hawaii.edu> References: <5702C709.1040801@hawaii.edu> Message-ID: Thanks Eric. Also relevant: https://github.com/numba/numba/issues/909 Looks like Numba has found a way to avoid this edge case. On Monday, April 4, 2016, Eric Firing wrote: > On 2016/04/04 9:23 AM, T J wrote: > >> I'm on NumPy 1.10.4 (mkl). >> >> >>> np.uint(3) // 2 # 1.0 >> >>> 3 // 2 # 1 >> >> Is this behavior expected? It's certainly not desired from my >> perspective. If this is not a bug, could someone explain the rationale >> to me. >> >> Thanks. >> > > I agree that it's almost always undesirable; one would reasonably expect > some sort of int. Here's what I think is going on: > > The odd behavior occurs only with np.uint, which is np.uint64, and when > the denominator is a signed int. The problem is that if the denominator is > negative, the result will be negative, so it can't have the same type as > the first numerator. Furthermore, if the denominator is -1, the result > will be minus the numerator, and that can't be represented by np.uint or > np.int. Therefore the result is returned as np.float64. The promotion > rules are based on what *could* happen in an operation, not on what *is* > happening in a given instance. > > Eric > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Tue Apr 12 22:11:21 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 12 Apr 2016 19:11:21 -0700 Subject: [Numpy-discussion] Using OpenBLAS for manylinux wheels In-Reply-To: References: Message-ID: On Wed, Apr 6, 2016 at 6:47 AM, Olivier Grisel wrote: > I updated the issue: > > https://github.com/xianyi/OpenBLAS-CI/issues/10#issuecomment-206195714 > > The random test_nanmedian_all_axis failure is unrelated to openblas > and should be ignored. It looks like all is well now, at least for the OpenBLAS buildbots : http://build.openblas.net/builders Matthew From matthew.brett at gmail.com Tue Apr 12 22:15:19 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 12 Apr 2016 19:15:19 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Hi, On Sat, Apr 2, 2016 at 6:11 PM, Matthew Brett wrote: > On Fri, Mar 25, 2016 at 6:39 AM, Peter Cock wrote: >> On Fri, Mar 25, 2016 at 3:02 AM, Robert T. McGibbon wrote: >>> I suspect that many of the maintainers of major scipy-ecosystem projects are >>> aware of these (or other similar) travis wheel caches, but would guess that >>> the pool of travis-ci python users who weren't aware of these wheel caches >>> is much much larger. So there will still be a lot of travis-ci clock cycles >>> saved by manylinux wheels. >>> >>> -Robert >> >> Yes exactly. Availability of NumPy Linux wheels on PyPI is definitely something >> I would suggest adding to the release notes. Hopefully this will help trigger >> a general availability of wheels in the numpy-ecosystem :) >> >> In the case of Travis CI, their VM images for Python already have a version >> of NumPy installed, but having the latest version of NumPy and SciPy etc >> available as Linux wheels would be very nice. > > We're very nearly there now. > > The latest versions of numpy, scipy, scikit-image, pandas, numexpr, > statsmodels wheels for testing at > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ > > Please do test with: > > python -m pip install --upgrade pip > > pip install --trusted-host=ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > --find-links=http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > numpy scipy scikit-learn numexpr > > python -c 'import numpy; numpy.test("full")' > python -c 'import scipy; scipy.test("full")' > > We would love to get any feedback as to whether these work on your machines. I've just rebuilt these wheels with the just-released OpenBLAS 0.2.18. OpenBLAS is now passing all its own tests and tests on numpy / scipy / scikit-learn at http://build.openblas.net/builders Our tests of the wheels look good too: http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian https://travis-ci.org/matthew-brett/manylinux-testing So I think these are ready to go. I propose uploading these wheels for numpy and scipy to pypi tomorrow unless anyone has an objection. Cheers, Matthew From antony.lee at berkeley.edu Tue Apr 12 22:17:21 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Tue, 12 Apr 2016 19:17:21 -0700 Subject: [Numpy-discussion] Floor divison on int returns float In-Reply-To: References: <5702C709.1040801@hawaii.edu> Message-ID: This kind of issue (see also https://github.com/numpy/numpy/issues/3511) has become more annoying now that indexing requires integers (indexing with a float raises a VisibleDeprecationWarning). The argument "dividing an uint by an int may give a result that does not fit in an uint nor in an int" does not sound very convincing to me, after all even adding two (sized) ints may give a result that does not fit in the same size, but numpy does not upcast everything there: In [17]: np.int32(2**31 - 1) + np.int32(2**31 - 1) Out[17]: -2 In [18]: type(np.int32(2**31 - 1) + np.int32(2**31 - 1)) Out[18]: numpy.int32 I'd think that overflowing operations should just overflow (and possibly raise a warning via the seterr mechanism), but their possibility should not be an argument for modifying the output type. Antony 2016-04-12 17:57 GMT-07:00 T J : > Thanks Eric. > > Also relevant: https://github.com/numba/numba/issues/909 > > Looks like Numba has found a way to avoid this edge case. > > > > On Monday, April 4, 2016, Eric Firing wrote: > >> On 2016/04/04 9:23 AM, T J wrote: >> >>> I'm on NumPy 1.10.4 (mkl). >>> >>> >>> np.uint(3) // 2 # 1.0 >>> >>> 3 // 2 # 1 >>> >>> Is this behavior expected? It's certainly not desired from my >>> perspective. If this is not a bug, could someone explain the rationale >>> to me. >>> >>> Thanks. >>> >> >> I agree that it's almost always undesirable; one would reasonably expect >> some sort of int. Here's what I think is going on: >> >> The odd behavior occurs only with np.uint, which is np.uint64, and when >> the denominator is a signed int. The problem is that if the denominator is >> negative, the result will be negative, so it can't have the same type as >> the first numerator. Furthermore, if the denominator is -1, the result >> will be minus the numerator, and that can't be represented by np.uint or >> np.int. Therefore the result is returned as np.float64. The promotion >> rules are based on what *could* happen in an operation, not on what *is* >> happening in a given instance. >> >> Eric >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 12 22:56:49 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 12 Apr 2016 19:56:49 -0700 Subject: [Numpy-discussion] Floor divison on int returns float In-Reply-To: References: <5702C709.1040801@hawaii.edu> Message-ID: So what type should uint64 + int64 return? On Apr 12, 2016 7:17 PM, "Antony Lee" wrote: > This kind of issue (see also https://github.com/numpy/numpy/issues/3511) > has become more annoying now that indexing requires integers (indexing with > a float raises a VisibleDeprecationWarning). The argument "dividing an > uint by an int may give a result that does not fit in an uint nor in an > int" does not sound very convincing to me, after all even adding two > (sized) ints may give a result that does not fit in the same size, but > numpy does not upcast everything there: > > In [17]: np.int32(2**31 - 1) + np.int32(2**31 - 1) > Out[17]: -2 > > In [18]: type(np.int32(2**31 - 1) + np.int32(2**31 - 1)) > Out[18]: numpy.int32 > > > I'd think that overflowing operations should just overflow (and possibly > raise a warning via the seterr mechanism), but their possibility should not > be an argument for modifying the output type. > > Antony > > 2016-04-12 17:57 GMT-07:00 T J : > >> Thanks Eric. >> >> Also relevant: https://github.com/numba/numba/issues/909 >> >> Looks like Numba has found a way to avoid this edge case. >> >> >> >> On Monday, April 4, 2016, Eric Firing wrote: >> >>> On 2016/04/04 9:23 AM, T J wrote: >>> >>>> I'm on NumPy 1.10.4 (mkl). >>>> >>>> >>> np.uint(3) // 2 # 1.0 >>>> >>> 3 // 2 # 1 >>>> >>>> Is this behavior expected? It's certainly not desired from my >>>> perspective. If this is not a bug, could someone explain the rationale >>>> to me. >>>> >>>> Thanks. >>>> >>> >>> I agree that it's almost always undesirable; one would reasonably expect >>> some sort of int. Here's what I think is going on: >>> >>> The odd behavior occurs only with np.uint, which is np.uint64, and when >>> the denominator is a signed int. The problem is that if the denominator is >>> negative, the result will be negative, so it can't have the same type as >>> the first numerator. Furthermore, if the denominator is -1, the result >>> will be minus the numerator, and that can't be represented by np.uint or >>> np.int. Therefore the result is returned as np.float64. The promotion >>> rules are based on what *could* happen in an operation, not on what *is* >>> happening in a given instance. >>> >>> Eric >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Wed Apr 13 00:26:19 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Tue, 12 Apr 2016 21:26:19 -0700 Subject: [Numpy-discussion] Floor divison on int returns float In-Reply-To: References: <5702C709.1040801@hawaii.edu> Message-ID: Whatever the C rules are (which I don't know off the top of my head, but I guess it must be one of uint64 or int64). It's not as if conversion to float64 was lossless: In [38]: 2**63 - (np.int64(2**62-1) + np.uint64(2**62-1)) Out[38]: 0.0 Note that the result of (np.int64(2**62-1) + np.uint64(2**62-1)) would actually fit in an int64 (or an uint64), so arguably the conversion to float makes things worse. Antony 2016-04-12 19:56 GMT-07:00 Nathaniel Smith : > So what type should uint64 + int64 return? > On Apr 12, 2016 7:17 PM, "Antony Lee" wrote: > >> This kind of issue (see also https://github.com/numpy/numpy/issues/3511) >> has become more annoying now that indexing requires integers (indexing with >> a float raises a VisibleDeprecationWarning). The argument "dividing an >> uint by an int may give a result that does not fit in an uint nor in an >> int" does not sound very convincing to me, after all even adding two >> (sized) ints may give a result that does not fit in the same size, but >> numpy does not upcast everything there: >> >> In [17]: np.int32(2**31 - 1) + np.int32(2**31 - 1) >> Out[17]: -2 >> >> In [18]: type(np.int32(2**31 - 1) + np.int32(2**31 - 1)) >> Out[18]: numpy.int32 >> >> >> I'd think that overflowing operations should just overflow (and possibly >> raise a warning via the seterr mechanism), but their possibility should not >> be an argument for modifying the output type. >> >> Antony >> >> 2016-04-12 17:57 GMT-07:00 T J : >> >>> Thanks Eric. >>> >>> Also relevant: https://github.com/numba/numba/issues/909 >>> >>> Looks like Numba has found a way to avoid this edge case. >>> >>> >>> >>> On Monday, April 4, 2016, Eric Firing wrote: >>> >>>> On 2016/04/04 9:23 AM, T J wrote: >>>> >>>>> I'm on NumPy 1.10.4 (mkl). >>>>> >>>>> >>> np.uint(3) // 2 # 1.0 >>>>> >>> 3 // 2 # 1 >>>>> >>>>> Is this behavior expected? It's certainly not desired from my >>>>> perspective. If this is not a bug, could someone explain the rationale >>>>> to me. >>>>> >>>>> Thanks. >>>>> >>>> >>>> I agree that it's almost always undesirable; one would reasonably >>>> expect some sort of int. Here's what I think is going on: >>>> >>>> The odd behavior occurs only with np.uint, which is np.uint64, and when >>>> the denominator is a signed int. The problem is that if the denominator is >>>> negative, the result will be negative, so it can't have the same type as >>>> the first numerator. Furthermore, if the denominator is -1, the result >>>> will be minus the numerator, and that can't be represented by np.uint or >>>> np.int. Therefore the result is returned as np.float64. The >>>> promotion rules are based on what *could* happen in an operation, not on >>>> what *is* happening in a given instance. >>>> >>>> Eric >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Wed Apr 13 03:42:06 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Wed, 13 Apr 2016 00:42:06 -0700 Subject: [Numpy-discussion] Changing the behavior of (builtins.)round (via the __round__ dunder) to return an integer Message-ID: https://github.com/numpy/numpy/issues/3511 proposed (nearly three years ago) to return an integer when `builtins.round` (which calls the `__round__ dunder method, and thereafter called `round` (... not to be confused with `np.round`)) is called with a single argument. Currently, `round` returns a floating scalar for numpy scalars, matching the Python2 behavior. Python3 changed the behavior of `round` to return an int when it is called with a single argument (otherwise, the return type matches the type of the first argument). I believe this is more intuitive, and is arguably becoming more important now that numpy is deprecating (via a VisibleDeprecationWarning) indexing with a float: having to write array[int(round(some_float))] is rather awkward. (Note that I am suggesting to switch to the new behavior regardless of the version of Python.) Note that currently the `__round__` dunder is not implemented for arrays (... see https://github.com/numpy/numpy/issues/6248) so it would be feasible to always return a signed integer of the same size with an OverflowError on overflow (at least, any floating point that is round-able without loss of precision will be covered). If `__round__` ends up being implemented for ndarrays too, I guess the correct behavior will be whatever we come up for signaling failure in integer operations (see current behavior of `np.array([0, 1]) // np.array([0, 1])`). Also note the comment posted by @njsmith on the github issue thread: I'd be fine with matching python here, but we need to run it by the mailing list. Not clear what the right kind of deprecation is... Normally FutureWarning since there's no error involved, but that would both be very annoying (basically makes round unusable -- you get this noisy warning even if what you're doing is round(a).astype(int)), and the change is relatively low risk compared to most FutureWarning changes, since the actual values returned are identical before and after the change. Thoughts? Antony -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Apr 13 04:31:06 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 13 Apr 2016 01:31:06 -0700 Subject: [Numpy-discussion] Changing the behavior of (builtins.)round (via the __round__ dunder) to return an integer In-Reply-To: References: Message-ID: On Wed, Apr 13, 2016 at 12:42 AM, Antony Lee wrote: > (Note that I am suggesting to switch to the new behavior regardless of the > version of Python.) > I would lean towards making this change only for Python 3. This is arguably more consistent with Python than changing the behavior on Python 2.7, too. The most obvious way in which a float being surprisingly switched to an integer could cause silent bugs (rather than noisy TypeErrors) is if the number is used in division. True division in Python 3 eliminates this risk. Generally, I agree with your reasoning. It would be unfortunate to be stuck with this legacy behavior forever. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Apr 13 11:06:43 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 13 Apr 2016 11:06:43 -0400 Subject: [Numpy-discussion] Changing the behavior of (builtins.)round (via the __round__ dunder) to return an integer In-Reply-To: References: Message-ID: On Wed, Apr 13, 2016 at 4:31 AM, Stephan Hoyer wrote: > On Wed, Apr 13, 2016 at 12:42 AM, Antony Lee > wrote: > >> (Note that I am suggesting to switch to the new behavior regardless of >> the version of Python.) >> > > I would lean towards making this change only for Python 3. This is > arguably more consistent with Python than changing the behavior on Python > 2.7, too. > > The most obvious way in which a float being surprisingly switched to an > integer could cause silent bugs (rather than noisy TypeErrors) is if the > number is used in division. True division in Python 3 eliminates this risk. > > Generally, I agree with your reasoning. It would be unfortunate to be > stuck with this legacy behavior forever. > > The difference is that Python 3 has looooong ints, (and doesn't have to overflow, AFAICS) what happens with nan? I guess inf would overflow? (nan and inf are preserved with np.round) Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Apr 13 12:07:32 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 13 Apr 2016 17:07:32 +0100 Subject: [Numpy-discussion] Floor divison on int returns float In-Reply-To: References: <5702C709.1040801@hawaii.edu> Message-ID: On Wed, Apr 13, 2016 at 3:17 AM, Antony Lee wrote: > > This kind of issue (see also https://github.com/numpy/numpy/issues/3511) has become more annoying now that indexing requires integers (indexing with a float raises a VisibleDeprecationWarning). The argument "dividing an uint by an int may give a result that does not fit in an uint nor in an int" does not sound very convincing to me, It shouldn't because that's not the rule that numpy follows. The range of the result is never considered. Both *inputs* are cast to the same type that can represent the full range of either input type (for that matter, the actual *values* of the inputs are also never considered). In the case of uint64 and int64, there is no really good common type (the integer hierarchy has to top out somewhere), but float64 merely loses resolution rather than cutting off half of the range of uint64. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Apr 13 12:35:19 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 13 Apr 2016 09:35:19 -0700 Subject: [Numpy-discussion] Changing the behavior of (builtins.)round (via the __round__ dunder) to return an integer In-Reply-To: References: Message-ID: On Wed, Apr 13, 2016 at 8:06 AM, wrote: > > The difference is that Python 3 has looooong ints, (and doesn't have to > overflow, AFAICS) > This is a good point. But if your float is so big that rounding it to an integer would overflow int64, rounding is already a no-op. I'm sure this has been done before but I would guess it's quite rare. I would be OK raising in this situation, especially because np.around will still be around returning floats. > what happens with nan? > I guess inf would overflow? > builtins.round raises for both of these (in Python 3) and I would propose copying this behavior: In [52]: round(float('inf')) --------------------------------------------------------------------------- OverflowError Traceback (most recent call last) in () ----> 1 round(float('inf')) OverflowError: cannot convert float infinity to integer In [53]: round(float('nan')) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 round(float('nan')) ValueError: cannot convert float NaN to integer -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Apr 13 15:15:30 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 13 Apr 2016 12:15:30 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Tue, Apr 12, 2016 at 7:15 PM, Matthew Brett wrote: > Hi, > > On Sat, Apr 2, 2016 at 6:11 PM, Matthew Brett wrote: >> On Fri, Mar 25, 2016 at 6:39 AM, Peter Cock wrote: >>> On Fri, Mar 25, 2016 at 3:02 AM, Robert T. McGibbon wrote: >>>> I suspect that many of the maintainers of major scipy-ecosystem projects are >>>> aware of these (or other similar) travis wheel caches, but would guess that >>>> the pool of travis-ci python users who weren't aware of these wheel caches >>>> is much much larger. So there will still be a lot of travis-ci clock cycles >>>> saved by manylinux wheels. >>>> >>>> -Robert >>> >>> Yes exactly. Availability of NumPy Linux wheels on PyPI is definitely something >>> I would suggest adding to the release notes. Hopefully this will help trigger >>> a general availability of wheels in the numpy-ecosystem :) >>> >>> In the case of Travis CI, their VM images for Python already have a version >>> of NumPy installed, but having the latest version of NumPy and SciPy etc >>> available as Linux wheels would be very nice. >> >> We're very nearly there now. >> >> The latest versions of numpy, scipy, scikit-image, pandas, numexpr, >> statsmodels wheels for testing at >> http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ >> >> Please do test with: >> >> python -m pip install --upgrade pip >> >> pip install --trusted-host=ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com >> --find-links=http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com >> numpy scipy scikit-learn numexpr >> >> python -c 'import numpy; numpy.test("full")' >> python -c 'import scipy; scipy.test("full")' >> >> We would love to get any feedback as to whether these work on your machines. > > I've just rebuilt these wheels with the just-released OpenBLAS 0.2.18. > > OpenBLAS is now passing all its own tests and tests on numpy / scipy / > scikit-learn at http://build.openblas.net/builders > > Our tests of the wheels look good too: > > http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian > http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian > https://travis-ci.org/matthew-brett/manylinux-testing > > So I think these are ready to go. I propose uploading these wheels > for numpy and scipy to pypi tomorrow unless anyone has an objection. Done. If y'all are on linux, and you have pip >= 8.11, you should now see this kind of thing: $ pip install numpy scipy Collecting numpy Downloading numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl (15.3MB) 100% |????????????????????????????????| 15.3MB 61kB/s Collecting scipy Downloading scipy-0.17.0-cp27-cp27mu-manylinux1_x86_64.whl (39.5MB) 100% |????????????????????????????????| 39.5MB 24kB/s Installing collected packages: numpy, scipy Successfully installed numpy-1.11.0 scipy-0.17.0 Cheers, Matthew From olivier.grisel at ensta.org Wed Apr 13 16:19:40 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Wed, 13 Apr 2016 22:19:40 +0200 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: \o/ Thank you very much Matthew. I will upload the scikit-learn wheels soon. -- Olivier From njs at pobox.com Wed Apr 13 16:22:53 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 13 Apr 2016 13:22:53 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Woot! \o/ On Wed, Apr 13, 2016 at 12:15 PM, Matthew Brett wrote: > On Tue, Apr 12, 2016 at 7:15 PM, Matthew Brett > wrote: > > Hi, > > > > On Sat, Apr 2, 2016 at 6:11 PM, Matthew Brett > wrote: > >> On Fri, Mar 25, 2016 at 6:39 AM, Peter Cock > wrote: > >>> On Fri, Mar 25, 2016 at 3:02 AM, Robert T. McGibbon < > rmcgibbo at gmail.com> wrote: > >>>> I suspect that many of the maintainers of major scipy-ecosystem > projects are > >>>> aware of these (or other similar) travis wheel caches, but would > guess that > >>>> the pool of travis-ci python users who weren't aware of these wheel > caches > >>>> is much much larger. So there will still be a lot of travis-ci clock > cycles > >>>> saved by manylinux wheels. > >>>> > >>>> -Robert > >>> > >>> Yes exactly. Availability of NumPy Linux wheels on PyPI is definitely > something > >>> I would suggest adding to the release notes. Hopefully this will help > trigger > >>> a general availability of wheels in the numpy-ecosystem :) > >>> > >>> In the case of Travis CI, their VM images for Python already have a > version > >>> of NumPy installed, but having the latest version of NumPy and SciPy > etc > >>> available as Linux wheels would be very nice. > >> > >> We're very nearly there now. > >> > >> The latest versions of numpy, scipy, scikit-image, pandas, numexpr, > >> statsmodels wheels for testing at > >> > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ > >> > >> Please do test with: > >> > >> python -m pip install --upgrade pip > >> > >> pip install --trusted-host= > ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > >> --find-links= > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > >> numpy scipy scikit-learn numexpr > >> > >> python -c 'import numpy; numpy.test("full")' > >> python -c 'import scipy; scipy.test("full")' > >> > >> We would love to get any feedback as to whether these work on your > machines. > > > > I've just rebuilt these wheels with the just-released OpenBLAS 0.2.18. > > > > OpenBLAS is now passing all its own tests and tests on numpy / scipy / > > scikit-learn at http://build.openblas.net/builders > > > > Our tests of the wheels look good too: > > > > http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian > > http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian > > https://travis-ci.org/matthew-brett/manylinux-testing > > > > So I think these are ready to go. I propose uploading these wheels > > for numpy and scipy to pypi tomorrow unless anyone has an objection. > > Done. If y'all are on linux, and you have pip >= 8.11, you should > now see this kind of thing: > > $ pip install numpy scipy > Collecting numpy > Downloading numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl (15.3MB) > 100% |????????????????????????????????| 15.3MB 61kB/s > Collecting scipy > Downloading scipy-0.17.0-cp27-cp27mu-manylinux1_x86_64.whl (39.5MB) > 100% |????????????????????????????????| 39.5MB 24kB/s > Installing collected packages: numpy, scipy > Successfully installed numpy-1.11.0 scipy-0.17.0 > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Nathaniel J. Smith -- https://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Wed Apr 13 16:29:40 2016 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Wed, 13 Apr 2016 21:29:40 +0100 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On 13 April 2016 at 20:15, Matthew Brett wrote: > Done. If y'all are on linux, and you have pip >= 8.11, you should > now see this kind of thing: That's fantastic. Thanks Matt! I just test installed this and ran numpy.test(). All tests passed but then I got a segfault at the end by (semi-accidentally) hitting Ctrl-C at the prompt: $ python Python 2.7.9 (default, Apr 2 2015, 15:33:21) [GCC 4.9.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Running unit tests for numpy Ran 5781 tests in 72.238s OK (KNOWNFAIL=6, SKIP=15) >>> Segmentation fault (core dumped) It was stopped at the prompt and then I did Ctrl-C and then the seg-fault message. $ uname -a Linux vnwulf 3.19.0-15-generic #15-Ubuntu SMP Thu Apr 16 23:32:37 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 15.04 Release: 15.04 Codename: vivid -- Oscar From matthew.brett at gmail.com Wed Apr 13 16:47:29 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 13 Apr 2016 13:47:29 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Wed, Apr 13, 2016 at 1:29 PM, Oscar Benjamin wrote: > On 13 April 2016 at 20:15, Matthew Brett wrote: >> Done. If y'all are on linux, and you have pip >= 8.11, you should >> now see this kind of thing: > > That's fantastic. Thanks Matt! > > I just test installed this and ran numpy.test(). All tests passed but > then I got a segfault at the end by (semi-accidentally) hitting Ctrl-C > at the prompt: > > $ python > Python 2.7.9 (default, Apr 2 2015, 15:33:21) > [GCC 4.9.2] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy >>>> numpy.test() > Running unit tests for numpy > > Ran 5781 tests in 72.238s > > OK (KNOWNFAIL=6, SKIP=15) > >>>> Segmentation fault (core dumped) > > It was stopped at the prompt and then I did Ctrl-C and then the > seg-fault message. > > $ uname -a > Linux vnwulf 3.19.0-15-generic #15-Ubuntu SMP Thu Apr 16 23:32:37 UTC > 2015 x86_64 x86_64 x86_64 GNU/Linux > $ lsb_release -a > No LSB modules are available. > Distributor ID: Ubuntu > Description: Ubuntu 15.04 > Release: 15.04 > Codename: vivid > Thanks so much for testing - that's very useful. I get the same thing on my Debian Sid machine. Actually I also get the same thing with a local compile against Debian ATLAS, here's the stack trace after: >>> import numpy; numpy.test() >>> # Ctrl-C https://gist.github.com/f6d8fb42f24689b39536a2416d717056 Do you get this as well? Cheers, Matthew From njs at pobox.com Wed Apr 13 16:48:58 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 13 Apr 2016 13:48:58 -0700 Subject: [Numpy-discussion] Floor divison on int returns float In-Reply-To: References: <5702C709.1040801@hawaii.edu> Message-ID: On Apr 13, 2016 9:08 AM, "Robert Kern" wrote: > > On Wed, Apr 13, 2016 at 3:17 AM, Antony Lee wrote: > > > > This kind of issue (see also https://github.com/numpy/numpy/issues/3511) has become more annoying now that indexing requires integers (indexing with a float raises a VisibleDeprecationWarning). The argument "dividing an uint by an int may give a result that does not fit in an uint nor in an int" does not sound very convincing to me, > > It shouldn't because that's not the rule that numpy follows. The range of the result is never considered. Both *inputs* are cast to the same type that can represent the full range of either input type (for that matter, the actual *values* of the inputs are also never considered). In the case of uint64 and int64, there is no really good common type (the integer hierarchy has to top out somewhere), but float64 merely loses resolution rather than cutting off half of the range of uint64. Let me play devil's advocate for a moment, since I've just been playing out this debate in my own mind and you've done a good job of articulating the case for that side :-). The counter argument is: it doesn't really matter about having a common type or not; what matters is whether the operation can be defined sensibly. For uint64 int64, this is actually not a problem: we provide 2s complement signed ints, so uint64 and int64 are both integers-mod-2**64, just choosing different representatives for the equivalence classes in the upper half of the ring. In particular, the uint64 and int64 ranges are isomorphic to each other. or with less jargon: casting between uint64 and int64 commutes with all arithmetic operations, so you actually get the same result performing the operation in infinite precision and then casting to uint64 or int64, or casting both operations to uint64 or int64 and then casting the result to uint64 or int64. Basically the operations are totally well-defined even if we stick within integers, and the casting is just another form of integer wraparound; we're already happy to tolerate wraparound for int64 int64 or uint64 uint64, so it's not entirely clear why we go all the way to float to avoid it for uint64 int64. [On second thought... I'm actually not 100% sure that the all-operations-commute-with-casting thing is true in the case of //'s rounding behavior. I would have to squint a lot to figure that out. I guess comparison operations are another exception -- a < b != np.uint64(a) < np.uint64(b) in general.] -n From charlesr.harris at gmail.com Wed Apr 13 17:49:15 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 13 Apr 2016 15:49:15 -0600 Subject: [Numpy-discussion] Floor divison on int returns float In-Reply-To: References: <5702C709.1040801@hawaii.edu> Message-ID: On Wed, Apr 13, 2016 at 2:48 PM, Nathaniel Smith wrote: > On Apr 13, 2016 9:08 AM, "Robert Kern" wrote: > > > > On Wed, Apr 13, 2016 at 3:17 AM, Antony Lee > wrote: > > > > > > This kind of issue (see also > https://github.com/numpy/numpy/issues/3511) has become more annoying now > that indexing requires integers (indexing with a float raises a > VisibleDeprecationWarning). The argument "dividing an uint by an int may > give a result that does not fit in an uint nor in an int" does not sound > very convincing to me, > > > > It shouldn't because that's not the rule that numpy follows. The range > of the result is never considered. Both *inputs* are cast to the same type > that can represent the full range of either input type (for that matter, > the actual *values* of the inputs are also never considered). In the case > of uint64 and int64, there is no really good common type (the integer > hierarchy has to top out somewhere), but float64 merely loses resolution > rather than cutting off half of the range of uint64. > > Let me play devil's advocate for a moment, since I've just been > playing out this debate in my own mind and you've done a good job of > articulating the case for that side :-). > > The counter argument is: it doesn't really matter about having a > common type or not; what matters is whether the operation can be > defined sensibly. For uint64 int64, this is actually not a > problem: we provide 2s complement signed ints, so uint64 and int64 are > both integers-mod-2**64, just choosing different representatives for > the equivalence classes in the upper half of the ring. In particular, > the uint64 and int64 ranges are isomorphic to each other. > > or with less jargon: casting between uint64 and int64 commutes with > all arithmetic operations, so you actually get the same result > performing the operation in infinite precision and then casting to > uint64 or int64, or casting both operations to uint64 or int64 and > then casting the result to uint64 or int64. Basically the operations > are totally well-defined even if we stick within integers, and the > casting is just another form of integer wraparound; we're already > happy to tolerate wraparound for int64 int64 or uint64 > uint64, so it's not entirely clear why we go all the way to float to > avoid it for uint64 int64. > > [On second thought... I'm actually not 100% sure that the > all-operations-commute-with-casting thing is true in the case of //'s > rounding behavior. I would have to squint a lot to figure that out. I > guess comparison operations are another exception -- a < b != > np.uint64(a) < np.uint64(b) in general.] > I looked this up once, `C` returns unsigned in the scalar case when both operands have the same width. See Usual Arithmetic Conversions . I think that is not a bad choice, but there is the back compatibility problem, plus it is a bit exceptional. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Apr 13 17:52:29 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 13 Apr 2016 15:52:29 -0600 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Wed, Apr 13, 2016 at 1:15 PM, Matthew Brett wrote: > On Tue, Apr 12, 2016 at 7:15 PM, Matthew Brett > wrote: > > Hi, > > > > On Sat, Apr 2, 2016 at 6:11 PM, Matthew Brett > wrote: > >> On Fri, Mar 25, 2016 at 6:39 AM, Peter Cock > wrote: > >>> On Fri, Mar 25, 2016 at 3:02 AM, Robert T. McGibbon < > rmcgibbo at gmail.com> wrote: > >>>> I suspect that many of the maintainers of major scipy-ecosystem > projects are > >>>> aware of these (or other similar) travis wheel caches, but would > guess that > >>>> the pool of travis-ci python users who weren't aware of these wheel > caches > >>>> is much much larger. So there will still be a lot of travis-ci clock > cycles > >>>> saved by manylinux wheels. > >>>> > >>>> -Robert > >>> > >>> Yes exactly. Availability of NumPy Linux wheels on PyPI is definitely > something > >>> I would suggest adding to the release notes. Hopefully this will help > trigger > >>> a general availability of wheels in the numpy-ecosystem :) > >>> > >>> In the case of Travis CI, their VM images for Python already have a > version > >>> of NumPy installed, but having the latest version of NumPy and SciPy > etc > >>> available as Linux wheels would be very nice. > >> > >> We're very nearly there now. > >> > >> The latest versions of numpy, scipy, scikit-image, pandas, numexpr, > >> statsmodels wheels for testing at > >> > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ > >> > >> Please do test with: > >> > >> python -m pip install --upgrade pip > >> > >> pip install --trusted-host= > ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > >> --find-links= > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com > >> numpy scipy scikit-learn numexpr > >> > >> python -c 'import numpy; numpy.test("full")' > >> python -c 'import scipy; scipy.test("full")' > >> > >> We would love to get any feedback as to whether these work on your > machines. > > > > I've just rebuilt these wheels with the just-released OpenBLAS 0.2.18. > > > > OpenBLAS is now passing all its own tests and tests on numpy / scipy / > > scikit-learn at http://build.openblas.net/builders > > > > Our tests of the wheels look good too: > > > > http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian > > http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian > > https://travis-ci.org/matthew-brett/manylinux-testing > > > > So I think these are ready to go. I propose uploading these wheels > > for numpy and scipy to pypi tomorrow unless anyone has an objection. > > Done. If y'all are on linux, and you have pip >= 8.11, you should > now see this kind of thing: > > $ pip install numpy scipy > Collecting numpy > Downloading numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl (15.3MB) > 100% |????????????????????????????????| 15.3MB 61kB/s > Collecting scipy > Downloading scipy-0.17.0-cp27-cp27mu-manylinux1_x86_64.whl (39.5MB) > 100% |????????????????????????????????| 39.5MB 24kB/s > Installing collected packages: numpy, scipy > Successfully installed numpy-1.11.0 scipy-0.17.0 > Great work. It is nice that we are finally getting the Windows thing squared away after all these years. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Wed Apr 13 18:11:03 2016 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Wed, 13 Apr 2016 23:11:03 +0100 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On 13 Apr 2016 21:48, "Matthew Brett" wrote: > > On Wed, Apr 13, 2016 at 1:29 PM, Oscar Benjamin > wrote: > > On 13 April 2016 at 20:15, Matthew Brett wrote: > >> Done. If y'all are on linux, and you have pip >= 8.11, you should > >> now see this kind of thing: > > > > That's fantastic. Thanks Matt! > > > > I just test installed this and ran numpy.test(). All tests passed but > > then I got a segfault at the end by (semi-accidentally) hitting Ctrl-C > > at the prompt: > > > > $ python > > Python 2.7.9 (default, Apr 2 2015, 15:33:21) > > [GCC 4.9.2] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > >>>> import numpy > >>>> numpy.test() > > Running unit tests for numpy > > > > Ran 5781 tests in 72.238s > > > > OK (KNOWNFAIL=6, SKIP=15) > > > >>>> Segmentation fault (core dumped) > > > > It was stopped at the prompt and then I did Ctrl-C and then the > > seg-fault message. > > > > $ uname -a > > Linux vnwulf 3.19.0-15-generic #15-Ubuntu SMP Thu Apr 16 23:32:37 UTC > > 2015 x86_64 x86_64 x86_64 GNU/Linux > > $ lsb_release -a > > No LSB modules are available. > > Distributor ID: Ubuntu > > Description: Ubuntu 15.04 > > Release: 15.04 > > Codename: vivid > > > > Thanks so much for testing - that's very useful. > > I get the same thing on my Debian Sid machine. > > Actually I also get the same thing with a local compile against Debian > ATLAS, here's the stack trace after: > > >>> import numpy; numpy.test() > >>> # Ctrl-C > > https://gist.github.com/f6d8fb42f24689b39536a2416d717056 > > Do you get this as well? It's late here but I'll test again tomorrow. What do I need to do to get comparable output? -- Oscar -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Apr 13 18:38:40 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 13 Apr 2016 15:38:40 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: I can reproduce in self-compiled 1.9, so it's not a new bug. I think something's going wrong with NPY_SIGINT_ON / NPY_SIGINT_OFF, where our special sigint handler is getting left in place even after our code finishes running. Skimming the code, my best guess is that this is due to a race condition in how we save/restore the original signal handler, when multiple threads are running numpy fftpack code at the same time (and thus using NPY_SIGINT_{ON,OFF} from multiple threads). -n On Wed, Apr 13, 2016 at 1:47 PM, Matthew Brett wrote: > On Wed, Apr 13, 2016 at 1:29 PM, Oscar Benjamin > wrote: >> On 13 April 2016 at 20:15, Matthew Brett wrote: >>> Done. If y'all are on linux, and you have pip >= 8.11, you should >>> now see this kind of thing: >> >> That's fantastic. Thanks Matt! >> >> I just test installed this and ran numpy.test(). All tests passed but >> then I got a segfault at the end by (semi-accidentally) hitting Ctrl-C >> at the prompt: >> >> $ python >> Python 2.7.9 (default, Apr 2 2015, 15:33:21) >> [GCC 4.9.2] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import numpy >>>>> numpy.test() >> Running unit tests for numpy >> >> Ran 5781 tests in 72.238s >> >> OK (KNOWNFAIL=6, SKIP=15) >> >>>>> Segmentation fault (core dumped) >> >> It was stopped at the prompt and then I did Ctrl-C and then the >> seg-fault message. >> >> $ uname -a >> Linux vnwulf 3.19.0-15-generic #15-Ubuntu SMP Thu Apr 16 23:32:37 UTC >> 2015 x86_64 x86_64 x86_64 GNU/Linux >> $ lsb_release -a >> No LSB modules are available. >> Distributor ID: Ubuntu >> Description: Ubuntu 15.04 >> Release: 15.04 >> Codename: vivid >> > > Thanks so much for testing - that's very useful. > > I get the same thing on my Debian Sid machine. > > Actually I also get the same thing with a local compile against Debian > ATLAS, here's the stack trace after: > >>>> import numpy; numpy.test() >>>> # Ctrl-C > > https://gist.github.com/f6d8fb42f24689b39536a2416d717056 > > Do you get this as well? > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Wed Apr 13 20:46:31 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 13 Apr 2016 17:46:31 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: https://github.com/numpy/numpy/issues/7545 On Wed, Apr 13, 2016 at 3:38 PM, Nathaniel Smith wrote: > I can reproduce in self-compiled 1.9, so it's not a new bug. > > I think something's going wrong with NPY_SIGINT_ON / NPY_SIGINT_OFF, > where our special sigint handler is getting left in place even after > our code finishes running. > > Skimming the code, my best guess is that this is due to a race > condition in how we save/restore the original signal handler, when > multiple threads are running numpy fftpack code at the same time (and > thus using NPY_SIGINT_{ON,OFF} from multiple threads). > > -n > > On Wed, Apr 13, 2016 at 1:47 PM, Matthew Brett wrote: >> On Wed, Apr 13, 2016 at 1:29 PM, Oscar Benjamin >> wrote: >>> On 13 April 2016 at 20:15, Matthew Brett wrote: >>>> Done. If y'all are on linux, and you have pip >= 8.11, you should >>>> now see this kind of thing: >>> >>> That's fantastic. Thanks Matt! >>> >>> I just test installed this and ran numpy.test(). All tests passed but >>> then I got a segfault at the end by (semi-accidentally) hitting Ctrl-C >>> at the prompt: >>> >>> $ python >>> Python 2.7.9 (default, Apr 2 2015, 15:33:21) >>> [GCC 4.9.2] on linux2 >>> Type "help", "copyright", "credits" or "license" for more information. >>>>>> import numpy >>>>>> numpy.test() >>> Running unit tests for numpy >>> >>> Ran 5781 tests in 72.238s >>> >>> OK (KNOWNFAIL=6, SKIP=15) >>> >>>>>> Segmentation fault (core dumped) >>> >>> It was stopped at the prompt and then I did Ctrl-C and then the >>> seg-fault message. >>> >>> $ uname -a >>> Linux vnwulf 3.19.0-15-generic #15-Ubuntu SMP Thu Apr 16 23:32:37 UTC >>> 2015 x86_64 x86_64 x86_64 GNU/Linux >>> $ lsb_release -a >>> No LSB modules are available. >>> Distributor ID: Ubuntu >>> Description: Ubuntu 15.04 >>> Release: 15.04 >>> Codename: vivid >>> >> >> Thanks so much for testing - that's very useful. >> >> I get the same thing on my Debian Sid machine. >> >> Actually I also get the same thing with a local compile against Debian >> ATLAS, here's the stack trace after: >> >>>>> import numpy; numpy.test() >>>>> # Ctrl-C >> >> https://gist.github.com/f6d8fb42f24689b39536a2416d717056 >> >> Do you get this as well? >> >> Cheers, >> >> Matthew >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Nathaniel J. Smith -- https://vorpus.org -- Nathaniel J. Smith -- https://vorpus.org From solipsis at pitrou.net Thu Apr 14 04:06:28 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 14 Apr 2016 10:06:28 +0200 Subject: [Numpy-discussion] Floor divison on int returns float References: <5702C709.1040801@hawaii.edu> Message-ID: <20160414100628.2f3c66d0@fsol> On Wed, 13 Apr 2016 15:49:15 -0600 Charles R Harris wrote: > > I looked this up once, `C` returns unsigned in the scalar case when both > operands have the same width. See Usual Arithmetic Conversions > . > I think that is not a bad choice, but there is the back compatibility > problem, plus it is a bit exceptional. It may be a worse choice for Python. In the original use case (indexing with an integer), losing the sign is a bug since negative indices have a well-defined meaning in Python. This is a far more likely issue than magnitude loss on a 64-bit integer. In Numba, we decided that combining signed and unsigned would return signed (see http://numba.pydata.org/numba-doc/dev/proposals/integer-typing.html#proposal-predictable-width-conserving-typing). Regards Antoine. From jenshnielsen at gmail.com Thu Apr 14 11:02:17 2016 From: jenshnielsen at gmail.com (Jens Nielsen) Date: Thu, 14 Apr 2016 15:02:17 +0000 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: I have tried testing the wheels in a project that runs tests on Travis's Trusty infrastructure which. The wheels work great for python 3.5 and saves us several minuts of runtime. However, I am having trouble using the wheels on python 2.7 on the same Trusty machines. It seems to be because the wheels are tagged as cp27-cp27mu (numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl) where as pip.pep425tags.get_abi_tag() returns cp27m on this particular python version. (Stock python 2.7 installed on Travis 14.04 VMs) Any chance of a cp27m compatible wheel build? best Jens On Thu, 14 Apr 2016 at 01:46 Nathaniel Smith wrote: > https://github.com/numpy/numpy/issues/7545 > > On Wed, Apr 13, 2016 at 3:38 PM, Nathaniel Smith wrote: > > I can reproduce in self-compiled 1.9, so it's not a new bug. > > > > I think something's going wrong with NPY_SIGINT_ON / NPY_SIGINT_OFF, > > where our special sigint handler is getting left in place even after > > our code finishes running. > > > > Skimming the code, my best guess is that this is due to a race > > condition in how we save/restore the original signal handler, when > > multiple threads are running numpy fftpack code at the same time (and > > thus using NPY_SIGINT_{ON,OFF} from multiple threads). > > > > -n > > > > On Wed, Apr 13, 2016 at 1:47 PM, Matthew Brett > wrote: > >> On Wed, Apr 13, 2016 at 1:29 PM, Oscar Benjamin > >> wrote: > >>> On 13 April 2016 at 20:15, Matthew Brett > wrote: > >>>> Done. If y'all are on linux, and you have pip >= 8.11, you should > >>>> now see this kind of thing: > >>> > >>> That's fantastic. Thanks Matt! > >>> > >>> I just test installed this and ran numpy.test(). All tests passed but > >>> then I got a segfault at the end by (semi-accidentally) hitting Ctrl-C > >>> at the prompt: > >>> > >>> $ python > >>> Python 2.7.9 (default, Apr 2 2015, 15:33:21) > >>> [GCC 4.9.2] on linux2 > >>> Type "help", "copyright", "credits" or "license" for more information. > >>>>>> import numpy > >>>>>> numpy.test() > >>> Running unit tests for numpy > >>> > >>> Ran 5781 tests in 72.238s > >>> > >>> OK (KNOWNFAIL=6, SKIP=15) > >>> > >>>>>> Segmentation fault (core dumped) > >>> > >>> It was stopped at the prompt and then I did Ctrl-C and then the > >>> seg-fault message. > >>> > >>> $ uname -a > >>> Linux vnwulf 3.19.0-15-generic #15-Ubuntu SMP Thu Apr 16 23:32:37 UTC > >>> 2015 x86_64 x86_64 x86_64 GNU/Linux > >>> $ lsb_release -a > >>> No LSB modules are available. > >>> Distributor ID: Ubuntu > >>> Description: Ubuntu 15.04 > >>> Release: 15.04 > >>> Codename: vivid > >>> > >> > >> Thanks so much for testing - that's very useful. > >> > >> I get the same thing on my Debian Sid machine. > >> > >> Actually I also get the same thing with a local compile against Debian > >> ATLAS, here's the stack trace after: > >> > >>>>> import numpy; numpy.test() > >>>>> # Ctrl-C > >> > >> https://gist.github.com/f6d8fb42f24689b39536a2416d717056 > >> > >> Do you get this as well? > >> > >> Cheers, > >> > >> Matthew > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > -- > > Nathaniel J. Smith -- https://vorpus.org > > > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Apr 14 14:04:12 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 14 Apr 2016 11:04:12 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Hi, On Thu, Apr 14, 2016 at 8:02 AM, Jens Nielsen wrote: > I have tried testing the wheels in a project that runs tests on Travis's > Trusty infrastructure which. The wheels work great for python 3.5 and saves > us several minuts of runtime. > > However, I am having trouble using the wheels on python 2.7 on the same > Trusty machines. It seems to be because the wheels are tagged as cp27-cp27mu > (numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl) where as > pip.pep425tags.get_abi_tag() returns cp27m on this particular python > version. (Stock python 2.7 installed on Travis 14.04 VMs) Any chance of a > cp27m compatible wheel build? Ouch - do you know where travis-ci's Python 2.7 comes from? I see that the standard apt-get install -y python is a wide (mu) build... Cheers, Matthew From ben.v.root at gmail.com Thu Apr 14 14:11:17 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 14 Apr 2016 14:11:17 -0400 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Are we going to have to have documentation somewhere making it clear that the numpy wheel shouldn't be used in a conda environment? Not that I would expect this issue to come up all that often, but I could imagine a scenario where a non-scientist is simply using a base conda distribution because that is what IT put on their system. Then they do "pip install ipython" that indirectly brings in numpy (through the matplotlib dependency), and end up with an incompatible numpy because they would have been linked against different pythons? Or is this not an issue? Ben Root On Thu, Apr 14, 2016 at 2:04 PM, Matthew Brett wrote: > Hi, > > On Thu, Apr 14, 2016 at 8:02 AM, Jens Nielsen > wrote: > > I have tried testing the wheels in a project that runs tests on Travis's > > Trusty infrastructure which. The wheels work great for python 3.5 and > saves > > us several minuts of runtime. > > > > However, I am having trouble using the wheels on python 2.7 on the same > > Trusty machines. It seems to be because the wheels are tagged as > cp27-cp27mu > > (numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl) where as > > pip.pep425tags.get_abi_tag() returns cp27m on this particular python > > version. (Stock python 2.7 installed on Travis 14.04 VMs) Any chance of a > > cp27m compatible wheel build? > > Ouch - do you know where travis-ci's Python 2.7 comes from? I see > that the standard apt-get install -y python is a wide (mu) build... > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Apr 14 14:26:27 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 14 Apr 2016 11:26:27 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Hi, On Thu, Apr 14, 2016 at 11:11 AM, Benjamin Root wrote: > Are we going to have to have documentation somewhere making it clear that > the numpy wheel shouldn't be used in a conda environment? Not that I would > expect this issue to come up all that often, but I could imagine a scenario > where a non-scientist is simply using a base conda distribution because that > is what IT put on their system. Then they do "pip install ipython" that > indirectly brings in numpy (through the matplotlib dependency), and end up > with an incompatible numpy because they would have been linked against > different pythons? > > Or is this not an issue? I'm afraid I don't know conda at all, but I'm guessing that pip will not install numpy when it is installed via conda. So the potential difference is that, pre-wheel, if numpy was not installed in your conda environment, then pip would build numpy from source, whereas now you'll get a binary install. I _think_ that Python's binary API specification (pip.pep425tags.get_abi_tag()) should prevent pip from installing an incompatible wheel. Are there any conda experts out there who can give more detail, or more convincing assurance? Cheers, Matthew From gfyoung17 at gmail.com Thu Apr 14 14:51:24 2016 From: gfyoung17 at gmail.com (G Young) Date: Thu, 14 Apr 2016 19:51:24 +0100 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Actually, conda pip will install the wheels that you put up. The good news is: they all (by which I mean *numpy* and *scipy* both on 2.7 and 3.5) pass! On Thu, Apr 14, 2016 at 7:26 PM, Matthew Brett wrote: > Hi, > > On Thu, Apr 14, 2016 at 11:11 AM, Benjamin Root > wrote: > > Are we going to have to have documentation somewhere making it clear that > > the numpy wheel shouldn't be used in a conda environment? Not that I > would > > expect this issue to come up all that often, but I could imagine a > scenario > > where a non-scientist is simply using a base conda distribution because > that > > is what IT put on their system. Then they do "pip install ipython" that > > indirectly brings in numpy (through the matplotlib dependency), and end > up > > with an incompatible numpy because they would have been linked against > > different pythons? > > > > Or is this not an issue? > > I'm afraid I don't know conda at all, but I'm guessing that pip will > not install numpy when it is installed via conda. > > So the potential difference is that, pre-wheel, if numpy was not > installed in your conda environment, then pip would build numpy from > source, whereas now you'll get a binary install. > > I _think_ that Python's binary API specification > (pip.pep425tags.get_abi_tag()) should prevent pip from installing an > incompatible wheel. Are there any conda experts out there who can > give more detail, or more convincing assurance? > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Apr 14 15:07:54 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 14 Apr 2016 12:07:54 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Apr 14, 2016 11:11 AM, "Benjamin Root" wrote: > > Are we going to have to have documentation somewhere making it clear that the numpy wheel shouldn't be used in a conda environment? Not that I would expect this issue to come up all that often, but I could imagine a scenario where a non-scientist is simply using a base conda distribution because that is what IT put on their system. Then they do "pip install ipython" that indirectly brings in numpy (through the matplotlib dependency), and end up with an incompatible numpy because they would have been linked against different pythons? > > Or is this not an issue? There are always issues when you have two different package managers maintaining separate and out-of-sync metadata about what they think is installed, but that's true for any mixed use of conda and pip. But: - pip won't install a numpy that is incompatible with your python, unless Anaconda is actively breaking cpython's standard abi (they aren't) or there's a bug in pip (possible, but no reports yet). - conda packages for python packages like numpy do generally include the .egg-info / .dist-info directories that pip uses to store its installation metadata, so pip can "see" packages installed by conda (but not vice-versa). So "pip install matplotlib" won't drag in a pypi numpy if there's already a conda numpy installed. AFAIK the one case that's nasty is if you first install a conda X, and then install a pypi X, and then try to use conda to (explicitly, or implicitly via dependencies) upgrade X. And maybe this is particularly nasty for X=numpy just because numpy is so low in the stack, but it's not really numpy specific. (NB I'm not an expert on the internals of conda though :-).) Actually, from the numpy developer point of view, one of the major advantages of having wheels is that we can ask people to test prereleases with 'pip install -U --pre numpy'. If you're a conda user you should only do this in a temporary environment (like any use of pip really), but I definitely hope that some conda users will do exactly that to test things :-). Also note that there's nothing Linux specific about this scenario. We've been shipping osx wheels for ages, and AFAIK it hasn't caused any disaster. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjhelmus at gmail.com Thu Apr 14 15:25:10 2016 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Thu, 14 Apr 2016 14:25:10 -0500 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: <570FEE96.1030705@gmail.com> On 4/14/16 1:26 PM, Matthew Brett wrote: > Hi, > > On Thu, Apr 14, 2016 at 11:11 AM, Benjamin Root wrote: >> Are we going to have to have documentation somewhere making it clear that >> the numpy wheel shouldn't be used in a conda environment? Not that I would >> expect this issue to come up all that often, but I could imagine a scenario >> where a non-scientist is simply using a base conda distribution because that >> is what IT put on their system. Then they do "pip install ipython" that >> indirectly brings in numpy (through the matplotlib dependency), and end up >> with an incompatible numpy because they would have been linked against >> different pythons? >> >> Or is this not an issue? > I'm afraid I don't know conda at all, but I'm guessing that pip will > not install numpy when it is installed via conda. Correct, pip will not (or at least should not, and did not in my tests) install numpy over top of an existing conda installed numpy. Unfortunately from my testing, conda will install a conda version of numpy over top of a pip installed version. This may be the expected behavior as conda maintains its own list of installed packages. > So the potential difference is that, pre-wheel, if numpy was not > installed in your conda environment, then pip would build numpy from > source, whereas now you'll get a binary install. > > I _think_ that Python's binary API specification > (pip.pep425tags.get_abi_tag()) should prevent pip from installing an > incompatible wheel. Are there any conda experts out there who can > give more detail, or more convincing assurance? I tested "pip install numpy" in conda environments (conda's equivalent to virtualenvs) which did not have numpy installed previously for Python 2.7, 3.4 and 3.5 in a Ubuntu 14.04 Docker container. In all cases numpy was installed from the whl file and appeared to be functional. Running the numpy test suite found three failing tests for Python 2.7 and 3.5 and 21 errors in Python 3.4. The 2.7 and 3.5 failures do not look concerning but the 3.4 errors are a bit strange. Logs are in https://gist.github.com/jjhelmus/a433a66d56fb0e39b8ebde248ad3fe36 Cheers, - Jonathan Helmus > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Thu Apr 14 15:57:35 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 14 Apr 2016 12:57:35 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: <570FEE96.1030705@gmail.com> References: <570FEE96.1030705@gmail.com> Message-ID: On Thu, Apr 14, 2016 at 12:25 PM, Jonathan Helmus wrote: > > > On 4/14/16 1:26 PM, Matthew Brett wrote: >> >> Hi, >> >> On Thu, Apr 14, 2016 at 11:11 AM, Benjamin Root >> wrote: >>> >>> Are we going to have to have documentation somewhere making it clear that >>> the numpy wheel shouldn't be used in a conda environment? Not that I >>> would >>> expect this issue to come up all that often, but I could imagine a >>> scenario >>> where a non-scientist is simply using a base conda distribution because >>> that >>> is what IT put on their system. Then they do "pip install ipython" that >>> indirectly brings in numpy (through the matplotlib dependency), and end >>> up >>> with an incompatible numpy because they would have been linked against >>> different pythons? >>> >>> Or is this not an issue? >> >> I'm afraid I don't know conda at all, but I'm guessing that pip will >> not install numpy when it is installed via conda. > > Correct, pip will not (or at least should not, and did not in my tests) > install numpy over top of an existing conda installed numpy. Unfortunately > from my testing, conda will install a conda version of numpy over top of a > pip installed version. This may be the expected behavior as conda maintains > its own list of installed packages. >> >> So the potential difference is that, pre-wheel, if numpy was not >> installed in your conda environment, then pip would build numpy from >> source, whereas now you'll get a binary install. >> >> I _think_ that Python's binary API specification >> (pip.pep425tags.get_abi_tag()) should prevent pip from installing an >> incompatible wheel. Are there any conda experts out there who can >> give more detail, or more convincing assurance? > > I tested "pip install numpy" in conda environments (conda's equivalent to > virtualenvs) which did not have numpy installed previously for Python 2.7, > 3.4 and 3.5 in a Ubuntu 14.04 Docker container. In all cases numpy was > installed from the whl file and appeared to be functional. Running the > numpy test suite found three failing tests for Python 2.7 and 3.5 and 21 > errors in Python 3.4. The 2.7 and 3.5 failures do not look concerning but > the 3.4 errors are a bit strange. > Logs are in > https://gist.github.com/jjhelmus/a433a66d56fb0e39b8ebde248ad3fe36 Thanks for testing. For: docker run -ti --rm ubuntu:14.04 /bin/bash apt-get update && apt-get install -y curl curl -LO https://bootstrap.pypa.io/get-pip.py python3 get-pip.py pip install numpy nose python3 -c "import numpy; numpy.test()" I get: FAILED (KNOWNFAIL=7, SKIP=17, errors=21) This is stock Python 3.4 - so not a conda issue. It is definitely a problem with the wheel because a compiled numpy wheel on the same docker image: apt-get update && apt-get install -y curl python3-dev curl -LO https://bootstrap.pypa.io/get-pip.py python3 get-pip.py pip install --no-binary=:all: numpy nose python3 -c "import numpy; numpy.test()" gives no test errors. It looks like we have some more work to do... Cheers, Matthew From pmhobson at gmail.com Thu Apr 14 15:59:39 2016 From: pmhobson at gmail.com (Paul Hobson) Date: Thu, 14 Apr 2016 12:59:39 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Thu, Apr 14, 2016 at 12:07 PM, Nathaniel Smith wrote: > On Apr 14, 2016 11:11 AM, "Benjamin Root" wrote: > > > > Are we going to have to have documentation somewhere making it clear > that the numpy wheel shouldn't be used in a conda environment? Not that I > would expect this issue to come up all that often, but I could imagine a > scenario where a non-scientist is simply using a base conda distribution > because that is what IT put on their system. Then they do "pip install > ipython" that indirectly brings in numpy (through the matplotlib > dependency), and end up with an incompatible numpy because they would have > been linked against different pythons? > > > > Or is this not an issue? > > There are always issues when you have two different package managers > maintaining separate and out-of-sync metadata about what they think is > installed, but that's true for any mixed use of conda and pip. > > But: > - pip won't install a numpy that is incompatible with your python, unless > Anaconda is actively breaking cpython's standard abi (they aren't) or > there's a bug in pip (possible, but no reports yet). > - conda packages for python packages like numpy do generally include the > .egg-info / .dist-info directories that pip uses to store its installation > metadata, so pip can "see" packages installed by conda (but not > vice-versa). So "pip install matplotlib" won't drag in a pypi numpy if > there's already a conda numpy installed. > Minor clarification:. I believe conda can see pip-installed packages. If I execute "conda list" in an environment, I can see packaged installed by both pip, conda, and locally (i.e., "pip install . -e"). -paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Thu Apr 14 16:04:52 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 14 Apr 2016 16:04:52 -0400 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: I am honestly surprised that these worked (I haven't gotten around to testing for myself). I could have sworn there was a difference in how Continuum compiled python such that any binaries built against a stock python would not work in a conda environment. I ran into issues a couple years ago where a modwsgi package provided through yum wouldn't work with miniconda because of link-time differences. I cannot for the life of me remember the error message, though. Ben Root On Thu, Apr 14, 2016 at 3:59 PM, Paul Hobson wrote: > > > On Thu, Apr 14, 2016 at 12:07 PM, Nathaniel Smith wrote: > >> On Apr 14, 2016 11:11 AM, "Benjamin Root" wrote: >> > >> > Are we going to have to have documentation somewhere making it clear >> that the numpy wheel shouldn't be used in a conda environment? Not that I >> would expect this issue to come up all that often, but I could imagine a >> scenario where a non-scientist is simply using a base conda distribution >> because that is what IT put on their system. Then they do "pip install >> ipython" that indirectly brings in numpy (through the matplotlib >> dependency), and end up with an incompatible numpy because they would have >> been linked against different pythons? >> > >> > Or is this not an issue? >> >> There are always issues when you have two different package managers >> maintaining separate and out-of-sync metadata about what they think is >> installed, but that's true for any mixed use of conda and pip. >> >> But: >> - pip won't install a numpy that is incompatible with your python, unless >> Anaconda is actively breaking cpython's standard abi (they aren't) or >> there's a bug in pip (possible, but no reports yet). >> - conda packages for python packages like numpy do generally include the >> .egg-info / .dist-info directories that pip uses to store its installation >> metadata, so pip can "see" packages installed by conda (but not >> vice-versa). So "pip install matplotlib" won't drag in a pypi numpy if >> there's already a conda numpy installed. >> > Minor clarification:. I believe conda can see pip-installed packages. > > If I execute "conda list" in an environment, I can see packaged installed > by both pip, conda, and locally (i.e., "pip install . -e"). > > -paul > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Apr 14 16:11:18 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 14 Apr 2016 13:11:18 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: <570FEE96.1030705@gmail.com> Message-ID: On Thu, Apr 14, 2016 at 12:57 PM, Matthew Brett wrote: > On Thu, Apr 14, 2016 at 12:25 PM, Jonathan Helmus wrote: >> >> >> On 4/14/16 1:26 PM, Matthew Brett wrote: >>> >>> Hi, >>> >>> On Thu, Apr 14, 2016 at 11:11 AM, Benjamin Root >>> wrote: >>>> >>>> Are we going to have to have documentation somewhere making it clear that >>>> the numpy wheel shouldn't be used in a conda environment? Not that I >>>> would >>>> expect this issue to come up all that often, but I could imagine a >>>> scenario >>>> where a non-scientist is simply using a base conda distribution because >>>> that >>>> is what IT put on their system. Then they do "pip install ipython" that >>>> indirectly brings in numpy (through the matplotlib dependency), and end >>>> up >>>> with an incompatible numpy because they would have been linked against >>>> different pythons? >>>> >>>> Or is this not an issue? >>> >>> I'm afraid I don't know conda at all, but I'm guessing that pip will >>> not install numpy when it is installed via conda. >> >> Correct, pip will not (or at least should not, and did not in my tests) >> install numpy over top of an existing conda installed numpy. Unfortunately >> from my testing, conda will install a conda version of numpy over top of a >> pip installed version. This may be the expected behavior as conda maintains >> its own list of installed packages. >>> >>> So the potential difference is that, pre-wheel, if numpy was not >>> installed in your conda environment, then pip would build numpy from >>> source, whereas now you'll get a binary install. >>> >>> I _think_ that Python's binary API specification >>> (pip.pep425tags.get_abi_tag()) should prevent pip from installing an >>> incompatible wheel. Are there any conda experts out there who can >>> give more detail, or more convincing assurance? >> >> I tested "pip install numpy" in conda environments (conda's equivalent to >> virtualenvs) which did not have numpy installed previously for Python 2.7, >> 3.4 and 3.5 in a Ubuntu 14.04 Docker container. In all cases numpy was >> installed from the whl file and appeared to be functional. Running the >> numpy test suite found three failing tests for Python 2.7 and 3.5 and 21 >> errors in Python 3.4. The 2.7 and 3.5 failures do not look concerning but >> the 3.4 errors are a bit strange. >> Logs are in >> https://gist.github.com/jjhelmus/a433a66d56fb0e39b8ebde248ad3fe36 > > Thanks for testing. For: > > docker run -ti --rm ubuntu:14.04 /bin/bash > > apt-get update && apt-get install -y curl > curl -LO https://bootstrap.pypa.io/get-pip.py > python3 get-pip.py > pip install numpy nose > python3 -c "import numpy; numpy.test()" > > I get: > > FAILED (KNOWNFAIL=7, SKIP=17, errors=21) > > This is stock Python 3.4 - so not a conda issue. It is definitely a > problem with the wheel because a compiled numpy wheel on the same > docker image: > > apt-get update && apt-get install -y curl python3-dev > curl -LO https://bootstrap.pypa.io/get-pip.py > python3 get-pip.py > pip install --no-binary=:all: numpy nose > python3 -c "import numpy; numpy.test()" > > gives no test errors. > > It looks like we have some more work to do... Actually, I can solve these errors by first doing: apt-get install gcc I think these must be bugs in the numpy tests where numpy is assuming a functional compiler. Does the conda numpy give test errors when there is no compiler? Cheers, Matthew From matthew.brett at gmail.com Thu Apr 14 16:22:01 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 14 Apr 2016 13:22:01 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Thu, Apr 14, 2016 at 8:02 AM, Jens Nielsen wrote: > I have tried testing the wheels in a project that runs tests on Travis's > Trusty infrastructure which. The wheels work great for python 3.5 and saves > us several minuts of runtime. > > However, I am having trouble using the wheels on python 2.7 on the same > Trusty machines. It seems to be because the wheels are tagged as cp27-cp27mu > (numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl) where as > pip.pep425tags.get_abi_tag() returns cp27m on this particular python > version. (Stock python 2.7 installed on Travis 14.04 VMs) Any chance of a > cp27m compatible wheel build? Nathaniel / other pip experts - I can't remember the history of these tags. Is there any danger that an older pip will install a cp27m wheel on a cp27mu system? Matthew From jjhelmus at gmail.com Thu Apr 14 16:47:19 2016 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Thu, 14 Apr 2016 15:47:19 -0500 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: <570FEE96.1030705@gmail.com> Message-ID: <571001D7.3090401@gmail.com> On 4/14/16 3:11 PM, Matthew Brett wrote: > On Thu, Apr 14, 2016 at 12:57 PM, Matthew Brett wrote: >> On Thu, Apr 14, 2016 at 12:25 PM, Jonathan Helmus wrote: >>> >>> On 4/14/16 1:26 PM, Matthew Brett wrote: >>>> Hi, >>>> >>>> On Thu, Apr 14, 2016 at 11:11 AM, Benjamin Root >>>> wrote: >>>>> Are we going to have to have documentation somewhere making it clear that >>>>> the numpy wheel shouldn't be used in a conda environment? Not that I >>>>> would >>>>> expect this issue to come up all that often, but I could imagine a >>>>> scenario >>>>> where a non-scientist is simply using a base conda distribution because >>>>> that >>>>> is what IT put on their system. Then they do "pip install ipython" that >>>>> indirectly brings in numpy (through the matplotlib dependency), and end >>>>> up >>>>> with an incompatible numpy because they would have been linked against >>>>> different pythons? >>>>> >>>>> Or is this not an issue? >>>> I'm afraid I don't know conda at all, but I'm guessing that pip will >>>> not install numpy when it is installed via conda. >>> Correct, pip will not (or at least should not, and did not in my tests) >>> install numpy over top of an existing conda installed numpy. Unfortunately >>> from my testing, conda will install a conda version of numpy over top of a >>> pip installed version. This may be the expected behavior as conda maintains >>> its own list of installed packages. >>>> So the potential difference is that, pre-wheel, if numpy was not >>>> installed in your conda environment, then pip would build numpy from >>>> source, whereas now you'll get a binary install. >>>> >>>> I _think_ that Python's binary API specification >>>> (pip.pep425tags.get_abi_tag()) should prevent pip from installing an >>>> incompatible wheel. Are there any conda experts out there who can >>>> give more detail, or more convincing assurance? >>> I tested "pip install numpy" in conda environments (conda's equivalent to >>> virtualenvs) which did not have numpy installed previously for Python 2.7, >>> 3.4 and 3.5 in a Ubuntu 14.04 Docker container. In all cases numpy was >>> installed from the whl file and appeared to be functional. Running the >>> numpy test suite found three failing tests for Python 2.7 and 3.5 and 21 >>> errors in Python 3.4. The 2.7 and 3.5 failures do not look concerning but >>> the 3.4 errors are a bit strange. >>> Logs are in >>> https://gist.github.com/jjhelmus/a433a66d56fb0e39b8ebde248ad3fe36 >> Thanks for testing. For: >> >> docker run -ti --rm ubuntu:14.04 /bin/bash >> >> apt-get update && apt-get install -y curl >> curl -LO https://bootstrap.pypa.io/get-pip.py >> python3 get-pip.py >> pip install numpy nose >> python3 -c "import numpy; numpy.test()" >> >> I get: >> >> FAILED (KNOWNFAIL=7, SKIP=17, errors=21) >> >> This is stock Python 3.4 - so not a conda issue. It is definitely a >> problem with the wheel because a compiled numpy wheel on the same >> docker image: >> >> apt-get update && apt-get install -y curl python3-dev >> curl -LO https://bootstrap.pypa.io/get-pip.py >> python3 get-pip.py >> pip install --no-binary=:all: numpy nose >> python3 -c "import numpy; numpy.test()" >> >> gives no test errors. >> >> It looks like we have some more work to do... > Actually, I can solve these errors by first doing: > > apt-get install gcc > > I think these must be bugs in the numpy tests where numpy is assuming > a functional compiler. > > Does the conda numpy give test errors when there is no compiler? > > Cheers, > > Matthew Yes, both the wheel and conda numpy packages give errors when there is not a compiler. These errors clear when gcc is installed. Looks like the wheels are fine, just forgot about a compiler. Cheers, - Jonathan Helmus From matthew.brett at gmail.com Thu Apr 14 17:32:08 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 14 Apr 2016 14:32:08 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: <571001D7.3090401@gmail.com> References: <570FEE96.1030705@gmail.com> <571001D7.3090401@gmail.com> Message-ID: On Thu, Apr 14, 2016 at 1:47 PM, Jonathan Helmus wrote: > > > On 4/14/16 3:11 PM, Matthew Brett wrote: >> >> On Thu, Apr 14, 2016 at 12:57 PM, Matthew Brett >> wrote: >>> >>> On Thu, Apr 14, 2016 at 12:25 PM, Jonathan Helmus >>> wrote: >>>> >>>> >>>> On 4/14/16 1:26 PM, Matthew Brett wrote: >>>>> >>>>> Hi, >>>>> >>>>> On Thu, Apr 14, 2016 at 11:11 AM, Benjamin Root >>>>> wrote: >>>>>> >>>>>> Are we going to have to have documentation somewhere making it clear >>>>>> that >>>>>> the numpy wheel shouldn't be used in a conda environment? Not that I >>>>>> would >>>>>> expect this issue to come up all that often, but I could imagine a >>>>>> scenario >>>>>> where a non-scientist is simply using a base conda distribution >>>>>> because >>>>>> that >>>>>> is what IT put on their system. Then they do "pip install ipython" >>>>>> that >>>>>> indirectly brings in numpy (through the matplotlib dependency), and >>>>>> end >>>>>> up >>>>>> with an incompatible numpy because they would have been linked against >>>>>> different pythons? >>>>>> >>>>>> Or is this not an issue? >>>>> >>>>> I'm afraid I don't know conda at all, but I'm guessing that pip will >>>>> not install numpy when it is installed via conda. >>>> >>>> Correct, pip will not (or at least should not, and did not in my tests) >>>> install numpy over top of an existing conda installed numpy. >>>> Unfortunately >>>> from my testing, conda will install a conda version of numpy over top of >>>> a >>>> pip installed version. This may be the expected behavior as conda >>>> maintains >>>> its own list of installed packages. >>>>> >>>>> So the potential difference is that, pre-wheel, if numpy was not >>>>> installed in your conda environment, then pip would build numpy from >>>>> source, whereas now you'll get a binary install. >>>>> >>>>> I _think_ that Python's binary API specification >>>>> (pip.pep425tags.get_abi_tag()) should prevent pip from installing an >>>>> incompatible wheel. Are there any conda experts out there who can >>>>> give more detail, or more convincing assurance? >>>> >>>> I tested "pip install numpy" in conda environments (conda's equivalent >>>> to >>>> virtualenvs) which did not have numpy installed previously for Python >>>> 2.7, >>>> 3.4 and 3.5 in a Ubuntu 14.04 Docker container. In all cases numpy was >>>> installed from the whl file and appeared to be functional. Running the >>>> numpy test suite found three failing tests for Python 2.7 and 3.5 and 21 >>>> errors in Python 3.4. The 2.7 and 3.5 failures do not look concerning >>>> but >>>> the 3.4 errors are a bit strange. >>>> Logs are in >>>> https://gist.github.com/jjhelmus/a433a66d56fb0e39b8ebde248ad3fe36 >>> >>> Thanks for testing. For: >>> >>> docker run -ti --rm ubuntu:14.04 /bin/bash >>> >>> apt-get update && apt-get install -y curl >>> curl -LO https://bootstrap.pypa.io/get-pip.py >>> python3 get-pip.py >>> pip install numpy nose >>> python3 -c "import numpy; numpy.test()" >>> >>> I get: >>> >>> FAILED (KNOWNFAIL=7, SKIP=17, errors=21) >>> >>> This is stock Python 3.4 - so not a conda issue. It is definitely a >>> problem with the wheel because a compiled numpy wheel on the same >>> docker image: >>> >>> apt-get update && apt-get install -y curl python3-dev >>> curl -LO https://bootstrap.pypa.io/get-pip.py >>> python3 get-pip.py >>> pip install --no-binary=:all: numpy nose >>> python3 -c "import numpy; numpy.test()" >>> >>> gives no test errors. >>> >>> It looks like we have some more work to do... >> >> Actually, I can solve these errors by first doing: >> >> apt-get install gcc >> >> I think these must be bugs in the numpy tests where numpy is assuming >> a functional compiler. >> >> Does the conda numpy give test errors when there is no compiler? >> >> Cheers, >> >> Matthew > > > Yes, both the wheel and conda numpy packages give errors when there is not a > compiler. These errors clear when gcc is installed. Looks like the wheels > are fine, just forgot about a compiler. Thanks for checking. I think the problem is fixed here: https://github.com/numpy/numpy/pull/7549 Cheers, Matthew From njs at pobox.com Thu Apr 14 21:00:12 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 14 Apr 2016 18:00:12 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Thu, Apr 14, 2016 at 1:22 PM, Matthew Brett wrote: > On Thu, Apr 14, 2016 at 8:02 AM, Jens Nielsen wrote: >> I have tried testing the wheels in a project that runs tests on Travis's >> Trusty infrastructure which. The wheels work great for python 3.5 and saves >> us several minuts of runtime. >> >> However, I am having trouble using the wheels on python 2.7 on the same >> Trusty machines. It seems to be because the wheels are tagged as cp27-cp27mu >> (numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl) where as >> pip.pep425tags.get_abi_tag() returns cp27m on this particular python >> version. (Stock python 2.7 installed on Travis 14.04 VMs) Any chance of a >> cp27m compatible wheel build? > > Nathaniel / other pip experts - I can't remember the history of these tags. > > Is there any danger that an older pip will install a cp27m wheel on a > cp27mu system? No, support for cp27m/cp27mu tags went in before support for manylinux tags. And in any case, a pip that doesn't know about cp27m/cp27mu will just not install such wheels. The dangerous case is if you were to use an old version of bdist_wheel that generated a wheel with the "none" abi tag instead of a cp27m/cp27mu abi tag -- this will mess up all versions of pip, new and old. But the manylinux docker image definitely has a new enough version of the wheel package that this is not a problem. ...I guess the other dangerous case is if you generate a wheel that simply has the wrong name -- this happened to the gevent packager due to some distutils brokenness involving using the same source directory to build both wheels. So don't do that :-). (IIRC there's an open bug against auditwheel to check for all these problems -- belt *and* suspenders -- but that hasn't been implemented yet.) -n -- Nathaniel J. Smith -- https://vorpus.org From cjw at ncf.ca Thu Apr 14 23:08:56 2016 From: cjw at ncf.ca (Colin J. Williams) Date: Thu, 14 Apr 2016 23:08:56 -0400 Subject: [Numpy-discussion] Please Unsubscribe Message-ID: <57105B48.6010406@ncf.ca> From charlesr.harris at gmail.com Sat Apr 16 17:34:58 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 16 Apr 2016 15:34:58 -0600 Subject: [Numpy-discussion] Preparing for 1.12 branch Message-ID: Hi All, This is just a request that numpy reviewers tag PRs that they think merit inclusion in 1.12 with `1.12.0 release`. The tag doesn't mean that the PR need be in 1.12, but it will help prioritize the review process. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Apr 16 23:02:18 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 16 Apr 2016 20:02:18 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Hi, On Thu, Apr 14, 2016 at 11:04 AM, Matthew Brett wrote: > Hi, > > On Thu, Apr 14, 2016 at 8:02 AM, Jens Nielsen wrote: >> I have tried testing the wheels in a project that runs tests on Travis's >> Trusty infrastructure which. The wheels work great for python 3.5 and saves >> us several minuts of runtime. >> >> However, I am having trouble using the wheels on python 2.7 on the same >> Trusty machines. It seems to be because the wheels are tagged as cp27-cp27mu >> (numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl) where as >> pip.pep425tags.get_abi_tag() returns cp27m on this particular python >> version. (Stock python 2.7 installed on Travis 14.04 VMs) Any chance of a >> cp27m compatible wheel build? > > Ouch - do you know where travis-ci's Python 2.7 comes from? I see > that the standard apt-get install -y python is a wide (mu) build... I built some narrow unicode builds (numpy-1.11.0-cp27-cp27m-manylinux1_x86_64.whl etc) here: http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ Would you mind testing them to see if they work on travis-ci? Thanks, Matthew From matthew.brett at gmail.com Sat Apr 16 23:23:47 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 16 Apr 2016 20:23:47 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Sat, Apr 16, 2016 at 8:02 PM, Matthew Brett wrote: > Hi, > > On Thu, Apr 14, 2016 at 11:04 AM, Matthew Brett wrote: >> Hi, >> >> On Thu, Apr 14, 2016 at 8:02 AM, Jens Nielsen wrote: >>> I have tried testing the wheels in a project that runs tests on Travis's >>> Trusty infrastructure which. The wheels work great for python 3.5 and saves >>> us several minuts of runtime. >>> >>> However, I am having trouble using the wheels on python 2.7 on the same >>> Trusty machines. It seems to be because the wheels are tagged as cp27-cp27mu >>> (numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl) where as >>> pip.pep425tags.get_abi_tag() returns cp27m on this particular python >>> version. (Stock python 2.7 installed on Travis 14.04 VMs) Any chance of a >>> cp27m compatible wheel build? >> >> Ouch - do you know where travis-ci's Python 2.7 comes from? I see >> that the standard apt-get install -y python is a wide (mu) build... > > I built some narrow unicode builds > (numpy-1.11.0-cp27-cp27m-manylinux1_x86_64.whl etc) here: > > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/ > > Would you mind testing them to see if they work on travis-ci? I tried testing on trusty with travis-ci, but it appears to pick up the mu builds as on precise... https://travis-ci.org/matthew-brett/manylinux-testing/jobs/123652670#L161 Cheers, Matthew From olivier.grisel at ensta.org Sun Apr 17 06:05:10 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sun, 17 Apr 2016 12:05:10 +0200 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: I tried on trusty and is also picked numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl using the system python 2.7 (in a virtualenv with pip 8.1.1): >>> import pip >>> pip.pep425tags.get_abi_tag() 'cp27mu' Outside of the virtualenv I still have the pip version from ubuntu trusty and it does cannot detect ABI tags: $ /usr/bin/pip --version pip 1.5.4 from /usr/lib/python2.7/dist-packages (python 2.7) >>> import pip >>> pip.pep425tags.get_abi_tag() Traceback (most recent call last): File "", line 1, in AttributeError: 'module' object has no attribute 'get_abi_tag' But we don't really care because manylinux1 wheels can only be installed by pip 8.1 and later. Previous versions of pip should just ignore those wheels and try to install from the source tarball instead. -- Olivier From jenshnielsen at gmail.com Sun Apr 17 12:48:48 2016 From: jenshnielsen at gmail.com (Jens Nielsen) Date: Sun, 17 Apr 2016 16:48:48 +0000 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: I have tested the new cp27m wheels and they seem to work great. @Matthew I am using the: ``` sudo: required dist: trusty images mentioned here https://docs.travis-ci.com/user/ci-environment/. As far as I can see you are doing: sudo: false dist: trusty I had no idea such an image exist since it's not documented on https://docs.travis-ci.com/user/ci-environment/ Anyway your tests runs with python 2.7.9 where as the sudo: requires ships python 2.7.10 so it's clearly a different python version: @Olivier Grisel this only applies to Travis's own home build versions of python 2.7 on the Trusty running on google compute engine. It ships it's own prebuild python version. I don't have any issues with the stock versions on Ubuntu which pip tells me are indeed cp27mu. It seems like the new cp27m wheels works as expected. Thanks a lot Doing: ``` python -c "from pip import pep425tags; print(pep425tags.is_manylinux1_compatible()); print(pep425tags.have_compatible_glibc(2, 5)); print(pep425tags.get_abi_tag())" pip install --timeout=60 --no-index --trusted-host " ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com" --find-links " http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/" numpy scipy --upgrade ``` results in: ``` True True cp27m Ignoring indexes: https://pypi.python.org/simple Collecting numpy Downloading http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/numpy-1.11.0-cp27-cp27m-manylinux1_x86_64.whl (15.3MB) 100% |????????????????????????????????| 15.3MB 49.0MB/s Collecting scipy Downloading http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/scipy-0.17.0-cp27-cp27m-manylinux1_x86_64.whl (39.5MB) 100% |????????????????????????????????| 39.5MB 21.1MB/s Installing collected packages: numpy, scipy Found existing installation: numpy 1.10.1 Uninstalling numpy-1.10.1: Successfully uninstalled numpy-1.10.1 Successfully installed numpy-1.11.0 scipy-0.17.0 ``` And all my tests pass as expected. Thanks a lot for all the work. Best Jens On Sun, 17 Apr 2016 at 11:05 Olivier Grisel wrote: > I tried on trusty and is also picked > numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl using the system python > 2.7 (in a virtualenv with pip 8.1.1): > > >>> import pip > >>> pip.pep425tags.get_abi_tag() > 'cp27mu' > > Outside of the virtualenv I still have the pip version from ubuntu > trusty and it does cannot detect ABI tags: > > $ /usr/bin/pip --version > pip 1.5.4 from /usr/lib/python2.7/dist-packages (python 2.7) > > >>> import pip > >>> pip.pep425tags.get_abi_tag() > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'module' object has no attribute 'get_abi_tag' > > But we don't really care because manylinux1 wheels can only be > installed by pip 8.1 and later. Previous versions of pip should just > ignore those wheels and try to install from the source tarball > instead. > > -- > Olivier > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Sun Apr 17 13:46:36 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sun, 17 Apr 2016 19:46:36 +0200 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Thanks for the clarification, I read your original report too quickly. I wonder why the travis maintainers built Python 2.7 with a non-standard unicode option. Edit (after googling): this is a known issue. The image with Python 2.7.11 will be fixed: https://github.com/travis-ci/travis-ci/issues/5107 -- Olivier From ben.v.root at gmail.com Sun Apr 17 15:03:02 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Sun, 17 Apr 2016 15:03:02 -0400 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Yeah! That's the bug I encountered! So, that would explain why this seems to work fine now (I tried it out a bit on Friday on a CentOS6 system, but didn't run the test suite). Cheers! Ben Root On Sun, Apr 17, 2016 at 1:46 PM, Olivier Grisel wrote: > Thanks for the clarification, I read your original report too quickly. > > I wonder why the travis maintainers built Python 2.7 with a > non-standard unicode option. > > Edit (after googling): this is a known issue. The image with Python > 2.7.11 will be fixed: > > https://github.com/travis-ci/travis-ci/issues/5107 > > -- > Olivier > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Apr 17 15:25:34 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 17 Apr 2016 12:25:34 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Apr 17, 2016 10:47 AM, "Olivier Grisel" wrote: > > Thanks for the clarification, I read your original report too quickly. > > I wonder why the travis maintainers built Python 2.7 with a > non-standard unicode option. Because for some reason cpython's configure script (in the now somewhat ancient versions we're talking about) defaults to non-standard broken Unicode support, and you have to explicitly override it if you want working standard Unicode support. I guess this made sense in like the 90s before people realized how unicode was going to go down. Same issue affects pyenv users (or used to, I think they might have just fixed it [0]) and Enthought Canopy. -n [0] https://github.com/yyuu/pyenv/issues/257 -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Apr 18 17:49:50 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 18 Apr 2016 14:49:50 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Sun, Apr 17, 2016 at 9:48 AM, Jens Nielsen wrote: > I have tested the new cp27m wheels and they seem to work great. > > @Matthew I am using the: > > ``` > sudo: required > dist: trusty > > images mentioned here https://docs.travis-ci.com/user/ci-environment/. As > far as I can see you are doing: > sudo: false > dist: trusty > > I had no idea such an image exist since it's not documented on > https://docs.travis-ci.com/user/ci-environment/ > > Anyway your tests runs with python 2.7.9 where as the sudo: requires ships > python 2.7.10 so it's clearly a different python version: > > @Olivier Grisel this only applies to Travis's own home build versions of > python 2.7 on the Trusty running on google compute engine. > It ships it's own prebuild python version. I don't have any issues with the > stock versions on Ubuntu which pip tells me are indeed cp27mu. > > It seems like the new cp27m wheels works as expected. Thanks a lot > Doing: > > ``` > python -c "from pip import pep425tags; > print(pep425tags.is_manylinux1_compatible()); > print(pep425tags.have_compatible_glibc(2, 5)); > print(pep425tags.get_abi_tag())" > pip install --timeout=60 --no-index --trusted-host > "ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com" > --find-links > "http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/" > numpy scipy --upgrade > ``` > results in: > > ``` > True > True > cp27m > Ignoring indexes: https://pypi.python.org/simple > Collecting numpy > Downloading > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/numpy-1.11.0-cp27-cp27m-manylinux1_x86_64.whl > (15.3MB) > 100% |????????????????????????????????| 15.3MB 49.0MB/s > Collecting scipy > Downloading > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/scipy-0.17.0-cp27-cp27m-manylinux1_x86_64.whl > (39.5MB) > 100% |????????????????????????????????| 39.5MB 21.1MB/s > Installing collected packages: numpy, scipy > Found existing installation: numpy 1.10.1 > Uninstalling numpy-1.10.1: > Successfully uninstalled numpy-1.10.1 > Successfully installed numpy-1.11.0 scipy-0.17.0 > ``` > And all my tests pass as expected. Thanks for testing. I set up a buildbot test to run against a narrow unicode build of Python: http://nipy.bic.berkeley.edu/builders/manylinux-2.7-debian-narrow/builds/1 All tests pass for me too, so I've done the pypi upload for the narrow unicode numpy, scipy, cython wheels. Cheers, Matthew From matthew.brett at gmail.com Tue Apr 19 03:17:40 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 19 Apr 2016 00:17:40 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Hi, On Mon, Apr 18, 2016 at 2:49 PM, Matthew Brett wrote: > On Sun, Apr 17, 2016 at 9:48 AM, Jens Nielsen wrote: >> I have tested the new cp27m wheels and they seem to work great. >> >> @Matthew I am using the: >> >> ``` >> sudo: required >> dist: trusty >> >> images mentioned here https://docs.travis-ci.com/user/ci-environment/. As >> far as I can see you are doing: >> sudo: false >> dist: trusty >> >> I had no idea such an image exist since it's not documented on >> https://docs.travis-ci.com/user/ci-environment/ >> >> Anyway your tests runs with python 2.7.9 where as the sudo: requires ships >> python 2.7.10 so it's clearly a different python version: >> >> @Olivier Grisel this only applies to Travis's own home build versions of >> python 2.7 on the Trusty running on google compute engine. >> It ships it's own prebuild python version. I don't have any issues with the >> stock versions on Ubuntu which pip tells me are indeed cp27mu. >> >> It seems like the new cp27m wheels works as expected. Thanks a lot >> Doing: >> >> ``` >> python -c "from pip import pep425tags; >> print(pep425tags.is_manylinux1_compatible()); >> print(pep425tags.have_compatible_glibc(2, 5)); >> print(pep425tags.get_abi_tag())" >> pip install --timeout=60 --no-index --trusted-host >> "ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com" >> --find-links >> "http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/" >> numpy scipy --upgrade >> ``` >> results in: >> >> ``` >> True >> True >> cp27m >> Ignoring indexes: https://pypi.python.org/simple >> Collecting numpy >> Downloading >> http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/numpy-1.11.0-cp27-cp27m-manylinux1_x86_64.whl >> (15.3MB) >> 100% |????????????????????????????????| 15.3MB 49.0MB/s >> Collecting scipy >> Downloading >> http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/scipy-0.17.0-cp27-cp27m-manylinux1_x86_64.whl >> (39.5MB) >> 100% |????????????????????????????????| 39.5MB 21.1MB/s >> Installing collected packages: numpy, scipy >> Found existing installation: numpy 1.10.1 >> Uninstalling numpy-1.10.1: >> Successfully uninstalled numpy-1.10.1 >> Successfully installed numpy-1.11.0 scipy-0.17.0 >> ``` >> And all my tests pass as expected. > > Thanks for testing. I've also tested a range of numpy and scipy wheels built with the manylinux docker image. Built numpy and scipy wheels here: http://nipy.bic.berkeley.edu/manylinux/ Test script and output here: http://nipy.bic.berkeley.edu/manylinux/tests/ There are some test failures in the logs there, but I think they are all known failures from old numpy / scipy versions, particularly https://github.com/scipy/scipy/issues/5370 Y'all can test for yourselves with something like: python -m pip install -U pip pip install -f https://nipy.bic.berkeley.edu/manylinux numpy==1.6.2 scipy==0.16.0 I propose to upload these historical wheels to pypi to make it easier to test against older versions of numpy / scipy. Any objections? Cheers, Matthew From olivier.grisel at ensta.org Tue Apr 19 04:12:54 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Tue, 19 Apr 2016 10:12:54 +0200 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: I think that would be very useful, e.g. for downstream projects to check that they work properly with old versions using a simple pip install command on their CI workers. -- Olivier From barberchris01 at gmail.com Tue Apr 19 14:12:54 2016 From: barberchris01 at gmail.com (Chris Barber) Date: Tue, 19 Apr 2016 11:12:54 -0700 Subject: [Numpy-discussion] nan version of einsum Message-ID: Is there any interest in a nan-ignoring version of einsum a la nansum, nanprod, etc? Any idea how difficult it would be to implement? - Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Apr 20 01:08:54 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 19 Apr 2016 22:08:54 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Tue, Apr 19, 2016 at 1:12 AM, Olivier Grisel wrote: > I think that would be very useful, e.g. for downstream projects to > check that they work properly with old versions using a simple pip > install command on their CI workers. Done for numpy 1.6.0 through 1.10.4, scipy 0.9 through scipy 0.16.1 Please let me know of any problems, Matthew From olivier.grisel at ensta.org Wed Apr 20 04:59:31 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Wed, 20 Apr 2016 10:59:31 +0200 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Thanks, I think next we could upgrade the travis configuration of numpy and scipy to build and upload manylinux1 wheels to http://travis-dev-wheels.scipy.org/ for downstream project to test against the master branch of numpy and scipy whithout having to build those from source. However that would require publishing an official pre-built libopenblas.so (+headers) archive or RPM package. That archive would server as the reference libary to build scipy stack manylinux1 wheels. -- Olivier From jenshnielsen at gmail.com Wed Apr 20 06:33:56 2016 From: jenshnielsen at gmail.com (Jens Nielsen) Date: Wed, 20 Apr 2016 10:33:56 +0000 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Thanks I can confirm that the new narrow unicode build wheels of Scipy works as expected for my project. @Oliver Grisel Thanks for finding the Travis issue it's probably worth considering switching the Travis build to 2.7.11 to avoid other similar issues. The old versions of numpy are very handy for downstream testing. I have verified that they work as expected in the Matplotlib tests here: https://github.com/jenshnielsen/matplotlib/tree/travisnowheelhouse where we are testing against numpy 1.6 as the earliest. This branch switches matplotlib from the scikit image wheelhouse to manylinux wheels which seems to work great. best Jens On Wed, 20 Apr 2016 at 09:59 Olivier Grisel wrote: > Thanks, > > I think next we could upgrade the travis configuration of numpy and > scipy to build and upload manylinux1 wheels to > http://travis-dev-wheels.scipy.org/ for downstream project to test > against the master branch of numpy and scipy whithout having to build > those from source. > > However that would require publishing an official pre-built > libopenblas.so (+headers) archive or RPM package. That archive would > server as the reference libary to build scipy stack manylinux1 wheels. > -- > Olivier > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Apr 20 10:57:50 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 20 Apr 2016 07:57:50 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Wed, Apr 20, 2016 at 1:59 AM, Olivier Grisel wrote: > Thanks, > > I think next we could upgrade the travis configuration of numpy and > scipy to build and upload manylinux1 wheels to > http://travis-dev-wheels.scipy.org/ for downstream project to test > against the master branch of numpy and scipy whithout having to build > those from source. > > However that would require publishing an official pre-built > libopenblas.so (+headers) archive or RPM package. That archive would > server as the reference libary to build scipy stack manylinux1 wheels. There's an OpenBLAS archive up at : http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/openblas_0.2.18.tgz - is that the right place for it? It gets uploaded by the manylinux-builds travis run. Cheers, Matthew From matthew.brett at gmail.com Wed Apr 20 14:41:49 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 20 Apr 2016 11:41:49 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: Hi, On Wed, Apr 20, 2016 at 3:33 AM, Jens Nielsen wrote: > Thanks > > I can confirm that the new narrow unicode build wheels of Scipy works as > expected for my project. > @Oliver Grisel Thanks for finding the Travis issue it's probably worth > considering switching the Travis build to 2.7.11 to avoid other similar > issues. > > The old versions of numpy are very handy for downstream testing. I have > verified that they work as expected in the Matplotlib tests here: > https://github.com/jenshnielsen/matplotlib/tree/travisnowheelhouse where we > are testing against numpy 1.6 as the earliest. This branch switches > matplotlib from the scikit image wheelhouse to manylinux wheels which seems > to work great. Jens - any interest in working together on a good matplotlib build recipe? Matthew From mitchell at intertrust.com Wed Apr 20 15:22:49 2016 From: mitchell at intertrust.com (Steve Mitchell) Date: Wed, 20 Apr 2016 19:22:49 +0000 Subject: [Numpy-discussion] Do getitem/setitem already have GIL? Message-ID: <57155BF8DF3FF541BF7E8515C12E957836F1DE6E@exch-1.corp.intertrust.com> When writing custom PyArray_ArrFuncs getitem() and setitem(), do I need to acquire the GIL, or has it been done for me already by the caller? --Steve http://docs.scipy.org/doc/numpy/reference/c-api.array.html?highlight=allow_c_api#group-2 http://docs.scipy.org/doc/numpy/reference/internals.code-explanations.html?highlight=gil#function-call http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html https://docs.python.org/2/c-api/init.html#thread-state-and-the-global-interpreter-lock -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Apr 21 03:02:17 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 21 Apr 2016 09:02:17 +0200 Subject: [Numpy-discussion] Do getitem/setitem already have GIL? In-Reply-To: <57155BF8DF3FF541BF7E8515C12E957836F1DE6E@exch-1.corp.intertrust.com> References: <57155BF8DF3FF541BF7E8515C12E957836F1DE6E@exch-1.corp.intertrust.com> Message-ID: <1461222137.31520.9.camel@sipsolutions.net> This is for a custom dtype? getitem and setitem work with objects and must have the GIL in any case, so yes, you can safely assume this. I think you probably have to set the flags correctly for some things to work right. So that the PyDataType_REFCHK makro gives the right result. Though frankly, I am just poking at it here, could be all wrong. - Sebastian On Mi, 2016-04-20 at 19:22 +0000, Steve Mitchell wrote: > When writing custom PyArray_ArrFuncs getitem() and setitem(), do I > need to acquire the GIL, or has it been done for me already by the > caller? > > --Steve > > http://docs.scipy.org/doc/numpy/reference/c-api.array.html?highlight= > allow_c_api#group-2 > http://docs.scipy.org/doc/numpy/reference/internals.code-explanations > .html?highlight=gil#function-call > http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures. > html > https://docs.python.org/2/c-api/init.html#thread-state-and-the-global > -interpreter-lock > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From olivier.grisel at ensta.org Thu Apr 21 04:47:50 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Thu, 21 Apr 2016 10:47:50 +0200 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: 2016-04-20 16:57 GMT+02:00 Matthew Brett : > On Wed, Apr 20, 2016 at 1:59 AM, Olivier Grisel > wrote: >> Thanks, >> >> I think next we could upgrade the travis configuration of numpy and >> scipy to build and upload manylinux1 wheels to >> http://travis-dev-wheels.scipy.org/ for downstream project to test >> against the master branch of numpy and scipy whithout having to build >> those from source. >> >> However that would require publishing an official pre-built >> libopenblas.so (+headers) archive or RPM package. That archive would >> server as the reference libary to build scipy stack manylinux1 wheels. > > There's an OpenBLAS archive up at : > http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/openblas_0.2.18.tgz Thanks. > - is that the right place for it? It gets uploaded by the > manylinux-builds travis run. The only problem with rackspace cloud files is that as of now there is no way to put a short domain name (CNAME) with https. Maybe we could use the github "release" system on a github repo under the numpy github organization to host it. Or alternatively use an external binary file host that use github credentials for upload rigths, for instance bintray (I have no experience with this yet though). -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From matthew.brett at gmail.com Fri Apr 22 14:17:11 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 22 Apr 2016 11:17:11 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Thu, Apr 21, 2016 at 1:47 AM, Olivier Grisel wrote: > 2016-04-20 16:57 GMT+02:00 Matthew Brett : >> On Wed, Apr 20, 2016 at 1:59 AM, Olivier Grisel >> wrote: >>> Thanks, >>> >>> I think next we could upgrade the travis configuration of numpy and >>> scipy to build and upload manylinux1 wheels to >>> http://travis-dev-wheels.scipy.org/ for downstream project to test >>> against the master branch of numpy and scipy whithout having to build >>> those from source. >>> >>> However that would require publishing an official pre-built >>> libopenblas.so (+headers) archive or RPM package. That archive would >>> server as the reference libary to build scipy stack manylinux1 wheels. >> >> There's an OpenBLAS archive up at : >> http://ccdd0ebb5a931e58c7c5-aae005c4999d7244ac63632f8b80e089.r77.cf2.rackcdn.com/openblas_0.2.18.tgz > > Thanks. > >> - is that the right place for it? It gets uploaded by the >> manylinux-builds travis run. > > The only problem with rackspace cloud files is that as of now there is > no way to put a short domain name (CNAME) with https. Maybe we could > use the github "release" system on a github repo under the numpy > github organization to host it. Or alternatively use an external > binary file host that use github credentials for upload rigths, for > instance bintray (I have no experience with this yet though). The github releases idea sounds intriguing. Do you have any experience with that? Are there good examples other than the API documentation? https://developer.github.com/v3/repos/releases/ Cheers, Matthew From olivier.grisel at ensta.org Fri Apr 22 14:27:58 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Fri, 22 Apr 2016 20:27:58 +0200 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: 2016-04-22 20:17 GMT+02:00 Matthew Brett : > > The github releases idea sounds intriguing. Do you have any > experience with that? Are there good examples other than the API > documentation? > > https://developer.github.com/v3/repos/releases/ I never used it by I assume we could create a numpy-openblas repo to host official builds suitable for embedding numpy wheels for stable each releases of OpenBLAS: There is also a travis deployment target. https://docs.travis-ci.com/user/deployment/releases I have not sure that the travis timeout is long enough to build openblas. I believe so but I have not tried myself yet. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From matthew.brett at gmail.com Fri Apr 22 14:35:24 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 22 Apr 2016 11:35:24 -0700 Subject: [Numpy-discussion] linux wheels coming soon In-Reply-To: References: Message-ID: On Fri, Apr 22, 2016 at 11:27 AM, Olivier Grisel wrote: > 2016-04-22 20:17 GMT+02:00 Matthew Brett : >> >> The github releases idea sounds intriguing. Do you have any >> experience with that? Are there good examples other than the API >> documentation? >> >> https://developer.github.com/v3/repos/releases/ > > I never used it by I assume we could create a numpy-openblas repo to > host official builds suitable for embedding numpy wheels for stable > each releases of OpenBLAS: > > There is also a travis deployment target. > > https://docs.travis-ci.com/user/deployment/releases Ah - thanks - that's good resource. > I have not sure that the travis timeout is long enough to build > openblas. I believe so but I have not tried myself yet. Yes, the manylinux-builds repo currently builds openblas for each entry in the build matrix, so it's easily within time: https://travis-ci.org/matthew-brett/manylinux-builds/builds/123643313 It would be good to think of a way of supporting a set of libraries, such as libpng, freetype, openblas. We might need to support both 64-bit and 32-bit versions as well. Then, some automated build script would by default pick up the latest of these for numpy, matplotlib etc. Matthew From caichinger at ubimet.com Mon Apr 25 17:25:58 2016 From: caichinger at ubimet.com (Christian Aichinger) Date: Mon, 25 Apr 2016 23:25:58 +0200 (CEST) Subject: [Numpy-discussion] Adding the Linux Wheels for old releases breaks builds In-Reply-To: <1297328701.10307133.1461613969671.JavaMail.root@ubimet.com> Message-ID: <1763498184.10397690.1461619558865.JavaMail.root@ubimet.com> Hi! The addition of the Linux Wheels broke the build process of several of our Debian packages, which rely on NumPy installed inside virtualenvs. The problem stems from the pre-compiled shared libraries included in the Wheels, details are in . I'm bringing this up here because these changes have implications that may not have been fully realized before. The Wheel packages are great for end users, they make NumPy much more easily installable for average people. Unfortunately, they are precisely the wrong thing for anyone re-packaging NumPy (e.g. shipping it in a virtualenv inside RPM or Debian packages). For that use-case, you typically want to build NumPy yourself.[1] You could rely on this happening before, now a `--no-binary` argument for pip is needed to get that behavior. Put another way, the addition of the Wheels silently invalidated the assumption that a `pip install numpy` would locally compile the package. In the perfect world, anyone re-packaging NumPy would specify `--no-binary` if they want to enforce local building. However, currently, --no-binary is not in widespread use because it was never necessary before. I fully agree that the Wheels have great value, but adding them for old releases (back to 1.6.0 from 2011) suddenly changes the NumPy distribution for people who explicitly pinned an older version to avoid surprises. It invites downstream build failures (as happened to us) and adds externally-built shared objects in a way that people won't expect. I would propose to only add Wheels for new releases and to explicitly mention this issue in the release notes, so people are not blind-sided by it. I realize that this would be a painfully slow process, but silently breaking previously working use-cases for old releases seems worse to me (though it is difficult to estimate how many people are negatively affected by this). Regards, Chris [1]: You want to build locally for many reasons, e.g. to link against your system's libraries which you get security upgrades for; to have the confidence that you can actually build the package from source if need be; to be sure that the binaries really correspond to the source code; ... ------------------------------------------------------------------------------------------------------------------- UBIMET GmbH - weather matters Christian Aichinger ? IT A-1220 Wien ? Donau-City-Stra?e 11 ? Tel +43 1 263 11 22 ? Fax +43 1 263 11 22 219 caichinger at ubimet.com ? www.ubimet.com From matthew.brett at gmail.com Mon Apr 25 17:28:58 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 25 Apr 2016 14:28:58 -0700 Subject: [Numpy-discussion] Adding the Linux Wheels for old releases breaks builds In-Reply-To: <1763498184.10397690.1461619558865.JavaMail.root@ubimet.com> References: <1297328701.10307133.1461613969671.JavaMail.root@ubimet.com> <1763498184.10397690.1461619558865.JavaMail.root@ubimet.com> Message-ID: Hi, On Mon, Apr 25, 2016 at 2:25 PM, Christian Aichinger wrote: > Hi! > The addition of the Linux Wheels broke the build process of several of our Debian packages, which rely on NumPy installed inside virtualenvs. The problem stems from the pre-compiled shared libraries included in the Wheels, details are in . > > I'm bringing this up here because these changes have implications that may not have been fully realized before. > > The Wheel packages are great for end users, they make NumPy much more easily installable for average people. Unfortunately, they are precisely the wrong thing for anyone re-packaging NumPy (e.g. shipping it in a virtualenv inside RPM or Debian packages). For that use-case, you typically want to build NumPy yourself.[1] You could rely on this happening before, now a `--no-binary` argument for pip is needed to get that behavior. Put another way, the addition of the Wheels silently invalidated the assumption that a `pip install numpy` would locally compile the package. > > In the perfect world, anyone re-packaging NumPy would specify `--no-binary` if they want to enforce local building. However, currently, --no-binary is not in widespread use because it was never necessary before. > > I fully agree that the Wheels have great value, but adding them for old releases (back to 1.6.0 from 2011) suddenly changes the NumPy distribution for people who explicitly pinned an older version to avoid surprises. It invites downstream build failures (as happened to us) and adds externally-built shared objects in a way that people won't expect. > > I would propose to only add Wheels for new releases and to explicitly mention this issue in the release notes, so people are not blind-sided by it. I realize that this would be a painfully slow process, but silently breaking previously working use-cases for old releases seems worse to me (though it is difficult to estimate how many people are negatively affected by this). There's more discussion of this issue over on https://github.com/numpy/numpy/issues/7570 Cheers, Matthew From dsaumyajit at student.nitw.ac.in Tue Apr 26 05:35:14 2016 From: dsaumyajit at student.nitw.ac.in (Saumyajit Dey) Date: Tue, 26 Apr 2016 15:05:14 +0530 Subject: [Numpy-discussion] (no subject) Message-ID: Hi, This is Saumyajit Dey and I am looking forward to start contributing to NumPy. I have never contributed to any open source projects before so I would want to know some tips and guidelines to start contributing. Regards, Saumyajit Saumyajit Dey Junior Undergraduate Student: Department of Computer Science and Engineering National Institute of Technology Warangal (NITW), India Cell: +91-8885847028 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Tue Apr 26 12:05:29 2016 From: pmhobson at gmail.com (Paul Hobson) Date: Tue, 26 Apr 2016 09:05:29 -0700 Subject: [Numpy-discussion] (no subject) In-Reply-To: References: Message-ID: Saumyajit, Numpy's source code is hosted on Github. You can find the contributing guides there: https://github.com/numpy/numpy/blob/master/CONTRIBUTING.md -paul On Tue, Apr 26, 2016 at 2:35 AM, Saumyajit Dey < dsaumyajit at student.nitw.ac.in> wrote: > Hi, > > This is Saumyajit Dey and I am looking forward to start contributing to > NumPy. > > I have never contributed to any open source projects before so I would > want to know some tips and guidelines to start contributing. > > Regards, > Saumyajit > > Saumyajit Dey > Junior Undergraduate Student: > Department of Computer Science and Engineering > National Institute of Technology > Warangal (NITW), India > Cell: +91-8885847028 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsaumyajit at student.nitw.ac.in Tue Apr 26 12:10:59 2016 From: dsaumyajit at student.nitw.ac.in (Saumyajit Dey) Date: Tue, 26 Apr 2016 21:40:59 +0530 Subject: [Numpy-discussion] (no subject) In-Reply-To: References: Message-ID: ?Thanks a lot, Paul for the reply. I will? look into the contribution guidelines. Also could you please suggest some good reading resources for getting to know more about NumPy. Regards, Saumyajit Saumyajit Dey Junior Undergraduate Student: Department of Computer Science and Engineering National Institute of Technology Warangal (NITW), India Cell: +91-8885847028 On Tue, Apr 26, 2016 at 9:35 PM, Paul Hobson wrote: > Saumyajit, > > Numpy's source code is hosted on Github. You can find the contributing > guides there: > https://github.com/numpy/numpy/blob/master/CONTRIBUTING.md > > -paul > > On Tue, Apr 26, 2016 at 2:35 AM, Saumyajit Dey < > dsaumyajit at student.nitw.ac.in> wrote: > >> Hi, >> >> This is Saumyajit Dey and I am looking forward to start contributing to >> NumPy. >> >> I have never contributed to any open source projects before so I would >> want to know some tips and guidelines to start contributing. >> >> Regards, >> Saumyajit >> >> Saumyajit Dey >> Junior Undergraduate Student: >> Department of Computer Science and Engineering >> National Institute of Technology >> Warangal (NITW), India >> Cell: +91-8885847028 >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Wed Apr 27 08:35:45 2016 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Wed, 27 Apr 2016 18:05:45 +0530 Subject: [Numpy-discussion] (no subject) In-Reply-To: References: Message-ID: Hi, Welcome! It would be a good exercise to look at the documentation and tutorial for Numpy at http://docs.scipy.org/doc/ Also the lectures at the lectures at www.scipy-lectures.org might be a interesting introduction to scientific python in numpy stack. Hope it helps. Happy learning ! Cheers, Maniteja. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From hdante.lnls at gmail.com Wed Apr 27 10:35:15 2016 From: hdante.lnls at gmail.com (Henrique Almeida) Date: Wed, 27 Apr 2016 11:35:15 -0300 Subject: [Numpy-discussion] Help with bit arrays Message-ID: Hello, what's the current status on numpy for loading bit-arrays ? I'm currently unable to correctly load black and white (1-bit) TIFF images. Code example follows: from PIL import Image import numpy from matplotlib import pyplot img = Image.open('oi-00.tiff') a = numpy.array(img) ^ does not work for 1-bit TIFF images PIL source shows that it incorrectly uses typestr == '|b1'. I tried to change this to '|t1', but I get : TypeError: data type "|t1" not understood My goal is to make the above code to work for black and white TIFF images the same way it works for grayscale images. Any help ? From dsaumyajit at student.nitw.ac.in Wed Apr 27 12:41:11 2016 From: dsaumyajit at student.nitw.ac.in (Saumyajit Dey) Date: Wed, 27 Apr 2016 22:11:11 +0530 Subject: [Numpy-discussion] (no subject) In-Reply-To: References: Message-ID: Hi, Thanks a lot for the reply. I am looking into the documentation already. Also is there any guide as to how the source code of Numpy is organised? For example, when i write np.power(2,3) ?what is the workflow in terms of functions in different modules being called?? ?Regards, Saumyajit? Saumyajit Dey Junior Undergraduate Student: Department of Computer Science and Engineering National Institute of Technology Warangal (NITW), India Cell: +91-8885847028 On Wed, Apr 27, 2016 at 6:05 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hi, > > Welcome! It would be a good exercise to look at the documentation and > tutorial for Numpy at http://docs.scipy.org/doc/ > > Also the lectures at the lectures at www.scipy-lectures.org might be a > interesting introduction to scientific python in numpy stack. > > Hope it helps. > > Happy learning ! > > Cheers, > Maniteja. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Apr 27 13:12:27 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 27 Apr 2016 19:12:27 +0200 Subject: [Numpy-discussion] (no subject) In-Reply-To: References: Message-ID: <1461777147.10852.31.camel@sipsolutions.net> On Mi, 2016-04-27 at 22:11 +0530, Saumyajit Dey wrote: > Hi, > > Thanks a lot for the reply. I am looking into the documentation > already. Also is there any guide as to how the source code of Numpy > is organised? > > For example, when i write > > > np.power(2,3) > what is the workflow in terms of functions in different modules being > called? > No, there is not much. There are different paths/possibilities, sometimes even intermingled. These are some: 1. Pure python functions, e.g. np.stack, np.delete, ... They are not hard to find/figure out (though some details may be), frankly, I usually just use the ?? magic in ipython. 2. Python shims for attributes, attributes usually go to the methods.c in numpy/core/src/multiarray 3. General C-Functions (Reshape, etc.) are usually in some specialized file in numpy/core/src/multiarray but wrapped by multiarraymodule.c, so you can backtrace from there. The exact calls can get pretty complex (even call back to python). 4. One important category (also tricky) are ufuncs. They form their own building block in the code base. The whole interplay of things is quite complex, so unless you need something specific I would be happy to understand all the things they can do for you and that they wrap C-functions working on a single axis in some sense. (code in numpy/core/src/umath) Frankly, I often just "git grep ..." to find the right place to look for something. It is not impossible to understand the logic behind the files and what calls what, but I would claim it is usually faster to grep for it if you are interested in something specific. There are some more arcane things, such as code generations, but that is easier to ask for/figure out for a specific problem. Things such as what happens when a ufunc is called, and how the ufunc is created in the first place are non-trivial. NumPy has a few rather distinct building blocks. Ufuncs, the iterator, general C-based shape/container functions, general python functions, linear algebra, fft, polynoms, .... I would argue that finding something that interests you and trying to figure that out and asking us about it explicitly is probably best. Honestly, I think all of us devs have at least two things in the above list we know almost nothing about. E.g. you don't need to understand details of the FFT implementation unless you want to actually change something there. There are some "easy issues" marked in the git issue list, which may be worth a shot if you like to just dive in. You could poke at one you find interesting and then ask us (we might have tagged something as "easy" but I would not guarantee all of them are, sometimes there are unexpected difficulties or it is easy if you already know where to look). - Sebastian > Regards, > Saumyajit > > Saumyajit Dey > Junior Undergraduate Student: > Department of Computer Science and Engineering > National Institute of Technology > Warangal (NITW), India > Cell: +91-8885847028 > > On Wed, Apr 27, 2016 at 6:05 PM, Maniteja Nandana < > maniteja.modesty067 at gmail.com> wrote: > > Hi, > > Welcome! It would be a good exercise to look at the documentation > > and tutorial for Numpy at http://docs.scipy.org/doc/ > > Also the lectures at the lectures at www.scipy-lectures.org might > > be a interesting introduction to scientific python in numpy stack. > > Hope it helps. > > Happy learning ! > > Cheers, > > Maniteja. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From barberchris01 at gmail.com Wed Apr 27 18:36:53 2016 From: barberchris01 at gmail.com (Chris Barber) Date: Wed, 27 Apr 2016 15:36:53 -0700 Subject: [Numpy-discussion] nan version of einsum In-Reply-To: References: Message-ID: Hi, Looks like I was a little confused. It appears that the nan* versions of functions in numpy just substitute the NaNs in a copy of the original array and so are just convenience methods. I was imagining that they were optimized and handling the NaNs at a lower level. It looks like the "bottleneck" package tries to do this for nansum, nanprod, etc, but I don't know if it is able to take advantage of SSE or not. Anyways, maybe if I elaborate someone can offer suggestions. My application is: given a large N-dimensional array, and a selected (N-1)-dimensional slice of it, compute the Pearson correlation of that slice versus all other (N-1)-dimensional slices of the array, omitting NaN's from the calculation. Ignoring NaN's for a moment, here is the slow and obvious way to do it: from scipy.stats import pearsonr import numpy as np def corrs(data,index,dim): seed = data.take(index,dim) res = np.zeros(data.shape[dim]) for i in range(0,data.shape[dim]): res[i] = pearsonr(seed, data.take(i,dim))[0] return res Doing all the math by hand and using einsum, there is an extremely fast (though fairly cryptic) way of doing this: def corrs2(data, index, axis): seed = data.take([index], axis=axis) sdims = range(0,seed.ndim) ddims = range(0,data.ndim) sample_axes = np.array([i for i in ddims if i != axis]) seed_mean = np.einsum(seed, sdims, []) / seed.size data_mean = np.einsum(data, ddims, [axis]) / seed.size data_mean.shape = tuple(data.shape[i] if i == axis else 1 for i in ddims) # restore dims after einsum seed_dev = np.einsum(seed-seed_mean, sdims, sdims) numerator = np.einsum(seed_dev, ddims, data, ddims, [axis]) numerator -= np.einsum(seed_dev, ddims, data_mean, ddims, [axis]) denominator = np.einsum(data, ddims, data, ddims, [axis]) denominator += -2.0*np.einsum(data, ddims, data_mean, ddims, [axis]) denominator += np.sum(data_mean**2, axis=sample_axes) * seed.size denominator *= np.einsum(seed_dev**2, sdims, [axis]) denominator = np.sqrt(denominator) return np.clip(numerator / denominator, -1.0, 1.0) It also doesn't need to make a copy of the array. Re-introducing the requirement to handle NaNs, though, I couldn't find any option besides making a mask array and introducing that explicitly into the calculations. That's why I was imagining an optimized "naneinsum." Are there any existing ways of doing sums and products on numpy arrays that have a fast way of handling NaNs? Or is a mask array the best thing to hope for? I think that for this problem that I could transpose and reshape the N-dimensional array down to 2-dimensions without making an array copy, which might make it easier to interface with some optimizing package that doesn't support multidimensional arrays fully. I am fairly new to making numpy/python fast, and coming from MATLAB am very impressed, though there are a bewlidering number of options when it comes to trying to optimize. Thanks, Chris On Tue, Apr 19, 2016 at 11:12 AM, Chris Barber wrote: > Is there any interest in a nan-ignoring version of einsum a la nansum, > nanprod, etc? Any idea how difficult it would be to implement? > > - Chris > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Permafacture at gmail.com Wed Apr 27 19:48:13 2016 From: Permafacture at gmail.com (Elliot Hallmark) Date: Wed, 27 Apr 2016 18:48:13 -0500 Subject: [Numpy-discussion] Is this a known error in Ipython with numpy? Message-ID: Hello, I haven't worked hard yet to create a minimal runnable (reproduce-able) example but I wanted to check if this sounds familiar to anyone. I have a pretty involved program that resizes arrays in place with arr.resize. When I run it with python it completes and gives the expected result. When I run it in Ipython, I get the following error: ``` ---> 43 self._buffer.resize((count,)+self._dim) 44 45 ValueError: cannot resize an array that references or is referenced by another array in this way. Use the resize function ``` It was consistently doing this on the same array, after resizing two others before hand, even after rebooting. But trying to track it down it goes away even if I undo everything I did to try and track it down. Does this sound familiar? -------------- next part -------------- An HTML attachment was scrubbed... URL: From argriffi at ncsu.edu Wed Apr 27 21:49:16 2016 From: argriffi at ncsu.edu (Alexander Griffing) Date: Wed, 27 Apr 2016 21:49:16 -0400 Subject: [Numpy-discussion] Is this a known error in Ipython with numpy? In-Reply-To: References: Message-ID: On Wed, Apr 27, 2016 at 7:48 PM, Elliot Hallmark wrote: > Hello, > > I haven't worked hard yet to create a minimal runnable (reproduce-able) > example but I wanted to check if this sounds familiar to anyone. > > I have a pretty involved program that resizes arrays in place with > arr.resize. When I run it with python it completes and gives the expected > result. When I run it in Ipython, I get the following error: > > ``` > ---> 43 self._buffer.resize((count,)+self._dim) > 44 > 45 > > ValueError: cannot resize an array that references or is referenced > by another array in this way. Use the resize function > ``` > > It was consistently doing this on the same array, after resizing two others > before hand, even after rebooting. But trying to track it down it goes away > even if I undo everything I did to try and track it down. > > Does this sound familiar? > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > I'd guess the issue is that your environment (IPython or IDLE or whatever) is keeping a reference to your array, for example so that you can refer to earlier outputs using the underscore _. Here's how I was able to reproduce the problem: This is OK: >>> import numpy as np >>> a = np.arange(10) >>> a.resize(3) >>> a array([0, 1, 2]) This gives an error: >>> import numpy as np >>> a = np.arange(10) >>> a array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> a.resize(3) Traceback (most recent call last): File "", line 1, in ValueError: cannot resize an array that references or is referenced by another array in this way. Use the resize function Cheers, Alex From nathan12343 at gmail.com Wed Apr 27 21:54:37 2016 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Wed, 27 Apr 2016 20:54:37 -0500 Subject: [Numpy-discussion] Is this a known error in Ipython with numpy? In-Reply-To: References: Message-ID: On Wed, Apr 27, 2016 at 8:49 PM, Alexander Griffing wrote: > On Wed, Apr 27, 2016 at 7:48 PM, Elliot Hallmark > wrote: > > Hello, > > > > I haven't worked hard yet to create a minimal runnable (reproduce-able) > > example but I wanted to check if this sounds familiar to anyone. > > > > I have a pretty involved program that resizes arrays in place with > > arr.resize. When I run it with python it completes and gives the > expected > > result. When I run it in Ipython, I get the following error: > > > > ``` > > ---> 43 self._buffer.resize((count,)+self._dim) > > 44 > > 45 > > > > ValueError: cannot resize an array that references or is referenced > > by another array in this way. Use the resize function > > ``` > > > > It was consistently doing this on the same array, after resizing two > others > > before hand, even after rebooting. But trying to track it down it goes > away > > even if I undo everything I did to try and track it down. > You might find the %xdel magic to be useful: In [1]: %xdel? Docstring: Delete a variable, trying to clear it from anywhere that IPython's machinery has references to it. By default, this uses the identity of the named object in the user namespace to remove references held under other names. The object is also removed from the output history. Options -n : Delete the specified name from all namespaces, without checking their identity. > > > > Does this sound familiar? > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > I'd guess the issue is that your environment (IPython or IDLE or > whatever) is keeping a reference to your array, for example so that > you can refer to earlier outputs using the underscore _. Here's how I > was able to reproduce the problem: > > This is OK: > > >>> import numpy as np > >>> a = np.arange(10) > >>> a.resize(3) > >>> a > array([0, 1, 2]) > > This gives an error: > > >>> import numpy as np > >>> a = np.arange(10) > >>> a > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > >>> a.resize(3) > Traceback (most recent call last): > File "", line 1, in > ValueError: cannot resize an array that references or is referenced > by another array in this way. Use the resize function > > Cheers, > Alex > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Permafacture at gmail.com Wed Apr 27 22:42:23 2016 From: Permafacture at gmail.com (Elliot Hallmark) Date: Wed, 27 Apr 2016 21:42:23 -0500 Subject: [Numpy-discussion] Is this a known error in Ipython with numpy? In-Reply-To: References: Message-ID: Hey Alex. Thanks. I was aware of that. However, I was simply doing `run myscript.py` on the first input line of the Ipython shell, so I did not expect this behaviour. The ipython list would be a better place to ask I guess, since the behaviour on numpy's part is to be expected. Just wondering if any numpy folks had seen this before. Elliot -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Thu Apr 28 13:50:10 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 28 Apr 2016 13:50:10 -0400 Subject: [Numpy-discussion] Possible negative impact of future change to None comparison Message-ID: Working on closing out some bug reports at work, and I ran into one about comparisons to 'None' will result in elementwise object comparison in the future. Now, I totally get the idea behind the change, and I am not here to argue that decision. However, I have come across a situation where the change might have an unexpected negative consequence. Consider the following: p = Pool(min(len(tiles), maxprocs)) res = p.map(_wrap_tilesum, izip(repeat(args.varname), tiles, repeat(request_start), repeat(args.timeLen), repeat(args.srcdir), repeat(args.tarredInputs), repeat(args.dataset))) p.close() p.join() (tiles, tile_start_dates, tile_end_dates, tile_lons, tile_lats) = zip(*res) if None in tiles: logging.critical("At least one tile was invalid. Aborting") raise Exception("Invalid data retrieved!") Essentailly, in the nominal case, "tiles" would be a list of numpy arrays. However, my error handling is such that if one of my subprocesses errors out, then it returns a None instead of a numpy array. So, all I am doing is testing to see if any of the items in the "tiles" list is None. I have zero desire to actually compare None with the elements in the arrays that happens to be in the list. Of course, I can rewrite this if statement as `any(tile is None for tile in tiles)`, but that isn't my point. Is `if None in tiles:` an unreasonable idiom? Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From hdante.lnls at gmail.com Fri Apr 29 11:27:38 2016 From: hdante.lnls at gmail.com (Henrique Almeida) Date: Fri, 29 Apr 2016 12:27:38 -0300 Subject: [Numpy-discussion] Help with bit arrays In-Reply-To: References: Message-ID: Any help with this problem ? 2016-04-27 11:35 GMT-03:00 Henrique Almeida : > Hello, what's the current status on numpy for loading bit-arrays ? > > I'm currently unable to correctly load black and white (1-bit) TIFF > images. Code example follows: > > from PIL import Image > import numpy > from matplotlib import pyplot > > img = Image.open('oi-00.tiff') > a = numpy.array(img) > > ^ does not work for 1-bit TIFF images > > PIL source shows that it incorrectly uses typestr == '|b1'. I tried to > change this to '|t1', but I get : > > TypeError: data type "|t1" not understood > > My goal is to make the above code to work for black and white TIFF > images the same way it works for grayscale images. Any help ? From pmhobson at gmail.com Fri Apr 29 12:06:24 2016 From: pmhobson at gmail.com (Paul Hobson) Date: Fri, 29 Apr 2016 09:06:24 -0700 Subject: [Numpy-discussion] Help with bit arrays In-Reply-To: References: Message-ID: Does using pyplot.imgread work? On Fri, Apr 29, 2016 at 8:27 AM, Henrique Almeida wrote: > Any help with this problem ? > > 2016-04-27 11:35 GMT-03:00 Henrique Almeida : > > Hello, what's the current status on numpy for loading bit-arrays ? > > > > I'm currently unable to correctly load black and white (1-bit) TIFF > > images. Code example follows: > > > > from PIL import Image > > import numpy > > from matplotlib import pyplot > > > > img = Image.open('oi-00.tiff') > > a = numpy.array(img) > > > > ^ does not work for 1-bit TIFF images > > > > PIL source shows that it incorrectly uses typestr == '|b1'. I tried to > > change this to '|t1', but I get : > > > > TypeError: data type "|t1" not understood > > > > My goal is to make the above code to work for black and white TIFF > > images the same way it works for grayscale images. Any help ? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hdante.lnls at gmail.com Fri Apr 29 12:31:26 2016 From: hdante.lnls at gmail.com (Henrique Almeida) Date: Fri, 29 Apr 2016 13:31:26 -0300 Subject: [Numpy-discussion] Help with bit arrays In-Reply-To: References: Message-ID: Paul, yes, imread() worked for reading the black and white TIFF. The situation improved, but now, there seems to be some problem with the color map. Example code: #!/usr/bin/env python3 import numpy from matplotlib import pyplot, cm img = pyplot.imread('oi-00.tiff') pyplot.imshow(img) pyplot.colorbar() pyplot.show() The code can open both 1-bit and 8-bit images, but only with 8 bits the image is shown with the colormap colors. The 1 bit image is shown as black and white. The questions: 1) Should Image.open() behave like pyplot.imread() ? Is this a bug in PIL ? 2) Why isn't the colormap working with black and white images ? 2016-04-29 13:06 GMT-03:00 Paul Hobson : > Does using pyplot.imgread work? > > On Fri, Apr 29, 2016 at 8:27 AM, Henrique Almeida > wrote: >> >> Any help with this problem ? >> >> 2016-04-27 11:35 GMT-03:00 Henrique Almeida : >> > Hello, what's the current status on numpy for loading bit-arrays ? >> > >> > I'm currently unable to correctly load black and white (1-bit) TIFF >> > images. Code example follows: >> > >> > from PIL import Image >> > import numpy >> > from matplotlib import pyplot >> > >> > img = Image.open('oi-00.tiff') >> > a = numpy.array(img) >> > >> > ^ does not work for 1-bit TIFF images >> > >> > PIL source shows that it incorrectly uses typestr == '|b1'. I tried to >> > change this to '|t1', but I get : >> > >> > TypeError: data type "|t1" not understood >> > >> > My goal is to make the above code to work for black and white TIFF >> > images the same way it works for grayscale images. Any help ? >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From ben.v.root at gmail.com Fri Apr 29 12:38:57 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Fri, 29 Apr 2016 12:38:57 -0400 Subject: [Numpy-discussion] Help with bit arrays In-Reply-To: References: Message-ID: What kind of array is "img"? What is its dtype and shape? plt.imshow() will use the default colormap for matplotlib if the given array is just 2D. But if it is 3D (a 2D array of RGB[A] channels), then it will forego the colormap and utilize that for the colors. It knows nothing of the colormap contained in the TIFF. Ben Root On Fri, Apr 29, 2016 at 12:31 PM, Henrique Almeida wrote: > Paul, yes, imread() worked for reading the black and white TIFF. The > situation improved, but now, there seems to be some problem with the > color map. Example code: > > #!/usr/bin/env python3 > import numpy > from matplotlib import pyplot, cm > > img = pyplot.imread('oi-00.tiff') > pyplot.imshow(img) > pyplot.colorbar() > pyplot.show() > > The code can open both 1-bit and 8-bit images, but only with 8 bits > the image is shown with the colormap colors. The 1 bit image is shown > as black and white. > > The questions: > 1) Should Image.open() behave like pyplot.imread() ? Is this a bug in PIL > ? > 2) Why isn't the colormap working with black and white images ? > > 2016-04-29 13:06 GMT-03:00 Paul Hobson : > > Does using pyplot.imgread work? > > > > On Fri, Apr 29, 2016 at 8:27 AM, Henrique Almeida > > > wrote: > >> > >> Any help with this problem ? > >> > >> 2016-04-27 11:35 GMT-03:00 Henrique Almeida : > >> > Hello, what's the current status on numpy for loading bit-arrays ? > >> > > >> > I'm currently unable to correctly load black and white (1-bit) TIFF > >> > images. Code example follows: > >> > > >> > from PIL import Image > >> > import numpy > >> > from matplotlib import pyplot > >> > > >> > img = Image.open('oi-00.tiff') > >> > a = numpy.array(img) > >> > > >> > ^ does not work for 1-bit TIFF images > >> > > >> > PIL source shows that it incorrectly uses typestr == '|b1'. I tried to > >> > change this to '|t1', but I get : > >> > > >> > TypeError: data type "|t1" not understood > >> > > >> > My goal is to make the above code to work for black and white TIFF > >> > images the same way it works for grayscale images. Any help ? > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hdante.lnls at gmail.com Fri Apr 29 12:43:06 2016 From: hdante.lnls at gmail.com (Henrique Almeida) Date: Fri, 29 Apr 2016 13:43:06 -0300 Subject: [Numpy-discussion] Help with bit arrays In-Reply-To: References: Message-ID: For 1 bit images, the resulting array has shape (256, 256, 4). For grayscale images, the shape is (256, 256). So the image seems to have been loaded as a color image. 2016-04-29 13:38 GMT-03:00 Benjamin Root : > What kind of array is "img"? What is its dtype and shape? > > plt.imshow() will use the default colormap for matplotlib if the given array > is just 2D. But if it is 3D (a 2D array of RGB[A] channels), then it will > forego the colormap and utilize that for the colors. It knows nothing of the > colormap contained in the TIFF. > > Ben Root > > > On Fri, Apr 29, 2016 at 12:31 PM, Henrique Almeida > wrote: >> >> Paul, yes, imread() worked for reading the black and white TIFF. The >> situation improved, but now, there seems to be some problem with the >> color map. Example code: >> >> #!/usr/bin/env python3 >> import numpy >> from matplotlib import pyplot, cm >> >> img = pyplot.imread('oi-00.tiff') >> pyplot.imshow(img) >> pyplot.colorbar() >> pyplot.show() >> >> The code can open both 1-bit and 8-bit images, but only with 8 bits >> the image is shown with the colormap colors. The 1 bit image is shown >> as black and white. >> >> The questions: >> 1) Should Image.open() behave like pyplot.imread() ? Is this a bug in PIL >> ? >> 2) Why isn't the colormap working with black and white images ? >> >> 2016-04-29 13:06 GMT-03:00 Paul Hobson : >> > Does using pyplot.imgread work? >> > >> > On Fri, Apr 29, 2016 at 8:27 AM, Henrique Almeida >> > >> > wrote: >> >> >> >> Any help with this problem ? >> >> >> >> 2016-04-27 11:35 GMT-03:00 Henrique Almeida : >> >> > Hello, what's the current status on numpy for loading bit-arrays ? >> >> > >> >> > I'm currently unable to correctly load black and white (1-bit) TIFF >> >> > images. Code example follows: >> >> > >> >> > from PIL import Image >> >> > import numpy >> >> > from matplotlib import pyplot >> >> > >> >> > img = Image.open('oi-00.tiff') >> >> > a = numpy.array(img) >> >> > >> >> > ^ does not work for 1-bit TIFF images >> >> > >> >> > PIL source shows that it incorrectly uses typestr == '|b1'. I tried >> >> > to >> >> > change this to '|t1', but I get : >> >> > >> >> > TypeError: data type "|t1" not understood >> >> > >> >> > My goal is to make the above code to work for black and white TIFF >> >> > images the same way it works for grayscale images. Any help ? >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From hdante.lnls at gmail.com Fri Apr 29 12:47:07 2016 From: hdante.lnls at gmail.com (Henrique Almeida) Date: Fri, 29 Apr 2016 13:47:07 -0300 Subject: [Numpy-discussion] Help with bit arrays In-Reply-To: References: Message-ID: I think in any case, the result is unexpected, PIL is loading garbage from memory when loading black and white images because it sends the wrong buffer size, and matplotlib correctly loads the black and white image, but stores it in a 3D array. 2016-04-29 13:43 GMT-03:00 Henrique Almeida : > For 1 bit images, the resulting array has shape (256, 256, 4). For > grayscale images, the shape is (256, 256). So the image seems to have > been loaded as a color image. > > 2016-04-29 13:38 GMT-03:00 Benjamin Root : >> What kind of array is "img"? What is its dtype and shape? >> >> plt.imshow() will use the default colormap for matplotlib if the given array >> is just 2D. But if it is 3D (a 2D array of RGB[A] channels), then it will >> forego the colormap and utilize that for the colors. It knows nothing of the >> colormap contained in the TIFF. >> >> Ben Root >> >> >> On Fri, Apr 29, 2016 at 12:31 PM, Henrique Almeida >> wrote: >>> >>> Paul, yes, imread() worked for reading the black and white TIFF. The >>> situation improved, but now, there seems to be some problem with the >>> color map. Example code: >>> >>> #!/usr/bin/env python3 >>> import numpy >>> from matplotlib import pyplot, cm >>> >>> img = pyplot.imread('oi-00.tiff') >>> pyplot.imshow(img) >>> pyplot.colorbar() >>> pyplot.show() >>> >>> The code can open both 1-bit and 8-bit images, but only with 8 bits >>> the image is shown with the colormap colors. The 1 bit image is shown >>> as black and white. >>> >>> The questions: >>> 1) Should Image.open() behave like pyplot.imread() ? Is this a bug in PIL >>> ? >>> 2) Why isn't the colormap working with black and white images ? >>> >>> 2016-04-29 13:06 GMT-03:00 Paul Hobson : >>> > Does using pyplot.imgread work? >>> > >>> > On Fri, Apr 29, 2016 at 8:27 AM, Henrique Almeida >>> > >>> > wrote: >>> >> >>> >> Any help with this problem ? >>> >> >>> >> 2016-04-27 11:35 GMT-03:00 Henrique Almeida : >>> >> > Hello, what's the current status on numpy for loading bit-arrays ? >>> >> > >>> >> > I'm currently unable to correctly load black and white (1-bit) TIFF >>> >> > images. Code example follows: >>> >> > >>> >> > from PIL import Image >>> >> > import numpy >>> >> > from matplotlib import pyplot >>> >> > >>> >> > img = Image.open('oi-00.tiff') >>> >> > a = numpy.array(img) >>> >> > >>> >> > ^ does not work for 1-bit TIFF images >>> >> > >>> >> > PIL source shows that it incorrectly uses typestr == '|b1'. I tried >>> >> > to >>> >> > change this to '|t1', but I get : >>> >> > >>> >> > TypeError: data type "|t1" not understood >>> >> > >>> >> > My goal is to make the above code to work for black and white TIFF >>> >> > images the same way it works for grayscale images. Any help ? >>> >> _______________________________________________ >>> >> NumPy-Discussion mailing list >>> >> NumPy-Discussion at scipy.org >>> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> > >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at scipy.org >>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> From ben.v.root at gmail.com Fri Apr 29 13:22:21 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Fri, 29 Apr 2016 13:22:21 -0400 Subject: [Numpy-discussion] Help with bit arrays In-Reply-To: References: Message-ID: What behavior is unexpected? For the (256, 256) images, matplotlib applies its default colormap to the grayscale (v1.5 and previous, that is jet, +v2.0, that will be viridis). The numpy array as loaded from PIL will never carry any additional information that came from the TIFF. As for PIL, it will return an RGB[A] array if there is colormap data in the TIFF. If there is no colormap specified in the TIFF, it'll give you a simple 2D array. Now, maybe you'd like it to always return an RGB[A] array, but without a colormap in the TIFF, it makes sense to return the data as-is. This makes sense for people treating the TIFF as a data format rather than a visualization data format. Ben Root On Fri, Apr 29, 2016 at 12:47 PM, Henrique Almeida wrote: > I think in any case, the result is unexpected, PIL is loading garbage > from memory when loading black and white images because it sends the > wrong buffer size, and matplotlib correctly loads the black and white > image, but stores it in a 3D array. > > 2016-04-29 13:43 GMT-03:00 Henrique Almeida : > > For 1 bit images, the resulting array has shape (256, 256, 4). For > > grayscale images, the shape is (256, 256). So the image seems to have > > been loaded as a color image. > > > > 2016-04-29 13:38 GMT-03:00 Benjamin Root : > >> What kind of array is "img"? What is its dtype and shape? > >> > >> plt.imshow() will use the default colormap for matplotlib if the given > array > >> is just 2D. But if it is 3D (a 2D array of RGB[A] channels), then it > will > >> forego the colormap and utilize that for the colors. It knows nothing > of the > >> colormap contained in the TIFF. > >> > >> Ben Root > >> > >> > >> On Fri, Apr 29, 2016 at 12:31 PM, Henrique Almeida < > hdante.lnls at gmail.com> > >> wrote: > >>> > >>> Paul, yes, imread() worked for reading the black and white TIFF. The > >>> situation improved, but now, there seems to be some problem with the > >>> color map. Example code: > >>> > >>> #!/usr/bin/env python3 > >>> import numpy > >>> from matplotlib import pyplot, cm > >>> > >>> img = pyplot.imread('oi-00.tiff') > >>> pyplot.imshow(img) > >>> pyplot.colorbar() > >>> pyplot.show() > >>> > >>> The code can open both 1-bit and 8-bit images, but only with 8 bits > >>> the image is shown with the colormap colors. The 1 bit image is shown > >>> as black and white. > >>> > >>> The questions: > >>> 1) Should Image.open() behave like pyplot.imread() ? Is this a bug in > PIL > >>> ? > >>> 2) Why isn't the colormap working with black and white images ? > >>> > >>> 2016-04-29 13:06 GMT-03:00 Paul Hobson : > >>> > Does using pyplot.imgread work? > >>> > > >>> > On Fri, Apr 29, 2016 at 8:27 AM, Henrique Almeida > >>> > > >>> > wrote: > >>> >> > >>> >> Any help with this problem ? > >>> >> > >>> >> 2016-04-27 11:35 GMT-03:00 Henrique Almeida >: > >>> >> > Hello, what's the current status on numpy for loading bit-arrays > ? > >>> >> > > >>> >> > I'm currently unable to correctly load black and white (1-bit) > TIFF > >>> >> > images. Code example follows: > >>> >> > > >>> >> > from PIL import Image > >>> >> > import numpy > >>> >> > from matplotlib import pyplot > >>> >> > > >>> >> > img = Image.open('oi-00.tiff') > >>> >> > a = numpy.array(img) > >>> >> > > >>> >> > ^ does not work for 1-bit TIFF images > >>> >> > > >>> >> > PIL source shows that it incorrectly uses typestr == '|b1'. I > tried > >>> >> > to > >>> >> > change this to '|t1', but I get : > >>> >> > > >>> >> > TypeError: data type "|t1" not understood > >>> >> > > >>> >> > My goal is to make the above code to work for black and white TIFF > >>> >> > images the same way it works for grayscale images. Any help ? > >>> >> _______________________________________________ > >>> >> NumPy-Discussion mailing list > >>> >> NumPy-Discussion at scipy.org > >>> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > > >>> > > >>> > > >>> > _______________________________________________ > >>> > NumPy-Discussion mailing list > >>> > NumPy-Discussion at scipy.org > >>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > > >>> _______________________________________________ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion at scipy.org > >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hdante.lnls at gmail.com Fri Apr 29 13:42:48 2016 From: hdante.lnls at gmail.com (Henrique Almeida) Date: Fri, 29 Apr 2016 14:42:48 -0300 Subject: [Numpy-discussion] Help with bit arrays In-Reply-To: References: Message-ID: I agree with everything, but the unexpected are something else. The TIFF images have no colormap. The colormap that I'm referring to is the GUI colormap, used by matplotlib to draw the image (imshow parameter cmap). The problematic image format is the black and white 1-bit TIFF format. It is a bit array format, all bits are packed in sequence. PIL passes this kind of image to numpy through the array_interface getter, which returns an array description of shape = (256, 256), type string "|b1" and data is a 8192 byte array (256 * 256 * 1 bit). This description is invalid and causes numpy to load 65536 bytes from memory, causing a buffer overflow (even though it does not crash). This is unexpected #1. matplotlib.imread(), when loading a 8-bit grayscale image creates an array of shape (256, 256). matplotlib.imread(), when loading a 1-bit black and white image creates an array of shape (256, 256, 4) by first converting the black and white image to RGBA. This difference between grayscale and black and white is unexpected #2. 2016-04-29 14:22 GMT-03:00 Benjamin Root : > What behavior is unexpected? For the (256, 256) images, matplotlib applies > its default colormap to the grayscale (v1.5 and previous, that is jet, > +v2.0, that will be viridis). The numpy array as loaded from PIL will never > carry any additional information that came from the TIFF. > > As for PIL, it will return an RGB[A] array if there is colormap data in the > TIFF. If there is no colormap specified in the TIFF, it'll give you a simple > 2D array. Now, maybe you'd like it to always return an RGB[A] array, but > without a colormap in the TIFF, it makes sense to return the data as-is. > This makes sense for people treating the TIFF as a data format rather than a > visualization data format. > > Ben Root > > > On Fri, Apr 29, 2016 at 12:47 PM, Henrique Almeida > wrote: >> >> I think in any case, the result is unexpected, PIL is loading garbage >> from memory when loading black and white images because it sends the >> wrong buffer size, and matplotlib correctly loads the black and white >> image, but stores it in a 3D array. >> >> 2016-04-29 13:43 GMT-03:00 Henrique Almeida : >> > For 1 bit images, the resulting array has shape (256, 256, 4). For >> > grayscale images, the shape is (256, 256). So the image seems to have >> > been loaded as a color image. >> > >> > 2016-04-29 13:38 GMT-03:00 Benjamin Root : >> >> What kind of array is "img"? What is its dtype and shape? >> >> >> >> plt.imshow() will use the default colormap for matplotlib if the given >> >> array >> >> is just 2D. But if it is 3D (a 2D array of RGB[A] channels), then it >> >> will >> >> forego the colormap and utilize that for the colors. It knows nothing >> >> of the >> >> colormap contained in the TIFF. >> >> >> >> Ben Root >> >> >> >> >> >> On Fri, Apr 29, 2016 at 12:31 PM, Henrique Almeida >> >> >> >> wrote: >> >>> >> >>> Paul, yes, imread() worked for reading the black and white TIFF. The >> >>> situation improved, but now, there seems to be some problem with the >> >>> color map. Example code: >> >>> >> >>> #!/usr/bin/env python3 >> >>> import numpy >> >>> from matplotlib import pyplot, cm >> >>> >> >>> img = pyplot.imread('oi-00.tiff') >> >>> pyplot.imshow(img) >> >>> pyplot.colorbar() >> >>> pyplot.show() >> >>> >> >>> The code can open both 1-bit and 8-bit images, but only with 8 bits >> >>> the image is shown with the colormap colors. The 1 bit image is shown >> >>> as black and white. >> >>> >> >>> The questions: >> >>> 1) Should Image.open() behave like pyplot.imread() ? Is this a bug in >> >>> PIL >> >>> ? >> >>> 2) Why isn't the colormap working with black and white images ? >> >>> >> >>> 2016-04-29 13:06 GMT-03:00 Paul Hobson : >> >>> > Does using pyplot.imgread work? >> >>> > >> >>> > On Fri, Apr 29, 2016 at 8:27 AM, Henrique Almeida >> >>> > >> >>> > wrote: >> >>> >> >> >>> >> Any help with this problem ? >> >>> >> >> >>> >> 2016-04-27 11:35 GMT-03:00 Henrique Almeida >> >>> >> : >> >>> >> > Hello, what's the current status on numpy for loading bit-arrays >> >>> >> > ? >> >>> >> > >> >>> >> > I'm currently unable to correctly load black and white (1-bit) >> >>> >> > TIFF >> >>> >> > images. Code example follows: >> >>> >> > >> >>> >> > from PIL import Image >> >>> >> > import numpy >> >>> >> > from matplotlib import pyplot >> >>> >> > >> >>> >> > img = Image.open('oi-00.tiff') >> >>> >> > a = numpy.array(img) >> >>> >> > >> >>> >> > ^ does not work for 1-bit TIFF images >> >>> >> > >> >>> >> > PIL source shows that it incorrectly uses typestr == '|b1'. I >> >>> >> > tried >> >>> >> > to >> >>> >> > change this to '|t1', but I get : >> >>> >> > >> >>> >> > TypeError: data type "|t1" not understood >> >>> >> > >> >>> >> > My goal is to make the above code to work for black and white >> >>> >> > TIFF >> >>> >> > images the same way it works for grayscale images. Any help ? >> >>> >> _______________________________________________ >> >>> >> NumPy-Discussion mailing list >> >>> >> NumPy-Discussion at scipy.org >> >>> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>> > >> >>> > >> >>> > >> >>> > _______________________________________________ >> >>> > NumPy-Discussion mailing list >> >>> > NumPy-Discussion at scipy.org >> >>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >>> > >> >>> _______________________________________________ >> >>> NumPy-Discussion mailing list >> >>> NumPy-Discussion at scipy.org >> >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion >