From ralf.gommers at gmail.com Mon Feb 1 16:25:17 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 1 Feb 2016 22:25:17 +0100 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: <56ADE8A9.7030901@googlemail.com> References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: On Sun, Jan 31, 2016 at 11:57 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 01/30/2016 06:27 PM, Ralf Gommers wrote: > > > > > > On Fri, Jan 29, 2016 at 11:39 PM, Nathaniel Smith > > wrote: > > > > It occurs to me that the best solution might be to put together a > > .travis.yml for the release branches that does: "for pkg in > > IMPORTANT_PACKAGES: pip install $pkg; python -c 'import pkg; > > pkg.test()'" > > This might not be viable right now, but will be made more viable if > > pypi starts allowing official Linux wheels, which looks likely to > > happen before 1.12... (see PEP 513) > > > > On Jan 29, 2016 9:46 AM, "Andreas Mueller" > > wrote: > > > > > > Is this the point when scikit-learn should build against it? > > > > Yes please! > > > > > Or do we wait for an RC? > > > > This is still all in flux, but I think we might actually want a rule > > that says it can't become an RC until after we've tested > > scikit-learn (and a list of similarly prominent packages). On the > > theory that RC means "we think this is actually good enough to > > release" :-). OTOH I'm not sure the alpha/beta/RC distinction is > > very helpful; maybe they should all just be betas. > > > > > Also, we need a scipy build against it. Who does that? > > > > Like Julian says, it shouldn't be necessary. In fact using old > > builds of scipy and scikit-learn is even better than rebuilding > > them, because it tests numpy's ABI compatibility -- if you find you > > *have* to rebuild something then we *definitely* want to know that. > > > > > Our continuous integration doesn't usually build scipy or numpy, > so it will be a bit tricky to add to our config. > > > Would you run our master tests? [did we ever finish this > discussion?] > > > > We didn't, and probably should... :-) > > > > Why would that be necessary if scikit-learn simply tests pre-releases of > > numpy as you suggested earlier in the thread (with --pre)? > > > > There's also https://github.com/MacPython/scipy-stack-osx-testing by the > > way, which could have scikit-learn and scikit-image added to it. > > > > That's two options that are imho both better than adding more workload > > for the numpy release manager. Also from a principled point of view, > > packages should test with new versions of their dependencies, not the > > other way around. > > > It would be nice but its not realistic, I doubt most upstreams that are > not themselves major downstreams are even subscribed to this list. > I'm pretty sure that some core devs from all major scipy stack packages are subscribed to this list. Testing or delegating testing of least our major downstreams should be > the job of the release manager. > If we make it (almost) fully automated, like in https://github.com/MacPython/scipy-stack-osx-testing, then I agree that adding this to the numpy release checklist would make sense. But it should really only be a tiny amount of work - we're short on developer power, and many things that are cross-project like build & test infrastructure (numpy.distutils, needed pip/packaging fixes, numpy.testing), scipy.org (the "stack" website), numpydoc, etc. are mostly maintained by the numpy/scipy devs. I'm very reluctant to say yes to putting even more work on top of that. So: it would really help if someone could pick up the automation part of this and improve the stack testing, so the numpy release manager doesn't have to do this. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Mon Feb 1 17:14:27 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 1 Feb 2016 23:14:27 +0100 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: <56AFD8C3.30801@googlemail.com> hi, even if it are good changes, I find it reasonable to ask for a delay in numpy release if you need more time to adapt. Of course this has to be within reason and can be rejected, but its very valuable to know changes break existing old workarounds. If pyfits broke there is probably a lot more code we don't know about that is also broken. Sometimes we might even be able to get the good without breaking the bad. E.g. thanks to Sebastians heroic efforts in his recent indexing rewrite only very little broke and a lot of odd stuff could be equipped with deprecation warnings instead of breaking. Of course that cannot often be done or be worthwhile but its at least worth considering when we change core functionality. cheers, Julian On 31.01.2016 22:52, Marten van Kerkwijk wrote: > Hi Julian, > > While the numpy 1.10 situation was bad, I do want to clarify that the > problems we had in astropy were a consequence of *good* changes in > `recarray`, which solved many problems, but also broke the work-arounds > that had been created in `astropy.io.fits` quite a long time ago > (possibly before astropy became as good as it tries to be now at moving > issues upstream and perhaps before numpy had become as responsive to > what happens downstream as it is now; I think it is fair to say many > project's attitude to testing has changed rather drastically in the last > decade!). > > I do agree, though, that it just goes to show one has to try to be > careful, and like Nathaniel's suggestion of automatic testing with > pre-releases -- I just asked on our astropy-dev list whether we can > implement it. > > All the best, > > Marten > From jtaylor.debian at googlemail.com Mon Feb 1 18:22:23 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 2 Feb 2016 00:22:23 +0100 Subject: [Numpy-discussion] Numpy pull requests getting out of hand. In-Reply-To: References: Message-ID: <56AFE8AF.7000806@googlemail.com> I don't like that approach, closing PRs with valuable code causes them to get lost in the much larger set of closed ones. Instead we could tag them appropriately so they can be found by parties interested in using or finishing them. These could also be used for new contributors to work on. You could also tag and close them, but I find it makes them harder to discover especially for outsiders who are not aware of this policy. Also we could extend our contribution guidelines to note that reviewing existing PRs can be a much more valuable contribution than adding new code. Possibly adding some review guidelines. On 31.01.2016 20:25, Jeff Reback wrote: > FYI also useful to simply close by time - say older than 6 months with a message for the writer to reopen if they want to work on it > > then u don't get too many stale ones > > my 2c > >> On Jan 31, 2016, at 2:10 PM, Charles R Harris wrote: >> >> Hi All, >> >> There are now 130 open numpy pull requests and it seems almost impossible to keep that number down. My personal decision is that I am going to ignore any new enhancements for the next couple of months and only merge bug fixes, tests, house keeping (style, docs, deprecations), and old PRs. I would also request that other maintainers start looking a taking care of older PRs, either cleaning them up and merging, or closing them. >> >> Chuck >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From evgeny.burovskiy at gmail.com Mon Feb 1 19:21:37 2016 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Tue, 2 Feb 2016 00:21:37 +0000 Subject: [Numpy-discussion] Numpy pull requests getting out of hand. In-Reply-To: <56AFE8AF.7000806@googlemail.com> References: <56AFE8AF.7000806@googlemail.com> Message-ID: On Mon, Feb 1, 2016 at 11:22 PM, Julian Taylor wrote: > I don't like that approach, closing PRs with valuable code causes them > to get lost in the much larger set of closed ones. > Instead we could tag them appropriately so they can be found by > parties interested in using or finishing them. These could also be used > for new contributors to work on. > You could also tag and close them, but I find it makes them harder to > discover especially for outsiders who are not aware of this policy. > > Also we could extend our contribution guidelines to note that > reviewing existing PRs can be a much more valuable contribution than > adding new code. Possibly adding some review guidelines. FWIW, I second this. http://scipy.github.io/devdocs/hacking.html mentions "Contributing new code" and "Contributing by helping maintain existing code". We should add "Contributing by reviewing open PRs" and "Contributing by rebasing stalled PRs". It's very much not obvious to would-be-contributors that either is welcome or that the latter is actually possible. I don't even think we need to detail guidelines too much, apart from stressing that we want to preserve the original commit authorship. And maybe "briefly introduce yourself if we don't know you yet". W.r.t. tagging, over at scipy we have a "needs-work" tag. As of today, there is also the "incomplete" tag, which I do not think conveys the message --- we should stress the call for action, not a limbo status. At a minimum, we could adopt a "needs-review" tag from e.g. matplotlib and introduce something like "up-for-grabs" or "needs-champion" tag. [Of course, the "we should" above is more like, "I think it might make sense for scipy, maybe also numpy could consider this". ] My 2 kopeiki, Evgeni > > On 31.01.2016 20:25, Jeff Reback wrote: >> FYI also useful to simply close by time - say older than 6 months with a message for the writer to reopen if they want to work on it >> >> then u don't get too many stale ones >> >> my 2c >> >>> On Jan 31, 2016, at 2:10 PM, Charles R Harris wrote: >>> >>> Hi All, >>> >>> There are now 130 open numpy pull requests and it seems almost impossible to keep that number down. My personal decision is that I am going to ignore any new enhancements for the next couple of months and only merge bug fixes, tests, house keeping (style, docs, deprecations), and old PRs. I would also request that other maintainers start looking a taking care of older PRs, either cleaning them up and merging, or closing them. >>> >>> Chuck >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From derek at astro.physik.uni-goettingen.de Mon Feb 1 19:23:33 2016 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Tue, 2 Feb 2016 01:23:33 +0100 Subject: [Numpy-discussion] Numpy 1.11.0b1 is out In-Reply-To: <1454230082.3650.8.camel@sipsolutions.net> References: <1453896636.18959.2.camel@sipsolutions.net> <1454230082.3650.8.camel@sipsolutions.net> Message-ID: > On 31 Jan 2016, at 9:48 am, Sebastian Berg wrote: > > On Sa, 2016-01-30 at 20:27 +0100, Derek Homeier wrote: >> On 27 Jan 2016, at 1:10 pm, Sebastian Berg < >> sebastian at sipsolutions.net> wrote: >>> >>> On Mi, 2016-01-27 at 11:19 +0000, Nadav Horesh wrote: >>>> Why the dot function/method is slower than @ on python 3.5.1? >>>> Tested >>>> from the latest 1.11 maintenance branch. >>>> >>> >>> The explanation I think is that you do not have a blas >>> optimization. In >>> which case the fallback mode is probably faster in the @ case >>> (since it >>> has SSE2 optimization by using einsum, while np.dot does not do >>> that). >> >> I am a bit confused now, as A @ c is just short for A.__matmul__(c) >> or equivalent >> to np.matmul(A,c), so why would these not use the optimised blas? >> Also, I am getting almost identical results on my Mac, yet I thought >> numpy would >> by default build against the VecLib optimised BLAS. If I build >> explicitly against >> ATLAS, I am actually seeing slightly slower results. >> But I also saw these kind of warnings on the first timeit runs: >> >> %timeit A.dot(c) >> The slowest run took 6.91 times longer than the fastest. This could >> mean that an intermediate result is being cached >> >> and when testing much larger arrays, the discrepancy between matmul >> and dot rather >> increases, so perhaps this is more an issue of a less memory >> -efficient implementation >> in np.dot? > > Sorry, I missed the fact that one of the arrays was 3D. In that case I > am not even sure which if the functions call into blas or what else > they have to do, would have to check. Note that `np.dot` uses a > different type of combinging high dimensional arrays. @/matmul > broadcasts extra axes, while np.dot will do the outer combination of > them, so that the result is: > > As = A.shape > As.pop(-1) > cs = c.shape > cs.pop(-2) # if possible > result_shape = As + cs > > which happens to be identical if only A.ndim > 2 and c.ndim <= 2. Makes sense now; with A.ndim = 2 both operations take about the same time (and are ~50% faster with VecLib than with ATLAS) and yield identical results, while any additional dimension in A adds more overhead time to np.dot, and the results are np.allclose, but not exactly identical. Thanks, Derek From ndbecker2 at gmail.com Tue Feb 2 10:33:37 2016 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 02 Feb 2016 10:33:37 -0500 Subject: [Numpy-discussion] julia - Multidimensional algorithms and iteration Message-ID: http://julialang.org/blog/2016/02/iteration/ From pav at iki.fi Tue Feb 2 11:45:20 2016 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 2 Feb 2016 18:45:20 +0200 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: 01.02.2016, 23:25, Ralf Gommers kirjoitti: [clip] > So: it would really help if someone could pick up the automation part of > this and improve the stack testing, so the numpy release manager doesn't > have to do this. quick hack: https://github.com/pv/testrig Not that I'm necessarily volunteering to maintain the setup, though, but if it seems useful, move it under numpy org. -- Pauli Virtanen From jni.soma at gmail.com Tue Feb 2 20:56:13 2016 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Wed, 3 Feb 2016 12:56:13 +1100 Subject: [Numpy-discussion] julia - Multidimensional algorithms and iteration In-Reply-To: References: Message-ID: Nice. I particularly liked that indices are just arrays, so you can do array arithmetic on them. I spend a lot of time converting tuples-to-array-to-tuples. If I understand correctly, indexing-with-arrays is overloaded in NumPy so the tuple syntax isn't going away any time soon, is it? On Wed, Feb 3, 2016 at 2:33 AM, Neal Becker wrote: > http://julialang.org/blog/2016/02/iteration/ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Feb 4 00:18:38 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 3 Feb 2016 21:18:38 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: On Tue, Feb 2, 2016 at 8:45 AM, Pauli Virtanen wrote: > 01.02.2016, 23:25, Ralf Gommers kirjoitti: > [clip] >> So: it would really help if someone could pick up the automation part of >> this and improve the stack testing, so the numpy release manager doesn't >> have to do this. > > quick hack: https://github.com/pv/testrig > > Not that I'm necessarily volunteering to maintain the setup, though, but > if it seems useful, move it under numpy org. That's pretty cool :-). I also was fiddling with a similar idea a bit, though much less fancy... my little script cheats and uses miniconda to fetch pre-built versions of some packages, and then runs the tests against numpy 1.10.2 (as shipped by anaconda) + the numpy master, and does a diff (with a bit of massaging to make things more readable, like summarizing warnings): https://travis-ci.org/njsmith/numpy/builds/106865202 Search for "#####" to jump between sections of the output. Some observations: testing* matplotlib* this way doesn't work, b/c they need special test data files that anaconda doesn't ship :-/ *scipy*: *one new failure*, in test_nanmedian_all_axis 250 calls to np.testing.rand (wtf), 92 calls to random_integers, 3 uses of datetime64 with timezones. And for some reason the new numpy gives more "invalid value encountered in greater"-type warnings. *astropy*: *two weird failures* that hopefully some astropy person will look into; two spurious failures due to over-strict testing of warnings *scikit-learn*: several* new failures:* 1 "invalid slice" (?), 2 "OverflowError: value too large to convert to int". No idea what's up with these. Hopefully some scikit-learn person will investigate? 2 np.ma view warnings, 16 multi-character strings used where "C" or "F" expected, 1514 (!!) calls to random_integers *pandas:* zero new failures, only new warnings are about NaT, as expected. I guess their whole "running their tests against numpy master" thing works! *statsmodels:* * absolute disaster*. *261 *new failures, I think mostly because of numpy getting pickier about float->int conversions. Also a few "invalid slice". 102 np.ma view warnings. I don't have a great sense of whether the statsmodels breakages are ones that will actually impact users, or if they're just like, 1 bad utility function that only gets used in the test suite. (well, probably not the latter, because they do have different tracebacks). If this is typical though then we may need to back those integer changes out and replace them by a really loud obnoxious warning for a release or two :-/ The other problem here is that statsmodels hasn't done a release since 2014 :-/ -n -- Nathaniel J. Smith -- https://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Feb 4 00:56:08 2016 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 3 Feb 2016 21:56:08 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: On Wed, Feb 3, 2016 at 9:18 PM, Nathaniel Smith wrote: > > On Tue, Feb 2, 2016 at 8:45 AM, Pauli Virtanen wrote: > > 01.02.2016, 23:25, Ralf Gommers kirjoitti: > > [clip] > >> So: it would really help if someone could pick up the automation part of > >> this and improve the stack testing, so the numpy release manager doesn't > >> have to do this. > > > > quick hack: https://github.com/pv/testrig > > > > Not that I'm necessarily volunteering to maintain the setup, though, but > > if it seems useful, move it under numpy org. > > That's pretty cool :-). I also was fiddling with a similar idea a bit, though much less fancy... my little script cheats and uses miniconda to fetch pre-built versions of some packages, and then runs the tests against numpy 1.10.2 (as shipped by anaconda) + the numpy master, and does a diff (with a bit of massaging to make things more readable, like summarizing warnings): Whoops, got distracted talking about the results and forgot to say -- I guess we should think about how to combine these? I like the information on warnings, because it helps gauge the impact of deprecations, which is a thing that takes a lot of our attention. But your approach is clearly fancier in terms of how it parses the test results. (Do you think the fanciness is worth it? I can see an argument for crude and simple if the fanciness ends up being fragile, but I haven't read the code -- mostly I was just being crude and simple because I'm lazy :-).) An extra ~2 hours of tests / 6-way parallelism is not that big a deal in the grand scheme of things (and I guess it's probably less than that if we can take advantage of existing binary builds) -- certainly I can see an argument for enabling it by default on the maintenance/1.x branches. Running N extra test suites ourselves is not actually more expensive than asking N projects to run 1 more testsuite :-). The trickiest part is getting it to give actually-useful automated pass/fail feedback, as opposed to requiring someone to remember to look at it manually :-/ Maybe it should be uploading the reports somewhere? So there'd be a readable "what's currently broken by 1.x" page, plus with persistent storage we could get travis to flag if new additions to the release branch causes any new failures to appear? (That way we only have to remember to look at the report manually once per release, instead of constantly throughout the process.) -n -- Nathaniel J. Smith -- https://vorpus.org From nadavh at visionsense.com Thu Feb 4 04:32:36 2016 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 4 Feb 2016 09:32:36 +0000 Subject: [Numpy-discussion] [OT] Interpolation of an unevently sampled bandwidth limited signal Message-ID: I have several cases of hand digitized spectra that I'd like to resample these spectra at even spacings. My problem is that cubic or RBF splines often result in an unacceptible over-shooting. Is there a python module that provides something similar to sinc interpolation on unevenly space sampled signal? Nadav. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Feb 4 04:33:29 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 4 Feb 2016 10:33:29 +0100 Subject: [Numpy-discussion] Numpy 1.11.0b2 released References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: <20160204103329.7bab2d2f@fsol> On Wed, 3 Feb 2016 21:56:08 -0800 Nathaniel Smith wrote: > > An extra ~2 hours of tests / 6-way parallelism is not that big a deal > in the grand scheme of things (and I guess it's probably less than > that if we can take advantage of existing binary builds) -- certainly > I can see an argument for enabling it by default on the > maintenance/1.x branches. Running N extra test suites ourselves is not > actually more expensive than asking N projects to run 1 more testsuite > :-). The trickiest part is getting it to give actually-useful > automated pass/fail feedback, as opposed to requiring someone to > remember to look at it manually :-/ Yes, I think that's where the problem lies. Python had something called "community buildbots" at a time (testing well-known libraries such as Twisted against the Python trunk), but it suffered from lack of attention and finally was dismantled. Apparently having the people running it and the people most interested in it not being the same ones ended up a bad idea :-) That said, if you do something like that with Numpy, we would be interested in having Numba be part of the tested packages. Regards Antoine. From evgeny.burovskiy at gmail.com Thu Feb 4 04:42:23 2016 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Thu, 4 Feb 2016 09:42:23 +0000 Subject: [Numpy-discussion] [OT] Interpolation of an unevently sampled bandwidth limited signal In-Reply-To: References: Message-ID: On Thu, Feb 4, 2016 at 9:32 AM, Nadav Horesh wrote: > I have several cases of hand digitized spectra that I'd like to resample > these spectra at even spacings. My problem is that cubic or RBF splines > often result in an unacceptible over-shooting. Is there a python module that > provides something similar to sinc interpolation on unevenly space sampled > signal? There are PCHIP and Akima interpolators in scipy.interpolate, both are designed to prevent overshooting at the expense of only being C1-smooth. (No idea about sinc interpolation) From evgeny.burovskiy at gmail.com Thu Feb 4 04:51:32 2016 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Thu, 4 Feb 2016 09:51:32 +0000 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: > scipy: > one new failure, in test_nanmedian_all_axis > 250 calls to np.testing.rand (wtf), 92 calls to random_integers, 3 uses > of datetime64 with timezones. And for some reason the new numpy gives more > "invalid value encountered in greater"-type warnings. One limitation of this approach, AFAIU, is that the downstream versions are pinned by whatever is available from anaconda, correct? Not a big deal per se, just something to keep in mind when looking at the report that there might be false positives. For scipy, for instance, this seems to test 0.16.1. Most (all?) of these are fixed in 0.17.0. At any rate, this is great regardless --- thank you! Cheers, Evgeni From nadavh at visionsense.com Thu Feb 4 06:34:49 2016 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 4 Feb 2016 11:34:49 +0000 Subject: [Numpy-discussion] [OT] Interpolation of an unevently sampled bandwidth limited signal In-Reply-To: References: , Message-ID: Thank you, I'll try this. Interpolation by the sinc function is equivalent to what yiu get if you'll synthesize a smooth function by summing its Fourier component obtained via FFT of the data. Nadav. ________________________________________ From: NumPy-Discussion on behalf of Evgeni Burovski Sent: 04 February 2016 11:42 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] [OT] Interpolation of an unevently sampled bandwidth limited signal On Thu, Feb 4, 2016 at 9:32 AM, Nadav Horesh wrote: > I have several cases of hand digitized spectra that I'd like to resample > these spectra at even spacings. My problem is that cubic or RBF splines > often result in an unacceptible over-shooting. Is there a python module that > provides something similar to sinc interpolation on unevenly space sampled > signal? There are PCHIP and Akima interpolators in scipy.interpolate, both are designed to prevent overshooting at the expense of only being C1-smooth. (No idea about sinc interpolation) _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion From pav at iki.fi Thu Feb 4 09:09:26 2016 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 4 Feb 2016 16:09:26 +0200 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: 04.02.2016, 07:56, Nathaniel Smith kirjoitti: [clip] > Whoops, got distracted talking about the results and forgot to say -- > I guess we should think about how to combine these? I like the > information on warnings, because it helps gauge the impact of > deprecations, which is a thing that takes a lot of our attention. But > your approach is clearly fancier in terms of how it parses the test > results. (Do you think the fanciness is worth it? I can see an > argument for crude and simple if the fanciness ends up being fragile, > but I haven't read the code -- mostly I was just being crude and > simple because I'm lazy :-).) The fanciness is essentially a question of implementation language and ease of writing the reporting code. At 640 SLOC it's probably not so bad. I guess it's reasonably robust --- the test report formats are unlikely to change, and pip/virtualenv will probably continue to work esp. with pinned pip version. It should be simple to extract also the warnings from the test stdout. I'm not sure if the order of test results is deterministic in nose/py.test, so I don't know if just diffing the outputs always works. Building downstream from source avoids future binary compatibility issues. [clip] > Maybe it should be uploading the reports somewhere? So there'd be a > readable "what's currently broken by 1.x" page, plus with persistent > storage we could get travis to flag if new additions to the release > branch causes any new failures to appear? (That way we only have to > remember to look at the report manually once per release, instead of > constantly throughout the process.) This is probably possible to implement. Although, I'm not sure how much added value this is compared to travis matrix, eg. https://travis-ci.org/pv/testrig/ Of course, if the suggestion is that the results are generated on somewhere else than on travis, then that's a different matter. -- Pauli Virtanen From jjstickel at gmail.com Thu Feb 4 09:32:03 2016 From: jjstickel at gmail.com (Jonathan Stickel) Date: Thu, 4 Feb 2016 07:32:03 -0700 Subject: [Numpy-discussion] [OT] Interpolation of an unevently sampled, bandwidth limited signal In-Reply-To: References: Message-ID: <56B360E3.3060402@gmail.com> On 2/4/16 02:42 , numpy-discussion-request at scipy.org wrote: > Date: Thu, 4 Feb 2016 09:32:36 +0000 > From: Nadav Horesh > To: numpy-discussion > Subject: [Numpy-discussion] [OT] Interpolation of an unevently sampled > bandwidth limited signal > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > I have several cases of hand digitized spectra that I'd like to resample these spectra at even spacings. My problem is that cubic or RBF splines often result in an unacceptible over-shooting. Is there a python module that provides something similar to sinc interpolation on unevenly space sampled signal? > > > Nadav. You might try scikit-datasmooth: https://pypi.python.org/pypi/scikits.datasmooth BTW, this wouldn't be offtopic on the scipy-user list. Regards, Jonathan From charlesr.harris at gmail.com Thu Feb 4 10:17:28 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 4 Feb 2016 08:17:28 -0700 Subject: [Numpy-discussion] [OT] Interpolation of an unevently sampled bandwidth limited signal In-Reply-To: References: Message-ID: On Thu, Feb 4, 2016 at 4:34 AM, Nadav Horesh wrote: > Thank you, I'll try this. > Interpolation by the sinc function is equivalent to what yiu get if you'll > synthesize a smooth function by summing its Fourier component obtained via > FFT of the data. > You might be interested in the NUFFT, see https://jakevdp.github.io/blog/2015/02/24/optimizing-python-with-numpy-and-numba/ Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tcaswell at gmail.com Thu Feb 4 10:32:46 2016 From: tcaswell at gmail.com (Thomas Caswell) Date: Thu, 04 Feb 2016 15:32:46 +0000 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: The test data for mpl is available as a sperate conda package, matplotlib-tests. The reason for splitting it is 40Mb of tests images. Tom On Thu, Feb 4, 2016, 09:09 Pauli Virtanen wrote: > 04.02.2016, 07:56, Nathaniel Smith kirjoitti: > [clip] > > Whoops, got distracted talking about the results and forgot to say -- > > I guess we should think about how to combine these? I like the > > information on warnings, because it helps gauge the impact of > > deprecations, which is a thing that takes a lot of our attention. But > > your approach is clearly fancier in terms of how it parses the test > > results. (Do you think the fanciness is worth it? I can see an > > argument for crude and simple if the fanciness ends up being fragile, > > but I haven't read the code -- mostly I was just being crude and > > simple because I'm lazy :-).) > > The fanciness is essentially a question of implementation language and > ease of writing the reporting code. At 640 SLOC it's probably not so bad. > > I guess it's reasonably robust --- the test report formats are unlikely > to change, and pip/virtualenv will probably continue to work esp. with > pinned pip version. > > It should be simple to extract also the warnings from the test stdout. > > I'm not sure if the order of test results is deterministic in > nose/py.test, so I don't know if just diffing the outputs always works. > > Building downstream from source avoids future binary compatibility issues. > > [clip] > > Maybe it should be uploading the reports somewhere? So there'd be a > > readable "what's currently broken by 1.x" page, plus with persistent > > storage we could get travis to flag if new additions to the release > > branch causes any new failures to appear? (That way we only have to > > remember to look at the report manually once per release, instead of > > constantly throughout the process.) > > This is probably possible to implement. Although, I'm not sure how much > added value this is compared to travis matrix, eg. > https://travis-ci.org/pv/testrig/ > > Of course, if the suggestion is that the results are generated on > somewhere else than on travis, then that's a different matter. > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Thu Feb 4 11:59:46 2016 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 4 Feb 2016 16:59:46 +0000 Subject: [Numpy-discussion] [OT] Interpolation of an unevently sampled bandwidth limited signal In-Reply-To: References: , Message-ID: Excellent! I was looking for nonuniform FFT as a component for the interpolation. I am thinking of combining nufft with czt (from scipy) for the interpolation. Nadav ________________________________ From: NumPy-Discussion on behalf of Charles R Harris Sent: 04 February 2016 17:17 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] [OT] Interpolation of an unevently sampled bandwidth limited signal On Thu, Feb 4, 2016 at 4:34 AM, Nadav Horesh > wrote: Thank you, I'll try this. Interpolation by the sinc function is equivalent to what yiu get if you'll synthesize a smooth function by summing its Fourier component obtained via FFT of the data. You might be interested in the NUFFT, see https://jakevdp.github.io/blog/2015/02/24/optimizing-python-with-numpy-and-numba/ Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Feb 4 12:33:52 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 4 Feb 2016 10:33:52 -0700 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: On Wed, Feb 3, 2016 at 10:18 PM, Nathaniel Smith wrote: > On Tue, Feb 2, 2016 at 8:45 AM, Pauli Virtanen wrote: > > 01.02.2016, 23:25, Ralf Gommers kirjoitti: > > [clip] > >> So: it would really help if someone could pick up the automation part of > >> this and improve the stack testing, so the numpy release manager doesn't > >> have to do this. > > > > quick hack: https://github.com/pv/testrig > > > > Not that I'm necessarily volunteering to maintain the setup, though, but > > if it seems useful, move it under numpy org. > > That's pretty cool :-). I also was fiddling with a similar idea a bit, > though much less fancy... my little script cheats and uses miniconda to > fetch pre-built versions of some packages, and then runs the tests against > numpy 1.10.2 (as shipped by anaconda) + the numpy master, and does a diff > (with a bit of massaging to make things more readable, like summarizing > warnings): > > https://travis-ci.org/njsmith/numpy/builds/106865202 > > Search for "#####" to jump between sections of the output. > > Some observations: > > testing* matplotlib* this way doesn't work, b/c they need special test > data files that anaconda doesn't ship :-/ > > *scipy*: > *one new failure*, in test_nanmedian_all_axis > 250 calls to np.testing.rand (wtf), 92 calls to random_integers, 3 uses > of datetime64 with timezones. And for some reason the new numpy gives more > "invalid value encountered in greater"-type warnings. > > *astropy*: > *two weird failures* that hopefully some astropy person will look into; > two spurious failures due to over-strict testing of warnings > > *scikit-learn*: > several* new failures:* 1 "invalid slice" (?), 2 "OverflowError: value > too large to convert to int". No idea what's up with these. Hopefully some > scikit-learn person will investigate? > 2 np.ma view warnings, 16 multi-character strings used where "C" or "F" > expected, 1514 (!!) calls to random_integers > > *pandas:* > zero new failures, only new warnings are about NaT, as expected. I guess > their whole "running their tests against numpy master" thing works! > > > *statsmodels:* > * absolute disaster*. *261 *new failures, I think mostly because of > numpy getting pickier about float->int conversions. Also a few "invalid > slice". > 102 np.ma view warnings. > > I don't have a great sense of whether the statsmodels breakages are ones > that will actually impact users, or if they're just like, 1 bad utility > function that only gets used in the test suite. (well, probably not the > latter, because they do have different tracebacks). If this is typical > though then we may need to back those integer changes out and replace them > by a really loud obnoxious warning for a release or two :-/ The other > problem here is that statsmodels hasn't done a release since 2014 :-/ > I'm going to do a second beta this weekend and will try putting it up on pypi. The statsmodels are a concern, we may need to put off the transition to integer only indexes. OTOH, if statsmodels can't fix things up we will have to deal with that at some point. Apparently we also need to do something about invisible deprecation warnings. Python changing the default to ignore was, IIRC, due to a Python screw up in backporting PyCapsule to 2.7 and deprecating PyCObject in the process. The easiest way out of that hole was painting it over. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Feb 5 11:27:59 2016 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 5 Feb 2016 08:27:59 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: <6161619481366224101@unknownmsgid> > An extra ~2 hours of tests / 6-way parallelism is not that big a deal > in the grand scheme of things (and I guess it's probably less than > that if we can take advantage of existing binary builds) If we set up a numpy-testing conda channel, it could be used to cache binary builds for all he versions of everything we want to test against. Conda-build-all could make it manageable to maintain that channel. -CHB From njs at pobox.com Fri Feb 5 12:55:10 2016 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 5 Feb 2016 09:55:10 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: <6161619481366224101@unknownmsgid> References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: On Feb 5, 2016 8:28 AM, "Chris Barker - NOAA Federal" wrote: > > > An extra ~2 hours of tests / 6-way parallelism is not that big a deal > > in the grand scheme of things (and I guess it's probably less than > > that if we can take advantage of existing binary builds) > > If we set up a numpy-testing conda channel, it could be used to cache > binary builds for all he versions of everything we want to test > against. > > Conda-build-all could make it manageable to maintain that channel. What would be the advantage of maintaining that channel ourselves instead of using someone else's binary builds that already exist (e.g. Anaconda's, or official project wheels)? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Feb 5 13:14:30 2016 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 5 Feb 2016 20:14:30 +0200 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: 05.02.2016, 19:55, Nathaniel Smith kirjoitti: > On Feb 5, 2016 8:28 AM, "Chris Barker - NOAA Federal" > wrote: >> >>> An extra ~2 hours of tests / 6-way parallelism is not that big a deal >>> in the grand scheme of things (and I guess it's probably less than >>> that if we can take advantage of existing binary builds) >> >> If we set up a numpy-testing conda channel, it could be used to cache >> binary builds for all he versions of everything we want to test >> against. >> >> Conda-build-all could make it manageable to maintain that channel. > > What would be the advantage of maintaining that channel ourselves instead > of using someone else's binary builds that already exist (e.g. Anaconda's, > or official project wheels)? ABI compatibility. However, as I understand it, not many backward ABI incompatible changes in Numpy are not expected in future. If they were, I note that if you work in the same environment, you can push repeated compilation times to zero compared to the time it takes to run tests in a way that requires less configuration, by enabling ccache/f90cache. From chris.barker at noaa.gov Fri Feb 5 16:16:46 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 5 Feb 2016 13:16:46 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: On Fri, Feb 5, 2016 at 9:55 AM, Nathaniel Smith wrote: > > If we set up a numpy-testing conda channel, it could be used to cache > > binary builds for all he versions of everything we want to test > > against. > > > > Conda-build-all could make it manageable to maintain that channel. > > What would be the advantage of maintaining that channel ourselves instead > of using someone else's binary builds that already exist (e.g. Anaconda's, > or official project wheels)? > other's binary wheels are only available for the versions that are supported. Usually the latest releases, but Anaconda doesn't always have the latest builds of everything. Maybe we want to test against matplotlib master (or a release candidate, or??), for instance. And when we are testing a numpy-abi-breaking release, we'll need to have everything tested against that release. Usually, when you set up a conda environment, it preferentially pulls from the default channel anyway (or any other channel you set up) , so we'd only maintain stuff that wasn't readily available elsewhere. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Feb 5 16:41:10 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Feb 2016 16:41:10 -0500 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: On Fri, Feb 5, 2016 at 3:24 PM, wrote: > > > On Fri, Feb 5, 2016 at 1:14 PM, Pauli Virtanen wrote: > >> 05.02.2016, 19:55, Nathaniel Smith kirjoitti: >> > On Feb 5, 2016 8:28 AM, "Chris Barker - NOAA Federal" < >> chris.barker at noaa.gov> >> > wrote: >> >> >> >>> An extra ~2 hours of tests / 6-way parallelism is not that big a deal >> >>> in the grand scheme of things (and I guess it's probably less than >> >>> that if we can take advantage of existing binary builds) >> >> >> >> If we set up a numpy-testing conda channel, it could be used to cache >> >> binary builds for all he versions of everything we want to test >> >> against. >> >> >> >> Conda-build-all could make it manageable to maintain that channel. >> > >> > What would be the advantage of maintaining that channel ourselves >> instead >> > of using someone else's binary builds that already exist (e.g. >> Anaconda's, >> > or official project wheels)? >> >> ABI compatibility. However, as I understand it, not many backward ABI >> incompatible changes in Numpy are not expected in future. >> >> If they were, I note that if you work in the same environment, you can >> push repeated compilation times to zero compared to the time it takes to >> run tests in a way that requires less configuration, by enabling >> ccache/f90cache. >> > > > control of fortran compiler and libraries > > I was just looking at some new test errors on TravisCI in unchanged code > of statsmodels, and it looks like conda switched from openblas to mkl > yesterday. > > (statsmodels doesn't care when compiling which BLAS/LAPACK is used as long > as they work because we don't have Fortran code.) > > Josef > > (sending again, delivery refused) > > >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Feb 5 18:24:37 2016 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 5 Feb 2016 15:24:37 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: On Fri, Feb 5, 2016 at 1:16 PM, Chris Barker wrote: > On Fri, Feb 5, 2016 at 9:55 AM, Nathaniel Smith wrote: >> >> > If we set up a numpy-testing conda channel, it could be used to cache >> > binary builds for all he versions of everything we want to test >> > against. >> > >> > Conda-build-all could make it manageable to maintain that channel. >> >> What would be the advantage of maintaining that channel ourselves instead >> of using someone else's binary builds that already exist (e.g. Anaconda's, >> or official project wheels)? > > other's binary wheels are only available for the versions that are > supported. Usually the latest releases, but Anaconda doesn't always have the > latest builds of everything. True, though official project wheels will hopefully solve that soon. > Maybe we want to test against matplotlib master (or a release candidate, > or??), for instance. Generally I think for numpy's purposes we want to test against the latest released version, because it doesn't do end-users much good if a numpy release breaks their environment, and the only fix is hiding in some git repo somewhere :-). But yeah. > And when we are testing a numpy-abi-breaking release, we'll need to have > everything tested against that release. There aren't any current plans to have such a release, but true. -n -- Nathaniel J. Smith -- https://vorpus.org From faltet at gmail.com Sat Feb 6 05:50:51 2016 From: faltet at gmail.com (Francesc Alted) Date: Sat, 6 Feb 2016 11:50:51 +0100 Subject: [Numpy-discussion] ANN: numexpr 2.5 Message-ID: ========================= Announcing Numexpr 2.5 ========================= Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. It wears multi-threaded capabilities, as well as support for Intel's MKL (Math Kernel Library), which allows an extremely fast evaluation of transcendental functions (sin, cos, tan, exp, log...) while squeezing the last drop of performance out of your multi-core processors. Look here for a some benchmarks of numexpr using MKL: https://github.com/pydata/numexpr/wiki/NumexprMKL Its only dependency is NumPy (MKL is optional), so it works well as an easy-to-deploy, easy-to-use, computational engine for projects that don't want to adopt other solutions requiring more heavy dependencies. What's new ========== In this version, a lock has been added so that numexpr can be called from multithreaded apps. Mind that this does not prevent numexpr to use multiple cores internally. Also, a new min() and max() functions have been added. Thanks to contributors! In case you want to know more in detail what has changed in this version, see: https://github.com/pydata/numexpr/blob/master/RELEASE_NOTES.rst Where I can find Numexpr? ========================= The project is hosted at GitHub in: https://github.com/pydata/numexpr You can get the packages from PyPI as well (but not for RC releases): http://pypi.python.org/pypi/numexpr Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy data! -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Feb 6 15:26:34 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 6 Feb 2016 12:26:34 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test Message-ID: Hi, As some of you may have seen, Robert McGibbon and Nathaniel have just guided a PEP for multi-distribution Linux wheels past the approval process over on distutils-sig: https://www.python.org/dev/peps/pep-0513/ The PEP includes a docker image on which y'all can build wheels which match the PEP: https://quay.io/repository/manylinux/manylinux Now we're at the stage where we need stress-testing of the built wheels to find any problems we hadn't thought of. I've built numpy and scipy wheels here: https://nipy.bic.berkeley.edu/manylinux/ So, if you have a Linux distribution handy, we would love to hear from you about the results of testing these guys, maybe on the lines of: pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy python -c 'import numpy; numpy.test()' python -c 'import scipy; scipy.test()' These manylinux wheels should soon be available on pypi, and soon after, installable with latest pip, so we would like to fix as many problems as possible before going live. Cheers, Matthew From Permafacture at gmail.com Sat Feb 6 17:56:34 2016 From: Permafacture at gmail.com (Elliot Hallmark) Date: Sat, 6 Feb 2016 16:56:34 -0600 Subject: [Numpy-discussion] resizeable arrays using shared memory? Message-ID: Hi all, I have a program that uses resize-able arrays. I already over-provision the arrays and use slices, but every now and then the data outgrows that array and it needs to be resized. Now, I would like to have these arrays shared between processes spawned via multiprocessing (for fast interprocess communication purposes, not for parallelizing work on an array). I don't care about mapping to a file on disk, and I don't want disk I/O happening. I don't care (really) about data being copied in memory on resize. I *do* want the array to be resized "in place", so that the child processes can still access the arrays from the object they were initialized with. I can share arrays easily using arrays that are backed by memmap. Ie: ``` #Source: http://github.com/rainwoodman/sharedmem class anonymousmemmap(numpy.memmap): def __new__(subtype, shape, dtype=numpy.uint8, order='C'): descr = numpy.dtype(dtype) _dbytes = descr.itemsize shape = numpy.atleast_1d(shape) size = 1 for k in shape: size *= k bytes = int(size*_dbytes) if bytes > 0: mm = mmap.mmap(-1,bytes) else: mm = numpy.empty(0, dtype=descr) self = numpy.ndarray.__new__(subtype, shape, dtype=descr, buffer=mm, order=order) self._mmap = mm return self def __array_wrap__(self, outarr, context=None): return numpy.ndarray.__array_wrap__(self.view(numpy.ndarray), outarr, context) ``` This cannot be resized because it does not own it's own data (ValueError: cannot resize this array: it does not own its data). (numpy.memmap has this same issue [0], even if I set refcheck to False and even though the docs say otherwise [1]). arr._mmap.resize(x) fails because it is annonymous (error: [Errno 9] Bad file descriptor). If I create a file and use that fileno to create the memmap, then I can resize `arr._mmap` but the array itself is not resized. Is there a way to accomplish what I want? Or, do I just need to figure out a way to communicate new arrays to the child processes? Thanks, Elliot [0] https://github.com/numpy/numpy/issues/4198. [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.resize.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Sat Feb 6 18:21:46 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 6 Feb 2016 15:21:46 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: On Fri, Feb 5, 2016 at 3:24 PM, Nathaniel Smith wrote: > On Fri, Feb 5, 2016 at 1:16 PM, Chris Barker > wrote: > > >> > If we set up a numpy-testing conda channel, it could be used to cache > >> > binary builds for all he versions of everything we want to test > >> > against. > Anaconda doesn't always have the > > latest builds of everything. OK, this may be more or less helpful, depending on what we want to built against. But a conda environment (maybe tied to a custom channel) really does make a nice contained space for testing that can be set up fast on a CI server. If whoever is setting up a test system/matrix thinks this would be useful, I'd be glad to help set it up. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From msarahan at gmail.com Sat Feb 6 18:42:03 2016 From: msarahan at gmail.com (Michael Sarahan) Date: Sat, 06 Feb 2016 23:42:03 +0000 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: FWIW, we (Continuum) are working on a CI system that builds conda recipes. Part of this is testing not only individual packages that change, but also any downstream packages that are also in the repository of recipes. The configuration for this is in https://github.com/conda/conda-recipes/blob/master/.binstar.yml and the project doing the dependency detection is in https://github.com/ContinuumIO/ProtoCI/ This is still being established (particularly, provisioning build workers), but please talk with us if you're interested. Chris, it may still be useful to use docker here (perhaps on the build worker, or elsewhere), also, as the distinction between build machines and user machines is important to make. Docker would be great for making sure that all dependency requirements are met on end-user systems (we've had a few recent issues with libgfortran accidentally missing as a requirement of scipy). Best, Michael On Sat, Feb 6, 2016 at 5:22 PM Chris Barker wrote: > On Fri, Feb 5, 2016 at 3:24 PM, Nathaniel Smith wrote: > >> On Fri, Feb 5, 2016 at 1:16 PM, Chris Barker >> wrote: >> > > >> >> > If we set up a numpy-testing conda channel, it could be used to cache >> >> > binary builds for all he versions of everything we want to test >> >> > against. >> > Anaconda doesn't always have the >> > latest builds of everything. > > > OK, this may be more or less helpful, depending on what we want to built > against. But a conda environment (maybe tied to a custom channel) really > does make a nice contained space for testing that can be set up fast on a > CI server. > > If whoever is setting up a test system/matrix thinks this would be useful, > I'd be glad to help set it up. > > -Chris > > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Sat Feb 6 18:51:11 2016 From: rmcgibbo at gmail.com (Robert T. McGibbon) Date: Sat, 6 Feb 2016 15:51:11 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: > (we've had a few recent issues with libgfortran accidentally missing as a requirement of scipy). On this topic, you may be able to get some milage out of adapting pypa/auditwheel, which can load up extension module `.so` files inside a wheel (or conda package) and walk the shared library dependency tree like the runtime linker (using pyelftools), and check whether things are going to resolve properly and where shared libraries are loaded from. Something like that should be able to, with minimal adaptation to use the conda dependency resolver, check that a conda package properly declares all of the shared library dependencies it actually needs. -Robert On Sat, Feb 6, 2016 at 3:42 PM, Michael Sarahan wrote: > FWIW, we (Continuum) are working on a CI system that builds conda > recipes. Part of this is testing not only individual packages that change, > but also any downstream packages that are also in the repository of > recipes. The configuration for this is in > https://github.com/conda/conda-recipes/blob/master/.binstar.yml and the > project doing the dependency detection is in > https://github.com/ContinuumIO/ProtoCI/ > > This is still being established (particularly, provisioning build > workers), but please talk with us if you're interested. > > Chris, it may still be useful to use docker here (perhaps on the build > worker, or elsewhere), also, as the distinction between build machines and > user machines is important to make. Docker would be great for making sure > that all dependency requirements are met on end-user systems (we've had a > few recent issues with libgfortran accidentally missing as a requirement of > scipy). > > Best, > Michael > > On Sat, Feb 6, 2016 at 5:22 PM Chris Barker wrote: > >> On Fri, Feb 5, 2016 at 3:24 PM, Nathaniel Smith wrote: >> >>> On Fri, Feb 5, 2016 at 1:16 PM, Chris Barker >>> wrote: >>> >> >> >>> >> > If we set up a numpy-testing conda channel, it could be used to >>> cache >>> >> > binary builds for all he versions of everything we want to test >>> >> > against. >>> >> Anaconda doesn't always have the >>> > latest builds of everything. >> >> >> OK, this may be more or less helpful, depending on what we want to built >> against. But a conda environment (maybe tied to a custom channel) really >> does make a nice contained space for testing that can be set up fast on a >> CI server. >> >> If whoever is setting up a test system/matrix thinks this would be >> useful, I'd be glad to help set it up. >> >> -Chris >> >> >> >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- -Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Sat Feb 6 18:52:37 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 6 Feb 2016 15:52:37 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: On Sat, Feb 6, 2016 at 3:42 PM, Michael Sarahan wrote: > FWIW, we (Continuum) are working on a CI system that builds conda recipes. > great, could be handy. I hope you've looked at the open-source systems that do this: obvious-ci and conda-build-all. And conda-smithy to help set it all up.. Chris, it may still be useful to use docker here (perhaps on the build > worker, or elsewhere), also, as the distinction between build machines and > user machines is important to make. Docker would be great for making sure > that all dependency requirements are met on end-user systems > yes -- veryhandy, I have certainly accidentally brough in other system libs in a build.... Too bad it's Linux only. Though very useful for manylinux. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From msarahan at gmail.com Sat Feb 6 19:11:27 2016 From: msarahan at gmail.com (Michael Sarahan) Date: Sun, 07 Feb 2016 00:11:27 +0000 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: Robert, Thanks for pointing out auditwheel. We're experimenting with a GCC 5.2 toolchain, and this tool will be invaluable. Chris, Both conda-build-all and obvious-ci are excellent projects, and we'll leverage them where we can (particularly conda-build-all). Obvious CI and conda-smithy are in a slightly different space, as we want to use our own anaconda.org build service, rather than write scripts to run on other CI services. With more control, we can do cool things like splitting up build jobs and further parallelizing them on more workers, which I see as very important if we're going to be building downstream stuff. As I see it, the single, massive recipe repo that is conda-recipes has been a disadvantage for a while in terms of complexity, but now may be an advantage in terms of building downstream packages (how else would dependency get resolved?) It remains to be seen whether git submodules might replace individual folders in conda-recipes - I think this might give project maintainers more direct control over their packages. The goal, much like ObviousCI, is to enable project maintainers to get their latest releases available in conda sooner, and to simplify the whole CI setup process. We hope we can help each other rather than compete. Best, Michael On Sat, Feb 6, 2016 at 5:53 PM Chris Barker wrote: > On Sat, Feb 6, 2016 at 3:42 PM, Michael Sarahan > wrote: > >> FWIW, we (Continuum) are working on a CI system that builds conda >> recipes. >> > > great, could be handy. I hope you've looked at the open-source systems > that do this: obvious-ci and conda-build-all. And conda-smithy to help set > it all up.. > > Chris, it may still be useful to use docker here (perhaps on the build >> worker, or elsewhere), also, as the distinction between build machines and >> user machines is important to make. Docker would be great for making sure >> that all dependency requirements are met on end-user systems >> > > yes -- veryhandy, I have certainly accidentally brough in other system > libs in a build.... > > Too bad it's Linux only. Though very useful for manylinux. > > > -Chris > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sat Feb 6 21:01:41 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 07 Feb 2016 03:01:41 +0100 Subject: [Numpy-discussion] resizeable arrays using shared memory? In-Reply-To: References: Message-ID: <1454810501.1557.5.camel@sipsolutions.net> On Sa, 2016-02-06 at 16:56 -0600, Elliot Hallmark wrote: > Hi all, > > I have a program that uses resize-able arrays. I already over > -provision the arrays and use slices, but every now and then the data > outgrows that array and it needs to be resized. > > Now, I would like to have these arrays shared between processes > spawned via multiprocessing (for fast interprocess communication > purposes, not for parallelizing work on an array). I don't care > about mapping to a file on disk, and I don't want disk I/O happening. > I don't care (really) about data being copied in memory on resize. > I *do* want the array to be resized "in place", so that the child > processes can still access the arrays from the object they were > initialized with. > > > I can share arrays easily using arrays that are backed by memmap. > Ie: > > ``` > #Source: http://github.com/rainwoodman/sharedmem > > > class anonymousmemmap(numpy.memmap): > def __new__(subtype, shape, dtype=numpy.uint8, order='C'): > > descr = numpy.dtype(dtype) > _dbytes = descr.itemsize > > shape = numpy.atleast_1d(shape) > size = 1 > for k in shape: > size *= k > > bytes = int(size*_dbytes) > > if bytes > 0: > mm = mmap.mmap(-1,bytes) > else: > mm = numpy.empty(0, dtype=descr) > self = numpy.ndarray.__new__(subtype, shape, dtype=descr, > buffer=mm, order=order) > self._mmap = mm > return self > > def __array_wrap__(self, outarr, context=None): > return > numpy.ndarray.__array_wrap__(self.view(numpy.ndarray), outarr, > context) > ``` > > This cannot be resized because it does not own it's own data > (ValueError: cannot resize this array: it does not own its data). > (numpy.memmap has this same issue [0], even if I set refcheck to > False and even though the docs say otherwise [1]). > > arr._mmap.resize(x) fails because it is annonymous (error: [Errno 9] > Bad file descriptor). If I create a file and use that fileno to > create the memmap, then I can resize `arr._mmap` but the array itself > is not resized. > > Is there a way to accomplish what I want? Or, do I just need to > figure out a way to communicate new arrays to the child processes? > I guess the answer is no, but the first question should be whether you can create a new array viewing the same data that is just larger? Since you have the mmap, that would be creating a new view into it. I.e. your "array" would be the memmap, and to use it, you always rewrap it into a new numpy array. Other then that, you would have to mess with the internal ndarray structure, since these kind of operations appear rather unsafe. - Sebastian > Thanks, > Elliot > > [0] https://github.com/numpy/numpy/issues/4198. > > [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap. > resize.html > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From nadavh at visionsense.com Sun Feb 7 00:28:48 2016 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun, 7 Feb 2016 05:28:48 +0000 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: Test platform: python 3.4.1 on archlinux x86_64 scipy test: OK OK (KNOWNFAIL=97, SKIP=1626) numpy tests: Failed on long double and int128 tests, and got one error: Traceback (most recent call last): File "/usr/lib/python3.5/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/lib/python3.5/site-packages/numpy/core/tests/test_longdouble.py", line 108, in test_fromstring_missing np.array([1])) File "/usr/lib/python3.5/site-packages/numpy/testing/utils.py", line 296, in assert_equal return assert_array_equal(actual, desired, err_msg, verbose) File "/usr/lib/python3.5/site-packages/numpy/testing/utils.py", line 787, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/usr/lib/python3.5/site-packages/numpy/testing/utils.py", line 668, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (shapes (6,), (1,) mismatch) x: array([ 1., -1., 3., 4., 5., 6.]) y: array([1]) ---------------------------------------------------------------------- Ran 6019 tests in 28.029s FAILED (KNOWNFAIL=13, SKIP=12, errors=1, failures=18 ________________________________________ From: NumPy-Discussion on behalf of Matthew Brett Sent: 06 February 2016 22:26 To: Discussion of Numerical Python; SciPy Developers List Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test Hi, As some of you may have seen, Robert McGibbon and Nathaniel have just guided a PEP for multi-distribution Linux wheels past the approval process over on distutils-sig: https://www.python.org/dev/peps/pep-0513/ The PEP includes a docker image on which y'all can build wheels which match the PEP: https://quay.io/repository/manylinux/manylinux Now we're at the stage where we need stress-testing of the built wheels to find any problems we hadn't thought of. I've built numpy and scipy wheels here: https://nipy.bic.berkeley.edu/manylinux/ So, if you have a Linux distribution handy, we would love to hear from you about the results of testing these guys, maybe on the lines of: pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy python -c 'import numpy; numpy.test()' python -c 'import scipy; scipy.test()' These manylinux wheels should soon be available on pypi, and soon after, installable with latest pip, so we would like to fix as many problems as possible before going live. Cheers, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Sun Feb 7 00:52:02 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 6 Feb 2016 21:52:02 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Sat, Feb 6, 2016 at 9:28 PM, Nadav Horesh wrote: > Test platform: python 3.4.1 on archlinux x86_64 > > scipy test: OK > > OK (KNOWNFAIL=97, SKIP=1626) > > > numpy tests: Failed on long double and int128 tests, and got one error: > > Traceback (most recent call last): > File "/usr/lib/python3.5/site-packages/nose/case.py", line 198, in runTest > self.test(*self.arg) > File "/usr/lib/python3.5/site-packages/numpy/core/tests/test_longdouble.py", line 108, in test_fromstring_missing > np.array([1])) > File "/usr/lib/python3.5/site-packages/numpy/testing/utils.py", line 296, in assert_equal > return assert_array_equal(actual, desired, err_msg, verbose) > File "/usr/lib/python3.5/site-packages/numpy/testing/utils.py", line 787, in assert_array_equal > verbose=verbose, header='Arrays are not equal') > File "/usr/lib/python3.5/site-packages/numpy/testing/utils.py", line 668, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Arrays are not equal > > (shapes (6,), (1,) mismatch) > x: array([ 1., -1., 3., 4., 5., 6.]) > y: array([1]) > > ---------------------------------------------------------------------- > Ran 6019 tests in 28.029s > > FAILED (KNOWNFAIL=13, SKIP=12, errors=1, failures=18 Great - thanks so much for doing this. Do you get a different error if you compile from source? If you compile from source, do you link to OpenBLAS? Thanks again, Matthew From nadavh at visionsense.com Sun Feb 7 05:06:43 2016 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun, 7 Feb 2016 10:06:43 +0000 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: , Message-ID: The reult tests of numpy 1.10.4 installed from source: OK (KNOWNFAIL=4, SKIP=6) I think I use openblas, as it is installed instead the normal blas/cblas. Nadav, ________________________________________ From: NumPy-Discussion on behalf of Nadav Horesh Sent: 07 February 2016 07:28 To: Discussion of Numerical Python; SciPy Developers List Subject: Re: [Numpy-discussion] Multi-distribution Linux wheels - please test Test platform: python 3.4.1 on archlinux x86_64 scipy test: OK OK (KNOWNFAIL=97, SKIP=1626) numpy tests: Failed on long double and int128 tests, and got one error: Traceback (most recent call last): File "/usr/lib/python3.5/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/lib/python3.5/site-packages/numpy/core/tests/test_longdouble.py", line 108, in test_fromstring_missing np.array([1])) File "/usr/lib/python3.5/site-packages/numpy/testing/utils.py", line 296, in assert_equal return assert_array_equal(actual, desired, err_msg, verbose) File "/usr/lib/python3.5/site-packages/numpy/testing/utils.py", line 787, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/usr/lib/python3.5/site-packages/numpy/testing/utils.py", line 668, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (shapes (6,), (1,) mismatch) x: array([ 1., -1., 3., 4., 5., 6.]) y: array([1]) ---------------------------------------------------------------------- Ran 6019 tests in 28.029s FAILED (KNOWNFAIL=13, SKIP=12, errors=1, failures=18 ________________________________________ From: NumPy-Discussion on behalf of Matthew Brett Sent: 06 February 2016 22:26 To: Discussion of Numerical Python; SciPy Developers List Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test Hi, As some of you may have seen, Robert McGibbon and Nathaniel have just guided a PEP for multi-distribution Linux wheels past the approval process over on distutils-sig: https://www.python.org/dev/peps/pep-0513/ The PEP includes a docker image on which y'all can build wheels which match the PEP: https://quay.io/repository/manylinux/manylinux Now we're at the stage where we need stress-testing of the built wheels to find any problems we hadn't thought of. I've built numpy and scipy wheels here: https://nipy.bic.berkeley.edu/manylinux/ So, if you have a Linux distribution handy, we would love to hear from you about the results of testing these guys, maybe on the lines of: pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy python -c 'import numpy; numpy.test()' python -c 'import scipy; scipy.test()' These manylinux wheels should soon be available on pypi, and soon after, installable with latest pip, so we would like to fix as many problems as possible before going live. Cheers, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Sun Feb 7 05:40:10 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Feb 2016 02:40:10 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Feb 6, 2016 12:27 PM, "Matthew Brett" wrote: > > Hi, > > As some of you may have seen, Robert McGibbon and Nathaniel have just > guided a PEP for multi-distribution Linux wheels past the approval > process over on distutils-sig: > > https://www.python.org/dev/peps/pep-0513/ > > The PEP includes a docker image on which y'all can build wheels which > match the PEP: > > https://quay.io/repository/manylinux/manylinux This is the wrong repository :-) It moved, and there are two now: quay.io/pypa/manylinux1_x86_64 quay.io/pypa/manylinux1_i686 -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From Permafacture at gmail.com Sun Feb 7 18:11:34 2016 From: Permafacture at gmail.com (Elliot Hallmark) Date: Sun, 7 Feb 2016 17:11:34 -0600 Subject: [Numpy-discussion] resizeable arrays using shared memory? In-Reply-To: <1454810501.1557.5.camel@sipsolutions.net> References: <1454810501.1557.5.camel@sipsolutions.net> Message-ID: That makes sense. I could either send a signal to the child process letting it know to re-instantiate the numpy array using the same (but now resized) buffer, or I could have it check to see if the buffer has been resized when it might need it and re-instantiate then. That's actually not too bad. It would be nice if the array could be resized, but it's probably unstable to do so and there isn't much demand for it. Thanks, Elliot On Sat, Feb 6, 2016 at 8:01 PM, Sebastian Berg wrote: > On Sa, 2016-02-06 at 16:56 -0600, Elliot Hallmark wrote: > > Hi all, > > > > I have a program that uses resize-able arrays. I already over > > -provision the arrays and use slices, but every now and then the data > > outgrows that array and it needs to be resized. > > > > Now, I would like to have these arrays shared between processes > > spawned via multiprocessing (for fast interprocess communication > > purposes, not for parallelizing work on an array). I don't care > > about mapping to a file on disk, and I don't want disk I/O happening. > > I don't care (really) about data being copied in memory on resize. > > I *do* want the array to be resized "in place", so that the child > > processes can still access the arrays from the object they were > > initialized with. > > > > > > I can share arrays easily using arrays that are backed by memmap. > > Ie: > > > > ``` > > #Source: http://github.com/rainwoodman/sharedmem > > > > > > class anonymousmemmap(numpy.memmap): > > def __new__(subtype, shape, dtype=numpy.uint8, order='C'): > > > > descr = numpy.dtype(dtype) > > _dbytes = descr.itemsize > > > > shape = numpy.atleast_1d(shape) > > size = 1 > > for k in shape: > > size *= k > > > > bytes = int(size*_dbytes) > > > > if bytes > 0: > > mm = mmap.mmap(-1,bytes) > > else: > > mm = numpy.empty(0, dtype=descr) > > self = numpy.ndarray.__new__(subtype, shape, dtype=descr, > > buffer=mm, order=order) > > self._mmap = mm > > return self > > > > def __array_wrap__(self, outarr, context=None): > > return > > numpy.ndarray.__array_wrap__(self.view(numpy.ndarray), outarr, > > context) > > ``` > > > > This cannot be resized because it does not own it's own data > > (ValueError: cannot resize this array: it does not own its data). > > (numpy.memmap has this same issue [0], even if I set refcheck to > > False and even though the docs say otherwise [1]). > > > > arr._mmap.resize(x) fails because it is annonymous (error: [Errno 9] > > Bad file descriptor). If I create a file and use that fileno to > > create the memmap, then I can resize `arr._mmap` but the array itself > > is not resized. > > > > Is there a way to accomplish what I want? Or, do I just need to > > figure out a way to communicate new arrays to the child processes? > > > > I guess the answer is no, but the first question should be whether you > can create a new array viewing the same data that is just larger? Since > you have the mmap, that would be creating a new view into it. > > I.e. your "array" would be the memmap, and to use it, you always rewrap > it into a new numpy array. > > Other then that, you would have to mess with the internal ndarray > structure, since these kind of operations appear rather unsafe. > > - Sebastian > > > > Thanks, > > Elliot > > > > [0] https://github.com/numpy/numpy/issues/4198. > > > > [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap. > > resize.html > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sun Feb 7 18:33:01 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 7 Feb 2016 15:33:01 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: Hi, On Sun, Feb 7, 2016 at 2:06 AM, Nadav Horesh wrote: > The reult tests of numpy 1.10.4 installed from source: > > OK (KNOWNFAIL=4, SKIP=6) > > > I think I use openblas, as it is installed instead the normal blas/cblas. Thanks again for the further tests. What do you get for: python -c 'import numpy; print(numpy.__config__.show())' Matthew From njs at pobox.com Sun Feb 7 19:17:46 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Feb 2016 16:17:46 -0800 Subject: [Numpy-discussion] [SciPy-Dev] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Feb 7, 2016 15:27, "Charles R Harris" wrote: > > > > On Sun, Feb 7, 2016 at 2:16 PM, Nathaniel Smith wrote: >> >> On Sun, Feb 7, 2016 at 9:49 AM, Charles R Harris >> wrote: >> > >> > >> > On Sun, Feb 7, 2016 at 3:40 AM, Nathaniel Smith wrote: >> >> >> >> On Feb 6, 2016 12:27 PM, "Matthew Brett" wrote: >> >> > >> >> > Hi, >> >> > >> >> > As some of you may have seen, Robert McGibbon and Nathaniel have just >> >> > guided a PEP for multi-distribution Linux wheels past the approval >> >> > process over on distutils-sig: >> >> > >> >> > https://www.python.org/dev/peps/pep-0513/ >> >> > >> >> > The PEP includes a docker image on which y'all can build wheels which >> >> > match the PEP: >> >> > >> >> > https://quay.io/repository/manylinux/manylinux >> >> >> >> This is the wrong repository :-) It moved, and there are two now: >> >> >> >> quay.io/pypa/manylinux1_x86_64 >> >> quay.io/pypa/manylinux1_i686 >> > >> > >> > I'm going to put out 1.11.0b3 today. What would be the best thing to do for >> > testing? >> >> I'd say, don't worry about building linux wheels as part of the >> release cycle yet -- it'll still be a bit before they're allowed on >> pypi or pip will recognize the new special tag. So for now you can >> leave it to Matthew or someone to build test images and stick them up >> on a server somewhere, same as before :-) > > > Should I try putting the sources up on pypi? +1 -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From nilsc.becker at gmail.com Sun Feb 7 19:39:28 2016 From: nilsc.becker at gmail.com (Nils Becker) Date: Mon, 8 Feb 2016 01:39:28 +0100 Subject: [Numpy-discussion] Linking other libm-Implementation Message-ID: Hi all, I wanted to know if there is any sane way to build numpy while linking to a different implementation of libm? A drop-in replacement for libm (e.g. openlibm) should in principle work, I guess, but I did not manage to actually make it work. As far as I understand the build code, setting MATHLIB=openlibm should suffice, but it did not. The build works fine, but in the end when running numpy apparently the functions of the system libm.so are used. I could not verify this directly (as I do not know how) but noticed that there is no performance difference between the builds - while there is one with pure C programs linked against libm and openlibm. Using amdlibm would require some work as the functions are prefixed with "_amd", I guess? Using intels libimf should work when using intels compiler, but I did not try this. With gcc I did not get it to work. A quite general question: At the moment the performance and the accuracy of the base mathematical functions depends on the platform and libm-Implementation of the system. Although there are functions defined in npy_math, they are only used as fall-backs, if they are not provided by a library. (correct me if I am wrong here) Is there some plan to change this in the future and provide defined behaviour (specified accuracy and/or speed) across platforms? As I understood it Julia started openlibm for this reason (which is based on fdlibm/msun, same as npy_math). Cheers Nils -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Feb 7 20:15:06 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Feb 2016 17:15:06 -0800 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: References: Message-ID: On Sun, Feb 7, 2016 at 4:39 PM, Nils Becker wrote: > Hi all, > > I wanted to know if there is any sane way to build numpy while linking to a > different implementation of libm? > A drop-in replacement for libm (e.g. openlibm) should in principle work, I > guess, but I did not manage to actually make it work. As far as I understand > the build code, setting MATHLIB=openlibm should suffice, but it did not. The > build works fine, but in the end when running numpy apparently the functions > of the system libm.so are used. I could not verify this directly (as I do > not know how) but noticed that there is no performance difference between > the builds - while there is one with pure C programs linked against libm and > openlibm. > Using amdlibm would require some work as the functions are prefixed with > "_amd", I guess? Using intels libimf should work when using intels compiler, > but I did not try this. With gcc I did not get it to work. > > A quite general question: At the moment the performance and the accuracy of > the base mathematical functions depends on the platform and > libm-Implementation of the system. Although there are functions defined in > npy_math, they are only used as fall-backs, if they are not provided by a > library. (correct me if I am wrong here) > Is there some plan to change this in the future and provide defined > behaviour (specified accuracy and/or speed) across platforms? As I > understood it Julia started openlibm for this reason (which is based on > fdlibm/msun, same as npy_math). The npy_math functions are used if otherwise unavailable OR if someone has at some point noticed that say glibc 2.4-2.10 has a bad quality tan (or whatever) and added a special case hack that checks for those particular library versions and uses our built-in version instead. It's not the most convenient setup to maintain, so there's been some discussion of trying openlibm instead [1], but AFAIK you're the first person to find the time to actually sit down and try doing it :-). You should be able to tell what math library you're linked to by running ldd (on linux) or otool (on OS X) against the .so / .dylib files inside your built copy of numpy -- e.g. ldd numpy/core/umath.cpython-34m.so (exact filename and command will vary depending on python version and platform). -n [1] https://github.com/numpy/numpy/search?q=openlibm&type=Issues&utf8=%E2%9C%93 -- Nathaniel J. Smith -- https://vorpus.org From nadavh at visionsense.com Mon Feb 8 01:09:56 2016 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 8 Feb 2016 06:09:56 +0000 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: , Message-ID: Thank you fo reminding me, it is OK now: $ python -c 'import numpy; print(numpy.__config__.show())' lapack_opt_info: library_dirs = ['/usr/local/lib'] language = c libraries = ['openblas'] define_macros = [('HAVE_CBLAS', None)] blas_mkl_info: NOT AVAILABLE openblas_info: library_dirs = ['/usr/local/lib'] language = c libraries = ['openblas'] define_macros = [('HAVE_CBLAS', None)] openblas_lapack_info: library_dirs = ['/usr/local/lib'] language = c libraries = ['openblas'] define_macros = [('HAVE_CBLAS', None)] blas_opt_info: library_dirs = ['/usr/local/lib'] language = c libraries = ['openblas'] define_macros = [('HAVE_CBLAS', None)] None I updated openblas to the latest version (0.2.15) and it pass the tests Nadav. ________________________________________ From: NumPy-Discussion on behalf of Matthew Brett Sent: 08 February 2016 01:33 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Multi-distribution Linux wheels - please test Hi, On Sun, Feb 7, 2016 at 2:06 AM, Nadav Horesh wrote: > The reult tests of numpy 1.10.4 installed from source: > > OK (KNOWNFAIL=4, SKIP=6) > > > I think I use openblas, as it is installed instead the normal blas/cblas. Thanks again for the further tests. What do you get for: python -c 'import numpy; print(numpy.__config__.show())' Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Mon Feb 8 01:13:49 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 7 Feb 2016 22:13:49 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Sun, Feb 7, 2016 at 10:09 PM, Nadav Horesh wrote: > Thank you fo reminding me, it is OK now: > $ python -c 'import numpy; print(numpy.__config__.show())' > > lapack_opt_info: > library_dirs = ['/usr/local/lib'] > language = c > libraries = ['openblas'] > define_macros = [('HAVE_CBLAS', None)] > blas_mkl_info: > NOT AVAILABLE > openblas_info: > library_dirs = ['/usr/local/lib'] > language = c > libraries = ['openblas'] > define_macros = [('HAVE_CBLAS', None)] > openblas_lapack_info: > library_dirs = ['/usr/local/lib'] > language = c > libraries = ['openblas'] > define_macros = [('HAVE_CBLAS', None)] > blas_opt_info: > library_dirs = ['/usr/local/lib'] > language = c > libraries = ['openblas'] > define_macros = [('HAVE_CBLAS', None)] > None > > I updated openblas to the latest version (0.2.15) and it pass the tests Oh dear - now I'm confused. So you installed the wheel, and tested it, and it gave a test failure. Then you updated openblas using pacman, and then reran the tests against the wheel numpy, and they passed? That's a bit frightening - the wheel should only see its own copy of openblas... Thans for persisting, Matthew From njs at pobox.com Mon Feb 8 01:15:13 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Feb 2016 22:15:13 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Sat, Feb 6, 2016 at 9:28 PM, Nadav Horesh wrote: > Test platform: python 3.4.1 on archlinux x86_64 > > scipy test: OK > > OK (KNOWNFAIL=97, SKIP=1626) > > > numpy tests: Failed on long double and int128 tests, and got one error: Could you post the complete output from the test suite somewhere? (Maybe gist.github.com) -n -- Nathaniel J. Smith -- https://vorpus.org From nadavh at visionsense.com Mon Feb 8 02:10:06 2016 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 8 Feb 2016 07:10:06 +0000 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: , Message-ID: I have atlas-lapack-base installed via pacman (required by sagemath). Since the numpy installation insisted on openblas on /usr/local, I got the openblas source-code and installed it on /usr/local. BTW, I use 1.11b rather then 1.10.x since the 1.10 is very slow in handling recarrays. For the tests I am erasing the 1.11 installation, and installing the 1.10.4 wheel. I do verify that I have the right version before running the tests, but I am not sure if there no unnoticed side effects. Would it help if I put a side the openblas installation and rerun the test? Nadav ________________________________________ From: NumPy-Discussion on behalf of Matthew Brett Sent: 08 February 2016 08:13 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Multi-distribution Linux wheels - please test On Sun, Feb 7, 2016 at 10:09 PM, Nadav Horesh wrote: > Thank you fo reminding me, it is OK now: > $ python -c 'import numpy; print(numpy.__config__.show())' > > lapack_opt_info: > library_dirs = ['/usr/local/lib'] > language = c > libraries = ['openblas'] > define_macros = [('HAVE_CBLAS', None)] > blas_mkl_info: > NOT AVAILABLE > openblas_info: > library_dirs = ['/usr/local/lib'] > language = c > libraries = ['openblas'] > define_macros = [('HAVE_CBLAS', None)] > openblas_lapack_info: > library_dirs = ['/usr/local/lib'] > language = c > libraries = ['openblas'] > define_macros = [('HAVE_CBLAS', None)] > blas_opt_info: > library_dirs = ['/usr/local/lib'] > language = c > libraries = ['openblas'] > define_macros = [('HAVE_CBLAS', None)] > None > > I updated openblas to the latest version (0.2.15) and it pass the tests Oh dear - now I'm confused. So you installed the wheel, and tested it, and it gave a test failure. Then you updated openblas using pacman, and then reran the tests against the wheel numpy, and they passed? That's a bit frightening - the wheel should only see its own copy of openblas... Thans for persisting, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Mon Feb 8 02:13:29 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Feb 2016 23:13:29 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: (This is not relevant to the main topic of the thread, but FYI I think the recarray issues are fixed in 1.10.4.) On Feb 7, 2016 11:10 PM, "Nadav Horesh" wrote: > I have atlas-lapack-base installed via pacman (required by sagemath). > Since the numpy installation insisted on openblas on /usr/local, I got the > openblas source-code and installed it on /usr/local. > BTW, I use 1.11b rather then 1.10.x since the 1.10 is very slow in > handling recarrays. For the tests I am erasing the 1.11 installation, and > installing the 1.10.4 wheel. I do verify that I have the right version > before running the tests, but I am not sure if there no unnoticed side > effects. > > Would it help if I put a side the openblas installation and rerun the test? > > Nadav > ________________________________________ > From: NumPy-Discussion on behalf of > Matthew Brett > Sent: 08 February 2016 08:13 > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Multi-distribution Linux wheels - please > test > > On Sun, Feb 7, 2016 at 10:09 PM, Nadav Horesh > wrote: > > Thank you fo reminding me, it is OK now: > > $ python -c 'import numpy; print(numpy.__config__.show())' > > > > lapack_opt_info: > > library_dirs = ['/usr/local/lib'] > > language = c > > libraries = ['openblas'] > > define_macros = [('HAVE_CBLAS', None)] > > blas_mkl_info: > > NOT AVAILABLE > > openblas_info: > > library_dirs = ['/usr/local/lib'] > > language = c > > libraries = ['openblas'] > > define_macros = [('HAVE_CBLAS', None)] > > openblas_lapack_info: > > library_dirs = ['/usr/local/lib'] > > language = c > > libraries = ['openblas'] > > define_macros = [('HAVE_CBLAS', None)] > > blas_opt_info: > > library_dirs = ['/usr/local/lib'] > > language = c > > libraries = ['openblas'] > > define_macros = [('HAVE_CBLAS', None)] > > None > > > > I updated openblas to the latest version (0.2.15) and it pass the tests > > Oh dear - now I'm confused. So you installed the wheel, and tested > it, and it gave a test failure. Then you updated openblas using > pacman, and then reran the tests against the wheel numpy, and they > passed? That's a bit frightening - the wheel should only see its own > copy of openblas... > > Thans for persisting, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Feb 8 02:48:27 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 7 Feb 2016 23:48:27 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: Hi Nadav, On Sun, Feb 7, 2016 at 11:13 PM, Nathaniel Smith wrote: > (This is not relevant to the main topic of the thread, but FYI I think the > recarray issues are fixed in 1.10.4.) > > On Feb 7, 2016 11:10 PM, "Nadav Horesh" wrote: >> >> I have atlas-lapack-base installed via pacman (required by sagemath). >> Since the numpy installation insisted on openblas on /usr/local, I got the >> openblas source-code and installed it on /usr/local. >> BTW, I use 1.11b rather then 1.10.x since the 1.10 is very slow in >> handling recarrays. For the tests I am erasing the 1.11 installation, and >> installing the 1.10.4 wheel. I do verify that I have the right version >> before running the tests, but I am not sure if there no unnoticed side >> effects. >> >> Would it help if I put a side the openblas installation and rerun the >> test? Would you mind doing something like this, and posting the output?: virtualenv test-manylinux source test-manylinux/bin/activate pip install -f https://nipy.bic.berkeley.edu/manylinux numpy==1.10.4 nose python -c 'import numpy; numpy.test()' python -c 'import numpy; print(numpy.__config__.show())' deactivate virtualenv test-from-source source test-from-source/bin/activate pip install numpy==1.10.4 nose python -c 'import numpy; numpy.test()' python -c 'import numpy; print(numpy.__config__.show())' deactivate I'm puzzled that the wheel gives a test error when the source install does not, and my best guess was an openblas problem, but this just to make sure we have the output from the exact same numpy version, at least. Thanks again, Matthew From njs at pobox.com Mon Feb 8 03:03:29 2016 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 8 Feb 2016 00:03:29 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Feb 7, 2016 11:49 PM, "Matthew Brett" wrote: > > Hi Nadav, > > On Sun, Feb 7, 2016 at 11:13 PM, Nathaniel Smith wrote: > > (This is not relevant to the main topic of the thread, but FYI I think the > > recarray issues are fixed in 1.10.4.) > > > > On Feb 7, 2016 11:10 PM, "Nadav Horesh" wrote: > >> > >> I have atlas-lapack-base installed via pacman (required by sagemath). > >> Since the numpy installation insisted on openblas on /usr/local, I got the > >> openblas source-code and installed it on /usr/local. > >> BTW, I use 1.11b rather then 1.10.x since the 1.10 is very slow in > >> handling recarrays. For the tests I am erasing the 1.11 installation, and > >> installing the 1.10.4 wheel. I do verify that I have the right version > >> before running the tests, but I am not sure if there no unnoticed side > >> effects. > >> > >> Would it help if I put a side the openblas installation and rerun the > >> test? > > Would you mind doing something like this, and posting the output?: > > virtualenv test-manylinux > source test-manylinux/bin/activate > pip install -f https://nipy.bic.berkeley.edu/manylinux numpy==1.10.4 nose > python -c 'import numpy; numpy.test()' > python -c 'import numpy; print(numpy.__config__.show())' > deactivate > > virtualenv test-from-source > source test-from-source/bin/activate > pip install numpy==1.10.4 nose > python -c 'import numpy; numpy.test()' > python -c 'import numpy; print(numpy.__config__.show())' > deactivate > > I'm puzzled that the wheel gives a test error when the source install > does not, and my best guess was an openblas problem, but this just to > make sure we have the output from the exact same numpy version, at > least. It's hard to say without seeing the full output, but AFAICT the only failures mentioned so far are in long double stuff, which shouldn't have any connection to openblas at all? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Mon Feb 8 03:09:36 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 8 Feb 2016 09:09:36 +0100 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: I used docker to run the numpy tests on base/archlinux. I had to pacman -Sy python-pip openssl and gcc (required by one of the numpy tests): ``` Ran 5621 tests in 34.482s OK (KNOWNFAIL=4, SKIP=9) ``` Everything looks fine. -- Olivier From olivier.grisel at ensta.org Mon Feb 8 04:35:42 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 8 Feb 2016 10:35:42 +0100 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: I found another problem by running the tests of scikit-learn: python3 -c "import numpy as np; from scipy import linalg; linalg.eigh(np.random.randn(200, 200))" Segmentation fault Note that the following works: python3 -c "import numpy as np; np.linalg.eigh(np.random.randn(200, 200))" Also note that all scipy tests pass: Ran 20180 tests in 366.163s OK (KNOWNFAIL=97, SKIP=1657) -- Olivier Grisel From nilsc.becker at gmail.com Mon Feb 8 06:04:27 2016 From: nilsc.becker at gmail.com (Nils Becker) Date: Mon, 8 Feb 2016 12:04:27 +0100 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: References: Message-ID: > The npy_math functions are used if otherwise unavailable OR if someone > has at some point noticed that say glibc 2.4-2.10 has a bad quality > tan (or whatever) and added a special case hack that checks for those > particular library versions and uses our built-in version instead. > It's not the most convenient setup to maintain, so there's been some > discussion of trying openlibm instead [1], but AFAIK you're the first > person to find the time to actually sit down and try doing it :-). > > You should be able to tell what math library you're linked to by > running ldd (on linux) or otool (on OS X) against the .so / .dylib > files inside your built copy of numpy -- e.g. > > ldd numpy/core/umath.cpython-34m.so > > (exact filename and command will vary depending on python version and > platform). > > -n > > [1] > https://github.com/numpy/numpy/search?q=openlibm&type=Issues&utf8=%E2%9C%93 > > Ok, I with a little help from someone, at least I got it to work somehow. Apparently linking to openlibm is not a problem, MATHLIB=openlibm does the job. The resulting .so-files are linked to openlibm AND libm. I do not know why, maybe you would have to call gcc with -nostdlib and explicitly include everything you need. When running such a build of numpy, however, only the functions in libm are called. What did the job was to export LD_PRELOAD=/usr/lib/libopenlibm.so. In that case the functions from openlibm are used. This works with any build of numpy and needs no rebuilding. Of course its hacky and not a solution but at the moment it seems by far the easiest way to use a different libm implementation. This does also work with intels libimf. It does not work with amdlibm as they use the prefix amd_ in function names which would require real changes to the build system. Very superficial benchmarks (see below) seem devastating for gnu libm. It seems that openlibm (compiled with gcc -mtune=native -O3) performs really well and intels libm implementation is the best (on my intel CPU). I did not check the accuracy of the functions, though. My own code uses a lot of trigonometric and complex functions (optics calculations). I'd guess it could go 25% faster by just using a better libm implementation. Therefore, I have an interest in getting sane linking to a defined libm implementation to work. Apparently openlibm seems quite a good choice for numpy, at least performance wise. However, I did not find any documentation or tests of the accuracy of its functions. A benchmarking and testing (for accuracy) code for libms would probably be a good starting point for a discussion. I could maybe help with that - but apparently not with any linking/building stuff (I just don't get it). Benchmark: gnu libm.so 3000 x sin(double[100000]): 6.68215647800389 s 3000 x log(double[100000]): 8.86350397899514 s 3000 x exp(double[100000]): 6.560557693999726 s openlibm.so 3000 x sin(double[100000]): 4.5058218560006935 s 3000 x log(double[100000]): 4.106520485998772 s 3000 x exp(double[100000]): 4.597905882001214 s Intel libimf.so 3000 x sin(double[100000]): 4.282402812998043 s 3000 x log(double[100000]): 4.008453270995233 s 3000 x exp(double[100000]): 3.301279639999848 s -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Mon Feb 8 06:29:58 2016 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Mon, 8 Feb 2016 12:29:58 +0100 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On 6 February 2016 at 21:26, Matthew Brett wrote: > > pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy > python -c 'import numpy; numpy.test()' > python -c 'import scipy; scipy.test()' > All the tests pass on my Fedora 23 with Python 2.7, but it seems to be linking to the system openblas: numpy.show_config() lapack_opt_info: libraries = ['openblas'] library_dirs = ['/usr/local/lib'] define_macros = [('HAVE_CBLAS', None)] language = c blas_opt_info: libraries = ['openblas'] library_dirs = ['/usr/local/lib'] define_macros = [('HAVE_CBLAS', None)] language = c openblas_info: libraries = ['openblas'] library_dirs = ['/usr/local/lib'] define_macros = [('HAVE_CBLAS', None)] language = c openblas_lapack_info: libraries = ['openblas'] library_dirs = ['/usr/local/lib'] define_macros = [('HAVE_CBLAS', None)] language = c blas_mkl_info: NOT AVAILABLE I can also reproduce Ogrisel's segfault. -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Mon Feb 8 06:57:48 2016 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Mon, 8 Feb 2016 11:57:48 +0000 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Evgeni Burovski Date: Mon, Feb 8, 2016 at 11:56 AM Subject: Re: [Numpy-discussion] Multi-distribution Linux wheels - please test To: Discussion of Numerical Python On Sat, Feb 6, 2016 at 8:26 PM, Matthew Brett wrote: > Hi, > > As some of you may have seen, Robert McGibbon and Nathaniel have just > guided a PEP for multi-distribution Linux wheels past the approval > process over on distutils-sig: > > https://www.python.org/dev/peps/pep-0513/ > > The PEP includes a docker image on which y'all can build wheels which > match the PEP: > > https://quay.io/repository/manylinux/manylinux > > Now we're at the stage where we need stress-testing of the built > wheels to find any problems we hadn't thought of. > > I've built numpy and scipy wheels here: > > https://nipy.bic.berkeley.edu/manylinux/ > > So, if you have a Linux distribution handy, we would love to hear from > you about the results of testing these guys, maybe on the lines of: > > pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy > python -c 'import numpy; numpy.test()' > python -c 'import scipy; scipy.test()' > > These manylinux wheels should soon be available on pypi, and soon > after, installable with latest pip, so we would like to fix as many > problems as possible before going live. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion Hi, Bog-standard Ubuntu 12.04, fresh virtualenv: Python 2.7.3 (default, Jun 22 2015, 19:33:41) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.__version__ '1.10.4' >>> numpy.test() Running unit tests for numpy NumPy version 1.10.4 NumPy relaxed strides checking option: False NumPy is installed in /home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy Python version 2.7.3 (default, Jun 22 2015, 19:33:41) [GCC 4.6.3] nose version 1.3.7 ====================================================================== ERROR: test_multiarray.TestNewBufferProtocol.test_relaxed_strides ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/core/tests/test_multiarray.py", line 5366, in test_relaxed_strides fd.write(c.data) TypeError: 'buffer' does not have the buffer interface ---------------------------------------------------------------------- * Scipy tests pass with one error in TestNanFuncs, but the interpreter crashes immediately afterwards. Same machine, python 3.5: both numpy and scipy tests pass. From olivier.grisel at ensta.org Mon Feb 8 10:19:59 2016 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 8 Feb 2016 16:19:59 +0100 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: Note that the above segfault was found in a VM (docker-machine virtualbox guest VM launched on a OSX host). The DYNAMIC_ARCH feature of OpenBLAS detects an Sandybridge core (using https://gist.github.com/ogrisel/ad4e547a32d0eb18b4ff). Here are the flags of the CPU visible from inside the docker container: cat /proc/cpuinfo | grep flags flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx rdrand hypervisor lahf_lm If I fix the Nehalem kernel by setting the environment variable the problem disappears: OPENBLAS_CORETYPE=Nehalem python3 -c "import numpy as np; from scipy import linalg; linalg.eigh(np.random.randn(200, 200))" So this is an issue with the architecture detection of OpenBLAS. -- Olivier From davidmenhur at gmail.com Mon Feb 8 11:23:53 2016 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Mon, 8 Feb 2016 17:23:53 +0100 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On 8 February 2016 at 16:19, Olivier Grisel wrote: > > > OPENBLAS_CORETYPE=Nehalem python3 -c "import numpy as np; from scipy > import linalg; linalg.eigh(np.random.randn(200, 200))" > > So this is an issue with the architecture detection of OpenBLAS. I am seeing the same problem on a native Linux box, with Ivy Bridge processor (i5-3317U). According to your script, both my native openblas and the one in the wheel recognises my CPU as Sandybridge, but the wheel produces a segmentation fault. Setting the architecture to Nehalem works. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Mon Feb 8 11:40:02 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 8 Feb 2016 17:40:02 +0100 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: <56B8C4E2.3060901@googlemail.com> On 02/08/2016 05:23 PM, Da?id wrote: > > On 8 February 2016 at 16:19, Olivier Grisel > wrote: > > > > OPENBLAS_CORETYPE=Nehalem python3 -c "import numpy as np; from scipy > import linalg; linalg.eigh(np.random.randn(200, 200))" > > So this is an issue with the architecture detection of OpenBLAS. > > > I am seeing the same problem on a native Linux box, with Ivy Bridge > processor (i5-3317U). According to your script, both my native openblas > and the one in the wheel recognises my CPU as Sandybridge, but the wheel > produces a segmentation fault. Setting the architecture to Nehalem works. > more likely that is a bug the kernel of openblas instead of its cpu detection. The cpuinfo of Oliver indicates its at least a sandy bridge, and ivy bridge is be sandy bridge compatible. Is an up to date version of openblas used? From njs at pobox.com Mon Feb 8 12:36:33 2016 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 8 Feb 2016 09:36:33 -0800 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: References: Message-ID: On Feb 8, 2016 3:04 AM, "Nils Becker" wrote: > [...] > Very superficial benchmarks (see below) seem devastating for gnu libm. It seems that openlibm (compiled with gcc -mtune=native -O3) performs really well and intels libm implementation is the best (on my intel CPU). I did not check the accuracy of the functions, though. > > My own code uses a lot of trigonometric and complex functions (optics calculations). I'd guess it could go 25% faster by just using a better libm implementation. Therefore, I have an interest in getting sane linking to a defined libm implementation to work. On further thought: I guess that to do this we actually will need to change the names of the functions in openlibm and then use those names when calling from numpy. So long as we're using the regular libm symbol names, it doesn't matter what library the python extensions themselves are linked to; the way ELF symbol lookup works, the libm that the python interpreter is linked to will be checked *before* checking the libm that numpy is linked to, so the symbols will all get shadowed. I guess statically linking openlibm would also work, but not sure that's a great idea since we'd need it multiple places. > Apparently openlibm seems quite a good choice for numpy, at least performance wise. However, I did not find any documentation or tests of the accuracy of its functions. A benchmarking and testing (for accuracy) code for libms would probably be a good starting point for a discussion. I could maybe help with that - but apparently not with any linking/building stuff (I just don't get it). > > Benchmark: > > gnu libm.so > 3000 x sin(double[100000]): 6.68215647800389 s > 3000 x log(double[100000]): 8.86350397899514 s > 3000 x exp(double[100000]): 6.560557693999726 s > > openlibm.so > 3000 x sin(double[100000]): 4.5058218560006935 s > 3000 x log(double[100000]): 4.106520485998772 s > 3000 x exp(double[100000]): 4.597905882001214 s > > Intel libimf.so > 3000 x sin(double[100000]): 4.282402812998043 s > 3000 x log(double[100000]): 4.008453270995233 s > 3000 x exp(double[100000]): 3.301279639999848 s I would be highly suspicious that this speed comes at the expense of accuracy... My impression is that there's a lot of room to make speed/accuracy tradeoffs in these functions, and modern glibc's libm has seen a fair amount of scrutiny by people who have access to the same code that openlibm is based off of. But then again, maybe not :-). If these are the operations that you care about optimizing, an even better approach might be to figure out how to integrate a vector math library here like yeppp (BSD licensed) or MKL. Libm tries to optimize log(scalar); these are libraries that specifically try to optimize log(vector). Adding this would require changing numpy's code to use these new APIs though. (Very new gcc can also try to do this in some cases but I don't know how good at it it is... Julian might.) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Mon Feb 8 12:54:46 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 8 Feb 2016 18:54:46 +0100 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: References: Message-ID: <56B8D666.8070402@googlemail.com> On 02/08/2016 06:36 PM, Nathaniel Smith wrote: > On Feb 8, 2016 3:04 AM, "Nils Becker" > wrote: >> > [...] >> Very superficial benchmarks (see below) seem devastating for gnu libm. > It seems that openlibm (compiled with gcc -mtune=native -O3) performs > really well and intels libm implementation is the best (on my intel > CPU). I did not check the accuracy of the functions, though. >> >> My own code uses a lot of trigonometric and complex functions (optics > calculations). I'd guess it could go 25% faster by just using a better > libm implementation. Therefore, I have an interest in getting sane > linking to a defined libm implementation to work. > > On further thought: I guess that to do this we actually will need to > change the names of the functions in openlibm and then use those names > when calling from numpy. So long as we're using the regular libm symbol > names, it doesn't matter what library the python extensions themselves > are linked to; the way ELF symbol lookup works, the libm that the python > interpreter is linked to will be checked *before* checking the libm that > numpy is linked to, so the symbols will all get shadowed. > > I guess statically linking openlibm would also work, but not sure that's > a great idea since we'd need it multiple places. > >> Apparently openlibm seems quite a good choice for numpy, at least > performance wise. However, I did not find any documentation or tests of > the accuracy of its functions. A benchmarking and testing (for accuracy) > code for libms would probably be a good starting point for a discussion. > I could maybe help with that - but apparently not with any > linking/building stuff (I just don't get it). >> >> Benchmark: >> >> gnu libm.so >> 3000 x sin(double[100000]): 6.68215647800389 s >> 3000 x log(double[100000]): 8.86350397899514 s >> 3000 x exp(double[100000]): 6.560557693999726 s >> >> openlibm.so >> 3000 x sin(double[100000]): 4.5058218560006935 s >> 3000 x log(double[100000]): 4.106520485998772 s >> 3000 x exp(double[100000]): 4.597905882001214 s >> >> Intel libimf.so >> 3000 x sin(double[100000]): 4.282402812998043 s >> 3000 x log(double[100000]): 4.008453270995233 s >> 3000 x exp(double[100000]): 3.301279639999848 s > > I would be highly suspicious that this speed comes at the expense of > accuracy... My impression is that there's a lot of room to make > speed/accuracy tradeoffs in these functions, and modern glibc's libm has > seen a fair amount of scrutiny by people who have access to the same > code that openlibm is based off of. But then again, maybe not :-). > > If these are the operations that you care about optimizing, an even > better approach might be to figure out how to integrate a vector math > library here like yeppp (BSD licensed) or MKL. Libm tries to optimize > log(scalar); these are libraries that specifically try to optimize > log(vector). Adding this would require changing numpy's code to use > these new APIs though. (Very new gcc can also try to do this in some > cases but I don't know how good at it it is... Julian might.) > > -n which version of glibm was used here? There are significant difference in performance between versions. Also the input ranges are very important for these functions, depending on input the speed of these functions can vary by factors of 1000. glibm now includes vectorized versions of most math functions, does openlibm have vectorized math? Thats where most speed can be gained, a lot more than 25%. From matthew.brett at gmail.com Mon Feb 8 13:21:19 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 8 Feb 2016 10:21:19 -0800 Subject: [Numpy-discussion] [SciPy-Dev] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: Hi, On Mon, Feb 8, 2016 at 3:29 AM, Da?id wrote: > > On 6 February 2016 at 21:26, Matthew Brett wrote: >> >> >> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy >> python -c 'import numpy; numpy.test()' >> python -c 'import scipy; scipy.test()' > > > > All the tests pass on my Fedora 23 with Python 2.7, but it seems to be > linking to the system openblas: > > numpy.show_config() > lapack_opt_info: > libraries = ['openblas'] > library_dirs = ['/usr/local/lib'] > define_macros = [('HAVE_CBLAS', None)] > language = c numpy.show_config() shows the places that numpy found the libraries at build time. In the case of the manylinux wheel builds, I put openblas at /usr/local , but the place the wheel should be loading openblas from is /.libs. For example, I think you'll find that the numpy tests will still pass if you remove any openblas installation at /usr/local . Thanks for testing by the way, Matthew From matthew.brett at gmail.com Mon Feb 8 13:23:17 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 8 Feb 2016 10:23:17 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Mon, Feb 8, 2016 at 3:57 AM, Evgeni Burovski wrote: > ---------- Forwarded message ---------- > From: Evgeni Burovski > Date: Mon, Feb 8, 2016 at 11:56 AM > Subject: Re: [Numpy-discussion] Multi-distribution Linux wheels - please test > To: Discussion of Numerical Python > > > On Sat, Feb 6, 2016 at 8:26 PM, Matthew Brett wrote: >> Hi, >> >> As some of you may have seen, Robert McGibbon and Nathaniel have just >> guided a PEP for multi-distribution Linux wheels past the approval >> process over on distutils-sig: >> >> https://www.python.org/dev/peps/pep-0513/ >> >> The PEP includes a docker image on which y'all can build wheels which >> match the PEP: >> >> https://quay.io/repository/manylinux/manylinux >> >> Now we're at the stage where we need stress-testing of the built >> wheels to find any problems we hadn't thought of. >> >> I've built numpy and scipy wheels here: >> >> https://nipy.bic.berkeley.edu/manylinux/ >> >> So, if you have a Linux distribution handy, we would love to hear from >> you about the results of testing these guys, maybe on the lines of: >> >> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy >> python -c 'import numpy; numpy.test()' >> python -c 'import scipy; scipy.test()' >> >> These manylinux wheels should soon be available on pypi, and soon >> after, installable with latest pip, so we would like to fix as many >> problems as possible before going live. >> >> Cheers, >> >> Matthew >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > Hi, > > Bog-standard Ubuntu 12.04, fresh virtualenv: > > Python 2.7.3 (default, Jun 22 2015, 19:33:41) > [GCC 4.6.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy >>>> numpy.__version__ > '1.10.4' >>>> numpy.test() > Running unit tests for numpy > NumPy version 1.10.4 > NumPy relaxed strides checking option: False > NumPy is installed in > /home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy > Python version 2.7.3 (default, Jun 22 2015, 19:33:41) [GCC 4.6.3] > nose version 1.3.7 > > > > ====================================================================== > ERROR: test_multiarray.TestNewBufferProtocol.test_relaxed_strides > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/case.py", > line 197, in runTest > self.test(*self.arg) > File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/core/tests/test_multiarray.py", > line 5366, in test_relaxed_strides > fd.write(c.data) > TypeError: 'buffer' does not have the buffer interface > > ---------------------------------------------------------------------- > > > * Scipy tests pass with one error in TestNanFuncs, but the interpreter > crashes immediately afterwards. > > > Same machine, python 3.5: both numpy and scipy tests pass. Ouch - great that you found these, I'll take a look, Matthew From evgeny.burovskiy at gmail.com Mon Feb 8 13:41:01 2016 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Mon, 8 Feb 2016 18:41:01 +0000 Subject: [Numpy-discussion] [SciPy-Dev] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: > numpy.show_config() shows the places that numpy found the libraries at > build time. In the case of the manylinux wheel builds, I put openblas > at /usr/local , but the place the wheel should be loading openblas > from is /.libs. For example, I think you'll > find that the numpy tests will still pass if you remove any openblas > installation at /usr/local . Confirmed: I do not have openblas in that location, and tests sort of pass (see a parallel email in this thread). By the way, is there a chance you could use a more specific location --- "What does your numpy.show_config() show?" is a question we often ask when receiving bug reports; having a marker location could save us an iteration when dealing with those when your wheels are common. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Feb 8 13:47:01 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 8 Feb 2016 10:47:01 -0800 Subject: [Numpy-discussion] [SciPy-Dev] Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Mon, Feb 8, 2016 at 10:41 AM, Evgeni Burovski wrote: > >> numpy.show_config() shows the places that numpy found the libraries at >> build time. In the case of the manylinux wheel builds, I put openblas >> at /usr/local , but the place the wheel should be loading openblas >> from is /.libs. For example, I think you'll >> find that the numpy tests will still pass if you remove any openblas >> installation at /usr/local . > > Confirmed: I do not have openblas in that location, and tests sort of pass > (see a parallel email in this thread). > > By the way, is there a chance you could use a more specific location --- > "What does your numpy.show_config() show?" is a question we often ask when > receiving bug reports; having a marker location could save us an iteration > when dealing with those when your wheels are common. That's a good idea. Matthew From matthew.brett at gmail.com Mon Feb 8 14:25:11 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 8 Feb 2016 11:25:11 -0800 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: <56B8C4E2.3060901@googlemail.com> References: <56B8C4E2.3060901@googlemail.com> Message-ID: Hi Julian, On Mon, Feb 8, 2016 at 8:40 AM, Julian Taylor wrote: > On 02/08/2016 05:23 PM, Da?id wrote: >> >> On 8 February 2016 at 16:19, Olivier Grisel > > wrote: >> >> >> >> OPENBLAS_CORETYPE=Nehalem python3 -c "import numpy as np; from scipy >> import linalg; linalg.eigh(np.random.randn(200, 200))" >> >> So this is an issue with the architecture detection of OpenBLAS. >> >> >> I am seeing the same problem on a native Linux box, with Ivy Bridge >> processor (i5-3317U). According to your script, both my native openblas >> and the one in the wheel recognises my CPU as Sandybridge, but the wheel >> produces a segmentation fault. Setting the architecture to Nehalem works. >> > > more likely that is a bug the kernel of openblas instead of its cpu > detection. > The cpuinfo of Oliver indicates its at least a sandy bridge, and ivy > bridge is be sandy bridge compatible. > Is an up to date version of openblas used? I used the latest release, v0.2.15: https://github.com/matthew-brett/manylinux-builds/blob/master/build_openblas.sh#L5 Is there a later version that we should try? Cheers, Matthew From chris.barker at noaa.gov Mon Feb 8 14:59:02 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 8 Feb 2016 11:59:02 -0800 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> <6161619481366224101@unknownmsgid> Message-ID: On Sat, Feb 6, 2016 at 4:11 PM, Michael Sarahan wrote: > Chris, > > Both conda-build-all and obvious-ci are excellent projects, and we'll > leverage them where we can (particularly conda-build-all). Obvious CI and > conda-smithy are in a slightly different space, as we want to use our own > anaconda.org build service, rather than write scripts to run on other CI > services. > I don't think conda-build-all or, for that matter, conda-smithy are fixed to any particular CI server. But anyway, the anaconda.org build service looks nice -- I'll need to give that a try. I've actually been building everything on my own machines anyway so far. > As I see it, the single, massive recipe repo that is conda-recipes has > been a disadvantage for a while in terms of complexity, but now may be an > advantage in terms of building downstream packages (how else would > dependency get resolved?) > yup -- but the other issue is that conda-recipes didn't seem to be maintained, really... > The goal, much like ObviousCI, is to enable project maintainers to get > their latest releases available in conda sooner, and to simplify the whole > CI setup process. We hope we can help each other rather than compete. > Great goal! Thanks, -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Feb 8 15:28:28 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 8 Feb 2016 12:28:28 -0800 Subject: [Numpy-discussion] GSoC? Message-ID: ANyone interested in Google Summer of Code this year? I think the real challenge is having folks with the time to really put into mentoring, but if folks want to do it -- numpy could really benefit. Maybe as a python.org sub-project? https://wiki.python.org/moin/SummerOfCode/2016 Deadlines are approaching -- so I thought I'd ping the list and see if folks are interested. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Mon Feb 8 17:32:21 2016 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Mon, 8 Feb 2016 23:32:21 +0100 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: References: Message-ID: <9EB7C08F-337E-41D4-A162-9664F1790A07@gmail.com> > Am 08.02.2016 um 18:36 schrieb Nathaniel Smith : > > On Feb 8, 2016 3:04 AM, "Nils Becker" > wrote: > > > [...] > > Very superficial benchmarks (see below) seem devastating for gnu libm. It seems that openlibm (compiled with gcc -mtune=native -O3) performs really well and intels libm implementation is the best (on my intel CPU). I did not check the accuracy of the functions, though. > > > > My own code uses a lot of trigonometric and complex functions (optics calculations). I'd guess it could go 25% faster by just using a better libm implementation. Therefore, I have an interest in getting sane linking to a defined libm implementation to work. > > On further thought: I guess that to do this we actually will need to change the names of the functions in openlibm and then use those names when calling from numpy. So long as we're using the regular libm symbol names, it doesn't matter what library the python extensions themselves are linked to; the way ELF symbol lookup works, the libm that the python interpreter is linked to will be checked *before* checking the libm that numpy is linked to, so the symbols will all get shadowed. > > I guess statically linking openlibm would also work, but not sure that's a great idea since we'd need it multiple places. > > > Apparently openlibm seems quite a good choice for numpy, at least performance wise. However, I did not find any documentation or tests of the accuracy of its functions. A benchmarking and testing (for accuracy) code for libms would probably be a good starting point for a discussion. I could maybe help with that - but apparently not with any linking/building stuff (I just don't get it). > > > > Benchmark: > > > > gnu libm.so > > 3000 x sin(double[100000]): 6.68215647800389 s > > 3000 x log(double[100000]): 8.86350397899514 s > > 3000 x exp(double[100000]): 6.560557693999726 s > > > > openlibm.so > > 3000 x sin(double[100000]): 4.5058218560006935 s > > 3000 x log(double[100000]): 4.106520485998772 s > > 3000 x exp(double[100000]): 4.597905882001214 s > > > > Intel libimf.so > > 3000 x sin(double[100000]): 4.282402812998043 s > > 3000 x log(double[100000]): 4.008453270995233 s > > 3000 x exp(double[100000]): 3.301279639999848 s > > I would be highly suspicious that this speed comes at the expense of accuracy... My impression is that there's a lot of room to make speed/accuracy tradeoffs in these functions, and modern glibc's libm has seen a fair amount of scrutiny by people who have access to the same code that openlibm is based off of. But then again, maybe not :-). > > If these are the operations that you care about optimizing, an even better approach might be to figure out how to integrate a vector math library here like yeppp (BSD licensed) or MKL. Libm tries to optimize log(scalar); these are libraries that specifically try to optimize log(vector). Adding this would require changing numpy's code to use these new APIs though. (Very new gcc can also try to do this in some cases but I don't know how good at it it is... Julian might.) > Years ago I made the vectorized math functions from Intels Vector Math Library (VML), part of MKL, available for numpy, see https://github.com/geggo/uvml Not particularly difficult, you not even have to change numpy. For some cases (e.g., exp) I have seen speedups up to 5x-10x. Unfortunately MKL is not free, and free vector math libraries like yeppp implement much fewer functions or do not support the required strided memory layout. But to improve performance, numexpr, numba or theano are much better. Gregor > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From c99.smruti at gmail.com Mon Feb 8 18:33:01 2016 From: c99.smruti at gmail.com (SMRUTI RANJAN SAHOO) Date: Tue, 9 Feb 2016 05:03:01 +0530 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: sir actually i am interested very much . so can you help me about this or suggest some , so that i can contribute . Thanks & Regards, Smruti Ranjan Sahoo On Tue, Feb 9, 2016 at 1:58 AM, Chris Barker wrote: > ANyone interested in Google Summer of Code this year? > > I think the real challenge is having folks with the time to really put > into mentoring, but if folks want to do it -- numpy could really benefit. > > Maybe as a python.org sub-project? > > https://wiki.python.org/moin/SummerOfCode/2016 > > Deadlines are approaching -- so I thought I'd ping the list and see if > folks are interested. > > -Chris > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Feb 8 19:02:47 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 8 Feb 2016 16:02:47 -0800 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: As you can see in the timeline: https://developers.google.com/open-source/gsoc/timeline We are now in the stage where mentoring organizations are getting their act together. So the question now is -- are there folks that want to mentor for numpy projects? It can be rewarding, but it's a pretty big commitment as well, and, I suppose depending on the project, would require some good knowledge of the innards of numpy -- there are not a lot of those folks out there that have that background. So to students, I suggest you keep an eye out, and engage a little later on in the process. That being said, if you have a idea for a numpy improvement you'd like to work on , by all means propose it and maybe you'll get a mentor or two excited. -CHB On Mon, Feb 8, 2016 at 3:33 PM, SMRUTI RANJAN SAHOO wrote: > sir actually i am interested very much . so can you help me about this or > suggest some , so that i can contribute . > > > > > Thanks & Regards, > Smruti Ranjan Sahoo > > On Tue, Feb 9, 2016 at 1:58 AM, Chris Barker > wrote: > >> ANyone interested in Google Summer of Code this year? >> >> I think the real challenge is having folks with the time to really put >> into mentoring, but if folks want to do it -- numpy could really benefit. >> >> Maybe as a python.org sub-project? >> >> https://wiki.python.org/moin/SummerOfCode/2016 >> >> Deadlines are approaching -- so I thought I'd ping the list and see if >> folks are interested. >> >> -Chris >> >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Feb 8 19:37:09 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 8 Feb 2016 16:37:09 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Mon, Feb 8, 2016 at 10:23 AM, Matthew Brett wrote: > On Mon, Feb 8, 2016 at 3:57 AM, Evgeni Burovski > wrote: >> ---------- Forwarded message ---------- >> From: Evgeni Burovski >> Date: Mon, Feb 8, 2016 at 11:56 AM >> Subject: Re: [Numpy-discussion] Multi-distribution Linux wheels - please test >> To: Discussion of Numerical Python >> >> >> On Sat, Feb 6, 2016 at 8:26 PM, Matthew Brett wrote: >>> Hi, >>> >>> As some of you may have seen, Robert McGibbon and Nathaniel have just >>> guided a PEP for multi-distribution Linux wheels past the approval >>> process over on distutils-sig: >>> >>> https://www.python.org/dev/peps/pep-0513/ >>> >>> The PEP includes a docker image on which y'all can build wheels which >>> match the PEP: >>> >>> https://quay.io/repository/manylinux/manylinux >>> >>> Now we're at the stage where we need stress-testing of the built >>> wheels to find any problems we hadn't thought of. >>> >>> I've built numpy and scipy wheels here: >>> >>> https://nipy.bic.berkeley.edu/manylinux/ >>> >>> So, if you have a Linux distribution handy, we would love to hear from >>> you about the results of testing these guys, maybe on the lines of: >>> >>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy >>> python -c 'import numpy; numpy.test()' >>> python -c 'import scipy; scipy.test()' >>> >>> These manylinux wheels should soon be available on pypi, and soon >>> after, installable with latest pip, so we would like to fix as many >>> problems as possible before going live. >>> >>> Cheers, >>> >>> Matthew >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> Hi, >> >> Bog-standard Ubuntu 12.04, fresh virtualenv: >> >> Python 2.7.3 (default, Jun 22 2015, 19:33:41) >> [GCC 4.6.3] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import numpy >>>>> numpy.__version__ >> '1.10.4' >>>>> numpy.test() >> Running unit tests for numpy >> NumPy version 1.10.4 >> NumPy relaxed strides checking option: False >> NumPy is installed in >> /home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy >> Python version 2.7.3 (default, Jun 22 2015, 19:33:41) [GCC 4.6.3] >> nose version 1.3.7 >> >> >> >> ====================================================================== >> ERROR: test_multiarray.TestNewBufferProtocol.test_relaxed_strides >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/case.py", >> line 197, in runTest >> self.test(*self.arg) >> File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/core/tests/test_multiarray.py", >> line 5366, in test_relaxed_strides >> fd.write(c.data) >> TypeError: 'buffer' does not have the buffer interface >> >> ---------------------------------------------------------------------- >> >> >> * Scipy tests pass with one error in TestNanFuncs, but the interpreter >> crashes immediately afterwards. >> >> >> Same machine, python 3.5: both numpy and scipy tests pass. > > Ouch - great that you found these, I'll take a look, I think these are problems with numpy and Python 2.7.3 - because I got the same "TypeError: 'buffer' does not have the buffer interface" on numpy with OS X with Python.org python 2.7.3, installing from a wheel, or installing from source. I also get a scipy segfault with scipy 0.17.0 installed from an OSX wheel, with output ending: test_check_finite (test_basic.TestLstsq) ... /Users/mb312/.virtualenvs/test/lib/python2.7/site-packages/scipy/linalg/basic.py:884: RuntimeWarning: internal gelsd driver lwork query error, required iwork dimension not returned. This is likely the result of LAPACK bug 0038, fixed in LAPACK 3.2.2 (released July 21, 2010). Falling back to 'gelss' driver. warnings.warn(mesg, RuntimeWarning) ok test_random_complex_exact (test_basic.TestLstsq) ... FAIL test_random_complex_overdet (test_basic.TestLstsq) ... Bus error This is so whether scipy is running on top of source- or wheel-built numpy, and for a scipy built from source. Same numpy error installing on a bare Ubuntu 12.04, either installing from a wheel built on 12.04 on travis: pip install -f http://travis-wheels.scikit-image.org --trusted-host travis-wheels.scikit-image.org --no-index numpy or from numpy built from source. I can't replicate the segfault with manylinux wheels and scipy. On the other hand, I get a new test error for numpy from manylinux, scipy from manylinux, like this: $ python -c 'import scipy.linalg; scipy.linalg.test()' ====================================================================== FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", line 658, in eigenhproblem_general assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 892, in assert_array_almost_equal precision=decimal) File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 713, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 4 decimals (mismatch 100.0%) x: array([ 0., 0., 0.], dtype=float32) y: array([ 1., 1., 1.]) ---------------------------------------------------------------------- Ran 1507 tests in 14.928s FAILED (KNOWNFAIL=4, SKIP=1, failures=1) This is a very odd error, which we don't get when running over a numpy installed from source, linked to ATLAS, and doesn't happen when running the tests via: nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg So, something about the copy of numpy (linked to openblas) is affecting the results of scipy (also linked to openblas), and only with a particular environment / test order. If you'd like to try and see whether y'all can do a better job of debugging than me: # Run this script inside a docker container started with this incantation: # docker run -ti --rm ubuntu:12.04 /bin/bash apt-get update apt-get install -y python curl apt-get install libpython2.7 # this won't be necessary with next iteration of manylinux wheel builds curl -LO https://bootstrap.pypa.io/get-pip.py python get-pip.py pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose python -c 'import scipy.linalg; scipy.linalg.test()' Cheers, Matthew From njs at pobox.com Mon Feb 8 20:26:49 2016 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 8 Feb 2016 17:26:49 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: [...] > I can't replicate the segfault with manylinux wheels and scipy. On > the other hand, I get a new test error for numpy from manylinux, scipy > from manylinux, like this: > > $ python -c 'import scipy.linalg; scipy.linalg.test()' > > ====================================================================== > FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line > 197, in runTest > self.test(*self.arg) > File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", > line 658, in eigenhproblem_general > assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) > File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", > line 892, in assert_array_almost_equal > precision=decimal) > File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", > line 713, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Arrays are not almost equal to 4 decimals > > (mismatch 100.0%) > x: array([ 0., 0., 0.], dtype=float32) > y: array([ 1., 1., 1.]) > > ---------------------------------------------------------------------- > Ran 1507 tests in 14.928s > > FAILED (KNOWNFAIL=4, SKIP=1, failures=1) > > This is a very odd error, which we don't get when running over a numpy > installed from source, linked to ATLAS, and doesn't happen when > running the tests via: > > nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg > > So, something about the copy of numpy (linked to openblas) is > affecting the results of scipy (also linked to openblas), and only > with a particular environment / test order. > > If you'd like to try and see whether y'all can do a better job of > debugging than me: > > # Run this script inside a docker container started with this incantation: > # docker run -ti --rm ubuntu:12.04 /bin/bash > apt-get update > apt-get install -y python curl > apt-get install libpython2.7 # this won't be necessary with next > iteration of manylinux wheel builds > curl -LO https://bootstrap.pypa.io/get-pip.py > python get-pip.py > pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose > python -c 'import scipy.linalg; scipy.linalg.test()' I just tried this and on my laptop it completed without error. Best guess is that we're dealing with some memory corruption bug inside openblas, so it's getting perturbed by things like exactly what other calls to openblas have happened (which is different depending on whether numpy is linked to openblas), and which core type openblas has detected. On my laptop, which *doesn't* show the problem, running with OPENBLAS_VERBOSE=2 says "Core: Haswell". Guess the next step is checking what core type the failing machines use, and running valgrind... anyone have a good valgrind suppressions file? -n -- Nathaniel J. Smith -- https://vorpus.org From Permafacture at gmail.com Mon Feb 8 21:01:29 2016 From: Permafacture at gmail.com (Elliot Hallmark) Date: Mon, 8 Feb 2016 20:01:29 -0600 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: Is there a clean way of importing existing C code as a vectorized numpy func? Like, it would be awesome to use gdal in a vectorized way just with ctypes or something. Just something I've dreamed of that I thought I'd ask about in regards to the GSoC. Elliot On Feb 8, 2016 6:03 PM, "Chris Barker" wrote: > As you can see in the timeline: > > https://developers.google.com/open-source/gsoc/timeline > > We are now in the stage where mentoring organizations are getting their > act together. So the question now is -- are there folks that want to mentor > for numpy projects? It can be rewarding, but it's a pretty big commitment > as well, and, I suppose depending on the project, would require some good > knowledge of the innards of numpy -- there are not a lot of those folks out > there that have that background. > > So to students, I suggest you keep an eye out, and engage a little later > on in the process. > > That being said, if you have a idea for a numpy improvement you'd like to > work on , by all means propose it and maybe you'll get a mentor or two > excited. > > -CHB > > > > > > On Mon, Feb 8, 2016 at 3:33 PM, SMRUTI RANJAN SAHOO > wrote: > >> sir actually i am interested very much . so can you help me about this >> or suggest some , so that i can contribute . >> >> >> >> >> Thanks & Regards, >> Smruti Ranjan Sahoo >> >> On Tue, Feb 9, 2016 at 1:58 AM, Chris Barker >> wrote: >> >>> ANyone interested in Google Summer of Code this year? >>> >>> I think the real challenge is having folks with the time to really put >>> into mentoring, but if folks want to do it -- numpy could really benefit. >>> >>> Maybe as a python.org sub-project? >>> >>> https://wiki.python.org/moin/SummerOfCode/2016 >>> >>> Deadlines are approaching -- so I thought I'd ping the list and see if >>> folks are interested. >>> >>> -Chris >>> >>> >>> >>> -- >>> >>> Christopher Barker, Ph.D. >>> Oceanographer >>> >>> Emergency Response Division >>> NOAA/NOS/OR&R (206) 526-6959 voice >>> 7600 Sand Point Way NE (206) 526-6329 fax >>> Seattle, WA 98115 (206) 526-6317 main reception >>> >>> Chris.Barker at noaa.gov >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Feb 8 21:04:18 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 8 Feb 2016 18:04:18 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith wrote: > On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: > [...] >> I can't replicate the segfault with manylinux wheels and scipy. On >> the other hand, I get a new test error for numpy from manylinux, scipy >> from manylinux, like this: >> >> $ python -c 'import scipy.linalg; scipy.linalg.test()' >> >> ====================================================================== >> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >> 197, in runTest >> self.test(*self.arg) >> File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >> line 658, in eigenhproblem_general >> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >> line 892, in assert_array_almost_equal >> precision=decimal) >> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >> line 713, in assert_array_compare >> raise AssertionError(msg) >> AssertionError: >> Arrays are not almost equal to 4 decimals >> >> (mismatch 100.0%) >> x: array([ 0., 0., 0.], dtype=float32) >> y: array([ 1., 1., 1.]) >> >> ---------------------------------------------------------------------- >> Ran 1507 tests in 14.928s >> >> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >> >> This is a very odd error, which we don't get when running over a numpy >> installed from source, linked to ATLAS, and doesn't happen when >> running the tests via: >> >> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >> >> So, something about the copy of numpy (linked to openblas) is >> affecting the results of scipy (also linked to openblas), and only >> with a particular environment / test order. >> >> If you'd like to try and see whether y'all can do a better job of >> debugging than me: >> >> # Run this script inside a docker container started with this incantation: >> # docker run -ti --rm ubuntu:12.04 /bin/bash >> apt-get update >> apt-get install -y python curl >> apt-get install libpython2.7 # this won't be necessary with next >> iteration of manylinux wheel builds >> curl -LO https://bootstrap.pypa.io/get-pip.py >> python get-pip.py >> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >> python -c 'import scipy.linalg; scipy.linalg.test()' > > I just tried this and on my laptop it completed without error. > > Best guess is that we're dealing with some memory corruption bug > inside openblas, so it's getting perturbed by things like exactly what > other calls to openblas have happened (which is different depending on > whether numpy is linked to openblas), and which core type openblas has > detected. > > On my laptop, which *doesn't* show the problem, running with > OPENBLAS_VERBOSE=2 says "Core: Haswell". > > Guess the next step is checking what core type the failing machines > use, and running valgrind... anyone have a good valgrind suppressions > file? My machine (which does give the failure) gives Core: Core2 with OPENBLAS_VERBOSE=2 Matthew From njs at pobox.com Mon Feb 8 21:07:04 2016 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 8 Feb 2016 18:07:04 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Mon, Feb 8, 2016 at 6:04 PM, Matthew Brett wrote: > On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith wrote: >> On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: >> [...] >>> I can't replicate the segfault with manylinux wheels and scipy. On >>> the other hand, I get a new test error for numpy from manylinux, scipy >>> from manylinux, like this: >>> >>> $ python -c 'import scipy.linalg; scipy.linalg.test()' >>> >>> ====================================================================== >>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >>> 197, in runTest >>> self.test(*self.arg) >>> File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >>> line 658, in eigenhproblem_general >>> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>> line 892, in assert_array_almost_equal >>> precision=decimal) >>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>> line 713, in assert_array_compare >>> raise AssertionError(msg) >>> AssertionError: >>> Arrays are not almost equal to 4 decimals >>> >>> (mismatch 100.0%) >>> x: array([ 0., 0., 0.], dtype=float32) >>> y: array([ 1., 1., 1.]) >>> >>> ---------------------------------------------------------------------- >>> Ran 1507 tests in 14.928s >>> >>> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >>> >>> This is a very odd error, which we don't get when running over a numpy >>> installed from source, linked to ATLAS, and doesn't happen when >>> running the tests via: >>> >>> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >>> >>> So, something about the copy of numpy (linked to openblas) is >>> affecting the results of scipy (also linked to openblas), and only >>> with a particular environment / test order. >>> >>> If you'd like to try and see whether y'all can do a better job of >>> debugging than me: >>> >>> # Run this script inside a docker container started with this incantation: >>> # docker run -ti --rm ubuntu:12.04 /bin/bash >>> apt-get update >>> apt-get install -y python curl >>> apt-get install libpython2.7 # this won't be necessary with next >>> iteration of manylinux wheel builds >>> curl -LO https://bootstrap.pypa.io/get-pip.py >>> python get-pip.py >>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >>> python -c 'import scipy.linalg; scipy.linalg.test()' >> >> I just tried this and on my laptop it completed without error. >> >> Best guess is that we're dealing with some memory corruption bug >> inside openblas, so it's getting perturbed by things like exactly what >> other calls to openblas have happened (which is different depending on >> whether numpy is linked to openblas), and which core type openblas has >> detected. >> >> On my laptop, which *doesn't* show the problem, running with >> OPENBLAS_VERBOSE=2 says "Core: Haswell". >> >> Guess the next step is checking what core type the failing machines >> use, and running valgrind... anyone have a good valgrind suppressions >> file? > > My machine (which does give the failure) gives > > Core: Core2 > > with OPENBLAS_VERBOSE=2 Yep, that allows me to reproduce it: root at f7153f0cc841:/# OPENBLAS_VERBOSE=2 OPENBLAS_CORETYPE=Core2 python -c 'import scipy.linalg; scipy.linalg.test()' Core: Core2 [...] ====================================================================== FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) ---------------------------------------------------------------------- [...] So this is indeed sounding like an OpenBLAS issue... next stop valgrind, I guess :-/ -- Nathaniel J. Smith -- https://vorpus.org From njs at pobox.com Mon Feb 8 22:59:07 2016 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 8 Feb 2016 19:59:07 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Mon, Feb 8, 2016 at 6:07 PM, Nathaniel Smith wrote: > On Mon, Feb 8, 2016 at 6:04 PM, Matthew Brett wrote: >> On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith wrote: >>> On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: >>> [...] >>>> I can't replicate the segfault with manylinux wheels and scipy. On >>>> the other hand, I get a new test error for numpy from manylinux, scipy >>>> from manylinux, like this: >>>> >>>> $ python -c 'import scipy.linalg; scipy.linalg.test()' >>>> >>>> ====================================================================== >>>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>>> ---------------------------------------------------------------------- >>>> Traceback (most recent call last): >>>> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >>>> 197, in runTest >>>> self.test(*self.arg) >>>> File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >>>> line 658, in eigenhproblem_general >>>> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>> line 892, in assert_array_almost_equal >>>> precision=decimal) >>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>> line 713, in assert_array_compare >>>> raise AssertionError(msg) >>>> AssertionError: >>>> Arrays are not almost equal to 4 decimals >>>> >>>> (mismatch 100.0%) >>>> x: array([ 0., 0., 0.], dtype=float32) >>>> y: array([ 1., 1., 1.]) >>>> >>>> ---------------------------------------------------------------------- >>>> Ran 1507 tests in 14.928s >>>> >>>> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >>>> >>>> This is a very odd error, which we don't get when running over a numpy >>>> installed from source, linked to ATLAS, and doesn't happen when >>>> running the tests via: >>>> >>>> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >>>> >>>> So, something about the copy of numpy (linked to openblas) is >>>> affecting the results of scipy (also linked to openblas), and only >>>> with a particular environment / test order. >>>> >>>> If you'd like to try and see whether y'all can do a better job of >>>> debugging than me: >>>> >>>> # Run this script inside a docker container started with this incantation: >>>> # docker run -ti --rm ubuntu:12.04 /bin/bash >>>> apt-get update >>>> apt-get install -y python curl >>>> apt-get install libpython2.7 # this won't be necessary with next >>>> iteration of manylinux wheel builds >>>> curl -LO https://bootstrap.pypa.io/get-pip.py >>>> python get-pip.py >>>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >>>> python -c 'import scipy.linalg; scipy.linalg.test()' >>> >>> I just tried this and on my laptop it completed without error. >>> >>> Best guess is that we're dealing with some memory corruption bug >>> inside openblas, so it's getting perturbed by things like exactly what >>> other calls to openblas have happened (which is different depending on >>> whether numpy is linked to openblas), and which core type openblas has >>> detected. >>> >>> On my laptop, which *doesn't* show the problem, running with >>> OPENBLAS_VERBOSE=2 says "Core: Haswell". >>> >>> Guess the next step is checking what core type the failing machines >>> use, and running valgrind... anyone have a good valgrind suppressions >>> file? >> >> My machine (which does give the failure) gives >> >> Core: Core2 >> >> with OPENBLAS_VERBOSE=2 > > Yep, that allows me to reproduce it: > > root at f7153f0cc841:/# OPENBLAS_VERBOSE=2 OPENBLAS_CORETYPE=Core2 python > -c 'import scipy.linalg; scipy.linalg.test()' > Core: Core2 > [...] > ====================================================================== > FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) > ---------------------------------------------------------------------- > [...] > > So this is indeed sounding like an OpenBLAS issue... next stop > valgrind, I guess :-/ Here's the valgrind output: https://gist.github.com/njsmith/577d028e79f0a80d2797 There's a lot of it, but no smoking guns have jumped out at me :-/ -n -- Nathaniel J. Smith -- https://vorpus.org From davidmenhur at gmail.com Tue Feb 9 04:21:12 2016 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 9 Feb 2016 10:21:12 +0100 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: <56B8C4E2.3060901@googlemail.com> Message-ID: On 8 February 2016 at 20:25, Matthew Brett wrote: > > I used the latest release, v0.2.15: > > https://github.com/matthew-brett/manylinux-builds/blob/master/build_openblas.sh#L5 > > Is there a later version that we should try? > > Cheers, > That is the one in the Fedora repos that is working for me. How are you compiling it? Mine is compiled with GCC 5 with the options seen in the source rpm: http://koji.fedoraproject.org/koji/packageinfo?packageID=15277 -------------- next part -------------- An HTML attachment was scrubbed... URL: From rainwoodman at gmail.com Tue Feb 9 04:51:36 2016 From: rainwoodman at gmail.com (Feng Yu) Date: Tue, 9 Feb 2016 01:51:36 -0800 Subject: [Numpy-discussion] resizeable arrays using shared memory? In-Reply-To: References: <1454810501.1557.5.camel@sipsolutions.net> Message-ID: Hi, If the base address and size of the anonymous memory map are 'shared', then one can protect them with a lock, grow the memmap with remap (or unmap and map, or other tricks), and release the lock. During the 'resize' call, any reference to the array from Python in other processes could just spin on the lock. This is probably more defined than using signals. but I am not sure about how to enforce the spinning when an object is referenced. A possibility is that one can insist that a 'resizable' mmap must be accessed via a context manager, e.g. growable = shm.growable(initsize) rank = do the magic to fork processes if rank == 0: growable.grow(fill=0, size=10) else: with growable as a: a += 10 Yu On Sun, Feb 7, 2016 at 3:11 PM, Elliot Hallmark wrote: > That makes sense. I could either send a signal to the child process letting > it know to re-instantiate the numpy array using the same (but now resized) > buffer, or I could have it check to see if the buffer has been resized when > it might need it and re-instantiate then. That's actually not too bad. It > would be nice if the array could be resized, but it's probably unstable to > do so and there isn't much demand for it. > > Thanks, > Elliot > > On Sat, Feb 6, 2016 at 8:01 PM, Sebastian Berg > wrote: >> >> On Sa, 2016-02-06 at 16:56 -0600, Elliot Hallmark wrote: >> > Hi all, >> > >> > I have a program that uses resize-able arrays. I already over >> > -provision the arrays and use slices, but every now and then the data >> > outgrows that array and it needs to be resized. >> > >> > Now, I would like to have these arrays shared between processes >> > spawned via multiprocessing (for fast interprocess communication >> > purposes, not for parallelizing work on an array). I don't care >> > about mapping to a file on disk, and I don't want disk I/O happening. >> > I don't care (really) about data being copied in memory on resize. >> > I *do* want the array to be resized "in place", so that the child >> > processes can still access the arrays from the object they were >> > initialized with. >> > >> > >> > I can share arrays easily using arrays that are backed by memmap. >> > Ie: >> > >> > ``` >> > #Source: http://github.com/rainwoodman/sharedmem >> > >> > >> > class anonymousmemmap(numpy.memmap): >> > def __new__(subtype, shape, dtype=numpy.uint8, order='C'): >> > >> > descr = numpy.dtype(dtype) >> > _dbytes = descr.itemsize >> > >> > shape = numpy.atleast_1d(shape) >> > size = 1 >> > for k in shape: >> > size *= k >> > >> > bytes = int(size*_dbytes) >> > >> > if bytes > 0: >> > mm = mmap.mmap(-1,bytes) >> > else: >> > mm = numpy.empty(0, dtype=descr) >> > self = numpy.ndarray.__new__(subtype, shape, dtype=descr, >> > buffer=mm, order=order) >> > self._mmap = mm >> > return self >> > >> > def __array_wrap__(self, outarr, context=None): >> > return >> > numpy.ndarray.__array_wrap__(self.view(numpy.ndarray), outarr, >> > context) >> > ``` >> > >> > This cannot be resized because it does not own it's own data >> > (ValueError: cannot resize this array: it does not own its data). >> > (numpy.memmap has this same issue [0], even if I set refcheck to >> > False and even though the docs say otherwise [1]). >> > >> > arr._mmap.resize(x) fails because it is annonymous (error: [Errno 9] >> > Bad file descriptor). If I create a file and use that fileno to >> > create the memmap, then I can resize `arr._mmap` but the array itself >> > is not resized. >> > >> > Is there a way to accomplish what I want? Or, do I just need to >> > figure out a way to communicate new arrays to the child processes? >> > >> >> I guess the answer is no, but the first question should be whether you >> can create a new array viewing the same data that is just larger? Since >> you have the mmap, that would be creating a new view into it. >> >> I.e. your "array" would be the memmap, and to use it, you always rewrap >> it into a new numpy array. >> >> Other then that, you would have to mess with the internal ndarray >> structure, since these kind of operations appear rather unsafe. >> >> - Sebastian >> >> >> > Thanks, >> > Elliot >> > >> > [0] https://github.com/numpy/numpy/issues/4198. >> > >> > [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap. >> > resize.html >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From davidmenhur at gmail.com Tue Feb 9 05:15:36 2016 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 9 Feb 2016 11:15:36 +0100 Subject: [Numpy-discussion] resizeable arrays using shared memory? In-Reply-To: References: Message-ID: On 6 February 2016 at 23:56, Elliot Hallmark wrote: > Now, I would like to have these arrays shared between processes spawned via multiprocessing (for fast interprocess communication purposes, not for parallelizing work on an array). I don't care about mapping to a file on disk, and I don't want disk I/O happening. I don't care (really) about data being copied in memory on resize. I *do* want the array to be resized "in place", so that the child processes can still access the arrays from the object they were initialized with. If you are only reading in parallel, and you can afford the extra dependency, one alternative way to do this would be to use an expandable array from HDF5: http://www.pytables.org/usersguide/libref/homogenous_storage.html#earrayclassdescr To avoid I/O, your file can live in RAM. http://www.pytables.org/cookbook/inmemory_hdf5_files.html From nilsc.becker at gmail.com Tue Feb 9 05:21:51 2016 From: nilsc.becker at gmail.com (Nils Becker) Date: Tue, 9 Feb 2016 11:21:51 +0100 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: <9EB7C08F-337E-41D4-A162-9664F1790A07@gmail.com> References: <9EB7C08F-337E-41D4-A162-9664F1790A07@gmail.com> Message-ID: 2016-02-08 18:54 GMT+01:00 Julian Taylor : > which version of glibm was used here? There are significant difference > in performance between versions. > Also the input ranges are very important for these functions, depending > on input the speed of these functions can vary by factors of 1000. > > glibm now includes vectorized versions of most math functions, does > openlibm have vectorized math? > Thats where most speed can be gained, a lot more than 25%. glibc 2.22 was used running on archlinux. As far as I know openlibm does not include special vectorized functions. (for reference vectorized operations in glibc: https://sourceware.org/glibc/wiki/libmvec). 2016-02-08 23:32 GMT+01:00 Gregor Thalhammer : > Years ago I made the vectorized math functions from Intels Vector Math > Library (VML), part of MKL, available for numpy, see > https://github.com/geggo/uvml > Not particularly difficult, you not even have to change numpy. For some > cases (e.g., exp) I have seen speedups up to 5x-10x. Unfortunately MKL is > not free, and free vector math libraries like yeppp implement much fewer > functions or do not support the required strided memory layout. But to > improve performance, numexpr, numba or theano are much better. > > Gregor > > Thank you very much for the link! I did not know about numpy.set_numeric_ops. You are right, vectorized operations can push down calculation time per element by factors. The benchmarks done for the yeppp-project also indicate that (however far you would trust them: http://www.yeppp.info/benchmarks.html). But I would agree that this domain should be left to specialized tools like numexpr as fully exploiting the speedup depends on the expression, that should be calculated. It is not suitable as a standard for numpy. Still, I think it would be good to give the possibility to choose the libm numpy links against. And be it simply to allow to choose or guarantee a specific accuracy/performance on different platforms and systems. Maybe maintaining a de-facto libm in npy_math could be replaced with a dependency on e.g. openlibm. But such a decision would require a thorough benchmark/testing of the available solutions. Especially with respect to the accuracy-performance-tradeoff that was mentioned. Cheers Nils -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Tue Feb 9 08:19:29 2016 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Tue, 9 Feb 2016 13:19:29 +0000 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: >>> ====================================================================== >>> ERROR: test_multiarray.TestNewBufferProtocol.test_relaxed_strides >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/case.py", >>> line 197, in runTest >>> self.test(*self.arg) >>> File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/core/tests/test_multiarray.py", >>> line 5366, in test_relaxed_strides >>> fd.write(c.data) >>> TypeError: 'buffer' does not have the buffer interface >>> >>> ---------------------------------------------------------------------- >>> >>> >>> * Scipy tests pass with one error in TestNanFuncs, but the interpreter >>> crashes immediately afterwards. >>> >>> >>> Same machine, python 3.5: both numpy and scipy tests pass. >> >> Ouch - great that you found these, I'll take a look, > > I think these are problems with numpy and Python 2.7.3 - because I got > the same "TypeError: 'buffer' does not have the buffer interface" on > numpy with OS X with Python.org python 2.7.3, installing from a wheel, > or installing from source. Indeed --- updated to python 2.7.11 (Thanks Felix Krull!) and the failure is gone, `numpy.test()` passes. However: >>> numpy.test("full") Running unit tests for numpy NumPy version 1.10.4 NumPy relaxed strides checking option: False NumPy is installed in /home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy Python version 2.7.11 (default, Dec 14 2015, 22:56:59) [GCC 4.6.3] nose version 1.3.7 ====================================================================== ERROR: test_kind.TestKind.test_all ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/case.py", line 381, in setUp try_run(self.inst, ('setup', 'setUp')) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/util.py", line 471, in try_run return func() File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 367, in setUp module_name=self.module_name) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 79, in wrapper memo[key] = func(*a, **kw) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 150, in build_module __import__(module_name) ImportError: /home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/core/../.libs/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by /tmp/tmpPVjYDE/_test_ext_module_5405.so) ====================================================================== ERROR: test_mixed.TestMixed.test_all ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/case.py", line 381, in setUp try_run(self.inst, ('setup', 'setUp')) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/util.py", line 471, in try_run return func() File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 367, in setUp module_name=self.module_name) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 79, in wrapper memo[key] = func(*a, **kw) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 150, in build_module __import__(module_name) ImportError: /home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/core/../.libs/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by /tmp/tmpPVjYDE/_test_ext_module_5405.so) ====================================================================== ERROR: test_mixed.TestMixed.test_docstring ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/case.py", line 381, in setUp try_run(self.inst, ('setup', 'setUp')) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/nose/util.py", line 471, in try_run return func() File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 367, in setUp module_name=self.module_name) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/f2py/tests/util.py", line 85, in wrapper raise ret ImportError: /home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/core/../.libs/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by /tmp/tmpPVjYDE/_test_ext_module_5405.so) ====================================================================== ERROR: test_basic (test_function_base.TestMedian) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/lib/tests/test_function_base.py", line 2361, in test_basic assert_(w[0].category is RuntimeWarning) IndexError: list index out of range ====================================================================== ERROR: test_nan_behavior (test_function_base.TestMedian) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/lib/tests/test_function_base.py", line 2464, in test_nan_behavior assert_(w[0].category is RuntimeWarning) IndexError: list index out of range ====================================================================== FAIL: test_default (test_numeric.TestSeterr) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/core/tests/test_numeric.py", line 281, in test_default under='ignore', AssertionError: {'over': 'raise', 'divide': 'warn', 'invalid': 'warn', 'under': 'ignore'} != {'over': 'warn', 'divide': 'warn', 'invalid': 'warn', 'under': 'ignore'} - {'divide': 'warn', 'invalid': 'warn', 'over': 'raise', 'under': 'ignore'} ? ^^^^ + {'divide': 'warn', 'invalid': 'warn', 'over': 'warn', 'under': 'ignore'} ? ++ ^ ====================================================================== FAIL: test_allnans (test_nanfunctions.TestNanFunctions_Median) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/lib/tests/test_nanfunctions.py", line 544, in test_allnans assert_(len(w) == 1) File "/home/br/virtualenvs/manylinux/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 53, in assert_ raise AssertionError(smsg) AssertionError ---------------------------------------------------------------------- Ran 6148 tests in 77.301s FAILED (KNOWNFAIL=3, SKIP=6, errors=5, failures=2) Not sure if any of these are present for python 2.7.3 and can no longer easily test. `scipy.test("full")` almost passes, there's a bunch of warnings-related noise and https://github.com/scipy/scipy/issues/5823 Nothing too bad on that machine, it seems :-). > I also get a scipy segfault with scipy 0.17.0 installed from an OSX > wheel, with output ending: > > test_check_finite (test_basic.TestLstsq) ... > /Users/mb312/.virtualenvs/test/lib/python2.7/site-packages/scipy/linalg/basic.py:884: > RuntimeWarning: internal gelsd driver lwork query error, required > iwork dimension not returned. This is likely the result of LAPACK bug > 0038, fixed in LAPACK 3.2.2 (released July 21, 2010). Falling back to > 'gelss' driver. > warnings.warn(mesg, RuntimeWarning) > ok > test_random_complex_exact (test_basic.TestLstsq) ... FAIL > test_random_complex_overdet (test_basic.TestLstsq) ... Bus error Oh, that one again... > This is so whether scipy is running on top of source- or wheel-built > numpy, and for a scipy built from source. > > Same numpy error installing on a bare Ubuntu 12.04, either installing > from a wheel built on 12.04 on travis: > > pip install -f http://travis-wheels.scikit-image.org --trusted-host > travis-wheels.scikit-image.org --no-index numpy > > or from numpy built from source. > > I can't replicate the segfault with manylinux wheels and scipy. On > the other hand, I get a new test error for numpy from manylinux, scipy > from manylinux, like this: > > $ python -c 'import scipy.linalg; scipy.linalg.test()' > > ====================================================================== > FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line > 197, in runTest > self.test(*self.arg) > File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", > line 658, in eigenhproblem_general > assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) > File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", > line 892, in assert_array_almost_equal > precision=decimal) > File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", > line 713, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Arrays are not almost equal to 4 decimals > > (mismatch 100.0%) > x: array([ 0., 0., 0.], dtype=float32) > y: array([ 1., 1., 1.]) > > ---------------------------------------------------------------------- > Ran 1507 tests in 14.928s > > FAILED (KNOWNFAIL=4, SKIP=1, failures=1) > > This is a very odd error, which we don't get when running over a numpy > installed from source, linked to ATLAS, and doesn't happen when > running the tests via: > > nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg > > So, something about the copy of numpy (linked to openblas) is > affecting the results of scipy (also linked to openblas), and only > with a particular environment / test order. > > If you'd like to try and see whether y'all can do a better job of > debugging than me: > > # Run this script inside a docker container started with this incantation: > # docker run -ti --rm ubuntu:12.04 /bin/bash > apt-get update > apt-get install -y python curl > apt-get install libpython2.7 # this won't be necessary with next > iteration of manylinux wheel builds > curl -LO https://bootstrap.pypa.io/get-pip.py > python get-pip.py > pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose > python -c 'import scipy.linalg; scipy.linalg.test()' > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From nadavh at visionsense.com Tue Feb 9 09:07:35 2016 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue, 9 Feb 2016 14:07:35 +0000 Subject: [Numpy-discussion] Multi-distribution Linux wheels - please test In-Reply-To: References: , Message-ID: Do not know what happened --- all test passed, even when removed openblas (Nathaniel was right). Manylinux config: python -c 'import numpy; print(numpy.__config__.show())' blas_opt_info: define_macros = [('HAVE_CBLAS', None)] libraries = ['openblas'] language = c library_dirs = ['/usr/local/lib'] lapack_opt_info: define_macros = [('HAVE_CBLAS', None)] libraries = ['openblas'] language = c library_dirs = ['/usr/local/lib'] blas_mkl_info: NOT AVAILABLE openblas_lapack_info: define_macros = [('HAVE_CBLAS', None)] libraries = ['openblas'] language = c library_dirs = ['/usr/local/lib'] openblas_info: define_macros = [('HAVE_CBLAS', None)] libraries = ['openblas'] language = c library_dirs = ['/usr/local/lib'] None Source installtion: python -c 'import numpy; print(numpy.__config__.show())' openblas_info: library_dirs = ['/usr/local/lib'] libraries = ['openblas', 'openblas'] language = c runtime_library_dirs = ['/usr/local/lib'] define_macros = [('HAVE_CBLAS', None)] openblas_lapack_info: library_dirs = ['/usr/local/lib'] libraries = ['openblas', 'openblas'] language = c runtime_library_dirs = ['/usr/local/lib'] define_macros = [('HAVE_CBLAS', None)] lapack_opt_info: extra_compile_args = ['-g -ftree-vectorize -mtune=native -march=native -O3'] runtime_library_dirs = ['/usr/local/lib'] define_macros = [('HAVE_CBLAS', None)] libraries = ['openblas', 'openblas', 'atlas', 'f77blas', 'cblas', 'blas'] language = c library_dirs = ['/usr/local/lib', '/usr/lib'] blas_mkl_info: NOT AVAILABLE blas_opt_info: extra_compile_args = ['-g -ftree-vectorize -mtune=native -march=native -O3'] runtime_library_dirs = ['/usr/local/lib'] define_macros = [('HAVE_CBLAS', None)] libraries = ['openblas', 'openblas', 'atlas', 'f77blas', 'cblas', 'blas'] language = c library_dirs = ['/usr/local/lib', '/usr/lib'] None ________________________________________ From: NumPy-Discussion on behalf of Matthew Brett Sent: 08 February 2016 09:48 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Multi-distribution Linux wheels - please test Hi Nadav, On Sun, Feb 7, 2016 at 11:13 PM, Nathaniel Smith wrote: > (This is not relevant to the main topic of the thread, but FYI I think the > recarray issues are fixed in 1.10.4.) > > On Feb 7, 2016 11:10 PM, "Nadav Horesh" wrote: >> >> I have atlas-lapack-base installed via pacman (required by sagemath). >> Since the numpy installation insisted on openblas on /usr/local, I got the >> openblas source-code and installed it on /usr/local. >> BTW, I use 1.11b rather then 1.10.x since the 1.10 is very slow in >> handling recarrays. For the tests I am erasing the 1.11 installation, and >> installing the 1.10.4 wheel. I do verify that I have the right version >> before running the tests, but I am not sure if there no unnoticed side >> effects. >> >> Would it help if I put a side the openblas installation and rerun the >> test? Would you mind doing something like this, and posting the output?: virtualenv test-manylinux source test-manylinux/bin/activate pip install -f https://nipy.bic.berkeley.edu/manylinux numpy==1.10.4 nose python -c 'import numpy; numpy.test()' python -c 'import numpy; print(numpy.__config__.show())' deactivate virtualenv test-from-source source test-from-source/bin/activate pip install numpy==1.10.4 nose python -c 'import numpy; numpy.test()' python -c 'import numpy; print(numpy.__config__.show())' deactivate I'm puzzled that the wheel gives a test error when the source install does not, and my best guess was an openblas problem, but this just to make sure we have the output from the exact same numpy version, at least. Thanks again, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion From davidmenhur at gmail.com Tue Feb 9 10:06:04 2016 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 9 Feb 2016 16:06:04 +0100 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: References: Message-ID: On 8 February 2016 at 18:36, Nathaniel Smith wrote: > I would be highly suspicious that this speed comes at the expense of > accuracy... My impression is that there's a lot of room to make > speed/accuracy tradeoffs in these functions, and modern glibc's libm has > seen a fair amount of scrutiny by people who have access to the same code > that openlibm is based off of. But then again, maybe not :-). I did some digging, and I found this: http://julia-programming-language.2336112.n4.nabble.com/Is-the-accuracy-of-Julia-s-elementary-functions-exp-sin-known-td32736.html In short: according to their devs, most openlibm functions are accurate to less than 1ulp, while GNU libm is rounded to closest float. /David. From gregor.thalhammer at gmail.com Tue Feb 9 12:02:41 2016 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Tue, 9 Feb 2016 18:02:41 +0100 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: References: <9EB7C08F-337E-41D4-A162-9664F1790A07@gmail.com> Message-ID: <428CC29A-A3DA-46A6-B7A1-A43345B1AB41@gmail.com> > Am 09.02.2016 um 11:21 schrieb Nils Becker : > > 2016-02-08 18:54 GMT+01:00 Julian Taylor >: > > which version of glibm was used here? There are significant difference > > in performance between versions. > > Also the input ranges are very important for these functions, depending > > on input the speed of these functions can vary by factors of 1000. > > > > glibm now includes vectorized versions of most math functions, does > > openlibm have vectorized math? > > Thats where most speed can be gained, a lot more than 25%. > > glibc 2.22 was used running on archlinux. As far as I know openlibm does not include special vectorized functions. (for reference vectorized operations in glibc: https://sourceware.org/glibc/wiki/libmvec ). > > 2016-02-08 23:32 GMT+01:00 Gregor Thalhammer >: > Years ago I made the vectorized math functions from Intels Vector Math Library (VML), part of MKL, available for numpy, see https://github.com/geggo/uvml > Not particularly difficult, you not even have to change numpy. For some cases (e.g., exp) I have seen speedups up to 5x-10x. Unfortunately MKL is not free, and free vector math libraries like yeppp implement much fewer functions or do not support the required strided memory layout. But to improve performance, numexpr, numba or theano are much better. > > Gregor > > > Thank you very much for the link! I did not know about numpy.set_numeric_ops. > You are right, vectorized operations can push down calculation time per element by factors. The benchmarks done for the yeppp-project also indicate that (however far you would trust them: http://www.yeppp.info/benchmarks.html ). But I would agree that this domain should be left to specialized tools like numexpr as fully exploiting the speedup depends on the expression, that should be calculated. It is not suitable as a standard for bumpy. Why should numpy not provide fast transcendental math functions? For linear algebra it supports fast implementations, even non-free (MKL). Wouldn?t it be nice if numpy outperforms C? > > Still, I think it would be good to give the possibility to choose the libm numpy links against. And be it simply to allow to choose or guarantee a specific accuracy/performance on different platforms and systems. > Maybe maintaining a de-facto libm in npy_math could be replaced with a dependency on e.g. openlibm. But such a decision would require a thorough benchmark/testing of the available solutions. Especially with respect to the accuracy-performance-tradeoff that was mentioned. > Intel publishes accuracy/performance charts for VML/MKL: https://software.intel.com/sites/products/documentation/doclib/mkl/vm/functions/_accuracyall.html For GNU libc it is more difficult to find similarly precise data, I only could find: http://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html Gregor > Cheers > Nils > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Tue Feb 9 14:13:58 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 9 Feb 2016 11:13:58 -0800 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: References: Message-ID: On Tue, Feb 9, 2016 at 7:06 AM, Da?id wrote: > On 8 February 2016 at 18:36, Nathaniel Smith wrote: >> I would be highly suspicious that this speed comes at the expense of >> accuracy... My impression is that there's a lot of room to make >> speed/accuracy tradeoffs in these functions, and modern glibc's libm has >> seen a fair amount of scrutiny by people who have access to the same code >> that openlibm is based off of. But then again, maybe not :-). > > > I did some digging, and I found this: > > http://julia-programming-language.2336112.n4.nabble.com/Is-the-accuracy-of-Julia-s-elementary-functions-exp-sin-known-td32736.html > > In short: according to their devs, most openlibm functions are > accurate to less than 1ulp, while GNU libm is rounded to closest > float. So GNU libm has max error <= 0.5 ULP, openlibm has <= 1 ULP, and OSX is (almost always) somewhere in-between. So, is <= 1 ULP good enough? Cheers, Matthew From jtaylor.debian at googlemail.com Tue Feb 9 14:37:26 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 9 Feb 2016 20:37:26 +0100 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: <56BA3FF6.40207@googlemail.com> On 09.02.2016 04:59, Nathaniel Smith wrote: > On Mon, Feb 8, 2016 at 6:07 PM, Nathaniel Smith wrote: >> On Mon, Feb 8, 2016 at 6:04 PM, Matthew Brett wrote: >>> On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith wrote: >>>> On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: >>>> [...] >>>>> I can't replicate the segfault with manylinux wheels and scipy. On >>>>> the other hand, I get a new test error for numpy from manylinux, scipy >>>>> from manylinux, like this: >>>>> >>>>> $ python -c 'import scipy.linalg; scipy.linalg.test()' >>>>> >>>>> ====================================================================== >>>>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>>>> ---------------------------------------------------------------------- >>>>> Traceback (most recent call last): >>>>> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >>>>> 197, in runTest >>>>> self.test(*self.arg) >>>>> File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >>>>> line 658, in eigenhproblem_general >>>>> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>> line 892, in assert_array_almost_equal >>>>> precision=decimal) >>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>> line 713, in assert_array_compare >>>>> raise AssertionError(msg) >>>>> AssertionError: >>>>> Arrays are not almost equal to 4 decimals >>>>> >>>>> (mismatch 100.0%) >>>>> x: array([ 0., 0., 0.], dtype=float32) >>>>> y: array([ 1., 1., 1.]) >>>>> >>>>> ---------------------------------------------------------------------- >>>>> Ran 1507 tests in 14.928s >>>>> >>>>> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >>>>> >>>>> This is a very odd error, which we don't get when running over a numpy >>>>> installed from source, linked to ATLAS, and doesn't happen when >>>>> running the tests via: >>>>> >>>>> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >>>>> >>>>> So, something about the copy of numpy (linked to openblas) is >>>>> affecting the results of scipy (also linked to openblas), and only >>>>> with a particular environment / test order. >>>>> >>>>> If you'd like to try and see whether y'all can do a better job of >>>>> debugging than me: >>>>> >>>>> # Run this script inside a docker container started with this incantation: >>>>> # docker run -ti --rm ubuntu:12.04 /bin/bash >>>>> apt-get update >>>>> apt-get install -y python curl >>>>> apt-get install libpython2.7 # this won't be necessary with next >>>>> iteration of manylinux wheel builds >>>>> curl -LO https://bootstrap.pypa.io/get-pip.py >>>>> python get-pip.py >>>>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >>>>> python -c 'import scipy.linalg; scipy.linalg.test()' >>>> >>>> I just tried this and on my laptop it completed without error. >>>> >>>> Best guess is that we're dealing with some memory corruption bug >>>> inside openblas, so it's getting perturbed by things like exactly what >>>> other calls to openblas have happened (which is different depending on >>>> whether numpy is linked to openblas), and which core type openblas has >>>> detected. >>>> >>>> On my laptop, which *doesn't* show the problem, running with >>>> OPENBLAS_VERBOSE=2 says "Core: Haswell". >>>> >>>> Guess the next step is checking what core type the failing machines >>>> use, and running valgrind... anyone have a good valgrind suppressions >>>> file? >>> >>> My machine (which does give the failure) gives >>> >>> Core: Core2 >>> >>> with OPENBLAS_VERBOSE=2 >> >> Yep, that allows me to reproduce it: >> >> root at f7153f0cc841:/# OPENBLAS_VERBOSE=2 OPENBLAS_CORETYPE=Core2 python >> -c 'import scipy.linalg; scipy.linalg.test()' >> Core: Core2 >> [...] >> ====================================================================== >> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >> ---------------------------------------------------------------------- >> [...] >> >> So this is indeed sounding like an OpenBLAS issue... next stop >> valgrind, I guess :-/ > > Here's the valgrind output: > https://gist.github.com/njsmith/577d028e79f0a80d2797 > > There's a lot of it, but no smoking guns have jumped out at me :-/ > > -n > plenty of smoking guns, e.g.: .............==3695== Invalid read of size 8 3417 ==3695== at 0x7AAA9C0: daxpy_k_CORE2 (in /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) 3418 ==3695== by 0x76BEEFC: ger_kernel (in /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) 3419 ==3695== by 0x788F618: exec_blas (in /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) 3420 ==3695== by 0x76BF099: dger_thread (in /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) 3421 ==3695== by 0x767DC37: dger_ (in /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) I think I have reported that to openblas already, they said do that intentionally, though last I checked they are missing the code that verifies this is actually allowed (if your not crossing a page you can read beyond the boundaries). Its pretty likely its a pointless micro optimization, you normally only use that trick for string functions where you don't know the size of the string. Your code also indicates it ran on core2, while the issues occur on sandybridge, maybe valgrind messes with the cpu detection so it won't show anything. From matthew.brett at gmail.com Tue Feb 9 14:40:17 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 9 Feb 2016 11:40:17 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: <56BA3FF6.40207@googlemail.com> References: <56BA3FF6.40207@googlemail.com> Message-ID: On Tue, Feb 9, 2016 at 11:37 AM, Julian Taylor wrote: > On 09.02.2016 04:59, Nathaniel Smith wrote: >> On Mon, Feb 8, 2016 at 6:07 PM, Nathaniel Smith wrote: >>> On Mon, Feb 8, 2016 at 6:04 PM, Matthew Brett wrote: >>>> On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith wrote: >>>>> On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: >>>>> [...] >>>>>> I can't replicate the segfault with manylinux wheels and scipy. On >>>>>> the other hand, I get a new test error for numpy from manylinux, scipy >>>>>> from manylinux, like this: >>>>>> >>>>>> $ python -c 'import scipy.linalg; scipy.linalg.test()' >>>>>> >>>>>> ====================================================================== >>>>>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>>>>> ---------------------------------------------------------------------- >>>>>> Traceback (most recent call last): >>>>>> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >>>>>> 197, in runTest >>>>>> self.test(*self.arg) >>>>>> File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >>>>>> line 658, in eigenhproblem_general >>>>>> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >>>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>>> line 892, in assert_array_almost_equal >>>>>> precision=decimal) >>>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>>> line 713, in assert_array_compare >>>>>> raise AssertionError(msg) >>>>>> AssertionError: >>>>>> Arrays are not almost equal to 4 decimals >>>>>> >>>>>> (mismatch 100.0%) >>>>>> x: array([ 0., 0., 0.], dtype=float32) >>>>>> y: array([ 1., 1., 1.]) >>>>>> >>>>>> ---------------------------------------------------------------------- >>>>>> Ran 1507 tests in 14.928s >>>>>> >>>>>> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >>>>>> >>>>>> This is a very odd error, which we don't get when running over a numpy >>>>>> installed from source, linked to ATLAS, and doesn't happen when >>>>>> running the tests via: >>>>>> >>>>>> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >>>>>> >>>>>> So, something about the copy of numpy (linked to openblas) is >>>>>> affecting the results of scipy (also linked to openblas), and only >>>>>> with a particular environment / test order. >>>>>> >>>>>> If you'd like to try and see whether y'all can do a better job of >>>>>> debugging than me: >>>>>> >>>>>> # Run this script inside a docker container started with this incantation: >>>>>> # docker run -ti --rm ubuntu:12.04 /bin/bash >>>>>> apt-get update >>>>>> apt-get install -y python curl >>>>>> apt-get install libpython2.7 # this won't be necessary with next >>>>>> iteration of manylinux wheel builds >>>>>> curl -LO https://bootstrap.pypa.io/get-pip.py >>>>>> python get-pip.py >>>>>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >>>>>> python -c 'import scipy.linalg; scipy.linalg.test()' >>>>> >>>>> I just tried this and on my laptop it completed without error. >>>>> >>>>> Best guess is that we're dealing with some memory corruption bug >>>>> inside openblas, so it's getting perturbed by things like exactly what >>>>> other calls to openblas have happened (which is different depending on >>>>> whether numpy is linked to openblas), and which core type openblas has >>>>> detected. >>>>> >>>>> On my laptop, which *doesn't* show the problem, running with >>>>> OPENBLAS_VERBOSE=2 says "Core: Haswell". >>>>> >>>>> Guess the next step is checking what core type the failing machines >>>>> use, and running valgrind... anyone have a good valgrind suppressions >>>>> file? >>>> >>>> My machine (which does give the failure) gives >>>> >>>> Core: Core2 >>>> >>>> with OPENBLAS_VERBOSE=2 >>> >>> Yep, that allows me to reproduce it: >>> >>> root at f7153f0cc841:/# OPENBLAS_VERBOSE=2 OPENBLAS_CORETYPE=Core2 python >>> -c 'import scipy.linalg; scipy.linalg.test()' >>> Core: Core2 >>> [...] >>> ====================================================================== >>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>> ---------------------------------------------------------------------- >>> [...] >>> >>> So this is indeed sounding like an OpenBLAS issue... next stop >>> valgrind, I guess :-/ >> >> Here's the valgrind output: >> https://gist.github.com/njsmith/577d028e79f0a80d2797 >> >> There's a lot of it, but no smoking guns have jumped out at me :-/ >> >> -n >> > > plenty of smoking guns, e.g.: > > .............==3695== Invalid read of size 8 > 3417 ==3695== at 0x7AAA9C0: daxpy_k_CORE2 (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > 3418 ==3695== by 0x76BEEFC: ger_kernel (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > 3419 ==3695== by 0x788F618: exec_blas (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > 3420 ==3695== by 0x76BF099: dger_thread (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > 3421 ==3695== by 0x767DC37: dger_ (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > > > I think I have reported that to openblas already, they said do that > intentionally, though last I checked they are missing the code that > verifies this is actually allowed (if your not crossing a page you can > read beyond the boundaries). Its pretty likely its a pointless micro > optimization, you normally only use that trick for string functions > where you don't know the size of the string. > > Your code also indicates it ran on core2, while the issues occur on > sandybridge, maybe valgrind messes with the cpu detection so it won't > show anything. Julian - thanks for having a look. Do you happen to remember the openblas issue number for this? Was there an obvious place we could patch openblas to avoid this error in particular? Cheers, Matthew From matthew.brett at gmail.com Tue Feb 9 14:52:35 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 9 Feb 2016 11:52:35 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: On Mon, Feb 8, 2016 at 7:59 PM, Nathaniel Smith wrote: > On Mon, Feb 8, 2016 at 6:07 PM, Nathaniel Smith wrote: >> On Mon, Feb 8, 2016 at 6:04 PM, Matthew Brett wrote: >>> On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith wrote: >>>> On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: >>>> [...] >>>>> I can't replicate the segfault with manylinux wheels and scipy. On >>>>> the other hand, I get a new test error for numpy from manylinux, scipy >>>>> from manylinux, like this: >>>>> >>>>> $ python -c 'import scipy.linalg; scipy.linalg.test()' >>>>> >>>>> ====================================================================== >>>>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>>>> ---------------------------------------------------------------------- >>>>> Traceback (most recent call last): >>>>> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >>>>> 197, in runTest >>>>> self.test(*self.arg) >>>>> File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >>>>> line 658, in eigenhproblem_general >>>>> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>> line 892, in assert_array_almost_equal >>>>> precision=decimal) >>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>> line 713, in assert_array_compare >>>>> raise AssertionError(msg) >>>>> AssertionError: >>>>> Arrays are not almost equal to 4 decimals >>>>> >>>>> (mismatch 100.0%) >>>>> x: array([ 0., 0., 0.], dtype=float32) >>>>> y: array([ 1., 1., 1.]) >>>>> >>>>> ---------------------------------------------------------------------- >>>>> Ran 1507 tests in 14.928s >>>>> >>>>> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >>>>> >>>>> This is a very odd error, which we don't get when running over a numpy >>>>> installed from source, linked to ATLAS, and doesn't happen when >>>>> running the tests via: >>>>> >>>>> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >>>>> >>>>> So, something about the copy of numpy (linked to openblas) is >>>>> affecting the results of scipy (also linked to openblas), and only >>>>> with a particular environment / test order. >>>>> >>>>> If you'd like to try and see whether y'all can do a better job of >>>>> debugging than me: >>>>> >>>>> # Run this script inside a docker container started with this incantation: >>>>> # docker run -ti --rm ubuntu:12.04 /bin/bash >>>>> apt-get update >>>>> apt-get install -y python curl >>>>> apt-get install libpython2.7 # this won't be necessary with next >>>>> iteration of manylinux wheel builds >>>>> curl -LO https://bootstrap.pypa.io/get-pip.py >>>>> python get-pip.py >>>>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >>>>> python -c 'import scipy.linalg; scipy.linalg.test()' >>>> >>>> I just tried this and on my laptop it completed without error. >>>> >>>> Best guess is that we're dealing with some memory corruption bug >>>> inside openblas, so it's getting perturbed by things like exactly what >>>> other calls to openblas have happened (which is different depending on >>>> whether numpy is linked to openblas), and which core type openblas has >>>> detected. >>>> >>>> On my laptop, which *doesn't* show the problem, running with >>>> OPENBLAS_VERBOSE=2 says "Core: Haswell". >>>> >>>> Guess the next step is checking what core type the failing machines >>>> use, and running valgrind... anyone have a good valgrind suppressions >>>> file? >>> >>> My machine (which does give the failure) gives >>> >>> Core: Core2 >>> >>> with OPENBLAS_VERBOSE=2 >> >> Yep, that allows me to reproduce it: >> >> root at f7153f0cc841:/# OPENBLAS_VERBOSE=2 OPENBLAS_CORETYPE=Core2 python >> -c 'import scipy.linalg; scipy.linalg.test()' >> Core: Core2 >> [...] >> ====================================================================== >> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >> ---------------------------------------------------------------------- >> [...] >> >> So this is indeed sounding like an OpenBLAS issue... next stop >> valgrind, I guess :-/ > > Here's the valgrind output: > https://gist.github.com/njsmith/577d028e79f0a80d2797 > > There's a lot of it, but no smoking guns have jumped out at me :-/ Could you send me instructions on replicating the valgrind run, I'll run on on the actual Core2 machine... Matthew From jtaylor.debian at googlemail.com Tue Feb 9 14:55:28 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 9 Feb 2016 20:55:28 +0100 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: Message-ID: <56BA4430.9040709@googlemail.com> On 09.02.2016 20:52, Matthew Brett wrote: > On Mon, Feb 8, 2016 at 7:59 PM, Nathaniel Smith wrote: >> On Mon, Feb 8, 2016 at 6:07 PM, Nathaniel Smith wrote: >>> On Mon, Feb 8, 2016 at 6:04 PM, Matthew Brett wrote: >>>> On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith wrote: >>>>> On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: >>>>> [...] >>>>>> I can't replicate the segfault with manylinux wheels and scipy. On >>>>>> the other hand, I get a new test error for numpy from manylinux, scipy >>>>>> from manylinux, like this: >>>>>> >>>>>> $ python -c 'import scipy.linalg; scipy.linalg.test()' >>>>>> >>>>>> ====================================================================== >>>>>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>>>>> ---------------------------------------------------------------------- >>>>>> Traceback (most recent call last): >>>>>> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >>>>>> 197, in runTest >>>>>> self.test(*self.arg) >>>>>> File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >>>>>> line 658, in eigenhproblem_general >>>>>> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >>>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>>> line 892, in assert_array_almost_equal >>>>>> precision=decimal) >>>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>>> line 713, in assert_array_compare >>>>>> raise AssertionError(msg) >>>>>> AssertionError: >>>>>> Arrays are not almost equal to 4 decimals >>>>>> >>>>>> (mismatch 100.0%) >>>>>> x: array([ 0., 0., 0.], dtype=float32) >>>>>> y: array([ 1., 1., 1.]) >>>>>> >>>>>> ---------------------------------------------------------------------- >>>>>> Ran 1507 tests in 14.928s >>>>>> >>>>>> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >>>>>> >>>>>> This is a very odd error, which we don't get when running over a numpy >>>>>> installed from source, linked to ATLAS, and doesn't happen when >>>>>> running the tests via: >>>>>> >>>>>> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >>>>>> >>>>>> So, something about the copy of numpy (linked to openblas) is >>>>>> affecting the results of scipy (also linked to openblas), and only >>>>>> with a particular environment / test order. >>>>>> >>>>>> If you'd like to try and see whether y'all can do a better job of >>>>>> debugging than me: >>>>>> >>>>>> # Run this script inside a docker container started with this incantation: >>>>>> # docker run -ti --rm ubuntu:12.04 /bin/bash >>>>>> apt-get update >>>>>> apt-get install -y python curl >>>>>> apt-get install libpython2.7 # this won't be necessary with next >>>>>> iteration of manylinux wheel builds >>>>>> curl -LO https://bootstrap.pypa.io/get-pip.py >>>>>> python get-pip.py >>>>>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >>>>>> python -c 'import scipy.linalg; scipy.linalg.test()' >>>>> >>>>> I just tried this and on my laptop it completed without error. >>>>> >>>>> Best guess is that we're dealing with some memory corruption bug >>>>> inside openblas, so it's getting perturbed by things like exactly what >>>>> other calls to openblas have happened (which is different depending on >>>>> whether numpy is linked to openblas), and which core type openblas has >>>>> detected. >>>>> >>>>> On my laptop, which *doesn't* show the problem, running with >>>>> OPENBLAS_VERBOSE=2 says "Core: Haswell". >>>>> >>>>> Guess the next step is checking what core type the failing machines >>>>> use, and running valgrind... anyone have a good valgrind suppressions >>>>> file? >>>> >>>> My machine (which does give the failure) gives >>>> >>>> Core: Core2 >>>> >>>> with OPENBLAS_VERBOSE=2 >>> >>> Yep, that allows me to reproduce it: >>> >>> root at f7153f0cc841:/# OPENBLAS_VERBOSE=2 OPENBLAS_CORETYPE=Core2 python >>> -c 'import scipy.linalg; scipy.linalg.test()' >>> Core: Core2 >>> [...] >>> ====================================================================== >>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>> ---------------------------------------------------------------------- >>> [...] >>> >>> So this is indeed sounding like an OpenBLAS issue... next stop >>> valgrind, I guess :-/ >> >> Here's the valgrind output: >> https://gist.github.com/njsmith/577d028e79f0a80d2797 >> >> There's a lot of it, but no smoking guns have jumped out at me :-/ > > Could you send me instructions on replicating the valgrind run, I'll > run on on the actual Core2 machine... > > Matthew please also use this suppression file, should reduce the python noise significantly but it might be a bit out of date. Used to work fine on an ubuntu built python. -------------- next part -------------- # # This is a valgrind suppression file that should be used when using valgrind. # # Here's an example of running valgrind: # # cd python/dist/src # valgrind --tool=memcheck --suppressions=Misc/valgrind-python.supp \ # ./python -E -tt ./Lib/test/regrtest.py -u bsddb,network # # You must edit Objects/obmalloc.c and uncomment Py_USING_MEMORY_DEBUGGER # to use the preferred suppressions with Py_ADDRESS_IN_RANGE. # # If you do not want to recompile Python, you can uncomment # suppressions for PyObject_Free and PyObject_Realloc. # # See Misc/README.valgrind for more information. # all tool names: Addrcheck,Memcheck,cachegrind,helgrind,massif { ADDRESS_IN_RANGE/Invalid read of size 4 Memcheck:Addr4 fun:Py_ADDRESS_IN_RANGE } { ADDRESS_IN_RANGE/Invalid read of size 4 Memcheck:Value4 fun:Py_ADDRESS_IN_RANGE } { ADDRESS_IN_RANGE/Invalid read of size 8 (x86_64 aka amd64) Memcheck:Value8 fun:Py_ADDRESS_IN_RANGE } { ADDRESS_IN_RANGE/Conditional jump or move depends on uninitialised value Memcheck:Cond fun:Py_ADDRESS_IN_RANGE } # # Leaks (including possible leaks) # Hmmm, I wonder if this masks some real leaks. I think it does. # Will need to fix that. # { Suppress leaking the GIL. Happens once per process, see comment in ceval.c. Memcheck:Leak fun:malloc fun:PyThread_allocate_lock fun:PyEval_InitThreads } { Suppress leaking the GIL after a fork. Memcheck:Leak fun:malloc fun:PyThread_allocate_lock fun:PyEval_ReInitThreads } { Suppress leaking the autoTLSkey. This looks like it shouldn't leak though. Memcheck:Leak fun:malloc fun:PyThread_create_key fun:_PyGILState_Init fun:Py_InitializeEx fun:Py_Main } { Hmmm, is this a real leak or like the GIL? Memcheck:Leak fun:malloc fun:PyThread_ReInitTLS } { Handle PyMalloc confusing valgrind (possibly leaked) Memcheck:Leak fun:realloc fun:_PyObject_GC_Resize # fun:COMMENT_THIS_LINE_TO_DISABLE_LEAK_WARNING } { Handle PyMalloc confusing valgrind (possibly leaked) Memcheck:Leak fun:malloc fun:_PyObject_GC_New # fun:COMMENT_THIS_LINE_TO_DISABLE_LEAK_WARNING } { Handle PyMalloc confusing valgrind (possibly leaked) Memcheck:Leak fun:malloc fun:_PyObject_GC_NewVar # fun:COMMENT_THIS_LINE_TO_DISABLE_LEAK_WARNING } # # Non-python specific leaks # { Handle pthread issue (possibly leaked) Memcheck:Leak fun:calloc fun:allocate_dtv fun:_dl_allocate_tls_storage fun:_dl_allocate_tls } { Handle pthread issue (possibly leaked) Memcheck:Leak fun:memalign fun:_dl_allocate_tls_storage fun:_dl_allocate_tls } # Object Malloc/Free/Realloc stuff, very broad { ADDRESS_IN_RANGE/Invalid read of size 4 Memcheck:Addr4 fun:PyObject_Free* } { ADDRESS_IN_RANGE/Invalid read of size 4 Memcheck:Value4 fun:PyObject_Free* } { ADDRESS_IN_RANGE/Conditional jump or move depends on uninitialised value Memcheck:Cond fun:PyObject_Free* } { ADDRESS_IN_RANGE/Invalid read of size 4 Memcheck:Addr4 fun:PyObject_Realloc* } { ADDRESS_IN_RANGE/Invalid read of size 4 Memcheck:Value4 fun:PyObject_Realloc* } { ADDRESS_IN_RANGE/Conditional jump or move depends on uninitialised value Memcheck:Cond fun:PyObject_Realloc* } # Object Malloc/Free/Realloc stuff for size 8 { ADDRESS_IN_RANGE/Invalid read of size 8 Memcheck:Addr8 fun:PyObject_Free* } { ADDRESS_IN_RANGE/Invalid read of size 8 Memcheck:Value8 fun:PyObject_Free* } { ADDRESS_IN_RANGE/Invalid read of size 8 Memcheck:Addr8 fun:PyObject_Realloc* } { ADDRESS_IN_RANGE/Invalid read of size 8 Memcheck:Value8 fun:PyObject_Realloc* } ### ### All the suppressions below are for errors that occur within libraries ### that Python uses. The problems to not appear to be related to Python's ### use of the libraries. ### { Generic ubuntu ld problems Memcheck:Addr8 obj:/lib/ld-2.4.so obj:/lib/ld-2.4.so obj:/lib/ld-2.4.so obj:/lib/ld-2.4.so } { Generic gentoo ld problems Memcheck:Cond obj:/lib/ld-2.3.4.so obj:/lib/ld-2.3.4.so obj:/lib/ld-2.3.4.so obj:/lib/ld-2.3.4.so } { DBM problems, see test_dbm Memcheck:Param write(buf) fun:write obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 fun:dbm_close } { DBM problems, see test_dbm Memcheck:Value8 fun:memmove obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 fun:dbm_store fun:dbm_ass_sub } { DBM problems, see test_dbm Memcheck:Cond obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 fun:dbm_store fun:dbm_ass_sub } { DBM problems, see test_dbm Memcheck:Cond fun:memmove obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 obj:/usr/lib/libdb1.so.2 fun:dbm_store fun:dbm_ass_sub } { GDBM problems, see test_gdbm Memcheck:Param write(buf) fun:write fun:gdbm_open } { ZLIB problems, see test_gzip Memcheck:Cond obj:/lib/libz.so.1.2.3 obj:/lib/libz.so.1.2.3 fun:deflate } { Avoid problems w/readline doing a putenv and leaking on exit Memcheck:Leak fun:malloc fun:xmalloc fun:sh_set_lines_and_columns fun:_rl_get_screen_size fun:_rl_init_terminal_io obj:/lib/libreadline.so.4.3 fun:rl_initialize } ### ### These occur from somewhere within the SSL, when running ### test_socket_sll. They are too general to leave on by default. ### ###{ ### somewhere in SSL stuff ### Memcheck:Cond ### fun:memset ###} ###{ ### somewhere in SSL stuff ### Memcheck:Value4 ### fun:memset ###} ### ###{ ### somewhere in SSL stuff ### Memcheck:Cond ### fun:MD5_Update ###} ### ###{ ### somewhere in SSL stuff ### Memcheck:Value4 ### fun:MD5_Update ###} # # All of these problems come from using test_socket_ssl # { from test_socket_ssl Memcheck:Cond fun:BN_bin2bn } { from test_socket_ssl Memcheck:Cond fun:BN_num_bits_word } { from test_socket_ssl Memcheck:Value4 fun:BN_num_bits_word } { from test_socket_ssl Memcheck:Cond fun:BN_mod_exp_mont_word } { from test_socket_ssl Memcheck:Cond fun:BN_mod_exp_mont } { from test_socket_ssl Memcheck:Param write(buf) fun:write obj:/usr/lib/libcrypto.so.0.9.7 } { from test_socket_ssl Memcheck:Cond fun:RSA_verify } { from test_socket_ssl Memcheck:Value4 fun:RSA_verify } { from test_socket_ssl Memcheck:Value4 fun:DES_set_key_unchecked } { from test_socket_ssl Memcheck:Value4 fun:DES_encrypt2 } { from test_socket_ssl Memcheck:Cond obj:/usr/lib/libssl.so.0.9.7 } { from test_socket_ssl Memcheck:Value4 obj:/usr/lib/libssl.so.0.9.7 } { from test_socket_ssl Memcheck:Cond fun:BUF_MEM_grow_clean } { from test_socket_ssl Memcheck:Cond fun:memcpy fun:ssl3_read_bytes } { from test_socket_ssl Memcheck:Cond fun:SHA1_Update } { from test_socket_ssl Memcheck:Value4 fun:SHA1_Update } #jtaylor added { Memcheck:Addr4 fun:PyObject_GC_Del fun:tupledealloc.* } { Memcheck:Addr4 fun:PyObject_GC_Del fun:code_dealloc.* } { Memcheck:Cond fun:PyObject_GC_Del fun:code_dealloc.* } { Memcheck:Value8 fun:PyObject_GC_Del fun:code_dealloc.* } { Memcheck:Value8 fun:PyObject_GC_Del fun:tupledealloc.* } { Memcheck:Cond fun:PyObject_GC_Del fun:tupledealloc.* } { Memcheck:Addr4 fun:PyObject_GC_Del fun:dict_dealloc.* } { Memcheck:Cond fun:PyObject_GC_Del fun:dict_dealloc.* } { Memcheck:Value8 fun:PyObject_GC_Del fun:dict_dealloc.* } { Memcheck:Addr4 fun:PyObject_GC_Del fun:collect.* } { Memcheck:Cond fun:PyObject_GC_Del fun:collect.* } { Memcheck:Value8 fun:PyObject_GC_Del fun:collect.* } { Memcheck:Addr4 fun:match_dealloc.* fun:frame_dealloc.* } { Memcheck:Addr4 fun:PyObject_GC_Del fun:subtype_dealloc.* } { Memcheck:Addr4 fun:PyObject_GC_Del fun:frame_dealloc.* fun:PyEval_EvalFrameEx fun:PyEval_EvalFrameEx fun:PyEval_EvalFrameEx fun:PyEval_EvalFrameEx } { Memcheck:Addr4 fun:PyObject_GC_Del fun:PyFrame_ClearFreeList fun:collect.* fun:_PyObject_GC_New } { Memcheck:Addr4 fun:PyObject_GC_Del fun:PyFrame_ClearFreeList fun:collect.* } { Memcheck:Cond fun:PyObject_GC_Del fun:PyFrame_ClearFreeList fun:collect.* } { Memcheck:Cond fun:PyObject_GC_Del fun:subtype_dealloc.* } { Memcheck:Addr4 fun:PyObject_GC_Del fun:PyDict_Fini fun:Py_Finalize } { Memcheck:Cond fun:PyObject_GC_Del fun:PyDict_Fini fun:Py_Finalize } { Memcheck:Value8 fun:PyObject_GC_Del fun:PyDict_Fini fun:Py_Finalize } { Memcheck:Value8 fun:PyGrammar_RemoveAccelerators fun:Py_Finalize } { Memcheck:Addr4 fun:PyGrammar_RemoveAccelerators fun:Py_Finalize } { Memcheck:Cond fun:PyGrammar_RemoveAccelerators fun:Py_Finalize } From njs at pobox.com Tue Feb 9 15:01:16 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 9 Feb 2016 12:01:16 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: <56BA3FF6.40207@googlemail.com> References: <56BA3FF6.40207@googlemail.com> Message-ID: On Tue, Feb 9, 2016 at 11:37 AM, Julian Taylor wrote: > On 09.02.2016 04:59, Nathaniel Smith wrote: >> On Mon, Feb 8, 2016 at 6:07 PM, Nathaniel Smith wrote: >>> On Mon, Feb 8, 2016 at 6:04 PM, Matthew Brett wrote: >>>> On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith wrote: >>>>> On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: >>>>> [...] >>>>>> I can't replicate the segfault with manylinux wheels and scipy. On >>>>>> the other hand, I get a new test error for numpy from manylinux, scipy >>>>>> from manylinux, like this: >>>>>> >>>>>> $ python -c 'import scipy.linalg; scipy.linalg.test()' >>>>>> >>>>>> ====================================================================== >>>>>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>>>>> ---------------------------------------------------------------------- >>>>>> Traceback (most recent call last): >>>>>> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >>>>>> 197, in runTest >>>>>> self.test(*self.arg) >>>>>> File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >>>>>> line 658, in eigenhproblem_general >>>>>> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >>>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>>> line 892, in assert_array_almost_equal >>>>>> precision=decimal) >>>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>>> line 713, in assert_array_compare >>>>>> raise AssertionError(msg) >>>>>> AssertionError: >>>>>> Arrays are not almost equal to 4 decimals >>>>>> >>>>>> (mismatch 100.0%) >>>>>> x: array([ 0., 0., 0.], dtype=float32) >>>>>> y: array([ 1., 1., 1.]) >>>>>> >>>>>> ---------------------------------------------------------------------- >>>>>> Ran 1507 tests in 14.928s >>>>>> >>>>>> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >>>>>> >>>>>> This is a very odd error, which we don't get when running over a numpy >>>>>> installed from source, linked to ATLAS, and doesn't happen when >>>>>> running the tests via: >>>>>> >>>>>> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >>>>>> >>>>>> So, something about the copy of numpy (linked to openblas) is >>>>>> affecting the results of scipy (also linked to openblas), and only >>>>>> with a particular environment / test order. >>>>>> >>>>>> If you'd like to try and see whether y'all can do a better job of >>>>>> debugging than me: >>>>>> >>>>>> # Run this script inside a docker container started with this incantation: >>>>>> # docker run -ti --rm ubuntu:12.04 /bin/bash >>>>>> apt-get update >>>>>> apt-get install -y python curl >>>>>> apt-get install libpython2.7 # this won't be necessary with next >>>>>> iteration of manylinux wheel builds >>>>>> curl -LO https://bootstrap.pypa.io/get-pip.py >>>>>> python get-pip.py >>>>>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >>>>>> python -c 'import scipy.linalg; scipy.linalg.test()' >>>>> >>>>> I just tried this and on my laptop it completed without error. >>>>> >>>>> Best guess is that we're dealing with some memory corruption bug >>>>> inside openblas, so it's getting perturbed by things like exactly what >>>>> other calls to openblas have happened (which is different depending on >>>>> whether numpy is linked to openblas), and which core type openblas has >>>>> detected. >>>>> >>>>> On my laptop, which *doesn't* show the problem, running with >>>>> OPENBLAS_VERBOSE=2 says "Core: Haswell". >>>>> >>>>> Guess the next step is checking what core type the failing machines >>>>> use, and running valgrind... anyone have a good valgrind suppressions >>>>> file? >>>> >>>> My machine (which does give the failure) gives >>>> >>>> Core: Core2 >>>> >>>> with OPENBLAS_VERBOSE=2 >>> >>> Yep, that allows me to reproduce it: >>> >>> root at f7153f0cc841:/# OPENBLAS_VERBOSE=2 OPENBLAS_CORETYPE=Core2 python >>> -c 'import scipy.linalg; scipy.linalg.test()' >>> Core: Core2 >>> [...] >>> ====================================================================== >>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>> ---------------------------------------------------------------------- >>> [...] >>> >>> So this is indeed sounding like an OpenBLAS issue... next stop >>> valgrind, I guess :-/ >> >> Here's the valgrind output: >> https://gist.github.com/njsmith/577d028e79f0a80d2797 >> >> There's a lot of it, but no smoking guns have jumped out at me :-/ >> >> -n >> > > plenty of smoking guns, e.g.: > > .............==3695== Invalid read of size 8 > 3417 ==3695== at 0x7AAA9C0: daxpy_k_CORE2 (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > 3418 ==3695== by 0x76BEEFC: ger_kernel (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > 3419 ==3695== by 0x788F618: exec_blas (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > 3420 ==3695== by 0x76BF099: dger_thread (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > 3421 ==3695== by 0x767DC37: dger_ (in > /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) > > > I think I have reported that to openblas already, they said do that > intentionally, though last I checked they are missing the code that > verifies this is actually allowed (if your not crossing a page you can > read beyond the boundaries). Its pretty likely its a pointless micro > optimization, you normally only use that trick for string functions > where you don't know the size of the string. Yeah, I thought that was intentional, and we're not getting a segfault so I don't think they're hitting any page boundaries. It's possible they're screwing it up and somehow the random data they're reading can affect the results, and that's why we get the wrong answer sometimes, but that's just a wild guess. > Your code also indicates it ran on core2, while the issues occur on > sandybridge, maybe valgrind messes with the cpu detection so it won't > show anything. It ran on core2 because I set OPENBLAS_CORETYPE=core2, since that seems to trigger the particular issue that Matthew ran into (and indeed, the relevant test failure did occur in that valgrind run). The sandybridge thing is a different issue I think. -n -- Nathaniel J. Smith -- https://vorpus.org From freddyrietdijk at fridh.nl Tue Feb 9 15:01:11 2016 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Tue, 9 Feb 2016 21:01:11 +0100 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: <56B8C4E2.3060901@googlemail.com> Message-ID: On Nix we also had trouble with OpenBLAS 0.2.15. Version 0.2.14 did not cause any segmentation faults so we reverted to that version. https://github.com/scipy/scipy/issues/5620 (hopefully this time the e-mail gets through) On Tue, Feb 9, 2016 at 10:21 AM, Da?id wrote: > On 8 February 2016 at 20:25, Matthew Brett > wrote: > >> >> I used the latest release, v0.2.15: >> >> https://github.com/matthew-brett/manylinux-builds/blob/master/build_openblas.sh#L5 >> >> Is there a later version that we should try? >> >> Cheers, >> > > That is the one in the Fedora repos that is working for me. How are you > compiling it? > > Mine is compiled with GCC 5 with the options seen in the source rpm: > http://koji.fedoraproject.org/koji/packageinfo?packageID=15277 > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Tue Feb 9 15:08:25 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 9 Feb 2016 21:08:25 +0100 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test In-Reply-To: References: <56BA3FF6.40207@googlemail.com> Message-ID: <56BA4739.4020908@googlemail.com> On 09.02.2016 21:01, Nathaniel Smith wrote: > On Tue, Feb 9, 2016 at 11:37 AM, Julian Taylor > wrote: >> On 09.02.2016 04:59, Nathaniel Smith wrote: >>> On Mon, Feb 8, 2016 at 6:07 PM, Nathaniel Smith wrote: >>>> On Mon, Feb 8, 2016 at 6:04 PM, Matthew Brett wrote: >>>>> On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith wrote: >>>>>> On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett wrote: >>>>>> [...] >>>>>>> I can't replicate the segfault with manylinux wheels and scipy. On >>>>>>> the other hand, I get a new test error for numpy from manylinux, scipy >>>>>>> from manylinux, like this: >>>>>>> >>>>>>> $ python -c 'import scipy.linalg; scipy.linalg.test()' >>>>>>> >>>>>>> ====================================================================== >>>>>>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>>>>>> ---------------------------------------------------------------------- >>>>>>> Traceback (most recent call last): >>>>>>> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >>>>>>> 197, in runTest >>>>>>> self.test(*self.arg) >>>>>>> File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >>>>>>> line 658, in eigenhproblem_general >>>>>>> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >>>>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>>>> line 892, in assert_array_almost_equal >>>>>>> precision=decimal) >>>>>>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>>>>>> line 713, in assert_array_compare >>>>>>> raise AssertionError(msg) >>>>>>> AssertionError: >>>>>>> Arrays are not almost equal to 4 decimals >>>>>>> >>>>>>> (mismatch 100.0%) >>>>>>> x: array([ 0., 0., 0.], dtype=float32) >>>>>>> y: array([ 1., 1., 1.]) >>>>>>> >>>>>>> ---------------------------------------------------------------------- >>>>>>> Ran 1507 tests in 14.928s >>>>>>> >>>>>>> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >>>>>>> >>>>>>> This is a very odd error, which we don't get when running over a numpy >>>>>>> installed from source, linked to ATLAS, and doesn't happen when >>>>>>> running the tests via: >>>>>>> >>>>>>> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >>>>>>> >>>>>>> So, something about the copy of numpy (linked to openblas) is >>>>>>> affecting the results of scipy (also linked to openblas), and only >>>>>>> with a particular environment / test order. >>>>>>> >>>>>>> If you'd like to try and see whether y'all can do a better job of >>>>>>> debugging than me: >>>>>>> >>>>>>> # Run this script inside a docker container started with this incantation: >>>>>>> # docker run -ti --rm ubuntu:12.04 /bin/bash >>>>>>> apt-get update >>>>>>> apt-get install -y python curl >>>>>>> apt-get install libpython2.7 # this won't be necessary with next >>>>>>> iteration of manylinux wheel builds >>>>>>> curl -LO https://bootstrap.pypa.io/get-pip.py >>>>>>> python get-pip.py >>>>>>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >>>>>>> python -c 'import scipy.linalg; scipy.linalg.test()' >>>>>> >>>>>> I just tried this and on my laptop it completed without error. >>>>>> >>>>>> Best guess is that we're dealing with some memory corruption bug >>>>>> inside openblas, so it's getting perturbed by things like exactly what >>>>>> other calls to openblas have happened (which is different depending on >>>>>> whether numpy is linked to openblas), and which core type openblas has >>>>>> detected. >>>>>> >>>>>> On my laptop, which *doesn't* show the problem, running with >>>>>> OPENBLAS_VERBOSE=2 says "Core: Haswell". >>>>>> >>>>>> Guess the next step is checking what core type the failing machines >>>>>> use, and running valgrind... anyone have a good valgrind suppressions >>>>>> file? >>>>> >>>>> My machine (which does give the failure) gives >>>>> >>>>> Core: Core2 >>>>> >>>>> with OPENBLAS_VERBOSE=2 >>>> >>>> Yep, that allows me to reproduce it: >>>> >>>> root at f7153f0cc841:/# OPENBLAS_VERBOSE=2 OPENBLAS_CORETYPE=Core2 python >>>> -c 'import scipy.linalg; scipy.linalg.test()' >>>> Core: Core2 >>>> [...] >>>> ====================================================================== >>>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>>> ---------------------------------------------------------------------- >>>> [...] >>>> >>>> So this is indeed sounding like an OpenBLAS issue... next stop >>>> valgrind, I guess :-/ >>> >>> Here's the valgrind output: >>> https://gist.github.com/njsmith/577d028e79f0a80d2797 >>> >>> There's a lot of it, but no smoking guns have jumped out at me :-/ >>> >>> -n >>> >> >> plenty of smoking guns, e.g.: >> >> .............==3695== Invalid read of size 8 >> 3417 ==3695== at 0x7AAA9C0: daxpy_k_CORE2 (in >> /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) >> 3418 ==3695== by 0x76BEEFC: ger_kernel (in >> /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) >> 3419 ==3695== by 0x788F618: exec_blas (in >> /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) >> 3420 ==3695== by 0x76BF099: dger_thread (in >> /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) >> 3421 ==3695== by 0x767DC37: dger_ (in >> /usr/local/lib/python2.7/dist-packages/numpy/.libs/libopenblas.so.0) >> >> >> I think I have reported that to openblas already, they said do that >> intentionally, though last I checked they are missing the code that >> verifies this is actually allowed (if your not crossing a page you can >> read beyond the boundaries). Its pretty likely its a pointless micro >> optimization, you normally only use that trick for string functions >> where you don't know the size of the string. > > Yeah, I thought that was intentional, and we're not getting a segfault > so I don't think they're hitting any page boundaries. It's possible > they're screwing it up and somehow the random data they're reading can > affect the results, and that's why we get the wrong answer sometimes, > but that's just a wild guess. with openblas everything is possible, especially this exact type of issue. See e.g.: https://github.com/xianyi/OpenBLAS/issues/171 here it loaded too much data, partly uninitialized, and if its filled with nan it spreads into the actually used data. That was a lot of fun to debug, and openblas is riddled with this stuff... e.g. here my favourite comment in openblas (which is probably the source of https://github.com/scipy/scipy/issues/5528): 51 /* make it volatile because some function (ex: dgemv_n.S) */ \ 52 /* do not restore all register */ \ https://github.com/xianyi/OpenBLAS/blob/develop/common_stackalloc.h#L51 From charlesr.harris at gmail.com Tue Feb 9 21:09:51 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 9 Feb 2016 19:09:51 -0700 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. Message-ID: Hi All, I'm pleased to announce the release of NumPy 1.11.0b3. This beta contains additional bug fixes as well as limiting the number of FutureWarnings raised by assignment to masked array slices. One issue that remains to be decided is whether or not to postpone raising an error for floats used as indexes. Sources may be found on Sourceforge and both sources and OS X wheels are availble on pypi. Please test, hopefully this will be that last beta needed. As a note on problems encountered, twine uploads continue to fail for me, but there are still variations to try. The wheeluploader downloaded wheels as it should, but could not upload them, giving the error message "HTTPError: 413 Client Error: Request Entity Too Large for url: https://www.python.org/pypi". Firefox also complains that http://wheels.scipy.org is incorrectly configured with an invalid certificate. Enjoy, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Wed Feb 10 11:53:30 2016 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 10 Feb 2016 11:53:30 -0500 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: <31866A9B-CA5B-4576-9BF6-9471C438D7F6@gmail.com> References: <56ABA552.1090002@gmail.com> <31866A9B-CA5B-4576-9BF6-9471C438D7F6@gmail.com> Message-ID: <56BB6B0A.1080307@gmail.com> Thanks, that is very helpful! On 01/30/2016 01:40 PM, Jeff Reback wrote: > just my 2c > > it's fairly straightforward to add a test to the Travis matrix to grab > numpy wheels built numpy wheels (works for conda or pip installs). > > so in pandas we r testing 2.7/3.5 against numpy master continuously > > https://github.com/pydata/pandas/blob/master/ci/install-3.5_NUMPY_DEV.sh > > On Jan 30, 2016, at 1:16 PM, Nathaniel Smith > wrote: > >> On Jan 30, 2016 9:27 AM, "Ralf Gommers" > > wrote: >> > >> > >> > >> > On Fri, Jan 29, 2016 at 11:39 PM, Nathaniel Smith > > wrote: >> >> >> >> It occurs to me that the best solution might be to put together a >> .travis.yml for the release branches that does: "for pkg in >> IMPORTANT_PACKAGES: pip install $pkg; python -c 'import pkg; pkg.test()'" >> >> This might not be viable right now, but will be made more viable >> if pypi starts allowing official Linux wheels, which looks likely to >> happen before 1.12... (see PEP 513) >> >> >> >> On Jan 29, 2016 9:46 AM, "Andreas Mueller" > > wrote: >> >> > >> >> > Is this the point when scikit-learn should build against it? >> >> >> >> Yes please! >> >> >> >> > Or do we wait for an RC? >> >> >> >> This is still all in flux, but I think we might actually want a >> rule that says it can't become an RC until after we've tested >> scikit-learn (and a list of similarly prominent packages). On the >> theory that RC means "we think this is actually good enough to >> release" :-). OTOH I'm not sure the alpha/beta/RC distinction is very >> helpful; maybe they should all just be betas. >> >> >> >> > Also, we need a scipy build against it. Who does that? >> >> >> >> Like Julian says, it shouldn't be necessary. In fact using old >> builds of scipy and scikit-learn is even better than rebuilding them, >> because it tests numpy's ABI compatibility -- if you find you *have* >> to rebuild something then we *definitely* want to know that. >> >> >> >> > Our continuous integration doesn't usually build scipy or numpy, >> so it will be a bit tricky to add to our config. >> >> > Would you run our master tests? [did we ever finish this >> discussion?] >> >> >> >> We didn't, and probably should... :-) >> > >> > Why would that be necessary if scikit-learn simply tests >> pre-releases of numpy as you suggested earlier in the thread (with >> --pre)? >> > >> > There's also https://github.com/MacPython/scipy-stack-osx-testing >> by the way, which could have scikit-learn and scikit-image added to it. >> > >> > That's two options that are imho both better than adding more >> workload for the numpy release manager. Also from a principled point >> of view, packages should test with new versions of their >> dependencies, not the other way around. >> >> Sorry, that was unclear. I meant that we should finish the >> discussion, not that we should necessarily be the ones running the >> tests. "The discussion" being this one: >> >> https://github.com/numpy/numpy/issues/6462#issuecomment-148094591 >> https://github.com/numpy/numpy/issues/6494 >> >> I'm not saying that the release manager necessarily should be running >> the tests (though it's one option). But the 1.10 experience seems to >> indicate that we need *some* process for the release manager to make >> sure that some basic downstream testing has happened. Another option >> would be keeping a checklist of downstream projects and making sure >> they've all checked in and confirmed that they've run tests before >> making the release. >> >> -n >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Wed Feb 10 11:57:38 2016 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 10 Feb 2016 11:57:38 -0500 Subject: [Numpy-discussion] Numpy 1.11.0b2 released In-Reply-To: References: <56ABA552.1090002@gmail.com> <56ADE8A9.7030901@googlemail.com> Message-ID: <56BB6C02.1040407@gmail.com> On 02/01/2016 04:25 PM, Ralf Gommers wrote: > > It would be nice but its not realistic, I doubt most upstreams > that are > not themselves major downstreams are even subscribed to this list. > > > I'm pretty sure that some core devs from all major scipy stack > packages are subscribed to this list. Well, I don't think anyone else from sklearn picked up on this, and I myself totally forgot the issue for the last two weeks. I think continuously testing against numpy master might actually be feasible for us, but I'm not entirely sure.... -------------- next part -------------- An HTML attachment was scrubbed... URL: From nilsc.becker at gmail.com Wed Feb 10 14:01:31 2016 From: nilsc.becker at gmail.com (Nils Becker) Date: Wed, 10 Feb 2016 20:01:31 +0100 Subject: [Numpy-discussion] Linking other libm-Implementation In-Reply-To: References: Message-ID: 2016-02-09 18:02 GMT+01:00 Gregor Thalhammer : >> It is not suitable as a standard for numpy. > > Why should numpy not provide fast transcendental math functions? For linear algebra it supports fast implementations, even non-free (MKL). Wouldn?t it be nice if numpy outperforms C? Floating point operations that make use of vector extensions of modern processors may behave subtly different. This especially concerns floating point exceptions, where sometimes silent infinities are generated instead of raising a divide-by-zero exception (best description I could find on the spot: https://randomascii.wordpress.com/2012/04/21/exceptional-floating-point/, also see the notes on C99-compliance of the new vector expressions in glibc: https://sourceware.org/glibc/wiki/libmvec). I think the default in numpy should strive to be mostly standard compliant. But of course an option to activate vector math operations would be nice - if that is necessary with packages like numexpr is another question. One other point is the extended/long double type which is normally not supported by those libraries (as vector extensions cannot handle them). > Intel publishes accuracy/performance charts for VML/MKL: > https://software.intel.com/sites/products/documentation/doclib/mkl/vm/functions/_accuracyall.html > > For GNU libc it is more difficult to find similarly precise data, I only could find: > http://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html On Tue, Feb 9, 2016 at 7:06 AM, Da?id wrote: > I did some digging, and I found this: > > http://julia-programming-language.2336112.n4.nabble.com/Is-the-accuracy-of-Julia-s-elementary-functions-exp-sin-known-td32736.html Thank you for looking that up! I did not knew about the stuff published by Intel yet. 2016-02-09 20:13 GMT+01:00 Matthew Brett : > So GNU libm has max error <= 0.5 ULP, openlibm has <= 1 ULP, and OSX > is (almost always) somewhere in-between. > > So, is <= 1 ULP good enough? Calculating transcendental functions correctly rounded is very, very hard and to my knowledge there is no complete libm implementation that guarantees the necessary accuracy for all possible inputs. One effort was/is the Correctly Rounded LibM (crlibm [1]) which tried to prove the accuracy of their algorithms. However, the performance impact to achieve that last ulp in all rounding modes can be severe. Assessing accuracy of a function implementation is hard. Testing all possible inputs is not feasible (2^32/64 for single/double) and proving accuracy bounds may be even harder. Most of the time one samples accuracy with random numbers from a certain range. This generates tables like the ones for GNU libm or Intel. This is a kind of "faithful" accuracy as you believe that the accuracy you tested on a sample extends to the whole argument range. The error in worst case may be (much) bigger. That being said, I believe the values given by GNU libm for example are very trustworthy. libm is not always correctly rounded (which would be <= 0.5ulp in round-to-nearest), however, the error bounds given in the table seem to cover all worst cases. Common single-argument functions (sin, cos) are correctly rounded and even complex two-argument functions (cpow) are at most 5ulp off. I do not think that other implementations are more accurate. So libm is definitely good enough, accuracy-wise. In any case I would like to build a testing framework to compare some libms and check accuracy/performance (at least Intel has a history of underestimating their error bounds in transcendental functions [2]). crlibm offers worst-case arguments for some functions which could be used to complement randomized sampling. Maybe I have some time in the next weeks... [1] http://lipforge.ens-lyon.fr/www/crlibm/ [2] https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Feb 10 15:33:53 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 10 Feb 2016 21:33:53 +0100 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: On Tue, Feb 9, 2016 at 1:02 AM, Chris Barker wrote: > As you can see in the timeline: > > https://developers.google.com/open-source/gsoc/timeline > > We are now in the stage where mentoring organizations are getting their > act together. So the question now is -- are there folks that want to mentor > for numpy projects? It can be rewarding, but it's a pretty big commitment > as well, and, I suppose depending on the project, would require some good > knowledge of the innards of numpy -- there are not a lot of those folks out > there that have that background. > Note that we have always done a combined numpy/scipy ideas page and submission. For really good students numpy may be the right challenge, but in general scipy is easier to get started on. So we have difficult project ideas for both, but easy/intermediate ones will most likely be for scipy. > > So to students, I suggest you keep an eye out, and engage a little later > on in the process. > > That being said, if you have a idea for a numpy improvement you'd like to > work on , by all means propose it and maybe you'll get a mentor or two > excited. > > -CHB > > > > > > On Mon, Feb 8, 2016 at 3:33 PM, SMRUTI RANJAN SAHOO > wrote: > >> sir actually i am interested very much . so can you help me about this >> or suggest some , so that i can contribute . >> > Hi Smruti, I suggest you look at the numpy or scipy issues on Github, and start with one labeled "easy-fix". > >> Thanks & Regards, >> Smruti Ranjan Sahoo >> >> On Tue, Feb 9, 2016 at 1:58 AM, Chris Barker >> wrote: >> >>> >>> I think the real challenge is having folks with the time to really put >>> into mentoring, but if folks want to do it -- numpy could really benefit. >>> >>> Maybe as a python.org sub-project? >>> >> Under the PSF umbrella has always worked very well, both in terms of communication quality and of getting the amount of slots we wanted, so yes. > >>> https://wiki.python.org/moin/SummerOfCode/2016 >>> >>> Deadlines are approaching -- so I thought I'd ping the list and see if >>> folks are interested. >>> ANyone interested in Google Summer of Code this year? >>> >> Yes, last year we had quite a productive GSoC, so I had planned to organize it along the same lines again (with an updated ideas page of course). Are you maybe interested in co-organizing or mentoring Chris? Updating the ideas page, proposal reviewing and interviewing students via video calls can be time-consuming, and mentoring definitely is, so the more the merrier. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed Feb 10 16:58:33 2016 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Feb 2016 23:58:33 +0200 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: Message-ID: 10.02.2016, 04:09, Charles R Harris kirjoitti: > I'm pleased to announce the release of NumPy 1.11.0b3. This beta contains [clip] > Please test, hopefully this will be that last beta needed. FWIW, https://travis-ci.org/pv/testrig/builds/108384173 From charlesr.harris at gmail.com Wed Feb 10 17:36:09 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 10 Feb 2016 15:36:09 -0700 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: Message-ID: On Wed, Feb 10, 2016 at 2:58 PM, Pauli Virtanen wrote: > 10.02.2016, 04:09, Charles R Harris kirjoitti: > > I'm pleased to announce the release of NumPy 1.11.0b3. This beta contains > [clip] > > Please test, hopefully this will be that last beta needed. > > FWIW, https://travis-ci.org/pv/testrig/builds/108384173 Thanks Pauli, very interesting. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Feb 10 17:48:39 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 10 Feb 2016 14:48:39 -0800 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: Thanks Ralf, Note that we have always done a combined numpy/scipy ideas page and > submission. For really good students numpy may be the right challenge, but > in general scipy is easier to get started on. > yup -- good idea. Is there a page ready to go, or do we need to get one up? (I don't even know where to put it...) > Under the PSF umbrella has always worked very well, both in terms of > communication quality and of getting the amount of slots we wanted, so yes. > hmm, looking here: https://wiki.python.org/moin/SummerOfCode/2016#Sub-orgs it seems it's time to get started. and I _think_ our ideas page can go on that Wiki. > Are you maybe interested in co-organizing or mentoring Chris? Updating the > ideas page, proposal reviewing and interviewing students via video calls > can be time-consuming, and mentoring definitely is, so the more the merrier. > I would love to help -- though I don't think I can commit to being a full-on mentor. If we get a couple people to agree to mentor, then we can get ourselves setup up with the PSF. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Feb 10 17:55:14 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 10 Feb 2016 23:55:14 +0100 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: On Wed, Feb 10, 2016 at 11:48 PM, Chris Barker wrote: > Thanks Ralf, > > Note that we have always done a combined numpy/scipy ideas page and >> submission. For really good students numpy may be the right challenge, but >> in general scipy is easier to get started on. >> > > yup -- good idea. Is there a page ready to go, or do we need to get one > up? (I don't even know where to put it...) > This is last year's page: https://github.com/scipy/scipy/wiki/GSoC-2015-project-ideas Some ideas have been worked on, others are still relevant. Let's copy this page to -2016- and start editing it and adding new ideas. I'll start right now actually. > > >> Under the PSF umbrella has always worked very well, both in terms of >> communication quality and of getting the amount of slots we wanted, so yes. >> > > hmm, looking here: > > https://wiki.python.org/moin/SummerOfCode/2016#Sub-orgs > > it seems it's time to get started. and I _think_ our ideas page can go on > that Wiki. > > >> Are you maybe interested in co-organizing or mentoring Chris? Updating >> the ideas page, proposal reviewing and interviewing students via video >> calls can be time-consuming, and mentoring definitely is, so the more the >> merrier. >> > > I would love to help -- though I don't think I can commit to being a > full-on mentor. > > If we get a couple people to agree to mentor, > That's always the tricky part. We normally let people indicate whether they're interested in mentoring for specific project ideas on the ideas page. > then we can get ourselves setup up with the PSF. > > That's the easiest part, takes one email and one wiki page edit:) Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Feb 10 18:02:52 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 11 Feb 2016 00:02:52 +0100 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: On Wed, Feb 10, 2016 at 11:55 PM, Ralf Gommers wrote: > > > On Wed, Feb 10, 2016 at 11:48 PM, Chris Barker > wrote: > >> Thanks Ralf, >> >> Note that we have always done a combined numpy/scipy ideas page and >>> submission. For really good students numpy may be the right challenge, but >>> in general scipy is easier to get started on. >>> >> >> yup -- good idea. Is there a page ready to go, or do we need to get one >> up? (I don't even know where to put it...) >> > > This is last year's page: > https://github.com/scipy/scipy/wiki/GSoC-2015-project-ideas > > Some ideas have been worked on, others are still relevant. Let's copy this > page to -2016- and start editing it and adding new ideas. I'll start right > now actually. > OK first version: https://github.com/scipy/scipy/wiki/GSoC-2016-project-ideas I kept some of the ideas from last year, but removed all potential mentors as the same people may not be available this year - please re-add yourselves where needed. And to everyone who has a good idea, and preferably is willing to mentor for that idea: please add it to that page. Ralf > > >> >> >>> Under the PSF umbrella has always worked very well, both in terms of >>> communication quality and of getting the amount of slots we wanted, so yes. >>> >> >> hmm, looking here: >> >> https://wiki.python.org/moin/SummerOfCode/2016#Sub-orgs >> >> it seems it's time to get started. and I _think_ our ideas page can go on >> that Wiki. >> >> >>> Are you maybe interested in co-organizing or mentoring Chris? Updating >>> the ideas page, proposal reviewing and interviewing students via video >>> calls can be time-consuming, and mentoring definitely is, so the more the >>> merrier. >>> >> >> I would love to help -- though I don't think I can commit to being a >> full-on mentor. >> >> If we get a couple people to agree to mentor, >> > > That's always the tricky part. We normally let people indicate whether > they're interested in mentoring for specific project ideas on the ideas > page. > > >> then we can get ourselves setup up with the PSF. >> >> > > That's the easiest part, takes one email and one wiki page edit:) > > Ralf > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 10 18:55:09 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 10 Feb 2016 18:55:09 -0500 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: Message-ID: On Wed, Feb 10, 2016 at 5:36 PM, Charles R Harris wrote: > > > On Wed, Feb 10, 2016 at 2:58 PM, Pauli Virtanen wrote: > >> 10.02.2016, 04:09, Charles R Harris kirjoitti: >> > I'm pleased to announce the release of NumPy 1.11.0b3. This beta >> contains >> [clip] >> > Please test, hopefully this will be that last beta needed. >> >> FWIW, https://travis-ci.org/pv/testrig/builds/108384173 > > > Thanks Pauli, very interesting. > Thanks Pauli, me too is this intended?: return np.r_[[np.nan] * head, x, [np.nan] * tail] TypeError: 'numpy.float64' object cannot be interpreted as an index In the old times of Python 2.x, statsmodels avoided integers so we don't get accidental integer division. Python wanted float() everywhere. Looks like numpy wants int() everywhere. (fixed in statsmodels master) Josef > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Feb 10 19:01:41 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 10 Feb 2016 16:01:41 -0800 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: On Wed, Feb 10, 2016 at 3:02 PM, Ralf Gommers wrote: > OK first version: > https://github.com/scipy/scipy/wiki/GSoC-2016-project-ideas > I kept some of the ideas from last year, but removed all potential mentors > as the same people may not be available this year - please re-add > yourselves where needed. > > And to everyone who has a good idea, and preferably is willing to mentor > for that idea: please add it to that page. > > Ralf > I removed the "Improve Numpy datetime functionality" project, since the relevant improvements have mostly already made it into NumPy 1.11. We might consider adding "improve duck typing for numpy arrays" if any GSOC students are true masochists ;). I could potentially be a mentor for this one, though of course Nathaniel is the obvious choice. Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Feb 10 19:22:06 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 10 Feb 2016 16:22:06 -0800 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: > > We might consider adding "improve duck typing for numpy arrays" > care to elaborate on that one? I know it come up on here that it would be good to have some code in numpy itself that made it easier to make array-like objects (I.e. do indexing the same way) Is that what you mean? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Wed Feb 10 22:09:09 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Wed, 10 Feb 2016 22:09:09 -0500 Subject: [Numpy-discussion] Deprecating `numpy.iterable` Message-ID: I have created a PR to deprecate `np.iterable` (https://github.com/numpy/numpy/pull/7202). It is a very old function, introduced as a utility in 2005 (https://github.com/numpy/numpy/commit/052a7b2e3276a303be1083022fc24d43084d2e14), and there is no good reason for it to be part of the public API. It is used internally 10 times within numpy. I have repaced those usages with a private function `np.lib.function_base._iterable` and added a `DeprecationWarning` to the public function. Is there anyone that objects to deprecating this function? Regards, -Joseph From mwojc at p.lodz.pl Thu Feb 11 10:40:22 2016 From: mwojc at p.lodz.pl (Marek Wojciechowski) Date: Thu, 11 Feb 2016 16:40:22 +0100 Subject: [Numpy-discussion] Build fortran extension on Windows with gfortran and MSVC Message-ID: <5304287.uJTab5Mlm1@think> Hi! It seems that on Windows + Python-3.5 fortran extensions cannot be built anymore with f2py and mingw32 compilers, because of new MSVC. Here is the short description of the errors one gets: http://stackoverflow.com/questions/33822554/build-fortran-extension-on-windows-with-gfortran-and-msvc This is sad, because this worked quite nicely with all previous python versions. I'm curious what are the options now? Is this going to be supported in the future, with mingw-w64 for example? Regards, -- Marek --- Politechnika ????dzka Lodz University of Technology Tre???? tej wiadomo??ci zawiera informacje przeznaczone tylko dla adresata. Je??eli nie jeste??cie Pa??stwo jej adresatem, b? d?? otrzymali??cie j? przez pomy??k?? prosimy o powiadomienie o tym nadawcy oraz trwa??e jej usuni??cie. This email contains information intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient or if you have received this message in error, please notify the sender and delete it from your system. From jgomezdans at gmail.com Thu Feb 11 11:23:23 2016 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Thu, 11 Feb 2016 16:23:23 +0000 Subject: [Numpy-discussion] Build fortran extension on Windows with gfortran and MSVC In-Reply-To: <5304287.uJTab5Mlm1@think> References: <5304287.uJTab5Mlm1@think> Message-ID: Hi, On 11 February 2016 at 15:40, Marek Wojciechowski wrote: > It seems that on Windows + Python-3.5 fortran extensions cannot be built > anymore with f2py and mingw32 compilers, because of new MSVC. Here is the > short description of the errors one gets: > Off-topic, but this is something that I get asked quite often, and I don't know what to say, not being a windows user: Is there some "howto" compile f2py fortran modules from scratch in windows with anaconda available? Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwojc at p.lodz.pl Thu Feb 11 11:40:36 2016 From: mwojc at p.lodz.pl (Marek Wojciechowski) Date: Thu, 11 Feb 2016 17:40:36 +0100 Subject: [Numpy-discussion] Build fortran extension on Windows with gfortran and MSVC In-Reply-To: <5304287.uJTab5Mlm1@think> References: <5304287.uJTab5Mlm1@think> Message-ID: <1605685.nivjaMJExt@think> Dnia czwartek, 11 lutego 2016 16:40:22 Marek Wojciechowski pisze: > Hi! > > It seems that on Windows + Python-3.5 fortran extensions cannot be built > anymore with f2py and mingw32 compilers, because of new MSVC. Here is the > short description of the errors one gets: > > http://stackoverflow.com/questions/33822554/build-fortran-extension-on-windo > ws-with-gfortran-and-msvc > > This is sad, because this worked quite nicely with all previous python > versions. I'm curious what are the options now? Is this going to be > supported in the future, with mingw-w64 for example? I found that the effort is on the way: http://mingwpy.github.io. The roadmap is quite ambitious... Regards, -- Marek --- Politechnika ????dzka Lodz University of Technology Tre???? tej wiadomo??ci zawiera informacje przeznaczone tylko dla adresata. Je??eli nie jeste??cie Pa??stwo jej adresatem, b? d?? otrzymali??cie j? przez pomy??k?? prosimy o powiadomienie o tym nadawcy oraz trwa??e jej usuni??cie. This email contains information intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient or if you have received this message in error, please notify the sender and delete it from your system. From mwojc at p.lodz.pl Thu Feb 11 11:44:19 2016 From: mwojc at p.lodz.pl (Marek Wojciechowski) Date: Thu, 11 Feb 2016 17:44:19 +0100 Subject: [Numpy-discussion] Build fortran extension on Windows with gfortran and MSVC In-Reply-To: References: <5304287.uJTab5Mlm1@think> Message-ID: <2238991.j8P6SsEmqA@think> Dnia czwartek, 11 lutego 2016 16:23:23 Jose Gomez-Dans pisze: > Hi, > > On 11 February 2016 at 15:40, Marek Wojciechowski wrote: > > It seems that on Windows + Python-3.5 fortran extensions cannot be built > > anymore with f2py and mingw32 compilers, because of new MSVC. Here is the > > > short description of the errors one gets: > Off-topic, but this is something that I get asked quite often, and I don't > know what to say, not being a windows user: > Is there some "howto" compile f2py fortran modules from scratch in windows > with anaconda available? After installing mingw (conda install mingw), you should go like on Linux. I think f2py is choosing --compiler=mingw32 by default. -- Marek --- Politechnika ????dzka Lodz University of Technology Tre???? tej wiadomo??ci zawiera informacje przeznaczone tylko dla adresata. Je??eli nie jeste??cie Pa??stwo jej adresatem, b? d?? otrzymali??cie j? przez pomy??k?? prosimy o powiadomienie o tym nadawcy oraz trwa??e jej usuni??cie. This email contains information intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient or if you have received this message in error, please notify the sender and delete it from your system. From shoyer at gmail.com Thu Feb 11 13:25:33 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 11 Feb 2016 10:25:33 -0800 Subject: [Numpy-discussion] Deprecating `numpy.iterable` In-Reply-To: References: Message-ID: We certainly can (and probably should) deprecate this, but we can't remove it for a very long time. np.iterable is used in a lot of third party code. On Wed, Feb 10, 2016 at 7:09 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > I have created a PR to deprecate `np.iterable` > (https://github.com/numpy/numpy/pull/7202). It is a very old function, > introduced as a utility in 2005 > ( > https://github.com/numpy/numpy/commit/052a7b2e3276a303be1083022fc24d43084d2e14 > ), > and there is no good reason for it to be part of the public API. It is > used internally 10 times within numpy. I have repaced those usages > with a private function `np.lib.function_base._iterable` and added a > `DeprecationWarning` to the public function. > > Is there anyone that objects to deprecating this function? > > Regards, > > -Joseph > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Feb 11 14:12:18 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 11 Feb 2016 11:12:18 -0800 Subject: [Numpy-discussion] Deprecating `numpy.iterable` In-Reply-To: References: Message-ID: Oh wow, yeah, there are tons of uses: https://github.com/search?q=%22np.iterable%22&ref=simplesearch&type=Code&utf8=%E2%9C%93 Meh, I dunno, maybe we're stuck with it. It's not a major maintenance burden at least. -n On Thu, Feb 11, 2016 at 10:25 AM, Stephan Hoyer wrote: > We certainly can (and probably should) deprecate this, but we can't remove > it for a very long time. > > np.iterable is used in a lot of third party code. > > On Wed, Feb 10, 2016 at 7:09 PM, Joseph Fox-Rabinovitz > wrote: >> >> I have created a PR to deprecate `np.iterable` >> (https://github.com/numpy/numpy/pull/7202). It is a very old function, >> introduced as a utility in 2005 >> >> (https://github.com/numpy/numpy/commit/052a7b2e3276a303be1083022fc24d43084d2e14), >> and there is no good reason for it to be part of the public API. It is >> used internally 10 times within numpy. I have repaced those usages >> with a private function `np.lib.function_base._iterable` and added a >> `DeprecationWarning` to the public function. >> >> Is there anyone that objects to deprecating this function? >> >> Regards, >> >> -Joseph >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Nathaniel J. Smith -- https://vorpus.org From ben.v.root at gmail.com Thu Feb 11 14:54:00 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 11 Feb 2016 14:54:00 -0500 Subject: [Numpy-discussion] Deprecating `numpy.iterable` In-Reply-To: References: Message-ID: Huh... matplotlib could use that! We have been using our own internal function left over from the numerix days, I think. Ben Root On Thu, Feb 11, 2016 at 2:12 PM, Nathaniel Smith wrote: > Oh wow, yeah, there are tons of uses: > > https://github.com/search?q=%22np.iterable%22&ref=simplesearch&type=Code&utf8=%E2%9C%93 > > Meh, I dunno, maybe we're stuck with it. It's not a major maintenance > burden at least. > > -n > > On Thu, Feb 11, 2016 at 10:25 AM, Stephan Hoyer wrote: > > We certainly can (and probably should) deprecate this, but we can't > remove > > it for a very long time. > > > > np.iterable is used in a lot of third party code. > > > > On Wed, Feb 10, 2016 at 7:09 PM, Joseph Fox-Rabinovitz > > wrote: > >> > >> I have created a PR to deprecate `np.iterable` > >> (https://github.com/numpy/numpy/pull/7202). It is a very old function, > >> introduced as a utility in 2005 > >> > >> ( > https://github.com/numpy/numpy/commit/052a7b2e3276a303be1083022fc24d43084d2e14 > ), > >> and there is no good reason for it to be part of the public API. It is > >> used internally 10 times within numpy. I have repaced those usages > >> with a private function `np.lib.function_base._iterable` and added a > >> `DeprecationWarning` to the public function. > >> > >> Is there anyone that objects to deprecating this function? > >> > >> Regards, > >> > >> -Joseph > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Thu Feb 11 14:57:34 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Thu, 11 Feb 2016 14:57:34 -0500 Subject: [Numpy-discussion] Deprecating `numpy.iterable` In-Reply-To: References: Message-ID: It appears that deprecating `numpy.iterable` would be counterproductive. I have reverted my PR to just making the return value an actual `bool` instead of an `int`. On Thu, Feb 11, 2016 at 2:54 PM, Benjamin Root wrote: > Huh... matplotlib could use that! We have been using our own internal > function left over from the numerix days, I think. > > Ben Root > > On Thu, Feb 11, 2016 at 2:12 PM, Nathaniel Smith wrote: >> >> Oh wow, yeah, there are tons of uses: >> >> https://github.com/search?q=%22np.iterable%22&ref=simplesearch&type=Code&utf8=%E2%9C%93 >> >> Meh, I dunno, maybe we're stuck with it. It's not a major maintenance >> burden at least. >> >> -n >> >> On Thu, Feb 11, 2016 at 10:25 AM, Stephan Hoyer wrote: >> > We certainly can (and probably should) deprecate this, but we can't >> > remove >> > it for a very long time. >> > >> > np.iterable is used in a lot of third party code. >> > >> > On Wed, Feb 10, 2016 at 7:09 PM, Joseph Fox-Rabinovitz >> > wrote: >> >> >> >> I have created a PR to deprecate `np.iterable` >> >> (https://github.com/numpy/numpy/pull/7202). It is a very old function, >> >> introduced as a utility in 2005 >> >> >> >> >> >> (https://github.com/numpy/numpy/commit/052a7b2e3276a303be1083022fc24d43084d2e14), >> >> and there is no good reason for it to be part of the public API. It is >> >> used internally 10 times within numpy. I have repaced those usages >> >> with a private function `np.lib.function_base._iterable` and added a >> >> `DeprecationWarning` to the public function. >> >> >> >> Is there anyone that objects to deprecating this function? >> >> >> >> Regards, >> >> >> >> -Joseph >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> >> >> >> -- >> Nathaniel J. Smith -- https://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From matthew.brett at gmail.com Thu Feb 11 20:19:09 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 11 Feb 2016 17:19:09 -0800 Subject: [Numpy-discussion] Hook in __init__.py to let distributors patch numpy Message-ID: Hi, Over at https://github.com/numpy/numpy/issues/5479 we're discussing Windows wheels. On thing that we would like to be able to ship Windows wheels, is to be able to put some custom checks into numpy when you build the wheels. Specifically, for Windows, we're building on top of ATLAS BLAS / LAPACK, and we need to check that the system on which the wheel is running, has SSE2 instructions, otherwise we know ATLAS will crash (almost everybody does have SSE2 these days). The way I propose we do that, is this patch here: https://github.com/numpy/numpy/pull/7231 diff --git a/numpy/__init__.py b/numpy/__init__.py index 0fcd509..ba3ba16 100644 --- a/numpy/__init__.py +++ b/numpy/__init__.py @@ -190,6 +190,12 @@ def pkgload(*packages, **options): test = testing.nosetester._numpy_tester().test bench = testing.nosetester._numpy_tester().bench + # Allow platform-specific build to intervene in numpy init + try: + from . import _distributor_init + except ImportError: + pass + from . import core from .core import * from . import compat So, numpy __init__.py looks for a module `_distributor_init`, in which the distributor might have put custom code to do any checks and initialization needed for the particular platform. We don't by default ship a `_distributor_init.py` but leave it up to packagers to generate this when building binaries. Does that sound like a sensible approach to y'all? Cheers, Matthew From msarahan at gmail.com Thu Feb 11 20:24:02 2016 From: msarahan at gmail.com (Michael Sarahan) Date: Fri, 12 Feb 2016 01:24:02 +0000 Subject: [Numpy-discussion] Hook in __init__.py to let distributors patch numpy In-Reply-To: References: Message-ID: +1. This seems nicer than patching __init__.py itself, in that it is much more transparent. Good idea. Michael On Thu, Feb 11, 2016 at 7:19 PM Matthew Brett wrote: > Hi, > > Over at https://github.com/numpy/numpy/issues/5479 we're discussing > Windows wheels. > > On thing that we would like to be able to ship Windows wheels, is to > be able to put some custom checks into numpy when you build the > wheels. > > Specifically, for Windows, we're building on top of ATLAS BLAS / > LAPACK, and we need to check that the system on which the wheel is > running, has SSE2 instructions, otherwise we know ATLAS will crash > (almost everybody does have SSE2 these days). > > The way I propose we do that, is this patch here: > > https://github.com/numpy/numpy/pull/7231 > > diff --git a/numpy/__init__.py b/numpy/__init__.py > index 0fcd509..ba3ba16 100644 > --- a/numpy/__init__.py > +++ b/numpy/__init__.py > @@ -190,6 +190,12 @@ def pkgload(*packages, **options): > test = testing.nosetester._numpy_tester().test > bench = testing.nosetester._numpy_tester().bench > > + # Allow platform-specific build to intervene in numpy init > + try: > + from . import _distributor_init > + except ImportError: > + pass > + > from . import core > from .core import * > from . import compat > > So, numpy __init__.py looks for a module `_distributor_init`, in which > the distributor might have put custom code to do any checks and > initialization needed for the particular platform. We don't by > default ship a `_distributor_init.py` but leave it up to packagers to > generate this when building binaries. > > Does that sound like a sensible approach to y'all? > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Feb 12 04:37:52 2016 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Feb 2016 09:37:52 +0000 Subject: [Numpy-discussion] Hook in __init__.py to let distributors patch numpy In-Reply-To: References: Message-ID: I would add a numpy/_distributor_init.py module and unconditionally import it in the __init__.py. It's contents in our upstream sources would just be a docstring: """Distributors! Put your initialization code here! """ One important technical benefit is that the unconditional import won't hide ImportErrors in the distributor's code. On Fri, Feb 12, 2016 at 1:19 AM, Matthew Brett wrote: > Hi, > > Over at https://github.com/numpy/numpy/issues/5479 we're discussing > Windows wheels. > > On thing that we would like to be able to ship Windows wheels, is to > be able to put some custom checks into numpy when you build the > wheels. > > Specifically, for Windows, we're building on top of ATLAS BLAS / > LAPACK, and we need to check that the system on which the wheel is > running, has SSE2 instructions, otherwise we know ATLAS will crash > (almost everybody does have SSE2 these days). > > The way I propose we do that, is this patch here: > > https://github.com/numpy/numpy/pull/7231 > > diff --git a/numpy/__init__.py b/numpy/__init__.py > index 0fcd509..ba3ba16 100644 > --- a/numpy/__init__.py > +++ b/numpy/__init__.py > @@ -190,6 +190,12 @@ def pkgload(*packages, **options): > test = testing.nosetester._numpy_tester().test > bench = testing.nosetester._numpy_tester().bench > > + # Allow platform-specific build to intervene in numpy init > + try: > + from . import _distributor_init > + except ImportError: > + pass > + > from . import core > from .core import * > from . import compat > > So, numpy __init__.py looks for a module `_distributor_init`, in which > the distributor might have put custom code to do any checks and > initialization needed for the particular platform. We don't by > default ship a `_distributor_init.py` but leave it up to packagers to > generate this when building binaries. > > Does that sound like a sensible approach to y'all? > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From filaboia at gmail.com Fri Feb 12 09:40:37 2016 From: filaboia at gmail.com (=?UTF-8?Q?S=C3=A9rgio?=) Date: Fri, 12 Feb 2016 12:40:37 -0200 Subject: [Numpy-discussion] [Suggestion] Labelled Array Message-ID: Hello, This is my first e-mail, I will try to make the idea simple. Similar to masked array it would be interesting to use a label array to guide operations. Ex.: >>> x labelled_array(data = [[0 1 2] [3 4 5] [6 7 8]], label = [[0 1 2] [0 1 2] [0 1 2]]) >>> sum(x) array([9, 12, 15]) The operations would create a new axis for label indexing. You could think of it as a collection of masks, one for each label. I don't know a way to make something like this efficiently without a loop. Just wondering... S?rgio. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Fri Feb 12 09:49:51 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Fri, 12 Feb 2016 09:49:51 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: Message-ID: Seems like you are talking about xarray: https://github.com/pydata/xarray Cheers! Ben Root On Fri, Feb 12, 2016 at 9:40 AM, S?rgio wrote: > Hello, > > This is my first e-mail, I will try to make the idea simple. > > Similar to masked array it would be interesting to use a label array to > guide operations. > > Ex.: > >>> x > labelled_array(data = > [[0 1 2] > [3 4 5] > [6 7 8]], > label = > [[0 1 2] > [0 1 2] > [0 1 2]]) > > >>> sum(x) > array([9, 12, 15]) > > The operations would create a new axis for label indexing. > > You could think of it as a collection of masks, one for each label. > > I don't know a way to make something like this efficiently without a loop. > Just wondering... > > S?rgio. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Fri Feb 12 09:52:54 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Fri, 12 Feb 2016 09:52:54 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: Message-ID: Re-reading your post, I see you are talking about something different. Not exactly sure what your use-case is. Ben Root On Fri, Feb 12, 2016 at 9:49 AM, Benjamin Root wrote: > Seems like you are talking about xarray: https://github.com/pydata/xarray > > Cheers! > Ben Root > > On Fri, Feb 12, 2016 at 9:40 AM, S?rgio wrote: > >> Hello, >> >> This is my first e-mail, I will try to make the idea simple. >> >> Similar to masked array it would be interesting to use a label array to >> guide operations. >> >> Ex.: >> >>> x >> labelled_array(data = >> [[0 1 2] >> [3 4 5] >> [6 7 8]], >> label = >> [[0 1 2] >> [0 1 2] >> [0 1 2]]) >> >> >>> sum(x) >> array([9, 12, 15]) >> >> The operations would create a new axis for label indexing. >> >> You could think of it as a collection of masks, one for each label. >> >> I don't know a way to make something like this efficiently without a >> loop. Just wondering... >> >> S?rgio. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Fri Feb 12 16:17:30 2016 From: t3kcit at gmail.com (Andreas Mueller) Date: Fri, 12 Feb 2016 16:17:30 -0500 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: Message-ID: <56BE4BEA.5040100@gmail.com> Hi. Where can I find the changelog? It would be good for us to know which changes are done one purpos without hunting through the issue tracker. Thanks, Andy On 02/09/2016 09:09 PM, Charles R Harris wrote: > Hi All, > > I'm pleased to announce the release of NumPy 1.11.0b3. This beta > contains additional bug fixes as well as limiting the number of > FutureWarnings raised by assignment to masked array slices. One issue > that remains to be decided is whether or not to postpone raising an > error for floats used as indexes. Sources may be found on Sourceforge > and > both sources and OS X wheels are availble on pypi. Please test, > hopefully this will be that last beta needed. > > As a note on problems encountered, twine uploads continue to fail for > me, but there are still variations to try. The wheeluploader > downloaded wheels as it should, but could not upload them, giving the > error message "HTTPError: 413 Client Error: Request Entity Too Large > for url: https://www.python.org/pypi". Firefox also complains that > http://wheels.scipy.org is incorrectly configured with an invalid > certificate. > > Enjoy, > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Fri Feb 12 16:19:56 2016 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Fri, 12 Feb 2016 15:19:56 -0600 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: <56BE4BEA.5040100@gmail.com> References: <56BE4BEA.5040100@gmail.com> Message-ID: https://github.com/numpy/numpy/blob/master/doc/release/1.11.0-notes.rst On Fri, Feb 12, 2016 at 3:17 PM, Andreas Mueller wrote: > Hi. > Where can I find the changelog? > It would be good for us to know which changes are done one purpos without > hunting through the issue tracker. > > Thanks, > Andy > > > On 02/09/2016 09:09 PM, Charles R Harris wrote: > > Hi All, > > I'm pleased to announce the release of NumPy 1.11.0b3. This beta contains > additional bug fixes as well as limiting the number of FutureWarnings > raised by assignment to masked array slices. One issue that remains to be > decided is whether or not to postpone raising an error for floats used as > indexes. Sources may be found on Sourceforge > and both > sources and OS X wheels are availble on pypi. Please test, hopefully this > will be that last beta needed. > > As a note on problems encountered, twine uploads continue to fail for me, > but there are still variations to try. The wheeluploader downloaded wheels > as it should, but could not upload them, giving the error message > "HTTPError: 413 Client Error: Request Entity Too Large for url: > https://www.python.org/pypi". Firefox also > complains that http://wheels.scipy.org is incorrectly configured with an > invalid certificate. > > Enjoy, > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttps://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Feb 12 18:06:41 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 12 Feb 2016 15:06:41 -0800 Subject: [Numpy-discussion] Windows wheels for testing Message-ID: Hi, We're talking about putting up Windows wheels for numpy on pypi - here: https://github.com/numpy/numpy/issues/5479 I've built some wheels that might be suitable - available here: http://nipy.bic.berkeley.edu/scipy_installers/atlas_builds/ I'd be very grateful if y'all would test these. They should install with something like: pip install -f https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy This should work for Pythons 2.7, 3.4, 3.5, both 32 and 64-bit. Any feedback would be very useful, Cheers, Matthew From rays at blue-cove.com Fri Feb 12 18:15:42 2016 From: rays at blue-cove.com (R Schumacher) Date: Fri, 12 Feb 2016 15:15:42 -0800 Subject: [Numpy-discussion] Windows wheels for testing In-Reply-To: References: Message-ID: <201602122315.u1CNFmmU019206@blue-cove.com> At 03:06 PM 2/12/2016, you wrote: >Any feedback would be very useful, Sure, here's a little: C:\Python34\Scripts>pip install -f https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy Requirement already satisfied (use --upgrade to upgrade): numpy in c:\python34\lib\site-packages Cleaning up... From matthewb at berkeley.edu Fri Feb 12 18:45:28 2016 From: matthewb at berkeley.edu (Matthew Brett) Date: Fri, 12 Feb 2016 15:45:28 -0800 Subject: [Numpy-discussion] Fwd: Windows wheels for testing In-Reply-To: References: <201602122315.u1CNFmmU019206@blue-cove.com> Message-ID: On Fri, Feb 12, 2016 at 3:15 PM, R Schumacher wrote: > At 03:06 PM 2/12/2016, you wrote: > >> Any feedback would be very useful, > > > Sure, here's a little: > > C:\Python34\Scripts>pip install -f > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy > Requirement already satisfied (use --upgrade to upgrade): numpy in > c:\python34\lib\site-packages > Cleaning up... Thanks - yes - I should maybe have said that testing in a virtual environment is your best bet. Here's me testing from Powershell on a 32-bit Windows and on Pythons 3.5 and 2.7: PS C:\tmp> c:\Python35\python -m venv np-testing PS C:\tmp> .\np-testing\Scripts\Activate.ps1 (np-testing) PS C:\tmp> pip install -f https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy nose Collecting numpy Downloading https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds/numpy-1.10.4-cp35-none-win32.whl (6.6MB) 100% |################################| 6.6MB 34kB/s Collecting nose Using cached nose-1.3.7-py3-none-any.whl Installing collected packages: numpy, nose Successfully installed nose-1.3.7 numpy-1.10.4 You are using pip version 7.1.2, however version 8.0.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command. (np-testing) PS C:\tmp> python -c 'import numpy; numpy.test()' Running unit tests for numpy NumPy version 1.10.4 NumPy relaxed strides checking option: False NumPy is installed in C:\tmp\np-testing\lib\site-packages\numpy Python version 3.5.0 (v3.5.0:374f501f4567, Sep 13 2015, 02:16:59) [MSC v.1900 32 bit (Intel)] nose version 1.3.7 ......... [test output] Now on 2.7 (np-testing) PS C:\tmp> deactivate PS C:\tmp> C:\Python27\Scripts\pip.exe install virtualenv [...] output snipped PS C:\tmp> C:\Python27\Scripts\virtualenv.exe np27-testing Ignoring indexes: https://pypi.python.org/simple Collecting pip Collecting wheel Collecting setuptools Installing collected packages: pip, wheel, setuptools Successfully installed pip-7.1.2 setuptools-18.5 wheel-0.26.0 PS C:\tmp> .\np27-testing\Scripts\activate.ps1 (np27-testing) PS C:\tmp> pip install -f https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy nose Collecting numpy c:\tmp\np27-testing\lib\site-packages\pip\_vendor\requests\packages\urllib3\util\ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatf ormwarning. InsecurePlatformWarning Downloading https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds/numpy-1.10.4-cp27-none-win32.whl (6.4MB) 100% |################################| 6.4MB 35kB/s Collecting nose c:\tmp\np27-testing\lib\site-packages\pip\_vendor\requests\packages\urllib3\util\ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatf ormwarning. InsecurePlatformWarning Using cached nose-1.3.7-py2-none-any.whl Installing collected packages: numpy, nose Successfully installed nose-1.3.7 numpy-1.10.4 You are using pip version 7.1.2, however version 8.0.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command. (np27-testing) PS C:\tmp> python -c 'import numpy; numpy.test()' Running unit tests for numpy NumPy version 1.10.4 NumPy relaxed strides checking option: False NumPy is installed in c:\tmp\np27-testing\lib\site-packages\numpy Python version 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] nose version 1.3.7 .............. [etc] Thanks a lot for testing, Matthew From rgutenk at email.arizona.edu Fri Feb 12 19:06:32 2016 From: rgutenk at email.arizona.edu (Gutenkunst, Ryan N - (rgutenk)) Date: Sat, 13 Feb 2016 00:06:32 +0000 Subject: [Numpy-discussion] Subclassing ma.masked_array, code broken after version 1.9 Message-ID: Hello all, In 2009 I developed an application that uses a subclass of masked arrays as a central data object. My subclass Spectrum possesses additional attributes along with many custom methods. It was very convenient to be able to use standard numpy functions for doing arithmetic on these objects. However, my code broke with numpy 1.10. I've finally had a chance to track down the problem, and I am hoping someone can suggest a workaround. See below for an example, which is as minimal as I could concoct. In this case, I have a Spectrum object that I'd like to take the logarithm of using numpy.ma.log, while preserving the value of the "folded" attribute. Up to numpy 1.9, this worked as expected, but in numpy 1.10 and 1.11 the attribute is not preserved. The change in behavior appears to be driven by a commit made on Jun 16th, 2015 by Marten van Kerkwijk. In particular, the commit changed _MaskedUnaryOperation.__call__ so that the result array's update_from method is no longer called with the input array as the argument, but rather the result of the numpy UnaryOperation (old line 889, new line 885). Because that UnaryOperation doesn't carry my new attribute, it's not present for update_from to access. I notice that similar changes were made to MaskedBinaryOperation, although I haven't tested those. It's not clear to me from the commit message why this particular change was made, so I don't know whether this new behavior is intentional. I know that subclassing arrays isn't widely encouraged, but it has been very convenient in my code. Is it still possible to subclass masked_array in such a way that functions like numpy.ma.log preserve additional attributes? If so, can someone point me in the right direction? Thanks! Ryan *** Begin example import numpy print 'Working with numpy {0}'.format(numpy.__version__) class Spectrum(numpy.ma.masked_array): def __new__(cls, data, mask=numpy.ma.nomask, data_folded=None): subarr = numpy.ma.masked_array(data, mask=mask, keep_mask=True, shrink=True) subarr = subarr.view(cls) subarr.folded = data_folded return subarr def __array_finalize__(self, obj): if obj is None: return numpy.ma.masked_array.__array_finalize__(self, obj) self.folded = getattr(obj, 'folded', 'unspecified') def _update_from(self, obj): print('Input to update_from: {0}'.format(repr(obj))) numpy.ma.masked_array._update_from(self, obj) self.folded = getattr(obj, 'folded', 'unspecified') def __repr__(self): return 'Spectrum(%s, folded=%s)'\ % (str(self), str(self.folded)) fs1 = Spectrum([2,3,4.], data_folded=True) fs2 = numpy.ma.log(fs1) print('fs2.folded status: {0}'.format(fs2.folded)) print('Expectation is True, achieved with numpy 1.9') *** End example -- Ryan Gutenkunst Assistant Professor Molecular and Cellular Biology University of Arizona phone: (520) 626-0569, office LSS 325 http://gutengroup.mcb.arizona.edu Latest paper: "Computationally efficient composite likelihood statistics for demographic inference" Molecular Biology and Evolution; http://dx.doi.org/10.1093/molbev/msv255 From rays at blue-cove.com Fri Feb 12 23:18:56 2016 From: rays at blue-cove.com (R Schumacher) Date: Fri, 12 Feb 2016 20:18:56 -0800 Subject: [Numpy-discussion] Fwd: Windows wheels for testing In-Reply-To: References: <201602122315.u1CNFmmU019206@blue-cove.com> Message-ID: <201602130419.u1D4J5FL030486@blue-cove.com> At 03:45 PM 2/12/2016, you wrote: >PS C:\tmp> c:\Python35\python -m venv np-testing >PS C:\tmp> .\np-testing\Scripts\Activate.ps1 >(np-testing) PS C:\tmp> pip install -f >https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy nose C:\Python34\Scripts>pip install "D:\Python distros\numpy-1.10.4-cp34-none-win_amd64.whl" Unpacking d:\python distros\numpy-1.10.4-cp34-none-win_amd64.whl Installing collected packages: numpy Successfully installed numpy Cleaning up... C:\Python34\Scripts>..\python Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC v.1600 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Running unit tests for numpy NumPy version 1.10.4 NumPy relaxed strides checking option: False NumPy is installed in C:\Python34\lib\site-packages\numpy Python version 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC v.1600 64 bit (AMD64)] nose version 1.3.7 .......................F....S........................................................................................... .....................................................................................................S.................. ..........................................................................................C:\Python34\lib\unittest\case. py:162: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future callable_obj(*args, **kwargs) ........C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future callable_obj(*args, **kwargs) C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a non-integer number instead of an integer will result i n an error in the future callable_obj(*args, **kwargs) .......................................................................................S................................ ........................................................................................................................ ..........................................................................C:\Python34\lib\unittest\case.py:162: Deprecat ionWarning: using a non-integer number instead of an integer will result in an error in the future callable_obj(*args, **kwargs) ..C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future callable_obj(*args, **kwargs) C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a non-integer number instead of an integer will result i n an error in the future callable_obj(*args, **kwargs) C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a non-integer number instead of an integer will result i n an error in the future callable_obj(*args, **kwargs) C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a non-integer number instead of an integer will result i n an error in the future callable_obj(*args, **kwargs) ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ...............................................K.........................................................C:\Python34\lib \site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid value encountered in multiply masked_da = umath.multiply(m, da) C:\Python34\lib\site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid value encountered in multiply masked_da = umath.multiply(m, da) ........................................................................................................................ ..................C:\Python34\lib\site-packages\numpy\core\tests\test_numerictypes.py:372: DeprecationWarning: using a n on-integer number instead of an integer will result in an error in the future return self.ary['f0', 'f1'] ........................................................................................................................ .............................................................................................K.......................... ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ....K..K................................K...SK.S.......S................................................................ ..................ES..SS................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ................................................................................S....................................... ........................................................................................................................ ........................................................................................................................ ...........................................K.................K.......................................................... ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ...........................................S............................................................................ ..........................................S............................................................C:\Python34\lib\s ite-packages\numpy\ma\core.py:4089: UserWarning: Warning: converting a masked element to nan. warnings.warn("Warning: converting a masked element to nan.") ..............................................................................................................C:\Python3 4\lib\site-packages\numpy\ma\core.py:5116: RuntimeWarning: invalid value encountered in power np.power(out, 0.5, out=out, casting='unsafe') ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ .......................... ====================================================================== ERROR: test_compile1 (test_system_info.TestSystemInfoReading) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python34\lib\site-packages\numpy\distutils\tests\test_system_info.py", line 182, in test_compile1 c.compile([os.path.basename(self._src1)], output_dir=self._dir1) File "C:\Python34\lib\distutils\msvc9compiler.py", line 460, in compile self.initialize() File "C:\Python34\lib\site-packages\numpy\distutils\msvccompiler.py", line 17, in initialize distutils.msvccompiler.MSVCCompiler.initialize(self, plat_name) File "C:\Python34\lib\distutils\msvc9compiler.py", line 371, in initialize vc_env = query_vcvarsall(VERSION, plat_spec) File "C:\Python34\lib\distutils\msvc9compiler.py", line 259, in query_vcvarsall raise DistutilsPlatformError("Unable to find vcvarsall.bat") distutils.errors.DistutilsPlatformError: Unable to find vcvarsall.bat ====================================================================== FAIL: test_blasdot.test_blasdot_used ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python34\lib\site-packages\nose\case.py", line 198, in runTest self.test(*self.arg) File "C:\Python34\lib\site-packages\numpy\testing\decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "C:\Python34\lib\site-packages\numpy\core\tests\test_blasdot.py", line 31, in test_blasdot_used assert_(dot is _dotblas.dot) File "C:\Python34\lib\site-packages\numpy\testing\utils.py", line 53, in assert_ raise AssertionError(smsg) AssertionError ---------------------------------------------------------------------- Ran 5575 tests in 32.042s FAILED (KNOWNFAIL=8, SKIP=12, errors=1, failures=1) >>> From matthew.brett at gmail.com Fri Feb 12 23:23:04 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 12 Feb 2016 20:23:04 -0800 Subject: [Numpy-discussion] Fwd: Windows wheels for testing In-Reply-To: <201602130419.u1D4J5FL030486@blue-cove.com> References: <201602122315.u1CNFmmU019206@blue-cove.com> <201602130419.u1D4J5FL030486@blue-cove.com> Message-ID: On Fri, Feb 12, 2016 at 8:18 PM, R Schumacher wrote: > At 03:45 PM 2/12/2016, you wrote: >> >> PS C:\tmp> c:\Python35\python -m venv np-testing >> PS C:\tmp> .\np-testing\Scripts\Activate.ps1 >> (np-testing) PS C:\tmp> pip install -f >> https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy nose > > > C:\Python34\Scripts>pip install "D:\Python > distros\numpy-1.10.4-cp34-none-win_amd64.whl" > Unpacking d:\python distros\numpy-1.10.4-cp34-none-win_amd64.whl > Installing collected packages: numpy > Successfully installed numpy > Cleaning up... > > C:\Python34\Scripts>..\python > Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC v.1600 64 bit > (AMD64)] on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy >>>> numpy.test() > Running unit tests for numpy > NumPy version 1.10.4 > NumPy relaxed strides checking option: False > NumPy is installed in C:\Python34\lib\site-packages\numpy > Python version 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC > v.1600 64 bit (AMD64)] > nose version 1.3.7 > .......................F....S........................................................................................... > .....................................................................................................S.................. > ..........................................................................................C:\Python34\lib\unittest\case. > py:162: DeprecationWarning: using a non-integer number instead of an integer > will result in an error in the future > callable_obj(*args, **kwargs) > ........C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > non-integer number instead of an integer will > result in an error in the future > callable_obj(*args, **kwargs) > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > non-integer number instead of an integer will result i > n an error in the future > callable_obj(*args, **kwargs) > .......................................................................................S................................ > ........................................................................................................................ > ..........................................................................C:\Python34\lib\unittest\case.py:162: > Deprecat > ionWarning: using a non-integer number instead of an integer will result in > an error in the future > callable_obj(*args, **kwargs) > ..C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > non-integer number instead of an integer will result > in an error in the future > callable_obj(*args, **kwargs) > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > non-integer number instead of an integer will result i > n an error in the future > callable_obj(*args, **kwargs) > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > non-integer number instead of an integer will result i > n an error in the future > callable_obj(*args, **kwargs) > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > non-integer number instead of an integer will result i > n an error in the future > callable_obj(*args, **kwargs) > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ...............................................K.........................................................C:\Python34\lib > \site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid value > encountered in multiply > masked_da = umath.multiply(m, da) > C:\Python34\lib\site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid > value encountered in multiply > masked_da = umath.multiply(m, da) > ........................................................................................................................ > ..................C:\Python34\lib\site-packages\numpy\core\tests\test_numerictypes.py:372: > DeprecationWarning: using a n > on-integer number instead of an integer will result in an error in the > future > return self.ary['f0', 'f1'] > ........................................................................................................................ > .............................................................................................K.......................... > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ....K..K................................K...SK.S.......S................................................................ > ..................ES..SS................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ................................................................................S....................................... > ........................................................................................................................ > ........................................................................................................................ > ...........................................K.................K.......................................................... > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ...........................................S............................................................................ > ..........................................S............................................................C:\Python34\lib\s > ite-packages\numpy\ma\core.py:4089: UserWarning: Warning: converting a > masked element to nan. > warnings.warn("Warning: converting a masked element to nan.") > ..............................................................................................................C:\Python3 > 4\lib\site-packages\numpy\ma\core.py:5116: RuntimeWarning: invalid value > encountered in power > np.power(out, 0.5, out=out, casting='unsafe') > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > ........................................................................................................................ > .......................... > ====================================================================== > ERROR: test_compile1 (test_system_info.TestSystemInfoReading) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "C:\Python34\lib\site-packages\numpy\distutils\tests\test_system_info.py", > line 182, in test_compile1 > c.compile([os.path.basename(self._src1)], output_dir=self._dir1) > File "C:\Python34\lib\distutils\msvc9compiler.py", line 460, in compile > self.initialize() > File "C:\Python34\lib\site-packages\numpy\distutils\msvccompiler.py", line > 17, in initialize > distutils.msvccompiler.MSVCCompiler.initialize(self, plat_name) > File "C:\Python34\lib\distutils\msvc9compiler.py", line 371, in initialize > vc_env = query_vcvarsall(VERSION, plat_spec) > File "C:\Python34\lib\distutils\msvc9compiler.py", line 259, in > query_vcvarsall > raise DistutilsPlatformError("Unable to find vcvarsall.bat") > distutils.errors.DistutilsPlatformError: Unable to find vcvarsall.bat > > ====================================================================== > FAIL: test_blasdot.test_blasdot_used > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "C:\Python34\lib\site-packages\nose\case.py", line 198, in runTest > self.test(*self.arg) > File "C:\Python34\lib\site-packages\numpy\testing\decorators.py", line > 146, in skipper_func > return f(*args, **kwargs) > File "C:\Python34\lib\site-packages\numpy\core\tests\test_blasdot.py", > line 31, in test_blasdot_used > assert_(dot is _dotblas.dot) > File "C:\Python34\lib\site-packages\numpy\testing\utils.py", line 53, in > assert_ > raise AssertionError(smsg) > AssertionError > > ---------------------------------------------------------------------- > Ran 5575 tests in 32.042s > > FAILED (KNOWNFAIL=8, SKIP=12, errors=1, failures=1) > Great - thanks - I got the same couple of failures - I believe they are benign... Matthew From gfyoung17 at gmail.com Sat Feb 13 02:16:27 2016 From: gfyoung17 at gmail.com (G Young) Date: Sat, 13 Feb 2016 07:16:27 +0000 Subject: [Numpy-discussion] Fwd: Windows wheels for testing In-Reply-To: References: <201602122315.u1CNFmmU019206@blue-cove.com> <201602130419.u1D4J5FL030486@blue-cove.com> Message-ID: AFAIK the vcvarsall.bat error occurs when your MSVC directories aren't properly linked in your system registry, so Python cannot find the file. This is not a numpy-specific issue, so I certainly would agree that that failure is not blocking. Other than that, this build contains the mingw32.lib bug that I fixed here , but other than that, everything else passes on relevant Python versions for 32-bit! On Sat, Feb 13, 2016 at 4:23 AM, Matthew Brett wrote: > On Fri, Feb 12, 2016 at 8:18 PM, R Schumacher wrote: > > At 03:45 PM 2/12/2016, you wrote: > >> > >> PS C:\tmp> c:\Python35\python -m venv np-testing > >> PS C:\tmp> .\np-testing\Scripts\Activate.ps1 > >> (np-testing) PS C:\tmp> pip install -f > >> https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy nose > > > > > > C:\Python34\Scripts>pip install "D:\Python > > distros\numpy-1.10.4-cp34-none-win_amd64.whl" > > Unpacking d:\python distros\numpy-1.10.4-cp34-none-win_amd64.whl > > Installing collected packages: numpy > > Successfully installed numpy > > Cleaning up... > > > > C:\Python34\Scripts>..\python > > Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC v.1600 64 > bit > > (AMD64)] on win32 > > Type "help", "copyright", "credits" or "license" for more information. > >>>> import numpy > >>>> numpy.test() > > Running unit tests for numpy > > NumPy version 1.10.4 > > NumPy relaxed strides checking option: False > > NumPy is installed in C:\Python34\lib\site-packages\numpy > > Python version 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC > > v.1600 64 bit (AMD64)] > > nose version 1.3.7 > > > .......................F....S........................................................................................... > > > .....................................................................................................S.................. > > > ..........................................................................................C:\Python34\lib\unittest\case. > > py:162: DeprecationWarning: using a non-integer number instead of an > integer > > will result in an error in the future > > callable_obj(*args, **kwargs) > > ........C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will > > result in an error in the future > > callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > > callable_obj(*args, **kwargs) > > > .......................................................................................S................................ > > > ........................................................................................................................ > > > ..........................................................................C:\Python34\lib\unittest\case.py:162: > > Deprecat > > ionWarning: using a non-integer number instead of an integer will result > in > > an error in the future > > callable_obj(*args, **kwargs) > > ..C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result > > in an error in the future > > callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > > callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > > callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > > callable_obj(*args, **kwargs) > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ...............................................K.........................................................C:\Python34\lib > > \site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid value > > encountered in multiply > > masked_da = umath.multiply(m, da) > > C:\Python34\lib\site-packages\numpy\ma\core.py:989: RuntimeWarning: > invalid > > value encountered in multiply > > masked_da = umath.multiply(m, da) > > > ........................................................................................................................ > > > ..................C:\Python34\lib\site-packages\numpy\core\tests\test_numerictypes.py:372: > > DeprecationWarning: using a n > > on-integer number instead of an integer will result in an error in the > > future > > return self.ary['f0', 'f1'] > > > ........................................................................................................................ > > > .............................................................................................K.......................... > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ....K..K................................K...SK.S.......S................................................................ > > > ..................ES..SS................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ................................................................................S....................................... > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ...........................................K.................K.......................................................... > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ...........................................S............................................................................ > > > ..........................................S............................................................C:\Python34\lib\s > > ite-packages\numpy\ma\core.py:4089: UserWarning: Warning: converting a > > masked element to nan. > > warnings.warn("Warning: converting a masked element to nan.") > > > ..............................................................................................................C:\Python3 > > 4\lib\site-packages\numpy\ma\core.py:5116: RuntimeWarning: invalid value > > encountered in power > > np.power(out, 0.5, out=out, casting='unsafe') > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > .......................... > > ====================================================================== > > ERROR: test_compile1 (test_system_info.TestSystemInfoReading) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > > > "C:\Python34\lib\site-packages\numpy\distutils\tests\test_system_info.py", > > line 182, in test_compile1 > > c.compile([os.path.basename(self._src1)], output_dir=self._dir1) > > File "C:\Python34\lib\distutils\msvc9compiler.py", line 460, in compile > > self.initialize() > > File "C:\Python34\lib\site-packages\numpy\distutils\msvccompiler.py", > line > > 17, in initialize > > distutils.msvccompiler.MSVCCompiler.initialize(self, plat_name) > > File "C:\Python34\lib\distutils\msvc9compiler.py", line 371, in > initialize > > vc_env = query_vcvarsall(VERSION, plat_spec) > > File "C:\Python34\lib\distutils\msvc9compiler.py", line 259, in > > query_vcvarsall > > raise DistutilsPlatformError("Unable to find vcvarsall.bat") > > distutils.errors.DistutilsPlatformError: Unable to find vcvarsall.bat > > > > ====================================================================== > > FAIL: test_blasdot.test_blasdot_used > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File "C:\Python34\lib\site-packages\nose\case.py", line 198, in runTest > > self.test(*self.arg) > > File "C:\Python34\lib\site-packages\numpy\testing\decorators.py", line > > 146, in skipper_func > > return f(*args, **kwargs) > > File "C:\Python34\lib\site-packages\numpy\core\tests\test_blasdot.py", > > line 31, in test_blasdot_used > > assert_(dot is _dotblas.dot) > > File "C:\Python34\lib\site-packages\numpy\testing\utils.py", line 53, > in > > assert_ > > raise AssertionError(smsg) > > AssertionError > > > > ---------------------------------------------------------------------- > > Ran 5575 tests in 32.042s > > > > FAILED (KNOWNFAIL=8, SKIP=12, errors=1, failures=1) > > > > Great - thanks - I got the same couple of failures - I believe they > are benign... > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Feb 13 10:03:42 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 13 Feb 2016 16:03:42 +0100 Subject: [Numpy-discussion] ANN: numpydoc 0.6.0 released Message-ID: Hi all, I'm pleased to announce the release of numpydoc 0.6.0. The main new feature is support for the Yields section in numpy-style docstrings. This is described in https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt Numpydoc can be installed from PyPi: https://pypi.python.org/pypi/numpydoc Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From rays at blue-cove.com Sat Feb 13 10:42:29 2016 From: rays at blue-cove.com (R Schumacher) Date: Sat, 13 Feb 2016 07:42:29 -0800 Subject: [Numpy-discussion] Fwd: Windows wheels for testing In-Reply-To: References: <201602122315.u1CNFmmU019206@blue-cove.com> <201602130419.u1D4J5FL030486@blue-cove.com> Message-ID: <201602131542.u1DFgYGQ015201@blue-cove.com> Have you all conferred with C Gohlke on his Windows build bot? I've never seen a description of his recipes. The MKL linking aside, his binaries always seem to work flawlessly. - Ray At 11:16 PM 2/12/2016, you wrote: >AFAIK the vcvarsall.bat error occurs when your >MSVC directories aren't properly linked in your >system registry, so Python cannot find the >file.? This is not a numpy-specific issue, so I >certainly would agree that that failure is not blocking. > >Other than that, this build contains the >mingw32.lib bug that I fixed? >here, >but other than that, everything else passes on >relevant Python versions for 32-bit! > >On Sat, Feb 13, 2016 at 4:23 AM, Matthew Brett ><matthew.brett at gmail.com> wrote: >On Fri, Feb 12, 2016 at 8:18 PM, R Schumacher ><rays at blue-cove.com> wrote: > > At 03:45 PM 2/12/2016, you wrote: > >> > >> PS C:\tmp> c:\Python35\python -m venv np-testing > >> PS C:\tmp> .\np-testing\Scripts\Activate.ps1 > >> (np-testing) PS C:\tmp> pip install -f > >> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds > numpy nose > > > > > > C:\Python34\Scripts>pip install? "D:\Python > > distros\numpy-1.10.4-cp34-none-win_amd64.whl" > > Unpacking d:\python distros\numpy-1.10.4-cp34-none-win_amd64.whl > > Installing collected packages: numpy > > Successfully installed numpy > > Cleaning up... > > > > C:\Python34\Scripts>..\python > > Python 3.4.2 (v3.4.2:ab2c023a9432, Oct? 6 > 2014, 22:16:31) [MSC v.1600 64 bit > > (AMD64)] on win32 > > Type "help", "copyright", "credits" or "license" for more information. > >>>> import numpy > >>>> numpy.test() > > Running unit tests for numpy > > NumPy version 1.10.4 > > NumPy relaxed strides checking option: False > > NumPy is installed in C:\Python34\lib\site-packages\numpy > > Python version 3.4.2 (v3.4.2:ab2c023a9432, Oct? 6 2014, 22:16:31) [MSC > > v.1600 64 bit (AMD64)] > > nose version 1.3.7 > > > .......................F....S........................................................................................... > > > .....................................................................................................S.................. > > > ..........................................................................................C:\Python34\lib\unittest\case. > > py:162: DeprecationWarning: using a > non-integer number instead of an integer > > will result in an error in the future > >? ? callable_obj(*args, **kwargs) > > ........C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will > > result in an error in the future > >? ? callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > >? ? callable_obj(*args, **kwargs) > > > .......................................................................................S................................ > > > ........................................................................................................................ > > > ..........................................................................C:\Python34\lib\unittest\case.py:162: > > Deprecat > > ionWarning: using a non-integer number instead of an integer will result in > > an error in the future > >? ? callable_obj(*args, **kwargs) > > ..C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result > >? in an error in the future > >? ? callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > >? ? callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > >? ? callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > >? ? callable_obj(*args, **kwargs) > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ...............................................K.........................................................C:\Python34\lib > > \site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid value > > encountered in multiply > >? ? masked_da = umath.multiply(m, da) > > C:\Python34\lib\site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid > > value encountered in multiply > >? ? masked_da = umath.multiply(m, da) > > > ........................................................................................................................ > > > ..................C:\Python34\lib\site-packages\numpy\core\tests\test_numerictypes.py:372: > > DeprecationWarning: using a n > > on-integer number instead of an integer will result in an error in the > > future > >? ? return self.ary['f0', 'f1'] > > > ........................................................................................................................ > > > .............................................................................................K.......................... > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ....K..K................................K...SK.S.......S................................................................ > > > ..................ES..SS................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ................................................................................S....................................... > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ...........................................K.................K.......................................................... > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ...........................................S............................................................................ > > > ..........................................S............................................................C:\Python34\lib\s > > ite-packages\numpy\ma\core.py:4089: UserWarning: Warning: converting a > > masked element to nan. > >? ? warnings.warn("Warning: converting a masked element to nan.") > > > ..............................................................................................................C:\Python3 > > 4\lib\site-packages\numpy\ma\core.py:5116: RuntimeWarning: invalid value > > encountered in power > >? ? np.power(out, 0.5, out=out, casting='unsafe') > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > .......................... > > ====================================================================== > > ERROR: test_compile1 (test_system_info.TestSystemInfoReading) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > >? ? File > > "C:\Python34\lib\site-packages\numpy\distutils\tests\test_system_info.py", > > line 182, in test_compile1 > >? ? ? c.compile([os.path.basename(self._src1)], output_dir=self._dir1) > >? ? File "C:\Python34\lib\distutils\msvc9compiler.py", line 460, in compile > >? ? ? self.initialize() > >? ? File > "C:\Python34\lib\site-packages\numpy\distutils\msvccompiler.py", line > > 17, in initialize > >? ? ? distutils.msvccompiler.MSVCCompiler.initialize(self, plat_name) > >? ? File > "C:\Python34\lib\distutils\msvc9compiler.py", line 371, in initialize > >? ? ? vc_env = query_vcvarsall(VERSION, plat_spec) > >? ? File "C:\Python34\lib\distutils\msvc9compiler.py", line 259, in > > query_vcvarsall > >? ? ? raise DistutilsPlatformError("Unable to find vcvarsall.bat") > > distutils.errors.DistutilsPlatformError: Unable to find vcvarsall.bat > > > > ====================================================================== > > FAIL: test_blasdot.test_blasdot_used > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > >? ? File "C:\Python34\lib\site-packages\nose\case.py", line 198, in runTest > >? ? ? self.test(*self.arg) > >? ? File "C:\Python34\lib\site-packages\numpy\testing\decorators.py", line > > 146, in skipper_func > >? ? ? return f(*args, **kwargs) > >? ? File "C:\Python34\lib\site-packages\numpy\core\tests\test_blasdot.py", > > line 31, in test_blasdot_used > >? ? ? assert_(dot is _dotblas.dot) > >? ? File > "C:\Python34\lib\site-packages\numpy\testing\utils.py", line 53, in > > assert_ > >? ? ? raise AssertionError(smsg) > > AssertionError > > > > ---------------------------------------------------------------------- > > Ran 5575 tests in 32.042s > > > > FAILED (KNOWNFAIL=8, SKIP=12, errors=1, failures=1) > > > >Great - thanks - I got the same couple of failures - I believe they >are benign... > >Matthew >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >https://mail.scipy.org/mailman/listinfo/numpy-discussion > > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfyoung17 at gmail.com Sat Feb 13 11:11:13 2016 From: gfyoung17 at gmail.com (G Young) Date: Sat, 13 Feb 2016 16:11:13 +0000 Subject: [Numpy-discussion] Fwd: Windows wheels for testing In-Reply-To: <201602131542.u1DFgYGQ015201@blue-cove.com> References: <201602122315.u1CNFmmU019206@blue-cove.com> <201602130419.u1D4J5FL030486@blue-cove.com> <201602131542.u1DFgYGQ015201@blue-cove.com> Message-ID: I've actually had test failures on occasion (i.e. when I run "numpy.test()") with his builds but overall, they are quite good. Speaking of MKL, for anyone who uses conda, does anyone know if it is possible to link the "mkl" package to the numpy source? My first guess is no since the description appears to imply that the package provides runtime libraries and not a static libraries that numpy would need, but perhaps someone who knows better can illuminate. On Sat, Feb 13, 2016 at 3:42 PM, R Schumacher wrote: > Have you all conferred with C Gohlke on his Windows build bot? I've never > seen a description of his recipes. > The MKL linking aside, his binaries always seem to work flawlessly. > > - Ray > > > At 11:16 PM 2/12/2016, you wrote: > > AFAIK the vcvarsall.bat error occurs when your MSVC directories aren't > properly linked in your system registry, so Python cannot find the file.? > This is not a numpy-specific issue, so I certainly would agree that that > failure is not blocking. > > Other than that, this build contains the mingw32.lib bug that I fixed? > here , but other than that, > everything else passes on relevant Python versions for 32-bit! > > On Sat, Feb 13, 2016 at 4:23 AM, Matthew Brett > wrote: > On Fri, Feb 12, 2016 at 8:18 PM, R Schumacher wrote: > > At 03:45 PM 2/12/2016, you wrote: > >> > >> PS C:\tmp> c:\Python35\python -m venv np-testing > >> PS C:\tmp> .\np-testing\Scripts\Activate.ps1 > >> (np-testing) PS C:\tmp> pip install -f > >> https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy nose > > > > > > C:\Python34\Scripts>pip install? "D:\Python > > distros\numpy-1.10.4-cp34-none-win_amd64.whl" > > Unpacking d:\python distros\numpy-1.10.4-cp34-none-win_amd64.whl > > Installing collected packages: numpy > > Successfully installed numpy > > Cleaning up... > > > > C:\Python34\Scripts>..\python > > Python 3.4.2 (v3.4.2:ab2c023a9432, Oct? 6 2014, 22:16:31) [MSC v.1600 > 64 bit > > (AMD64)] on win32 > > Type "help", "copyright", "credits" or "license" for more information. > >>>> import numpy > >>>> numpy.test() > > Running unit tests for numpy > > NumPy version 1.10.4 > > NumPy relaxed strides checking option: False > > NumPy is installed in C:\Python34\lib\site-packages\numpy > > Python version 3.4.2 (v3.4.2:ab2c023a9432, Oct? 6 2014, 22:16:31) [MSC > > v.1600 64 bit (AMD64)] > > nose version 1.3.7 > > > .......................F....S........................................................................................... > > > .....................................................................................................S.................. > > > ..........................................................................................C:\Python34\lib\unittest\case. > > py:162: DeprecationWarning: using a non-integer number instead of an > integer > > will result in an error in the future > >? ? callable_obj(*args, **kwargs) > > ........C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will > > result in an error in the future > >? ? callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > >? ? callable_obj(*args, **kwargs) > > > .......................................................................................S................................ > > > ........................................................................................................................ > > > ..........................................................................C:\Python34\lib\unittest\case.py:162: > > Deprecat > > ionWarning: using a non-integer number instead of an integer will result > in > > an error in the future > >? ? callable_obj(*args, **kwargs) > > ..C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result > >? in an error in the future > >? ? callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > >? ? callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > >? ? callable_obj(*args, **kwargs) > > C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a > > non-integer number instead of an integer will result i > > n an error in the future > >? ? callable_obj(*args, **kwargs) > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ...............................................K.........................................................C:\Python34\lib > > \site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid value > > encountered in multiply > >? ? masked_da = umath.multiply(m, da) > > C:\Python34\lib\site-packages\numpy\ma\core.py:989: RuntimeWarning: > invalid > > value encountered in multiply > >? ? masked_da = umath.multiply(m, da) > > > ........................................................................................................................ > > > ..................C:\Python34\lib\site-packages\numpy\core\tests\test_numerictypes.py:372: > > DeprecationWarning: using a n > > on-integer number instead of an integer will result in an error in the > > future > >? ? return self.ary['f0', 'f1'] > > > ........................................................................................................................ > > > .............................................................................................K.......................... > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ....K..K................................K...SK.S.......S................................................................ > > > ..................ES..SS................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ................................................................................S....................................... > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ...........................................K.................K.......................................................... > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ...........................................S............................................................................ > > > ..........................................S............................................................C:\Python34\lib\s > > ite-packages\numpy\ma\core.py:4089: UserWarning: Warning: converting a > > masked element to nan. > >? ? warnings.warn("Warning: converting a masked element to nan.") > > > ..............................................................................................................C:\Python3 > > 4\lib\site-packages\numpy\ma\core.py:5116: RuntimeWarning: invalid value > > encountered in power > >? ? np.power(out, 0.5, out=out, casting='unsafe') > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > > ........................................................................................................................ > > .......................... > > ====================================================================== > > ERROR: test_compile1 (test_system_info.TestSystemInfoReading) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > >? ? File > > > "C:\Python34\lib\site-packages\numpy\distutils\tests\test_system_info.py", > > line 182, in test_compile1 > >? ? ? c.compile([os.path.basename(self._src1)], output_dir=self._dir1) > >? ? File "C:\Python34\lib\distutils\msvc9compiler.py", line 460, in > compile > >? ? ? self.initialize() > >? ? File > "C:\Python34\lib\site-packages\numpy\distutils\msvccompiler.py", line > > 17, in initialize > >? ? ? distutils.msvccompiler.MSVCCompiler.initialize(self, plat_name) > >? ? File "C:\Python34\lib\distutils\msvc9compiler.py", line 371, in > initialize > >? ? ? vc_env = query_vcvarsall(VERSION, plat_spec) > >? ? File "C:\Python34\lib\distutils\msvc9compiler.py", line 259, in > > query_vcvarsall > >? ? ? raise DistutilsPlatformError("Unable to find vcvarsall.bat") > > distutils.errors.DistutilsPlatformError: Unable to find vcvarsall.bat > > > > ====================================================================== > > FAIL: test_blasdot.test_blasdot_used > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > >? ? File "C:\Python34\lib\site-packages\nose\case.py", line 198, in > runTest > >? ? ? self.test(*self.arg) > >? ? File "C:\Python34\lib\site-packages\numpy\testing\decorators.py", > line > > 146, in skipper_func > >? ? ? return f(*args, **kwargs) > >? ? File > "C:\Python34\lib\site-packages\numpy\core\tests\test_blasdot.py", > > line 31, in test_blasdot_used > >? ? ? assert_(dot is _dotblas.dot) > >? ? File "C:\Python34\lib\site-packages\numpy\testing\utils.py", line > 53, in > > assert_ > >? ? ? raise AssertionError(smsg) > > AssertionError > > > > ---------------------------------------------------------------------- > > Ran 5575 tests in 32.042s > > > > FAILED (KNOWNFAIL=8, SKIP=12, errors=1, failures=1) > > > > Great - thanks - I got the same couple of failures - I believe they > are benign... > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Feb 13 11:31:42 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 13 Feb 2016 09:31:42 -0700 Subject: [Numpy-discussion] Modulus (remainder) function corner cases Message-ID: Hi All, I'm curious as to what folks think about some choices in the compution of the remainder function. As an example where different choices can be made In [2]: -1e-64 % 1. Out[2]: 1.0 In [3]: float64(-1e-64) % 1. Out[3]: 0.99999999999999989 The first is Python, the second is in my branch. The first is more accurate numerically, but the modulus is of the same magnitude as the divisor. The second maintains the convention that the result must have smaller magnitude than the divisor. There are other corner cases along the same lines. So the question is, which is more desirable: maintaining numerical accuracy or enforcing mathematical convention? The differences are on the order of an ulp, but there will be a skew in the distribution of the errors if convention is maintained. The Fortran modulo function, which is the same basic function as in my branch, does not specify any bounds on the result for floating numbers, but gives only the formula, modulus(a, b) = a - b*floor(a/b), which has the advantage of being simple and well defined ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Feb 13 11:42:03 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 13 Feb 2016 09:42:03 -0700 Subject: [Numpy-discussion] Modulus (remainder) function corner cases In-Reply-To: References: Message-ID: On Sat, Feb 13, 2016 at 9:31 AM, Charles R Harris wrote: > Hi All, > > I'm curious as to what folks think about some choices in the compution of > the remainder function. As an example where different choices can be made > > In [2]: -1e-64 % 1. > Out[2]: 1.0 > > In [3]: float64(-1e-64) % 1. > Out[3]: 0.99999999999999989 > > The first is Python, the second is in my branch. The first is more accurate > numerically, but the modulus is of the same magnitude as the divisor. The > second maintains the convention that the result must have smaller > magnitude than the divisor. There are other corner cases along the same > lines. So the question is, which is more desirable: maintaining numerical > accuracy or enforcing mathematical convention? The differences are on the > order of an ulp, but there will be a skew in the distribution of the > errors if convention is maintained. > > The Fortran modulo function, which is the same basic function as in my > branch, does not specify any bounds on the result for floating numbers, but > gives only the formula, modulus(a, b) = a - b*floor(a/b), which has the > advantage of being simple and well defined ;) > Note that the other enforced bound is that the result have the same sign as the divisor. Python enforces that by adjusting the integer part, I enforce it by adjusting the remainder. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Sat Feb 13 12:11:07 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Sat, 13 Feb 2016 12:11:07 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: Message-ID: <56BF63AB.2080004@gmail.com> I've had a pretty similar idea for a new indexing function 'split_classes' which would help in your case, which essentially does def split_classes(c, v): return [v[c == u] for u in unique(c)] Your example could be coded as >>> [sum(c) for c in split_classes(label, data)] [9, 12, 15] I feel I've come across the need for such a function often enough that it might be generally useful to people as part of numpy. The implementation of split_classes above has pretty poor performance because it creates many temporary boolean arrays, so my plan for a PR was to have a speedy version of it that uses a single pass through v. (I often wanted to use this function on large datasets). If anyone has any comments on the idea (good idea. bad idea?) I'd love to hear. I have some further notes and examples here: https://gist.github.com/ahaldane/1e673d2fe6ffe0be4f21 Allan On 02/12/2016 09:40 AM, S?rgio wrote: > Hello, > > This is my first e-mail, I will try to make the idea simple. > > Similar to masked array it would be interesting to use a label array to > guide operations. > > Ex.: > >>> x > labelled_array(data = > [[0 1 2] > [3 4 5] > [6 7 8]], > label = > [[0 1 2] > [0 1 2] > [0 1 2]]) > > >>> sum(x) > array([9, 12, 15]) > > The operations would create a new axis for label indexing. > > You could think of it as a collection of masks, one for each label. > > I don't know a way to make something like this efficiently without a > loop. Just wondering... > > S?rgio. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From jjhelmus at gmail.com Sat Feb 13 12:55:04 2016 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Sat, 13 Feb 2016 11:55:04 -0600 Subject: [Numpy-discussion] Fwd: Windows wheels for testing In-Reply-To: References: <201602122315.u1CNFmmU019206@blue-cove.com> <201602130419.u1D4J5FL030486@blue-cove.com> Message-ID: <56BF6DF8.30304@gmail.com> On 2/12/16 10:23 PM, Matthew Brett wrote: > On Fri, Feb 12, 2016 at 8:18 PM, R Schumacher wrote: >> At 03:45 PM 2/12/2016, you wrote: >>> PS C:\tmp> c:\Python35\python -m venv np-testing >>> PS C:\tmp> .\np-testing\Scripts\Activate.ps1 >>> (np-testing) PS C:\tmp> pip install -f >>> https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy nose >> >> C:\Python34\Scripts>pip install "D:\Python >> distros\numpy-1.10.4-cp34-none-win_amd64.whl" >> Unpacking d:\python distros\numpy-1.10.4-cp34-none-win_amd64.whl >> Installing collected packages: numpy >> Successfully installed numpy >> Cleaning up... >> >> C:\Python34\Scripts>..\python >> Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC v.1600 64 bit >> (AMD64)] on win32 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import numpy >>>>> numpy.test() >> Running unit tests for numpy >> NumPy version 1.10.4 >> NumPy relaxed strides checking option: False >> NumPy is installed in C:\Python34\lib\site-packages\numpy >> Python version 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC >> v.1600 64 bit (AMD64)] >> nose version 1.3.7 >> .......................F....S........................................................................................... >> .....................................................................................................S.................. >> ..........................................................................................C:\Python34\lib\unittest\case. >> py:162: DeprecationWarning: using a non-integer number instead of an integer >> will result in an error in the future >> callable_obj(*args, **kwargs) >> ........C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >> non-integer number instead of an integer will >> result in an error in the future >> callable_obj(*args, **kwargs) >> C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >> non-integer number instead of an integer will result i >> n an error in the future >> callable_obj(*args, **kwargs) >> .......................................................................................S................................ >> ........................................................................................................................ >> ..........................................................................C:\Python34\lib\unittest\case.py:162: >> Deprecat >> ionWarning: using a non-integer number instead of an integer will result in >> an error in the future >> callable_obj(*args, **kwargs) >> ..C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >> non-integer number instead of an integer will result >> in an error in the future >> callable_obj(*args, **kwargs) >> C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >> non-integer number instead of an integer will result i >> n an error in the future >> callable_obj(*args, **kwargs) >> C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >> non-integer number instead of an integer will result i >> n an error in the future >> callable_obj(*args, **kwargs) >> C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >> non-integer number instead of an integer will result i >> n an error in the future >> callable_obj(*args, **kwargs) >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ...............................................K.........................................................C:\Python34\lib >> \site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid value >> encountered in multiply >> masked_da = umath.multiply(m, da) >> C:\Python34\lib\site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid >> value encountered in multiply >> masked_da = umath.multiply(m, da) >> ........................................................................................................................ >> ..................C:\Python34\lib\site-packages\numpy\core\tests\test_numerictypes.py:372: >> DeprecationWarning: using a n >> on-integer number instead of an integer will result in an error in the >> future >> return self.ary['f0', 'f1'] >> ........................................................................................................................ >> .............................................................................................K.......................... >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ....K..K................................K...SK.S.......S................................................................ >> ..................ES..SS................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ................................................................................S....................................... >> ........................................................................................................................ >> ........................................................................................................................ >> ...........................................K.................K.......................................................... >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ...........................................S............................................................................ >> ..........................................S............................................................C:\Python34\lib\s >> ite-packages\numpy\ma\core.py:4089: UserWarning: Warning: converting a >> masked element to nan. >> warnings.warn("Warning: converting a masked element to nan.") >> ..............................................................................................................C:\Python3 >> 4\lib\site-packages\numpy\ma\core.py:5116: RuntimeWarning: invalid value >> encountered in power >> np.power(out, 0.5, out=out, casting='unsafe') >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> ........................................................................................................................ >> .......................... >> ====================================================================== >> ERROR: test_compile1 (test_system_info.TestSystemInfoReading) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "C:\Python34\lib\site-packages\numpy\distutils\tests\test_system_info.py", >> line 182, in test_compile1 >> c.compile([os.path.basename(self._src1)], output_dir=self._dir1) >> File "C:\Python34\lib\distutils\msvc9compiler.py", line 460, in compile >> self.initialize() >> File "C:\Python34\lib\site-packages\numpy\distutils\msvccompiler.py", line >> 17, in initialize >> distutils.msvccompiler.MSVCCompiler.initialize(self, plat_name) >> File "C:\Python34\lib\distutils\msvc9compiler.py", line 371, in initialize >> vc_env = query_vcvarsall(VERSION, plat_spec) >> File "C:\Python34\lib\distutils\msvc9compiler.py", line 259, in >> query_vcvarsall >> raise DistutilsPlatformError("Unable to find vcvarsall.bat") >> distutils.errors.DistutilsPlatformError: Unable to find vcvarsall.bat >> >> ====================================================================== >> FAIL: test_blasdot.test_blasdot_used >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "C:\Python34\lib\site-packages\nose\case.py", line 198, in runTest >> self.test(*self.arg) >> File "C:\Python34\lib\site-packages\numpy\testing\decorators.py", line >> 146, in skipper_func >> return f(*args, **kwargs) >> File "C:\Python34\lib\site-packages\numpy\core\tests\test_blasdot.py", >> line 31, in test_blasdot_used >> assert_(dot is _dotblas.dot) >> File "C:\Python34\lib\site-packages\numpy\testing\utils.py", line 53, in >> assert_ >> raise AssertionError(smsg) >> AssertionError >> >> ---------------------------------------------------------------------- >> Ran 5575 tests in 32.042s >> >> FAILED (KNOWNFAIL=8, SKIP=12, errors=1, failures=1) >> > Great - thanks - I got the same couple of failures - I believe they > are benign... > > Matthew Matthew, The wheels seem to work fine in the Python provided by Continuum on 32-bit Windows. Tested in Python 2.7, 3.3 and 3.4. The only test errors/failures was the the vcvarsall.bat error on all three versions. Full tests logs at https://gist.github.com/jjhelmus/de2b34779e83eb37a70f. Cheers, - Jonathan Helmus From allanhaldane at gmail.com Sat Feb 13 13:01:53 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Sat, 13 Feb 2016 13:01:53 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: <56BF63AB.2080004@gmail.com> References: <56BF63AB.2080004@gmail.com> Message-ID: <56BF6F91.5020606@gmail.com> Sorry, to reply to myself here, but looking at it with fresh eyes maybe the performance of the naive version isn't too bad. Here's a comparison of the naive vs a better implementation: def split_classes_naive(c, v): return [v[c == u] for u in unique(c)] def split_classes(c, v): perm = c.argsort() csrt = c[perm] div = where(csrt[1:] != csrt[:-1])[0] + 1 return [v[x] for x in split(perm, div)] >>> c = randint(0,32,size=100000) >>> v = arange(100000) >>> %timeit split_classes_naive(c,v) 100 loops, best of 3: 8.4 ms per loop >>> %timeit split_classes(c,v) 100 loops, best of 3: 4.79 ms per loop In any case, maybe it is useful to Sergio or others. Allan On 02/13/2016 12:11 PM, Allan Haldane wrote: > I've had a pretty similar idea for a new indexing function > 'split_classes' which would help in your case, which essentially does > > def split_classes(c, v): > return [v[c == u] for u in unique(c)] > > Your example could be coded as > > >>> [sum(c) for c in split_classes(label, data)] > [9, 12, 15] > > I feel I've come across the need for such a function often enough that > it might be generally useful to people as part of numpy. The > implementation of split_classes above has pretty poor performance > because it creates many temporary boolean arrays, so my plan for a PR > was to have a speedy version of it that uses a single pass through v. > (I often wanted to use this function on large datasets). > > If anyone has any comments on the idea (good idea. bad idea?) I'd love > to hear. > > I have some further notes and examples here: > https://gist.github.com/ahaldane/1e673d2fe6ffe0be4f21 > > Allan > > On 02/12/2016 09:40 AM, S?rgio wrote: >> Hello, >> >> This is my first e-mail, I will try to make the idea simple. >> >> Similar to masked array it would be interesting to use a label array to >> guide operations. >> >> Ex.: >> >>> x >> labelled_array(data = >> [[0 1 2] >> [3 4 5] >> [6 7 8]], >> label = >> [[0 1 2] >> [0 1 2] >> [0 1 2]]) >> >> >>> sum(x) >> array([9, 12, 15]) >> >> The operations would create a new axis for label indexing. >> >> You could think of it as a collection of masks, one for each label. >> >> I don't know a way to make something like this efficiently without a >> loop. Just wondering... >> >> S?rgio. >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > From njs at pobox.com Sat Feb 13 13:16:19 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 13 Feb 2016 10:16:19 -0800 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: Message-ID: I believe this is basically a groupby, which is one of pandas's core competencies... even if numpy were to add some utilities for this kind of thing, then I doubt we'd do as well as them, so you might check whether pandas works for you first :-) On Feb 12, 2016 6:40 AM, "S?rgio" wrote: > Hello, > > This is my first e-mail, I will try to make the idea simple. > > Similar to masked array it would be interesting to use a label array to > guide operations. > > Ex.: > >>> x > labelled_array(data = > [[0 1 2] > [3 4 5] > [6 7 8]], > label = > [[0 1 2] > [0 1 2] > [0 1 2]]) > > >>> sum(x) > array([9, 12, 15]) > > The operations would create a new axis for label indexing. > > You could think of it as a collection of masks, one for each label. > > I don't know a way to make something like this efficiently without a loop. > Just wondering... > > S?rgio. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Feb 13 13:29:44 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Feb 2016 13:29:44 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: <56BF6F91.5020606@gmail.com> References: <56BF63AB.2080004@gmail.com> <56BF6F91.5020606@gmail.com> Message-ID: On Sat, Feb 13, 2016 at 1:01 PM, Allan Haldane wrote: > Sorry, to reply to myself here, but looking at it with fresh eyes maybe > the performance of the naive version isn't too bad. Here's a comparison of > the naive vs a better implementation: > > def split_classes_naive(c, v): > return [v[c == u] for u in unique(c)] > > def split_classes(c, v): > perm = c.argsort() > csrt = c[perm] > div = where(csrt[1:] != csrt[:-1])[0] + 1 > return [v[x] for x in split(perm, div)] > > >>> c = randint(0,32,size=100000) > >>> v = arange(100000) > >>> %timeit split_classes_naive(c,v) > 100 loops, best of 3: 8.4 ms per loop > >>> %timeit split_classes(c,v) > 100 loops, best of 3: 4.79 ms per loop > The usecases I recently started to target for similar things is 1 Million or more rows and 10000 uniques in the labels. The second version should be faster for large number of uniques, I guess. Overall numpy is falling far behind pandas in terms of simple groupby operations. bincount and histogram (IIRC) worked for some cases but are rather limited. reduce_at looks nice for cases where it applies. In contrast to the full sized labels in the original post, I only know of applications where the labels are 1-D corresponding to rows or columns. Josef > > In any case, maybe it is useful to Sergio or others. > > Allan > > > On 02/13/2016 12:11 PM, Allan Haldane wrote: > >> I've had a pretty similar idea for a new indexing function >> 'split_classes' which would help in your case, which essentially does >> >> def split_classes(c, v): >> return [v[c == u] for u in unique(c)] >> >> Your example could be coded as >> >> >>> [sum(c) for c in split_classes(label, data)] >> [9, 12, 15] >> >> I feel I've come across the need for such a function often enough that >> it might be generally useful to people as part of numpy. The >> implementation of split_classes above has pretty poor performance >> because it creates many temporary boolean arrays, so my plan for a PR >> was to have a speedy version of it that uses a single pass through v. >> (I often wanted to use this function on large datasets). >> >> If anyone has any comments on the idea (good idea. bad idea?) I'd love >> to hear. >> >> I have some further notes and examples here: >> https://gist.github.com/ahaldane/1e673d2fe6ffe0be4f21 >> >> Allan >> >> On 02/12/2016 09:40 AM, S?rgio wrote: >> >>> Hello, >>> >>> This is my first e-mail, I will try to make the idea simple. >>> >>> Similar to masked array it would be interesting to use a label array to >>> guide operations. >>> >>> Ex.: >>> >>> x >>> labelled_array(data = >>> [[0 1 2] >>> [3 4 5] >>> [6 7 8]], >>> label = >>> [[0 1 2] >>> [0 1 2] >>> [0 1 2]]) >>> >>> >>> sum(x) >>> array([9, 12, 15]) >>> >>> The operations would create a new axis for label indexing. >>> >>> You could think of it as a collection of masks, one for each label. >>> >>> I don't know a way to make something like this efficiently without a >>> loop. Just wondering... >>> >>> S?rgio. >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sat Feb 13 13:39:34 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Sat, 13 Feb 2016 13:39:34 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: <56BF63AB.2080004@gmail.com> <56BF6F91.5020606@gmail.com> Message-ID: In [10]: pd.options.display.max_rows=10 In [13]: np.random.seed(1234) In [14]: c = np.random.randint(0,32,size=100000) In [15]: v = np.arange(100000) In [16]: df = DataFrame({'v' : v, 'c' : c}) In [17]: df Out[17]: c v 0 15 0 1 19 1 2 6 2 3 21 3 4 12 4 ... .. ... 99995 7 99995 99996 2 99996 99997 27 99997 99998 28 99998 99999 7 99999 [100000 rows x 2 columns] In [19]: df.groupby('c').count() Out[19]: v c 0 3136 1 3229 2 3093 3 3121 4 3041 .. ... 27 3128 28 3063 29 3147 30 3073 31 3090 [32 rows x 1 columns] In [20]: %timeit df.groupby('c').count() 100 loops, best of 3: 2 ms per loop In [21]: %timeit df.groupby('c').mean() 100 loops, best of 3: 2.39 ms per loop In [22]: df.groupby('c').mean() Out[22]: v c 0 49883.384885 1 50233.692165 2 48634.116069 3 50811.743992 4 50505.368629 .. ... 27 49715.349425 28 50363.501469 29 50485.395933 30 50190.155223 31 50691.041748 [32 rows x 1 columns] On Sat, Feb 13, 2016 at 1:29 PM, wrote: > > > On Sat, Feb 13, 2016 at 1:01 PM, Allan Haldane > wrote: > >> Sorry, to reply to myself here, but looking at it with fresh eyes maybe >> the performance of the naive version isn't too bad. Here's a comparison of >> the naive vs a better implementation: >> >> def split_classes_naive(c, v): >> return [v[c == u] for u in unique(c)] >> >> def split_classes(c, v): >> perm = c.argsort() >> csrt = c[perm] >> div = where(csrt[1:] != csrt[:-1])[0] + 1 >> return [v[x] for x in split(perm, div)] >> >> >>> c = randint(0,32,size=100000) >> >>> v = arange(100000) >> >>> %timeit split_classes_naive(c,v) >> 100 loops, best of 3: 8.4 ms per loop >> >>> %timeit split_classes(c,v) >> 100 loops, best of 3: 4.79 ms per loop >> > > The usecases I recently started to target for similar things is 1 Million > or more rows and 10000 uniques in the labels. > The second version should be faster for large number of uniques, I guess. > > Overall numpy is falling far behind pandas in terms of simple groupby > operations. bincount and histogram (IIRC) worked for some cases but are > rather limited. > > reduce_at looks nice for cases where it applies. > > In contrast to the full sized labels in the original post, I only know of > applications where the labels are 1-D corresponding to rows or columns. > > Josef > > > >> >> In any case, maybe it is useful to Sergio or others. >> >> Allan >> >> >> On 02/13/2016 12:11 PM, Allan Haldane wrote: >> >>> I've had a pretty similar idea for a new indexing function >>> 'split_classes' which would help in your case, which essentially does >>> >>> def split_classes(c, v): >>> return [v[c == u] for u in unique(c)] >>> >>> Your example could be coded as >>> >>> >>> [sum(c) for c in split_classes(label, data)] >>> [9, 12, 15] >>> >>> I feel I've come across the need for such a function often enough that >>> it might be generally useful to people as part of numpy. The >>> implementation of split_classes above has pretty poor performance >>> because it creates many temporary boolean arrays, so my plan for a PR >>> was to have a speedy version of it that uses a single pass through v. >>> (I often wanted to use this function on large datasets). >>> >>> If anyone has any comments on the idea (good idea. bad idea?) I'd love >>> to hear. >>> >>> I have some further notes and examples here: >>> https://gist.github.com/ahaldane/1e673d2fe6ffe0be4f21 >>> >>> Allan >>> >>> On 02/12/2016 09:40 AM, S?rgio wrote: >>> >>>> Hello, >>>> >>>> This is my first e-mail, I will try to make the idea simple. >>>> >>>> Similar to masked array it would be interesting to use a label array to >>>> guide operations. >>>> >>>> Ex.: >>>> >>> x >>>> labelled_array(data = >>>> [[0 1 2] >>>> [3 4 5] >>>> [6 7 8]], >>>> label = >>>> [[0 1 2] >>>> [0 1 2] >>>> [0 1 2]]) >>>> >>>> >>> sum(x) >>>> array([9, 12, 15]) >>>> >>>> The operations would create a new axis for label indexing. >>>> >>>> You could think of it as a collection of masks, one for each label. >>>> >>>> I don't know a way to make something like this efficiently without a >>>> loop. Just wondering... >>>> >>>> S?rgio. >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sat Feb 13 13:42:20 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Sat, 13 Feb 2016 13:42:20 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: <56BF63AB.2080004@gmail.com> <56BF6F91.5020606@gmail.com> Message-ID: These operations get slower as the number of groups increase, but with a faster function (e.g. the standard ones which are cythonized), the constant on the increase is pretty low. In [23]: c = np.random.randint(0,10000,size=100000) In [24]: df = DataFrame({'v' : v, 'c' : c}) In [25]: %timeit df.groupby('c').count() 100 loops, best of 3: 3.18 ms per loop In [26]: len(df.groupby('c').count()) Out[26]: 10000 In [27]: df.groupby('c').count() Out[27]: v c 0 9 1 11 2 7 3 8 4 16 ... .. 9995 11 9996 13 9997 13 9998 7 9999 10 [10000 rows x 1 columns] On Sat, Feb 13, 2016 at 1:39 PM, Jeff Reback wrote: > In [10]: pd.options.display.max_rows=10 > > In [13]: np.random.seed(1234) > > In [14]: c = np.random.randint(0,32,size=100000) > > In [15]: v = np.arange(100000) > > In [16]: df = DataFrame({'v' : v, 'c' : c}) > > In [17]: df > Out[17]: > c v > 0 15 0 > 1 19 1 > 2 6 2 > 3 21 3 > 4 12 4 > ... .. ... > 99995 7 99995 > 99996 2 99996 > 99997 27 99997 > 99998 28 99998 > 99999 7 99999 > > [100000 rows x 2 columns] > > In [19]: df.groupby('c').count() > Out[19]: > v > c > 0 3136 > 1 3229 > 2 3093 > 3 3121 > 4 3041 > .. ... > 27 3128 > 28 3063 > 29 3147 > 30 3073 > 31 3090 > > [32 rows x 1 columns] > > In [20]: %timeit df.groupby('c').count() > 100 loops, best of 3: 2 ms per loop > > In [21]: %timeit df.groupby('c').mean() > 100 loops, best of 3: 2.39 ms per loop > > In [22]: df.groupby('c').mean() > Out[22]: > v > c > 0 49883.384885 > 1 50233.692165 > 2 48634.116069 > 3 50811.743992 > 4 50505.368629 > .. ... > 27 49715.349425 > 28 50363.501469 > 29 50485.395933 > 30 50190.155223 > 31 50691.041748 > > [32 rows x 1 columns] > > > On Sat, Feb 13, 2016 at 1:29 PM, wrote: > >> >> >> On Sat, Feb 13, 2016 at 1:01 PM, Allan Haldane >> wrote: >> >>> Sorry, to reply to myself here, but looking at it with fresh eyes maybe >>> the performance of the naive version isn't too bad. Here's a comparison of >>> the naive vs a better implementation: >>> >>> def split_classes_naive(c, v): >>> return [v[c == u] for u in unique(c)] >>> >>> def split_classes(c, v): >>> perm = c.argsort() >>> csrt = c[perm] >>> div = where(csrt[1:] != csrt[:-1])[0] + 1 >>> return [v[x] for x in split(perm, div)] >>> >>> >>> c = randint(0,32,size=100000) >>> >>> v = arange(100000) >>> >>> %timeit split_classes_naive(c,v) >>> 100 loops, best of 3: 8.4 ms per loop >>> >>> %timeit split_classes(c,v) >>> 100 loops, best of 3: 4.79 ms per loop >>> >> >> The usecases I recently started to target for similar things is 1 Million >> or more rows and 10000 uniques in the labels. >> The second version should be faster for large number of uniques, I guess. >> >> Overall numpy is falling far behind pandas in terms of simple groupby >> operations. bincount and histogram (IIRC) worked for some cases but are >> rather limited. >> >> reduce_at looks nice for cases where it applies. >> >> In contrast to the full sized labels in the original post, I only know of >> applications where the labels are 1-D corresponding to rows or columns. >> >> Josef >> >> >> >>> >>> In any case, maybe it is useful to Sergio or others. >>> >>> Allan >>> >>> >>> On 02/13/2016 12:11 PM, Allan Haldane wrote: >>> >>>> I've had a pretty similar idea for a new indexing function >>>> 'split_classes' which would help in your case, which essentially does >>>> >>>> def split_classes(c, v): >>>> return [v[c == u] for u in unique(c)] >>>> >>>> Your example could be coded as >>>> >>>> >>> [sum(c) for c in split_classes(label, data)] >>>> [9, 12, 15] >>>> >>>> I feel I've come across the need for such a function often enough that >>>> it might be generally useful to people as part of numpy. The >>>> implementation of split_classes above has pretty poor performance >>>> because it creates many temporary boolean arrays, so my plan for a PR >>>> was to have a speedy version of it that uses a single pass through v. >>>> (I often wanted to use this function on large datasets). >>>> >>>> If anyone has any comments on the idea (good idea. bad idea?) I'd love >>>> to hear. >>>> >>>> I have some further notes and examples here: >>>> https://gist.github.com/ahaldane/1e673d2fe6ffe0be4f21 >>>> >>>> Allan >>>> >>>> On 02/12/2016 09:40 AM, S?rgio wrote: >>>> >>>>> Hello, >>>>> >>>>> This is my first e-mail, I will try to make the idea simple. >>>>> >>>>> Similar to masked array it would be interesting to use a label array to >>>>> guide operations. >>>>> >>>>> Ex.: >>>>> >>> x >>>>> labelled_array(data = >>>>> [[0 1 2] >>>>> [3 4 5] >>>>> [6 7 8]], >>>>> label = >>>>> [[0 1 2] >>>>> [0 1 2] >>>>> [0 1 2]]) >>>>> >>>>> >>> sum(x) >>>>> array([9, 12, 15]) >>>>> >>>>> The operations would create a new axis for label indexing. >>>>> >>>>> You could think of it as a collection of masks, one for each label. >>>>> >>>>> I don't know a way to make something like this efficiently without a >>>>> loop. Just wondering... >>>>> >>>>> S?rgio. >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjhelmus at gmail.com Sat Feb 13 13:48:57 2016 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Sat, 13 Feb 2016 12:48:57 -0600 Subject: [Numpy-discussion] Subclassing ma.masked_array, code broken after version 1.9 In-Reply-To: References: Message-ID: <56BF7A99.5040507@gmail.com> On 2/12/16 6:06 PM, Gutenkunst, Ryan N - (rgutenk) wrote: > Hello all, > > In 2009 I developed an application that uses a subclass of masked arrays as a central data object. My subclass Spectrum possesses additional attributes along with many custom methods. It was very convenient to be able to use standard numpy functions for doing arithmetic on these objects. However, my code broke with numpy 1.10. I've finally had a chance to track down the problem, and I am hoping someone can suggest a workaround. > > See below for an example, which is as minimal as I could concoct. In this case, I have a Spectrum object that I'd like to take the logarithm of using numpy.ma.log, while preserving the value of the "folded" attribute. Up to numpy 1.9, this worked as expected, but in numpy 1.10 and 1.11 the attribute is not preserved. > > The change in behavior appears to be driven by a commit made on Jun 16th, 2015 by Marten van Kerkwijk. In particular, the commit changed _MaskedUnaryOperation.__call__ so that the result array's update_from method is no longer called with the input array as the argument, but rather the result of the numpy UnaryOperation (old line 889, new line 885). Because that UnaryOperation doesn't carry my new attribute, it's not present for update_from to access. I notice that similar changes were made to MaskedBinaryOperation, although I haven't tested those. It's not clear to me from the commit message why this particular change was made, so I don't know whether this new behavior is intentional. > > I know that subclassing arrays isn't widely encouraged, but it has been very convenient in my code. Is it still possible to subclass masked_array in such a way that functions like numpy.ma.log preserve additional attributes? If so, can someone point me in the right direction? > > Thanks! > Ryan > > *** Begin example > > import numpy > print 'Working with numpy {0}'.format(numpy.__version__) > > class Spectrum(numpy.ma.masked_array): > def __new__(cls, data, mask=numpy.ma.nomask, data_folded=None): > subarr = numpy.ma.masked_array(data, mask=mask, keep_mask=True, > shrink=True) > subarr = subarr.view(cls) > subarr.folded = data_folded > > return subarr > > def __array_finalize__(self, obj): > if obj is None: > return > numpy.ma.masked_array.__array_finalize__(self, obj) > self.folded = getattr(obj, 'folded', 'unspecified') > > def _update_from(self, obj): > print('Input to update_from: {0}'.format(repr(obj))) > numpy.ma.masked_array._update_from(self, obj) > self.folded = getattr(obj, 'folded', 'unspecified') > > def __repr__(self): > return 'Spectrum(%s, folded=%s)'\ > % (str(self), str(self.folded)) > > fs1 = Spectrum([2,3,4.], data_folded=True) > fs2 = numpy.ma.log(fs1) > print('fs2.folded status: {0}'.format(fs2.folded)) > print('Expectation is True, achieved with numpy 1.9') > > *** End example > > -- > Ryan Gutenkunst > Assistant Professor > Molecular and Cellular Biology > University of Arizona > phone: (520) 626-0569, office LSS 325 > http://gutengroup.mcb.arizona.edu > Latest paper: "Computationally efficient composite likelihood statistics for demographic inference" > Molecular Biology and Evolution; http://dx.doi.org/10.1093/molbev/msv255 Ryan, I'm not sure if you will be able to get this to work as in NumPy 1.9, but the __array_wrap__ method is intended to be the mechanism for subclasses to set their return type, adjust metadata, etc [1]. Unfortunately, the numpy.ma.log function does not seem to make a call to __array_wrap__ (at least in NumPy 1.10.2) although numpy.log does: from __future__ import print_function import numpy print('Working with numpy {0}'.format(numpy.__version__)) class Spectrum(numpy.ma.masked_array): def __new__(cls, data, mask=numpy.ma.nomask, data_folded=None): subarr = numpy.ma.masked_array(data, mask=mask, keep_mask=True, shrink=True) subarr = subarr.view(cls) subarr.folded = data_folded return subarr def __array_finalize__(self, obj): if obj is None: return numpy.ma.masked_array.__array_finalize__(self, obj) self.folded = getattr(obj, 'folded', 'unspecified') def __array_wrap__(self, out_arr, context=None): print('__array_wrap__ called') return numpy.ndarray.__array_wrap__(self, out_arr, context) def __repr__(self): return 'Spectrum(%s, folded=%s)'\ % (str(self), str(self.folded)) fs1 = Spectrum([2,3,4.], data_folded=True) print('numpy.ma.log:') fs2 = numpy.ma.log(fs1) print('fs2 type:', type(fs2)) print('fs2.folded status: {0}'.format(fs2.folded)) print('numpy.log:') fs3 = numpy.log(fs1) print('fs3 type:', type(fs3)) print('fs3.folded status: {0}'.format(fs3.folded)) ---- $ python example.py Working with numpy 1.10.2 numpy.ma.log: fs2 type: fs2.folded status: unspecified numpy.log: __array_wrap__ called fs3 type: fs3.folded status: True The change mentioned in the original message was made in pull request 3907 [2] in case anyone wants to have a look. Cheers, - Jonathan Helmus [1] http://docs.scipy.org/doc/numpy-1.10.1/user/basics.subclassing.html#array-wrap-for-ufuncs [2] https://github.com/numpy/numpy/pull/3907 From josef.pktd at gmail.com Sat Feb 13 13:51:44 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Feb 2016 13:51:44 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: <56BF63AB.2080004@gmail.com> <56BF6F91.5020606@gmail.com> Message-ID: On Sat, Feb 13, 2016 at 1:42 PM, Jeff Reback wrote: > These operations get slower as the number of groups increase, but with a > faster function (e.g. the standard ones which are cythonized), the > constant on > the increase is pretty low. > > In [23]: c = np.random.randint(0,10000,size=100000) > > In [24]: df = DataFrame({'v' : v, 'c' : c}) > > In [25]: %timeit df.groupby('c').count() > 100 loops, best of 3: 3.18 ms per loop > > In [26]: len(df.groupby('c').count()) > Out[26]: 10000 > > In [27]: df.groupby('c').count() > Out[27]: > v > c > 0 9 > 1 11 > 2 7 > 3 8 > 4 16 > ... .. > 9995 11 > 9996 13 > 9997 13 > 9998 7 > 9999 10 > > [10000 rows x 1 columns] > > One other difference across usecases is whether this is a single operation, or we want to optimize the data format for a large number of different calculations. (We have both cases in statsmodels.) In the latter case it's worth spending some extra computational effort on rearranging the data to be either sorted or in lists of arrays, (I guess without having done any timings). Josef > > On Sat, Feb 13, 2016 at 1:39 PM, Jeff Reback wrote: > >> In [10]: pd.options.display.max_rows=10 >> >> In [13]: np.random.seed(1234) >> >> In [14]: c = np.random.randint(0,32,size=100000) >> >> In [15]: v = np.arange(100000) >> >> In [16]: df = DataFrame({'v' : v, 'c' : c}) >> >> In [17]: df >> Out[17]: >> c v >> 0 15 0 >> 1 19 1 >> 2 6 2 >> 3 21 3 >> 4 12 4 >> ... .. ... >> 99995 7 99995 >> 99996 2 99996 >> 99997 27 99997 >> 99998 28 99998 >> 99999 7 99999 >> >> [100000 rows x 2 columns] >> >> In [19]: df.groupby('c').count() >> Out[19]: >> v >> c >> 0 3136 >> 1 3229 >> 2 3093 >> 3 3121 >> 4 3041 >> .. ... >> 27 3128 >> 28 3063 >> 29 3147 >> 30 3073 >> 31 3090 >> >> [32 rows x 1 columns] >> >> In [20]: %timeit df.groupby('c').count() >> 100 loops, best of 3: 2 ms per loop >> >> In [21]: %timeit df.groupby('c').mean() >> 100 loops, best of 3: 2.39 ms per loop >> >> In [22]: df.groupby('c').mean() >> Out[22]: >> v >> c >> 0 49883.384885 >> 1 50233.692165 >> 2 48634.116069 >> 3 50811.743992 >> 4 50505.368629 >> .. ... >> 27 49715.349425 >> 28 50363.501469 >> 29 50485.395933 >> 30 50190.155223 >> 31 50691.041748 >> >> [32 rows x 1 columns] >> >> >> On Sat, Feb 13, 2016 at 1:29 PM, wrote: >> >>> >>> >>> On Sat, Feb 13, 2016 at 1:01 PM, Allan Haldane >>> wrote: >>> >>>> Sorry, to reply to myself here, but looking at it with fresh eyes maybe >>>> the performance of the naive version isn't too bad. Here's a comparison of >>>> the naive vs a better implementation: >>>> >>>> def split_classes_naive(c, v): >>>> return [v[c == u] for u in unique(c)] >>>> >>>> def split_classes(c, v): >>>> perm = c.argsort() >>>> csrt = c[perm] >>>> div = where(csrt[1:] != csrt[:-1])[0] + 1 >>>> return [v[x] for x in split(perm, div)] >>>> >>>> >>> c = randint(0,32,size=100000) >>>> >>> v = arange(100000) >>>> >>> %timeit split_classes_naive(c,v) >>>> 100 loops, best of 3: 8.4 ms per loop >>>> >>> %timeit split_classes(c,v) >>>> 100 loops, best of 3: 4.79 ms per loop >>>> >>> >>> The usecases I recently started to target for similar things is 1 >>> Million or more rows and 10000 uniques in the labels. >>> The second version should be faster for large number of uniques, I guess. >>> >>> Overall numpy is falling far behind pandas in terms of simple groupby >>> operations. bincount and histogram (IIRC) worked for some cases but are >>> rather limited. >>> >>> reduce_at looks nice for cases where it applies. >>> >>> In contrast to the full sized labels in the original post, I only know >>> of applications where the labels are 1-D corresponding to rows or columns. >>> >>> Josef >>> >>> >>> >>>> >>>> In any case, maybe it is useful to Sergio or others. >>>> >>>> Allan >>>> >>>> >>>> On 02/13/2016 12:11 PM, Allan Haldane wrote: >>>> >>>>> I've had a pretty similar idea for a new indexing function >>>>> 'split_classes' which would help in your case, which essentially does >>>>> >>>>> def split_classes(c, v): >>>>> return [v[c == u] for u in unique(c)] >>>>> >>>>> Your example could be coded as >>>>> >>>>> >>> [sum(c) for c in split_classes(label, data)] >>>>> [9, 12, 15] >>>>> >>>>> I feel I've come across the need for such a function often enough that >>>>> it might be generally useful to people as part of numpy. The >>>>> implementation of split_classes above has pretty poor performance >>>>> because it creates many temporary boolean arrays, so my plan for a PR >>>>> was to have a speedy version of it that uses a single pass through v. >>>>> (I often wanted to use this function on large datasets). >>>>> >>>>> If anyone has any comments on the idea (good idea. bad idea?) I'd love >>>>> to hear. >>>>> >>>>> I have some further notes and examples here: >>>>> https://gist.github.com/ahaldane/1e673d2fe6ffe0be4f21 >>>>> >>>>> Allan >>>>> >>>>> On 02/12/2016 09:40 AM, S?rgio wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> This is my first e-mail, I will try to make the idea simple. >>>>>> >>>>>> Similar to masked array it would be interesting to use a label array >>>>>> to >>>>>> guide operations. >>>>>> >>>>>> Ex.: >>>>>> >>> x >>>>>> labelled_array(data = >>>>>> [[0 1 2] >>>>>> [3 4 5] >>>>>> [6 7 8]], >>>>>> label = >>>>>> [[0 1 2] >>>>>> [0 1 2] >>>>>> [0 1 2]]) >>>>>> >>>>>> >>> sum(x) >>>>>> array([9, 12, 15]) >>>>>> >>>>>> The operations would create a new axis for label indexing. >>>>>> >>>>>> You could think of it as a collection of masks, one for each label. >>>>>> >>>>>> I don't know a way to make something like this efficiently without a >>>>>> loop. Just wondering... >>>>>> >>>>>> S?rgio. >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at scipy.org >>>>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>>>> >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Feb 13 16:38:47 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Feb 2016 16:38:47 -0500 Subject: [Numpy-discussion] ANN: numpydoc 0.6.0 released In-Reply-To: References: Message-ID: On Sat, Feb 13, 2016 at 10:03 AM, Ralf Gommers wrote: > Hi all, > > I'm pleased to announce the release of numpydoc 0.6.0. The main new > feature is support for the Yields section in numpy-style docstrings. This > is described in > https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt > > Numpydoc can be installed from PyPi: https://pypi.python.org/pypi/numpydoc > Thanks, BTW: the status section in the howto still refers to the documentation editor, which has been retired AFAIK. Josef > > > Cheers, > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sat Feb 13 19:53:15 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Sat, 13 Feb 2016 19:53:15 -0500 Subject: [Numpy-discussion] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE Message-ID: Hi, I'm pleased to announce the availability of the first release candidate of Pandas 0.18.0. Please try this RC and report any issues here: Pandas Issues We will be releasing officially in 1-2 weeks or so. **RELEASE CANDIDATE 1** This is a major release from 0.17.1 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: - pandas >= 0.18.0 will no longer support compatibility with Python version 2.6 GH7718 or version 3.3 GH11273 - Moving and expanding window functions are now methods on Series and DataFrame similar to .groupby like objects, see here . - Adding support for a RangeIndex as a specialized form of the Int64Index for memory savings, see here . - API breaking .resample changes to make it more .groupby like, see here - Removal of support for positional indexing with floats, which was deprecated since 0.14.0. This will now raise a TypeError, see here - The .to_xarray() function has been added for compatibility with the xarray package see here . - Addition of the .str.extractall() method , and API changes to the the .str.extract() method , and the .str.cat() method - pd.test() top-level nose test runner is available GH4327 See the Whatsnew for much more information. Best way to get this is to install via conda from our development channel. Builds for osx-64,linux-64,win-64 for Python 2.7 and Python 3.5 are all available. conda install pandas=v0.18.0rc1 -c pandas Thanks to all who made this release happen. It is a very large release! Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Sat Feb 13 20:57:59 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Sat, 13 Feb 2016 17:57:59 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster Message-ID: Compare (on Python3 -- for Python2, read "xrange" instead of "range"): In [2]: %timeit np.array(range(1000000), np.int64) 10 loops, best of 3: 156 ms per loop In [3]: %timeit np.arange(1000000, dtype=np.int64) 1000 loops, best of 3: 853 ?s per loop Note that while iterating over a range is not very fast, it is still much better than the array creation: In [4]: from collections import deque In [5]: %timeit deque(range(1000000), 1) 10 loops, best of 3: 25.5 ms per loop On one hand, special cases are awful. On the other hand, the range builtin is probably important enough to deserve a special case to make this construction faster. Or not? I initially opened this as https://github.com/numpy/numpy/issues/7233 but it was suggested there that this should be discussed on the ML first. (The real issue which prompted this suggestion: I was building sparse matrices using scipy.sparse.csc_matrix with some indices specified using range, and that construction step turned out to take a significant portion of the time because of the calls to np.array). Antony -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Feb 13 21:43:48 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Feb 2016 21:43:48 -0500 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Sat, Feb 13, 2016 at 8:57 PM, Antony Lee wrote: > Compare (on Python3 -- for Python2, read "xrange" instead of "range"): > > In [2]: %timeit np.array(range(1000000), np.int64) > 10 loops, best of 3: 156 ms per loop > > In [3]: %timeit np.arange(1000000, dtype=np.int64) > 1000 loops, best of 3: 853 ?s per loop > > > Note that while iterating over a range is not very fast, it is still much > better than the array creation: > > In [4]: from collections import deque > > In [5]: %timeit deque(range(1000000), 1) > 10 loops, best of 3: 25.5 ms per loop > > > On one hand, special cases are awful. On the other hand, the range builtin > is probably important enough to deserve a special case to make this > construction faster. Or not? I initially opened this as > https://github.com/numpy/numpy/issues/7233 but it was suggested there > that this should be discussed on the ML first. > > (The real issue which prompted this suggestion: I was building sparse > matrices using scipy.sparse.csc_matrix with some indices specified using > range, and that construction step turned out to take a significant portion > of the time because of the calls to np.array). > IMO: I don't see a reason why this should be supported. There is np.arange after all for this usecase, and from_iter. range and the other guys are iterators, and in several cases we can use larange = list(range(...)) as a short cut to get python list.for python 2/3 compatibility. I think this might be partially a learning effect in the python 2 to 3 transition. After using almost only python 3 for maybe a year, I don't think it's difficult to remember the differences when writing code that is py 2.7 and py 3.x compatible. It's just **another** thing to watch out for if milliseconds matter in your application. Josef > > Antony > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Feb 13 21:48:31 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Feb 2016 21:48:31 -0500 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Sat, Feb 13, 2016 at 9:43 PM, wrote: > > > On Sat, Feb 13, 2016 at 8:57 PM, Antony Lee > wrote: > >> Compare (on Python3 -- for Python2, read "xrange" instead of "range"): >> >> In [2]: %timeit np.array(range(1000000), np.int64) >> 10 loops, best of 3: 156 ms per loop >> >> In [3]: %timeit np.arange(1000000, dtype=np.int64) >> 1000 loops, best of 3: 853 ?s per loop >> >> >> Note that while iterating over a range is not very fast, it is still much >> better than the array creation: >> >> In [4]: from collections import deque >> >> In [5]: %timeit deque(range(1000000), 1) >> 10 loops, best of 3: 25.5 ms per loop >> >> >> On one hand, special cases are awful. On the other hand, the range >> builtin is probably important enough to deserve a special case to make this >> construction faster. Or not? I initially opened this as >> https://github.com/numpy/numpy/issues/7233 but it was suggested there >> that this should be discussed on the ML first. >> >> (The real issue which prompted this suggestion: I was building sparse >> matrices using scipy.sparse.csc_matrix with some indices specified using >> range, and that construction step turned out to take a significant portion >> of the time because of the calls to np.array). >> > > > IMO: I don't see a reason why this should be supported. There is np.arange > after all for this usecase, and from_iter. > range and the other guys are iterators, and in several cases we can use > larange = list(range(...)) as a short cut to get python list.for python 2/3 > compatibility. > > I think this might be partially a learning effect in the python 2 to 3 > transition. After using almost only python 3 for maybe a year, I don't > think it's difficult to remember the differences when writing code that is > py 2.7 and py 3.x compatible. > > > It's just **another** thing to watch out for if milliseconds matter in > your application. > side question: Is there a simple way to distinguish a iterator or generator from an iterable data structure? Josef > > Josef > > >> >> Antony >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Sat Feb 13 22:41:13 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Sat, 13 Feb 2016 22:41:13 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: <56BF63AB.2080004@gmail.com> <56BF6F91.5020606@gmail.com> Message-ID: <56BFF759.7010505@gmail.com> Impressive! Possibly there's still a case for including a 'groupby' function in numpy itself since it's a generally useful operation, but I do see less of a need given the nice pandas functionality. At least, next time someone asks a stackoverflow question like the ones below someone should tell them to use pandas! (copied from my gist for future list reference). http://stackoverflow.com/questions/4373631/sum-array-by-number-in-numpy http://stackoverflow.com/questions/31483912/split-numpy-array-according-to-values-in-the-array-a-condition/31484134#31484134 http://stackoverflow.com/questions/31863083/python-split-numpy-array-based-on-values-in-the-array http://stackoverflow.com/questions/28599405/splitting-an-array-into-two-smaller-arrays-in-python http://stackoverflow.com/questions/7662458/how-to-split-an-array-according-to-a-condition-in-numpy Allan On 02/13/2016 01:39 PM, Jeff Reback wrote: > In [10]: pd.options.display.max_rows=10 > > In [13]: np.random.seed(1234) > > In [14]: c = np.random.randint(0,32,size=100000) > > In [15]: v = np.arange(100000) > > In [16]: df = DataFrame({'v' : v, 'c' : c}) > > In [17]: df > Out[17]: > c v > 0 15 0 > 1 19 1 > 2 6 2 > 3 21 3 > 4 12 4 > ... .. ... > 99995 7 99995 > 99996 2 99996 > 99997 27 99997 > 99998 28 99998 > 99999 7 99999 > > [100000 rows x 2 columns] > > In [19]: df.groupby('c').count() > Out[19]: > v > c > 0 3136 > 1 3229 > 2 3093 > 3 3121 > 4 3041 > .. ... > 27 3128 > 28 3063 > 29 3147 > 30 3073 > 31 3090 > > [32 rows x 1 columns] > > In [20]: %timeit df.groupby('c').count() > 100 loops, best of 3: 2 ms per loop > > In [21]: %timeit df.groupby('c').mean() > 100 loops, best of 3: 2.39 ms per loop > > In [22]: df.groupby('c').mean() > Out[22]: > v > c > 0 49883.384885 > 1 50233.692165 > 2 48634.116069 > 3 50811.743992 > 4 50505.368629 > .. ... > 27 49715.349425 > 28 50363.501469 > 29 50485.395933 > 30 50190.155223 > 31 50691.041748 > > [32 rows x 1 columns] > > > On Sat, Feb 13, 2016 at 1:29 PM, > wrote: > > > > On Sat, Feb 13, 2016 at 1:01 PM, Allan Haldane > > wrote: > > Sorry, to reply to myself here, but looking at it with fresh > eyes maybe the performance of the naive version isn't too bad. > Here's a comparison of the naive vs a better implementation: > > def split_classes_naive(c, v): > return [v[c == u] for u in unique(c)] > > def split_classes(c, v): > perm = c.argsort() > csrt = c[perm] > div = where(csrt[1:] != csrt[:-1])[0] + 1 > return [v[x] for x in split(perm, div)] > > >>> c = randint(0,32,size=100000) > >>> v = arange(100000) > >>> %timeit split_classes_naive(c,v) > 100 loops, best of 3: 8.4 ms per loop > >>> %timeit split_classes(c,v) > 100 loops, best of 3: 4.79 ms per loop > > > The usecases I recently started to target for similar things is 1 > Million or more rows and 10000 uniques in the labels. > The second version should be faster for large number of uniques, I > guess. > > Overall numpy is falling far behind pandas in terms of simple > groupby operations. bincount and histogram (IIRC) worked for some > cases but are rather limited. > > reduce_at looks nice for cases where it applies. > > In contrast to the full sized labels in the original post, I only > know of applications where the labels are 1-D corresponding to rows > or columns. > > Josef > > > In any case, maybe it is useful to Sergio or others. > > Allan > > > On 02/13/2016 12:11 PM, Allan Haldane wrote: > > I've had a pretty similar idea for a new indexing function > 'split_classes' which would help in your case, which > essentially does > > def split_classes(c, v): > return [v[c == u] for u in unique(c)] > > Your example could be coded as > > >>> [sum(c) for c in split_classes(label, data)] > [9, 12, 15] > > I feel I've come across the need for such a function often > enough that > it might be generally useful to people as part of numpy. The > implementation of split_classes above has pretty poor > performance > because it creates many temporary boolean arrays, so my plan > for a PR > was to have a speedy version of it that uses a single pass > through v. > (I often wanted to use this function on large datasets). > > If anyone has any comments on the idea (good idea. bad > idea?) I'd love > to hear. > > I have some further notes and examples here: > https://gist.github.com/ahaldane/1e673d2fe6ffe0be4f21 > > Allan > > On 02/12/2016 09:40 AM, S?rgio wrote: > > Hello, > > This is my first e-mail, I will try to make the idea simple. > > Similar to masked array it would be interesting to use a > label array to > guide operations. > > Ex.: > >>> x > labelled_array(data = > [[0 1 2] > [3 4 5] > [6 7 8]], > label = > [[0 1 2] > [0 1 2] > [0 1 2]]) > > >>> sum(x) > array([9, 12, 15]) > > The operations would create a new axis for label indexing. > > You could think of it as a collection of masks, one for > each label. > > I don't know a way to make something like this > efficiently without a > loop. Just wondering... > > S?rgio. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From antony.lee at berkeley.edu Sun Feb 14 03:21:34 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Sun, 14 Feb 2016 00:21:34 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: re: no reason why... This has nothing to do with Python2/Python3 (I personally stopped using Python2 at least 3 years ago.) Let me put it this way instead: if Python3's "range" (or Python2's "xrange") was not a builtin type but a type provided by numpy, I don't think it would be controversial at all to provide an `__array__` special method to efficiently convert it to a ndarray. It would be the same if `np.array` used a `functools.singledispatch` dispatcher rather than an `__array__` special method (which is obviously not possible for chronological reasons). re: iterable vs iterator: check for the presence of the __next__ special method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not isinstance(x, Iterable)) Antony 2016-02-13 18:48 GMT-08:00 : > > > On Sat, Feb 13, 2016 at 9:43 PM, wrote: > >> >> >> On Sat, Feb 13, 2016 at 8:57 PM, Antony Lee >> wrote: >> >>> Compare (on Python3 -- for Python2, read "xrange" instead of "range"): >>> >>> In [2]: %timeit np.array(range(1000000), np.int64) >>> 10 loops, best of 3: 156 ms per loop >>> >>> In [3]: %timeit np.arange(1000000, dtype=np.int64) >>> 1000 loops, best of 3: 853 ?s per loop >>> >>> >>> Note that while iterating over a range is not very fast, it is still >>> much better than the array creation: >>> >>> In [4]: from collections import deque >>> >>> In [5]: %timeit deque(range(1000000), 1) >>> 10 loops, best of 3: 25.5 ms per loop >>> >>> >>> On one hand, special cases are awful. On the other hand, the range >>> builtin is probably important enough to deserve a special case to make this >>> construction faster. Or not? I initially opened this as >>> https://github.com/numpy/numpy/issues/7233 but it was suggested there >>> that this should be discussed on the ML first. >>> >>> (The real issue which prompted this suggestion: I was building sparse >>> matrices using scipy.sparse.csc_matrix with some indices specified using >>> range, and that construction step turned out to take a significant portion >>> of the time because of the calls to np.array). >>> >> >> >> IMO: I don't see a reason why this should be supported. There is >> np.arange after all for this usecase, and from_iter. >> range and the other guys are iterators, and in several cases we can use >> larange = list(range(...)) as a short cut to get python list.for python 2/3 >> compatibility. >> >> I think this might be partially a learning effect in the python 2 to 3 >> transition. After using almost only python 3 for maybe a year, I don't >> think it's difficult to remember the differences when writing code that is >> py 2.7 and py 3.x compatible. >> >> >> It's just **another** thing to watch out for if milliseconds matter in >> your application. >> > > > side question: Is there a simple way to distinguish a iterator or > generator from an iterable data structure? > > Josef > > > >> >> Josef >> >> >>> >>> Antony >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Feb 14 09:17:19 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 14 Feb 2016 15:17:19 +0100 Subject: [Numpy-discussion] ANN: numpydoc 0.6.0 released In-Reply-To: References: Message-ID: On Sat, Feb 13, 2016 at 10:38 PM, wrote: > > > On Sat, Feb 13, 2016 at 10:03 AM, Ralf Gommers > wrote: > >> Hi all, >> >> I'm pleased to announce the release of numpydoc 0.6.0. The main new >> feature is support for the Yields section in numpy-style docstrings. This >> is described in >> https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt >> >> Numpydoc can be installed from PyPi: >> https://pypi.python.org/pypi/numpydoc >> > > > Thanks, > > BTW: the status section in the howto still refers to the documentation > editor, which has been retired AFAIK. > Thanks Josef. I sent a PR to remove that text. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Feb 14 09:28:03 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 14 Feb 2016 09:28:03 -0500 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 3:21 AM, Antony Lee wrote: > re: no reason why... > This has nothing to do with Python2/Python3 (I personally stopped using > Python2 at least 3 years ago.) Let me put it this way instead: if > Python3's "range" (or Python2's "xrange") was not a builtin type but a type > provided by numpy, I don't think it would be controversial at all to > provide an `__array__` special method to efficiently convert it to a > ndarray. It would be the same if `np.array` used a > `functools.singledispatch` dispatcher rather than an `__array__` special > method (which is obviously not possible for chronological reasons). > > But numpy does provide arange. What's the reason to not use np.arange and use an iterator instead? > re: iterable vs iterator: check for the presence of the __next__ special > method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not > isinstance(x, Iterable)) > AFAIR and from spot checking the mailing list, in the past the argument was that it's too complicated to mix array/asarray creation with fromiter building of arrays. (I have no idea if array could cheaply delegate to fromiter.) Josef > > Antony > > > 2016-02-13 18:48 GMT-08:00 : > >> >> >> On Sat, Feb 13, 2016 at 9:43 PM, wrote: >> >>> >>> >>> On Sat, Feb 13, 2016 at 8:57 PM, Antony Lee >>> wrote: >>> >>>> Compare (on Python3 -- for Python2, read "xrange" instead of "range"): >>>> >>>> In [2]: %timeit np.array(range(1000000), np.int64) >>>> 10 loops, best of 3: 156 ms per loop >>>> >>>> In [3]: %timeit np.arange(1000000, dtype=np.int64) >>>> 1000 loops, best of 3: 853 ?s per loop >>>> >>>> >>>> Note that while iterating over a range is not very fast, it is still >>>> much better than the array creation: >>>> >>>> In [4]: from collections import deque >>>> >>>> In [5]: %timeit deque(range(1000000), 1) >>>> 10 loops, best of 3: 25.5 ms per loop >>>> >>>> >>>> On one hand, special cases are awful. On the other hand, the range >>>> builtin is probably important enough to deserve a special case to make this >>>> construction faster. Or not? I initially opened this as >>>> https://github.com/numpy/numpy/issues/7233 but it was suggested there >>>> that this should be discussed on the ML first. >>>> >>>> (The real issue which prompted this suggestion: I was building sparse >>>> matrices using scipy.sparse.csc_matrix with some indices specified using >>>> range, and that construction step turned out to take a significant portion >>>> of the time because of the calls to np.array). >>>> >>> >>> >>> IMO: I don't see a reason why this should be supported. There is >>> np.arange after all for this usecase, and from_iter. >>> range and the other guys are iterators, and in several cases we can use >>> larange = list(range(...)) as a short cut to get python list.for python 2/3 >>> compatibility. >>> >>> I think this might be partially a learning effect in the python 2 to 3 >>> transition. After using almost only python 3 for maybe a year, I don't >>> think it's difficult to remember the differences when writing code that is >>> py 2.7 and py 3.x compatible. >>> >>> >>> It's just **another** thing to watch out for if milliseconds matter in >>> your application. >>> >> >> >> side question: Is there a simple way to distinguish a iterator or >> generator from an iterable data structure? >> >> Josef >> >> >> >>> >>> Josef >>> >>> >>>> >>>> Antony >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Feb 14 09:36:05 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 14 Feb 2016 15:36:05 +0100 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee wrote: > re: no reason why... > This has nothing to do with Python2/Python3 (I personally stopped using > Python2 at least 3 years ago.) Let me put it this way instead: if > Python3's "range" (or Python2's "xrange") was not a builtin type but a type > provided by numpy, I don't think it would be controversial at all to > provide an `__array__` special method to efficiently convert it to a > ndarray. It would be the same if `np.array` used a > `functools.singledispatch` dispatcher rather than an `__array__` special > method (which is obviously not possible for chronological reasons). > > re: iterable vs iterator: check for the presence of the __next__ special > method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not > isinstance(x, Iterable)) > I think it's good to do something about this, but it's not clear what the exact proposal is. I could image one or both of: - special-case the range() object in array (and asarray/asanyarray?) such that array(range(N)) becomes as fast as arange(N). - special-case all iterators, such that array(range(N)) becomes as fast as deque(range(N)) or yet something else? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From nilsc.becker at gmail.com Sun Feb 14 14:35:27 2016 From: nilsc.becker at gmail.com (Nils Becker) Date: Sun, 14 Feb 2016 20:35:27 +0100 Subject: [Numpy-discussion] Modulus (remainder) function corner cases In-Reply-To: References: Message-ID: 2016-02-13 17:42 GMT+01:00 Charles R Harris : > The Fortran modulo function, which is the same basic function as in my >> branch, does not specify any bounds on the result for floating numbers, but >> gives only the formula, modulus(a, b) = a - b*floor(a/b), which has the >> advantage of being simple and well defined ;) >> > > In the light of the libm-discussion I spent some time looking at floating point functions and their accuracy. I would vote in favor of keeping an implementation that uses the fmod-function of the system library and bends it to adhere to the python convention (sign of divisor). There is probably a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)" [1]. One obvious problem with the simple expression arises when a/b = 0.0 in floating point. E.g. In [43]: np.__version__ Out[43]: '1.10.4' In [44]: x = np.float64(1e-320) In [45]: y = np.float64(-1e10) In [46]: x % y # this uses libm's fmod on my system Out[46]: -10000000000.0 # == y, correctly rounded result in round-to-nearest In [47]: x - y*np.floor(x/y) # this here is the naive expression Out[47]: 9.9998886718268301e-321 # == x, wrong sign There are other problems (a/b = inf in floating point). As I do not understand the implementation of fmod (for example in openlibm) in detail I cannot give a full list of corner cases. Unfortunately, I did not follow the (many different) bug reports on this issue originally and am confused why there was a need to change the implementation in the first place. numpy's "%" operator seems to work quite well on my system. Therefore, this mail may be rather unproductive as I am missing some information. Concerning your original question: Many elementary functions loose their mathematical properties when they are calculated correctly-rounded in floating point numbers [2]. We do not fix this for other functions, I would not fix it here. Cheers Nils [1] https://github.com/JuliaLang/openlibm/blob/master/src/e_fmod.c [2] np.exp(np.nextafter(1.0, 0.0)) < np.e -> False (Monotonicity lost in exp(x)). -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 14 14:54:47 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 14 Feb 2016 12:54:47 -0700 Subject: [Numpy-discussion] Modulus (remainder) function corner cases In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 12:35 PM, Nils Becker wrote: > 2016-02-13 17:42 GMT+01:00 Charles R Harris : > >> The Fortran modulo function, which is the same basic function as in my >>> branch, does not specify any bounds on the result for floating numbers, but >>> gives only the formula, modulus(a, b) = a - b*floor(a/b), which has >>> the advantage of being simple and well defined ;) >>> >> >> > In the light of the libm-discussion I spent some time looking at floating > point functions and their accuracy. I would vote in favor of keeping an > implementation that uses the fmod-function of the system library and bends > it to adhere to the python convention (sign of divisor). There is probably > a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)" > [1]. > > One obvious problem with the simple expression arises when a/b = 0.0 in > floating point. E.g. > > In [43]: np.__version__ > Out[43]: '1.10.4' > In [44]: x = np.float64(1e-320) > In [45]: y = np.float64(-1e10) > In [46]: x % y # this uses libm's fmod on my system > Out[46]: -10000000000.0 # == y, correctly rounded result in > round-to-nearest > In [47]: x - y*np.floor(x/y) # this here is the naive expression > Out[47]: 9.9998886718268301e-321 # == x, wrong sign > But more accurate ;) Currently, this is actually clipped. In [3]: remainder(x,y) Out[3]: -0.0 In [4]: x - y*floor(x/y) Out[4]: 9.9998886718268301e-321 In [5]: fmod(x,y) Out[5]: 9.9998886718268301e-321 > There are other problems (a/b = inf in floating point). As I do not > understand the implementation of fmod (for example in openlibm) in detail I > cannot give a full list of corner cases. > ? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 14 15:11:39 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 14 Feb 2016 13:11:39 -0700 Subject: [Numpy-discussion] Modulus (remainder) function corner cases In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 12:54 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sun, Feb 14, 2016 at 12:35 PM, Nils Becker > wrote: > >> 2016-02-13 17:42 GMT+01:00 Charles R Harris : >> >>> The Fortran modulo function, which is the same basic function as in my >>>> branch, does not specify any bounds on the result for floating >>>> numbers, but gives only the formula, modulus(a, b) = a - b*floor(a/b), >>>> which has the advantage of being simple and well defined ;) >>>> >>> >>> >> In the light of the libm-discussion I spent some time looking at floating >> point functions and their accuracy. I would vote in favor of keeping an >> implementation that uses the fmod-function of the system library and bends >> it to adhere to the python convention (sign of divisor). There is probably >> a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)" >> [1]. >> >> One obvious problem with the simple expression arises when a/b = 0.0 in >> floating point. E.g. >> >> In [43]: np.__version__ >> Out[43]: '1.10.4' >> In [44]: x = np.float64(1e-320) >> In [45]: y = np.float64(-1e10) >> In [46]: x % y # this uses libm's fmod on my system >> > I'm not too worried about denormals. However, this might be considered a bug in the floor function In [16]: floor(-1e-330) Out[16]: -0.0 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Sun Feb 14 15:23:20 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Sun, 14 Feb 2016 12:23:20 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: I was thinking (1) (special-case range()); however (2) may be more generally applicable and useful. Antony 2016-02-14 6:36 GMT-08:00 Ralf Gommers : > > > On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee > wrote: > >> re: no reason why... >> This has nothing to do with Python2/Python3 (I personally stopped using >> Python2 at least 3 years ago.) Let me put it this way instead: if >> Python3's "range" (or Python2's "xrange") was not a builtin type but a type >> provided by numpy, I don't think it would be controversial at all to >> provide an `__array__` special method to efficiently convert it to a >> ndarray. It would be the same if `np.array` used a >> `functools.singledispatch` dispatcher rather than an `__array__` special >> method (which is obviously not possible for chronological reasons). >> >> re: iterable vs iterator: check for the presence of the __next__ special >> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not >> isinstance(x, Iterable)) >> > > I think it's good to do something about this, but it's not clear what the > exact proposal is. I could image one or both of: > > - special-case the range() object in array (and asarray/asanyarray?) > such that array(range(N)) becomes as fast as arange(N). > - special-case all iterators, such that array(range(N)) becomes as fast > as deque(range(N)) > > or yet something else? > > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 14 15:35:27 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 14 Feb 2016 13:35:27 -0700 Subject: [Numpy-discussion] Modulus (remainder) function corner cases In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 1:11 PM, Charles R Harris wrote: > > > On Sun, Feb 14, 2016 at 12:54 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, Feb 14, 2016 at 12:35 PM, Nils Becker >> wrote: >> >>> 2016-02-13 17:42 GMT+01:00 Charles R Harris : >>> >>>> The Fortran modulo function, which is the same basic function as in my >>>>> branch, does not specify any bounds on the result for floating >>>>> numbers, but gives only the formula, modulus(a, b) = a - b*floor(a/b), >>>>> which has the advantage of being simple and well defined ;) >>>>> >>>> >>>> >>> In the light of the libm-discussion I spent some time looking at >>> floating point functions and their accuracy. I would vote in favor of >>> keeping an implementation that uses the fmod-function of the system library >>> and bends it to adhere to the python convention (sign of divisor). There is >>> probably a reason why the fmod-implementation is not as simple as "a - >>> b*floor(a/b)" [1]. >>> >>> One obvious problem with the simple expression arises when a/b = 0.0 in >>> floating point. E.g. >>> >>> In [43]: np.__version__ >>> Out[43]: '1.10.4' >>> In [44]: x = np.float64(1e-320) >>> In [45]: y = np.float64(-1e10) >>> In [46]: x % y # this uses libm's fmod on my system >>> >> > I'm not too worried about denormals. However, this might be considered a > bug in the floor function > > In [16]: floor(-1e-330) > Out[16]: -0.0 > > However, I do note that some languages offer two versions of modulus, one floor based and the other trunc based (effectively fmod). What I wanted is to keep the remainder consistent with the floor function in the C library. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 14 16:36:00 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 14 Feb 2016 14:36:00 -0700 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers wrote: > > > On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee > wrote: > >> re: no reason why... >> This has nothing to do with Python2/Python3 (I personally stopped using >> Python2 at least 3 years ago.) Let me put it this way instead: if >> Python3's "range" (or Python2's "xrange") was not a builtin type but a type >> provided by numpy, I don't think it would be controversial at all to >> provide an `__array__` special method to efficiently convert it to a >> ndarray. It would be the same if `np.array` used a >> `functools.singledispatch` dispatcher rather than an `__array__` special >> method (which is obviously not possible for chronological reasons). >> >> re: iterable vs iterator: check for the presence of the __next__ special >> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not >> isinstance(x, Iterable)) >> > > I think it's good to do something about this, but it's not clear what the > exact proposal is. I could image one or both of: > > - special-case the range() object in array (and asarray/asanyarray?) > such that array(range(N)) becomes as fast as arange(N). > - special-case all iterators, such that array(range(N)) becomes as fast > as deque(range(N)) > I think the last wouldn't help much, as numpy would still need to determine dimensions and type. I assume that is one of the reason sparse itself doesn't do that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbmcleod at gmail.com Sun Feb 14 17:19:49 2016 From: robbmcleod at gmail.com (Robert McLeod) Date: Sun, 14 Feb 2016 23:19:49 +0100 Subject: [Numpy-discussion] Numexpr-3.0 proposal Message-ID: Hello everyone, I've done some work on making a new version of Numexpr that would fix some of the limitations of the original virtual machine with regards to data types and operation/function count. Basically I re-wrote the Python and C sides to use 4-byte words, instead of null-terminated strings, for operations and passing types. This means the number of operations and types isn't significantly limited anymore. Francesc Alted suggested I should come here and get some advice from the community. I wrote a short proposal on the Wiki here: https://github.com/pydata/numexpr/wiki/Numexpr-3.0-Branch-Overview One can see my branch here: https://github.com/robbmcleod/numexpr/tree/numexpr-3.0 If anyone has any comments they'd be welcome. Questions from my side for the group: 1.) Numpy casting: I downloaded the Numpy source and after browsing it seems the best approach is probably to just use numpy.core.numerictypes.find_common_type? 2.) Can anyone foresee any issues with casting build-in Python types (i.e. float and integer) to their OS dependent numpy equivalents? Numpy already seems to do this. 3.) Is anyone enabling the Intel VML library? There are a number of comments in the code that suggest it's not accelerating the code. It also seems to cause problems with bundling numexpr with cx_freeze. 4.) I took a stab at converting from distutils to setuputils but this seems challenging with numpy as a dependency. I wonder if anyone has tried monkey-patching so that setup.py build_ext uses distutils and then pass the interpreter.pyd/so as a data file, or some other such chicanery? (I was going to ask about attaching a debugger, but I just noticed: https://wiki.python.org/moin/DebuggingWithGdb ) Ciao, Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universit?t Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcleod at unibas.ch robert.mcleod at bsse.ethz.ch robbmcleod at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Feb 15 01:21:37 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 15 Feb 2016 07:21:37 +0100 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 10:36 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers > wrote: > >> >> >> On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee >> wrote: >> >>> re: no reason why... >>> This has nothing to do with Python2/Python3 (I personally stopped using >>> Python2 at least 3 years ago.) Let me put it this way instead: if >>> Python3's "range" (or Python2's "xrange") was not a builtin type but a type >>> provided by numpy, I don't think it would be controversial at all to >>> provide an `__array__` special method to efficiently convert it to a >>> ndarray. It would be the same if `np.array` used a >>> `functools.singledispatch` dispatcher rather than an `__array__` special >>> method (which is obviously not possible for chronological reasons). >>> >>> re: iterable vs iterator: check for the presence of the __next__ special >>> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not >>> isinstance(x, Iterable)) >>> >> >> I think it's good to do something about this, but it's not clear what the >> exact proposal is. I could image one or both of: >> >> - special-case the range() object in array (and asarray/asanyarray?) >> such that array(range(N)) becomes as fast as arange(N). >> - special-case all iterators, such that array(range(N)) becomes as fast >> as deque(range(N)) >> > > I think the last wouldn't help much, as numpy would still need to > determine dimensions and type. I assume that is one of the reason sparse > itself doesn't do that. > Not orders of magnitude, but this shows that there's something to optimize for iterators: In [1]: %timeit np.array(range(100000)) 100 loops, best of 3: 14.9 ms per loop In [2]: %timeit np.array(list(range(100000))) 100 loops, best of 3: 9.68 ms per loop Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Feb 15 01:28:30 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 15 Feb 2016 07:28:30 +0100 Subject: [Numpy-discussion] Numexpr-3.0 proposal In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 11:19 PM, Robert McLeod wrote: > > 4.) I took a stab at converting from distutils to setuputils but this > seems challenging with numpy as a dependency. I wonder if anyone has tried > monkey-patching so that setup.py build_ext uses distutils and then pass the > interpreter.pyd/so as a data file, or some other such chicanery? > Not sure what you mean, since numpexpr already uses setuptools: https://github.com/pydata/numexpr/blob/master/setup.py#L22. What is the real goal you're trying to achieve? This monkeypatching is a bad idea: https://github.com/robbmcleod/numexpr/blob/numexpr-3.0/setup.py#L19. Both setuptools and numpy.distutils already do that, and that's already one too many. So you definitely don't want to add a third place.... You can use the -j (--parallel) flag to numpy.distutils instead, see http://docs.scipy.org/doc/numpy-dev/user/building.html#parallel-builds Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Mon Feb 15 02:41:29 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Sun, 14 Feb 2016 23:41:29 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: I wonder whether numpy is using the "old" iteration protocol (repeatedly calling x[i] for increasing i until StopIteration is reached?) A quick timing shows that it is indeed slower. ... actually it's not even clear to me what qualifies as a sequence for `np.array`: class C: def __iter__(self): return iter(range(10)) # [0... 9] under the new iteration protocol def __getitem__(self, i): raise IndexError # [] under the old iteration protocol np.array(C()) ===> array(<__main__.C object at 0x7f3f21ffff28>, dtype=object) So how can np.array(range(...)) even work? 2016-02-14 22:21 GMT-08:00 Ralf Gommers : > > > On Sun, Feb 14, 2016 at 10:36 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers >> wrote: >> >>> >>> >>> On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee >>> wrote: >>> >>>> re: no reason why... >>>> This has nothing to do with Python2/Python3 (I personally stopped using >>>> Python2 at least 3 years ago.) Let me put it this way instead: if >>>> Python3's "range" (or Python2's "xrange") was not a builtin type but a type >>>> provided by numpy, I don't think it would be controversial at all to >>>> provide an `__array__` special method to efficiently convert it to a >>>> ndarray. It would be the same if `np.array` used a >>>> `functools.singledispatch` dispatcher rather than an `__array__` special >>>> method (which is obviously not possible for chronological reasons). >>>> >>>> re: iterable vs iterator: check for the presence of the __next__ >>>> special method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and >>>> not isinstance(x, Iterable)) >>>> >>> >>> I think it's good to do something about this, but it's not clear what >>> the exact proposal is. I could image one or both of: >>> >>> - special-case the range() object in array (and asarray/asanyarray?) >>> such that array(range(N)) becomes as fast as arange(N). >>> - special-case all iterators, such that array(range(N)) becomes as >>> fast as deque(range(N)) >>> >> >> I think the last wouldn't help much, as numpy would still need to >> determine dimensions and type. I assume that is one of the reason sparse >> itself doesn't do that. >> > > Not orders of magnitude, but this shows that there's something to optimize > for iterators: > > In [1]: %timeit np.array(range(100000)) > 100 loops, best of 3: 14.9 ms per loop > > In [2]: %timeit np.array(list(range(100000))) > 100 loops, best of 3: 9.68 ms per loop > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Feb 15 03:07:03 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 15 Feb 2016 09:07:03 +0100 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: <1455523623.8049.9.camel@sipsolutions.net> On So, 2016-02-14 at 23:41 -0800, Antony Lee wrote: > I wonder whether numpy is using the "old" iteration protocol > (repeatedly calling x[i] for increasing i until StopIteration is > reached?) A quick timing shows that it is indeed slower. > ... actually it's not even clear to me what qualifies as a sequence > for `np.array`: > > class C: > def __iter__(self): > return iter(range(10)) # [0... 9] under the new iteration > protocol > def __getitem__(self, i): > raise IndexError # [] under the old iteration protocol > Numpy currently uses PySequence_Fast, but it has to do a two pass algorithm (find dtype+dims), and the range is converted twice to list by this call. That explains the speed advantage of converting to list manually. - Sebastian > np.array(C()) > ===> array(<__main__.C object at 0x7f3f21ffff28>, dtype=object) > > So how can np.array(range(...)) even work? > > 2016-02-14 22:21 GMT-08:00 Ralf Gommers : > > > > > > On Sun, Feb 14, 2016 at 10:36 PM, Charles R Harris < > > charlesr.harris at gmail.com> wrote: > > > > > > > > > On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers < > > > ralf.gommers at gmail.com> wrote: > > > > > > > > > > > > On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee < > > > > antony.lee at berkeley.edu> wrote: > > > > > re: no reason why... > > > > > This has nothing to do with Python2/Python3 (I personally > > > > > stopped using Python2 at least 3 years ago.) Let me put it > > > > > this way instead: if Python3's "range" (or Python2's > > > > > "xrange") was not a builtin type but a type provided by > > > > > numpy, I don't think it would be controversial at all to > > > > > provide an `__array__` special method to efficiently convert > > > > > it to a ndarray. It would be the same if `np.array` used a > > > > > `functools.singledispatch` dispatcher rather than an > > > > > `__array__` special method (which is obviously not possible > > > > > for chronological reasons). > > > > > > > > > > re: iterable vs iterator: check for the presence of the > > > > > __next__ special method (or isinstance(x, Iterable) vs. > > > > > isinstance(x, Iterator) and not isinstance(x, Iterable)) > > > > > > > > > I think it's good to do something about this, but it's not > > > > clear what the exact proposal is. I could image one or both of: > > > > > > > > - special-case the range() object in array (and > > > > asarray/asanyarray?) such that array(range(N)) becomes as fast > > > > as arange(N). > > > > - special-case all iterators, such that array(range(N)) > > > > becomes as fast as deque(range(N)) > > > > > > > I think the last wouldn't help much, as numpy would still need to > > > determine dimensions and type. I assume that is one of the > > > reason sparse itself doesn't do that. > > > > > Not orders of magnitude, but this shows that there's something to > > optimize for iterators: > > > > In [1]: %timeit np.array(range(100000)) > > 100 loops, best of 3: 14.9 ms per loop > > > > In [2]: %timeit np.array(list(range(100000))) > > 100 loops, best of 3: 9.68 ms per loop > > > > Ralf > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From njs at pobox.com Mon Feb 15 03:10:11 2016 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 15 Feb 2016 00:10:11 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 11:41 PM, Antony Lee wrote: > I wonder whether numpy is using the "old" iteration protocol (repeatedly > calling x[i] for increasing i until StopIteration is reached?) A quick > timing shows that it is indeed slower. Yeah, I'm pretty sure that np.array doesn't know anything about "iterable", just about "sequence" (calling x[i] for 0 <= i < i.__len__()). (See Sequence vs Iterable: https://docs.python.org/3/library/collections.abc.html) Personally I'd like it if we could eventually make it so np.array specifically looks for lists and only lists, because the way it has so many different fallbacks right now creates all confusion between which objects are elements. Compare: In [5]: np.array([(1, 2), (3, 4)]).shape Out[5]: (2, 2) In [6]: np.array([(1, 2), (3, 4)], dtype="i4,i4").shape Out[6]: (2,) -n -- Nathaniel J. Smith -- https://vorpus.org From gregor.thalhammer at gmail.com Mon Feb 15 04:43:53 2016 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Mon, 15 Feb 2016 10:43:53 +0100 Subject: [Numpy-discussion] Numexpr-3.0 proposal In-Reply-To: References: Message-ID: <029E5E0A-21A5-42B4-B306-A2C8C12DF73B@gmail.com> > Am 14.02.2016 um 23:19 schrieb Robert McLeod : > > Hello everyone, > > I've done some work on making a new version of Numexpr that would fix some of the limitations of the original virtual machine with regards to data types and operation/function count. Basically I re-wrote the Python and C sides to use 4-byte words, instead of null-terminated strings, for operations and passing types. This means the number of operations and types isn't significantly limited anymore. > > Francesc Alted suggested I should come here and get some advice from the community. I wrote a short proposal on the Wiki here: > > https://github.com/pydata/numexpr/wiki/Numexpr-3.0-Branch-Overview > > One can see my branch here: > > https://github.com/robbmcleod/numexpr/tree/numexpr-3.0 > > If anyone has any comments they'd be welcome. Questions from my side for the group: > > 1.) Numpy casting: I downloaded the Numpy source and after browsing it seems the best approach is probably to just use numpy.core.numerictypes.find_common_type? > > 2.) Can anyone foresee any issues with casting build-in Python types (i.e. float and integer) to their OS dependent numpy equivalents? Numpy already seems to do this. > > 3.) Is anyone enabling the Intel VML library? There are a number of comments in the code that suggest it's not accelerating the code. It also seems to cause problems with bundling numexpr with cx_freeze. > Dear Robert, thanks for your effort on improving numexpr. Indeed, vectorized math libraries (VML) can give a large boost in performance (~5x), except for a couple of basic operations (add, mul, div), which current compilers are able to vectorize automatically. With recent gcc even more functions are vectorized, see https://sourceware.org/glibc/wiki/libmvec But you need special flags depending on the platform (SSE, AVX present?), runtime detection of processor capabilities would be nice for distributing binaries. Some time ago, since I lost access to Intels MKL, I patched numexpr to use Accelerate/Veclib on os x, which is preinstalled on each mac, see https://github.com/geggo/numexpr.git veclib_support branch. As you increased the opcode size, I could imagine providing a bit to switch (during runtime) between internal functions and vectorized ones, that would be handy for tests and benchmarks. Gregor > 4.) I took a stab at converting from distutils to setuputils but this seems challenging with numpy as a dependency. I wonder if anyone has tried monkey-patching so that setup.py build_ext uses distutils and then pass the interpreter.pyd/so as a data file, or some other such chicanery? > > (I was going to ask about attaching a debugger, but I just noticed: https://wiki.python.org/moin/DebuggingWithGdb ) > > Ciao, > > Robert > > -- > Robert McLeod, Ph.D. > Center for Cellular Imaging and Nano Analytics (C-CINA) > Biozentrum der Universit?t Basel > Mattenstrasse 26, 4058 Basel > Work: +41.061.387.3225 > robert.mcleod at unibas.ch > robert.mcleod at bsse.ethz.ch > robbmcleod at gmail.com > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Mon Feb 15 11:13:29 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Mon, 15 Feb 2016 08:13:29 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: Indeed: In [1]: class C: def __getitem__(self, i): if i < 10: return i else: raise IndexError def __len__(self): return 10 ...: In [2]: np.array(C()) Out[2]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) (omitting __len__ results in the creation of an object array, consistently with the fact that the sequence protocol requires __len__). Meanwhile, I found a new way to segfault numpy :-) In [3]: class C: def __getitem__(self, i): if i < 10: return i else: raise IndexError def __len__(self): return 42 ...: In [4]: np.array(C()) Fatal Python error: Segmentation fault 2016-02-15 0:10 GMT-08:00 Nathaniel Smith : > On Sun, Feb 14, 2016 at 11:41 PM, Antony Lee > wrote: > > I wonder whether numpy is using the "old" iteration protocol (repeatedly > > calling x[i] for increasing i until StopIteration is reached?) A quick > > timing shows that it is indeed slower. > > Yeah, I'm pretty sure that np.array doesn't know anything about > "iterable", just about "sequence" (calling x[i] for 0 <= i < > i.__len__()). > > (See Sequence vs Iterable: > https://docs.python.org/3/library/collections.abc.html) > > Personally I'd like it if we could eventually make it so np.array > specifically looks for lists and only lists, because the way it has so > many different fallbacks right now creates all confusion between which > objects are elements. Compare: > > In [5]: np.array([(1, 2), (3, 4)]).shape > Out[5]: (2, 2) > > In [6]: np.array([(1, 2), (3, 4)], dtype="i4,i4").shape > Out[6]: (2,) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Mon Feb 15 11:24:51 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Mon, 15 Feb 2016 11:24:51 -0500 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: just an FYI. pandas implemented a RangeIndex in upcoming 0.18.0, mainly for memory savings, see here , similar to how python range/xrange work. though there are substantial perf benefits, mainly with set operations, see here though didn't officially benchmark thes. Jeff On Mon, Feb 15, 2016 at 11:13 AM, Antony Lee wrote: > Indeed: > > In [1]: class C: > def __getitem__(self, i): > if i < 10: return i > else: raise IndexError > def __len__(self): > return 10 > ...: > > In [2]: np.array(C()) > Out[2]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > > (omitting __len__ results in the creation of an object array, consistently > with the fact that the sequence protocol requires __len__). > Meanwhile, I found a new way to segfault numpy :-) > > In [3]: class C: > def __getitem__(self, i): > if i < 10: return i > else: raise IndexError > def __len__(self): > return 42 > ...: > > In [4]: np.array(C()) > Fatal Python error: Segmentation fault > > > 2016-02-15 0:10 GMT-08:00 Nathaniel Smith : > >> On Sun, Feb 14, 2016 at 11:41 PM, Antony Lee >> wrote: >> > I wonder whether numpy is using the "old" iteration protocol (repeatedly >> > calling x[i] for increasing i until StopIteration is reached?) A quick >> > timing shows that it is indeed slower. >> >> Yeah, I'm pretty sure that np.array doesn't know anything about >> "iterable", just about "sequence" (calling x[i] for 0 <= i < >> i.__len__()). >> >> (See Sequence vs Iterable: >> https://docs.python.org/3/library/collections.abc.html) >> >> Personally I'd like it if we could eventually make it so np.array >> specifically looks for lists and only lists, because the way it has so >> many different fallbacks right now creates all confusion between which >> objects are elements. Compare: >> >> In [5]: np.array([(1, 2), (3, 4)]).shape >> Out[5]: (2, 2) >> >> In [6]: np.array([(1, 2), (3, 4)], dtype="i4,i4").shape >> Out[6]: (2,) >> >> -n >> >> -- >> Nathaniel J. Smith -- https://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Feb 15 11:49:29 2016 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 15 Feb 2016 16:49:29 +0000 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Mon, Feb 15, 2016 at 4:24 PM, Jeff Reback wrote: > > just an FYI. > > pandas implemented a RangeIndex in upcoming 0.18.0, mainly for memory savings, > see here, similar to how python range/xrange work. > > though there are substantial perf benefits, mainly with set operations, see here > though didn't officially benchmark thes. Since it is a numpy-aware object (unlike the builtins), you can (and have, if I'm reading the code correctly) implement __array__() such that it does the correctly performant thing and call np.arange(). RangeIndex won't be adversely impacted by retaining the status quo. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgutenk at email.arizona.edu Mon Feb 15 12:06:28 2016 From: rgutenk at email.arizona.edu (Gutenkunst, Ryan N - (rgutenk)) Date: Mon, 15 Feb 2016 17:06:28 +0000 Subject: [Numpy-discussion] Subclassing ma.masked_array, code broken after version 1.9 In-Reply-To: <56BF7A99.5040507@gmail.com> References: <56BF7A99.5040507@gmail.com> Message-ID: <278614E7-1BEE-4442-80CE-2E62CB2B7D3A@email.arizona.edu> Thank Jonathan, Good to confirm this isn't something inappropriate I'm doing. I give up transparency here in my application, so I'll just work around it. I leave it up to wiser numpy heads as to whether it's worth altering these numpy.ma functions to enable subclassing. Best, Ryan On Feb 13, 2016, at 11:48 AM, Jonathan Helmus wrote: > > > On 2/12/16 6:06 PM, Gutenkunst, Ryan N - (rgutenk) wrote: >> Hello all, >> >> In 2009 I developed an application that uses a subclass of masked arrays as a central data object. My subclass Spectrum possesses additional attributes along with many custom methods. It was very convenient to be able to use standard numpy functions for doing arithmetic on these objects. However, my code broke with numpy 1.10. I've finally had a chance to track down the problem, and I am hoping someone can suggest a workaround. >> >> See below for an example, which is as minimal as I could concoct. In this case, I have a Spectrum object that I'd like to take the logarithm of using numpy.ma.log, while preserving the value of the "folded" attribute. Up to numpy 1.9, this worked as expected, but in numpy 1.10 and 1.11 the attribute is not preserved. >> >> The change in behavior appears to be driven by a commit made on Jun 16th, 2015 by Marten van Kerkwijk. In particular, the commit changed _MaskedUnaryOperation.__call__ so that the result array's update_from method is no longer called with the input array as the argument, but rather the result of the numpy UnaryOperation (old line 889, new line 885). Because that UnaryOperation doesn't carry my new attribute, it's not present for update_from to access. I notice that similar changes were made to MaskedBinaryOperation, although I haven't tested those. It's not clear to me from the commit message why this particular change was made, so I don't know whether this new behavior is intentional. >> >> I know that subclassing arrays isn't widely encouraged, but it has been very convenient in my code. Is it still possible to subclass masked_array in such a way that functions like numpy.ma.log preserve additional attributes? If so, can someone point me in the right direction? >> >> Thanks! >> Ryan >> >> *** Begin example >> >> import numpy >> print 'Working with numpy {0}'.format(numpy.__version__) >> >> class Spectrum(numpy.ma.masked_array): >> def __new__(cls, data, mask=numpy.ma.nomask, data_folded=None): >> subarr = numpy.ma.masked_array(data, mask=mask, keep_mask=True, >> shrink=True) >> subarr = subarr.view(cls) >> subarr.folded = data_folded >> >> return subarr >> >> def __array_finalize__(self, obj): >> if obj is None: >> return >> numpy.ma.masked_array.__array_finalize__(self, obj) >> self.folded = getattr(obj, 'folded', 'unspecified') >> >> def _update_from(self, obj): >> print('Input to update_from: {0}'.format(repr(obj))) >> numpy.ma.masked_array._update_from(self, obj) >> self.folded = getattr(obj, 'folded', 'unspecified') >> >> def __repr__(self): >> return 'Spectrum(%s, folded=%s)'\ >> % (str(self), str(self.folded)) >> >> fs1 = Spectrum([2,3,4.], data_folded=True) >> fs2 = numpy.ma.log(fs1) >> print('fs2.folded status: {0}'.format(fs2.folded)) >> print('Expectation is True, achieved with numpy 1.9') >> >> *** End example >> >> -- >> Ryan Gutenkunst >> Assistant Professor >> Molecular and Cellular Biology >> University of Arizona >> phone: (520) 626-0569, office LSS 325 >> http://gutengroup.mcb.arizona.edu >> Latest paper: "Computationally efficient composite likelihood statistics for demographic inference" >> Molecular Biology and Evolution; http://dx.doi.org/10.1093/molbev/msv255 > Ryan, > > I'm not sure if you will be able to get this to work as in NumPy 1.9, but the __array_wrap__ method is intended to be the mechanism for subclasses to set their return type, adjust metadata, etc [1]. Unfortunately, the numpy.ma.log function does not seem to make a call to __array_wrap__ (at least in NumPy 1.10.2) although numpy.log does: > > from __future__ import print_function > import numpy > print('Working with numpy {0}'.format(numpy.__version__)) > > > class Spectrum(numpy.ma.masked_array): > def __new__(cls, data, mask=numpy.ma.nomask, data_folded=None): > subarr = numpy.ma.masked_array(data, mask=mask, keep_mask=True, > shrink=True) > subarr = subarr.view(cls) > subarr.folded = data_folded > > return subarr > > def __array_finalize__(self, obj): > if obj is None: > return > numpy.ma.masked_array.__array_finalize__(self, obj) > self.folded = getattr(obj, 'folded', 'unspecified') > > def __array_wrap__(self, out_arr, context=None): > print('__array_wrap__ called') > return numpy.ndarray.__array_wrap__(self, out_arr, context) > > def __repr__(self): > return 'Spectrum(%s, folded=%s)'\ > % (str(self), str(self.folded)) > > fs1 = Spectrum([2,3,4.], data_folded=True) > > print('numpy.ma.log:') > fs2 = numpy.ma.log(fs1) > print('fs2 type:', type(fs2)) > print('fs2.folded status: {0}'.format(fs2.folded)) > > print('numpy.log:') > fs3 = numpy.log(fs1) > print('fs3 type:', type(fs3)) > print('fs3.folded status: {0}'.format(fs3.folded)) > > ---- > $ python example.py > Working with numpy 1.10.2 > numpy.ma.log: > fs2 type: > fs2.folded status: unspecified > numpy.log: > __array_wrap__ called > fs3 type: > fs3.folded status: True > > > The change mentioned in the original message was made in pull request 3907 [2] in case anyone wants to have a look. > > Cheers, > > - Jonathan Helmus > > [1] http://docs.scipy.org/doc/numpy-1.10.1/user/basics.subclassing.html#array-wrap-for-ufuncs > [2] https://github.com/numpy/numpy/pull/3907 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -- Ryan Gutenkunst Assistant Professor Molecular and Cellular Biology University of Arizona phone: (520) 626-0569, office LSS 325 http://gutengroup.mcb.arizona.edu Latest paper: "Computationally efficient composite likelihood statistics for demographic inference" Molecular Biology and Evolution; http://dx.doi.org/10.1093/molbev/msv255 From derek at astro.physik.uni-goettingen.de Mon Feb 15 12:51:35 2016 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Mon, 15 Feb 2016 18:51:35 +0100 Subject: [Numpy-discussion] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE In-Reply-To: References: Message-ID: On 14 Feb 2016, at 1:53 am, Jeff Reback wrote: > > I'm pleased to announce the availability of the first release candidate of Pandas 0.18.0. > Please try this RC and report any issues here: Pandas Issues > We will be releasing officially in 1-2 weeks or so. > Thanks, looking forward to give this a try! Do you have a download link to the source for non-Conda users and packagers? Finding anything in the github source tarball repositories without having the exact path seems hopeless. Derek From jeffreback at gmail.com Mon Feb 15 12:55:57 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Mon, 15 Feb 2016 12:55:57 -0500 Subject: [Numpy-discussion] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE In-Reply-To: References: Message-ID: https://github.com/pydata/pandas/releases/tag/v0.18.0rc1 On Mon, Feb 15, 2016 at 12:51 PM, Derek Homeier < derek at astro.physik.uni-goettingen.de> wrote: > On 14 Feb 2016, at 1:53 am, Jeff Reback wrote: > > > > I'm pleased to announce the availability of the first release candidate > of Pandas 0.18.0. > > Please try this RC and report any issues here: Pandas Issues > > We will be releasing officially in 1-2 weeks or so. > > > Thanks, looking forward to give this a try! > Do you have a download link to the source for non-Conda users and > packagers? > Finding anything in the github source tarball repositories without having > the exact > path seems hopeless. > > Derek > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Feb 15 13:31:48 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 15 Feb 2016 19:31:48 +0100 Subject: [Numpy-discussion] Subclassing ma.masked_array, code broken after version 1.9 In-Reply-To: <278614E7-1BEE-4442-80CE-2E62CB2B7D3A@email.arizona.edu> References: <56BF7A99.5040507@gmail.com> <278614E7-1BEE-4442-80CE-2E62CB2B7D3A@email.arizona.edu> Message-ID: <1455561108.21603.4.camel@sipsolutions.net> On Mo, 2016-02-15 at 17:06 +0000, Gutenkunst, Ryan N - (rgutenk) wrote: > Thank Jonathan, > > Good to confirm this isn't something inappropriate I'm doing. I give > up transparency here in my application, so I'll just work around it. > I leave it up to wiser numpy heads as to whether it's worth altering > these numpy.ma functions to enable subclassing. > Frankly, when it comes to masked array stuff, at least I am puzzled most of the time, so input is very welcome. Most of the people currently contributing, barely use masked arrays as far as I know, and sometimes it is hard to make good calls. It is a not the easiest code base and any feedback or nudging is important. A new release is about to come out, and if you feel it there is a serious regression, we may want to push for fixing it (or even better, you may have time to suggest a fix yourself). - Sebastian > Best, > Ryan > > On Feb 13, 2016, at 11:48 AM, Jonathan Helmus > wrote: > > > > > > > On 2/12/16 6:06 PM, Gutenkunst, Ryan N - (rgutenk) wrote: > > > Hello all, > > > > > > In 2009 I developed an application that uses a subclass of masked > > > arrays as a central data object. My subclass Spectrum possesses > > > additional attributes along with many custom methods. It was very > > > convenient to be able to use standard numpy functions for doing > > > arithmetic on these objects. However, my code broke with numpy > > > 1.10. I've finally had a chance to track down the problem, and I > > > am hoping someone can suggest a workaround. > > > > > > See below for an example, which is as minimal as I could concoct. > > > In this case, I have a Spectrum object that I'd like to take the > > > logarithm of using numpy.ma.log, while preserving the value of > > > the "folded" attribute. Up to numpy 1.9, this worked as expected, > > > but in numpy 1.10 and 1.11 the attribute is not preserved. > > > > > > The change in behavior appears to be driven by a commit made on > > > Jun 16th, 2015 by Marten van Kerkwijk. In particular, the commit > > > changed _MaskedUnaryOperation.__call__ so that the result array's > > > update_from method is no longer called with the input array as > > > the argument, but rather the result of the numpy UnaryOperation > > > (old line 889, new line 885). Because that UnaryOperation doesn't > > > carry my new attribute, it's not present for update_from to > > > access. I notice that similar changes were made to > > > MaskedBinaryOperation, although I haven't tested those. It's not > > > clear to me from the commit message why this particular change > > > was made, so I don't know whether this new behavior is > > > intentional. > > > > > > I know that subclassing arrays isn't widely encouraged, but it > > > has been very convenient in my code. Is it still possible to > > > subclass masked_array in such a way that functions like > > > numpy.ma.log preserve additional attributes? If so, can someone > > > point me in the right direction? > > > > > > Thanks! > > > Ryan > > > > > > *** Begin example > > > > > > import numpy > > > print 'Working with numpy {0}'.format(numpy.__version__) > > > > > > class Spectrum(numpy.ma.masked_array): > > > def __new__(cls, data, mask=numpy.ma.nomask, > > > data_folded=None): > > > subarr = numpy.ma.masked_array(data, mask=mask, > > > keep_mask=True, > > > shrink=True) > > > subarr = subarr.view(cls) > > > subarr.folded = data_folded > > > > > > return subarr > > > > > > def __array_finalize__(self, obj): > > > if obj is None: > > > return > > > numpy.ma.masked_array.__array_finalize__(self, obj) > > > self.folded = getattr(obj, 'folded', 'unspecified') > > > > > > def _update_from(self, obj): > > > print('Input to update_from: {0}'.format(repr(obj))) > > > numpy.ma.masked_array._update_from(self, obj) > > > self.folded = getattr(obj, 'folded', 'unspecified') > > > > > > def __repr__(self): > > > return 'Spectrum(%s, folded=%s)'\ > > > % (str(self), str(self.folded)) > > > > > > fs1 = Spectrum([2,3,4.], data_folded=True) > > > fs2 = numpy.ma.log(fs1) > > > print('fs2.folded status: {0}'.format(fs2.folded)) > > > print('Expectation is True, achieved with numpy 1.9') > > > > > > *** End example > > > > > > -- > > > Ryan Gutenkunst > > > Assistant Professor > > > Molecular and Cellular Biology > > > University of Arizona > > > phone: (520) 626-0569, office LSS 325 > > > http://gutengroup.mcb.arizona.edu > > > Latest paper: "Computationally efficient composite likelihood > > > statistics for demographic inference" > > > Molecular Biology and Evolution; > > > http://dx.doi.org/10.1093/molbev/msv255 > > Ryan, > > > > I'm not sure if you will be able to get this to work as in NumPy > > 1.9, but the __array_wrap__ method is intended to be the mechanism > > for subclasses to set their return type, adjust metadata, etc [1]. > > Unfortunately, the numpy.ma.log function does not seem to make a > > call to __array_wrap__ (at least in NumPy 1.10.2) although > > numpy.log does: > > > > from __future__ import print_function > > import numpy > > print('Working with numpy {0}'.format(numpy.__version__)) > > > > > > class Spectrum(numpy.ma.masked_array): > > def __new__(cls, data, mask=numpy.ma.nomask, data_folded=None): > > subarr = numpy.ma.masked_array(data, mask=mask, > > keep_mask=True, > > shrink=True) > > subarr = subarr.view(cls) > > subarr.folded = data_folded > > > > return subarr > > > > def __array_finalize__(self, obj): > > if obj is None: > > return > > numpy.ma.masked_array.__array_finalize__(self, obj) > > self.folded = getattr(obj, 'folded', 'unspecified') > > > > def __array_wrap__(self, out_arr, context=None): > > print('__array_wrap__ called') > > return numpy.ndarray.__array_wrap__(self, out_arr, context) > > > > def __repr__(self): > > return 'Spectrum(%s, folded=%s)'\ > > % (str(self), str(self.folded)) > > > > fs1 = Spectrum([2,3,4.], data_folded=True) > > > > print('numpy.ma.log:') > > fs2 = numpy.ma.log(fs1) > > print('fs2 type:', type(fs2)) > > print('fs2.folded status: {0}'.format(fs2.folded)) > > > > print('numpy.log:') > > fs3 = numpy.log(fs1) > > print('fs3 type:', type(fs3)) > > print('fs3.folded status: {0}'.format(fs3.folded)) > > > > ---- > > $ python example.py > > Working with numpy 1.10.2 > > numpy.ma.log: > > fs2 type: > > fs2.folded status: unspecified > > numpy.log: > > __array_wrap__ called > > fs3 type: > > fs3.folded status: True > > > > > > The change mentioned in the original message was made in pull > > request 3907 [2] in case anyone wants to have a look. > > > > Cheers, > > > > - Jonathan Helmus > > > > [1] http://docs.scipy.org/doc/numpy-1.10.1/user/basics.subclassing. > > html#array-wrap-for-ufuncs > > [2] https://github.com/numpy/numpy/pull/3907 > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- > Ryan Gutenkunst > Assistant Professor > Molecular and Cellular Biology > University of Arizona > phone: (520) 626-0569, office LSS 325 > http://gutengroup.mcb.arizona.edu > Latest paper: "Computationally efficient composite likelihood > statistics for demographic inference" > Molecular Biology and Evolution; > http://dx.doi.org/10.1093/molbev/msv255 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From izaid at continuum.io Mon Feb 15 13:42:00 2016 From: izaid at continuum.io (Irwin Zaid) Date: Mon, 15 Feb 2016 12:42:00 -0600 Subject: [Numpy-discussion] DyND 0.7.1 Release Message-ID: Hello everyone, I'm pleased to announce the latest 0.7.1 release of DyND. The release notes are at https://github.com/libdynd/libdynd/blob/master/docs/release_notes.txt . Over the last 6 months, DyND has really matured a lot and many features that were "experimental" before are quite usable at the moment. At the same time, we still have bugs and issues to sort out, so I wouldn't claim DyND has reached stability just yet. Nevertheless, I'd encourage early adopters to try it out. You can install it easily using conda, via "conda install -c dynd/channel/dev dynd-python". Presently, the core DyND team consists of myself, Mark Wiebe, and Ian Henriksen, alongside several other contributors. Our focus is almost entirely on gaps in stability and usability -- the novel features in DyND that people find attractive (including missing values, ragged arrays, variable-sized strings, dynamic custom types, and overloadable callables, among others) are functioning pretty well now. NumPy compatibility and interoperability is very important to us, and is something we are constantly improving. We would eventually like to have an optional NumPy-like API that is fully consistent with what a NumPy user would expect, but we're not there just yet. The DyND team would be happy to answer any questions people have about DyND, like "what is working and what is not" or "what do we still need to do to hit DyND 1.0". All the best, Irwin -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Mon Feb 15 14:56:19 2016 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Mon, 15 Feb 2016 20:56:19 +0100 Subject: [Numpy-discussion] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE In-Reply-To: References: Message-ID: On 15 Feb 2016, at 6:55 pm, Jeff Reback wrote: > > https://github.com/pydata/pandas/releases/tag/v0.18.0rc1 Ah, think I forgot about the ?releases? pages. Built on OS X 10.10 + 10.11 with python 2.7.11, 3.4.4 and 3.5.1. 17 errors in the test suite + 1 failure with python2.7 only; I can send you details on the errors if desired, the majority seems to be generic to a urllib problem with openssl on OS X anyway. Thanks for the good work Derek From vilanova at ac.upc.edu Mon Feb 15 16:28:12 2016 From: vilanova at ac.upc.edu (=?utf-8?Q?Llu=C3=ADs_Vilanova?=) Date: Mon, 15 Feb 2016 22:28:12 +0100 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: (Benjamin Root's message of "Fri, 12 Feb 2016 09:49:51 -0500") References: Message-ID: <87d1rxu0ib.fsf@fimbulvetr.bsc.es> Benjamin Root writes: > Seems like you are talking about xarray: https://github.com/pydata/xarray Oh, I wasn't aware of xarray, but there's also this: https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#basic-indexing https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#dimension-oblivious-indexing Cheers, Lluis > Cheers! > Ben Root > On Fri, Feb 12, 2016 at 9:40 AM, S?rgio wrote: > Hello, > This is my first e-mail, I will try to make the idea simple. > Similar to masked array it would be interesting to use a label array to > guide operations. > Ex.: >>>> x > labelled_array(data = > [[0 1 2] > [3 4 5] > [6 7 8]], > label = > [[0 1 2] > [0 1 2] > [0 1 2]]) >>>> sum(x) > array([9, 12, 15]) > The operations would create a new axis for label indexing. > You could think of it as a collection of masks, one for each label. > I don't know a way to make something like this efficiently without a loop. > Just wondering... > S?rgio. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From pmhobson at gmail.com Mon Feb 15 17:31:12 2016 From: pmhobson at gmail.com (Paul Hobson) Date: Mon, 15 Feb 2016 14:31:12 -0800 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: <87d1rxu0ib.fsf@fimbulvetr.bsc.es> References: <87d1rxu0ib.fsf@fimbulvetr.bsc.es> Message-ID: Just for posterity -- any future readers to this thread who need to do pandas-like on record arrays should look at matplotlib's mlab submodule. I've been in situations (::cough:: Esri production ::cough::) where I've had one hand tied behind my back and unable to install pandas. mlab was a big help there. https://goo.gl/M7Mi8B -paul On Mon, Feb 15, 2016 at 1:28 PM, Llu?s Vilanova wrote: > Benjamin Root writes: > > > Seems like you are talking about xarray: > https://github.com/pydata/xarray > > Oh, I wasn't aware of xarray, but there's also this: > > > https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#basic-indexing > > https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#dimension-oblivious-indexing > > > Cheers, > Lluis > > > > > Cheers! > > Ben Root > > > On Fri, Feb 12, 2016 at 9:40 AM, S?rgio wrote: > > > Hello, > > > > This is my first e-mail, I will try to make the idea simple. > > > > Similar to masked array it would be interesting to use a label array > to > > guide operations. > > > > Ex.: > >>>> x > > labelled_array(data = > > > [[0 1 2] > > [3 4 5] > > [6 7 8]], > > label = > > [[0 1 2] > > [0 1 2] > > [0 1 2]]) > > > >>>> sum(x) > > array([9, 12, 15]) > > > > The operations would create a new axis for label indexing. > > > > You could think of it as a collection of masks, one for each label. > > > > I don't know a way to make something like this efficiently without a > loop. > > Just wondering... > > > > S?rgio. > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Feb 15 22:50:44 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 15 Feb 2016 22:50:44 -0500 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: <56BE4BEA.5040100@gmail.com> Message-ID: On Mon, Feb 15, 2016 at 10:46 PM, wrote: > > > On Fri, Feb 12, 2016 at 4:19 PM, Nathan Goldbaum > wrote: > >> https://github.com/numpy/numpy/blob/master/doc/release/1.11.0-notes.rst >> >> On Fri, Feb 12, 2016 at 3:17 PM, Andreas Mueller >> wrote: >> >>> Hi. >>> Where can I find the changelog? >>> It would be good for us to know which changes are done one purpos >>> without hunting through the issue tracker. >>> >>> Thanks, >>> Andy >>> >>> >>> On 02/09/2016 09:09 PM, Charles R Harris wrote: >>> >>> Hi All, >>> >>> I'm pleased to announce the release of NumPy 1.11.0b3. This beta >>> contains additional bug fixes as well as limiting the number of >>> FutureWarnings raised by assignment to masked array slices. One issue that >>> remains to be decided is whether or not to postpone raising an error for >>> floats used as indexes. Sources may be found on Sourceforge >>> and both >>> sources and OS X wheels are availble on pypi. Please test, hopefully this >>> will be that last beta needed. >>> >>> As a note on problems encountered, twine uploads continue to fail for >>> me, but there are still variations to try. The wheeluploader downloaded >>> wheels as it should, but could not upload them, giving the error message >>> "HTTPError: 413 Client Error: Request Entity Too Large for url: >>> https://www.python.org/pypi". Firefox also >>> complains that http://wheels.scipy.org is incorrectly configured with >>> an invalid certificate. >>> >>> Enjoy, >>> >>> Chuck >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttps://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > (try to send again) > > another indexing question: (not covered by unit test but showed up in > examples in statsmodels) > > > This works in numpy at least 1.9.2 and 1.6.1 (python 2.7, and python 3.4) > > >>> list(range(5))[np.array([0])] > 0 > > > > on numpy 0.11.0b2 (I'm not yet at b3) (python 3.4) > > I get the same exception as here but even if there is just one element > > > >>> list(range(5))[np.array([0, 1])] > Traceback (most recent call last): > File "", line 1, in > list(range(5))[np.array([0, 1])] > TypeError: only integer arrays with one element can be converted to an > index > > > the actual code uses pop on a python list with a return from > np.where(...)[0] that returns a one element int64 array > > Josef > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Feb 15 23:05:54 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 15 Feb 2016 21:05:54 -0700 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: <56BE4BEA.5040100@gmail.com> Message-ID: On Mon, Feb 15, 2016 at 8:50 PM, wrote: > > > On Mon, Feb 15, 2016 at 10:46 PM, wrote: > > >> >> On Fri, Feb 12, 2016 at 4:19 PM, Nathan Goldbaum >> wrote: >> >>> https://github.com/numpy/numpy/blob/master/doc/release/1.11.0-notes.rst >>> >>> On Fri, Feb 12, 2016 at 3:17 PM, Andreas Mueller >>> wrote: >>> >>>> Hi. >>>> Where can I find the changelog? >>>> It would be good for us to know which changes are done one purpos >>>> without hunting through the issue tracker. >>>> >>>> Thanks, >>>> Andy >>>> >>>> >>>> On 02/09/2016 09:09 PM, Charles R Harris wrote: >>>> >>>> Hi All, >>>> >>>> I'm pleased to announce the release of NumPy 1.11.0b3. This beta >>>> contains additional bug fixes as well as limiting the number of >>>> FutureWarnings raised by assignment to masked array slices. One issue that >>>> remains to be decided is whether or not to postpone raising an error for >>>> floats used as indexes. Sources may be found on Sourceforge >>>> and >>>> both sources and OS X wheels are availble on pypi. Please test, hopefully >>>> this will be that last beta needed. >>>> >>>> As a note on problems encountered, twine uploads continue to fail for >>>> me, but there are still variations to try. The wheeluploader downloaded >>>> wheels as it should, but could not upload them, giving the error message >>>> "HTTPError: 413 Client Error: Request Entity Too Large for url: >>>> https://www.python.org/pypi". Firefox >>>> also complains that http://wheels.scipy.org is incorrectly configured >>>> with an invalid certificate. >>>> >>>> Enjoy, >>>> >>>> Chuck >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttps://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> > (try to send again) > > >> >> another indexing question: (not covered by unit test but showed up in >> examples in statsmodels) >> >> >> This works in numpy at least 1.9.2 and 1.6.1 (python 2.7, and python >> 3.4) >> >> >>> list(range(5))[np.array([0])] >> 0 >> >> >> >> on numpy 0.11.0b2 (I'm not yet at b3) (python 3.4) >> >> I get the same exception as here but even if there is just one element >> >> >> >>> list(range(5))[np.array([0, 1])] >> Traceback (most recent call last): >> File "", line 1, in >> list(range(5))[np.array([0, 1])] >> TypeError: only integer arrays with one element can be converted to an >> index >> > Looks like a misleading error message. Apparently it requires scalar arrays (ndim == 0) In [3]: list(range(5))[np.array(0)] Out[3]: 0 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Feb 15 23:09:13 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 15 Feb 2016 21:09:13 -0700 Subject: [Numpy-discussion] Subclassing ma.masked_array, code broken after version 1.9 In-Reply-To: <278614E7-1BEE-4442-80CE-2E62CB2B7D3A@email.arizona.edu> References: <56BF7A99.5040507@gmail.com> <278614E7-1BEE-4442-80CE-2E62CB2B7D3A@email.arizona.edu> Message-ID: On Mon, Feb 15, 2016 at 10:06 AM, Gutenkunst, Ryan N - (rgutenk) < rgutenk at email.arizona.edu> wrote: > Thank Jonathan, > > Good to confirm this isn't something inappropriate I'm doing. I give up > transparency here in my application, so I'll just work around it. I leave > it up to wiser numpy heads as to whether it's worth altering these > numpy.ma functions to enable subclassing. > There is a known bug MaskedArrays that might account for this. It will hopefully be fixed in the next beta. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Feb 15 23:15:49 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 15 Feb 2016 23:15:49 -0500 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: <56BE4BEA.5040100@gmail.com> Message-ID: On Mon, Feb 15, 2016 at 11:05 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Mon, Feb 15, 2016 at 8:50 PM, wrote: > >> >> >> On Mon, Feb 15, 2016 at 10:46 PM, wrote: >> >> >>> >>> On Fri, Feb 12, 2016 at 4:19 PM, Nathan Goldbaum >>> wrote: >>> >>>> https://github.com/numpy/numpy/blob/master/doc/release/1.11.0-notes.rst >>>> >>>> On Fri, Feb 12, 2016 at 3:17 PM, Andreas Mueller >>>> wrote: >>>> >>>>> Hi. >>>>> Where can I find the changelog? >>>>> It would be good for us to know which changes are done one purpos >>>>> without hunting through the issue tracker. >>>>> >>>>> Thanks, >>>>> Andy >>>>> >>>>> >>>>> On 02/09/2016 09:09 PM, Charles R Harris wrote: >>>>> >>>>> Hi All, >>>>> >>>>> I'm pleased to announce the release of NumPy 1.11.0b3. This beta >>>>> contains additional bug fixes as well as limiting the number of >>>>> FutureWarnings raised by assignment to masked array slices. One issue that >>>>> remains to be decided is whether or not to postpone raising an error for >>>>> floats used as indexes. Sources may be found on Sourceforge >>>>> and >>>>> both sources and OS X wheels are availble on pypi. Please test, hopefully >>>>> this will be that last beta needed. >>>>> >>>>> As a note on problems encountered, twine uploads continue to fail for >>>>> me, but there are still variations to try. The wheeluploader downloaded >>>>> wheels as it should, but could not upload them, giving the error message >>>>> "HTTPError: 413 Client Error: Request Entity Too Large for url: >>>>> https://www.python.org/pypi". Firefox >>>>> also complains that http://wheels.scipy.org is incorrectly configured >>>>> with an invalid certificate. >>>>> >>>>> Enjoy, >>>>> >>>>> Chuck >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttps://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >> (try to send again) >> >> >>> >>> another indexing question: (not covered by unit test but showed up in >>> examples in statsmodels) >>> >>> >>> This works in numpy at least 1.9.2 and 1.6.1 (python 2.7, and python >>> 3.4) >>> >>> >>> list(range(5))[np.array([0])] >>> 0 >>> >>> >>> >>> on numpy 0.11.0b2 (I'm not yet at b3) (python 3.4) >>> >>> I get the same exception as here but even if there is just one element >>> >>> >>> >>> list(range(5))[np.array([0, 1])] >>> Traceback (most recent call last): >>> File "", line 1, in >>> list(range(5))[np.array([0, 1])] >>> TypeError: only integer arrays with one element can be converted to an >>> index >>> >> > Looks like a misleading error message. Apparently it requires scalar > arrays (ndim == 0) > > In [3]: list(range(5))[np.array(0)] > Out[3]: 0 > We have a newer version of essentially same function a second time that uses squeeze and that seems to work fine. Just to understand Why does this depend on the numpy version? I would have understood that this always failed, but this code worked for several years. https://github.com/statsmodels/statsmodels/issues/2817 Josef > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Feb 15 23:31:39 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 15 Feb 2016 21:31:39 -0700 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: <56BE4BEA.5040100@gmail.com> Message-ID: On Mon, Feb 15, 2016 at 9:15 PM, wrote: > > > On Mon, Feb 15, 2016 at 11:05 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Mon, Feb 15, 2016 at 8:50 PM, wrote: >> >>> >>> >>> On Mon, Feb 15, 2016 at 10:46 PM, wrote: >>> >>> >>>> >>>> On Fri, Feb 12, 2016 at 4:19 PM, Nathan Goldbaum >>> > wrote: >>>> >>>>> https://github.com/numpy/numpy/blob/master/doc/release/1.11.0-notes.rst >>>>> >>>>> On Fri, Feb 12, 2016 at 3:17 PM, Andreas Mueller >>>>> wrote: >>>>> >>>>>> Hi. >>>>>> Where can I find the changelog? >>>>>> It would be good for us to know which changes are done one purpos >>>>>> without hunting through the issue tracker. >>>>>> >>>>>> Thanks, >>>>>> Andy >>>>>> >>>>>> >>>>>> On 02/09/2016 09:09 PM, Charles R Harris wrote: >>>>>> >>>>>> Hi All, >>>>>> >>>>>> I'm pleased to announce the release of NumPy 1.11.0b3. This beta >>>>>> contains additional bug fixes as well as limiting the number of >>>>>> FutureWarnings raised by assignment to masked array slices. One issue that >>>>>> remains to be decided is whether or not to postpone raising an error for >>>>>> floats used as indexes. Sources may be found on Sourceforge >>>>>> and >>>>>> both sources and OS X wheels are availble on pypi. Please test, hopefully >>>>>> this will be that last beta needed. >>>>>> >>>>>> As a note on problems encountered, twine uploads continue to fail for >>>>>> me, but there are still variations to try. The wheeluploader downloaded >>>>>> wheels as it should, but could not upload them, giving the error message >>>>>> "HTTPError: 413 Client Error: Request Entity Too Large for url: >>>>>> https://www.python.org/pypi". Firefox >>>>>> also complains that http://wheels.scipy.org is incorrectly >>>>>> configured with an invalid certificate. >>>>>> >>>>>> Enjoy, >>>>>> >>>>>> Chuck >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttps://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at scipy.org >>>>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>> (try to send again) >>> >>> >>>> >>>> another indexing question: (not covered by unit test but showed up in >>>> examples in statsmodels) >>>> >>>> >>>> This works in numpy at least 1.9.2 and 1.6.1 (python 2.7, and python >>>> 3.4) >>>> >>>> >>> list(range(5))[np.array([0])] >>>> 0 >>>> >>>> >>>> >>>> on numpy 0.11.0b2 (I'm not yet at b3) (python 3.4) >>>> >>>> I get the same exception as here but even if there is just one element >>>> >>>> >>>> >>> list(range(5))[np.array([0, 1])] >>>> Traceback (most recent call last): >>>> File "", line 1, in >>>> list(range(5))[np.array([0, 1])] >>>> TypeError: only integer arrays with one element can be converted to an >>>> index >>>> >>> >> Looks like a misleading error message. Apparently it requires scalar >> arrays (ndim == 0) >> >> In [3]: list(range(5))[np.array(0)] >> Out[3]: 0 >> > > > We have a newer version of essentially same function a second time that > uses squeeze and that seems to work fine. > > Just to understand > > Why does this depend on the numpy version? I would have understood that > this always failed, but this code worked for several years. > https://github.com/statsmodels/statsmodels/issues/2817 > It's part of the indexing cleanup. In [2]: list(range(5))[np.array([0])] /home/charris/.local/bin/ipython:1: VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future #!/usr/bin/python Out[2]: 0 The use of multidimensional arrays as indexes is likely a coding error. Or so we hope... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Feb 16 00:09:25 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Feb 2016 00:09:25 -0500 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: <56BE4BEA.5040100@gmail.com> Message-ID: On Mon, Feb 15, 2016 at 11:31 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Mon, Feb 15, 2016 at 9:15 PM, wrote: > >> >> >> On Mon, Feb 15, 2016 at 11:05 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Mon, Feb 15, 2016 at 8:50 PM, wrote: >>> >>>> >>>> >>>> On Mon, Feb 15, 2016 at 10:46 PM, wrote: >>>> >>>> >>>>> >>>>> On Fri, Feb 12, 2016 at 4:19 PM, Nathan Goldbaum < >>>>> nathan12343 at gmail.com> wrote: >>>>> >>>>>> >>>>>> https://github.com/numpy/numpy/blob/master/doc/release/1.11.0-notes.rst >>>>>> >>>>>> On Fri, Feb 12, 2016 at 3:17 PM, Andreas Mueller >>>>>> wrote: >>>>>> >>>>>>> Hi. >>>>>>> Where can I find the changelog? >>>>>>> It would be good for us to know which changes are done one purpos >>>>>>> without hunting through the issue tracker. >>>>>>> >>>>>>> Thanks, >>>>>>> Andy >>>>>>> >>>>>>> >>>>>>> On 02/09/2016 09:09 PM, Charles R Harris wrote: >>>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> I'm pleased to announce the release of NumPy 1.11.0b3. This beta >>>>>>> contains additional bug fixes as well as limiting the number of >>>>>>> FutureWarnings raised by assignment to masked array slices. One issue that >>>>>>> remains to be decided is whether or not to postpone raising an error for >>>>>>> floats used as indexes. Sources may be found on Sourceforge >>>>>>> and >>>>>>> both sources and OS X wheels are availble on pypi. Please test, hopefully >>>>>>> this will be that last beta needed. >>>>>>> >>>>>>> As a note on problems encountered, twine uploads continue to fail >>>>>>> for me, but there are still variations to try. The wheeluploader downloaded >>>>>>> wheels as it should, but could not upload them, giving the error message >>>>>>> "HTTPError: 413 Client Error: Request Entity Too Large for url: >>>>>>> https://www.python.org/pypi". Firefox >>>>>>> also complains that http://wheels.scipy.org is incorrectly >>>>>>> configured with an invalid certificate. >>>>>>> >>>>>>> Enjoy, >>>>>>> >>>>>>> Chuck >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttps://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing list >>>>>>> NumPy-Discussion at scipy.org >>>>>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at scipy.org >>>>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>> >>>>>> >>>>> >>>> (try to send again) >>>> >>>> >>>>> >>>>> another indexing question: (not covered by unit test but showed up in >>>>> examples in statsmodels) >>>>> >>>>> >>>>> This works in numpy at least 1.9.2 and 1.6.1 (python 2.7, and python >>>>> 3.4) >>>>> >>>>> >>> list(range(5))[np.array([0])] >>>>> 0 >>>>> >>>>> >>>>> >>>>> on numpy 0.11.0b2 (I'm not yet at b3) (python 3.4) >>>>> >>>>> I get the same exception as here but even if there is just one element >>>>> >>>>> >>>>> >>> list(range(5))[np.array([0, 1])] >>>>> Traceback (most recent call last): >>>>> File "", line 1, in >>>>> list(range(5))[np.array([0, 1])] >>>>> TypeError: only integer arrays with one element can be converted to an >>>>> index >>>>> >>>> >>> Looks like a misleading error message. Apparently it requires scalar >>> arrays (ndim == 0) >>> >>> In [3]: list(range(5))[np.array(0)] >>> Out[3]: 0 >>> >> >> >> We have a newer version of essentially same function a second time that >> uses squeeze and that seems to work fine. >> >> Just to understand >> >> Why does this depend on the numpy version? I would have understood that >> this always failed, but this code worked for several years. >> https://github.com/statsmodels/statsmodels/issues/2817 >> > > It's part of the indexing cleanup. > > In [2]: list(range(5))[np.array([0])] > /home/charris/.local/bin/ipython:1: VisibleDeprecationWarning: converting > an array with ndim > 0 to an index will result in an error in the future > #!/usr/bin/python > Out[2]: 0 > > The use of multidimensional arrays as indexes is likely a coding error. Or > so we hope... > Thanks for the explanation Or, it forces everyone to watch out for the color of the ducks :) It's just a number, whether it's python scalar, numpy scalar, 1D or 2D. And once we squeeze, we cannot iterate over it anymore. This looks like the last problem with have in statsmodels master. Part of the reason that 0.10 hurt quite a bit is that we are using in statsmodels some of the grey zones so we don't have to commit to a specific usage. Even if a user or developer tries a "weird" case, it works for most of the results, but breaks in some unknown places. (In the current case a cryptic exception would be raised if the user has two constant columns in the regression. Which is fine for some usecases but not for every result.) Josef > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Feb 16 00:13:28 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Feb 2016 00:13:28 -0500 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: <56BE4BEA.5040100@gmail.com> Message-ID: On Tue, Feb 16, 2016 at 12:09 AM, wrote: > > > On Mon, Feb 15, 2016 at 11:31 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Mon, Feb 15, 2016 at 9:15 PM, wrote: >> >>> >>> >>> On Mon, Feb 15, 2016 at 11:05 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> >>>> On Mon, Feb 15, 2016 at 8:50 PM, wrote: >>>> >>>>> >>>>> >>>>> On Mon, Feb 15, 2016 at 10:46 PM, wrote: >>>>> >>>>> >>>>>> >>>>>> On Fri, Feb 12, 2016 at 4:19 PM, Nathan Goldbaum < >>>>>> nathan12343 at gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> https://github.com/numpy/numpy/blob/master/doc/release/1.11.0-notes.rst >>>>>>> >>>>>>> On Fri, Feb 12, 2016 at 3:17 PM, Andreas Mueller >>>>>>> wrote: >>>>>>> >>>>>>>> Hi. >>>>>>>> Where can I find the changelog? >>>>>>>> It would be good for us to know which changes are done one purpos >>>>>>>> without hunting through the issue tracker. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Andy >>>>>>>> >>>>>>>> >>>>>>>> On 02/09/2016 09:09 PM, Charles R Harris wrote: >>>>>>>> >>>>>>>> Hi All, >>>>>>>> >>>>>>>> I'm pleased to announce the release of NumPy 1.11.0b3. This beta >>>>>>>> contains additional bug fixes as well as limiting the number of >>>>>>>> FutureWarnings raised by assignment to masked array slices. One issue that >>>>>>>> remains to be decided is whether or not to postpone raising an error for >>>>>>>> floats used as indexes. Sources may be found on Sourceforge >>>>>>>> and >>>>>>>> both sources and OS X wheels are availble on pypi. Please test, hopefully >>>>>>>> this will be that last beta needed. >>>>>>>> >>>>>>>> As a note on problems encountered, twine uploads continue to fail >>>>>>>> for me, but there are still variations to try. The wheeluploader downloaded >>>>>>>> wheels as it should, but could not upload them, giving the error message >>>>>>>> "HTTPError: 413 Client Error: Request Entity Too Large for url: >>>>>>>> https://www.python.org/pypi". Firefox >>>>>>>> also complains that http://wheels.scipy.org is incorrectly >>>>>>>> configured with an invalid certificate. >>>>>>>> >>>>>>>> Enjoy, >>>>>>>> >>>>>>>> Chuck >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttps://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> NumPy-Discussion mailing list >>>>>>>> NumPy-Discussion at scipy.org >>>>>>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> NumPy-Discussion mailing list >>>>>>> NumPy-Discussion at scipy.org >>>>>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>>>> >>>>>>> >>>>>> >>>>> (try to send again) >>>>> >>>>> >>>>>> >>>>>> another indexing question: (not covered by unit test but showed up >>>>>> in examples in statsmodels) >>>>>> >>>>>> >>>>>> This works in numpy at least 1.9.2 and 1.6.1 (python 2.7, and >>>>>> python 3.4) >>>>>> >>>>>> >>> list(range(5))[np.array([0])] >>>>>> 0 >>>>>> >>>>>> >>>>>> >>>>>> on numpy 0.11.0b2 (I'm not yet at b3) (python 3.4) >>>>>> >>>>>> I get the same exception as here but even if there is just one element >>>>>> >>>>>> >>>>>> >>> list(range(5))[np.array([0, 1])] >>>>>> Traceback (most recent call last): >>>>>> File "", line 1, in >>>>>> list(range(5))[np.array([0, 1])] >>>>>> TypeError: only integer arrays with one element can be converted to >>>>>> an index >>>>>> >>>>> >>>> Looks like a misleading error message. Apparently it requires scalar >>>> arrays (ndim == 0) >>>> >>>> In [3]: list(range(5))[np.array(0)] >>>> Out[3]: 0 >>>> >>> >>> >>> We have a newer version of essentially same function a second time that >>> uses squeeze and that seems to work fine. >>> >>> Just to understand >>> >>> Why does this depend on the numpy version? I would have understood that >>> this always failed, but this code worked for several years. >>> https://github.com/statsmodels/statsmodels/issues/2817 >>> >> >> It's part of the indexing cleanup. >> >> In [2]: list(range(5))[np.array([0])] >> /home/charris/.local/bin/ipython:1: VisibleDeprecationWarning: converting >> an array with ndim > 0 to an index will result in an error in the future >> #!/usr/bin/python >> Out[2]: 0 >> >> The use of multidimensional arrays as indexes is likely a coding error. >> Or so we hope... >> > > Thanks for the explanation > > > Or, it forces everyone to watch out for the color of the ducks :) > > It's just a number, whether it's python scalar, numpy scalar, 1D or 2D. > And once we squeeze, we cannot iterate over it anymore. > > > This looks like the last problem with have in statsmodels master. > Part of the reason that 0.10 hurt quite a bit is that we are using in > statsmodels some of the grey zones so we don't have to commit to a specific > usage. Even if a user or developer tries a "weird" case, it works for most > of the results, but breaks in some unknown places. > > I meant 1.11 here. > (In the current case a cryptic exception would be raised if the user has > two constant columns in the regression. Which is fine for some usecases but > not for every result.) > > Josef > > >> >> Chuck >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tcaswell at gmail.com Tue Feb 16 00:18:59 2016 From: tcaswell at gmail.com (Thomas Caswell) Date: Tue, 16 Feb 2016 05:18:59 +0000 Subject: [Numpy-discussion] cycler v.0.10.0 released Message-ID: Folks, I am happy to announce the next release of Cycler. This will become the minimal version for the upcoming mpl v2.0 release. http://matplotlib.org/cycler/ Feature release for `cycler`. This release includes a number of new features: - `Cycler` objecst learned to generate an `itertools.cycle` by calling them, a-la a generator. - `Cycler` objects learned to change the name of a key via the new `.change_key(old_key, new_key)` method. - `Cycler` objects learned how to compare each other and determine if they are equal or not (`==`). - `Cycler` objects learned how to join another `Cycler` to be concatenated into a singel longer `Cycler` via `concat` method of function. `A.concat(B)` or `concat(A, B)`. - The `cycler` factory function learned to construct a complex `Cycler` from iterables provided as keyword arguments. - `Cycler` objects learn do show their insides with the `by_key` method which returns a dictionary of lists (instead of an iterable of dictionaries). Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Tue Feb 16 00:49:42 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Tue, 16 Feb 2016 00:49:42 -0500 Subject: [Numpy-discussion] Proposal to add `weights` to `np.percentile` and `np.median` Message-ID: I would like to add a `weights` keyword to `np.partition`, `np.percentile` and `np.median`. My reason for doing so is to to allow `np.histogram` to process automatic bin selection with weights. Currently, weights are not supported for the automatic bin selection and would be difficult to support in `auto` mode without having `np.percentile` support a `weights` keyword. I suspect that there are many other uses for such a feature. I have taken a preliminary look at the C implementation of the partition functions that are the basis for `partition`, `median` and `percentile`. I think that it would be possible to add versions (or just extend the functionality of existing ones) that check the ratio of the weights below the partition point to the total sum of the weights instead of just counting elements. One of the main advantages of such an implementation is that it would allow any real weights to be handled correctly, not just integers. Complex weights would not be supported. The purpose of this email is to see if anybody objects, has ideas or cares at all about this proposal before I spend a significant amount of time working on it. For example, did I miss any functions in my list? Regards, -Joe From robbmcleod at gmail.com Tue Feb 16 03:48:45 2016 From: robbmcleod at gmail.com (Robert McLeod) Date: Tue, 16 Feb 2016 09:48:45 +0100 Subject: [Numpy-discussion] Numexpr-3.0 proposal In-Reply-To: References: Message-ID: On Mon, Feb 15, 2016 at 7:28 AM, Ralf Gommers wrote: > > > On Sun, Feb 14, 2016 at 11:19 PM, Robert McLeod > wrote: > >> >> 4.) I took a stab at converting from distutils to setuputils but this >> seems challenging with numpy as a dependency. I wonder if anyone has tried >> monkey-patching so that setup.py build_ext uses distutils and then pass the >> interpreter.pyd/so as a data file, or some other such chicanery? >> > > Not sure what you mean, since numpexpr already uses setuptools: > https://github.com/pydata/numexpr/blob/master/setup.py#L22. What is the > real goal you're trying to achieve? > > This monkeypatching is a bad idea: > https://github.com/robbmcleod/numexpr/blob/numexpr-3.0/setup.py#L19. Both > setuptools and numpy.distutils already do that, and that's already one too > many. So you definitely don't want to add a third place.... You can use the > -j (--parallel) flag to numpy.distutils instead, see > http://docs.scipy.org/doc/numpy-dev/user/building.html#parallel-builds > > Ralf > Dear Ralf, Yes, this appears to be a bad idea. I was just trying to think about if I could use the more object-oriented approach that I am familiar with in setuptools to easily build wheels for Pypi. Thanks for the comments and links; I didn't know I could parallelize the numpy build. Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universit?t Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcleod at unibas.ch robert.mcleod at bsse.ethz.ch robbmcleod at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbmcleod at gmail.com Tue Feb 16 04:04:17 2016 From: robbmcleod at gmail.com (Robert McLeod) Date: Tue, 16 Feb 2016 10:04:17 +0100 Subject: [Numpy-discussion] Fwd: Numexpr-3.0 proposal In-Reply-To: References: <029E5E0A-21A5-42B4-B306-A2C8C12DF73B@gmail.com> Message-ID: On Mon, Feb 15, 2016 at 10:43 AM, Gregor Thalhammer < gregor.thalhammer at gmail.com> wrote: > > Dear Robert, > > thanks for your effort on improving numexpr. Indeed, vectorized math > libraries (VML) can give a large boost in performance (~5x), except for a > couple of basic operations (add, mul, div), which current compilers are > able to vectorize automatically. With recent gcc even more functions are > vectorized, see https://sourceware.org/glibc/wiki/libmvec But you need > special flags depending on the platform (SSE, AVX present?), runtime > detection of processor capabilities would be nice for distributing > binaries. Some time ago, since I lost access to Intels MKL, I patched > numexpr to use Accelerate/Veclib on os x, which is preinstalled on each > mac, see https://github.com/geggo/numexpr.git veclib_support branch. > > As you increased the opcode size, I could imagine providing a bit to > switch (during runtime) between internal functions and vectorized ones, > that would be handy for tests and benchmarks. > Dear Gregor, Your suggestion to separate the opcode signature from the library used to execute it is very clever. Based on your suggestion, I think that the natural evolution of the opcodes is to specify them by function signature and library, using a two-level dict, i.e. numexpr.interpreter.opcodes['exp_f8f8f8'][gnu] = some_enum numexpr.interpreter.opcodes['exp_f8f8f8'][msvc] = some_enum +1 numexpr.interpreter.opcodes['exp_f8f8f8'][vml] = some_enum + 2 numexpr.interpreter.opcodes['exp_f8f8f8'][yeppp] = some_enum +3 I want to procedurally generate opcodes.cpp and interpreter_body.cpp. If I do it the way you suggested funccodes.hpp and all the many #define's regarding function codes in the interpreter can hopefully be removed and hence simplify the overall codebase. One could potentially take it a step further and plan (optimize) each expression, similar to what FFTW does with regards to matrix shape. That is, the basic way to control the library would be with a singleton library argument, i.e.: result = ne.evaluate( "A*log(foo**2 / bar**2", lib=vml ) However, we could also permit a tuple to be passed in, where each element of the tuple reflects the library to use for each operation in the AST tree: result = ne.evaluate( "A*log(foo**2 / bar**2", lib=(gnu,gnu,gnu,yeppp,gnu) ) In this case the ops are (mul,mul,div,log,mul). The op-code picking is done by the Python side, and this tuple could be potentially optimized by numexpr rather than hand-optimized, by trying various permutations of the linked C math libraries. The wisdom from the planning could be pickled and saved in a wisdom file. Currently Numexpr has cacheDict in util.py but there's no reason this can't be pickled and saved to disk. I've done a similar thing by creating wrappers for PyFFTW already. Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universit?t Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcleod at unibas.ch robert.mcleod at bsse.ethz.ch robbmcleod at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Tue Feb 16 04:52:00 2016 From: faltet at gmail.com (Francesc Alted) Date: Tue, 16 Feb 2016 10:52:00 +0100 Subject: [Numpy-discussion] Fwd: Numexpr-3.0 proposal In-Reply-To: References: <029E5E0A-21A5-42B4-B306-A2C8C12DF73B@gmail.com> Message-ID: 2016-02-16 10:04 GMT+01:00 Robert McLeod : > On Mon, Feb 15, 2016 at 10:43 AM, Gregor Thalhammer < > gregor.thalhammer at gmail.com> wrote: > >> >> Dear Robert, >> >> thanks for your effort on improving numexpr. Indeed, vectorized math >> libraries (VML) can give a large boost in performance (~5x), except for a >> couple of basic operations (add, mul, div), which current compilers are >> able to vectorize automatically. With recent gcc even more functions are >> vectorized, see https://sourceware.org/glibc/wiki/libmvec But you need >> special flags depending on the platform (SSE, AVX present?), runtime >> detection of processor capabilities would be nice for distributing >> binaries. Some time ago, since I lost access to Intels MKL, I patched >> numexpr to use Accelerate/Veclib on os x, which is preinstalled on each >> mac, see https://github.com/geggo/numexpr.git veclib_support branch. >> >> As you increased the opcode size, I could imagine providing a bit to >> switch (during runtime) between internal functions and vectorized ones, >> that would be handy for tests and benchmarks. >> > > Dear Gregor, > > Your suggestion to separate the opcode signature from the library used to > execute it is very clever. Based on your suggestion, I think that the > natural evolution of the opcodes is to specify them by function signature > and library, using a two-level dict, i.e. > > numexpr.interpreter.opcodes['exp_f8f8f8'][gnu] = some_enum > numexpr.interpreter.opcodes['exp_f8f8f8'][msvc] = some_enum +1 > numexpr.interpreter.opcodes['exp_f8f8f8'][vml] = some_enum + 2 > numexpr.interpreter.opcodes['exp_f8f8f8'][yeppp] = some_enum +3 > Yes, by using a two level dictionary you can access the functions implementing opcodes much faster and hence you can add much more opcodes without too much slow-down. > > I want to procedurally generate opcodes.cpp and interpreter_body.cpp. If > I do it the way you suggested funccodes.hpp and all the many #define's > regarding function codes in the interpreter can hopefully be removed and > hence simplify the overall codebase. One could potentially take it a step > further and plan (optimize) each expression, similar to what FFTW does with > regards to matrix shape. That is, the basic way to control the library > would be with a singleton library argument, i.e.: > > result = ne.evaluate( "A*log(foo**2 / bar**2", lib=vml ) > > However, we could also permit a tuple to be passed in, where each element > of the tuple reflects the library to use for each operation in the AST tree: > > result = ne.evaluate( "A*log(foo**2 / bar**2", lib=(gnu,gnu,gnu,yeppp,gnu) > ) > > In this case the ops are (mul,mul,div,log,mul). The op-code picking is > done by the Python side, and this tuple could be potentially optimized by > numexpr rather than hand-optimized, by trying various permutations of the > linked C math libraries. The wisdom from the planning could be pickled and > saved in a wisdom file. Currently Numexpr has cacheDict in util.py but > there's no reason this can't be pickled and saved to disk. I've done a > similar thing by creating wrappers for PyFFTW already. > I like the idea of various permutations of linked C math libraries to be probed by numexpr during the initial iteration and then cached somehow. That will probably require run-time detection of available C math libraries (think that a numexpr binary will be able to run on different machines with different libraries and computing capabilities), but in exchange, it will allow for the fastest execution paths independently of the machine that runs the code. -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Feb 16 07:05:31 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 16 Feb 2016 13:05:31 +0100 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: <56BE4BEA.5040100@gmail.com> Message-ID: <1455624331.30480.6.camel@sipsolutions.net> On Di, 2016-02-16 at 00:13 -0500, josef.pktd at gmail.com wrote: > > > On Tue, Feb 16, 2016 at 12:09 AM, wrote: > > > > > > > > > > Or, it forces everyone to watch out for the color of the ducks :) > > > > It's just a number, whether it's python scalar, numpy scalar, 1D or > > 2D. > > And once we squeeze, we cannot iterate over it anymore. > > > > > > This looks like the last problem with have in statsmodels master. > > Part of the reason that 0.10 hurt quite a bit is that we are using > > in statsmodels some of the grey zones so we don't have to commit to > > a specific usage. Even if a user or developer tries a "weird" case, > > it works for most of the results, but breaks in some unknown > > places. > > > > > I meant 1.11 here. > The reason for this part is that `arr[np.array([1])]` is very different from `arr[np.array(1)]`. For `list[np.array([1])]` if you allow `operator.index(np.array([1]))` you will not get equivalent results for lists and arrays. The normal array result cannot work for lists. We had open bug reports about it. Of course I dislike it in any case ;), but that is the reasoning behind being a bit more restrictive for `__index__`. - Sebastian > > (In the current case a cryptic exception would be raised if the > > user has two constant columns in the regression. Which is fine for > > some usecases but not for every result.) > > > > Josef > > > > > > > > Chuck > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From filaboia at gmail.com Tue Feb 16 09:05:51 2016 From: filaboia at gmail.com (=?UTF-8?Q?S=C3=A9rgio?=) Date: Tue, 16 Feb 2016 12:05:51 -0200 Subject: [Numpy-discussion] [Suggestion] Labelled Array Message-ID: Just something I tried with pandas: >>> image array([[[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19]], [[20, 21, 22, 23, 24], [25, 26, 27, 28, 29], [30, 31, 32, 33, 34], [35, 36, 37, 38, 39]], [[40, 41, 42, 43, 44], [45, 46, 47, 48, 49], [50, 51, 52, 53, 54], [55, 56, 57, 58, 59]]]) >>> label array([[0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7]]) >>> dt = pd.DataFrame(np.vstack((label.ravel(), image.reshape(3, 20))).T) >>> labelled_image = dt.groupby(0) >>> labelled_image.mean().values array([[ 0, 20, 40], [ 3, 23, 43], [ 6, 26, 46], [ 9, 29, 49], [10, 30, 50], [13, 33, 53], [16, 36, 56], [19, 39, 59]]) Sergio > Date: Sat, 13 Feb 2016 22:41:13 -0500 > From: Allan Haldane > To: numpy-discussion at scipy.org > Subject: Re: [Numpy-discussion] [Suggestion] Labelled Array > Message-ID: <56BFF759.7010505 at gmail.com> > Content-Type: text/plain; charset=windows-1252; format=flowed > > Impressive! > > Possibly there's still a case for including a 'groupby' function in > numpy itself since it's a generally useful operation, but I do see less > of a need given the nice pandas functionality. > > At least, next time someone asks a stackoverflow question like the ones > below someone should tell them to use pandas! > > (copied from my gist for future list reference). > > http://stackoverflow.com/questions/4373631/sum-array-by-number-in-numpy > > http://stackoverflow.com/questions/31483912/split-numpy-array-according-to-values-in-the-array-a-condition/31484134#31484134 > > http://stackoverflow.com/questions/31863083/python-split-numpy-array-based-on-values-in-the-array > > http://stackoverflow.com/questions/28599405/splitting-an-array-into-two-smaller-arrays-in-python > > http://stackoverflow.com/questions/7662458/how-to-split-an-array-according-to-a-condition-in-numpy > > Allan > > > On 02/13/2016 01:39 PM, Jeff Reback wrote: > > In [10]: pd.options.display.max_rows=10 > > > > In [13]: np.random.seed(1234) > > > > In [14]: c = np.random.randint(0,32,size=100000) > > > > In [15]: v = np.arange(100000) > > > > In [16]: df = DataFrame({'v' : v, 'c' : c}) > > > > In [17]: df > > Out[17]: > > c v > > 0 15 0 > > 1 19 1 > > 2 6 2 > > 3 21 3 > > 4 12 4 > > ... .. ... > > 99995 7 99995 > > 99996 2 99996 > > 99997 27 99997 > > 99998 28 99998 > > 99999 7 99999 > > > > [100000 rows x 2 columns] > > > > In [19]: df.groupby('c').count() > > Out[19]: > > v > > c > > 0 3136 > > 1 3229 > > 2 3093 > > 3 3121 > > 4 3041 > > .. ... > > 27 3128 > > 28 3063 > > 29 3147 > > 30 3073 > > 31 3090 > > > > [32 rows x 1 columns] > > > > In [20]: %timeit df.groupby('c').count() > > 100 loops, best of 3: 2 ms per loop > > > > In [21]: %timeit df.groupby('c').mean() > > 100 loops, best of 3: 2.39 ms per loop > > > > In [22]: df.groupby('c').mean() > > Out[22]: > > v > > c > > 0 49883.384885 > > 1 50233.692165 > > 2 48634.116069 > > 3 50811.743992 > > 4 50505.368629 > > .. ... > > 27 49715.349425 > > 28 50363.501469 > > 29 50485.395933 > > 30 50190.155223 > > 31 50691.041748 > > > > [32 rows x 1 columns] > > > > > > On Sat, Feb 13, 2016 at 1:29 PM, > > wrote: > > > > > > > > On Sat, Feb 13, 2016 at 1:01 PM, Allan Haldane > > > wrote: > > > > Sorry, to reply to myself here, but looking at it with fresh > > eyes maybe the performance of the naive version isn't too bad. > > Here's a comparison of the naive vs a better implementation: > > > > def split_classes_naive(c, v): > > return [v[c == u] for u in unique(c)] > > > > def split_classes(c, v): > > perm = c.argsort() > > csrt = c[perm] > > div = where(csrt[1:] != csrt[:-1])[0] + 1 > > return [v[x] for x in split(perm, div)] > > > > >>> c = randint(0,32,size=100000) > > >>> v = arange(100000) > > >>> %timeit split_classes_naive(c,v) > > 100 loops, best of 3: 8.4 ms per loop > > >>> %timeit split_classes(c,v) > > 100 loops, best of 3: 4.79 ms per loop > > > > > > The usecases I recently started to target for similar things is 1 > > Million or more rows and 10000 uniques in the labels. > > The second version should be faster for large number of uniques, I > > guess. > > > > Overall numpy is falling far behind pandas in terms of simple > > groupby operations. bincount and histogram (IIRC) worked for some > > cases but are rather limited. > > > > reduce_at looks nice for cases where it applies. > > > > In contrast to the full sized labels in the original post, I only > > know of applications where the labels are 1-D corresponding to rows > > or columns. > > > > Josef > > > > > > In any case, maybe it is useful to Sergio or others. > > > > Allan > > > > > > On 02/13/2016 12:11 PM, Allan Haldane wrote: > > > > I've had a pretty similar idea for a new indexing function > > 'split_classes' which would help in your case, which > > essentially does > > > > def split_classes(c, v): > > return [v[c == u] for u in unique(c)] > > > > Your example could be coded as > > > > >>> [sum(c) for c in split_classes(label, data)] > > [9, 12, 15] > > > > I feel I've come across the need for such a function often > > enough that > > it might be generally useful to people as part of numpy. The > > implementation of split_classes above has pretty poor > > performance > > because it creates many temporary boolean arrays, so my plan > > for a PR > > was to have a speedy version of it that uses a single pass > > through v. > > (I often wanted to use this function on large datasets). > > > > If anyone has any comments on the idea (good idea. bad > > idea?) I'd love > > to hear. > > > > I have some further notes and examples here: > > https://gist.github.com/ahaldane/1e673d2fe6ffe0be4f21 > > > > Allan > > > > On 02/12/2016 09:40 AM, S?rgio wrote: > > > > Hello, > > > > This is my first e-mail, I will try to make the idea > simple. > > > > Similar to masked array it would be interesting to use a > > label array to > > guide operations. > > > > Ex.: > > >>> x > > labelled_array(data = > > [[0 1 2] > > [3 4 5] > > [6 7 8]], > > label = > > [[0 1 2] > > [0 1 2] > > [0 1 2]]) > > > > >>> sum(x) > > array([9, 12, 15]) > > > > The operations would create a new axis for label > indexing. > > > > You could think of it as a collection of masks, one for > > each label. > > > > I don't know a way to make something like this > > efficiently without a > > loop. Just wondering... > > > > S?rgio. > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Tue Feb 16 13:32:24 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Tue, 16 Feb 2016 10:32:24 -0800 Subject: [Numpy-discussion] Proposal to add `weights` to `np.percentile` and `np.median` In-Reply-To: References: Message-ID: See earlier discussion here: https://github.com/numpy/numpy/issues/6326 Basically, na?vely sorting may be faster than a not-so-optimized version of quickselect. Antony 2016-02-15 21:49 GMT-08:00 Joseph Fox-Rabinovitz : > I would like to add a `weights` keyword to `np.partition`, > `np.percentile` and `np.median`. My reason for doing so is to to allow > `np.histogram` to process automatic bin selection with weights. > Currently, weights are not supported for the automatic bin selection > and would be difficult to support in `auto` mode without having > `np.percentile` support a `weights` keyword. I suspect that there are > many other uses for such a feature. > > I have taken a preliminary look at the C implementation of the > partition functions that are the basis for `partition`, `median` and > `percentile`. I think that it would be possible to add versions (or > just extend the functionality of existing ones) that check the ratio > of the weights below the partition point to the total sum of the > weights instead of just counting elements. > > One of the main advantages of such an implementation is that it would > allow any real weights to be handled correctly, not just integers. > Complex weights would not be supported. > > The purpose of this email is to see if anybody objects, has ideas or > cares at all about this proposal before I spend a significant amount > of time working on it. For example, did I miss any functions in my > list? > > Regards, > > -Joe > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Tue Feb 16 13:41:30 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Tue, 16 Feb 2016 13:41:30 -0500 Subject: [Numpy-discussion] Proposal to add `weights` to `np.percentile` and `np.median` In-Reply-To: References: Message-ID: Thanks for pointing me to that. I had something a bit different in mind but that definitely looks like a good start. On Tue, Feb 16, 2016 at 1:32 PM, Antony Lee wrote: > See earlier discussion here: https://github.com/numpy/numpy/issues/6326 > Basically, na?vely sorting may be faster than a not-so-optimized version of > quickselect. > > Antony > > 2016-02-15 21:49 GMT-08:00 Joseph Fox-Rabinovitz : >> >> I would like to add a `weights` keyword to `np.partition`, >> `np.percentile` and `np.median`. My reason for doing so is to to allow >> `np.histogram` to process automatic bin selection with weights. >> Currently, weights are not supported for the automatic bin selection >> and would be difficult to support in `auto` mode without having >> `np.percentile` support a `weights` keyword. I suspect that there are >> many other uses for such a feature. >> >> I have taken a preliminary look at the C implementation of the >> partition functions that are the basis for `partition`, `median` and >> `percentile`. I think that it would be possible to add versions (or >> just extend the functionality of existing ones) that check the ratio >> of the weights below the partition point to the total sum of the >> weights instead of just counting elements. >> >> One of the main advantages of such an implementation is that it would >> allow any real weights to be handled correctly, not just integers. >> Complex weights would not be supported. >> >> The purpose of this email is to see if anybody objects, has ideas or >> cares at all about this proposal before I spend a significant amount >> of time working on it. For example, did I miss any functions in my >> list? >> >> Regards, >> >> -Joe >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Tue Feb 16 14:39:42 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Feb 2016 14:39:42 -0500 Subject: [Numpy-discussion] Proposal to add `weights` to `np.percentile` and `np.median` In-Reply-To: References: Message-ID: On Tue, Feb 16, 2016 at 1:41 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > Thanks for pointing me to that. I had something a bit different in > mind but that definitely looks like a good start. > > On Tue, Feb 16, 2016 at 1:32 PM, Antony Lee > wrote: > > See earlier discussion here: https://github.com/numpy/numpy/issues/6326 > > Basically, na?vely sorting may be faster than a not-so-optimized version > of > > quickselect. > > > > Antony > > > > 2016-02-15 21:49 GMT-08:00 Joseph Fox-Rabinovitz < > jfoxrabinovitz at gmail.com>: > >> > >> I would like to add a `weights` keyword to `np.partition`, > >> `np.percentile` and `np.median`. My reason for doing so is to to allow > >> `np.histogram` to process automatic bin selection with weights. > >> Currently, weights are not supported for the automatic bin selection > >> and would be difficult to support in `auto` mode without having > >> `np.percentile` support a `weights` keyword. I suspect that there are > >> many other uses for such a feature. > >> > >> I have taken a preliminary look at the C implementation of the > >> partition functions that are the basis for `partition`, `median` and > >> `percentile`. I think that it would be possible to add versions (or > >> just extend the functionality of existing ones) that check the ratio > >> of the weights below the partition point to the total sum of the > >> weights instead of just counting elements. > >> > >> One of the main advantages of such an implementation is that it would > >> allow any real weights to be handled correctly, not just integers. > >> Complex weights would not be supported. > >> > >> The purpose of this email is to see if anybody objects, has ideas or > >> cares at all about this proposal before I spend a significant amount > >> of time working on it. For example, did I miss any functions in my > >> list? > >> > >> Regards, > >> > >> -Joe > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > statsmodels just got weighted quantiles https://github.com/statsmodels/statsmodels/pull/2707 I didn't try to figure out it's computational efficiency, and we would gladly delegate to whatever fast algorithm would be in numpy. Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From Christian.BERGER at 3ds.com Tue Feb 16 14:39:55 2016 From: Christian.BERGER at 3ds.com (BERGER Christian) Date: Tue, 16 Feb 2016 19:39:55 +0000 Subject: [Numpy-discussion] building NumPy with gcc if Python was built with icc?!? Message-ID: <867172D4AC0F194AACCDD3853364A34882F2AF@AG-DCC-MBX13.dsone.3ds.com> Hi All, Here's a potentially dumb question: is it possible to build NumPy with gcc, if python was built with icc? Right now, the build is failing in the toolchain check phase, because gcc doesn't know how to handle icc-specific c flags (like -fp-model, prec-sqrt, ...) In our environment we're providing an embedded python that our customers should be able to use and extend with 3rd party modules (like numpy). Problem is that our sw is built using icc, but we don't want to force our customers to do the same and we also don't want to build every possible 3rd party module for our customers. Thanks for your help, Christian This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged. If you are not one of the named recipients or have received this email in error, (i) you should not read, disclose, or copy it, (ii) please notify sender of your receipt by reply email and delete this email and all attachments, (iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email. For other languages, go to http://www.3ds.com/terms/email-disclaimer -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfyoung17 at gmail.com Tue Feb 16 14:44:54 2016 From: gfyoung17 at gmail.com (G Young) Date: Tue, 16 Feb 2016 19:44:54 +0000 Subject: [Numpy-discussion] building NumPy with gcc if Python was built with icc?!? In-Reply-To: <867172D4AC0F194AACCDD3853364A34882F2AF@AG-DCC-MBX13.dsone.3ds.com> References: <867172D4AC0F194AACCDD3853364A34882F2AF@AG-DCC-MBX13.dsone.3ds.com> Message-ID: I'm not sure about anyone else, but having been playing around with both gcc and icc, I'm afraid you might be out of luck. Is there any reason why you can't use a Python distribution built with gcc? On Tue, Feb 16, 2016 at 7:39 PM, BERGER Christian wrote: > Hi All, > > > > Here's a potentially dumb question: is it possible to build NumPy with > gcc, if python was built with icc? > > Right now, the build is failing in the toolchain check phase, because gcc > doesn't know how to handle icc-specific c flags (like -fp-model, prec-sqrt, > ...) > > In our environment we're providing an embedded python that our customers > should be able to use and extend with 3rd party modules (like numpy). > Problem is that our sw is built using icc, but we don't want to force our > customers to do the same and we also don't want to build every possible 3rd > party module for our customers. > > > > Thanks for your help, > > Christian > > > > This email and any attachments are intended solely for the use of the > individual or entity to whom it is addressed and may be confidential and/or > privileged. > > If you are not one of the named recipients or have received this email in > error, > > (i) you should not read, disclose, or copy it, > > (ii) please notify sender of your receipt by reply email and delete this > email and all attachments, > > (iii) Dassault Systemes does not accept or assume any liability or > responsibility for any use of or reliance on this email. > > For other languages, go to http://www.3ds.com/terms/email-disclaimer > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Tue Feb 16 14:48:26 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Tue, 16 Feb 2016 14:48:26 -0500 Subject: [Numpy-discussion] Proposal to add `weights` to `np.percentile` and `np.median` In-Reply-To: References: Message-ID: Please correct me if I misunderstood, but the code in that commit is doing a full sort, somewhat similar to what `scipy.stats.scoreatpercentile`. If that is correct, I will run some benchmarks first, but I think there is value to going forward with a numpy version that extends the current partitioning scheme. - Joe On Tue, Feb 16, 2016 at 2:39 PM, wrote: > > > On Tue, Feb 16, 2016 at 1:41 PM, Joseph Fox-Rabinovitz > wrote: >> >> Thanks for pointing me to that. I had something a bit different in >> mind but that definitely looks like a good start. >> >> On Tue, Feb 16, 2016 at 1:32 PM, Antony Lee >> wrote: >> > See earlier discussion here: https://github.com/numpy/numpy/issues/6326 >> > Basically, na?vely sorting may be faster than a not-so-optimized version >> > of >> > quickselect. >> > >> > Antony >> > >> > 2016-02-15 21:49 GMT-08:00 Joseph Fox-Rabinovitz >> > : >> >> >> >> I would like to add a `weights` keyword to `np.partition`, >> >> `np.percentile` and `np.median`. My reason for doing so is to to allow >> >> `np.histogram` to process automatic bin selection with weights. >> >> Currently, weights are not supported for the automatic bin selection >> >> and would be difficult to support in `auto` mode without having >> >> `np.percentile` support a `weights` keyword. I suspect that there are >> >> many other uses for such a feature. >> >> >> >> I have taken a preliminary look at the C implementation of the >> >> partition functions that are the basis for `partition`, `median` and >> >> `percentile`. I think that it would be possible to add versions (or >> >> just extend the functionality of existing ones) that check the ratio >> >> of the weights below the partition point to the total sum of the >> >> weights instead of just counting elements. >> >> >> >> One of the main advantages of such an implementation is that it would >> >> allow any real weights to be handled correctly, not just integers. >> >> Complex weights would not be supported. >> >> >> >> The purpose of this email is to see if anybody objects, has ideas or >> >> cares at all about this proposal before I spend a significant amount >> >> of time working on it. For example, did I miss any functions in my >> >> list? >> >> >> >> Regards, >> >> >> >> -Joe >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > statsmodels just got weighted quantiles > https://github.com/statsmodels/statsmodels/pull/2707 > > I didn't try to figure out it's computational efficiency, and we would > gladly delegate to whatever fast algorithm would be in numpy. > > Josef > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From njs at pobox.com Tue Feb 16 15:02:33 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 16 Feb 2016 12:02:33 -0800 Subject: [Numpy-discussion] building NumPy with gcc if Python was built with icc?!? In-Reply-To: <867172D4AC0F194AACCDD3853364A34882F2AF@AG-DCC-MBX13.dsone.3ds.com> References: <867172D4AC0F194AACCDD3853364A34882F2AF@AG-DCC-MBX13.dsone.3ds.com> Message-ID: In principle this should work (offer may be void on windows which has its own special weirdnesses, but I assume you're not on windows). icc and gcc should both support the same calling conventions and so forth. It sounds like you're just running into an annoying build system configuration issue where python likes to remember the compiler options used to build python, and then when it's time to build extension modules then the extension modules ask distutils what to do and distutils tells them to use these remembered options to build themselves as well. So you want to look into what compiler flag defaults are being exported by your python build, and figure out some way to make it export the ones you want instead of the defaults. I don't think there's anything really numpy specific about this, since it's about cpython's own build system plus stdlib -- I'd try asking on python-list or so. -n On Feb 16, 2016 11:40 AM, "BERGER Christian" wrote: > Hi All, > > > > Here's a potentially dumb question: is it possible to build NumPy with > gcc, if python was built with icc? > > Right now, the build is failing in the toolchain check phase, because gcc > doesn't know how to handle icc-specific c flags (like -fp-model, prec-sqrt, > ...) > > In our environment we're providing an embedded python that our customers > should be able to use and extend with 3rd party modules (like numpy). > Problem is that our sw is built using icc, but we don't want to force our > customers to do the same and we also don't want to build every possible 3rd > party module for our customers. > > > > Thanks for your help, > > Christian > > > > This email and any attachments are intended solely for the use of the > individual or entity to whom it is addressed and may be confidential and/or > privileged. > > If you are not one of the named recipients or have received this email in > error, > > (i) you should not read, disclose, or copy it, > > (ii) please notify sender of your receipt by reply email and delete this > email and all attachments, > > (iii) Dassault Systemes does not accept or assume any liability or > responsibility for any use of or reliance on this email. > > For other languages, go to http://www.3ds.com/terms/email-disclaimer > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Feb 16 15:22:35 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Feb 2016 15:22:35 -0500 Subject: [Numpy-discussion] Proposal to add `weights` to `np.percentile` and `np.median` In-Reply-To: References: Message-ID: On Tue, Feb 16, 2016 at 2:48 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > Please correct me if I misunderstood, but the code in that commit is > doing a full sort, somewhat similar to what > `scipy.stats.scoreatpercentile`. If that is correct, I will run some > benchmarks first, but I think there is value to going forward with a > numpy version that extends the current partitioning scheme. > I think so, but it's hiding inside pandas groupby, which also uses a hash, IIUC. AFAICS, the main reason it's implemented this way is to get correct tie handling. There could be large performance differences depending on whether there are many ties (discretized data) or only unique floats. (just guessing) Josef > > - Joe > > On Tue, Feb 16, 2016 at 2:39 PM, wrote: > > > > > > On Tue, Feb 16, 2016 at 1:41 PM, Joseph Fox-Rabinovitz > > wrote: > >> > >> Thanks for pointing me to that. I had something a bit different in > >> mind but that definitely looks like a good start. > >> > >> On Tue, Feb 16, 2016 at 1:32 PM, Antony Lee > >> wrote: > >> > See earlier discussion here: > https://github.com/numpy/numpy/issues/6326 > >> > Basically, na?vely sorting may be faster than a not-so-optimized > version > >> > of > >> > quickselect. > >> > > >> > Antony > >> > > >> > 2016-02-15 21:49 GMT-08:00 Joseph Fox-Rabinovitz > >> > : > >> >> > >> >> I would like to add a `weights` keyword to `np.partition`, > >> >> `np.percentile` and `np.median`. My reason for doing so is to to > allow > >> >> `np.histogram` to process automatic bin selection with weights. > >> >> Currently, weights are not supported for the automatic bin selection > >> >> and would be difficult to support in `auto` mode without having > >> >> `np.percentile` support a `weights` keyword. I suspect that there are > >> >> many other uses for such a feature. > >> >> > >> >> I have taken a preliminary look at the C implementation of the > >> >> partition functions that are the basis for `partition`, `median` and > >> >> `percentile`. I think that it would be possible to add versions (or > >> >> just extend the functionality of existing ones) that check the ratio > >> >> of the weights below the partition point to the total sum of the > >> >> weights instead of just counting elements. > >> >> > >> >> One of the main advantages of such an implementation is that it would > >> >> allow any real weights to be handled correctly, not just integers. > >> >> Complex weights would not be supported. > >> >> > >> >> The purpose of this email is to see if anybody objects, has ideas or > >> >> cares at all about this proposal before I spend a significant amount > >> >> of time working on it. For example, did I miss any functions in my > >> >> list? > >> >> > >> >> Regards, > >> >> > >> >> -Joe > >> >> _______________________________________________ > >> >> NumPy-Discussion mailing list > >> >> NumPy-Discussion at scipy.org > >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> > > >> > > >> > _______________________________________________ > >> > NumPy-Discussion mailing list > >> > NumPy-Discussion at scipy.org > >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > statsmodels just got weighted quantiles > > https://github.com/statsmodels/statsmodels/pull/2707 > > > > I didn't try to figure out it's computational efficiency, and we would > > gladly delegate to whatever fast algorithm would be in numpy. > > > > Josef > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Tue Feb 16 21:10:27 2016 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 16 Feb 2016 18:10:27 -0800 Subject: [Numpy-discussion] Fwd: Windows wheels for testing In-Reply-To: <56BF6DF8.30304@gmail.com> References: <201602122315.u1CNFmmU019206@blue-cove.com> <201602130419.u1D4J5FL030486@blue-cove.com> <56BF6DF8.30304@gmail.com> Message-ID: On Sat, Feb 13, 2016 at 9:55 AM, Jonathan Helmus wrote: > > > On 2/12/16 10:23 PM, Matthew Brett wrote: >> >> On Fri, Feb 12, 2016 at 8:18 PM, R Schumacher wrote: >>> >>> At 03:45 PM 2/12/2016, you wrote: >>>> >>>> PS C:\tmp> c:\Python35\python -m venv np-testing >>>> PS C:\tmp> .\np-testing\Scripts\Activate.ps1 >>>> (np-testing) PS C:\tmp> pip install -f >>>> https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds numpy nose >>> >>> >>> C:\Python34\Scripts>pip install "D:\Python >>> distros\numpy-1.10.4-cp34-none-win_amd64.whl" >>> Unpacking d:\python distros\numpy-1.10.4-cp34-none-win_amd64.whl >>> Installing collected packages: numpy >>> Successfully installed numpy >>> Cleaning up... >>> >>> C:\Python34\Scripts>..\python >>> Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC v.1600 64 >>> bit >>> (AMD64)] on win32 >>> Type "help", "copyright", "credits" or "license" for more information. >>>>>> >>>>>> import numpy >>>>>> numpy.test() >>> >>> Running unit tests for numpy >>> NumPy version 1.10.4 >>> NumPy relaxed strides checking option: False >>> NumPy is installed in C:\Python34\lib\site-packages\numpy >>> Python version 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC >>> v.1600 64 bit (AMD64)] >>> nose version 1.3.7 >>> >>> .......................F....S........................................................................................... >>> >>> .....................................................................................................S.................. >>> >>> ..........................................................................................C:\Python34\lib\unittest\case. >>> py:162: DeprecationWarning: using a non-integer number instead of an >>> integer >>> will result in an error in the future >>> callable_obj(*args, **kwargs) >>> ........C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >>> non-integer number instead of an integer will >>> result in an error in the future >>> callable_obj(*args, **kwargs) >>> C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >>> non-integer number instead of an integer will result i >>> n an error in the future >>> callable_obj(*args, **kwargs) >>> >>> .......................................................................................S................................ >>> >>> ........................................................................................................................ >>> >>> ..........................................................................C:\Python34\lib\unittest\case.py:162: >>> Deprecat >>> ionWarning: using a non-integer number instead of an integer will result >>> in >>> an error in the future >>> callable_obj(*args, **kwargs) >>> ..C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >>> non-integer number instead of an integer will result >>> in an error in the future >>> callable_obj(*args, **kwargs) >>> C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >>> non-integer number instead of an integer will result i >>> n an error in the future >>> callable_obj(*args, **kwargs) >>> C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >>> non-integer number instead of an integer will result i >>> n an error in the future >>> callable_obj(*args, **kwargs) >>> C:\Python34\lib\unittest\case.py:162: DeprecationWarning: using a >>> non-integer number instead of an integer will result i >>> n an error in the future >>> callable_obj(*args, **kwargs) >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ...............................................K.........................................................C:\Python34\lib >>> \site-packages\numpy\ma\core.py:989: RuntimeWarning: invalid value >>> encountered in multiply >>> masked_da = umath.multiply(m, da) >>> C:\Python34\lib\site-packages\numpy\ma\core.py:989: RuntimeWarning: >>> invalid >>> value encountered in multiply >>> masked_da = umath.multiply(m, da) >>> >>> ........................................................................................................................ >>> >>> ..................C:\Python34\lib\site-packages\numpy\core\tests\test_numerictypes.py:372: >>> DeprecationWarning: using a n >>> on-integer number instead of an integer will result in an error in the >>> future >>> return self.ary['f0', 'f1'] >>> >>> ........................................................................................................................ >>> >>> .............................................................................................K.......................... >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ....K..K................................K...SK.S.......S................................................................ >>> >>> ..................ES..SS................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ................................................................................S....................................... >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ...........................................K.................K.......................................................... >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ...........................................S............................................................................ >>> >>> ..........................................S............................................................C:\Python34\lib\s >>> ite-packages\numpy\ma\core.py:4089: UserWarning: Warning: converting a >>> masked element to nan. >>> warnings.warn("Warning: converting a masked element to nan.") >>> >>> ..............................................................................................................C:\Python3 >>> 4\lib\site-packages\numpy\ma\core.py:5116: RuntimeWarning: invalid value >>> encountered in power >>> np.power(out, 0.5, out=out, casting='unsafe') >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> >>> ........................................................................................................................ >>> .......................... >>> ====================================================================== >>> ERROR: test_compile1 (test_system_info.TestSystemInfoReading) >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File >>> >>> "C:\Python34\lib\site-packages\numpy\distutils\tests\test_system_info.py", >>> line 182, in test_compile1 >>> c.compile([os.path.basename(self._src1)], output_dir=self._dir1) >>> File "C:\Python34\lib\distutils\msvc9compiler.py", line 460, in >>> compile >>> self.initialize() >>> File "C:\Python34\lib\site-packages\numpy\distutils\msvccompiler.py", >>> line >>> 17, in initialize >>> distutils.msvccompiler.MSVCCompiler.initialize(self, plat_name) >>> File "C:\Python34\lib\distutils\msvc9compiler.py", line 371, in >>> initialize >>> vc_env = query_vcvarsall(VERSION, plat_spec) >>> File "C:\Python34\lib\distutils\msvc9compiler.py", line 259, in >>> query_vcvarsall >>> raise DistutilsPlatformError("Unable to find vcvarsall.bat") >>> distutils.errors.DistutilsPlatformError: Unable to find vcvarsall.bat >>> >>> ====================================================================== >>> FAIL: test_blasdot.test_blasdot_used >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File "C:\Python34\lib\site-packages\nose\case.py", line 198, in >>> runTest >>> self.test(*self.arg) >>> File "C:\Python34\lib\site-packages\numpy\testing\decorators.py", line >>> 146, in skipper_func >>> return f(*args, **kwargs) >>> File "C:\Python34\lib\site-packages\numpy\core\tests\test_blasdot.py", >>> line 31, in test_blasdot_used >>> assert_(dot is _dotblas.dot) >>> File "C:\Python34\lib\site-packages\numpy\testing\utils.py", line 53, >>> in >>> assert_ >>> raise AssertionError(smsg) >>> AssertionError >>> >>> ---------------------------------------------------------------------- >>> Ran 5575 tests in 32.042s >>> >>> FAILED (KNOWNFAIL=8, SKIP=12, errors=1, failures=1) >>> >> >> Great - thanks - I got the same couple of failures - I believe they >> are benign... >> >> Matthew > > Matthew, > > The wheels seem to work fine in the Python provided by Continuum on > 32-bit Windows. Tested in Python 2.7, 3.3 and 3.4. The only test > errors/failures was the the vcvarsall.bat error on all three versions. Full > tests logs at https://gist.github.com/jjhelmus/de2b34779e83eb37a70f. Thanks all for testing, that is very helpful, Cheers, Matthew From shoyer at gmail.com Wed Feb 17 01:14:21 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 16 Feb 2016 22:14:21 -0800 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: On Wed, Feb 10, 2016 at 4:22 PM, Chris Barker wrote: > We might consider adding "improve duck typing for numpy arrays" >> > > care to elaborate on that one? > > I know it come up on here that it would be good to have some code in numpy > itself that made it easier to make array-like objects (I.e. do indexing the > same way) Is that what you mean? > I was thinking particularly of improving the compatibility of numpy functions (e.g., concatenate) with non-numpy array-like objects, but now that you mention it utilities to make it easier to make array-like objects could also be a good thing. In any case, I've now elaborated on my thought into a full project idea on the Wiki: https://github.com/scipy/scipy/wiki/GSoC-2016-project-ideas#improved-duck-typing-support-for-n-dimensional-arrays Arguably, this might be too difficult for most GSoC students -- the API design questions here are quite contentious. But given that "Pythonic dtypes" is up there as a GSoC proposal still it's in good company. Cheers, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed Feb 17 06:20:15 2016 From: cournape at gmail.com (David Cournapeau) Date: Wed, 17 Feb 2016 11:20:15 +0000 Subject: [Numpy-discussion] building NumPy with gcc if Python was built with icc?!? In-Reply-To: <867172D4AC0F194AACCDD3853364A34882F2AF@AG-DCC-MBX13.dsone.3ds.com> References: <867172D4AC0F194AACCDD3853364A34882F2AF@AG-DCC-MBX13.dsone.3ds.com> Message-ID: On Tue, Feb 16, 2016 at 7:39 PM, BERGER Christian wrote: > Hi All, > > > > Here's a potentially dumb question: is it possible to build NumPy with > gcc, if python was built with icc? > > Right now, the build is failing in the toolchain check phase, because gcc > doesn't know how to handle icc-specific c flags (like -fp-model, prec-sqrt, > ...) > > In our environment we're providing an embedded python that our customers > should be able to use and extend with 3rd party modules (like numpy). > Problem is that our sw is built using icc, but we don't want to force our > customers to do the same and we also don't want to build every possible 3rd > party module for our customers. > If you are the one providing python, your best bet is to post process your python to strip the info from Intel-specific options. The process is convoluted, but basically: - at configure time, python makes a difference between CFLAGS and OPTS, and will store both in your Makefile - when python is installed, it will parse the Makefile and generate some dict written in the python stdlib as _sysconfigdata.py _sysconfigdata.py is what's used by distutils to build C extensions. This is only valid on Unix/cygwin, if you are on windows, the process is completely different. David > > Thanks for your help, > > Christian > > > > This email and any attachments are intended solely for the use of the > individual or entity to whom it is addressed and may be confidential and/or > privileged. > > If you are not one of the named recipients or have received this email in > error, > > (i) you should not read, disclose, or copy it, > > (ii) please notify sender of your receipt by reply email and delete this > email and all attachments, > > (iii) Dassault Systemes does not accept or assume any liability or > responsibility for any use of or reliance on this email. > > For other languages, go to http://www.3ds.com/terms/email-disclaimer > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryanv at continuum.io Wed Feb 17 09:05:51 2016 From: bryanv at continuum.io (Bryan Van de Ven) Date: Wed, 17 Feb 2016 09:05:51 -0500 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: Message-ID: <05A90159-9A13-4117-9A2E-76029AEFD0C4@continuum.io> [This is a complete tangent, and so I apologize in advance.] We are considering applying to GSOC for Bokeh. However, I have zero experience with GSOC, but non-zero questions (e.g. go it alone, vs apply through PSF... I think?) If anyone with experience from the mentoring organization side of things wouldn't mind a quick chat (or a few emails) to answer questions, share your experience, or offer advice, please drop me a line directly. Thanks, Bryan > On Feb 17, 2016, at 1:14 AM, Stephan Hoyer wrote: > > On Wed, Feb 10, 2016 at 4:22 PM, Chris Barker wrote: > We might consider adding "improve duck typing for numpy arrays" > > care to elaborate on that one? > > I know it come up on here that it would be good to have some code in numpy itself that made it easier to make array-like objects (I.e. do indexing the same way) Is that what you mean? > > I was thinking particularly of improving the compatibility of numpy functions (e.g., concatenate) with non-numpy array-like objects, but now that you mention it utilities to make it easier to make array-like objects could also be a good thing. > > In any case, I've now elaborated on my thought into a full project idea on the Wiki: > https://github.com/scipy/scipy/wiki/GSoC-2016-project-ideas#improved-duck-typing-support-for-n-dimensional-arrays > > Arguably, this might be too difficult for most GSoC students -- the API design questions here are quite contentious. But given that "Pythonic dtypes" is up there as a GSoC proposal still it's in good company. > > Cheers, > Stephan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From gfyoung17 at gmail.com Wed Feb 17 10:01:38 2016 From: gfyoung17 at gmail.com (G Young) Date: Wed, 17 Feb 2016 15:01:38 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint Message-ID: Hello all, I have a PR open here that makes "low" an optional parameter in numpy.randint and introduces new behavior into the API as follows: 1) `low == None` and `high == None` Numbers are generated over the range `[lowbnd, highbnd)`, where `lowbnd = np.iinfo(dtype).min`, and `highbnd = np.iinfo(dtype).max`, where `dtype` is the provided integral type. 2) `low != None` and `high == None` If `low >= 0`, numbers are still generated over the range `[0, low)`, but if `low` < 0, numbers are generated over the range `[low, highbnd)`, where `highbnd` is defined as above. 3) `low == None` and `high != None` Numbers are generated over the range `[lowbnd, high)`, where `lowbnd` is defined as above. The primary motivation was the second case, as it is more convenient to specify a 'dtype' by itself when generating such numbers in a similar vein to numpy.empty, except with initialized values. Looking forward to your feedback! Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Wed Feb 17 11:12:39 2016 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 17 Feb 2016 11:12:39 -0500 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: References: <56BE4BEA.5040100@gmail.com> Message-ID: <56C49BF7.6020506@gmail.com> On 02/12/2016 04:19 PM, Nathan Goldbaum wrote: > https://github.com/numpy/numpy/blob/master/doc/release/1.11.0-notes.rst > Thanks. That doesn't cover the backward incompatible change to assert_almost_equal and assert_array_almost_equal, right? From alan.isaac at gmail.com Wed Feb 17 11:40:05 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Wed, 17 Feb 2016 11:40:05 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: Message-ID: <56C4A265.6040903@gmail.com> Behavior of random integer generation: Python randint [a,b] MATLAB randi [a,b] Mma RandomInteger [a,b] haskell randomR [a,b] GAUSS rndi [a,b] Maple rand [a,b] In short, NumPy's `randint` is non-standard (and, I would add, non-intuitive). Presumably was due due to relying on a float draw from [0,1) along with the use of floor. The divergence in behavior between the (later) Python function of the same name is particularly unfortunate. So I suggest further work on this function is not called for, and use of `random_integers` should be encouraged. Probably NumPy's `randint` should be deprecated. If there is any playing with the interface, I think Mma provides a pretty good model. If I were designing the interface, I would always require a tuple argument (for the inclusive range), with possible `None` values to imply datatype extreme values. Proposed name (after `randint` deprecation): `randints`. Cheers, Alan Isaac From robert.kern at gmail.com Wed Feb 17 11:46:43 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Feb 2016 16:46:43 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: <56C4A265.6040903@gmail.com> References: <56C4A265.6040903@gmail.com> Message-ID: On Wed, Feb 17, 2016 at 4:40 PM, Alan Isaac wrote: > > Behavior of random integer generation: > Python randint [a,b] > MATLAB randi [a,b] > Mma RandomInteger [a,b] > haskell randomR [a,b] > GAUSS rndi [a,b] > Maple rand [a,b] > > In short, NumPy's `randint` is non-standard (and, > I would add, non-intuitive). Presumably was due > due to relying on a float draw from [0,1) along > with the use of floor. No, never was. It is implemented so because Python uses semi-open integer intervals by preference because it plays most nicely with 0-based indexing. Not sure about all of those systems, but some at least are 1-based indexing, so closed intervals do make sense. The Python stdlib's random.randint() closed interval is considered a mistake by python-dev leading to the implementation and preference for random.randrange() instead. > The divergence in behavior between the (later) Python > function of the same name is particularly unfortunate. Indeed, but unfortunately, this mistake dates way back to Numeric times, and easing the migration to numpy was a priority in the heady days of numpy 1.0. > So I suggest further work on this function is > not called for, and use of `random_integers` > should be encouraged. Probably NumPy's `randint` > should be deprecated. Not while I'm here. Instead, `random_integers()` is discouraged and perhaps might eventually be deprecated. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfyoung17 at gmail.com Wed Feb 17 11:48:14 2016 From: gfyoung17 at gmail.com (G Young) Date: Wed, 17 Feb 2016 16:48:14 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> Message-ID: Actually, it has already been deprecated because I did it myself. :) On Wed, Feb 17, 2016 at 4:46 PM, Robert Kern wrote: > On Wed, Feb 17, 2016 at 4:40 PM, Alan Isaac wrote: > > > > Behavior of random integer generation: > > Python randint [a,b] > > MATLAB randi [a,b] > > Mma RandomInteger [a,b] > > haskell randomR [a,b] > > GAUSS rndi [a,b] > > Maple rand [a,b] > > > > In short, NumPy's `randint` is non-standard (and, > > I would add, non-intuitive). Presumably was due > > due to relying on a float draw from [0,1) along > > with the use of floor. > > No, never was. It is implemented so because Python uses semi-open integer > intervals by preference because it plays most nicely with 0-based indexing. > Not sure about all of those systems, but some at least are 1-based > indexing, so closed intervals do make sense. > > The Python stdlib's random.randint() closed interval is considered a > mistake by python-dev leading to the implementation and preference for > random.randrange() instead. > > > The divergence in behavior between the (later) Python > > function of the same name is particularly unfortunate. > > Indeed, but unfortunately, this mistake dates way back to Numeric times, > and easing the migration to numpy was a priority in the heady days of numpy > 1.0. > > > So I suggest further work on this function is > > not called for, and use of `random_integers` > > should be encouraged. Probably NumPy's `randint` > > should be deprecated. > > Not while I'm here. Instead, `random_integers()` is discouraged and > perhaps might eventually be deprecated. > > -- > Robert Kern > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Feb 17 12:07:55 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Feb 2016 10:07:55 -0700 Subject: [Numpy-discussion] NumPy 1.11.0b3 released. In-Reply-To: <56C49BF7.6020506@gmail.com> References: <56BE4BEA.5040100@gmail.com> <56C49BF7.6020506@gmail.com> Message-ID: On Wed, Feb 17, 2016 at 9:12 AM, Andreas Mueller wrote: > > > On 02/12/2016 04:19 PM, Nathan Goldbaum wrote: > >> https://github.com/numpy/numpy/blob/master/doc/release/1.11.0-notes.rst >> >> Thanks. > That doesn't cover the backward incompatible change to assert_almost_equal > and assert_array_almost_equal, > right? What changes? AFAICT, there have only been some PEP8 changes in those functions since 1.9. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Feb 17 12:10:52 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Wed, 17 Feb 2016 12:10:52 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> Message-ID: <56C4A99C.5080804@gmail.com> On 2/17/2016 11:46 AM, Robert Kern wrote: > some at least are 1-based indexing, so closed intervals do make sense. Haskell is 0-indexed. And quite carefully thought out, imo. Cheers, Alan From gfyoung17 at gmail.com Wed Feb 17 12:28:48 2016 From: gfyoung17 at gmail.com (G Young) Date: Wed, 17 Feb 2016 17:28:48 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: <56C4A99C.5080804@gmail.com> References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> Message-ID: Perhaps, but we are not coding in Haskell. We are coding in Python, and the standard is that the endpoint is excluded, which renders your point moot I'm afraid. On Wed, Feb 17, 2016 at 5:10 PM, Alan Isaac wrote: > On 2/17/2016 11:46 AM, Robert Kern wrote: > >> some at least are 1-based indexing, so closed intervals do make sense. >> > > > > Haskell is 0-indexed. > And quite carefully thought out, imo. > > Cheers, > Alan > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 17 13:37:04 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Feb 2016 13:37:04 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: Message-ID: On Wed, Feb 17, 2016 at 10:01 AM, G Young wrote: > Hello all, > > I have a PR open here that > makes "low" an optional parameter in numpy.randint and introduces new > behavior into the API as follows: > > 1) `low == None` and `high == None` > > Numbers are generated over the range `[lowbnd, highbnd)`, where `lowbnd = > np.iinfo(dtype).min`, and `highbnd = np.iinfo(dtype).max`, where `dtype` is > the provided integral type. > > 2) `low != None` and `high == None` > > If `low >= 0`, numbers are still generated over the range `[0, > low)`, but if `low` < 0, numbers are generated over the range `[low, > highbnd)`, where `highbnd` is defined as above. > > 3) `low == None` and `high != None` > > Numbers are generated over the range `[lowbnd, high)`, where `lowbnd` is > defined as above. > My impression (*) is that this will be confusing, and uses a default that I never ever needed. Maybe a better way would be to use low=-np.inf and high=np.inf where inf would be interpreted as the smallest and largest representable number. And leave the defaults unchanged. (*) I didn't try to understand how it works for various cases. Josef > > The primary motivation was the second case, as it is more convenient to > specify a 'dtype' by itself when generating such numbers in a similar vein > to numpy.empty, except with initialized values. > > Looking forward to your feedback! > > Greg > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Feb 17 13:50:03 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 17 Feb 2016 10:50:03 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Sun, Feb 14, 2016 at 11:41 PM, Antony Lee wrote: > So how can np.array(range(...)) even work? > range() (in py3) is not a generator, nor is is a iterator. it is a range object, which is lazily evaluated, and satisfies both the iterator protocol and the sequence protocol (at least most of it: In [*1*]: r = range(10) In [*2*]: r[3] Out[*2*]: 3 In [*3*]: len(r) Out[*3*]: 10 In [*4*]: type(r) Out[*4*]: range In [*9*]: isinstance(r, collections.abc.Sequence) Out[*9*]: True In [*10*]: l = list() In [*11*]: isinstance(l, collections.abc.Sequence) Out[*11*]: True In [*12*]: isinstance(r, collections.abc.Iterable) Out[*12*]: True I'm still totally confused as to why we'd need to special-case range when we have arange(). -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Wed Feb 17 13:52:51 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Wed, 17 Feb 2016 13:52:51 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: Message-ID: On Wed, Feb 17, 2016 at 1:37 PM, wrote: > > > On Wed, Feb 17, 2016 at 10:01 AM, G Young wrote: >> >> Hello all, >> >> I have a PR open here that makes "low" an optional parameter in >> numpy.randint and introduces new behavior into the API as follows: >> >> 1) `low == None` and `high == None` >> >> Numbers are generated over the range `[lowbnd, highbnd)`, where `lowbnd = >> np.iinfo(dtype).min`, and `highbnd = np.iinfo(dtype).max`, where `dtype` is >> the provided integral type. >> >> 2) `low != None` and `high == None` >> >> If `low >= 0`, numbers are still generated over the range `[0, >> low)`, but if `low` < 0, numbers are generated over the range `[low, >> highbnd)`, where `highbnd` is defined as above. >> >> 3) `low == None` and `high != None` >> >> Numbers are generated over the range `[lowbnd, high)`, where `lowbnd` is >> defined as above. > > > My impression (*) is that this will be confusing, and uses a default that I > never ever needed. > > Maybe a better way would be to use low=-np.inf and high=np.inf where inf > would be interpreted as the smallest and largest representable number. And > leave the defaults unchanged. > > (*) I didn't try to understand how it works for various cases. > > Josef > As I mentioned on the PR discussion, the thing that bothers me is the inconsistency between the new and the old functionality, specifically in #2. If high is, the behavior is completely different depending on the value of `low`. Using `np.inf` instead of `None` may fix that, although I think that the author's idea was to avoid having to type the bounds in the `None`/`+/-np.inf` cases. I think that a better option is to have a separate wrapper to `randint` that implements this behavior in a consistent manner and leaves the current function consistent as well. -Joe > > >> >> >> The primary motivation was the second case, as it is more convenient to >> specify a 'dtype' by itself when generating such numbers in a similar vein >> to numpy.empty, except with initialized values. >> >> Looking forward to your feedback! >> >> Greg >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From chris.barker at noaa.gov Wed Feb 17 13:57:27 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 17 Feb 2016 10:57:27 -0800 Subject: [Numpy-discussion] GSoC? In-Reply-To: <05A90159-9A13-4117-9A2E-76029AEFD0C4@continuum.io> References: <05A90159-9A13-4117-9A2E-76029AEFD0C4@continuum.io> Message-ID: Apparetnly, NumFocus is applyign to be a GSoC Umbrella org as well: https://github.com/numfocus/gsoc Not sure why one might choose NumFocus vs PSF... -Chris On Wed, Feb 17, 2016 at 6:05 AM, Bryan Van de Ven wrote: > [This is a complete tangent, and so I apologize in advance.] > > We are considering applying to GSOC for Bokeh. However, I have zero > experience with GSOC, but non-zero questions (e.g. go it alone, vs apply > through PSF... I think?) If anyone with experience from the mentoring > organization side of things wouldn't mind a quick chat (or a few emails) to > answer questions, share your experience, or offer advice, please drop me a > line directly. > > Thanks, > > Bryan > > > > > On Feb 17, 2016, at 1:14 AM, Stephan Hoyer wrote: > > > > On Wed, Feb 10, 2016 at 4:22 PM, Chris Barker > wrote: > > We might consider adding "improve duck typing for numpy arrays" > > > > care to elaborate on that one? > > > > I know it come up on here that it would be good to have some code in > numpy itself that made it easier to make array-like objects (I.e. do > indexing the same way) Is that what you mean? > > > > I was thinking particularly of improving the compatibility of numpy > functions (e.g., concatenate) with non-numpy array-like objects, but now > that you mention it utilities to make it easier to make array-like objects > could also be a good thing. > > > > In any case, I've now elaborated on my thought into a full project idea > on the Wiki: > > > https://github.com/scipy/scipy/wiki/GSoC-2016-project-ideas#improved-duck-typing-support-for-n-dimensional-arrays > > > > Arguably, this might be too difficult for most GSoC students -- the API > design questions here are quite contentious. But given that "Pythonic > dtypes" is up there as a GSoC proposal still it's in good company. > > > > Cheers, > > Stephan > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfyoung17 at gmail.com Wed Feb 17 14:09:12 2016 From: gfyoung17 at gmail.com (G Young) Date: Wed, 17 Feb 2016 19:09:12 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: Message-ID: Yes, you are correct in explaining my intentions. However, as I also mentioned in the PR discussion, I did not quite understand how your wrapper idea would make things any more comprehensive at the cost of additional overhead and complexity. What do you mean by making the functions "consistent" (i.e. outline the behavior *exactly* depending on the inputs)? As I've explained before, and I will state it again, the different behavior for the high=None and low != None case is due to backwards compatibility. On Wed, Feb 17, 2016 at 6:52 PM, Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > On Wed, Feb 17, 2016 at 1:37 PM, wrote: > > > > > > On Wed, Feb 17, 2016 at 10:01 AM, G Young wrote: > >> > >> Hello all, > >> > >> I have a PR open here that makes "low" an optional parameter in > >> numpy.randint and introduces new behavior into the API as follows: > >> > >> 1) `low == None` and `high == None` > >> > >> Numbers are generated over the range `[lowbnd, highbnd)`, where `lowbnd > = > >> np.iinfo(dtype).min`, and `highbnd = np.iinfo(dtype).max`, where > `dtype` is > >> the provided integral type. > >> > >> 2) `low != None` and `high == None` > >> > >> If `low >= 0`, numbers are still generated over the range `[0, > >> low)`, but if `low` < 0, numbers are generated over the range `[low, > >> highbnd)`, where `highbnd` is defined as above. > >> > >> 3) `low == None` and `high != None` > >> > >> Numbers are generated over the range `[lowbnd, high)`, where `lowbnd` is > >> defined as above. > > > > > > My impression (*) is that this will be confusing, and uses a default > that I > > never ever needed. > > > > Maybe a better way would be to use low=-np.inf and high=np.inf where inf > > would be interpreted as the smallest and largest representable number. > And > > leave the defaults unchanged. > > > > (*) I didn't try to understand how it works for various cases. > > > > Josef > > > > As I mentioned on the PR discussion, the thing that bothers me is the > inconsistency between the new and the old functionality, specifically > in #2. If high is, the behavior is completely different depending on > the value of `low`. Using `np.inf` instead of `None` may fix that, > although I think that the author's idea was to avoid having to type > the bounds in the `None`/`+/-np.inf` cases. I think that a better > option is to have a separate wrapper to `randint` that implements this > behavior in a consistent manner and leaves the current function > consistent as well. > > -Joe > > > > > > > >> > >> > >> The primary motivation was the second case, as it is more convenient to > >> specify a 'dtype' by itself when generating such numbers in a similar > vein > >> to numpy.empty, except with initialized values. > >> > >> Looking forward to your feedback! > >> > >> Greg > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Wed Feb 17 14:19:00 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Wed, 17 Feb 2016 14:19:00 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: Message-ID: My point is that you are proposing to make the overall API have counter-intuitive behavior for the sake of adding a new feature. It is worth a little bit of overhead to have two functions that behave exactly as expected. Josef's footnote is a good example of how people will feel about having to figure out (not to mention remember) the different use cases. I think it is better to keep the current API and just add a "bounded_randint" function for which an input of `None` always means "limit of that bound, no exceptions". -Joe On Wed, Feb 17, 2016 at 2:09 PM, G Young wrote: > Yes, you are correct in explaining my intentions. However, as I also > mentioned in the PR discussion, I did not quite understand how your wrapper > idea would make things any more comprehensive at the cost of additional > overhead and complexity. What do you mean by making the functions > "consistent" (i.e. outline the behavior exactly depending on the inputs)? > As I've explained before, and I will state it again, the different behavior > for the high=None and low != None case is due to backwards compatibility. > > On Wed, Feb 17, 2016 at 6:52 PM, Joseph Fox-Rabinovitz > wrote: >> >> On Wed, Feb 17, 2016 at 1:37 PM, wrote: >> > >> > >> > On Wed, Feb 17, 2016 at 10:01 AM, G Young wrote: >> >> >> >> Hello all, >> >> >> >> I have a PR open here that makes "low" an optional parameter in >> >> numpy.randint and introduces new behavior into the API as follows: >> >> >> >> 1) `low == None` and `high == None` >> >> >> >> Numbers are generated over the range `[lowbnd, highbnd)`, where `lowbnd >> >> = >> >> np.iinfo(dtype).min`, and `highbnd = np.iinfo(dtype).max`, where >> >> `dtype` is >> >> the provided integral type. >> >> >> >> 2) `low != None` and `high == None` >> >> >> >> If `low >= 0`, numbers are still generated over the range `[0, >> >> low)`, but if `low` < 0, numbers are generated over the range `[low, >> >> highbnd)`, where `highbnd` is defined as above. >> >> >> >> 3) `low == None` and `high != None` >> >> >> >> Numbers are generated over the range `[lowbnd, high)`, where `lowbnd` >> >> is >> >> defined as above. >> > >> > >> > My impression (*) is that this will be confusing, and uses a default >> > that I >> > never ever needed. >> > >> > Maybe a better way would be to use low=-np.inf and high=np.inf where >> > inf >> > would be interpreted as the smallest and largest representable number. >> > And >> > leave the defaults unchanged. >> > >> > (*) I didn't try to understand how it works for various cases. >> > >> > Josef >> > >> >> As I mentioned on the PR discussion, the thing that bothers me is the >> inconsistency between the new and the old functionality, specifically >> in #2. If high is, the behavior is completely different depending on >> the value of `low`. Using `np.inf` instead of `None` may fix that, >> although I think that the author's idea was to avoid having to type >> the bounds in the `None`/`+/-np.inf` cases. I think that a better >> option is to have a separate wrapper to `randint` that implements this >> behavior in a consistent manner and leaves the current function >> consistent as well. >> >> -Joe >> >> >> > >> > >> >> >> >> >> >> The primary motivation was the second case, as it is more convenient to >> >> specify a 'dtype' by itself when generating such numbers in a similar >> >> vein >> >> to numpy.empty, except with initialized values. >> >> >> >> Looking forward to your feedback! >> >> >> >> Greg >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Wed Feb 17 14:20:16 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Feb 2016 14:20:16 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: Message-ID: On Wed, Feb 17, 2016 at 2:09 PM, G Young wrote: > Yes, you are correct in explaining my intentions. However, as I also > mentioned in the PR discussion, I did not quite understand how your wrapper > idea would make things any more comprehensive at the cost of additional > overhead and complexity. What do you mean by making the functions > "consistent" (i.e. outline the behavior *exactly* depending on the > inputs)? As I've explained before, and I will state it again, the > different behavior for the high=None and low != None case is due to > backwards compatibility. > One problem is that if there is only one positional argument, then I can still figure out that it might have different meanings. If there are two keywords, then I would assume standard python argument interpretation applies. If I want to save on typing, then I think it should be for a more "standard" case. (I also never sample all real numbers, at least not uniformly.) Josef > > On Wed, Feb 17, 2016 at 6:52 PM, Joseph Fox-Rabinovitz < > jfoxrabinovitz at gmail.com> wrote: > >> On Wed, Feb 17, 2016 at 1:37 PM, wrote: >> > >> > >> > On Wed, Feb 17, 2016 at 10:01 AM, G Young wrote: >> >> >> >> Hello all, >> >> >> >> I have a PR open here that makes "low" an optional parameter in >> >> numpy.randint and introduces new behavior into the API as follows: >> >> >> >> 1) `low == None` and `high == None` >> >> >> >> Numbers are generated over the range `[lowbnd, highbnd)`, where >> `lowbnd = >> >> np.iinfo(dtype).min`, and `highbnd = np.iinfo(dtype).max`, where >> `dtype` is >> >> the provided integral type. >> >> >> >> 2) `low != None` and `high == None` >> >> >> >> If `low >= 0`, numbers are still generated over the range `[0, >> >> low)`, but if `low` < 0, numbers are generated over the range `[low, >> >> highbnd)`, where `highbnd` is defined as above. >> >> >> >> 3) `low == None` and `high != None` >> >> >> >> Numbers are generated over the range `[lowbnd, high)`, where `lowbnd` >> is >> >> defined as above. >> > >> > >> > My impression (*) is that this will be confusing, and uses a default >> that I >> > never ever needed. >> > >> > Maybe a better way would be to use low=-np.inf and high=np.inf where >> inf >> > would be interpreted as the smallest and largest representable number. >> And >> > leave the defaults unchanged. >> > >> > (*) I didn't try to understand how it works for various cases. >> > >> > Josef >> > >> >> As I mentioned on the PR discussion, the thing that bothers me is the >> inconsistency between the new and the old functionality, specifically >> in #2. If high is, the behavior is completely different depending on >> the value of `low`. Using `np.inf` instead of `None` may fix that, >> although I think that the author's idea was to avoid having to type >> the bounds in the `None`/`+/-np.inf` cases. I think that a better >> option is to have a separate wrapper to `randint` that implements this >> behavior in a consistent manner and leaves the current function >> consistent as well. >> >> -Joe >> >> >> > >> > >> >> >> >> >> >> The primary motivation was the second case, as it is more convenient to >> >> specify a 'dtype' by itself when generating such numbers in a similar >> vein >> >> to numpy.empty, except with initialized values. >> >> >> >> Looking forward to your feedback! >> >> >> >> Greg >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Wed Feb 17 14:59:40 2016 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 17 Feb 2016 20:59:40 +0100 Subject: [Numpy-discussion] PyData Madrid Message-ID: Hi all, I just found out there is a PyData Madrid happening in early April, and it would feel wrong not to go, it being my hometown and all. Aside from the usual "Who else is going? We should meet!" I was also thinking of submitting a proposal for a talk. My idea was to put something together on "The future of NumPy indexing" and use it as an opportunity to raise awareness and hopefully gather feedback from users on the proposed changes, in sort of a "if the mountain won't come to Muhammad" type of thing. Thoughts? Comments? Anyone else going or thinking about going? Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 17 15:04:37 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Feb 2016 15:04:37 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: Message-ID: On Wed, Feb 17, 2016 at 2:20 PM, wrote: > > > On Wed, Feb 17, 2016 at 2:09 PM, G Young wrote: > >> Yes, you are correct in explaining my intentions. However, as I also >> mentioned in the PR discussion, I did not quite understand how your wrapper >> idea would make things any more comprehensive at the cost of additional >> overhead and complexity. What do you mean by making the functions >> "consistent" (i.e. outline the behavior *exactly* depending on the >> inputs)? As I've explained before, and I will state it again, the >> different behavior for the high=None and low != None case is due to >> backwards compatibility. >> > > > One problem is that if there is only one positional argument, then I can > still figure out that it might have different meanings. > If there are two keywords, then I would assume standard python argument > interpretation applies. > > If I want to save on typing, then I think it should be for a more > "standard" case. (I also never sample all real numbers, at least not > uniformly.) > One more thing I don't like: So far all distributions are "theoretical" distributions where the distribution depends on the provided shape, location and scale parameters. There is a limitation in how they are represented as numbers/dtype and what range is possible. However, that is not relevant for most use cases. In this case you are promoting `dtype` from a memory or storage parameter to an actual shape (or loc and scale) parameter. That's "weird", and even more so if this would be the default behavior. There is no proper uniform distribution on all integers. So, this forces users to think about the implementation detail like dtype, when I just want a random sample of a probability distribution. Josef > > Josef > > > >> >> On Wed, Feb 17, 2016 at 6:52 PM, Joseph Fox-Rabinovitz < >> jfoxrabinovitz at gmail.com> wrote: >> >>> On Wed, Feb 17, 2016 at 1:37 PM, wrote: >>> > >>> > >>> > On Wed, Feb 17, 2016 at 10:01 AM, G Young wrote: >>> >> >>> >> Hello all, >>> >> >>> >> I have a PR open here that makes "low" an optional parameter in >>> >> numpy.randint and introduces new behavior into the API as follows: >>> >> >>> >> 1) `low == None` and `high == None` >>> >> >>> >> Numbers are generated over the range `[lowbnd, highbnd)`, where >>> `lowbnd = >>> >> np.iinfo(dtype).min`, and `highbnd = np.iinfo(dtype).max`, where >>> `dtype` is >>> >> the provided integral type. >>> >> >>> >> 2) `low != None` and `high == None` >>> >> >>> >> If `low >= 0`, numbers are still generated over the range `[0, >>> >> low)`, but if `low` < 0, numbers are generated over the range `[low, >>> >> highbnd)`, where `highbnd` is defined as above. >>> >> >>> >> 3) `low == None` and `high != None` >>> >> >>> >> Numbers are generated over the range `[lowbnd, high)`, where `lowbnd` >>> is >>> >> defined as above. >>> > >>> > >>> > My impression (*) is that this will be confusing, and uses a default >>> that I >>> > never ever needed. >>> > >>> > Maybe a better way would be to use low=-np.inf and high=np.inf where >>> inf >>> > would be interpreted as the smallest and largest representable number. >>> And >>> > leave the defaults unchanged. >>> > >>> > (*) I didn't try to understand how it works for various cases. >>> > >>> > Josef >>> > >>> >>> As I mentioned on the PR discussion, the thing that bothers me is the >>> inconsistency between the new and the old functionality, specifically >>> in #2. If high is, the behavior is completely different depending on >>> the value of `low`. Using `np.inf` instead of `None` may fix that, >>> although I think that the author's idea was to avoid having to type >>> the bounds in the `None`/`+/-np.inf` cases. I think that a better >>> option is to have a separate wrapper to `randint` that implements this >>> behavior in a consistent manner and leaves the current function >>> consistent as well. >>> >>> -Joe >>> >>> >>> > >>> > >>> >> >>> >> >>> >> The primary motivation was the second case, as it is more convenient >>> to >>> >> specify a 'dtype' by itself when generating such numbers in a similar >>> vein >>> >> to numpy.empty, except with initialized values. >>> >> >>> >> Looking forward to your feedback! >>> >> >>> >> Greg >>> >> >>> >> _______________________________________________ >>> >> NumPy-Discussion mailing list >>> >> NumPy-Discussion at scipy.org >>> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >>> > >>> > >>> > _______________________________________________ >>> > NumPy-Discussion mailing list >>> > NumPy-Discussion at scipy.org >>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Feb 17 15:30:45 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Wed, 17 Feb 2016 15:30:45 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> Message-ID: <56C4D875.4000007@gmail.com> On 2/17/2016 12:28 PM, G Young wrote: > Perhaps, but we are not coding in Haskell. We are coding in Python, and > the standard is that the endpoint is excluded, which renders your point > moot I'm afraid. I am not sure what "standard" you are talking about. I thought we were talking about the user interface. Nobody is proposing changing the behavior of `range`. That is an entirely separate question. I'm not trying to change any minds, but let's not rely on spurious arguments. Cheers, Alan From robert.kern at gmail.com Wed Feb 17 15:42:07 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Feb 2016 20:42:07 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: <56C4D875.4000007@gmail.com> References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> Message-ID: On Wed, Feb 17, 2016 at 8:30 PM, Alan Isaac wrote: > > On 2/17/2016 12:28 PM, G Young wrote: >> >> Perhaps, but we are not coding in Haskell. We are coding in Python, and >> the standard is that the endpoint is excluded, which renders your point >> moot I'm afraid. > > I am not sure what "standard" you are talking about. > I thought we were talking about the user interface. It is a persistent and consistent convention (i.e. "standard") across Python APIs that deal with integer ranges (range(), slice(), random.randrange(), ...), particularly those that end up related to indexing; e.g. `x[np.random.randint(0, len(x))]` to pull a random sample from an array. random.randint() was the one big exception, and it was considered a mistake for that very reason, soft-deprecated in favor of random.randrange(). -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfyoung17 at gmail.com Wed Feb 17 15:43:55 2016 From: gfyoung17 at gmail.com (G Young) Date: Wed, 17 Feb 2016 20:43:55 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: <56C4D875.4000007@gmail.com> References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> Message-ID: Joe: fair enough. A separate function seems more reasonable. Perhaps it was a wording thing, but you kept saying "wrapper," which is not the same as a separate function. Josef: I don't think we are making people think more. They're all keyword arguments, so if you don't want to think about them, then you leave them as the defaults, and everyone is happy. The 'dtype' keyword was needed by someone who wanted to generate a large array of uint8 random integers and could not just as call 'astype' due to memory constraints. I would suggest you read this issue here and the PR's that followed so that you have a better understanding as to why this 'weird' behavior was chosen. On Wed, Feb 17, 2016 at 8:30 PM, Alan Isaac wrote: > On 2/17/2016 12:28 PM, G Young wrote: > >> Perhaps, but we are not coding in Haskell. We are coding in Python, and >> the standard is that the endpoint is excluded, which renders your point >> moot I'm afraid. >> > > > I am not sure what "standard" you are talking about. > I thought we were talking about the user interface. > > Nobody is proposing changing the behavior of `range`. > That is an entirely separate question. > > I'm not trying to change any minds, but let's not rely > on spurious arguments. > > > Cheers, > Alan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Feb 17 15:46:34 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 17 Feb 2016 21:46:34 +0100 Subject: [Numpy-discussion] PyData Madrid In-Reply-To: References: Message-ID: <1455741994.9869.18.camel@sipsolutions.net> On Mi, 2016-02-17 at 20:59 +0100, Jaime Fern?ndez del R?o wrote: > Hi all, > > I just found out there is a PyData Madrid happening in early April, > and it would feel wrong not to go, it being my hometown and all. > > Aside from the usual "Who else is going? We should meet!" I was also > thinking of submitting a proposal for a talk. My idea was to put > something together on "The future of NumPy indexing" and use it as an > opportunity to raise awareness and hopefully gather feedback from > users on the proposed changes, in sort of a "if the mountain won't > come to Muhammad" type of thing. > I guess you do know my last name means mountain in german? But if Muhammed might come, I should really improve my arabic ;). In any case sounds good to me if you like to do it, I don't think I will go, though it sounds nice. There are probably some other bigger things for "the future of NumPy", both impact and work wise. Such as the dtypes ideas, which might be nice to mention on such an occasion. Of course I like feedback on indexing, though (not sure if your ideas for indexing go further then what I think of right now). That NEP and code is sitting there after all with a decent chunk done and pretty much working (though relatively far from finished with testing and subclasses). Plus we have to make sure we get the details right, and there a talk may really help too :). - Sebastian > Thoughts? Comments? Anyone else going or thinking about going? > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > planes de dominaci?n mundial. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From robert.kern at gmail.com Wed Feb 17 15:48:24 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Feb 2016 20:48:24 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> Message-ID: On Wed, Feb 17, 2016 at 8:43 PM, G Young wrote: > Josef: I don't think we are making people think more. They're all keyword arguments, so if you don't want to think about them, then you leave them as the defaults, and everyone is happy. I believe that Josef has the code's reader in mind, not the code's writer. As a reader of other people's code (and I count 6-months-ago-me as one such "other people"), I am sure to eventually encounter all of the different variants, so I will need to know all of them. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfyoung17 at gmail.com Wed Feb 17 15:58:47 2016 From: gfyoung17 at gmail.com (G Young) Date: Wed, 17 Feb 2016 20:58:47 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> Message-ID: I sense that this issue is now becoming more of "randint has become too complicated" I suppose we could always "add" more functions that present simpler interfaces, though if you really do want simple, there's always Python's random library you can use. On Wed, Feb 17, 2016 at 8:48 PM, Robert Kern wrote: > On Wed, Feb 17, 2016 at 8:43 PM, G Young wrote: > > > Josef: I don't think we are making people think more. They're all > keyword arguments, so if you don't want to think about them, then you leave > them as the defaults, and everyone is happy. > > I believe that Josef has the code's reader in mind, not the code's writer. > As a reader of other people's code (and I count 6-months-ago-me as one such > "other people"), I am sure to eventually encounter all of the different > variants, so I will need to know all of them. > > -- > Robert Kern > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Feb 17 16:10:29 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 17 Feb 2016 22:10:29 +0100 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> Message-ID: <1455743429.10234.10.camel@sipsolutions.net> On Mi, 2016-02-17 at 20:48 +0000, Robert Kern wrote: > On Wed, Feb 17, 2016 at 8:43 PM, G Young wrote: > > > Josef: I don't think we are making people think more. They're all > keyword arguments, so if you don't want to think about them, then you > leave them as the defaults, and everyone is happy. > > I believe that Josef has the code's reader in mind, not the code's > writer. As a reader of other people's code (and I count 6-months-ago > -me as one such "other people"), I am sure to eventually encounter > all of the different variants, so I will need to know all of them. > Completely agree. Greg, if you need more then a few minutes to explain it in this case, there seems little point. It seems to me even the worst cases of your examples would be covered by writing code like: np.random.randint(np.iinfo(np.uint8).min, 10, dtype=np.uint8) And *everyone* will immediately know what is meant with just minor extra effort for writing it. We should keep the analogy to "range" as much as possible. Anything going far beyond that, can be confusing. On first sight I am not convinced that there is a serious convenience gain by doing magic here, but this is a simple case: "Explicit is better then implicit" since writing the explicit code is easy. It might also create weird bugs if the completely unexpected (most users would probably not even realize it existed) happens and you get huge numbers because you happened to have a `low=0` in there. Especially your point 2) seems confusing. As for 3) if I see `np.random.randint(high=3)` I think I would assume [0, 3).... Additionally, I am not sure the maximum int range is such a common need anyway? - Sebastian > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From josef.pktd at gmail.com Wed Feb 17 16:20:49 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Feb 2016 16:20:49 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> Message-ID: On Wed, Feb 17, 2016 at 3:58 PM, G Young wrote: > I sense that this issue is now becoming more of "randint has become too > complicated" I suppose we could always "add" more functions that present > simpler interfaces, though if you really do want simple, there's always > Python's random library you can use. > > On Wed, Feb 17, 2016 at 8:48 PM, Robert Kern > wrote: > >> On Wed, Feb 17, 2016 at 8:43 PM, G Young wrote: >> >> > Josef: I don't think we are making people think more. They're all >> keyword arguments, so if you don't want to think about them, then you leave >> them as the defaults, and everyone is happy. >> >> I believe that Josef has the code's reader in mind, not the code's >> writer. As a reader of other people's code (and I count 6-months-ago-me as >> one such "other people"), I am sure to eventually encounter all of the >> different variants, so I will need to know all of them. >> > I have mostly the users in mind (i.e. me). I like simple patterns where I don't have to stare at a docstring for five minutes to understand it, or pull it up again each time I use it. dtype for storage is different from dtype as distribution parameter. --- aside, since I just read this https://news.ycombinator.com/item?id=11112763 what to avoid. you save a few keystrokes and spend months trying to figure out what's going on. (exaggerated) "*Note* that this convenience feature may lead to undesired behaviour when ..." from R docs Josef > >> -- >> Robert Kern >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Feb 17 16:27:37 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 17 Feb 2016 22:27:37 +0100 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: <1455743429.10234.10.camel@sipsolutions.net> References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <1455743429.10234.10.camel@sipsolutions.net> Message-ID: <1455744457.10234.16.camel@sipsolutions.net> On Mi, 2016-02-17 at 22:10 +0100, Sebastian Berg wrote: > On Mi, 2016-02-17 at 20:48 +0000, Robert Kern wrote: > > On Wed, Feb 17, 2016 at 8:43 PM, G Young > > wrote: > > > > > Josef: I don't think we are making people think more. They're > > > all > > keyword arguments, so if you don't want to think about them, then > > you > > leave them as the defaults, and everyone is happy. > > > > I believe that Josef has the code's reader in mind, not the code's > > writer. As a reader of other people's code (and I count 6-months > > -ago > > -me as one such "other people"), I am sure to eventually encounter > > all of the different variants, so I will need to know all of them. > > > > Completely agree. Greg, if you need more then a few minutes to > explain > it in this case, there seems little point. It seems to me even the > worst cases of your examples would be covered by writing code like: > > np.random.randint(np.iinfo(np.uint8).min, 10, dtype=np.uint8) > > And *everyone* will immediately know what is meant with just minor > extra effort for writing it. We should keep the analogy to "range" as > much as possible. Anything going far beyond that, can be confusing. > On > first sight I am not convinced that there is a serious convenience > gain > by doing magic here, but this is a simple case: > > "Explicit is better then implicit" > > since writing the explicit code is easy. It might also create weird > bugs if the completely unexpected (most users would probably not even > realize it existed) happens and you get huge numbers because you > happened to have a `low=0` in there. Especially your point 2) seems > confusing. As for 3) if I see `np.random.randint(high=3)` I think I > would assume [0, 3).... > OK, that was silly, that is what happens of course. So it is explicit in the sense that you have pass in at least one `None` explicitly. But I am still not sure that the added convenience is big and easy to understand [1], if it was always lowest for low and highest for high, I remember get it, but it seems more complex (though None does also look a a bit like "default" and "default" is 0 for low). - Sebastian [1] As in the trade-off between added complexity vs. added convenience. > Additionally, I am not sure the maximum int range is such a common > need > anyway? > > - Sebastian > > > > -- > > Robert Kern > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From gfyoung17 at gmail.com Wed Feb 17 16:53:08 2016 From: gfyoung17 at gmail.com (G Young) Date: Wed, 17 Feb 2016 21:53:08 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: <1455744457.10234.16.camel@sipsolutions.net> References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <1455743429.10234.10.camel@sipsolutions.net> <1455744457.10234.16.camel@sipsolutions.net> Message-ID: "Explicit is better than implicit" - can't argue with that. It doesn't seem like the PR has gained much traction, so I'll close it. On Wed, Feb 17, 2016 at 9:27 PM, Sebastian Berg wrote: > On Mi, 2016-02-17 at 22:10 +0100, Sebastian Berg wrote: > > On Mi, 2016-02-17 at 20:48 +0000, Robert Kern wrote: > > > On Wed, Feb 17, 2016 at 8:43 PM, G Young > > > wrote: > > > > > > > Josef: I don't think we are making people think more. They're > > > > all > > > keyword arguments, so if you don't want to think about them, then > > > you > > > leave them as the defaults, and everyone is happy. > > > > > > I believe that Josef has the code's reader in mind, not the code's > > > writer. As a reader of other people's code (and I count 6-months > > > -ago > > > -me as one such "other people"), I am sure to eventually encounter > > > all of the different variants, so I will need to know all of them. > > > > > > > Completely agree. Greg, if you need more then a few minutes to > > explain > > it in this case, there seems little point. It seems to me even the > > worst cases of your examples would be covered by writing code like: > > > > np.random.randint(np.iinfo(np.uint8).min, 10, dtype=np.uint8) > > > > And *everyone* will immediately know what is meant with just minor > > extra effort for writing it. We should keep the analogy to "range" as > > much as possible. Anything going far beyond that, can be confusing. > > On > > first sight I am not convinced that there is a serious convenience > > gain > > by doing magic here, but this is a simple case: > > > > "Explicit is better then implicit" > > > > since writing the explicit code is easy. It might also create weird > > bugs if the completely unexpected (most users would probably not even > > realize it existed) happens and you get huge numbers because you > > happened to have a `low=0` in there. Especially your point 2) seems > > confusing. As for 3) if I see `np.random.randint(high=3)` I think I > > would assume [0, 3).... > > > > OK, that was silly, that is what happens of course. So it is explicit > in the sense that you have pass in at least one `None` explicitly. > > But I am still not sure that the added convenience is big and easy to > understand [1], if it was always lowest for low and highest for high, I > remember get it, but it seems more complex (though None does also look > a a bit like "default" and "default" is 0 for low). > > - Sebastian > > [1] As in the trade-off between added complexity vs. added convenience. > > > > Additionally, I am not sure the maximum int range is such a common > > need > > anyway? > > > > - Sebastian > > > > > > > -- > > > Robert Kern > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Feb 17 17:18:44 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 17 Feb 2016 23:18:44 +0100 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <1455743429.10234.10.camel@sipsolutions.net> <1455744457.10234.16.camel@sipsolutions.net> Message-ID: <1455747524.11979.3.camel@sipsolutions.net> On Mi, 2016-02-17 at 21:53 +0000, G Young wrote: > "Explicit is better than implicit" - can't argue with that. It > doesn't seem like the PR has gained much traction, so I'll close it. > Thanks for the effort though! Sometimes we get a bit carried away with doing fancy stuff, and I guess the idea is likely a bit too fancy for wide application. - Sebastian > On Wed, Feb 17, 2016 at 9:27 PM, Sebastian Berg < > sebastian at sipsolutions.net> wrote: > > On Mi, 2016-02-17 at 22:10 +0100, Sebastian Berg wrote: > > > On Mi, 2016-02-17 at 20:48 +0000, Robert Kern wrote: > > > > On Wed, Feb 17, 2016 at 8:43 PM, G Young > > > > wrote: > > > > > > > > > Josef: I don't think we are making people think more. > > They're > > > > > all > > > > keyword arguments, so if you don't want to think about them, > > then > > > > you > > > > leave them as the defaults, and everyone is happy. > > > > > > > > I believe that Josef has the code's reader in mind, not the > > code's > > > > writer. As a reader of other people's code (and I count 6 > > -months > > > > -ago > > > > -me as one such "other people"), I am sure to eventually > > encounter > > > > all of the different variants, so I will need to know all of > > them. > > > > > > > > > > Completely agree. Greg, if you need more then a few minutes to > > > explain > > > it in this case, there seems little point. It seems to me even > > the > > > worst cases of your examples would be covered by writing code > > like: > > > > > > np.random.randint(np.iinfo(np.uint8).min, 10, dtype=np.uint8) > > > > > > And *everyone* will immediately know what is meant with just > > minor > > > extra effort for writing it. We should keep the analogy to > > "range" as > > > much as possible. Anything going far beyond that, can be > > confusing. > > > On > > > first sight I am not convinced that there is a serious > > convenience > > > gain > > > by doing magic here, but this is a simple case: > > > > > > "Explicit is better then implicit" > > > > > > since writing the explicit code is easy. It might also create > > weird > > > bugs if the completely unexpected (most users would probably not > > even > > > realize it existed) happens and you get huge numbers because you > > > happened to have a `low=0` in there. Especially your point 2) > > seems > > > confusing. As for 3) if I see `np.random.randint(high=3)` I think > > I > > > would assume [0, 3).... > > > > > > > OK, that was silly, that is what happens of course. So it is > > explicit > > in the sense that you have pass in at least one `None` explicitly. > > > > But I am still not sure that the added convenience is big and easy > > to > > understand [1], if it was always lowest for low and highest for > > high, I > > remember get it, but it seems more complex (though None does also > > look > > a a bit like "default" and "default" is 0 for low). > > > > - Sebastian > > > > [1] As in the trade-off between added complexity vs. added > > convenience. > > > > > > > Additionally, I am not sure the maximum int range is such a > > common > > > need > > > anyway? > > > > > > - Sebastian > > > > > > > > > > -- > > > > Robert Kern > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From andy.terrel at gmail.com Wed Feb 17 17:35:28 2016 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Wed, 17 Feb 2016 16:35:28 -0600 Subject: [Numpy-discussion] GSoC? In-Reply-To: References: <05A90159-9A13-4117-9A2E-76029AEFD0C4@continuum.io> Message-ID: On Wed, Feb 17, 2016 at 12:57 PM, Chris Barker wrote: > Apparetnly, NumFocus is applyign to be a GSoC Umbrella org as well: > > https://github.com/numfocus/gsoc > > Not sure why one might choose NumFocus vs PSF... > > No reason to choose, you can get students from both orgs. > -Chris > > > On Wed, Feb 17, 2016 at 6:05 AM, Bryan Van de Ven > wrote: > >> [This is a complete tangent, and so I apologize in advance.] >> >> We are considering applying to GSOC for Bokeh. However, I have zero >> experience with GSOC, but non-zero questions (e.g. go it alone, vs apply >> through PSF... I think?) If anyone with experience from the mentoring >> organization side of things wouldn't mind a quick chat (or a few emails) to >> answer questions, share your experience, or offer advice, please drop me a >> line directly. >> >> Thanks, >> >> Bryan >> >> >> >> > On Feb 17, 2016, at 1:14 AM, Stephan Hoyer wrote: >> > >> > On Wed, Feb 10, 2016 at 4:22 PM, Chris Barker >> wrote: >> > We might consider adding "improve duck typing for numpy arrays" >> > >> > care to elaborate on that one? >> > >> > I know it come up on here that it would be good to have some code in >> numpy itself that made it easier to make array-like objects (I.e. do >> indexing the same way) Is that what you mean? >> > >> > I was thinking particularly of improving the compatibility of numpy >> functions (e.g., concatenate) with non-numpy array-like objects, but now >> that you mention it utilities to make it easier to make array-like objects >> could also be a good thing. >> > >> > In any case, I've now elaborated on my thought into a full project idea >> on the Wiki: >> > >> https://github.com/scipy/scipy/wiki/GSoC-2016-project-ideas#improved-duck-typing-support-for-n-dimensional-arrays >> > >> > Arguably, this might be too difficult for most GSoC students -- the API >> design questions here are quite contentious. But given that "Pythonic >> dtypes" is up there as a GSoC proposal still it's in good company. >> > >> > Cheers, >> > Stephan >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Feb 17 18:29:30 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Wed, 17 Feb 2016 18:29:30 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> Message-ID: <56C5025A.2090601@gmail.com> On 2/17/2016 3:42 PM, Robert Kern wrote: > random.randint() was the one big exception, and it was considered a > mistake for that very reason, soft-deprecated in favor of > random.randrange(). randrange also has its detractors: https://code.activestate.com/lists/python-dev/138358/ and following. I think if we start citing persistant conventions, the persistent convention across *many* languages that the bounds provided for a random integer range are inclusive also counts for something, especially when the names are essentially shared. But again, I am just trying to be clear about what is at issue, not push for a change. I think citing non-existent standards is not helpful. I think the discrepancy between the Python standard library and numpy for a function going by a common name is harmful. (But then, I teach.) fwiw, Alan From jni.soma at gmail.com Wed Feb 17 18:48:39 2016 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Thu, 18 Feb 2016 10:48:39 +1100 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: <56C5025A.2090601@gmail.com> References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <56C5025A.2090601@gmail.com> Message-ID: Also fwiw, I think the 0-based, half-open interval is one of the best features of Python indexing and yes, I do use random integers to index into my arrays and would not appreciate having to litter my code with "-1" everywhere. On Thu, Feb 18, 2016 at 10:29 AM, Alan Isaac wrote: > On 2/17/2016 3:42 PM, Robert Kern wrote: > >> random.randint() was the one big exception, and it was considered a >> mistake for that very reason, soft-deprecated in favor of >> random.randrange(). >> > > > randrange also has its detractors: > https://code.activestate.com/lists/python-dev/138358/ > and following. > > I think if we start citing persistant conventions, the > persistent convention across *many* languages that the bounds > provided for a random integer range are inclusive also counts for > something, especially when the names are essentially shared. > > But again, I am just trying to be clear about what is at issue, > not push for a change. I think citing non-existent standards > is not helpful. I think the discrepancy between the Python > standard library and numpy for a function going by a common > name is harmful. (But then, I teach.) > > fwiw, > > Alan > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfyoung17 at gmail.com Wed Feb 17 18:52:35 2016 From: gfyoung17 at gmail.com (G Young) Date: Wed, 17 Feb 2016 23:52:35 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <56C5025A.2090601@gmail.com> Message-ID: Your statement is a little self-contradictory, but in any case, you shouldn't worry about random_integers getting removed from the code-base. However, it has been deprecated in favor of randint. On Wed, Feb 17, 2016 at 11:48 PM, Juan Nunez-Iglesias wrote: > Also fwiw, I think the 0-based, half-open interval is one of the best > features of Python indexing and yes, I do use random integers to index into > my arrays and would not appreciate having to litter my code with "-1" > everywhere. > > On Thu, Feb 18, 2016 at 10:29 AM, Alan Isaac wrote: > >> On 2/17/2016 3:42 PM, Robert Kern wrote: >> >>> random.randint() was the one big exception, and it was considered a >>> mistake for that very reason, soft-deprecated in favor of >>> random.randrange(). >>> >> >> >> randrange also has its detractors: >> https://code.activestate.com/lists/python-dev/138358/ >> and following. >> >> I think if we start citing persistant conventions, the >> persistent convention across *many* languages that the bounds >> provided for a random integer range are inclusive also counts for >> something, especially when the names are essentially shared. >> >> But again, I am just trying to be clear about what is at issue, >> not push for a change. I think citing non-existent standards >> is not helpful. I think the discrepancy between the Python >> standard library and numpy for a function going by a common >> name is harmful. (But then, I teach.) >> >> fwiw, >> >> Alan >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Wed Feb 17 18:53:50 2016 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Thu, 18 Feb 2016 10:53:50 +1100 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <56C5025A.2090601@gmail.com> Message-ID: LOL "random integers" != "random_integers". =D On Thu, Feb 18, 2016 at 10:52 AM, G Young wrote: > Your statement is a little self-contradictory, but in any case, you > shouldn't worry about random_integers getting removed from the code-base. > However, it has been deprecated in favor of randint. > > On Wed, Feb 17, 2016 at 11:48 PM, Juan Nunez-Iglesias > wrote: > >> Also fwiw, I think the 0-based, half-open interval is one of the best >> features of Python indexing and yes, I do use random integers to index into >> my arrays and would not appreciate having to litter my code with "-1" >> everywhere. >> >> On Thu, Feb 18, 2016 at 10:29 AM, Alan Isaac >> wrote: >> >>> On 2/17/2016 3:42 PM, Robert Kern wrote: >>> >>>> random.randint() was the one big exception, and it was considered a >>>> mistake for that very reason, soft-deprecated in favor of >>>> random.randrange(). >>>> >>> >>> >>> randrange also has its detractors: >>> https://code.activestate.com/lists/python-dev/138358/ >>> and following. >>> >>> I think if we start citing persistant conventions, the >>> persistent convention across *many* languages that the bounds >>> provided for a random integer range are inclusive also counts for >>> something, especially when the names are essentially shared. >>> >>> But again, I am just trying to be clear about what is at issue, >>> not push for a change. I think citing non-existent standards >>> is not helpful. I think the discrepancy between the Python >>> standard library and numpy for a function going by a common >>> name is harmful. (But then, I teach.) >>> >>> fwiw, >>> >>> Alan >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Feb 17 18:55:00 2016 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Feb 2016 23:55:00 +0000 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <56C5025A.2090601@gmail.com> Message-ID: He was talking consistently about "random integers" not "random_integers()". :-) On Wednesday, 17 February 2016, G Young wrote: > Your statement is a little self-contradictory, but in any case, you > shouldn't worry about random_integers getting removed from the code-base. > However, it has been deprecated in favor of randint. > > On Wed, Feb 17, 2016 at 11:48 PM, Juan Nunez-Iglesias > wrote: > >> Also fwiw, I think the 0-based, half-open interval is one of the best >> features of Python indexing and yes, I do use random integers to index into >> my arrays and would not appreciate having to litter my code with "-1" >> everywhere. >> >> On Thu, Feb 18, 2016 at 10:29 AM, Alan Isaac > > wrote: >> >>> On 2/17/2016 3:42 PM, Robert Kern wrote: >>> >>>> random.randint() was the one big exception, and it was considered a >>>> mistake for that very reason, soft-deprecated in favor of >>>> random.randrange(). >>>> >>> >>> >>> randrange also has its detractors: >>> https://code.activestate.com/lists/python-dev/138358/ >>> and following. >>> >>> I think if we start citing persistant conventions, the >>> persistent convention across *many* languages that the bounds >>> provided for a random integer range are inclusive also counts for >>> something, especially when the names are essentially shared. >>> >>> But again, I am just trying to be clear about what is at issue, >>> not push for a change. I think citing non-existent standards >>> is not helpful. I think the discrepancy between the Python >>> standard library and numpy for a function going by a common >>> name is harmful. (But then, I teach.) >>> >>> fwiw, >>> >>> Alan >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Feb 17 18:59:15 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Wed, 17 Feb 2016 18:59:15 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <56C5025A.2090601@gmail.com> Message-ID: <56C50953.4000108@gmail.com> On 2/17/2016 6:48 PM, Juan Nunez-Iglesias wrote: > Also fwiw, I think the 0-based, half-open interval is one of the best > features of Python indexing and yes, I do use random integers to index > into my arrays and would not appreciate having to litter my code with > "-1" everywhere. http://docs.scipy.org/doc/numpy-1.10.0/reference/generated /numpy.random.choice.html fwiw, Alan Isaac From jni.soma at gmail.com Wed Feb 17 19:01:46 2016 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Thu, 18 Feb 2016 11:01:46 +1100 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: <56C50953.4000108@gmail.com> References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <56C5025A.2090601@gmail.com> <56C50953.4000108@gmail.com> Message-ID: Notice the limitation "1D array-like". On Thu, Feb 18, 2016 at 10:59 AM, Alan Isaac wrote: > On 2/17/2016 6:48 PM, Juan Nunez-Iglesias wrote: > >> Also fwiw, I think the 0-based, half-open interval is one of the best >> features of Python indexing and yes, I do use random integers to index >> into my arrays and would not appreciate having to litter my code with >> "-1" everywhere. >> > > > http://docs.scipy.org/doc/numpy-1.10.0/reference/generated > /numpy.random.choice.html > > fwiw, > Alan Isaac > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Feb 17 19:08:08 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Wed, 17 Feb 2016 19:08:08 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <56C5025A.2090601@gmail.com> <56C50953.4000108@gmail.com> Message-ID: <56C50B68.1070205@gmail.com> On 2/17/2016 7:01 PM, Juan Nunez-Iglesias wrote: > Notice the limitation "1D array-like". http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.random.choice.html "If an int, the random sample is generated as if a was np.arange(n)" hth, Alan Isaac From jni.soma at gmail.com Wed Feb 17 19:17:24 2016 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Thu, 18 Feb 2016 11:17:24 +1100 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: <56C50B68.1070205@gmail.com> References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <56C5025A.2090601@gmail.com> <56C50953.4000108@gmail.com> <56C50B68.1070205@gmail.com> Message-ID: Ah! Touch?! =) My last and admittedly weak defense is that I've been writing numpy since before 1.7. =) On Thu, Feb 18, 2016 at 11:08 AM, Alan Isaac wrote: > On 2/17/2016 7:01 PM, Juan Nunez-Iglesias wrote: > >> Notice the limitation "1D array-like". >> > > > > http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.random.choice.html > "If an int, the random sample is generated as if a was np.arange(n)" > > hth, > > Alan Isaac > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nr4qewd6v4 at snkmail.com Wed Feb 17 19:35:27 2016 From: nr4qewd6v4 at snkmail.com (.) Date: Thu, 18 Feb 2016 00:35:27 +0000 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument Message-ID: <1258-1455755727-616572@sneakemail.com> I've suggested a new function similar to logspace, but where you specify the start and stop points directly instead of using log(start) and base arguments: https://github.com/numpy/numpy/issues/7255 https://github.com/numpy/numpy/pull/7268 From matthew.brett at berkeley.edu Wed Feb 17 20:08:01 2016 From: matthew.brett at berkeley.edu (Matthew Brett) Date: Wed, 17 Feb 2016 17:08:01 -0800 Subject: [Numpy-discussion] Fwd: Multi-distribution Linux wheels - please test Message-ID: Hi, On Tue, Feb 9, 2016 at 12:01 PM, Freddy Rietdijk wrote: > On Nix we also had trouble with OpenBLAS 0.2.15. Version 0.2.14 did not > cause any segmentation faults so we reverted to that version. > https://github.com/scipy/scipy/issues/5620 > > (hopefully this time the e-mail gets through) In hope, I tried building with 0.2.12 and 0.2.14, but 0.2.12 gave me an extra test failure, and 0.2.14 gave the same test failure as 0.2.15.. Sadly... Matthew From josef.pktd at gmail.com Wed Feb 17 21:24:41 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Feb 2016 21:24:41 -0500 Subject: [Numpy-discussion] making "low" optional in numpy.randint In-Reply-To: References: <56C4A265.6040903@gmail.com> <56C4A99C.5080804@gmail.com> <56C4D875.4000007@gmail.com> <56C5025A.2090601@gmail.com> <56C50953.4000108@gmail.com> <56C50B68.1070205@gmail.com> Message-ID: On Wed, Feb 17, 2016 at 7:17 PM, Juan Nunez-Iglesias wrote: > Ah! Touch?! =) My last and admittedly weak defense is that I've been > writing numpy since before 1.7. =) > > On Thu, Feb 18, 2016 at 11:08 AM, Alan Isaac wrote: > >> On 2/17/2016 7:01 PM, Juan Nunez-Iglesias wrote: >> >>> Notice the limitation "1D array-like". >>> >> >> >> >> http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.random.choice.html >> "If an int, the random sample is generated as if a was np.arange(n)" >> > (un)related aside: my R doc quote about "may lead to undesired behavior" refers to this, IIRC, R's `sample` was the inspiration for this function but numpy distinguishes scalar from one element (1D) arrays >>> for i in range(3, 10): np.random.choice(np.arange(10)[i:]) Josef > >> hth, >> >> Alan Isaac >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Thu Feb 18 13:13:56 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Thu, 18 Feb 2016 10:13:56 -0800 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: Actually, while working on https://github.com/numpy/numpy/issues/7264 I realized that the memory efficiency (one-pass) argument is simply incorrect: import numpy as np class A: def __getitem__(self, i): print("A get item", i) return [np.int8(1), np.int8(2)][i] def __len__(self): return 2 print(repr(np.array(A()))) This prints out A get item 0 A get item 1 A get item 2 A get item 0 A get item 1 A get item 2 A get item 0 A get item 1 A get item 2 array([1, 2], dtype=int8) i.e. the sequence is "turned into a concrete sequence" no less than 3 times. Antony 2016-01-19 11:33 GMT-08:00 Stephan Sahm : > just to not prevent it from the black hole - what about integrating > fromiter into array? (see the post by Benjamin Root) > > for me personally, taking the first element for deducing the dtype would > be a perfect default way to read generators. If one wants a specific other > dtype, one could specify it like in the current fromiter method. > > On 15 December 2015 at 08:08, Stephan Sahm wrote: > >> I would like to further push Benjamin Root's suggestion: >> >> "Therefore, I think it is not out of the realm of reason that passing a >> generator object and a dtype could then delegate the work under the hood to >> np.fromiter()? I would even go so far as to raise an error if one passes a >> generator without specifying dtype to np.array(). The point is to reduce >> the number of entry points for creating numpy arrays." >> >> would this be ok? >> >> On Mon, Dec 14, 2015 at 6:50 PM Robert Kern >> wrote: >> >>> On Mon, Dec 14, 2015 at 5:41 PM, Benjamin Root >>> wrote: >>> > >>> > Heh, never noticed that. Was it implemented more like a >>> generator/iterator in older versions of Python? >>> >>> No, it predates generators and iterators so it has always had to be >>> implemented like that. >>> >>> -- >>> Robert Kern >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Thu Feb 18 13:15:44 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Thu, 18 Feb 2016 10:15:44 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: Mostly so that there is no performance lost when someone passes range(...) instead of np.arange(...). At least I had never realized that one is much faster than the other and always just passed range() as a convenience. Antony 2016-02-17 10:50 GMT-08:00 Chris Barker : > On Sun, Feb 14, 2016 at 11:41 PM, Antony Lee > wrote: > >> So how can np.array(range(...)) even work? >> > > range() (in py3) is not a generator, nor is is a iterator. it is a range > object, which is lazily evaluated, and satisfies both the iterator protocol > and the sequence protocol (at least most of it: > > In [*1*]: r = range(10) > > > In [*2*]: r[3] > > Out[*2*]: 3 > > > In [*3*]: len(r) > > Out[*3*]: 10 > > > In [*4*]: type(r) > > Out[*4*]: range > > In [*9*]: isinstance(r, collections.abc.Sequence) > > Out[*9*]: True > > In [*10*]: l = list() > > In [*11*]: isinstance(l, collections.abc.Sequence) > > Out[*11*]: True > > In [*12*]: isinstance(r, collections.abc.Iterable) > > Out[*12*]: True > I'm still totally confused as to why we'd need to special-case range when > we have arange(). > > -CHB > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Feb 18 14:12:13 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 18 Feb 2016 14:12:13 -0500 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Thu, Feb 18, 2016 at 1:15 PM, Antony Lee wrote: > Mostly so that there is no performance lost when someone passes range(...) > instead of np.arange(...). At least I had never realized that one is much > faster than the other and always just passed range() as a convenience. > > Antony > > 2016-02-17 10:50 GMT-08:00 Chris Barker : > >> On Sun, Feb 14, 2016 at 11:41 PM, Antony Lee >> wrote: >> >>> So how can np.array(range(...)) even work? >>> >> >> range() (in py3) is not a generator, nor is is a iterator. it is a range >> object, which is lazily evaluated, and satisfies both the iterator protocol >> and the sequence protocol (at least most of it: >> >> In [*1*]: r = range(10) >> > thanks, I didn't know that the range r here doesn't get eaten by iterating through it while r = (i for i in range(5)) is only good for a single pass. (tried on python 3.4) Josef > >> In [*2*]: r[3] >> >> Out[*2*]: 3 >> >> >> In [*3*]: len(r) >> >> Out[*3*]: 10 >> >> >> In [*4*]: type(r) >> >> Out[*4*]: range >> >> In [*9*]: isinstance(r, collections.abc.Sequence) >> >> Out[*9*]: True >> >> In [*10*]: l = list() >> >> In [*11*]: isinstance(l, collections.abc.Sequence) >> >> Out[*11*]: True >> >> In [*12*]: isinstance(r, collections.abc.Iterable) >> >> Out[*12*]: True >> I'm still totally confused as to why we'd need to special-case range when >> we have arange(). >> >> -CHB >> >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Feb 18 14:38:11 2016 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 18 Feb 2016 11:38:11 -0800 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument In-Reply-To: <1258-1455755727-616572@sneakemail.com> References: <1258-1455755727-616572@sneakemail.com> Message-ID: Some questions it'd be good to get feedback on: - any better ideas for naming it than "geomspace"? It's really too bad that the 'logspace' name is already taken. - I guess the alternative interface might be something like np.linspace(start, stop, steps, spacing="log") what do people think? -n On Wed, Feb 17, 2016 at 4:35 PM, . wrote: > I've suggested a new function similar to logspace, but where you specify the start and stop points directly instead of using log(start) and base arguments: > > https://github.com/numpy/numpy/issues/7255 > https://github.com/numpy/numpy/pull/7268 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith -- https://vorpus.org From robert.kern at gmail.com Thu Feb 18 14:44:08 2016 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 18 Feb 2016 19:44:08 +0000 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument In-Reply-To: References: <1258-1455755727-616572@sneakemail.com> Message-ID: On Thu, Feb 18, 2016 at 7:38 PM, Nathaniel Smith wrote: > > Some questions it'd be good to get feedback on: > > - any better ideas for naming it than "geomspace"? It's really too bad > that the 'logspace' name is already taken. geomspace() is a perfectly cromulent name, IMO. > - I guess the alternative interface might be something like > > np.linspace(start, stop, steps, spacing="log") > > what do people think? In a new function not named `linspace()`, I think that might be fine. I do occasionally want to swap between linear and logarithmic/geometric spacing based on a parameter, so this doesn't violate the van Rossum Rule of Function Signatures. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Thu Feb 18 15:19:58 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Thu, 18 Feb 2016 15:19:58 -0500 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument In-Reply-To: References: <1258-1455755727-616572@sneakemail.com> Message-ID: I like the idea, as long as we all remain aware of the irony of having a "log" spacing for a function named "lin"space. -Joe On Thu, Feb 18, 2016 at 2:44 PM, Robert Kern wrote: > On Thu, Feb 18, 2016 at 7:38 PM, Nathaniel Smith wrote: >> >> Some questions it'd be good to get feedback on: >> >> - any better ideas for naming it than "geomspace"? It's really too bad >> that the 'logspace' name is already taken. > > geomspace() is a perfectly cromulent name, IMO. > >> - I guess the alternative interface might be something like >> >> np.linspace(start, stop, steps, spacing="log") >> >> what do people think? > > In a new function not named `linspace()`, I think that might be fine. I do > occasionally want to swap between linear and logarithmic/geometric spacing > based on a parameter, so this doesn't violate the van Rossum Rule of > Function Signatures. > > -- > Robert Kern > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From alan.isaac at gmail.com Thu Feb 18 17:19:36 2016 From: alan.isaac at gmail.com (Alan Isaac) Date: Thu, 18 Feb 2016 17:19:36 -0500 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument In-Reply-To: References: <1258-1455755727-616572@sneakemail.com> Message-ID: <56C64378.4060805@gmail.com> On 2/18/2016 2:44 PM, Robert Kern wrote: > In a new function not named `linspace()`, I think that might be fine. I do occasionally want to swap between linear and logarithmic/geometric spacing based on a parameter, so this > doesn't violate the van Rossum Rule of Function Signatures. Would such a new function correct the apparent mistake (?) of `linspace` including the endpoint by default? Or is the current API justified by its Matlab origins? (Or have I missed the point altogether?) If this query is annoying, please ignore it. It is not meant to be. Alan From chris.barker at noaa.gov Thu Feb 18 17:21:02 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 18 Feb 2016 14:21:02 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: On Thu, Feb 18, 2016 at 10:15 AM, Antony Lee wrote: > Mostly so that there is no performance lost when someone passes range(...) > instead of np.arange(...). At least I had never realized that one is much > faster than the other and always just passed range() as a convenience. > Well, pretty much everything in numpy is faster if you use the numpy array version rather than plain python -- this hardly seems like the extra code would be worth it. numpy's array() constructor can (and should) take an arbitrary iterable. It does make some sense that you we might want to special case iterators, as you don't want to loop through them too many times, which is what np.fromiter() is for. and _maybe_ it would be worth special casing python lists, as you can access items faster, and they are really, really common (or has this already been done?), but special casing range() is getting silly. And it might be hard to do. At the C level I suppose you could actually know what the parameters and state of the range object are and create an array directly from that -- but that's what arange is for... -CHB > 2016-02-17 10:50 GMT-08:00 Chris Barker : > >> On Sun, Feb 14, 2016 at 11:41 PM, Antony Lee >> wrote: >> >>> So how can np.array(range(...)) even work? >>> >> >> range() (in py3) is not a generator, nor is is a iterator. it is a range >> object, which is lazily evaluated, and satisfies both the iterator protocol >> and the sequence protocol (at least most of it: >> >> In [*1*]: r = range(10) >> >> >> In [*2*]: r[3] >> >> Out[*2*]: 3 >> >> >> In [*3*]: len(r) >> >> Out[*3*]: 10 >> >> >> In [*4*]: type(r) >> >> Out[*4*]: range >> >> In [*9*]: isinstance(r, collections.abc.Sequence) >> >> Out[*9*]: True >> >> In [*10*]: l = list() >> >> In [*11*]: isinstance(l, collections.abc.Sequence) >> >> Out[*11*]: True >> >> In [*12*]: isinstance(r, collections.abc.Iterable) >> >> Out[*12*]: True >> I'm still totally confused as to why we'd need to special-case range when >> we have arange(). >> >> -CHB >> >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Feb 18 17:26:08 2016 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 18 Feb 2016 22:26:08 +0000 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument In-Reply-To: <56C64378.4060805@gmail.com> References: <1258-1455755727-616572@sneakemail.com> <56C64378.4060805@gmail.com> Message-ID: On Thu, Feb 18, 2016 at 10:19 PM, Alan Isaac wrote: > > On 2/18/2016 2:44 PM, Robert Kern wrote: >> >> In a new function not named `linspace()`, I think that might be fine. I do occasionally want to swap between linear and logarithmic/geometric spacing based on a parameter, so this >> doesn't violate the van Rossum Rule of Function Signatures. > > Would such a new function correct the apparent mistake (?) of > `linspace` including the endpoint by default? > Or is the current API justified by its Matlab origins? > (Or have I missed the point altogether?) The last, I'm afraid. Different use cases, different conventions. Integer ranges are half-open because that is the most useful convention in a 0-indexed ecosystem. Floating point ranges don't interface with indexing, and the closed intervals are the most useful (or at least the most common). > If this query is annoying, please ignore it. It is not meant to be. The same for my answer. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Feb 18 17:29:55 2016 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 18 Feb 2016 14:29:55 -0800 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument In-Reply-To: <56C64378.4060805@gmail.com> References: <1258-1455755727-616572@sneakemail.com> <56C64378.4060805@gmail.com> Message-ID: On Thu, Feb 18, 2016 at 2:19 PM, Alan Isaac wrote: > Would such a new function correct the apparent mistake (?) of > `linspace` including the endpoint by default? > Or is the current API justified by its Matlab origins? > I don't think so -- we don't need no stinkin' Matlab ! But I LIKE including the endpoint in the sequence -- for the common use cases, it's often what you want, and if it didn't include the end point but you did want that, it would get pretty ugly to figure out how to get what you want. On the other hand, if I had it to do over, I would have the count specify the number of intervals, rather than the number of items. A common cae may be: values from zero to 10 (inclusive), and I want ten steps: In [19]: np.linspace(0, 10, 10) Out[19]: array([ 0. , 1.11111111, 2.22222222, 3.33333333, 4.44444444, 5.55555556, 6.66666667, 7.77777778, 8.88888889, 10. ]) HUH? I was expecting [0,1,2,3 ....] (OK, not me, this isn't my first Rodeo), so now I need to do: In [20]: np.linspace(0, 10, 11) Out[20]: array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]) This gets uglier if I know what "delta" I want: In [21]: start = 0.0; end = 9.0; delta = 1.0 In [24]: np.linspace(start, end, (end-start)/delta) Out[24]: array([ 0. , 1.125, 2.25 , 3.375, 4.5 , 5.625, 6.75 , 7.875, 9. ]) oops! In [25]: np.linspace(start, end, (end-start)/delta + 1) Out[25]: array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]) But in any case, there is no changing it now. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Thu Feb 18 17:46:40 2016 From: antony.lee at berkeley.edu (Antony Lee) Date: Thu, 18 Feb 2016 14:46:40 -0800 Subject: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster In-Reply-To: References: Message-ID: In a sense this discussion is really about making np.array(iterable) more efficient, so I restarted the discussion at https://mail.scipy.org/pipermail/numpy-discussion/2016-February/075059.html Antony 2016-02-18 14:21 GMT-08:00 Chris Barker : > On Thu, Feb 18, 2016 at 10:15 AM, Antony Lee > wrote: > >> Mostly so that there is no performance lost when someone passes >> range(...) instead of np.arange(...). At least I had never realized that >> one is much faster than the other and always just passed range() as a >> convenience. >> > > Well, pretty much everything in numpy is faster if you use the numpy > array version rather than plain python -- this hardly seems like the extra > code would be worth it. > > numpy's array() constructor can (and should) take an arbitrary iterable. > > It does make some sense that you we might want to special case iterators, > as you don't want to loop through them too many times, which is what > np.fromiter() is for. > > and _maybe_ it would be worth special casing python lists, as you can > access items faster, and they are really, really common (or has this > already been done?), but special casing range() is getting silly. And it > might be hard to do. At the C level I suppose you could actually know what > the parameters and state of the range object are and create an array > directly from that -- but that's what arange is for... > > -CHB > > > >> 2016-02-17 10:50 GMT-08:00 Chris Barker : >> >>> On Sun, Feb 14, 2016 at 11:41 PM, Antony Lee >>> wrote: >>> >>>> So how can np.array(range(...)) even work? >>>> >>> >>> range() (in py3) is not a generator, nor is is a iterator. it is a >>> range object, which is lazily evaluated, and satisfies both the iterator >>> protocol and the sequence protocol (at least most of it: >>> >>> In [*1*]: r = range(10) >>> >>> >>> In [*2*]: r[3] >>> >>> Out[*2*]: 3 >>> >>> >>> In [*3*]: len(r) >>> >>> Out[*3*]: 10 >>> >>> >>> In [*4*]: type(r) >>> >>> Out[*4*]: range >>> >>> In [*9*]: isinstance(r, collections.abc.Sequence) >>> >>> Out[*9*]: True >>> >>> In [*10*]: l = list() >>> >>> In [*11*]: isinstance(l, collections.abc.Sequence) >>> >>> Out[*11*]: True >>> >>> In [*12*]: isinstance(r, collections.abc.Iterable) >>> >>> Out[*12*]: True >>> I'm still totally confused as to why we'd need to special-case range >>> when we have arange(). >>> >>> -CHB >>> >>> >>> >>> -- >>> >>> Christopher Barker, Ph.D. >>> Oceanographer >>> >>> Emergency Response Division >>> NOAA/NOS/OR&R (206) 526-6959 voice >>> 7600 Sand Point Way NE (206) 526-6329 fax >>> Seattle, WA 98115 (206) 526-6317 main reception >>> >>> Chris.Barker at noaa.gov >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Thu Feb 18 17:59:07 2016 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Thu, 18 Feb 2016 14:59:07 -0800 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument Message-ID: > > Some questions it'd be good to get feedback on: > > - any better ideas for naming it than "geomspace"? It's really too bad > that the 'logspace' name is already taken. > > - I guess the alternative interface might be something like > > np.linspace(start, stop, steps, spacing="log") > > what do people think? > > -n > You?ve got to wonder how many people actually use logspace(start, stop, num) in preference to 10.0**linspace(start, stop, num) - i.e. I prefer the latter for clarity, and if I wanted performance I?d be prepared to write something more ugly. I don?t mind geomspace(), but if you are brainstorming >>> linlogspace(start, end) # i.e. ?linear in log-space? is ok for me too. Peter From andyfaff at gmail.com Fri Feb 19 07:10:34 2016 From: andyfaff at gmail.com (Andrew Nelson) Date: Fri, 19 Feb 2016 23:10:34 +1100 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument Message-ID: With respect to geomspace proposals: instead of specifying start and end values and the number of points I'd like to have an option where I can set the start and end points and the ratio. The function would then work out the correct number of points to get closest to the end value. E.g. geomspace(start=1, finish=2, ratio=1.03) The first entries would be 1.0, 1.03, 1*1.03**2, etc. I have a requirement for the correct ratio between the points, and it's a right bind having to calculate the exact number of points needed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nr4qewd6v4 at snkmail.com Fri Feb 19 07:19:51 2016 From: nr4qewd6v4 at snkmail.com (.) Date: Fri, 19 Feb 2016 07:19:51 -0500 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument In-Reply-To: References: Message-ID: <22379-1455884402-512225@sneakemail.com> What about this API? You specify the start point, ratio, and number of points. http://spacepy.lanl.gov/doc/autosummary/spacepy.toolbox.geomspace.html On Fri, Feb 19, 2016 at 7:10 AM, Andrew Nelson andyfaff-at-gmail.com |numpy mailing list/Example Allow| wrote: > With respect to geomspace proposals: instead of specifying start and end > values and the number of points I'd like to have an option where I can set > the start and end points and the ratio. The function would then work out > the correct number of points to get closest to the end value.. > > E.g. geomspace(start=1, finish=2, ratio=1.03) > > The first entries would be 1.0, 1.03, 1*1.03**2, etc. > > I have a requirement for the correct ratio between the points, and it's a > right bind having to calculate the exact number of points needed. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Feb 19 07:59:31 2016 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 19 Feb 2016 12:59:31 +0000 Subject: [Numpy-discussion] proposal: new logspace without the log in the argument In-Reply-To: References: Message-ID: On Fri, Feb 19, 2016 at 12:10 PM, Andrew Nelson wrote: > > With respect to geomspace proposals: instead of specifying start and end values and the number of points I'd like to have an option where I can set the start and end points and the ratio. The function would then work out the correct number of points to get closest to the end value. > > E.g. geomspace(start=1, finish=2, ratio=1.03) > > The first entries would be 1.0, 1.03, 1*1.03**2, etc. > > I have a requirement for the correct ratio between the points, and it's a right bind having to calculate the exact number of points needed. At the risk of extending the twisty little maze of names, all alike, I would probably call a function with this signature geomrange() instead. It is more akin to arange(start, stop, step) than linspace(start, stop, num_steps). -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Fri Feb 19 08:23:28 2016 From: jfoxrabinovitz at gmail.com (=?utf-8?B?Sm9zZXBoIEZveC1SYWJpbm92aXR6?=) Date: Fri, 19 Feb 2016 05:23:28 -0800 (PST) Subject: [Numpy-discussion] proposal: new logspace without the log in the argument Message-ID: <000f4242.430ec7217c8822d1@gmail.com> If the author is willing, I'd say both functions are useful. The "geom" prefix is very fitting. - Joe ------ Original message------From: Robert KernDate: Fri, Feb 19, 2016 08:00To: Discussion of Numerical Python;Subject:Re: [Numpy-discussion] proposal: new logspace without the log in the argumentOn Fri, Feb 19, 2016 at 12:10 PM, Andrew Nelson wrote: > > With respect to geomspace proposals: instead of specifying start and end values and the number of points I'd like to have an option where I can set the start and end points and the ratio. The function would then work out the correct number of points to get closest to the end value. > > E.g. geomspace(start=1, finish=2, ratio=1.03) > > The first entries would be 1.0, 1.03, 1*1.03**2, etc. > > I have a requirement for the correct ratio between the points, and it's a right bind having to calculate the exact number of points needed. At the risk of extending the twisty little maze of names, all alike, I would probably call a function with this signature geomrange() instead. It is more akin to arange(start, stop, step) than linspace(start, stop, num_steps). -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Fri Feb 19 12:08:33 2016 From: allanhaldane at gmail.com (Allan Haldane) Date: Fri, 19 Feb 2016 12:08:33 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: <87d1rxu0ib.fsf@fimbulvetr.bsc.es> Message-ID: <56C74C11.3030106@gmail.com> I also want to add a historical note here, that 'groupby' has been discussed a couple times before. Travis Oliphant even made an NEP for it, and Wes McKinney lightly hinted at adding it to numpy. http://thread.gmane.org/gmane.comp.python.numeric.general/37480/focus=37480 http://thread.gmane.org/gmane.comp.python.numeric.general/38272/focus=38299 http://docs.scipy.org/doc/numpy-1.10.1/neps/groupby_additions.html Travis's idea for a ufunc method 'reduceby' is more along the lines of what I was originally thinking. Just musing about it, it might cover few small cases pandas groupby might not: It could work on arbitrary ufuncs, and over particular axes of multidimensional data. Eg, to sum over pixels from NxNx3 image data. But maybe pandas can cover the multidimensional case through additional index columns or with Panel. Cheers, Allan On 02/15/2016 05:31 PM, Paul Hobson wrote: > Just for posterity -- any future readers to this thread who need to do > pandas-like on record arrays should look at matplotlib's mlab submodule. > > I've been in situations (::cough:: Esri production ::cough::) where I've > had one hand tied behind my back and unable to install pandas. mlab was > a big help there. > > https://goo.gl/M7Mi8B > > -paul > > > > On Mon, Feb 15, 2016 at 1:28 PM, Llu?s Vilanova > wrote: > > Benjamin Root writes: > > > Seems like you are talking about xarray: https://github.com/pydata/xarray > > Oh, I wasn't aware of xarray, but there's also this: > > > https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#basic-indexing > > https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#dimension-oblivious-indexing > > > Cheers, > Lluis > > > > > Cheers! > > Ben Root > > > On Fri, Feb 12, 2016 at 9:40 AM, S?rgio > wrote: > > > Hello, > > > > This is my first e-mail, I will try to make the idea simple. > > > > Similar to masked array it would be interesting to use a label > array to > > guide operations. > > > > Ex.: > >>>> x > > labelled_array(data = > > > [[0 1 2] > > [3 4 5] > > [6 7 8]], > > label = > > [[0 1 2] > > [0 1 2] > > [0 1 2]]) > > > >>>> sum(x) > > array([9, 12, 15]) > > > > The operations would create a new axis for label indexing. > > > > You could think of it as a collection of masks, one for each > label. > > > > I don't know a way to make something like this efficiently > without a loop. > > Just wondering... > > > > S?rgio. > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Fri Feb 19 13:39:19 2016 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 19 Feb 2016 13:39:19 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: <56C74C11.3030106@gmail.com> References: <87d1rxu0ib.fsf@fimbulvetr.bsc.es> <56C74C11.3030106@gmail.com> Message-ID: On Fri, Feb 19, 2016 at 12:08 PM, Allan Haldane wrote: > I also want to add a historical note here, that 'groupby' has been > discussed a couple times before. > > Travis Oliphant even made an NEP for it, and Wes McKinney lightly hinted > at adding it to numpy. > > http://thread.gmane.org/gmane.comp.python.numeric.general/37480/focus=37480 > http://thread.gmane.org/gmane.comp.python.numeric.general/38272/focus=38299 > http://docs.scipy.org/doc/numpy-1.10.1/neps/groupby_additions.html > > Travis's idea for a ufunc method 'reduceby' is more along the lines of > what I was originally thinking. Just musing about it, it might cover few > small cases pandas groupby might not: It could work on arbitrary ufuncs, > and over particular axes of multidimensional data. Eg, to sum over > pixels from NxNx3 image data. But maybe pandas can cover the > multidimensional case through additional index columns or with Panel. > xarray is now covering that area. There are also recfunctions in numpy.lib that never got a lot of attention and expansion. There were plans to cover more of the matplotlib versions in numpy, but I have no idea and didn't check what happened to it.. Josef > > Cheers, > Allan > > On 02/15/2016 05:31 PM, Paul Hobson wrote: > > Just for posterity -- any future readers to this thread who need to do > > pandas-like on record arrays should look at matplotlib's mlab submodule. > > > > I've been in situations (::cough:: Esri production ::cough::) where I've > > had one hand tied behind my back and unable to install pandas. mlab was > > a big help there. > > > > https://goo.gl/M7Mi8B > > > > -paul > > > > > > > > On Mon, Feb 15, 2016 at 1:28 PM, Llu?s Vilanova > > wrote: > > > > Benjamin Root writes: > > > > > Seems like you are talking about xarray: > https://github.com/pydata/xarray > > > > Oh, I wasn't aware of xarray, but there's also this: > > > > > > > https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#basic-indexing > > > > > https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#dimension-oblivious-indexing > > > > > > Cheers, > > Lluis > > > > > > > > > Cheers! > > > Ben Root > > > > > On Fri, Feb 12, 2016 at 9:40 AM, S?rgio > > wrote: > > > > > Hello, > > > > > > > This is my first e-mail, I will try to make the idea simple. > > > > > > > Similar to masked array it would be interesting to use a label > > array to > > > guide operations. > > > > > > > Ex.: > > >>>> x > > > labelled_array(data = > > > > > [[0 1 2] > > > [3 4 5] > > > [6 7 8]], > > > label = > > > [[0 1 2] > > > [0 1 2] > > > [0 1 2]]) > > > > > > >>>> sum(x) > > > array([9, 12, 15]) > > > > > > > The operations would create a new axis for label indexing. > > > > > > > You could think of it as a collection of masks, one for each > > label. > > > > > > > I don't know a way to make something like this efficiently > > without a loop. > > > Just wondering... > > > > > > > S?rgio. > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Fri Feb 19 13:44:16 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Fri, 19 Feb 2016 13:44:16 -0500 Subject: [Numpy-discussion] [Suggestion] Labelled Array In-Reply-To: References: <87d1rxu0ib.fsf@fimbulvetr.bsc.es> <56C74C11.3030106@gmail.com> Message-ID: matplotlib would be more than happy if numpy could take those functions off our hands! They don't get nearly the correct visibility in matplotlib because no one is expecting them to be in a plotting library, and they don't have any useful unit-tests. None of us made them, so we are very hesitant to update them because of that. Cheers! Ben Root On Fri, Feb 19, 2016 at 1:39 PM, wrote: > > > On Fri, Feb 19, 2016 at 12:08 PM, Allan Haldane > wrote: > >> I also want to add a historical note here, that 'groupby' has been >> discussed a couple times before. >> >> Travis Oliphant even made an NEP for it, and Wes McKinney lightly hinted >> at adding it to numpy. >> >> >> http://thread.gmane.org/gmane.comp.python.numeric.general/37480/focus=37480 >> >> http://thread.gmane.org/gmane.comp.python.numeric.general/38272/focus=38299 >> http://docs.scipy.org/doc/numpy-1.10.1/neps/groupby_additions.html >> >> Travis's idea for a ufunc method 'reduceby' is more along the lines of >> what I was originally thinking. Just musing about it, it might cover few >> small cases pandas groupby might not: It could work on arbitrary ufuncs, >> and over particular axes of multidimensional data. Eg, to sum over >> pixels from NxNx3 image data. But maybe pandas can cover the >> multidimensional case through additional index columns or with Panel. >> > > xarray is now covering that area. > > There are also recfunctions in numpy.lib that never got a lot of attention > and expansion. > There were plans to cover more of the matplotlib versions in numpy, but I > have no idea and didn't check what happened to it.. > > Josef > > > >> >> Cheers, >> Allan >> >> On 02/15/2016 05:31 PM, Paul Hobson wrote: >> > Just for posterity -- any future readers to this thread who need to do >> > pandas-like on record arrays should look at matplotlib's mlab submodule. >> > >> > I've been in situations (::cough:: Esri production ::cough::) where I've >> > had one hand tied behind my back and unable to install pandas. mlab was >> > a big help there. >> > >> > https://goo.gl/M7Mi8B >> > >> > -paul >> > >> > >> > >> > On Mon, Feb 15, 2016 at 1:28 PM, Llu?s Vilanova > > > wrote: >> > >> > Benjamin Root writes: >> > >> > > Seems like you are talking about xarray: >> https://github.com/pydata/xarray >> > >> > Oh, I wasn't aware of xarray, but there's also this: >> > >> > >> > >> https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#basic-indexing >> > >> > >> https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#dimension-oblivious-indexing >> > >> > >> > Cheers, >> > Lluis >> > >> > >> > >> > > Cheers! >> > > Ben Root >> > >> > > On Fri, Feb 12, 2016 at 9:40 AM, S?rgio > > > wrote: >> > >> > > Hello, >> > >> > >> > > This is my first e-mail, I will try to make the idea simple. >> > >> > >> > > Similar to masked array it would be interesting to use a label >> > array to >> > > guide operations. >> > >> > >> > > Ex.: >> > >>>> x >> > > labelled_array(data = >> > >> > > [[0 1 2] >> > > [3 4 5] >> > > [6 7 8]], >> > > label = >> > > [[0 1 2] >> > > [0 1 2] >> > > [0 1 2]]) >> > >> > >> > >>>> sum(x) >> > > array([9, 12, 15]) >> > >> > >> > > The operations would create a new axis for label indexing. >> > >> > >> > > You could think of it as a collection of masks, one for each >> > label. >> > >> > >> > > I don't know a way to make something like this efficiently >> > without a loop. >> > > Just wondering... >> > >> > >> > > S?rgio. >> > >> > > _______________________________________________ >> > > NumPy-Discussion mailing list >> > > NumPy-Discussion at scipy.org > > >> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > >> > > _______________________________________________ >> > > NumPy-Discussion mailing list >> > > NumPy-Discussion at scipy.org >> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Feb 20 11:58:45 2016 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 20 Feb 2016 17:58:45 +0100 Subject: [Numpy-discussion] PyData Madrid In-Reply-To: <1455741994.9869.18.camel@sipsolutions.net> References: <1455741994.9869.18.camel@sipsolutions.net> Message-ID: On Wed, Feb 17, 2016 at 9:46 PM, Sebastian Berg wrote: > On Mi, 2016-02-17 at 20:59 +0100, Jaime Fern?ndez del R?o wrote: > > Hi all, > > > > I just found out there is a PyData Madrid happening in early April, > > and it would feel wrong not to go, it being my hometown and all. > > > > Aside from the usual "Who else is going? We should meet!" I was also > > thinking of submitting a proposal for a talk. My idea was to put > > something together on "The future of NumPy indexing" and use it as an > > opportunity to raise awareness and hopefully gather feedback from > > users on the proposed changes, in sort of a "if the mountain won't > > come to Muhammad" type of thing. > > > > I guess you do know my last name means mountain in german? But if > Muhammed might come, I should really improve my arabic ;). > > In any case sounds good to me if you like to do it, I don't think I > will go, though it sounds nice. > Sounds like a good idea to me too. I like both the concrete topic, as well as just having a talk on Numpy at a PyData conference. In general there are too few (if any) talks on Numpy and other core libraries at PyData and Scipy confs I think. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From kikocorreoso at gmail.com Sat Feb 20 12:26:31 2016 From: kikocorreoso at gmail.com (Kiko) Date: Sat, 20 Feb 2016 18:26:31 +0100 Subject: [Numpy-discussion] PyData Madrid In-Reply-To: References: <1455741994.9869.18.camel@sipsolutions.net> Message-ID: 2016-02-20 17:58 GMT+01:00 Ralf Gommers : > > > On Wed, Feb 17, 2016 at 9:46 PM, Sebastian Berg < > sebastian at sipsolutions.net> wrote: > >> On Mi, 2016-02-17 at 20:59 +0100, Jaime Fern?ndez del R?o wrote: >> > Hi all, >> > >> > I just found out there is a PyData Madrid happening in early April, >> > and it would feel wrong not to go, it being my hometown and all. >> > >> > Aside from the usual "Who else is going? We should meet!" I was also >> > thinking of submitting a proposal for a talk. My idea was to put >> > something together on "The future of NumPy indexing" and use it as an >> > opportunity to raise awareness and hopefully gather feedback from >> > users on the proposed changes, in sort of a "if the mountain won't >> > come to Muhammad" type of thing. >> > >> >> I guess you do know my last name means mountain in german? But if >> Muhammed might come, I should really improve my arabic ;). >> >> In any case sounds good to me if you like to do it, I don't think I >> will go, though it sounds nice. >> > > Sounds like a good idea to me too. I like both the concrete topic, as well > as just having a talk on Numpy at a PyData conference. In general there are > too few (if any) talks on Numpy and other core libraries at PyData and > Scipy confs I think. > +1. It would be great a numpy talk from a core developer. BTW, C4P closes tomorrow!!! Jaime, if you come to Madrid you know you have some beers waiting for you. Disclaimer, I'm one of co-organizers of the PyData Madrid. Best. > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Feb 20 14:13:42 2016 From: cournape at gmail.com (David Cournapeau) Date: Sat, 20 Feb 2016 19:13:42 +0000 Subject: [Numpy-discussion] PyData Madrid In-Reply-To: References: <1455741994.9869.18.camel@sipsolutions.net> Message-ID: On Sat, Feb 20, 2016 at 5:26 PM, Kiko wrote: > > > 2016-02-20 17:58 GMT+01:00 Ralf Gommers : > >> >> >> On Wed, Feb 17, 2016 at 9:46 PM, Sebastian Berg < >> sebastian at sipsolutions.net> wrote: >> >>> On Mi, 2016-02-17 at 20:59 +0100, Jaime Fern?ndez del R?o wrote: >>> > Hi all, >>> > >>> > I just found out there is a PyData Madrid happening in early April, >>> > and it would feel wrong not to go, it being my hometown and all. >>> > >>> > Aside from the usual "Who else is going? We should meet!" I was also >>> > thinking of submitting a proposal for a talk. My idea was to put >>> > something together on "The future of NumPy indexing" and use it as an >>> > opportunity to raise awareness and hopefully gather feedback from >>> > users on the proposed changes, in sort of a "if the mountain won't >>> > come to Muhammad" type of thing. >>> > >>> >>> I guess you do know my last name means mountain in german? But if >>> Muhammed might come, I should really improve my arabic ;). >>> >>> In any case sounds good to me if you like to do it, I don't think I >>> will go, though it sounds nice. >>> >> >> Sounds like a good idea to me too. I like both the concrete topic, as >> well as just having a talk on Numpy at a PyData conference. In general >> there are too few (if any) talks on Numpy and other core libraries at >> PyData and Scipy confs I think. >> > > +1. > > It would be great a numpy talk from a core developer. BTW, C4P closes > tomorrow!!! > > Jaime, if you come to Madrid you know you have some beers waiting for you. > > Disclaimer, I'm one of co-organizers of the PyData Madrid. > Since when does one need disclaimer when offering beers ? That would make for a dangerous precedent :) David > > Best. > > >> Ralf >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kikocorreoso at gmail.com Sat Feb 20 14:21:24 2016 From: kikocorreoso at gmail.com (Kiko) Date: Sat, 20 Feb 2016 20:21:24 +0100 Subject: [Numpy-discussion] PyData Madrid In-Reply-To: References: <1455741994.9869.18.camel@sipsolutions.net> Message-ID: 2016-02-20 20:13 GMT+01:00 David Cournapeau : > > > On Sat, Feb 20, 2016 at 5:26 PM, Kiko wrote: > >> >> >> 2016-02-20 17:58 GMT+01:00 Ralf Gommers : >> >>> >>> >>> On Wed, Feb 17, 2016 at 9:46 PM, Sebastian Berg < >>> sebastian at sipsolutions.net> wrote: >>> >>>> On Mi, 2016-02-17 at 20:59 +0100, Jaime Fern?ndez del R?o wrote: >>>> > Hi all, >>>> > >>>> > I just found out there is a PyData Madrid happening in early April, >>>> > and it would feel wrong not to go, it being my hometown and all. >>>> > >>>> > Aside from the usual "Who else is going? We should meet!" I was also >>>> > thinking of submitting a proposal for a talk. My idea was to put >>>> > something together on "The future of NumPy indexing" and use it as an >>>> > opportunity to raise awareness and hopefully gather feedback from >>>> > users on the proposed changes, in sort of a "if the mountain won't >>>> > come to Muhammad" type of thing. >>>> > >>>> >>>> I guess you do know my last name means mountain in german? But if >>>> Muhammed might come, I should really improve my arabic ;). >>>> >>>> In any case sounds good to me if you like to do it, I don't think I >>>> will go, though it sounds nice. >>>> >>> >>> Sounds like a good idea to me too. I like both the concrete topic, as >>> well as just having a talk on Numpy at a PyData conference. In general >>> there are too few (if any) talks on Numpy and other core libraries at >>> PyData and Scipy confs I think. >>> >> >> +1. >> >> It would be great a numpy talk from a core developer. BTW, C4P closes >> tomorrow!!! >> >> Jaime, if you come to Madrid you know you have some beers waiting for you. >> >> Disclaimer, I'm one of co-organizers of the PyData Madrid. >> > > Since when does one need disclaimer when offering beers ? That would make > for a dangerous precedent :) > The disclaimer is not for the beers :-P The beers sentence should be a "P.D.:" > > David > >> >> Best. >> >> >>> Ralf >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Sat Feb 20 17:19:58 2016 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sat, 20 Feb 2016 23:19:58 +0100 Subject: [Numpy-discussion] PyData Madrid In-Reply-To: References: <1455741994.9869.18.camel@sipsolutions.net> Message-ID: On Sat, Feb 20, 2016 at 8:13 PM, David Cournapeau wrote: > > > On Sat, Feb 20, 2016 at 5:26 PM, Kiko wrote: > >> >> >> 2016-02-20 17:58 GMT+01:00 Ralf Gommers : >> >>> >>> >>> On Wed, Feb 17, 2016 at 9:46 PM, Sebastian Berg < >>> sebastian at sipsolutions.net> wrote: >>> >>>> On Mi, 2016-02-17 at 20:59 +0100, Jaime Fern?ndez del R?o wrote: >>>> > Hi all, >>>> > >>>> > I just found out there is a PyData Madrid happening in early April, >>>> > and it would feel wrong not to go, it being my hometown and all. >>>> > >>>> > Aside from the usual "Who else is going? We should meet!" I was also >>>> > thinking of submitting a proposal for a talk. My idea was to put >>>> > something together on "The future of NumPy indexing" and use it as an >>>> > opportunity to raise awareness and hopefully gather feedback from >>>> > users on the proposed changes, in sort of a "if the mountain won't >>>> > come to Muhammad" type of thing. >>>> > >>>> >>>> I guess you do know my last name means mountain in german? But if >>>> Muhammed might come, I should really improve my arabic ;). >>>> >>>> In any case sounds good to me if you like to do it, I don't think I >>>> will go, though it sounds nice. >>>> >>> >>> Sounds like a good idea to me too. I like both the concrete topic, as >>> well as just having a talk on Numpy at a PyData conference. In general >>> there are too few (if any) talks on Numpy and other core libraries at >>> PyData and Scipy confs I think. >>> >> >> +1. >> >> It would be great a numpy talk from a core developer. BTW, C4P closes >> tomorrow!!! >> > With a full day to spare, I have submitted a talk proposal: Brief Description Advanced (a.k.a. "fancy") indexing is one of NumPy's greatest features. It is also well known for its ability to trip and confuse beginners and experts alike. This talk will review how it works and why it is great, give some insight on why it is how it is, explore some of its darkest corners, and go over some recent proposals to rationalize it. Detailed Abstract Advanced (a.k.a. _fancy_) indexing is one of NumPy's greatest features. Once past the rather steep learning curve, it enables a very expressive and powerful syntax, and makes coding a wide range of complex operations a breeze. But this versatility comes with a dark side of surprising results for some seemingly simple cases, and conflicts with the design choices of more recent data analysis packages. This has led to a viewpoint with growing support among the community that fancy indexing may be too fancy for its own good. This talk will review the workings of advanced indexing, highlighting where it excels, and where it falls short, and give some context on the logic behind some design decisions. It will also cover the existing [NumPy Enhancement Proposal (NEP)](https://github.com/numpy/numpy/pull/6256) to "implement an intuitive and fully featured advanced indexing." > >> Jaime, if you come to Madrid you know you have some beers waiting for you. >> > Talk or not, I'm really looking forward to those beers and getting to meet Juan Luis and you! Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Sun Feb 21 20:18:02 2016 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Sun, 21 Feb 2016 17:18:02 -0800 Subject: [Numpy-discussion] DyND 0.7.1 Release In-Reply-To: References: Message-ID: <3007742961930363565@unknownmsgid> > The DyND team would be happy to answer any questions people have about DyND, like "what is working and what is not" or "what do we still need to do to hit DyND 1.0". OK, how about: How does the performance. I'd DyND compare to Numpy for the core functionality they both support? - CHB From charlesr.harris at gmail.com Mon Feb 22 20:47:35 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 22 Feb 2016 18:47:35 -0700 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. Message-ID: Hi All, I'm delighted to announce the release of Numpy 1.11.0rc1. Hopefully the issues discovered in 1.11.0b3 have been dealt with and this release can go on to become the official release. Source files and documentation can be found on Sourceforge , while source files and OS X wheels for Python 2.7, 3.3, 3.4, and 3.5 can be installed from Pypi. Please test thoroughly. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Feb 23 07:02:41 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 23 Feb 2016 05:02:41 -0700 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. In-Reply-To: References: Message-ID: On Mon, Feb 22, 2016 at 6:47 PM, Charles R Harris wrote: > Hi All, > > I'm delighted to announce the release of Numpy 1.11.0rc1. Hopefully the > issues discovered in 1.11.0b3 have been dealt with and this release can go > on to become the official release. Source files and documentation can be > found on Sourceforge > , while > source files and OS X wheels for Python 2.7, 3.3, 3.4, and 3.5 can be > installed from Pypi. Please test thoroughly. > Issues reported by Christoph at https://github.com/numpy/numpy/issues/7316. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Feb 23 10:44:40 2016 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 23 Feb 2016 17:44:40 +0200 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. In-Reply-To: References: Message-ID: 23.02.2016, 03:47, Charles R Harris kirjoitti: > I'm delighted to announce the release of Numpy 1.11.0rc1. Hopefully the > issues discovered in 1.11.0b3 have been dealt with and this release can go > on to become the official release. Source files and documentation can be > found on Sourceforge > , while > source files and OS X wheels for Python 2.7, 3.3, 3.4, and 3.5 can be > installed from Pypi. Please test thoroughly. FWIW https://travis-ci.org/pv/testrig/builds/108384173 From ben.v.root at gmail.com Tue Feb 23 11:32:12 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 23 Feb 2016 11:32:12 -0500 Subject: [Numpy-discussion] reshaping empty array bug? Message-ID: Not exactly sure if this should be a bug or not. This came up in a fairly general function of mine to process satellite data. Unexpectedly, one of the satellite files had no scans in it, triggering an exception when I tried to reshape the data from it. >>> import numpy as np >>> a = np.zeros((0, 5*64)) >>> a.shape (0, 320) >>> a.shape = (0, 5, 64) >>> a.shape (0, 5, 64) >>> a.shape = (0, 5*64) >>> a.shape = (0, 5, -1) Traceback (most recent call last): File "", line 1, in ValueError: total size of new array must be unchanged So, if I know all of the dimensions, I can reshape just fine. But if I wanted to use the nifty -1 semantic, it completely falls apart. I can see arguments going either way for whether this is a bug or not. Thoughts? Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Tue Feb 23 11:41:01 2016 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Tue, 23 Feb 2016 11:41:01 -0500 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: References: Message-ID: On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root wrote: > Not exactly sure if this should be a bug or not. This came up in a fairly > general function of mine to process satellite data. Unexpectedly, one of > the satellite files had no scans in it, triggering an exception when I > tried to reshape the data from it. > > >>> import numpy as np > >>> a = np.zeros((0, 5*64)) > >>> a.shape > (0, 320) > >>> a.shape = (0, 5, 64) > >>> a.shape > (0, 5, 64) > >>> a.shape = (0, 5*64) > >>> a.shape = (0, 5, -1) > Traceback (most recent call last): > File "", line 1, in > ValueError: total size of new array must be unchanged > > So, if I know all of the dimensions, I can reshape just fine. But if I > wanted to use the nifty -1 semantic, it completely falls apart. I can see > arguments going either way for whether this is a bug or not. > When you try `a.shape = (0, 5, -1)`, the size of the third dimension is ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the temptation to guess." Warren > Thoughts? > > Ben Root > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Tue Feb 23 11:45:38 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 23 Feb 2016 11:45:38 -0500 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: References: Message-ID: but, it isn't really ambiguous, is it? The -1 can only refer to a single dimension, and if you ignore the zeros in the original and new shape, the -1 is easily solvable, right? Ben Root On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > > > On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root > wrote: > >> Not exactly sure if this should be a bug or not. This came up in a fairly >> general function of mine to process satellite data. Unexpectedly, one of >> the satellite files had no scans in it, triggering an exception when I >> tried to reshape the data from it. >> >> >>> import numpy as np >> >>> a = np.zeros((0, 5*64)) >> >>> a.shape >> (0, 320) >> >>> a.shape = (0, 5, 64) >> >>> a.shape >> (0, 5, 64) >> >>> a.shape = (0, 5*64) >> >>> a.shape = (0, 5, -1) >> Traceback (most recent call last): >> File "", line 1, in >> ValueError: total size of new array must be unchanged >> >> So, if I know all of the dimensions, I can reshape just fine. But if I >> wanted to use the nifty -1 semantic, it completely falls apart. I can see >> arguments going either way for whether this is a bug or not. >> > > > When you try `a.shape = (0, 5, -1)`, the size of the third dimension is > ambiguous. From the Zen of Python: "In the face of ambiguity, refuse the > temptation to guess." > > Warren > > > > >> Thoughts? >> >> Ben Root >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Feb 23 13:58:41 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 23 Feb 2016 19:58:41 +0100 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: References: Message-ID: <1456253921.9274.3.camel@sipsolutions.net> On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote: > but, it isn't really ambiguous, is it? The -1 can only refer to a > single dimension, and if you ignore the zeros in the original and new > shape, the -1 is easily solvable, right? I think if there is a simple logic (like using 1 for all zeros in both input and output shape for the -1 calculation), maybe we could do it. I would like someone to think about it carefully that it would not also allow some unexpected generalizations. And at least I am getting a BrainOutOfResourcesError right now trying to figure that out :). - Sebastian > Ben Root > > On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < > warren.weckesser at gmail.com> wrote: > > > > > > On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < > > ben.v.root at gmail.com> wrote: > > > Not exactly sure if this should be a bug or not. This came up in > > > a fairly general function of mine to process satellite data. > > > Unexpectedly, one of the satellite files had no scans in it, > > > triggering an exception when I tried to reshape the data from it. > > > > > > >>> import numpy as np > > > >>> a = np.zeros((0, 5*64)) > > > >>> a.shape > > > (0, 320) > > > >>> a.shape = (0, 5, 64) > > > >>> a.shape > > > (0, 5, 64) > > > >>> a.shape = (0, 5*64) > > > >>> a.shape = (0, 5, -1) > > > Traceback (most recent call last): > > > File "", line 1, in > > > ValueError: total size of new array must be unchanged > > > > > > So, if I know all of the dimensions, I can reshape just fine. But > > > if I wanted to use the nifty -1 semantic, it completely falls > > > apart. I can see arguments going either way for whether this is a > > > bug or not. > > > > > > > When you try `a.shape = (0, 5, -1)`, the size of the third > > dimension is ambiguous. From the Zen of Python: "In the face of > > ambiguity, refuse the temptation to guess." > > > > Warren > > > > > > > > > > > > Thoughts? > > > > > > Ben Root > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Tue Feb 23 14:36:00 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 23 Feb 2016 12:36:00 -0700 Subject: [Numpy-discussion] How to check for memory leaks? Message-ID: Hi All, I'm suspecting a possible memory leak in 1.11.x, what is the best way to check for that? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Feb 23 14:46:47 2016 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Feb 2016 20:46:47 +0100 Subject: [Numpy-discussion] How to check for memory leaks? References: Message-ID: <20160223204647.740d0704@fsol> On Tue, 23 Feb 2016 12:36:00 -0700 Charles R Harris wrote: > Hi All, > > I'm suspecting a possible memory leak in 1.11.x, what is the best way to > check for that? If that is due to a reference leak, you can use sys.getrefcount() or weakref.ref(). Otherwise you may want to change Numpy to go through PyMem_RawMalloc / PyMem_RawCalloc / PyMem_RawRealloc / PyMem_RawFree on recent Pythons, so as to have Numpy-allocated memory accounted by the tracemalloc module. (https://github.com/numpy/numpy/pull/5470 may make it more palatable ;-)) Regards Antoine. From ben.v.root at gmail.com Tue Feb 23 14:57:25 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 23 Feb 2016 14:57:25 -0500 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: <1456253921.9274.3.camel@sipsolutions.net> References: <1456253921.9274.3.camel@sipsolutions.net> Message-ID: I'd be more than happy to write up the patch. I don't think it would be quite like make zeros be ones, but it would be along those lines. One case I need to wrap my head around is to make sure that a 0 would happen if the following was true: >>> a = np.ones((0, 5*64)) >>> a.shape = (-1, 5, 64) EDIT: Just tried the above, and it works as expected (zero in the first dim)! Just tried out a couple of other combos: >>> a.shape = (-1,) >>> a.shape (0,) >>> a.shape = (-1, 5, 64) >>> a.shape (0, 5, 64) This is looking more and more like a bug to me. Ben Root On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg wrote: > On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote: > > but, it isn't really ambiguous, is it? The -1 can only refer to a > > single dimension, and if you ignore the zeros in the original and new > > shape, the -1 is easily solvable, right? > > I think if there is a simple logic (like using 1 for all zeros in both > input and output shape for the -1 calculation), maybe we could do it. I > would like someone to think about it carefully that it would not also > allow some unexpected generalizations. And at least I am getting a > BrainOutOfResourcesError right now trying to figure that out :). > > - Sebastian > > > > Ben Root > > > > On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < > > warren.weckesser at gmail.com> wrote: > > > > > > > > > On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < > > > ben.v.root at gmail.com> wrote: > > > > Not exactly sure if this should be a bug or not. This came up in > > > > a fairly general function of mine to process satellite data. > > > > Unexpectedly, one of the satellite files had no scans in it, > > > > triggering an exception when I tried to reshape the data from it. > > > > > > > > >>> import numpy as np > > > > >>> a = np.zeros((0, 5*64)) > > > > >>> a.shape > > > > (0, 320) > > > > >>> a.shape = (0, 5, 64) > > > > >>> a.shape > > > > (0, 5, 64) > > > > >>> a.shape = (0, 5*64) > > > > >>> a.shape = (0, 5, -1) > > > > Traceback (most recent call last): > > > > File "", line 1, in > > > > ValueError: total size of new array must be unchanged > > > > > > > > So, if I know all of the dimensions, I can reshape just fine. But > > > > if I wanted to use the nifty -1 semantic, it completely falls > > > > apart. I can see arguments going either way for whether this is a > > > > bug or not. > > > > > > > > > > When you try `a.shape = (0, 5, -1)`, the size of the third > > > dimension is ambiguous. From the Zen of Python: "In the face of > > > ambiguity, refuse the temptation to guess." > > > > > > Warren > > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > Ben Root > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Feb 23 15:06:43 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 23 Feb 2016 21:06:43 +0100 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: References: <1456253921.9274.3.camel@sipsolutions.net> Message-ID: <1456258003.9274.14.camel@sipsolutions.net> On Di, 2016-02-23 at 14:57 -0500, Benjamin Root wrote: > I'd be more than happy to write up the patch. I don't think it would > be quite like make zeros be ones, but it would be along those lines. > One case I need to wrap my head around is to make sure that a 0 would > happen if the following was true: > > >>> a = np.ones((0, 5*64)) > >>> a.shape = (-1, 5, 64) > > EDIT: Just tried the above, and it works as expected (zero in the > first dim)! > > Just tried out a couple of other combos: > >>> a.shape = (-1,) > >>> a.shape > (0,) > >>> a.shape = (-1, 5, 64) > >>> a.shape > (0, 5, 64) > Seems right to me on first sight :). (I don't like shape assignments though, who cares for one extra view). Well, maybe 1 instead of 0 (ignore 0s), but if the result for -1 is to use 1 and the shape is 0 convert the 1 back to 0. But it is starting to sound a bit tricky, though I think it might be straight forward (i.e. no real traps and when it works it always is what you expect). The main point is, whether you can design cases where the conversion back to 0 hides bugs by not failing when it should. And whether that would be a tradeoff we are willing to accept. - Sebastian > > This is looking more and more like a bug to me. > > Ben Root > > > On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg < > sebastian at sipsolutions.net> wrote: > > On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote: > > > but, it isn't really ambiguous, is it? The -1 can only refer to a > > > single dimension, and if you ignore the zeros in the original and > > new > > > shape, the -1 is easily solvable, right? > > > > I think if there is a simple logic (like using 1 for all zeros in > > both > > input and output shape for the -1 calculation), maybe we could do > > it. I > > would like someone to think about it carefully that it would not > > also > > allow some unexpected generalizations. And at least I am getting a > > BrainOutOfResourcesError right now trying to figure that out :). > > > > - Sebastian > > > > > > > Ben Root > > > > > > On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < > > > warren.weckesser at gmail.com> wrote: > > > > > > > > > > > > On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < > > > > ben.v.root at gmail.com> wrote: > > > > > Not exactly sure if this should be a bug or not. This came up > > in > > > > > a fairly general function of mine to process satellite data. > > > > > Unexpectedly, one of the satellite files had no scans in it, > > > > > triggering an exception when I tried to reshape the data from > > it. > > > > > > > > > > >>> import numpy as np > > > > > >>> a = np.zeros((0, 5*64)) > > > > > >>> a.shape > > > > > (0, 320) > > > > > >>> a.shape = (0, 5, 64) > > > > > >>> a.shape > > > > > (0, 5, 64) > > > > > >>> a.shape = (0, 5*64) > > > > > >>> a.shape = (0, 5, -1) > > > > > Traceback (most recent call last): > > > > > File "", line 1, in > > > > > ValueError: total size of new array must be unchanged > > > > > > > > > > So, if I know all of the dimensions, I can reshape just fine. > > But > > > > > if I wanted to use the nifty -1 semantic, it completely falls > > > > > apart. I can see arguments going either way for whether this > > is a > > > > > bug or not. > > > > > > > > > > > > > When you try `a.shape = (0, 5, -1)`, the size of the third > > > > dimension is ambiguous. From the Zen of Python: "In the face > > of > > > > ambiguity, refuse the temptation to guess." > > > > > > > > Warren > > > > > > > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > Ben Root > > > > > > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at scipy.org > > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Tue Feb 23 15:14:03 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 23 Feb 2016 21:14:03 +0100 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: <1456258003.9274.14.camel@sipsolutions.net> References: <1456253921.9274.3.camel@sipsolutions.net> <1456258003.9274.14.camel@sipsolutions.net> Message-ID: <1456258443.9274.18.camel@sipsolutions.net> On Di, 2016-02-23 at 21:06 +0100, Sebastian Berg wrote: > On Di, 2016-02-23 at 14:57 -0500, Benjamin Root wrote: > > I'd be more than happy to write up the patch. I don't think it > > would > > be quite like make zeros be ones, but it would be along those > > lines. > > One case I need to wrap my head around is to make sure that a 0 > > would > > happen if the following was true: > > > > > > > a = np.ones((0, 5*64)) > > > > > a.shape = (-1, 5, 64) > > > > EDIT: Just tried the above, and it works as expected (zero in the > > first dim)! > > > > Just tried out a couple of other combos: > > > > > a.shape = (-1,) > > > > > a.shape > > (0,) > > > > > a.shape = (-1, 5, 64) > > > > > a.shape > > (0, 5, 64) > > > > Seems right to me on first sight :). (I don't like shape assignments > though, who cares for one extra view). Well, maybe 1 instead of 0 > (ignore 0s), but if the result for -1 is to use 1 and the shape is 0 > convert the 1 back to 0. But it is starting to sound a bit tricky, > though I think it might be straight forward (i.e. no real traps and > when it works it always is what you expect). > The main point is, whether you can design cases where the conversion > back to 0 hides bugs by not failing when it should. And whether that > would be a tradeoff we are willing to accept. > Another thought. Maybe you can figure out the -1 correctly, if there is no *other* 0 involved. If there is any other 0, I could imagine problems. > - Sebastian > > > > > > This is looking more and more like a bug to me. > > > > Ben Root > > > > > > On Tue, Feb 23, 2016 at 1:58 PM, Sebastian Berg < > > sebastian at sipsolutions.net> wrote: > > > On Di, 2016-02-23 at 11:45 -0500, Benjamin Root wrote: > > > > but, it isn't really ambiguous, is it? The -1 can only refer to > > > > a > > > > single dimension, and if you ignore the zeros in the original > > > > and > > > new > > > > shape, the -1 is easily solvable, right? > > > > > > I think if there is a simple logic (like using 1 for all zeros in > > > both > > > input and output shape for the -1 calculation), maybe we could do > > > it. I > > > would like someone to think about it carefully that it would not > > > also > > > allow some unexpected generalizations. And at least I am getting > > > a > > > BrainOutOfResourcesError right now trying to figure that out :). > > > > > > - Sebastian > > > > > > > > > > Ben Root > > > > > > > > On Tue, Feb 23, 2016 at 11:41 AM, Warren Weckesser < > > > > warren.weckesser at gmail.com> wrote: > > > > > > > > > > > > > > > On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root < > > > > > ben.v.root at gmail.com> wrote: > > > > > > Not exactly sure if this should be a bug or not. This came > > > > > > up > > > in > > > > > > a fairly general function of mine to process satellite > > > > > > data. > > > > > > Unexpectedly, one of the satellite files had no scans in > > > > > > it, > > > > > > triggering an exception when I tried to reshape the data > > > > > > from > > > it. > > > > > > > > > > > > > > > import numpy as np > > > > > > > > > a = np.zeros((0, 5*64)) > > > > > > > > > a.shape > > > > > > (0, 320) > > > > > > > > > a.shape = (0, 5, 64) > > > > > > > > > a.shape > > > > > > (0, 5, 64) > > > > > > > > > a.shape = (0, 5*64) > > > > > > > > > a.shape = (0, 5, -1) > > > > > > Traceback (most recent call last): > > > > > > File "", line 1, in > > > > > > ValueError: total size of new array must be unchanged > > > > > > > > > > > > So, if I know all of the dimensions, I can reshape just > > > > > > fine. > > > But > > > > > > if I wanted to use the nifty -1 semantic, it completely > > > > > > falls > > > > > > apart. I can see arguments going either way for whether > > > > > > this > > > is a > > > > > > bug or not. > > > > > > > > > > > > > > > > When you try `a.shape = (0, 5, -1)`, the size of the third > > > > > dimension is ambiguous. From the Zen of Python: "In the > > > > > face > > > of > > > > > ambiguity, refuse the temptation to guess." > > > > > > > > > > Warren > > > > > > > > > > > > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > Ben Root > > > > > > > > > > > > _______________________________________________ > > > > > > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at scipy.org > > > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at scipy.org > > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From njs at pobox.com Tue Feb 23 15:14:25 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 23 Feb 2016 12:14:25 -0800 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: References: Message-ID: On Tue, Feb 23, 2016 at 8:45 AM, Benjamin Root wrote: > but, it isn't really ambiguous, is it? The -1 can only refer to a single > dimension, and if you ignore the zeros in the original and new shape, the -1 > is easily solvable, right? Sure, it's totally ambiguous. These are all legal: In [1]: a = np.zeros((0, 5, 64)) In [2]: a.shape = (0, 5 * 64) In [3]: a.shape = (0, 5 * 65) In [4]: a.shape = (0, 5, 102) In [5]: a.shape = (0, 102, 64) Generally, the -1 gets replaced by prod(old_shape) // prod(specified_entries_in_new_shape). If the specified new shape has a 0 in it, then this is a divide-by-zero. In this case it happens because it's the solution to the equation prod((0, 5, 64)) == prod((0, 5, x)) for which there is no unique solution for 'x'. Your proposed solution feels very heuristic-y to me, and heuristics make me very nervous :-/ If what you really want to say is "flatten axes 1 and 2 together", then maybe there should be some API that lets you directly specify *that*? As a bonus you might be able to avoid awkward tuple manipulations to compute the new shape. -n -- Nathaniel J. Smith -- https://vorpus.org From ben.v.root at gmail.com Tue Feb 23 15:23:03 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 23 Feb 2016 15:23:03 -0500 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: References: Message-ID: On Tue, Feb 23, 2016 at 3:14 PM, Nathaniel Smith wrote: > Sure, it's totally ambiguous. These are all legal: I would argue that except for the first reshape, all of those should be an error, and that the current algorithm is buggy. This isn't a heuristic. It isn't guessing. It is making the semantics consistent. The fact that I can do: a.shape = (-1, 5, 64) or a.shape = (0, 5, 64) but not a.shape = (0, 5, -1) is totally inconsistent. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Feb 23 15:30:41 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 23 Feb 2016 12:30:41 -0800 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: References: Message-ID: On Tue, Feb 23, 2016 at 12:23 PM, Benjamin Root wrote: > > On Tue, Feb 23, 2016 at 3:14 PM, Nathaniel Smith wrote: >> >> Sure, it's totally ambiguous. These are all legal: > > > > I would argue that except for the first reshape, all of those should be an > error, and that the current algorithm is buggy. Reshape doesn't care about axes at all; all it cares about is that the number of elements stay the same. E.g. this is also totally legal: np.zeros((12, 5)).reshape((10, 3, 2)) And so are the equivalents np.zeros((12, 5)).reshape((-1, 3, 2)) np.zeros((12, 5)).reshape((10, -1, 2)) np.zeros((12, 5)).reshape((10, 3, -1)) > This isn't a heuristic. It isn't guessing. It is making the semantics > consistent. The fact that I can do: > a.shape = (-1, 5, 64) > or > a.shape = (0, 5, 64) > > but not > a.shape = (0, 5, -1) > > is totally inconsistent. It's certainly annoying and unpleasant, but it follows inevitably from the most natural way of defining the -1 semantics, so I'm not sure I'd say "inconsistent" :-) What should this do? np.zeros((12, 0)).reshape((10, -1, 2)) -n -- Nathaniel J. Smith -- https://vorpus.org From charlesr.harris at gmail.com Tue Feb 23 15:40:48 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 23 Feb 2016 13:40:48 -0700 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. In-Reply-To: References: Message-ID: Christoph reports the following problem that I am unable to reproduce on appveyor or find reported elsewhere. On all 32-bit platforms: ============================================================ ERROR: test_zeros_big (test_multiarray.TestCreation) ------------------------------------------------------------ Traceback (most recent call last): File "X:\Python27\lib\site-packages\numpy\core\tests\test_multiarray.py", line 594, in test_zeros_big d = np.zeros((30 * 1024**2,), dtype=dt) MemoryError I would be much obliged if someone else could demonstrate it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Tue Feb 23 15:50:14 2016 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 23 Feb 2016 15:50:14 -0500 Subject: [Numpy-discussion] reshaping empty array bug? In-Reply-To: References: Message-ID: On Tue, Feb 23, 2016 at 3:30 PM, Nathaniel Smith wrote: > What should this do? > > np.zeros((12, 0)).reshape((10, -1, 2)) > It should error out, I already covered that. 12 != 20. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Tue Feb 23 15:58:48 2016 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 23 Feb 2016 21:58:48 +0100 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. In-Reply-To: References: Message-ID: <56CCC808.5090308@googlemail.com> that test needs about 500Mb of memory on windows as it doesn't have sparse allocations like most *nixes. It used to fail for me during release testing when I only gave the windows VM 1GB of ram. If its a problem for CI we could disable it on windows, or at least skip the complex double case. On 23.02.2016 21:40, Charles R Harris wrote: > Christoph reports the following problem that I am unable to reproduce on > appveyor or find reported elsewhere. > > On all 32-bit platforms: > > ============================================================ > ERROR: test_zeros_big (test_multiarray.TestCreation) > ------------------------------------------------------------ > Traceback (most recent call last): > File > "X:\Python27\lib\site-packages\numpy\core\tests\test_multiarray.py", > line 594, in test_zeros_big > d = np.zeros((30 * 1024**2,), dtype=dt) > MemoryError > > I would be much obliged if someone else could demonstrate it. > > > > Chuck > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Tue Feb 23 16:05:36 2016 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 23 Feb 2016 23:05:36 +0200 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. In-Reply-To: References: Message-ID: 23.02.2016, 22:40, Charles R Harris kirjoitti: [clip] > On all 32-bit platforms: > > ============================================================ > ERROR: test_zeros_big (test_multiarray.TestCreation) > ------------------------------------------------------------ > Traceback (most recent call last): > File "X:\Python27\lib\site-packages\numpy\core\tests\test_multiarray.py", > line 594, in test_zeros_big > d = np.zeros((30 * 1024**2,), dtype=dt) > MemoryError > > I would be much obliged if someone else could demonstrate it. Memory fragmentation in the 2GB address space available? If dt==float64, that requires 250MB contiguous. -- Pauli Virtanen From sebastian at sipsolutions.net Tue Feb 23 16:06:57 2016 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 23 Feb 2016 22:06:57 +0100 Subject: [Numpy-discussion] How to check for memory leaks? In-Reply-To: References: Message-ID: <1456261617.14343.10.camel@sipsolutions.net> On Di, 2016-02-23 at 12:36 -0700, Charles R Harris wrote: > Hi All, > > I'm suspecting a possible memory leak in 1.11.x, what is the best way > to check for that? > Would like to learn better methods, but I tried valgrind with trace origins and full leak check, just thinking maybe it shows something. Unfortunately, I got the error below midway, I ran it before successfully (with only minor obvious leaks due to things like module wide strings) I think. My guess is, the error does not say much at all, but I have no clue :) (running without track-origins now, maybe it helps). - Sebastian Error: VEX temporary storage exhausted. Pool = TEMP, start 0x38f91668 curr 0x39456190 end 0x394561a7 (size 5000000) vex: the `impossible' happened: VEX temporary storage exhausted. > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Tue Feb 23 16:19:47 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 23 Feb 2016 14:19:47 -0700 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. In-Reply-To: <56CCC808.5090308@googlemail.com> References: <56CCC808.5090308@googlemail.com> Message-ID: On Tue, Feb 23, 2016 at 1:58 PM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > that test needs about 500Mb of memory on windows as it doesn't have > sparse allocations like most *nixes. > It used to fail for me during release testing when I only gave the > windows VM 1GB of ram. > If its a problem for CI we could disable it on windows, or at least skip > the complex double case. > It's not a problem on CI, just for Christoph. I asked him what memory resources the test had available but haven't heard back. AFAICT, nothing associated with the test has changed for this release. The options are probably to 1) ignore the failure, or 2) disable the test on 32 bits. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cgohlke at uci.edu Tue Feb 23 17:06:40 2016 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 23 Feb 2016 14:06:40 -0800 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. In-Reply-To: References: Message-ID: <56CCD7F0.80505@uci.edu> On 2/23/2016 1:05 PM, Pauli Virtanen wrote: > 23.02.2016, 22:40, Charles R Harris kirjoitti: > [clip] >> On all 32-bit platforms: >> >> ============================================================ >> ERROR: test_zeros_big (test_multiarray.TestCreation) >> ------------------------------------------------------------ >> Traceback (most recent call last): >> File "X:\Python27\lib\site-packages\numpy\core\tests\test_multiarray.py", >> line 594, in test_zeros_big >> d = np.zeros((30 * 1024**2,), dtype=dt) >> MemoryError >> >> I would be much obliged if someone else could demonstrate it. > > Memory fragmentation in the 2GB address space available? If dt==float64, > that requires 250MB contiguous. > Before creating the dtype('D') test array, the largest contiguous block available to the 32 bit Python process on my system is ~830 MB. The creation of this array succeeds. However, the creation of the next dtype('G') test array fails because the previous array is still in memory and the largest contiguous block available is only ~318 MB. Deleting the test arrays after usage via del(d) fixes this problem . Another fix could be to change the order of data types tested. Christoph From charlesr.harris at gmail.com Tue Feb 23 17:44:40 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 23 Feb 2016 15:44:40 -0700 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. In-Reply-To: <56CCD7F0.80505@uci.edu> References: <56CCD7F0.80505@uci.edu> Message-ID: Christoph, any chance you can test https://github.com/numpy/numpy/pull/7324 before it gets merged (or not). Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Tue Feb 23 18:20:00 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Tue, 23 Feb 2016 18:20:00 -0500 Subject: [Numpy-discussion] ENH: `scale` parameter for `sinc` Message-ID: I have created PR #7322 (https://github.com/numpy/numpy/pull/7322) to add a scale parameter to `sinc`. What this allows is to compute `sinc` as `sin(x)/x` or really `sin(n*x)/(n*x)` for arbitrary `n` instead of just `sin(pi*x)/(pi*x)` as is being done now. The parameter accepts two string arguments in addition to the actual scale value: 'normalized' and 'unnormalized'. 'normalized' is the default since that is the existing functionality. 'unnormalized' is equivalent to a `scale` of 1.0. The parameter also supports broadcasting against the input array. Regards, -Joe P.S. I would like to turn `sinc` into a `ufunc` at some point if the community approves. It would make the computation much cleaner (e.g., in-place `where`) and faster. It would also complement the existing trig functions nicely. The only question I have is whether or not it is possible to pass in optional parameters to ufuncs beyond the ones listed in http://docs.scipy.org/doc/numpy-1.10.0/reference/ufuncs.html#optional-keyword-arguments From njs at pobox.com Tue Feb 23 18:23:41 2016 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 23 Feb 2016 15:23:41 -0800 Subject: [Numpy-discussion] ENH: `scale` parameter for `sinc` In-Reply-To: References: Message-ID: On Tue, Feb 23, 2016 at 3:20 PM, Joseph Fox-Rabinovitz wrote: > P.S. I would like to turn `sinc` into a `ufunc` at some point if the > community approves. It would make the computation much cleaner (e.g., > in-place `where`) and faster. It would also complement the existing > trig functions nicely. The only question I have is whether or not it > is possible to pass in optional parameters to ufuncs beyond the ones > listed in http://docs.scipy.org/doc/numpy-1.10.0/reference/ufuncs.html#optional-keyword-arguments Right now it isn't possible, no. There are a lot of general improvements we can/should make to ufuncs, and this is one of them... -n -- Nathaniel J. Smith -- https://vorpus.org From gfyoung17 at gmail.com Wed Feb 24 03:40:15 2016 From: gfyoung17 at gmail.com (G Young) Date: Wed, 24 Feb 2016 08:40:15 +0000 Subject: [Numpy-discussion] fromnumeric.py internal calls Message-ID: Hello all, I have PR #7325 up that changes the internal calls for functions in *fromnumeric.py* from positional arguments to keyword arguments. I made this change for two reasons: 1) It is consistent with the external function signature 2) The inconsistency caused a breakage in *pandas* in its own implementation of *searchsorted* in which the *sorter* argument is not really used but is accepted so as to make it easier for *numpy* users who may be used to the *searchsorted* signature in *numpy*. The standard in *pandas* is to "swallow" those unused arguments into a *kwargs* argument so that we don't have to document an argument that we don't really use. However, that turned out not to be possible when *searchsorted* is called from the *numpy* library. Does anyone have any objections to the changes I made? Thanks! Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From cimrman3 at ntc.zcu.cz Wed Feb 24 08:20:48 2016 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 24 Feb 2016 14:20:48 +0100 Subject: [Numpy-discussion] ANN: SfePy 2016.1 Message-ID: <56CDAE30.8030108@ntc.zcu.cz> I am pleased to announce release 2016.1 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method or by the isogeometric analysis (preliminary support). It is distributed under the new BSD license. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker, wiki: http://github.com/sfepy Highlights of this release -------------------------- - major simplification of finite element field code - automatic checking of shapes of term arguments - improved mesh parametrization code and documentation - support for fieldsplit preconditioners of PETSc For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman on behalf of the SfePy development team --- Contributors to this release in alphabetical order: Robert Cimrman Vladimir Lukes From charlesr.harris at gmail.com Wed Feb 24 12:42:29 2016 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Feb 2016 10:42:29 -0700 Subject: [Numpy-discussion] Numpy 1.11.0rc1 released. In-Reply-To: References: Message-ID: On Tue, Feb 23, 2016 at 8:44 AM, Pauli Virtanen wrote: > 23.02.2016, 03:47, Charles R Harris kirjoitti: > > I'm delighted to announce the release of Numpy 1.11.0rc1. Hopefully the > > issues discovered in 1.11.0b3 have been dealt with and this release can > go > > on to become the official release. Source files and documentation can be > > found on Sourceforge > > , while > > source files and OS X wheels for Python 2.7, 3.3, 3.4, and 3.5 can be > > installed from Pypi. Please test thoroughly. > > FWIW https://travis-ci.org/pv/testrig/builds/108384173 Thanks for that. Most of the new errors look to be the result of the change in divmod, where before divmod(float64(1.0), 0.0) was (inf, nan) and it is now (nan, nan). There are also two errors in matplotlib that look to be the result of the slight change in the numerical values of remainder due to improved precision. I would class those errors as more of a test problem resulting from the inherent imprecision of floating point than a numpy regression. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbednar at inf.ed.ac.uk Wed Feb 24 19:42:44 2016 From: jbednar at inf.ed.ac.uk (James A. Bednar) Date: Thu, 25 Feb 2016 00:42:44 +0000 Subject: [Numpy-discussion] ANN: Stop plotting your data -- HoloViews 1.4 released! Message-ID: <22222.19972.843557.670445@rockefeller.inf.ed.ac.uk> We are pleased to announce the fifth public release of HoloViews, a Python package for exploring and visualizing numerical data: http://holoviews.org HoloViews provides composable, sliceable, declarative data structures for building even complex visualizations easily. Instead of you having to explicitly and laboriously plot your data, HoloViews lets you simply annotate your data so that any part of it visualizes itself automatically. You can now work with large datasets as easily as you work with simple datatypes at the Python prompt. The new version can be installed using conda: conda install -c ioam holoviews Release 1.4 introduces major new features, incorporating over 1700 new commits and closing 142 issues: - Now supports both Bokeh (bokeh.pydata.org) and matplotlib backends, with Bokeh providing extensive interactive features such as panning and zooming linked axes, and customizable callbacks - DynamicMap: Allows exploring live streams from ongoing data collection or simulation, or parameter spaces too large to fit into your computer's or your browser's memory, from within a Jupyter notebook - Columnar data support: Underlying data storage can now be in Pandas dataframes, NumPy arrays, or Python dictionaries, allowing you to define HoloViews objects without copying or reformatting your data - New Element types: Area (area under or between curves), Spikes (sequence of lines, e.g. spectra, neural spikes, or rug plots), BoxWhisker (summary of a distribution), QuadMesh (nonuniform rasters), Trisurface (Delaunay-triangulated surface plots) - New Container type: GridMatrix (grid of heterogenous Elements) - Improved layout handling, with better support for varying aspect ratios and plot sizes - Improved help system, including recursively listing and searching the help for all the components of a composite object - Improved Jupyter/IPython notebook support, including improved export using nbconvert, and standalone HTML output that supports dynamic widgets even without a Python server - Significant performance improvements for large or highly nested data And of course we have fixed a number of bugs found by our very dedicated users; please keep filing Github issues if you find any! For the full list of changes, see: https://github.com/ioam/holoviews/releases HoloViews is now supported by Continuum Analytics, and is being used in a wide range of scientific and industrial projects. HoloViews remains freely available under a BSD license, is Python 2 and 3 compatible, and has minimal external dependencies, making it easy to integrate into your workflow. Try out the extensive tutorials at holoviews.org today! Jean-Luc R. Stevens Philipp Rudiger James A. Bednar Continuum Analytics, Inc., Austin, TX, USA School of Informatics, The University of Edinburgh, UK -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From erensezener at gmail.com Fri Feb 26 11:32:35 2016 From: erensezener at gmail.com (Eren Sezener) Date: Fri, 26 Feb 2016 17:32:35 +0100 Subject: [Numpy-discussion] Generalized flip function Message-ID: Hi, In PR #7346 we add a flip function that generalizes fliplr and flipud for arbitrary axes. flipud and fliplr reverse the elements of an array along axis=0 and axis=1 respectively. The new flip function reverses the elements of an array along any given axis. In case flip is called with axis=0 or axis=1, the function is equivalent to flipud and fliplr respectively. A similar function is also available in MATLAB?. We use this function in PR #7347 to generalize the rot90 function to rotate an arbitrary plane (defined by the axes argument) of a multidimensional array. By that we fix issue #6506. Because flip function introduces a new API, @shoyer asked us to consult the mailing list. Any objection to adding the generalized flip function? Best regards, C. Eren Sezener & Denis Alevi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Fri Feb 26 11:36:03 2016 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Fri, 26 Feb 2016 11:36:03 -0500 Subject: [Numpy-discussion] Generalized flip function In-Reply-To: References: Message-ID: If nothing else, this is a nice complement to the generalized `stack` function. -Joe On Fri, Feb 26, 2016 at 11:32 AM, Eren Sezener wrote: > Hi, > > In PR #7346 we add a flip function that generalizes fliplr and flipud for > arbitrary axes. > > flipud and fliplr reverse the elements of an array along axis=0 and axis=1 > respectively. The new flip function reverses the elements of an array along > any given axis. In case flip is called with axis=0 or axis=1, the function > is equivalent to flipud and fliplr respectively. > > A similar function is also available in MATLAB?. > > We use this function in PR #7347 to generalize the rot90 function to rotate > an arbitrary plane (defined by the axes argument) of a multidimensional > array. By that we fix issue #6506. > > Because flip function introduces a new API, @shoyer asked us to consult the > mailing list. > > Any objection to adding the generalized flip function? > > Best regards, > C. Eren Sezener & Denis Alevi > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From jeffreback at gmail.com Sun Feb 28 12:03:45 2016 From: jeffreback at gmail.com (Jeff) Date: Sun, 28 Feb 2016 09:03:45 -0800 (PST) Subject: [Numpy-discussion] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE In-Reply-To: <4f18bfb7-ad9e-4d50-bf3e-1f7164e96bfa@googlegroups.com> References: <4f18bfb7-ad9e-4d50-bf3e-1f7164e96bfa@googlegroups.com> Message-ID: <5224abfa-fccd-4f07-8d4a-7360c9764cc4@googlegroups.com> These are pre-releases. In other words, we would want the community to test out before an official release, and see if there are any show stoppers. The docs are setup for the official releases. These are not put into official channels at all (that is the point), e.g. not on PyPi, nor in the conda main channels. Only official releases will go there. Generally we will try to do release candidates before major changes, but not before minor changes. So the official release of 0.18.0 has not happened yet! (in fact going to do a v0.18.0rc2 next week). We would love for you to test out! Jeff On Sunday, February 28, 2016 at 11:50:57 AM UTC-5, John E wrote: > > I hope this doesn't come across as a trivial, semantical question, but... > > The initial releases of the last 2 or so versions have been labelled as > "release candidates" but still say "We recommend that all > users upgrade to this version." > > So this is a little confusing to me for using pandas in a production > environment. "Release candidate" seems to suggest that you should wait for > 0.18.1, but the note unambiguously says not to wait. So which > interpretation is recommended for a production environment? > > > On Saturday, February 13, 2016 at 7:53:18 PM UTC-5, Jeff wrote: >> >> Hi, >> >> I'm pleased to announce the availability of the first release candidate >> of Pandas 0.18.0. >> Please try this RC and report any issues here: Pandas Issues >> >> We will be releasing officially in 1-2 weeks or so. >> >> **RELEASE CANDIDATE 1** >> >> This is a major release from 0.17.1 and includes a small number of API >> changes, several new features, >> enhancements, and performance improvements along with a large number of >> bug fixes. We recommend that all >> users upgrade to this version. >> >> Highlights include: >> >> - pandas >= 0.18.0 will no longer support compatibility with Python >> version 2.6 GH7718 or >> version 3.3 GH11273 >> - Moving and expanding window functions are now methods on Series and >> DataFrame similar to .groupby like objects, see here >> >> . >> - Adding support for a RangeIndex as a specialized form of the >> Int64Index for memory savings, see here >> >> . >> - API breaking .resample changes to make it more .groupby like, see >> here >> >> - Removal of support for positional indexing with floats, which was >> deprecated since 0.14.0. This will now raise a TypeError, see here >> >> - The .to_xarray() function has been added for compatibility with the xarray >> package see here >> >> . >> - Addition of the .str.extractall() method >> , >> and API changes to the the .str.extract() method >> , >> and the .str.cat() method >> >> - pd.test() top-level nose test runner is available GH4327 >> >> >> See the Whatsnew >> for much >> more information. >> >> Best way to get this is to install via conda >> from >> our development channel. Builds for osx-64,linux-64,win-64 for Python 2.7 >> and Python 3.5 are all available. >> >> conda install pandas=v0.18.0rc1 -c pandas >> >> Thanks to all who made this release happen. It is a very large release! >> >> Jeff >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sun Feb 28 12:54:52 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 28 Feb 2016 17:54:52 +0000 Subject: [Numpy-discussion] Generalized flip function In-Reply-To: References: Message-ID: I also think this is a good idea -- the generalized flip is much more numpythonic than the specialized 2d versions. On Fri, Feb 26, 2016 at 11:36 AM Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > If nothing else, this is a nice complement to the generalized `stack` > function. > > -Joe > > On Fri, Feb 26, 2016 at 11:32 AM, Eren Sezener > wrote: > > Hi, > > > > In PR #7346 we add a flip function that generalizes fliplr and flipud for > > arbitrary axes. > > > > flipud and fliplr reverse the elements of an array along axis=0 and > axis=1 > > respectively. The new flip function reverses the elements of an array > along > > any given axis. In case flip is called with axis=0 or axis=1, the > function > > is equivalent to flipud and fliplr respectively. > > > > A similar function is also available in MATLAB?. > > > > We use this function in PR #7347 to generalize the rot90 function to > rotate > > an arbitrary plane (defined by the axes argument) of a multidimensional > > array. By that we fix issue #6506. > > > > Because flip function introduces a new API, @shoyer asked us to consult > the > > mailing list. > > > > Any objection to adding the generalized flip function? > > > > Best regards, > > C. Eren Sezener & Denis Alevi > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sun Feb 28 13:49:41 2016 From: jeffreback at gmail.com (Jeff) Date: Sun, 28 Feb 2016 10:49:41 -0800 (PST) Subject: [Numpy-discussion] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE In-Reply-To: References: <4f18bfb7-ad9e-4d50-bf3e-1f7164e96bfa@googlegroups.com> <5224abfa-fccd-4f07-8d4a-7360c9764cc4@googlegroups.com> Message-ID: So you are probably reading sas7bdat, which was put in AFTER 0.18.0rc1 was cut (if you are reading xport format then you are good to go), otherwise you may want to wait a bit for 0.18.0rc2. On Sunday, February 28, 2016 at 1:42:53 PM UTC-5, John E wrote: > > OK, thanks, I got it. Although... I would consider pandas.pydata.org to > be a common end user gateway and if one starts there they will read "We > recommend that *all *users upgrade to this version." And then if they > scroll down a short distance they will see a single line instruction for > installing via conda: "conda install pandas=v0.18.0rc1 -c pandas". > > And also somewhat confusing to me about pandas.pydata.org is that looking > to the right, you have a choice of RC, dev, and previous releases, but > nothing that says something like "current, stable release". > > Anyways.... quite possibly this is confusing only to me, and not others, > but I thought I'd mention it just in case. FWIW. > > I've now installed 0.18.0rc1 and will try to test out some of the newer > features. I'm really interested to see how well the SAS reader works (i.e. > how fast). I hate SAS myself, but this would be a really, really nice > feature for my organization and likely increase adoption of python & pandas. > > > > On Sunday, February 28, 2016 at 12:03:45 PM UTC-5, Jeff wrote: >> >> >> These are pre-releases. In other words, we would want the community to >> test out before an official release, and see if there are any show >> stoppers. The docs are setup for the official releases. These are not put >> into official channels at all (that is the point), e.g. not on PyPi, nor in >> the conda main channels. Only official releases will go there. >> >> Generally we will try to do release candidates before major changes, but >> not before minor changes. >> >> So the official release of 0.18.0 has not happened yet! (in fact going to >> do a v0.18.0rc2 next week). >> >> We would love for you to test out! >> >> Jeff >> >> >> >> >> On Sunday, February 28, 2016 at 11:50:57 AM UTC-5, John E wrote: >>> >>> I hope this doesn't come across as a trivial, semantical question, but... >>> >>> The initial releases of the last 2 or so versions have been labelled as >>> "release candidates" but still say "We recommend that all >>> users upgrade to this version." >>> >>> So this is a little confusing to me for using pandas in a production >>> environment. "Release candidate" seems to suggest that you should wait for >>> 0.18.1, but the note unambiguously says not to wait. So which >>> interpretation is recommended for a production environment? >>> >>> >>> On Saturday, February 13, 2016 at 7:53:18 PM UTC-5, Jeff wrote: >>>> >>>> Hi, >>>> >>>> I'm pleased to announce the availability of the first release candidate >>>> of Pandas 0.18.0. >>>> Please try this RC and report any issues here: Pandas Issues >>>> >>>> We will be releasing officially in 1-2 weeks or so. >>>> >>>> **RELEASE CANDIDATE 1** >>>> >>>> This is a major release from 0.17.1 and includes a small number of API >>>> changes, several new features, >>>> enhancements, and performance improvements along with a large number of >>>> bug fixes. We recommend that all >>>> users upgrade to this version. >>>> >>>> Highlights include: >>>> >>>> - pandas >= 0.18.0 will no longer support compatibility with Python >>>> version 2.6 GH7718 or >>>> version 3.3 GH11273 >>>> - Moving and expanding window functions are now methods on Series >>>> and DataFrame similar to .groupby like objects, see here >>>> >>>> . >>>> - Adding support for a RangeIndex as a specialized form of the >>>> Int64Index for memory savings, see here >>>> >>>> . >>>> - API breaking .resample changes to make it more .groupby like, see >>>> here >>>> >>>> - Removal of support for positional indexing with floats, which was >>>> deprecated since 0.14.0. This will now raise a TypeError, see here >>>> >>>> - The .to_xarray() function has been added for compatibility with >>>> the xarray package see here >>>> >>>> . >>>> - Addition of the .str.extractall() method >>>> , >>>> and API changes to the the .str.extract() method >>>> , >>>> and the .str.cat() method >>>> >>>> - pd.test() top-level nose test runner is available GH4327 >>>> >>>> >>>> See the Whatsnew >>>> for >>>> much more information. >>>> >>>> Best way to get this is to install via conda >>>> from >>>> our development channel. Builds for osx-64,linux-64,win-64 for Python 2.7 >>>> and Python 3.5 are all available. >>>> >>>> conda install pandas=v0.18.0rc1 -c pandas >>>> >>>> Thanks to all who made this release happen. It is a very large release! >>>> >>>> Jeff >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Feb 29 02:44:30 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 29 Feb 2016 07:44:30 +0000 Subject: [Numpy-discussion] fromnumeric.py internal calls In-Reply-To: References: Message-ID: I think this is an improvement, but I do wonder if there are libraries out there that use *args instead of **kwargs to handle these extra arguments. Perhaps it's worth testing this change against third party array libraries that implement their own array like classes? Off the top of my head, maybe scipy, pandas, dask, astropy, pint, xarray? On Wed, Feb 24, 2016 at 3:40 AM G Young wrote: > Hello all, > > I have PR #7325 up that > changes the internal calls for functions in *fromnumeric.py* from > positional arguments to keyword arguments. I made this change for two > reasons: > > 1) It is consistent with the external function signature > 2) > > The inconsistency caused a breakage in *pandas* in its own implementation > of *searchsorted* in which the *sorter* argument is not really used but > is accepted so as to make it easier for *numpy* users who may be used to > the *searchsorted* signature in *numpy*. > > The standard in *pandas* is to "swallow" those unused arguments into a > *kwargs* argument so that we don't have to document an argument that we > don't really use. However, that turned out not to be possible when > *searchsorted* is called from the *numpy* library. > > Does anyone have any objections to the changes I made? > > Thanks! > > Greg > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfyoung17 at gmail.com Mon Feb 29 03:07:28 2016 From: gfyoung17 at gmail.com (G Young) Date: Mon, 29 Feb 2016 08:07:28 +0000 Subject: [Numpy-discussion] fromnumeric.py internal calls In-Reply-To: References: Message-ID: Well I know pandas uses **kwargs (that was the motivation for this PR), but I can certainly have a look at those other ones. I am not too familiar with all of the third party libraries that implement their own array-like classes, so if there any others that come to mind, let me know! Also, could you also add that as a comment to the PR as well? Thanks! On Mon, Feb 29, 2016 at 7:44 AM, Stephan Hoyer wrote: > I think this is an improvement, but I do wonder if there are libraries out > there that use *args instead of **kwargs to handle these extra arguments. > Perhaps it's worth testing this change against third party array libraries > that implement their own array like classes? Off the top of my head, maybe > scipy, pandas, dask, astropy, pint, xarray? > On Wed, Feb 24, 2016 at 3:40 AM G Young wrote: > >> Hello all, >> >> I have PR #7325 up that >> changes the internal calls for functions in *fromnumeric.py* from >> positional arguments to keyword arguments. I made this change for two >> reasons: >> >> 1) It is consistent with the external function signature >> 2) >> >> The inconsistency caused a breakage in *pandas* in its own >> implementation of *searchsorted* in which the *sorter* argument is not >> really used but is accepted so as to make it easier for *numpy* users >> who may be used to the *searchsorted* signature in *numpy*. >> >> The standard in *pandas* is to "swallow" those unused arguments into a >> *kwargs* argument so that we don't have to document an argument that we >> don't really use. However, that turned out not to be possible when >> *searchsorted* is called from the *numpy* library. >> >> Does anyone have any objections to the changes I made? >> >> Thanks! >> >> Greg >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cgodshall at enthought.com Fri Feb 26 13:11:22 2016 From: cgodshall at enthought.com (Courtenay Godshall (Enthought)) Date: Fri, 26 Feb 2016 18:11:22 -0000 Subject: [Numpy-discussion] ANN: SciPy (Scientific Python) 2016 Conference Call for Talk / Tutorial Proposals Open Until 3/25 Message-ID: <006f01d170c1$17790060$466b0120$@enthought.com> **SciPy 2016 Conference (Scientific Computing with Python) Announcement** *Call for Proposals: Submit Your Tutorial and Talk Ideas by March 25, 2015 at http://scipy2016.scipy.org. SciPy 2016, the 15th annual Scientific Computing with Python conference, will be held July 11-17, 2016 in Austin, Texas. SciPy is a community dedicated to the advancement of scientific computing through open source Python software for mathematics, science, and engineering. The annual SciPy Conference brings together over 650 participants from industry, academia, and government to showcase their latest projects, learn from skilled users and developers, and collaborate on code development. The full program will consist of 2 days of tutorials (July 11-12), 3 days of talks (July 13-15), and 2 days of developer sprints (July 16-17). More info is available on the conference website at http://scipy2016.scipy.org (where you can sign up for the mailing list); or follow @scipyconf on Twitter. We hope you'll join us - early bird registration is open until May 22, 2016 at http://scipy2016.scipy.org/ehome/146062/332936/?&& We encourage you to submit tutorial or talk proposals in the categories below; please also share with others who you'd like to see participate! Submit via the conference website: http://scipy2016.scipy.org. ---------------------------------------------------------------------------- ------------------------- *SUBMIT A SCIPY 2016 TUTORIAL PROPOSAL - DUE MARCH 21, 2016* ---------------------------------------------------------------------------- ------------------------- Details and submission here: http://scipy2016.scipy.org/ehome/146062/332967/?&& These sessions provide extremely affordable access to expert training, and consistently receive fantastic feedback from participants. We're looking for submissions on topics from introductory to advanced - we'll have attendees across the gamut looking to learn. Whether you are a major contributor to a scientific Python library or an expert-level user, this is a great opportunityto share your knowledge and stipends are available. ---------------------------------------------------------------------------- --------------------------------- **SUBMIT A SCIPY 2016 TALK / POSTER PROPOSAL - DUE MARCH 25, 2016* ---------------------------------------------------------------------------- --------------------------------- Details and submission here: http://scipy2016.scipy.org/ehome/146062/332968/?&& SciPy 2016 will include 3 major topic tracks and 8 mini-symposia tracks. Major topic tracks include: - Scientific Computing in Python - Python in Data Science (Big data and not so big data) - High Performance Computing Mini-symposia will include the applications of Python in: - Earth and Space Science - Engineering - Medicine and Biology - Social Sciences - Special Purpose Databases - Case Studies in Industry - Education - Reproducibility If you have any questions or comments, feel free to contact us at: scipy-organizers at scipy.org ----------------------------------------------------------- **SCIPY 2016 REGISTRATION IS OPEN** ----------------------------------------------------------- Please register early. SciPy early bird registration until May 22, 2016! Register at http://scipy2016.scipy.org. Plus, enter our t-shirt design contest to win a free registration. (Send a vector art file to scipy at enthought.com by March 31 to enter). -------------- next part -------------- An HTML attachment was scrubbed... URL: